Fault Detection and Diagnosis in Chemical Processes Using Sensitive

Jan 7, 2013 - based chemical process monitoring performance, by solving the information loss ... to information loss and poor monitoring performance...
42 downloads 8 Views 589KB Size
Article pubs.acs.org/IECR

Fault Detection and Diagnosis in Chemical Processes Using Sensitive Principal Component Analysis Qingchao Jiang,† Xuefeng Yan,*,† and Weixiang Zhao‡ †

Key Laboratory of Advanced Control and Optimization for Chemical Processes of Ministry of Education, East China University of Science and Technology, Shanghai 200237, P. R. China ‡ National Grid, Hicksville, New York 11801, United States S Supporting Information *

ABSTRACT: Sensitive principal component analysis (SPCA) is proposed to improve the principal component analysis (PCA) based chemical process monitoring performance, by solving the information loss problem and reducing nondetection rates of the T2 statistic. Generally, principal components (PCs) selection in the PCA-based process monitoring is subjective, which can lead to information loss and poor monitoring performance. The SPCA method is to subsequently build a conventional PCA model based on normal samples, index PCs which reflect the dominant variation of abnormal observations, and use these sensitive PCs (SPCs) to monitor the process. Moreover, a novel fault diagnosis approach based on SPCA is also proposed due to SPCs’ ability to represent the main characteristic of the fault. The case studies on the Tennessee Eastman process demonstrate the effect of SPCA on online monitoring, showing its performance is significantly better than that of the classical PCA methods.

1. INTRODUCTION With the development of modern industry and technology, manufacturing equipments are having a larger size, a higher speed, and a more complex and intelligent operation. Fault detection and diagnosis of chemical processes play increasingly important roles in process industry since they are one of the keys to ensure the safety and enhance the productivity of the process. Recently, multivariate statistical process monitoring (MSPM) methods have progressed very quickly and among them principal component analysis (PCA) is the most widely used.1−7 It can effectively deal with high-dimensional, noisy, and highly correlated data by projecting the data onto a lower dimensional subspace which contains sufficient variance information of normal operational data. For using PCA, it is necessary to determine how many and which principal components should be extracted or retained. Currently, various methods have been proposed for selecting principal components such as cumulative percent variance (CPV), cross validation, variance of reconstruction error (VRE), and quite many comparative studies have been conducted.8,9 CPV is a simple method for determining the number of PCs, which selects the first several PCs representing the major variance information of original data. Wold10 proposed the cross-validation method, which divides the training sample into two parts respectively for model construction and prediction. If the sum of squared prediction residuals is smaller than the precedent one, the new component is added to the model. The VRE method for principal components number determination is based on the best reconstruction of the variables, which was proposed by Qin and Dunia.11 This method indicates that when the error of fault reconstruction comes to minimum, the corresponding PCs are deemed as with optimal number. Li and Tang12 proposed a method based on fault signalto-noise ratio (SNR), by observing the relationship between the sensitivity of fault detection and the number of PCs. Most of the classical methods just take normal operational observations into account while ignore the information of abnormal observations, © 2013 American Chemical Society

making the selection of principal components rather subjective. The SNR method considers the information of process fault, but it does not take the relationship into further consideration and also it still just selects the first several PCs. Jolliffe13 pointed out that it is a misconception that the principal components with small eigenvalues are useless and demonstrated that these components can be as important as those with large variance. Smith and Campbell14 discussed the importance of last principal components through a regression problem from chemical engineering. The similar discussions could also be found in the works by Kung and Sharif,15 Hill et al.,16 etc. However, this issue is not yet sufficiently discussed in PCA based process monitoring. Generally, in process monitoring, the first several PCs with the largest explained variance are retained while the smaller ones are rejected. This may lead to the loss of useful information hidden in faults and seriously affect the monitoring performance. The principal components selection is still an open question in PCA based process monitoring. It should be noted that there are also many research works that select key variables for dimensionality reduction and seek to identify a subset of measurements containing the most information for process monitoring. Arbel et al.17 suggested the process variables that are preponderant in achieving specific objectives need to be selected. Tyréus18 highlighted some of the challenges in identifying the dominant variables. Kothare et al.19 noted that a combination of physical insight, mathematical criteria, and engineering judgment is required for choosing such variables. Srinivasan and Qian20 proposed a state-specific key variables selecting method for multistate process monitoring. These key variable selection methods have significantly facilitated the development of chemical processes monitoring; however, they simply focus on raw process variables, Received: Revised: Accepted: Published: 1635

March 31, 2012 October 29, 2012 January 7, 2013 January 7, 2013 dx.doi.org/10.1021/ie3017016 | Ind. Eng. Chem. Res. 2013, 52, 1635−1644

Industrial & Engineering Chemistry Research

Article

Process monitoring based on sensitive principal component analysis (SPCA) is introduced in this paper to improve process monitoring performance. SPCA method takes the information of abnormal observations into account, but it does not need history fault data. For online monitoring, the principal components are selected objectively according to the change rate of T2, which is an important evaluation of coming faults. This method can effectively concentrate the useful information into one subspace, handle the situation of information loss, and reduce nondetection rate significantly. The rest of this paper is structured as follows. First, briefly the PCA model used in process monitoring is reviewed and the monitoring behavior of the T2 statistic using a numerical process is illustrated. Second, the sensitive principal component as a novel process monitoring method is introduced, followed by fault identification and diagnosis methods with SPCA. In section 4, the Tennessee Eastman process (TEP) is employed to demonstrate the effect of this method and the monitoring results based upon SPCA are presented. Finally, we made conclusions of our work in section 5.

not on the key latent factors that are extracted from original process data. When using PCA model, the monitoring space is divided into two subspaces, known as dominant subspace and residual subspace. T2 statistic and Q (also known as squared prediction error, SPE) statistic are constructed to interpret the mean and variance information of a process in the two subspaces, respectively.21,22 When a fault occurs in process, the useful information might be reflected into the two spaces, and then there would be four possible outcomes for monitoring: (i) neither T2 nor Q shows a fault; (ii) Q shows a fault while T2 does not; (iii) T2 shows a fault while Q does not; (iv) both T2 and Q show a fault. Although substantial research about PCA-based process monitoring and quality control can be found in the literature, fundamental theory analysis about the characteristic of these four kinds of detection behaviors and the performance of PCA is insufficient. Most literature just took the four situations into qualitative analysis and regard scenarios ii, iii, and iv as process faults and scenario i as normal in that both T2 and Q statistics are under control. Wang et al.23 studied the two statistics and pointed out that the detection behavior of PCA is complex and further study is needed; however, to the best of our knowledge, the detectability of the statistics is seldom discussed. Undetectable process changes and faults do not gain enough attention, and the issue of information loss, a possible reason for undetectable cases, has not been studied. In PCA process monitoring, T2 statistic measures the variation directly along the directions of principal components; however, the directions which would be influenced by the fault is indefinite (i.e., fault information has no definite mapping to PCs). Therefore, the fault information might be divided and mapped into two subspaces when selecting the first several PCs for monitoring. If the useful information of a fault is quite limited, such an information loss could be fatal because some fault information is suppressed or submerged into an unmatched subspace, and this can directly lead to fault nondetection. Therefore, it is really important to select the fault-sensitive principal components and concentrate the useful fault information into a specific subspace for process monitoring. Fault diagnosis becomes an important task of process monitoring because it is desirable to find the underlying cause of the fault. Currently, the most widely used approach based on PCA models is contribution plots.24 Contribution plots are easy to generate with no need of prior process knowledge and able to show the contribution of each process variable to the observed statistics. The approach is based on quantifying the contribution of each process variable to the score of individual principal component. And, the contributions of each process variable to the principal components which are at an out-of-control status are summed up and termed variable contribution. It is regarded that the process variable with large contribution is likely the root cause of the fault. Recently, many fault diagnosis methods have been reported,25−29 which have improved fault diagnosis performance significantly. However, the task of diagnosing the fault is still challenging, especially when the number of process variables is large and the process is highly complex. Moreover, many of the measured variables may deviate from the set-points for only a short time period and the fault may be disguised by the control loops. Therefore, it is really important to detect the fault timely and capture the most useful fault information for fault diagnosis. Additionally, if the useful information could be concentrated into one subspace, it is more likely and easier to accurately find the root cause of the fault and provide pertinent guide for the following operations.

2. PROCESS MONITORING BASED ON PCA 2.1. Principal Component Analysis. Principal component analysis is concerned with explaining the variance−covariance structure of a set of variables through a few linear combinations of these variables. Its general objective is data reduction and interpretation.30 Algebraically, principal components are certain linear combinations of the s random variables x1, x2, ..., xs. Geometrically, these linear combinations represent the selection of a new coordinate system obtained by rotating the original coordinate system composed of x1, x2, ..., xs. The new coordinate axes represent the directions with maximum variability and provide a simpler and more parsimonious description of the covariance structure. The PCA model could be obtained via singular value decomposition (SVD). Let X ∈ RN×s denote a scaled data matrix with zero mean and unit variance, where N is sample number and s is the number of variables in process. On the basis of the SVD algorithm, the matrix X can be decomposed as follows: X = TPT + E = X̂ + E

(1)

where T ∈ R and P ∈ R are the score matrix and the loading matrix; k is the number of the principal components retained; X̂ ∈ RN×s is the projection of T back into the N-dimensional observation space; and E is the residual matrix. The number of PCs is commonly determined based on CPV method, which is introduced as follows: N×k

k

s×k

s

∑ λi /∑ λi × 100% ≥ 85% i=1

i=1

(2)

where λi is the variance of score vector. When CPV is larger than 85%, the corresponding number of PCs is determined. The residual matrix captures the variations in the observation space spanned by the loading vectors associated with the s − k smallest singular values. The subspaces spanned by X̂ and E are called the score space and the residual space, respectively. The T2 and Q statistics are constructed to monitor the two spaces.31,32 Given an observation vector x ∈ Rs×1, the T2 statistic of the first k PCs can be calculated as follows. T 2 = x TP(Λ)−1PTx ≤ δT 2

(3)

where Λ ∈ R is a diagonal matrix denoting the estimated covariance matrix of principal component scores; δT2 = {[(N − 1)(N + 1)k]/[N(N − k)]}Fα(k, N − k) is the threshold of T2 on k×k

1636

dx.doi.org/10.1021/ie3017016 | Ind. Eng. Chem. Res. 2013, 52, 1635−1644

Industrial & Engineering Chemistry Research

Article

condition that observations in the process are Gaussian distributed, Fα(k, N − k) is an F-distribution with k and N − k degrees of freedom with the level of significance being α. The Q statistic is a squared 2-norm of the deviation of the observation from the first k PCs. It can be calculated as follows. Q = e Te ≤ δQ 2 ,

e = (I − PPT)x

(4)

where e is the residual vector, a projection of the observation x onto the residual space, δQ2 = θ1{[(Cαh0(2θ2)1/2)/θ1] + 1 + [(θ2h0(h0 − 1))/θ12]}1/h0 is the threshold of Q, θi = ∑sj=k+1 λij, h0 = 1 − (2θ1θ3/3θ22), and Cα is the normal deviate corresponding to the (1 − α) percentile. 2.2. Motivational Example. To analyze the monitoring performance of PCA and illustrate the situation of information loss in PCA monitoring, a simple multivariate process which was suggested by Ku et al.4 and modified by Lee et al.33 is employed. The numerical process is shown as follows: ⎡ 0.018 −0.191 0.287 ⎤ ⎢ ⎥ z(i) = ⎢ 0.847 0.264 0.943 ⎥z(i − 1) ⎢⎣−0.333 0.514 − 0.217 ⎥⎦ ⎡1 2 ⎤ ⎥ ⎢ + ⎢ 3 −4 ⎥u(i − 1) ⎣− 2 1 ⎦

y(i) = z(i) + v(i) ⎡ 0.193 0.689 ⎤ ⎡ 0.811 −0.226 ⎤ u (i ) = ⎢ ⎥h ⎥u(i − 1) + ⎢ ⎣ 0.477 0.415 ⎦ ⎣−0.320 −0.749 ⎦ (i − 1)

The input h is a random vector of which each element is uniformly distributed over the interval (−2, 2). The output y is equal to z plus a random noise vector v. Each element of v has a zero mean and a variance of 0.1. Both input u and output y are measured, but z and h are not. Two hundred samples are generated for analysis, with each being represented as x(i) = [yT(i)uT(i)]T. The total five variables (y1, y2, y3, u1, u2) are scaled to zero mean and unit variance to prevent less important variables with large magnitudes from overshadowing important variables with small magnitudes. The simulated faults for monitoring and diagnosis are created as below: Fault 1: a step change of h1 by 3 is introduced at sample 50. Fault 2: h1 is linearly increased for samples 50−149 by adding 0.05(i − 50) to the h1 value of each sample in this range, where i is the sample number. Fault 3: a step change of h2 by 1.5 is introduced at sample 50. PCA monitoring performance of this process is shown in Figure 1, in which the control limits of statistics are based on the significance level of 99.5%. Figure 1a exhibits the monitoring charts for fault 1 in the process, Figure 1b illustrates the monitoring charts for fault 2, and Figure 1c shows the monitoring charts for fault 3. Figures 1a and b show that faults 1 and 2 can be detected successfully by the PCA method. In Figure 1c, fault 3 seems to be detected, but the nondetection rates and detection delays of both T2 and Q are large. To investigate the cause of the failure in detecting fault 3, the T2 statistic of each principal component is calculated. The T2 of the mth principal component is constructed as Tm 2 = x Tpm(λm)−1pmT x

Figure 1. PCA monitoring charts: (a) fault 1, (b) fault 2, (c) fault 3.

where m = 1, 2, ..., 5, pm is the mth loading vector, and λm is the mth eigenvalue of XTX. In PCA monitoring, the measurements are assumed to be Gaussian distributed and the T2 statistic follows a χ2 distribution with k degrees of freedom, where k is the number of principal components retained in dominant subspace. Since Tm2 is scaled by the variance of PCA, it follows that the χ2 distribution has one degree of freedom. For the above process, at most, there are five principal components that can be used for the process monitoring. The monitoring performances of T2 for each principal component are illustrated in Figure 2, in which the principal components are in the order of variance decrease. The first three principal components are retained in the dominant subspace according to the CPV method. It can be seen from the Figure 2 that the first 2 principal components owning the largest variances do not show the largest

(5) 1637

dx.doi.org/10.1021/ie3017016 | Ind. Eng. Chem. Res. 2013, 52, 1635−1644

Industrial & Engineering Chemistry Research

Article

Figure 4. Schematics of PCs selection in (a) PCA and (b) SPCA.

Figure 2.

Tm2

very likely to be submerged, leading to the nondetection of the faults. The motive of SPCA is to select fault-sensitive principal directions and concentrate the fault relevant information into one subspace, i.e. the fault subspace, which is illustrated in Figure 4b. The monitoring performance of T2 statistics for fault 3 using the third and fifth principal components is shown in Figure 5.

of each principal component when fault 3 occurs.

change of T2 due to the occurrence of fault 3. Actually, the T2 values of the third and fifth principal components change more drastically. The key reason is the dominant principal component directions are defined as those which have maximum variance of normal operating process sample and on these directions the fault information are not significantly reflected. In the geometric meaning, the original axes X have been rotated to T, which represent the directions with maximum variance of normal samples. The matrix P rotates the major axes so that they directly correspond to the elements of T. For example, the rotating of a 2dimension normal sample space, which lies within the black ellipse, is demonstrated graphically in Figure 3. In Figure 3, axes

Figure 5. T2 statistic for fault 3 using the third and fifth principal component.

From Figure 5, we can see that fault 3 is detected successfully when the relevant information is concentrated into one subspace which is measured by the T2 statistic. The nondetection rates and detection delays are reduced significantly compared with those in Figure 1c.

3. PROCESS MONITORING BASED ON SENSITIVE PRINCIPAL COMPONENTS It has been pointed out in the literature that the principal components with small variance may be as important as those with large variance.13−16 Also, as demonstrated in the above preliminary study, the principal components in PCA model are not the same sensitive to faults as each other. The indefinite mapping of fault information could result in the loss the relevant information and lead to poor monitoring performance. Therefore, it is really important to select the fault-sensitive principal components and concentrate the fault relevant information into one subspace for process monitoring. In this section, the sensitive principal component analysis method for both fault detection and diagnosis is introduced and described in detail. 3.1. Sensitive Principal Components. In PCA monitoring, the T2 statistic can indicate the variation directly along the directions of principal components and this ability is used to capture the sensitive information in process. To illustrate this method, we suppose there are two sets of normal operating data: training set A (N × s) and training set B (n × s), where N and n are the number of samples in each set and s is the measurement variables. First, build the conventional PCA model with r PCs retained (CPV ≥ 99%) based on training data A. To describe the variation in the direction of the mth principal component, the change rate RTm,a2 of Tm2 on the ath sample point in set B can be defined as

Figure 3. Graphical demonstration of the PCA rotating.

x1, x2 represent the original data directions while t1 represents the direction with maximum variance of normal sample and t2 represents the direction with the smaller one. The dotted ellipse shows the distribution of fault data, which also has maximum variance in t1 and smaller variance in t2. However, the fault does not change the data distribution along t1 but changes the data distribution along t2. This is the reason why T12 is smaller than T22. Namely, the T22 statistic, which measures the change along the second loading vector, can detect the fault, but the T12 statistic cannot. The fault information has no definite mapping on a certain PC; however, the monitoring space is divided into two subspaces according to the corresponding variances. This can lead to a situation of the loss of useful information, as illustrated in Figure 4. From Figure 4a, we can see that in PCA, the third and fifth principal components, which contain dominant fault information, are divided and reflected into two subspaces. However, the data along the other PCs are rarely affected in their own subspace. The limited fault information is

⎧ ⎪1 R Tm,a2 = Tm , a 2/⎨ ⎪n ⎩ 1638

n

⎫ ⎪

∑ Tm,j 2⎪⎬ j=1



(6)

dx.doi.org/10.1021/ie3017016 | Ind. Eng. Chem. Res. 2013, 52, 1635−1644

Industrial & Engineering Chemistry Research

Article

where Tm,a2 is the T2 statistic of the mth (m = 1, 2, ..., r) principal component on the ath sample point in set B; n is the number of observations in set B. RTm,a2 directly describes the change of process data along the mth principal component, and if the process is in normal condition, the value of RTm,a2 should be below the control limit CLm. The control limit CLm of RTm2 can be determined through kernel density estimation (KDE),34,35 which is an easy and effective approach for nonparametric density estimation. A univariate kernel estimator with kernel K is defined as below. f ̅ (R ) =

1 nd

n

⎧ R − R (a ) ⎫ ⎬ ⎭ d

∑ K ⎨⎩ a=1

principal component by using a criterion based on the relative ratio of RTm,cu2 to control limit. R Tmax 2 = max{R T1,cu 2 /CL1, R T2,cu 2 /CL 2, ..., R Tr ,cu 2 /CLr} m ,cu

(10)

Here the max{} is an operator to select the maximum value. 3.2. Fault Detection with Sensitive Principal Components. Based on the analysis in section 2, it is obvious that the key issue for monitoring the fault is to observe the T2 statistic which has the largest change rate, i.e. Rmax Tm,cu2. Considering the challenge of process noise in real industry, the mean value of the two largest RTm2 at the cuth time point are used to instead of a max max max max single Rmax Tm2 , which is R T 2 = (RTm1,cu2 + RTm2,cu2)/2. If R T 2 < CL,

(7)

where R is the data point under consideration; R(a) is an observation value from the data set; d is the window width (also known as the smoothing parameter); n is the number of observations; and K is the kernel function. The kernel function K determines the shape of the smooth curve and satisfies the condition

m,cu

m,cu

a fault in process and further analysis is needed. In the following part, a novel process monitoring approach based on sensitive principal components is presented and its concrete calculating steps are summarized: Offline modeling: (1) Get a normal operational observation set X ∈ RN×s, where N is the sample number and s is the number of variables (denoted as training set A), as well as another normal set for threshold determining (denoted as training set B). (2) Normalize the training data A and training data B through the mean value and variance of each variable. (3) Obtain principal component using SVD decomposition. By reconstructing X via X = ∑si=1tipTi , where ti is score vector and pi is loading vector, the next step is to determine the first k PCs that cover 85% CPV and the first r PCs covering 99% CPV of normal sample data. (4) Based on the training data B, calculate the control limits of T2 statistic for the first k principal components δT2, the Tm,a2 statistic and the RTm,a2 for each component, where m = 1, ..., r. (5) Use kernel density estimation to determine the control limit CLm of each component and the threshold CL for Rmax Tm2 . Save the mean values and variance of the variables, the loading vectors and eigenvalues of XTX, δT2, and the CL for future online monitoring. Online monitoring: (1) Normalize the current time point data using mean values and variance of the training data. (2) Calculate the T2 statistics for the first k principal components retained, as well as Tm,cu2, RTm,cu2, and R Tmax 2

+∞

∫−∞

K (R ) d R = 1

(8)

There are a number of kernel functions and the Gaussian kernel function is the most commonly used. In this study, a Gaussian function is used as kernel functions. In KDE, the window width d usually has a crucial influence on the performance of density estimation. If d is too small, the density estimator is constructed with sharp peaks which are positioned at the sample points. If d is too large, the density estimate is overly smooth and structure in the probability density estimate is lost. The optimal choice of d depends on several factors, such as the number of data points, the data distributions, and the choice of the kernel function. Sufficient research on the choice of d have been reported, and it is suggested to find a proper empirical value for each specific case.36 Because the training data of normal conditions is easy to obtain, with the sufficient information in the training data, a satisfactory performance of KDE usually can be guaranteed. In this study, the threshold of RTm2 is determined through the following steps: (1) Calculate RTm,a2 (a = 1, 2, ..., n) between each principal component obtained from sample set A and each sample in set B, where n is the number of observations in B. (2) The univariate kernel density estimator is used to estimate the density function of RTm2 for each principal component with respect to RTm,a2 (a = 1, 2, ..., n). (3) Determine the control limit of each component, which covers the 99% area of density function and the control limit of the mth principal component is denoted as CLm. For online monitoring, the PCs with RTm2 exceeding the control limits are defined as the sensitive principal components. For example, the mth principal component is regarded as sensitive principal component on the condition that R Tm,cu 2 ≥ CLm

m,cu

the process is regarded as normal. If R Tmax 2 ≥ CL, there might be

m,cu

(m = 1, ..., r) of the current time point data. (3) Determine whether the values of R Tmax 2 run out of the m,cu

control limit CL or not. If not, repeat 1 and 2 to monitor the next time point data. If R Tmax 2 ≥ CL, there may be a m,cu

fault and go to the following steps. (4) Select the sensitive principal components by comparing RTm,cu2 ≥ CLm. Suppose that the number of sensitive principal components is k1 and they are denoted as m1, m2, .., mk1. (5) Calculate the T2 statistic and the control limit of the selected k1 sensitive principal components, which is represented by T2(k1) and δT12 (eq 3). If T2(k1) < δT12, go to step 1. If T2(k1) ≥ δT12, there is a fault in process.

(9)

where cu is the current time point during the online monitoring. RTm,a2 ≥ CLm may indicate a possible fault in process at time point cu. The principal component corresponding to the largest value 2 (Rmax Tm,cu2) among RTm,cu can be regarded as the most sensitive 1639

dx.doi.org/10.1021/ie3017016 | Ind. Eng. Chem. Res. 2013, 52, 1635−1644

Industrial & Engineering Chemistry Research

Article

Figure 6. RT12, cumulative probability density of RT12, and CLm of the TE process.

Figure 7. Variances and RTm2 of principal components for detecting fault 2.

3.3. Fault Identification and Diagnosis with Sensitive Principal Components. Once a fault has been detected, the next step is to determine the cause of the out-of-control status. A method based on the contribution plots of SPCA, in which the most useful information is concentrated, is used in this study. Moreover, a novel metrics for fault diagnosis, termed sensitive principal component similarity rate is also presented in this section. 3.3.1. Fault Identification with Contribution Plots in SPCA. The objective of fault identification is to identify the observation variables which are the most closely related to the fault. The procedure of the SPCA contribution plots based fault identification is summarized as follows: (1) Select the sensitive principal components scores, i.e., the scores with RTm2 ≥ CLm. (2) Calculate the contribution of each variable xj to the out-ofcontrol scores ti

cont i , j =

ti p (xj) λi i , j

Figure 8. Online monitoring results for fault 2: (a) online monitoring based on PCA model and (b) online monitoring based on SPCA model.

Macgregor suggested that the negative ones should be set to be zero.6,37,38 Both of the two methods have been widely used in the literature, and the details are not provided in this paper. 3.3.2. Fault Diagnosis with SPC Similarity Rate. Assuming the process data collected during a fault are represented by previous fault classes, the objective of fault diagnosis is to classify the current fault data to the correct fault class. In SPCA monitoring, each class of fault has its sensitive principal components, so a fault-related feature set can be set up using sensitive principal components of historical fault data. Suppose the sensitive principal components of a known fault are the m1th, ..., mnsth, ..., mk1th principal components and a new fault is detected, the sensitive principal components of which are the m1th, ..., mnsth, ..., mk2th principal components, the similarity rate between the known fault and the detected fault is defined as follows.

(11)

⎧ ns k2 × × 100% k 2 ≤ k1 ⎪ ⎪ k1 k1 sr = ⎨ k1 ⎪ ns × × 100% k 2 > k1 ⎪ ⎩ k1 k2

where pi,j is the (i, j)th element of the loading matrix P. (3) If conti,j is negative, set it to be zero (i.e., its sign is opposite to the value of the score ti). (4) Calculate the total contribution of the jth process variable r

CONTj =

∑ (conti ,j) i=1

(13)

where ns is the number of the same sensitive principal components between the known fault and the detected fault, k1 is the number of the sensitive principal components of the known fault, and k2 is the number of the sensitive principal components of the detected fault. After calculating the similarity rate sr between the current fault and all the existing faults classes, the detected fault is assigned to the known fault class which has the largest sr.

(12)

(5) Plot CONTj for all the process variables on a single graph. Nomikos has suggested that a graph with the contributions of each variable to T2 statistic can be constructed, and those contributions could be positive or negative; however, Kourti and 1640

dx.doi.org/10.1021/ie3017016 | Ind. Eng. Chem. Res. 2013, 52, 1635−1644

Industrial & Engineering Chemistry Research

Article

Figure 9. Online monitoring results for fault 4: (a) online monitoring with PCA and (b) online monitoring with SPCA.

Figure 10. Online monitoring results for fault 5: (a) online monitoring with PCA and (b) online monitoring with SPCA.

4. CASE STUDIES ON THE TE PROCESS Tennessee Eastman process is a benchmark case in process engineering, which was developed by Downs and Vogel.39 This case consists of five major unit operations: a reactor, a product condenser, a vapor−liquid separator, a recycle compressor, and a product stripper. Two products are produced by two simultaneous gas−liquid exothermic reactions, and a byproduct is generated by two additional exothermic reactions. The process has 12 manipulated variables, 22 continuous process measurements, and 19 compositions, as listed in Supporting Information Tables 1, 2, and 3. The fault simulator can generate 21 different types of faults, as shown in Table 4 of the Supporting Information. All the process measurements are contaminated by Gaussian noise. Once a fault enters the process, it affects almost all state variables in the process. The base control scheme for the TE process is shown in Figure 1 of the Supporting Information Material and the simulation code for the open loop can be downloaded from http://brahms.scs.uiuc.edu.26,40 The second plantwide control structure described in Lyman’s work41 is implemented here to simulate the realistic conditions (closedloop). To build the monitoring models, a normal process data set (500 samples) has been collected under the base operation. A set of 21 programmed faults (default values) are simulated, and the corresponding process data are collected for testing. The following simulations are run in Matlab 7.6.0 (2008a) environment. In total, 52 variables composed of the 22 continuous process measurements, 19 compositions, and 11 manipulated variables (the agitation speed was not included

because it was not manipulated) of the TE process are employed for this case study. All the faults are introduced into the process on the 160th time point. As an example, Figure 6 shows RT12, cumulative probability density of RT12, the control limits CLm of the 52 principal components, and the threshold CL of RTm2 under the normal operating condition. The control limits CLm of the 52 principal components are corresponding to 99% of the area under their density functions (the window width d in KDE is an optimized value that is obtained by Matlab KDE toolbox, and the optimal value is 0.074 for TE process), and the threshold CL of RTm2 is 1.389 for the TE process. 4.1. Fault Detection with SPCA in the TE Process. 4.1.1. Case Study on Fault 2. In the TE process, fault 2 is a step change in the composition of the inert B and is employed to demonstrate the online monitoring performance of SPCA model. The variances in PCA and the RTm2 of each principal component when fault 2 occurs are shown in Figure 7. It can be seen from Figure 7 that the first 6 principal components present larger variances while the Tm2 statistics of the 8th, 9th, 38th, and 44th principal components have a bigger change when fault 2 occurs. The monitoring results of fault 2 using SPCA is shown in Figure 8. The T2 values for SPC are calculated using sensitive principal components model and R Tmax is defined as the max 2 m

change rate of Tm2 (MRTm2). Comparing Figure 8a and b, we can see that the T2 values calculated using the PCA model do not 1641

dx.doi.org/10.1021/ie3017016 | Ind. Eng. Chem. Res. 2013, 52, 1635−1644

Industrial & Engineering Chemistry Research

Article

Table 1. Nondetection Rates/Detection Delays (Minutes) for the Testing Set fault no.

PCA T2

PCA Q

DPCA T2

DPCA Q

SPCA SPC T2

SPCA MRTm2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

0.008/21 0.020/51 0.998/− 0.956/− 0.775/48 0.011/30 0.085/3 0.034/69 0.994/− 0.666/288 0.794/912 0.029/66 0.060/147 0.158/12 0.988/− 0.834/936 0.259/87 0.113/279 0.996/− 0.701/261 0.736/1689

0.003/9 0.014/36 0.991/− 0.038/9 0.746/3 0/3 0/3 0.024/60 0.981/− 0.659/147 0.356/33 0.025/24 0.045/111 0/3 0.973/2220 0.755/591 0.108/75 0.101/252 0.873/− 0.550/261 0.570/855

0.006/18 0.019/48 0.991/− 0.939/453 0.758/6 0.013/33 0.159/3 0.028/69 0.995/− 0.580/303 0.801/585 0.013/9 0.049/135 0.061/18 0.964/− 0.783/597 0.240/84 0.111/279 0.735/− 0.490/267 0.644/1566

0.005/15 0.015/39 0.990/− 0.019/9 0.748/6 0/3 0/3 0.025/63 0.994/− 0.665/150 0.193/21 0.024/24 0.049/120 0.002/3 0.976/− 0.708/588 0.053/72 0.100/252 0.849/246 0.248/252 0.558/858

0.006/3 0.014/36 0.972/120 0.019/9 0.001/3 0/3 0/3 0.034/60 0.981/2190 0.083/69 0.335/18 0.013/6 0.052/111 0.002/6 0.956/2010 0.097/24 0.053/66 0.113/231 0.149/30 0.248/195 0.686/1551

−/3 −/36 −/120 −/9 −/3 −/3 −/3 −/60 −/2190 −/69 −/18 −/6 −/111 −/6 −/2010 −/24 −/66 −/231 −/30 −/195 −/1551

detection delays, by comparing with the PCA and dynamic PCA (DPCA) method.4,26 4.2. Fault Identification and Diagnosis with SPCA in the TE Process. The contribution plots for fault 5 using PCA and SPCA are shown in Figure 11. Both the PCA method and the

show a good performance while those obtained by SPCA detect the fault successfully. 4.1.2. Case Study on Fault 4. Fault 4 in the TE process involves a step change in the reactor cooling water inlet temperature. When the fault occurs, the reactor cooling water flow rate and the temperature in the reactor are influenced while the other 50 variables remain steady. Fault detection performances using PCA and SPCA methods are shown in Figure 9. It can be seen that the T2 statistics of the PCA do not present a clear indication of fault occurrence while that based on the SPCA provide a successful detection of the fault. 4.1.3. Case Study on Fault 5. Fault 5 in the TE process involves a step change in the condenser cooling water inlet temperature. A big influence of the fault is to lead to a step change in the condenser cooling water flow rate. When the fault occurs, the flow rate of the outlet stream from the condenser to the vapor/liquid separator also increases, which result in an increase in temperature. Monitoring results of the fault using PCA and SPCA are both shown in Figure 10. From Figure 10, it can be seen that both PCA and SPCA monitoring methods can successfully detect fault 5 in the TE process. However, in Figure 10a, as time goes on, the classical PCA method trends to tolerate the fault and cause high nondetection rate. Superior to PCA, the SPCA method is able to keep its fault detection ability, which further demonstrates the effect of the T2 statistic of the SPC. The nondetection rates and detection delays for each fault in the TE process are summarized in Table 1. In computing the detection delays in Table 1, a fault is indicated only when six consecutive measurement values have exceeded the threshold and the detection delay is defined as the distance between the location of the stipulated fault and the first time point exceeding the threshold. Faults 3, 9, and 15 are not used for comparison because no observable change in the mean or the variance can be detected by visually comparing the plots of each observation variable associated with faults 3, 9, and 15 with the plots associated with the normal condition. From Table 1, we can see that using SPCA can efficiently reduce nondetection rates and

Figure 11. Contribution plots for fault 5 using PCA and SPCA methods.

SPCA method indicate the sharp change in condenser cooling water flow when fault 5 occurs. However, the PCA method failed to indicate the change in Sep water outlet temperature, which is also directly related to the fault. The sensitive principal components concentrate more useful information to identify the relationship between variables and process fault. For fault diagnosis, the process data contaminated with all the 21 types of fault are assumed to be available to us. Then the sensitive principal components of each fault can be obtained, and the SPC set for the TE process can be built. The sensitive principal components of the faults in the TE process are presented in Table 2. For online monitoring, the similarity rates between the current SPCs and the SPCs set of each known fault is calculated. Here fault 2 is used as an example. When fault 2 is detected, the process data, the fault related information and the corresponding solution are stored in database and their sensitive principal components, which are the 8th, 9th, 38th, and 44th, are determined and saved in the fault feature set. During online process monitoring, a fault is identified as fault 2 if the detected 1642

dx.doi.org/10.1021/ie3017016 | Ind. Eng. Chem. Res. 2013, 52, 1635−1644

Industrial & Engineering Chemistry Research



Table 2. Sensitive Principal Components for the Testing Set fault no.

fault no.

sensitive principal components

12 13 14 15 16 17 18

15, 41, 44, 47, 48 3, 6, 28, 40, 44, 49, 50 17, 18, 25, 43, 44, 45 − 39, 46 45, 47 5, 25, 42

8

20, 43, 44, 45 8, 9, 38, 44 − 17, 20 41, 42, 47 19−40, 42, 44−50 13, 17, 23, 29, 31, 35, 36, 38, 39, 41− 46, 48, 49 41, 43, 44, 46, 48

19

9 10 11

− 39, 47 17, 20, 27, 28, 43, 45

20 21

17, 18, 34, 42, 46, 48, 49 20, 27, 49, 50 14, 25, 29, 36

1 2 3 4 5 6 7

sensitive principal components

Article

ASSOCIATED CONTENT

S Supporting Information *

Tables 1−4 as mentioned in the text and base control scheme for the Tennessee Eastman process. This information is available free of charge via the Internet at http://pubs.acs.org/.



AUTHOR INFORMATION

Corresponding Author

*E-mail address: [email protected]. Mailing address: East China University of Science and Technology, P.O. Box 293, MeiLong Road no. 130, Shanghai 200237, P. R. China. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS The authors gratefully acknowledge the supports from the following foundations: National Natural Science Foundation of China (21176073), Doctoral Fund of Ministry of Education of China (20090074110005), Program for New Century Excellent Talents in University (NCET-09-0346), “Shu Guang” project (09SG29), 973 project (2012CB721006), and the Fundamental Research Funds for the Central Universities.

fault has the same sensitive principal components as fault 2, and the corresponding solution can be suggested. However, it should be noted that sometimes some faults only change the variable correlations and do not influence any distribution direction. In those cases, no sensitive PCs could be found and the SPCA method would fail to detect the fault. Although we have examined all the faults in the TE process and have not found such cases (Please be advised that the nondetection of faults 3, 9, and 15 can be partially ascribed to a fact that the Q statistics of these faults do not exceed the control limits and the variable correlations are not saliently changed), this situation should be given enough attention and the further study is greatly needed. Sensitive principal components contain plentiful relevant information for fault capture and diagnosis. However further study is needed to reinforce its performance, such as the analysis of the change tendency of each sensitive principal component. With more properties of the sensitive principal components being taken into consideration, the characteristic of a fault could be revealed more clearly and the diagnosis can be facilitated.



5. CONCLUSIONS In this paper, a novel chemical processes monitoring method, which is termed sensitive principal component analysis (SPCA), has been proposed to improve the PCA processes monitoring performance. Considering the drawback of the PCA based detection method, the behavior of T2 statistic is analyzed to unveil the cause of the loss of fault related information. SPCA examines T2 statistic of each principal component and locates the most sensitive principal components, which then is applied to determine whether there is a fault in process. It not only makes use of the information of normal observations but also takes fault information into consideration, which can efficiently prevent the loss of fault relevant information. Besides, each fault has its corresponding sensitive principal components (SPCs) and the method based on SPC similarity rate is proposed for fault diagnosis. Moreover, SPCA concentrates the most useful information into one subspace, which has significantly improved the effect of the contribution plots method on root cause identification. The proposed fault detection and diagnosis methods have been evaluated in the Tennessee Eastman process. The results presented in the case study indicate that the SPCA method provides superior power of fault detection and diagnosis, compared to the classical PCA-based methods. However the proposed method is limited in linear processes. Future work could be focused on extending SPCA to nonlinear process monitoring.



NOMENCLATURE a = current sample point in the training set CL = threshold of maximum RT2 CLm = threshold of RTm2 cu = current sample point when online monitoring d = window width in the kernel estimator E = residual matrix f ̅ = kernel estimator K = kernel function k = number of principal components k1, k2 = number of sensitive principal components N = number of samples in training set A n = number of samples in training set B ns = number of the same sensitive principal components P = loading matrix pm = mth loading vector RTm2 = change rate of Tm2 2 R Tmax 2 = maximum value of RTm m s = number of variables sr = similarity rate t = principal component Tm2 = T2 statistic along the mth principal component α = level of significance δQ2 = threshold of Q statistic δT2 = threshold of T2 statistic λm = mth eigenvalue of XTX Λ = estimated covariance matrix of principal component scores REFERENCES

(1) Elshenawy, L. M.; Yin, S.; Naik, A. S.; Ding, S. X. Efficient recursive principal component analysis algorithms for process monitoring. Ind. Eng. Chem. Res. 2009, 49, 252−259. (2) Ge, Z.; Yang, C.; Song, Z. Improved kernel PCA-based monitoring approach for nonlinear processes. Chem. Eng. Sci. 2009, 64, 2245−2255. (3) Kresta, J. V.; Macgregor, J. F.; Marlin, T. E. Multivariate statistical monitoring of process operating performance. Can. J. Chem. Eng. 1991, 69, 35−47.

1643

dx.doi.org/10.1021/ie3017016 | Ind. Eng. Chem. Res. 2013, 52, 1635−1644

Industrial & Engineering Chemistry Research

Article

(4) Ku, W.; Storer, R. H.; Georgakis, C. Disturbance detection and isolation by dynamic principal component analysis. Chemom. Intell. Lab. Syst. 1995, 30, 179−196. (5) Lu, N.; Wang, F.; Gao, F. Combination method of principal component and wavelet analysis for multivariate process monitoring and fault diagnosis. Ind. Eng. Chem. Res. 2003, 42, 4198−4207. (6) Nomikos, P.; MacGregor, J. F. Monitoring batch processes using multiway principal component analysis. AIChE J. 1994, 40, 1361−1375. (7) Srinivasan, R.; Wang, C.; Ho, W.; Lim, K. Dynamic principal component analysis based methodology for clustering process states in agile chemical plants. Ind. Eng. Chem. Res. 2004, 43, 2123−2139. (8) Valle, S.; Li, W.; Qin, S. J. Selection of the number of principal components: the variance of the reconstruction error criterion with a comparison to other methods. Ind. Eng. Chem. Res. 1999, 38, 4389− 4401. (9) Zwick, W. R.; Velicer, W. F. Comparison of five rules for determining the number of components to retain. Psych. Bull. 1986, 99, 432. (10) Wold, S. Cross-validatory estimation of the number of components in factor and principal components models. Technometrics 1978, 20, 397−405. (11) Qin, S. J.; Dunia, R. Determining the number of principal components for best reconstruction. J. Process Control 2000, 10, 245− 250. (12) Li, Y.; Tang, X. C. Improved performance of fault detection based on selection of the optimal number of principal components. Acta Automatica Sinica 2009, 35, 1550−1557. (13) Jolliffe, I. T. A note on the use of principal components in regression. Appl. Statist. 1982, 300−303. (14) Smith, G.; Campbell, F. A critique of some ridge regression methods. J. Am. Stat. Assoc. 1980, 75, 74−81. (15) Kung, E. C.; Sharif, T. A. Regression forecasting of the onset of the Indian summer monsoon with antecedent upper air conditions. J. Appl. Mete. 1980, 19, 370−380. (16) Hill, R. C.; Fomby, T. B.; Johnson, S. Component selection norms for principal components regression. Commun. Stat.-Theor. M. 1977, 6, 309−334. (17) Arbel, A.; Rinard, I. H.; Shinnar, R. Dynamics and control of fluidized catalytic crackers. 3. designing the control system: Choice of manipulated and measured variables for partial control. Ind. Eng. Chem. Res. 1996, 35, 2215−2233. (18) Tyréus, B. D. Dominant variables for partial control. 1. A thermodynamic method for their identification. Ind. Eng. Chem. Res. 1999, 38, 1432−1443. (19) Kothare, M. V.; Shinnar, R.; Rinard, I.; Morari, M. On defining the partial control problem: Concepts and examples. AIChE J. 2000, 46, 2456−2474. (20) Srinivasan, R.; Qian, M. State-specific key variables for monitoring multi-state processes. Chem. Eng. Res. Des. 2007, 85, 1630−1644. (21) Chen, Q.; Kruger, U.; Meronk, M.; Leung, A. Synthesis of T2 and Q statistics for process monitoring. Control Eng. Pract. 2004, 12, 745− 755. (22) Jackson, J. E. A user’s guide to principal components; Wiley: New York, 1991. (23) Wang, H.; Song, Z.; Li, P. Fault detection behavior and performance analysis of principal component analysis based process monitoring methods. Ind. Eng. Chem. Res. 2002, 41, 2455−2464. (24) Miller, P.; Swanson, R.; Heckler, C. E. Contribution plots: a missing link in multivariate quality control. Appl. Math. Comput. Sci. 1998, 8, 775−792. (25) Chiang, L. H.; Kotanchek, M. E.; Kordon, A. K. Fault diagnosis based on Fisher discriminant analysis and support vector machines. Comput. Chem. Eng. 2004, 28, 1389−1401. (26) Chiang, L. H.; Russell, E.; Braatz, R. D. Fault detection and diagnosis in industrial systems; Springer Verlag: London, 2001. (27) Lee, G.; Han, C.; Yoon, E. S. Multiple-fault diagnosis of the Tennessee Eastman process based on system decomposition and dynamic PLS. Ind. Eng. Chem. Res. 2004, 43, 8037−8048.

(28) Maurya, M. R.; Rengaswamy, R.; Venkatasubramanian, V. Fault diagnosis using dynamic trend analysis: A review and recent developments. Eng. Appl. Artif. Intel. 2007, 20, 133−146. (29) Rusinov, L.; Rudakova, I.; Remizova, O.; Kurkina, V. Fault diagnosis in chemical processes with application of hierarchical neural networks. Chemom. Intell. Lab. Syst. 2009, 97, 98−103. (30) Johnson, R. A. Applied multivariate statistical analysis; PrenticeHall: New Jersey, 2007. (31) Jackson, J. E. Quality control methods for several related variables. Technometrics 1959, 359−377. (32) Jackson, J. E.; Mudholkar, G. S. Control procedures for residuals associated with principal component analysis. Technometrics 1979, 21, 341−349. (33) Lee, J. M.; Yoo, C. K.; Lee, I. B. Statistical process monitoring with independent component analysis. J. Process Control 2004, 14, 467−485. (34) Scott, D. W. Multivariate density estimation; Wiley Online Library: New York, 1992. (35) Webb, A. R.; Copsey, K. D.; Cawley, G. Statistical pattern recognition; Wiley: New York, 2011. (36) Silverman, B. W. Density estimation for statistics and data analysis; Chapman & Hall/CRC: Boca Raton, 1986. (37) Kourti, T.; MacGregor, J. F. Multivariate SPC methods for process and product monitoring. J. Qual. Technol. 1996, 28, 409−428. (38) Nomikos, P.; MacGregor, J. F. Multivariate SPC charts for monitoring batch processes. Technometrics 1995, 37, 41−59. (39) Downs, J. J.; Vogel, E. F. A plant-wide industrial process control problem. Comput. Chem. Eng. 1993, 17, 245−255. (40) McAvoy, T.; Ye, N. Base control for the Tennessee Eastman problem. Comput. Chem. Eng. 1994, 18, 383−413. (41) Lyman, P. R.; Georgakis, C. Plant-wide control of the Tennessee Eastman problem. Comput. Chem. Eng. 1995, 19, 321−331.

1644

dx.doi.org/10.1021/ie3017016 | Ind. Eng. Chem. Res. 2013, 52, 1635−1644