Probabilistic Fault Diagnosis Based on Monte Carlo and Nested-Loop

Nov 29, 2016 - One is that a prejudgment strategy should be developed that can determine whether it is suitable to perform DA on the fault data. Secon...
0 downloads 11 Views 2MB Size
Article pubs.acs.org/IECR

Probabilistic Fault Diagnosis Based on Monte Carlo and Nested-Loop Fisher Discriminant Analysis for Industrial Processes Chunhui Zhao,*,† Wei Wang,† and Furong Gao‡ †

State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China ‡ Department of Chemical and Biomolecular Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong Special Administrative Region ABSTRACT: In this study, a new discriminant analysis (DA)based fault diagnosis method is proposed based on the idea of variable selection to overcome the drawbacks of the existing methods. It addresses three issues. One is that a prejudgment strategy should be developed that can determine whether it is suitable to perform DA on the fault data. Second, it should solve the dependence problem of faulty variable-selection performance on the modeling samples. Third, it may not clearly distinguish between different faults using simple Boolean algebra for reconstruction-based diagnosis. To achieve the above purposes, a Monte Carlo (MC)-based evaluation method is developed to quantify the bias of fault center and judge whether discriminant analysis can be conducted. Next, a nested-loop Fisher discriminant analysis (NeLFDA) algorithm is used to extract meaningful directions by which to distinguish between normal and abnormal classes. Iterative faulty variable selection is conducted using MC and NeLFDA in which a stability index of fault-relevant variable significance is defined to isolate the significant faulty variables and reduce the influences of modeling samples. Next, 2-fold fault diagnosis models are then developed for general variables and faulty variables to explore their different variable correlations relative to the normal case and fault reconstruction is performed for both variable subsets. A probability index is calculated using Bayes’ rule for each fault sample to quantitatively evaluate the probability that it belongs to each fault type and to further judge the fault affiliation. The performance of the proposed method is illustrated by applying it to the cigarette-production industrial process. work by Chiang et al. in 2000,15 the use of FDA for fault diagnosis of chemical process has been explored widely. From another perspective, fault diagnosis can be regarded as one classification problem by treating each fault case as one class. By the projection of the measurement data onto some discriminant directions, the samples are as far as possible between different classes but close to each other enough within the same class. Despite success in fault diagnosis, people have recognized some problems of classical FDA algorithm, which may have limited its effectiveness. First, for the FDA algorithm itself, the within-class scatter matrix may be singular, which thus causes calculation problems for eigenvalue decomposition. Some solutions19−25 have been reported from different perspectives but caused some undesirable new problems. Zhao et al. proposed a nestedloop fisher discriminant analysis (NeLFDA) algorithm25 to comprehensively handle the problems resulting from singular

1. INTRODUCTION In manufacturing industries and chemical processes, online fault detection and diagnosis1−9 that are important for both process safety and operation efficiency have drawn people’s increasing attention over the last few decades. Data-driven statistical methods have emerged as an important tool with which to maximize the productivities and efficiencies of industrial processes with large amounts of process variables measured and recorded. Because of the high correlations among process variables, dimensionality reduction techniques10−14 have been widely used. Some representatives are principal component analysis (PCA),10,11 Fisher discriminant analysis (FDA),12,14 etc., which can get a refined and low-dimensional analysis space. Besides, instead of extracting general variations, some faultrelevant variations are extracted13 by comparing the relative changes from normal to fault, which can highlight the fault information and is thus more sensitive to fault detection. Although fault-detection methods have been relatively wellestablished, extensive research on fault isolation still requires deep consideration. FDA algorithm12 has been reported as one well-known fault classification and diagnosis method.15−18 Since the research © XXXX American Chemical Society

Received: Revised: Accepted: Published: A

August 22, 2016 November 27, 2016 November 29, 2016 November 29, 2016 DOI: 10.1021/acs.iecr.6b03221 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

analysis). Second, the faulty variable selection results may be subject to the modeling samples. Third, the reconstruction technique can only tell whether the fault model can accommodate the fault characteristics but can not tell the extent. Thus, some fault types may not be clearly distinguished from each other if their fault characteristics are overlapped, more or less. To address the above issues, on the one hand, Monte Carlo (MC) technique is used in the procedure of acquiring faulty variable for two purposes. First, a center bias evaluation method is developed by MC random sampling to define a confidence region to accommodate the normal variation of data center, which can judge whether discriminant analysis is necessary. Second, a stability index is defined to evaluate the variable significance and recognize variables with highest stability as faulty variables. On the other hand, a probability index is defined to evaluate the extent of the current fault samples belonging to each fault type. Besides, more real faults are considered for the industrial process to demonstrate the practical effectiveness of the proposed methods. The major contribution (i.e., the difference from previous work)30,31 is listed as below: (1) The Monte Carlo principle is used to judge whether it is suitable to perform discriminant analysis on the fault data and which is the most-possible faulty variable that should be selected first and can reduce the influences of modeling samples on selection results. (2) For the development of 2-fold fault diagnosis model, the probabilistic fault diagnosis index is defined to evaluate and compare the probability for the current sample when it is associated with different fault classes. It can further judge and distinguish between different fault types to determine the final fault affiliation. The remainder of the paper is presented as follows. First, the NeLFDA algorithm is simply revisited. Subsequently, the proposed algorithm is described that includes three important parts: variable separation, fault-diagnosis modeling, and online diagnosis. Third, the application to the cigarette production process is presented to verify the performance of the proposed method, with results analyzed in detail. It is also compared with other methods. The final conclusions are drawn in the last section.

within-class scatter matrix and rank-deficient between-class scatter matrix. Besides, by employing orthogonal transformation, correlations among features are decomposed so that overlapping information between components is avoided. The NeLFDA algorithm thus shows improved discriminant power for classification. Second, conventional discriminant analysis treats the whole measurement variables as a single subject, which does not isolate the specific faulty variables. Some diagnosis methods have focused on analyzing the specific variables. Contribution plots26−28 calculate the specific part of each variable in monitoring statistics from which the variables that account for large values are isolated as faulty ones highly related with out-of-control statistics. Its popular use may result from the fact that it does not need any prior fault information. However, this method may get confusing results because it is hard to say whether the contribution of one variable is significant or not. Moreover, the contribution values are calculated along the PCA monitoring directions, which only explore the variation information on normal data and may not be useful for classifying different fault types. DA considers information from both normal and fault data from which the discriminant directions can better explore the fault effects departing from the normal status. Using the similar idea of contribution plot, Qin et al.17 identified the major contributing variables to the fault based on FDA directions instead of PCA directions, which, however, does not probe into the relationship of specific variables. Meanwhile, it was limited by the singularity problem of conventional FDA algorithm. In general, for each fault case, the variables may be influenced differently. It is recognized that only some specific variables have been disturbed significantly, revealing quite different characteristics from those of normal data, which are deemed to cover significant classification information. Because not all measurement variables are disturbed, the key issue to produce the most discriminative power for fault diagnosis is how to exclude the undesirable influences of the general variables that stay similar to those of normal case and thus may not contain meaningful information about the fault effects. This purpose can be achieved by separating the key faulty variables from the other general ones on the basis of their different relationships with fault effects. Verron et al.29 performed feature selection based on the mutual information between variables so that discriminant analysis on the selected variables can achieve good classification between different fault classes. However, it focused on improving the misclassification error instead of finding all faulty variables for each fault class. Some faulty variables shared by fault classes may not be selected if they are not informative for classification. Zhao and Wang30 proposed a NeLFDA-based fault diagnosis method to isolate faulty variables by using an iterative variable selection procedure in which process variables that are related to the fault were identified and ordered by checking contribution to a predefined fault evaluation index. Furthermore, its practical application to a real industrial manufacturing process is reported by Wang et al.,31 in which the isolated faulty variables agree well with the real case, as judged by experienced experts. In this paper, the previous fault diagnosis work30,31 is further extended by the authors. It focuses on solving the following problems. First, discriminant analysis assumes that there is biased location from normal to fault statues, which, however, is not quantitatively checked to judge whether this assumption is satisfied (that is, whether it is proper to conduct discriminant

2. REVISIT OF NELFDA In this section, NeLFDA algorithm with the theoretical part presented in our previous work25 is revisited, which is used as the basis of the proposed faulty variable selection based fault diagnosis work. Let Xi(Ni × J) describe a data matrix of class i in which each row denotes the observation of J process variables. Ni indicates the number of samples in each class. There are C classes here, and each class may correspond to historical data collected from a specific fault type. Without a special statement, a vector denotes a column vector. To overcome the problems of the conventional FDA, NeLFDA is designed to solve two different mathematical optimization functions in the following two steps. Step 1: It considers the maximization of the between-class scatter in an inner-loop iterative process. It can prepare multiple preliminary components up to the rank of the original within-class scatter matrix, which can transfer the within-class scatter matrix into a diagonal one. The specific procedure is implemented as below. B

DOI: 10.1021/acs.iecr.6b03221 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

The extraction of discriminant direction converges to the following eigenvalue decomposition problem,

The between-class scatter is maximized to extract the preliminary discriminant components, J1(w) = max(w TS bw)

Sw* − 1S*b w* = λ w*

(1)

One class-specific final discriminant component can be calculated for each class data Xi, and data deflation is implemented for each class of data to guarantee that the class-specific components are orthogonal with each other:

where the between-class scatter matrix is calculated based on the class-specific mean and the total mean of all classes, C S b = ∑i = 1 Ni(xi̅ − x ̅ )(xi̅ − x ̅ )T where Ni denotes the number of samples in class i, and x ̅ is the total mean of all classes by calculating the center of all samples. xi̅ (J × 1) denotes the classspecific mean of class i. The solution converges to calculation of the eigenvector of the eigenvalue problem, Sb w = λ w

ti* = X iRw* = X iθ pi*T = (ti*T ti*)−1ti*T X i Ei* = X i − ti*pi*T

(2)

t = X̅ w T

p = (tT t)−1tT X̅ (3) C

where the centered super data X̅ (∑i = 1 ni × J ) is constructed by stacking the data sets of different classes xi̅ (ni × J) from top to bottom. p is the superloading vector calculated for X̅ . Xi is replaced by Ei to update all related data information. The above procedure is iteratively repeated to calculate the projection directions (i.e., weight vectors) and preliminary discriminant components. The number of components is determined to be equal to the rank of the original within-class scatter matrix, which correspondingly can result in a weight matrix W(J × M) and a loading matrix P(J × M), where M is the number of the retained components. The weight matrix that can directly derive multiple component from each class is then formulated as R = W(PTW)−1, like what is derived in the basic PLS algorithm.32 Step 2: These preliminary components are further refined to get one final component by maximizing the between-class scatter and minimizing the within-class scatter simultaneously. The information on the final component is subtracted from each class to guarantee that no information is extracted repetitively. The specific is described as below. To calculate, use XiR to replace the original class data set. Calculate the sample mean for each class and the total mean for all samples as well as the within-class matrix (Sb*) and betweenclass scatter matrix (S*w ). It is clear that S*b = RTSbR and S*w = C Sw = ∑i = 1 Si RTSwR where and

3. PROPOSED METHOD Following the theory part of NeLFDA algorithm,25 it is employed for faulty variable isolation and online fault diagnosis. A total of three issues will be addressed during offline model development and online application. One is that a prejudgment strategy should be developed that can determine whether it is suitable to perform DA on the fault data. Second, it should solve the dependence problem of faulty variable selection performance on the modeling samples. Third, it should solve the problem imposed by simple Boolean algebra for reconstruction-based diagnosis. These issues are addressed as below. 3.1. Faulty Variable Selection Based on Monte Carlo and NeLFDA. Here, the iterative faulty variable selection is conducted based on the principle of Monte Carlo (MC)33 and NeLFDA. The analysis is conducted for the normal data (Xn(Nn × J)) and one fault case (Xf,i(Nf,i × J) in which subscript i denotes one of fault cases). The subscript n is used for normal data and subscript f for fault data. They cover the same number of variables (J) but may have different number of observations, denoted by Nn and Nf,i, respectively. The Monte Carlo method, which is known as a powerful stochastic technique, is often used to investigate problems by employing random numbers as well as probability statistics. It is also termed as the random imitative algorithm. In this work, the MC method is used in the procedure of acquiring faulty variables for two aspects.

Si = ∑x ∈ X (x i − xi̅ )(x i − xi̅ )T. The within-class scatter matrix i i S*w in fact converges to a nonsingular diagonal matrix. The final discriminant components are calculated by achieving the maximization of the ratio between the between-class scatter and the within-class scatter, J (θ ) =

w*T S b*w* T

w* Sw *w*

=

w*T RTS bRw* w*T RTSw Rw*

(6)

where θ = Rw* is the final weight vector to calculate the discriminant component ti* directly from each class data Xi. ti* is one of the class-specific final discriminant components. p*(J × 1) is the class-specific loading vector, and E*i contains the residuals of class data Xi that are irrelevant with t*i . Xi is replaced by Ei* in the first step to update all relevant data information. The above two steps are repeated iteratively until the desired number of final discriminant components is derived, which are orthogonal with each other in each class due to the use of data deflation. The above procedure outputs class-specific final discriminant components T*i to distinguish different classes. The number of retained components R can be determined by checking the discriminant power. The weights matrix Θ(J × R) is obtained covering θ(J × 1), calculated in eq 6. To directly calculate discriminant components from each class Xi, the weights R*i (J × R) can be formulated as R*i = Θ(P*i TΘ)−1. Here, the symbol R is a bold letter, which is different from the italic letter R. For more details, please refer to our previous work.25

Here, it only derives one eigenvector, w. For each class data and the centered super data set, the class-specific and super components can be calculated, respectively, and to guarantee that the within-class scatter can be transformed to be diagonal matrix, data deflation is implemented for each class data set,

Ei = X i − X iwpT

(5)

(4)

where w* denotes the optimal discriminant direction. C

DOI: 10.1021/acs.iecr.6b03221 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research First, Monte Carlo based fault center evaluation is performed to judge whether discriminant analysis can be conducted. A certain amount of normal samples, Xs(Ns × J), are randomly selected from Xn(Nn × J), as the normal subsets (where the number of samples in subset Xs(Ns × J) counts for 2/3 of the normal samples of Xn here). The center of each normal subset is also calculated. The random selection procedure is repeated M times (here, M is set with 200), and a matrix of the centers is obtained, X̅ (M × J), where each row vector is the center of subset Xs(Ns × J). Next, the final center μ and the variance σ can be calculated from X̅ (M × J), from which a confidence region is defined as [μ − 3σ,μ + 3σ], which can tell the variation of normal data center. It will be used to check whether the fault variation can be explored by discriminant analysis. If the fault data center is beyond the confidence region for any variable, discriminant analysis can then be performed on Xn and Xf,i for fault variable selection. Second, Monte Carlo based faulty variable evaluation index is defined to reduce dependence on modeling samples and to judge the selection of them. M normal subsets are randomly selected by MC, where the number of samples in each subset counts for 2/3 of the normal samples of Xn and is pairwise analyzed with fault data to extract the discriminant-based fault directions. Along the fault directions, the fault effects of different variables are evaluated in which a large number of relative contribution values are calculated at each time for each variable with randomly selected normal subsets, and then the stability index is defined for each variable. Variables that show high stability are recognized as faulty variables. The evaluation index is described as below. Index 1: Monte Carlo technique is used for random sampling from Xn(Nn × J) so that we can get M normal subsets (Xs(Ns × J)), each composed of some normal samples. By pairwise implementation of the NeLFDA algorithm25 on one normal subset, Xs(Ns × J), that is randomly selected by MC and fault data Xf,i(Nf,i × J), the weights R*i (J × R) are obtained corresponding to the normal class by comparing it with each fault class. It is noted that the italic letter R denotes the number of NeLFDA components, which is different from the bold letter R. The weights for each fault class can also be obtained, R*f,i (J × R). It is noted that for the normal class, when it is associated with different classes of fault data, the extracted Ri* are different and are used as projection directions to discriminate normal and fault classes. R*i are used as the monitoring directions, and along the extracted directions, multiple discriminant components can be extracted for each fault class, which reveal the major information to separate each specific fault class from the normal status, revealing the responsible fault effects, Tf, i* = X f, iR i*

ts, i , m = x s, mT R i* Ds, i , m 2 = (ts, i , m − ts̅ , i)T Σs, i−1(ts, i , m − ts,̅ i)

where, is the discriminant components. ts̅ ,i denotes the mean of Ts,i. Σs,i is the covariance matrix of components (Ts,i), which is diagonal, resulting from the orthogonality of Ts,i. Some elements may approximate zero resulting in singularity problem and can be set to one. The statistic D2 describes the normal variations along different fault directions. For the fault class, the corresponding D2 for each fault sample, which summarizes the variations at fault status along the extracted fault directions, can also be calculated as, t f, i , m = x f, mT R i* Df, i , m 2 = (t f, i , m − ts,̅ i)T Σn , i−1(t f, i , m − ts,̅ i)

(10)

where D2f,i,m in fact reveals the fault version of D2s,i,m that covers the integrated fault information. Index 3: A comparison of D2f,i,m against D2s,i,m, it reveals the changes from the normal status to the fault case. The variables that have significantly contributed to discrimination between normal and fault are regarded as informative ones that are useful for fault classification. The contribution of each variable to D2 statistic is then defined for both normal and fault data, t , i , m = x , mT R i* * * = (t , i , m − ts,̅ i)T Σs, i−1ri , j(x , m , j − xs,̅ j) (11) *, m , j * * where the subscript * can be s for the normal subset or f for the fault case. ri,j, as the jth row vector of R*i , are the weights for the jth variable. x*,m,j denotes the jth variable in each sample, and xs̅ ,j denotes its mean calculated for each normal subset. The contributions of different variables can be summed by C D2 ,

J

J

∑ C D ,*,m,j = ∑ (t*,i ,m − ts̅ ,i)T Σs , i−1ri ,j(x*,m,j − xs̅ ,j) 2

j=1

j=1

= (t , i , m − ts̅ , i)T Σs , i−1(t , i , m − ts̅ , i) * * 2 Index 4: For each normal subset, CD ,s,j(Ns × 1) is available for each variable. The relative variable contribution can be obtained by comparing the variable contribution between normal and fault cases at each time m,

RC D2 , f , m , j = =

(7)

C D2 ,f, m , j ctr(C D2 ,s, j) (t , i , m − ts,̅ i)T Σs, i−1ri , j(x , m , j − xs,̅ j) * * ctr(C D2 ,s, j)

(12)

where ctr(CD2,s,j) denotes the control limit of CD2,s,j. Without prior distribution knowledge, kernel density estimation35 is used to estimate the distribution of CD2,s,j from which the control limit of (1−α)% confidence level is defined by the point that accounts for the α% area. The control limit works as the reference by excluding singular points to avoid their influences. The relative contribution is thus available to evaluate the significance of each variable to uncover the fault effects departing from the normal status. The larger this index RCD2,f,m,j is, the more significant this variable is. For M normal subsets that are randomly selected by MC, a vector RCD2,f,m,j(M × 1) is obtained, which can quantitatively measure the fault-relevant

Correspondingly, the normal version of discriminant components can be calculated for the normal subset, Ts, i* = X sR i*

(9)

tTs,i,m

(8)

Then, the original fault class data are represented by the classspecific final discriminant components Tf,i* and the normal class data replaced by T*s,i. The superscript * will not be used for components for readability as below. Index 2: The Mahalanobis distance34 of each sample m in the normal class can be calculated to evaluate its similarity with the normal class center, D

DOI: 10.1021/acs.iecr.6b03221 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research significance of each variable j by defining the significance index as below: S(RC D2 ,f, m , j) =

subset and fault class, respectively. Move back to Step 1 to select the next faulty variable. Iteratively repeat the above steps to find all faulty variables. A pair of stop conditions are defined as described in Steps 1 and 3. For the stop condition shown in Step 1, it is used to judge whether it is necessary to perform discriminant analysis because the deviation of data center is a necessity. For the stop condition shown in Step 3, it is used to check whether the left variables can be accommodated by the model defined for normal data, i.e., whether the fault case presents similar characteristics with those of normal case for the left variables. If all variables have been disturbed, it will converge to the special case with all variables selected. In general, some faulty variables can be separated from general variables, each revealing different variable correlations. Figure 1 depicts the procedure of selecting faulty variables that outputs two variable subsets for each fault class, faulty variables X̃ f,i(Nf × Jf,i) and general variables X̃ f,i(Nf × Jn,i). Jf,i denotes the number of selected faulty variables, and Jn,i denotes

mean(RC D2 ,f, m , j) std(RC D2 ,f, m , j)

(13)

where mean(RCD2,f,m,j) calculates the mean of the column vector RCD2,f,m,j, and std(RCD2,f,m,j) calculates the standard deviation of RCD2,f,m,j. The absolute value of ratio between mean(RCD2,f,m,j) and std(RCD2,f,m,j) can evaluate the variable significance to describe fault effects. The highest value of S(RCD2,f,m,j) indicates both high relative contribution and low variability among randomly selected subsets, which thus can be used to reveal the most significant fault-relevant variable. Besides, an Nf,i-dimensional vector S(RCD2,f,m,j) is available for each variable, with S(RCD2,f,m,j) calculated for all fault samples. The average level can be evaluated by calculating the mean value of S(RCD2,f,m,j) as S̅(RCD2,f,m,j). The largest mean value, S(̅ RCD2,f,m,j), can point to the most-significant faulty variable. The specific faulty variable selection procedure is described as follows. Step 1: Monte Carlo Based Fault Center Evaluation. Monte Carlo technique is used to randomly select normal samples from Xn(Nn × J) and M normal subsets (Xs(Ns × J)) are obtained. Next, the confidence region of normal data center is defined as [μ − 3σ, μ + 3σ] to judge whether the fault variation can be explored by discriminant analysis. If the fault data center is beyond the confidence region, stop the selection process. Otherwise, move forward to Step 2. Step 2: NeLFDA-Based Fault Direction Extraction. The NeLFDA algorithm25 is employed to extract the fault directions by pairwise analysis of normal data subset (Xs) and one fault class (Xf,i). The class-specific components (Tf,i and Ts,i) are then calculated by projecting measurement samples onto these fault directions based on eqs 7 and 8, respectively. Step 3: Fault-Data Monitoring. Based on eqs 9 and 10, the statistical index D2 is obtained using the class-specific components for the normal subset and fault class, respectively. The confidence limit is defined based on the normal data subset, and the values of D2 of fault samples are compared against the limit. If there are not sufficient out-of-control alarms with the number less than a predefined threshold ß, it means that the two classes have similar characteristics. In this case, stop the selection process. Otherwise, move forward to Step 4. Step 4: Relative Variable Contribution Evaluation. The variable contribution are available for both the normal subset and the fault case based on eq 11 and then the relative variable contribution (RCD2,f,m,j) are derived for each variable in the fault class based on eq 12 by setting the control limit based on normal data subset. Move back to Step 2 to calculate RCD2,f,m,j for the next normal subset and fault samples until all normal subsets have been used. Step 5: Calculation of Variable Significance. The significance index, S(RCD2,f,m,j), is calculated as shown in eq 13 for each variable as well as the mean value S(̅ RCD2,f,m,j) for Nf,i fault samples, which is used to indicate the significance of each variable in distinguishing normal and fault data. Step 6: Faulty Variable Selection. Choose the variable (xj*) with the largest value of S(̅ RCD2,f,m,j), which is regarded to be the most significant faulty variable and archived into the library of candidate faulty variables. Step 7: Model Updating. After the removal of the concerned variable (x*j ), the data are updated as X̃ s and X̃ f,i for normal

Figure 1. Illustration of the proposed faulty variable-selection procedure. E

DOI: 10.1021/acs.iecr.6b03221 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

directions equal to the rank of X̃ i. After projection, the diagnosis statistic is then calculated,

the number of general variables. Correspondingly, the variables in the normal status are also separated into two parts, which are denoted as X̃ n,i(Nn × Jf,i) and X̃ n,i(Nn × Jn,i). Here, by associating the normal data with different fault cases, the normal data can be separated differently, which is indicated by subscript i. 3.2. Development of 2-Fold Fault Diagnosis Models. It is noted that two fault cases may share the same faulty variables and general variables but present different variable correlations. Besides, only the selected faulty variables may not clearly distinguish different fault types. To explore their different characteristics, 2-fold fault diagnosis models are developed for each fault type. First, for the subset of faulty variables, data normalization is conducted for X̃ n,i, and the preprocessing information is then used to normalize fault samples X̃ f,i. NeLFDA is used for modeling X̃ n,i and X̃ f,i for the extraction of fault directions, revealing the major abnormal changes departing from normal. The fault directions are R̃ n,i for the normal data X̃ n,i and R̃ f,i for the fault data X̃ f,i. Here, the reconstruction calculation is used to determine how many fault directions should be retained, which can just bring the corrected monitoring statistics back to the normal region:

Tĩ = X̃ iPĩ 2

T

Pf,̃ iT = (Tf̃ r , i Tf̃ r , i)−1Tf̃ r , i X̃ f, i Ef,̃ i = X̃ f, i − Tf̃ r , iPf,̃ iT

(14)

where P̃ f,i denotes loading matrix for T̃ fr,i to explain X̃ f,i, and Ẽ f,i denotes the updated data. Here, the number of retained directions in R̃ f,i is determined by checking whether the alarms can be removed. The updated data (Ẽ f,i) are projected onto the PCA model (P̃ i) that has been defined for X̃ n,i, and the statistic D̃ 2f,i is calculated to summarize the fault variations along PCA directions P̃ i, Tf,̃ i = Ef,̃ i Pi ̃ −1 Df,̃ i 2 = (tf,̃ i − t n,̃ i)T Σ̃n, i (tf,̃ i − t n,̃ i)

(16)

where T̃ i are the principal components extracted from X̃ i; tTĩ is one row vector of T̃ i, and t ĩ denotes the mean of T̃ i, which in fact is zero vector due to data preprocessing; Σ̃i is the variancecovariance matrix of T̃ i. A confidence limit can be defined for D̃ 2i using the similar principle with that for the D̃ 2f,i statistic. Based on the above analysis, each fault class will be distinguished by two variable subsets, their respective diagnosis models, and the associated statistics as well as their confidence limits. D̃ 2f,i and D̃ 2i summarize variations of different variables. Then, for online fault diagnosis, each new fault sample will be evaluated by predefined 2-fold models for each candidate fault to identify its affiliation. 3.3. Online Probabilistic Fault Diagnosis. For conventional fault diagnosis, the fault cause is in general judged by checking whether the candidate fault class can accommodate the current fault sample or not. Assumes that 0 indicates the current fault samples can not be brought back to normal region enclosed by a predefined confidence limit, and 1 denotes that the current fault samples can be well-reconstructed. That is, simple Boolean algebra (0 or 1) is used for reconstructionbased diagnosis. Against the confidence limit, it can only tell whether the new samples belong to one fault type but can not tell how much the samples are associated with the concerned fault type. That is, the extent of affiliation that may tell more information on the fault cause can not be quantified. Here, a probability index is also defined to evaluate the possibility of the current fault samples belong to each fault type. The online probabilistic fault diagnosis strategy is implemented as below. Whenever a new observation, xnew(J × 1), is available at the kth time interval, it is detected by some method to find whether some abnormal behavior is happening. Here PCA-based statistical models are used to check its status. For abnormal sample, fault diagnosis is then implemented to determine its affiliation to candidate fault classes. For each candidate fault class i, two variable subsets are available, the faulty variables and the general variables. Correspondingly, the new fault sample can be organized as ⎡ x̃ (J × 1)⎤ ⎥. Then, the similarity of this x new(J × 1) = ⎢ new, i n ⎣ x̃ new, i(Jf × 1) ⎦ new fault sample with each candidate fault class will be evaluated using the 2-fold fault diagnosis models. First, the faulty variables are evaluated by adopting the corresponding fault diagnosis model developed for faulty variables of each candidate fault class i,

Tf̃ r , i = X̃ f, i R̃ f, i T

−1

Dĩ = (tĩ − tĩ )T Σ̃i (tĩ − tĩ )

(15)

where tTf,ĩ is one row vector in T̃ f,i, the components extracted ̃ denotes the mean of from the faulty variables X̃ f,i; t n,i components T̃ n,i extracted from X̃ n,i by T̃ n,i = X̃ n,iP̃ i, which is in fact a zero vector because of the data normalization; Σ̃n,i is the variance−covariance matrix of T̃ n,i. It is noted that the retained number of directions in P̃ i is equal to the rank of X̃ n,i. The confidence limit for D̃ 2f,i has been defined based on the variable subset in normal data using χ2 distribution36 instead of F-distribution37 by assuming that the measurement data follow a multivariate Gaussian distribution and the sample number is large enough. Alternatively, kernel density estimation35 that is a nonparametric way for estimation of the probability density function can be used to define the confidence limit. Second, for the subset of general variables, the variations are similar between normal and fault cases and thus can be described by a unified model. Construct the super data ⎡ ̃ ⎤ X X̃ i = ⎢ n, i ⎥ and normalize them to make each variable to be ⎢⎣ X̃ f, i ⎥⎦ of zero mean and unit standard deviation. Perform PCA on X̃ i to get monitoring directions (P̃ i) with the retained number of

̃ i = (xnew, tnew, ̃ iT − xnew, ̃ iT R̃ f, iPf,̃ iT )Pi ̃ ̃ i 2 = (tnew, ̃ i − t n,̃ i)T Σ̃n, i−1(tnew, ̃ i − t ñ ,i) Dnew,

(17)

where xñ ew,i(Jf,i × 1) covers the selected faulty variables in the fault sample xnew and is corrected by the specific fault diagnosis models (R̃ f,i and P̃ f,i). The corrected sample is then projected ̃ . onto predefined model P̃ i to calculate the components tnew,i 2 The corresponding statistic D̃ new,i is then calculated to summarize the variations along P̃ i. F

DOI: 10.1021/acs.iecr.6b03221 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research Table 1. Variables in Cut-Made Process of Cigarette Used for Modeling variable no.

variable no.

variable description

1

initial cut tobacco flow rate (kg/h)

9

2 3 4

initial moisture content (%) SIROX vapor pressure (bar) SIROX temperature (°C)

10 11 12

5 6 7

SIROX vapor volume flow rate (m3/h) SIROX vapor mass flow rate (kg/h) SIROX vapor diaphragm valve opening (%)

13 14 15

8

KLD moisture exhaust negative pressure (μbar)

16

variable no.

variable description KLD moisture exhaust damper opening (%) KLD vapor pressure (bar) region 1 vapor pressure (bar) region 1 wall temperature

17

region 2 region 2 region 1 (°C) region 2 (°C)

21 22 23

variable description KLD hot wind temperature (°C) KLD hot wind speed (m/s) KLD water removal mass (l/h) KLD dried moisture content (%) KLD dried temperature (°C) cooling moisture content (%) cooling temperature (°C)

18 19 20

vapor pressure (bar) wall temperature (°C) condensed water temperature condensed water temperature

Table 2. (a) Fault Description and (b) Faulty Variable Selection Results in the Cut-Made Process of Cigarette (a) fault no.

fault description

number of real faulty variables (variables nos.) by process knowledge and expert experience

1 2 3

The vapor pressure valve open is increased by 25%. Step change is imposed in the set point of KLD hot wind speed. unknown fault

7 (nos. 11, 12, 15, 20, 21, 22, and 23) 6 (nos. 18, 19, 20, 21, 22, and 23) 11 (nos. 10, 11, 12, 13, 14, 15, 16, 17, 20, 21, and 22) (b)

CVSR%

FVSR%

fault no.

proposed method

NeLFDA-based selection method30

contribution plot based on FDA fault direction17

mutual information based method29

proposed method

NeLFDA-based selection method30

contribution plot based on FDA fault direction17

mutual information based method29

1 2 3

100 100 100

100 100 100

100 100 100

85.71 83.33 81.81

0 0 0

0 16.67 18.18

57.14 50.00 36.36

0 0 0

where P( f i|x) denotes the posteriori probability that x belongs to class f i and P(f i) denotes the priori probability of fault class f i. C is the number of fault classes, which are considered in fault library. P(x|f i) is the conditional probability for x. Without any knowledge of fault data distribution, use kernel density estimation35 to estimate their probability density functions. Assume that P(f i) is the same for different fault classes, eq 19 can be replaced by

Second, the general variables are also evaluated by the predefined diagnosis model, T ̃ i = x̃ new, i Pĩ t new, 2

−1

̃ i = (t new, ̃ i − tĩ )T Σ̃i (t new, ̃ i − tĩ ) Dnew,

(18)

where xñ ew,i(Jn,i × 1) are the general variables in the new sample, ̃ and xnew and tnew,i are the components. The statistic D̃ 2new,i tells the variation of general variables in the new sample. The two new statistical indices can be compared against their predefined control limits, respectively. The fault cause can be preliminarily judged by finding which candidate fault class can accommodate the current fault sample. In general, if out-ofcontrol signals are found for any index, it reveals that the current fault sample is not consistent with the candidate fault type for the concerned variables. Otherwise, if both statistics stay well within the predefined confidence regions for one specific fault class, it means that both the fault effects of faulty variables and general variables are similar to those of the concerned fault class, and the fault affiliation is thus preliminarily determined. However, sometimes, it may find that multiple fault classes may simultaneously presented incontrol statistics. To further clarify the fault cause, the probability index is defined for the calculated monitoring statistics to indicate the extent. Using Bayes’ rule,38 for each concerned statistic x, P(fi |x) =

P(fi |x) =

(20) C

where the denominator (∑m = 1 P(x|fm )) is the same and thus the probability is determined by P(x|f i). Therefore, for each concerned statistic, the posteriori probability of x belonging to class f i can be calculated. Here, by replacing x by one of the monitoring statistics defined in eqs 17 and 18, respectively, the posterior probability can be derived based on the fault diagnosis statistics using the above-described strategy. Then we can know the probability of the current sample belonging to each fault type. In comparison to the conventional reconstruction-based judgment method, in which the fault cause may not be determined if more than one candidate fault classes can accommodate the current fault sample, the specific probability value can be used to further denote the final affiliation by using the largest posteriori probability. For example, if the new fault sample is found to stay in the confidence regions of both fault no. 1 and fault no. 2, the difference of probability can tell which one can better describe the current new fault sample from the perspective of

P(x|f )P(f )

i i C ∑m = 1 P(x|fm )P(fm )

P(x|f )

i C ∑m = 1 P(x|fm )

(19) G

DOI: 10.1021/acs.iecr.6b03221 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

Therefore, the selected faulty variables may not agree well with the real case. Using this method,29 eight faulty variables are selected to best classify the three fault types, which, however, do not correctly reveal the real faulty variables for each fault case. Some faulty variables shared by different fault classes may not be selected if they are not informative for classification. To improve the selection performance, here, the method29 is modified and pairwise conducted on normal data and each fault class. In this way, the variables that are better for classification between normal and fault data are selected as faulty variables, which, however, may not cover all fault variables in this fault case. No extra variables are falsely selected as faulty variables, revealing better FVSR%. However, the CVSR index is smaller, which may result from the fact that the concerned method29 and assumes the data normality, which in general can not be satisfied, particularly for fault data. For each fault case, the faulty variables are explained in detail as below. For fault no. 1, the vapor pressure valve open is increased by 25% to impose a step change into the Region 1 vapor pressure. Due to the close variable relationship, other variables will also change in response to the change of Region 1 vapor pressure. For modeling, the faulty variables are first separated from the general variables using the proposed selection algorithm. Here, we can get seven faulty variables, which are variables no. 11, 12, 15, 20, 21, 22, and 23. The result can be well explained. In response to the increase of pressure valve open, the Region 1 vapor pressure (variable no. 11) will be increased immediately and lead to more vapor projected onto the wall. Therefore, the wall temperature (variable no. 12) will be increased as well as the condensed water temperature (variable no. 15) because more heat should be condensed for the same amount of cooling water. Variable no. 20(KLD dried moisture content) and variable no. 21(KLD dried temperature) are measured at the output of KLD machine. Variable no. 22 (cooling moisture content) and variable no. 23 (cooling temperature) are measured at the output of winnowing cooling machine that locates after KLD machine. Both temperature variables, the KLD dried temperature (variable no. 21) and the cooling temperature (variable no. 23), increase due to the higher wall temperature. Because more water is evaporated out of the tobacco, the KLD dried moisture content (variable no. 20) is reduced. The same thing occurs for the cooling moisture content (variable no. 22). For fault no. 2, the set point of KLD hot wind speed is increased, which will result in the increase of KLD hot wind speed and the changes of other variables because of their relationship. We can get six faulty variables, including variables no. 18, 19, 20, 21, 22, and 23, in response to the disturbance. The selection result can be reasonably explained. Following the increase of KLD hot wind speed (variable no. 18), the leaf-silk will go through the KLD faster. Thus, less water is evaporated from the leaf-silk, and the moisture content can not be dried to the expected value, resulting in smaller KLD water removal mass (variable no. 19). In contrast, more moisture content will be left so that the KLD dried moisture content (variable no. 20) will be larger than the normal value. Resulting from a shorter stay during the KLD, the KLD dried temperature (variable no. 21) will be lower than the reference value. Next, the output of KLD is fed into winnowing cooling machine, in which the cooling temperature (variable no. 23) will be larger and the cooling moisture content (variable no. 22) smaller than the normal values. It is noted that because the disturbance happens in KLD machine, which is located after the SIROX

distribution and the fault class corresponding to the larger probability can point to the final affiliation.

4. ILLUSTRATIONS AND DISCUSSIONS In this subsection, the proposed fault diagnosis method is tested for the cigarette production process. Besides this, the conventional discriminant analysis based fault diagnosis method is also tested for comparison. For the cigarette production process, the cut-made process is one of the most important procedures. It determines the flavor and style characteristic of cigarette and thus is critical to the final quality. A total of three operation sections are covered, including leaf processing, silk processing, and blending and spicing. Here, we only consider the silk-processing operation section, in which the leaves are transferred into leaf-silk. A pair of major operation machines are included, SIROX warming and humidification machine and cylinder drying (KLD). For the first machine, the leaf-silk is inflated and dampened to prepare for the following operation, while for the second machine, the heated barrel with saturated steam will be used to dry the moisture content of leaf-silk from 20% to 12%. In this case, we use twenty-three measurement variables with sampling interval of 10 s as described in Table 1. A total of three fault cases are considered, and the fault description is shown in Table 2a, where the real faulty variables are also listed based on the process knowledge and expert experience. MC-based fault center evaluation is performed for the three fault cases, revealing that all of them are suitable to be analyzed by discriminant analysis. First, during the offline modeling stage, faulty variable separation is performed for different faults. Here, 595 normal samples are used as reference. A total of 50 fault samples are considered that represent a short fault process time so that we can assume that the fault characteristics have not changed significantly. Using different methods, the identified faulty variables are comparatively shown in Table 2b. A pair of indices are defined to evaluate the faulty variable selection performance. The correct variable-selection ratio (CVSR) is obtained by comparing the number of faulty variables that have been correctly selected against the number of real faulty variables. The false-variable selection ratio (FVSR) is calculated as the ratio between the number of faulty variables that have been falsely corrected and the number of real faulty variables. Using the proposed algorithm, all faulty variables are correctly separated and no other variables are falsely identified. In comparison, using the selection method30 without MC-based evaluation of variable significance, some extra variables are falsely selected as faulty variables, leading to a larger FVSR index. Because the method determines the number of selected faulty variables by checking whether the left variables stay in the confidence region predefined by normal data, all faulty variables have to be selected, resulting in 100% CVSR. Using a contribution plot based on FDA fault direction by Qin et al.,17 the number of faulty variables can not be determined because it is hard to say how much contribution is significant. Here, this method17 is modified to determine the number of selected variables by checking whether the left variables are similar to those in normal data. Similarly, all faulty variables have to be picked up. In contrast, some variables are falsely selected, resulting in the worst FVSR index. For mutual information based method by Verron et al.,29 it focused on improving the misclassification error between different faults instead of identification of all faulty variables for each fault case. H

DOI: 10.1021/acs.iecr.6b03221 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research warming and humidification processing unit, thus will not influence the SIROX process unit before KLD. For fault no. 3, the fault cause is unknown in which the operation process is stopped resulting from the fact that the Region 2 wall temperature is going beyond the normal region. A total of 11 variables are selected as faulty ones, which are explained as below. First, all of the variables measured for Region 1 and Region 2 are selected, including variable nos. 11− 16. It reveals that the fault may happen in this part, which locates in the front of KLD machine. Besides, the change of wall temperature may result from the change of vapor pressure. Moreover, variable nos. 10, 20, 17, and 21 reveal the operation status of back KLD machine, which may result from the changes in Region 1 and Region 2. Besides, variable no. 22 (cooling moisture content) is also selected because it may go beyond the expected value because of all disturbances that happened before. From the selected faulty variables, the fault cause may be analyzed. The vapor pressure leads to some failure in vapor pressure valve, which thus results in changes of wall temperature and condensed water temperature. The experienced engineers checked the pressure values and confirmed the reliability of the fault cause. The selected faulty variables can help to know the specific fault characteristics and diagnose the root cause of fault process. Besides, it can improve the classification between normal and fault data. In Figure 2, the scatter of the first two discriminant components is compared between NeLFDA with and without variable selection for fault no. 1. By the use of all variables, the fault data and normal data can not be clearly separated from each other with some overlapping to a certain extent, as shown in Figure 2a. In contrast, with variable separation, the normal and fault data can be separated better using faulty variables than that shown in Figure 2b. The general variables are thus more similar between two data classes after excluding the faulty variables, which are overlapped significantly. The results reveal that by identifying the significant fault effects, we can improve classification performance between normal and fault data. On the basis of faulty variable separation results, 2-fold models are developed for each fault case based on the selected faulty variables and the general variables, respectively. To test the model performance, another 50 fault samples are employed as testing samples. For fault samples that have been detected, the 2-fold models are used to check whether the current fault sample present similar characteristics with some fault class regarding both faulty variables and general variables. For samples from fault case no. 1, they are evaluated by 2fold models developed from fault no. 1, as shown in Figure 3. For general variables, the statistics stay well below the confidence limit as shown in Figure 3a, revealing similar characteristics with those of training case. For faulty variables, the statistics are going beyond the confidence limit before fault reconstruction; they will be corrected by the predefined fault model using the proposed method, and the reconstructed statistics will go back to the normal region, as shown in Figure 3b. The fault cause is thus judged by double checking the characteristics of general variables and faulty variables. The general variables can tell whether the operation status of this variable subset is similar to that of normal case, and the faulty variables can tell whether the faulty variables are departing from the normal case along the predefined fault directions. For comparison, if the fault samples are evaluated by models developed from fault no. 2, as shown in Figure 4, both general variables and faulty variables can not stay within the confidence

Figure 2. First two components extracted from fault data and normal data using (a) the proposed algorithm with faulty variable separation and (b) NeLFDA without faulty variable separation.

region even after reconstruction. Besides, the probability of each fault sample belonging to fault no. 1 is calculated and shown in Figure 5. Clearly, the probability of fault no. 1 is larger than 0.5 for both general variables and faulty variables, which is obviously larger than that to other faults and thus directly indicates the correct fault cause. From subsection 3.3, the 2-fold models of different candidate faults will be tested alternatively to check which fault class can well-explain the current fault samples. Therefore, for fault samples from fault no. i, if they can also be well-explained by fault models from other faults, they may cause wrong fault I

DOI: 10.1021/acs.iecr.6b03221 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

Figure 3. Online fault diagnosis results using (a) general variables and (b) faulty variables for fault samples of fault no. 1 using the proposed method (dotted line: monitoring statistics; red dashed line: 99% monitoring confidence limit). J

DOI: 10.1021/acs.iecr.6b03221 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

Figure 4. Online fault diagnosis results for fault samples of fault no. 1 using fault models developed from fault no. 2 (dotted line: monitoring statistics; red dashed line: 99% monitoring confidence limit).

The fault diagnosis results are presented in Table 3, in which four methods are compared for three faults and 50 testing samples to demonstrate the idea of faulty variable selection, improved discriminant power of NeLFDA, and superiority of probabilistic fault diagnosis. For each fault case, four different methods are evaluated using CCR and FCR. A paired t-test (α = 0.05)39 is used to quantify the difference between the NeLFDA method with no faulty-variable selection and the conventional FDA method. Both CCR and FCR indices are statistically better for the NeLFDA method with no faulty variable selection, revealing improved discriminant power of NeLFDA than that of the conventional FDA. Besides, with faulty variable selection, the NeLFDA method presents a significantly larger CCR index and smaller FCR index than those of the NeLFDA algorithm with no faulty-variable separation, revealing that faulty-variable separation can improve diagnosis accuracy. Comparatively, the conventional FDA presents the worst fault-diagnosis performance, which can be

diagnosis results. A pair of indices are calculated to evaluate the fault diagnosis performance, correct classification rate (CCR), and false classification rate (FCR), CCR =

no. of samples (under fault i) total samples (under any fault)

FCR =

no. of samples (beyond fault i) real fault samples

(21)

where the CCR index is calculated by comparing the number of samples that are under the confidence region of fault no. i with the sum of samples that are under the confidence region of any fault. The FCR index counts the number of samples that are out of the confidence region of fault no. i in comparison with the number of real fault samples. The largest CCR indicates that the models from fault no. i can best accommodate the fault samples in comparison with the other faults. FCR can quantify how many fault samples can not be correctly identified. K

DOI: 10.1021/acs.iecr.6b03221 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research

Figure 5. Online probability values for fault samples of fault no. 1 using fault models developed from fault no. 1.

Table 3. Fault Diagnosis Performance for Four Different Methods Evaluated by CCR and FCR (%) method conventional FDA algorithm

NeLFDA algorithm with no faulty variable separation

NeLFDA algorithm with faulty variable separation

probabilistic fault diagnosis with faulty variable separation

fault no.

CCR

FCR

CCR

FCR

CCR

FCR

CCR

FCR

1 2 3

66.94 73.08 71.43

19.00 24.00 20.00

79.63 78.70 77.27

14.00 15.00 15.00

96.08 88.68 87.04

2.00 6.00 6.00

100 98.02 98.02

0 1.00 1.00

5. CONCLUSIONS

explained from the following aspects. First, it can only extract single discriminant components when each fault class is associated with the normal case, which may not cover sufficient information to tell the critical difference among multiple fault types. Second, with no faulty variable selection, the different characteristics of faulty variables and general variables can not be clearly explored. Using NeLFDA with faulty variable selection, some fault samples may not be clearly distinguished between different faults in particular for fault no. 3, in which some fault samples are also accommodated by the models from fault no. 2. Then, we can use the probabilistic fault diagnosis strategy to further classify the fault cause. A larger probabilistic value reveals that the fault samples better follow the distribution of some fault type, although they can also be accommodated by other fault types, which, thus, can further indicate the fault cause. The classification between fault no. 2 and fault no. 3 is further improved by using probabilistic diagnosis strategy.

In the present work, a probabilistic fault diagnosis method is developed based on MC and NeLFDA. The prejudgment strategy can reveal whether discriminant analysis can be conducted for fault data. The MC-based evaluation strategy of fault-relevant variable significance can reduce the influences of modeling samples and better separate the significant faulty variables from the other general ones for each fault case. The probability diagnosis strategy can further judge fault affiliation by checking the extent of each specific fault sample belonging to every fault class. In addition, the application of the proposed method to a real industrial process verifies its superior performance.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Tel: 86-571-87951879. Fax: 86571-87951879. L

DOI: 10.1021/acs.iecr.6b03221 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Article

Industrial & Engineering Chemistry Research ORCID

(19) Chakrabarti, S.; Roy, S.; Soundalgekar, M. V. Fast and accurate text classification via multiple linear discriminant projections. VLDB J. 2003, 12, 170−185. (20) Ye, J.; Janardan, R.; Park, C. H.; Park, H. An optimization criterion for generalized discriminant analysis on undersampled problems. IEEE Trans. Pattern Anal. Mach. Intell. 2004, 26, 982−994. (21) Friedman, J. H. Regularized discriminant analysis. J. Am. Stat. Assoc. 1989, 84, 165−175. (22) Raudys, S.; Duin, R. P. W. Expected classification error of the fisher linear classifier with pseudoinverse convariance matrix. Pattern Recogn. Lett. 1998, 19, 385−392. (23) Peter, N. B.; Joao, P. H.; David, J. K. Eigenfaces vs fisherfaces, recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 929−941. (24) Ye, J.; Li, Q. A two-stage linear discriminant analysis via QRdecomposition. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 929− 941. (25) Zhao, C. H.; Gao, F. R. A nested-loop Fisher discriminant analysis algorithm. Chemom. Intell. Lab. Syst. 2015, 146, 396−406. (26) Gertler, J.; Li, W.; Huang, Y.; McAvoy, T. Isolation enhanced principal component analysis. AIChE J. 1999, 45, 323−334. (27) Westerhuis, J. A.; Gurden, S. P.; Smilde, A. K. Generalized contribution plots in multivariate statistical process monitoring. Chemom. Intell. Lab. Syst. 2000, 51, 95−114. (28) Liu, J. L.; Chen, D. S. Fault isolation using modified contribution plots. Comput. Chem. Eng. 2014, 61, 9−19. (29) Verron, S.; Tiplica, T.; Kobi, A. Fault detection and identification with a new feature selection based on mutual information. J. Process Control 2008, 18 (5), 479−490. (30) Zhao, C. H.; Wang, W. Faulty variable selection based fault reconstruction for industrial processes. Proceedings of American Control Conference; 2016, 6845−6850. (31) Wang, W.; Zhao, C. H.; Sun, Y. X. Locating faulty variables by evaluating ratio of variable contribution based on discriminant analysis for online fault diagnosis. Proceedings of Chinese Control Conference; 2015, 6366−6371. (32) Dayal, B. S.; MacGregor, J. F. Improved PLS algorithms. J. Chemom. 1997, 11, 73−85. (33) Owen, A. B. Monte Carlo Theory, Methods and Examples; Springer: London, 2013. (34) Mahalanobis, P. C. On the generalized distance in statistics. Proceedings of the National Institute of Sciences (Calcutta) 1936, 2, 49− 55. (35) Silverman, B. W. Density Estimation for Statistics and Data Analysis; CRC: Boca Raton, FL, 1986. (36) Nomikos, P.; MacGregor, J. F. Multivariate SPC charts for monitoring batch processes. Technometrics 1995, 37, 41−59. (37) Lowry, C. A.; Montgomery, D. C. A review of multivariate control charts. IIE Trans 1995, 27, 800−810. (38) Chiang, L. H.; Russell, E. L.; Braatz, R. D. Fault Detection and Diagnosis in Industrial Systems; Springer-Verlag: London, 2001. (39) Montgomery, D. C.; Runger, G. C. Applied Statistics and Probability for Engineers; Wiley: New York, 2006.

Chunhui Zhao: 0000-0002-0254-5763 Furong Gao: 0000-0002-5900-1353 Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This work is supported by the National Natural Science Foundation of China (nos. 61422306, 61273166, and 61433005) and the Open Research Project of the State Key Laboratory of Industrial Control Technology, Zhejiang University, China (nos. ICT1600198 and ICT1600257).



REFERENCES

(1) Chang, C. C.; Yu, C. C. On-line fault diagnosis using the signed directed graph. Ind. Eng. Chem. Res. 1990, 29, 1290−1299. (2) Zhao, C. H.; Sun, Y. X.; Gao, F. R. A Multiple-Time-Region (MTR)-Based Fault Subspace Decomposition and Reconstruction Modeling Strategy for Online Fault Diagnosis. Ind. Eng. Chem. Res. 2012, 51, 11207−11217. (3) Kariwala, V.; Odiowei, P. E.; Cao, Y.; Chen, T. A branch and bound method for isolation of faulty variables through missing variable analysis. J. Process Control 2010, 20, 1198−1206. (4) van den Kerkhof, P.; Vanlaer, J.; Gins, G. J.; Van Impe, F. M. Analysis of smearing-out in contribution plot based fault isolation for statistical process control. Chem. Eng. Sci. 2013, 104, 285−293. (5) Zhao, C. H.; Yao, Y.; Gao, F. R.; Wang, F. L. Statistical Analysis and Online Monitoring for Multimode Processes with Between-mode Transitions. Chem. Eng. Sci. 2010, 65 (22), 5961−5975. (6) Yu, J.; Rashid, M. M. A novel dynamic bayesian network-based networked process monitoring approach for fault detection, propagation identification, and root cause diagnosis. AIChE J. 2013, 59 (7), 2348−2365. (7) Zhao, C. H.; Sun, Y. X. Comprehensive subspace decomposition and isolation of principal reconstruction directions for online fault diagnosis. J. Process Control 2013, 23, 1515−1527. (8) Zhao, C. H.; Gao, F. R. Online Fault Prognosis with Relative Deviation Analysis and Vector Autoregressive Modeling. Chem. Eng. Sci. 2015, 138 (22), 531−543. (9) Tong, C.; El-Farra, N. H.; Palazoglu, A.; Yan, X. Fault detection and isolation in hybrid process systems using a combined data-driven and observer-design methodology. AIChE J. 2014, 60 (8), 2805−2814. (10) Wold, S.; Esbensen, K.; Geladi, P. Principal component analysis. Chemom. Chemom. Intell. Lab. Syst. 1987, 2, 37−52. (11) Jackson, J. E. A User’S Guide to Principal Components; John Wiley & Sons: Ontario, Canada, 2005. (12) Duda, R. O.; Hart, P. E. Pattern Classification and Scene Analysis; Wiley: New York, 1973. (13) Zhao, C. H.; Gao, F. R. Fault-relevant principal component analysis (FPCA) method for multivariate statistical modeling and process monitoring. Chemom. Chemom. Intell. Lab. Syst. 2014, 133 (15), 1−16. (14) Martínez, A. M.; Kak, A. C. PCA versus LDA. IEEE Trans. Pattern Anal. Mach. Intell. 2001, 23, 228−233. (15) Chiang, L. H.; Kotanchek, M. E.; Kordon, A. K. Fault diagnosis based on Fisher discriminant analysis and support vector machines. Comput. Chem. Eng. 2004, 28, 1389−1401. (16) Chiang, L. H.; Russell, E. L.; Braatz, R. D. Fault diagnosis in chemical processes using Fisher discriminant analysis, discriminant partial least squares, and principal component analysis. Chemom. Chemom. Intell. Lab. Syst. 2000, 50, 243−252. (17) He, Q. P.; Qin, S. J.; Wang, J. A new fault diagnosis method using fault directions in Fisher discriminant analysis. AIChE J. 2005, 51, 555−571. (18) Du, Z.; Jin, X. Multiple faults diagnosis for sensors in air handling unit using fisher discrimminant analysis. Energy Convers. Manage. 2008, 49, 3654−3665. M

DOI: 10.1021/acs.iecr.6b03221 Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX