Fault Identification of Nonlinear Processes - American Chemical Society

Jul 29, 2013 - KPLS is superior to PLS in fault detecting of nonlinear processes, the fault ... diagnosis of simple sensor faults, compared with PLS u...
0 downloads 0 Views 3MB Size
Article pubs.acs.org/IECR

Fault Identification of Nonlinear Processes Yingwei Zhang,†,* Lingjun Zhang,† and Renquan Lu† †

State Laboratory of Synthesis Automation of Process Industry, Northeastern University, Shenyang, Liaoning 110819, RP China School of Automation, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, PR China



ABSTRACT: In this paper, a new kernel partial least-squares (KPLS)-based fault identification method is proposed. Although KPLS is superior to PLS in fault detecting of nonlinear processes, the fault identification methods for KPLS are limited. In this paper, the contributions are (1) The relationship between the input and the output variables are considered and each variable’s contribution is measured using the gradient of kernel function. In the existing work, only input variables are concerned; (2) The complex computation is avoided since the new computation method of the partial derivative in the kernel matrix is introduced. The proposed method has two advantages: the ability to identify faulty variables in nonlinear process and guarantee correct diagnosis of simple sensor faults, compared with PLS using conventional contribution plots. In the end, case study on a numerical example and the electro-fused magnesia furnace (EFMF) is employed to illustrate the effectiveness of the proposed method, where the comparison with linear PLS method is involved as well. total PLS based contribution plots.17,18 In this method, the model is developed with total PLS and the fault is identified using the conventional contribution plots. Although contribution plot approach has been employed for years, sometimes it may lead misdiagnosis because of fault smearing.19 Another diagnosis method is reconstruction based contribution (RBC) proposed by Alcala and Qin.20 This method is based on reconstruction of the fault detection index along the direction of a variable. The correct diagnosis results for faults with unknown direction are guaranteed in this method. A generalized reconstruction based contribution is also employed in total PLS to diagnose the fault type for output-relevant faults.21 PLS-based monitoring and identification methods are effective in the detecting and diagnosis of linear processes. However, most industrial processes have nonlinear characteristics. As a nonlinear extension of linear PLS, kernel PLS (KPLS) has been widely used in the detecting of quality-concerning nonlinear processes.22−26 KPLS is proposed based on Cover’s theorem that nonlinear data structure in the input space is more likely to be linear after highdimensional nonlinear mapping.27 Therefore, in KPLS method, input variables data set is mapped from its original space into a high-dimensional feature space by means of nonlinear kernel functions at first. Then linear relationships between input variables and output variables are developed in this highdimensional feature space. Although KPLS is proposed and used for many years, there are less diagnosis methods for KPLS. Some other methods are used sometimes. Zhang et al. proposed a multiblock KPLS method.28 This method is able to compute the block contributions to the monitoring indices. Just like multiblock PLS, the contribution of each variable can be derived too. Therefore, the method is able to be used to identify faults in KPLS model. Zhang and Hu proposed a multiscale KPLS method which is also employed to identify faults in KPLS model.

1. INTRODUCTION Multivariate statistical process monitoring (SPM) techniques have been widely used in the monitoring of industrial processes over the past two decades. Multivariate statistical methods such as principal component analysis (PCA), partial least-squares or projection to latent structures (PLS) and more recently independent component analysis (ICA) have received great success in practice.1−5 These methods build statistical models with normal operation data by projecting large numbers of highly correlated measured variables onto low-dimensional latent spaces and monitor process with Hotelling’s statistic and squared prediction error or statistic. Faults are detected where monitoring statistics move beyond their confidence limits. After a fault is detected, contribution plots are used to identify the faulty variables generally. PLS has been used in multivariate monitoring of process operating performance, which is almost exactly in the same way as PCA-based monitoring.6−8 It is a dimensionality reduction technique that finds a set of latent variables through the projection of process space (X) and quality space (Y) onto new subspaces by maximizing the covariance between the two spaces. In PLS, the decomposition of the input variables X are affected by the quality variables Y, therefore quality variables are more concerned in PLS model than those in PCA model. Some extensions of PLS have been reported while the identification methods for PLS are limited.9−13 In the PLS based fault diagnosis and identification methods, contribution plot is also the most popular one.14,15 It is assumed that faulty variables have high contributions to the monitoring indices. Besides in the conventional PLS model, contribution plot is also employed in many extensions of PLS. Choi and Lee proposed a fault identification approach based on multiblock PLS.16 The block contributions to T2 and SPE statistics are derived in this approach. In fact, if the number of block is equal to that of variable, that is, each block contains one variable, the contribution of each variable to the monitoring indices can be calculated directly with this identification approach. In this case, the approach is similar to contribution plot. Li et al. proposed a © 2013 American Chemical Society

Received: Revised: Accepted: Published: 12072

January 27, 2013 June 26, 2013 July 28, 2013 July 29, 2013 dx.doi.org/10.1021/ie400310q | Ind. Eng. Chem. Res. 2013, 52, 12072−12081

Industrial & Engineering Chemistry Research

Article

Table 1. KPLS Algorithm (Zhang, & Hu, 2011) for comprehension

for computation

initialize ui

initialize ui

1.wi = Φ(x)i ui /|| Φ(x)Ti ui ||

ti = K iui / uTi K iui

2.ti = Φ(x)i wi 3.q i = Yiti /|| tTi ti ||

q i = Yiti /|| tTi ti ||

ui = Yiq i/q Ti q i

ui = Yiq i/q Ti q i

4. loop until ui converges

loop until ui converges

5. Φ(x)i + 1 = (I − titTi /tTi ti)Φ(x)i

K i + 1 = (I − titTi /tTi ti)K i(I − titTi /tTi ti)

Yi + 1 = (I − titTi /tTi ti)Yi go to step 2

Yi + 1 = (I − titTi /tTi ti)Yi go to step 2

⎛ || x ‐x ||2 ⎞ i j ⎟ K ij = k(x i , x j) = exp⎜⎜ − ⎟ c ⎝ ⎠

In this method, process measurements are decomposed into separated multiscale components and faults that occur in different scale are diagnosed and identified.29,30 However, an identification method for conventional KPLS has not been reported. In this paper, a KPLS-based partial derivate contribution is proposed. It is partly derived from the work of Cho et al. for KPCA.31 In this method, contribution of each variable is calculated using the partial derivate of kernel function employed in KPLS. The contribution is the amount of each variable’s influence on the kernel function. Then the contributions of each variable to the monitoring statistics are derived directly. Compared with the work of Cho et al., The proposed method is applied in KPLS-based monitoring and diagnosis. Each variable’s contribution is measured using the gradient of kernel function in Cho’s work while only input variables are concerned in this paper. Therefore, fewer variables’ contributions are calculated so that the proposed method is more effective in the fault identification. Moreover, the proposed method simplifies the computational process of Cho’s work.In Cho’s work, each element in K̅ newK̅ Tnew is calculated in order to calculate each element in ((∂(K̅ newK̅ Tnew))/(∂vl)) and the computing process is quite complex. In this paper, ((∂(K̅ newK̅ Tnew))/(∂vl)) is replaced by ((∂K̅ new)/(∂vl))K̅ Tnew + K̅ new((∂K̅ new)/(∂vl)) which is computed easily, so that complex computation is avoided. Therefore, the faulty variables are identified more easily and quickly. The remaining sections of this paper are organized as follows. In Section 2 the KPLS algorithm is reviewed. In Section 3, KPLS detection and identification method based on the proposed contribution are introduced. In Section 4, model development procedure and online monitoring and diagnosis structure are presented in detail. In Section 5, two simulation examples are given to illustrate the feasibility of the proposed method. At last the conclusions are drawn in Section 6.

(1)

and mean-centered via K̅ = K − EK − KE + EKE

(2)

⎡1 ··· 1⎤ ⎥ ⋱ ⋮⎥ ∈ RN × N . It is just like the kernel where E = ⎣1 ··· 1⎦ method in KPCA.35,36The parameter c in eq 1 is determined with c = rmσ2, where r is a constant to be selected, m is the dimension of the input space and σ2 is the variance of the data. 1⎢ ⋮ N⎢

3. FAULT DETECTION AND DIAGNOSIS BASED ON PARTIAL DERIVATE CONTRIBUTION PLOT In this section, a KPLS-based partial derivate contribution plot is proposed. This method is derived from the work of Rakotomamonjy and Cho et al. partly.31,37 In this method, contributions of each variable to the monitoring statistics are derived based on the partial derivate of kernel function employed in KPLS and the monitoring indices. For a set of normal data X = [x1,x2,...,xN] ∈ RN×M where N is the number of the samples and M is the number of input variables, its scores T = [t1,t2,...,tR] ∈ RN×R are calculated as the steps in Table 1. R is the number of scores and is determined using crossvalidation.33 Then the T2 statistic of the ith normal sample is calculated as follows: T 2 = tiΛ−1tTi

(3)

where Λ = (1/N)T T and ti is the i throw of T. And the SPE statistic of the ith normal sample is calculated as follows: T

2 SPE =|| Φ(x i) − Φ̂ (x i)||

2. KPLS ALGORITHM KPLS algorithm shown in Table 1 is directly derived from the NIPALS algorithm of PLS.32−34 In Table 1, Φ(x) = [Φ(x1),Φ(x2),...,Φ(xN)]T is the projection of input variable vectors x ∈ RN×M in the high-dimensional space via the mapping Φ: x ∈ RM → Φ(x)∈F. It is assumed that ∑Ni=lΦ(xi) = 0, so that mean-centering in the high-dimensional space should be performed before applying KPLS. Through the introduction of kernel trick Φ(xi)Φ(xj)T = K̅ ij, explicit nonlinear mapping and the computing of dot products are avoided. In this paper, Gaussian kernel function k(x, y) = exp(−((∥x-y∥2)/(c))) is employed so that the kernel gram matrix is calculated via

= Φ(x i)Φ(x i)T − 2Φ(x i)Φ̂ (x i)T + Φ̂ (x i)Φ̂ (x i)T = K̅ ii − 2Φ(x i)PtTi + tiPT PtTi T = K̅ ii − 2K̅ iTtTi + tiTT KTt ̅ i

(4)

where K̅ i is the ith row of K̅ which is calculated via eqs 1 and 2.P is the loading matrix for Φ(x) and is calculated with P = Φ(x)TT. The confidence limits ofT2 statistic and SPE statistic are calculated from their characteristic distributions and the detailed equations are given by Qin et al.38 12073

dx.doi.org/10.1021/ie400310q | Ind. Eng. Chem. Res. 2013, 52, 12072−12081

Industrial & Engineering Chemistry Research

Article

where K̅ new ∈ R1×N and EL = (1/N)[1,...,1] ∈ R1×N. Then scores of new data tnew is calculated via tnew = K̅ newU where U is calculated in Table 1. Then the partial derivative of K̅ new is calculated as follows:

Assuming a scale factor v = [v1, v2,...,vm]T where vi = 1 (i = 1, 2,...,m), the kernel function in eq 1 is written to be the form as follows: k(x i , x j) =k(v·x i , v·x j) ⎛ || v·x , v·x ||2 ⎞ i j ⎟ = exp⎜⎜ − ⎟ c ⎝ ⎠

(5)

where the symbol “·” denotes component-wise vector product. Then the partial derivative of the kernel function with respect to the scale factor of the lth variable vl is calculated as follows: ∂k(x i , x j) ∂vl

=

∂vl

1 2 = − (vx l i , l‐vx l j , l) k(v · x i , v · x j) c

contTl

contlSPE

⎛ ⎞T ∂K̅ −1 T T −1 T ∂K̅ Λ UΛ U K̅ + KU U ⎜ ⎟ ̅ ∂vl ⎝ ∂vl ⎠

= k(x new , x new) − 2Φ(x new)PtTnew + tTnewPPT t new

(7)

T = k(x new , x new) − 2K̅ newTtTnew + t newTT KTt ̅ new

(15)

where k(xnew,xnew) = 1−(2/N)ΣNi=lKnew,i + (1/N2)ΣNi=lΣNj=lKij. Knew,i is the ithelement of Knew. According to eqs 14 and 15, contributions of the lth variable in the new sample to the T2new and SPEnew statistics are calculated as follows: 2

contlTnew =

(8)

∂SPE = ∂vl =

∂K̅ ∂K̅ ∂K̅ T |ii − 2· |i TtTi + tiTT Tti ∂vl ∂vl ∂vl

(14)

= Φ(x new)Φ(x new)T − 2Φ(x new)Φ̂ (x new)T + Φ̂ (x new)Φ̂ (x new)T

∂T 2 = ∂vl =

(13)

2 SPEnew =|| Φ(x new) − Φ̂ (x new)||

The absolute value of ((∂K̅ )/(∂vl))|ij is contribution of the lth variable in the ith sample to the kernel function, then according to eqs 3 and 4, absolute contributions of the lth variable in the ith sample to the monitoring statistics are calculated as follows: 2

∂K̅ new ∂K new ∂K new = − E ∂vl ∂vl ∂vl

2 Tnew = t newΛ−1tTnew

(6)

where xi,l denotes value of the lth variable in the ith sample. According to eq 1, ((∂K)/(∂vl))|ij = ((∂Kij)/(∂vl)) = ((∂k(xi,xj))/ (∂vl)), so that partial derivative of the mean-centered gram matrix is obtained as follows: ∂K̅ ∂K ∂K ∂K ∂K = ‐ E ‐E +E E ∂vl ∂vl ∂vl ∂vl ∂vl

(12)

where ((∂Knew)/(∂vl))|i is the ith element in ((∂Knew)/(∂vl)). Equation 13 is obtained by computing the partial derivative of eq 11. The second and fourth terms in the right-hand of eq 11 are negligible because they do not give any contribution to evaluate the effect of new data. According to eqs 12 and 13, ((∂K̅ new)/(∂vl)) and K̅ new have the same dimension. T2 and SPE statistics of the new sample are calculated as follows:

∂k(v·x i , v·x j)

1 = − (x i , l‐x j , l)2 k(x i , x j) c vl = 1

∂K new 1 |i = − (x new, l‐x i , l)2 k(x new , x i) ∂vl c

2 ∂Tnew ∂vl

=

∂K̅ new −1 T T ∂K̅ T UΛ U K̅ new + K̅ newUΛ−1UT new ∂vl ∂vl

=

∂K̅ new −1 T T UΛ U K̅ new + K̅ newUΛ−1UT ∂vl

(9)

⎛ ∂K̅ new ⎞T ⎜ ⎟ ⎝ ∂vl ⎠

2

where contTl and contSPE denote the contributions of the lth l variable in the ith sample to T2 and SPE statistics, respectively. ((∂K̅ )/(∂vl))|i denotes the ith row of ((∂K̅ )/(∂vl)). After the contributions of each variable to the monitoring statistics are gotten, the upper confidence limit for contribution of the lth variable is determined as CUCL,l = m(contl) + 2.5758· s(contl), where m(contl) and s(contl) are the mean and sample standard deviation of contl, respectively.16,18 contl∈RN×1 is composed of contributions of the lth variable to each sample. Given the new sample xnew∈R1×M, the new gram matrix is calculated and mean-centered as follows:

(16) new = contSPE l

∂SPEnew ∂vl

= −

2 N

N

∑ i=1

∂K̅ new ∂K̅ |i − 2· new TtTnew ∂vl ∂vl

⎛ ⎞T ∂K̅ new T T T T ∂K̅ new + UT KTt ⎜ ⎟ ̅ new + t newT KTU ̅ ∂vl ⎝ ∂vl ⎠

⎛ || x ‐x ||2 ⎞ new j ⎟ K new = k(x new , x j) = exp⎜⎜ − ⎟ c ⎝ ⎠

(10)

(17)

K̅ new = K new − ELK − K newE + ELKE

(11)

where ((∂K̅ new)/(∂vl))|i denotes the ith element of ((∂K̅ new)/ (∂vl)). 12074

dx.doi.org/10.1021/ie400310q | Ind. Eng. Chem. Res. 2013, 52, 12072−12081

Industrial & Engineering Chemistry Research

Article

Figure 1. Procedures of model development and online monitoring.

After the contributions of each variable to the monitoring statistics are gotten, the relative contributions are calculated using contl/CUCL,l. If the relative contribution of a variable is larger than 1, this variable is considered to be responsible for the current fault.

(2) Calculate the kernel matrix K according to eq 1. (3) Carry out centering in the feature space to get K̅ according to eq 2. (4) Carry out NIPALS for KPLS in Table 1 to calculate T and U. (5) Calculate the monitoring statistics of the normal operating data according to eqs 3 and 4. (6) Determine the confidence limits of T2 and SPE charts. (7) Calculate the contributions of each variable in each sample to the monitoring statistics according to eqs 8 and 9. (8) Calculate the upper confidence limitsfor contributionof each variable.

4. ONLINE MONITORING PROCEDURE 4.1. Developing the Normal Operating Condition Model. (1) Acquire normal operating data and normalize the data using the mean and standard deviation of each variable. 12075

dx.doi.org/10.1021/ie400310q | Ind. Eng. Chem. Res. 2013, 52, 12072−12081

Industrial & Engineering Chemistry Research

Article

(3) Mean-centering the new kernel vector to get K̅ new according to eq 11. (4) Calculate the monitoring statistics of the new data using eqs 14 and 15. (5) Monitor themonitoring statistics. If any statistic exceeds its corresponding confidence limit, calculate the contributions of each variable to the monitoring statistics using eqs 16 and 17. (6) Calculate the relative contributions of each variable to the monitoring statistics with the upper confidence limits gotten in step 8 of the modeling procedure. (7) Monitor the contributions of each variable. The variable with any contribution that exceeds 1 is identified as faulty variable. The procedures of model development and online monitoring are described in Figure 1.

5. SIMULATION STUDY 5.1. Simulated Nonlinear Process. Consider the system composed of three variables as follows:

Figure 2. Monitoring results of numerical example in the case of the fault 1 based on (a) PLS and (b) KPLS.

4.2. Online Monitoring. (1) Obtain new data and normalize the new data using the mean and standard deviation obtained from step 1 of modeling procedure. (2) Calculate the new kernel matrix Knew according to eq 10.

x1 = t1 + t 2 2 + e1

(18)

x 2 = 5t 2 − t12 + e 2

(19)

x3 = 3t12 − t 2 3 + e3

(20)

y1 = x1 + x 2 + x3

(21)

y2 = 2x1 − 3x 2 + x3

(22)

where e1, e2, and e3 are independent noise of N(0,0.01) and t1 ∈[0.01, 1.99], t2 ∈[0, ;3]. x1, x2, and x3 are input variables.y1 and y2 are output variables. 200 samples generated by eqs 18−22 are used to develop the normal operating condition model. The fault

Figure 3. Contribution plots of numerical example in the case of fault 1 based on (a) Conventional contribution plots and (b) Partial derivate contribution plots. 12076

dx.doi.org/10.1021/ie400310q | Ind. Eng. Chem. Res. 2013, 52, 12072−12081

Industrial & Engineering Chemistry Research

Article

For the fault 1, the monitoring charts of PLS and KPLS are shown in Figure 2. It is evident that the fault is detected by using both PLS and KPLS methods. Four statistics exceed their corresponding 99% confidence limits from the 101st sample. In KPLS monitoring charts, there are fewer samples than PLS below the confidence limit. The PLS-based contribution plots are shown in Figure 3(a). Contributions of each input variable in the 101st sample to the monitoring statistics are plotted. From Figure 3(a), the T2 contribution of x2 is higher than that of x1 and x3 but T2 contributions of x2 and x3 both exceed 1. The SPE contribution of x3 exceeds 1 and is the highest. Generally both x2 and x3 are considered to be the variables responsible for the fault. In this case, misdiagnosis is leaded. The proposed partial derivatecontribution plots are shown in Figure 3(b). It is shown that contributions of faulty variable x2 are much higher and

Figure 4. Monitoring results of numerical example in the case of fault 2 based on (a) PLS and (b) KPLS.

data with 200 samples are generated by the artificial faults as follows: Fault 1: A step fault is introduced. x2 by −50 is introduced from the 101st sample. Fault 2: A ramp fault is introduced. x1 is increased by adding 0.6(k−100) from the 101st sample. Where k is the sample number.

Figure 6. Diagram of electro fused magnesium furnace.

Figure 5. Contribution plots of numerical example in the case of fault 2 based on (a) Conventional contribution plots and (b) Partial derivate contribution plots. 12077

dx.doi.org/10.1021/ie400310q | Ind. Eng. Chem. Res. 2013, 52, 12072−12081

Industrial & Engineering Chemistry Research

Article

each input variable in the 121st sample to the PLS monitoring statistics are plotted in Figure 5(a). It is obvious that both T2 and SPE contributions of x3 are higher than those of the other two variables and exceed 1. The diagnosis results are not correct. Partial derivatecontributions of each input variable in the 111st sample to the KPLS monitoring statistics are plotted in Figure 5(b). It is seen that T2 and SPE contributions of x1 are much higher than those of other variables. In this case, the KPLS-based contribution plots give correct diagnosis results. In this simulation, two simple faults are detected successfully by both PLS and KPLS. For the step fault, the monitoring performance of KPLS is better with fewer false alarms. For the ramp fault, KPLS detect fault more quickly. The diagnosis results given by conventional contribution plots are confused. In these plots correct faulty variable is not identified. By contrast, correct diagnosis is guaranteed by using KPLS based partial derivatecontribution plots. 5.2. Electro-Fused Magnesia Furnace (EFMF). As a kind of mine hot electric arc furnace, electro-fused magnesia furnace (EFMF) is the equipment used to make electro fused magnesia and it is widely used in the electro fused magnesia production.29,30,39,40 It is composed of transformer, heavy current line, electrode holder, electrodes, furnace body, etc. It makes use of the heat produced by electricity. The currents in electrodes are so high that electric arcs can form between electrodes and burden. The burden is heated to melt by the heat produced along with electric arcs. After the melting process has completed, the trolley under the furnace is moved away to cool the fused magnesia contained in it. The diagram of EFMF is shown in Figure 6. The production process has nonlinear characteristics so that it is suitable to test the effectiveness of the proposed method. PLS is also applied in the monitoring and diagnosis of this process. The temperature of furnace is an important parameter and it is determined by the value of currents in the electrodes. In this simulation, the temperatures are selected as output variables and

Figure 7. Monitoring results of EFMF in the case of fault 1 based on (a) PLS and (b) KPLS.

exceed 1 in both T2 contribution and SPE contribution charts. The fault source variable is correctly picked up. For the fault 2, monitoring charts of PLS and KPLS are shown in Figure 4. In Figure 4(a), PLS monitoringstatistics exceed their corresponding 99% confidence limitsfrom the 120th sample and the delay time is 20 samples. In Figure 4(b), KPLS monitoringstatistics exceed confidence limitsfrom the 110th sample with 10 samples delay. Conventional contributions of

Figure 8. Contribution plots of EFMF in the case of fault 1 based on (a) Conventional contribution plots and (b) Partial derivate contribution plots. 12078

dx.doi.org/10.1021/ie400310q | Ind. Eng. Chem. Res. 2013, 52, 12072−12081

Industrial & Engineering Chemistry Research

Article

three currents in the electrodes are chosen as input variables. 300 normal samples are used to develop normal operating condition model. 300 samples are used for testing. Fault 1 is introduced in the second input variable by decreasing the value of current sharply lasting from the 50th sample to the 100th sample. Fault 2 is introduced in the third input variable by increasing the current gradually from the 100th sample to the 200th sample.

For fault 1, the monitoring results are shown in Figure 7. The fault is detected by both two methods. Four statistics exceed the 99% confidence limitsfrom the 51st sample to the 150th sample. The monitoring performance of KPLS is better for fewer false alarms than that of PLS. Contributions of the 51st sampleare plotted in Figure.8. As shown in Figure 8(a), different diagnosis results are given in two conventional contribution charts. It is hard to identify which variable is the fault source. By contrast, one consistent diagnosis result is given by the proposed partial derivatecontributionsas shown in Figure 8(b). The contributions of the faulty variable to both of these two monitoring statistics are much higher than those of other variables.The faulty variable is correctly identified. For fault 2, the monitoring charts are shown in Figure 9 and diagnosis results are shown in Figure 8. Four statistics exceed the 99% confidence limitsfrom the 153st sample. Both the T2 statistic and SPE statistic of KPLS and PLS are similar. Contributions of each variable in the 153st sampleto the monitoring indices are plotted.From Figure 10(a), it is not clear which variable is the correct fault source by watching and analyzing these two conventional contribution plots. By contrast, correct and explicit results are given by the proposed partial derivatecontribution plots as shown in Figure 10(b). A same diagnosis result is given by two contribution plots. In two contribution charts, contributions of the third variable are much higher than that of other variables. In this simulation experiment, the monitoring performance of KPLS is better than that of PLS in the monitoring of fault 1, showing that KPLS is more suitable to monitor nonlinear process. If the faults are hardly detected, the diagnosis results do not make sense because the identification methods are based on the monitoring statistics. Moreover, the PLS-based conventional contribution plots are not suitable to diagnose faults that occur in this nonlinear process. By contrast, the proposed KPLS-based partial derivate contribution plots identify the faulty variables in this nonlinear process successfully.

Figure 9. Monitoring results of EFMF in the case of fault 2 based on (a) PLS and (b) KPLS.

Figure 10. Contribution plots of EFMF in the case of fault 2 based on (a) Conventional contribution plots and (b) Partial derivate contribution plots. 12079

dx.doi.org/10.1021/ie400310q | Ind. Eng. Chem. Res. 2013, 52, 12072−12081

Industrial & Engineering Chemistry Research

Article

dynamic T-PLS approach. IEEE Trans. Neural Networks 2011, 22, 2262−2271. (13) Yacoub, F.; MacGregor, J. F. Product optimization and control in the latent variable space of nonlinear PLS models. Chemom. Intell. Lab. Syst. 2004, 70, 63−74. (14) Miller, P.; Swanson, R. E.; Heckler, C. E. Contribution plots: The missing link in multivariate quality control. Appl. Math. Comput. Sci. 1998, 8, 775−792. (15) Westerhuis, J. A.; Gurden, S. P.; Smilde, A. K. Generalized contribution plots in multivariate statistical process monitoring. Chemom. Intell. Lab. Syst. 2000, 51, 95−114. (16) Choi, S. W.; Lee, I. B. Multiblock PLS-based localized process diagnosis. J. Process Control 2005, 15, 295−306. (17) Zhou, D.; Li, G.; Qin, S. J. Total projection to latent structures for process monitoring. AIChE J. 2010, 56, 168−178. (18) Li, G.; Qin, S. J.; Ji, Y.; Zhou, D. Total PLS based contribution plots for fault diagnosis. Acta Autom. Sin. 2009, 35, 759−765. (19) Yoon, S.; MacGregor, J. F. Fault diagnosis with multivariate statistical models, part I: Using steady state fault signatures. J. Process Control 2001, 11, 387−400. (20) Alcala, C. F.; Qin, S. J. Reconstruction-based contribution for process monitoring. Automatica 2009, 45, 1593−1600. (21) Li, G.; Alcala, C. F.; Qin, S. J.; Zhou, D. Generalized reconstruction-based contribution for output-relevant fault diagnosis with application to the Tennessee Eastman process. IEEE Trans. Control Syst. Technol. 2011, 19, 1114−1127. (22) Willis, A. J. Condition monitoring of centrifuge vibrations using kernel PLS. Comput. Chem. Eng. 2010, 34, 349−353. (23) Kim, K.; Lee, J. M.; Lee, I. B. A novel multivariate regression approach based on kernel partial least squares with orthogonal signal correction. Chemom. Intell. Lab. Syst. 2005, 79, 22−30. (24) Rosipal, R.; Trejo, L. J. Kernel partial least squares regression in reproducing Kernel Hilbert space. J. Mach. Learn. Res. 2002, 2, 97−123. (25) Jia, R.; Mao, Z.; Chang, Y.; Zhang, S. Kernel partial robust Mregression as a flexible robust nonlinear modeling technique. Chemom. Intell. Lab. Syst. 2010, 100, 91−98. (26) Zhang, Y.; Teng, Y. Process data modeling using modified kernel partial least squares. Chem. Eng. Sci. 2010, 65, 6353−6361. (27) Rosipal, R. Kernel partial least squares for nonlinear regression and discrimination. Neural Network World 2003, 13, 291−300. (28) Zhang, Y.; Zhou, H.; Qin, S. J. Decentralized fault diagnosis of large-scale processes using multiblock kernel principal component analysis. Acta Autom. Sin. 2010, 36, 593−597. (29) Zhang, Y.; Hu, Z. Multivariate process monitoring and analysis based on multi-scale KPLS. Chem. Eng. Res. Des. 2011, 89, 2667−2678. (30) Zhang, Y.; Ma, C. Fault diagnosis of nonlinear processes using multiscale KPCA and multiscale KPLS. Chem. Eng. Sci. 2011, 66, 64−72. (31) Cho, J. H.; Lee, J. M.; Choi, S. W.; Lee, D.; Lee, I. B. Fault identification for process monitoring using kernel principal component analysis. Chem. Eng. Sci. 2005, 60, 279−288. (32) Wold, S.; Trygg, J.; Berglund, A.; Antti, H. Some recent development in PLS modeling. Chemom. Intell. Lab. Syst. 2001, 58, 131− 150. (33) Wold, S. Cross-validatory estimation of components in factor and principal components model. Technometrics 1978, 20, 397−405. (34) Dayal, B. S.; MacGregor, J. F. Improved PLS algorithm. J. Chemom. 1997, 11, 73−85. (35) Lee, J. M.; Yoo, C. K.; Choi, S. W.; Vanrolleghem, P. A.; Lee, I. B. Nonlinear process monitoring using kernel principal component analysis. Chem. Eng. Sci. 2004, 59, 223−234. (36) Schölkopf, B.; Smola, A.; Mjuller, K. R. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 1998, 10, 1299−1319. (37) Rakotomamonjy, A. Variable selection using SVM-based criteria. J. Mach. Learn. Res. 2003, 3, 1357−1370. (38) Qin, S. J.; Valle, S.; Piovoso, M. J. On unifying multiblock analysis with application to decentralized process monitoring. J. Chemom 2001, 15, 715−742.

6. CONCLUSIONS In this paper, a KPLS-based partial derivate contribution plot is proposed. It is based on the contribution of each variable to the kernel function.In this method, the contribution of each variable is measured using the partial derivate of kernel function employed in KPLS. Then the contributions to monitoring statistics are calculated. This contribution reflects the magnitude of deviation between normal sample and faulty sample so that it can correctly identify the faulty variables. Case study on a numerical example and the electro-fused magnesia furnace (EFMF) is employed to present the effectiveness of the proposed method, where conventional contribution plot is applied too. In these simulations, simple faults that occur in one single variable are introduced. The results show that correct identification results are guaranteed by using the proposed method, whereas the conventional contribution plots cannot always be effective. Therefore, the proposed identification method is more suitable in identifying faults in nonlinear process.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Notes

The authors declare no competing financial interest.

■ ■

ACKNOWLEDGMENTS The work is supported by China’s National 973 program (2009CB320600) and NSF in China (61273163). REFERENCES

(1) Hsu, C. C.; Su, C. T. An adaptive forecast-based chart for nonGaussian process monitoring: With application to equipment malfunctions detection in a thermal power plant. IEEE Trans. Control Syst. Technol. 2011, 26, 1245−1250. (2) Chen, T.; Zhang, J. On-line multivariate statistical monitoring of batch processes using Gaussian mixture model. Control Eng. Pract. 2010, 34, 500−507. (3) Kano, M.; Nakagawa, Y. Data-based process monitoring process control and quality improvement: Recent developments and applications in steel industry. Comput. Chem. Eng. 2008, 32, 12−24. (4) Samara, P. A.; Fouskitakis, G. N.; Sakellariou, J. S.; Fassois, S. D. A statistical method for the detection of sensor abrupt faults in aircraft control systems. IEEE Trans. Control Syst. Technol. 2008, 16, 789−798. (5) Burnham, A. J.; Viveros, R.; MacGregor, J. F. Frameworks for latent variable multivariate regression. J. Chemom. 1996, 10, 31−45. (6) Chen, Q.; Kruger, U. Analysis of extended partial least squares for monitoring large-scale processes. IEEE Trans. Control Syst. Technol. 2005, 13, 807−813. (7) Muradore, R.; Fiorini, P. A PLS-based statistical approach for fault detection and isolation of robotic manipulators. IEEE Trans. Ind. Electron. Control Instrum. 2012, 59, 3167−3175. (8) Chen, J. H.; Liu, K. C. On-line batch process monitoring using dynamic PCA and dynamic PLS models. Chem. Eng. Sci. 2002, 57, 63− 75. (9) Lee, H. W.; Lee, M. W.; Park, J. M. Multi-scale extension of PLS algorithm for advanced on-line process monitoring. Chemom. Intell. Lab. Syst. 2009, 98, 201−212. (10) Lee, G.; Han, C.; Yoon, E. S. Multiple-Fault diagnosis of the Tennessee Eastman process based on system decomposition and dynamic PLS. Ind. Eng. Chem. Res. 2004, 43, 8037−8048. (11) Kourti, T.; Nomikos, P.; MacGregor, J. F. Analysis, monitoring and fault diagnosis of batch processes using multiblock and multiway PLS. J. Process Control. 1995, 5, 227−284. (12) Li, G.; Liu, B.; Qin, S. J.; Zhou, D. Quality relevant data-driven modeling and monitoring of multivariate dynamic processes: The 12080

dx.doi.org/10.1021/ie400310q | Ind. Eng. Chem. Res. 2013, 52, 12072−12081

Industrial & Engineering Chemistry Research

Article

(39) Dong, B.; Zhang, L.; Wu, Y.; Chai, T. Fuzzy control research on electrodes of electrical-fused magnesia furnace. Presented at the 20th Chinese Control and Decision Conference, Yantai, China, July 02−04, 2008. (40) Zhang, Y.; Li, S.; Hu, Z. Improved multi-scale kernel principal component analysis and its application for fault detection. Chem. Eng. Res. Des. 2012, 90, 1271−1280. (41) Zhang, Y.; Li, S. Modeling and monitoring between-mode transition of multimode processes. IEEE Trans. Ind. Inf. 2012, In press.

12081

dx.doi.org/10.1021/ie400310q | Ind. Eng. Chem. Res. 2013, 52, 12072−12081