Ind. Eng. Chem. Res. 2010, 49, 9175–9183
9175
Output Relevant Fault Reconstruction and Fault Subspace Extraction in Total Projection to Latent Structures Models Gang Li,† S. Joe Qin,*,‡ and Donghua Zhou*,† Department of Automation, TNList, Tsinghua UniVersity, Beijing 100084, P. R. China, and Departments of Chemical Engineering and Materials Science and Department of Electrical Engineering, UniVersity of Southern California, Los Angeles, California 90089
Statistical data-driven process monitoring is critical for efficient operations of industrial processes. However, deviations from normal regions in the process data may or may not lead to poor quality of products. This paper proposes a new combined index for detecting output-relevant faults, which affect the output data, and studies the output-relevant fault detectability based on total projection to latent structures (T-PLS). Given actual fault direction, fault-free data can be reconstructed and output-relevant part of fault magnitude can be estimated. Two new methods are derived to extract output-relevant fault subspace from faulty data. A simulation example and a case study on the Tennessee Eastman process are used to show the effectiveness of the proposed methods. 1. Introduction Process monitoring is important for product quality, reliability, and maintainability of a complex industrial process. There are usually many process sensors installed in a plant. A monitoring technique that quantitatively represents the major relations between process variables and quality variables is required to detect abnormal situations. In the case of sensor faults, the faulty sensor should be identified and reconstructed from other normal sensors. In the case of process faults, the fault type needs to be identified from a set of candidate faults. If the fault is classified as one of the known faults that happened before, the fault can be corrected by proper maintenance. For many practical cases, a multivariate data-driven model is preferred. Principal component analysis (PCA) is a reliable and simple technique for describing the correlation among process variables and has been used for a long time in process monitoring,1-3 fault identification and diagnosis,4-6 sensor validation,7 and fault reconstruction.8 Qin reviewed the multivariate statistical process monitoring tasks based on PCA models.9 PCA-based monitoring methods are able to monitor all the abnormal variations in process variables, but they cannot tell whether a fault is related to the output variables. In the manuscript, output variables refer to quality and yield variables, such as mole concentration of the product. Input variables refer to all the process variables that can be measured online. A fault that has an impact on output data is called an output-relevant fault. If one is interested in monitoring the abnormal situations that affect output data, one should use partial least-squares or projection to latent structures (PLS),10,11 which is built from input data X and output data Y. The purpose of PLS model is to extract the covariance in both input and output variables and to model the relationship between them. PLS models have been used to monitor process operating performance.1,12 Recently, Li et al. revealed the geometric properties of PLS structure for process monitoring, compared the monitoring policies using various PLS models, and concluded that the standard PLS is * To whom correspondence should be addressed. E-mail:
[email protected] (S.J.Q.);
[email protected] (D.Z.). † Tsinghua University. ‡ University of Southern California.
preferred for process monitoring over alternative PLS methods in the chemometrics literature.13 However, it is misleading to use standard PLS model in detecting output-relevant faults with T2 statistic and output-irrelevant faults with the Q statistic. On one hand, the scores that form T2 statistic contain the variations orthogonal to Y which is not related to Y. On the other hand, PLS does not extract the variance of X-space in descending order; therefore, the residual part still contains large variability and is not suitable to use the Q statistic. Zhou et al. proposed total PLS (T-PLS) for monitoring to reslove these problems, which is used as the foundation in this paper.14 Once a fault has been detected, it is important to diagnose an assignable cause for it. Li et al. proposed the contribution plots based on T-PLS model for output-relevant fault diagnosis.15 If the actual fault direction is known, the fault can be analyzed further, including recovering fault-free data and estimating the fault magnitude. In this paper, we perform the output-relevant fault reconstruction and estimation based on T-PLS models. To extract output-relevant fault subspace, we propose two new methods and compare them with the existing approach. In the context of PCA models, reconstruction is defined in the work of Dunia and Qin (1998)8 and a combined index based reconstruction in the work of Yue and Qin (2001).6 The extraction of fault subspace or directions is proposed in the work of Valle et al. (2001).16 Several ideas of this paper are inspired from these early works and applied to outputrelevant faults based on T-PLS models. The organization of this paper is as follows. First, the fault detection methods based on T-PLS models are reviewed in section 2. Then, we propose a new combined index to detect output-relevant faults, and study output-relevant fault detectability. Following that, the output-relevant fault reconstruction with this index is performed and analyzed in section 3. Section 4 considers how to extract output-relevant fault subspaces efficiently from historical faulty data. In section 5, we illustrate the proposed methods using a simulation case and the Tennessee Eastman process (TEP). Finally, we present our conclusions in the last section.
10.1021/ie901939n 2010 American Chemical Society Published on Web 08/26/2010
9176
Ind. Eng. Chem. Res., Vol. 49, No. 19, 2010
2. Output-Relevant Fault Detection and Detectability Output-Relevant Fault Detection Based on the T-PLS Model. Let X ∈ Rn×m be the input matrix consisting of n samples with m process variables per sample, and Y ∈ Rn×p be the output matrix with p quality variables per sample. It is assumed that X and Y are centered to zero mean and scaled to unit variance. PLS decomposes X and Y as follows (referring to ref 12):
{
X ) TPT + E Y ) TQT + F
(1)
where T ∈ Rn×A is the score matrix, and P ∈ Rm×A and Q ∈ Rp×A are the loading matrices for X and Y, respectively. A is the number of PLS components, which is usually determined by cross validation. E and F are the residual matrices of X and Y, respectively. In the PLS procedure, the weight matrix W is used to calculate matrix T. Let R ) W(PTW)-1, then T ) XR. The PLS algorithm extracts the scores T to maximize the covariance between X and Y. The scores T have been traditionally monitored using the T2 statistic and thought as factors related to the output Y. The PLS residuals have been traditionally considered irrelevant to Y and monitored by the Q statistic. However, as revealed by Zhou et al.,14 none of the above statements are accurate. First of all, the PLS scores contain components orthogonal to Y, which are output-irrelevant part. This is why orthogonal PLS methods are proposed in recent years.17,18 Second, the X-residuals in PLS still contain a large variability, as PLS does not extract the X-variance in descending order. As a consequence, using the Q statistic for PLS residuals is not appropriate. A further decomposition of the PLS residual subspace is necessary to apply the Q statistic appropriately. In addition, significant variations in the X-residuals may as well be affecting the product quality. This subspace is not modeled in the PLS model simply because it is not excited in the data that are used to build the PLS model. In summary, variations in both PLS scores and residuals can be relevant to the output Y. The T-PLS algorithm further decomposes the PLS scores and residuals to sort out components that are relevant to the output Y.14 The T-PLS algorithm is used to obtain a further decomposition of the X-residuals and X-scores (see Appendix A):
{
X ) TyPTy + ToPTo + TrPrT + Er Y ) TyQTy + F
(2)
where Ty ∈ Rn×Ay, To ∈ Rn×(A-Ay), and Tr ∈ Rn×Ar are three score matrices and Py ∈ Rm×Ay, Po ∈ Rm×(A-Ay), and Pr ∈ Rm×Ar are the corresponding loading matrices. Qy ∈ Rp×Ay is the new loading matrix for Y responding to Ty. Er ) E(I - PrPrT) is the new residual matrix after performing PCA on E. Ay is the number of output-relevant components in the PLS scores and Ar is the number of significant components in the PLS residuals. In the T-PLS model, Ty represents the variations only related to Y in original T of PLS model, To represents the variations orthogonal to Y in T, Tr is the major part of original X-residual E, and Er is the residual part of E after Tr is removed. The T-PLS model decomposes X-space into four subspaces. They are the range spaces of Py, Po, Pr, and (I - PrPrT)(I - PRT), with dimensions of Ay, A - Ay, Ar, m - A - Ar, and are denoted by Sy, So, Srp, Srr, respectively. A new sample vector x is partitioned into four portions: x ) xˆy + xˆo + xˆr + x˜r
(3a)
xˆy ) PyQTy QRTx ≡ C1x ∈ Sy
(3b)
xˆo ) (P - PyQTy Q)RTx ≡ C2x ∈ So
(3c)
xˆr ) PrPrT(I - PRT)x ≡ C3x ∈ Srp
(3d)
x˜r ) (I - PrPrT)(I - PRT)x ≡ C4x ∈ Srr
(3e)
It has been pointed out that only the abnormal situation in Sy and Srr may affect output data.14 Therefore, we only need to use T2 and Qr for detecting output-relevant faults. Subspace Sy contains the output-relevant part of process variation, which is suitable to use T2 statistic. Ty2 ) tTy Λy-1ty ) xTRQTQyΛy-1QTy QRTx
(4)
where Λy ) [1/(n - 1)]TTy Ty is the covariance matrix of ty. If the process is normal, Ty2 e δy2 )
Ay(n2 - 1) F n(n - Ay) Ay,n-Ay
(5)
with 1 - R confidence, where FAy,n-Ay is the F-distribution with degrees of freedom Ay, n - Ay. Subspace Srr represents the residual part of the X-space that is not excited in the normal process data, but can have an impact on the output if a fault happens in this subspace. It is appropriate to use the Q statistic, Qr ) ||x˜r || 2 ) ||C4x|| 2
(6)
If the process is normal, Qr e δr2 ) (S/2µ)χ2µ2/S2
(7)
with 1 - R confidence, where µ and S are the sample mean and variance of Qr and χ2 means the χ2 distribution. Output-Relevant Fault Detection Using a Combined Index. In the T-PLS model, Sy represents the known variation that is related to Y, while Srr represents the unknown variation that may be related to Y. Therefore, Ty2 and Qr detect two different kinds of faults that may affect Y.14 Neither of them should be omitted to monitor faults related to Y. In practice, one index rather than two indices is preferred to monitor the process. Yue and Qin gave a combination of T2 and Q statistic in PCA-based methods.6 Similarly, we propose a combined index, which incorporates Ty2 and Qr in a balanced way. Considering their respective control limit, the two indices can be combined as follows. φ)
Ty2 δy
2
+
Qr δy2
) xTΦx ) ||Φ1/2x|| 2
(8)
Substituting (4) and (6) into (8), we can obtain Φ)
RQTQyΛy-1QTy QRT δy2
+
(I - RPT)(I - PrPrT)(I - PRT) δr2
(9) Notice that I - PrPrT is an idempotent matrix according to the property of T-PLS. Thus, Φ is symmetric and semipositive definite. From (8), we know φ is a quadratic function of x. We use an approximate distribution gχh2 to calculate the confidence
Ind. Eng. Chem. Res., Vol. 49, No. 19, 2010
Figure 1. Comparison of control regions with different ζ2 values.
limits.19 The process is considered without output-relevant faults if φ < ζ2 ) gχh,R2
(10)
9177
j cannot rank, then nonzero faults that are in the null space of Ξ be detected, which implies that they are output-irrelevant. If for all f with large magnitudes, the fault can be detected by φ, then this kind of fault is called completely output-relevant. If only for some f with large enough magnitudes, the fault can be detected by φ, then this kind of fault is called partially outputj }. Lemma 1 gives conditions for relevant. Denote Sj ) span{Ξ whether a fault is output-relevant. Lemma 1. The output-releVant detectability for a fault can be determined as follows: (1) If dim(Sj) ) Af, then the fault F is completely output-releVant. (2) If dim(Sj) ) 0, then the fault F is completely output-irreleVant. (3) Otherwise, the fault F is partially output-releVant. Lemma 1 is proven in Appendix B. In practice, we can use the equivalent mathematical condition. The mathematical criterion of complete output-relevance can be described as σmin(Ξ¯ ) > 0
(15)
The scaling factor g and the degree of freedom h are given by tr(SΦ)2 g) tr(SΦ)
(11)
σmax(Ξ¯ ) > σmin(Ξ¯ ) ) 0
2
[tr(SΦ)] h) [tr(SΦ)]2
(12)
where S ) cov(x) ≈ [1/(n - 1)]XTX is the covariance matrix of x. It is interesting to consider different values of ζ2. If ζ2 ) 1, when φ < ζ2, there must be Ty2 < δy2 and Qr < δr2. However, although Ty2 and Qr are both normal, φ may exceed the control limit. Thus, ζ2 ) 1 is conservative, which increases false alarms. If ζ2 ) 2, when φ > ζ2, there must be Ty2 > δy2 or Qr > δr2. However, when φ is under the control limit, Ty2 or Qr still may exceed corresponding limits. Thus, ζ2 ) 2 is loose, which reduces effective alarms. If 1 < ζ2 < 2, the control limit is a trade-off between detection rate and false alarm rate. The comparison of confidence region with different ζ2 is illustrated by Figure 1, where the dashed rectangle region is the confidence area defined by statistics Ty2 and Qr, and the solid ellipses are confidence areas defined by φ. It can be seen that the ellipse area can approximate the rectangle area with a proper ζ2, which is calculated by (10). Consequently, φ can detect output-relevant faults effectively. Output-Relevant Fault Detectability. When a fault occurs, we can represent the faulty sample as x ) x* + Ξf
And the mathematical criterion of partial output-relevance can be described as
(13)
j ) and σmax(Ξ j ) represent the minimum and the where σmin(Ξ j. maximum of singular values of Ξ 3. Output-Relevant Fault Reconstruction Output-Relevant Fault Reconstruction Based on the T-PLS Model. In this section, we propose an output-relevant fault reconstruction strategy based on the T-PLS model to eliminate the fault effect on Y and meanwhile estimate outputrelevant fault magnitude. First, we assume that the actual fault has been identified with fault direction Ξ, the faulty sample x can be corrected along the given fault direction by xe ) x - Ξfe
φ(x) ) |Φ1/2x* + Φ1/2Ξf| 2 ) |x¯* + Ξ¯ f| 2
(14)
jf ) where xj * ) Φ1/2x* represents the normal part and Ξ Φ1/2Ξf represents the contribution of the faulty part. When an output-relevant fault is detected, φ(x) > ζ2, which implies that j f| > 0. To satisfy this for all nonzero f, Ξ j must have full |Ξ j column rank. If, on the other hand, Ξ does not have full column
(17)
where xe is the corrected value and fe is the estimated fault magnitude. Then, we calculate reconstructed φ for xe. φ(xe) ) xTe Φxe ) (x - Ξfe)TΦ(x - Ξfe)
(18)
In order to eliminate the fault effect on output as much as possible, we solve the following optimal problem: min{φ(xe)} fe
where x* is the normal part of x without fault, Ξf is the fault part added on the measurement. This representation can describe not only sensor faults, but also process faults. The fault direction matrix Ξ ∈ Rm×Af is an orthonormal matrix that spans the fault subspace with a dimension Af and f denotes the magnitude of the fault. Ξ can be derived from the process data or from first principle relations. f can be an abrupt fault or a slow-varying fault. Substituting (13) into (8), we obtain that
(16)
(19)
Problem 19 is an unconstrained least-squares problem and has the following analytical solution. fe ) (ΞTΦΞ)+ΞTΦx ≡ Bx
(20)
where (•)+ means the Moore-Penrose pseudoinverse of a matrix. If ΞTΦΞ has full rank, the solution (20) is unique and optimal to the object (19). If ΞTΦΞ does not have full rank, solution 20 is not unique, but the minimum norm solution to problem 19. The objective of fault reconstruction based on φ is to pull the faulty sample back into the normal region defined in subspace Sy x Srr along the direction Ξ, where x means the direct sum of two subspaces. Although the reconstructed sample may still be abnormal in subspace So x Srp, it does not affect output Y. Simultaneous reconstruction in all subspaces can be
9178
Ind. Eng. Chem. Res., Vol. 49, No. 19, 2010
performed, but it is beyond the interest of this paper. When φ is minimized, the fault effect on Y is removed as much as possible, which is the essential difference from PCA-based reconstruction. If the fault Ξ is completely output-relevant, (20) is the weighted least-squares, which yields complete reconstruction. j rankIf the fault Ξ is partially output-relevant, which makes Ξ deficient, the minimum norm solution of (20) results, yielding a partial reconstruction. Comparing fault detectability and reconstructability based on φ, we see that the criterion for fault reconstructability is exactly the same as fault detectability. When a fault is completely detectable, it is also completely reconstructible. Notice that a fault should be detected first before it can be reconstructed. After a fault is detected, only the outputrelevant portion of the fault magnitude is reconstructed. Reconstruction Error and Estimation Error. The reconstruction error is the difference between the reconstructed value and the normal value. From (13), (17) and (20), reconstruction error is calculated as ε ) x* - xe ) ΞB(x* + Ξf) - Ξf ) ΞBx* + (ΞBΞ - Ξ)f
Ξ ) [0 0 1 0 0 ]T If the second and the fourth sensors are faulty simultaneously and independently, the direction matrix is Ξ)
(21)
(22)
Therefore, in the case of complete reconstruction, the reconstruction error depends on the fault direction only. Further, E(ε) ) 0 and cov(ε) ) ΞBSBTΞT, where E(•) and cov(•) are the expectation and covariance matrices of the random vector. The fault estimation error is the bias of estimated fault from the actual fault. From (20), the estimation error is obtained as
(24)
In general, fe is a biased estimation of f from (24). If the fault is complete reconstructible, the result can be simplified to w ) -Bx*
(25)
Thus, E(w) ) 0 and cov(w) ) BSB , which means fault estimation is unbiased for the case of complete reconstruction. For the case of complete reconstruction, the reconstructed φ can be further calculated by T
φ(xe) ) xTe Φxe ) ) ) e
x*T(I - ΞB)TΦ(I - ΞB)x* x*TΦx* - x*TΦTΞBx* φ(x*) - x*TΦTΞ(ΞTΦΞ)-1ΞTΦx* φ(x*) e ζ2
]
T
(26)
Equation 26 indicates the reconstructed φ can always be normal whatever f is. 4. Fault Subspace Extraction In the preceding sections, we assume that the fault direction matrix Ξ is known beforehand. The fault direction matrix is
(27)
As x*k is zero-mean, it can be neglected with some averaging schemes, such as the moving-window average. Denote x_k as the sample processed with the scheme. Then x_k ≈ Ξf_k
(28)
Therefore, forming a matrix of averaged faulty data, X _ Tf ≈ Ξ[f_1, ..., _f n]
(23)
w ) f - fe ) f - B(x* + Ξf) ) (I - BΞ)f - Bx*
0 1 0 0 0 0 0 0 1 0
However, it is hard to derive the fault direction for process faults or sensor fault in a closed-loop control process. If a fault occurred in the past, we can extract the fault direction matrix from historical faulty samples. Extraction of the Whole Fault Subspace. Yue and Qin proposed a method to extract the whole fault subspace as follows.6 Let Xf represent the faulty data matrix under the fault Ξ. Denote the kth sample as xk, then
Substituting (22) into (21), we have ε ) ΞBx*
[
xk ) x*k + Ξfk
Equation 21 means the reconstruction error depends on both fault direction and fault magnitude generally. However, if the fault is completely reconstructible, (ΞTΦΞ)+ ) (ΞTΦΞ)-1. Thus, BΞ ) (ΞTΦΞ)-1ΞTΦΞ ) I
easily derived for sensor faults. Taking a process consisting of five sensors as an example, if the third sensor is faulty, the direction vector is
(29)
and performing singular value decomposition (SVD) on X _ Tf , X _ Tf ) UDVT
(30)
where the diagonal matrix D has nonzero singular values in descending order, we can choose Ξ ) U. In practice, the dimension of the fault subspace is the minimum dimension which brings reconstructed φ under the control limit. This approach to fault subspace extraction is the same as that used in PCA-based fault subspace extraction. It requires no knowledge of the output data for the fault period. To determine whether the extracted fault direction is output-relevant or not, Lemma 1 can be used directly. If the output data Y are available for the same faulty period, a PLS approach can be developed to extract output-relevant fault directions, which is discussed in the next subsection. Extraction of Output-Relevant Fault Subspace. As we want to remove the fault effect on Y as much as possible, the output-relevant fault subspace, denoted by Ξy, is more preferred. In this section, we extract output-relevant fault subspace from historical faulty data Xf and Yf. We assume the faulty process data are processed as in (29), and faulty quality data is processed with same actions, denoted by Y _ f. Performing nonlinear iterative partial least-squares on _f X _ f and Y
{
X _ f ) TfPTf + Ef Y _ f ) TfQTf + Ff
(31)
We can choose span{Pf} as the fault subspaces related to Y. Since Pf is not orthonormal, we perform SVD on Pf to obtain Pf ) UfDfVTf
(32)
Ind. Eng. Chem. Res., Vol. 49, No. 19, 2010
9179
where Df contains the Af nonzero singular values in descending order. Now we can take Ξy ) Uf. According to the properties of PLS structure, the extracted direction matrix Ξy represents the fault directions related to Y, which is more suitable for output-relevant fault reconstruction. However, the PLS component number Af should not be decided via cross-validation. During the iterative process in NIPALS, we determine the Af as the minimum dimension which can bring reconstructed φ under the control limit. In summary, this approach needs faulty quality data Yf and can extract outputrelevant fault direction Ξy. Extraction of Reduced Fault Subspace. As shown in section 2, the fault direction Ξ is not used directly in determining fault j is enough for detectability. In fact, reduced fault direction Ξ j instead of Ξ from faulty X in all tasks. Therefore, we extract Ξ j f is usually large, this subsection. Considering |xj*| < ζ and Ξ we have x¯k ) x¯*k + Ξ¯ fk ≈ Ξ¯ fk
(33)
¯ Tf ) Ξ¯ [f1, ..., fn] X
(34)
Thus, forming
j Tf , and performing SVD on X ¯ Tf ) U ¯D ¯V ¯T X
(35)
j has nonzero singular values in where the diagonal matrix D j )U j . The dimension of reduced descending order, we can take Ξ fault subspace is also the minimum dimension which brings reconstructed φ under the control limit. The whole fault subspace can reflect all information of a fault, hence it is widely used in the PCA method. However, it is more efficient to extract output-relevant fault subspace. The reduced fault subspace from (35) is the intersection between outputrelevant subspace and the fault subspace. Thus, it can capture the output-relevant fault information. Generally speaking, the whole fault subspace extracted in (30) can be output-relevant or output-irrelevant. The fault subspace extracted in (32) is exactly output-relevant as reflected from the faulty output and input data. The reduced subspace extracted from (35) is possibly output-relevant, but the relevance of variations in Srp is not confirmed by the output under this fault. The comparison of these three approaches is shown in the simulation case study.
Figure 2. Output y, SPE of y, and fault detection in the complete outputrelevant case, f ) [10, -4]T.
ek ∈ R5 ∼ N(0, 0.52I5), Vk ∼ N(0, 0.12). U([0, 1]) means the uniform distribution in the interval [0,1]. The T-PLS algorithm is performed on 100 normal samples with A ) 2, Ay ) 1, and Ar ) 1. The fault is added as shown in (13). In order to show all the different cases of output-relevance, we use a twodimensioned fault. For each fault, we generate 100 faulty samples, i.e. samples 1-100 are normal and samples 101-200 are faulty. In order to indicate the fault effect on output y, we plot the output y and square prediction error (SPE) of y under each fault case. Using the known fault subspaces, we validate the results on output relevance. The matrix Φ is listed as follows.
[
]
0.1956 -0.2277 -0.0724 0.0597 0.0606 -0.2277 0.4005 0.0352 0.0846 0.0325 0.1267 -0.2054 -0.0393 Φ ) -0.0724 0.0352 0.0597 0.0846 -0.2054 0.3914 0.1042 0.0606 0.0325 -0.0393 0.1042 0.1024 (37)
4. Simulation and Application Studies In this section, we use a numerical simulation to demonstrate the concept of fault detectability and reconstructability based on φ. Then a study on the Tennessee Eastman process (TEP) is used to compare three fault subspace extraction methods, and illustrate the fault reconstruction and estimation further. Numerical Simulation. We simulate a multivariate system with strong correlation among the sensors. The normal measured data are generated by the following model:
{
xk ) Gzk + ek yk ) Cxk + Vk
where
(
(36)
)
1 0 4 1 2 T zk ∈ R ∼ U([0, 1]), G ) 2 1 1 0 1 , 1 0 3 2 -1 C ) [1 0 1 -1 2 ] 3
The control limit is calculated as ζ2 ) 1.45, which confirms the analysis. Case of Complete Output-Relevance. When Ξ is selected as Ξ)
[
-0.1961 -0.0749 -0.3104 -0.0212 -0.2600 0.2267 -0.6671 0.2258 -0.6436 -0.1960
]
T
j is calculated as the Ξ
[
-0.0491 -0.0250 -0.0407 0.0289 -0.0999 Ξ¯ ) 0.1722 -0.4975 0.1710 -0.4828 -0.1412
]
T
j ) > 0. Therefore, this kind of fault is completely outputσmin(Ξ relevant. Let f ) [10, -4]T. It is observed that output y is significantly affected by this fault in Figure 2. Thus, the fault is detected
9180
Ind. Eng. Chem. Res., Vol. 49, No. 19, 2010
Figure 3. Fault estimation in the complete output-relevant case, f ) [10, -4]T.
Figure 4. Output y, SPE of y, and fault detection in the partial outputrelevant case, f ) [50, 10]T.
immediately when the fault occurs at 101st sample. Figure 3 shows that the output-relevant fault can be estimated in all dimensions. Case of Partial Output-Relevance. When Ξ is selected as ΞT )
[
0.5192 0.1169
- 0.5164 - 00924
- 0.3945 0.8250
0.5232 0.4955
0.1856 - 0.2271
j is calculated as the Ξ
[
]
]
0.3881 -0.3861 -0.2949 0.3911 0.1387 T Ξ¯ ) 0.0000 0.0000 -0.0000 0.0000 0.0000 j j ) ) 0. Therefore, this kind of fault is σmax(Ξ) > 0 and σmin(Ξ partially output-relevant. If we choose f ) [50, 10]T, then the fault has an impact on the SPE of y and can be detected as shown in Figure 4. However, only the first dimension of the fault can be estimated, as shown in Figure 5.
Figure 5. Fault estimation in the partial output-relevant case, f ) [50, 10]T.
Figure 6. Output y, SPE of y, and fault detection in the partial outputrelevant case, f ) [0, 10]T.
If we choose f ) [0, 10]T, then the fault is output-irrelevant and can not be detected as shown in Figure 6. Application to the TE Process. The Tennessee Eastman process (TEP) was created by the Eastman Chemical Company for evaluating process control and monitoring methods.20 The process consists of five major units: a reactor, condenser, compressor, separator, and stripper; and it contains eight components: A, B, C, D, E, F, G, and H. The gaseous reactants A, C, D, and E and the inert B are fed to the reactor where the liquid products G and H are formed. The species F is a byproduct of the reactions. The process is operated under closedloop control. TEP has been widely used as a benchmark process for diagnosis methods such as PCA, support vector machine, and Fisher discriminant analysis (FDA).21 T-PLS based monitoring methods are used to detect output-relevant faults.14 The TEP contains two blocks of variables: the MV block of 12 manipulated variables and MEAS block of 41 measured variables.21,22 Process measurements are sampled with an
Ind. Eng. Chem. Res., Vol. 49, No. 19, 2010
9181
Figure 7. Output y under the normal case and fault case. Figure 9. Extraction for fault subspace related to Y.
Figure 8. Extraction for the whole fault subspace. Figure 10. Extraction for reduced fault subspace.
interval of 3 min. Nineteen composition measurements are sampled with time delays that vary from six minutes to fifteen minutes which are taken from streams 6, 9, and 11. In this study, the composition of G in stream 9, i.e. MEAS(35), is chosen as output variable y with a time delay of 6 min. Twenty-two process measurements and 11 manipulated variables, i.e., MEAS1-MEAS22 and MV1-MV11, are chosen as X. First, 480 normal samples are centered to zero mean and scaled to unit variance. Then the samples are used to build a T-PLS structure with A ) 6, Ay ) 1, and Ar ) 17. The control limit of φ is calculated as ζ2 ) 1.04. We take fault 8 as an example, which causes random variation in the A, B, C feed composition of steam 4. This fault affects output data y significantly as shown in Figure 7. A training set of 480 faulty samples are used to extract the fault subspace of fault 8, using three approaches given in section 4. First, we use a movingwindow average technique to eliminate the effect of normal value, i.e. 1 x_k ) (xk + ... + xk-l+1) l
(38)
where l ) 50. _yk is obtained similarly. Figures 8-10 show the reconstructed φ with different fault dimensions using three methods, respectively. The whole fault subspace needs more
than five dimensions to bring the reconstructed φ under the control limit, output-relevant subspace needs five dimensions, and the reduced fault subspace needs only four dimensions. This result means that extraction of the reduced fault subspace in the output-relevant region can describe output-relevant fault information more efficiently. Thus, we decide to use this method j , then estimate the fault magnitude. Notice that if to extract Ξ j we extract a fault direction matrix using the first method, Ξ may not have full column rank and the fault may be partially j directly, it always has full reconstructible. If we extract Ξ column rank, thus the fault is completely reconstructible. A test data set of 960 samples are used for fault reconstruction and estimation. The fault is introduced at the 161st sample in the test data. Figure 11 compares the original combined index and reconstructed φ and shows that reconstructed φ is within control for most faulty samples, which is the aim of fault reconstruction. At last, the fault magnitude for four dimensions is estimated in Figure 12. Notice that the estimation value before the fault starts reflects the estimation error. 5. Conclusions In this article, we propose a new index φ combining Ty2 and Qr to detect output-relevant faults. Then, the detectability
9182
Ind. Eng. Chem. Res., Vol. 49, No. 19, 2010
ˆ and X ˆ ) TPT. (3) Let PTy ) (TTy Ty)-1TTy X ˆ - TyPTy ) ToPoT. Run PCA on X ˆo ) X ˆ o with A - Ay (4) X components.
(5) Run PCA on E with Ar components, E ) TrPrT + Er,
where Ar < m - A is determined using PCA methods, such as the cumulative percent variance criterion.
j. Figure 11. Original and reconstructed φ for test data with extracted Ξ
Appendix B: Proof of Lemma 1 j ∈ Rm×Af, dim(Sj) ) Af is equivalent to σmin(Ξ j ) > 0. As Ξ j j ), |Ξ j f| j Therefore, |Ξf| g |f|σmin(Ξ) > 0. When |f| > 2ζ/σmin(Ξ j f|)2 > ζ2, which guarantees g 2ζ. As |xj*| < ζ, φ(x) g (|xj*| - |Ξ j ). the fault can be detected as long as |f| > 2ζ/σmin(Ξ j ) 0 Thus, for all f, the Similarly, Sj ) 0 is equivalent to Ξ fault F can not be detected by φ, which means F is completely irrelevant to output y. j ) > σmin(Ξ j) ) At last, 0 < dim(Sj) < Af is equivalent to σmax(Ξ j ) > 0, where σ*min(Ξ j) j f| g |f|σ*min(Ξ 0. There must exist f so that |Ξ j is the minimum of nonzero singular value of Ξ. Similarly, when j ), the fault can be detected. |f| > 2ζ/σ*min(Ξ Acknowledgment This work was supported by the national 973 projects under Grants 2010CB731800 and 2009CB32602, by NSFC under Grants 60721003 and 60736026, and by the Changjiang Professorship (S.J.Q.) by the Ministry of Education, P.R. China. Literature Cited
Figure 12. Fault estimation for four dimensions.
based on φ is studied. Fault reconstruction based on φ is proposed to eliminate the fault effect on Y and estimate the output-relevant part of the fault. In the complete outputrelevant case, the reconstructed φ can be brought under the control limit along the actual fault direction. A numerical simulation example illustrates the concept of fault detectability and reconstructability. Several approaches for extraction of the fault subspace are compared. The approaches are successfully applied to TEP benchmark data. Fault subspaces are extracted from historical data that are known to contain a fault. Fault reconstruction and estimation are performed with the extracted fault direction. The extraction of reduced fault subspace is more efficient. Future work could consider an adaptive T-PLS approach with real time data that may contain new information of the unexcited residual space. Appendix A: T-PLS algorithm Center and scale the raw data to give X and Y. (1) Perform the nonlinear iterative partial least-squares (NIPALS) algorithm on data pair X and Y and train a PLS model as shown in (1). See the following algorithm for reference. ˆ with Ay components, Y ˆ ) TQT ) TyQTy , (2) Run PCA on Y where Ay ) rank(Q).
(1) Kresta, J. V.; Macgregor, J. F.; Marlin, T. E. Multivariate statistical monitoring of process operating performance. Can. J. Chem. Eng. 1991, 69, 35–47. (2) Macgregor, J. F.; Kourti, T. Statistical process control of multivariate processes. Control Eng. Pract. 1995, 3, 403–414. (3) Wise, B. M.; Gallagher, N. B. The process chemometrics approach to process monitoring and fault detection. J. Process Control 1996, 6, 329– 348. (4) Raich, A.; Cinar, A. Statistical process monitoring and disturbance diagnosis in multivariable continuous processes. AIChE J. 1996, 42, 995– 1009. (5) Dunia, R.; Joe Qin, S. Joint diagnosis of process and sensor faults using principal component analysis. Control Eng. Practice 1998, 6, 457– 469. (6) Yue, H.; Qin, S. Reconstruction-Based Fault Identification Using a Combined Index. Ind. Eng. Chem. Res. 2001, 40, 4403–4414. (7) Dunia, R.; Qin, S. J.; Edgar, T. F.; McAvoy, T. J. Identification of faulty sensors using principal component analysis. AIChE J. 1996, 42, 2797– 2812. (8) Dunia, R.; Qin, S. J. Subspace approach to multidimensional fault identification and reconstruction. AIChE J. 1998, 44, 1813–1831. (9) Qin, S. J. Statistical process monitoring: basics and beyond. J. Chemom. 2003, 17, 480–502. (10) Geladi, P.; Kowalski, B. R. Partial least-squares regression: a tutorial. Anal. Chim. Acta 1986, 185, 1–17. (11) Ho´skuldsson, A. PLS regression methods. J. Chemom. 1988, 2, 211–228. (12) MacGregor, J. F.; Jaeckle, C.; Kiparissides, C.; Koutoudi, M. Process monitoring and diagnosis by multiblock PLS methods. AIChE J. 1994, 40, 826–838. (13) Li, G.; Qin, S. J.; Zhou, D. H. Geometric properties of partial least squares for process monitoring. Automatica 2010, 46, 204–210. (14) Zhou, D. H.; Li, G.; Qin, S. J. Total projection to latent structures for process monitoring. AIChE J. 2010, 56, 168–178. (15) Li, G.; Zhou, D. H.; Ji, Y. D.; Qin, S. J. Total PLS based contribution plots for fault diagnosis. Acta Automat. Sin. 2009, 35, 759– 765. (16) Valle, S.; Qin, S.; Piovoso, M.; Bachmann, M.; Mandakoro, N. Extracting fault subspaces for fault identification of a polyesterfilm process. Proceedings ACC, Arlington, VA, June 25-27, 2001; pp 4466-4471. ¨ hman, J. Orthogonal signal (17) Wold, S.; Antti, H.; Lindgren, F.; O correction of near-infrared spectra. Chemom. Intell. Lab. Syst. 1998, 44, 175–185.
Ind. Eng. Chem. Res., Vol. 49, No. 19, 2010 (18) Trygg, J.; Wold, S. Orthogonal projections to latent structures (OPLS). J. Chemom. 2002, 16, 119–128. (19) Box, G. E. P. Some theorems on quadratic forms applied in the study of analysis of variance problems, I. Effect of inequality of variance in the one-way classification. Ann. Math. Stat. 1954, 25, 290–302. (20) Downs, J. J.; Vogel, E. F. A plant-wide industrial process control problem. Comput. Chem. Eng. 1993, 17, 245–255. (21) Chiang, L. H.; Russell, E.; Braatz, R. D. Fault detection and diagnosis in industrial systems; Springer: London, 2001.
9183
(22) Lee, G.; Han, C. H.; Yoon, E. S. Multiple-Fault Diagnosis of the Tennessee Eastman Process Based on System Decomposition and Dynamic PLS. Ind. Eng. Chem. Res. 2004, 43, 8037–8048.
ReceiVed for reView December 8, 2009 ReVised manuscript receiVed June 26, 2010 Accepted August 4, 2010 IE901939N