Multiple-Fault Diagnosis of the Tennessee Eastman Process Based on

The hybrid fault diagnosis method based on a combination of the signed digraph and partial least-squares (PLS) has the advantage of improving the diag...
3 downloads 0 Views 238KB Size
Ind. Eng. Chem. Res. 2004, 43, 8037-8048

8037

Multiple-Fault Diagnosis of the Tennessee Eastman Process Based on System Decomposition and Dynamic PLS Gibaek Lee* Department of Chemical Engineering, Chungju National University, Chungju, Chungbuk 380-702, Korea

Chonghun Han and En Sup Yoon School of Chemical Engineering, Seoul National University, Seoul 151-742, Korea

The hybrid fault diagnosis method based on a combination of the signed digraph and partial least-squares (PLS) has the advantage of improving the diagnosis resolution, accuracy, and reliability, compared to those of previous qualitative methods, and of enhancing the ability to diagnose multiple fault [Ind. Eng. Chem. Res. 2003, 42, 6145-6154]. In this study, the method is applied for the multiple fault diagnosis of the Tennessee Eastman challenge process. The target process is decomposed using the local qualitative relationships of each measured variable. Linear and quadratic models based on dynamic PLS are built to estimate each measured variable, which is then compared with the estimated value in order to diagnose the fault. Through case studies, the proposed method demonstrated a good diagnosis capability compared with previous statistical methods. Introduction When an abnormal situation occurs in chemical processes, human operators must quickly detect the fault, diagnose its root causes, and then bring the process back to a normal state. However, it is difficult for operators to take correct actions when facing these situations. The task is further complicated by the size and complexity of modern chemical plants. Therefore, automatic fault diagnosis systems for chemical process operations have become extremely important in improving the safety of the plant and plant personnel. Such systems are used to analyze process data on-line, monitor process trends, and diagnose faults when an abnormal situation arises. Among a variety of fault diagnosis approaches for chemical processes, rule-based expert system, state estimation such as observer and Kalman filter, signed digraph (SDG), qualitative simulation, statistical method, and neural network, have been developed.1,2 These methods are broadly classified as those that use a process model and those that rely on process history data. They can be further subclassified as qualitative or quantitative. Our previous study suggested the hybrid method combining SDG and the partial least squares (PLS).2 The target system is decomposed on the basis of the local causal relationships of each measured variable in the SDG classified as the qualitative model-based methods. For each decomposed subprocess, local fault diagnosis is performed using the PLS classified as a quantitative history data-based method. The method has the advantages of improving the diagnosis resolution and accuracy compared to previous qualitative methods. Moreover, it enhances the reliability of the diagnosis for all predictable faults, including multiple fault. Although it is based on statistical process data, it allows the diagnosis model to be * To whom correspondence should be addressed. Tel.: +8243-841-5230. Fax: +82-43-841-5220. E-mail: glee@ chungju.ac.kr.

built on the basis of easily obtainable data sets and does not require faulty case data sets. The Tennessee Eastman (TE) process created by the Eastman Chemical Co. has been widely used as a benchmark process for evaluating process diagnosis methods (Figure 1).3,4 Principal component analysis (PCA),5-9 multiway PCA,10 partial PCA,11 nonlinear dynamic PCA,12 pattern recognition,13 Fisher discriminant analysis (FDA),9,14 PCA-wavelet,15 PLS,9 steadystate-based approach,16 support vector machines (SVM),14,17 and PCA-QTA (qualitative trend analysis)18 have all been applied to the TE process. Most of the previous methods are based on multivariate statistics, and several studies have used nonlinear or dynamic models to consider process dynamics and nonlinearity.10,12,14,17 Although most methods show good diagnostic performance, they have the crucial drawback of using faulty data sets, which are rarely available.5,8,9,12-15,17,18 Chiang et al. reviewed the fault detection and diagnosis method of the multivariate statistics such as PCA, FDA, PLS, and CVA (canonical variate analysis) and compared them using the case studies of the TE processes.4 This study considers the multiple fault diagnosis of the TE process using the hybrid fault diagnosis method of system decomposition and dynamic PLS (DPLS) proposed in our previous paper. To deal with the nonlinearity of the target process, we use nonlinear (quadratic) PLS as well as linear PLS for the TE process. Through the diagnosis result of 15 single faults defined in the TE process, the diagnostic performance is compared with the results of Chiang et al. Linear and quadratic PLS models are also compared using the diagnosis result for 15 single faults and 44 double faults. Previous Studies System Decomposition Based on SDG. System decomposition methods for fault diagnosis have been formulated to utilize the advantage of proving flexible diagnosis throughout operational condition changes,

10.1021/ie049624u CCC: $27.50 © 2004 American Chemical Society Published on Web 10/26/2004

8038

Ind. Eng. Chem. Res., Vol. 43, No. 25, 2004

Figure 1. Process flow diagram of the TE process.4

reducing the size of the knowledge base and allowing easy understanding of complex, process to process interactions. A two-step diagnostic approach narrows the diagnostic focus to a particular decomposed subsystem and performs diagnosis on this subsystem. Another approach develops a diagnostic knowledge base for each subsystem and collects diagnostic results from all of the subsystems before the final conclusion is drawn. This study uses the system decomposition method proposed by our previous study.2 It is centering on measured variables in SDG. In SDG, each arc represents the instantaneous effect produced from the source node to the target node. All source nodes connected to a particular target node by means of the arcs have a direct influence on that target node. That is, only the source nodes connected to a target node can affect the particular target node. Because unmeasured nodes among the source nodes cannot demonstrate the direct effects from faults, unmeasured nodes are not used and the reduced digraph is used consisting of the original SDG with the unmeasured nodes removed. Each decomposed subprocess includes a central measured variable (hereafter, the target variable) as well as measured variables (hereafter, source variables) and faults connected to the target variable. For instance, the reduced digraph of the decomposed subprocess for XA of the TE process is shown in Figure 2a. Local Fault diagnosis can be performed for each decomposed subprocess. Because fault diagnosis is locally executed for each target variable, the fault diagnosis method using the system decomposition can diagnose all types of multiple faults except for those multiple faults that affect the same measured nodes. Fault Diagnosis Based on Dynamic PLS Models: Off-Line Analysis. The simple diagnosis method

Figure 2. Reduced digraph of the decomposed subprocess for (a) XA and (b) F6.

on the basis of the decomposition method is to estimate the value of each target variable using the measured values of the source variables connected to the target variable. A substantial difference between the estimated and measured values implies the occurrence of one or more faults. The sensor faults that occurred in the sensor corresponding to the source variables used for the estimation produce errors in the estimated values. The faults added to the target node give rise to errors in the measured values. The estimation of our previous study used the PLS model built for each decomposed subprocess. The input X of the model contains the source variables connected to the target variable, and the output Y is the estimated value of the target variable. To handle the process dynamics accurately, we used DPLS that is integrated with ARMAX. In addition to the past values of the source variables, the resulting input of DPLS for a target variable includes the past values of the target variable, as well as the source variables. The necessary number of past values (time lags) l and the principal components (PCs) are determined from the learning data. The number l is usually 1 or 2, which indicates the order of the dynamic system. Each DPLS model can be built from the operational data set

Ind. Eng. Chem. Res., Vol. 43, No. 25, 2004 8039

representing the local relations between the input variables and the output variable of the DPLS models. Therefore, the required data set for each DPLS model can be easily obtained. The available data sets can be obtained in the presence of set-point changes or external disturbances, which occur frequently. Therefore, the proposed method does not need a faulty case data set, which would otherwise be difficult to obtain. Fault detection is performed by the observation of the residual, which is the difference between the estimated value determined by the DPLS model and the measured one.

ri ) yi - yˆ i

(1)

where ri is the residual of variable i and yi and yˆ i are the measured and estimated values of variable i, respectively. A qualitative state, which corresponds to ranges of possible values for the residual, becomes an attribute of the residual. We will consider methods that use three ranges: low, to which the qualitative state (-) is assigned; normal, assigned (0); and high, assigned (+). If a fault occurs, the qualitative state for the residual may be (+) or (-). The abnormal qualitative state for the residual becomes a symptom, which is expressed as the pair of the target variable and the qualitative state of the residual. Those faults inducing the abnormality of each residual are classified along with their symptoms, and the classified faults are stored in a set (called a fault set). Also, faults can be classified into two types: one is the faults added to the target variable and the other is the sensor faults that occur in the sensor corresponding to the source variables in the DPLS model. Fault Diagnosis Based on Dynamic PLS Models: On-Line Diagnosis. The first step of fault diagnosis is the monitoring of the residuals, to detect their qualitative change of state. As the detection method based on the statistical methods, we used CUSUM, which comprises the recurrent computation form suitable for real time analysis and which does not need filtering. CUSUM presents changes when the accumulated value becomes larger than threshold after accumulating larger values than constant minimal jump size from the mean value. As two parameters of CUSUM, 6σ of the residual distribution is used as the minimal jump size and 3σ of the CUSUM distribution as the threshold. The next step is to obtain the set of detected symptoms. The detected residual of a variable becomes an element in the set. The next step of fault diagnosis is to obtain the minimum set of faults that can explain all of the detected symptoms. For the detailed description, refer to our previous paper. PLS and Nonlinear PLS. By maximizing the covariance between the input data matrix X and the output matrix Y, PLS can predict the values of Y on the basis of X values. The PLS model consists of outer relations (X and Y blocks individually) and an inner relation (linking the two blocks). The scaled and mean-centered X and Y matrices are decomposed into score and loading vectors denoted by T and U, and P and Q, respectively. The inner relationship between the two blocks of X and Y is represented as a linear algebraic relation (U ) TB) between their scores

X ) TPT + E

(2)

Y ) UQT + F

(3)

where E and F are the residual matrices for X and Y, respectively. In PLS, the loading and score vectors are determined in such a way as to maximize the prediction accuracy of Y while describing a large amount of the variation in X. The most common method used to calculate the PLS model parameters is known as NIPALS for noniterative partial least squares. The prediction of Y from X is done according to the following regression model.

Y ˆ ) XBPLS ) XW(PTW)-1BQT

(4)

In the above equation, BPLS is the coefficient of the PLS regression model, and the weight W is defined in the NIPALS algorithm by

T ) XW

(5)

In practice, linear PLS cannot always be used to model significant nonlinear characteristics of real and complex chemical processes. Two methods to integrate nonlinearity within the linear PLS framework have been studied. One nonlinear PLS approach without modifying the NIPALS algorithm is to extend the input matrix by including nonlinear combinations of the original variables (such as logarithms, squares, and products) and to perform a linear PLS on the extended input and output matrices.19 Though the method may be very flexible, it is too unstable, and the assumption of independence between variables is rarely true in the real world.20 The other method is to use a nonlinear function which relates the output score U to the input score T and modify the NIPALS algorithm. Examples of the nonlinear function are quadratic polynomial,19,21 spline function,20 and neural network.22,23 For fault diagnosis of the TE process, this study adopted the nonlinear quadratic PLS algorithm proposed by Wold et al.21 They modified the inner relationship between the input score and the output score to be nonlinear:

U ) f(T) + H

(6)

where H denotes the residual matrix. The nonlinear function denoting the inner relation can take any form such as polynomial, exponential, and logarithmic functions. In particular, Wold et al. proposed a quadratic polynomial relation for the inner relation:

U ) c0 + c1T + c2T2 + H

(7)

To obtain the quadratic inner relation between U and T, they modified the NIPALS algorithm. The modified algorithm starts with a linearly derived PLS weight vector, W, which is then updated by means of a Newton-Raphson linearization of the quadratic inner relation estimated by linear PLS. We use their algorithm implemented in Matlab. Tennessee Eastman Challenge Process Process Description. Downs and Vogel3 proposed the TE process and described it in detail. It provides a realistic industrial process for evaluating process control and monitoring methods. The test process is based on a simulation of an actual industrial process where the components, kinetics, and operating conditions have been modified for proprietary reasons (Figure 1). The process has five major units, a reactor, condenser, recycle compressor, vapor/liquid separator, and product

8040

Ind. Eng. Chem. Res., Vol. 43, No. 25, 2004

Table 1. Manipulated and Measured Variables of the TE Process variable

description

sampling interval (min)

MV1 MV2 MV3 MV4 MV5 MV6 MV7 MV8 MV9 MV10 MV11 MV12 F1 F2 F3 F4 F5 F6 P7 L8 T9 F10 T11 L12 P13 F14 L15

D feed flow (stream 2) E feed flow (stream 3) A feed flow (stream 1) total feed flow (stream 4) compressor recycle valve purge valve (stream 9) separator pot liquid flow (stream 10) stripper liquid product flow stripper steam valve reactor cooling water flow condenser cooling water flow agitator speed A feed (stream 1) D feed (stream 2) E feed (stream 3) total feed (stream 4) recycle flow (stream 8) reactor feed rate (stream 6) reactor pressure reactor level reactor temp purge rate (stream 9) separator temp separator level separator pressure separator underflow (stream 10) stripper level

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

variable

description

sampling interval (min)

P16 F17 T18 F19 J20 T21 T22 XA XB XC XD XE XF YA YB YC YD YE YF YG YH ZD ZE ZF ZG ZH

stripper pressure stripper underflow (stream 11) stripper temperature stripper steam flow compressor work reactor cooling water outlet temp condenser cooling water outlet temp composition of A (stream 6) composition of B (stream 6) composition of C (stream 6) composition of D (stream 6) composition of E (stream 6) composition of F (stream 6) composition of A (stream 9) composition of B (stream 9) composition of C (stream 9) composition of D (stream 9) composition of E (stream 9) composition of F (stream 9) composition of G (stream 9) composition of H (stream 9) composition of D (stream 11) composition of E (stream 11) composition of F (stream 11) composition of G (stream 11) composition of H (stream 11)

1 1 1 1 1 1 1 6 6 6 6 6 6 6 6 6 6 6 6 6 6 15 15 15 15 15

Table 2. Faults Defined in the TE Process fault ID

description

type

IDV1 IDV2 IDV3 IDV4 IDV5 IDV6 IDV7 IDV8 IDV9 IDV10 IDV11 IDV12 IDV13 IDV14 IDV15 IDV16-IDV20 IDV21

A/C feed ratio, B composition constant (stream 4) B composition, A/C ratio constant (stream 4) D feed temp (stream 2) reactor cooling water inlet temp condenser cooling water inlet temp A feed loss (stream 1) C header pressure losssreduced availability (stream 4) A, B, C feed composition (stream 4) D feed temp (stream 2) C feed temp (stream 4) reactor cooling water inlet temp condenser cooling water inlet temp reaction kinetics reactor cooling water valve condenser cooling water valve unknown valve for stream 4 fixed at the steady-state position

step step step step step step step random variation random variation random variation random variation random variation slow drift sticking sticking

stripper, and eight components, A-H. The gaseous reactants A and C-E, and the inert B, are fed to the reactor, where the liquid products G and H are formed. The reactions in the reactor are

A(g) + C(g) + D(g) f G(l), product 1 A(g) + C(g) + E(g) f H(l), product 2 A(g) + E(g) f F(l), byproduct 3D(g) f 2F(l), byproduct The reactions are irreversible, exothermic, and approximately first-order with respect to the reactant concentrations. The reaction rates are Arrhenius functions of temperature where the reaction for G has a higher activation energy than the reaction for H, resulting in a higher sensitivity to temperature. The reactor product stream is cooled through a partial condenser and then fed to a vapor-liquid separator. The vapor exiting the separator is recycled to the reactor feed through a compressor. A portion of the recycle stream

constant position

is purged to keep the inert and byproduct from accumulating in the process. The condensed components from the separator (stream 10) are pumped to a stripper. Stream 4 is used to strip the remaining reactants from stream 10, which are combined with the recycle stream via stream 5. Products G and H exiting the base of the stripper are sent to a downstream process. The process contains 41 measured and 12 manipulated variables (Table 1). Every minute 22 measured variables, F1 through T22, are sampled. Taken from streams 6, 9, and 11 are 19 composition measurements of XA through ZH. The sampling interval and time delay for streams 6 and 9 are both 6 min, and for stream 11, are 15 min. All the process measurements include Gaussian noise. The TE process simulation contains 21 preprogrammed faults (Table 2), 16 of which are known and 5 are unknown. IDV1 through IDV7 are associated with a step change in a process variable. IDV8 through IDV12 are associated with an increase in the variability of some process variables. IDV13 is a slow drift in the reaction kinetics, and IDV14, IDV15, and IDV21 are associated with sticking valves.

Ind. Eng. Chem. Res., Vol. 43, No. 25, 2004 8041 Table 3. Input Variables, Number of Principal Components (PCs), and Time Delays of the DPLS Model linear target variable F1 F4 P7 L8 T9 T11 P13 T18 T21 T22 XA XB XC YA YC YD YE YF YG YH

source variables connected to the target variable

no. of PCs

no. of time delay

no. of PCs

no. of time delay

1 1 10 6 6

1 1 2 1 2

1 1 8 8 6

1 1 2 2 2

11

2

6

2

9

2

6

2

8 4 1 7 7 5 6 7 6 6 7 7 9

2 2 1 2 2 1 1 2 1 1 1 1 1

3 4 3 3 2 5 5 5 6 7 8 5 4

2 2 2 2 2 1 1 1 2 2 2 1 1

MV3 MV4 F1, F2, F3, F4, F5, L8, T9, P13, F17, YA, YB, YC, YD, YE, YF, YG, YH F1, F2, F3, F4, F5, P7, T9, P13, F14, F17, YA, YC, YD, YE, YF, YG, YH F1, F2, F3, F4, F5, P7, L8, T11, P13, F17, T18, YA, YB, YC, YD, YE, YF, YG, YH, MV10 F1, F2, F3, F5, P7, L8, T9, F10, L12, P13, F14, F17, YA, YB, YC, YD, YE, YF, YG, YH, MV11 F1, F2, F3, F4, F5, P7, L8, T9, F10, T11, L12, F14, F17, YA, YB, YC, YD, YE, YF, YG, YH F4, T11, F14, L15, F17, F19, ZD, ZE, ZF, ZG, ZH L8, T9, MV10 T9, MV11, P713 F1, F2, F3, F4, F5, F17, YA F1, F2, F3, F4, F5, F17, YB F1, F2, F3, F4, F5, F17, YC F5, F6, P7, L8, T9, F10, T11, L12, XA, XC, XD, XE F5, F6, P7, L8, T9, F10, T11, L12, XA, XC, XD, XE F5, F6, P7, L8, T9, F10, T11, L12, F14, XA, XC, XD F5, F6, P7, L8, T9, F10, T11, L12, F14, XA, XC, XE F5, F6, P7, L8, T9, F10, T11, L12, F14, XA, XC, XE, XF F5, F6, P7, L8, T9, F10, T11, L12, F14, XA, XC, XD F5, F6, P7, L8, T9, F10, T11, L12, F14, XA, XC, XD

The TE simulation code is available in Fortran and Matlab, and the detailed description of the process and simulation can be found in Downs and Vogel,3 Ricker and Lee,24 and Chiang et al.4 The Fortran code used in this study can be downloaded from http://brahms.scs.uiuc.edu. A 1-s integration interval is used here. Chiang et al. use a diagnosis interval of 3 min, but this study uses a diagnosis interval of 1 min to obtain faster diagnosis results. It is the same with the sampling interval. Diagnosis Model Description (1) SDG and System Decomposition. The system is decomposed from the SDG of the TE process. However, it is a very difficult task to build an accurate SDG, because the process contains four reactions and eight components. Especially, the vapor/liquid equilibrium in the reactor, vapor/liquid separator, and stripper become a great obstacle in determining the signs of the causal relationships between process variables. However, our method does not need the SDG of the whole process and uses only the locally reduced SDG of the measured variables, which are directly affected by the faults defined in the process. For these reasons, the efforts to build SDG are greatly reduced. The proposed method can diagnose only the predefined faults. Among 16 known faults, IDV21 is not defined in the simulation program. Our study aimed to diagnose these 15 faults of IDV1 through IDV15 and found 26 measured variables directly affected by them. ZD through ZH among the 26 variables are removed, because their sample intervals of 15 min are too wide to be helpful in diagnosis speed. This study found that P16 in the simulation program does not mean the stripper pressure but the pressure of the feed mixing zone. As P16 does not have an accurate value, it is eliminated from the diagnosis. As a result, the process is decomposed centering on 20 measured variables and the reduced digraph for the decomposed subprocess is obtained. The source variables connected to 20 target variables are shown in Table 3, and the faults added to target variables are shown in Table 4. In Table 3, the source variables connected to P7 are originally 13: F5, F6, L8, T9, P13, XA, XB, XC, XD, XE,

quadratic

Table 4. Fault Added to the Measured Node in the TE Process measd variable F1 F4 P7 L8 T9 T11 P13 T18 T21 T22 XA XB XC YA-YH

sign of arc negative (-)

positive (+) or negative (-)

IDV6 IDV7 IDV13 IDV13 IDV3, IDV4, IDV9, IDV11, IDV13, IDV14 IDV5, IDV12, IDV13, IDV15 IDV13 IDV1, IDV2, IDV8, IDV10, IDV13 IDV4, IDV11, IDV14 IDV5, IDV12, IDV15 IDV1, IDV2, IDV8 IDV2, IDV8 IDV1, IDV2, IDV8 IDV13

XF, YG, and YH. However, because the sample interval and time delay of XA through XF are as wide as 6 min, it is doubtful whether they can represent timely the effect from F1, F2, F3, and F4. Figure 3a compares F1 and XA for the set-point change of XA. F1 is the manipulated variable for the control of XA. The response of the composition variable is much slower than that of the flow rate variable. Figure 3b is similar, too. The reduced digraphs of XA through XF (for example, refer to Figure 2a) are built, and XA through XF are replaced with F1, F2, F3, F4, F5, F14, F17, and YA through YF. Due to this replacement, IDV1, IDV2, and IDV3 become the faults added to P7. F6 is removed from the estimation model because it can be represented by F1, F2, F3, F4, F14, and F17 (Figure 2b). For the same reasons, the reduced digraphs of L8, T9, T11, and P13 are changed as shown in Table 3. This type of model modification enables quicker detection. However, diagnosis will fail if any meaningful information is lost. The model should be carefully modified. Using the scaled regression coefficients, it is confirmed that F14 has little effect on P7, and F14 is removed from the input of the model to estimate P7. (2) DPLS Model Building. The learning data to build the DPLS model for the decomposed subprocess can be obtained in the presence of set-point change or external disturbances. However, the data set in the

8042

Ind. Eng. Chem. Res., Vol. 43, No. 25, 2004

Figure 3. Dynamics of (a) F1 and XA for the set-point change of XA and (b) F3 and XE for the set-point change of XE. Table 5. Set-Point Change for the Learning Data loop no.

manipulated variable

control variable

set-point change (%)

5 7 8 11 13 14 15 16 17 18 19

MV5 MV7 MV8 MV11 F1 F2 F3 F19 F4 T21 T18

F5 L12 L15 F17 XA XD XE T18 L8 T9 YB

+5 +10 +10 -2 -5 -5 +10 +10 -10 -1 -10

presence of external disturbance cannot be used because the changes of the operation conditions such as the feed composition and cooling water temperature are not available in the simulation program. To get the learning data, the set point is changed. The set points of 11 control loops in the TE process can changed, and the changes are set on the basis of (10% (Table 5). The set points of five loops in Table 5 are changed to the values lower than (10% due to the controller output saturations when the change is (10%. The simulation time for the training data set was 10 h, and the set point was changed after 1 simulation hour from the run. The total number of samples generated for each run was 11 × 10 × 60 ) 6600. In considering the time delays, the composition measurement variables having the known time delays of 6 (or 15) min show the previous value of 6 (or 15) min ago. Because the number of past values used in the DPLS models is usually 1 or 2 and time delays are 6 or 15 times of the diagnosis interval, it is very difficult for the DPLS models to handle the process dynamics accurately. To increase the accuracy of the estimation, the DPLS models should be modified to deal with this dead time. If the current and one previous values are used as input data of the DPLS model, the input X for the estimation of F1 at time t is MV3(t), F1(t-1), and MV3(t-1). Consider P7. At time t, YA through YH among the input variables show the previous value of 6 min ago. The diagnosis at time t uses the value of P7 of 6 min ago, which can be estimated with the available data. Therefore, the input X for the estimation of P7(t-6) is F1(t-6), F2(t-6), F3(t-6), F4(t-6), F5(t-6), L8(t-6), T9(t-6), P13(t-6), F17(t-6), YA(t), YB(t), YC(t), YD(t), YE(t), YF(t), YG(t), YH(t), P7(t-7), F1(t-7), F2(t-7), F3(t-7), F4(t-7), F5(t-7), L8(t-7), T9(t-7), P13(t-7), F17(t-7), YA(t-1), YB(t-1), YC(t-1), YD(t-1), YE(t-1), YF(t-1), YG(t-1), and YH(t-1). Consider XA. At time t, the current value of XA is the previous value of 6 min ago. The estimation of XA needs the values of F1, F2, F3, F4, and F17, which are collected 6 min ago. Also, the diagnosis interval of

Table 6. Fault Sets of the TE Process symptom F1(-) F4(-) P72 L82 T92 T112 P132 T182 T212 T222 XA2 XB2 XC2 SUMY2

fault set IDV6 IDV7 IDV1, IDV2, IDV8, IDV13 IDV1, IDV2, IDV8, IDV13 IDV1, IDV2, IDV3, IDV4, IDV8, IDV9, IDV11, IDV13, IDV14 IDV1, IDV2, IDV5, IDV8, IDV12, IDV13, IDV15 IDV1, IDV2, IDV8, IDV13 IDV1, IDV2, IDV8, IDV10, IDV13 IDV4, IDV11, IDV14 IDV5, IDV12, IDV15 IDV1, IDV2, IDV8 IDV2, IDV8 IDV1, IDV2, IDV8 IDV13

XA is same with the sampling interval of 6 min. Therefore, the input for the estimation of XA(t) is F1(t-6), F2(t-6), F3(t-6), F4(t-6), F17(t-6), YA(t), XA(t-6), F1(t-12), F2(t-12), F3(t-12), F4(t-12), F17(t12), and YA(t-6). In the same way, the diagnosis interval of XB through YH is 6 min. The number of past values l and PCs are determined from the learning data. To verify the diagnostic performance of nonlinear DPLS models, nonlinear quadratic PLS as well as linear PLS models are built. As in our previous model, the model based on linear PLS uses the cross-corelation plots of the scores to determine the number of time lags and PCs, as suggested by Ku et al.25 However, this method is proposed for a linear model. The model based on quadratic PLS uses crossvalidation. The numbers of time lags and PCs for the linear and quadratic DPLS models of the TE process are shown in Table 3. The same data are used to determine the CUSUM parameters of minimal jump size and threshold size. (3) Fault Set. Using Table 4, the fault sets for the TE process are obtained as shown in Table 6. If a fault occurs, the qualitative state for the residual may be (+) or (-). However, the sign of the arc from the faults to the measured variables are unknown except IDV6 and IDV7 among 15 faults defined in the TE process (Table 4). Because the types IDV8 through IDV12 are random variation, the signs of the symptoms can fluctuate between (+) and (-), which greatly decreases the diagnosis accuracy. To consider the characteristics of the faults defined in the TE process and make a stable diagnosis, the diagnosis strategy is modified for CUSUM to monitor the squared residuals as well as the residuals of each variable, according to the following equation:

ri2 ) (yi - yˆ i)2

(8)

To detect the increase (+) of the squared residuals, CUSUM monitors the squared residuals. For the

Ind. Eng. Chem. Res., Vol. 43, No. 25, 2004 8043

Figure 4. Dynamics of measured variables for IDV1.

CUSUM of the squared residuals, 6σ of the squared residual distribution is used as the minimal jump size and 3σ of the CUSUM distribution as the threshold. When the symptoms of squared residuals are used, a particular strategy is used to increase the diagnosis speed. In considering the variable of which the squared residual is monitored, the sign of the symptom of the residual for the variable is unknown. However, the positive or negative symptom of the residual may be detected earlier than the squared residual. To improve the diagnosis speed, the positive or negative symptom of the residual is also monitored by CUSUM. If the squared residual as well as the residual (either + or -) for the variable is detected, it is concluded that the symptom of the squared residual is detected. Since the strategy can worsen the diagnosis resolution, it should be carefully adopted. The eight composition measurement variables of YA through YH are affected only by IDV13 and are independent of other faults. It cannot be said that the possibility of false detection of one variable among the eight variables is low. When each composition variable is monitored separately, false detection of one variable may make the diagnosis unstable. To resolve the difficulty, YA through YH are grouped into one variable as follows: YH

SUMY2 )

(Yi - Y ˆ i)2 ∑ i)YA

(9)

Example and Discussion Example-IDV1 (A/C Feed Ratio, B Composition Constant). When the fault occurs, a step change is induced in the A/C feed ratio in stream 4, which decreases the A feed in stream 6 (XA) and a control loop reacts to increase the A feed in stream 1 (F1) (Figure 4). The variations in the flow rate and compositions of stream 6 to the reactor cause variations in the reactor level (L8), which affect the flow rate in stream 4 (F4) through a cascade control loop. Since the ratio of

reactants A and C changes, the variables associated with the reaction (level, pressure, and composition) change correspondingly. The simulation time for the faulty data set was 48 h to compare the diagnosis result with that of Chiang et al.’s study. The simulations started with no faults, and the faults were introduced to the run from 8 simulation hours. To measure the diagnostic performance, four parameters were used: accuracy, resolution, wrong detection, and detection delay.2 The accuracy is 1 if the diagnosis is accurate; that is, the true fault is included in the final fault candidates set. Otherwise, the accuracy is 0. The resolution denotes the number of final fault candidates. Wrong detection refers to the number of falsely detected symptoms independent of the true solution. The detection delay refers to the time from fault occurrence to fault diagnosis, and a short detection delay indicates quick detection and diagnosis. Figure 5 shows the residuals of the detected variables when the linear PLS model is used. The bounds of Figure 5 are the minimal jump size of CUSUM (6σ of the residual distribution). Using the linear PLS model, the detection sequence of symptoms are as follows: XA2 at 516 min, T92 and T9(-) at 532 min, XA(-) at 534 min, XC(+) at 570 min (fluctuation), XC2 at 570 min, P7(+) from 616 to 940 min (fluctuation), P72 from 618 to 1093 min, T182 at 694 min, T18(+) from 699 to 1204 min, P13(-) at 1036 min (fluctuation), and P132 at 1037 min. The symptoms having sign are XA(-), XC(+), T18(+), T9(-), P7(+), and P13(-). The fault candidates are IDV1, IDV2, and IDV8 from 516 min to the last diagnosis time. The symptom of F12 is falsely detected at 571, 572, and 1676 min and F42 at 1905 and 1906 min. Although the final fault candidates obtained for these 5 min of false detection are six double faults including IDV6 or IDV7, the final solution does include the true solution and the accuracy is 1 from the detection to the last diagnosis time. The fluctuation of XC(+), P7(+), and P13(-) between detection and missed detection does not have any effect on the accuracy. The

8044

Ind. Eng. Chem. Res., Vol. 43, No. 25, 2004

Figure 5. Residuals of the detected variables for IDV1.

accurate symptom of XA2 is detected at 36 min from the fault occurrence, and the detection delay is 36 min. When the quadratic PLS model is used, the accuracy is 0.99 as XB2 is falsely detected during 24 min (27002705 min, 2718-2735 min). As the final candidate of IDV2 and IDV8 is obtained, the true solution of IDV1 during 24 min is missed. As the final fault candidates are IDV1, IDV2, and IDV8, the resolution is 3. The definition of IDV1 does not include the sign of the fault. However, we can see that the symptoms of XA(-) and XC(+) indicate the decrease of the A/C feed ratio in stream 4. Therefore, IDV1 plus A/C feed ratio (-) can be a more accurate solution. When IDV2 occurs, B composition changes and A/C feed ratio is constant, indicating that the sign of the symptoms of XA and XC should be the same regardless of B composition. This is a potential diagnosis strategy to increase the diagnosis resolution. Example-IDV11 (Reactor Cooling Water Inlet Temperature). IDV11 induces a fault in the reactor cooling water inlet temperature. The fault in this case is a random variation. As seen in Figure 6, the fault induces large oscillations in the reactor cooling water flow rate (MV10), which results in a fluctuation of reactor temperature (T9). Figure 7 shows the residuals and squared residuals of the detected variables when the linear PLS model is

used. The bounds of Figure 7 are the minimal jump size of CUSUM (6σ of the residual distribution). Using the linear PLS model, the detection sequence of symptoms is T21(+) at 498 min (fluctuation), T212 at 502 min, T21(-) at 533 min (fluctuation), and T9(+), T9(-), and T92 during 228 min from 660 to 2874 min. The detection of T9(+), T9(-), T21(+), and T21(-) are fluctuating between detection and missed detection. If the diagnosis uses only residuals, the stable solution cannot be obtained. However, as the squared residuals are used, a stable solution is obtained from the detection to the last diagnosis time. There is no false detection in this case. IDV4, IDV11, and IDV14 are obtained as the solution, and the resolution is 3. Also, the detection delay of 18 min is better than the best result (21 min) of Chiang et al. Result of Single Fault Cases. Table 7 shows the diagnosis result obtained by linear and quadratic DPLS models and compares the result with that of Chiang et al. The wrong detection and resolution shown in this table is the average of the one measured every 1 min from the initial detection time to the last diagnosis time. The value in brackets refers to the worst result obtained during all diagnostic periods. Chiang et al. compared various statistical methods such as PCA, DPCA, CVA, PLS, FDA, and DFDA. In Table 7, the left and right

Ind. Eng. Chem. Res., Vol. 43, No. 25, 2004 8045

Figure 6. Dynamics of MV10 and T9 for IDV11.

Figure 7. Residuals and squared residuals of T9 and T21 for IDV11. Table 7. Diagnosis Results for Single Faults of the TE Process accuracy detection delays (min)

Chiang et al. (best/worst)

linear quadratic Ching et al. linear quadratic PLS PLS (best/worst) PLS PLS IDV1 IDV2 IDV3 IDV4 IDV5 IDV6 IDV7 IDV8 IDV9 IDV10 IDV11 IDV12 IDV13 IDV14 IDV15

36 115

36 78

6/21 36/75

2 13 1 1 96

3 12 1 1 84

3/0/48 0/33 0/3 60/69

189 18 68 142 11

206 19 67 130 11

69/303 21/912 0/66 111/147 3/18 2031/-

1 0.90 0 1 0.995 1 1 0.61 0 1 1 0.996 1 1 0

0.99 0.97 0 1 1 1 1 0.66 0 1 1 0.988 1 1

wrong detection

resolution

missed detection

misclassification

linear PLS

quadratic PLS

linear PLS

quadratic PLS

0/0.008 0.010/0.026 0.981/0.998 0/0.975 0/0.775 0/0.013 0/0.486 0.016/0.486 0.981/0.994 0.099/0.666 0.193/0.801 0/0.029 0.040/0.060 0/0.158 0.903/0.988

0.013/0.880 0.010/0.441 0.734/1 0.119/1 0.006/1 0/0.834 0/0.978 0.003/1 0.773/1 0.098/1 0.118/0.989 0.005/0.988 0.208/1 0.001/0.998 0.725/1

0.002(1) 0(0) 0 0(0) 0(0) 7.594(10) 0.0004(1) 0.002(1) 0 0(0) 0(0) 0.116(2) 0.088(1) 0.0004(1) 0

0.001(1) 0.20(1) 0 0(0) 0(0) 7.77(9) 0.293(3) 0.02(1) 0 0(0) 0(0) 1.345(7) 0.056(1) 0 0

3(3) 3.97(4)

3(3) 3.65(4)

3(3) 7(7) 1(1) 1(1) 4.16(9)

3(3) 6.57(7) 1(1) 1(1) 3.77(4)

5(5) 3(3) 3.49(7) 2.0(4) 3(3)

5(5) 3(3) 4.19(7) 2.44(4) 3(3)

values of the slash in the result of Chiang et al. refer to the best and worst results of these methods, respectively. When Chiang et al. considered detection delays, a fault was indicated only when six consecutive measure values have exceeded the threshold, and the detection

delay was recorded as the first time instant in which the threshold was exceeded. Therefore, it is expected that the actual detection delay of their method will be more than the detection delay shown in Table 7. However, the diagnosis interval of the proposed method is 1 or 6 min, which is different from that (3 min) of

8046

Ind. Eng. Chem. Res., Vol. 43, No. 25, 2004

Figure 8. Dynamics and squared residuals of the measured variables for (a) IDV3, (b) IDV15, (c) IDV8, and (d) IDV6.

Chiang et al. Thus, note that though a method detects faster as 1 or 2 min, it cannot mean that the method make a strictly better result. Regardless of the order of the DPLS models, the detections of four cases (IDV4, IDV6, IDV7, and IDV11) are faster than those of Chiang et al. Also, the other cases show similar or only slightly worse detection delays than their methods. The detection delays obtained by linear models are almost the same as those of quadratic models. The diagnosis for three cases (IDV3, IDV9, and IDV15) failed because the fault sizes of these cases were so small and therefore the variations of process variables were as weak as the steady state. Parts a and b of Figure 8 compare T9 (reactor temperature) of IDV3 (step change of D feed temperature in stream 2) with steady-state data and compare T22 (condenser cooling water outlet temperature) of IDV15 (sticking of condenser cooing water valve) with steady-state data, respectively. Therefore, other methods used by Chiang et al. also encounter difficulty in making accurate diagnosis (Table 7). In the other 12 cases, except IDV8, the diagnostic accuracies are almost 1 during all diagnosis periods. In the diagnosis of IDV8, our method failed during 39% of the diagnosis periods, and the accuracy is 0.61. The symptoms of P7, T9, P13, and XA are detected, but are frequently missed due to the small fault size (Figure 8c), indicating that the diagnostic performance of the suggested method depends on the fault size. The results by quadratic model do not show significant differences from those by linear model. The parameters of the missed detection and misclassification rates used by Chiang et al. mean the diagnosis failure rate. Although the meaning is not the same, the difference in value of one and two parameters can be a comparable value with the accuracy used in this study. In most cases, the accuracy obtained by our study was 1, which was better than the result obtained by Chiang et al.

Except IDV12, there is little difference between the rates of wrong detection by the linear and quadratic models. Wrong detection elevated other fault candidates to the top tier. For example, in the diagnosis of IDV1 by the quadratic model, the false detection of XB2 gave the final solution of IDV2 and IDV8, and the diagnosis failed during the false detection. In the diagnosis of IDV6, the average wrong detections by linear and quadratic models are over 7. This is due to the controller output saturations, such as MV3 (547 min, 100%, Figure 8d), MV9 (764 min, 100%), MV10 (1037 min, 100%), MV5 (1152 min, 100%), MV6 (1225 min, 0%), and MV4 (1303 min, 100%), and the fact that the operation range guided by the fault is very different from that of the training data. Including the average resolutions, it can be concluded that the linear and quadratic models show comparable performances. Result of Double Fault Cases. Double faults are generated from the combinations of single faults. As IDV3, IDV9, and IDV15, among 15 single faults, have a detection problem, they are omitted in the combinations for double faults. IDV6 is also omitted because it makes a number of false detections. From 11 single faults (11C2), 55 double faults can be made. However, the proposed method cannot diagnose double faults which affect the same measured variables. For instance, the symptoms set which can be generated from IDV2 or IDV8 includes all possible symptoms from IDV1. With the double faults of IDV1 and IDV2, the possible solutions are IDV1, IDV2, and IDV8. Therefore, 11 double faults are removed and 44 double faults are tested. Tables 8 and 9 show the diagnosis results by linear and quadratic PLS models, respectively. Among the four diagnosis parameters, detection delays by linear models are not largely different from ones by quadratic models, but the accuracies by linear models (arithmetical average of 44 cases is 0.926) are a little better than ones by quadratic models (0.856). It is due to the quadratic models making more wrong detections than the linear models. In Table 8, the accuracies of the

Ind. Eng. Chem. Res., Vol. 43, No. 25, 2004 8047 Table 8. Diagnosis Results for Double Faults by the Linear DPLS Model first fault IDV1

IDV2

IDV4

IDV5 IDV5

IDV7

IDV8

IDV10 IDV11 IDV12 IDV13

second detection delay wrong fault (first/second) accuracy detection resolution IDV4 IDV5 IDV7 IDV11 IDV12 IDV13 IDV14 IDV4 IDV5 IDV7 IDV11 IDV12 IDV13 IDV14 IDV5 IDV7 IDV8 IDV10 IDV12 IDV13 IDV7 IDV8 IDV10 IDV11 IDV13 IDV14 IDV8 IDV10 IDV11 IDV12 IDV13 IDV14 IDV11 IDV12 IDV13 IDV14 IDV11 IDV12 IDV14 IDV12 IDV13 IDV13 IDV14 IDV14

36/2 12/12 30/1 36/18 36/118 36/810 36/12 115/2 12/12 116/1 112/18 77/77 183/183 113/16 2/13 2/1 2/96 2/189 2/68 2/142 12/1 13/13 13/187 13/17 13/13 12/9 1/96 1/184 1/18 1/31 1/147 1/16 96/18 68/68 96/145 96/11 188/18 184/68 189/11 18/68 18/141 68/68 68/11 141/11

1 0.998 1 1 0.994 0.997 1 0.902 0.842 0.729 0.897 0.178 0.945 0.872 1 1 1 1 0.975 1 1 0.999 0.073 1 0.984 1 0.999 0.9996 1 0.985 0.9996 1 1 0.981 0.628 1 1 0.855 1 0.966 1 0.9996 0.963 1

0.003(1) 0(0) 0.006(1) 0.002(1) 0.023(2) 0.006(1) 0.002(1) 0(0) 0.007(1) 0.008(1) 0.001(1) 0.179(1) 0.004(1) 0(0) 0.004(1) 0.002(1) 0.005(1) 0(0) 0.022(1) 0.009(1) 0.01(1) 0.009(1) 0(0) 0.005(2) 0.036(2) 0.005(1) 0.055(1) 0(0) 0.003(1) 0.118(1) 0.157(1) 0.0004(1) 0.005(1) 0.020(2) 0.072(2) 0.003(1) 0.0004(1) 0.005(1) 0.0004(1) 0.014(1) 0.011(1) 0.020(1) 0.012(1) 0.010(1)

8.92(9) 10.88(11) 2.97(3) 8.96(9) 8.97(11) 3.16(8) 8.94(9) 11.44(12) 10.41(12) 3.77(4) 11.51(12) 6.96(12) 3.07(8) 11.48(12) 20.92(21) 3.0(3) 7.97(12) 14.06(15) 10.14(21) 5.54(12) 6.71(7) 9.53(27) 7(7) 20.98(21) 7.85(12) 20.98(21) 3.29(9) 4.70(5) 2.99(3) 3.27(7) 1.97(4) 2.99(3) 8.81(12) 11.01(27) 3.67(5) 9.38(12) 14.14(15) 14.53(15) 14.11(15) 9.90(21) 5.55(12) 5.34(12) 10.17(21) 5.62(12)

two cases are low. The accuracy of the double fault of IDV2 and IDV12 is very low with the average value of 0.178. In this case, SUMY2 was wrongly detected during 1968 min, and XA and XB were detected in short periods. Since IDV13 can explain one more symptom than IDV2, the diagnosis failed. In the case of IDV5 and IDV10, the average accuracy is 0.073. Because the symptoms of T11 and T18 were detected to the last and T22 was not detected, single faults of IDV1, IDV2, IDV8, and IDV13 can explain two symptoms of T11 and T18 during long diagnosis periods. Though wrong detection by the quadratic model is more than by the linear model, wrong detections obtained by linear and quadratic models are less than 1 on average. In Table 8 and Table 9, resolution is not low because a number of single faults that affect the same measured variables can be distinguished only by the fault type of step and random variation. Conclusion This study investigated the multiple fault diagnosis of the TE process, which is a benchmark process for evaluating process diagnosis methods. The hybrid diagnosis method combining SDG and DPLS, proposed in

Table 9. Diagnosis Results for Double Faults by the Quadratic DPLS Model first fault IDV1

IDV2

IDV4

IDV5 IDV5

IDV7

IDV8

IDV10 IDV11 IDV12 IDV13

second detection delay wrong fault (first/second) accuracy detection resolution IDV4 IDV5 IDV7 IDV11 IDV12 IDV13 IDV14 IDV4 IDV5 IDV7 IDV11 IDV12 IDV13 IDV14 IDV5 IDV7 IDV8 IDV10 IDV12 IDV13 IDV7 IDV8 IDV10 IDV11 IDV13 IDV14 IDV8 IDV10 IDV11 IDV12 IDV13 IDV14 IDV11 IDV12 IDV13 IDV14 IDV11 IDV12 IDV14 IDV12 IDV13 IDV13 IDV14 IDV14

36/2 11/11 36/1 36/18 24/95 36/402 36/13 78/2 12/12 60/1 78/19 66/66 78/228 78/19 3/12 2/1 3/84 3/206 3/67 3/130 12/1 12/12 12/210 12/19 12/12 12/19 1/84 1/269 1/18 1/67 1/144 1/16 84/19 67/67 84/142 84/11 207/19 202/67 206/11 19/67 19/130 67/67 67/11 129/11

0.990 0.990 0.950 0.982 0.964 0.720 0.990 0.973 0.922 0.582 0.970 0.338 1 0.934 1 0.96 0.977 1 0.713 1 0.741 0.939 0.204 1 0.99 1 0.805 0.795 0.961 0.672 0.9996 0.96 0.98 0.581 0.701 0.965 1 0.377 1 0.713 1 1 0.481 1

0.021(1) 0.005(1) 0.019(2) 0.002(1) 0.25(2) 0.002(1) 0.002(1) 0.222(2) 0.277(2) 0.782(2) 0.231(2) 0.772(3) 0.065(2) 0.233(2) 0.002(2) 0.197(1) 0.064(2) 0(0) 0.279(2) 0.193(2) 0.114(1) 0.022(1) 0(0) 0.003(1) 0.072(1) 0.005(1) 0.191(3) 0(0) 0.195(1) 0.289(2) 0.188(2) 0.198(1) 0.062(2) 0.339(2) 0.530(3) 0.066(2) 0(0) 0(0) 0(0) 0.313(2) 0.198(2) 0.220(3) 0.009(1) 0.19(2)

8.91(9) 3.38(9) 2.97(3) 8.95(9) 8.74(10) 3(3) 8.94(9) 10.67(12) 10.88(12) 3.04(4) 10.76(12) 6.30(9) 2.58(3) 10.75(12) 19.63(21) 3.0(3) 8.24(12) 13.98(15) 11.76(21) 6.94(12) 6.71(7) 7.77(12) 10.5(15) 19.06(21) 8.18(12) 19.4(21) 3.69(4) 4.68(5) 2.98(3) 3.48(7) 2.24(4) 2.984(3) 7.27(12) 11.16(27) 3.67(5) 9.99(12) 14.05(15) 13.79(15) 14.02(15) 12.31(21) 7.017(12) 3.97(12) 12.01(21) 7.10(12)

our previous study, was used. The process was decomposed centering on 20 measured variables, which are directly affected by the 15 faults defined in the TE process, and the reduced digraph for the decomposed subprocess was made. Dynamic linear and nonlinear (quadratic) PLS models were constructed for each decomposed subprocess, and fault diagnosis was performed by using the residual between the estimated value determined by the DPLS model and the measured one. Through the case studies of 15 single faults, the diagnosis performance was compared with the statistical methods reviewed by Chiang et al., which need faulty case data sets. The result confirmed the satisfactory accuracy of the proposed method. Especially, the diagnosis of four cases by the proposed method was faster than that by other methods. In the single-fault cases, the results by the linear and quadratic models did not make significant differences. The average wrong detection of one single-fault case was over 7, because the operation range guided by the fault was very different from that of the training data. In the future, the diagnosis strategy will have to be able to change the DPLS models according to the significant changes of the operation ranges. If sufficient data for various

8048

Ind. Eng. Chem. Res., Vol. 43, No. 25, 2004

operation ranges are provided, multivariate statistics such as PCA can be helpful to judge the change of the operating conditions. After the estimation models are switched, the CUSUM parameters of minimal jump size and threshold size may be changed. Also, the change needs a strategy to smoothly alter the variables of the detection program. Double fault diagnosis of the TE process was performed. The diagnosis results were acceptably accurate. In the several cases of double fault, the accuracy by the linear model was better than that by the quadratic model because the latter made more wrong detections than the former. Although the linear models were applied to a nonlinear process, it showed a good diagnosis capability. However, it must be noted that the results cannot be generalized into other processes. In this study, faults and symptoms have the form of cause and effects, respectively, and the minimum set of causes that can explain all symptoms becomes the final solution. It can be the basic form of a standardized diagnostic structure, which integrates the result obtained from each diagnosis method. In other words, all possible faults are predefined, and the structure of other conventional diagnostic methods is simplified to have the form of cause and effects. For example, the if-then sentence in the rule-based expert system means cause and effects. After the diagnosis system collects the cause and effects from each diagnosis methods applicable to a target process, the minimum set of faults that can explain all effects becomes the solution. Acknowledgment This work was supported by Grant No. R01-2004-00010345-0 from the Basic Research Program of the Korea Science & Engineering Foundation. Notation BPLS ) matrix of PLS regression coefficient bi ) PLS regression coefficient c0, c1, c2 ) coefficients of quadratic polynomial relation E ) residual matrix for X F ) residual matrix for Y k ) number of principal components l ) number of time delay P ) loading matrix for X Q ) loading matrix for Y ri ) residual of variable i T ) score matrix for X U ) score matrix for Y W ) X-weight matrix X ) input matrix Y ) output matrix Y ˆ ) predicted output matrix yi ) measured value of variable i yˆ i ) estimated value of variable i

Literature Cited (1) Becraft, W. R.; Guo, D. Z.; Lee, P. L.; Newel, R. B. Fault Diagnosis Strategies for Chemical Plants: A Review of Competing Technologies. Proceedings of Process Systems Engineering ’91, Montebello, Canada, 1991; Vol. 2, pp 12.1-12.15. (2) Lee, G.; Song, S.-O.; Yoon, E. S. Multiple-Fault Diagnosis Based on System Decomposition and Dynamic PLS. Ind. Eng. Chem. Res. 2003, 42, 6145-6154. (3) Downs, J. J.; Vogel, E. F. A Plant-wide Industrial Process Control Problem. Comput. Chem. Eng. 1993, 17, 245-255.

(4) Chiang, L. H.; Russell, E. L.; Braatz, R. D. Fault Detection and Diagnosis in Industrial Systems; Springer: London, 2001. (5) Raich, A.; C¸ inar, A. Multivariate Statistical Methods for Monitoring Continuous Processes: Assessment of Discrimination Power of Disturbance Models and Diagnosis of Multiple Disturbances. Chemom. Intell. Lab. Syst. 1995, 30, 37-48. (6) Gertler, J.; Li, W.; Huang, Y.; McAvoy, T. Isolation Enhanced Principal Component Analysis. AIChE J. 1999, 45, 323334. (7) Kano, M.; Nagao, K.; Hasebe, S.; Hashimoto, I.; Ohno, H.; Strauss, R.; Bakshi, B. Comparison of Statistical Process Monitoring Methods: Application to the Eastman Challenge Problem. Comput. Chem. Eng. 2000, 24, 175-181. (8) Raich, A.; C¸ inar, A. Diagnosis of Process Disturbances by Statistical Distance and Angle Measured. Comput. Chem. Eng. 1997, 21, 661-673. (9) Chiang, L. H.; Russell, E. L.; Braatz, R. D. Fault Diagnosis in Chemical Processes Using Fisher Discriminant Analysis, Discriminant Partial Least-Squares, and Principal Component Analysis. Chemom. Intell. Lab. Syst. 2000, 50, 243-252. (10) Chen, G.; McAvoy, T. J. Predictive On-Line Monitoring of Continuous Processes. J. Process Control 1998, 8, 409-420. (11) Huang, Y.; Gertler, J.; McAvoy, T. J. Sensor and Actuator Fault Isolation by Structured Partial PCA with Nonlinear Extensions. J. Process Control 2000, 10, 459-469. (12) Lin, W.; Qian, Y.; Li, X. Nonlinear Dynamic Principal Component Analysis for On-line Process Monitoring and Diagnosis. Comput. Chem. Eng. 2000, 24, 423-429. (13) Kassidas, A.; Taylor, P. A.; MacGregor, J. F. Off-Line Diagnosis of Deterministic Faults in Continuous Dynamic Multivariable Processes Using Speech Recognition Methods. J. Process Control 1998, 8, 381-393. (14) Chiang, L. H.; Kotanchek, M. E.; Kordon, A. K. Fault Diagnosis Based on Fisher Discriminant Analysis and Support Vector Machines. Comput. Chem. Eng. 2004, 28, 1389-1401. (15) Akbaryan, F.; Bishnoi, P. R. Fault Diagnosis of Multivariate Systems Using Pattern Recognition and Multisensor Data Analysis Technique. Comput. Chem. Eng. 2001, 25, 1313-1339. (16) Chen, J.; Howell, J. Towards Distributed Diagnosis of the Tennessee Eastman Process Benchmark. Control Eng. Pract. 2002, 10, 971-987. (17) Yamashita, Y. Dimensionality Reduction in ComputerAided Decision Making. Proceedings of Process Systems Engineering 2003, Kunming, China, 2004; Elsevier: Amsterdam, The Netherlands, 2004; pp 356-361. (18) Maurya, M. R.; Rengaswamy, R.; Venkatasubramanian, V. Qualitative Trend Analysis of the Principal Components: Application to Fault Diagnosis. Proceedings of Process Systems Engineering 2003, Kunming, China, 2004; Elsevier: Amsterdam, The Netherlands, 2004; pp 968-973. (19) Baffi, G.; Martin, E. B.; Morris, A. J. Nonlinear Projection to Latent Structures Revisited: The Quadratic PLS Algorithm. Comput. Chem. Eng. 1999, 23, 395-411. (20) Wold, S. Nonlinear Partial Least-Squares Modelling. II. Spline Inner Relation. Chemom. Intell. Lab. Syst. 1992, 14, 7184. (21) Wold, S.; Kettaneh-wold, N.; Skagerberg, B. Nonlinear PLS Modeling. Chemom. Intell. Lab. Syst. 1989, 7, 53-65. (22) Qin, S. J.; McAvoy, T. J. Nonlinear Far-IR Modeling via a Neural Net PLS Approach. Comput. Chem. Eng. 1996, 20, 147159. (23) Holcomb, T. R.; Morari, M. PLS/Neural Networks. Comput. Chem. Eng. 1992, 16, 393-411. (24) Ricker, N. L.; Lee, J. H. Nonlinear Modeling and State Estimation for the Tennessee Eastman Challenge Process. Comput. Chem. Eng. 1995, 19, 983-1005. (25) Ku, W.; Storer, R. H.; Georgakis, C. Disturbance Detection and Isolation by Dynamic Principal Component Analysis. Chemom. Intell. Lab. Syst. 1995, 30, 179-196.

Received for review May 6, 2004 Revised manuscript received August 29, 2004 Accepted September 13, 2004 IE049624U