Multiple-Fault Diagnosis under Uncertain Conditions by the

Jan 29, 1999 - Multiple-Fault Diagnosis Based on System Decomposition and ... Journal of Control, Automation and Systems Engineering 2003, 159-167 ...
2 downloads 4 Views 164KB Size
988

Ind. Eng. Chem. Res. 1999, 38, 988-998

Multiple-Fault Diagnosis under Uncertain Conditions by the Quantification of Qualitative Relations Gibaek Lee Department of Industrial Chemistry, Chung-Ju National University, Chung-Ju, Chungbuk 380-702, Korea

Byungwoo Lee and En Sup Yoon* Division of Chemical Engineering, Seoul National University, Seoul 151-742, Korea

Chonghun Han Automation Research Center and Department of Chemical Engineering, Pohang University of Science and Technology, Pohang, Kyungbuk 790-784, Korea

Among many fault diagnosis methodologies for chemical processes, the signed digraph (SDG) offers a simple and graphical representation of the causal relationship between process variables and has been widely used. The assumptions of SDG such as single-fault assumption and consistent path assumption, however, have limited its applications to real processes. This paper presents a multiple-fault diagnosis methodology utilizing many advantages of SDG. The diagnosis model is organized by carefully modifying SDG and searching the fault propagation path. The on-line diagnosis is performed through quantifying the qualitative relation of the fault propagation path, and this quantification enables the diagnosis to overcome the weakness of SDG under uncertainty and to diagnose multiple faults. Also, this work suggests how modifiedCUSUM (cumulative sum) can be used to handle compensatory or inverse response. The proposed methodology is illustrated with a case study of a continuously stirred tank reactor (CSTR) with a heat exchanger. Introduction The safe and reliable operation of the chemical process has been one of the key factors for chemical industries to survive in a very competitive international market. To ensure the safe and reliable operation, however, an effective and reliable methodology should be developed to handle the factors such as the changes of feed specifications or the surroundings, sensor noises, and equipment malfunction. Automatic control systems help operators to keep the process within proper operation range against those factors, while automatic trip systems help to protect the process from those factors. The gap between these two systems, however, has caused the operators serious difficulties in monitoring and diagnosing the processes which have led to unstable and unreliable operations. Many simultaneous warnings may mislead operators. Infrequent faults make decisionmaking more difficult as operators are not trained for them. As a result, it is not easy for operators to take correct actions when facing these faults. At the same time, the operators are under the stress to make proper decisions quickly. There is great demand for an automated fault diagnosis system that analyzes the process data on-line, monitors the process trends, and diagnoses faults at abnormal situation to help process operators make the right decisions. The system should provide operators with sufficient information to help in decisionmaking and ultimately assist operators to more efficiently keep operation continuous and safe. A variety of approaches have been developed for the fault diagnosis of chemical processes. A rule-based * To whom correspondence should be addressed. Tel.: +82-2-873-2605. Fax: +82-2-884-0530. E-mail: esyoon@ pslab.snu.ac.kr.

expert system, state estimation, signed digraph, qualitative simulation, and neural network are a few of those that have been reviewed in detail.1 Among these approaches, the signed digraph (SDG) offers a simple and graphical representation for the causal relationship between process variables and propagation paths of the fault based on the experience and basic principles.2,3 After Iri’s work,2 the objectives in the study for a SDGbased diagnosis can be divided into three categories: diagnosis resolution by adding quantitative information to arc;4-6 diagnosis speed using off-line analysis of SDG;3,7-9 representation of process interaction through the expansion to time domain.10,11 Despite these researches on SDG, two fundamental assumptions of SDG limit the wide application of SDGbased diagnosis methods to real processes. They are single-fault assumption and consistent path assumption. Single-fault assumption is that multiple fault is assumed to rarely occur. As the result, it is difficult even to handle fault occurring simultaneously with external disturbances. The consistent path assumption says that each observed change of process variable is connected with an unbroken path to a fault. This is not always true because the fault propagation path does not always conform to the expected or predicted one.12 This paper presents a fault diagnosis methodology for continuous chemical processes in which more than one fault occurs. The suggested method is based on SDG which enables robust diagnosis by relaxing two assumptions of SDG through a quantitative measure of the qualitative fault propagation path. As the size of the knowledge base for the successful diagnosis of a largescale process is huge, this work presents the size reduction technique of the knowledge base by grouping

10.1021/ie980359k CCC: $18.00 © 1999 American Chemical Society Published on Web 01/29/1999

Ind. Eng. Chem. Res., Vol. 38, No. 3, 1999 989

Figure 1. Process flow diagram of the CSTR process.8

process variables. The resolution improvement technique by adding quantitative governing equations available in the process is also introduced. The proposed methodology will be illustrated with the case study of the CSTR with the heat exchanger shown in Figure 1.8

Figure 2. The type of multiple fault: (a) induced fault; (b) independent multiple fault; (c) masked multiple fault; (d) dependent multiple fault.

Limitations of SDG-Based Diagnosis Methods Consistent Path Assumption. Consistent path assumption (or single-state transition assumption) is that each observed symptom is connected with an unbroken path of consistent arcs. In reality, this is not always true because the symptom sequence is affected by the fault size and location, noise, and threshold, and consequently does not follow the predicted one.12 Because of this problem which is called symptom variation, the methods based on SDG with consistent path assumption often fail in diagnosis. The problem can be classified into four categories: effect of noise, error in threshold, effect of fault size and location, and effect of control loop.13 Single-Fault Assumption. The diagnostic methodology based on SDG has been developed under the single-fault assumption that all the observed abnormal states arise from one fault.2,3 In most of the studies for fault diagnosis as well as SDG, multiple faults have not been treated because they rarely occur. However, disturbances such as the changes in demands and product specifications often occur during the operation of a chemical plant. Such disturbances cause the change of the process state and subsequent faults may occur. Consequently, a fault diagnostic methodology should be able to handle a fault that occurs with disturbances. A multiple fault is more than two faults that happen simultaneously or sequentially and can be classified into four categories as follows. (1) Induced fault: This is the case when a fault is induced by another fault (Figure 2a). In other words, an induced fault is dependent on the previous fault. To treat this type of multiple fault, Finch et al. introduced the arc (or branch) called induced failure link (IFL).12 IFL enables the diagnosis of one fault and the expected induced fault occurs. (2) Independent multiple faults: This is the case when different faults have effects on different variables, so the process variables affected by each fault appears respectively as those in Figure 2b. Once multiple fault is allowed in a fault diagnosis system, it is very easy to handle this type of problem. (3) Masked multiple faults: These are the faults (A) of which some can explain all symptoms from the others (B) as shown in Figure 2c. Fault B cannot be recognized using the qualitative method only. (4) Dependent multiple faults: These are the faults that compete with each other, resulting in the mixed

Figure 3. The signed digraph for the CSTR process.8

symptoms (Figure 2d). In this case, any of faults cannot explain all the symptoms resulting from the mutual amplification and diminution. Model Development Modification of SDG. The fault-effect tree (FET) suggested in this work represents the possible fault propagation path that resulted from each fault (top node) as the tree; the terminology of SDG comes from the study of Mohindra and Clark.3 SDG is modified by adding the elements as follows, and the FET model is obtained from the modified SDG. (1) Fault: In previous methodologies, the fault is treated as being identical to the root node that is physically infeasible.3 Hence, the fault cannot be represented perfectly. However, this study defines physically feasible faults for each piece of equipment and adds them on the root node in order to handle only physically meaningful faults. For instance, the faults of each piece of equipment of the CSTR (the original SDG is shown in Figure 3) are listed in Table 1 and the cases of FEEDFCH (feed flow rate change high) and FEED-FCL (feed flow rate change low) are shown in Figure 4a.

990

Ind. Eng. Chem. Res., Vol. 38, No. 3, 1999

Table 1. Fault Classification According to Equipment in the CSTR equipment sensor

valve pipe reactor pump heat exchanger external disturbance

operator disturbance

fault

abbreviation letter

failure high (including stuck) failure low (including stuck) bias high (including stuck) bias low (including stuck) bias high (including failure and stuck) bias low (including failure and stuck) blockage leak equipment failure fouling flow rate change high flow rate change low temperature change high temperature change low composition change high composition change low setpoint change high setpoint change low

FH FL BH BL BH BL BK LK EF FL FCH FCL TCH TCL CCH CCL SVCH SVCL

Figure 4. The example of SDG modification.

(2) Constraint variable: The constraint variable represents the quantitative governing equation14 such as the balance equation and valve relation as variables. For example, three governing equations for CSTR can be formulated as follows:

mass balance equation: 0 ) FOS - FPS

(1)

control valve equation: FRS ) k1CR

(2)

FPS ) k2CL

(3)

The three following constraint variables are introduced from these equations and are added to SDG to use as symptoms for the diagnosis.

CV1 ) FOS - FPS

(4)

CV2 )

FRS CR

(5)

CV3 )

FPS CL

(6)

It is used as a node to improve the resolution. For instance, (CV1, +) is used as the symptom of reactor leakage (Figure 4b). (3) Conditions for an arc: Imposing conditions on arc has originated from ESDG.8 ESDG introduced conditional branches to prevent the fault propagation path from being broken in the case of the controlled variable presenting a compensatory response. In this study, compensatory and reverse response can be treated in symptom detection, and control mode such as automatic, manual, or cascade is represented by conditions on arcs.

For example, control mode of the CSTR recycle flow rate controller should be automatic for its setpoint change to affect the process (Figure 4c). (4) Variable cluster: It is the group of process variables (or nodes) which shows similar responses by the effect of the disturbances, for example, the variables in the negative feedback control loop. Each variable cluster has more than one measured variable and is treated as a single measured variable. We suggest that it is better to formulate a variable cluster with measured variables that show similar dynamic behaviors when disturbance propagates. The advantage of a variable cluster is as follows. First, it enhances the confidence about symptoms. That is, although one or more measured variables show different behavior from the predicted ones, the predicted behaviors of the variables can be estimated by compensating on the basis of other measured variables in the variable cluster which show the predicted behavior. Second the variable cluster gets a much smaller knowledge base than the case of considering each variable, respectively. Therefore, it facilitates the development of a fault diagnosis system for a huge plant. In the case of a large process, the target process may be decomposed and the name of the variable cluster may include the name of the subprocess to explain about the variable cluster.9,13 However, should the time delay between measured variables contained in one cluster be greater than the diagnosis interval, it could bring about the delayed detection. Consequently, the composition of a variable cluster should be made carefully in accordance with the process topology and sensor location, so it will be helpful to consider composing the variable cluster as well when formulating a conventional SDG. In the example, four variable clusters with more than two measured variables and three variable clusters with one measured variable are composed (in Figure 3, the line is used to separate each variable cluster and has not been contained in a conventional SDG). Construction of a Fault-Effect Tree. The step for obtaining a FET is nearly identical to the search step of the fault propagation path. The fault propagation path in the FET consists of “not normal” qualitative states (or symptoms) of the constraint variable and those of a variable cluster, while the fault propagation path in the conventional SDG consists of symptoms of a variable. The qualitative states of the variable cluster involve those of all measured variables in the variable cluster, and we denote them as subpatterns of the

Ind. Eng. Chem. Res., Vol. 38, No. 3, 1999 991 Table 2. Subpatterns for CSTR variable cluster

subpatterns

qualitative state of measured variables

VC1

VC1-1 VC1-2 VC1-3 VC1-4 VC1-5 VC1-6

(LS, +), (CL, +), (FPS, +) (LS, +), (CL, -), (FPS, -) (LS, +), (CL, +), (FPS, -) (LS, -), (CL, -), (FPS, -) (LS, -), (CL, +), (FPS, +) (LS, -), (CL, -), (FPS, +)

VC2

VC2-1 VC2-2 VC2-3 VC2-4

(FRS, +), (CR, -) (FRS, +), (CR, +) (FRS, -), (CR, +) (FRS, -), (CR, -)

VC3

VC3-1 VC3-2 VC3-3 VC3-4 VC3-5 VC3-6 VC3-7 VC3-8

(TS, +), (CT, +), (FWS, +), (TRS, -) (TS, +), (CT, -), (FWS, -), (TRS, +) (TS, +), (CT, +), (FWS, +), (TRS, +) (TS, -), (CT, -), (FWS, +), (TRS, -) (TS, -), (CT, -), (FWS, -), (TRS, +) (TS, -), (CT, +), (FWS, +), (TRS, -) (TS, -), (CT, -), (FWS, -), (TRS, -) (TS, +), (CT, +), (FWS, -), (TRS, +)

VC4

VC4-1 VC4-2 VC4-3 VC4-4

(CAS, +), (CBS, -) (CAS, +), (CBS, +) (CAS, -), (CBS, +) (CAS, -), (CBS, -)

variable cluster. Because FET considers only measured variables while SDG considers unmeasured variables as well, the diagnosis resolution may be worsened.3 However, this resolution reduction can be offset by the definition of a physically meaningful fault and the use of a constraint variable. To construct a FET, a set of possible subpatterns is composed for each variable cluster, and every variable cluster reachable from a fault and its subpattern, is searched for. (1) Composition of the subpattern: To obtain possible subpatterns for each variable cluster, symptoms of all nodes connected with arcs to the current variable cluster are input to the variable cluster. Then, a forward search finds the qualitative states of the nodes in this variable cluster; the qualitative state of the target node is determined as the product of the qualitative state of its source node and the sign of the arc. Finally, this subpattern is stored with the pair of the measured node and the qualitative state. In the case of the variable cluster, VC2, the qualitative states, (+) and (-) of PT, and faults, VR-BH, VR-FH, VR-BL, VR-FL, FS-BH, FSBL, FC-SVCH, FC-SVCL, and FP-BLK can be input to VC2. When PT is positive, a forward search determines (FRS, +), (CR, -), and (VR, -) and the determined subpattern, VC2-1, is stored with (FRS, +), and (CR, -). An example of the determined subpattern for the variable clusters of CSTR is presented in Table 2. (2) A search of the fault propagation path: After the subpatterns are determined, fault propagation paths for all faults are searched. The procedure of the path search is as follows: (1) Select the fault and determine the qualitative states of the forward node as it is the sign on the arc. (2) Determine the qualitative states of all nodes in the variable cluster which includes the current node by a forward search from the current node. (3) Compare the determined qualitative states of the measured nodes with each subpattern of the variable cluster and replace the current node with the determined subpattern. (4) Search every node connected to the arcs from the subpattern and add the nodes and the qualitative state of each node to the end of the path. (5) Repeat steps 2, 3, and 4 with all nodes connected at the end of

Figure 5. The example of FET construction.

the path. The stop conditions of the search are (i) there is no reachable node in the path direction and (ii) there is the subpattern of the variable cluster in the path which contains the current node. The obtained fault propagation path can be represented as a tree structure. For instance, the expansion process of the fault propagation path for LC-BH of CSTR is shown in Figure 5. By step 1, Figure 5a is obtained. Since (LS, +) is a node contained in VC1, a forward search from (LS, +) determines the qualitative state of the measured nodes in VC1, (LS, +), (CL, +), and (FP, +) (Figure 5b). By step 3, VC1-1 is determined in the set of subpatterns for VC1 and replaces (LS, +) (Figure 5c). In step 4, the nodes (FR, -), (T, -), (CA, +), and (CR, -) which are connected to the arcs from VC1-1, are determined and added to the path (Figure 5d). Again, the expanding tree on each node gives Figure 5e. (3) The search of the dominant fault propagation path: More than two paths from one fault to the subpattern of one variable cluster may exist in the obtained fault propagation path. For instance, as shown in Figure 5e, the path corresponding to LS-BH can have subpatterns such as VC3-1, VC3-5, and VC3-7 which have the same 〈SP〉 and the different paths. In here, 〈SP〉 denotes a variable cluster which includes SP. Like this case, different subpatterns of one variable cluster may exist because there can be different paths from a fault to a variable cluster. To increase the diagnosis resolution in such cases, we can find the dominant fault propagation path with process knowledges.8 If only the dominant path is left in Figure 5e, Figure 5f can be obtained. The final tree is called a fault-effect tree. As a matter of convenience, FET is expressed as a set of ordered pairs (F, bs, S); in which F stands for fault, bs a set of the basic symptoms, and S a set of the subpatterns which can be reached from F. The basic symptom is related to the quantity changed immediately by the corresponding fault and used to invoke the fault. Therefore, the basic symptom can be subpatterns or the constraint variables direct-connected to the fault in FET. Also, S is declared as the set of (SP, P, c), in which SP represents the subpattern, P represents the sequence of subpatterns on the path from F to SP, and c represents a set of conditions on the path. SP is represented as (〈SP〉 , M), in which M is a set of

992

Ind. Eng. Chem. Res., Vol. 38, No. 3, 1999

(measured variable in the subpattern, the sign of the measured variable) pairs. In the case of CSTR, the FET is constructed for 45 faults except 28 independent sensor faults. Diagnosis Strategy Based on FET. The process variables do not always follow the predicted qualitative pattern in the environment with too many uncertainties such as noise, and so forth. Therefore, this study quantifies the qualitative pattern and determines how similar the response of process variables is to the predicted qualitative model. The quantitative evaluation starts from the evaluation of subpattern. The measurement of how similar the given qualitative pattern is to the subpattern, subpattern-likelihood (SL), is calculated by the following equation: NMSP

∑i

SLSP )

(ei/τi)n

1 + (ei/τi)n NMSP

(7)

where ei is the deviation of the measured variable i in the same direction as given in the subpattern, τi is the threshold which determines the range of steady state (1.5σ of process data distribution in this study), and n is the constant which determines the steepness of the function (four in this study). The term in the summation is the sigmoid-type belief function14 to obtain the likelihood of each measured variable. When subpatternlikelihood is greater than the threshold, we say that the subpattern is detected and it becomes a symptom. In this study, 0.5 is used as the threshold. However, if the detector shows the excellent performance, a larger value is possible. In the case of very poor detection, values equal to or less than 0.5 should be given. If the detection is so accurate as assumed in SDG, 1 may be used as the base point. The next step is the invocation of the primary fault candidates by evaluating the basic symptom, and the verification of the subpattern which can be explained with each primary fault candidate. This is named as explained-subpattern and the set of the explainedsubpatterns by the fault, F, is expressed as ESP(F).

Finally, the fault-likelihood (FL) of a fault candidate is calculated using the following equation: NESP(F)

FL )

∑i

〈〈SPi,F〉〉 (9)

NESP(F)

To show the extent of the explanation for every symptom, the number of variable clusters that has the detected subpatterns may be used as the denominator of eq 9. However, if the false symptoms occur because of the severe effect from noise, and so forth, they can make the fault-likelihood small. For this case, the number of explained-subpatterns is used to increase the operator’s credibility of the fault candidate set. If fault-likelihood is calculated, every invoked fault is ranked by the fault-likelihood and the number of explained-subpatterns. When fault-likelihood of the top fault candidate is greater than the minimum faultlikelihood for a single-fault decision and the number of explained-subpatterns is equal to the number of the variable cluster that has the detected subpatterns, the set of the invoked faults becomes the set of the final fault candidate; otherwise, the diagnosis for multiple fault begins. For example, the deviation of process variables are given as follows. The sign indicates the deviation direction of the process variables.

(CR, +, 0.000182), (FRS, -, 0.001156), (CL, +, 0.003687), (LS, +, 0.002315) (CT, +, 0.028218), (TS, -, 0.036621), (CAS, +, 0.741801), (CBS, -, 0.741801) (FOS, +, 0.25), (FPS, -, 0.036694), (FWS, +, 0.028218), (TRS, -, 0.143793) With the given value, the subpattern-likelihood calculation of VC1-1 by eq 7 is calculated as follows:

SLVC1-1 )

(

(eLS(+)/τLS)4

(eCL(+)/τCL)4 + + 1 + (eLS(+)/τLS)4 1 + (eCL(+)/τCL)4 (eFPS(+)/τFPS)4

ESP(F) ) {(SP,〈〈SP,F〉〉)|SP ∈ SF, SLSP > 0.5} where 〈〈SP,F〉〉 is the path-likelihood from F to SP. The path-likelihood is obtained from simple quantitative evaluation of the fault propagation path. NPSP F

〈〈SP,F〉〉 )

∑i SLi NPSP F

(8)

where PSP F denotes the sequence of subpatterns existing on the path from F to SP. If the arc condition involved in the path is not satisfied, the path-likelihood of the corresponding path is 0. If two or more subpatterns which have the same 〈SP〉 in the FET of F are detected, only one subpattern which has a maximum path-likelihood can be an element of ESP(F).

)

(

)

1 + (eFPS(+)/τFPS)4 (0.002315/0.0025)4

1 + (0.002315/0.0025)4 (0.003687/0.0025)4

/3

+

+ 1 + (0.003687/0.0025)4 (0.0036694/0.0025)4

)

1 + (0.0036694/0.0025)4

/3

) 0.690651 The subpatterns with a subpattern-likelihood over 0.5 are found to be as follows:

(FO(+), 1) (VC1-1, 0.690651) (VC1-5, 0.549411) (VC3-1, 0.641147) (VC3-3, 0.526687) (VC4-1, 1)

Ind. Eng. Chem. Res., Vol. 38, No. 3, 1999 993

(1) nth-order multiple fault MFn and single fault F are selected. MFn is a single fault in the case of the calculation for the double fault and a double fault in the case of the triple fault. The path-likelihood of each element of ESP(F) is compared to that of ESP(MFn). There are two possible cases. The first case is when the path-likelihood of the elements of ESP(F) is larger than that of ESP(MFn). The second case is when the element of ESP(F) is not an element of ESP(MFn). In both cases, a novel candidate is generated. Figure 6. The example of inverse response.

ESP(MFn+1) ) {(SP,〈〈SP,MFn+1〉〉)|(SP ∈ ESP(F),SP′ ∈ ESP(MFn), 〈SP〉 ) 〈SP′〉 and 〈〈SP,F〉〉 > 〈〈SP,MFn〉〉) or (SP ∈ ESP(F) and 〈SP〉 * 〈∀ SP′ ∈ ESP(MFn)〉)} (10)

Figure 7. The FET of FEED-FCH and RX-LK.

The faults which have the basic symptom of these subpatterns, are FEED-FCH, LS-BH, LC-SVCL, TS-BH, and FC-SVCH and they become primary fault candidates. We will explain the next steps with the case of fault of FEED-FCH (the FET of FEED-FCH is given in Figure 7). Among the subpatterns, FO(+), VC1-1, VC31, and VC4-1 can be explained with FEED-FCH and path-likelihood from FEED-FCH to VC3-1 is calculated.

〈〈VC3-1,FEED-FCH〉〉 )

SLFO(+) + SLVC3-1

) 2 1 + 0.641147 ) 0.820573 2

The set of the explained-subpattern is as follows and the fault-likelihood of FEED-FCH is calculated by eq 9.

The fault likelihood of MFn+1 is calculated by eq 9 and the fault candidates of the single fault and multiple fault are reranked. For example, when FEED-FCH and RX-LK are given as primary fault candidates, each set of explainedsubpatterns is as shown below.

ESP(FEED-FCH) ) {(FO(+),1), (VC3-1,0.95), (VC4-1,1)} ESP(RX-LK) ) {(VC1-4,1), (VC3-5,0.8), (VC4-1,1)} Then, FEED-FCH is MF1 and RX-LK is F by the above procedure. Let us consider VC1-4, the first element of ESP(RX-LK). As ESP(FEED-FCH) does not have any subpattern of which 〈SP〉 is VC1, the new double-fault candidate, MF-N1, is generated and (VC14,1) becomes an element of ESP(MF-N1). In the case of (VC3-5,0.8), ESP(FEED-FCH) has the element VC3-1 of which 〈SP〉 is VC3, and as 〈〈VC3-1〉〉 is larger than 〈〈VC3-5〉〉, (VC3-1, 0.95) becomes the second element of ESP(MF-N1). In the case of (VC1-4, 1), it is treated with the same procedure. Therefore, the following result can be obtained:

ESP(FEED-FCH) ) {(FO(+),1), (VC1-1,0.845326), (VC3-1,0.820573), (VC4-1,1)}

ESP(MF-N1) ) {(FO(+),1), (VC1-4,1), (VC3-1,0.95), (VC4-1,1)}

1 + 0.845326 + 0.820573 + 1 ) 4 0.916475

To search a possible multiple fault based on the SDGbased method, all possible pairs of single faults must be investigated. That is, if the number of single faults is n, 2n - (n + 1) cases must be investigated. For example, if n is 10, the possible number of cases is 1103. Therefore, the larger the number of the primary fault candidates, the greater computation time needed. However, the diagnosis based on FET can reduce the computation time by using the following method. In considering the multiple fault of third or higher order the single faults which are not included in the multiple fault of one less order do not need to be considered. For example, the set of primary fault candidates is {F1, F2, ..., F6} and the set of double faults is found as follows:

FLFEED-FCH )

Although the number of explained-subpatterns of FEED-FCH, four, is the same as the number of the variable clusters that have the detected subpatterns, the fault-likelihood of FEED-FCH is less than 1.0. Therefore, the diagnosis for a multiple fault will be needed. Strategy for Multiple-Fault Diagnosis. The diagnosis based on FET considers only the subpatterns that are dependent on the fault when calculating the faultlikelihood. This makes FET handle not only independent multiple fault but also dependent multiple fault. The search of the multiple fault is defined as the search of the fault set which explains the current pattern better; that is, the multiple fault must have greater fault-likelihood than the single fault. The procedure for the evaluation of the multiple fault is as follows.

{(F1,F2), (F1,F3), (F1,F4), (F2,F4), (F3,F5), (F5,F6)} This case is for the evaluation of a triple fault including (F1,F2). F5 and F6 have no effect on the fault-likelihood of a triple fault, so only two triple faults of (F1,F2,F3)

994

Ind. Eng. Chem. Res., Vol. 38, No. 3, 1999

and (F1,F2,F4) are needed to be considered. By using this method, the calculation time is greatly reduced. Also, the diagnosis model for a multiple fault is not needed because the suggested method can diagnose a multiple fault with the diagnostic model constructed for the single fault. Detection The qualitative diagnosis methods such as SDG require the interpreter which analyzes process data and detects symptoms. The simplest and most widely used method to detect symptoms is the Shewhart chart (or limit checking method). It consists of an upper limit and a lower limit which have the constant distance from the central value (or mean value) and generates symptoms if variables go over the limit value. It can detect a large change more quickly than other methods, but it has a drawback of late detection under a small change. Because of its drawback, Lucas used CUSUM.15 CUSUM has the recurrent computational form suitable for real time analysis and does not need filtering. It presents changes when the accumulated value becomes larger than a threshold after accumulating larger values than a constant minimal jump size from the mean value. SDG-based methods have failed to get the solution in the case of compensatory and inverse responses as the transient response of process variables. Although the FET-based method can diagnose a transient response, the diagnosis performance such as precision, and so forth, may be reduced. To handle a transient response, this work suggests m-CUSUM (modified CUSUM). The variable shown in Figure 6 presents an inverse response. If the parameter tuning of the controller is not too bad, the integrated area above the threshold is larger than that below that threshold. In this case, we can hold on the initial symptom. The modified algorithm is as follows.

for low alarm: TO ) 0, IO ) 0

Table 3. Control Parameters of CSTR controller

gain

time constant

reactor level recycle flow rate reactor temperature

1.5 0.98 0.75

250 500 500

When the measured variable is detected by eqs 11 and 12, the value of the belief function in eq 7 becomes 1. Otherwise, its value is always less than 1. CUSUM uses two parameters: minimal jump size and threshold size. It is known that the resolution decreases if a smaller threshold size is used than necessary, and the diagnosis fails after detection in the case of a too large threshold size. In this study, 6σ of process data distribution as the minimal jump size and 3σ of CUSUM distribution as the threshold was used.16 For the robust and continuous operation of a fault diagnosis system, the mean value (or central value) µ0 and standard deviation σ of process data should be accurately calculated from process data.13 Examples Process and Diagnosis Description. The example process consists of a heat exchanger and a continuous stirred tank reactor where an irreversible first-order reaction A f B takes place (Figure 1). This process is simple but displays the most common characteristics in industrial processes and the SDG of the process involves the most familiar model of SDG (Figure 3). This process is originally used by Kramer and is simulated by the model of Sorsa.8,17 In Table 3, the control parameters used in this study are given and the other parameters are found in the study of Sorsa. The diagnosis model is implemented on the expert system shell, G2. The selected situations are 15 single faults and 37 double faults which are generated randomly, and the step function is used for the simulation of a fault. The parameters for diagnosis are shown below.

(11)

diagnosis interval: 5 s total diagnosis time: 2000 s time of fault occurrence: 100 s limit of multiple fault order: 2 minimum sub-pattern-likelihood for detection: 0.5 minimum fault-likelihood for primary fault candidate: 0.5 minimum fault-likelihood for single fault decision: 1.0

low alarm when (low alarm) or ((Mn - Tn > λ) and normal)

To measure the diagnostic performance, a performance parameter (Φ) is used.

low alarm and set in ) 0 when ((Mn - Tn > λ) and high alarm and In > in)

Φ ) (accuracy) × (resolution) × (precision)

Ln ) xn - µ O + In ) In-1Ln

νm , Tn ) Tn-1 + Ln, Mn ) max Tk 2 when Ln < 0

) In-1 when Ln g 0

for high alarm: UO ) 0, iO ) 0 νm , Un ) Un-1 + Hn, mn ) min Uk 2 in ) in-1 - Hn when Hn > 0

Hn ) xn - µO ) in-1

when Hn e 0

(12)

high alarm when (high alarm) or ((Un - mn > λ) and normal) high alarm and set In ) 0 when ((Un - mn > λ) and low alarm and in > In)

(13)

where accuracy is 1 if the diagnosis is accurate, that is, a true fault is involved in the final set of fault candidates. Otherwise, accuracy is 0. Resolution is calculated as follows:12

resolution ) (Ftotal - Fdiagnosis)/(Ftotal - 1) (14) where Ftotal denotes the total number of possible candidates and Fdiagnosis denotes the number of candidates in the current diagnosis set. If Fdiagnosis is 1 and the candidate is a true fault, resolution and accuracy is to be 1. Precision is the ratio of fault-likelihood of a true fault to that of the candidates in the top tier. A tier represents the rank of the true fault. The candidate with the highest fault-likelihood is in the top tier and the

Ind. Eng. Chem. Res., Vol. 38, No. 3, 1999 995

Figure 8. The dynamics of measured values in a case study.

candidate with the second highest fault-likelihood is in the second tier. If the true fault is in the top tier, the precision is 1. If the fault-likelihood of the true fault is 0.9 and the fault-likelihood of the candidates in the top tier is 1, the precision is 0.9. ExamplesFeed Flow Rate High and Reactor Leaking. In this example, while the supply rate is increased (+0.25 kg/s) at 100 s, leak (0.475 kg/s) has occurred in the reactor at the same time (The FET of two faults are given in Figure 7). As shown in Figure 8, the influence of the leak is stronger than the increase of the feed rate (FOS, detected at 105 s), and therefore the output of the level controller (CL, at 185 s) and exit flow rate (FPS, at 190 s) are decreasing. Also, the reactor temperature (TS, at 355 s) decreases and then increases because of the combined effect of the two faults and the control loop. As the reactor temperature decreases, so do the output of the temperature controller (CT, at 350 s), the cooling water rate (FWS, at 365 s), and the recycle temperature (TRS, at 375 s). The two faults result in the increase of the concentration of component A (CAS, at 125 s) and the decrease in the concentration of component B (CBS, at 125 s). The values of the upper bound and lower bound in Figure 8 are 3 times the standard deviation. Table 4 shows the detection sequence of each measured variables and Table 5 shows the subpatterns detected (among these subpatterns, FO(+), VC3-1 and VC4-1 are related to and explained by FEED-FCH, and VC1-4, VC3-5, and VC4-1 related to and explained by RX-LK). As it is, the traditional SDG-based methods suggest only the single fault of FEED-FCH from 105 to 180 s, and then fail in diagnosis from 185 s when the

Table 4. Detection Sequence of Measured Variables in the Case Study time (s)

detected measured variables

105 125 185 190 240 350 355 365 375

(FOS, +) (CAS, +), (CBS, -) (CL, -) (FPS, -) (LS, -) (CT, +) (TS, +) (FWS, +) (TRS, -)

Table 5. Detection Sequence of Subpatterns in the Case Study time (s)

detected subpatternsa

105 120 125 130 145 160 175 195 305 310 320

FO(+),VC4-1 FO(+),VC3-5,VC4-1 FO(+),VC1-4,VC3-5,VC4-1 FO(+),VC1-2,VC1-4,VC3-5,VC3-7,VC4-1 FO(+),VC1-2,VC1-4,VC3-2,VC3-5,VC3-7,VC4-1 FO(+),VC1-2,VC1-4,VC1-6,VC3-2,VC3-5,VC3-7,VC4-1 FO(+),VC1-2,VC1-4,VC1-6,VC3-5,VC4-1 FO(+),VC1-2,VC1-4,VC1-6,VC4-1 FO(+),VC1-2,VC1-4,VC1-6,VC3-1,VC4-1 FO(+),VC1-2,VC1-4,VC1-6,VC3-1,VC3-3,VC4-1 FO(+),VC1-2,VC1-4,VC1-6,VC3-1,VC3-3,VC3-6,VC4-1

a The underlined subpatterns are newly detected ones at that time.

decrease in the output of the level controller is detected. If the diagnosis of an independent multiple fault is enabled, then both of the two faults are suggested from 355 s. However, during (1) the interval when the output of the level controller (at 185 s) and the exit flow rate

996

Ind. Eng. Chem. Res., Vol. 38, No. 3, 1999

Figure 9. The FL and ESP no. of the fault candidates in the case study.

(at 190 s) are detected later than the level (at 240 s) and (2) the interval when the output of the temperature controller (at 350 s) is detected later than the reactor temperature (at 355 s), the true solution cannot be obtained. By contrast with conventional SDG-based diagnosis, a FET-based diagnosis gives the true solution. A single solution of FEED-FCH is suggested from 105 to 120 s, and the multiple fault of FEED-FCH and RX-LK is suggested from 125 s as it has the largest faultlikelihood and the largest number of explained-subpatterns. Figure 9 gives the fault-likelihood and number of explained-subpatterns for each single fault and multiple fault. The dotted line of (a), (c), and (e) in Figure 9 is the number of the variable clusters that have the detected subpatterns. In this case, the temperature and level of the reactor showed the compensatory response, but the initial responses of two variables could be maintained by m-CUSUM. If the Shewhart chart was used as the detection method instead of m-CUSUM, the result is like that shown in (g) and (h) in Figure 9. This result shows that the detection by m-CUSUM makes

the diagnosis performance better than that by the Shewhart chart. Result and Discussion Result of Single-Fault Cases. The result of the selected single fault is shown in Table 6. In Table 6, the average is over the total results taken by the frequency of 5 s during the simulation period of 2000 s, and the worst is the worst result among the simulation results. All cases are accurate during all diagnosis periods and the diagnosis result is satisfactory. In most cases, the performance parameter is greater than 0.95 and the true solution is placed in the top tier. The case which has a tier greater than 2 in 15 cases is VL-BH. In this case, the true solution was at tier 2 and multiple faults including the true solution were at the top tier only for 45 s in which the symptom was developing. However, the true solution has not been missed unlike conventional SDG and has the largest fault-likelihood among primary fault candidates. The previous examples used 1.0 as the minimum fault-likelihood for a single-fault decision. However, if

Ind. Eng. Chem. Res., Vol. 38, No. 3, 1999 997 Table 6. Diagnosis Result of the Selected Single Faults fault

description

LC-SVCH

level controller setpoint change high

FEED-FCH

feed flow rate change high

FEED-CCL

feed composition change low

FEED-TCL

feed temperature change low

RX-LK

reactor leaking

WP-BK

cooling water pipe blockage

FC-SVCH

flow rate controller setpoint change high

VR-BL

recycle flow rate C/V bias low

VL-BH

level C/V bias high

TC-SVCH

temperature controller setpoint change high

VT-BH

temperature C/V bias high

CW-TCH

cooling water temperature high

LS-BL

level sensor bias low

TS-BH

temperature sensor bias high

FS-BL

recycle flow rate sensor bias low

case

tier

precision

resolution

performance

average worst average worst average worst average worst average worst average worst average worst average worst average worst average worst average worst average worst average worst average worst average worst

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1.02 2 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0.999 0.978 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 3 3 1 1 4 4 2.04 4 1 1 2 2 2 2 1 1 1 1 4 4

1 1 1 1 1 1 1 1 1 1 0.972 0.972 1 1 0.958 0.958 0.985 0.938 1 1 0.986 0.986 0.986 0.986 1 1 1 1 0.958 0.958

Table 7. Selected Multiple-Fault Cases no.

first fault

second fault

second fault time

no.

first fault

second fault

second fault time

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

FEED-FCH FEED-FCL FC-SVCH LC-SVCL RX-LK RX-LK LS-BL LS-BL LS-BL VT-BH VR-BH VL-BL FC-SVCH LC-SVCH VR-BH FC-SVCH TC-SVCH LC-SVCH FS-BL

FEED-CCL CW-TCL TC-SVCH TC-SVCH PP-BK WP-BK FS-BL VL-BH TC-BH FS-BL TS-BL TS-BL VR-BL VR-BL TC-SVCL VT-BH VR-BL VT-BH RX-LK

100 200 100 200 100 200 100 100 200 200 200 200 200 100 100 100 200 100 200

20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

FV-BL LS-BL FEED-TCL LC-SVCH LC-SVCL TC-SVCH CW-TCH CW-TCH FEED-FCH FEED-CCH FEED-TCL FS-BL FEED-CCL FEED-FCL FEED-FCH WP-BLK CW-TCH FP-BK

WP-BK RX-LK FC-SVCH FEED-CCL RX-LK FP-BK LS-BL TS-BH LC-BL TC-BH VL-BL FEED-TCL VR-BH FS-BH RX-LK FEED-CCL RX-LK FEED-TCL

100 100 100 100 200 200 100 100 100 100 200 200 100 100 100 200 200 100

a number less than 1.0 is used, there is the possibility of suggesting an incorrect multiple fault as the solution is reduced. For example, if 0.9 is used, the case of VLBH will suggest the true solution in the top tier. This number should be selected carefully because only a single fault can be presented as the solution though a multiple fault is the true solution. Result of Double-Fault Cases. In 37 randomly selected cases (Table 7), the diagnosis for 9 cases (5, 8, 9, 13, 21, 24, 27, 29, 33) failed because these cases are masked multiple faults in which one fault can explain the symptom of the other fault. In the other 28 cases, the diagnosis results are accurate and satisfactory during all diagnosis periods. Among the cases in which the type of multiple fault is not a masked multiple fault,

19 cases are independent multiples fault and 9 cases are dependent multiple faults. The rest except 9 cases (4, 6, 7, 12, 14, 18, 19, 23, 36) showed tier 1 on average and the worst diagnosis (Table 8). Among these cases, 6 cases except 3 (6, 14, 23) showed tier 2 only in short periods (10-15 s) in which the symptom is developing and tier 1 for most of the time. Only 3 cases showed tier 3 in the worst diagnosis and less than 2 on average. Conclusion This study is based on SDG, but tries to quantitatively evaluate the qualitative fault propagation path to overcome the weakness of SDG. This quantification

998

Ind. Eng. Chem. Res., Vol. 38, No. 3, 1999

Table 8. Diagnosis Result of the Selected Double Faults no.

case

tier

precision

resolution

accuracy

performance

4

average worst

1.0 2

1.0 0.99

1.0 2

1 1

1.0 0.98

6

average worst

1.03 3

0.99 0.94

5.38 6

1 1

0.96 0.89

7

average worst

1.01 2

1.0 0.96

4.87 10

1 1

0.99 0.84

12

average worst

1.0 2

1.0 0.99

4.95 8

1 1

0.99 0.90

14

average worst

1.04 3

1.0 0.91

5 5

1 1

0.99 0.86

18

average worst

1.0 2

1.0 1.0

1.61 2

1 1

1.0 1.0

19

average worst

1.03 2

0.99 0.96

6.75 7

1 1

0.92 0.9

23

average worst

1.55 6

0.96 0.63

6.58 30

1 1

0.95 0.51

36

average worst

1.03 2

0.99 0.88

3.87 4

1 1

0.97 0.86

enables the new methodology to overcome the basic assumption of SDGssingle fault assumption and consistent path assumptionsand to diagnose competing multiple faults under uncertain conditions. The proposed methodology has the following advantages. (1) It reduces the possibility of diagnosis failure because of symptom variations under uncertain conditions. (2) It can diagnose multiple faults using a diagnostic model constructed for a single fault. (3) It makes the knowledge base, to be reduced by grouping process variables, able to facilitate the development of a fault diagnosis system for a huge plant. The advantages of the presented methodology were confirmed through case studies of a CSTR with a heat exchanger and the applicability to a real plant was verified through a fault diagnosis system for a utility boiler plant under uncertain conditions.13 However, there is still the disadvantage that it cannot diagnose masked multiple faults. The presented method is based on a hybrid structure using a qualitative diagnosis model with a quantitative model (governing equation method), but uses the quantitative model only for the diagnosis suggestion of primary fault candidates. We have a plan to combine two models (quantitative and qualitative diagnosis model) to diagnose masked multiple faults and improve diagnosis resolution. Acknowledgment This work was supported (in part) by the Korea Science and Engineering Foundation (KOSEF) through the Automation Research Center at POSTECH. Notation bs ) set of the basic symptom c ) set of conditions on the path cm ) logical function of control mode CV ) constraint-variable e ) deviation of measured variable ESP ) set of explained-subpattern F ) fault FL ) fault-likelihood M ) set of (measured variable in the subpattern, sign) pairs MFn ) nth order multiple fault N ) number of elements in the set P ) sequence of subpatterns existing on the path

S ) set of the subpatterns which can be reached by fault SL ) subpattern-likelihood SP ) subpattern VC ) variable cluster Greek Letters λ ) threshold of CUSUM µ0 ) mean value νm ) minimum jump magnitude τ ) threshold which determines the range of steady state Φ ) performance parameter Subscripts F ) fault i ) summation letter S ) sensor SP ) subpattern

Literature Cited (1) Becraft, W. R.; Guo, D. Z.; Lee, P. L.; Newel, R. B.; Fault Diagnosis Strategies for Chemical Plants: A Review of Competing Technologies. Proc. PSE ’91 1991, 2, 12.1-12.15. (2) Iri, M.; Aoki, K.; O’Shima, E. An Algorithm for Diagnosis of System Failures in the Chemical Process. Comput. Chem. Eng. 1979, 3, 489-493. (3) Mohindra, S.; Clark, P. A. A Distributed Fault Diagnosis Method based on Digraph Models: Steady-State Analysis. Comput. Chem. Eng. 1993, 17, 193-209. (4) Kokawa, M.; Miyazaki, S.; Shingai, S. Fault Location Using Digraph and Inverse Direction Search with Application. Automatica 1983, 19, 729-735. (5) Yu, C. C.; Lee, C. Fault Diagnosis Based on Qualitative/ Quantitative Process Knowledge. AIChE J. 1991, 37, 617-628. (6) Wang, X. Z.; Yang, S. A.; Veloso, E.; Lu, M. L.; McGreavy, C. Qualitative Process ModellingsA Fuzzy Signed Directed Graph Method. Comput. Chem. Eng. 1995, 19, S735-S740. (7) Han, J. A Study on the Knowledge-Based Expert System for Cement Calcination Process Control and Diagnosis. M.S. Dissertation, Seoul National University, Seoul, Korea, 1986. (8) Kramer, M. A.; Palowitch, B. L. A Rule-Based Approach to Fault Diagnosis Using the Signed Directed Graph. AIChE J. 1987, 33, 1067-1078. (9) Nam, D. S.; Han, C. H.; Jeong, C. W.; Yoon, E. S. Automatic Construction of Extended Symptom-Fault Associations From the Signed Digraph. Comput. Chem. Eng. 1996, 20, S605-S610. (10) Hashimoto, Y.; Kawahara, K.; Tanaka, Y.; Yoneya, A.; Togari, Y. Fault Diagnosis Utilizing A Three-Layer Directed Graph. Proc. 4th Intl. Symp. Process Systems Eng. 1991, 2, 10.110.7. (11) Mo, K. J.; Jeong, C. W.; Lee, G.; Yoon, E. S. Development of Operation Aided System for Fault Diagnosis of Chemical Process. J. Expert Systems 1996, 2, 11-26. (12) Finch, F. E.; Oyeleye, O. O.; Kramer, M. A. A Robust Event-Oriented Methodology for Diagnosis of Dynamic Process Systems. Comput. Chem. Eng. 1990, 14, 1379-1396. (13) Lee, G. A Study on Process Fault Diagnostic Systems Using the Fault-Effect Tree Model. Ph.D. Dissertation, Seoul National University, Seoul, Korea, 1997. (14) Kramer, M. A. Malfunction Diagnosis Using Quantitative Models with Non-Boolean Reasoning in Expert Systems. AIChE J. 1987, 33, 130-140. (15) Lucas, J. M. The Design and Use of V-Mask Control Schemes. J. Quality Technol. 1976, 8, 1-12. (16) Choe, Y. J. Application and Implementation of Scale-Space Filtering Techniques for Qualitative Interpretation of Real-Time Process Data. Ph.D. Thesis, Department of Chemical Engineering, Seoul National University, Seoul, Korea, 1995. (17) Sorsa, T.; Koivo, H. N. Neural Networks in Process Fault Diagnosis. IEEE Trans. Systems, Man, Cybern. 1991, 21, 815825.

Received for review June 8, 1998 Revised manuscript received November 10, 1998 Accepted November 11, 1998 IE980359K