Article pubs.acs.org/IECR
A Multilogic Probabilistic Signed Directed Graph Fault Diagnosis Approach Based on Bayesian Inference Di Peng,† Zhiqiang Geng,† and Qunxiong Zhu*,† †
College of Information Science & Technology, Beijing University of Chemical Technology, Beijing 100029, China ABSTRACT: Signed directed graph (SDG), as a widely applied fault diagnosis approach, is unable to express complicated logic relations other than logic OR and usually results in spurious interpretations. To solve the problem, a semiquantitative fault diagnosis approach based on the model of multilogic probabilistic SDG (MPSDG) with Bayesian inference is proposed. The MPSDG model introduces the logic gates to describe multiple logic causalities between process variables, and the priori probabilistic parameters in MPSDG are decided by the historical malfunction frequencies and the deviation of variables. When a failure occurs, the backtracking algorithm using the consistent rule is conducted immediately, and the posterior probabilities of each searched fault are computed and sorted by a set of Bayesian inference mechanisms. Thus, the real reason is further distinguished. This MPSDG based fault diagnosis approach is applied to two examples: a continuous stirred tank heater (CSTH) process and a Tennessee Eastman (TE) process. The experimental results demonstrate that the proposed approach is superior to the conventional SDG approach and can diagnose the production faults more accurately.
1. INTRODUCTION Because most chemical processes are becoming larger in size and more complicated in nature, the economic loss and social influence caused by industrial accidents have grown more severe. To ensure the reliability and safety of production, fault diagnosis technology1−4 has drawn increasing attention from scholars and industrial companies. As a kind of qualitativebased fault diagnosis technology, signed directed graph (SDG) uses nodes and directed edges to describe and represent the causal relations between process parameters in the target system. Unlike the conventional quantitative-based5,6 or datadriven approaches,7,8 the SDG model does not require precise mathematical description or complete operational data, and it can be developed from the partial information of equations or the experience of operators. In addition, SDG reveals the latent dangers and the propagation rules in a simple and effective way, so it is especially suitable for fault diagnosis in the chemical industry. During the three preceding decades, the technology of SDGbased fault diagnosis has improved greatly. In 1980s, the study of SDG-based fault diagnosis only considered the qualitative information of the system, and some researchers extracted these cause−effect relations as expert rules.9,10 In the year of 1989, Shiozaki et al.11 first proposed the concept of fault reveal time, and realized the online fault diagnosis by using temporal information. In 1990s, many researchers extended the SDG model to solve the problem of multiple fault diagnosis, and the diagnostic algorithm is based on the following fact: “The more failures, the smaller co-occurrence probability”.12 At the same time, some quantitative methods began to be combined with SDG.13,14 Since 1999, additional approaches such as fuzzySDG,15 PCA-SDG,16 PLS-SDG,17 and QTA-SDG18 have received a high-level of interest. Essentially, almost all the studies of SDG-based fault diagnosis focus on providing a set of reasoning mechanisms to find out the fault reasons, and a few approaches for the construction of an SDG model have been discussed. Actually, © 2014 American Chemical Society
modeling is the base of fault diagnosis; if the SDG model is improper, even an excellent inference mechanism is useless. To construct a more accurate SDG model, Mylaraswamy et al.19 modeled the target system on the basis of the mathematical descriptions (including differential equations, algebraic equations, and differential algebraic equations). Maurya et al.20−22 presented the modeling approach for control loops, where disturbances (such as sensor bias) as well as structural failures (such as sensor failure, controller failure) are easily modeled under steady-state conditions. Otherwise, most other SDG models are based on the experts’ experience. Nevertheless, present SDG models can only express the logic relation among the cause nodes as OR. Obviously, when the interaction between cause nodes is complex, these SDG models are unable to represent this cause−effect relationship precisely. In addition, SDG reasoning is too simple to result in numerous spurious interpretations of faults. For example, several fault causes that are reasoned from the same qualitative symptoms cannot be distinguished further by SDG. To address this issue, Yang and Xiao23,24 first proposed the probabilistic SDG (PSDG) model. They introduced the probabilistic information of nodes and directed edges in SDG, and proposed a junction tree algorithm to acquire the conditional possibilities of each candidate fault. Song et al.25 presented a similar model. Compared with the former, the node variable was expressed as a fuzzy variable with more information. Lu et al.26 described the procedure of constructing PSDG models in detail, and then formed an integrated framework of PSDG-based fault diagnosis approaches. However, the studies on PSDG still need more research. On the one hand, these probabilistic parameters are just determined by the practical experience, while ignoring Received: Revised: Accepted: Published: 9792
October 25, 2013 April 3, 2014 May 14, 2014 May 14, 2014 dx.doi.org/10.1021/ie403608a | Ind. Eng. Chem. Res. 2014, 53, 9792−9804
Industrial & Engineering Chemistry Research
Article
some quantitative information of the target process. On the other hand, the structure of the conditional probability as well as its calculation procedure has not yet been given in detail. The purpose of this article is to introduce a novel multilogic probabilistic SDG (MPSDG) fault diagnosis approach for analyzing the fault states during the chemical process. In the off-line modeling stage, the MPSDG model is constructed with an introduction of the logic gates (representing multiple logic causalities between process variables) and the probabilistic parameters (determined by historical malfunction frequencies and deviation values). In the online diagnosis stage, the backtracking algorithm based on the consistent rule is first carried out to find out candidate fault reasons, and then the conditional probabilities of these reasons are defined under the condition of fault symptom and calculated by a set of Bayesian inference. By sequencing the conditional probability values, the real fault reason can be further reasonably concluded. The continuous stirred tank heater (CSTH) process and Tennessee Eastman (TE) process are applied to reveal the validity and advantages of this MPSDG based fault diagnosis approach.
found out. The flowchart of the backtracking algorithm is shown in Figure 1.
2. REVIEW OF SDG BASED FAULT DIAGNOSIS MPSDG is the inheritance and development of the traditional SDG model, so some basic definitions and characters of SDG will be reviewed briefly before introducing MPSDG. 2.1. SDG Model. Definition 2.1: A SDG model γ = (G,φ) is composed by a directed graph G and a function φ. The directed graph G is defined as G = (V, R, E), where V = {vi} is a variable node set, and vi denotes a process variable; R = {rl} is a reason node set, and rl denotes an abnormal reason that will cause variation of its adjacent variable node; E = {eij} is a directed edge set, and eij is a directed edge from cause node vi to effect node vj. The function is defined as φ: E → {+, −}, and φ(eij) is the sign of directed edge eij. In the traditional three-state SDG model, each variable node has three qualitative values “0”, “+” and “−”, representing the normal state, higher than normal state range, and lower than normal state range, respectively. The directed edge has two signs, i.e., φ(eij) = + (cause node and effect node change in same direction) and φ(eij) = − (cause node and effect node change in opposite direction). What is more, the reason node rl is introduced by some researchers to better express the target system.27 Each reason node is considered as root nodes, which has at least one edge connecting it to an effect node but no edge connecting it to a cause node. Definition 2.2: Suppose the qualitative values of node vi are ψ(vi). If ψ(vi)φ(eij)ψ(vj) = +, the edge eij is said to be a consistent edge. A consistent path is composed of some continuous consistent edges (including one reason node and some variable nodes). 2.2. SDG-Based Fault Diagnosis. SDG considers the fault is only propagated by the consistent path, and the reason node in the consistent path is regarded as the fault cause. All the process variables are monitored in the system. When the measured value of a variable node is out of the preset thresholds (i.e., ψ(vi) ≠ 0), this node sends an alarming signal and makes the diagnosis start. At first, the qualitative values of alarming nodes are determined, and all alarming nodes are collected in an alarming node set. Subsequently, the backtracking search from any alarming node on SDG will start to find out consistent paths, and it will stop when all alarming nodes have been reasoned. At last, candidate fault reasons are
Figure 1. Backtracking algorithm flow.
Through the backtracking algorithm, SDG clearly explains the fault propagation route in the target system and is suitable for fault diagnosis in the chemical industry. However, SDG itself still has some disadvantages in the area of real-time fault diagnosis: (1) The default logic relation among cause nodes is defined as OR in SDG. However, if the interaction of cause nodes is complex, SDG fails to describe the target process precisely. (2) The traditional SDG model can only obtain a candidate fault set list, but these candidate faults have the equal possibilities of occurrence and the most possible fault cause cannot be found out directly and reasonably.
3. MPSDG BASED FAULT DIAGNOSIS To overcome the shortcomings of traditional SDG, a MPSDG model with Bayesian inference is proposed. The purpose of the proposed fault diagnosis approach is to express the multifarious variable correlation, calculate and rank the posterior probabilities of candidate faults, and finally confirm the actual fault cause. 3.1. Definition of the MPSDG Model. Compared with the definition of the SDG model, MPSDG introduces the multilogic relation as well as the probabilistic information between the nodes and edges. The model structure of MPSDG is quite similar to that of SDG, but MPSDG contains more priori information and is able to describe the system more accurately. The definition of MPSDG is as follows: Definition 3.1: A MPSDG model γ = (G′,φ, P) is composed by a multilogic directed graph G′, a function φ and a probability distribution P. The multilogic directed graph G′ is defined as G′ = (V, R, E), where V = {vi} is a variable node set, R = {rl} is a reason node set, and E = {eij} is a directed edge set. The function φ is same as the Definition 2.1, where φ(eij) denotes the sign of directed edge eij. The probability distribution P is defined as P: R, E → [0,1], where P(rl) indicates the occurrence probability of fault reason rl (rl ∈ R), and P(eij) indicates the propagation probability from the cause node vi failure to the effect node vj failure (eij ∈ E). In MPSDG, a variable node denotes an element of the target system and has some qualitative state values, which are 9793
dx.doi.org/10.1021/ie403608a | Ind. Eng. Chem. Res. 2014, 53, 9792−9804
Industrial & Engineering Chemistry Research
Article
determined by comparing their measured value with the state range. A directed edge, connecting two nodes, also has two signs. To describe the diagnosis system in detail, in this paper, vni is used to describe the nth abnormal state of process variable vi, and enm ij is used to describe the connection between the nth abnormal state of cause node vni and the mth abnormal state of effect node vmj . Taking the three-state MPSDG model as an example, there are two abnormal states in every variable node, and the variable nodes and the directed edges are specified in Table 1.
Table 2. Logic Gate Specification
cause node vi
effect node vj
+
v1i
v1j
[ eij11 0; 0 eij22 ]
−
v2i
v2j
[0 eij21; eij12 0]
Gph
1 2 ⋮ P
expression 1 expression 2 ⋮ expression P
3.3. Evaluation of the Priori Probabilistic Parameters. As mentioned above, MPSDG introduces the probabilistic parameters of reason nodes and directed edges into the SDG. These probabilistic parameters can be determined by statistical analysis of historical data, simulation, and practical experiences. In MPSDG, the probability distribution of reason nodes indicates a probability of failure that is going to happen. The occurrence probabilities of reason nodes are determined by the statistical analysis of malfunction frequencies from historical operational data of the target process. What is more, the probability distribution of directed edges indicates a probability of the rule being used in an edge. The propagation probabilities of directed edge are traditionally described by a conditional probabilities table (CPT), where the probability values are determined by the practical experiences. However, the CPT does not contain any quantitative information of the system and sometimes reduces the diagnostic resolution. Definition 3.2: Suppose the effect node vj has M states and I cause nodes, and the cause node vi has N states (where i = 1, 2, ···, I), so the propagation probability of directed edge eij is denoted as
Table 1. Variable Nodes and Directed Edges Specification qualitative value
P
directed edge eij
3.2. Modeling of Multilogic Directed Graph. The SDG model transmits abnormal information from cause nodes to effect nodes by directed edges. Figure 2 is an illustration of a
P(eijnm) = λijpijnm /λj
(1)
pnm ij
vni
In eq 1, is defined as the connection probability from to vmj with the assurance of ∑mM= 1pnm ij = 1. λij (λij ∈ [0,1]) is defined as the causal relationship intensity between cause node vi and effect node vj, in which λij = 1 suggests they are perfect correlation and λij = 0 suggests they are zero correlation. λj = ∑i I= 1λij is the sum of the causal relationship intensity in all cause nodes. In eq 1, λij is a relative value that can be determined by the deviation relationship between cause node vi (i = 1,2,···,I) and effect node vj. As cause nodes have different scope, the relative deviations (deviations compared with the normal value) of vi and vj are first calculated as Δvi and Δvj, then the deviation relationship is represented as Δvi/Δvj. In nature, Definition 3.2 is based on the following two cognitions. First, each state of a same node is mutex to each other, meaning these states do not happen at the same time. Second, for all the states of a same node, the sum of their occurrence probability is 1. These two cognitions are given in Appendix A.1. According to the proof of the mutex and normalization, the mth abnormal state of variable node vj is simplified as
Figure 2. An illustration of the directed graph (each node is supposed to have two abnormal states {+,−}): (a) traditional directed graph; (b) multilogic directed graph.
directed graph. As defined by traditional SDG, the logic relation between cause node v1 and cause node v2 is considered as OR, which means that any one of cause node v1 and cause node v2 or they together may impact the variance of effect node v3. Obviously, if the interaction of cause nodes is complex, the traditional SDG cannot describe the cause−effect relationship precisely. To describe the target system more accurately, the logic gate is introduced in the directed graph. The logic gate28 is a flexible tool for the compact representation of multilogic relation among cause nodes. For instance, only if the node v2 is in the positive abnormal state, can node v1 cause the variance of node v3. To specify the logic relation, a logic gate variable denoted as G4 is used in Figure 2b. In this instance, G4 is a special parent variable of v3 and has two exclusive states G14 and G24. The two states of G4 cause v3 with the function φ(e43). In the MPSDG modeling, we can use a logic gate variable Gh to represent all types of logic relations between cause nodes (where h is the index of logic gate and is different from the indexes of other nodes), and Gh is specified in Table 2. What is more, the logic gate variable is considered as a variable node in the MPSDG model. It indicates that a logic gate can be the input of other logic gates, and more than one logic gate can be the cause of a same effect node.
I
N
vjm = ∪ ∪ vineijnm = i=1 n=1
+
i = 1, ···, I n = 1, ···, N
vineijnm (2)
and the occurrence probability of the mth abnormal state of effect node vj is expressed as I
P(vjm) =
N
I
N
∑ ∑ P(vin)P(eijnm) = ∑ ∑ P(vin)λijpijnm /λj i=1 n=1
i=1 n=1
(3) 9794
dx.doi.org/10.1021/ie403608a | Ind. Eng. Chem. Res. 2014, 53, 9792−9804
Industrial & Engineering Chemistry Research
Article
Figure 3 concludes the probability calculation method of effect node, and a MPSDG model is finally formed by introducing the
cannot be caused by itself at a same time, so when there are cycles in the expanding process (discussed in Appendix A.2), the following rule is concluded to break down the cycle: Rule 1: A node cannot be the cause of itself at a same time. When a cycle is encountered, it should be broken down at the duplicate node, and the descendant cut set is treated as a null set. (4) Compute the logic “AND” expanding of all fault symptoms, and then the posterior probability P(rl|E) of the inspected fault reason rl is obtained by inserting the occurrence probabilities of fault reasons and the propagation probabilities of directed edges. 3.5. Integrated Procedure of MPSDG-Based Fault Diagnosis. On the basis of the above analysis, the integrated procedure of MPSDG-based fault diagnosis is illustrated in Figure 4. The procedure contains two parts: off-line modeling and online diagnosis.
Figure 3. Probability calculation method of effect nodes.
logic relation and the probabilistic information of nodes and directed edges into the corresponding SDG model. 3.4. MPSDG Reasoning with Bayesian Inference. After the MPSDG model of the target system is constructed, the reasoning mechanism will be utilized to find out fault causes based on symptoms. In general, the fault diagnosis process of MPSDG reasoning is divided into two parts: determining candidate fault reasons and calculating their probabilities. First, a backtracking algorithm, in accordance with Figure 1, is implemented to obtain the consistent paths and all possible fault reasons. Second, the conditional probabilities of all candidate fault reasons based on the measured symptoms are calculated. Then the candidate faults are ranked according to the probability values, and the fault with maximal probability is regarded as the most conceivable fault cause. Definition 3.3: Suppose all alarming variable nodes are considered as a set of fault symptom E (E = E1···EQ), where Q denotes the number of alarming variable nodes and Eq = vnqq represents the nqth abnormal state of variable node vq (q = 1,2,···,Q). Thus, the conditional probability of candidate fault reason rl under the condition of fault symptom E is expressed as P(rl|E) =
P(rl , E1 , ···, EQ ) P(E1 , ···, EQ )
n
=
P(rl ∩ v1n1 ∩ ··· ∩ vQQ ) n
P(v1n1 ∩ ··· ∩ vQQ )
. (4)
Up to now, there is no universal algorithm designed specially for probabilistic SDG reasoning reported in the literature. The Bayesian inference29,30 method is usually chosen to solve the posterior probability calculation. However, the MPSDG model has various cycles that denote the necessary feedback control loops of the target system, and the Bayesian inference is disabled due to these cycles. The Bayesian inference must be modified in advance to obtain the posterior probability value, and the detailed steps of Bayesian inference in the MPSDG model are described as follows: (1) Simplify the multilogic directed graph according to the nonalarming variable nodes. As the nodes in the normal state are not the cause of nodes in an abnormal state, the nonalarming nodes should be deleted during the process of Bayesian inference. (2) Obtain the first-order cut set expression of one fault symptom vnqq. The nqth abnormal state of variable node vq n nn is expanded as vqq = ∪i I= 1∪nN= 1vni eiq q, where vni is the nth state of cause node vi (i = 1,2,···,I; n = 1,2,···,N). n (3) Deduce the final cut set expression of fault symptom vqq. nq Expanding the first-order cut set expression of vq until the expression is only composed of fault reasons and directed edges. As the MPSDG model has cycles, a node will be expanded as the cause of itself. However, a node
Figure 4. Procedure of the MPSDG-based fault diagnosis.
The main goal of the off-line stage is to construct the MPSDG model of the target system, which can be realized by the following steps: (1) model the multilogic directed graph of the target system; (2) evaluate the priori probabilities of reason nodes and directed edges by analysis of historical data, practical experiences, and deviation values of process variables; (3) complete the MPSDG model by introducing the probabilistic parameters into the multilogic directed graph. The main goal of the online stage is to get the diagnostic result of the target system, which can be realized by following steps: (1) monitor the alarming nodes by comparing with the normal state range; (2) implement the backtracking algorithm to find out the consistent paths and fault reasons; (3) if the candidate fault reason is more than one, implement the Bayesian inference to calculate the posterior probabilities of all candidate faults; (4) rank the probability values of all candidate fault reasons and output the diagnostic results. 9795
dx.doi.org/10.1021/ie403608a | Ind. Eng. Chem. Res. 2014, 53, 9792−9804
Industrial & Engineering Chemistry Research
Article
4. APPLICATION EXAMPLES To demonstrate the feasibility and effectiveness of the proposed MPSDG approach, we put it into the fault diagnosis two examples: a CSTH process and a TE process. 4.1. CSTH Process. 4.1.1. Introduction of CSTH Process. The CSTH process was originally presented by Thornhill and Patwardhan,31,32 and is modeled based on the measured data captured from a real process. As depicted in Figure 5, this
Table 4. Model Parameters and Steady State parameter
description
value
V1 A2 r2 U TC Ta T1 T2 h2
volume of stirred tank 1 cross sectional area of stirred tank 2 radius of stirred tank 2 heat transfer coefficient cold water temperature atmospheric temperature steady state temperature (stirred tank 1) steady state temperature (stirred tank 2) steady state level (stirred tank 2)
1.75 × 10−3 m3 7.854 × 10−3 m2 0.05 m 235.1 W/(m2 K) 30 °C 25 °C 50 °C 47 °C 0.36 m
On the basis of the mathematical description and control strategy of CSTH, the multilogic directed graph is first built in Figure 6. In Figure 6, the circle nodes, rectangle nodes, directed solid lines and directed dotted lines represent process variables, fault reasons, positive directed edges and negative directed edges, respectively. Because stirrer v15 has only one fault state (stirring stops), and this state might decrease the deviation of tank level v6, the relationships among the cause nodes v1, v7 and v15 cannot be expressed as logic OR. To solve the problem, a special logic gate G16 with two exclusive states is introduced in Figure 6. 1 2 1 G16 = v11 ∪ v71 , G16 = v12 ∪ v72 ∪ v15
where the logic variable G116 is caused by any increase of incoming flow v1 and v7, and the logic variable G216 is caused by any decrease of incoming flow v1 and v7 or the stop of stirrer v15. From the multilogic directed graph, the MPSDG model can be obtained by introducing the probabilistic information. The n connection probability pnm ij from a cause node vi to an effect m node vj is determined by the impact degree according to the experience of operators. Here, the given connection probabilities should accord with the direction of edges with the assurance of ∑mM= 1pnm ij = 1. As for the correlation λij between cause node vi and effect node vj, it is determined according to the deviation relationship between cause nodes and effect nodes. Therefore, with the determination of pij and λij, the propagation probability p(eij) can be calculated as eq 1. In addition, the occurrence probabilities of each fault reason p(rl) are decided approximately by the statistical analysis of historical malfunctions. For example, during 500 running days, the fault reason r1 broke down 20 times. Thus, p(r1) should be 20/500 = 0.04. Because there may be some faults never occurred before, a relatively small probabilistic value of 0.001 will be assigned to these reason nodes. However, because the CSTH system does not give the historical malfunction data, the occurrence probability p(rl) cannot be obtained. In the following section, the fault diagnosis results under different occurrence probabilities are discussed. 4.1.3. Experimental Results. The CSTH process first runs in the nonfault status for 500 s, and then a fault described in Table 6 is added. Once the variable nodes alarm, the backtracking algorithm based on the collected fault symptoms is started to find out the consistent paths. Table 6 shows the backtracking search results of each case. As observed from Table 6, there are 8 kinds of cases (1st, 2nd, 4th, 5th, 6th, 8th, 9th and 10th cases) in which the searched fault reasons are consistent with the original hypothesis. For the 3rd and the 7th case, two candidate fault reasons that result in the same qualitative symptoms are detected. If we only use the
Figure 5. Continuous stirred tank heater.
system consists of two continuous stirred tanks, and the cold water entering stirred tank 1 and stirred tank 2 is first heated by immersing the steam into the tank through a long pipe. Then, a portion of hot water formed from stirred tank 2 is recycled to stirred tank 1, which introduces additional multivariable interactions and complexity in the system. In each stirred tank, the hot and cold water is well mixed, and the temperature in the stirred tank is assumed the same as the outflow temperature. Simulation programs of this CSTH process can be found at the website of Thornhill.33 To strengthen the antidisturbance ability of the CSTH process, we introduce the temperature control loop of stirred tank 1, the temperature control loop of stirred tank 2, and the level-flow cascade control loop of stirred tank 2. Each control loop implements a proportional plus integral (PI) controller, and the tuning parameters of each PI controller are shown in Table 3. Table 3. Tuning Parameters of PI Controller controller
set point
proportion
integral
temp control of stirred tank 1 temp control of stirred tank 2 level control of stirred tank 2 flow control of stirred tank 2
50 °C 47 °C 0.36 m LC output
0.5 1 0.001 10000
0.1 0.05 0.0001 0.01
On the basis of the construction of balance equations and control loops, the model parameters and steady states of this CSTH process are listed in Table 4. 4.1.2. MPSDG Model. This CSTH process defines 15 variables and 10 kinds of failures, where the variable Q1 and Q2 cannot be measured online. Table 5 provides a detailed description of these process variables and fault reasons. 9796
dx.doi.org/10.1021/ie403608a | Ind. Eng. Chem. Res. 2014, 53, 9792−9804
Industrial & Engineering Chemistry Research
Article
Table 5. Description of Variable Nodes and Reason Nodes of the CSTH Process variable node
reason node
description
fault state
v1: cold water 1 in-flow, F1 v2: cold water 1 valve, u1 v3: tank 1 temperature, T1 v4: heat in-flow of tank 1, Q1 v5: heat input 1 control valve, u4 v6: level of tank 2, h2 v7: cold water 2 in-flow, F2 v8: cold water 2 control valve, u2 v9: tank 2 temperature, T2 v10: heat in-flow of tank 2, Q2 v11: heat input 2 control valve, u5 v12: cold water temperature, TC v13: recycle flow, FR v14: recycle flow valve, u3 v15: stirrer state
v11: +; v21: − v12: +; v22: − v13: +; v23: − v14: +; v24: − v15: +; v25: − v16: +; v26: − v17: +; v27: − v18: +; v28: − v19: +; v29: − v110: +; v210: − v111: +; v211: − v112: +; v212: − v113: +; v213: − v114: +; v214: − v115: −
description
fault reason
r21: failure of cold water 1 in-flow
r121:
positive disturbance; r221: negative disturbance
r22: failure of cold water 1 valve r23: failure of heat in-flow of tank 1
r122: closed r123: positive disturbance; r123: negative disturbance
r24: failure of heat input 1 valve r25: failure of cold water 2 in-flow
r124: closed r125: positive disturbance; r225: negative disturbance
r26: failure of cold water 2 valve r27: failure of heat in-flow of tank 2
r126: closed r127: positive disturbance; r227: negative disturbance
r28: failure of heat input 2 valve r29: failure of cold water temperature
r128: closed r129: positive disturbance; r129: negative disturbance
r30: failure of recycle flow valve
r130: closed
Figure 6. Multilogic directed graph of the CSTH process.
backtracking search process, as the nodes v4 and v10 are immeasurable, they disrupt the consistency between the fault symptoms and lower the fault resolution. To solve the problem,34 different values of v4 and v10 are assumed until the consistent path to explain these fault symptoms is found. The consistent paths obtained by the backtracking algorithm are as follows:
Table 6. Backtracking Search Results of 10 Cases case no. 1 2 3 4 5 6 7 8 9 10
description cold water flow F1 jumps + 10−5 m3 s−1 cold water valve u1 closes heat flow Q1 jumps +200 J s−1 heat input valve u4 closes cold water flow F2 jumps + 10−5 m3 s−1 cold water valve u2 closes heat flow Q2 jumps +200 J s−1 heat input valve u5 closes cold water temperature TC jumps +3 °C recycle flow valve u3 closes
fault symptom
candidate reason
v11,v23,v15,v16,v27,v28,v29,v111
r121
v21,v22,v13,v25,v26,v17,v18,v19,v211 v13,v25,v19,v211 v23,v25,v29,v111 v23,v15,v16,v17,v28,v29,v111
r122 r123, r127 r124 r125
v13,v25,v26,v27,v28,v19,v211 v13,v25,v19,v211 v23,v15,v29,v211 v13,v25,v19,v211,v112
r126 r123, r127 r128 r129
v13,v25,v29,v111,v213,v214
r130
Path 1: 1 r23 → v41 → v31 → v52 ; 1 2 2 r23 → v41 → v31 → v91 → v11 → v10
Path 2: 1 1 2 r27 → v10 → v91 → v11 ; 1 1 r27 → v10 → v91 → v31 → v52 → v42
conventional SDG-based fault diagnosis, the possibilities of these candidate fault reasons will be equal and the real fault reason cannot be distinguished directly and reasonably. Diagnosis 1: Heat flow Q1 jumps +200 J s−1. For the third case, its fault symptom includes v13, v25, v19, and v211. During the
According to the searched consistent paths, both the fault reason r123 and r127 can explain the collected fault symptoms. To distinguish the two candidate faults, Bayesian inference is introduced to rank the possibilities of fault reason r123 and r127, 9797
dx.doi.org/10.1021/ie403608a | Ind. Eng. Chem. Res. 2014, 53, 9792−9804
Industrial & Engineering Chemistry Research
Article
λ9,11, and λ11,9 of 1. Moreover, the correlation values λ23,3, λ24,5, λ27,9 and λ28,11 between reason nodes and their connected variable nodes are also considered as 1. Thus, only the correlation value λ3,9 and λ9,3 should be determined. Figure 8
and the inference results of the third case are specified in the following steps: (1) Simplify the multilogic directed graph. As the qualitative values of variable nodes v1, v2, v6, v7, v8, v12, v13, v14 and v15 are “0”, these nodes should be deleted from the original multilogic directed graph. What is more, directed edges that connect immeasurable nodes v4 or v10 should be transformed before deleting nodes v4 and v10. Figure 7 shows the simplified
Figure 7. Simplified multilogic directed graph of CSTH process.
multilogic directed graph and the failure states of each variable. As the immeasurable nodes v4 and v10 are deleted, the directed edges that connect nodes v4 and v10 are transformed as e5,3 = e4,3*e5,4, e23,3 = e4,3*e23,4, e11,9 = e10,9*e11,10 and e27,9 = e10,9*e27,10. (2) Calculate the first cut set expressions of variable v3, v5, v9, and v11: v3 = e 23,3r23 ∪ e5,3v5 ∪ e 9,3v9 Figure 8. Fault symptoms of nodes v3, v5, v9, and v11 in the third case.
v5 = e 24,5r24 ∪ e3,5v3 v9 = e 27,9r27 ∪ e3,9v3 ∪ e11,9v11 v11 = e 28,11r28 ∪ e 9,11v9
shows the fault symptoms of nodes v3, v5, v9, and v11 in the third case. As observed from Figure 8, the proportion of relative deviation between Δv11/Δv9 and Δv3/Δv9 is almost 3:2, and the proportion between Δv5/Δv3 and Δv9/Δv3 is almost 10:1, so the correlation value can be set as λ3,9 = 0.7, λ9,3 = 0.1. What is more, the connection probabilities should meet the direction of directed edges with the assurance of ∑mM= 1pnm ij = 1 (namely, if φ(eij) = +, then pij = [1 0; 0 1]; on the contrary, if φ(eij) = −, then pij = [0 1; 1 0]), so the propagation probabilities of each directed edge P(enm ij ) are calculated by eq 1. (5) Calculate the posterior probabilities P(r123|v13v25v19v211) and P(r127|v13v25v19v211). The logic “AND” of fault symptoms are expanded as follows:
(5)
(3) Calculate the final cut set expressions of variable v3, v5, v9, and v11: v3 = e 23,3r23 ∪ e5,3e 24,5r24 ∪ e 9,3e 27,9r27 ∪ e 9,3e11,9e 28,11r28 v5 = e 24,5r24 ∪ e3,5e 23,3r23 ∪ e3,5e 9,3e 27,9r27 ∪ e3,5e 9,3e11,9e 28,11r28 v9 = e 27,9r27 ∪ e3,9e 23,3r23 ∪ e3,9e5,3e 24,5r24 ∪ e11,9e 28,11r28 v11 = e 28,11r28 ∪ e 9,11e 27,9r27 ∪ e 9,11e3,9e 23,3r23 ∪ e 9,11e3,9e5,3e 24,5r24
2 v31 ∩ v52 ∩ v91 ∩ v11 12 11 1 12 11 12 11 1 12 11 1 = e3,5e 23,3r23e 9,11e3,9 + e3,5 e 23,3r23e 9,11e 27,9r27 12 11 11 1 12 + e3,5e 9,3e 27,9r27e 9,11
(6)
Expand eq 6 into a state matrix, and the final cut set expressions of fault symptoms v13, v25, v19, and v211 are obtained.
2 1 ∩ r23 v31 ∩ v52 ∩ v91 ∩ v11 12 11 1 12 11 12 11 1 12 11 1 = e3,5 e 23,3r23e 9,11e3,9 + e3,5 e 23,3r23e 9,11e 27,9r27 12 11 11 1 12 1 + e3,5e 9,3e 27,9r27e 9,11r23
11 1 11 11 1 v31 = e 23,3 r23 + e 9,3 e 27,9r27 12 11 1 12 11 11 1 v52 = e3,5 e 23,3r23 + e3,5 e 9,3e 27,9r27 11 1 11 11 1 v91 = e 27,9 r27 + e3,9 e 23,3r23 2 12 11 1 12 11 11 1 v11 e 27,9r27 + e 9,11 e3,9e 23,3r23 = e 9,11
2 1 ∩ r27 v31 ∩ v52 ∩ v91 ∩ v11 12 11 1 12 11 1 12 11 1 12 11 1 = e3,5e 23,3r23e 9,11e3,9r27 + e3,5 e 23,3r23e 9,11e 27,9r27 12 11 11 1 12 + e3,5e 9,3e 27,9r27e 9,11
(7)
(4) Calculate the propagation probabilities of each directed edge in Figure 7. From Definition 3.2, the propagation probabilities are determined by the correlation values and the connection probabilities. In Figure 7, the directed edges e3,5, e5,3, e9,11, and e11,9 denote the transmission direction of control signal, so the nodes v3 and v5 or the nodes v9 and v11 are perfect correlated, corresponding to the correlation values λ3,5, λ5,3,
(8)
Because the historical malfunctions of CSTH system are hard to be collected, Table 7 lists the diagnostic results of the third case under the different occurrence probabilities. As listed in Table 7, when the occurrence probabilities of r123 and r127 are equal, the posterior probability of fault reason r123 is obviously greater than the posterior probability of r127. Thus, the fault 9798
dx.doi.org/10.1021/ie403608a | Ind. Eng. Chem. Res. 2014, 53, 9792−9804
Industrial & Engineering Chemistry Research
Article
propagation probabilities of each directed edge P(enm ij ) are calculated by eq (1). Table 8 lists the diagnostic results of seventh case under the different occurrence probabilities. As listed in Table 8, when the
Table 7. Diagnostic Results of 3rd Case under the Different Occurrence Probability occurrence probability
diagnostic resultsa
P(r123):P(r127)
P(r123|v13v25v19v211)
P(r127|v13v25v19v211)
0.01:0.01 0.02:0.01 0.01:0.03 0.01:0.06 0.01:0.07 0.01:0.09
0.8778 0.9355 0.7117 0.5632 0.5286 0.4728
0.1444 0.0882 0.3408 0.5162 0.5571 0.6231
Table 8. Diagnostic Results of Seventh Case under the Different Occurrence Probability occurrence probability
P(r123|v13v25v19v211)
P(r127|v13v25v19v211)
0.01:0.01 0.01:0.02 0.02:0.01 0.04:0.01 0.05:0.01 0.07:0.01
0.1885 0.1161 0.3194 0.4894 0.5476 0.6399
0.8377 0.9125 0.7250 0.5787 0.5286 0.4543
a
The boldface emphasizes the most probable fault reason by MPSDG approach, whose posterior probability is maximal.
reason r123 (the increase of heat flows Q1) is diagnosed as the most probable cause of symptoms v13, v25, v19, and v211, and this diagnostic conclusion accords with the original hypothesis. What is more, the priori occurrence probabilities of fault reasons are also an important factor that influences the diagnostic result. As observed from Table 7, when the proportion of the occurrence probabilities between r123 and r127 is higher than 1:6, the real fault reason r123 can be correctly diagnosed. However, when the occurrence probability value in fault reason r127 is more than seven times the value in fault reason r123, fault reason r127 (the increase of heat flows Q2) is changed as the most probable fault reason. Diagnosis 2: Heat flow Q2 jumps +200 J s−1. For the seventh case, because its fault symptom is unchanged as the third case, consistent paths and Bayesian inference results (eq 8) of the seventh case are same as those of the third case. Figure 9 shows the fault symptoms of nodes v3, v5, v9 and v11 in seventh case. As observed from Figure 9, the proportion between Δv11/v9 and Δv3/v9 is almost 10:1, and the proportion between Δv5/v3 and Δv9/v3 is almost 2:1, so the correlation value is set as λ3,9 = 0.1, λ9,3 = 0.5. The other correlation values and connection probabilities between cause nodes and effect nodes are also the same as those in the third case, and the
diagnostic results
P(r123):P(r127)
a
The boldface emphasizes the most probable fault reason by MPSDG approach, whose posterior probability is maximal.
proportion of the priori occurrence probabilities between r123 and r127 is lower than 5:1, the posterior probability of fault reason r127 is maximal, so the fault reason r127 is concluded as the most possible cause of the seventh case, which accords with the original hypothesis. However, when the proportion is higher than 5:1, fault reason r123 is the diagnostic conclusion. 4.2. Tennessee Eastman Process. The TE process is a simulation program of a realistic industrial process that is widely used for evaluating and comparing the efficiency of process monitoring techniques. The process produces two products (labeled G and H) and one byproduct (labeled F) from four reactants (labeled A, C, D, and E). A simplified diagram of the TE process35 is shown in Figure 10, which contains 41 measured variables and 12 manipulated variables in the five major operating units (a continuous stirred tank reactor, a condenser, a separator, a stripper, and a compressor). The step in reactor cooling water inlet temperature r14 is one of 20 faults simulated by the TE process. The simulation time is 72 h, and the fault is injected at 20 h. When the fault arises, the following symptoms are detected: reactor temperature v19 (+), separator temperature v111 (+), stripper temperature v118 (+), reactor cooling water outlet temperature v121 (+), condenser cooling water outlet temperature v122 (+), and reactor cooling water flow v110 (+). The MPSDG model should be first constructed for reasoning and fault diagnosis in this study. Instead of the entire MPSDG model (not shown due to its complexity), the simplified MPSDG model of fault symptoms is given in Figure 11. Once the variable nodes alarm, the backtracking algorithm based on the simplified MPSDG model is started to find out the consistent paths. The consistent paths obtained by the backtracking algorithm are as follows: Path 1: 1 r31 → v91 → v21 ;
1 r31 → v91 → v10 ;
1 1 r31 → v91 → v11 → v18 ;
1 r31 → v91 → v22
Path 2: 1 1 r41 → v21 → v10 ;
Figure 9. Fault symptoms of nodes v3, v5, v9, and v11 in the seventh case.
1 1 1 r41 → v21 → v91 → v11 → v18 ;
1 1 r41 → v21 → v91 → v22
9799
dx.doi.org/10.1021/ie403608a | Ind. Eng. Chem. Res. 2014, 53, 9792−9804
Industrial & Engineering Chemistry Research
Article
Figure 10. Tennessee Eastman process.
v21 = e4,21r4 ∪ e 9,21v9 ∪ e10,21v10 v9 = e3,9r3 ∪ e 21,9v21 v10 = e 9,10v9 ∪ e 21,10v21 v11 = e 9,11v9 v18 = e11,18v11 v22 = e 9,22v9
Figure 11. Simplified multilogic-directed graph of the TE process.
(9)
(3) Calculate the final cut set expressions of fault symptoms v121, v19, v110, v111, v118, and v122: 1 11 1 11 11 1 v21 r4 + e 9,21 e3,9r3 = e4,21
According to the searched consistent paths, both the fault reason r13 (step in D feed temperature) and r14 can explain the collected fault symptoms. If we only use the conventional SDG based fault diagnosis, the possibilities of the two candidate fault reasons will be equal and the real fault reason cannot be found out further by SDG. However, by MPSDG, the possibilities of the two candidate faults are calculated, and the conclusion will be obtained by sequencing the probability values of candidate fault reasons. To calculate the possibilities of the two candidate faults, Bayesian inference is applied and the inference results are specified in the following steps: (1) Simplify the multilogic directed graph, as shown in Figure 11. (2) Calculate the first cut set expressions of variable v21, v9, v10, v11, v18, and v22:
11 1 11 11 1 v91 = e3,9 r3 + e 21,9 e4,21r4 1 11 11 1 11 11 11 1 11 11 1 v10 e3,9r3 + e 21,10 e 9,21 e3,9r3 + e 21,10 e4,21 r4 = e 9,10 1 11 11 1 11 11 11 1 v11 e3,9r3 + e 9,11 e 21,9e4,21r4 = e 9,11 1 11 11 11 1 11 11 11 11 1 v18 e 9,11 e3,9r3 + e11,18 e 9,11 e 21,9e4,21r4 = e11,18 1 11 11 1 11 11 11 1 v22 e3,9r3 + e 9,22 e 21,9e4,21r4 = e 9,22
(10)
(4) Calculate the propagation probabilities of directed edges in Figure 11. The propagation probabilities are defined by the correlation values and the connection probabilities. In Figure 11, the directed edges e21,10, e10,21, and e9,10 denote the transmission direction of control signal, so their correlation values λ21,10, λ10,21, and λ9,11 are 1. Moreover, the correlation 9800
dx.doi.org/10.1021/ie403608a | Ind. Eng. Chem. Res. 2014, 53, 9792−9804
Industrial & Engineering Chemistry Research
Article
values λ4,21, λ3,9, λ9,11, λ11,18, and λ9,22 are also considered as 1. Thus, only the correlation value λ21,9 and λ9,21 should be determined. Figure 12 shows the fault symptoms of nodes v9, v21, and v10. As observed from Figure 12, the proportion of relative
Table 9. Diagnostic Results of TE Process under the Different Occurrence Probability occurrence probability
diagnostic results
P(r13):P(r14)
P(r13|v121v19v110v111v118v122)
P(r14|v121v19v110v111v118v122)
0.01:0.01 0.01:0.02 0.01:0.03 0.02:0.01 0.03:0.01 0.04:0.01
0.3028 0.1885 0.1416 0.4674 0.5708 0.6418
0.7211 0.8393 0.8879 0.5571 0.4743 0.4090
a
The boldface emphasizes the most probable fault reason by MPSDG approach, whose posterior probability is maximal.
diagnostic conclusion accords with the original hypothesis. What is more, when the proportion of the occurrence probabilities between r13 and r14 is lower than 2:1, the real fault reason r14 is diagnosed. However, when the proportion is higher than 2:1, the fault reason r13 is diagnosed as conclusion. 4.3. Discussions. In the probability calculation of candidate fault reasons, it is very important to learn the initial priori probabilities of reason nodes and directed edges. The priori probabilities of directed edges determined by the relative deviation are more reasonable than those determined by the practical experience. Here, the directed graph in Figure 2a is taken as an example again. If the relative deviation is Δv1/Δv3 > Δv2/Δv3, then the correlation value can be assumed λ1 > λ2. 11 What is more, the connection probabilities p11 1,3 and p2,3 are equal to 1, so the priori probabilities of directed edges can be 11 calculated as P(e11 1,3) > P(e2,3). Of course, if Δv1/Δv3 ≈ Δv2/Δv3, 11 then assume λ1 = λ2, and finally obtain P(e11 1,3) = P(e2,3). Therefore, it can be proven that the priori probability values of directed edges are determined in accord with common sense. Moreover, compared with the traditional PSDG model, MPSDG has less conditional probabilistic parameters. Take Figure 2a as an example, node v3 has two cause nodes v1 and v2. For the traditional PSDG model,25 the number of the priori conditional probabilistic parameters that should be attached to node v3 will be 2 × 2 × 2 = 8. If a node has n cause nodes, 2n+1 parameters should be used to specify the conditional probability distribution. For the MPSDG model, because the nm parameters λij and pnm ij in eq 1 are independently given and pij is accord with the direction of directed edges, the number of probabilistic parameters in the n cause nodes are n. Considering the CSTH case (referring to Figure 7), as nodes v11, v9, v3, and v5 have 2, 3, 3, and 2 cause nodes separately, the number of probabilistic parameters in the traditional PSDG is 23 + 24 + 24 + 23 = 48, while the number in MPSDG is only 2 + 3 + 3 + 2 = 10. In addition, the posterior probability of a candidate fault reason largely depends on its initial priori probabilities. As we know from Tables 7 and 8, when the proportion of the priori probabilities between r123 and r127 exceeds one value, the diagnostic conclusion will be changed. Because of this property, if multiple reasons cause the same evidence, the posterior probability of the reason with small priori probability may not be the maximum value even if it is the real fault. Therefore, the diagnostic conclusion should not only focus on the reason with maximal posterior probability, but also check the other candidate reasons based on the orders of their posterior probabilities.
Figure 12. Fault symptoms of nodes v9, v21, and v10 in the TE process.
deviation between Δv9/Δv21 and Δv10/Δv21 is almost 1:5, and the proportion between Δv21/Δv9 and Δv21/Δv10 is almost 5:1, so the correlation value is set as λ9,21 = 0.2, λ21,9 = 1. Moreover, the connection probabilities are the same as the determination of CSTH case, so the propagation probabilities of each directed edge P(enm ij ) are calculated by eq 1. (5) Calculate the posterior probabilities P(r13|v121v19v110v111v118v122) and (r14|v121v19v110v111v118v122). The logic “AND” of fault symptoms are expanded as follows: 1 1 1 1 1 ∩ v91 ∩ v10 ∩ v11 ∩ v18 ∩ v22 v21 11 1 11 1 11 11 11 11 11 1 11 1 11 11 = e3,9r3e4,21r4e 9,10e11,18e 9,11e 9,22 + e3,9 r3e4,21r4e 21,10e11,18 11 11 11 11 1 11 11 11 11 11 11 1 e 9,11 e 9,22 + e3,9 e 9,21r3e 9,10e11,18e 9,11 e 9,22 + e3,9 e 9,21r3 11 11 11 11 11 11 1 11 11 11 11 e 21,10e11,18e 9,11e 9,22 + e 21,9e4,21r4e 21,10e11,18e 9,11e 9,22 1 1 1 1 1 ∩ v91 ∩ v10 ∩ v11 ∩ v18 ∩ v22 ∩ r31 v21 11 1 11 1 11 11 11 11 11 1 11 1 11 11 = e3,9r3e4,21r4e 9,10e11,18e 9,11e 9,22 + e3,9 r3e4,21r4e 21,10e11,18 11 11 11 11 1 11 11 11 11 11 11 1 e 9,11 e 9,22 + e3,9 e 9,21r3e 9,10e11,18e 9,11 e 9,22 + e3,9 e 9,21r3 11 11 11 11 11 11 1 11 11 11 11 1 e 21,10e11,18e 9,11e 9,22 + e 21,9e4,21r4e 21,10e11,18e 9,11e 9,22r3 1 1 1 1 1 ∩ v91 ∩ v10 ∩ v11 ∩ v18 ∩ v22 ∩ r41 v21 11 1 11 1 11 11 11 11 11 1 11 1 11 11 = e3,9 r3e4,21r4e 9,10e11,18e 9,11 e 9,22 + e3,9 r3e4,21r4e 21,10e11,18 11 11 11 11 1 11 11 11 11 1 11 11 1 e 9,11e 9,22 + e3,9e 9,21r3e 9,10e11,18e 9,11e 9,22r4 + e3,9e 9,21r3 11 11 11 11 1 11 11 1 11 11 11 11 e 21,10 e11,18 e 9,11 e 9,22r4 + e 21,9 e4,21r4e 21,10e11,18 e 9,11 e 9,22
(11)
Table 9 lists the diagnostic results under the different occurrence probabilities. As listed in Table 9, when the occurrence probabilities of r13 and r14 are equal, the posterior probability of fault reason r14 is obviously greater than the posterior probability of r13. Thus, the fault reason r14 (the reactor cooling water inlet temperature) is diagnosed as the most probable cause of symptoms v121, v19, v110, v111, v118, and v122, and this 9801
dx.doi.org/10.1021/ie403608a | Ind. Eng. Chem. Res. 2014, 53, 9792−9804
Industrial & Engineering Chemistry Research
Article M
5. CONCLUSION To meet the requirement of completeness, accuracy, and performance of the real-time fault diagnosis, a semiquantitative fault diagnosis approach based on the MPSDG with Bayesian inference is proposed. Compared with the conventional SDG, MPSDG realizes the intricate causal representation between process variables, enhances the rationality of priori probabilistic parameters, and calculates and ranks the posterior probabilities of all candidate faults under the conditions of the measured fault symptoms. In the two application examples, the MPSDGbased fault diagnosis approach presents a characteristic of concise causal representation, powerful fault interpretation, and high diagnostic resolution, thus providing a new approach to guarantee the safety of process. Of course, this MPSDG model also has some limitations, for example, constructing a MPSDG model conveniently, learning and tuning the priori probabilistic parameters in a supervised mode, and so on. These should be discussed in the future studies.
■
= =
and
M
+
n = 1, ···, N
vineijnm ∩
+
i ′= 1, ···, I
I
N
I
i=1 n=1
where = 1. If all the propagation probabilities M nm P(enm ) divide by λ ij j (namely, suppose λij/λj = ∑m = 1P(eij )), M m then the normalization requirement of ∑m = 1P(vj ) is realized. The key to the above proof is to suppose λij/λj = ∑mM= 1P(enm ij ). In nature, λij (λij ∈ [0,1]) denotes the causal relationship intensity between cause node vi and effect node vj, in which λij = 1 suggests they are perfect correlation and λij = 0 suggests they are zero correlation. Furthermore, according to the normalization proof, the propagation probability of directed edge eij is concluded in Definition 3.2. In Definition 3.2, the causal relationship intensity between cause node and effect node λij is given at first, then the connection probability pnm is determined in the ij normalization way (that is ensuring ∑mM= 1pnm ij = 1). A.2. Cycle Processing
As mentioned above, if the measured value of variable in a cycle has deviated, this cycle must be processed in advance to use Bayesian inference to calculate the probabilities. In general, a number of fault scenarios may be occurred in a feedback control loop, and these scenarios22,26 are mainly classified into perfect control (including external disturbances, sensor bias, controller signal bias, and valve bias) and imperfect control (including large external disturbances, sensor failure, controller signal failure, and valve failure). Figure A1 is an illustration of the SDG model with a control loop system. The eight scenarios can be analyzed using this model. When any one of eight faults happens, the consistent paths based on the detected fault symptoms inside a cycle can be obtained by using the conventional SDG reasoning. The consistent paths will help to break down the cycle for further probabilities calculation. Owing to the limited space, the cases of sensor basis in perfect control and controller failure in imperfect control are discussed to illustrate the rule of cycle processing. (1) Sensor bias MVbias. It is supposed that there is a positive sensor bias. Then, the measured initial responses of MV, CS, and VP are positive, negative, and negative. After feedback regulation of the control loop, the steady-state response of MV
(A2)
N
i=1 n=1
(A5)
∑n N= 1P(vni )
Equation A2 demonstrates all the states of a same cause node are mutex. According to Proposition 1, the mth state of effect node vmj is simplified as eq (A1), so its occurrence probability can be concluded as
∑ ∑ P(vineijnm) = ∑ ∑ P(vin)P(eijnm)
∑ λij
= λj
vin′ ′ein′ j′ m ′ = ϕ
n ′= 1, ···, N
n=1
i=1
where the symbol + means XOR operation. Furthermore, eq A2 is realized when m ≠ m′.
P(vjm) =
i=1 n=1 I N i=1 I
(A1)
i = 1, ···, I
N
= ∑ λij ∑ P(vin)
vjm ∩ vjm ′ = ϕ
n = 1, ···, N
vjm ∩ vjm ′ =
I
m=1
vineijnm
i = 1, ···, I
(A4)
∑ P(vjm) = ∑ ∑ P(vin)λij
n m Proof: From the condition enm ij ∩ei′j′ ′ = ϕ, the equation n n m vni enm ∩v ′ e ′ ′ = ϕ is derived. What is more, the mth abnormal ij i′ i′j state of variable node vj in the SDG is expressed as vmj = ∪iI= 1∪nN= 1vni enm ij . Thus, with a logical expansion of effect node vj, the effect node vj is simplified as
+
m=1
Suppose λij = n = 1,2,···,N. That is, the sum of the propagation probabilities from the any state of cause node vi to the all states of effect node vj is equal to a constant λij (where λij ∈ [0,1]). Then, the eq (A4) is expressed as
n = 1, ···, N
vjm =
M
∑ ∑ [P(vin) ∑ P(eijnm)] i=1 n=1
=
vineijnm
∑ ∑ ∑ P(vin)P(eijnm)
∑mM= 1P(enm ij ),
Firstly, the mutex requirement for the states of a same node is realized as follows: Proposition 1: Suppose the effect node vj has M states and I cause nodes, the cause node vi has N states (i = 1,2,···,I), and the directed edges are independent of each other (i.e., ∃n ≠ n′ n m and m ≠ m′ and i ≠ i′ s.t. enm ij ∩ei′j′ ′ = ϕ), so the effect node is denoted as +
N
i=1 n=1 m=1 I N
A.1. Proof of the Mutex and Normalization Requirement
i = 1, ···, I
I
m=1 i=1 n=1 I N M
m=1
APPENDIX
vjm =
M
∑ P(vjm) = ∑ ∑ ∑ P(vin)P(eijnm)
(A3)
Secondly, the normalization requirement for the states of a same node is proved as follows: Proof: Equation A3 has defined the occurrence probability of vmj , and the sum of the occurrence probabilities of M states is 9802
dx.doi.org/10.1021/ie403608a | Ind. Eng. Chem. Res. 2014, 53, 9792−9804
Industrial & Engineering Chemistry Research
Article
(9) Shiozaki, J.; Matsuyama, H.; O’Shima, E.; Iri, M. An improved algorithm for diagnosis of system failures in the chemical process. Comput. Chem. Eng. 1985, 9, 285. (10) Kramer, M. A.; Palowitch, B. L. A rule-based approach to fault diagnosis using the signed directed graph. AIChE J. 1987, 33, 1067. (11) Shiozaki, J.; Shibata, B.; Matsuyama, H.; O’Shima, E. Fault diagnosis of chemical processes utilizing signed directed graphsimprovement by using temporal information. IEEE Trans. Ind. Electron. 1989, 36, 469. (12) Vedam, H.; Venkatasubramanian, V. Signed digraph based multiple fault diagnosis. Comput. Chem. Eng. 1997, 21, 665. (13) Vianna, R.; McGreavy, C. Qualitative modeling of chemical processes-A weighted digraph (WDG) approach. Comput. Chem. Eng. 1995, 19, 375. (14) Tarifa, E. E.; Scenna, N. J. Fault diagnosis, directed graphs, and fuzzy logic. Comput. Chem. Eng. 1997, 21, 649. (15) Tarifa, E. E.; Scenna, N. J. Fault diagnosis for MSF dynamic states using a SDG and fuzzy logic. Desalination 2004, 166, 93. (16) Vedam, H.; Venkatasubramanian, V. PCA-SDG based process monitoring and fault diagnosis. Control Eng. Pract. 1999, 7, 903. (17) Lee, G.; Tosukhowong, T.; Lee, J. H.; Han, C. Fault diagnosis using the hybrid method of signed digraph and partial least squares with time delay: The pulp mill process. Ind. Eng. Chem. Res. 2006, 45, 9061. (18) Maurya, M. R.; Rengaswamy, R.; Venkatasubramanian, V. A signed directed graph and qualitative trend analysis analysis-based framework for incipient fault diagnosis. Chem. Eng. Res. Des. 2007, 85, 1407. (19) Mylaraswamy D.; Kavuri S.; Venkatasubramanian V. A framework for automated development of causal models for fault diagnosis. AIChE Annual Meeting, Miami, FL, 1994. (20) Maurya, M. R.; Rengaswamy, R.; Venkatasubramanian, V. A systematic framework for the development and analysis of singed digraphs for chemical processes. 1. Algorithms and analysis. Ind. Eng. Chem. Res. 2003, 42, 4789. (21) Maurya, M. R.; Rengaswamy, R.; Venkatasubramanian, V. A systematic framework for the development and analysis of singed digraphs for chemical processes. 2. Control loops and flowsheet analysis. Ind. Eng. Chem. Res. 2003, 42, 4811. (22) Maurya, M. R.; Rengaswamy, R.; Venkatasubramanian, V. A signed directed graph-based systematic framework for steady-state malfunction diagnosis inside control loops. Chem. Eng. Sci. 2006, 61, 1790. (23) Yang, F.; Xiao, D. Y. Probabilistic signed directed graph and its application in hazard assessment. Prog. Saf. Sci. Technol. 2006, 6, 111. (24) Yang F.; Xiao D. Y. Model and fault inference with the framework of probabilistic SDG. 9th International Conference on Control, Automation, Robotics and Vision, Singapore, 2006. (25) Song Q. J.; Xu M. Q.; Wang R. X. Fault diagnosis approach based on fuzzy probabilistic SDG model and Bayesian Inference. IEEE Circuits and Systems International Conference on Testing and Diagnosis (ICTD), Chengdu, 2009. (26) Lu, N.; Xiong, Z. H.; Wang, X.; Ren, C. R. Integrated framework of probabilistic signed digraph based fault diagnosis approach to a gas fractionation unit. Ind. Eng. Chem. Res. 2011, 50, 10062. (27) Gao, D.; Wu, C. G.; Zhang, B. K.; Ma, X. Signed directed graph and qualitative trend analysis based fault diagnosis in chemical industry. Chin. J. Chem. Eng. 2010, 18, 265. (28) Zhang, Q. Dynamic uncertain causality graph for knowledge representation and reasoning: Discrete DAG cases. J. Comput. Sci. Technol. 2012, 27, 1. (29) Zhu, Y. L.; Huo, L. M.; Lu, J. L. Bayesian networks-based approach for power systems fault diagnosis. IEEE Trans. Power Delivery 2006, 21, 634. (30) Lo, C. H.; Wong, Y. K.; Rad, A. B. Bond graph based Bayesian network for fault diagnosis. Appl. Soft Comput. 2011, 11, 1208. (31) Thornhill, N. F.; Patwardhan, S. C.; Shah, S. L. A continuous stirred tank heater simulation model with applications. J. Process Control 2008, 18, 347.
Figure A1. SDG model of a control loop system.
returns normal, and the steady-state responses of CS and VP are still negative. It means that the response of MV is a compensatory response. On the basis of the measured initial responses, the consistent path for sensor bias is shown as follows: MVbias(+) → MV(+) → CS(−) → VP(−). It can be seen that the consistent path ends before the duplicate node MV, illustrating that the cycle is broken down at the duplicate node. (2) Controller failure CSfail. This fault scenario is equivalent to cutting the branch from MV to CS. Then, the signs of VP and MV are always the same as the sign of CS. Therefore, the consistent path for sensor bias is as follows: CSfail(−) → CS(−) → VP(−) → MV(−). It also shows that the cycle is broken down at the duplicate node.
■
AUTHOR INFORMATION
Corresponding Author
*Tel.: +86-10-64426960. Fax: +86-10-64437805. E-mail:
[email protected]. Notes
The authors declare no competing financial interest.
■
ACKNOWLEDGMENTS This project is supported by the National Natural Science Foundation of China (Grant No. 61104131, Grant No. 61374166).
■
REFERENCES
(1) Ghantasala, S.; El-Farra, N. H. Robust actuator fault isolation and management in constrained uncertain parabolic PDE systems. Automatica 2009, 45, 2368. (2) Qin, S. J.; Zheng, Y. Y. Quality-relevant and process-relevant fault monitoring with concurrent projection to latent structures. AIChE J. 2013, 59, 496. (3) Tang, J. Z.; Wang, Q. F. Online fault diagnosis and prevention expert system for dredgers. Expert Syst. Appl. 2008, 34, 511. (4) Peng, D.; Xu, Y.; Zhu, Q. X. Study and application of case-based extension fault diagnosis for chemical process. Chin. J. Chem. Eng. 2013, 21, 366. (5) Lau, C. K.; Ghosh, K.; Hussain, M. A.; Hassan, C.R. C. Fault diagnosis of Tennessee Eastman process with multi-scale PCA and ANFIS. Chemom. Intell. Lab. 2013, 120, 1. (6) Geng, Z. Q.; Zhu, Q. X. Rough set-based fuzzy rule acquisition and its application for fault diagnosis in petrochemical process. Ind. Eng. Chem. Res. 2009, 48, 827. (7) Qin, S. J. Survey on data-driven industrial process monitoring and diagnosis. Annu. Rev. Control 2012, 59, 496. (8) Eslamloueyan, R. Designing a hierarchical neural network based on fuzzy clustering for fault diagnosis of the Tennessee-Eastman process. Appl. Soft Comput. 2011, 11, 1407. 9803
dx.doi.org/10.1021/ie403608a | Ind. Eng. Chem. Res. 2014, 53, 9792−9804
Industrial & Engineering Chemistry Research
Article
(32) Patwardhan, S. C.; Manuja, S.; Narasimhan, S.; Shah, S. L. From data to diagnosis and control using generalized orthonormal basis filiters. Part II: model predictive and fault tolerant control. J. Process Control 2006, 16, 157. (33) Thornhill N. F.; Patwardhan S. C.; Shah S. L. The CSTH Simulation. http://www.ps.ic.ac.∼nina/CSTHSimulation/index.htm (accessed on September 20th, 2013). (34) Zhang J.; Cao W.-L.; Wang B.-S. Fault location algorithm based on the qualitative knowledge of signed directed graph. IEEE International Conference on Industrial Technology−(ICIT), Hongkong, 2005; p 1295. (35) McAvoy, T. J.; Ye, N. Base control for the Tennessee Eastman problem. Comput. Chem. Eng. 1994, 18, 383.
9804
dx.doi.org/10.1021/ie403608a | Ind. Eng. Chem. Res. 2014, 53, 9792−9804