Adaptive Agent-Based System for Process Fault Diagnosis - American

Jun 20, 2011 - 9138 dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138-9155. ARTICLE pubs.acs.org/IECR. Adaptive Agent-Based System ...
0 downloads 0 Views 4MB Size
ARTICLE pubs.acs.org/IECR

Adaptive Agent-Based System for Process Fault Diagnosis Sinem Perk,*,† Fouad Teymour,† and Ali Cinar† †

Department of Chemical and Biological Engineering, Illinois Institute of Technology, Chicago, Illinois 60616, United States

bS Supporting Information ABSTRACT: An adaptive agent-based hierarchical framework for fault type classification and diagnosis in continuous chemical processes is presented. Classification techniques such as Fisher’s discriminant analysis (FDA) and partial least-squares discriminant analysis (PLSDA) and diagnosis tools such as variable contribution plots are used by agents in this supervision system. After an abnormality is detected, the classification results reported by different diagnosis agents are summarized via a performance-based criterion, and a consensus diagnosis decision is formed. In the agent management layer of the proposed system, the performances of diagnosis agents are evaluated under different fault scenarios, and the collective performance of the supervision system is improved via performance-based consensus decision and adaptation. The effectiveness of the proposed adaptive agent-based framework for the classification of faults is illustrated using a simulated continuous stirred tank reactor (CSTR) network.

’ INTRODUCTION Almost all chemical processes experience abnormalities that arise from process disturbances, equipment failure, and environmental factors and, if undetected, will have negative consequences that range from off-spec production to shut-down of the plant. Statistical process monitoring (SPM), fault detection and diagnosis (FDD), and supervision are used to detect the abnormality in the system, to identify the potential fault by looking at the affected parts of the process, and to take corrective action before it spreads. Statistical model-based techniques using principal component analysis (PCA),13 partial least-squares (PLS),47 multiblock PCA (MBPCA) and PLS (MBPLS),811 and dynamic PCA (DPCA)12 have been proposed in literature for fault detection in continuous processes. After an abnormality is detected in the process, two types of approaches are used for fault diagnosis. One approach relies on identifying the process variables that contributed the most on inflating the monitoring statistics and relating these variables to source causes. For the identification of the important variables that have been affected from the fault, contribution plots are used,1317 and variable contributions on PCA statistics and state space model states are investigated.1820 The second approach is based on relating symptoms directly to faults by using classification and discrimination techniques. Supervised classification techniques that are used for fault diagnosis use different historical data sets, each representing different fault classes and relate the new data collected while the fault is active to the most closely related fault class. Supervised classification techniques include linear and angular distance measures of PCA scores and residuals,18,2124 FDA,23,24 PLSDA,25 artificial neural networks, and wavelets.2633 During the design phase of FDD, it is challenging to predict the techniques that would give the best results for all processes and for all possible abnormalities a process can experience. A system that includes several alternative techniques, dynamically analyzes the performance of every technique in detection and diagnosis of different faults, and selects and prioritizes the most effective technique for each fault scenario is required. r 2011 American Chemical Society

A hierarchical agent-based system can create a flexible environment for distributed FDD and provide a reliable system for automated and adaptive fault classification in complex chemical processes. A combined fault detection, classification, and diagnosis environment with agent-based systems has been developed at IIT as a part of a complex framework for Monitoring, Analysis, Diagnosis, and Control with Agent-Based Systems (MADCABS). MADCABS contains agents that perform tasks such as data preprocessing, process monitoring, fault diagnosis, system identification, and control. The aim is to bring together several alternative techniques for each of these tasks, dynamically identify the best performing agents and techniques, and provide reliable and accurate results via an agent performance evaluation mechanism, performance-based consensus, and adaptation. The paper outlines the structure of this hierarchical system, presents its fault classification and diagnosis module, and illustrates its performance for adaptive fault classification. A performance-based adaptive monitoring and fault detection framework has been proposed in a previous article.34 Agents that use PCA, MBPCA, and DPCA techniques are implemented in a distributed process where PCA and DPCA agents monitor each unit in the process, and a multiblock PCA agent monitors the entire process using data from each unit. Multivariate monitoring statistics, T2, and squared prediction error (SPE) are used by unique fault detection agents to detect abnormal process operation. A fault detection organizer (FDO) agent summarizes all the decisions given by its respective fault detection agents and provides a consensus decision among the fault detection agents. It was demonstrated that historical performance-based consensus scheme yields fewer missed alarms and faster detection than a voting-based scheme, since the consensus can be formed by fewer reliable agents and the combined system has the capability to predict which of the agents should be assigned a higher Received: October 10, 2010 Accepted: June 19, 2011 Revised: June 15, 2011 Published: June 20, 2011 9138

dx.doi.org/10.1021/ie102058d | Ind. Eng. Chem. Res. 2011, 50, 9138–9155

Industrial & Engineering Chemistry Research performance value during consensus on the basis of their experiences for similar faults in history.34 This paper presents the fault diagnosis activities that follow fault detection. In this article, the agents implemented for fault diagnosis are presented, and the performances of voting-based and adaptive performance-based criteria are compared in terms of misclassification rates for various single and multiple simultaneous actuator faults. Intelligent fault detection and diagnosis schemes have been proposed using knowledge-based (KB) monitoring and diagnosis with expert system technology and blackboard applications.3543 Qualitative and probabilistic quantitative methodologies have been proposed where fault detection with SPM is combined with KB fault diagnosis. The fault information that is extracted from the monitoring statistics is either compressed into qualitative measures or used to find the process variables responsible for the alarm to activate the corresponding rules in knowledge-based diagnosis.37,40,41,43 In blackboard applications several diagnosis techniques are brought together, and one of the techniques is selected based on its likelihood of success.36 In this paper, the proposed framework contains fault discrimination agents that use techniques such as PLSDA, FDA, and contribution plots. The performances of agents are evaluated under unique fault scenarios, and based on their estimated performances for a given fault, a diagnosis consensus decision is formed. The performance values of the agents are updated after every confirmed diagnosis, and agents with poor performances adapt by including the new faulty data and retraining. Agent-performance-based fault diagnosis and adaptation provide novel approaches for integrating the strengths of various fault diagnosis techniques. The outline of the paper is as follows. The hierarchical layers of MADCABS, fault discrimination agents in the process supervision layer, and the agent-performance management mechanism are presented. The performance-based criterion (PBC) that is used in consensus fault diagnosis is explained, and the adaptation of the agents is discussed. The performance of the combined adaptive agent-based FDD framework is illustrated in case studies on a simulated CSTR network.

’ AGENT-BASED SYSTEMS AND MADCABS An agent is a software entity that has specific properties and behavioral rules. A proactive agent observes its environment and other agents in the environment, acts on the environment according to its defined behavioral rules, and can adapt to the changing process conditions automatically according to some predefined criteria. Multilayered, autonomous, and adaptive multiagent systems provide a powerful environment for supervision and control of complex distributed processes. MADCABS is a multiagent, hierarchical, adaptive, autonomous, distributed decision-making system that automates the sequence of knowledge extraction from data, analysis, and decision making. It includes several successful monitoring, diagnosis, and control methodologies from literature, and efficiently combines and analyzes information provided by different agents, ranks and selects the best performing agent for the current operating conditions of the system and modifies deteriorating agents to become better by using experience and built-in knowledge. MADCABS consists of three hierarchical layers (Figure 1). In the physical communication layer, the software representations of the process units, their connections, and sensor and actuator

ARTICLE

Figure 1. The interlayer and intralayer information flow in MADCABS.

representations are created. The communication agents in this layer manage the communication of MADCABS with a process or simulator and update the values in the sensors and the actuators. The process supervision layer contains modules for preprocessing of data, statistical process monitoring, fault classification and diagnosis, and control and decision making. Each of these modules consists of a number of agents with different techniques that collaborate with each other during the execution of their tasks. The manager agents in the agent management layer monitor the performances of agents in process supervision layer under specific states of process operation, rate and rank their performances, and adjust the confidence level to an agent based on past performances under similar operating conditions. A context is a container where agents are implemented. In the supervision layer, there are different contexts for monitoring and fault detection, fault diagnosis, and process control. In the fault diagnosis context, a PLSDA agent collects data from the unit for specific fault types, builds a model relating the type of fault to the values of process variables, and uses this model to classify the type of fault for new data after the FDO declares the consensus fault decision. Other fault discrimination agents have similar rules, and they provide their classification results for the current fault observation. Diagnosis organizer (DO) gathers the classification results from all of the fault discrimination agents for the same unit and forms a consensus classification decision on the basis of the agents’ historical performances. If a classification agent has a higher rate of misclassification for a certain fault, hence, has a lower performance value, it is given less confidence in the consensus and the agent retrains its model by include the new observation. The agent becomes more reliable for that type of fault in time via adaptation. It is difficult to know at the design phase which technique would be more reliable and perform better than the others. Practice has shown that some methods and their agents are better 9139

dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138–9155

Industrial & Engineering Chemistry Research suited to specific situations and ineffective for others. Therefore, multiple alternative methods are implemented in MADCABS. When there are multiple agents that can perform the same task or similar tasks, they are compared based on their historical performances. The performances of different agents are stored as history for some state metrics that define the situation under which the performances are measured. The performance history is then used as a reference in estimating an agent’s potential performance next time a similar situation arises. When agents are not selected to take part in a consensus due to poor performance, the agents update their built-in knowledge (their methods) and adapt to perform better. In the agent management layer in MADCABS a historical performance space is formed for each competing agent for a specific task. The performance is measured and recorded along with the state metrics that define the state of the system when the performance was observed. When MADCABS is supervising a process, the current state of the system is compared with the recorded states, and the performance of the agents for the current state is estimated by using their performances for similar states in the historical performance space.34 The performances of agents can be measured during their performance episodes. All agents update their performance values, reliabilities, or priorities after the performance evaluation. Then, a new performance assessment episode begins, after which the historical performance space is updated with the current performances. This cycle repeats itself for each performance episode.

’ FAULT DIAGNOSIS AGENTS IN MADCABS MADCABS has three types of monitoring agents, PCA, DPCA, and MBPCA, and they build normal operating (NO) models for each unit in the process. Each monitoring agent generates two monitoring statistics, the Hotelling’s T2 and SPE. A total of six statistics are observed by six different fault detection agents, hence, a total of six flags are raised for each unit at each time point. A fault flag is raised when the monitoring statistic goes outside the control limits, otherwise, a normal flag is raised. There is a FDO for each process unit. FDO keeps track of all the fault flags raised by fault detection agents, declares consensus fault based on a consensus criterion, and in case of a consensus fault decision FDO triggers the diagnosis agent. The effectiveness of the combined monitoring and fault detection framework using several alternative consensus criteria is illustrated and compared in terms of detection times and false and missed alarm rates in a previous paper.34 Use of combinations of monitoring agents yielded fewer false alarms since it is unlikely for all good techniques to flag a false alarm simultaneously. The most effective consensus criterion was the time-averaged performance with history criterion (TAPWHC). TAPWHC uses information about the performances of fault detection agents for similar fault magnitudes in history and forms a consensus decision on the basis of the expected performances of fault flagging agents. It provides faster fault detection and fewer missed alarms and, therefore, is used as the fault detection criterion in this paper. After an abnormality is detected by the FDO, the diagnosis organizer (DO) is triggered for the faulty unit. DO is responsible for getting estimates from different fault diagnosis agents, requesting performance estimates from the diagnosis manager for each agent for the current fault state, forming a consensus diagnosis decision via a performance-based criterion (PBC), and triggering agent adaptation (Figure 2).

ARTICLE

Figure 2. Various agents used in fault diagnosis system.

Three types of fault diagnosis agents are built in MADCABS that use three different fault classification techniques, namely PLSDA, FDA, and a fault type estimator that utilizes variable contributions to monitoring statistics. The details of the PLSDA and FDA techniques and the contribution plots are given in the Supporting Information.4,8,9,13,25,44,45 The diagnosis trainer agent contains the process data for each known fault type in the process (Figure 2). All of the fault diagnosis agents require historical data of all possible faults. The diagnosis trainer agent is responsible for collecting process data under each fault and building a fault data matrix in the training phase. This paper emphasizes the fully automated FDD scheme where its effectiveness is illustrated on a set of known fault types. MADCABS also allows human-operator intervention when unknown faults occur in the system, and the agents retrain their models in real-time by incorporating the process data for newly defined fault. PLSDA agent uses the fault data matrix and builds another matrix with 1s and 0s to represent the fault type. PLSDA uses the PLS method to relate the fault data to the respective fault under which the fault data has been collected. For each block of process fault data collected under a certain process fault, the respective fault column in the quality matrix block will be equal to 1s and the rest of the fault type columns of that block of the quality matrix will be equal to 0s. FDA uses the same fault data from the training agent as PLSDA to create different clusters of faults. The multivariate observations are transformed to another coordinate system that enhances the separation of the samples belonging to each class. In FDA, the separation among different fault classes is maximized. Contribution plots identify the process variables that have contributed significantly to the inflation of both T2 and SPE statistics.13 In practice, it is necessary to relate these process variables to various faults. Automation of this process has been proposed in the literature using knowledge-based systems.40,43 In this paper, the use of variable contributions to monitoring statistics is different than the conventional use of contribution plots in fault diagnosis. The contribution values are used to list the process variables with large contributions to the inflation of the statistic that exceed the 3σ confidence limits of NO contribution values for each statistic. Since six different monitoring statistics are used for fault detection, six different sets of variable contributions are available (Figure 3). Only the variables that have been listed in the majority of the sets are kept. The variables 9140

dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138–9155

Industrial & Engineering Chemistry Research

ARTICLE

Figure 3. Illustration of contribution map estimator agent.

are kept as a mapping of the fault signature on process variables to the fault type itself. During the training phase the variable contributions are calculated and fault signature is extracted for each known fault and a mapping is created. The lists of fault signature variables are mapped against the respective fault under which the data have been collected. This contributionplot-based classification map is used by one type of fault discrimination agent, namely the variable contribution map estimator (Figure 3). The reason to use the variable contribution plots in this closed set scheme without incorporating human expert knowledge is to provide automation to the classification and diagnosis without integrating with a knowledgebased system. Figure 3 illustrates the fault data classification by the contribution map estimator agent with an example for an arbitrary Fault Type 2 (F2). The fault detection agents identify variables X1, X4, X6, X7, and X8 as the variables that have contributed to F2. The consensus set contains only variables X1 and X4, and these two variables are set as the fault signatures for F2. In the table, all the fault diagnosis decisions are listed corresponding to different fault signatures. The left side of the table lists the consensus set of variables, and the right side lists the sequence of actual faults in effect when a corresponding set of process variables has high contributions. In this illustrative example, the contribution map estimator wrongfully classified the fault signature [X1, X2, X3] for F1 as F2 in 3 observations out of 10. For F2, the fault signature is represented by [X1, X4] and is correctly classified for all 10 observations. Once a fault is detected and the consensus variables are identified, the map estimator agent determines the potential fault in the process by considering the highest number of times the same fault signature has been classified as one of the known faults in history. For the illustrative example given in Figure 3, a consensus set that consists of X1 and X4 yields F2 as the diagnosis result from the contribution map estimator. Similarly, a consensus set of X1, X2, and X3 yields F1 as the diagnosis decision since F1 has been in effect for 7 out of 10 diagnosis decisions when [X1, X2, X3] have high contributions previously. In the testing phase (that follows training), every time FDO detects an abnormality, all fault discrimination agents provide their classification results for the current process fault. PLSDA estimates the fault type using the regression vectors that relate the process observations to the fault types. FDA calculates the distance of the current data to each of the fault clusters and selects the cluster available with the minimum distance measure as the likely fault type. The contribution map estimator uses the patterns of variables with significant contributions to select the fault. DO agent summarizes all the classification results from all available fault discrimination agents for an operating unit and uses a performance-based criterion (PBC) to finalize the classification of the fault for that unit.

Figure 4. Illustrative comparison of the performances of three competing methods for a new state.

’ PERFORMANCE EVALUATION MECHANISM FOR FAULT CLASSIFICATION AGENTS PBC in fault diagnosis relies on the previous diagnosis performances of all fault discrimination agents that participate in the consensus-building. These performances are used as confidence measures of a fault discrimination agent’s effectiveness in correctly diagnosing a certain type of fault. The performance space provides the expected performance of an agent for a given fault signature. The agent performance evaluation occurs in the topmost agent management layer in MADCABS. The details of the performance evaluation are illustrated in Figure 4 for three hypothetical agents A, B, and C. A historical performance space is formed for each competing supervision agent. The performance is measured and recorded along with the state metrics that define the state of the process when the performance was observed. The idea is to compare the current state of the process with the recorded states and estimate the performance of the agents for the current state using their performances for similar states in the historical performance space. Figure 4 shows the details of the performance estimation for Method B. The current state of the process is projected onto the history space, and for each method the closest points to the current state point are determined along with the previous performances for those points. The diagnosis agent is expected to perform similarly to its previous performance for similar situations. Therefore, a distance measure is used in the estimations of performances for each agent for the current state. The agent that gives the highest performance estimation is selected or given a higher reliability weight. The variable contributions are well suited as state metrics that express the diagnosis performance of all fault diagnosis agents since they represent the signature of an abnormality in the 9141

dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138–9155

Industrial & Engineering Chemistry Research process. Variable contributions represent the actual state of the process and are independent of the performances of all of the fault classification agents. Ideally, all of the fault diagnosis agents relate this process state to the correct fault type. The variables that have been contributing to the inflation of each statistic are kept as lists in fault detection agents. The contributions of process variables that are common to all contribution lists are used for assessing the performance of fault diagnosis agents. The variable lists are slightly modified when used as state metrics in performance spaces. The variable space is an N-dimensional space where N is the total number of variables of a unit operating under fault. The sets of variable contributions are available from each fault detection agent at the time of each diagnosis, and the variables with large contribution values that appear in all sets are identified. These variables contain the signature of the abnormality. In the N-dimensional performance space, the current fault state is described by an array of 1s and 0s. The fault signature is represented by assigning 1s to these variables that contain the signature of the abnormality and 0s for all other variables. The diagnosis performance of a fault diagnosis agent for these states are added as state points to the performance spaces. The performance space is a mapping of the performance of a diagnosis agent for a specific fault signature on the process variables, and consequently it is the mapping of the agent’s performance for the potential fault type. Additional or alternative state metrics can also be defined to assess the performance of diagnosis agents. The performance of each fault diagnosis agent is assigned a value based on its match to the classification consensus. If the fault indicated by fault diagnosis agent i is the Confirmed Diagnosis Decision, the performance of agent i is assigned the score 100, otherwise the score 0. A diagnosis decision is made at each sampling time after a fault is detected; however, a confirmed diagnosis decision is formed only after three consecutive and consistent consensus diagnosis decisions. The performance space is updated, and the adaptation may begin only after the diagnosis decision is confirmed. For the initial observations before the diagnosis decision is confirmed, the decisions of all of the agents and the corresponding consensus variables are recorded. At the time of confirmed diagnosis, their historical performance spaces are updated with their diagnosis decisions during these initial observations. If they have correctly diagnosed the fault as the confirmed fault, they are rewarded, and if they have provided false classifications, they are penalized. For example, if any agent classifies the fault type for an arbitrary set of consensus variables as [F1,F1,F2,F2,F2,F1,F1,F1] until the decision is confirmed by the consensus as F1 at the eighth fault observation, the agent’s historical space will have five 100s and three 0s assigned. The values will affect the estimated performance of this agent for the later occurrences of F1. The performances of two hypothetical agents A and B implementing methods A and B in the 3D illustrative example (Figure 4) are assessed for two performance evaluation states (1,0,1) and (1,0,0). Each performance evaluation state, (1,0,1) or (1,0,0) denotes a point in the 3D performance space as (statemetric1, statemetric2, statemetric3) and historical performance values measured are assigned to the corresponding state point. For a simple worked out example, consider Agent A’s performances for state (1,0,1) are recorded as (100,0,0,0,0) and for state (1,0,0) as (0,0,100,100,100) and B’s performances for the same states as (100,0,100,100,100) and (0,0,0,0,100), respectively. Each performance space contains two state points and

ARTICLE

Table 1. Estimated Performances of Agents A and B for Two Example States estimated performances agent

performance state

P=5

P = 10

A

(1,0,1)

20.0

20.004

A

(1,0,0)

60.0

59.996

B B

(1,0,1) (1,0,0)

80.0 20.0

79.994 20.006

five historical performances assigned to each state point. The agents performances for new states will be estimated based on the recorded states using the nearest P points. P is a parameter that specifies the number of closest points taken into consideration for the estimated performance calculation. Estimated performance is a weighted average of the performances recorded at the closest P points. The weights are inversely proportional to the distance of the new state to the recorded states so that the performances measured at the most similar historical states (closest points) receive higher weights in the estimation. For state metric (1,0,1) the performance evaluator estimates the agent A’s future performance as 20.0 considering the nearest five points or as 20.004 (computation details are provided as Supporting Information) for P = 10 (Table 1). P = 5 includes the points that are relevant to each state point, whereas P = 10 includes all the data points for both state points. Estimated performance of Agent A is higher than Agent B for operating condition (1,0,0), whereas Agent B has higher performance for condition (1,0,1). Since there are 12 variables measured and monitored for Reactor 11 for which various faults are simulated in the Results and Discussions section, the performance space is 12D. The performance evaluation works the same way as the illustrative example in this section, only the performance space is larger and more than 10 points P are used in the performance estimation. For a specific fault which affects variable 4 only, the state metric would be (0,0,0,1,0,0,0,0,0,0,0,0) and the performances recorded for that state would be used in the assessment of the future performance of an agent for similar states.

’ FAULT DIAGNOSIS CONSENSUS CRITERION AND ADAPTATION The diagnosis manager resides in the agent management layer. It evaluates and records the performances of fault discrimination agents when the process is operating with an abnormality. Diagnosis manager provides the performance estimates of each fault discrimination agent to the DO which then forms the consensus decision based on PBC (Figure 5). In Figure 5, circles with dark solid lines and dotted and light solid lines represent PLSDA, FDA, and contribution map estimator agents, respectively. Multiple instances indicate the ability to use the variants of the same techniques. In the figure, there are two instances of each of these three agents to underline that one set uses an adaptation mechanism where they retrain their model after each confirmed diagnosis, and the other set does not. The performance estimates are gathered from the historical performance space for each agent as soon as the variable contributions are available and the fault signature is extracted. All performance estimates and all classification decisions are listed for each agent for the current fault observation. The agent with the highest 9142

dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138–9155

Industrial & Engineering Chemistry Research performance estimate is selected, and its classification decision is the consensus diagnosis decision. If multiple agents share the same

Figure 5. Fault diagnosis consensus mechanism with adaptive and static instances of different diagnosis agents. Circles with dark solid lines and dotted and light solid lines represent PLSDA, FDA, and contribution map estimator agents, respectively.

Figure 6. Performance-based consensus decision criterion.

ARTICLE

ranking with the same estimated performance values, depending on the individual decisions either both or all of the agents’ decisions are considered. If the performances of two top ranking agents are the same but the fault decisions are different, then all of the agents participate in the consensus (Figure 6). The observations that the FDA or PLSDA type agents misclassified are added as new rows to the respective fault class (the confirmed diagnosis class) of the initial fault data matrix and the agent’s model is updated (Figure 7). For Fault F1, if any agent yields [F1,F1,F2,F2,F2,F1,F1,F1] for eight consecutive fault observations, this corresponds to adding the three observations that yielded F2 to the model data as new observations that represent F1 type fault behavior. If the agent is a contributions map estimator, F1 is added to the fault list on the right-hand side of the table that corresponds to the set of consensus variables observed during the wrong F2 decisions, namely observations 3 to 5. Only the lists of relevant sets of consensus variables are updated, and the rest of the lists remain unchanged. The adaptation continues after the consensus diagnosis decision is confirmed. If any of the agents provides a misclassification after the confirmed diagnosis, depending on the type of the agent either the models or the list maps are updated similarly. All instances of an agent class have common agent-classspecific properties and unique agent-specific properties. The unique properties of fault discrimination agents are the methodologies. The class-specific properties include the classified fault type and the diagnosis performances. When a new fault discrimination agent is created, the methodology that the agent would use is identified and the training fault matrix is provided. Using the method and the data, the agent will update its classified fault type and receive a performance value for its classification performance at each fault observation. A confirmed diagnosis decision is formed, and the performances of the methods are updated based on their classification results. The agents that have shown poor performance in classification may adapt by adding the new fault data to their historical data sets and rebuilding the model. In the object-based environment of Java and Repast Simphony,46 two instances of the PLSDA and FDA agents with static and adaptive models, a single adaptive contribution map estimator

Figure 7. Adaptation of the fault discrimination agents for a new fault observation of type F1. 9143

dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138–9155

Industrial & Engineering Chemistry Research

ARTICLE

Figure 8. Two of the interconnected CSTRs and the associated sensors and actuators.

agent is used simultaneously in the case studies. Adaptive agents use an adaptation mechanism where they retrain their models after each confirmed diagnosis, and the static agents do not.

’ CASE STUDY: AUTOCATALYTIC CSTR NETWORK Reactor networks have various modes of operation to produce different product grades and nonlinearities that cause different responses to a specific disturbance depending on the current steady state of the processes. Reactor networks hosting multiple autocatalytic species show a very complex behavior and provide a good case study.4749 As the number of steady states of the network increases, autocatalytic species are allowed to exist in the network that would otherwise not exist in a single CSTR.4749 The data for illustrating the performance of the fault diagnosis system are generated by a CSTR network simulator, where three competing species coexist in a CSTR network using the same single resource. The ordinary differential equations modeling the kinetics are coded in C and connected to Repast Simphony through the use of a Java Native Interface (JNI). At each sampling time, the data from the simulator are written to a database in MADCABS for use by MADCABS agents. Reproduction and death cycles for each species are represented by the set of isothermal autocatalytic reactions R + 2Pi f 3Pi Pi f D i

ð1Þ

where R is the resource continuously fed to the CSTR, and Di is the so-called dead (or inactive) form of species Pi (i = 1,2,3). The reaction rate constants ki are the species growth rate constants for the first reaction, and kdi are the species death rate constants. Each CSTR is interconnected in a rectangular grid network and has an inlet and outlet flow. The feed flow contains pure resource that is consumed by the species in the reactor. Each reactor can host multiple species; one species is always dominant, and other species are usually found in trace amounts mainly due to the interconnection flows between the CSTRs. The production objective is to produce a desired product grade in the network. The product collected from the reactors in the network should meet the desired production grade. For a network hosting three species the desired production grade used in the case study is 30:30:40. This corresponds to having 30% species1, 30%species2, and 40%species3 in the combined product from all of the CSTRs in the network.

Figure 9. (4  5) grid network with three dominant species.

For a CSTR, the available manipulated variables set includes the feed flow rate and the interconnection flow rates to and from its neighbors. Each manipulated variable can only be accessed by one controller at a given time. The outflow of the reactor is adjusted to keep the reactor volume constant (Figure 8). The feed is pure resource in the base case, but the simulator supports the addition of species to the CSTR through the feed flow. This is used in simulating a disturbance where a species other than the dominant species in the reactor is mixing into the pure resource feed and causes fluctuations in the CSTR concentrations. The magnitude of disturbance or the concentration of a species in the feed are not monitored. The disturbance is detected from the changes in the concentrations of resource and all species in the reactor. For each reactor, the variables that are used in FDD are the feed flow rate, outgoing interconnection flow rates, and species and resource concentrations in the reactor. To illustrate the performance of the FDD tools for actuator faults, it is assumed that flow sensors are not available, and concentrations are measured with concentration sensors. Other possible abnormalities a CSTR can experience are the change in feed flow rate, changes in outgoing interconnection to a neighbor reactor or incoming interconnection rate from a neighbor reactor with same or different dominant species. These and the effect of a nondominant species mixing in the otherwise pure resource feed are simulated as a case study, and the effectiveness of performance-based diagnosis consensus and agent adaptation are illustrated.

’ RESULTS AND DISCUSSION Diagnosis of Single Faults. The faults and disturbances simulated are listed in 2. Four unique actuator (valves regulating flow rates) faults (Faults A, B, C, D) are simulated between times 210 and 660 for data collection for model training. The faults are simulated for reactor 11 (node 10) (Figure 9). The concentration data, namely dominant species, resource, and two other species concentrations in the reactor, are shown in Figure 10. Fault A simulates a ramp change in the feed flow rate to Node 10. This type of fault is common in chemical processes and happens if the pipe gets clogged over time and restricts flow. It also happens when the actuator providing the flow fails in time. The 9144

dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138–9155

Industrial & Engineering Chemistry Research

Figure 10. Sensor data for Node 10 (Reactor 11). (a) Species 1 in the reactor in response to Fault B, (b) Resource concentration in the reactor in response to Fault A, (c) Dominant species (Species 2) concentration in the reactor in response to Fault A, (d) Species 3 in the reactor in response to Fault C.

effect of Fault A is realized mostly in the dominant species and resource concentrations in the reactor. The resource concentration in the reactor initially drops causing a rapid decrease in the dominant species concentration, and the resource concentration starts increasing when less dominant species remain to use the resource. Fault B simulates the mixing of a nondominant species to feed flow which consists of pure resource under normal operating conditions. In real life, this happens if the product species are used in the feed for initiating reactions in the CSTRs, and one of their valves deteriorates in time and leaks. The most significant change is observed for the Species 1 concentration corresponding to Fault Type B, where the actuator that regulates the flow of Species 1 in the feed deteriorates in time and Species 1 is leaking into the reactor. A drift from 0 initial value to 0.06 in 30 time points causes the Species 1 concentration in the reactor to suddenly increase. All the drifts reach high values at the end of the fault duration compared to the normal variation in the data; however, the fault detection agents are able to detect the drift at the early stages much before the drift reaches its peak. Since the fault diagnosis is triggered immediately after a fault is confirmed in the beginning of the drift, the misclassification rates from the early stages and the progress as the magnitude of the drift is increasing can be observed. Fault C and Fault D simulate ramp increases in the interconnection flow rates between reactors. The effect of Fault D is not observed in the originating reactor but is realized in the receiving reactor. It is detected by the agents of the originating reactor since

ARTICLE

the outgoing flow rates are monitored and is detected in the receiving reactor because the reactor concentrations will change as more of the originating reactor’s concentrations introduced to the receiving reactor. Fault C simulates the scenario when the fault affects the incoming flow from a neighboring reactor with different dominant species. Species 3 in the receiving reactor will increase significantly signaling a process fault. The neighboring reactor will also be signaling because of a faulty outgoing interconnection flow rate, which is a Type D fault. Four different types of actuator faults are simulated in a repeating manner during a period between time 210 and 2990 (Table 2). Every simulation starts with a training phase where faulty operation data corresponding to each fault type are collected. Diagnosis training agent requests data from the database for time periods (210260, 310360, 510560, and 610660) and provides the fault data matrix with the corresponding fault information to all fault discrimination agents. The models for PLSDA and FDA are formed. The contribution map estimator gets the variable contribution lists from the fault detection agents for the same time periods, and the fault classification map is created. For the initial performance calculations, the model data and the classification performances of all fault discrimination agents for the model data are used. The classification results of the fault models for the training data of one example run (time 210660) are given in Figure 11. The y-axes show the fault class provided by the corresponding agent. All fault discrimination agents misclassify some of the observations especially in the beginning of each fault despite these are the data points that have been used in model building. For example, Fault A is misclassified as Fault B or C by the FDA and PLSDA and could not be classified by the contribution map estimator in the beginning. For Faults A, B, and C all agents have similar performances. Most of the misclassifications occur at the beginning since for ramp changes it takes time for the fault to be in full effect. For Fault D, the contribution map estimator is superior to others for the example run even at the beginning stages of the fault. After the models are built and the performance space is constructed for all agents in the training phase, testing phase starts. The faults are simulated in a repeating manner. Fault A is simulated five times, and Faults B, C, and D are simulated two, five, and six times, respectively (Table 2). A3 denotes the third occurrence of Fault A in the testing phase. The misclassification rates of all agents are calculated for each fault, and the changes in performances of all agents and the effects of adaptation are illustrated. For comparison with a performance-based diagnosis scheme, a voting-based scheme is discussed first. The misclassification rates for the voting-based consensus using PLSDA, FDA, and contribution map estimator agents are calculated. A voting-based criterion (VBC) considers the maximum number of agents providing the same classification results. In a scheme where there are three agents, a diagnosis decision is formed when two or more agents provide the same diagnosis result. The decisions of each fault discrimination agent are shown in Figure 13. The y-axis shows the fault decision made by the corresponding fault discrimination agent. If the classification results are different for all three agents at a sampling time, VBC cannot provide a decision. The consensus decision via VBC based on the individual decisions shown in Figure 13 is illustrated in Figure 14. Two points in Figure 14 represent the observations (indicated by arrows) for which a diagnosis decision was not provided by VBC and are labeled as unclassified. Most misclassifications are for 9145

dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138–9155

Industrial & Engineering Chemistry Research

ARTICLE

Table 2. Faults Used in Model Training and Testing fault type

variable

time interval

magnitude (ramp)

Model Building (Training) 1

Fault A

resource feed flow rate

210260

2

Fault B

nondominant species in feed

310360

9% 0 to 0.07

3

Fault C

interconnection flow rate from neighbor

510560

30%

4

Fault D

interconnection flow rate to neighbor

610660

30%

750780 880910

6% 0 to 0.06 27%

Testing: 1 2

Fault A1 Fault B1

resource feed flow rate nondominant species in feed

3

Fault C1

interconnection flow rate from neighbor

10101040

4

Fault D1

interconnection flow rate to neighbor

11401170

27%

5

Fault A2

resource feed flow rate

12701300

6%

6

Fault B2

nondominant species in feed

14001430

0 to 0.06

7

Fault C2

interconnection flow rate from neighbor

15301560

27%

8

Fault D2

interconnection flow rate to neighbor

16601690

27%

9 10

Fault A3 Fault A4

resource feed flow rate resource feed flow rate

17901820 19201950

6% 6%

11

Fault A5

resource feed flow rate

20502080

6%

12

Fault C3

interconnection flow rate from neighbor

21802210

27%

13

Fault C4

interconnection flow rate from neighbor

23102340

27%

14

Fault D3

interconnection flow rate to neighbor

24402470

27%

15

Fault D4

interconnection flow rate to neighbor

25702600

27%

16

Fault D5

interconnection flow rate to neighbor

27002730

27%

17 18

Fault D6 Fault C5

interconnection flow rate to neighbor interconnection flow rate from neighbor

28302860 29602990

27% 27%

Figure 11. Classifications of the model data. From top to bottom, the figures represent the classification results for PLSDA, FDA, and contribution map estimator.

Fault B and Fault D. Since both PLSDA and FDA have many misclassifications for these two fault types, it affects the votingbased consensus decision and consensus yields wrong decisions. The numbers of misclassifications for all three fault discrimination agents are shown in Figure 12. PLSDA agent (Figure 12(a))

misclassified Fault B as Fault C twice during B1 and once during B2 and as Fault A once during B2. Fault D was misclassified for a total of 7 times by the same agent, twice during D2, once during D5 and once during D6 as Fault C, once during D5 as Fault B, and once during D2 and once during D3 as Fault A. The figures 9146

dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138–9155

Industrial & Engineering Chemistry Research

ARTICLE

Figure 12. Misclassifications of fault diagnosis agents for static, voting-based scheme. (a) PLSDA, (b) FDA, and (c) contribution map estimator.

Figure 13. Classifications of fault diagnosis agents for static, voting-based scheme.

are read similarly for FDA agent and contribution map estimator agent in Figure 12 (b) and (c). To illustrate the effectiveness of adaptation and performance evaluation, a scheme with five agents using PBC is selected. These agents are adaptive PLSDA, adaptive FDA, adaptive contribution map estimator, PLSDA, and FDA. The performance spaces are built initially using the performances of agents in diagnosing the model data in Figure 11. Initially, the performance spaces of the adaptive agents are the same as the performance spaces for static agents and P is set to 50. As the performance space is updated with each new state metric during the simulation, the performance evaluation points P increases by one, so that the new state point is incorporated to the performance estimation calculations. Since new points contain the improvement in an agent’s performance for a specific fault, the

estimated performance also reflects the improvement and increases in time. The classification results of all fault discrimination agents for an example run is shown in Figure 15. There are differences in classification results of static agents compared to adaptive agents during the periods under Fault D. At the second occurrence of Fault D (D2) (time period 16601690), static FDA agent has a misclassified fault, whereas the adaptive FDA agent correctly classifies that fault observation as Fault D. Even though, they use the same technique and the same fault observation data, this outcome supports that the adaptation of the FDA agent for Fault D has been successful. Since the static PLSDA agent did not yield a misclassification for the same occurrence of the fault, the success of its adaptation at that instance (D2) is not observed. For PLSDA, the adaptation is realized later at D5 (Table 3). 9147

dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138–9155

Industrial & Engineering Chemistry Research

ARTICLE

Figure 14. Consensus diagnosis result in a static, voting-based consensus.

Figure 15. Classifications of individual agents in a performance-based scheme for an example run.

Misclassification numbers for all these agents for the example run (Figure 15) are provided in Table 3. The misclassification numbers of the adaptive agents improve in time as shown with bold text. Adaptive FDA agent gives fewer misclassifications than the static FDA method for Fault Type D starting with D2 and for faults D3 and D6. Adaptive PLSDA gives fewer misclassifications than static PLSDA for fault D5. The adaptation of FDA is more successful than the adaptation of PLSDA for Fault Type D. The success of adaptation is dependent on the number of misclassifications. If an agent has many misclassifications at the first occurrence of a certain type of fault, the model adaptation starts earlier and continues with every false diagnosis. If the agent does not yield many misclassifications, the model update does not happen as frequently and the adaptation takes longer. The performance-based consensus criterion has no misclassification for the example run (Figure 16).

The adaptations of the PLSDA and FDA agents are realized quicker than the contribution map estimator agent. The latter uses a map of significantly changed variables to the lists of fault types and chooses the fault type that makes up the majority of the list as the potential fault. Successful adaptation of this agent necessitates a sufficient number of observations with the correct fault type for each fault signature. The estimated performances of all five agents for Fault A are shown in Figure 17. For A1, the agents have similar performance estimates (Figure 17(a)). PLSDA and adaptive PLSDA have higher performance estimates than the contribution map estimator and the FDA agents. For A2 at time 1270, the expected performance of the contribution map estimator for one observation in the beginning is much lower than expected performances of the other agents (Figure 17(b)). This can happen if for that 9148

dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138–9155

Industrial & Engineering Chemistry Research

ARTICLE

Table 3. Number of Misclassifications for Example Run with Performance-Based Consensus Criteriona

a

fault number

fault type

PLSDA

FDA

adaptive PLSDA

adaptive FDA

contribution map estimator

1

Fault A1

0

0

0

0

0

2

Fault B1

1

0

1

0

0

3

Fault C1

0

0

0

0

1

4

Fault D1

1

1

1

1

0

5

Fault A2

0

0

0

0

1

6

Fault B2

1

1

1

1

0

7

Fault C2

2

1

2

1

0

8 9

Fault D2 Fault A3

0 0

1 0

0 0

0 0

0 0

10

Fault A4

0

0

0

0

3

11

Fault A5

0

0

0

0

0

12

Fault C3

0

0

0

0

0

13

Fault C4

0

0

0

0

0

14

Fault D3

2

2

2

1

0

15

Fault D4

0

0

0

0

0

16 17

Fault D5 Fault D6

1 0

1 1

0 0

1 0

0 0

18

Fault C5

0

0

0

0

0

Adaptive agents yield fewer misclassifications than nonadaptive agents as shown with boldface font.

Figure 16. Consensus diagnosis result using performance-based criterion for the example run.

observation the fault signature is not fully developed or none of the variable contributions are above their NO confidence limits. The contribution map estimator may not be able to provide an estimate (the estimate is equal to zero), when the variable contributions are not significantly higher than their NO values, hence, should not be considered in the consensus for these observations. The performance evaluation successfully predicts a lower performance for contribution map estimator in Figure 17(b).

Misclassifications of the contribution map estimator during time periods (12701300) and (19201950) (Figure 15) affect the agent’s performance history, and the agent drops to being the lowest ranked agent for Fault A after the misclassifications during (19201950) (Figure 17). Adaptive FDA starts with a higher performance estimate than static FDA at 1920 because the performance manager calculates an average performance based on the historical performances for all similar states to the current state and estimates a higher performance for adaptive FDA than 9149

dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138–9155

Industrial & Engineering Chemistry Research

ARTICLE

Figure 17. Estimated performances of agents for Fault A over time: (a) estimated performances during A1, (b) estimated performances during A2, and (c) estimated performances during A4.

Figure 18. Estimated performances of agents for Fault D over time: (a) estimated performances during D2, (b) estimated performances during D3, and (c) estimated performances during D6.

FDA based on their historical performances. At time 1920, the performances of agents is ranked starting with the best as adaptive PLSDA, PLSDA, adaptive FDA, FDA, and finally contribution map estimator (Figure 17(c)). For Fault D, the best performing agent for the example run is the contribution map estimator. The fault signature develops quickly since the outgoing interconnection flow rate is

monitored and the variable contribution quickly exceeds the NO confidence limits. For Fault D, PLSDA, FDA, and their adaptive instances have relatively higher misclassification rates during D1 at time 1140 than the contribution map estimator (Figure 15). During D2, only the FDA agent gives a misclassification (Figure 15). The effect of this misclassification on FDA agent’s estimated performance after the diagnosis is confirmed as 9150

dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138–9155

Industrial & Engineering Chemistry Research

ARTICLE

Table 4. Misclassification Rates (Average Results of 100 Runs) fault

fault

voting-based

performance-based

number

type

consensus

consensus

1 2

Fault A1 Fault B1

0.0041 0.0290

0.0015 0.0076

3

Fault C1

0.0027

0.0033

4

Fault D1

0.0221

0.0069

5

Fault A2

0.0054

0.0015

6

Fault B2

0.0234

0.0010

7

Fault C2

0.0020

0.0011

8

Fault D2

0.0208

0.0045

9 10

Fault A3 Fault A4

0.0036 0.0033

0.0025 0.0019

11

Fault A5

0.0043

0.0018

12

Fault C3

0.0018

0.0040

13

Fault C4

0.0034

0.0026

14

Fault D3

0.0250

0.0051

15

Fault D4

0.0174

0.0038

16

Fault D5

0.0270

0.0017

17 18

Fault D6 Fault C5

0.0240 0.0038

0.0024 0.0025

Fault D is shown in Figure 18(a). When the fault is confirmed as Type D by the performance-based diagnosis, the performance history spaces are updated, and the estimated performance of FDA agent for Fault D decreases below all other agents’ estimated performances. When Fault D reoccurs as D3, contribution map estimator has the highest estimated performance, whereas FDA agent has the lowest (Figure 18(b)). For D6, FDA starts with a low performance estimate, and the performance gets lower after another history space update with another misclassification performance at time 2836. The differences in estimated performances of adaptive PLS, adaptive FDA, and PLSDA is more pronounced when zoomed in at time 2860 (Figure 18(c)). Adaptive PLSDA has a higher performance than adaptive FDA which has a higher performance than PLSDA. For Fault D, even though there are four other agents providing the same wrong fault class, the performance-based concensus is built on the basis of only one agent, namely the contribution map estimator, that has a history of successful classifications for Fault D for the example run. For Fault A, the contribution map estimator could not provide good classification results, especially during the beginning of the faulty periods. For Fault A, the performance-based consensus decision is formed on the basis of adaptive PLSDA and PLSDA agents. The diagnosis results using only PLSDA, FDA, and contribution map estimator in a voting-based consensus versus using all five agents in a performance-based consensus are given in Table 4. The numbers are the average numbers of 100 runs. The misclassification rate is the ratio of number of misclassifications given during a diagnosis period from the detection time of a certain fault to the end of the fault period. There is significant reduction in the average misclassification rate for Fault Types B and D when a PBC is used instead of a VBC (Table 4). For Fault Type C, the misclassification rate is similar for both criteria. In the receiving reactor the fault is detected only after the concentrations in the reactor have been

Figure 19. Mean misclassification rates of all agents in time (average results of 30 runs). PLSDA, adaptive PLSDA, FDA, adaptive FDA, and contribution map estimator are denoted by +, O, *, Δ, and 0, respectively.

Table 5. Actuator Faults Simulated for Diagnosis of Multiple Faults fault ype

variable

1 Fault A 2 Fault B 3 Fault E (A + B)

Model Building (Training) resource feed flow rate 210260 nondominant species 310360 in feed simultaneous A and 410460 B Faults

Testing resource feed flow rate nondominant species in feed 3 Fault E1 (A + B) simultaneous F1 and F2 4 Fault E2 (A + B) simultaneous F1 and F2 5 Fault E3 (A + B) simultaneous F1 and F2 1 Fault A1 2 Fault B1

time interval

600630 730760

magnitude (ramp)

9% 0 to 0.01 0 to 0.01 + 9%

9% 0 to 0.01

860890 0 to 0.01 + 9% 9901020 0 to 0.01 + 9% 11201150 0 to 0.01 + 9%

affected significantly due to the increased flow coming in with different composition. By the time the fault is detected in the receiving reactor and the diagnosis agents are activated, in approximately 78 time points, the fault signature in the reactor is already established, and diagnosis agents do not give many misclassifications thereafter. If the agents do not give misclassifications, the adaptation does not occur. This is the reason for not having a significant adaptation for Fault C in time. Although there is no significant adaptation, the misclassification rate is small compared to Fault B and D. For Fault A, the average misclassification rate of 100 runs with PBC is slightly smaller than the average misclassification rate for VBC. Both criteria have few misclassications for Fault A. The average misclassification rates of all agents in the performance-based consensus scheme for 30 different runs are given in Figure 19. The subplot for Fault D in Figure 19(d) (bottom right figure) provides the average misclassification rates of the five agents for all the occurrences of Fault D during the course of a run (D1 to D6). Fault 4 is repeated six times during the course of a run as fault numbers 4, 8, 14, 15, 16, and 17, which are listed on 9151

dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138–9155

Industrial & Engineering Chemistry Research the x-axis coordinate values. The y-axis represents the average misclassification rates for 30 runs. In all four plots, the misclassification rates of FDA and adaptive FDA are very similar at the first occurrences of the faults. PLSDA and adaptive PLSDA are very similar in the beginning as well. For Fault Types A and D, at each repetition of the faults, the difference between misclassification rates of adaptive PLSDA and static PLSDA and similarly difference between the rates of adaptive FDA and static FDA increase since the adaptive instances yield less misclassifications. For Fault C, the average misclassification rate is very small, and this prevented the agents to adapt better for Fault C, since adaptation only occurs when the agents decisions do not match with the confirmed diagnosis decision of the consensus.

Figure 20. Score biplot (PC 2 vs PC 1) showing different fault clusters. The numbers inserted in the plot are time stamps.

ARTICLE

Therefore, the misclassification rates of agents with adaptive models are similar to the rates of agents with static models for Fault C. Diagnosis of Multiple Concurrent Faults. It is important to correctly diagnose multiple simultaneous faults, since in real processes combinations of faults and/or disturbances may occur. Diagnosis should be able to determine if there are additional faults in the process. Building discriminant models utilizing the data for individual disturbances and their combinations is also effective in diagnosis of multiple simultaneous faults. In order to demonstrate the effectiveness of performancebased diagnosis for multiple simultaneous faults, the fault scenarios given in Table 5 are used. Fault A and Fault B are simulated as drifts with the magnitudes given in Table 5. Fault E is a combination of Faults A and B, namely, both faults are in effect in the system simultaneously. The similarities of faults are presented in Figure 20. Fault A and Fault B deviate from the normal operation starting from the top right in two different directions toward the bottom of the figure. Fault A is denoted by Δ and Fault B is denoted by *. The combination fault, Fault E, starts deviating from the normal operation region first toward Fault B and then toward the middle region between clusters for Fault A and Fault B. Since Fault E shows a very similar deviation trend during the early stages of the drifts, it is expected that it is misclassified as Fault B during those time points. The classification results of agents for the tested faults (Table 5) are given in Figure 21. Fault A is misclassified as Fault E for PLSDA and FDA agents in the beginning of the drift for the example run in Figure 21. FDA agent has a very high misclassification rate for Fault E between time 860 and 890. For the first occurrence of Fault E (E1), there are many misclassifications. Fault E is especially misclassified as Fault B, which is an expected outcome since the score biplot in Figure 20 presents the similarity between these two faults in the beginning.

Figure 21. Classifications of individual agents for diagnosis of multiple faults using adaptive performance-based scheme for example run. 9152

dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138–9155

Industrial & Engineering Chemistry Research

ARTICLE

Figure 22. Consensus diagnosis result using performance-based criterion for the example run with multiple faults.

Table 6. Misclassification Numbers for Example Run with Multiple Faults adaptive adaptive PLSDA

Figure 23. Estimated performances of agents for Fault E (concurrent Faults A and B).

Contribution map estimator correctly diagnoses Faults A and B when they occur separately; however, the agent cannot classify Fault E as successfully and yields a few misclassifications. All the misclassifications of this agent are in the beginning of the fault duration. The misclassification rate for the adaptive FDA agent is smaller than the misclassification rate of FDA for Fault E at each occurrence of the fault. Using the performance-based consensus criterion in diagnosis yields the classifications shown in Figure 22. For Faults A and B, PBC yields no misclassifications. In the beginning of E1, there are three misclassifications. Fault E is misclassified as Fault B for those time points. The estimated performances of all five agents are shown in Figure 23 for all occurrences of Fault E in time. Due to the

FDA PLSDA

contribution

diagnosis

FDA

map estimator

result

Fault A1 Fault B1

2 0

3 0

1 0

3 0

0 0

0 0

Fault E1

4

10

4

4

4

3

Fault E2

4

1

3

0

1

0

Fault E3

4

5

2

2

1

0

number of misclassification rates of FDA agent for E1, the estimated performance of the FDA agent, denoted by * decreases in time. Contribution map estimator starts with a lower estimated performance based on its misclassifications in the beginning; however, its performance increases in time. The agents with the lowest performance estimates for the example run are the static instances of PLSDA and FDA. Even though they correctly classify the fault toward the end of the fault duration for both E2 and E3, the increase in their performances is not sufficient enough to change their rankings among other agents. The rankings of the agents are given as contribution map estimator, adaptive FDA, adaptive PLSDA, PLSDA, and FDA in decreasing order for the example run. The misclassification numbers are given in Table 6 for the example run. For Fault B, there are no misclassifications. For Fault A, PLSDA has fewer misclassifications than FDA with adaptive PLSDA yielding the minimum number of misclassifications. For Fault E, the misclassification rate is more than the misclassification rates of other faults. When multiple faults occur simultaneously, it is more difficult to understand which fault is in effect since the fault signature carries information from the others. Based on the misclassification numbers given in Table 6, PBC in MADCABS yielded fewer misclassifications during E1 and no misclassifications for the other faults. The adaptive 9153

dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138–9155

Industrial & Engineering Chemistry Research Table 7. Average Misclassification Rate of Five Runs for Peformance-Based Diagnosis Criterion fault type

average misclassification rate

Fault A1

0.0667

Fault B1 Fault E1 (Faults A + B)

0.0143 0.1500

Fault E2 (Faults A + B)

0.0571

Fault E3 (Faults A + B)

0.0571

PLSDA and FDA agents yielded fewer misclassifications than static agents suggesting successful adaptation for Faults A and E. The average misclassification rate of performance-based diagnosis criterion for each fault are given in Table 7. The numbers are the average misclassification rates of five different runs.

’ CONCLUSIONS The fault scenarios provided in this paper suggest that no single method or agent provides the best results for all fault types. This supports the necessity of a dynamic multiagent system with multiple alternative built-in techniques that can switch from relying on the diagnosis of one agent to another when the operating conditions or fault types change. MADCABS automatically determines the most successful agent for a certain type of fault based on historical performances and forms a “consensus” on the basis of that agent’s fault classifications. The best performing agent may change from run-to-run. Since the adaptive environment created by MADCABS utilizes a dynamic performance evaluation mechanism, the best performing agent is determined automatically on an ongoing basis for each run. The rankings of the agents can change from run-to-run and also it can change during each run after each performance evaluation. A performance-based consensus criterion yields a lower misclassification rate than a voting-based consensus criterion. The misclassification rates for different agents decrease in time via adaptation. ’ ASSOCIATED CONTENT

bS

Supporting Information. Details of the FDA, PLSDA and contribution plot methodologies are included. This material is available free of charge via the Internet at http://pubs.acs.org.

’ AUTHOR INFORMATION Corresponding Author

*E-mail: [email protected].

’ ACKNOWLEDGMENT This work is supported by the National Science Foundation Grant CTS-0325378 of the ITR program. ’ REFERENCES (1) Jackson, J. E. Principal Components and Factor Analysis: Part I-Principal Components. J. Qual. Technol. 1980, 12, 201–213. (2) MacGregor, J. F.; Kourti, T. Statistical Process Control of Multivariable Processes. Control Eng. Pract. 1995, 3, 403–414. (3) Kourti, T.; MacGregor, J. F. Process Analysis, Monitoring and Diagnosis Using Multivariate Projection Methods. Chemom. Intell. Lab. 1995, 28, 3–21.

ARTICLE

(4) Geladi, P.; Kowalski, B. R. Partial Least Squares Regression: A Tutorial. Anal. Chim. Acta 1986, 185, 1–17. (5) Hoskuldsson, A. A Combined Theory for PCA and PLS. J. Chemom. 1995, 9, 91–123. (6) Hoskuldsson, A. PLS Regression Methods. J. Chemom. 1988, 2, 211–228. (7) Doymaz, F.; Romagnoli, J. A.; Palazoglu, A. Orthogonal Nonlinear Partial Least Squares Regression. Ind. Eng. Chem. Res. 2003, 42, 5836–5849. (8) Qin, S. J.; Valle, S.; Piovoso, M. J. On Unifying Multiblock Analysis With Application to Decentralized Process Monitoring. J. Chemom. 2001, 15, 715–742. (9) MacGregor, J. F.; Jaeckle, C.; Kiparissides, C. Process Monitoring and Diagnosis by Multi-Block PLS Methods. AIChE J. 1994, 40, 826–838. (10) Westerhuis, J. A.; Kourti, T.; MacGregor, J. F. Analysis of Multiblock and Hierarchical PCA and PLS Models. J. Chemom. 1998, 12, 301–321. (11) Wold, S.; Kettaneh, N.; Tjessem, K. Hierarchical Multiblock PLS and PC Models for Easier Model Interpretation and as an Alternative to Variable Selection. J. Chemom. 1996, 10, 463–482. (12) Russell, E. L.; Chiang, L. H.; Braatz, R. D. Fault Detection in Industrial Processes Using Canonical Variate Analysis and Dynamic Principal Component Analysis. Chemom. Intell. Lab. 2000, 51, 81–93. (13) Miller, P.; Swanson, R. E. Contribution Plots: A Missing Link in Multivariate Quality Control. Appl. Math. Comput. Sci. 1998, 8, 775–792. (14) Westerhuis, J. A.; Gurden, S. P.; Smilde, A. K. Generalized Contribution Plots in Multivariable Statistical Process Monitoring. Chemom. Intell. Lab. 2000, 51, 95–114. (15) Alcala, C.; Qin, J. Reconstruction-based Contribution Analysis for Process Monitoring. Automatica 2009, 45, 1593–1600. (16) Conlin, A. K.; Martin, E. B.; Morris, A. J. Confidence Limits for Contribution Plots. J. Chemom. 2000, 14, 725–736. (17) Qin, S. J.; Li, W. Detection and Identification of Faulty Sensors in Dynamic Processes. AIChE J. 2001, 47, 1581–1593. (18) Kosebalaban, F.; Cinar, A. Integration of Multivariable SPM and FDD by Parity Space Technique for a Food Pasteurization Process. Comput. Chem. Eng. 2001, 25, 473–491. (19) Negiz, A.; Cinar, A. Statistical Monitoring of Multivariable Dynamic Processes with State Space Models. AIChE J. 1997, 43, 2002– 2020. (20) Dunia, R.; Qin, S. J. Joint Diagnosis of Process and Sensor Faults Using Principal Component Analysis. Control Eng. Pract. 1998, 6, 457–469. (21) Raich, A.; Cinar, A. Statistical Process Monitoring and Disturbance Diagnosis in Multivariable Continuous Processes. AIChE J. 1996, 42, 995–1009. (22) Raich, A.; Cinar, A. Diagnosis of Process Disturbances by Statistical Distance and Angle Measures. Comput. Chem. Eng. 1997, 21, 661–673. (23) Duda, R. O.; Hart, P. Pattern Classification and Scene Analysis; Wiley: New York, 1973. (24) Chiang, L. H.; Kotanchek, M. E.; Kordon, A. K. Fault Diagnosis Based on Fisher Discriminant Analysis and Support Vector Machines. Comput. Chem. Eng. 2004, 28, 1389–1401. (25) Chiang, L. H.; Russell, E. L.; Braatz, R. D. Fault Diagnosis in Chemical Processes Using Fisher Discriminant Analysis, Discriminant Partial Least Squares, and Principal Component Analysis. Chemom. Intell. Lab. 2000, 50, 243–252. (26) Doymaz, F.; Bakhtazad, A.; Romagnoli, J. A.; Palazoglu, A. Wavelet-Based Robust Filtering of Process Data. Comput. Chem. Eng. 2001, 25, 1549–1559. (27) Lu, N.; Wang, F.; Gao, F. Combination Method of Principal Component andWavelet Analysis for Multivariate Process Monitoring and Fault Diagnosis. Ind. Eng. Chem. Res. 2003, 42, 4198–4207. (28) Venkatasubramanian, V.; Chan, K. A Neural Network Methodology for Process Fault Diagnosis. AIChE J. 1989, 35, 1993–2002. 9154

dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138–9155

Industrial & Engineering Chemistry Research

ARTICLE

(29) Venkatasubramanian, V.; Rengaswamy, R.; Kavuri, S.; Yin, K. A Review of Process Fault Detection and Diagnosis: Part III: Process History Based Methods. Comput. Chem. Eng. 2003, 27, 327–346. (30) Wong, J.; McDonald, K.; Palazoglu, A. Classification of Abnormal Plant Operation Using Multiple Process Variable Trends. J. Process Control 2001, 11, 409–418. (31) Ng, Y. S.; Srinivasan, R. Multi-agent Based Collaborative Fault Detection and Identification in Chemical Processes. Eng. Appl. Artif. Intell. 2010, 23, 934949 (DOI: 10.1016/j.engappai.2010.01.026). (32) Doymaz, F.; Romagnoli, J. A.; Palazoglu, A. A Strategy for Detection and Isolation of Sensor Failures and Process Upsets. Chemom. Intell. Lab. 2001, 55, 109–123. (33) Doymaz, F.; Chen, J.; Romagnoli, J. A.; Palazoglu, A. A Robust Strategy for Real-Time Process Monitoring. J. Process Control 2001, 13, 343–359. (34) Perk, S.; Teymour, F.; Cinar, A. Statistical Monitoring of Complex Chemical Processes with Agent-Based Systems. Ind. Eng. Chem. Res. 2010, 49, 50805093 (DOI: 10.1021/ie901368j). (35) Leung, D.; Romagnoli, J. Dynamic Probabilistic Model-Based Expert System for Fault Diagnosis. Comput. Chem. Eng. 2000, 24, 2473– 2492. (36) Dash, S.; Venkatasubramanian, V. Challenges in the industrial applications of fault diagnostic systems. Comput. Chem. Eng. 2000, 24, 785–791. (37) Leung, D.; Romagnoli, J. An Integration Mechanism for Multivariate Knowledge-Based Fault Diagnosis. J. Process Control 2002, 12, 15–26. (38) Rao, M.; Yang, H. Integrated Distributed Intelligent System Architecture for Incidents Monitoring and Diagnosis. Comput. Ind. 1998, 37, 143–151. (39) Yoon, S.; MacGregor, J. F. Fault Diagnosis with Multivariate Statistical Models Part I: Using Steady State Fault Signatures. AIChE J. 2001, 47, 1581–1593. (40) Tatara, E.; Cinar, A. An Intelligent System for Multivariable Statistical Process Monitoring and Diagnosis. ISA Trans. 2002, 41, 255–270. (41) Undey, C.; Tatara, E.; Cinar, A. Real-Time Batch Process Supervision by Integrated Knowledge-Based Systems and Multivariate Statistical Methods. Eng. Appl. Artif. Intell. 2003, 16, 555–566. (42) Undey, C.; Ertunc, S.; Tatara, E.; Teymour, F.; Cinar, A. Batch Process Monitoring and Quality Prediction for Batch/Fed-Batch Cultivations. J. Biotechnol. 2004, 108, 61–77. (43) Norvilas, A.; Negiz, A.; Cinar, A. Intelligent Process Monitoring by Interfacing Knowledge-Based Systems and Multivariate Statistical Monitoring. J. Process Control 2000, 10, 341–350. (44) Johnson, R. A.; Wichern, D. W. Applied Multivariate Statistical Analysis; Prentice Hall: Englewood Cliffs, NJ, 1998. (45) Nomikos, P. Detection and Diagnosis of Abnormal Batch Operations Based on Multiway Principal Component Analysis. ISA Trans 1996, 35, 147–168. (46) North, M. J.; Howe, T. R.; Collier, N. T.; Vos, J. R. Repast Simphony Runtime System, 2005. (47) Tatara, E.; Birol, I.; Cinar, A.; Teymour, F. Measuring Complexity in Reactor Networks with Cubic Autocatalytic Reactions. Ind. Eng. Chem. Res. 2005, 44, 2781–2791. (48) Tatara, E.; Teymour, F.; Cinar, A. Control of Complex Distributed Systems with Distributed Intelligent Agents. J. Process Control 2007, 17, 415–427. (49) Tatara, E.; North, M.; Hood, C.; Teymour, F.; Cinar, A. Agent Based Control of Spatially Distributed Chemical Reactor Networks. In Engineering Self-Organising Systems; Lecture Notes in Computer Science, Springer: Berlin/Heidelberg, 2006; Vol. 3910.

9155

dx.doi.org/10.1021/ie102058d |Ind. Eng. Chem. Res. 2011, 50, 9138–9155