Process Fault Prognosis Using Hidden Markov Model–Bayesian

Jun 16, 2019 - These results set the benchmark for future prediction and isolation studies of the TE process. ... the years to solve similar problems ...
0 downloads 0 Views 5MB Size
Article Cite This: Ind. Eng. Chem. Res. 2019, 58, 12041−12053

pubs.acs.org/IECR

Process Fault Prognosis Using Hidden Markov Model−Bayesian Networks Hybrid Model Mihiran Galagedarage Don and Faisal Khan*

Downloaded via UNIV OF SOUTHERN INDIANA on July 30, 2019 at 06:40:57 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

Centre for Risk, Integrity, and Safety Engineering (C-RISE), Faculty of Engineering & Applied Science, Memorial University of Newfoundland, Newfoundland, Canada A1B 3X5 ABSTRACT: A Hidden Markov Model−Bayesian Networks (HMM− BN) hybrid system coupled with a novel prediction technique is employed to predict and isolate 10 identified faults in the Tennessee Eastman (TE) process. HMM is trained offline with Normal Operating Condition data and then used to train the BN. Utilizing the same trained HMM, a history of Log Likelihood (LL) values of process fault data is generated. The same trained HMM determines the LL values of online data strings and compares with the LL history and predicts the most likely future state of the system. This information is then fed to BN as likelihood evidence to isolate the root cause. The system successfully predicts all the selected 10 faults of the TE process while accurately isolating 8 of them. The maximum level of noise that can be handled is presented along with the respective result. These results set the benchmark for future prediction and isolation studies of the TE process.

1. INTRODUCTION AND REVIEW OF THE RELEVANT LITERATURE The present study introduces the use of a hybrid system developed by Don and Khan,1 coupled with a novel prediction technique, to predict and isolate 10 identified process faults in the Tennessee Eastman (TE) process. Initially, the method mentioned above was used to detect and diagnose the faults already taking place. The current study is distinguished from the original as a new prediction technique is employed to predict and isolate the potential faults (i.e., prognosis). Although there are methods and techniques available for early detection and diagnosis of the faults mentioned above, there is no literature available for prediction and isolation of these faults. Therefore, this literature survey covers a wide variety of application areas where different prediction techniques are employed. Process fault prediction and isolation is currently a highly demanded area of study as it can reduce most of the risks including financial, health, and reputation. There are numerous studies that have been done over the years to solve similar problems in applications such as software, electronic circuits, cement manufacturing, distillation columns, computer disk drives, mechanical bearings, chemical and biological processes, metal smelting, semiconductor, stock market prediction, and foreign exchange. Among the techniques used, Artificial Neural Networks (ANN), Genetic Programming (GP), Bayesian ̈ Bayes Submodels, Naive ̈ Bayes Classifier, Methods, Naive Hidden Markov Model (HMM), Gaussian Mixture Model (GMM), Principal Component Analysis (PCA), Kernel © 2019 American Chemical Society

Principal Component Analysis (KPCA), Decision Tree, K Nearest Neighbors (k-NN), Canonical Variable Trend Analysis (CVTA), Locally Linear Neuro-Fuzzy (LLNF), Particle Predictor, Generalized Linear Auto-Regression (GLAR), Fuzzy Models, and Multiway Partial Least Squares are commonly found in most of the applications. According to the available literature, HMM−BN combined fault prediction and isolation is not explored, and there is an excellent potential to outperform the available techniques. Therefore, the present work is aimed to utilize the hybrid approach presented by Don and Khan1 for fault prediction and isolation using a novel prediction technique. In general, there are two different variations in machine learning approaches called supervised learning and unsupervised learning as mentioned in the publication by Kevin Murphy.2 In supervised learning, the algorithms assess the input data and corresponding outputs to learn the mapping function from input to output. In unsupervised learning, the algorithms identify the hidden structures of data for further evaluation of data. Classification and regression come under supervised learning while clustering comes under unsupervised learning. Hidden Markov Model (HMM) is an unsupervised learning technique which is based on clustering. Various machine learning techniques have been used as prediction Received: Revised: Accepted: Published: 12041

January 28, 2019 June 8, 2019 June 16, 2019 June 16, 2019 DOI: 10.1021/acs.iecr.9b00524 Ind. Eng. Chem. Res. 2019, 58, 12041−12053

Article

Industrial & Engineering Chemistry Research

Artificial Neural Network (ANN) and Particle Swarm Optimization (PSO) based methodology is presented by Illias, Chai, Bakar, and Mokhlis10 to predict the incipient transformer fault. It has been found that the ANN−PSO method proposed gives the leading percentage of correct identification. On the other hand, Jiang, Wey, and Fanin,11 propose an algorithm to predict faults in analog circuits. The central concept is continuously monitoring the component values which are evaluated according to the consecutive voltage measurements. These measurements are taken at the accessible test points, at each periodic maintenance. This approach makes it possible to locate the faulty components as well as components which are predicted to fail shortly. Based on multi-PCA model, a methodology for multiple mode process fault detection is presented by Ma and Xu.12 It also includes techniques for fault estimation and fault prediction. Multi-PCA model is initially used to detect faults in a process which is operated under steady state and different conditions. For the transition process, a weighted algorithm is used. Fault amplitude is made consistent by using a consistent estimation algorithm, and finally Support Vector Machine is used to predict the fault amplitude changing pattern. This method has a proven performance by applying and testing in Tennessee Eastman process data. A prediction technique is introduced by Gao and Liu13 which is developed based on an improved version of Kernel Principal Component Analysis (KPCA) method. Indiscernibility and eigenvector concepts are used. The application area is the process fault prediction of distillation columns. This version of KPCA can remove variables with almost no correlation to the fault being monitored. On the other hand, it can reduce the number of data strings used several times. The proposed methodology gives better performance over the traditional technique. By applying the method to a distillation column scenario, the authors have shown that the KPCA method is capable of predicting the process failures caused by a small disturbance. Weighted least-squares vector machines regression is used by Jiang, Gui, and Ding14 to develop a Hammerstein model to predict the dynamic behavior and the possible faults in the Imperial Smelting Furnace. The proposed model is capable of accurate fault predictions of Imperial Smelting Furnace. Further, Ramana, Sapthagiri, and Srinivas15 have introduced a prediction methodology for the quality of injection molding products based on a machine learning approach, which has shown a prediction accuracy of 95%. It is an effective method to increase productivity by eliminating defectives during the production process. To build the data mining models, Decision Tree and k-NN techniques are used and trained using a training data set developed using actual production data of a given product. Then the prediction accuracy is tested using a testing data set developed similarly. Prediction of emerging faults of dynamic industrial processes is achieved by Hu, Deng, Cao, and Tian16 using an approach based on CVTA. Canonical Variable is the leading information carried forward to make the predictions. They are the uncorrelated latent features extracted through the analysis of process dynamics. The initial analysis is done with Canonical Variate Analysis (CVA) algorithm while SVM is employed to identify the relationship between past and future values. It facilitates the development of a time series prediction model for the canonical variables. Change of the process status is forecasted, using an overall monitoring statistic and based on

tools in a variety of applications. There are abundant examples available in the literature for both individual and combined use of machine learning techniques. As an example of the above-mentioned machine learning techniques, a failure prediction methodology of a partially observable system presented by Kim, Jiang, Makis, and Lee3 can be introduced. They have modeled the system behavior of three hidden state continuous time−homogeneous Markov process. States 0 and 1 are not observable but represent normal and warning conditions, respectively. On the other hand, failure state 2 is observable. The Expectation Maximization (EM) algorithm is employed to model parameters estimation. Further, a cost-optimal Bayesian fault prediction scheme is also applied. A comparison concerning other prediction techniques is given. Bearing failure prediction was successfully made by Fulufhelo and Nelwamondo4 using HMM and GMM hybrid approach. They introduce feature extraction methodologies that can facilitate early detection of faults. Time domain vibration signals of faulty and standard bearings in rotating machinery are used for feature extraction. HMM and GMM are then used to classify faults based on the extracted features. Based on the classification performance, HMM is superior to that of GMM. They further mention that HMM has a disadvantage of being computationally expensive. Further, Hamerly and Elkan proposed a technique to predict hard disk failures which are rare but costly.5 Two Bayesian ̈ methods are introduced, namely; a mixture model of Naive ̈ Bayes Submodels and Naive Bayes Classifier. The former method used the EM algorithm, while the latter is a supervised learning approach. Both techniques show good prediction accuracy on real-world data. Also, Murray6 confirms that the rank-sum method outperforms other techniques in comparison with the performance of techniques such as Support Vector Machines (SVMs). Two improved versions of Self-Monitoring and Reporting Technology (SMART) failure prediction system are proposed by Hughes, Murray, Kreutz-Delgado, and Elk.7 The proposed techniques give several times higher prediction accuracy than error thresholds on hard disk drives which are prone to fail, at 0.2% false alarm rate. Euclidean distance based feature selection is proposed by Kim, Han, and Lee.8 It is used for fault detection and prediction model in the semiconductor synthesizing process. As the first step, the features of the semiconductor manufacturing process are measured regarding Mean Absolute Value and Standard Deviation (SD). After that, using the Euclidean distance, the most appropriate features are selected using the classification model. Finally, with the filtered features, the neural network is trained to generate a fault prediction model. The proposed method performs well in the semiconductor manufacturing process fault prediction. Neural Network modeling is used by Bekat and Erdogan9 to predict the amount of bottom ash accumulated in a pulverized coal-fired power plant. Operating data collected throughout one year and the properties of the coal processed are used in the prediction process. The network architecture used is Feedforward. Backpropagation learning is used with three layers. The sigmoid function is used as the activation function. The authors have determined the ideal parameters for accurate predictions along with the most useful metrics to be monitored based on a sensitivity analysis. To minimize the error and reduce the maintenance cost, prediction of an incipient fault in transformer oil is essential. 12042

DOI: 10.1021/acs.iecr.9b00524 Ind. Eng. Chem. Res. 2019, 58, 12041−12053

Article

Industrial & Engineering Chemistry Research

Cheng, Qu, and Qiao22 propose a method for fault isolation and gearbox RUL prediction. The fault features are extracted from the stator current of the generator coupled with the gearbox. The extracted fault features are used to train ANFIS, and the PF predicts the RUL. For this, new information about the fault features is also used. The proposed method is found effective based on the experimental results. A similar problem is successfully solved by Zhao, Li, Dong, Kang, Lv, and Shang.23 It is shown that RUL of the wind turbine is predictable 18 days ahead, with nearly 80% accuracy. The generator faults are diagnosable with an accuracy of 94% once having occurred. The benefit of the proposed system is that it does not require any additional hardware installation. Already available SCADA system serves as the source of information and hence is cost efficient. With a detailed analysis, the authors have selected the SVM as the most suitable classification technique for this particular application among ANN, Bayes classifier, and k-NN classifiers. Prediction of the foreign exchange rate is also a profoundly explored researched area. A nonlinear ensemble forecasting model is implemented by Yu, Wang, and Lai.24 It is recommended as a tool for exchange rate forecasting process. It consists of generalized linear autoregression with Artificial Neural Networks (ANN) for accurate predictions. Performance of the new combined model is assessed with reference to the two individual forecasting tools (i.e., GLR, and ANN). As further explained, the new integrated approach is more accurate in comparison with the GLAR and ANN individual systems. Similar to foreign exchange prediction, stock market forecasting is also a well-researched area. An HMM-based tool is developed for predicting the stock market.25 An HMM is used to scan for similar patterns from the past data. Based on the previous trends, the forecasting is done by interpolating the adjacent log likelihood values of the data sets. The results show the high potential that the HMM approach has in predicting stock exchange. A combined model of HMM and the Fuzzy models is proposed in ref 25 by Hasan and Nath. The HMM identifies hidden data patterns, and fuzzy logic is used to generate a forecast value. The entire data space is partitioned based on the log likelihood for each data string. They are used to generate the fuzzy rules. The performance of the proposed system is outperforming in comparison with ANN and AutoRegressive Integrated Moving Average (ARIMA). As a further improvement, Cao, Li, Coleman, Belatreche, and McGinnity26 propose an addition to the methodology introduced by Nguyen.27 In addition to the process historical data, they use the likelihoods of the model of the most recent data set. Further, they use the developed models to predict stock closing prices of Apple, Google, and Facebook using single observation data and multiple observation data. It has concluded that the results from multiple observation data perform better in stock price predictions. Also, the Maximum a Posteriori HMM approach is introduced by Gupta and Dhingra,28 which is another prediction tool. Given historical data, this method can forecast stock values for the coming day. To train the continuous HMM, they utilize the high and low values in a given day and the fractional change in the stock value of the stock. A Maximum a Posteriori decision is made using the trained HMM. This approach has also shown the excellent potential of HMM in stock price prediction.

the predicted canonical variables. They have demonstrated the effectiveness of the technique by applying it in a simulation on a Continuous Stirred Tank Reactor (CSTR) system. An intelligent algorithm for fault prediction of turbine pitch system is proposed by Deng,17 based on Least Squares Support Vector Machines (LS-SVM) parameter optimization. Initially, the data of the SCADA system are analyzed. Through this, four kinds of input parameters are selected, which are strictly related to the turbine pitch system fault. Then the minimum output coding (MOC) is introduced to construct multiple classifications LS-SVM to understand the multiclass classification of pitch fault. Later, to select the optimal feature parameters for the multiclass LS-SVM classifiers, the algorithm of particle swarm optimization is employed. The proposed methodology is applied to a pitch fault prediction scenario of wind farms. The performance is assessed with reference to the neural network algorithm (back-propagation) and the standard SVM algorithm. The proposed method is found to be superior in performance. A novel approach for power converter fault prediction in power conversion systems is presented by Y. Di.18 Decision Tree and SVM are used. Those two will take in to account the changes in working conditions and imbalances of data, respectively. It was validated with an industrial application to be useful in predicting the power converter failures. A prediction method for cement rotary kiln process is proposed by Sadeghian and Fatehi19 using a nonlinear system identification method. First, the suitable inputs and outputs are selected, and a model of inputs and outputs is identified for the complete system. To identify various operating points of the kiln, the Locally Linear Neuro-Fuzzy (LLNF) model is used. An incremental tree structure algorithm is employed to train the model. The methodology is used to develop three models, one for normal operating conditions and the other two for two faulty situations. The fault can be predicted 7 min before occurrence. To calculate the probability of fault prediction, a method has been proposed by Chen, Zhou, and Liu.20 This method can be used for nonlinear time-varying systems. It is a particle predictor-based method. As illustrated by the use of simulations, the proposed methodology capable of giving an early alarm can be triggered before the system reaches the faulty state. Although the conventional Particle Filter cannot perform with unknown time-varying parameters, the Particle Predictor has the ability. As further explained, it is almost impossible to predict the abrupt faults of a system. Nevertheless, slowly developing faults can be predicted with the use of an online monitoring system. In a nonlinear stochastic system, incipient faults prediction methodology is proposed by Ding and Fang.21 This fault estimation algorithm is developed based on a particle filter. Some simplifying assumptions on the incipient faults are made without losing critical details. A novel fault detection strategy called “intuitive fault detection” is presented. When the incipient faults are detected, nonlinear regression is used to identify the respective parameters. Based on the parameters determined, the oncoming fault signal is predicted. Finally, a standard simulation has been employed to assess the performance of the proposed methodology. Wind turbine Remaining Useful Life (RUL) is an important parameter to know to maintain the reliability of the service. By employing an Adaptive Neuro-Fuzzy Inference System (ANFIS) along with Particle Filtering (PF) approaches, 12043

DOI: 10.1021/acs.iecr.9b00524 Ind. Eng. Chem. Res. 2019, 58, 12041−12053

Article

Industrial & Engineering Chemistry Research

Figure 1. Process flow diagram of the TE process.38

scenario related to unusual cylinder temperature in a marine diesel engine and engine knocking of automotive gasoline engines. According to Gabbar,32 Fault Propagation Analysis (FPA) is the foundation to ensure the proper management of abnormal situations and safe operations of chemical and petrochemical plants. Gabbar has introduced a framework in 2007 to assess and develop fault propagation scenarios in complex industrial applications. The importance of this study is that there is a provision to dynamically tune the developed Fault Semantic Network (FSN) by feeding the information on real-time process data, simulation data, and most importantly human experience. This can be considered as a more efficient approach to develop the structure of BN and an alternative for Fault Tree Analysis, Event Tree Analysis, and Sign Directed Graphs. As a further development, a method based on dynamic fault semantic networks was presented by Gabbar and Hossein in 2013.33 The fault semantic network was probabilistically tuned by combining with Bayesian bielief theory. This can be introduced as a robust method that can estimate interaction strengths among process variables. Further, in 2014, Gabbar and Boafo34 have used FSN developed by Gabbar in 200732 to represent fault knowledge based on interrelationships between objects. The performance of the fault diagnosing system is enhanced by the introduction of FSN. The remaining of this text is organized as follows. Section 2 provides fundamentals of HMM and BN and how they are integrated; section 3 details the methodology and its testing; section 4 discusses the results, while section 5 presents the conclusions.

Real-time supervision of bioprocesses is beneficial in quality control. A novel efficient modeling and supervision technique based on multiway partial least-squares is presented by Ü ndey, Tatara, and Ç inar.29 The method can predict the quality of the batch at the end of the growth. A real-time knowledge-based system (RTKBS) is employed for process monitoring, fault diagnosis, and quality estimation. Using a fed-batch penicillin production benchmark process simulator, they have validated the performance of the methodology. A data-driven fault prediction method is proposed by Wang30 which can quantify the degree of abnormality through probability density estimation. The method can be used to monitor the state of a complex system quantitatively. They first define an index to quantify the degree of abnormality. Next, a single slack factor multiple kernel SVM probability density estimation model is employed to improve the computational efficiency. In addition to that, this improves data mapping performance. The resulting model is capable of providing a rapid estimation with higher precision. The degree of abnormality is found to be accurately measurable by the abnormality index. In most of the applications, the systems tend to show some characteristic signals before the actual appearance of a fault. Nevertheless, the extraction of those features for fault prediction process is challenging. A novel study was done by Baek and Kim31 to address this issue. They introduce two important definitions called symptom pattern and symptom period. Then they present a methodology for symptom pattern extraction. It collects all evidence from sensor signals related to faults that can occur shortly. This study is based on the assumption that there is a period before the occurrence of a fault, which carries the characteristic symptoms to that particular fault. The proposed method is validated using a 12044

DOI: 10.1021/acs.iecr.9b00524 Ind. Eng. Chem. Res. 2019, 58, 12041−12053

Article

Industrial & Engineering Chemistry Research

where λ is the model. On the other hand, A = {aij}, B = {bij(k)}, and π = πi stand for probability distribution of transition, probability distribution of observation, and distribution of initial state, respectively. If Si is a given state, parameters can be defined as aij = P(qt+1 = Sj|qt = Si), 1 ≤ i, j ≤ N, bij(k) = P(Ok|qt = Si), 1 ≤ j ≤ N, 1 ≤ k ≤ M, and ∏i = P(q1 = Si), 1 ≤ i ≤ N. Here, qt, N, Ok, and M stand for state at time t, number of states, kth observation, and some sharp observations. Model λ can generate the probability of the observation sequence of visible states. The probability is calculated using eq 1 based on bij(k).

2. PRELIMINARIES 2.1. Hidden Markov Model (HMM). A Markov chain is a random process that involves several different states. There are Table 1. Faults in the TE Process and Their Respective Root Causes38 fault ID

description

type

A

E feed loss

step

B

D

reactor cooling water inlet temperature condenser cooling water inlet temperature reactor cooling water valve

random variation random variation sticking

E

condenser cooling water valve

sticking

F

step

I

A/C feed ratio, B composition constant (stream 4) reactor cooling water inlet temperature condenser cooling water inlet temperature A feed loss (stream 1)

J

stripper steam valve stiction

sticking

C

G H

step step step

root cause VARIABLE (3) VARIABLE (9) VARIABLE (11) VARIABLE (9) VARIABLE (11) VARIABLE (4) VARIABLE (9) VARIABLE (11) VARIABLE (1) VARIABLE (19)

T=1

P(O , λ) =

variable number (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22)

description

unit

A feed (stream 1) D feed (stream 2) E feed (stream 3) A and C feed (stream 4) recycle flow (stream 8) reactor feed rate (stream 6) reactor pressure reactor level reactor temperature purge rate (stream 9) product separator temperature product separator level product separator pressure product separator underflow (stream 10) stripper level stripper pressure stripper underflow (stream 11) stripper temperature stripper steam flow compressor work reactor cooling water outlet temperature separator cooling water outlet temperature

Kscmh kg/h kg/h Kscmh kscmh kscmh kPa gauge % °C kscmh °C % kPa gauge m3/h % kPa gauge m3/h °C kg/h kW °C °C

y

0

all S

(OSt+1)

t+1

T=0

(1)

Some key algorithms are running in HMM, namely the kMeans Algorithm, the Expectation Maximization Algorithm, and the Viterbi Algorithm. k-Means is almost a binary algorithm. This algorithm allows finding the cluster centers. This converges to local minima. It is required to know the number of cluster centers (i.e., k as an input). As the initiation, the value of k (i.e., number of clusters) is randomly guessed. As the repeating step, the data correspond to the nearest cluster, and then clusters are updated using corresponding data points. If the cluster is empty, the process restarts at a random point until no change. On the other hand, the EM algorithm is one that uses other probability distributions. EM is a probabilistic generalization which also allows finding the cluster centers. It modifies not only the shape of the clusters but also the covariance matrix. This is probabilistically sound and can be proven that it converges in a log likelihood space. Similar to k-means, this gets converged to local minima. We need to know the number of cluster centers (i.e., k), similar to the k-means algorithm. P(x) = ∑ik= 1P(C = i)·P(x|C = i), where P(C = i) πi is the prior probability to the cluster center, and P(x|C = i) is the Gaussian parameter for each of the individual Gaussian (i.e., μi, ∑i, where i = 1, 2, 3, ...). The code consists of two main sections namely, Expectation Step (E-Step) and Maximization Step (MStep). In the E-Step, it is assumed that πi, μi, and ∑i are known values. If eij is the probability of jth data point corresponding to cluster point i

Table 2. Variables of the Tennessee Eastman Chemical Process38 VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE VARIABLE

∑ πS ∏ aS St + 1bS

−1

eij = π ·(2π )−M /2 |Σ|−1exp

1 (xj − μi )−1 ∑ (xj − μi ) 2 i

(2)

where π stands for the prior probability to the cluster center, (2π)−M/2|∑|−1 is the normalizer, and 1 −1 exp 2 (xj − μi )−1 ∑i (xj − μi ) is the Gaussian expression. In M-Step, πi can be taken from ∑jeij/M, μi can be taken from ∑jeij·xXj/∑jeij, and ∑i can be taken from ∑jeij(xj − μi)T(xj − μi)/∑jeij. Here eij works as a soft correspondence of a data point which works as a weight for the calculation. Defining the value of k is the next problem to be solved. This number is not known in real-world applications. However, it is assumed a constant. In practical cases, we guess the value of k and minimize the following expression, which is called the log likelihood.

relationships between the states due to state transitions. Each of these transitions has a transition probability linked to it. Also, each state has an observation linked to it. The main characteristic of a Markov process is that the state transition to the next state depends only on the current state and not on any of the former states. The specialty of this technique is that the actual state sequence is not observable. Therefore, it is called the Hidden Markov Model. HMM with a discrete output probability distribution can be represented as λ = {A, B, π},

LL = −∑ log P(xj|σ1Σ1k) + Cost × k j

12045

(3) DOI: 10.1021/acs.iecr.9b00524 Ind. Eng. Chem. Res. 2019, 58, 12041−12053

Article

Industrial & Engineering Chemistry Research Here, LL is the log likelihood, and Cost is a constant penalty. Further, the posterior probability (i.e., P(xj|σ1∑1k)) is maximized of data. As shown in eq 3, if the number of clusters of data is increased, the penalty will be high. Nevertheless, typically, this minimizes at a certain value of k. A fault diagnosis and prediction system is proposed by Li, Wei, Wang, and Zhou35 which consists of three parts, namely: data preprocessing, degradation state detection, and fault diagnosis. For feature extraction, the wavelet transforms correlation filter is employed. The proposed methodology is validated to be capable of identifying the system operating state and hence facilitate to predict the system behavior. 2.2. Bayesian Networks (BNs). As mentioned by Ruggeri,36 BN is a type of probabilistic graphical model (GM). These graphical structures can acquire knowledge about the uncertain domain. Probabilistic causal relationships are demonstrated by the arrows, and each node connected by an arrow represents a random variable. To estimate the conditional dependencies in the graph, known statistical and computational methods are used. Hence, BNs can be considered as a meeting point of graph theory, probability theory, computer science, and statistics. Further, as mentioned by Neapolitan,37 a BN is capable of being used in studying the structures of gene regulatory networks. BNs have the ability to utilize inputs from both prior knowledge and experimental data to find a root cause. BNs can be considered as a powerful tool for fault diagnosis. It has shown remarkable performance in fault detection and diagnosis in work presented by Amin, Imtiaz, and Khan.38 Therefore, the predicted fault by HMM can be isolated by BN. As mentioned by Wu,39 the basic principles of BN are conditional independence and joint probability distribution which can be expressed by eqs 4 and 5, respectively.

Figure 2. Establishing HMM and BN.

k

P(V1 , V2 , ..., Vk|ν) =

∏ P(Vi |ν)(i = 1, 2, ..., k) 1

(4)

k

P(V1 , V2 , ..., Vk|ν) =

Figure 3. Overview of prediction and isolation.

∏ P(Vi /Parent(Vi ))(i = 1, 2, ..., k) 1

Table 3. Identified 10 Faults in the TE Process and Their Root Causes

(5)

BNs-based rare event prediction has been studied by Cheon40 and by Cózar, Puerta, and Gámez.41 BN has both causal and probabilistic semantics. Hence, it is powerful in combining actual process historical data and background knowledge. To be used when there are not many details available about all possible values, the authors have proposed a general-purpose decision support system tool. It consists of two BN models: one represents the failure-free behavior of the system, and the other represents abnormal behaviors. This novel system is a robust tool that can be used for health management in industrial environments. 2.3. The Tennessee Eastman (TE) Process. Figure 1 illustrates the process flow diagram of the TE process. This is used to develop the BN which is later trained using HMM. Further, the descriptions of each of the continuous variables are given in Table 1, while Table 2 represents the faults in the TE process and their respective root causes. A detailed explanation of the TE process is presented by Amin, Imtiaz, and Khan.38

fault ID

actual root cause

A B C D E F G H I J

VARIABLE (3) VARIABLE (9) VARIABLE (11) VARIABLE (9) VARIABLE (11) VARIABLE (4) VARIABLE (9) VARIABLE (11) VARIABLE (1) VARIABLE (19)

3. METHODOLOGY Under this topic, the preprocessing of data, training of HMM, prediction of abnormalities, training of BN, and isolation of the predicted fault using BN are discussed. The main contribution of this work is in the prediction of abnormalities and generation of likelihood evidence for fault isolation. Other 12046

DOI: 10.1021/acs.iecr.9b00524 Ind. Eng. Chem. Res. 2019, 58, 12041−12053

Article

Industrial & Engineering Chemistry Research

Figure 4. Training and application process of an HMM.

Second, to generate a testing data set, ±0.002% of random noise for all 22 variables was added using the Matlab code. This noise level is gradually reduced to find the maximum possible noise level that the testing data set can contain while keeping the accuracy of isolation through BN. 3.2. Training of HMM and Prediction of the nth Data String. The open source HMM toolbox based Matlab code presented in Don and Khan1 is used in this study. Figure 4 illustrates the training process of HMM. mu0 and sigma0 are an initial assumption for mean for the mixture of Gaussians and initial assumption for standard deviation for the mixture of Gaussians, respectively. They are derived through mixgauss_init function, based on the inputs Q, M, data, O, T, and nex. Then, prior0, transmat0, mixmat0, and data along with mu0 and sigma0 are updated by an EM algorithm to determine LL, prior1, transmat1, mu1, sigma1, and mixmat1. This provides a trained HMM which can give a higher log likelihood when it is provided with a data set with similar features. Table 4 describes each variable which is useful in training an HMM using HMM Toolbox developed by Kevin Murphy.42 Further, Figure 4 illustrates the connections between functions used in HMM tool box such as mixgauss_init, mhmm_logprob, mixgauss_prob, and viterbi_path. This figure is solely presented to illustrate the information flow from one function to another. Detailed explanation on the function of HMM tool box is not presented for clarity. In selecting the number of hidden states (Q), crossvalidation technique is used. If there are N training samples and N parameters, a perfect score can be achieved, means free of errors. However, this can be done only for available data.

steps of the methodology are adopted from a separate study conducted by Don and Khan.1 The overview of the entire methodology can be summarized as follows. As illustrated in Figure 2, data are preprocessed to extract standard operating condition data, and the extracted data are used to establish HMM. The trained HMM is then used to generate conditional probabilities for BN. On the other hand, the process flow diagram is input for the development of qualitative BN (i.e., the structure of BN), through the development of Signal Directed Graph (SDG). Then the Conditional Probability Tables (CPTs) of qualitative BN are filled using the generated conditional probabilities to establish the quantitative BN. The entire software code developed for the above task, using MATLAB software, is available at https://github.com/mihiranpathmika/CPL1.0. Prediction and isolation of potential faults is the main contribution of this study. It is summarized in Figure 3 as follows. The incoming real-time data are fed to the trained HMM, and the most likely data string that will receive after n number of seconds is predicted. The predicted data strings are carefully assessed to detect potential future abnormalities. Once an abnormality is detected, that information is sent to the trained BN as likelihood evidence, and the cause for the particular abnormality is isolated. The following topics discuss the above prediction and isolation processes in detail. 3.1. Data Preprocessing. Initially, there are 10 data sets for 10 different faults as shown in Table 3. For example, in data set A, first 1000 data strings represent normal operating conditions; 1001−1500 data strings represent the respective faulty state. First, standard operating condition data are extracted from each data set and used to train the HMM. 12047

DOI: 10.1021/acs.iecr.9b00524 Ind. Eng. Chem. Res. 2019, 58, 12041−12053

Article

Industrial & Engineering Chemistry Research Table 4. Description of Notations Used notation Q M data O

stands for

description

number of hidden states number of mixtures of Gaussians training data size of a given vector-valued sequence length of a given vector-valued sequence number of vector-valued sequences co-variance matrix type

the optimum value of this to be determined by cross-validation the optimum value of this to be determined by cross-validation the historical normal operating condition data of the process used to train the model the number of different observations for a given sequence, i.e., the number of different sensors that give readings per unit time the number of observation sequences

estimates initial parameters for a mixture of Gaussians using K-means algorithm. the initial assumption for mean for the mixture of Gaussians updated through a mixgauss_init function for the mixture of Gaussians the initial assumption for standard deviation for the mixture of Gaussians

mixmat0 test data LL

function initial parameter: mean updated mean initial parameter: standard deviation updated standard deviation initial state probability estimate updated state probability initial state transition probability updated state transition probability initial Gaussian mixture matrix new observations log likelihood 3(Θ|/)

mhmm_logprob

function

mixgauss_prob viterbi_path loglik path

function function log likelihood of the test data path of hidden states

T nex f ull/spherical/ diag mixgauss_init mu0 mu1 sigma0 sigma1 prior0 prior1 transmat0 transmat1

the observations (i.e., T and O) taken from similar processes trial and error selected optimum one

updated standard deviation through a mixgauss_init function for the mixture of Gaussians the initial assumption for state probability updated state probability through EM algorithm for the mixture of Gaussians assumed value for transition probability updated state transition probability through EM algorithm for the mixture of Gaussians assumed values for the GM matrix real-time incoming data can be fed through this variable N

3(Θ|/) = ∏i = 1 p(xi|Θ) = p(/|Θ) evaluates the log likelihood of a trained model given test data; computes the log likelihood of a data set using a (mixture of) Gaussians. evaluate the pdf of a conditional mixture of Gaussians determines the most likely hidden state path the system may take for the given set of observations. determines the log likelihood of the given data sequences (i.e., likelihood of occurrence) path of most probable hidden states for the given set of observations (i.e., test data)

Figure 5. 3D matrix of data.

Figure 7. Prediction of nth data string using trained HMM.

Figure 6. Training curve for HMM. 12048

DOI: 10.1021/acs.iecr.9b00524 Ind. Eng. Chem. Res. 2019, 58, 12041−12053

Article

Industrial & Engineering Chemistry Research

Figure 8. Refining process of the initial prediction.

Figure 9. Predicted vs actual LL of fault A.

(i.e., λ) for HMM are consistently compatible with the training data set. In brief, the HMM is trained for the given data set. In addition to the above work done offline, the trained HMM is then used to predict the nth data string by following the online procedure shown in Figure 7. Prediction with HMM requires experience on similar past incidents. Once the system understands the current state, it scans its memory (i.e., the LL value history) searching for a similar state, and the prediction is made concerning that. This is the central concept used in the prediction process. If tcurrent is the current time, up to (tcurrent + n)th data string is predicted. Once the actual time reaches (tcurrent + n) the system reviews the predicted and observed data strings and creates knowledge which assists in predicting the next n data strings. In the first step, three adjacent incoming data strings are fed to the HMM, and the respective LL values are evaluated. The LL values are then compared with the history of adjacent LL values. As a result, the most approximate three adjacent LL and the respective data string can be detected from past knowledge. Here it is assumed that the same pattern that occurred in the history is likely to occur again. Based on that assumption, the nth data string is predicted. As a further improvement of the prediction, the procedure illustrated in Figure 8 is used. It can be observed from Figure 9 that the predicted and actual LL values follow a pattern which is almost similar. Here the solid line represents the actual variation of LL values during the previous data window, and the dotted line represents the prediction made by the HMM for the same data window. Therefore, during the revision process shown in Figure 8, the correction of the mean value of

That does not mean that the model is strong enough to predict the unseen data. For example, the stock price variation in the coming month can be given. If the data are not correctly fitted, it is called underfit. Moreover, if all the noise is also taken into account, it is called overfitting. Therefore, this careful separation of correct data is essential. As a solution to this, the following approach can be taken. The entire data set is separated into several portions, and one set is kept as a validation data set while the rest of the data are taken as training data. After that, the cost of each validation data set is considered. Through this method, we can choose the optimum number of states that give the highest validation accuracy. A standard way of doing this is K-fold cross-validation which begins with the separation of the data set into N sets. One set out of N is selected as the validation data set while the rest of them are used to train the system. The value of N is determined based on the mean and the standard deviation. On top of this, the knowledge on the physical system can also be used in deciding the number of hidden states. In training the HMM, the test data are fed as a 3D matrix to the HMM toolbox as shown in Figure 5. The number of observation sequences (T) is the number of rows in the data set, while the number of different observations in a given data string (O) is the number of columns in the data set. Also, there is a provision for data sets from similar processes called nex. In the current study, nex = 1. Once the data is fed, the HMM can be trained. The training curve can be plotted as shown in Figure 6. It shows that the LL value gets consistent with the number of iterations. It means that the parameters estimated 12049

DOI: 10.1021/acs.iecr.9b00524 Ind. Eng. Chem. Res. 2019, 58, 12041−12053

Article

Industrial & Engineering Chemistry Research

Figure 10. BN for the TE process.38

Figure 11. Procedure for isolation of the fault.

12050

DOI: 10.1021/acs.iecr.9b00524 Ind. Eng. Chem. Res. 2019, 58, 12041−12053

Article

Industrial & Engineering Chemistry Research

Figure 12. Fault A prediction using HMM.

predicted data concerning the previous data window is found to be satisfactory. 3.3. Development of Structure and Training of BN. This is the section that process knowledge is brought into the prediction and isolation process. The BN structure development begins in the Process Flow Diagram. The Process Flow Diagram can be converted to a Sign Directed Graph based on the knowledge of causality and effect. Then it is converted to a BN while maintaining the acyclic nature of BN. The BN for the Tennessee Eastman Process is illustrated in Figure 10.38 3.4. Updating the BN with Likelihood Evidence. In this step, the “corresponding time of future fault” is predicted using HMM. Usually, HMM takes 20−50 s to identify an abnormality after the occurrence. Therefore, if HMM identifies an abnormality at time t1, the actual introduction of the abnormality can happen at t = t1 − 50. Due to this reason, once an abnormality is detected, data strings from t − 50 to t are considered to calculate the likelihood evidence. This likelihood evidence is then fed in to the BN for fault isolation. Isolation Using BN. The methodology for isolation is adopted from the study of Amin, Imtiaz, and Khan.38 Once the BN is updated with the likelihood evidence, the percentage change is evaluated in each node. If the maximum percentage change is in a root node, that node is taken as the cause for the particular problem. If not, the maximum percentage change in the preceding successive parent node is taken as the root cause. This process is illustrated in Figure 11.

Table 5. Isolation of Fault A node

initial probability × 10−2

final probability × 10−2

difference

VARIABLE_1 VARIABLE_2 VARIABLE_3 VARIABLE_4 VARIABLE_5_R VARIABLE_6 VARIABLE_7 VARIABLE_8 VARIABLE_9 VARIABLE_10 VARIABLE_11 VARIABLE_12 VARIABLE_13 VARIABLE_14 VARIABLE_15 VARIABLE_16 VARIABLE_17 VARIABLE_18 VARIABLE_19 VARIABLE_20 VARIABLE_21 VARIABLE_22

53 18 21 46 27 26 39 24 51 45 53 30 30 35 29 37 41 47 31 34 49 19

16 0 100 0 100 16 100 100 16 84 100 84 100 100 16 16 84 84 0 84 16 84

37 18 79 46 73 10 61 76 35 39 47 54 70 65 13 21 43 37 31 50 33 65

Table 6. Summary of Results

4. RESULTS AND DISCUSSION Under this topic, prediction and isolation results of fault A are presented as a sample. Prediction on testing data set of fault A is illustrated in Figure 12. According to the same figure, fault A is detected at 995 s, which is actually introduced to the system at 1000 s. This fault in the same data set grows to a detectable level only by 1020 s. Therefore, this is a decent prediction of the abnormality. Further, Table 5 indicates the change in probabilities in each node after introducing the likelihood evidence. As illustrated, the highest percentage change is in VARIABLE (3) which is the actual cause of Fault A. This completes the process of prediction and isolation of Fault A. In a similar approach, the rest of the faults were also tested. It was observed that the isolation performance varies for each

fault

maximum noise level that can be handled

accurate prediction

A B C D E F G H I J

10−2 10−7 10−3 10−7 10−3 N/A 10−8 10−6 10−8 N/A

yes yes yes yes yes no yes yes yes no

fault. Therefore, the noise level was gradually reduced from a 10−1 until the system accurately predicts the fault. Table 6 12051

DOI: 10.1021/acs.iecr.9b00524 Ind. Eng. Chem. Res. 2019, 58, 12041−12053

Article

Industrial & Engineering Chemistry Research CPT CSTR CVA CVTA EM FPA GLAR GMM GP HMM k-NN KPA LL LLNF LSTM PCA PF RUL TE

shows the performance of the prediction of each fault based on the maximum level of noise that can be handled by each system. The prediction using HMM mainly utilizes the experience of faults and the variation characteristics before a fault. In other words, it tracks the features (symptoms) that a system will show just before the start of a fault. Therefore, the proposed technique is recommended for systems which show some characteristics or symptoms before failure. It can be a pattern of variation of pressure or any physically measurable parameter. These are not rare in practical applications, and HMM is efficient in classifying those features. Although there are abrupt changes of process states in the current scenario, the HMM−BN hybrid system performs decently well. Therefore, there is a good potential that the system can perform more efficiently for gradual changes in process states. The testing data used has ±10−2 to ±10−8 of added noise in all 22 parameters in comparison with the training data. Some of the faults are not accurately prognosed at higher noise levels. Faults A, C, and E were accurately isolated even at comparatively higher noise levels. Because the system is composed of 22 sensors, the information on an abnormality may have been diluted by the rest of the sensor readings, and the failure detection of F and J may have become unsuccessful.



REFERENCES

(1) Galagedarage Don, M.; Khan, F. Dynamic process fault detection and diagnosis based on a combined approach of hidden Markov and Bayesian network model. Chem. Eng. Sci. 2019, 201, 82−96. (2) Murphy, K. Machine Learning: A Probabilistic Perspective; The MIT Press: Cambridge, MA, London, England, 2012. (3) Kim, M. J.; Jiang, R.; Makis, V.; Lee, C. G. Optimal Bayesian fault prediction scheme for a partially observable system subject to random failure. Eur. J. Oper. Res. 2011, 214 (2), 331−339. (4) Fulufhelo, U. M.; Nelwamondo, V. Early Classifications of Bearing Faults Using Hidden Makov Model. Inf. Control 2006, 2 (6), 1281−1299. (5) Abraham, A.; Grosan, C. Genetic Programming Approach for Fault Modeling of Electronic Hardware. 2005 IEEE Congress on Evolutionary Computation 2005, 2 (1), 1563−1569. (6) Murray, J. Hard drive failure prediction using non-parametric statistical methods. Proc. ICANN 2003, 1. (7) Hughes, G. F.; Murray, J. F.; Kreutz-Delgado, K.; Elkan, C. Improved disk-drive failure warnings. IEEE Trans. Reliab. 2002, 51 (3), 350−357. (8) Kim, J.; Han, Y.; Lee, J. Euclidean Distance Based Feature Selection for Fault Detection Prediction Model in Semiconductor Manufacturing Process 2016, 133, 85−89. (9) Bekat, T.; Erdogan, M.; Inal, F.; Genc, A. Prediction of the bottom ash formed in a coal-fired power plant using artificial neural networks. Energy 2012, 45 (1), 882−887. (10) Illias, H. A.; Chai, X. R.; Bakar, A. H. A.; Mokhlis, H. Transformer incipient fault prediction using combined artificial neural network and various particle swarm optimization techniques. PLoS One 2015, 10 (6), 1−16. (11) Jiang, B. L.; Wey, C. L.; Fan, L. J. Fault Prediction for Analog Circuits. Circuits, Systems and Signal Processing 1988, 7 (1), 95−109. (12) Ma, J.; Xu, J. Fault Prediction Algorithm for Multiple Mode Process Based on Reconstruction Technique. Mathematical Problems in Engineering 2015, 2015, 1. (13) Gao, Q.; Liu, W.; et al. Research and Application of the Distillation Column Process Fault Prediction based on the Improved KPCA. 2017 IEEE International Conference on Mechatronics and Automation (ICMA) 2017, 247−251. (14) Jiang, S.; Gui, W.; Ding, S. X.; Naik, A. Development and application of a data-driven fault prediction method to the imperial smelting furnace process; 7th IFAC Symposium on Fault Detection, Supervision and Safety of Technical Processes Barcelona, Spain; IFAC, 2009; Vol. 42, pp 1174−1179. (15) Ramana, E. V.; Sapthagiri, S.; Srinivas, P. Data Mining Approach for Quality Prediction and Fault Diagnosis of Injection Molding. Process, Indian J. Sci. Technol. 2017, 10 (17), 1−7.

5. CONCLUSIONS The unique aspect of this study is the integration of HMM− BN hybrid model with a novel fault prediction technique to perform process fault prediction and isolation. The proposed system predicts all the 10 identified faults while successfully isolates (i.e., prognosis) 8 of them. The accuracy of isolation varies depending on the added noise level to the testing data, and the maximum acceptable noise levels for respective faults were determined. The faults A, C, and E were detectable at higher added noise levels (i.e., 10−3). Faults F and J were not isolated due to the high sensitivity for added noise while the rest of the faults were isolated with around 10−8 of added noise. Isolation of fault F and J using the HMM-BN hybrid system will be a useful future contribution. Moreover, the false alarm frequency can be reduced to make the system more reliable. The software code can be further developed to a stand-alone computer program which can predict and isolate process faults.



Conditional Probability Tables Continuous Stirred Tank Reactor Canonical Variate Analysis Canonical Variable Trend Analysis Expectation Maximization Fault Propagation Analysis Generalized Linear Auto-Regression Gaussian Mixture Model Genetic Programming Hidden Markov Model K Nearest Neighbors Kernel Principal Component Analysis Log Likelihood Locally Linear Neuro-Fuzzy Long−Short-Term Memory Principal Component Analysis Particle Filter Remaining Useful Life Tennessee Eastman

AUTHOR INFORMATION

Corresponding Author

*E-mail: fi[email protected]. ORCID

Faisal Khan: 0000-0002-5638-4299 Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS The authors thankfully acknowledge the financial support provided by the Natural Science and Engineering Council of Canada and the Canada Research Chair (CRC) Tier I Program in offshore safety and risk engineering.



SYMBOLS AND ACRONYMS ANN Artificial Neural Network ARIMA Auto-Regressive Integrated Moving Average BN Bayesian Network 12052

DOI: 10.1021/acs.iecr.9b00524 Ind. Eng. Chem. Res. 2019, 58, 12041−12053

Article

Industrial & Engineering Chemistry Research

(39) Wu, J.; Zhou, R.; Xu, S.; Wu, Z. Probabilistic analysis of natural gas pipeline network accident based on Bayesian network. J. Loss Prev. Process Ind. 2017, 46, 126−136. (40) Cheon, S. P.; Kim, S.; Lee, S. Y.; Lee, C. B. Bayesian networks based rare event prediction with sensor data. Knowledge-Based Syst. 2009, 22, 336−343. (41) Cózar, J.; Puerta, J. M.; Gámez, J. A. An application of dynamic Bayesian networks to condition monitoring and fault prediction in a sensored system: A case study. Int. J. Comput. Intell. Syst. 2017, 10 (1), 176−195. (42) Murphy, K. Hidden Markov Model (HMM) Toolbox for Matlab, distributed under the MIT License, 2005; Available online: https://www.cs.ubc.ca/~murphyk/Software/HMM/hmm. htmlhttps://www.cs.ubc.ca/~murphyk/Software/HMM/hmm.html (accessed: 19 Jul 2018).

(16) Hu, Y.; Deng, X.; Cao, Y.; Tian, X. Dynamic process fault prediction using canonical variable trend analysis. 2017 Chinese Autom. Congr. 2017, 2015−2020. (17) Deng, Z. Proceedings of 2017 Chinese Intelligent Systems Conference, 2018; Vol. 460. (18) Di, Y. Fault prediction of power electronics modules and systems under complex working conditions. Comput. Ind. 2018, 97, 1−9. (19) Sadeghian, M.; Fatehi, A. Identification, prediction and detection of the process fault in a cement rotary kiln by locally linear neuro-fuzzy technique. J. Process Control 2011, 21 (no. 2), 302−308. (20) Chen, M. Z.; Zhou, D. H.; Liu, G. P. A new particle predictor for fault prediction of nonlinear time-varying systems. Dev. Chem. Eng. Miner. Process. 2005, 13 (3−4), 379. (21) Ding, B.; Fang, H. Fault prediction for nonlinear stochastic system with incipient faults based on particle filter and nonlinear regression. ISA Trans. 2017, 68, 327−334. (22) Cheng, F.; Qu, L.; Qiao, W. Fault Prognosis and Remaining Useful Life Prediction of Wind Turbine Gearboxes Using Current Signal Analysis. IEEE Trans. Sustain. Energy 2018, 9 (1), 157. (23) Zhao, Y.; Li, D.; Dong, A.; Kang, D.; Lv, Q.; Shang, L. Fault prediction and diagnosis of wind turbine generators using SCADA data. Energies 2017, 10 (8), 1210. (24) Yu, L.; Wang, S.; Lai, K. K. A novel nonlinear ensemble forecasting model incorporating GLAR and ANN for foreign exchange rates. Comput. Oper. Res. 2005, 32 (10), 2523−2541. (25) Hassan, M. R.; Nath, B. Stock market forecasting using hidden Markov model: a new approach. 5th Int. Conf. Intell. Syst. Des. Appl. 2005, 192−196. (26) Cao, Y.; Li, Y.; Coleman, S.; Belatreche, A.; McGinnity, T. M. Adaptive hidden Markov model with anomaly states for price manipulation detection. IEEE Trans. Neural Networks Learn. Syst. 2015, 26 (2), 318−330. (27) Nguyen, N. Stock Price Prediction using Hidden Markov Model. 2016; pp 1−20. (28) Gupta, A.; Dhingra, B. Stock market prediction using Hidden Markov Models. 2012 Students Conf. Eng. Syst. SCES 2012 2012, 1. (29) Ü ndey, C.; Tatara, E.; Ç inar, A. Intelligent real-time performance monitoring and quality prediction for batch/fed-batch cultivations. J. Biotechnol. 2004, 108 (no. 1), 61−77. (30) Wang, H. Q.; Cai, Y. N.; Fu, G. Y.; Wu, M.; Wei, Z. H. Datadriven fault prediction and anomaly measurement for complex systems using support vector probability density estimation. Eng. Appl. Artif. Intell. 2018, 67, 1−13. (31) Baek, S.; Kim, D. Y. Fault Prediction via Symptom Pattern Extraction Using the Discretized State Vectors of Multi-Sensor Signals. IEEE Trans. Ind. Informatics 2019, 15, 922−931. (32) Gabbar, H. A. Improved qualitative fault propagation analysis. J. Loss Prev. Process Ind. 2007, 20 (3), 260−270. (33) Hussain, S.; Hossein, A.; Gabbar, H. A. Tuning of fault semantic network using Bayesian theory for probabilistic fault diagnosis in process industry. QR2MSE 2013Proceedings of 2013 International Conference on Quality, Reliability, Risk, Maintenance, and Safety Engineering 2013, No. 1, 1677−1682. (34) Gabbar, H.; Boafo, E. FSN-Based Cosimulation for Fault Propagation Analysis in Nuclear Power Plants. Process Saf. Prog. 2016, 35 (1), 53−60. (35) Li, C.; Wei, F.; Wang, C.; Zhou, S. Fault diagnosis and prediction of complex system based on Hidden Markov model. J. Intell. Fuzzy Syst. 2017, 33 (5), 2937−2944. (36) Ruggeri, F.; Faltin, F.; Kenett, R. Bayesian Networks. Encycl. Stat. Qual. Reliab. 2007, 1 (1), 4. (37) Neapolitan, R. E. Learning Bayesian Networks. Int. J. Data Min. Bioinform. 2010, 4, 505−19. (38) Amin, M. T.; Imtiaz, S.; Khan, F. Process system fault detection and diagnosis using a hybrid technique. Chem. Eng. Sci. 2018, 189, 191−211. 12053

DOI: 10.1021/acs.iecr.9b00524 Ind. Eng. Chem. Res. 2019, 58, 12041−12053