Subscriber access provided by NAGOYA UNIV
Process Systems Engineering
Systematic Procedure for Granger-Causality-Based Root Cause Diagnosis of Chemical Process Faults Han-Sheng Chen, Zhengbing Yan, Yuan Yao, Tsai-Bang Huang, and Yi-Sern Wong Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/acs.iecr.8b00697 • Publication Date (Web): 26 Jun 2018 Downloaded from http://pubs.acs.org on June 26, 2018
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Systematic Procedure for Granger-Causality-Based Root Cause Diagnosis of Chemical Process Faults Han-Sheng Chen a, Zhengbing Yan b, Yuan Yao a*, Tsai-Bang Huang c, Yi-Sern Wong c a
Department of Chemical Engineering, National Tsing Hua University, Hsinchu 31003, Taiwan b
College of Physics and Electronic Information Engineering, Wenzhou University, Wenzhou 325035, China
c
Kaohsiung Factory, Chang Chun Plastics Co., Ltd., No.14 Gongye 1st Rd., Renwu District, Kaohsiung 81469, Taiwan
Abstract: Multivariate statistical process monitoring (MSPM) has received a considerable amount of attention both in terms of academic research and industrial applications. Most of these efforts have been focused on fault detection and isolation, while root cause diagnosis has not yet been fully addressed. In recent years, data-driven causality analysis methods have been adopted in order to understand the complex relationship between process variables and to identify the causes of the faults triggering the alarms. Among them, the Granger causality (G-causality) test is a popular method of inferring causal associations between signals based on temporal precedence. Nevertheless, the conventional G-causality test applies only to stationary and linear time series. Additionally, it determines the relationships between the variable pairs and is not suited to multivariate cases. In this study, the use of statistical tests is proposed in order to assess whether the time series are 1
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 2 of 49
non-stationary or nonlinear. For significant non-stationary or nonlinear signals, the Gaussian process regression (GPR) approach is integrated into the framework of the multivariate G-causality test in order to better indicate the causal relationships between the candidate process variables. The feasibility of the proposed scheme for root cause diagnosis is illustrated through case studies.
Keywords: root cause diagnosis, causality analysis, non-stationary time series, nonlinearity, Granger causality, Gaussian process regression.
*
Corresponding
author:
Tel:
886-3-5713690,
Fax:
[email protected] 2
ACS Paragon Plus Environment
886-3-5715408,
Email:
Page 3 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1. Introduction Modern chemical plants are usually composed of multiple operating units and numerous control loops. Consequently, when a fault occurs in a single unit, it will likely propagate to other units along the material and signal flows1. As a result, the root cause diagnosis of the process/sensor faults becomes difficult. In the last two decades, multivariate statistical process monitoring (MSPM) has received a considerable amount of attention both in terms of academic research and industrial applications2, 3. Nevertheless, most of the efforts have been devoted to fault detection (i.e., recognizing process abnormalities) and fault isolation (i.e., identifying the variables primarily responsible for the detected faults), while the root cause diagnosis problem has not yet been fully addressed. In recent years, causality analysis techniques, which can be roughly classified into modelbased and data-driven methods4, have been adopted in order to aid the understanding of complex relationships between the process variables and to identify the causes of the faults triggering the alarms5. Model-based methods, such as the adjacency matrix6 and signed digraphs7, 8, often require qualitative process information, at least. For a largescale and complex chemical plant, such knowledge may not be adequate. On the contrary, data-driven methods, such as the Bayesian networks9-12, transfer entropy13-15, and Granger causality (G-causality)16, aim to extract the causality information from the process measurements in a statistical manner. Among them, the G-causality test, which infers causal associations between signals based on temporal precedence, seems promising due to its ease of implementation and interpretation. Conventional G-causality is calculated by fitting autoregressive (AR) models to the investigated time series and is only suitable for the analysis of the linear relationships between a pair of stationary
3
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
signals. In many cases, the diagnostic approaches using the conventional G-causality test may not be able to accurately identify the root cause and fault propagation path, because the trajectories of the process variables often exhibit multivariate, non-stationary, and nonlinear characteristics. Transfer entropy is able to handle nonlinearity; however, it still requires process stationarity. The differences between information transferring and causal effect have been discussed in previously published literature17, 18. Most recently, dynamic time warping (DTW) 19 was adopted in the analysis of non-stationary faults. Nevertheless, this approach is only suited to cause-effect signals with similar patterns. In this study, a data-driven procedure is proposed for the G-causality-based root cause diagnosis of chemical process faults. After determining the candidate set of faulty variables, the statistical hypothesis tests, including the augmented Dickey-Fuller (ADF) test20,
21
, Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test22, and Brock-Dechert-
Scheinkman (BDS) test23, were used to assess whether the time series under investigation were non-stationary or nonlinear. For significant non-stationary or nonlinear signals, the Gaussian process regression (GPR)24 approach was integrated into the framework of the multivariate G-causality test to better indicate the causal relationships between the candidate variables and to facilitate the root causes diagnosis of faults triggering the alarm. Although the techniques used in this study are well-established in statistical literature, a systematic procedure of integrating them into a solution to the problem of diagnosing the root cause of process faults was lacking prior to this study. The rest of this paper is organized as follows. The principle of the conventional pairwise G-causality test is briefly introduced in Section 2. In Section 3, the methodologies adopted in this study are presented. Specifically, multivariate conditional G-causality is
4
ACS Paragon Plus Environment
Page 4 of 49
Page 5 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
introduced in Section 3.1. The GPR-based G-causality test, which can also be used to handle the non-stationary time series, is described in Section 3.2. The hypothesis tests for non-stationarity and nonlinearity are presented in Section 3.3. The entire systematic procedure of the proposed strategy is described in Section 3.4. In Section 4, the feasibility of the proposed method is illustrated by its applications to a numerical simulation problem and the benchmark Tennessee Eastman process. Finally, in Section 5, the paper is concluded along with a discussion on the future direction of research.
2. Preliminary: Granger causality test G-causality is a concept based on predictability. It originated in the field of econometrics for investigating whether a time series can help in forecasting another time series25. According to the theory of G-causality, if a signal (i.e., a time series) X1 Granger-causes (G-causes) another signal X2, then the past values of X1 should contain information that can assist in the prediction of X2. In other words, if a time series model of X2, which involves the past values of X1 and X2 as the inputs, has better prediction performance than a model solely based on the past values of X2, then X1 is a Granger-cause (G-cause) of X2. The underlying principle of G-causality is that an effect does not occur before its cause. Additionally, it is assumed that the cause set does not contain any redundant information. In other words, the cause variables cannot be correlated. The mathematical expression of G-causality is as follows. Consider two stationary time series: X1 = (x1(1), x1(2), ... , x1(n)) and X2 = (x2(1), x2(2), … , x2(n)). To investigate the causal relationship between X1 and X2, two different models were built, including a bivariate AR model as follows:
5
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
p
p
l =1
l =1
p
p
l =1
l =1
x1 (t ) = ∑ a11,l x1 (t − l ) + ∑ a12,l x2 (t − l ) + ε1 (t ) , x2 (t ) = ∑ a21,l x1 (t − l ) + ∑ a22,l x2 (t − l ) + ε 2 (t ) ,
Page 6 of 49
(1)
(2)
and a reduced model as follows: p
x1 (t ) = ∑ b1,l x1 (t − l ) + ε1(2) (t ) ,
(3)
l =1
p
x2 (t ) = ∑ b2,l x2 (t − l ) + ε 2(1) (t ) .
(4)
l =1
In the above equations, aij,l and bi,l are the model coefficients, εi represents the residuals (i.e., prediction errors) of the full model, and εi(j) represents the residuals of the reduced model, which predicts the signal of Xi by excluding Xj from the model. p is the model order defining the time lags included in the models, and can be selected by maximizing the Akaike Information Criterion (AIC)26 or the Bayesian Information Criterion (BIC)27. If the variability of the full model residual (εi) is significantly less than that of the reduced model (εi(j)), then an improvement in the prediction of Xi has occurred due to Xj, where i = 1, 2, j = 1, 2, and i ≠ j. This influence can be quantified as follows: FX j → X i = ln
var(ε i ( j ) ) var(ε i )
.
(5)
When there is no causal influence from Xj to Xi, FX j → X i = 0 ; otherwise, FX j → X i > 0 . Additionally, FX j → X i can never be negative. The statistical significance of this influence can be tested via the F-statistic, which is defined as follows: Fstatistic =
(RSS0 − RSS1 ) / p ~ F ( p, N − 2 p − 1) , RSS1 / ( N − 2 p − 1)
6
ACS Paragon Plus Environment
(6)
Page 7 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
where RSS0 is the residual sum of squares in the reduced model, RSS1 is the residual sum of squares in the full model, and N is the total number of observations used to estimate the model. If the null hypothesis FX j → X i = 0 is rejected with a significance level α, Xj is then said to have a causal influence on Xi. In root cause diagnosis applications, X1 and X2 represent the trajectories of the two candidate process variables, where the candidate set of variables is determined during a fault isolation pre-step. To construct an entire causal map, repeated pairwise analysis is often required due to the inherent multivariate characteristic of industrial chemical processes.
3. Methodologies 3.1. Multivariate conditional Granger causality test In chemical processes, process variables are often highly correlated. Thus, they break the basic assumption in the conventional pairwise G-causality test. As a result, the interpretation of the fault diagnosis results is affected. The conditional G-causality28, which is a multivariate version of G-causality, is better suited to such situations. Unlike the conventional G-causality, the conditional G-causality simultaneously includes all of the candidate variables into the AR models, where the full model is represented as follows: p
p
l =1
l =1
j = 3 l =1
p
p
J
l =1
l =1
J
p
x1 (t ) = ∑ a11,l x1 (t − l ) + ∑ a12,l x2 (t − l ) + ∑∑ a1j ,l x j (t − l ) + ε1 (t ) ,
(7)
p
x2 (t ) = ∑ a21,l x1 (t − l ) + ∑ a22,l x2 (t − l ) + ∑∑ a2 j ,l x j (t − l ) + ε 2 (t ) , j = 3 l =1
7
ACS Paragon Plus Environment
(8)
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 8 of 49
and the reduced model is defined as follows: p
p
J
x1 (t ) = ∑ b11,l x1 (t − l ) + ∑∑ b1 j ,l x j (t − l ) + ε1(2) (t ) , l =1
(9)
j =3 l =1
p
p
J
x2 (t ) = ∑ b2,l x2 (t − l ) + ∑∑ b2 j ,l x j (t − l ) + ε 2(1) (t ) . l =1
(10)
j = 3 l =1
In these equations, J is the total number of the variables under investigation. The direct G-causality relationship between X1 and X2 is determined by fitting the full and reduced models expressed above and by conducting the hypothesis test with the Fstatistic defined by Equation (6). The indirect G-causality conditional for the other variables does not affect the test results. In root cause diagnosis applications, the causal map obtained by repeating the conditional G-causality test for all of the candidate variable pairs is usually much simpler than that obtained by the conventional G-causality test. Therefore, the root cause diagnosis of the fault becomes easier.
3.2. Gaussian process regression and GPR-based Granger causality Conventional and conditional G-causality tests are both based on linear AR models, which may not be able to handle the nonlinear causal influences which commonly exist in industrial data. In recent years, the GPR approach29, a popular method utilized in various industrial applications to address data nonlinearity30-32, has been adopted for nonlinear AR modeling and has resulted in a nonlinear G-causality test method33. GPR belongs to the family of Bayesian regression methods, which assumes that the target function f(u) is a Gaussian prior and performs prediction on the basis of Bayesian
8
ACS Paragon Plus Environment
Page 9 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
inference. A Gaussian process is a stochastic model completely specified by its mean function m(u) and its covariance function k(u,u’), which predicts a point’s output value by using the measure of similarity between the inputs. Here, u and u’ are input vectors. Then, the Gaussian process can be written as follows:
f (u) ~ GP (m(u), k (u, u ')) .
(11)
Practically, the mean function is often assumed to be zero, while the commonly used covariance functions include the radial basis function (RBF) kernel, polynomial kernel, etc. In this study, the RBF kernel was selected as follows:
k (u, u ') = σ 2 exp(−γ || u − u ' ||22 ) , where θ = [σ
(12)
γ ] are the hyper-parameters defining the covariance function. Thus, the
regression model is formulated as follows:
y = f (u) + ζ ,
(13)
where ξ is the independent identically distributed Gaussian noise with variance σ 2 . Before applying the GPR model to the test dataset, the hyper-parameters must be determined by using the training dataset {U,y}, where the matrix U contains the input vectors of the training data and the vector y consists of the corresponding outputs. This can be accomplished by maximizing the log-likelihood log P (y | U, θ) in a Bayesian framework, as follows: 1 1 N log P ( y | U, θ) = − y T K (U, U) −1 y − log | K ( U, U ) | − log 2π , 2 2 2
(14)
where N is the number of observations in the training dataset, K(U,U) is the covariance matrix of the training data whose i,j-th entry is defined by the covariance function as
9
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 10 of 49
k(ui,uj), and ui and uj are the input vectors of the i-th and j-th observations in the training set. The joint prior distribution of the observed outputs (y) in the training set and the function values (f*) at the test points can be expressed as follows:
K (U, U) + σ 2I K (U, U* ) y f ζ . f * = f * + 0 ~ N 0, * K (U* , U* ) K ( U , U)
(15)
Here, I is the identity matrix, U* is the input matrix of the test data, f* contains the function values corresponding to the test inputs, and K(U,U*), K(U*,U), and K(U*,U*) are defined in a similar way to K(U,U). Assuming that there are N* observations in the test dataset, K(U,U*) denotes an N×N* covariance matrix evaluated for all of the training and test point pairs. The posterior distribution of f* can then be inferred as follows: f * | U , y , U * ~ N ( f * , cov(f * ) ) ,
(16)
f * = K (U* , U )[K (U, U) + σ 2 I ]−1 y ,
(17)
where
and
cov(f * ) = K (U* , U* ) − K (U* , U)[K (U, U) + σ n2I]−1 K (U* , U* ) .
(18)
For a given input vector, the prediction of the output can be provided by Equation (17), while Equation (18) estimates the prediction uncertainty. To evaluate the nonlinear conditional Granger causality between the two signals X1 and X2, the full and reduced models can be built by using GPR. To build the full model, the inputs u should consist of the lagged measurements of X1 and X2, i.e., u = [x1(t-1), x1(t2),…, x1(t-p), x2(t-1), x2(t-2),…, x2(t-p),…, xJ(t-1), xJ(t-2),…, xJ(t-p)], and the output y 10
ACS Paragon Plus Environment
Page 11 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
should be equal to x1(t) or x2(t). To build the reduced model for X1, the lagged measurements of X2, i.e., x2(t-1), x2(t-2), … , x2(t-p) are excluded from the inputs u. Moreover, the reduced model for X2 can be built in a similar way. In a previous study33, the inference of the causal relationship was proposed by comparing the log-evidences of the two models, as follows: d X j → X i = max log P ( y | U1 , θ1 ) − max log P ( y | U 2 , θ 2 ) , θ1
(19)
θ2
where U1 and U2 are the input matrices of the full and reduced models, respectively, for predicting xi, while θ1 and θ2 are the corresponding hyper-parameters of these two models. If d X j → X i > 0 , then it is inferred that Xj causes Xi. However, a significance test for this index has not been reported in the original literature. Therefore, in this study, causality was assessed by conducting a hypothesis test on FX
j
→ Xi
as introduced in Section 2, where
the calculation of FX j → X i based on the model residuals was presented. It should be noted that GPR can also be used to model a non-stationary time series34. For online implementation, a non-stationary covariance function is suggested. However, for causality analysis applications, the time series is only analyzed in an offline manner. In such cases, the RBF kernel is often adequate for approximating the data distribution.
3.3. Test for stationarity and nonlinearity As mentioned previously, the conventional and conditional G-causality tests require that the signals under investigation are a stationary time series. When this requirement is not satisfied, we propose that the GPR-based G-causality test introduced in Section 3.2 can be used as a replacement. In statistics, a unit root test is commonly used to test whether a
11
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
time series is stationary or not. An AR process is considered to be stationary if and only if all characteristic values (i.e., the roots of the characteristic equation) are less than a unit. The most widely adopted unit root tests include the ADF test20, 21 and the KPSS test 22. In the ADF test, the null hypothesis is that a unit root is present in the AR model, while the alternative hypothesis is that the AR process is stationary. The KPSS test is a reversed test, whose null hypothesis is that the AR time series is stationary, and the corresponding alternative hypothesis is that a unit root exists. In practice, it is often suggested that these two tests are combined in order to reduce the error. For more details on the unit root tests, please refer to the cited literature. It is noted that, in time series analysis, there are methods which can be used to deal with non-stationary signals. The most popular approach to make a time series stationary is to compute the differences between the consecutive observations. This transformation is known as differencing and eliminates the trends from the signals. However, in causality analysis, this has the drawback of any possibly important relationships between the trends being ignored, particularly when the non-stationary series are cointegrated35. In this study, the GPR-based G-causality test was used to deal with the non-stationary variable trajectories commonly observed when the processes are faulty. Furthermore, the GPR-based G-causality test is an inherently nonlinear method suitable for a nonlinear time series. The most popular tool to test nonlinearity is the BDS test23, which is known to be powerful against a wide range of nonlinear time series models. The basic principle of the BDS test is simple. After removing any linear structure from the time series by detrending or fitting any linear model, the BDS test examines the null
12
ACS Paragon Plus Environment
Page 12 of 49
Page 13 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
hypothesis of the independent and identically distributed residuals. Readers who are interested in more details regarding this process may refer to the cited literature.
3.4. Systematic procedure for root cause diagnosis The systematic procedure for the root cause diagnosis of chemical process faults is as follows: Pre-step 1 (fault detection). Detect process abnormality by using any of the statistical process monitoring methods. Pre-step 2 (fault isolation): Determine the candidate set of faulty variables by using the fault isolation tools, such as the contribution plots36, reconstruction analysis37, variable selection38, etc. Step 1. Select the time window for conducting causality analysis. There is no universal technique for determining a period to analyze. Typically, the focus is put on the time period following soon after the fault is detected. Moreover, a trade-off takes place when selecting the period length, by which a longer length is able to provide more data for the estimation of the AR model, but the chance of fault propagation increases. This also increases the difficulty of root cause diagnosis. Therefore, the experience of the investigated process is helpful. Step 2. Collect the trajectory data of the candidate process variables in the selected time period. Step 3. Select a pair of variables in the candidate set.
13
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Step 4. Determine whether the trajectory signals of these two variables are stationary and linear. The ADF, KPSS and BDS tests can be implemented. If yes, go to Step 5; otherwise, go to Step 6. Step 5. Build the full model as expressed in Equations (7) and (8), and the reduced model as expressed in Equations (9) and (10). The time lags included in the models can be selected by maximizing AIC or BIC. Then, proceed to Step 7. Step 6. Build the full model and the reduced model by using GPR, as introduced in Section 3.2. The time lags included in the models can be selected by maximizing AIC or BIC. Step 7. Calculate the error terms of the full and reduced models. The calculated error terms can then be used in the calculation of the F-statistics by Equation (6). Step 8. The causal relationship between the examined pair of variables can be inferred based on the results of the F-test. Step 9. Choose another pair of candidate variables and go to Step 4 until all the variable pairs have been tested. Step 10. Construct the causal map and draw a conclusion regarding the root cause.
4. Case studies 4.1. A numerical example of non-stationary signals In this section, a numerical example is used to illustrate the feasibility of the proposed method when dealing with non-stationary signals. A process was simulated by using the Simulink toolbox in the MATLAB software. As shown in Fig. 1, this process consisted of four different units, whose transfer functions were
3 2 2 −2 , , , and , 3s + 1 6s + 1 3s + 1 4s + 1
14
ACS Paragon Plus Environment
Page 14 of 49
Page 15 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
respectively. Each unit was operated under random disturbances and controlled by a proportional integral (PI) controller. The various components linking different units were also represented by the transfer functions shown in the figure. Three variables were measured in each unit, where the manipulated variable (i.e., the input of the operating unit) is denoted as MV, the controlled variable (i.e., the output of the operating unit) is denoted as CV, and the disturbance variable is denoted as DV. A total of 5,000 observations were collected with a sampling interval of 0.1 s. Initially, the process was operated under normal operating conditions and the recorded data fluctuated slightly around the set-points. At the 1,000th sampling time point (i.e., the 100th second), a sinusoidal periodic oscillation was induced to Unit 1 through the transfer function of 2.2 , and led to a severe process fault, which gradually propagated to other units. 2s + 1
Obviously, the disturbance variable DV1 was influenced directly, and the controller output MV1 was also quickly affected due to the controller’s efficient reaction to the oscillation. In this case, the controller could not completely compensate for the oscillation effects. As a result, this fault eventually propagated to other units and variables. It is clear that only DV1 and MV1 should have been recognized as the root cause variables of this fault. Fig. 2 shows the plots of the variable trajectories in Unit 1 and Unit 2. The data recorded in the other two operating units had a similar pattern. Obviously, these signals were non-stationary. In this case study, the 1,001st-2,000th process observations, which were collected during a period of 100 s, were used to perform the root cause diagnosis, and all 12 variables were selected as candidates. According to the hypothesis test results, the characteristic equations of the time series had roots larger than one, which implied that the variable
15
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
trajectories were non-stationary. For comparison, three different G-causality analysis methods were used to perform the root cause diagnosis, including the conventional Gcausality test, linear conditional G-causality test, and GPR-based conditional G-causality test. The results are shown in Figs. 3-5. In these figures, the x-axis represents the potential cause variables, while the y-axis represents the potential effect variables. A black block located in the i-th row and j-th column implied that the variable j G-caused the variable i, while a blank block indicated that the variable j was not the G-cause of variable i. If a variable is the root cause, it cannot be the effect of any other variables. Consequently, the corresponding row should be completely blank. As shown in Fig. 3, the conventional G-causality test mistakenly indicated that there were causal relationships between the variable pairs. From this result, a conclusion could not be drawn with regard to the root cause of this fault. Obviously, the correlation among the process variables seriously affected the test result. Moreover, the results of the conditional G-causality test were different. As shown in Fig. 4, the manipulated and controlled variables in all of the operating units were identified as the root cause. Such an unreasonable result was produced because the conditional G-causality was not able to handle the non-stationarity of the process. Better diagnosis could be achieved by using the GPR-based conditional Gcausality test. Fig. 5 shows that DV1 and MV1 were identified as the root cause. Owing to its capability of dealing with both the correlation and non-stationarity contained in the data, the GPR-based method provided a correct diagnosis result, which was meaningful and valuable in the recovery process. Here, the lag p in the models was automatically selected based on BIC, while the hyper-parameters θ in the GPR model were determined by maximizing the log-likelihood as introduced in Section 3.2. In addition, the transfer
16
ACS Paragon Plus Environment
Page 16 of 49
Page 17 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
entropy method was also implemented for comparison, whose results are plotted in Fig. 6. In Fig. 6(a), the transfer entropy values between all variable pairs are shown in colors. In order to better identify the causal relationships, this figure is re-plotted in Fig. 6(b) where only the values larger than 0.05 are highlighted in black. Obviously, such results are misleading, because the transfer entropy method cannot handle non-stationary data series.
4.2. Tennessee Eastman (TE) process The TE process39 is a benchmark for testing various process control and monitoring methods. As shown in Fig. 7, the process contains five primary operating units including a reactor, product condenser, recycle compressor, vapor-liquid separator, and product stripper. It produces two products (G and H) and a byproduct (F) from four reactants (A, C, D, and E). Additionally, an inert chemical (B) is fed into the reactor. A total of 41 measured variables and 12 manipulated variables are recorded for data collection with a sampling interval of 3 min. In the process database, different types of faults are simulated and each one is triggered after the 160th sampling interval. More details regarding the dataset and faults can be found in the literature39, 40. The complete list of the process variables is presented in Table S1 in the Supporting Information. The first case study regards IDV(1), which is a fault related to the change of the A/C feed ratio in stream 4. The occurrence of this fault leads to a decrease in the amount of A in the recycling stream. As a result, the A composition in stream 6 also decreases. The composition controller reacts efficiently and increases the flow rate (x1) of the A feed in stream 1, by adjusting the corresponding valve opening, i.e., x44. Consequently, this raises the level of the reactor, changes the residence time of the materials, and affects many other process variables. The flow rate of stream 4 (x4) is affected by the action of the 17
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
level controller, which further changes the stripper pressure (x16) and temperature (x18). Then, the product separator pressure (x13) and the reactor pressure (x7) are affected. In the meantime, the temperature control of the stripper results in the adjustment of the stripper steam valve (x50), which manipulates the stripper steam flow (x19). From the above analysis, it is clear that x1 and x44 are the variables closest to the root cause of this fault, while x4 is affected by x1. Additional faulty variables are located downstream of the fault propagation paths. G-causality-based diagnosis methods were implemented in order to check whether they can accurately reveal the root cause. In this case study, a contribution plot based on principal component analysis (PCA)36 was used to select the candidate variable set. Nine principal components (PCs) were retained in the model, as suggested by a previous study41. The average contribution of each variable is shown in Fig. 8, where the control limits are plotted in red. Owing to the smearing effect42, many variables lie outside of the control limits. As has been indicated in a previous study19, the results of the G-causality tests are usually sensitive to the number of candidate signals under investigation. Therefore, here, the first nine variables with the largest contributions, which are {x1, x4, x7, x13, x16, x18, x19, x44, x50}, were selected as the candidates. The trajectories of several selected variables are shown in Fig. 9, and they all appear to be stationary. The statistical tests confirmed this impression; however, they indicated that the series was nonlinear. Therefore, it is believed that the proposed GPR-based conditional G-causality test outperformed the linear conditional test. To reveal the root cause, only the samples collected after the occurrence of the fault were involved in the analysis. The results of the linear conditional G-causality test are presented in Fig. 10. These results could be further
18
ACS Paragon Plus Environment
Page 18 of 49
Page 19 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
transformed to a causal map as shown in Fig 11, where the single arrows indicate the cause and effect directions, while the double-headed arrows represent the reciprocal causation. x1 was correctly identified as the root cause. However, x44, which was closely correlated with x1, was mistakenly placed downstream of the fault propagation paths. In comparison, the GPR-based conditional G-causality test led to a more reasonable result. As indicated by Fig. 12 and Fig. 13, both x1 and x44 were located upstream of the paths, which is correct. Additionally, in comparison to the causal map shown in Fig. 11, the causal map illustrated in Fig. 13 is more indicative of the fault propagation route, although some mismatches are still present. It should be pointed out that there is no purely data-driven method that can guarantee the absolutely correct identification of fault propagation paths. However, a more reasonable result can better aid the understanding of the process abnormality mechanism. To further improve the result, the topological structure of the plant should be involved in the analysis. However, this was out of this study’s scope. The second fault that was investigated was IDV(7), which is a fault related to the C header pressure loss. When this fault occurs, the feed flow of stream 4 (x4) decreases sharply. The feedback control system attempts to compensate this disturbance by adjusting the corresponding valve opening (x45). Therefore, both variables could be diagnosed as the possible root cause of this specific fault. Since the performance of the flow control is quite sluggish, x4 and x45 oscillate for a long time before the process reaches a new steady state. Therefore, many process variables are influenced by fault propagation, and their trajectories also exhibited an oscillatory behavior. These variables include the product separator pressure (x13), stripper pressure (x16), reactor pressure (x7),
19
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
reactor cooling water outlet temperature (x21), separator cooling water outlet temperature (x22), reactor cooling water flow (x51), etc. Before the conduct of root cause diagnosis, the candidate faulty variables were determined by the PCA contribution plot as shown in Fig. 14. The ten variables with the largest contributions were selected as {x4, x7, x13, x16, x20, x21, x22, x45, x46, x51}. A subset of the variable trajectories is shown in Fig. 15. The clue for determining the root cause could not be obtained by visual inspection based on these variable trajectories. The statistical test results revealed that the investigated signals were both stationary and nonlinear. The root cause diagnosis was conducted based on the data samples measured after the occurrence of the fault. Fig. 16 and Fig. 17 show the result of the linear conditional G-causality test. Identifying the root cause based on this result is a difficult task to accomplish. Additionally, all of the variables were in loops, while many causal relationships were not meaningful. For comparison, the result of the GPR-based conditional G-causality test is shown in Fig. 18 and Fig. 19, where x45 was determined as the root cause. The fault propagation paths between x45, x13, x16, x7, x21, x51, and so on were identified correctly.
5. Conclusion To meet the demand of root cause diagnosis with regard to chemical process faults, the utilization of a data-driven causality analysis technique, namely, the G-causality test, has been previously investigated. However, its performance has often been affected by the multivariate, non-stationary and nonlinear characteristics of the process data. In this study, a systematic procedure was proposed to solve this problem. After selecting the candidate
20
ACS Paragon Plus Environment
Page 20 of 49
Page 21 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
faulty variables, statistical tests were conducted to check the non-stationarity and nonlinearity of the trajectory signals. If the non-stationarity or nonlinearity were statistically significant, the use of the GPR-based multivariate conditional G-causality test method was suggested in the next causality analysis step. The feasibility of the proposed method was illustrated by case studies. We will conclude this paper by discussing some challenges and research issues with regard to future work. First, in the case studies of this paper, the PCA contribution plot was utilized as an illustration to determine the candidate set of the root cause variables. As is well-known, the contribution plots often suffer from a smearing problem and overestimate the amount of faulty variables42. In the literature19, it has been pointed out that an overlarge candidate set may distort the diagnosis result and lead to a causal map that is significantly inconsistent with the fault propagation paths. Therefore, a fault isolation method without the smearing effect, such as variable selection38, 43, is a good alternative to the contribution plots. Secondly, to the best of our knowledge, the issue of time window selection for root cause diagnosis has not yet been addressed. Generally, the selected time window should be close to the time point of the fault’s occurrence and should be long enough to identify the time series models. However, it should not be too long to cause the occurrence of significant changes in the system. Thirdly, another topic warranting discussion is whether normal operation data should be used in root cause diagnosis along with the fault samples. The utilization of more data collected under different operating conditions may be helpful in understanding the topological structure of the target plant. However, it should be mentioned that the identified causal relationships do not necessarily indicate the fault propagation paths. An interpretation of
21
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
this should be carried out with caution. Finally, it is noted that causality analysis is not the only way of root cause diagnosis
44, 45
. Practically, integrating the information
extracted by different types of methods may lead to more effective diagnosis results. In summary, a great amount of research is required in order to bridge the gap between the statistical theories of causality analysis and their industrial applications to root cause diagnosis.
Supporting Information A list of process variables involved in the case study. This material is available free of charge via the Internet at http://pubs.acs.org.
Acknowledgment This work was supported in part by Ministry of Science and Technology, ROC, under Grant No. MOST 106-2622-8-007-017. Yan was supported by the National Natural Science Foundation of China (No. 61703309) and Zhejiang Provincial Natural Science Foundation of China under Grant No. LY18F030014. This paper is extended from a conference paper presented at IFAC Congress 2017.
References 1.
Zhao, C.; Gao, F., Critical-to-fault-degradation variable analysis and direction
extraction for online fault prognostic. IEEE Transactions on Control Systems Technology, 2017, 25, (3), 842-854.
22
ACS Paragon Plus Environment
Page 22 of 49
Page 23 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
2.
Qin, S. J., Survey on data-driven industrial process monitoring and diagnosis.
Annual Reviews in Control, 2012, 36, (2), 220-234. 3.
Ge, Z.; Song, Z.; Gao, F., Review of recent research on data-based process
monitoring. Industrial & Engineering Chemistry Research, 2013, 52, (10), 3543-3562. 4.
Duan, P.; Chen, T.; Shah, S. L.; Yang, F., Methods for root cause diagnosis of
plant-wide oscillations. AIChE Journal, 2014, 60, (6), 2019-2034. 5.
Chiang, L. H.; Jiang, B.; Zhu, X.; Huang, D.; Braatz, R. D., Diagnosis of multiple
and unknown faults using the causal map and multivariate statistics. Journal of Process Control, 2015, 28, 27-39. 6.
Jiang, H.; Patwardhan, R.; Shah, S. L., Root cause diagnosis of plant-wide
oscillations using the concept of adjacency matrix. Journal of Process Control, 2009, 19, (8), 1347-1354. 7.
Wan, Y.; Yang, F.; Lv, N.; Xu, H.; Ye, H.; Li, W.; Xu, P.; Song, L.; Usadi, A. K.,
Statistical root cause analysis of novel faults based on digraph models. Chemical Engineering Research and Design, 2013, 91, (1), 87-99. 8.
Ram Maurya, M.; Rengaswamy, R.; Venkatasubramanian, V., Application of
signed digraphs-based analysis for fault diagnosis of chemical process flowsheets. Engineering Applications of Artificial Intelligence, 2004, 17, (5), 501-518. 9.
Weidl, G.; Madsen, A. L.; Israelson, S., Applications of object-oriented Bayesian
networks for condition monitoring, root cause analysis and decision support on operation of complex continuous processes. Computers & Chemical Engineering, 2005, 29, (9), 1996-2009.
23
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
10.
Yu, H.; Khan, F.; Garaniya, V., Modified independent component analysis and
Bayesian network-based two-stage fault diagnosis of process operations. Industrial & Engineering Chemistry Research, 2015, 54, (10), 2724-2742. 11.
Yu, J.; Rashid, M. M., A novel dynamic bayesian network‐based networked
process monitoring approach for fault detection, propagation identification, and root cause diagnosis. AIChE Journal, 2013, 59, (7), 2348-2365. 12.
Gharahbagheri, H.; Imtiaz, S. A.; Khan, F., Root cause diagnosis of process fault
using KPCA and Bayesian network. Industrial & Engineering Chemistry Research, 2017, 56, (8), 2054-2070. 13.
Bauer, M.; Cox, J. W.; Caveness, M. H.; Downs, J. J.; Thornhill, N. F., Finding
the direction of disturbance propagation in a chemical process using transfer entropy. IEEE Transactions on Control Systems Technology, 2007, 15, (1), 12-21. 14.
Xu, S.; Baldea, M.; Edgar, T. F.; Wojsznis, W.; Blevins, T.; Nixon, M., Root
cause diagnosis of plant-wide oscillations based on information transfer in the frequency domain. Industrial & Engineering Chemistry Research, 2016, 55, (6), 1623-1629. 15.
Duan, P.; Yang, F.; Chen, T.; Shah, S. L., Direct causality detection via the
transfer entropy approach. IEEE Transactions on Control Systems Technology, 2013, 21, (6), 2052-2066. 16.
Yuan, T.; Qin, S. J., Root cause diagnosis of plant-wide oscillations using
Granger causality. Journal of Process Control, 2014, 24, (2), 450-459. 17.
Ay, N.; Polani, D., Information flows in causal networks. Advances in Complex
Systems, 2008, 11, (1), 17-41.
24
ACS Paragon Plus Environment
Page 24 of 49
Page 25 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
18.
Lizier, J. T.; Prokopenko, M., Differentiating information transfer and causal
effect. The European Physical Journal B, 2010, 73, (4), 605-615. 19.
Li, G.; Qin, S. J.; Yuan, T., Data-driven root cause diagnosis of faults in process
industries. Chemometrics and Intelligent Laboratory Systems, 2016, 159, 1-11. 20.
Dickey, D. A.; Fuller, W. A., Distribution of the estimators for autoregressive
time series with a unit root. Journal of the American Statistical Association, 1979, 74, (366), 427-431. 21.
Said, E. S.; Dickey, D. A., Testing for unit roots in autoregressive-moving
average models of unknown order. Biometrika, 1984, 71, (3), 599-607. 22.
Kwiatkowski, D.; Phillips, P. C. B.; Schmidt, P.; Shin, Y., Testing the null
hypothesis of stationarity against the alternative of a unit root: How sure are we that economic time series have a unit root? Journal of Econometrics, 1992, 54, (1), 159-178. 23.
Broock, W. A.; Scheinkman, J. A.; Dechert, W. D.; LeBaron, B., A test for
independence based on the correlation dimension. Econometric Reviews, 1996, 15, (3), 197-235. 24.
Williams, C. K. I.; Rasmussen, C. E., Gaussian processes for regression. In
Advances in Neural Information Processing Systems 8, The MIT Press: Cambridge, MA, 1996, 514-520. 25.
Granger, C. W. J., Investigating causal relations by econometric models and
cross-spectral methods. Econometrica, 1969, 37, (3), 424-438. 26.
Akaike, H., A new look at the statistical model identification. IEEE Transactions
on Automatic Control, 1974, 19, (6), 716-723.
25
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
27.
Schwarz, G., Estimating the dimension of a model. The Annals of Statistics, 1978,
6, (2), 461-464. 28.
Geweke, J. F., Measures of conditional linear dependence and feedback between
time series. Journal of the American Statistical Association, 1984, 79, (388), 907-915. 29.
Rasmussen, C. E.; Williams, C., Gaussian Processes for Machine Learning. The
MIT Press: Cambridge, MA, USA, 2006. 30.
Liu, Y.-J.; Chen, T.; Yao, Y., Nonlinear process monitoring and fault isolation
using extended maximum variance unfolding. Journal of Process Control, 2014, 24, (6), 880-891. 31.
Liu, Y.; Wu, Q.-Y.; Chen, J., Active selection of informative data for sequential
quality enhancement of soft sensor models with latent variables. Industrial & Engineering Chemistry Research, 2017, 56, (16), 4804-4817. 32.
Deng, H.; Liu, Y.; Li, P.; Ma, Y.; Zhang, S., Integrated probabilistic modeling
method for transient opening height prediction of check valves in oil-gas multiphase pumps. Advances in Engineering Software, 2018, 118, 18-26. 33.
Amblard, P. O.; Michel, O. J. J.; Richard, C.; Honeine, P. A Gaussian process
regression approach for testing Granger causality between time series data, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012, 3357-3360. 34.
Brahim-Belhouari, S.; Bermak, A., Gaussian process for nonstationary time series
prediction. Computational Statistics & Data Analysis, 2004, 47, (4), 705-712. 35.
Engle, R. F.; Granger, C. W. J., Co-integration and error correction:
representation, estimation, and testing. Econometrica, 1987, 55, (2), 251-276.
26
ACS Paragon Plus Environment
Page 26 of 49
Page 27 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
36.
Westerhuis, J.; Gurden, S.; Smilde, A., Generalized contribution plots in
multivariate statistical process monitoring. Chemometrics and Intelligent Laboratory Systems, 2000, 51, (1), 95-114. 37.
Yue, H. H.; Qin, S. J., Reconstruction-based fault identification using a combined
index. Industrial & Engineering Chemistry Research, 2001, 40, (20), 4403-4414. 38.
Kuang, T.-H.; Yan, Z.; Yao, Y., Multivariate fault isolation via variable selection
in discriminant analysis. Journal of Process Control, 2015, 35, 30-40. 39.
Downs, J.; Vogel, E., A plant-wide industrial process control problem. Computers
& Chemical Engineering, 1993, 17, (3), 245-255. 40.
Russell, E. L.; Chiang, L. H.; Braatz, R. D., Fault detection in industrial processes
using canonical variate analysis and dynamic principal component analysis. Chemometrics and Intelligent Laboratory Systems, 2000, 51, (1), 81-93. 41.
Yin, S.; Ding, S. X.; Haghani, A.; Hao, H.; Zhang, P., A comparison study of
basic data-driven fault diagnosis and process monitoring methods on the benchmark Tennessee Eastman process. Journal of Process Control, 2012, 22, (9), 1567-1581. 42.
Van den Kerkhof, P.; Vanlaer, J.; Gins, G.; Van Impe, J. F. M., Analysis of
smearing-out in contribution plot based fault isolation for statistical process control. Chemical Engineering Science 2013, 104, 285-293. 43.
Yan, Z.; Kuang, T.-H.; Yao, Y., Multivariate fault isolation of batch processes via
variable selection in partial least squares discriminant analysis. ISA Transactions, 2017, 70, 389-399. 44.
Nan, C.; Khan, F.; Iqbal, M. T., Real-time fault diagnosis using knowledge-based
expert system. Process Safety and Environmental Protection, 2008, 86, (1), 55-71.
27
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
45.
Yu, H.; Khan, F.; Garaniya, V., A probabilistic multivariate method for fault
diagnosis of industrial processes. Chemical Engineering Research and Design, 2015, 104, 306-318.
28
ACS Paragon Plus Environment
Page 28 of 49
Page 29 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Figure list Figure 1. Simulink process model with four units Figure 2. Variable trajectories in Unit 1 and Unit 2 Figure 3. Result of conventional G-causality test Figure 4. Result of linear conditional G-causality test Figure 5. Result of GPR-based conditional G-causality test Figure 6. Result of transfer entropy Figure 7. Flowchart of TE process Figure 8. PCA contribution plot of IDV(1) Figure 9. Variable trajectories in IDV(1) scenario Figure 10. Result of linear conditional G-causality test in IDV(1) scenario Figure 11. Causal map constructed based on Figure 9 Figure 12. Result of GPR-based conditional G-causality test in IDV(1) scenario Figure 13. Causal map constructed based on Figure 10 Figure 14. PCA contribution plot of IDV(7) Figure 15. Variable trajectories in IDV(7) scenario Figure 16. Result of linear conditional G-causality test in IDV(7) scenario Figure 17. Causal map constructed based on Figure 15 Figure 18. Result of GPR-based conditional G-causality test in IDV(7) scenario Figure 19. Causal map constructed based on Figure 17
29
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 1. Simulink process model with four units
30
ACS Paragon Plus Environment
Page 30 of 49
10
50
5
DV2
100
0
0
-50
-5
-100
-10 0
100
200
300
400
500
0
100
200
300
400
500
400
500
400
500
Time (s)
100
4
50
2
MV2
MV1
Time (s)
0 -50
0 -2
-100
-4 0
100
200
300
400
500
0
100
Time (s)
200
300
Time (s)
10
10
5
5
CV2
CV1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
DV1
Page 31 of 49
0 -5
0 -5
-10
-10 0
100
200
300
400
500
0
100
Time (s)
200
300
Time (s)
Figure 2. Variable trajectories in Unit 1 and Unit 2
31
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 3. Result of conventional G-causality test
32
ACS Paragon Plus Environment
Page 32 of 49
Page 33 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Figure 4. Result of linear conditional G-causality test
33
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 5. Result of GPR-based conditional G-causality test
34
ACS Paragon Plus Environment
Page 34 of 49
Page 35 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
(a)
(b) Figure 6. Result of transfer entropy: (a) original values, and (b) values>0.05
35
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 7. Flowchart of TE process
36
ACS Paragon Plus Environment
Page 36 of 49
Page 37 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Figure 8. PCA contribution plot of IDV(1)
37
ACS Paragon Plus Environment
x4 x13
x7
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
x1
Industrial & Engineering Chemistry Research
Figure 9. Variable trajectories in IDV(1) scenario
38
ACS Paragon Plus Environment
Page 38 of 49
Page 39 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Figure 10. Result of linear conditional G-causality test in IDV(1) scenario
39
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 11. Causal map constructed based on Figure 10
40
ACS Paragon Plus Environment
Page 40 of 49
Page 41 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Figure 12. Result of GPR-based conditional G-causality test in IDV(1) scenario
41
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 13. Causal map constructed based on Figure 12
42
ACS Paragon Plus Environment
Page 42 of 49
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Contribution
Page 43 of 49
Figure 14. PCA contribution plot of IDV(7)
43
ACS Paragon Plus Environment
x7 x16
x13
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
x4
Industrial & Engineering Chemistry Research
Figure 15. Variable trajectories in IDV(7) scenario
44
ACS Paragon Plus Environment
Page 44 of 49
Page 45 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Figure 16. Result of linear conditional G-causality test in IDV(7) scenario
45
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
x13
x20
x7
x46
Page 46 of 49
x22
x4
x45 x16
Figure 17. Causal map constructed based on Figure 16
46
ACS Paragon Plus Environment
x21
x51
Page 47 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Figure 18. Result of GPR-based conditional G-causality test in IDV(7) scenario
47
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 19. Causal map constructed based on Figure 18
48
ACS Paragon Plus Environment
Page 48 of 49
Page 49 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Table of Contents (TOC) Graphic
49
ACS Paragon Plus Environment