Subscriber access provided by University of Rochester | River Campus & Miner Libraries
Process Systems Engineering
Efficient Reconstruction of Granger-Causal Networks in Linear Multivariable Dynamical Processes Sudhakar Kathari, and Arun Kumar Tangirala Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/acs.iecr.8b06109 • Publication Date (Web): 20 Mar 2019 Downloaded from http://pubs.acs.org on March 21, 2019
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Efficient Reconstruction of Granger-Causal Networks in Linear Multivariable Dynamical Processes Sudhakar Kathari and Arun K. Tangirala∗ Department of Chemical Engineering, IIT Madras, Chennai - 600 036, India. E-mail:
[email protected] Phone: +91 44 22574181. Fax: +91 44 22574152 Abstract Multivariable dynamical processes are characterized by complex cause and effect relationships among variables. Reconstruction of these causal connections from data, especially based on the concept of Granger causality (GC), has attracted significant attention in process engineering with applications to interaction assessment, topology reconstruction and fault detection. The standard practice for reconstruction of GC networks has been along the parametric route that deploys vector auto-regressive (VAR) models, but without giving due importance to the structural characteristics of data generating process (DGP). In this work, we first demonstrate that the presence of a mismatch between model structure and DGP leads to biased and/or inefficient estimates of causality measures and hence the strengths (weights) of causal connections. This issue is further aggravated for small sample sizes wherein additionally spurious causal relationships are detected. In this respect, we present, secondly, a systematic methodology for efficient reconstruction of weighted GC networks for stationary multivariable linear dynamical processes. This methodology uses a combination of sparse
1
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
optimization and recently introduced scalar correlation functions to select the most appropriate and parsimonious model structure. Applications to synthetic and environmental processes are presented to demonstrate the efficacy of the proposed approach.
1
Introduction
Industrial processes are generally complex multivariable systems due to interconnections (cause and effect relationships / causal relationships) among variables. Identification of inter-variable connections among variables (signals) in multivariable systems is a subject of great interest in different scientific areas including engineering, biology, econometrics and social sciences. Quite often, the process configuration or architecture is not available a priori due to insufficient or absence of process knowledge. In addition due to the presence of recycle streams (in industrial processes), feedback and complex interconnections, it may not be possible to label the variables exclusively as causes and effects since a process variable can play the dual roles of both cause and effect. Notwithstanding these facts, in applications of fault detection and diagnosis, determination of disturbance propagation pathways, interaction assessment, and process topology reconstruction, it is necessary to determine the causal pathways among variables, which implicitly involves the classification of variables into causes and effects or both. 1–8 . In all such applications, joint modelling of process variables has been a standard and natural strategy. Models thus developed can then be used to classify the variables as causes and effects or both while simultaneously detecting the existence or absence of causal pathways between variables. These inferences may be possible from a fundamental understanding (first-principles) of the processes, but data-driven methodologies are popular for a few important reasons: (i) limited or poor fundamental understanding of processes, (ii) natural provision for effective handling of process and measurement uncertainties in data-driven modelling, and (iii) the need for determining “effective” as against theoretical connectivity. These reasons have been further propelled by significant advances in data acquisition technology and algorithms for data-based modelling. 2
ACS Paragon Plus Environment
Page 2 of 53
Page 3 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
For the reasons stated above, identification of causal relationships from data is an emerging field of research today in different areas encompassing the broad fields of engineering, biological systems, and econometrics 9–18 . Historically, the origins of these methods can be traced back to econometrics (prediction theory) and social science subsequently traversing the field of neuroscience before making their appearance in engineering and systems biology. The last three decades in particular has witnessed a huge spurt of activity in this area bringing with it new formalisms and novel ideas. One of the striking impacts of causality analysis is the emergence of so-called causal network representations. These are essentially directed graphical representation of the cause and effect relationships that are inferred from data. The vertices in such networks represent variables while the directed edges encode the causal information between a pair of variables. The formal theory of causal networks with all the necessary synoptics has only been recently established 19–23 deriving inspiration from graphical models in statistics and other fields. 24–26 Broadly speaking, causal networks can be classified into two categories: (i) unweighted (or Boolean) networks, where only the structure (including directionality) of the network is specified, and (ii) weighted causal networks, where the strengths/weights in addition to the network structure are also specified. The weighted causal network naturally encodes more information than the former class and finds applications in determining the strongest paths, fault diagnosis, control of networks, etc. Reconstruction of causal networks from data in an efficient manner depends predominantly on (i) the quality of data, (ii) the quantity (amount) of data, and (iii) the method of estimation (time- and frequency- domain, parametric and non-parametric). It is a wellestablished fact that the data quality is affected by different factors such as lack of sufficient excitation in the inputs, measurement errors, outliers, and missing observations. The impact of measurement errors on reconstruction of Granger-causal (to be introduced shortly) networks is formally studied in 27 . A remedy for the same is proposed in 28 . An isolated work on the effects of additive outliers on causal networks in the context of econometrics appears in 29 . A formal methodology for handling missing observations is presented in 30 , wherein a sparse
3
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
optimization-based formulation is deployed. On the other hand, excitation requirements in data for the purpose of reconstructing causal networks remains an open-ended problem. Data size (quantity) is naturally expected to have a significant impact on the reliability (precision) of reconstructed causal networks. Intuitively, large sample sizes are expected to result in accurate and fairly reliable inferences of networks. Small sample sizes, on the other hand, can lead to highly imprecise estimates of causality measures and can also result in detection of spurious causal relations. The foregoing postulate is one of the focal points of study in this work. With regards to the third factor (method of estimation), it is useful to consider two classes, namely, parametric and non-parametric methods depending on whether a structure is assumed for the model or not. The preferred method is the parametric one (as in signal processing and system identification applications) for reasons of parsimony (of models), numerical stability and smoothness of estimates. Parametric methods for causal network reconstruction involves estimating a model of specified structure followed by inference of causal pathways. Undoubtedly, the accuracy and reliability (efficiency) of the causal inferences is critically determined by the model structure and its estimate. In the literature on Granger-causal network reconstruction, which is also the context of this work, it is a standard practice to use vector auto-regressive models without necessarily giving due regard to the underlying data generating process. In this work, we first demonstrate that this standard practice can give rise to inefficient and inaccurate inferences. Subsequently we present a remedial solution for the same. On a relative scale, the impact of aforementioned factors is larger on the estimates of weighted causal networks than their unweighted counterparts, since the strengths of connections also need to be determined in addition to merely the network structure. The first step in reconstructing a causal network is concerned with the determination of causal pathways, which requires the use of an appropriate statistical measure of causality. In the present work, we use the widely used Granger causality measure. There exist other measures of causality, namely, the Sims causality 31 , Akaike causality 32 or the most recent
4
ACS Paragon Plus Environment
Page 4 of 53
Page 5 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
convergent cross-mapping measure 33 . Each of these measures differs in the “type” of causality they capture (e.g., direct / total) and / or the processes (e.g., stochastic or deterministic). The Granger causality (GC) measure is devised to capture the direct causal pathway between a pair of stochastic signals. It is based on two key assumptions: (i) the cause precedes its effect, and (ii) the cause has unique information with respect to the prediction of effect. The GC is also referred to as Wiener-Granger causality (WGC) in certain parts of literature since the seeds of this prediction-based notion of causality were sown by Wiener 34 which were then watered by Granger 9 in the framework of multivariate linear stationary processes. The technical idea underlying Granger causality is as follows. Suppose we wish to test if a random signal z1 [k] Granger causes another random signal z2 [k]. Predict z2 [k] using its own past and construct another forecast by incorporating past of z1 [k]. If the forecast is improved as a result, then z1 [k] is said to “Granger causes” z2 [k]. In a similar manner, one can determine the existence of a causal pathway from z2 [k] to z1 [k], if a feedback causality or a two-way relationship exists between z1 [k] and z2 [k]. It must be noted that the original formulation of Granger causality does not account for instantaneous causality, but only for lagged causality. Furthermore, as mentioned previously, Granger causality is a measure of direct causality as against another measure (for example, Sims causality 31 ) that examines causality along both indirect and direct pathways. The bivariate Granger causality based approach for multivariable case results in spurious connections due to confounding among variables. Thus the multivariate Granger causality based on joint modelling of all the variables using vector time series modelling has emerged as the standard choice for identification of causal relationships in multivariable processes. Although the concept (and measure) of Granger causality was originally devised in the timedomain, extensions to frequency-domain gradually gained momentum, especially in the fields of neuroscience and process engineering 10,22,35 . A prime benefit of working with frequencydomain measures is that they do not require an accurate specification of the model order in contrast to the prevailing time-domain methods. Furthermore, they provide insights into the
5
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
filtering characteristics of the connectivity. Finally, when the application of interest is wellcharacterized in the frequency domain, it is natural to employ a frequency-domain measure. Among the several frequency-domain causality measures, the two well-established measures that are of interest in this work are partial directed coherence (PDC) 10 , and direct pathway function (DPF) 13 . Regardless of the time- and frequency- domain statistics and the underlying data generating process (DGP), it has been a standard practice to use vector auto-regressive (VAR) models for reconstructing causal networks without giving due importance to the structural characteristics of the DGP 9,10,12,13,35–37 . This may be attributed to two compelling reasons, namely, that (i) VAR models can be uniquely estimated using linear least squares methods that are computationally light and (ii) a sufficiently high-order VAR model can approximate a stationary process with reasonable accuracy. The DGP, on the other hand, could be of non-VAR type - either a vector moving-average (VMA) or a vector auto-regressive movingaverage (VARMA) type. In such situations, the fixed VAR model structure approach can lead to the following two important issues. The first one is that biased and/or inefficient estimates of causality measures result and consequently the strengths of causal connections as well. Secondly, in the small sample scenario, this issue further leads to the detection of spurious causal relationships among variables. These issues are more pronounced in reconstruction of a weighted causal network since it involves the additional step of pathway strengths as against mere detection of pathways. The prime reason for the aforementioned issue is that mismatch between DGP and a fixed model class leads to a non-parsimonious representation, i.e., model structures with parameters in excess of what is required with a model of different class. This difference is amplified in multivariable processes. Consider, for example, a scenario when the DGP is an M -dimensional first-order VMA, essentially an M 2 parameter process. The VARequivalent representation of such a process would be of much higher-order, requiring p M 2 parameters. Recalling that estimation errors are, in general, proportional to the parameter
6
ACS Paragon Plus Environment
Page 6 of 53
Page 7 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
size, the parameter estimates of the VAR-equivalent model would incur much higher errors than acceptable. This problem is further compounded for small sample sizes. In general, since a model structure that is commensurate with the DGP is the most parsimonious among all possible model structures for a given process, it can be expected that networks estimated from models that are in line with the DGP are relatively more efficiently reconstructed from data. To stretch this argument a bit further, convenience or ease of estimation cannot be necessarily the basis for selecting a model structure, especially in network reconstruction. The selection of an appropriate model structure has been a long-standing problem of interest in modelling multivariable processes. The main task of model structure selection is finding a model that is most parsimonious and simplest from a set of plausible models. The choice of a reasonable model structure may not be obvious if the prior knowledge of DGP is not known. A majority of the literature has focused on selecting an appropriate model structure post-estimation by means of information theoretic criteria such as the Akaike information criterion (AIC), and the Bayesian (or Schwartz) information criterion (BIC / SIC) 38,39 . On the other hand, the modern sparse optimization methods have the capability of selecting the most parsimonious model of a pre-specified model family during estimation. However, selecting the most appropriate (parsimonious) model family, i.e., VAR / VMA / VARMA, for a given process requires a manual search. In order to choose an appropriate model structure prior to model estimation, multivariate time series analysis offers two statistics, namely, the matrix auto-correlation function (ACF), and partial lag auto-correlation function (PACF) or alternatively, the inverse auto-correlation function (IACF) 40–42 . The typical usage of these correlations involves examining the plots of matrix correlation functions, which consist of auto-correlations along the diagonal and cross-correlations along the off-diagonal. For an M -dimensional process, it is required to examine 2M 2 plots to determine the correlation structure (e.g., AR or MA and the order) of each channel. This approach serves the purpose well for very low-dimensional processes but becomes cumbersome even for moderately high-dimensional processes.
7
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
In order to overcome this difficulty and also to obtain the overall correlation characteristics of multivariate process, recently the authors have developed scalar correlation functions for multivariable processes (see Section 3.2 for a synopsis of these functions including the theoretical definitions). However, it may be noted that the model order determined by these scalar functions is the overall order of the process, i.e., the largest among orders of the individual channels. Thus a model order determined by these scalar functions can be overparametrized with respect to a few individual channels. In order to obtain a tight model, the sparse optimization or penalized regression methods are used to optimize the order of individual channels. A positive side-benefit of turning to sparse optimization is their ability to select the most parsimonious model even in small sample size scenarios. With the above motivation, we propose a systematic methodology that uses scalar correlation functions combined with the sparse optimization for efficient reconstruction of weighted Granger-causal networks of linear dynamical multivariable processes, a problem that has neither been addressed nor studied in the literature. The key highlights of this work are (a) a demonstration that the presence of a model structure-DGP mismatch leads to detection of spurious causal relationships, especially for small sample sizes, (b) a study to compare the performance of different regularization methods for the purpose of reconstructing causal networks, (c) a definition for strength of connectivity, and (d) a systematic methodology for efficient reconstruction of GC networks in linear multivariable dynamical processes. The rest of this article is organized as follows. Section 2 discusses the brief overview of model descriptions, causality measures in frequency-domain, and different approaches for reconstruction of weighted GC networks. The proposed methodology for efficient estimation of weighted GC networks in linear stationary multivariate processes is presented in Section 3. The effectiveness of the proposed approach is shown through case-studies in Section 4. The article ends with few concluding remarks in Section 5.
8
ACS Paragon Plus Environment
Page 8 of 53
Page 9 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
2
Preliminaries
This section briefly reviews the requisite mathematical definitions and terminology that are essential for the reconstruction of weighted GC networks from measurements.
2.1
Model descriptions
Let z[k] = (z1 [k], z2 [k], · · · , zM [k])T be a M -dimensional jointly stationary multivariate random process. A vector autoregressive (VAR) representation for this process is given by 41 ,
z[k] =
P X
Ar z[k − r] + e[k]
(1)
r=1
where P is the order of the process, Ar is the autoregressive coefficient matrix at lag r and e[k] is an M -dimensional vector white noise (VWN) process with covariance Σe . In this representation, the present state of the process is a superposition of the past state of the process and present uncertainties / innovations. The vector moving average (VMA) representation, on the other hand, models the process as a linear combination of the present and past innovations e[k]. The mathematical representation of finite-order VMA model is given by 40 ,
z[k] =
Q X
Hr e[k − r] + e[k]
(2)
r=1
where Q is the order of the process, Hr is the moving average coefficient matrix at lag r and e[k] is as before, the M -dimensional VWN process. In this work, we work with stationary VAR and invertible VMA models. Frequency-domain representations of these processes are extensively used in order to characterize their spectral densities and also the filtering nature of the individual channels.
9
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 10 of 53
These are characterized by A(ω) and H(ω), given by,
A(ω) , A(e
−jω
)=I−
P X
−jωr
Ar e
,
H(ω) , H(e
−jω
)=I+
Q X
Hr e−jωr
(3)
r=1
r=1
respectively, with
a (r) a12 (r) · · · 11 . .. .. .. . . Ar = . .. .. . . . . aM 1 (r) aM 2 (r) · · ·
a1M (r) .. . , .. . aM M (r)
··· .. . .. . ···
h (r) h12 (r) · · · 11 . .. .. .. . . Hr = . .. .. . . . . hM 1 (r) hM 2 (r) · · ·
··· .. . .. . ···
h1M (r) .. . .. . hM M (r)
(4)
where A(ω) and H(ω) are the Fourier transform of the autoregressive coefficient matrix, Ar and the moving average coefficient matrix, Hr , respectively and the coefficients aij (r) represent the linear influence of zj onto zi at lag r. The following equivalence exists between VAR and VMA processes in the frequency-domain,
H(ω) = A−1 (ω)
(5)
The vector autoregressive-moving average (VARMA) representation is a natural generalization for characterizing processes that have mixed characteristics. Mathematically, a VARMA(P ,Q) model is represented as 43,44 ,
z[k] =
P X
Ar z[k − r] +
r=1
Q X
Br e[k − r] + e[k]
(6)
r=1
where Ar and Br are the autoregressive and the moving average coefficient matrices at lag r, respectively. Section 1.1 of the Supporting Information describes the method of estimating these vector time series models.
10
ACS Paragon Plus Environment
Page 11 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
2.2
Granger causality
As mentioned in Section 1, the concept of Granger causality is based on two assumptions, firstly, the cause precedes its effect and secondly, the cause contains unique information about the effect that is not contained in any other variable. To understand the concept of Granger causality between a cause signal zi [k], on an effect zj [k], predict zj [k] using the past information of available signals with the exception of zi [k] and construct another forecast by including the past of zi [k]. Denote the associate forecast errors by ε1 [k] and ε2 [k] with variances σε21 and σε22 , respectively. Then the signal zi [k] is said to Granger cause zj [k] if σε22 < σε21
(7)
In other words, inclusion of the past of zi [k] should improve the predictions of zj [k]. For the stationary processes with autoregressive representation, Granger showed that the condition for causality in (7) is identical to the requirement that aji (r) 6= 0 for at least one r, where aji (r) is (j, i)th element of Ar in (1). On the other hand, statistical tests for this requirement were provided by Geweke 45 . Modern causality measures include either testing based on parametric models or frequency-domain measures as discussed in the sections to follow.
2.3
Frequency domain causality measures
A variety of data-driven causality detection measures were developed in frequency domain 13,45–49 . Among them, the partial directed coherence (PDC) and direct pathway function (DPF) have proven to be effective in detecting the GC connectivity among variables whereas the directed transfer function (DTF) is useful for identifying the so-called Sims-causality 31,50 . In the past years, the frequency domain Granger causality measures have been widely used for detecting causality. All these frequency domain measures are rooted in the well-known spectral factorization 51 . In this work, we focus on DPF-based Ganger-causality measure with lesser excursions to PDC. 11
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
2.3.1
Page 12 of 53
Partial directed coherence
Baccala and Sameshima 10 introduced partial directed coherence (PDC), which serves as a frequency domain measure for Granger causality. It measures only direct effects and directional flow of information between the variables of a multivariate system. The PDC from zj to zi is given by aij (ω) πij (ω) , πi←j (ω) = v uM uX t |aij (ω)|2
(8)
i=1
The squared PDC is a bounded measure and satisfies the following properties
M X
0 ≤ |πij (ω)|2 ≤ 1
(9)
|πij (ω)|2 = 1 for all 1 ≤ j ≤ M.
(10)
i=1
Partial directed coherence is suitable only for structure determination but does not provide a normalized measure of the direct power transfer between variables. Although the PDC originally was defined in terms of VAR model parameters, the theoretical PDC can be computed from VMA model using the expression in (8) by determining A(ω) through inversion of H(ω). Similarly, the PDC can be computed from VARMA model by computing A(ω) through the product of H−1 (ω) and A(ω).
2.3.2
Direct pathway function
Direct pathway function (DPF) introduced by Gigi and Tangirala 13 directly quantifies the amount of power transfer from the source variable ej (driving force for zj ) to the sink zi . In contrast to PDC which quantifies the power transfer from the source variable zj to the sink
12
ACS Paragon Plus Environment
Page 13 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
zi . The DPF from the source ej to sink zi is given by, hD,ij (ω)
ψij (ω) , ψi←j (ω) = s
M P
(11)
|hD,ij (ω)|2
i=1
where ¯ ij ) −aij (ω) det(M , det(A(ω)) hD,ij (ω) = det(Mij ) = hij (ω), det(A(ω))
i 6= j (12) i=j
¯ ij (ω) is the minor matrix obtained from A(ω) by eliminating both ith and j th row where M and column, while Mij (ω) is the standard minor matrix obtained by eliminating the ith row and j th column, respectively from A(ω). Although, the hD,ij (ω) is defined using VAR representation, it can be obtained for VMA model computing A(ω) through inversion of H(ω). Similarly, for VARMA model by computing A(ω) through the product of H−1 (ω) and A(ω). The squared DPF is a bounded measure and satisfies the following properties
M X
0 ≤ |ψij (ω)|2 ≤ 1
(13)
|ψij (ω)|2 = 1 for all 1 ≤ j ≤ M.
(14)
i=1
The normalization used in this definition is similar to that as in PDC. The squared version of DPF is termed as the direct power transfer (DPT), is a measure of power transfer directly from source ej to sink zi . Section 1.2 of the Supporting Information describes the method of estimating the above measures from a VAR, VMA or VARMA model.
13
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
2.4
Page 14 of 53
Reconstruction of Granger-causal networks
Reconstruction of GC networks (for linear, jointly stationary processes) is based on either time- or frequency-domain implementations of Granger causality 9 ; recall its description in Section 1. The standard approach in time-domain methods consists of fitting a time series model to the observed data followed by a simple inspection of the estimated model. The existence of Granger causal relationship is determined from models and frequency-domain measures based on the following results. VAR approach: The Granger causal relationship does not exist from zj → zi if and only if 41
aij (r) = 0,
for r = 1, · · · , P
(15)
and vice versa when testing for causality in the direction zi → zj . In practice, the estimated VAR model coefficient matrices have to be used; in which case, the existence of Grangercausal relationship from zj → zi can be tested by conducting a hypothesis test using the statistic 50 ,
ˆ −1 a ˆTij V Sij = N a ij ˆ ij
(16)
ˆij = (aij (1), aij (2), · · · , aij (P ))T is the AR coefficient where N is the number of observations, a ˆ ij is a P × P matrix for the pair estimates for the pair of variables (zi , zj ) at all lags, and V ˆ A of VAR model coefficients of variables obtained from the estimate of covariance matrix, Σ ˆ A is M 2 P × M 2 P ). Under the null hypothesis that zj does not Granger-cause (the size of Σ zi , the test statistic Sij is asymptotically χ2 -distributed with P degrees of freedom.
14
ACS Paragon Plus Environment
Page 15 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
VMA approach: The Granger causal connection does not exist from zj → zi if and only if 41
hij (r) = 0,
for r = 1, · · · , Q
(17)
VARMA approach: The concepts used for detecting Granger causality from VAR and VMA representations are applied to a stationary and invertible VARMA(P, Q) representation. A variable zj does not Granger-cause zi if and only if 41
aij (r) ≡ 0 & bij (r) ≡ 0, ∀r
(18)
In the frequency-domain methods, the GC relationships are determined by applying the following result to the estimated PDC / DPF measures. A variable zj does not Granger-cause zi if and only if 10,13 |πij (ω)|2 = 0,
∀ω
(from PDC)
(19)
∀ω
(from DPF)
(20)
OR |ψij (ω)|2 = 0,
Both PDC and DPF are capable of detecting the causal structure. However, an advantage of working with DPF is that an interpretable strength of connectivity can be defined (see Section 3.3).
3
Proposed Methodology
In this section, first, we study the performance of different regularization methods for the purpose of reconstructing causal networks. Next, a synopsis of recently developed scalar correlation functions including the theoretical definitions is provided. Finally, we introduce
15
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
the strength of connectivity and present the proposed systematic methodology for efficient estimation of weighted GC networks in linear multivariate dynamical processes.
3.1
Sparse optimization for reconstruction of causal networks
As mentioned earlier, the parameters of time series models such as VAR , VMA and VARMA describe the temporal relationships and also provide a structural (causal) information among variables. Successful identification of causal relationships depends on efficient estimation of these models. However, fitting these models to large dimensional time series using traditional estimation methods is challenging primarily due to a large number of parameters involved. In a multivariate time series model, the number of parameters grows quadratically with the dimension of the process. For example, fully-parametrized M -dimensional VAR(P ) model contains M 2 × P parameters. The parameters of multivariate models can be efficiently estimated using traditional estimation methods when the dimension (M ) of the process is small. However, when the number of parameters (Np ) to be estimated exceeds sample size (N ), the estimates using traditional estimation methods are not well behaved or even no longer unique since there may exist a large number of spurious parameter estimates, which can result in detection of spurious causal relations. In real life applications, the available data points are often limited, which in turn leads to poor estimation accuracy. As a result, fitting multivariate models using traditional estimation methods are rarely applied to large dimensional processes. Therefore reduction of a number of parameters to be estimated becomes an important aspect in estimating multivariate models to large dimensional process. To address these issues, a possible solution to reduce the number of parameters to be estimated is imposing sparsity constraint on the model parameters and performing variable selection using penalized (regularized) regression. Regularized regression effectively reduces the number of parameters to be estimated, which is particularly important when the data is of limited sample size. The objective of variable selection is identifying the non-zero parameters and giving accurate estimates of those parameters, with the main assumption is that the 16
ACS Paragon Plus Environment
Page 16 of 53
Page 17 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Np parameter vector is sparse with many parameters being exactly zero or negligibly small. Further, the estimation accuracy and computational efficiency of modelling using regularized methods with sparsity depend on the kind of penalty function. In this section, we present the extensive numerical studies using different penalty functions, namely, least absolute shrinkage and selection operator (LASSO), smoothly clipped absolute deviation (SCAD), elastic net, and minimax conclave penalty (MCP) for estimation of sparse VAR models and the performance of these approaches is compared through simulations on synthetic data.
3.1.1
Estimation of VAR models using sparse optimization
Given a multivariate time series z[k], the VAR(P ) model can be written compactly as 41
Y = ZA + E
(21)
where
Y = [z[P + 1], . . . , z[N ]]T A = [A1 , A2 , . . . , AP ]T Z = [z[P : N − 1], . . . , z[1 : N − P ]]T E = [e[P + 1], e[P + 2], . . . , e[N ]]T
The regularized least squares optimization problem is
ˆ = arg min ||Y − ZA||2 + λ A 2 A
M X
p(|Aj |)
(22)
j=1
where k.k2 denotes the L2 norm, λ is the regularization parameter (λ ≥ 0) and p(.) is the penalty function applied to each regression coefficient. Minimization of regularized optimization problem gives the estimates of VAR model parameters. Commonly used penalization and their penalty functions for sparse VAR modelling are listed in Table S1 of the Supporting 17
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Information. A brief review of these penalty functions and their properties are presented in Section 2 of the Supporting Information. We next demonstrate the performance of different regularization based methods for estimation of VAR models through simulations under small sample scenario on high dimensional synthetic data.
3.1.2
Example: 6-dimensional VAR(3) process
The process under study is a 6-dimensional VAR of order 3, described by the following equations,
0 0 0.4 0 0 0 0.3 0 0 0 0 0 0 0.2 0 0 0 0 0 0 0 0 0 0 0.5 0.4 0 0 0 0 0.4 0 0 0 0 0 z[k − 2] · · · z[k] = z[k − 1] + 0 0 0 0.4 0 0 0 0 0 0.6 0 0 0 0 0 0 0.3 0 0 0 0.6 0 0.3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 −0.2 0 0 0.3 0 0 0 0 0 0 0 z[k − 3] + e[k] (23) ··· + 0 0 0 0.2 0 0 0 0 0 0 0 0 0 0 0 −0.45 0 0.7 where ei [k], i = 1, · · · , 6 are usual zero-mean and unit variance uncorrelated white noise sequences. In order to study the effect of sample size on the error in the parameter estimates, the process is simulated to generate N = 100 samples of data, which is less than the number of parameters (Np = 108) to be estimated (N < Np ). Note that such a short data length is purposely chosen as to illustrate the effects of different regularization methods on the 18
ACS Paragon Plus Environment
Page 18 of 53
Page 19 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
estimates of model parameters. First, we estimate the model parameters using traditional ordinary least squares (OLS) approach. Next, we turn to the estimation of model parameters using sparse based regularization methods. The model parameters are estimated by fitting sparse VAR model to the data using different penalty functions corresponding to LASSO, Elastic Net, SCAD and MCP. The 10-fold cross validation technique is used for selecting the regularization parameter 52 . The estimate of A1 coefficient matrix using different methods for R = 100 replications is displayed in Figure 1. Panel (a) of Figure 1 displays the true AR coefficient matrix A1 , where the presence of circle indicates the existence of non-zero AR coefficient and the colour of the circle shows the value of the corresponding AR coefficient. Panel (b) shows the estimate of A1 coefficient matrix using OLS. The estimates of A1 coefficient matrices using different regularization based methods such as LASSO, elastic net, SCAD, and MCP are displayed in Panels (c), (d) and (e), and (f) respectively. Similarly, the estimates of A2 and A3 coefficient matrices are displayed in Figure 2 and Figure 3, respectively. It is clear that the estimates of model parameters obtained from Elastic net, SCAD and MCP are close to the true parameters. However, the elastic net based regularization method is computationally efficient than the SCAD and MCP based approaches. The computational time and the mean square error of these regularization methods are listed in Table 1. The results suggest that when the number of measurements is limited as often the case in real life applications, the sparse optimization approaches are more suitable for reconstruction of causal networks since successful identification of causal relationships depends on the efficient estimation of model parameters. Further, the results show that the Elastic net / SCAD / MCP based regularization methods produce more accurate results and among these, the Elastic net based method is computationally efficient. Therefore, it is advisable to use Elastic net based regularization method for estimation of VAR models for reconstruction of causal networks (the case studies in Section 4 demonstrate the small sample size challenges with respect to reconstruction of causal networks).
19
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
2
3
4
5
6
1
2
3
4
5
6
0.8
1
2
●
●
●
●
●
●
●
●
●
●
4
●
●
●
●
●
●
●
●
●
2
●
●
●
●
3
4
6
●
●
●
●
●
6
●
−0.5
1
2
3
4
5
6
●
●
●
●
●
●
●
●
●
1
●
4
●
●
●
●
2
3
4
5
6
●
●
●
●
●
2
●
●
●
●
●
●
●
●
●
6
●
●
●
●
●
2
●
●
●
●
●
●
3
●
●
●
●
●
●
4
●
●
●
●
●
5
●
●
●
●
●
●
6
●
●
●
●
●
●
0.37
−0.07
−0.5
●
●
●
●
3
4
●
●
●
1
2
3
4
5
6
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.8
1
2
●
0.37
●
●
●
●
●
●
●
●
●
−0.07
5
0.8 ●
0.8
1
0.37
3
6
●
ˆ 1 using LASSO (c) A
0.8
2
5
●
ˆ 1 using OLS (b) A
●
4
●
−0.5
(a) A1
1
3
●
−0.07
5
●
2
1
0.37
−0.07
5
1 0.8
1
0.37
3
Page 20 of 53
0.37
3
4
●
●
●
5
●
●
●
●
6
●
●
●
●
−0.07
5
●
●
●
●
6
●
●
●
●
●
●
●
−0.5
−0.07 ●
●
●
−0.5
ˆ 1 using Elastic Net (d) A
−0.5
ˆ 1 using SCAD (e) A
ˆ 1 using MCP (f) A
Figure 1: Display of A1 coefficient matrix estimates (N = 100, R = 100)
1
2
3
4
5
6
1
2
3
4
5
6
0.8
1
2
●
●
●
●
●
●
●
●
●
●
●
2 0.37
3
●
●
●
●
●
4
●
●
●
●
●
●
●
6
●
●
●
3
4
●
●
●
6
●
−0.5
1
4
5
●
●
●
●
●
●
●
●
●
2
3
●
6
1
●
●
●
●
●
4
●
●
●
●
●
2
3
4
5
●
●
6
●
●
● ●
●
●
2
●
●
●
●
●
●
3
●
●
●
●
●
●
4
●
●
●
●
●
●
5
●
●
●
●
●
●
6
●
●
●
●
●
●
0.37
−0.07
−0.5
6
2
●
●
●
●
●
●
●
●
●
●
●
●
●
●
3
●
●
●
●
●
4
●
●
●
●
●
2
3
4
5
6 0.8
●
●
●
●
●
●
●
●
●
2
●
●
3
●
●
●
●
●
4
●
●
●
●
●
5
●
●
●
6
●
●
0.37
−0.07
5
●
●
6
●
●
●
●
●
●
●
●
−0.5
ˆ 2 using Elastic Net (d) A
1 1
0.37
−0.07
5
0.8 ●
0.8
1
0.37
3
6
●
ˆ 2 using LASSO (c) A
0.8
2
5
●
ˆ 2 using OLS (b) A
●
4
●
−0.5
(a) A2
1
3
●
−0.07
5
●
2
1
0.37
−0.07
5
1 0.8
1
−0.07
●
●
●
●
●
−0.5
ˆ 2 using SCAD (e) A
−0.5
ˆ 2 using MCP (f) A
Figure 2: Display of A2 coefficient matrix estimates (N = 100, R = 100) 20
ACS Paragon Plus Environment
Page 21 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
1
2
3
4
5
6
1
2
3
4
5
6
0.8
1
●
2
●
●
●
●
●
●
●
2
●
0.37
3
●
●
●
4
●
●
●
●
●
●
●
●
●
●
●
6
●
●
●
●
●
3
4
6
●
−0.5
1
2
2
●
●
3
4
5
6
●
●
●
●
●
●
1
●
●
4
●
●
●
●
●
●
●
6
●
●
●
●
●
●
2
●
●
●
●
●
●
3
●
●
●
●
●
●
4
●
●
●
●
●
●
5
●
●
●
●
●
●
6
●
●
●
●
●
0.37
−0.07
−0.5
2
3
4
5
6
●
●
●
●
●
●
●
●
●
●
●
●
●
●
2
●
●
●
4
5
6
●
●
●
●
●
●
●
●
●
●
●
5
●
●
●
6
●
●
●
2
0.8
1
●
2
●
3
●
4
●
●
0.37
3
●
●
●
4
●
●
●
●
●
●
●
●
0.37 ●
●
●
●
●
●
−0.07
5
●
●
●
6
●
●
●
●
●
●
●
−0.5
ˆ 3 using Elastic Net (d) A
3 ●
1 0.8
1
−0.07
5
0.8 ●
ˆ 3 using LASSO (c) A
0.37
3
6
●
ˆ 3 using OLS (b) A
0.8
●
5
●
−0.5
(a) A3
1
4
●
−0.07
5
●
3
2
1
0.37
−0.07
5
1 0.8
1
●
−0.07 ●
●
−0.5
ˆ 3 using SCAD (e) A
−0.5
ˆ 3 using MCP (f) A
Figure 3: Display of A3 coefficient matrix estimates (N = 100, R = 100)
Table 1: Computational time and mean square error for different regularization methods N = 100, Np = 108 Elastic Net SCAD
3.2
MCP
Computational Time (Sec.)
0.3830
3.8177
3.7678
Mean Square Error
1.059
1.044
1.033
Scalar correlation functions for model structure selection
Recently the authors have developed scalar correlation functions for model structure and its order selection in multivariate processes. These scalar correlation functions are central to the proposed methodology for reconstructing weighted GC networks in an efficient manner. A synopsis of these functions including the theoretical definitions is therefore provided below. The key idea used in arriving at the scalar correlation functions is that of constructing 21
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 22 of 53
two univariate processes v[k] and w[k] that are equivalent to the given multivariate process in the sense of lagged correlation. Given an M -dimensional stationary multivariate process z[k], we first construct univariate process v[k]
α ∈ RM ×1 , αi < ∞, αi 6= 0 ∀i
v[k] = αT z[k],
(24)
such that v[k] is second-order stationary and importantly satisfies
ρvv [l] = 0 ⇐⇒ Γz [l] = 0
(25)
where ρvv [l] is the scalar auto-correlation function for the given multivariate process z[k] and Γz [l] is the standard matrix autocorrelation function at any lag l for a multivariate stationary process 40,41 (with the exception of some highly peculiar processes with skewsymmetric Σz [l]). See Section 3.1 of the Supporting Information for proof. The scalar autocorrelation function (SACF) is then,
ρ[l] ,
1,
l=0
σ [l] αT Σ [l]α vv = T z , |l| ≥ 1 σvv [0] α Σz [0]α
(26)
where Σz [l] is the standard matrix auto-covariance function for a multivariate stationary process. It maybe noted that such a univariate process can be constructed for almost all multivariate processes. The scalar autocorrelations for a stationary VMA(Q) process are zero after the lag l > Q. On the other hand, the scalar autocorrelations for VAR(P ) process decays gradually to zero with lag. The coefficient vector α can be chosen in a free manner so long as it meets the requirement in (24). However, for practical purposes, it is recommended to choose the coefficient vector as the solution to (see Section 3.2 of the
22
ACS Paragon Plus Environment
Page 23 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Supporting Information for details):
Σz [0]α = diag(Σz [0])
(27)
which has a unique solution since Σz [0] is non-singular. This choice of α ensures that the resulting univariate process optimally captures the information in the sense of lagged correlations. It may be possible that one or few of αi ’s can be zero under some conditions which in our experience very are rarely encountered for real-life processes. Under such circumstances, any other non-zero α may be chosen. The second univariate process w[k] is constructed in a similar manner to that of v[k] but with inverse ACF (IACF) replacing the ACF in the foregoing development. The inverse ACF enjoys a perfect dual relationship with the ACF, in the sense that the IACF of a stationary process is identical to the ACF of the inverse process, defined as follows 53,54 . Given a stationary and invertible multivariate process z[k],
A(q −1 )z[k] = B(q −1 )e[k]
(28)
with variance and covariance matrix of e[k] as Σe , then its inverse process z0 [k] is B(q −1 )z0 [k] = A(q −1 )˜ e[k]
(29)
with variance and covariance matrix of ˜ e[k] as Σ−1 e . With this definition of inverse process, we construct the second univariate process w[k]
w[k] = β T z0 [k], β ∈ RM ×1 , βi < ∞, βi 6= 0 ∀i,
23
ACS Paragon Plus Environment
(30)
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 24 of 53
such that w[k] is second-order stationary and satisfies
ρww [l] = 0 ⇐⇒ Γz0 [l] = 0
(31)
where ρww [l] is the scalar autocorrelation of the inverse process and Γz0 [l] is the standard matrix auto-correlation function for a multivariate inverse process z0 [k] (with the exception ¯ z [l]). Using the duality property, of some highly peculiar processes with skew-symmetric Σ we write (31) as
¯ z [l] = 0 ρ¯vv [l] , ρww [l] = 0 ⇐⇒ Γz0 [l] = Γ
(32)
where ρ¯vv [l] is the scalar inverse auto-correlation function (SIACF) for the given multivariate ¯ z [l] is the standard matrix inverse auto-correlation function for a multiprocess z[k] and Γ variate stationary process 42 . The scalar inverse auto-correlation function (SIACF) at any lag l is then,
ρ¯[l] ,
1,
l=0
¯ [l]β σ [l] βT Σ ww = T z , |l| ≥ 1 ¯ z [0]β σww [0] β Σ
(33)
¯ z [l] is the standard matrix inverse auto-covariance function for a multivariate stawhere Σ tionary process. The scalar inverse autocorrelations for a stationary VAR(P ) process are zero after the lag l > P . On the other hand, the scalar inverse autocorrelations for VMA(Q) process decays gradually to zero with lag. Similar to the coefficient vector α, the coefficient vector β can be chosen in a free manner so long as it meets the requirement in (30). However, for practical purposes, it is recommended to choose the coefficient vector as the solution to:
¯ z [0]β = diag(Σ ¯ z [0]) Σ
24
ACS Paragon Plus Environment
(34)
Page 25 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
¯ z [0] is non-singular. The properties of the scalar correwhich has a unique solution since Σ lation functions for multivariate processes are summarized in Table 2. Table 2: Properties of the scalar correlation functions Model structure
SACF
VAR(P )
VMA(Q)
Decreases exponentially
Cuts off at l = Q Decreases exponentially
SIACF Cuts off at l = P
VARMA(P, Q) Decreases exponentially Decreases exponentially
Two illustrative examples are provided to demonstrate the usefulness of proposed scalar correlation functions for selection of model structure and its order in multivariate processes.
3.2.1
Illustrative example 1
Consider a 3-dimensional VMA(1) generating process,
z1 [k] = 0.5e1 [k − 1] + 0.4e2 [k − 1] + 0.6e3 [k − 1] + e1 [k] z2 [k] = 0.3e2 [k − 1] + e2 [k] z3 [k] = 0.6e2 [k − 1] + 0.2e3 [k − 1] + e3 [k]
(35)
where e1 [k], e2 [k] and e3 [k] are usual zero-mean and unit variance uncorrelated Gaussian white noise sequences. The process is simulated to generate N = 1000 observations. The sample SACF and SIACF from data along with the 95% significance levels are shown in Figure 4. It s clear that the SACF abruptly goes to zero after lag 1 whereas the SIACF decreases gradually to zero with lags. Thus the scalar measures identify the model structure as VMA(1) for the process, which is indeed the case.
25
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1
1
Sample SIACF
Sample SACF
0.8 0.6 0.4 0.2
0.5
0
0 -0.2
-0.5 0
2
4
6
8
10
0
2
4
Lag
6
8
10
Lag
(a) SACF
(b) SIACF
Figure 4: Sample SACF and SIACF for the VMA(1) process in (35). 3.2.2
Illustrative example 2
In this example, the process under study is a 6-dimensional VAR(3) process of (23). The process is simulated to generate N = 2000 samples of data. The sample versions of scalar correlation functions obtained from data are shown in Figure 5. From these plots, it is clear that the sample SIACF of the process abruptly goes to zero after lag l = 3 whereas the sample SACF decreases gradually to zero with lag. Thus the scalar correlation functions suggest that the process is VAR(3), which is in agreement with the true process structure.
1
1
0.8
0.8
Sample SIACF
Sample SACF
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 26 of 53
0.6 0.4 0.2 0
0.6 0.4 0.2 0 -0.2
-0.2
-0.4 0
5
10
15
0
2
Lag
4
6
8
10
Lag
(a) SACF
(b) SIACF
Figure 5: Sample SACF and SIACF for the VAR(3) process in (23). It is observed that the user has to examine only a single plot of SACF and single plot of SIACF instead of 2M 2 matrix correlation and inverse correlation plots. Furthermore, these measures provide a correlation structure without any loss of information on the overall correlation characteristics of the multivariate process. Thus the scalar correlation functions 26
ACS Paragon Plus Environment
Page 27 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
are useful in identifying the correlation structure of multivariate data generating process, which is very important in reconstructing weighted causal networks in an efficient manner.
3.3
Strength of connectivity
We introduce the definition of strength of connectivity based on the direct pathway function. The DPF directly quantifies the amount of power transfer from the source ej to the sink zi . The direct power transfers in a multivariable process occur through direct pathways. Thus the strength of the connection can be expressed in terms of direct power transfer. A definition of strength of connectivity ζij along (i, j) pathway that quantifies the amount of power transfer from the source ej to the sink zi as 1 ζij = π
Z
π
|ψij (ω)|2 dω
(36)
0
where |ψij (ω)|2 is the direct power transfer from a source ej to sink zi . The strength satisfies the following properties
M X
0 ≤ ζij ≤ 1
(37)
ζij = 1 for all 1 ≤ j ≤ M.
(38)
i=1
Thus, the strength is always positive and theoretically it is zero when the causal connection does not exist between the variables. The strength (weight) is a measure of the normalized power transfer from the source (causal) variable directly to the sink (effect). It is useful in determining the extent to which the variability in source directly transfers to the sink, i.e, the extent to which the variance of a (white-noise) disturbance in the source directly affects the effect. Since it is normalized across effects (for a given source), it allows us to compare the strongest propagation path for a given source. Higher the strength value stronger is the causal path.
27
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Estimates of connectivity strengths are obtained directly from the estimates of DPF using the expression given in (36).
3.4
Methodology for efficient estimation of weighted GC networks
The proposed methodology for efficient estimation of weighted Granger causal networks in linear dynamical processes is described below in a stepwise manner. Primarily, the proposed approach consists of three steps. In the first step, we identify a suitable model structure and its order for the given multivariate data using scalar correlation functions presented in Section 3.2. Secondly, we fit the identified model to the data using sparse optimization based method and refine the order(s) of the model if required, estimate the Granger causality measures (Test statistic / PDC / DPF) from the model coefficient matrices and subsequently estimate the strengths from the DPF. In the final step, we detect the Granger causal relationships from DPF / PDC, and construct the network of the process and assign the strengths. The complete methodology is provided in Table 3. Table 3: Proposed approach / methodology for efficient estimation of weighted GC networks 1. Estimate the SACF and SIACF for the given multivariate data using (26) and (33), respectively. 2. Identify a suitable parametric model structure (i.e.,VAR/VMA/VARMA) and model order using the estimated SACF and SIACF. 3. Fit the chosen model as per Step 2 to the multivariate data. 4. Test for whiteness of vector residuals of the fitted model by examining their SACF. If the test reveals any remnant correlation, refine the order(s) of the model and iterate until residuals are satisfactorily white. 5. From the estimated model compute the DPF using (11) (or PDC using (8)) and subsequently estimate the strengths/weights from the DPF using (36). 6. From the DPF (or PDC), construct the Granger causal network of the process and assign the strengths to the connecting links.
The following section presents the case studies to demonstrate the efficacy of the proposed
28
ACS Paragon Plus Environment
Page 28 of 53
Page 29 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
methodology for reconstruction of weighted Granger causal networks in an efficient manner.
4
Case Studies
Four case studies pertaining to different types of data generating processes (VARMA, MultiInput Multi-Output, and Environmental, and Wood-Berry distillation column) are presented to demonstrate the efficacy of the proposed methodology for reconstructing weighted GC networks. The first two case studies highlight the efficacy of the proposed methodology in reconstructing weighted GC networks. Case study 3 is presented to demonstrate the efficacy of the proposed approach for a real-life data set. The last case study is presented to demonstrate the use of the proposed methodology for interaction assessment in process engineering. In all the cases, we first implement the standard way of estimation of causal relationships, i.e, fitting VAR model to the data and report the resulting GC measures and weighted GC network. Secondly, we implement the proposed approach and compare the efficiency and accuracy of the estimates.
4.1
Case Study 1: VARMA Process
The process under study is a 3-dimensional process with mixed characteristics, i.e., a mix of both autoregressive and moving average. The following 3-dimensional VARMA(1,1) process is chosen.
z1 [k] = 0.4z1 [k − 1] + 0.4e1 [k − 1] + e1 [k] z2 [k] = 0.5z1 [k − 1] + 0.6z2 [k − 1] + 0.5e2 [k − 1] + e2 [k] z3 [k] = 0.25z3 [k − 1] + 0.3e2 [k − 1] + 0.4e3 [k − 1] + e3 [k]
(39)
where e1 [k], e2 [k] and e3 [k] are uncorrelated Gaussian white noise sequences with zero-mean and unit variance. First, we identify the true (theoretical) cause and effect relationships
29
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 30 of 53
among variables both in time- and frequency-domain. The true AR and MA coefficient matrices for the process are 0 0 0.4 0 0.4 0 ; B1 = 0 0.5 0 . A1 = 0.5 0.6 0 0 0 0.25 0 0.3 0.4
(40)
As described earlier, simple inspection of model coefficients and based on the result given in (18), it is clear that the two causal relationships (z1 −→ z2 and z2 −→ z3 ) exists in the process. Thus the theoretical causal connectivity matrix for the process is
CGC
− 0 0 = 1 − 0 0 1 −
(41)
Next, we compute the frequency-domain Granger causality measure (DPF) theoretically using (11). The matrix layout plots of (true) theoretical DPF for this process are shown in Figure 6(a). In matrix layout plots, the diagonal blocks represent the influence of selfvariable while the off-diagonal block (i, j) represents the direct influence from source zj to sink zi . The columns represent the causes whereas the rows represent the effects. If the squared DPF / PDC is zero at all frequencies then the Granger-causal relationship does not exist between zj and zi . The theoretical squared DPF shown in Figure 6(a) illustrates the presence of Granger-causal relationship from z1 to z2 and z2 to z3 since the squared DPF is non-zero. Therefore the resulting theoretical weighted GC network is illustrated in Figure 6(b). In the network diagram, the vertices represent the variables (signals), and the directed edge between zi and zj indicates the existence of Granger-causal relationship (direct influence). The numerical value on the top of the edge represents the strength of connection (weight).
30
ACS Paragon Plus Environment
Page 31 of 53
0.5 0
0.5 0
0
2
2
0
0.5 0
0
2
0.5 0
0
2
2
z1
1
|ψ 33|2
|ψ 32|2
0
0
0.5
2
1
0.5
2
0 0
1
0 1
|ψ 23|2
|ψ 22|2
1
0.5
0.5 0
0
1
|ψ 21|2
1
|ψ 13|2
1
|ψ 12|2
|ψ 11|2
1
|ψ 31|2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
0.229
0.5 0
0
2
ω
0
2
ω
z2
0.127
z3
ω
(a) Squared DPF
(b) Weighted GC network
Figure 6: Theoretical DPF and GC network for the VARMA process in Case Study 1.
Now we turn to the problem of reconstructing the weighted GC networks from data. For this purpose, the above 3-dimensional VARMA (1,1) process is simulated to generate N = 1000 samples of data. The Granger causality measures both in time- and frequency-domain are estimated using standard (VAR) model-based approach, i.e., by fitting an appropriate VAR model (P = 3) without giving due importance to the structural characteristics of DGP. The estimated time-domain GC test statistic is
5.8702 0.4682 −− Sˆij = −− 4.4099 312.0398 . 15.6123 169.7661 −−
(42)
Recalling that under the null hypothesis that zj does not Granger-cause zi , this test statistic is asymptotically χ2 -distributed with P degrees of freedom. The critical value at 95% significance level with 3 degrees of freedom is 7.814. Thus the causal connectivity matrix is ˆ GC C
0 −− 0 = 1 −− 0 , 1 1 −−
(43)
which is not identical to the true connectivity matrix. The matrix layout plots of estimated 31
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
frequency-domain measures such as PDC and DPF with the theoretical values are shown in Figure 7(a) and Figure 7(b), respectively. It is observed that the squared PDC / DPF estimates are in significant deviation from true values. When the sample size is small, the errors in these estimates are very high, which may further lead to identification of spurious causal relationships among variables. In order to demonstrate the effect of sample size on the error in the estimates of DPF (PDC) and hence the strengths, the process is simulated to generate N = 50 samples of data. The matrix layout plots of estimated squared PDC and DPF using standard approach from 50 observations are shown in Figure 8(a) and Figure 8(b), respectively. The DPF (PDC) shown in Figure 8 illustrates the presence of (spurious) causal connection from z3 to z2 as squared DPF is non-zero and suppression of the true causal relation from z2 to z3 (as squared DPF is zero). Thus it is clear that when the sample size is small, the mismatch in model structure and DGP results in detection of spurious causal relationships among variables.
0
1
|π 23|2
0.5
0.5 0
0
2
0
|π 33|2
0.5
1
0.5 0
0
2
ω
2
ω
0
2
ω
(a) Squared PDC
2
1
0.5 0
0
2
ω
|ψ 13|2
|ψ 12|2
0
0 0
0
0.5
2
1
0.5
2
0 0
1
0.5
0.5
2
0 1
0 0
2
2
1
0.5
0.5 0
0
0 0
1
|π 32|2
1
0.5
2
2
1
0 0
0 0
2
0.5
|ψ 23|2
0
Estimated
1
|ψ 33|2
2
1
|π 22|2
|π 21|2
1
0
0 0
0.5
|ψ 22|2
2
0.5
1
|ψ 32|2
0 0
Theoretical
1
|ψ 21|2
0
0.5
Estimated 1
|ψ 31|2
0.5
|π 13|2
|π 12|2
|π 11|2
1
|ψ 11|2
Theoretical 1
|π 31|2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 32 of 53
0.5 0
0
2
ω
0
2
ω
(b) Squared DPF
Figure 7: Estimated GC measures using standard approach when N = 1000 for the VARMA process in Case Study 1.
32
ACS Paragon Plus Environment
0
0
0.5
0
2
0 0
0
0.5
0
2
0 0
ω
2
ω
0 0
2
2
ω
|ψ 13|2
0.5
0.5 0
0
ω
(a) Squared PDC
2
1
0 0
0
0.5
2
1
0.5
2
0 0
1
0.5
0
0.5
2
0 1
0 0
1
|π 33|2
0.5
2
2
1
0.5
0.5 0
0
0 0
1
|π 32|2
1
2
2
1
0.5
0
0.5 0
0
|ψ 21|2
0.5
2
1
|π 23|2
1
|π 22|2
1
2
0
|ψ 23|2
0
0.5
1
|ψ 33|2
0
Estimated
1
|ψ 12|2
0.5
|ψ 31|2
2
1
|ψ 11|2
0.5 0
0
Theoretical
1
|π 13|2
|π 12|2
|π 11|2
0.5 0
|π 21|2
Estimated
1
|ψ 22|2
Theoretical
1
|π 31|2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
|ψ 32|2
Page 33 of 53
2
0
ω
2
ω
(b) Squared DPF
Figure 8: Estimated GC measures using standard approach when N = 50 for the VARMA process in Case Study 1.
We implement next the proposed approach for this process, wherein the scalar correlation functions are first estimated to select the suitable model structure for the given process. The sample SACF and SIACF obtained from 1000 observations are shown in Figure 9(a) and Figure 9(b), respectively. It is observed that both the scalar correlation functions decrease gradually to zero with lags suggesting that the suitable model structure of this process is VARMA, which is in agreement with the true process. The VARMA model of an appropriate order (P = 1, Q = 1) is fitted to the data and the estimated AR and MA coefficient matrices using maximum-likelihood estimation (MLE) with the standard errors are
0.41 −0.026 0.016 0.375 0.008 −0.021 (0.0482) (0.0445) (0.0542) (0.0356) (0.0286) (0.0456) ˆ 1 = 0.53 ˆ 1 = −0.052 0.44 0.61 −0.035 0.067 A ; B (0.0402) (0.0262) . (0.0512) (0.0371) (0.0295) (0.0469) −0.01 −0.001 0.234 0.04 0.321 0.392 (0.0510)
(0.0271)
(0.0491)
(0.0530)
(0.0376)
(44)
(0.0482)
Thus the estimated causal connectivity matrix is
ˆ GC C
− 0 0 = 1 − 0, 0 1 − 33
ACS Paragon Plus Environment
(45)
Industrial & Engineering Chemistry Research
which is identical to the true connectivity matrix. In the frequency-domain, the PDC and DPF are re-estimated from VARMA model and matrix layout plots of these estimates are shown in Figure 10(a) and Figure 10(b), respectively. The resulting weighted GC network for the process is illustrated in Figure 10(c). The bias and standard error in the estimates of strengths computed using Monte-Carlo simulations for R = 100 replications for both the scenarios (i.e.,with structural mismatch between model and DGP, and without structural mismatch between model and DGP) are listed in Table 4. It can be observed that the estimates of DPF (PDC) and hence the strengths (weights) estimated using the proposed approach (i.e, no structural mismatch between model and DGP) are almost identical to the theoretical values and consequently the reconstructed weighted GC network is also identical to the true network. In contrast, the errors in the estimates obtained using the VAR modelbased approach (i.e., there is structural mismatch between model and DGP) are relatively high. Thus it is clear that the reconstruction of weighted GC networks using proposed approach results in accurate and reliable (efficient) reconstruction of weighted GC networks.
1
1 0.8
0.8
Sample SIACF
0.6
Sample SACF
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 34 of 53
0.6 0.4 0.2
0.4 0.2 0 -0.2
0
-0.4
-0.2
-0.6 0
2
4
6
8
10
0
2
Lag
4
6
8
10
Lag
(a) SACF
(b) SIACF
Figure 9: Sample SACF and SIACF for the VARMA process in Case Study 1.
34
ACS Paragon Plus Environment
0
0
|π 23|2
|π 22|2
0.5
0.5
0
2
0
2
0.5
0
2
ω
0
0 0
2
0
2
0
2
|ψ 13|2 2
(a) Squared PDC
2
1
0.5
0.5 0
0
ω
0
0 0
0
ω
2
0.5
1
0.5
0 1
0.5
2
0
ω
2
0
1
0.5
0
0 0
1
0.5
2
1
|π 33|2
|π 32|2
0
2
0 0
1
0.5
0
0
0.5
0
1
0.5
0
1
2
1
0.5
|ψ 23|2
2
1
|ψ 33|2
0 1
0.5 0
|ψ 21|2
2
0
Estimated
1
|ψ 12|2
0.5
|ψ 31|2
0
1
|ψ 11|2
0.5 0
1
Theoretical
1
|π 13|2
|π 12|2
|π 11|2
0.5 0
|π 21|2
Estimated
1
|ψ 22|2
Theoretical
1
|π 31|2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
|ψ 32|2
Page 35 of 53
2
ω
0
2
ω
(b) Squared DPF
z1 0.23
z2
0.135
z3
(c) Weighted GC network
Figure 10: Estimated GC measures and weighted GC network using proposed approach for the VARMA process in Case Study 1.
In order to test the ability to identify the network structure correctly for multiple runs, Monte-Carlo simulations are performed for three different causal structures. The data generating equations for the new causal structures 2 and 3 in Table 5 are given in (46) and (47), respectively. z1 [k] = 0.4z1 [k − 1] + 0.4e1 [k − 1] + e1 [k] z2 [k] = 0.6z2 [k − 1] + 0.5e2 [k − 1] + e2 [k] z3 [k] = 0.5z1 [k − 1] + 0.25z3 [k − 1] + 0.7e2 [k − 1] + 0.4e3 [k − 1] + e3 [k] z1 [k] = 0.4z1 [k − 1] + 0.4e1 [k − 1] + 0.45e3 [k − 1] + e1 [k] z2 [k] = 0.2z1 [k − 1] + 0.6z2 [k − 1] + 0.5e2 [k − 1] + e2 [k] z3 [k] = 0.25z3 [k − 1] + 0.3e2 [k − 1] + 0.4e3 [k − 1] + e3 [k]
(46)
(47)
where e1 [k], e2 [k] and e3 [k] are usual uncorrelated Gaussian white noise sequences with zero-
35
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 36 of 53
mean and unit variance. The results of type I (percentage of cases where existing causal connections are undetected) and type II errors (percentage of cases where spurious causal connections are detected) obtained for 100 runs using both the standard and proposed approaches from 1000 observations are reported in Table 5. From the results, it may be safely concluded that the proposed approach has a higher chance of detecting the underlying network structure than the standard approach. Table 4: Bias and standard error in the estimates of strengths for the VARMA(1,1) process. Link
True strength (ζ)
ˆ Estimated strength (ζ)
Standard error
Bias
Standard
Proposed
Standard
Proposed
Standard
Proposed
(2,1)
0.229
0.247
0.23
0.017
0.014
-0.007
-0.002
(3,2)
0.127
0.096
0.135
0.029
0.018
-0.041
-0.001
Table 5: Type I and Type II errors for three different causal structures for 100 runs Causal Structure
Standard Approach
Proposed Approach
Type I error (%)
Type II error (%)
Type I error (%)
Type II error (%)
z3
0
36
0
13
z3
0
19
0
4
z3
2
41
0
18
z1 z2 z1 z2 z1 z2
36
ACS Paragon Plus Environment
Page 37 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
4.2
Case Study 2: MIMO Process
In this case study, the process of interest is a discrete-domain multivariable process used in Gigi and Tangirala 1 for interaction assessment. The process is
y[k] = Gp (z)u[k] z −1 1 − 0.4z −1 y1 [k] z −1 y [k] = 2 1 − 0.1z −1 y3 [k] 0
(48) z −1 1 − 0.2z −1 z −1 1 − 0.3z −1 z −1 1 − 0.2z −1
z −1 u1 [k] 1 − 0.5z −1 u2 [k] 0 −1 z u3 [k] 1 − 0.5z −1
(49)
The above process has three input variables uj , j = 1, · · · , 3 and three output variables yi , i = 1, · · · , 3, and the causal relationship does not exist from u1 to y3 and u3 to y2 . T The variables are arranged as z = y1 y2 y3 u1 u2 u3 and the theoretical DPF is computed by re-writing (49) in the difference equation form. The theoretical DPF shown in Figure 11(a) illustrates the presence of true causal relationships from u1 −→ y1 , u1 −→ y2 , u2 −→ y1 , u2 −→ y2 , u2 −→ y3 , u3 −→ y1 , and u3 −→ y3 . The resulting weighted GC network for the process is illustrated in Figure 11(b).
37
ACS Paragon Plus Environment
0
2
0
ω
2
0
ω
2
0
ω
1 0.5 0
2
0
2
|ψ 16|2
1 0.5 0
1 0.5 0
2
0
2
1 0.5 0
2
0
2
1 0.5 0
2
0
2
|ψ 26|2
0
2
1 0.5 0
|ψ 36|2
2
1 0.5 0
0
1 0.5 0
|ψ 46|2
0
1 0.5 0
u2
|ψ 56|2
2
1 0.5 0
2
Page 38 of 53
|ψ 66|2
0
|ψ 15|2
1 0.5 0
2
|ψ 25|2
0
0
|ψ 35|2
2
1 0.5 0
2
1 0.5 0
|ψ 45|2
0
1 0.5 0
0
1 0.5 0
|ψ 55|2
2
1 0.5 0
2
u1
|ψ 65|2
0
|ψ 14|2
1 0.5 0
2
|ψ 24|2
0
0
|ψ 34|2
2
1 0.5 0
2
1 0.5 0
|ψ 44|2
0
1 0.5 0
0
1 0.5 0
|ψ 54|2
2
1 0.5 0
2
y3
|ψ 64|2
0
|ψ 13|2
2
|ψ 23|2
1 0.5 0
0
|ψ 33|2
1 0.5 0
2
1 0.5 0
|ψ 43|2
1 0.5 0
0
1 0.5 0
|ψ 53|2
1 0.5 0
2
y2
|ψ 63|2
0
|ψ 12|2
1 0.5 0
0
|ψ 22|2
1 0.5 0
0
|ψ 32|2
|ψ 41|2
1 0.5 0
0
1 0.5 0
|ψ 42|2
|ψ 31|2
1 0.5 0
0
1 0.5 0
|ψ 52|2
|ψ 21|2
1 0.5 0
y1
|ψ 62|2
|ψ 11|2
1 0.5 0
|ψ 51|2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
|ψ 61|2
Industrial & Engineering Chemistry Research
2
0
2
ω
ω
u3
0
2
0
2
0
2
0
2
0
2
0
2
ω
(a) Squared DPF
u1 0.341
y1
u2 0.325 0.247
0.251 0.323
u3 0.247
y2
0.323
y3
(b) Weighted GC network
Figure 11: Theoretical DPF and GC network for the process in Case Study 2.
In order to infer the causality from data, the process is simulated to generate N = 1500 observations. A Gaussian white noise is added to the outputs such that the SNR is 10. The sample scalar correlation functions obtained data shown in Figure 12 suggesting that the suitable model structure for the process is VAR(2). The DPF is estimated by fitting VAR(2) model to the data and matrix layout plots of squared DPF are shown in Figure 13(a). The estimated DPF illustrates the presence of causal relationships from u1 −→ y1 , u1 −→ y2 , u2 −→ y1 , u2 −→ y2 , u2 −→ y3 , u3 −→ y1 , and u3 −→ y3 . The resulting weighted GC network for the process is illustrated in Figure 13(b). Therefore, the estimated causal relationships are identical to the true relations among variables. 38
ACS Paragon Plus Environment
1
0.8
0.8
Sample SIACF
1
0.6 0.4 0.2 0
0.6 0.4 0.2 0 -0.2
-0.2
-0.4 0
2
4
6
8
10
0
2
4
6
Lag
8
10
Lag
(a) SACF
(b) SIACF
0
2
0
2
0
0
|ψ 16|2 |ψ 26|2 |ψ 36|2 |ψ 46|2
1 0.5 0 1 0.5 0
2
0
2
1 0.5 0
|ψ 56|2
1 0.5 0
1 0.5 0
2
0
2
1 0.5 0
|ψ 66|2
|ψ 15|2 |ψ 25|2 |ψ 35|2 |ψ 45|2 0
2
1 0.5 0
1 0.5 0
2
0
2
Estimated
2
0
2
0
ω
1 0.5 0
|ψ 55|2
1 0.5 0
1 0.5 0
u2
0
2
0
2
1 0.5 0
|ψ 65|2
|ψ 14|2 |ψ 24|2 |ψ 34|2 |ψ 44|2
1 0.5 0
1 0.5 0
2
0
2
0
ω
|ψ 54|2
1 0.5 0
1 0.5 0
Theoretical
u1
0
2
0
2
1 0.5 0
|ψ 64|2
|ψ 13|2 |ψ 23|2 0
ω
|ψ 33|2
1 0.5 0
1 0.5 0
1 0.5 0
2
0
2
1 0.5 0
2
0
2
0
2
|ψ 43|2
0 1 0.5 0
1 0.5 0
|ψ 53|2
1 0.5 0
1 0.5 0
y3
0
2
0
2
1 0.5 0
1 0.5 0
|ψ 63|2
|ψ 51|
|ψ 12|2
|ψ 41|
0
|ψ 61|
|ψ 22|2
|ψ 31|
2
1 0.5 0
1 0.5 0
2
0
2
1 0.5 0
y2
0
2
1 0.5 0 0
2
1 0.5 0
2
1 0.5 0 0
2
1 0.5 0
|ψ 32|2
|ψ 21|
2
0
1 0.5 0
|ψ 42|2
1 0.5 0
|ψ 52|2
y1
|ψ 62|2
|ψ 11|
2
Figure 12: Sample SACF and SIACF for the process in Case Study 2
2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Sample SACF
Page 39 of 53
2
0
2
ω
ω
u3
0
2
0
2
0
2
0
2
0
2
0
2
ω
(a) Squared DPF
u1 0.34
y1
u2 0.319 0.244
0.259 0.311
u3 0.236
y2
0.34
y3
(b) Weighted GC network
Figure 13: Estimated DPF and weighted GC network using proposed approach for the process in Case Study 2
39
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
We next turn to small sample scenario, wherein the process is simulated to generate N = 100 samples of data. The DPF is estimated by fitting the regularized VAR(2) model to the data with elastic net regularization. The squared DPF shown in Figure 14(a) illustrate the presence of GC relationships from u1 −→ y1 , u1 −→ y2 , u2 −→ y1 , u2 −→ y2 , u2 −→ y3 , u3 −→ y1 , and u3 −→ y3 . The resulting weighted GC network for the process when N = 100 is illustrated in Figure 14(b). It is observed that the causal relationships identified using
0
0
2
ω
2
ω
0
2
0
|ψ 16|2 |ψ 26|2 |ψ 36|2 |ψ 46|2
1 0.5 0 1 0.5 0
2
0
2
1 0.5 0
|ψ 56|2
1 0.5 0
1 0.5 0
2
0
2
1 0.5 0
|ψ 66|2
|ψ 15|2 |ψ 25|2 0
ω
|ψ 35|2
1 0.5 0
1 0.5 0
1 0.5 0
2
0
2
Estimated
2
0
2
0
2
|ψ 45|2
0
1 0.5 0
|ψ 55|2
1 0.5 0
1 0.5 0
u2
0
2
0
2
1 0.5 0
|ψ 65|2
|ψ 14|2 |ψ 24|2
1 0.5 0
0
2
|ψ 34|2
0
|ψ 44|2
1 0.5 0
2
1 0.5 0
1 0.5 0
1 0.5 0
2
0
2
Theoretical
u1
0
2
0
2
1 0.5 0
|ψ 54|2
0
0
1 0.5 0
2
0
2
1 0.5 0
|ψ 64|2
1 0.5 0
1 0.5 0
y3
0
2
0
2
|ψ 13|2
0
1 0.5 0
2
0
2
|ψ 23|2
0
|ψ 33|2
1 0.5 0
2
1 0.5 0
|ψ 43|2
0
0
1 0.5 0
|ψ 53|2
1 0.5 0
2
y2
|ψ 63|2
|ψ 12|2
1 0.5 0
|ψ 22|2
1 0.5 0
|ψ 32|2
|ψ 41|2
1 0.5 0
1 0.5 0
|ψ 42|2
|ψ 31|2
1 0.5 0
0
1 0.5 0
|ψ 52|2
|ψ 21|2
1 0.5 0
y1
|ψ 62|2
|ψ 11|2
1 0.5 0
|ψ 51|2
regularization method are identical to the true relations among variables for the process..
|ψ 61|2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 40 of 53
2
0
2
ω
ω
u3
0
2
0
2
0
2
0
2
0
2
0
2
ω
(a) Squared DPF
u1 0.31
y1
u2 0.33 0.254
0.258 0.304
u3 0.206
y2
0.32
y3
(b) Weighted GC network
Figure 14: Estimated DPF and weighted GC network using proposed approach when N = 100 for the process in Case Study 2
40
ACS Paragon Plus Environment
Page 41 of 53
4.3
Case Study 3: Environmental Process
In this case study, we infer the causal relations among concentration levels of three air pollutants, CO (mg/m3 ) (mainly emitted from cars, house-heating, and industry), NO2 (µg/m3 ) and O3 (µg/m3 ) (created in different reactions in the atmosphere), as well as the atmospheric temperature Tatm (◦ C). The data are recorded hourly during the month January 2016 at Royapuram, Chennai, India. The data is available at Continuous Ambient Air Quality Monitoring Station (CAAQMS) and Tamil Nadu Pollution Control Board (TNPCB). The time series has N = 744 observations and the snapshot of data is shown in Figure 15. It may be noted that the air pollution data is not necessarily generated by a strictly stationary process. Despite this fact, we believe that this case study serves to demonstrate the performance of the standard and proposed approaches on a real-life data set where deviations from ideal assumptions are bound to exist. It may be pointed out that this comparison only rests on a stationary approximation.
CO
5 0 -5 0
100
200
300
400
500
600
700
0
100
200
300
400
500
600
700
0
100
200
300
400
500
600
700
0
100
200
300
400
500
600
700
NO 2
20 0 -20
O3
200 0
-200 10
T atm
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
0 -10
Figure 15: Snapshot of air pollutants data
The sample versions of scalar correlation functions shown in Figure 16 suggest that the suitable model structure for the data is VAR(2) since the sample SACF decreases exponentially whereas the sample SIACF abruptly goes to zero after lag 2. In this case study, the model structure identified by the scalar correlation functions is identical to the model struc41
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
ture used in the standard approach. The estimated squared DPF by fitting VAR model to the data is shown in Figure 17 (a). The corresponding weighted GC network is illustrated in Figure 17(b). The edge from CO to NO2 indicates that concentration of CO is responsible for the generation of NO2 . The network correctly reflects the creation of O3 from NO2 , CO and Tatm . The edge from CO to Tatm indicates that the CO influences the Tatm . Further, it is observed that Tatm is also responsible for the generation of NO2 .
1
1
Sample SIACF
Sample SACF
0.8 0.6 0.4 0.2
0.5
0
-0.5 0 -0.2
-1 0
5
10
15
20
25
30
0
2
4
Lag
6
8
10
Lag
(a) SACF
(b) SIACF
Figure 16: Sample SACF and SIACF for air pollutants data in case study 3
0 2
0
2
0.5 0
0
2 0.5
0
2
0 0
2
ω
2
ω
O3
0
2
0.47
0.14
0.08
0.11
1
0.5 0
0
2
0.5
2
1
|ψ 43|2
|ψ 42|2
0
0
0 0
1 0.5
0.18
1
0.5
2
CO
0.5
2
0 0
1 0.5
|ψ 14|2 0
0
2
0
1
|ψ 33|2
|ψ 32|2
2
0
0 1
0.5
2
1
0.5
2
0 0
1
0.5 0
0 1
|ψ 23|2
|ψ 22|2
|ψ 21|
2
0
|ψ 31|
0
1
0.5
0.5
|ψ 24|2
0 1
|ψ 41|
0.5
T atm
1
|ψ 34|2
0
O3
1
|ψ 13|2
|ψ 12|2
|ψ 11|
2
0.5
NO 2
1
|ψ 44|2
CO
1
2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 42 of 53
0.5 0
0
2
0
ω
NO2
2
ω
(a) Squared DPF from regular VAR (N = 744)
0.13
Tatm
(b) Weighted GC network
Figure 17: Estimated DPF and weighted GC network for Case study 3 (N = 744)
We next turn to small sample scenario, wherein a limited number of data points, i.e., N = 120 (beginning) points, is chosen. In this case study, we do not know that (true) theoretical DPF since it is a real life data. Therefore, the DPF estimated in the previous case is considered as a reference, where sufficient data points are used for fitting a model. The 42
ACS Paragon Plus Environment
Page 43 of 53
snapshot of data and estimated squared DPF by fitting VAR model to the data are shown in Figure 18(a) and Figure 18(b), respectively. It is observed the estimated DPF given by regular VAR model coefficients from N = 120 observations noticeably deviates from the referred DPF, resulting in detection spurious causal relationships. In order to address this issue, the sparse optimization based VAR model is fitted to data and the DPF is re-estimated. The matrix layout plots of estimated squared DPF and the corresponding GC network are shown in Figure 18(c) and Figure 18(d), respectively. It is observed that the estimates of DPF are identical to the DPF estimated from sufficient number of data points. Thus the reconstructed Granger causal network correctly reflects the causal relationships among CO, NO2 , O3 , and Tatm . Further, it is also observed that the inferred causal relationships are identical for another set of 120 points taken from point 401 to 520. Thus the proposed approach results in reliable reconstruction of causal networks.
0 20
40
60
80
100
0
120
|ψ 31|2
0 0
20
40
60
80
100
0.5
|ψ 41|2
0
0
-5 0
20
40
60
80
100
0
120
2
(a) Air pollutants data (N = 120)
0
2
0
0 0
2
0
0 0
2
0 2
ω
|ψ 24|2 0
2
ω
|ψ 14|2
0.19
O3
0.42
0.132
0.07
0.105
0
2
1
0.5 0
0
2
0.5
2
2
|ψ 43|
0.5 0
0
0
CO
0
1
2
|ψ 42|
0.5
0.5
2
1
2
1
0 0
1
2
ω
0.5
2
2
|ψ 33|
0.5
0
0 0
1
2
|ψ 32|
0.5
0.5
2
1
0 1
0 0
1
2
2
|ψ 23|
0.5
2
ω
0.5 0
0 1
2
0.5
|ψ 14|2
0
1
|ψ 22|
|ψ 21|2
2
2
0.5
0 0
T atm
1
|ψ 34|2
0
|ψ 31|2
0.5 0
1
O3
1
|ψ 13|
2
|ψ 12|
|ψ 11|2
0.5 0
|ψ 41|2
NO2
1
0.5
(b) Squared DPF from regular VAR (N = 120)
|ψ 44|2
CO
1
2
ω
2
1
0 0
ω
2
0.5
0
0
0.5 0
0 1
0.5
2
1
0.5
2
1
0.5
2
0 0
1
5
0.5
2
0
0 0
1
0 0
120
2
2
0.5
0 0
0 1
0.5
1
0
-10
0.5
2
1
0
1
0
|ψ 32|2
0 10
O3
0.5 0
-20
|ψ 42|2
NO2
20
1
|ψ 24|2
1
0.5
2
|ψ 34|2
120
|ψ 23|2
100
0
|ψ 33|2
80
0.5
2
T atm
1
0 0
|ψ 43|2
60
2
|ψ 22|2
40
|ψ 21|2
20
0.5 0
0
O3
1
|ψ 13|2
0.5 0
0
atm
|ψ 12|2
|ψ 11|2
CO
0 -1
NO 2
1
|ψ 44|2
CO
1 1
T
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
0.5 0
0
2
ω
0
2
NO2
ω
(c) Squared DPF from sparse VAR (N = 120)
0.12
Tatm
(d) Weighted GC network
Figure 18: Snapshot of data, estimated DPF and GC network for Case study 3 (N = 120). 43
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 44 of 53
It may be noted that most of the real-life processes such as the one in this case study essentially belong to the category of data generating processes that do not meet all the assumptions. Consequently, the reconstructed networks for all such processes are not necessarily truly reflective of the underlying process but can be viewed as the best approximation under the assumptions made. From this viewpoint, the proposed method can be expected to identify the best approximation.
4.4
Case Study 4: Wood-Berry Distillation Column
In order to demonstrate the use of proposed approach for interaction assessment in process engineering, the 2×2 Wood-Berry (WB) distillation column studied by Garg and Tangirala 3 is considered. The process is described (with sampling time, Ts = 1.09 s) by −1 + 0.74z −1 −3 −0.2414 − 0.7145z z z 1 − 0.9368z −1 1 − 0.9494z −1 Gp (z) = −1 −1 −0.3604 − 0.1.054z 0.3707 + 0.257z z −3 z −7 1 − 0.9048z −1 1 − 0.9271z −1
−1 0.0688
(50)
The above process is known to be interacting and the quantity of interaction based on the directed (causal) power transfer between pair of variables has been computed theoretically by Garg and Tangirala 3 . The aim of this case study is to estimate the control interaction metric (CII), η, from process data using both standard and proposed approaches for decentralized control configuration with different tuning methods and compare the theoretical interaction index given in Garg and Tangirala 3 . The above process is simulated to generate N = 496 samples of data and white noise type disturbance with a variance of 0.5 is added to the outputs.
44
ACS Paragon Plus Environment
1
1
0.8
0.8
0.6
Sample SIACF
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Sample SACF
Page 45 of 53
0.4 0.2 0
0.6 0.4 0.2 0
-0.2 -0.4
-0.2 0
5
10
15
20
0
5
Lag
10
15
Lag
(a) SACF
(b) SIACF
Figure 19: Sample SACF and SIACF for the Wood-Berry process in case study 4 Now we turn to the estimation of the control interaction metric using standard approach. The CII is estimated using standard (VAR) model-based approach without giving due importance to the structural characteristics of DGP. The estimates of CII from direct power transfer for different tuning methods are reported in Table 6. The theoretical values of CII given in Garg and Tangirala 3 are also reported for the purpose of comparison. We implement next the proposed approach for this process, wherein the scalar correlation functions are first estimated to select an appropriate model structure. The sample SACF and SIACF obtained from data are shown in Figure 19(a) and Figure 19(b), respectively. It is observed that both the scalar correlation functions decrease gradually to zero with lags suggesting that an appropriate model structure of this process is VARMA. The estimates of CII by fitting the VARMA model of an appropriate order to the data are reported in the fourth column of Table 6. It is observed that the estimated interactions using the proposed approach are close to the true interactions in the process. Table 6: Interaction assessment in decentralized control scheme for Wood-Berry column Tuning method
η
ηˆ using Standard Approach
ηˆ Proposed Approach
Seborg - 1
0.0954
0.079
0.0973
BLT
0.0761
0.0025
0.0685
Seborg - 2
0.1991
0.2098
0.1891
Edgar
0.2025
0.2066
0.2021
45
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
To summarize, the proposed methodology for reconstruction of weighted GC networks results in accurate and reliable estimates. In contrast, the errors in the estimates obtained with a fixed model (VAR)-based approach in presence of model structure-data generating process mismatch, are relatively high.
5
Conclusions
In this work, we have presented a methodology / procedure that addresses the issue of the structural mismatch between the model and data generating process for efficient reconstruction of weighted Granger causal networks in linear multivariable dynamical processes. The proposed methodology is based on the use of sparse optimization combined with the recently introduced scalar correlation functions to select the most appropriate and parsimonious model structure. Simulation results demonstrated a comparison of fixed model (VAR)-based and proposed approaches for reconstruction of weighted GC networks under different scenarios. One of the key findings of this work is that estimation of weighted GC networks in presence of model structure-DGP mismatch results in biased and / or inefficient estimates, which can further lead to the detection of spurious causal relationships when the sample size is small. The main contribution of this work is a methodology that delivers efficient estimates of causality measures for reconstruction of GC networks, by modifying the key step of model building such that the effect of model structure-DGP mismatch is effectively alleviated. The results of this work are significant particularly given that the prevalent practice in the causal network reconstruction literature is to use a fixed VAR model regardless of underlying data generating process. From another viewpoint, the work re-affirms a well-known fact, but in the context of network reconstruction, that convenience of estimation as a sole criterion for choosing model structure can result in loss of efficiency and/or accuracy of the target parameter estimates. Future direction involves the extension of proposed approach to non-linear processes.
46
ACS Paragon Plus Environment
Page 46 of 53
Page 47 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Supporting Information Available The supporting document contains (i) estimation of time series models and causality measures in frequency-domain, (ii) penalty functions for sparse VAR modelling, (iii) proof for the condition ρvv [l] = 0 ⇐⇒ Γz [l] = 0 and the recommendation to choose α and β, and (iv) a link to download the data sets used in this work. This information is available free of charge via the Internet at http://pubs.acs.org/.
References (1) Gigi, S.; Tangirala, A. K. Quantification of Interaction in Multiloop Control Systems Using Directed Spectral Decomposition. Automatica 2013, 49, 1174–1183. (2) Yang, F.; Duan, P.; Shah, S. L.; Chen, T. Capturing connectivity and causality in complex industrial processes; Springer Science & Business Media, 2014. (3) Garg, A.; Tangirala, A. K. Metrics for Interaction Assessment in Multivariable Control Systems Using Directional Analysis. Industrial & Engineering Chemistry Research 2018, 57, 967–979. (4) Duan, P.; Chen, T.; Shah, S. L.; Yang, F. Methods for root cause diagnosis of plant-wide oscillations. AIChE Journal 60, 2019–2034. (5) Xu, S.; Baldea, M.; Edgar, T. F.; Wojsznis, W.; Blevins, T.; Nixon, M. Root Cause Diagnosis of Plant-Wide Oscillations Based on Information Transfer in the Frequency Domain. Industrial & Engineering Chemistry Research 2016, 55, 1623–1629. (6) Yuan, T.; Qin, S. J. Root cause diagnosis of plant-wide oscillations using Granger causality. Journal of Process Control 2014, 24, 450 – 459, ADCHEM 2012 Special Issue.
47
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
(7) Hu, W.; Wang, J.; Chen, T.; Shah, S. L. Cause-effect analysis of industrial alarm variables using transfer entropies. Control Engineering Practice 2017, 64, 205 – 214. (8) Chen, H.-S.; Yan, Z.; Yao, Y.; Huang, T.-B.; Wong, Y.-S. Systematic Procedure for Granger-Causality-Based Root Cause Diagnosis of Chemical Process Faults. Industrial & Engineering Chemistry Research 2018, 57, 9500–9512. (9) Granger, C. Investigating Causal Relations by Econometric Models and Cross-Spectral Methods. Econometrica 1969, 37, 424–38. (10) Baccala, L.; Sameshima, K. Partial directed coherence: a new concept in neural structure determination. Biological Cybernetics 2001, 84, 463–474. (11) Faes, L.; Nollo, G. Biomedical Engineering, Trends in Electronics, Communications and Software; InTech, 2011. (12) Takahashi, D. Y.; Baccal´a, L. A.; Sameshima, K. Information theoretic interpretation of frequency domain connectivity measures. Biological Cybernetics 2010, 103, 463–469. (13) Gigi, S.; Tangirala, A. K. Quantitative analysis of directional strengths in jointly stationary linear multivariate processes. Biological Cybernetics 2010, 103, 119–133. (14) Marques, V. M.; Munaro, C. J.; Shah, S. L. Data-based causality detection from a system identification perspective. 2013 European Control Conference (ECC). 2013; pp 2453–2458. (15) Marques, V. M.; Munaro, C. J.; Shah, S. L. Detection of Causal Relationships Based on Residual Analysis. IEEE Transactions on Automation Science and Engineering 2015, 12, 1525–1534. (16) Duan, P.; Yang, F.; Chen, T.; Shah, S. L. Direct Causality Detection via the Transfer Entropy Approach. IEEE Transactions on Control Systems Technology 2013, 21, 2052– 2066. 48
ACS Paragon Plus Environment
Page 48 of 53
Page 49 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
(17) Triacca, U. Is Granger causality analysis appropriate to investigate the relationship between atmospheric concentration of carbon dioxide and global surface air temperature? Theoretical and Applied Climatology 2005, 81, 133–135. (18) Zhang, J.; Han, D.; Yang, F.; Ye, H.; Chen, M. Frequency domain causality analysis method for multivariate systems in hypothesis testing framework. IFAC Proceedings Volumes 2014, 47, 6870–6877. (19) Dahlhaus, R.; Eichler, M. Causality and graphical models in time series analysis. Oxford Statistical Science Series 2003, 115–137. (20) Eichler, M.; Dahlhaus, R.; Sandk¨ uhler, J. Partial correlation analysis for the identification of synaptic connections. Biological Cybernetics 2003, 89, 289–302. (21) Lauritzen, S. L. Causal inference from graphical models. Complex stochastic systems 2001, 63–107. (22) Eichler, M. A graphical approach for evaluating effective connectivity in neural systems. Philosophical Transactions of the Royal Society of London B: Biological Sciences 2005, 360, 953–967. (23) Eichler, M. Granger causality and path diagrams for multivariate time series. Journal of Econometrics 2007, 137, 334–353. (24) Wright, S. The method of path coefficients. Annals of Mathematical Statistics 1934, 5, 161–215. (25) Whittaker, J. Graphical models in applied multivariate statistics; Wiley and Sons: New York, 1990. (26) Pearl, J. In Quantified Representation of Uncertainty and Imprecision; Smets, P., Ed.; Handbook of Defeasible Reasoning and Uncertainty Management Systems; Springer, 1998; Vol. 1. 49
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
(27) Anderson, B. D.; Deistler, M.; Dufour, J.-M. On the Sensitivity of Granger Causality to Errors-In-Variables, Linear Transformations and Subsampling. Journal of Time Series Analysis 2018, 40, 102–123. (28) Patriota, A. G.; Sato, J. R.; Achic, B. G. B. Vector autoregressive models with measurement errors for testing Granger causality. Statistical Methodology 2010, 7, 478 – 497. (29) Bald´e, T. A.; Rodr´ıguez, G. Finite sample effects of additive outliers on the Grangercausality test with an application to money growth and inflation in Peru. Applied Economics Letters 2005, 12, 841–844. (30) Agarwal, P.; Tangirala, A. K. Reconstruction of missing data in multivariate processes with applications to causality analysis. International Journal of Advances in Engineering Sciences and Applied Mathematics 2017, 9, 196–213. (31) Sims, C. A. Money, Income, and Causality. The American Economic Review 1972, 62, 540–552. (32) Akaike, H. On the use of a linear model for the identification of feedback systems. Annals of the Institute of statistical mathematics 1968, 20, 425–439. (33) Sugihara, G.; May, R.; Ye, H.; Hsieh, C.-h.; Deyle, E.; Fogarty, M.; Munch, S. Detecting Causality in Complex Ecosystems. Science 2012, 338, 496–500. (34) Wiener, N. In Modern mathematics for the engineer, ed. E. F. Beckenbach, McGrawHill, New York,; Beckenbach, E., Ed.; McGraw-Hill, 1956. (35) Kami´ nski, M.; Ding, M.; Truccolo, W.; Bressler, S. Evaluating causal relations in neural systems: granger causality, directed transfer function and statistical assessment of significance. Biological Cybernetics 2001, 85 .
50
ACS Paragon Plus Environment
Page 50 of 53
Page 51 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
(36) Eichler, M. On the Evaluation of Information Flow in Multivariate Systems by the Directed Transfer Function. Biological Cybernetics 2006, 94, 469–482. (37) Faes, L.; Porta, A.; Nollo, G. Testing Frequency-Domain Causality in Multivariate Time Series. IEEE Transactions on Biomedical Engineering 2010, 57, 1897–1906. (38) Akaike, H. A new look at the statistical model identification. IEEE Transactions on Automatic Control 1974, 19, 716–723. (39) Schwarz, G. Estimating the Dimension of a Model. Ann. Statist. 1978, 6, 461–464. (40) Wei, W. Time Series Analysis: Univariate and Multivariate Methods; Pearson, 2005. (41) Lutkepohl, H. New Introduction to Multiple Time Series Analysis; Springer, 2005. (42) Heyse, J.; Wei, W. Inverse and Partial Lag Autocorrelation for Vector Time Series. ASA Proceedings of Business and Economics Statistics Section. 1985. (43) Tiao, G.; Box, G. Modeling Multiple Times Series with Applications. Journal of the American Statistical Association 1981, 76, pp. 802–816. (44) Quenouille, M. H. The Analysis of Multiple Time-Series; New York: Hafner, 1957. (45) Geweke, J. Measurement of Linear Dependence and Feedback Between Multiple Time Series. Journal of the American Statistical Association 1982, 77, 304–313. (46) Kami´ nski, M.; Blinowska, K. A new method of the description of the information flow in the brain structures. Biological Cybernetics 1991, 65, 203–210. (47) Baccala, L.; Sameshima, K.; Ballester, G.; Do Valle, A.; Timo-Iaria, C. Studying the Interaction Between Brain Structures via Directed Coherence and Granger Causality. Applied Signal Processing 1998, 5 .
51
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
(48) Hlavackova-Schindler, K.; Palus, M.; Vejmelka, M.; Bhattacharya, J. Causality detection based on information-theoretic approaches in time series analysis. PhysicsReports 2007, 441, 1–46. (49) Geweke, J. F. Measures of Conditional Linear Dependence and Feedback Between Time Series. Journal of the American Statistical Association 1984, 79, 907–915. (50) Eichler, M. Causal inference with multiple time series: principles and problems. Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engineering Sciences 2013, 371 . (51) Gevers, M. R.; Anderson, B. D. O. Representations of jointly stationary stochastic feedback processes. International Journal of Control 1981, 33, 777–809. (52) Kohavi, R. A study of cross-validation and bootstrap for accuracy estimation and model selection. Ijcai. 1995; pp 1137–1145. (53) Cleveland, W. S. The Inverse Autocorrelations of a Time Series and Their Applications. Technometrics 1972, 14, 277–293. (54) Chatfield, C. Inverse Autocorrelations. Journal of the Royal Statistical Society. Series A (General) 1979, 142, 363–377.
52
ACS Paragon Plus Environment
Page 52 of 53
Page 53 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Figure 20: Table of contents graphic
53
ACS Paragon Plus Environment