Subscriber access provided by UNIV OF LOUISIANA
Process Systems Engineering
Adaptive selection of latent variables for process monitoring Lijia Luo, Shiyi Bao, and Jianfeng Mao Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/acs.iecr.8b05847 • Publication Date (Web): 08 May 2019 Downloaded from http://pubs.acs.org on May 9, 2019
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Adaptive selection of latent variables for process monitoring Lijia Luo ∗, Shiyi Bao, and Jianfeng Mao College of Mechanical Engineering, Zhejiang University of Technology, Hangzhou 310023, China ABSTRACT: Latent variable (LV) methods have been widely applied to multivariate statistical process monitoring. Using LV methods can simplify the process monitoring problem through a projection of the high-dimensional process data into a low-dimensional LV space that retains most of the important information for fault detection. A key issue is the optimal selection of a subset of LVs to constitute such a low-dimensional LV space. Traditional LV selection methods don’t choose LVs from a fault detection point of view, which may result in poor process monitoring performance. To overcome this drawback, a cumulative percent contribution (CPC) criterion is proposed to select appropriate LVs for process monitoring. First, contributions of LVs to the T2 value of a sample are computed by the decomposition of the T2 statistic. The importance of LVs to fault detection is then evaluated by their contributions. The larger the contribution is, the more important the LV is. After sorting LVs in order of decreasing contributions, the CPC criterion selects the first few LVs with a CPC value beyond a threshold. The LVs are selected adaptively for each sample so that the selected LVs contain the most important information for detecting faults. Based on the selected LVs, two fault detection indices are defined. An online process monitoring procedure is then developed. The effectiveness and advantages of the proposed method are demonstrated with two case studies. 1. Introduction Online monitoring of industrial processes is an effective way to ensure the safe operation and high
∗
Corresponding Author. E-mail address:
[email protected] 1
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 2 of 32
product quality. Process monitoring aims to detect process faults caused possibly by disturbances, equipment malfunctions, improper operations, and other factors that occur during the operation of industrial processes.1 Fast and accurate fault detection is crucial for avoiding product deterioration, performance degradation and equipment damage. Methods to implement process monitoring can be classified into four categories: quantitative model-based,2 process knowledge-based,3 data-driven4-6 and knowledge-data-integrated methods.7,8 Data-driven monitoring methods are more attractive recently than the other methods, because they require only the process data that are readily available for use in modern process systems. Due to this data-driven nature, data-driven monitoring methods are well suited for complex and large-scale process systems.1 Data-driven methods use the historical data to build process monitoring models, and subsequently to derive monitoring indices for fault detection. Process faults are detected when the control limit of a monitoring statistic is violated. The data obtained from industrial processes often contain lots of redundant information due to strong correlations between process variables. Dimension reduction methods,9-14 such as principal component analysis (PCA)9, are often used to extract important characteristics of process data. Performing dimension reduction on process data yields a set of latent variables to explain correlations between process variables. However, latent variables are not equally important to process monitoring. Usually, only a subset of latent variables carries the data information that is useful for process monitoring. These crucial latent variables constitute a latent variable space that summarizes the most important data information. The dimension of this latent variable space is often much smaller than that of the original process variable space. The process monitoring problem is simplified by working in a low-dimensional latent variable space instead of the high-dimensional process variable space. 2
ACS Paragon Plus Environment
Page 3 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
A key issue in using latent variable methods for process monitoring is the selection of latent variables. Using different latent variables may change the values of monitoring indices and the corresponding control limits, which results in different monitoring performance. The aim of latent variable selection is to choose a subset of latent variables that are most appropriate for process monitoring. The commonly used selection methods try to find a subset of latent variables that explains most of the data variance or produces the minimal reconstruction/prediction errors, for examples, the methods based on cumulative percent variance (CPV),15 cross validation,16,17 variance of the reconstruction error (VRE).18 In these methods, the latent variable explaining a larger amount of data variance is considered to be more important and therefore is chosen with higher priority. Such selection criteria are reasonable from a viewpoint of data regression, but may be not suitable for fault detection. It is important to note that the fault data are different from the normal data used for computing latent variables. A latent variable that explains well the normal data may not clearly reveal differences between the fault data and the normal data. Since the key to fault detection is detecting differences between the fault data and the normal data, more importance should be placed on latent variables that are better at distinguishing the fault data from the normal data. Moreover, different faults can affect process variables in many different ways. A fault may also affect different process variables during different time periods due to the fault shift/spread in process units.19 Therefore, for faulty samples obtained at different time or under different fault conditions, the important fault information may be carried by different latent variables. To maximally extract the fault information contained in each sample, latent variables should be selected adaptively for each sample. However, it is quite common that a same subset of latent variables is used to describe all the samples. This is not suitable for fault detection, because a fixed subset of 3
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 4 of 32
latent variables cannot capture the important information of all faulty samples. Recently, new methods were proposed for the better selection of principal components in the PCA models for fault detection.2022
However, these methods are not applicable to general latent variable methods except for PCA. In this paper, a cumulative percent contribution (CPC) criterion is proposed to select appropriate
latent variables for process monitoring. Contributions of latent variables to the T2 value of a sample are computed by the decomposition of the T2 statistic. The importance of latent variables to fault detection is quantified by their contributions. A higher contribution represents the larger importance. For each sample, all latent variables are sorted in order of decreasing contributions, and then the first few latent variables with the CPC value beyond a threshold are selected. Latent variables are selected adaptively for each sample from the fault detection point of view, which improves process monitoring performance significantly. Based on the selected latent variables, two fault detection indices are defined. An online process monitoring procedure is then developed. The effectiveness of the proposed method is illustrated using a numerical example and an industrial case study. The rest of the article is organized as follows. Section 2 gives a brief review on the T2 statistic and the latent variable model for process monitoring. The effect of latent variable selection on fault detection is illustrated in Section 3. In Section 4, the CPC selection criterion, two fault detection indices and an online process monitoring procedure are introduced. The performance of the proposed method is illustrated with two case studies in Section 5. The conclusions are given in Section 6. 2. Preliminaries 2.1. The T2 statistic The Hotelling’s T2 statistic is one of the most popular multivariate statistics used for statistical 4
ACS Paragon Plus Environment
Page 5 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
process control and monitoring.23 Consider a sample x taken from a m-variate normal distribution with �= the mean vector μ and the covariance matrix Σ. Let μ and Σ be estimated using the mean 𝒙𝒙
∑𝑛𝑛𝑖𝑖=1 𝒙𝒙𝑖𝑖 ⁄𝑛𝑛 and the covariance matrix 𝑺𝑺 = ∑𝑛𝑛𝑖𝑖=1(𝒙𝒙𝑖𝑖 − 𝒙𝒙 �)(𝒙𝒙𝑖𝑖 − 𝒙𝒙 �)𝑇𝑇 ⁄(𝑛𝑛 − 1) of n reference samples
{𝒙𝒙1 , 𝒙𝒙2 , … , 𝒙𝒙𝑛𝑛 }. The T2 statistic of x is defined as
�) �)T 𝑺𝑺−1 (𝒙𝒙 − 𝒙𝒙 𝑇𝑇 2 = (𝒙𝒙 − 𝒙𝒙
(1)
which follows a scaled F-distribution23
𝑇𝑇 2 ~
𝑚𝑚(𝑛𝑛2 −1) 𝑛𝑛(𝑛𝑛−𝑚𝑚)
𝐹𝐹𝑚𝑚,𝑛𝑛−𝑚𝑚
(2)
with m and n – m degrees of freedom. The T2 statistic represents the statistical distance of a sample from the mean point. A large T2 value indicates that the sample deviates far from reference samples. A control limit is often used to determine when a large T2 value is significant. This control limit can be computed using the F-distribution in Eq. (2). 2.2. Latent variable models The industrial process data usually contain lots of redundant information. To reduce redundant information and to extract important characteristics of process data, dimension reduction methods (such as PCA) are often applied to the data for building latent variable models. The general form of a linear latent variable model is 𝒀𝒀 = 𝑿𝑿𝑿𝑿
(3)
where 𝑿𝑿 = [𝒙𝒙1 , … , 𝒙𝒙𝑛𝑛 ]𝑇𝑇 ∈ ℛ𝑛𝑛×𝑚𝑚 is a matrix of n reference samples, 𝒀𝒀 ∈ ℛ𝑛𝑛×𝑙𝑙 is a latent variable
score matrix, 𝑷𝑷 ∈ ℛ𝑚𝑚×𝑙𝑙 is a loading matrix that shows the relations between latent variables and
original variables in X, and 𝑙𝑙 (𝑙𝑙 ≤ 𝑚𝑚) is the number of latent variables retained in the model. A new
sample is evaluated against the normal operation by projecting onto loading vectors and analyzing the 5
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 6 of 32
result in the latent variable space. Latent variable scores of a new sample x is computed by 𝒚𝒚 = 𝑷𝑷𝑇𝑇 𝒙𝒙,
and the T2 statistic can be defined in the latent variable space as9 �−1 (𝒚𝒚 − 𝒚𝒚 �)T 𝑺𝑺 �)~ 𝑇𝑇 2 = (𝒚𝒚 − 𝒚𝒚
𝑙𝑙(𝑛𝑛2 −1) 𝑛𝑛(𝑛𝑛−𝑙𝑙)
𝐹𝐹𝑛𝑛,𝑛𝑛−𝑙𝑙
(4)
� denotes the � = ∑𝑛𝑛𝑖𝑖=1 𝒚𝒚𝑖𝑖 ⁄𝑛𝑛 is a vector of average scores of reference samples, and 𝑺𝑺 where 𝒚𝒚
covariance matrix of Y. A key issue in building a latent variable model is to find a subset of latent variables that contains most of the important data information. For many classical dimension reduction
methods (such as PCA and PLS), this issue boils down to select an appropriate subset of latent variables from all the latent variables. The latent variable selection has significant effects on fault detection performance. 3. Effects of latent variable selection on fault detection 3.1. Principal component analysis PCA is one of the most popular methods for building a latent variable model. The PCA loading vectors pj are eigenvectors of the covariance matrix of the data, and the corresponding eigenvalues 𝜆𝜆𝑗𝑗
represent the variances of latent variables yj. The loading matrix P is column-wise orthonormal and the score matrix Y is column-wise orthogonal, i.e., 𝑷𝑷𝑇𝑇 𝑷𝑷 = 𝑰𝑰 and 𝒀𝒀𝑇𝑇 𝒀𝒀 = 𝚲𝚲 , where I denotes the
identity matrix and 𝚲𝚲 is a diagonal matrix. Based on these characteristics of PCA, the T2 statistic in Eq. (4) can be decomposed as follows
�−1 (𝒚𝒚 − 𝒚𝒚 �)T 𝑺𝑺 �) 𝑇𝑇 2 = (𝒚𝒚 − 𝒚𝒚 �)𝑇𝑇 𝑷𝑷 � = (𝒙𝒙 − 𝒙𝒙
𝒀𝒀𝑇𝑇 𝒀𝒀
𝑛𝑛−1
−1
�
�) 𝑷𝑷𝑇𝑇 (𝒙𝒙 − 𝒙𝒙
�)𝑇𝑇 𝒑𝒑𝑗𝑗 𝒑𝒑𝑇𝑇𝑗𝑗 (𝒙𝒙 − 𝒙𝒙 �) = ∑𝑙𝑙𝑗𝑗=1 𝜆𝜆𝑗𝑗−1 (𝒙𝒙 − 𝒙𝒙
(5)
where 𝒑𝒑𝑗𝑗 and 𝜆𝜆𝑗𝑗 are the jth loading vector and eigenvalue (here all eigenvalues are assumed to be
nonzero). The contribution of the jth principal component (PC) to the T2 value of x is defined as 6
ACS Paragon Plus Environment
Page 7 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
�)𝑇𝑇 𝒑𝒑𝑗𝑗 𝒑𝒑𝑇𝑇𝑗𝑗 (𝒙𝒙 − 𝒙𝒙 �) 𝑐𝑐𝑗𝑗,𝑥𝑥 = 𝜆𝜆𝑗𝑗−1 (𝒙𝒙 − 𝒙𝒙
(6)
The PCs with larger contributions are more useful for fault detection, because they are the main causes of a large T2 value that exceeds the control limit. Such PCs should be chosen with higher priority for fault detection. Note that the PC contribution is inversely proportional to the eigenvalue. This implies that the PC corresponding to the smallest eigenvalue may contribute more than other PCs. Classical selection criteria (e.g., CPV15 and VRE18) may not select the best PCs for fault detection, because they give priority to the PCs corresponding to large eigenvalues. Consider a simple case in which a system with two variables (x1 and x2) that follow the standard normal distribution is monitored. Let x1 and x2 be correlated with a covariance ρ. Without loss of generality, ρ is assumed to be nonnegative, i.e., ρ ≥ 0. The covariance matrix of two variables can be 1 expressed as � 𝜌𝜌
𝜌𝜌 � . Loading vectors of two PCs are 𝒑𝒑1 = [1⁄√2 , 1⁄√2]𝑇𝑇 and 𝒑𝒑2 = [1⁄√2 , 1
−1⁄√2]𝑇𝑇 , and corresponding eigenvalues are 1+ρ and 1-ρ. For a centered sample 𝒙𝒙 = [𝑥𝑥1 , 𝑥𝑥2 ]𝑇𝑇 , the
first PC has better fault detection performance if it contributes more to the T2 value of x than the second
PC, namely 𝑇𝑇 −1 𝑇𝑇 𝑇𝑇 𝑇𝑇 𝜆𝜆−1 1 𝒙𝒙 𝒑𝒑1 𝒑𝒑1 𝒙𝒙 > 𝜆𝜆2 𝒙𝒙 𝒑𝒑2 𝒑𝒑2 𝒙𝒙 1
→ 1+𝜌𝜌 �
→
The solution of Eq. (7) is
𝜌𝜌𝑥𝑥12
1
𝑥𝑥 + 2 1
√
2
1
𝑥𝑥 � > 2 2
√
− 2𝑥𝑥1 𝑥𝑥2 + 𝜌𝜌𝑥𝑥22 𝑥𝑥2 𝑥𝑥2 2
1
�
1
𝑥𝑥 − 2 1
1−𝜌𝜌 √