Statistical Monitoring and Fault Diagnosis of Batch Processes Using

Ind. Eng. Chem. Res. 2010, 49, 9961–9969

9961

Statistical Monitoring and Fault Diagnosis of Batch Processes Using Two-Dimensional Dynamic Information Yuan Yao and Furong Gao* Center for Polymer Processing and Systems, The Hong Kong UniVersity of Science and Technology, Nansha, Guangzhou, China

Two-dimensional (2D) dynamics widely exist in batch processes, which inspirit research efforts to develop corresponding monitoring schemes. Recently, two-dimensional dynamic principal component analysis (2DDPCA) has been proposed to model and monitor such 2D dynamic batch processes, in which support region (ROS) determination is a key step. A proper ROS ensures modeling accuracy, monitoring efficiency, and reasonable fault diagnosis. The previous ROS determination method is practicable in many situations but still has certain limitations, as discussed in this paper. To overcome these shortcomings, a 2D-DPCA method with an improved ROS determination procedure is developed, by considering variable partial correlations and performing iterative stepwise regressions. Such a procedure expands ROS batch by batch and is a generalization of the autoregressive (AR) model order selection to the 2D batch process cases. Simulations show that the proposed method extracts 2D dynamics more accurately and improves the monitoring and diagnosis performance of the 2D-DPCA model. 1. Introduction Batch processes have been widely applied in chemical, semiconductor processing, and many bioprocesses, to manufacture high-value-added products. Dynamics are inherent characteristics of batch processes, which exist not only within single-batch duration, but also from batch to batch. The causes of batchwise dynamics include slow changes in material properties, drifts of process characteristics, and the utilizations of batch-to-batch controllers. Therefore, in many batch processes, dynamics behave in a two-time-dimensional way including timewise dynamics and batchwise dynamics, and can be regarded as two-dimensional (2D) dynamics. To ensure operation safety and quality consistency, the multivariate statistical monitoring techniques, such as principal component analysis (PCA),1 have been extended to batch process monitoring.2,3 Several methods have been proposed to model batch process dynamics.4,5 The two-dimensional dynamic principal component analysis (2D-DPCA) model,6 developed by the authors, is the first attempt to model both two types of batch dynamics simultaneously, by combining PCA with 2D autoregressive (AR) model structure. The model is relatively easy to build, while the 2D batch dynamics information can be better extracted. Both process drifts and small changes in correlation structures can be detected more efficiently. Support region determination is a key step in 2D-DPCA model building. In batch processes, the dynamics can be reflected by the relationship between the current measurements and the lagged measurements. When 2D dynamics exist, the lagged measurements indicating process dynamic features forms a region in the past half-plane, which is called the support region or the region of support (ROS). A properly determined ROS ensures modeling accuracy, monitoring efficiency, and correct fault diagnosis. In previous work, ROS is determined following two-stage steps.7 In the first stage, an initial ROS is selected by considering the relationships between current measurements and lagged measurements. Then, backward eliminations are performed in the initial ROS to determine the target ROS. * To whom correspondence should be addressed. Tel.: (852)-23587139. Fax: (852)-2358-0054. E-mail: [email protected].

Although this method is applicable in many cases, there are limitations, as discussed later. Such drawbacks affect the performance of ROS determination. In this work, 2D-DPCA based on an improved ROS automatic determination method is proposed, by considering variable partial correlations and following iterative stepwise regressions. In the determination procedure, the ROS is expanded batch by batch. Simulation results show that the newly proposed method provides more reasonable ROS determination results. Consequently, the accuracy of 2D-DPCA model is enhanced. As a result, the monitoring and fault diagnosis efficiencies are improved. This paper is organized as follows. The preliminaries of PCA are reviewed in section 2, followed by the brief introductions of 2D-DPCA and the previous ROS determination method. In section 3, the motivations of this research are given, by discussing the features and the drawbacks of the previous ROS determination method. The detailed procedure of 2D-DPCA with the improved ROS automatic determination method is presented in section 4. Section 5 illustrates the proposed method with simulation examples. The simulations compare the modeling, online monitoring, and fault diagnosis results of the 2D-DPCA models, based on the previous and the proposed ROS determination methods. Finally, the conclusions are drawn in section 6 to summarize the paper. 2. Preliminaries 2.1. PCA. PCA is an orthogonal linear transformation performed on a data matrix such as X(n × m), where n is the number of samples and m is the number of variables. As shown in eq 1, it decomposes X into two subspaces: score space and residual space. A

X)

∑

m

tjpTj +

j)1

∑

tjpTj ) TPT + E ) Xˆ + E

(1)

j)A+1

where A is the number of the principal components (PCs) retained in score space, tj(n × 1) is the score vector which is orthogonal to each other, pj(m × 1) is the orthonormal loading vector, which projects data into score space, T and P are the

10.1021/ie100860x  2010 American Chemical Society Published on Web 09/16/2010

9962

Ind. Eng. Chem. Res., Vol. 49, No. 20, 2010

score matrix and the loading matrix, respectively, and E is the residual matrix. Algebraically, |tj| is equal to the jth-largest eigenvalue of the covariance matrix, and pj is the corresponding eigenvector. By selecting A properly,8 T extracts most systematical variance information in X, while E contains noises. Loading matrix P reflects the variable correlation structure. In multivariate statistical process control (MSPC), process monitoring can be performed in the subspaces determined by PCA. In score space, the Hotelling T2 statistic measures the distances between each operation point and the center of the historical normal operation region. In residual space, the statistic of squared prediction error (SPE) measures the model fitness. The control limits of T2 and SPE can be calculated based on certain statistical assumptions. In process monitoring, the values of these two statistics are plotted on respective control charts. If the value of either statistic is outside the corresponding control limits, a fault is detected. More details of PCA and PCA-based monitoring can be found in ref 1. After an abnormality is detected, the contribution plots are the most widely utilized tools to diagnose the causes of the fault.9-11 2.2. 2D-DPCA. As mentioned in the first section, two types of dynamics are common in batch processes: within-batch dynamics and batch-to-batch dynamics. Therefore, batch process dynamics can be modeled in a 2D time framework. To monitor such batch processes, 2D-DPCA has been proposed to perform PCA on a parsimonious 2D time series model structure.6 With the existence of 2D dynamics, the current measurements can be described as a function not only of the lagged measurements in the past time of the same batch, but also of the lagged measurements in some past batches. These lagged variables form a region of support (ROS) which reflects batch process 2D dynamic features. In 2D-DPCA, PCA algorithm is performed on an expanded data matrix which is constructed with the current variables together with all the lagged variables in ROS. Thus, both 2D dynamics and variable cross-correlations can be extracted in the PCA score space, while residual space represents information about measurement noises. The SPE statistic and corresponding control limits can be calculated for process monitoring. The dynamics information retained in 2D-DPCA score space breaks the statistical basis of T2-statistic-based monitoring. Therefore, before the calculation of T2 and the corresponding control limits, the scores should be filtered.12 Please notice that an important hypothesis of 2D-DPCA is that the structure of variable correlations and the 2D dynamics do not change significantly throughout the entire batch duration. Otherwise, the multiphase strategy13 should be utilized for phase division. Different 2D-DPCA models then can be built for different phases. Please also note that, at this stage, the 2DDPCA method can only be applied to modeling the batch processes with even operation durations. For more details about 2D-DPCA, one can refer to our previous paper.6 2.3. Existing Method of ROS Determination. In a previous work of the authors,7 a data-based ROS automatic determination method has been designed for 2D-DPCA modeling. This method regards the current measurements as the dependent variables to be predicted, while the lagged variables in the target ROS are regarded as the independent variables that provide the best prediction of the current sample. Thus, the ROS determination problem has been transformed to a variable selection problem of building a regression model between current and lagged samples. The procedure can be divided into two stages. The first stage is called initial ROS selection, in which a past neighborhood

of the current sample is chosen as the candidate region of the target ROS. Simple regression (SR)14 is applied to form the initial ROS, which regresses the current sample with each of the lagged variables individually. If the probability of the associated Student’s t test for a slope coefficient is less than a significant level, the corresponding lagged variable is kept as a component of the initial ROS. In the second stage, which is called target ROS determination, iterative stepwise elimination (ISE),15 which is a type of backward elimination, is performed based on the importance of the independent variables. In each iteration run, a regression model is built to relate the candidate independent variables (the lagged variables in the candidate region) to current measurements. Since the independent variables are correlated, the partial least-squares (PLS), principal component regression (PCR) or ridge regression model can be built. The importance of each lagged variable is quantified as z)

|bi |si

(2)

V

∑ |b |s

i i

i)1

where bi is the regression coefficient corresponding to the ith independent variable, si the standard deviation of the ith independent variable, and V the total number of independent variables in the candidate region. At the beginning of this stage, the candidate region is the same as the initial ROS. Then in each iteration run, the independent variable with the minimum importance is eliminated from the candidate region. An index, which is a model evaluation criterion, is calculated for each regression model. This index can be prediction error sum of squares (PRESS), root-mean-square prediction error (RMSE), Akaike information criterion (AIC), or other indices that represent the model prediction ability. In the end, the best choice of the target ROS is determined to be the candidate region corresponding to the best prediction model. 3. Motivations 3.1. Tentative Specification of Time Series Models. The 2D-DPCA model utilizes the 2D autoregressive (AR) model structure. Thus, the ROS determination of a 2D-DPCA model is analogous with the order selection of a conventional AR model. The order selection task is also called tentative specification, when dealing with multivariate time series models. To get a better understanding of the ROS determination problem, let us do a brief review about the tentative specification of time series models. Autoregressive (AR) and moving average (MA) are two basic structures in time-series models. In a one-dimensional (1D) system, a conventional univariate AR model of order p is defined as p

xk ) c +

∑φx

i k-i

+ εk

(3)

i)1

where k is the index of sampling interval, φi is the model parameter, c is a constant, and εk is the white noise. Partial autocorrelation function (PACF), which indicates the partial correlation between the current sample and a lagged sample with the effects of other lagged samples removed, can be used for AR model order selection. For an AR process, the values of PACF become insignificant when the lags are beyond the


order of the process. A univariate MA model of order q is written as q

xk ) ε k +

∑θε

i k-i

(4)

i)1

where θi is the model parameter and εk is the error term. The order of a MA process can be indicated by the autocorrelation function (ACF), which calculates the values of autocorrelations between the current sample and the lagged samples. For an MA process, the values of ACF become insignificant when the lags are beyond the order of the process. An MA process can be approximated by a high-order AR model, and the corresponding values of PACF of an MA process slowly decay to zero (and vice versa). For more details of the time-series model, please refer to the literature given in ref 16. In multivariate cases, the properties of the partial autoregressive matrix and the cross-correlation matrix are similar to the properties of PACF and ACF in univariate cases, respectively.17 3.2. Discussions on Existing ROS Determination Method. As mentioned in section 2.3, a key point of the initial ROS selection in the previous ROS determination method is that the significances of the SR slope coefficients are regarded as the selection criterion. It is easy to understand that the simple regression slope coefficients are equivalent to the variable autocorrelation coefficients and the lagged cross-correlation coefficients. Thus, the SR method is equivalent to the calculation of a cross-correlation matrix. When a batch process has AR properties, the values of slope coefficients gradually decay to zero, as discussed in the last section. Therefore, the initial ROS selection stage identifies a region larger than the target ROS and provides a reasonable basis for the second stage. On the other hand, if the 2D batch process only has MA properties, the slope coefficients become insignificant when the lags are beyond the MA order in either batch or time direction. As a result, such slope coefficients lead to a relative small initial ROS. However, because of the equivalence between the 2D-DPCA and 2D AR models, a large ROS is needed for 2D-DPCA to approximate the process MA properties. Therefore, in such cases, the SR-based initial ROS selection may not be able to provide a sufficiently large candidate region for the second determination stage. In the target ROS determination stage of the previous method, there are three features: taking model prediction ability (reflected by RMSE, PRESS, AIC, etc.) as the model selection criterion to determine the most proper ROS, using the variable importance index as the variable selection (elimination) criterion, and following the backward elimination procedure. However, such criteria and procedures have some limitations. First, the existing method is inspired from the variable selection for regression models, which focuses on the model prediction abilities. However, the goal of ROS determination is to identify a subset of lagged variables reflecting process dynamics, which focuses on the lagged variable partial correlations. Therefore, the model prediction ability may not be a most proper criterion in ROS determination. Second, the variable importance index defined based on regression coefficients is not a well-defined statistic. Third, as Grechanovsky pointed out, “different procedures are known to select different ‘best’ subsets when applied to one data sample”.18 Procedures can affect the ROS determination results. As a type of backward elimination, iterative stepwise elimination (ISE) starts with a sufficient large subset. In each run, a most “unimportant” variable is deleted. However, the regression coefficients calculated in each run are affected by

9963

other lagged variables in the candidate region. In fact, the answer to the question “which variable is most unimportant” is not known. We can only answer this question conditionally, when the existence of other candidate variables is determinate. Therefore, redundant information provided by other lagged variables and noises in data may lead to unexpected deletions. As discussed previously, although the previous method of ROS determination is practical in many cases, there exist shortcomings that may affect the accuracy of the identified ROS. It may result in a subset of lagged variables providing good prediction, but it may not lead to a most reasonable result in ROS determination. Therefore, an improved ROS automatic determination method is needed to overcome such drawbacks. 4. 2D-DPCA Based on Improved ROS Automatic Determination 4.1. Basic Idea of Improved ROS Automatic Determination. Instead of looking at the model prediction abilities, the proposed method utilizes the partial correlations between current sample and lagged samples as the criterion to determine the most proper ROS, which is consistent with using PACF or the partial autoregressive matrix in the tentative specification of a conventional AR model. The P value of a partial F-test, which is a well-defined statistical index that indicates the significance of partial correlation, is chosen to be the variable selection index. A determination procedure is designed, which iteratively performs stepwise regressions, to avoid including too much redundant information. At the same time, the risk of selecting a too small candidate region is also avoided, since the ROS is expanded batch by batch. In the proposed procedure, first, all lagged variables in the current batch are put into a candidate set. Stepwise regressions19 are then performed between current variables and candidate lagged variables for variable selection, using the P-value of a partial F-test as the criterion. The candidate set can be updated based on the variable selection results, which contains the selected lagged variables in the current batch, together with all lagged variables in the last batch. Again, stepwise regressions are utilized to select variables from this new candidate set. The candidate set then is updated again by including all lagged variables in the secondto-last batch and the variables selected in the last iteration run, followed by another run of stepwise regressions. The above steps are iterated, until no more new variables are persistently selected, in comparison to the selection result of the previous iteration run. 4.2. Procedure of Improved ROS Automatic Determination. Suppose there are J number of process variables xj(j ) 1, 2, ..., J). Denote the lagged variable of xj in 2D space as xj(i - R,k - β), where j(j ) 1, 2, ..., J) is the variable index, R and β are lagged steps in batch and time directions, respectively. Particularly, when a lagged variable locates at a future sampling interval in a past batch, the value of β becomes negative. The term xj(i - R,k - β) is the data vector corresponding to the lagged variable xj(i - R,k - β), which contains the following measurements: xj(i - R, k - β) ) [_xj(i - R, j - β), _xj(i - R, j - β - 1), · · · , _xj(i - R, 1), _xj(i - R - 1, K - β), _xj(i - R - 1, K - β - 1), · · · , _xj(i - R - 1, 1), · · · , _xj(1, K - β), _xj(1, K - β - 1), · · · , _xj(1, 1)]T

(5) where i and j are the batch and sampling interval index values of the current sample, K is the number of sampling intervals in a batch, and the process measurements are denoted as _xj(i,k).

9964


Figure 1. Procedure of the improved ROS automatic determination method.

The parameter xj(i,k) is the current variable of xj. Correspondingly, the data vector xj(i,k) of the current variable xj(i,k) can be described as xj(i, k) ) [_xj(i, j), _xj(i, j - 1), _xj(i, β + 1), _xj(i - 1, K), _xj(i - 1, K - 1), _xj(i - 1, β + 1), _xj(R + 1, K), _xj(R + 1, K - 1), _xj(R + 1, β + 1)]T

(6)

The detailed procedure of improved ROS automatic determination is shown in Figure 1 and interpreted in detail as given below. (1) Let variable index h ) 1. (2) Set the values of the entry criterion F-to-enter and the removal criterion F-to-remove for stepwise regression. Usually, set F-to-enter e F-to-remove to avoid an infinite loop. (3) Let the dependent variable y be given as y ) xh(i,k). (4) Set g ) 0. (5) Define the candidate variable set A as A ) A 1 ∪ A2 ∪ · · · ∪ A J

(7)

which contains all lagged variables in the current batch, where Aj ) {xj(i, k - 1), xj(i, k - 2), ..., xj(i, 1)}

(8)

(6) Calculate the correlation coefficient between y and each candidate variable in A. (7) The candidate variable with the largest correlation to y is selected first, while other variables are kept in the candidate set, if the F-test shows that the selected variable is significant. Otherwise, there is no time-direction dynamics in variable xh; set the selected variable set B to B ) 0, and go to step 12.

(8) Another variable is chosen, based on the highest partial correlation to y, with the effect of the already selected variable(s) removed. If its significance test of partial F-value can pass the entry criterion F-to-enter, the variable is selected. Such a significance test is performed based on the p value. (9) The variables already selected are examined for removal, according to the removal criterion F-to-remove. The removed variables are put back into the candidate set. (10) Repeat steps 8 and 9, until no more variables meet the entry and removal criteria. (11) Put all lagged variables selected through steps 6-10 into the selected variable set B. (12) Increase the value of g by 1. 13. Update Aj and A as Aj ) {xj(i - g, K), xj(i - g, K - 1), ..., xj(i - g, 1)}

(9)

and A ) B ∪ A 1 ∪ A2 ∪ · · · ∪ AJ

(10)

Thus, all previously selected lagged variables and all lagged variables in the gth lagged batch are included in the updated candidate variable set A. (14) Repeat steps 6-14 until there are no more variables persistently selected into set B. (15) Finally, the variables selected into set B are determined as the lagged variables in the target ROS of the process variable xh. (16) Increase the value of h by 1. (17) If h e J, repeat steps 2-16. If h > J, the ROS automatic determination procedure is finished and the complete ROS is the combination of the support regions of all process variables.


In above procedure, steps 6-10 are the steps of stepwise regression, where variable selection in the candidate variable set is performed. For more details of stepwise regression, one can refer to ref 19. In step 8, the partial correlations between the candidate variables and the dependent variable y are calculated. To illustrate the concept of partial correlation, take three arbitrary variablessa1, a2, and a3sas an example. The partial correlation between a1 and a2, with the effect of the a3 removed, can be written as F12.3. The variable a3 is called controlling variable. The partial correlation expresses the correlation between the residuals resulting from the linear regressions of a1 with a3 and of a2 with a3, and can be calculated as

F12.3 )

F12 - F13F23

[(1 - F132)(1 - F232)]1/2

(11)

where Fij is the correlation coefficient between ai and aj. In more general cases, when there are more than one controlling variable, the values of the partial correlations can be derived recursively.20 In time series analysis, PACF is a special case of the partial correlation calculation. Given a time series {a0,a1,a2, · · · }, where the subscript denotes the lag number, the value of PACF equals to F0h.{1,2,...,h-1} for lag h. Therefore, the improved ROS automatic determination method can be regarded as a generalization of the AR order selection to 2D multivariate cases. Since the ROS determination is always performed offline in the stage of 2D-DPCA model building, one does not need to worry about the computation burden. 4.3. Procedure of 2D-DPCA Modeling. After the ROS determination, an augmented data matrix can be constructed, consisting of the current measurements and all lagged variables in the ROS. Suppose the complete ROS includes lagged variables xj(i,k - 1), xj(i,k - 2), ..., xj(i,k - qj(i)), xj(i - 1,k + fj(i - 1)), xj(i - 1,k + fj(i - 1) - 1), ..., xj(i - 1,k), xj(i - 1,k - 1), ..., xj(i - 1,k - qj(i - 1)), ..., xj(i - pj,k + fj(i - pj)), ..., xj(i - pj,k), xj(i - pj,k - 1), xj(i - pj,k - qj(i - pj)), where pj is the maximum number of lagged batch index in the proper ROS on variable j, qj(V) is the maximum number of lagged time index in the proper ROS in the Vth batch on variable j, fj(V) is the maximum number of future time index in the proper ROS ˜ on variable j, and j ) 1, 2, ..., J. The augmented data matrix X is organized as shown below.

[]

T X˜p+1,q+1 l ˜ ) X˜Ti,k X l T X˜I,K-f

where X˜Ti,k ) [x˜T1 (i, k), ..., x˜Tj (i, k), ..., x˜TJ (i, k)]

(12)

9965

) [x_j(i, k), _xj(i, k - 1), ..., _xj(i, k - qj(i)), _xj(i - 1, k + fj(i - 1)), _xj(i - 1, k + fj(i - 1) - 1), ..., _xj(i - 1, k), _xj(i - 1, k - 1), ..., _xj(i - 1, k - qj(i - 1)), ..., _xj(i - pj, k + fj(i - pj)), _xj(i - pj, k + fj(i - pj) - 1), ..., _xj(i - pj, k), _xj(i - pj, k - 1), ..., _xj(i - pj, k - qj(i - pj))]

x˜Tj (i, k)

q ) max(q1(i), q1(i - 1), ..., q1(i - m), ..., qj(i), qj(i - 1), ..., qj(i - m), ..., qJ(i), qJ(i - 1), ..., qJ(i - m)) f ) max(f1(i - 1), f1(i - 2), ..., f1(i - m), ..., fj(i - 1), fj(i - 2), ..., fj(i - m), ..., fJ(i - 1), fJ(i - 2), ..., fJ(i - m)) p ) max(p1, ..., pj, ..., pJ) PCA is then performed to analyze this augmented matrix and to extract variable cross-correlations and 2D dynamic information. The SPE statistic and corresponding control limits can be calculated based on residuals of the 2D-DPCA model, and they can be used in online monitoring when new operation data are coming, as introduced in section 2.2. The T2 statistic, based on filtered scores, can also be utilized in online monitoring. 5. Simulation Results In this section, the proposed improved ROS automatic determination method is verified with two simulation examples. In both examples, the process current measurements are formulated to have dynamics in both the time and batch directions. In the meantime, the variable cross-correlations are also simulated. These simulations are suitable examples to verify the effectiveness of the proposed method, since the exact ROS of these simulated processes are clearly known and can be compared to the automatic determination results. 5.1. Simulation 1: A Simple Batch Process with Irregular ROS. In the first simulation, the batch process model is similar to that depicted below. x1(i, k) ) 0.8x1(i, k - 1) + 0.5x1(i - 1, k + 1) 0.3x1(i - 1, k) + w1 x2(i, k) ) 0.67x2(i, k - 1) + 0.44x2(i - 1, k + 1) 0.11x2(i - 1, k) + w2 x3(i, k) ) 0.4x3(i, k - 1) + 0.25x1(i, k) + 0.35x2(i, k) + w3 x4(i, k) ) 0.8x4(i, k - 1) + 0.53x1(i, k) 0.33x2(i, k) + w4

(13)

where i is the batch index; k is the time index; x1 and x2 are two independent signals with dynamics in both the time and batch directions; x3 and x4 are functions of x1, x2, and their own values at one step before; wj(j ) 1, 2, 3, 4) are Gaussian noises with a variance of 0.01. Fifty batches of data are generated as referenced data, and there are 200 samples in each batch. Obviously, this simulated batch process has an ROS that consists of lagged variables x1(i,k - 1), x1(i - 1,k + 1), x1(i - 1,k), x2(i,k - 1), x2(i - 1,k + 1), x2(i - 1,k), x3(i,k - 1), and x4(i,k - 1) in the past region with an irregular shape. First, the ROS is determined with the proposed method. The entry and removal criteria (F-to-enter and F-to-removal, respectively) are both specified to be 0.01. The ROS of the four process variables are selected one by one, and then are combined into a complete proper ROS. Take variable x1 as an example. First, the stepwise regression is performed between the current variable x1(i,k) and all lagged

9966


variables of x1, x2, x3, and x4 in the current batch. In this stepwise regression, variable x1(i,k - 1) is selected into set B first, since it has the largest correlation to x1(i,k) and the F-test shows its significance. Variable x2(i,k - 1) is chosen to be the second selected variable, because it has the largest partial correlation to x1(i,k), taking the effect of x1(i,k - 1) into consideration, and the corresponding p-value is smaller than the entry criterion. Then, check whether the already selected variable should be removed after the entry of x2(i,k - 1). No variable’s p value is larger than F-to-removal, so no variable should be removed in this step. This stepwise regression is then terminated, since no other candidate lagged variable is significantly partial correlated to x1(i,k). The candidate variable set then is updated to include all variables in set B, including x1(i,k - 1) and x2(i,k - 1), together with all lagged variables in the last batch. Stepwise regression is performed between x1(i,k) and these updated candidate variables. Again, the variable x1(i,k - 1) is selected first. Then, select x1(i - 1,k + 1), followed by x1(i - 1,k). This time, these three lagged variables are selected into set B. Preparing for the stepwise regression in the third iteration run, the candidate variable set is updated again by including x1(i,k - 1), x1(i - 1,k + 1), x1(i - 1,k), and all lagged variables in the second-to-last batch. This time, x1(i,k - 1), x1(i - 1,k + 1), x1(i - 1,k), x1(i - 2,k + 3), x1(i - 2,k + 2), and x1(i - 2,k + 1) are selected. These six lagged variables and all lagged variables in the third-to-last batch form a new candidate set. In the fourth iteration run, x1(i,k - 1), x1(i - 1,k + 1), x1(i - 1,k), x1(i - 3,k + 4), x1(i - 3,k + 3) are selected, and former selected x1(i - 2,k + 3), x1(i - 2,k + 2), x1(i - 2,k + 1) are no longer included into the set B. In the fifth run, x1(i,k - 1), x1(i - 1,k + 1), x1(i - 1,k) and x1(i - 4,k + 3), x1(i - 4,k) are selected. From the above results, it can be found that only lagged variables x1(i,k - 1), x1(i - 1,k + 1), x1(i - 1,k) are persistently retained in set B. Other variables appeared in the third, fourth, and fifth runs are selected occasionally and are deleted soon after other lagged information is introduced into the candidate set, since there is no certain significance on their partial correlations to the current variable x1(i,k). The selection of these variables will only lead to an overfitted model. Therefore, the final determined ROS of variable x1 only consists of x1(i,k - 1), x1(i - 1,k + 1), and x1(i - 1,k). Following a similar procedure, the support regions of other variables can be determined as follows. The x2’s ROS includes x2(i,k - 1), x2(i - 1,k + 1), and x2(i - 1,k), while x1(i,k - 1), x2(i,k - 1), x3(i,k - 1), x1(i - 1,k + 1), and x2(i - 1,k + 1) form the ROS of x3, and x1(i,k - 1), x2(i,k - 1), x4(i,k - 1), x1(i - 1,k + 1), and x2(i - 1,k + 1) form the ROS of x4. In summary, the complete ROS is determined as x1(i,k - 1), x1(i - 1,k + 1), x1(i - 1,k), x2(i,k - 1), x2(i - 1,k + 1), x2(i - 1,k), x3(i,k - 1), and x4(i,k - 1), which is same with the real one. Then, the previous ROS determination method based on backward elimination is also performed. In the initial ROS selection, the SR method shows (a) that the lagged variables of x4 do not need to be included in the initial ROS of x2 and (b) at the same time, the lagged variables of x2 do not need to be included in the initial ROS of x4, which will lead to an incorrect ROS determination result of x4. Besides, the SR method does not give a clear suggestion about the edge of the initial ROS. Therefore, SR is not reliable enough, although it is practical in many situations. Here, suppose that there is some knowledge that the maximum numbers of the lagged batches, the lagged time intervals, and the future time intervals in past batches to be included into the initial ROS are smaller than 3, 5, and 5,

Figure 2. Target ROS determination of x1 in simulation 1 with the previous method.

respectively. The ISE begins with a candidate region that contains 152 lagged variables at the maximum, including 20 lagged variables in the current batches, and 44 lagged variables in each lagged batch. Figure 2 shows the backward elimination results of the target ROS determination for variable x1. Since there is a total of 152 lagged variables in the initial ROS, there are 152 runs of backward elimination. From the RMSE values, it can be seen that the last three eliminated candidate variablessx1(i,k - 1), x1(i - 1,k + 1), and x1(i 1,k)scontribute most to the prediction of the current variable of x1. This is consistent with the true ROS of x1. However, because of the overfitting, the model that takes x1(i,k - 1), x1(i - 1,k + 1), and x1(i - 1,k) as the independent variables does not correspond to the smallest RMSE. In fact, the smallest RMSE value appears in the 71th elimination run. Using other criteria, such as AIC, cannot avoid the overfitting problem either. This causes difficulties in judging which variables should be included in the ROS. Similar situations appear in the ROS determination of other variables. This example shows (a) in the previous ROS determination method, SR can only be used as an assistant method in initial ROS selection and is not very reliable in some situations and (b) because of the overfitting, the target ROS determination steps cannot tell the range of the ROS clearly. On the other hand, the proposed improved ROS automatic determination avoids the above disadvantages of the former method. 5.2. Simulation 2: A More Complicated Process with MA Property. In the second simulation example, the batch process is described as below: x1(i, k) ) 0.8x1(i, k - 1) - 0.6x1(i - 1, k) + 0.8x1(i - 1, k - 1) + w1(i, k) x2(i, k) ) 0.7w2(i, k) + 0.3w2(i, k - 1) x3(i, k) ) 0.4x3(i, k - 1) + 0.25x1(i, k) + 3.5x2(i, k) + w3(i, k)

(14)

As can be seen in the above equation, variable x2 has firstorder MA properties, which should be described by a higherorder AR model structure in 2D-DPCA. x1 has dynamics in both 2D directions and is independent of x2, while x3 is a function of x1, x2, and its own value at one step before. The parameter wj (with j ) 1, 2, 3) has the same definition as that previously given. Again, 50 reference batches of data are generated, and there are 200 samples in each batch. For the ROS determination with the proposed method, the entry and removal criteria F-to-enter and F-to-removal are


9967

Figure 4. Target ROS determination of x3 in simulation 2 with the previous method. Figure 3. Target ROS determination of x2 in simulation 2 with the previous method.

specified to be 0.01. Following steps similar to those given in simulation 1, the complete ROS determined for this process consists of x1(i,k - 1), x1(i - 1,k), x1(i - 1,k - 1), x2(i,k - 1), x2(i,k - 2), x2(i,k - 3), and x3(i,k - 1), which is consistent with the simulated batch process dynamic property. Specifically, x1(i,k - 1), x1(i - 1,k) and x1(i - 1,k - 1) forms the ROS of x1, while the ROS of x2 includes its lagged variables x2(i,k 1), x2(i,k - 2), and x2(i,k - 3), which describe the MA dynamics with a higher-order AR structure. In addition, the ROS determination result of x3 is x1(i,k - 1), x1(i - 1,k), x1(i - 1,k - 1), x2(i,k - 2), and x3(i,k - 1). For comparison, the previous ROS determination method is also applied. The shortcomings of that method are shown in both initial ROS selection and target proper ROS determination stages. To show the effects of the initial ROS selection, take variable x2 as an example. Using the SR method, the candidate region of x2 is constrained in a region only including three lagged variablessx1(i,k - 1), x2(i,k - 1), and x3(i,k - 1)swhich is not large enough for the target ROS determination and cannot lead to a higher-order AR approximation of the MA dynamics. Then, to check the target ROS determination ability, suppose that a sufficiently large initial ROS has been selected, based on process knowledge, which includes five lagged steps in the current batch and three lagged batches. In each lagged batch, there are five lagged sampling intervals and five future sampling intervals, together with a current sample. The last three eliminated lagged variables in the ROS determination of x1 are x1(i,k - 1), x1(i - 1,k), and x1(i - 1,k - 1), which are reasonable. The backward elimination results of x2 are shown in Figure 3. The last four deletions cause significant changes in the RMSE values and indicate that the target ROS corresponds to lagged variables x1(i,k - 1), x1(i,k - 2), x1(i - 1,k - 1), and x1(i - 1,k). However, from eq 14, it is known that x1 and x2 are independent. Therefore, such an ROS is neither proper nor reasonable for variable x2. The ISE does not show a clear ROS determination result for x3. The last three eliminated lagged variables x2(i,k - 1), x3(i,k - 1), and x3(i,k - 2) cause most of the changes in the RMSE values and are regarded as components of the x3’s ROS, as can be read from Figure 4. However, they do not provide sufficient information of real process dynamics. Generally speaking, the improved ROS automatic determination method works well in this simulation, although the process dynamic characteristics are more complicated than the

Figure 5. Comparison between normal and faulty batch trajectories (solid curves represent normal batch trajectories of x2, whereas dashed curves denote faulty batch trajectories of x2).

process in the first example. On the other hand, the previous method is not successful in this case. The initial ROS selection stage cannot address the MA type dynamics, while the target ROS determination stage does not explore the correct information about the process dynamics. Then, the monitoring efficiencies of the 2D-DPCA models based on these two ROS determination methods are compared. One hundred batches of data are generated, within which the first 60 batches are normal operated, while the following cycles are faulty. From batch 61, the correlation structure of variable x2 changes as described below: x2(i, k) ) x2(i - 1, k) + w2(i, k - 1)

(15)

From Figure 5, it is found that the variable trajectory magnitudes change only slightly in the first several batches after the fault happens, which makes the fault difficult to be detected. Since this is a fault about the variable correlation structure change, SPE control plots are used to illustrate the online monitoring results. Based on cross-validation, four PCs have been retained in both types of 2D-DPCA models. Figure 6 shows the monitoring result of the 2D-DPCA model, based on the improved ROS

9968


Figure 6. Monitoring result based on 2D-DPCA with the improved ROS determination method (the solid curve denotes 99% control limits, whereas the dashed curve represents 95% control limits).

Figure 7. Fault diagnosis result based on 2D-DPCA with the improved ROS determination method (the solid curve denotes 99% control limits, whereas the dashed curve represents 95% control limits).

automatic determination method. The fault is detected efficiently from batch 61. The SPE values of 16.75% of the samples in batch 61 are outside the 99% control limit, whereas 29.95% of the SPE values are outside the 95% control limit. The fault diagnosis is performed, using the SPE contribution plot. In Figure 7, variables 1-10 are x1(i,k), x1(i,k - 1), x1(i - 1,k), x1(i - 1,k - 1), x2(i,k), x2(i,k - 1), x2(i,k - 2), x2(i,k - 3), x3(i,k), and x3(i,k - 1), respectively. From the contribution plot, the current and lagged variables of x1 are not significantly beyond the control limits, while all variables related to x2 are diagnosed very clearly to be abnormal. Because of the variable correlations, the variables of x3 are also outside the control limits. The diagnosis results reveal that the fault occurs to x2. The monitoring result of the 2D-DPCA model based on the previous ROS determination method is plotted in Figure 8. The fault also is detected but is less clear than the result in Figure 6. Here, 9.6% of the SPE values are outside the 99% control limit, while 24.75% of the points are outside the 95% control limit. In Figure 9, variables 1-10 are x1(i,k), x1(i,k - 1), x1(i,k - 2), x1(i - 1,k), x1(i - 1,k - 1), x2(i,k), x2(i,k - 1), x2(i,k 3), x3(i,k), x3(i,k - 1), and x3(i,k - 2), respectively. In this contribution plot, x1, x2, and x3 all have current or lagged variables that are significantly outside the control limits. It is hard to conclude the reason of the fault.

Figure 8. Monitoring result based on 2D-DPCA with the previous ROS determination method (the solid curve denotes 99% control limits, whereas the dashed curve represents 95% control limits).

Figure 9. Fault diagnosis result based on 2D-DPCA with the previous ROS determination method (the solid curve denotes 99% control limits, whereas the dashed curve represents 95% control limits).

From the above comparisons, the improved ROS automatic determination method behaves better than the previous one, either in monitoring or in fault diagnosis. 6. Conclusions To better model and monitor the batch processes with twodimensional (2D) dynamics, two-dimensional dynamic principal component analysis (2D-DPCA) has been proposed earlier by the authors. The support region (ROS) determination step is important to ensure the 2D-DPCA modeling accuracy and the monitoring efficiency. This paper discussed the properties and the drawbacks of the previous ROS automatic determination method. Consequently, an improved method, using iterative stepwise regressions and variable partial correlation information, has been proposed. This method could be regarded as a generalization of the autoregressive (AR) model order selection to 2D multivariate cases, which is data-driven and can be readily applied. It overcomes the shortcomings of the previous method, and redounds to better modeling, monitoring, and fault diagnosis of the 2D dynamic batch processes. The benefits of the proposed method have been demonstrated with two simulation applications.


Acknowledgment This project is supported in part by the Hong Kong Research Grant Council (under Project No. 613107). Literature Cited (1) Jackson, J. E. A User’s Guide to Principal Components; Wiley: New York, 1991. (2) Nomikos, P.; MacGregor, J. F. Monitoring of batch processes using multiway principal component analysis. AIChE J. 1994, 40, 1361. (3) Nomikos, P.; MacGregor, J. F. Multiway partial least squares in monitoring batch processes. Chemom. Intell. Lab. Syst. 1995, 30, 97. (4) Chen, J.; Liu, K. C. On-line batch process monitoring using dynamic PCA and dynamic PLS models. Chem. Eng. Sci. 2002, 57, 63. (5) Flores-Cerrillo, J.; MacGregor, J. F. Multivariate monitoring of batch processes using batch-to-batch information. AIChE J. 2004, 50, 1219. (6) Lu, N.; Yao, Y.; Gao, F. Two-dimensional dynamic PCA for batch process monitoring. AIChE J. 2005, 51, 3300. (7) Yao, Y.; Diao, Y.; Lu, N.; Lu, J.; Gao, F. Two-dimensional dynamic principal component analysis with autodetermined support region. Ind. Eng. Chem. Res. 2009, 48, 837. (8) Wold, S. Cross-validatory estimation of the number of components in factor and principal components models. Technometrics 1978, 20, 397. (9) Miller, P.; Swanson, R. E.; Heckler, C. E. Contribution plots: A missing link in multivariate quality control. Int. J. Appl. Math. Comput. Sci. 1998, 8, 775. (10) Conlin, A. K.; Martin, E. B.; Morris, A. J. Confidence limits for contribution plots. J. Chemom. 2000, 14, 725.

9969

(11) Westerhuis, J.; Gurden, S.; Smilde, A. Generalized contribution plots in multivariate statistical process monitoring. Chemom. Intell. Lab. Syst. 2000, 51, 95. (12) Yao, Y.; Gao, F. Batch process monitoring in score space of twodimensional dynamic principal component analysis (PCA). Ind. Eng. Chem. Res. 2007, 46, 8033. (13) Yao, Y.; Gao, F. A survey on multistage/multiphase statistical modeling methods for batch processes. Annu. ReV. Control 2009, 33, 172. (14) Gauchi, J. P.; Chagnon, P. Comparison of selection methods of explanatory variables in PLS regression with application to manufacturing process data. Chemom. Intell. Lab. Syst. 2001, 58, 179. (15) Boggia, R.; Forina, M.; Fossa, P.; Mosti, L. Chemometrics study and validation strategies in the structure-activity relationships of a new class of cardiotonic agents. Quant. Struct.-Act. Relat. (QSAR) 1997, 16, 201. (16) Box, G. E. P.; Jenkins, G. M.; Reinsel, G. C. Time Series Analysis: Forecasting and Control, 3rd ed.; Prentice Hall: Englewood Cliffs, NJ, 1994. (17) Pen˜a, D., Tiao, G. C.; Tsay, R. S. A Course in Time Series Analysis; Wiley: New York, 2001. (18) Grechanovsky, E. Stepwise regression procedures: Overview, problems, results, and suggestions. Ann. N.Y. Acad. Sci. 1987, 197. (19) Draper, N. R.; Smith, H. Applied Regression Analysis, 3rd ed.; Wiley: New York, 1998. (20) Stuart, A.; Ord, J. K. Kendall’s AdVanced Theory of Statistics, Vol. 2: Classical Inference and Relationship, 5th ed.; Oxford University Press: New York, 1991.

ReceiVed for reView April 12, 2010 ReVised manuscript receiVed August 2, 2010 Accepted August 31, 2010 IE100860X

Statistical Monitoring and Fault Diagnosis of Batch Processes Using

Recommend Documents