Multivariate Statistical Process Control Method Including Soft Sensors

May 1, 2014 - Therefore, multivariate statistical process control (MSPC) methods have been developed, but traditional MSPC methods cannot detect fault...
1 downloads 0 Views 3MB Size
Article pubs.acs.org/IECR

Multivariate Statistical Process Control Method Including Soft Sensors for Both Early and Accurate Fault Detection Yasuyuki Masuda, Hiromasa Kaneko, and Kimito Funatsu* Department of Chemical System Engineering, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo 113-8656, Japan S Supporting Information *

ABSTRACT: The development of process monitoring and control methods is important to maintaining product quality in chemical plants safely and effectively. Therefore, multivariate statistical process control (MSPC) methods have been developed, but traditional MSPC methods cannot detect faults relating to process variables that are difficult to measure online. In this work, a new MSPC method including soft sensor prediction is proposed to solve this problem. Soft sensors predict values of difficultto-measure variables that are used as input variables of fault detection models. The proposed method enables the real-time control of processes using difficult-to-measure variables. The fault detection performance of the proposed method is demonstrated and compared with that of traditional MSPC methods using the Tennessee Eastman process and real industrial process data sets. The results show that the proposed method can achieve more accurate and earlier fault detection than traditional MSPC methods. PCA11 have been researched and developed. To overcome the assumption that the data follow a normal distribution in PCA, independent component analysis (ICA)12 was proposed and applied to MSPC.13,14 ICA was combined with PCA15 and applied to dissimilarity16 and the hidden Markov model.17 In addition, nonlinearity in processes can be considered with nonlinear MSPC methods such as kernel PCA18,19 and support vector machines (SVMs),20,21 and multivariate multimodal distributions in process data can be handled with the Gaussian mixture model (GMM)22 and adaptive GMM.23 However, the traditional MSPC methods cannot detect faults that relate to process variables that are difficult to measure online because the values of these variables cannot be obtained in real time and the variables cannot be used as input variables for the fault detection model. Product properties related to quality such as concentrations and density cannot be measured online and frequently, and it is impossible to detect faults that are relevant to product quality rapidly and accurately. Although Qin and Zheng included output variables in their fault detection models,24 real-time measurement of the output variables is assumed, and real-time fault detection cannot be performed when the output variables are difficult to measure. Difficult-tomeasure variables embody much information about the process, and such variables should be controlled in real time. We therefore apply soft sensors that predict difficult-tomeasure variables in real time from easy-to-measure variables to MSPC. An inferential model is constructed between the variables X that are easy to measure online and the variables Y that are not, and the Y variables are then predicted using that model.25,26 The proposed MSPC method involves soft sensors, which means that both X variables measured with hard sensors

1. INTRODUCTION The early and accurate detection of faults is very important to prevent unexpected accidents in chemical plants. To maintain product quality and improve productivity, it is also necessary to keep chemical plants out of trouble, because the final products are the results of the process including operating conditions in the plants. Therefore, process variables such as temperature, pressure, flow rate, liquid level, and product quality must be measured and monitored. Control limits for the process variables are set to control the variables by comparing the measured values with the control limits. In this way, univariate process control can be performed, but this approach ignores the relationships between process variables derived from chemical processes. Multivariate statistical process control (MSPC) methods are used to handle process variables and the relationships among them. MSPC shows higher fault detection performance than univariate control because MSPC methods introduce latent variables extracted from the process variables using multivariate analysis and enable the construction of models monitoring inherent variations in processes with smaller numbers of variables. The features of the processes are effectively taken into account in MSPC methods. For these reasons, MSPC methods are widely researched.1,2 Principal component analysis (PCA)3 is one of the most popular statistical analysis methods for MSPC. Jackson and Mudholkar first applied PCA to MSPC.4 The PCA method seeks linear correlations among process variables that express major trends in a data set, enables a decrease in the dimensions of highly correlated variables, and is applicable to process monitoring and control. For these reasons, PCA has been extremely successful for fault detection and diagnosis in the field of MSPC. Therefore, various types of PCA-based MSPC methods such as dynamic PCA,5 recursive PCA,6 dissimilaritybased PCA,7 maximum-likelihood PCA,8 progressive PCA,9 cumulative-sum- (CUSUM-) based PCA,10 and distributed © 2014 American Chemical Society

Received: Revised: Accepted: Published: 8553

January 20, 2014 April 17, 2014 May 1, 2014 May 1, 2014 dx.doi.org/10.1021/ie501024w | Ind. Eng. Chem. Res. 2014, 53, 8553−8564

Industrial & Engineering Chemistry Research

Article

data, and so on. The contribution ratio of the ith principal component is the rate of variance of the ith component to the sum of the variance of all components. In this article, the number of principal components was determined based on the cumulative contribution ratio. 2.2. MSPC Method Based on PCA. A process variable can be controlled individually using the three-sigma method. However, process variables cannot be controlled appropriately when there are correlations between the process variables. The process control method based on PCA, which is one of the MSPC methods, can be applied in the case of the existence of correlated process variables. First, PCA is applied to the multivariate data, and the principal components are extracted; then, Hotelling’s T2 statistic is calculated as

and Y variables predicted with soft sensors are used as input variables of the MSPC models. The values of both X and Y variables can be obtained in real time, which enables the fault detection model to detect any faults related to X and Y variables. Additionally, the proposed method can be combined with all of the traditional MSPC methods such as those in refs 4−24 and is expected to perform earlier and more accurate fault detection than traditional MSPC methods. Kano and Nakagawa developed a PCA model for fault detection and a partial least-squares (PLS) model for a soft sensor.25 A PLS-based soft sensor model was used only to predict Y values, and only the PCA model without Y variables was employed to detect abnormal events. Kourti constructed a PLS model and used Hotelling’s T2 and Q statistics for fault detection, but the input variables for the T2 and Q statistics were X variables and the effect of Y variables was not considered clearly.27 AlGhazzawi and Lennox used not only the T2 and Q statistics of a PLS model, but also the prediction errors of Y variables for the detection of poor control performance.28 Although Y variables were considered directly, the errors could be obtained after the Y measurements which was delayed compared to the X measurements, meaning the delay of fault detection. Additionally the relationships between X variables and Y variables were not considered in fault detection. On the other hand, our proposed method can handle X variables, Y variables, and the relationships between X variables and Y variables simultaneously in real time, because Y variables estimated by soft sensors are input variables of fault detection models. In addition, the methods to construct soft sensors methods were not restricted. Therefore, we can use first principle models, nonlinear regression models and hybrid models as soft sensor models even when fault detection models are constructed with linear methods such as the PCA method. In this article, we use the PCA-based MSPC method as an example of MSPC methods and verify the effectiveness and usability of the proposed method through fault detection results using the Tennessee Eastman process and a real industrial process.

r

T2 =

i=1

σi 2

(2)

where ti is the score of the ith principal component and σi is the standard deviation of the ith principal component. For the residual of a PCA model, the Q statistic is given by p

Q=

∑ (xi − xî )2 i=1

(3)

where p is the number of X variables and x̂i w is the value of the xi variable estimated with the PCA model. The Q statistic is also called the squared prediction error and monitors the part of the data that is not expressed with the PCA model. For both the T2 and Q statistics, confidence limits are determined to make the distinction between normal and abnormal states. We monitored the values of the T2 and Q statistics simultaneously and detected abnormal states when the value of one statistic exceeded the threshold. 2.3. Proposed Method. Some process variables are difficult to measure online and have intervals for measurement. The values of those variables cannot be obtained in real time. When those kinds of process variables are included as input variables for fault detection models, the models cannot detect abnormal events in real time. However, the difficult-to-measure variables (Y variables) such as quality of products embody much information on the process. In this study, we propose a fault detection method that includes X variables measured online by hard sensors and Y variables predicted by soft sensors. An overview of the traditional methods and the proposed method is shown in Figure 1. X variables can be measured online, but Y variables are difficult to measure online. Traditional method ① does not include Y variables among the input variables of the fault detection model and cannot detect abnormal events relating to Y. For traditional method ②, although the input variables include Y variables and faults related to Y variables can be detected, fault detection is delayed because of the measurement delay of the Y analyzer and cannot be performed frequently due to the measurement frequency of Y analyzer. On the other hand, proposed method ③ includes Y variables whose values were predicted in real time by inputting values of X variables into the soft sensor model. Therefore, the proposed method can perform online fault detection including Y variables. This method is expected to enable more stable and more accurate abnormal detection than traditional method ① and earlier abnormal detection than traditional method ②. The proposed method includes two statistical models: One represents a soft sensor and is used for the prediction of Y variables, and the other is used for fault detection. The soft

2. METHODS The proposed method can be combined with all MSPC methods. First, we explain PCA and the PCA-based MSPC method used in this study, and then, we describe our proposed method. Of course, the proposed method can be applied not only to the PCA-based MSPC method but also to any other MSPC methods. 2.1. PCA. PCA is a method for transforming observed multivariate data into statistically uncorrelated components expressed as linear combinations of observed process variables. The principal components can be extracted in descending order of variance of the components. The observed multivariate data set X is represented using r principal components as follows X = t1p1T + t 2p2 T + ··· + trpr T + E = TPT + E



ti 2

(1)

where ti is the ith principal component vector, pi is the ith loading vector, E is the matrix of X residuals, T is the principal component matrix, and P is the loading matrix. The pi components are determined in order, so that the variance of the ith principal component vector is maximized. PCA can reduce the dimension of X, which clarifies the data distribution, outliers in 8554

dx.doi.org/10.1021/ie501024w | Ind. Eng. Chem. Res. 2014, 53, 8553−8564

Industrial & Engineering Chemistry Research

Article

3. RESULTS AND DISCUSSION In this study, to demonstrate the effectiveness of the proposed method in an industrial process, we analyzed dynamic simulation data using the Tennessee Eastman (TE) process31 and real industrial data measured in a distillation column. 3.1. Tennessee Eastman Process. The TE process was developed by Eastman Chemical Company for simulating a real industrial process and is used for comparing the performance of process control and monitoring methods. The process diagram of the TE process is shown in Figure 2. The TE process consists of five units, namely, a reactor, a condenser, a compressor, a separator, and a stripper, and eight components A−H. Liquid products G and H and byproduct F are produced from gaseous reactants A, C, D, and E through the following chemical reactions A(g) + C(g) + D(g) → G(liq) Figure 1. Basic concepts of the traditional (① and ②) and proposed (③) methods.

A(g) + C(g) + E(g) → H(liq) A(g) + E(g) → F(liq)

sensor is constructed with statistical methods such as the PLS method29 and the support vector regression (SVR) method.30 An example of a fault detection model is the PCA-based MSPC model discussed in the above section. For example, even when the relationships between X and Y variables are nonlinear, by using a nonlinear regression method for the construction of the soft sensor model, one can consider nonlinear relationships with a linear MSPC model.

3D(g) → 2F(liq) where the reactions are irreversible, exothermic, and approximately first-order with respect to the reactant concentrations. The reaction rates are Arrhenius functions of temperature, where the reaction for G has a higher activation energy than the reaction for H, resulting in a higher sensitivity to temperature.

Figure 2. Tennessee Eastman process. 8555

dx.doi.org/10.1021/ie501024w | Ind. Eng. Chem. Res. 2014, 53, 8553−8564

Industrial & Engineering Chemistry Research

Article

The TE process includes 11 manipulated variables, 22 easy-tomeasure process variables (see Table S1, Supporting Information), and 19 composition measurements (see Table S2, Supporting Information). Eleven manipulated variables and 22 easy-tomeasure process variables are used as X variables. Nineteen component values are measured with analyzers A−C. The X variables are measured every 3 min online, but the sampling interval of analyzers A and B is 6 min, and that of analyzer C is 15 min. Hence, if abnormal events involved with composition measurements in analyzer C occur, the detection of these events will be delayed because it is difficult to obtain values of composition measurements. It is important to control these variables for safe and effective plant operation. There are 21 preprogrammed faults including five unknown faults in the TE process (see Table S3, Supporting Information). In this study, we targeted and discuss the process faults IDV(4), IDV(7), and IDV(11) because small differences were confirmed between the results of the traditional and proposed methods for the other process faults. The training and test data were obtained from the literature.32 Thus, the control scheme was completely the same as that of ref 32. The training data sets for the normal condition and each fault consisted of 500 and 480 observations, respectively. The test data set for each fault consisted of 960 observations including 160 normal data (8 simulation hours) first. The X variables were sampled every 3 min, but the Y variables were sampled every 6 or 15 min. Therefore, the numbers of sampling times were 490 at analyzers A and B and 196 at analyzer C for the training data, and the numbers of sampling times were 480 at analyzers A and B and 192 at analyzer C for the test data. In soft sensor analysis, we used only the data where the Y variables were measured, but to incorporate the time-delayed X variables that will be mentioned later, other data points were also used for X variables. We developed the soft sensor models that predict the Y values by using all of the training data and evaluated them by using all of the test data. It is significant to control concentrations of products and byproducts to maintain product quality at a high level, but composition measurements are difficult to achieve in real time. Therefore, we constructed soft sensors to predict the values of the composition measurements of products G and H and byproduct F using the PLS and SVR methods; that is, we regarded the 11 manipulated variables and 22 easy-to-measure process variables in Table S1 (Supporting Information) as X variables and the three composition measurements as Y variables. The PLS method is a linear regression method, whereas the SVR method is a nonlinear regression method; the details of PLS and SVR are given in Appendix A and Appendix B, respectively. To incorporate the dynamics of process variables into soft sensor models, X included each explanatory variable that was delayed for durations ranging from 0 to 30 min in steps of 3 min. In addition, we included each composition measurement that was delayed for 6, 12, and 18 min for analyzer B and for 15 and 30 min for analyzer C because the measurement interval of analyzer B was 6 min and that of analyzer C was15 min. Tables 1 and 2 report the modeling and prediction results at analyzer B when we targeted IDV(7) and at analyzer C when we targeted IDV(4) and IDV(11), respectively. Both IDV(4) and IDV(11) correspond to reactor cooling water inlet temperature. The details of r2, q2, and r2pred are explained in Appendix C. As can be seen in Tables 1 and 2, almost all r2, q2, and r2pred values exceeded 0.9 and were very high, so the

Table 1. Modeling and Prediction Results at Analyzer B in the Case of IDV(7) PLS

SVR

analyzer B

r2

q2

r2pred

r2

q2

r2pred

F G H

0.979 0.998 1.000

0.968 0.993 1.000

0.963 0.991 1.000

0.991 1.000 1.000

0.966 0.988 0.981

0.960 0.982 0.978

Table 2. Modeling and Prediction Results at Analyzer C in the Cases of IDV(4) and IDV(11)a PLS analyzer C

r

2

F G H

0.993 0.961 1.000

2

SVR

q

r2pred

r

2

q2

r2pred

0.987 0.927 1.000

0.992 0.910 1.000

1.000 1.000 1.000

0.982 0.925 0.998

0.982 0.876 0.997

a

Both IDV(4) and IDV(11) correspond to reactor cooling water inlet temperature.

predictive PLS and SVR models could be constructed with high accuracy. The high prediction accuracy for analyzer C was performed with single soft sensors for IDV(4) and IDV(11), which is because the correlations between process variables remain in the same process faults with different behaviors. The r2pred values of the PLS models were slightly higher than those of the SVR models in all cases, as shown in Tables 2 and 3. The Table 3. Fault Detection Performancea ①



③b

a

accuracy rate detection rate precision accuracy rate detection rate precision accuracy rate detection rate precision

IDV(4)

IDV(11)

IDV(7)

0.1779 0.0238 1.0000 0.9989 1.0000 0.9988 0.9947 1.0000 0.9938

0.8589 0.8337 0.9985 0.9768 0.9738 0.9987 0.9723 0.9888 0.9790

0.4642 0.3638 1.0000 1.0000 1.0000 1.0000 0.9809 0.9950 0.9827

See Figure 1 for ①, ②, and ③. bProposed method.

relationship between process variables is linear, and the SVR model would be overfit to the training data. The Y values can be predicted from X values obtained online with the hardware sensor and past Y values in real time. Figure 3 shows the relationships between the simulated and predicted Y values. According to Figure 3, tight clusters of predicted values along the diagonal should be shown, reflecting the high prediction of component F at analyzer B and component G at analyzer C for both the PLS and SVR models. As shown in Figure 3a,b, the SVR model could predict the values from about 65% to 70% more accurately than the PLS model. The concentration region from about 65% to 70% corresponds to the transition from normal to abnormal conditions. The SVR model could predict this transition accurately, but there seemed to be a negative bias of the prediction errors when the actual Y values were higher than 80%, which is why the r2pred value in the SVR model was lower than that in the PLS model in Table 1. Figure 3c,d shows much tighter clusters of predicted values along the diagonal in PLS modeling, compared with those in SVR modeling, when the Y values were lower than about 40% or higher than 45%. These regions correspond to the random variation of the 8556

dx.doi.org/10.1021/ie501024w | Ind. Eng. Chem. Res. 2014, 53, 8553−8564

Industrial & Engineering Chemistry Research

Article

Here, TP denotes the number of true positives, or the number of samples for which the model detects the faults correctly; TN represents the number of true negatives, or the number of samples for which the model does not detect the faults and the system operation is indeed normal; FP denotes the number of false positives, or the number of samples for which the model detects the faults incorrectly; and FN represents the number of false negatives, or the number of samples for which the system operation is actually abnormal but the model does not detect the faults. The confidence limits of the MSPC models are the 99.7% values of the T2 and Q statistics of normal process data. Thus, these confidence limits are equal to the values for the 937th data point of the 940 normal data points for the T2 and Q statistics. The fault detection results for IDV(4) are shown in Table 3 and Figures 4 and 5. These results show that traditional method

Figure 3. Relationships between simulated and predicted Y values for the TE process.

reactor cooling water inlet temperature in IDV(11). The linearity between the process variables will be retained in the variation as well, and the PLS model should have more predictive accuracy than the SVR model. We compared two traditional MSPC models (① and ② in Figure 1) and the proposed MSPC model (③ in Figure 1) in terms of fault detection performance. As is the case with soft sensor analysis, X included each explanatory variable that was delayed for durations ranging from 0 to 30 min in steps of 3 min. For traditional method ②, composition measurements F−H at analyzer C were added to the X variables in the fault detection of IDV(4) and IDV(11), and composition measurements F−H at analyzer B were added to the X variables in the fault detection of IDV(7). For proposed method ③, we used soft sensor models constructed with PLS. The predicted compositions of F−H at analyzer C were added to the X variables in the fault detection of IDV(4) and IDV(11), and the predicted compositions of F−H at analyzer B were added to the X variables in the fault detection of IDV(7). All of the MSPC models in this case study were based on PCA, and the PCA models were constructed with only normal data in the training data. The number of components was determined so that the cumulative contribution ratio first exceeded 95% or 99%. Although we examined the fault detection performances in the both cases (i.e., 95% and 99%), there was no significant difference in the results. Therefore, we discuss the results for 95% in this article. As indexes to compare the fault detection performance, the accuracy rate (AR), precision (PR), and detection rate (DR) were used; they are defined as AR =

TP + TN TP + FP + TN + FN

(4)

PR =

TP TP + FP

(5)

DR =

TP TP + FN

(6)

Figure 4. Time plots of the T2 and Q statistics for IDV(4) using traditional method ①. The dashed lines represent the confidence limits of T2 and Q.

① could not detect the abnormal condition with both T2 and Q statistics. Figure 4 indicates that the Q statistic largely exceeded the confidence limit only in the transition region from the normal conditions to the abnormal conditions, whereas the T2 statistic was within the confidence limit in this region. This result suggests that there is a breakage of the correlation among the process variables only in the transition region and that the correlation is maintained under abnormal conditions after the transition. Thus, the accuracy and detection rates were both very low for method ①. The fault detection performances in Table 3 greatly improved when traditional method ② and proposed method ③ and monitoring of component concentrations F−H at analyzer C were used. However, for method ②, 8557

dx.doi.org/10.1021/ie501024w | Ind. Eng. Chem. Res. 2014, 53, 8553−8564

Industrial & Engineering Chemistry Research

Article

detection results using only the T2 or Q statistic are reported in Tables S4 and S5 (Supporting Information). In addition, the above fault detection results indicate that the step variation of the reactor cooling water inlet temperature did not appear directly in the process variables such as the reactor temperature and the reactor cooling water outlet temperature in the disturbance. The heats of all reactions in the disturbance would decrease from those in the normal state, and this change would be induced by the variation of the production of component G whose reaction rate changed easily, compared with those of the other components. In this process, the process variables such as temperatures were not affected, whereas the component concentration was affected by the disturbance. Therefore, monitoring the process with the concentration of component G could improve the fault detection performance. On the other hand, Table 3 and Figure 7 show that fault in IDV(11) was roughly detected with the Q statistic for

Figure 5. Time plots of the T2 and Q statistics for IDV(4) using proposed method ③. The dashed lines represent the confidence limits of T2 and Q.

fault detection is delayed because of the measurement time of the concentration. Figure 5 shows the T2 and Q statistics of proposed method ③. In this case, the effects of component concentrations of F and H at analyzer C from the disturbance were small. The T2 and Q statistics were greatly influenced by the component concentration of G at analyzer C (Y variable), and the high prediction accuracy of this component (see Figure 6) resulted in the good fault detection performance of

Figure 7. Time plots of the T2 and Q statistics for IDV(11) using traditional method ①. The dashed lines represent the confidence limits of T2 and Q.

traditional method ①, reflecting the fact that there was a change in the correlation among the process variables in the disturbance. Because the same fault detection model was used for IDV(4) and IDV(11), the values of the T2 and Q statistics in Figures 4 and 7 suggest that the process state in IDV(11) and the normal state are more different than the process state in IDV(4) and the normal state. Also, in the case of IDV(11), monitoring of component G improved the fault detection performance of traditional method ② and proposed method ③, as shown in Table 3. Figure 8 shows the T2 and Q statistics of proposed method ③. The T2 and Q statistics with both the process variables (X variables) and the predicted component concentrations (Y variables)

Figure 6. Time plot of predicted values of component G at analyzer C for IDV(4).

method ③ that was almost the same as that of method ②. There was a breakage of the correlation structure among the component concentrations and process variables of these models under steady abnormal conditions. One can thus say that monitoring of the component concentration greatly improved the fault detection performance in these systems. The fault 8558

dx.doi.org/10.1021/ie501024w | Ind. Eng. Chem. Res. 2014, 53, 8553−8564

Industrial & Engineering Chemistry Research

Article

Figure 8. Time plots of the T2 and Q statistics for IDV(11) using proposed method ③. The dashed lines represent the confidence limits of T2 and Q.

Figure 10. Time plots of the T2 and Q statistics for IDV(7) using traditional method ①. The dashed lines represent the confidence limits of T2 and Q.

reflect the disturbance more strongly than those with only the X variables. This is because the predicted concentration of component G at analyzer C (Y variable) could represent the random variation in the process (see Figure 9). Therefore,

accuracy when traditional method ② and proposed method ③ were used. The time plots of the T2 and Q statistics for IDV(7) using proposed method ③ and the time plot of predicted values of component F at analyzer B for IDV(7) are shown in Figures 11 and 12, respectively. The values of the Y variable became large in disturbance IDV(7) and had a positive effect on fault detection. On one hand, the fault detection model constructed with method ② had a time delay in fault detection because of the measurement time of the Y variables; on the other hand, the proposed method achieved accurate and early fault detection by predicting the values of the Y variables with the soft sensor. The disturbance of IDV(7) (i.e., C header pressure lossreduced availability) appeared in the process variables (X variables) only when the transition from the normal state to the abnormal state happened, whereas the disturbance appeared in the component concentration of purged byproduct F at analyzer B in the abnormal state (see Figure 12). The accurate prediction of this component concentration by the soft sensor and monitoring of the predicted values could improve the fault detection performance. Although C header pressure loss had occurred, the total feed and the total feed flow valve were controlled to the normal-state values, which was the reason for the behaviors of the T2 and Q statistics in ① (Figure 10). Meanwhile, the component composition in the plant would be changed due to disturbance IDV(7); that is, the composition of reactant A and/or the composition of nonreactant B would increase, and the composition of reactant C would decrease. The decrease of the C composition would accelerate the reaction that does not require C and produces byproduct F. Therefore, the disturbance appeared in the F composition. By using the Y values predicted by soft sensors, the proposed method achieved almost the same fault detection performance

Figure 9. Time plot of predicted values of component G at analyzer C for IDV(11).

control with the X variables and the predicted Y variables is useful for accurate fault detection. In addition, the proposed model can detect a fault that has different variation. The proposed method is effective for practical applications. Figure 10 shows the time plots of the T2 and Q statistics for IDV(7) using traditional method ①. The values of the statistics became large only in the transition region from normal conditions to abnormal condition using traditional method ①, as was the case for IDV(4). The behaviors of those statistics resulted in poor fault detection performance, as shown in Table 3. Even in this case, there was an improvement in the fault detection 8559

dx.doi.org/10.1021/ie501024w | Ind. Eng. Chem. Res. 2014, 53, 8553−8564

Industrial & Engineering Chemistry Research

Article

to other difficult-to-measure variables if the corresponding predictive soft sensors are constructed. In this case study, we did not consider the degradation of the soft sensor models,33,35 in which the predictive accuracy of the soft sensors tends to decrease gradually for several reasons, including changes in the state of the chemical plant, catalyzing performance loss, and sensor and process drift. When degradation occurs in realistic cases, the adaptive soft sensor models33−36 should be used for accurate predictions even in irregular process states and large sampling times of Y variables because the performance of the proposed method largely depends on the predictive accuracy of the soft sensor model. By using adaptive models and increasing the predictive ability of the soft sensor model, the accuracy of the proposed method also increases. Next, we employ an adaptive model to maintain a high predictive ability of the soft sensor model in a realistic case study using industrial data. 3.2. Application to a Real Industrial Process. To verify the effectiveness of the proposed method, we analyzed the data set obtained from the operation of a distillation column at Mizushima Works, Mitsubishi Chemical Corporation. A Y variable is the concentration of the bottom product, and the X variables are 19 variables such as temperature and pressure. The measurement interval of Y was 30 min, whereas the X variables were measured every minute. We used data in which the Y variable was measured in years 2002 and 2003. Reference 34 also includes the same distillation column, and the details of a schematic representation of the distillation column and process variables can be found in ref 34. The normal data from 2002 were used as training data, and 4000 h worth of data from 2003 were used as test data. The training data were used for training both the PLS model and the PCA model. The modeling results are presented in Table 4. The r2 and q2 values are high, meaning

Figure 11. Time plots of the T2 and Q statistics for IDV(7) using proposed method ③. The dashed lines represent the confidence limits of T2 and Q.

Table 4. Modeling and Prediction Results for PLS Using the Real Industrial Data statistic

value

r RMSE(r2) q2 RMSE(q2) r2pred RMSE(r2pred)

0.867 0.392 0.859 0.403 0.448 0.268

2

Figure 12. Time plot of predicted values of component F at analyzer B for IDV(7).

that the predictive PLS model could be constructed. In this case study, we eliminated the abnormal data from the training data in the PLS model first because the abnormal data decreased the predictive accuracy of the PLS model. At around 3840−3845 h in the test data, a Y-analyzer fault occurred, and accordingly, a process fault also occurred, although the details of the process fault were uncertain. We tried to detect both the Y-analyzer fault and the process fault. The time plots of the T2 and Q statistics obtained using traditional method ① are shown in Figure 13. The number of principal components was determined so that the cumulative contribution ratio first exceeded 99%. The thresholds of the T2 and Q statistics were set according to the same procedure as used in the TE process data analyses. From Figure 13a, the values of T2 exceeded the threshold most of the time even though the plant state was normal. Figure 13b is the enlarged plot of time from 3500 to 4000, and most of the T2 values exceeded the threshold from about 3550 h. However, the actual faults started at 3840 h. For the Q statistic in Figure 13c, the values exceeded

as the method using the measured Y values and having a time delay in fault detection. The proposed method can control the process in real time, because the input variables for fault detection include only X variables and difficult-to-measure Y variables predicted in real time. Generally, time is required to measure component concentrations with analyzers. On the other hand, the predicted component concentrations in this work can be obtained with soft sensors in real time, and the proposed method can detect faults with no time delay. Although we constructed soft sensors with data under normal conditions and those under a single abnormal condition in this study, we believe that the development of soft sensors with data obtained under various conditions enlarges the applicability domains34 of the proposed system and realizes more accurate fault detection. Additionally, although we handled only component concentrations in this study, the proposed method is able to be applied 8560

dx.doi.org/10.1021/ie501024w | Ind. Eng. Chem. Res. 2014, 53, 8553−8564

Industrial & Engineering Chemistry Research

Article

Figure 13. Time plot of the T2 and Q statistics using traditional method ① with real industrial data.

Figure 15. Time plot of the T2 and Q statistics using the proposed method with real industrial data.

was too late compared to the actual fault time (3840−3845 h). Traditional method ① could thus not achieve appropriate abnormal detection. Then, we detected the faults with proposed method ③. Because the degradation of the soft sensor model was confirmed in ref 37 for this data set, the moving-window PLS (MWPLS) model, which is an adaptive soft sensor model, was used to predict the values of Y. The number of data points for

Figure 14. Relationships between measured and predicted Y values when real industrial data are used.

the threshold at some times, although the number of such occurrences was lower than for the T2 statistic. The Q statistic could detect the abnormal event right before 3900 h, but this 8561

dx.doi.org/10.1021/ie501024w | Ind. Eng. Chem. Res. 2014, 53, 8553−8564

Industrial & Engineering Chemistry Research

Article

values of process variables that are difficult to measure online, and therefore, it could enable early fault detection in comparison with traditional MSPC methods. The concepts of the method proposed in this study can be applied to any other MSPC methods and are expected to improve the fault detection performance of those methods. The proposed method depends on the prediction accuracy of the soft sensor models, and the construction of highly predictive models is important for accurate fault detection. The diagnosis of faults using the proposed method is one of our future works. By applying our proposed method to process control in industrial plants with appropriate selection of statistical methods for each process characteristics, the plants can be operated stably and safely.

model construction (i.e., the window size) was 500. After eliminating the abnormal data, the r2pred value was 0.448, which was relatively low compared to the r2 and q2 values. It should be noted that the variances of Y in the training and test data are different and that the r2, q2, and r2pred values are affected by the variance of Y. Thus, root-mean-square error (RMSE) values corresponding to r2, q2, and r2pred (see Appendix C) are also included in Table 4. The RMSE(r2pred) value was low, indicating that the MWPLS model could predict the Y values with low prediction errors. The relationships between the measured and predicted Y values are shown in Figure 14. The prediction results were affected by the process fault, and the prediction errors for the data of high Y values were relatively large, which could be one of the reasons for the low r2pred value. However, Figure 14 shows that all of the values of Y could be predicted well using the MWPLS model. The predicted Y variable was used as one of the X variables for the PCA model. Figure 15 shows the time plots of the T2 and Q statistics obtained using proposed method ③. In Figure 15a−c, one can see that the strong peak was around 3840 h, which was the actual time of the faults, and the proposed model could detect the target faults. However, the T2 values exceeded the threshold even when the state in the plant was normal, although the number of such occurrences was lower than that for traditional T2 statistics. The Q statistic in Figure 15d,e exhibited large values when at around 3840−3845 h, which enabled the detection of the Y-analyzer fault and the process fault. In addition, the Q values exceeded the threshold in the normal state less than the other statistics did. Although there was a time when the MSPC model detected the normal state as abnormal, the target abnormal event could be accurately detected by predicting the Y values with an adaptive soft sensor model and using the proposed method. Much of the time when the plant state was detected as abnormal could be confirmed even when it was actually normal in the case of the real industrial data analysis. This was because data noise exists and the process characteristics change in real industrial plants. To solve this problem, the proposed method can be combined with adaptive MSPC models such as recursive PCA and can adapt to the changes in the process characteristics.



APPENDIX A. PLS PLS is a method for relating explanatory variables, X, and an objective variable, y. Using a linear multivariate model, PLS goes beyond traditional regression methods in that it also models the structures of X and y. In PLS modeling, the covariance between y and the score vector ti is maximized. A PLS model has higher predictive power than ordinary leastsquares models. A PLS model consists of the following two equations (A.1) X = TP′ + E y = Tq + f

(A.2)

where T is a score matrix, P is an X-loading matrix, q is a y-loading vector, E is a matrix of X residuals, and f is the vector of y residuals. The PLS regression model is given by y = Xb + constant

(A.3)

b = W(P′W)−1q

(A.4)

where W is an X-weight matrix and b is a vector of regression coefficients.



APPENDIX B. SVR The SVR method applies a support vector machine (SVM) to regression analysis and can be used to construct nonlinear models by applying a kernel trick as well as the SVM. The primal form of SVR can be expressed as the following optimization problem. Minimize 1 || w ||2 + C ∑ |yi − f (x i)|e 2 (B.1) i

4. CONCLUSIONS In this study, we proposed a new MSPC method using soft sensors for both early and accurate fault detection in industrial plants. The process variables that are difficult to measure in real time are predicted with soft sensors and are added to input variables of the fault detection models. Therefore, by using the proposed method, real-time control can be performed when considering Y variables such as product quality. We demonstrated the effectiveness of the proposed method through data analysis of the TE process and a real industrial process. In the case of the TE process, precise and predictive soft sensor models predicting the important component values could be constructed using the PLS and SVR methods, and the proposed fault detection models considering the predicted component values achieved high performance. For a real industrial process, although some misdetection occurred because of the influence of noise and changes in process characteristics, the good prediction of Y variables was achieved by MWPLS, and the more relevant detection of a target abnormal event compared to the traditional method was realized using the proposed method. The proposed fault detection system does not include measured

where yi and xi are training data, w is a weight vector, e is a threshold, and C is a penalizing factor that controls the tradeoff between model complexity and training errors. The second term of eq B.1 is the e-insensitive loss function, which is written as |yi − f (x i)|e = max(0, |yi − f (x i)| − e)

(B.2)

where e is a threshold. Through the minimization of eq B.1, one can construct a regression model that has a good balance between generalization capabilities and the ability to adapt to the training data. The kernel function in our application is a radial basis function K (x , x′) = exp( −γ || x − x′||2 ) 8562

(B.3)

dx.doi.org/10.1021/ie501024w | Ind. Eng. Chem. Res. 2014, 53, 8553−8564

Industrial & Engineering Chemistry Research

Article

where γ is a tuning parameter controlling the width of the kernel function. The ν-SVR method, which assigns an upper threshold ν to the ratio of data whose error exceeds e instead of e itself, was used in this study.

(2) Kourti, T. Application of latent variable methods to process control and multivariate statistical process control in industry. Int. J. Adapt. Control Signal Process. 2005, 19, 213. (3) Wold, S.; Esbensen, K.; Geladi, P. Principal Component Analysis. Chemom. Intell. Lab. Syst. 1989, 2, 37. (4) Jackson, J. E.; Mudholkar, G. S. Control Procedures for Residuals Associated with Principal Component Analysis. Technometrics 1979, 21, 341. (5) Ku, W.; Storer, R.; Georgakis, C. Disturbance detection and isolation by dynamic principal component analysis. Chemom. Intell. Lab. Syst. 1995, 30, 179. (6) Li, W. H.; Yue, H. H.; Valle-Cervantes, S.; Qin, S. J. Recursive PCA for adaptive process monitoring. J. Process Control 2000, 10, 471. (7) Kano, M.; Hasebe, S.; Hashimoto, L.; Ohno, H. Statistical process monitoring based on dissimilarity of process data. AIChE J. 2002, 48, 1231. (8) Choi, S. W.; Martin, E. B.; Morris, A. J.; Lee, I. B. Fault Detection Based on a Maximum-Likelihood Principal Component Analysis (PCA) Mixture. Ind. Eng. Chem. Res. 2005, 44, 2316. (9) Hong, J. J.; Zhang, J.; Morris, J. Fault Localization in Batch Processes through Progressive Principal Component Analysis Modeling. Ind. Eng. Chem. Res. 2011, 50, 8163. (10) Bin Shams, M. A.; Budman, H. M.; Duever, T. A. Fault detection, identification and diagnosis using CUSUM based PCA. Chem. Eng. Sci. 2011, 20, 4488. (11) Ge, Z. Q.; Song, Z. H. Distributed PCA Model for Plant-Wide Process Monitoring. Ind. Eng. Chem. Res. 2013, 52, 1947. (12) Hyvärinen, A.; Oja, E. Independent component analysis: Algorithms and applications. Neural Networks 2000, 13, 411. (13) Kano, M.; Tanaka, S.; Hasebe, S.; Hashimoto, I.; Ohno, H. Monitoring Independent Components for Fault Detection. AIChE J. 2003, 49, 969. (14) Kaneko, H.; Arakawa, M.; Funatsu, K. Development of a new soft sensor method using independent component analysis and partial least squares. AIChE J. 2009, 55, 87. (15) Ge, Z. Q.; Song, Z. H. Process Monitoring Based on Independent Component Analysis−Principal Component Analysis (ICA−PCA) and Similarity Factors. Ind. Eng. Chem. Res. 2007, 46, 2054. (16) Rashid, M. M.; Yu, J. A new dissimilarity method integrating multidimensional mutual information and independent component analysis for non-Gaussian dynamic process monitoring. Chemom. Intell. Lab. Syst. 2012, 115, 44. (17) Rashid, M. M.; Yu, J. Hidden Markov Model Based Adaptive Independent Component Analysis Approach for Complex Chemical Process Monitoring and Fault Detection. Ind. Eng. Chem. Res. 2012, 51, 5506. (18) Lee, J. M.; Yoo, C. K.; Choi, S. W.; Vanrolleghem, P. A.; Lee, I. B. Nonlinear process monitoring using kernel principal component analysis. Chem. Eng. Sci. 2004, 59, 223. (19) Ge, Z. Q.; Yang, C. J.; Song, Z. H. Improved kernel PCA-based monitoring approach for nonlinear processes. Chem. Eng. Sci. 2009, 64, 2245. (20) Kittiwachana, S.; Ferreira, D. L. S.; Lloyd, G. R.; Fido, L. A.; Thompson, D. R.; Escott, R. E. A.; Brereton, R. G. One class classifiers for process monitoring illustrated by the application to online HPLC of a continuous process. J. Chemom. 2010, 24, 96. (21) Yu, J. A Support Vector Clustering-Based Probabilistic Method for Unsupervised Fault Detection and Classification of Complex Chemical Processes Using Unlabeled Data. AIChE J. 2013, 59, 407. (22) Yu, J.; Qin, S. J. Multimode process monitoring with Bayesian inference-based finite Gaussian mixture models. AIChE J. 2008, 54, 1811. (23) Xie, X.; Shi, H. B. Dynamic Multimode Process Modeling and Monitoring Using Adaptive Gaussian Mixture Models. Ind. Eng. Chem. Res. 2012, 51, 5497. (24) Qin, S. J.; Zheng, Y. Y. Quality-Relevant and Process-Relevant Fault Monitoring with Concurrent Projection to Latent Structures. AIChE J. 2013, 59, 496.



APPENDIX C. STATISTICS In this article, r2 and q2 were used as measures of the accuracy and predictive ability of regression models and were defined as n

2

r =1−

∑i = 1 (yobs, i − ycalc, i )2 n

∑i = 1 (yobs, i − y ̅ )2

(C.1)

n

2

q =1−

∑i = 1 (yobs, i − ypred, i )2 n

∑i = 1 (yobs, i − y ̅ )2

(C.2)

where yobs is the measured y value, ycalc is the calculated y value, ypred is the predicted y value in the cross-validation procedure, and n is the number of data points. In this study, the five-fold cross-validation method was used in the calculation of ypred. In the above equations, r2 represents the fitting accuracy of the constructed models, and q2 represents the predictive accuracy of the constructed models. Values close to unity for both r2 and q2 are favorable. r2pred is the value of r2 calculated from the test data. The root-mean-square errors (RMSEs) of ycalc and ypred are defined as follows n

2

RMSE(r ) =

∑i = 1 (yobs, i − ycalc, i )2 n

(C.3)

n

2

RMSE(q ) =

∑i = 1 (yobs, i − ypred, i )2 n

(C.4)

The lower the RMSE(r2) and RMSE(q2) values, the higher the accuracy and predictive accuracy obtained with the constructed model. RMSE(r2pred) is the RMSE value calculated from the test data.



ASSOCIATED CONTENT

S Supporting Information *

Additional information as noted in text. This material is available free of charge via the Internet at http://pubs.acs.org.



AUTHOR INFORMATION

Corresponding Author

*Tel.: +81-3-5841-7751. Fax: +813-5841-7771. E-mail: [email protected]. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS The authors acknowledge the support of Mizushima Works, Mitsubishi Chemical Co., and the financial support of Mizuho Foundation for the Promotion of Sciences.



REFERENCES

(1) Bersimis, S.; Psarakis, S.; Panaretos, J. Multivariate Statistical Process Control Charts: An Overview. Qual. Reliab. Engng. Int. 2007, 23, 517. 8563

dx.doi.org/10.1021/ie501024w | Ind. Eng. Chem. Res. 2014, 53, 8553−8564

Industrial & Engineering Chemistry Research

Article

(25) Kano, M.; Nakagawa, Y. Data-based process monitoring, process control, and quality improvement: Recent developments and applications in steel industry. Comput. Chem. Eng. 2008, 32, 12. (26) Kadlec, P.; Gabrys, B.; Strandt, S. Data-driven soft sensors in the process industry. Comput. Chem. Eng. 2009, 33, 795. (27) Kourti, T. Process analysis and abnormal situation detection: From theory to practice. IEEE Control Syst. 2002, 22, 10. (28) AlGhazzawi, A.; Lennox, B. Model predictive control monitoring using multivariate statistics. J. Process Control 2009, 19, 314. (29) Geladi, P.; Kowalski, B. R. Partial least squares regression: A tutorial. Anal. Chim. Acta 1986, 185, 1. (30) Smola, A. J.; Schölkopf, B. A tutorial on support vector regression. Stat. Comput. 2004, 14, 199. (31) Downs, J. J.; Vogel, E. F. A plant-wide industrial process control problem. Comput. Chem. Eng. 1993, 17, 245. (32) Russell, E. L.; Chiang, L. H.; Braatz, R. D. Fault detection in industrial processes using canonical variate analysis and dynamic principal component analysis. Chemom. Intell. Lab. Syst. 2000, 51, 81. (33) Kaneko, H.; Funatsu, K. Classification of the degradation of soft sensor models and discussion on adaptive models. AIChE J. 2013, 59, 2339. (34) Kaneko, H.; Arakawa, M.; Funatsu, K. Applicability domains and accuracy of prediction of soft sensor models. AIChE J. 2011, 57, 1506. (35) Kadlec, P.; Grbic, R.; Gabrys, B. Review of adaptation mechanisms for data-driven soft sensors. Comput. Chem. Eng. 2011, 35, 1. (36) Kaneko, H.; Funatsu, K. Adaptive soft sensor model using online support vector regression with the time variable and discussion on appropriate hyperparameters and window size. Comput. Chem. Eng. 2013, 58, 288. (37) Kaneko, H.; Funatsu, K. A soft sensor method based on values predicted from multiple intervals of time difference for improvement and estimation of prediction accuracy. Chemom. Intell. Lab. Syst. 2011, 109, 197.

8564

dx.doi.org/10.1021/ie501024w | Ind. Eng. Chem. Res. 2014, 53, 8553−8564