Robust Adaptive Partial Least Squares Modeling of a Full-Scale

Dec 30, 2006 - (hard threshold) or suppressed by using a weight function (soft threshold) prior to model update. To elucidate the feasibility of the p...
0 downloads 0 Views 1MB Size
Ind. Eng. Chem. Res. 2007, 46, 955-964

955

Robust Adaptive Partial Least Squares Modeling of a Full-Scale Industrial Wastewater Treatment Process Hae Woo Lee, Min Woo Lee, and Jong Moon Park* AdVanced EnVironmental Biotechnology Research Center, Department of Chemical Engineering/School of EnVironmental Science and Engineering, POSTECH, San 31 Hyoja-Dong, Nam-Gu, Pohang, Kyungbuk 790-784, Republic of Korea

A new scheme of robust adaptive partial least squares (PLS) method was proposed for the purpose of prediction and monitoring of an industrial wastewater treatment process that has highly complex and time-varying process dynamics. The essential feature of this method is that all incoming process data are preliminarily screened on the basis of a combined monitoring index and each observation identified as an outlier is simply eliminated (hard threshold) or suppressed by using a weight function (soft threshold) prior to model update. To elucidate the feasibility of the proposed scheme, various PLS modeling approaches, including conventional ones, were evaluated and their results were compared with each other. While the conventional approaches clearly revealed their limitations such as the inflexibility of the model to process changes and the misleading model update by high leverage outliers, most robust adaptive PLS approaches based on the proposed scheme exhibited fairly good performances both in the prediction and monitoring aspects. Among the tested methods, the robust adaptive PLS method using Fair weight function showed the best performances, reasonably maintaining the robustness of the PLS model. 1. Introduction Multivariate statistical process control (MSPC) has received considerable attention along with the rapid advances in on-line monitoring and computer technology. In MSPC complex process behaviors can be efficiently interpreted by compressing the high dimensional space of process variables into a low dimensional latent variable space, retaining the essential information of raw data. The partial least squares (PLS) method is one of the most popular MSPC techniques. It can effectively derive the relationship between the process input and output variables which usually have strongly collinear and noisy characteristics. The PLS method also provides powerful process monitoring tools. One can easily identify abnormal operations on the basis of the statistic monitoring indices such as Hotelling’s T 2 and squared prediction error (SPE). There have been several reports that the PLS method can be applied to the modeling and monitoring of various chemical and biological processes.1-3 However, the conventional static PLS method has been also criticized for its basic assumption of steady state which is contradictory to the fact that most actual processes usually have a nonstationary and time-varying dynamic nature. To overcome this problem, adaptive PLS method has been proposed. In the adaptive PLS method, the model is recursively updated using newly incoming data so that slowly changing process behavior can be effectively reflected in the model. While several adaptive PLS algorithms have been proposed and successfully applied to the modeling and monitoring of various time-varying processes,4-7 it has been pointed out that the PLS model could be seriously deteriorated when considerable numbers of abnormal process data were used in the model update procedure.5,8 Because the accuracy of the PLS model is highly dependent on the statistical information contained in the operation data, it is very crucial that only a data set representing the relevant variance of normal process dynamics should be * To whom correspondence should be addressed. Tel.: +82-54-2792275. Fax: +82-54-279-8299. E-mail: [email protected].

used for the model update in order to maintain the robustness of the PLS model. The robustness problem has been already issued throughout all MSPC techniques. MSPC techniques are basically derived from a database usually containing some outliers that originate from sensor faults, missing values, process disturbances, malfunction of instruments, process shut-down, and so on. These outliers can distort the distribution of multivariate data and often lead to a deceptive result. To minimize the adverse effect of outliers, several authors have proposed robust multivariate methods. In robust multivariate methods, robust statistics such as median and median absolute deviation are often used instead of mean and variance, respectively. Rousseeuw9 introduced least median squares method as an extension of the median methodology to obtain a robust regression model. Several robust techniques including minimum volume ellipsoid, ellipsoidal multivariate trimming, and minimum covariance determination have been also developed for the robust estimation of covariance matrix.9-11 For the static PLS model, Wakeling and Macfie12 proposed a simple and comprehensive robust PLS algorithm, so-called, iteratively reweighted PLS (IRPLS). This method used a regression residual to examine whether an observation is an outlier or not and calculated weight values to suppress the outliers in the model building step. The IRPLS has been modified by several authors and widely used as the most representative robust PLS method.13-15 According to our literature surveys, however, there have been very few reports concerning robust adaptive PLS method despite that the robustness problem is much more important in the adaptive model because the continuous incursion of incoming data without detection of outliers can seriously deteriorate the model structure. In this study, an industrial anaerobic filter process that shows highly complicated dynamic behaviors was selected as the model process, and a new scheme of robust adaptive PLS method was proposed for the modeling and monitoring of the process. To confirm the feasibility of the proposed robust adaptive PLS

10.1021/ie061094+ CCC: $37.00 © 2007 American Chemical Society Published on Web 12/30/2006

956

Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007

Figure 1. Schematic diagram of the industrial anaerobic filter process.

method, the performances of a conventional static PLS16 and an adaptive PLS method4 were also evaluated and compared with each other. 2. Description of the Model Process The model process is a full-scale down-flow anaerobic filter process to treat the wastewater discharged from a purified terephthalic acid manufacturing plant (Samsung Petrochemical Co. Ltd., Ulsan, Korea). The detail schematic diagram of the process is shown in Figure 1. The anaerobic filter process was designed for the preliminary conversion of organic pollutants in wastewater into methane gas, reducing the organic loading to the following activated sludge process. For the stable operation of the anaerobic filter process, the organic loading rate and the pH of the feed stream are manually controlled by changing the flows of both high-strength wastewater and sodium hydroxide added to the raw wastewater feed. The feed temperature is also controlled at 38 °C by a cooling tower. The effluent is recycled to the front of the reactor for the purpose of mixing and dilution of the feed wastewater. An operation database consisting of the online-measured variables shown in Table 1 was available, which was automatically accumulated by a data acquisition system (Honeywell, Morristown, NJ). Other detailed process descriptions can be found elsewhere.17 3. Model Identification (3.1) General Modeling Approach. In the whole subsequent model identification processes, hourly average data sets with 4369 total observations were used. All variables considered in the model identification were classified into X and Y blocks, as shown in Table 1. The predictor block X consisted of 10 onlinemeasured variables and 5 additional variables that could be calculated from the online-measured variables. It was expected that the inclusion of the additional variables could enhance the model performance, because they were closely related to the actual dynamics of the anaerobic filter process. The predicted block Y consisted of the total oxygen demand (TOD) concentration of effluent and the production rate of methane gas, which are directly related to the performance of the model process. All data were used in the model identification after autoscaling. To construct a dynamic PLS model, autoregressive with exogenous inputs (ARX) and finite impulse response (FIR) modeling approaches were considered. These approaches have been widely adopted in data-driven dynamic model identification.18-20 In the preliminary study, however, it was revealed that ARX

Table 1. Variables Used in the Modeling of an Industrial Anaerobic Filter Process notation

description

predictor variable (X)

Qin

flow rate of influent (m3/h)

TODin pHin pHout Tequ Tfeed Qr Qhigh Qgas QNaOH rTODina

TODout

TOD of influent (mg/L) pH of influent pH of effluent temperature of equalization tank (°C) temperature of feed flow (°C) recycle flow rate (m3/h) federate rate of high strength wastewater (m3/h) production rate of biogas (m3/h) Feed rate of sodium hydroxide solution (ton/h) actual TOD of influent (mg/L) ) (QinTODin + QrTODout)/(Qin + Qr) TOD loading rate (g of TOD/h) ) QinTODin contact time (h) ) (volume of reactor)/(Qin + Qr) hydraulic retention time (h) ) (volume of reactor)/Qin actual TOD loading rate (g of TOD/h) ) r(TODin)(Qin + Qr) TOD of effluent (mg/L)

QCH4

production rate of methane gas (m3/h)

TODloada CTa HRTa rTODloada predicted variable (Y) a

Variables were calculated from the measured variable on the basis of the knowledge of the process.

modeling approach tended to overemphasize the autoregression terms, making the model insensitive to the process changes.4 Therefore, we finally adopted the FIR modeling approach for all subsequent model identification processes as follows:

y(t+1) ) f(x(t), x(t-1), x(t-2), ..., x(t-nx))

(1)

where x and y represent the input variable vector and output variable vector, respectively, and nx is a time lag for input variables.4,21 The applied value of nx was 1, which was determined by using Akaike’s information criteria.22,23 The model was designed to perform a one step ahead prediction for the output variables. All programs for the model identification were implemented in MATLAB by using PLS toolbox.24 (3.2) Static PLS Model. A static PLS model was developed using the nonlinear iterative partial least squares (NIPALS) algorithm, which is most widely used to determine the model parameters such as score, loading, and weight vectors and regression coefficients.16 The first 1000 observations were used for model calibration, and the remainder was used for model

Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007 957

Figure 2. Flow chart of continuous model updating in the robust adaptive PLS algorithm. Table 2. Weight Functions Used in the Robust Adaptive PLS with Soft Threshold category

weighting function

constant, c

Cauchy Fair Bisquare

ωi ) 1/(1 + (φi/cζ2)2 ωi ) 1/(1 + φi/cζ2)2 ωi ) [1 - (φi/ζ2)2/c2]2 for φi ec; 0 for φi > c

3.94 3.6 8.41

validation. To determine the optimum number of latent variables, leave-one-out cross-validation technique was used25 and 3 was selected as the optimum value on the basis of the Wold’s R criterion.26 To investigate the process monitoring ability of the static PLS model, two typical statistical indices, T 2 and SPE, were used. T 2 is a measure of variations in the principal component subspace, whereas SPE is a measure of variations in the residual subspace. During the construction of the PLS model, the T 2s and the SPEs for all observations were calculated,1,2 and their confidence limits were also determined from their distributions.27 (3.3) Adaptive PLS Model. Among various adaptive PLS methods, we adopted the blockwise recursive PLS algorithm proposed by Qin4 as a basic skeleton because it is very efficient in updating the PLS model with respect to the computational cost and memory. First, an initial PLS model was obtained by the same way of constructing the static PLS model described above. Then, the PLS model was recursively updated with the moving window concept when an amount of newly incoming data was available. In the original Qin’s algorithm, the maximum possible number of latent variables, which corresponds to the rank of X block, was used in the model update. Thereafter, however, it was revealed that this might result in a poor prediction ability of the updated model due to an overfitting problem.28 To avoid this, we modified the updated model one more time by applying the cross-validation technique after every model update procedure. The size of the moving window was 1000, and the size of the subblock for the model update was 100. These sizes of moving window and subblock were chosen by a heuristic approach because no fundamental guideline exists.

However, some discussions about this topic are available in literature.7,28 For the application of the adaptive PLS method to the process monitoring, we used the adaptive confidence limits proposed by Wang et al.7 instead of the constant limits described previously in the static PLS model. Whenever a new observation became available, the confidence limits of T 2 and SPE were recalculated. (3.4) Robust Adaptive PLS Model. Figure 2 represents the detailed flow chart of the proposed robust adaptive PLS method. The overall scheme is very similar to that of the adaptive PLS method described previously. The distinct feature of the robust adaptive PLS method is that abnormal incoming process operation data (outliers) are preliminarily screened to maintain the robustness of the PLS model during the model update. To screen the outliers, we adopted the combined monitoring index which had been first proposed by Yue and Qin:29

φ)

SPE T 2 + 2 δ2 χl

(2)

where φ is the combined monitoring index, δ2 is the 99% confidence limit of the SPE, and χl2 is the 99% confidence limit of T 2. The distribution of the combined monitoring index can be approximated by χ2 distribution, and thus the statistical confidence limit (99%) of the combined monitoring index, ζ2, also can be calculated.18 Each newly incoming observation can be tested whether it is an outlier or not on the basis of its corresponding φ and ζ2 values. It should be noted that in this robust adaptive PLS method ζ2 was also recalculated whenever a new observation became available. The presented robust adaptive PLS method can be categorized into two different approaches according to the rejection threshold for the screening of abnormal data. In hard threshold approach, all data identified as outlier were simply eliminated and thus only normal data were used in the model update. On the other hand, in soft threshold approach, all incoming data including outliers were used in the model update. The soft threshold

958

Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007

Figure 3. Prediction results and monitoring charts obtained by using the static PLS method. Gray circles: measured values. Solid line: predicted values. Short dashed line: 95% confidence limit. Long dashed line: 99% confidence limit.

approach was designed on the basis of Pell’s idea,13 which used a weight function to suppress the adverse effect of outliers in identifying a static PLS model. In this study, however, the weight was calculated from the combined monitoring index instead of the cross-validated residual used by Pell, because it is a more meaningful indicator that can discriminate outliers both in the prediction and the monitoring aspects of the PLS model. The soft threshold approach can be further classified according to the weight function used for the calculation of the weight value. In this study, three different weight functions listed in Table 2 were tested, which were modified from the weight functions used by Pell.13 The calculated weight for an observation ranges from 0 to 1, depending on its combined monitoring index and the confidence limit. Although the dependency is different according to the weight function, the calculated weight converges to zero as the combined monitoring index exceeds the confidence limit increasingly. By multiplying the weight close to zero, the observation corresponding to a high-leverage outlier can be disguised as a normal observation and the robustness of the PLS model can be maintained during the model

update. All weight functions have their own tuning parameter c. We determined these values empirically to provide the best performance in terms of both prediction and monitoring performances. The tuning parameter values used in this study are also presented in Table 2. 4. Results and Discussion (4.1) Performance of Static PLS Method. Figure 3 shows the prediction and monitoring results obtained by applying the static PLS method, which reveals some evidence for the fundamental drawback of the static PLS method. In general, the prediction accuracy of the model for the validation part was worse than that for the calibration part, as most data-driven modeling approaches show a similar result. A particular feature is that the prediction accuracy of the model remarkably decreased after 2300 h, especially for the effluent TOD. This failure in the prediction indicates that after 2300 h the process states might be significantly changed from the normal states considered in the model calibration step. Because the static PLS

Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007 959

Figure 4. Time profiles of the various process variables showing abrupt changes after 2300 h.

model is usually derived from a limited historical database, it cannot well-describe the correlations between process variables that are not reflected on the calibration data sets. The occurrence of these severe process changes can be also recognized in the monitoring charts. As can be seen in the SPE and the T 2 plots, the SPEs continuously violated the confidence limit after 2300 h, while the T 2s showed a relatively stable profile. The continuous violation of the SPE confidence limit implies that the process changes could be characterized as not simple outliers that originated from temporary sensor faults or malfunction of instruments but new sources of correlations that should be further reflected on the model. Figure 4 shows the actual profiles of the process variables which were identified to be closely related to the process changes by using SPE contribution plot. After 2300 h the added amount of sodium hydroxide was abruptly decreased, whereas the added amount of extremely high strength wastewater was increased. It should be noted that both of them are the major manipulation variables to control the performance of the anaerobic filter process. Indeed, these control actions were strategically adopted by field operators because the anaerobic filter process had suffered from a media plugging problem that seemed to be instigated by the continuous feeding of suspended solids, mainly consisting of undissolved terephthalic acid. In order to resolve this media plugging problem, it was intended that the suspended solids were removed through preliminary sedimentation. However, this resulted in the lowered acidity and TOD concentration of the feed stream so that the control actions to compensate for them were inevitable. The adoption of a new control strategy and its consequential changes of process states are often experienced in most industrial processes. Because the static PLS method does not have any scheme to detect these types of process changes and to reflect them in the model, it seems that the static PLS method is not appropriate for the prediction and monitoring of an industrial process in a long-term perspective. (4.2) Performance of Adaptive PLS Method. Figure 5 represents the prediction and monitoring results obtained by applying the adaptive PLS method. Compared with the static PLS method, the adaptive PLS method showed relatively

enhanced prediction ability in spite of the introduction of the intentional process changes explained previously. The adaptive PLS method had the scheme to update the model periodically so that the intentional process changes could be effectively reflected on the model along with the time proceeding. The adaptation of the model to the process changes can be identified more obviously in the monitoring charts, which show the adaptive confidence limits of T 2 and SPE. In general, both confidence limits were updated well adaptively, providing more reasonable statistical guidelines to discriminate the outliers. As can be seen in the latter part of Figure 5, however, the model performances were seriously deteriorated after around 3500 h. The prediction accuracies for the effluent TOD and the methane production rate were greatly declined, showing somewhat insensitive time profiles against the considerable variations of the model input variables. Furthermore, it was also observed that the 95 and 99% confidence limits of the SPE were dramatically increased during this period. The fault identification method based on T 2 and SPE contribution plots revealed that these sudden deteriorations of the model performances were closely related to the sensor fault of the pH meter which measured the effluent pH of the anaerobic filter process. During the whole period, the actual pH values of the effluent were normally maintained around 7.0 with a very small variance, but the online measured pH values from 3486 to 3489 h were 14.0 due to the sensor fault. When these contaminated data sets (outliers) with an abnormally increased variance of the effluent pH were used in the model update procedure, the regression coefficients for other model input variables were remarkably shrunk from their normal values and thus the updated model became insensitive to the variations of the model input variables. The incursion of these severe outliers also distorted the distribution of the SPEs, resulting in the increase of the confidence limits as described previously. The conventional adaptive PLS method seems to be more appropriate than the static PLS method for the prediction and monitoring of an industrial process in online manner because it can adaptively capture the process changes by way of updating the model periodically. As can be deduced from the above results, however, it still has a limitation that the model can lose

960

Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007

Figure 5. Prediction results and monitoring charts obtained by using the adaptive PLS method. Gray circles: measured values. Solid line: predicted values. Short dashed line: 95% confidence limit. Long dashed line: 99% confidence limit.

its robustness when a number of severe outliers are used in the model update procedure. (4.3) Performance of Robust Adaptive PLS Method. (4.3.1) Hard Threshold Approach. The hard threshold approach may be one of the most intuitive methods to suppress the adverse effect of outliers during the model update. Because only the normal operation data passing the hard threshold were used for the model update, it was expected that the robustness of the model could be always guaranteed. Figure 6 represents the prediction and monitoring results obtained by applying the robust adaptive PLS method with the hard threshold approach. Overall, the prediction accuracy of the model was greatly enhanced compared with the results of the static and the adaptive PLS methods. The adaptation of the model also proceeded properly up to 2300 h. Unexpectedly, however, the model was never updated after 2300 h, showing rather problematic process monitoring ability. This interruption of the model update seems to be closely related to the introduction of the intentional process changes explained in the results of the static PLS method. Because the hard threshold approach has the nature of strict

elimination of outliers, the rapid and persistent process changes can be hardly reflected on the model. It is very interesting that the prediction accuracy of the model was maintained satisfactorily throughout all of the time span despite the model never being updated after 2300 h. The prediction accuracy after 3500 h was even higher than that of the adaptive PLS model. These results imply that the operation data after 2300 h have no essential information to enhance the prediction accuracy of the model and the operation data around 3500 h would be even harmful if they were used for the model update. Because the model had been already updated many times until 2300 h capturing sufficient information to describe the whole process dynamics, the prediction accuracy of the model could be maintained satisfactorily thereafter. In general, however, it should be remembered that adequate predictions are possible only when the model is updated properly. Moreover, it should be also noted that in the given example the operation data after 2300 h still contain some critical information to enhance the monitoring ability of the model so that they should be further reflected in the model if possible.

Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007 961

Figure 6. Prediction results and monitoring charts obtained by using the robust adaptive PLS with hard threshold method. Gray circles: measured values. Solid line: predicted values. Short dashed line: 95% confidence limit. Long dashed line: 99% confidence limit.

(4.3.2) Soft Threshold Approach. In the soft threshold approach, three different weight functions (i.e., Cauchy, Fair, and Bisquare) were tested to suppress the adverse effect of the outliers during the model update. The resultant performance of each weight function is summarized in Table 3 together with the performances of the other PLS modeling approaches considered in this study. Overall, the soft threshold approach always showed better prediction accuracy than the others regardless of the kind of used weight function. The monitoring ability of the soft threshold approach also seemed to be

Table 3. Performances of the Various PLS Modeling Approaches category

RMSEa

adaptabilityb

static PLS adaptive PLS robust adaptive PLS

1.7818 1.6299 1.2099 1.1828 1.1812 1.1833

no yes no yes yes yes

hard threshold soft threshold

Cauchy Fair Bisquare

a Root-mean-square error. b “Yes” means that PLS model could track the process changes properly.

962

Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007

Figure 7. Prediction results and monitoring charts obtained by using the robust adaptive PLS method with Fair weighting function. Gray circles: measured values. Solid line: predicted values. Short dashed line: 95% confidence limit. Long dashed line: 99% confidence limit.

outstanding for all cases compared with the others. Among the tested weight functions, the Fair weight function gave the best performance, so its detailed results are illustrated in Figure 7. As can be seen in this figure, the robust adaptive PLS method with the Fair weight function showed fairly good performances both in the prediction and in the monitoring results. In particular, the monitoring ability of the model was greatly improved compared with the hard threshold approach. The fault detection ability of the model was never interrupted by any type of process changes, and most alarms violating the confidence limits of T 2, SPE, and the combined monitoring index were clearly related

with the abnormal process operations such as process shut-down or back-flushing of the reactor. Although the detailed results are presented here only for the Fair weight function, all other soft threshold methods using different weight functions also exhibited comparable results. The weight functions considered in this study have a common characteristic in that they are all continuous functions to generate a weight value from 0 to 1, depending on the combined monitoring index and its confidence limit. However, they generate different weight values for the same observation, resulting in different screening abilities of outliers. Figure 8

Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007 963

Figure 8. Sample weights calculated from soft threshold method: (a) Cauchy, (b) Fair, and (c) Bisquare weight functions.

represents the weight value profiles which were obtained by applying different weight functions. All weight functions considered here could generate proper weight values not only for the intentional process changes after 2300 h but also for the sensor faults around 3500 h. For the observations corresponding to the intentional process changes, the weight functions generated moderately lowered weight values first, which eventually increased again as the model was adapted to the new process conditions. On the other hand, the weight functions generated weight values close to zero for the observations corresponding to the sensor faults. It should be noted that each weight function had different strictness in screening the outliers. The order of strictness seemed to be Bisquare, Cauchy, and Fair in descending order. However, it is miscellaneous how this strictness is related with the performance of the weight function. Moreover, each weight function has its own tuning parameter that was determined heuristically to provide the best performance in terms of both prediction and monitoring performances. Indeed, the weight functions showed different behaviors with different tuning parameters, and this made the selection of optimum weight function somewhat problematic. Although the Fair weight function gave the best performance in this study, the performances of the weight functions could be completely different according to the characteristics of the process dynamics of a model process. 5. Conclusions In this paper, a new scheme of robust adaptive PLS method was proposed to overcome the limitations of the conventional PLS methods. A full-scale anaerobic filter process that showed highly complicated process dynamics was selected as a model process, and the feasibilities of different PLS methods were investigated, especially focusing on their performances in the prediction and monitoring of the process. The conventional static PLS method showed very limited performances for the whole validation data sets because of the highly time-varying process dynamics of the model process. Although the conventional adaptive PLS method could reflect this time-varying process dynamics on the model, it also showed unsatisfactory performances after some severe outliers were used in the model update. On the other hand, most robust adaptive PLS methods based on the proposed scheme showed satisfactory prediction and monitoring performances, reasonably eliminating the adverse effect of the outliers during the model update. We believe that the presented robust adaptive PLS modeling approach could

be successively applied for the explorations of the various industrial processes that have complex and time-varying process dynamics. Acknowledgment This work was financially supported by the Samsung Petrochemical Co. Ltd. and by the ERC program of MOST/ KOSEF (Grant R11-2003-006-01001-1) through the Advanced Environmental Biotechnology Research Center at POSTECH. This work was also supported by the program for advanced education of chemical engineers (second stage of BK21). Literature Cited (1) MacGregor, J. F.; Kourti, T. Statistical process control of multivariate processes. Control Eng. Practice 1995, 3, 403. (2) Wise, B. M.; Gallagher, N. B. The process chemometrics approach to process monitoring and fault detection. J. Process Control 1996, 6, 329. (3) Teppola, P.; Mujunen, S. P.; Minkkinen, P. Partial, least squares modeling of an activated sludge plant: A case study. Chemom. Intell. Lab. Syst. 1997, 38, 197. (4) Qin, S. J. Recursive PLS algorithms for adaptive data modeling. Comput. Chem. Eng. 1998, 22, 503. (5) Rosen, C.; Lennox, J. A. Multivariate and multiscale monitoring of wastewater treatment operation. Water Res. 2001, 35, 3402. (6) Lee, D. S.; Vanrolleghem, P. A. Monitoring of a sequencing batch reactor using adaptive multiblock principal component analysis. Biotechnol. Bioeng. 2002, 82, 489. (7) Wang, X.; Kruger, U.; Lennox B. Recursive partial least squares algorithms for monitoring complex industrial processes. Control Eng. Practice 2003, 11, 613. (8) Li, W.; Yue, H. H.; Valle-Cervantes, S.; Qin, S. J. Recursive PCA for adaptive process monitoring. J. Process Control 2000, 10, 471. (9) Rousseeuw, P. J. Least median of squares regression. JASA, J. Am. Stat. Assoc. 1984, 79, 781. (10) Devlin, S. J.; Gnanadesikan, R.; Kettenting, J. R. Robust estimation of dispersion matrices and principal components. JASA, J. Am. Stat. Assoc. 1981, 76, 354. (11) Rousseeuw, P. J.; van Zomeren, B. C. Unmasking multivariate outliers and leverage points. JASA, J. Am. Stat. Assoc. 1990, 85, 633. (12) Wakeling, I. N.; Macfie, J. H. H. A robust PLS procedure. J. Chemom. 1992, 6, 189. (13) Cummins, D. J.; Andrew, C. W. Iteratively reweighted partial least squares: A performance analysis by Montecarlo simulation. J. Chemom. 1995, 9, 489. (14) Gil, J. A.; Romera, R. On robust partial least squares (PLS) methods. J. Chemom. 1998, 12, 365. (15) Pell, R. J. Multiple outlier detection for multivariate calibration using robust statistical techniques. Chemom. Intell. Lab. Syst. 2000, 52, 87. (16) Geladi, P.; Kowalski, B. R. Partial least-squares regression: A tutorial. Anal. Chim. Acta 1986, 185, 1. (17) Lee, M. W.; Joung, J. Y.; Lee, D. S.; Park, J. M.; Woo, S. H. Application of a moving-window neural network to the modeling of a fullscale anaerobic filter process. Ind. Eng. Chem. Res. 2005, 44, 3973.

964

Ind. Eng. Chem. Res., Vol. 46, No. 3, 2007

(18) Box, G. E. P.; Jenkins, G. M.; Reinsel, G. C. Time series analysis: Forecasting and control; Prentice Hall: Englewood Cliffs, NJ, 1994. (19) Dayal, B. S.; MacGregor, J. F. Recursive exponentially weighted PLS and its applications to adaptive control and prediction. J. Process Control 1997, 7, 169. (20) Shi, R.; MacGregor, J. F. Modeling of dynamic systems using latent variable and subspace methods. J. Chemom. 2000, 14, 423. (21) Ku, W.; Storer, R. H.; Georgakis, C. Disturbance detection and isolation by dynamic principal component analysis. Chemom. Intell. Lab. Syst. 1995, 30, 179. (22) Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1974, 19, 716. (23) Wu, T. J.; Sepulveda, A. The weighted average information criterion for order selection in time series and regression models. Stat. Probab. Lett. 1998, 39, 1. (24) Wise, B. M.; Gallagher, N. B. PLS toolbox, version 2.1; Eigenvector Research, Inc.: Wenatchee, WA, 2000.

(25) Wold, S. Cross-validatory estimation of the number of components in factor and principal component analysis. Technometrics 1978, 20, 397. (26) Li, B.; Morris, J.; Martin, E. B. Model selection for partial least squares regression. Chemom. Intell. Lab. Syst. 2002, 64, 79. (27) Jackson, J. E. A user’s guide to principal components; WileyInterscience: New York, 1991. (28) Vijaysai, P.; Gudi, R. D.; Lakshminarayanan, S. Identification on demand using blockwise recursive partial least-squares technique. Ind. Eng. Chem. Res. 2003, 42, 540. (29) Yue, H. H.; Qin, S. J. Reconstruction-based fault identification using a combined index. Ind. Eng. Chem. Res. 2001, 40, 4403.

ReceiVed for reView August 18, 2006 ReVised manuscript receiVed October 31, 2006 Accepted November 13, 2006 IE061094+