Development of Self-Validating Soft Sensors Using Fast Moving

Oct 15, 2010 - would not be misled by the outliers from the online analyzers, whereas the model could be updated during the transition stage. The pred...
0 downloads 0 Views 4MB Size
11530

Ind. Eng. Chem. Res. 2010, 49, 11530–11546

Development of Self-Validating Soft Sensors Using Fast Moving Window Partial Least Squares Jialin Liu* Department of Information Management, Fortune Institute of Technology, 1-10, Nwongchang Road, Neighborhood 28, Lyouciyou Village, Daliao Township, Kaohsiung Country, Taiwan, Republic of China

Ding-Sou Chen and Jui-Fu Shen New Materials Research & DeVelopment Department, China Steel Corporation, 1, Chung Kang Road, Hsiao Kang, Kaohsiung, Taiwan, Republic of China

In the development of soft sensors for an industrial process, the colinearity of the predictor variables and the time-varying nature of the process need to be addressed. In many industrial applications, the partial leastsquares (PLS) has been proven to capture the linear relationship between input and output variables for a local operating region; therefore, the PLS model needs to be adapted to accommodate the time-varying nature of the process. In this paper, a fast moving window algorithm is derived to update the PLS model. The proposed approach adapted the parameters of the inferential model with the dissimilarities between the new and oldest data and incorporated them into the kernel algorithm for the PLS. The computational loading of the model adaptation was therefore independent of the window size. In addition, the prediction performance of the model is only dependent on the retained latent variables (LVs) and the window size that can be predetermined from the historical data. Since a moving window approach is sensitive to outliers, the confidence intervals for the primary variables were created based on the prediction uncertainty. The inferential model would not be misled by the outliers from the online analyzers, whereas the model could be updated during the transition stage. The prediction performance of a soft sensor is not only dependent on the capability of the inferential model, but also relies on the data quality of the input measurements. In this paper, the input sensors were validated before performing a prediction. The deterioration of the prediction performance due to the failed sensors was removed by the reconstruction approach. A simulated example of a continuous stirred tank reactor (CSTR) with feedback control systems illustrated that the process characteristics captured by the PLS could be adapted to accommodate a nonlinear process. An industrial example, predicting oxygen concentrations in the air separation process, demonstrated the effectiveness of the proposed approach for the process industry. 1. Introduction In industrial processes, operators adjust manipulated variables to maintain product qualities or exhaust gases within the specifications of the product or government regulations, according to online analyzers and laboratory tests. Owing to malfunctions of the online analyzers or significant delays during laboratory testing, soft sensors inferring the primary output from other process variables provide useful information for regulating the process operation. Soft sensor applications have attracted significant attention in the process industry.1 There are two main categories of soft sensor development: first-principle models and data-driven models. A first-principle physical model can be obtained from the fundamental process knowledge. However, due to the complexity of the manufacturing process, such fundamental models either require a lot of effort and time to develop, or are too simplistic to be accurate in practice. On the other hand, data-driven models provide accurate information for a particular operation region by multivariate regression methods2 such as principal component regression (PCR), partial least-squares (PLS), and canonical coordinates regression (CCR). Such models are usually linear; therefore, they lack the ability to extrapolate into different operating regions. To cover a wide range of operations, nonlinear models may be used, such as * To whom correspondence should be addressed. Tel.: 886-77889888 ext. 6122. Fax: 886-7-7889777. E-mail: jialin@center. fotech.edu.tw.

artificial neural networks3 (ANN), support vector machines4 (SVM), and kernel partial least-squares5 (KPLS). Since nonlinear modeling techniques capture the nonlinear relationships of all training data, the models need to be retrained by the trained and new data once a new operating region emerges. Nevertheless, the linear counterparts can be updated to accommodate the new operating condition only by the new data due to the transparency of the model structure. The PLS algorithm is a popular multivariate statistical tool to model input/output data. It has been proven that the maximal covariance between two data sets can be captured by PLS.6 Since industrial processes are time-varying, the PLS models need to be adapted to accommodate this behavior. Qin7 proposed a block-wise recursive PLS (RPLS) for adapting the inferential model and Vijaysai et al.8 modified it by incorporating it into the condition number to determinate whether the model updating was really necessary. Although RPLS accounts for the timevarying nature of processes by updating models with the newest data, it leads to a reduction in the speed of adaptation as the data size increases. Dayal and MacGregor9 incorporated a variable forgetting factor into the recursive exponentially weighted PLS to discount the old data; however, the factor is difficult to determine without the process knowledge. The moving window algorithm is an alternative approach to exclude the oldest data when new ones are available. Qin7 reported the computational loading of the moving window PLS is propor-

10.1021/ie101356c  2010 American Chemical Society Published on Web 10/15/2010

Ind. Eng. Chem. Res., Vol. 49, No. 22, 2010 10

tional to the window size. Wang et al. proposed a fast moving window algorithm to adapt the PCA model for monitoring processes with a time-varying nature. In their approach, the computational loading is independent of the window size, which is more practicable for online updating models. In this paper, the fast moving window algorithm was enhanced and applied to the PLS for modeling the time variant process. Nonlinear iterative partial least-squares11 (NIPALS) is the common algorithm to estimate the parameters of the PLS model. It is a robust procedure to extract latent variables from the common data structure of the input and output data sets. When the data sets contain a massive amount of observations, it becomes inefficient due to iterating the score vectors and deflating the data matrices. Lindgren et al.12 proposed a kernel algorithm for PLS using the covariance of the input data and the cross covariance of the input and output data to estimate the parameters of the PLS. In their approach, the number of data affects the computational loading only when calculating the covariance and cross covariance, which are calculated once at the beginning of the algorithm. De Jong and Ter Braak13 improved the kernel algorithm by modifying the deflation procedure. Dayal and MacGregor14 proved that only one of either the predictor or the response data matrix needs to be deflated. On the basis of that, they derived a faster kernel algorithm where only the cross covariance needs to be deflated. It is suitable to derive a fast moving window algorithm for the PLS using the kernel algorithm because the computational loading is independent of the window size. Mu et al.15 proposed the online dual updating strategy, which combines RPLS model updating and model offset updating, to deal with the issues of time variant processes and uncertainty of the process data. In their approach, the PLS model is updated and the current bias is calculated by the sample’s value and the output of the PLS once a new sample is available. The model prediction is adjusted by the model offset that is updated by the time-weighted current bias. In the case of a sample with abrupt noise, the model output would be significantly unstable since both the PLS model and the current bias were misled by an abnormal event from the analyzing sample. In this paper, the confidence intervals of the response measurements have been created using prediction variance to ensure the PLS model will not be misled by outliers from online analyzers. Faber and Kowalski16 recommended that the prediction variance should include measurement errors in the response and predictor variables. Zhang and Garcia-Munoz17 applied pharmaceutical industrial data sets to compare the performance of prediction uncertainty for PLS models using several uncertainty estimation algorithms, such as linearization-based methods, ordinary leastsquares (OLS)-type methods, resampling based methods, and empirical methods. Since soft sensors infer the measurements of the output variables through a process model and the online measurements of input variables, the failure of input instruments needs to be dealt with during online predictions. Sensor validation is needed to detect failed instruments and generate validated values based on the physical relationship of input data. Martin18 constructed a two-layer neural net using input data to predict itself. When the predicted input differs from its corresponding measurement by more than a predefined tolerance, the operators need to be alerted and a rational value needs to be generated to replace the failed measurement with the process knowledge. However, it is not practical to manually generate validated data for an industrial application. Qin et al.19 minimized the statistic Q of PCA by adjusting the input measurements when the statistic Q

11531

was out of its control limits. The failed instrument can be identified if the recalculated Q using the reconstructed inputs is under its control limits. In this paper, the reconstruction approach was extended to the case of multiple sensor failures. In addition, the failed instruments were identified and reconstructed by minimizing the combined index,20 that is, minimizing the statistic Q and T2 simultaneously, leading to a more feasible solution than the original approach.19 In this paper, a one-step fast moving window algorithm was provided to update the PLS model with the dissimilarities of the new and the oldest data. It was more efficient than the work of Wang et al.10 In their approach, when the new data were available, the mean and standard deviation of each variable and the covariance matrix were adapted twice by the recursive adaptation, removing the oldest data and then adding the new data. The fast moving window PLS (FMWPLS) was derived based on the dissimilarities between the new and oldest data and incorporated those dissimilarities into the kernel algorithm for PLS. The parameters were only adapted once in the proposed approach. In addition, the confidence intervals of predictions, which were derived from the uncertainties of the response and predictor variables, were created to avoid inferential models being misled by abrupt noise from the online analyzers. However, the inferential model could still be updated during the transition stage. Since the process variables moving from steady-state values to the new ones increased the variations of the predictor variables, the confidence limits would be widened to allow updating the model. Before performing a prediction, the measurements of input instruments needed to be validated according to the PLS model. The approach of Qin et al.19 was improved upon by minimizing the combined index from the statistics Q and T2 rather than Q only. It outperformed the original approach since the statistics Q and T2 of the reconstructed data were under their control limits at the same time. The proposed approach was applied to predict the oxygen concentrations of the separation process in an industrial oxygen plant. The results show the challenges of developing a soft sensor, such as adapting to the time-varying nature of processes, uncertainty of response measurements and validating the input variables, can be effectively dealt with by the proposed method. The remainder of this paper is organized as follows. Section 2 gives a preliminary of the sensor validation by reconstruction approach, the kernel algorithm for the PLS, and the confidence intervals from the prediction uncertainty. The proposed approach of the FMWPLS algorithm and the sensor validation for multiple sensor faults is detailed in section 3. In section 4, a simulated example and an industrial application are given. The simulated example, which is a nonisothermal continuous stirred tank reactor (CSTR) with feedback control systems, illustrates how the PLS model has the capability to capture process behaviors coming from the short-term effect of disturbances, whereas the model is adapted to accommodate process variations due to the long-term effect of disturbances by the proposed approach. An industrial process with several online analyzers is modeled. In the example, building a single model for all online analyzers or a separate model for each analyzer is discussed. In addition, how the prediction performance compares with the accuracy of the online analyzer is studied in order to decide a proper sampling rate and reduce the operating cost for analyzing the samples. Furthermore, a real case of the inferential model being misled by the abrupt noise of the online analyzers has been dealt with. Finally, conclusions are given.

11532

Ind. Eng. Chem. Res., Vol. 49, No. 22, 2010

Φ ≡ C/QR + D/TR2

2. Basic Theory 2.1. Sensor Validation by Reconstruction Approach. Consider the data matrix X ∈ Rm×n with m rows of observations and n columns of variables. Each column is normalized to zero mean and unit variance. The covariance of the reference data can be estimated as 1 ˜ P˜T XTX ) PΛPT + P˜Λ S≈ (m-1)

(1)

where Λ is a diagonal matrix with the first K terms of the significant eigenvalues and P contains the respective eigenvec˜ and P˜ are the residual eigenvalues and eigenvectors, tors. The Λ respectively. The statistic Q is defined as a measure of the variations of residual parts of data. Q ) (x - xˆ)(x - xˆ)T ) xP˜P˜TxT ) xCxT

(2)

where C ≡ P˜ P˜T. In addition, the statistic T2 is measured for the variations of systematic parts of the PC subspace. T2 ) xPΛ-1PTxT ) xDxT ) tΛ-1tT

(3)

where D ≡ PΛ-1PT and t are the first K term scores. The confidence limits of Q and T2 can be found in ref 21. Qin et al.19 developed self-validating soft sensors based on the reconstruction approach.22 Each input sensor is validated in order to minimize the statistic Q when it is over its control limit.

( ) ∂Q ∂xk

n

xi*xk

)2

∑xc

i k,i

)0

(4)

i)1

where the ck,i is an element of the C. Rearranging the above equation, the reconstructed input can be obtained: 1 ck,k

x*k ) -

n

∑xc

(5)

i k,i

i*k

where x*k is the kth reconstructed input. Substituting the reconstructed input into eq 2, the statistic Q with the kth reconstructed input can be written as n

Q*k )

n

∑ ∑x

i

i*k j*k

(

)

ci,jck,k - ci,kcj,k xj ck,k

(6)

The Q*k is the recalculated statistic Q without the kth variable’s information that has been replaced by eq 5. If the statistic Q is out of its control limit and the Q*k is under the control limit, it can be concluded that the kth input sensor has failed. It should be noted that the reconstructed input is based on the other inputs and the correlation among the inputs; therefore, Q*k is recalculated without any further information about the faulty sensor. The process-related faults most likely occur when each recalculated Q*k is over the control limit. Yue and Qin20 suggested that T2 may be over its control limit when Q is minimized to reconstruct the faulty sensor. Therefore, they proposed a combined index as Q T2 φ≡ + 2 ) xΦxT QR TR

where QR and TR2, respectively, are the (1 - R) confidence limits of the statistic Q and T2. The combined index is minimized instead of the Q statistic and the reconstructed input can be obtained as follows.

( ) ∂φ ∂xk

x*k ) -

xi*xk

1 φk,k

)0 n

∑xφ

i k,i

i*k

where the φk,i is an element of the Φ. In this paper, the selfvalidating soft sensor is implemented by minimizing the combined index to enhance the proposed approach of Qin et al.19 The advantage is not only to reconstruct the faulty input with a more feasible solution, but also to trigger the reconstruction procedure when T2 has exceeded its control limit and Q is still under its control limit. 2.2. Kernel Algorithm for Partial Least Squares. PLS regression is a popular statistical tool for modeling the predictor and response data sets. It has been proven that the maximal covariance between two data sets can be captured by PLS. A set of latent variables is extracted iteratively to describe the predictor (X) and response (Y) data matrices. X ) TkPTk + E

(9)

Y ) TkQTk + F

(10)

where Tk is the first k terms of the latent variables or the score vectors, Pk and Qk, respectively, are the loading vectors of the data matrices X and Y, and E and F are the residual terms of PLS. In general, each score vector is extracted through deflating X and Y by the algorithm of the NIPALS method until all variance in the data structure is explained. It is a time-consuming procedure when the data matrices contain a massive amount of data. However, the score vectors are not necessary for a regression model as follows. Y ) XBPLS + F

(11)

where BPLS is the matrix of the regression coefficients. Lindgren et al.12 proposed a kernel algorithm for PLS. They used the covariance of X, ΣX ≡ (XTX)/(m - 1), and the cross covariance of X and Y, ΣXY ≡ (XTY)/(m - 1), to evaluate the regression coefficients of PLS. The algorithm is listed in Appendix A. For a data set with a massive amount of observations, the most timeconsuming parts of the NIPALS algorithm respectively are to iterate the score vector, which projects the data matrix onto the weighting vector wa, and to conduct the deflation procedure, which multiplies the score vector by the loading vectors. In the kernel algorithm, the score vectors are not necessary and the deflation procedure is conducted on the matrices of the covariance and cross covariance. The most time-consuming part is to estimate the covariance and cross covariance of the data matrices, which is only performed once in the algorithm. When there is a single output variable, the singular value decomposition (SVD) procedure of the kernel algorithm can be replaced by the following equation: wTa ) (ΣXY)a /|ΣXY |

(7)

(8)

(12)

Therefore, the algorithm is further simplified for this special case.

Ind. Eng. Chem. Res., Vol. 49, No. 22, 2010

2.3. Estimate Prediction Uncertainty. Considering a regression model for a single output, y ) XB + e, in which all data have been normalized with zero mean and unit variance, B is a true regression vector and e is the white noise with standard deviation σ, the variance of y can be written as var(y) ) var(XB + e) ) var(XB) + var(e) ) σ2

(13) Since the true regression matrix cannot be observed directly, the variance of the regression vector is as shown in the following equation when it is estimated by ordinary least-squares (OLS): var(BOLS) ) (XTX)-1XT var(y) X(XTX)-1 ) (XTX)-1σ2 (14) Therefore, σ2 can be estimated from the training data set. The prediction uncertainty of the new data by OLS is written as

occur due to the malfunction of the data collection system. The reconstruction of the multiple sensor faults also is given in this section. 3.1. Fast-Moving Window PLS. The conventional moving window algorithm is to discard the oldest data once the new data are available and the model needs to be rebuilt by all data in the window size, therefore, the computational loading is proportional to the number of data in the window size. The proposed approach adapts the parameters with the dissimilarities between the new and oldest data and incorporated them into the kernel algorithm for the PLS. The computational loading of the model adaptation is therefore independent of the window size. Given a data set with m measurements, in which the numbers of the predictor and response variables respectively are n and l, the input and output matrices are W ∈ Rm×n and Z ∈ Rm×l. The mean and the standard deviation of each variable are as follows:

var(yˆOLS) ) var(xoBOLS) + σ2 ) σ2(xo(XTX)-1xTo + 1) ) σOLS2 (15) in which xo is the new observation of predictor variables and yˆOLS is the prediction result by the OLS model. In the above equation, the term xo(XTX)-1xoT is the prediction uncertainty contributed by the predictor variables. The (1 - R) confidence intervals (CI) can be established by the following equation:17 CI ) (tR/2,m-nσOLS2

(17)

where T ) XR. The prediction uncertainty of the new data by the PLS model can be obtained. var(yˆPLS) ) σ2(xoR(TTT)-1RTxTo + 1) ) σPLS2

¯ ) 1 W m

∑

m



1 wi, Z¯ ) m i)1

m

∑z

(19)

i

i)1

m

sxi )

wj,i2 - mw j i /(m - 1),

i ) 1...n

j)1

(20)

SX ) diag(sx1 sx2 ... sxn )

(16)

where tR/2, m-n is a t-distribution with m - n degrees of freedom. Faber and Kowalski16 extended the prediction uncertainty by the OLS model to the PLS model. The variance of the regression vector by the PLS model can be written as var(BPLS) ) R(TTT)-1RTσ2

11533

(18)

The (1 - R) confidence intervals of the new prediction by the PLS model are given. In this paper, the confidence intervals were used to examine the reliability of the new observations from the online analyzers. The inferential model would not be adapted if the observations were out of the confidence intervals. 3. Proposed Approach In this paper, the PLS model was used to build an inferential model to predict the response variable with the predictor variables. Considering the time-varying nature of an industrial process, a fast moving window algorithm was derived to adapt the PLS model with the capability of describing the process behavior. Since the abrupt noise from an online analyzer may mislead an inferential model, the measurements of online analyzers need to be examined by the confidence intervals that are derived based on the prediction uncertainty from the predictor and response variables. Before predicting an output from the measurements of the predictor variables, the process inputs need to be validated by the inferential model. The input data are reconstructed once the faulty sensors have been identified. Although it is rare to have multiple sensor faults simultaneously for measurement instruments, the situation may

syi )



m

∑z

2 j,i

- mzji /(m - 1),

i ) 1...l

j)1

(21)

SY ) diag(sy1 sy2 ... syl ) j and Z j are row vectors, in which w where W j i and jzi are the ith element for the mean of the variable; SX and SY are diagonal matrices of the standard deviations where sxi and syi are the ith diagonal element for the standard deviation of the variable. The covariance of the input matrix (ΣX) and the cross covariance of the input and output matrices (ΣXY) can be derived from the above equations. 1 ¯ )T(W - 1W ¯ )SX-1 S -1(W - 1W m-1 X 1 ¯ TW ¯ )SX-1 S -1(WTW - mW ) m-1 X

(22)

1 ¯ )T(Z - 1Z¯)SY-1 S -1(W - 1W m-1 X 1 ¯ TZ¯)SY-1 S -1(WTZ - mW ) m-1 X

(23)

ΣX ≡

ΣXY ≡

where 1 is a column vector, in which all elements are one. Once the new observations are available and the oldest ones are discarded, the adaptive means and standard deviations can be written as ¯*)W ¯ + 1 (wm+1 - w1), Z¯* ) Z¯ + 1 (zm+1 - z1) W m m (24)

11534

Ind. Eng. Chem. Res., Vol. 49, No. 22, 2010

1 2 - w1,i2) - m(w j* j i)], i ) 1...n [(w i - w m - 1 m+1,i S* X ) diag(sx* 1 sx* 2 ... sx* n) 2

2 sx* i ) sxi +

(25) 1 2 - z1,i2) - m(zj* zi)], i ) 1...l [(z i - j m - 1 m+1,i S* l ) Y ) diag(sy* 1 sy* 2 ... sy* 2

2 sy* i ) syi +

(26) where the notations with a superscript asterisk are the adaptive quantities and the subscripts m + 1 and 1 stand for the new and the oldest data. It can be observed that the means and standard deviations are adapted on the basis of the original quantities and the dissimilarities between the new and the oldest data. Similarly, the adaptive covariance and cross covariance can be derived on the basis of the same concept. Σ*X ≡

{

S*X-1

1 [(wT w SXΣXSX + - wT1 w1) m - 1 m+1 m+1 ¯*-W ¯ TW ¯ )] S*X-1 (27) ¯ * TW m(W

}

{

Σ*XY ≡ S*X-1 SXΣXYSX +

1 [(wT z - wT1 z1) m - 1 m+1 m+1 ¯ TZ¯)] S*Y-1 (28) ¯ *TZ¯* - W m(W

}

The PLS model is updated by the kernel algorithm and the adaptive covariance and cross covariance. Since the computational loading for model updating is independent of the window size, the fast moving window PLS (FMWPLS) algorithm is given. 3.2. Sensor Validation for Multiple Sensor Faults. In the work of Qin et al.,19 multiple sensor faults were sequentially reconstructed when the sensors failed at different times. In practice, the failure of multiple sensors may occur simultaneously. It is not likely due to a failure of the measurement instruments but to a malfunction of the data collection system. For example, when part of the predictor variables come from another control system, the data of the predictor variables will be lost if the gateway of the control system has failed. The reconstruction of the multiple sensor faults needs to be conducted when the data remains unchanged for a period of time. The eq 8 is rewritten for the case of multiple sensor faults. ξΦxT ) 0

(29)

T ] in which nf is the number of faulty where ξT ≡ [ξ1T ξ2T ... ξnf sensors and ξi is a row vector in which the ith element is 1 and the others are zero. The input variables can be decomposed as

x ) xη + x(I - η)

(30)

where η is a diagonal matrix, in which the values of the diagonal elements are one for the faulty sensors and zero for the normal inputs. Equation 29 can be rewritten in the following form. ξΦηxT ) -ξΦ(I - η)xT

(31)

The left-hand side of the above equation contains the data that need to be reconstructed by the normal data in the right-hand term. The data that need to be reconstructed can be expressed as ηxT ) ξTxfT, in which xf is the collection of the faulty data. The reconstruction of the faulty data can be obtained by the following equation. x*f T ) -(ξΦξT)+ξΦ(I - η)xT

(32)

where (ξΦξT)+ is the generalized inverse of the (ξΦξT). In the limiting case of a single sensor fault, the above equation is identical to eq 8. Since the latent variables of PLS are extracted from the common data structure of the predictor and response data matrices, it is more suitable to reconstruct the faulty inputs using PLS rather than PCA. Although the framework of the reconstruction approach was developed based on the PCA, it can be extended to PLS. The combined index of PLS can be written as follows: φPLS ≡

QPLS TPLS2 + ) xΦPLSxT QR TR2

(33)

ΦPLS ≡ (I - RPT)(I - RPT)T /QR + RΛ-1RT /TR2 The rest of reconstruction procedure is exactly equivalent to the approach based on PCA. 3.3. Summary. The contributions of the proposed approach are to develop a fast moving window PLS where the computational loading is independent of the window size and to protect the model predictions and the inferential model from incorrect information. For a self-validating soft sensor, the reconstruction model and the prediction model need to be the same model for prediction consistency, that is, the model must have a unique number of latent variables (LVs) for the PLS or the number of PCs for the PCR. If the faulty inputs are reconstructed by a larger number of LVs, the predictions with the reconstructed inputs will contain unexpected noise induced by the excess LVs. On the other hand, if fewer LVs are used to reconstruct the faulty data, the incorrect information of the faulty inputs will deteriorate the model predictions from the smaller number of LVs. Therefore, the purpose of the self-validating soft sensor is to predict an output without the effects from faulty inputs, rather than to enhance the prediction performance by the reconstructed inputs. The schematic diagram of the proposed approach is depicted in Figure 1. Before performing a prediction, the operating state needs to be assured in the normal operating condition (NOC) by examining the process inputs using eq 33. If the combined index is over its control limit, it will be recalculated by reconstructing the process inputs. Once the faulty sensors are found, the input data are reconstructed by eqs 32 and 33. The process-related fault is most likely occurring when all recalculated combined indices are over the control limit. The abnormal event then needs to be reported. When a new observation from the online analyzer is available, the measurement is validated by the confidence intervals of the prediction, which are eqs 16 and 18. Once the new observation is within the confidence intervals, the FMWPLS model is adapted by eqs 24-28 and the kernel algorithm for PLS. 4. Illustrative Examples 4.1. Continuous Stirred Tank Reactor. A nonisothermal continuous stirred tank reactor (CSTR) with feedback control systems was simulated; the detailed CSTR model and the problem setting are described in Appendix B. The average relative error of prediction (AREP) was used to assess the prediction performance of the inferential model. AREP )

1 m

∑| m

i)1

|

yi - yˆi × 100 yi

(34)

where yi and yˆi, respectively, are the measured and predicted values of the online analyzer. For a PLS-based soft sensor, the

Ind. Eng. Chem. Res., Vol. 49, No. 22, 2010

11535

Figure 1. Schematic diagram of the proposed approach.

Figure 2. The AREP by different window sizes and numbers of latent variables.

number of latent variables and the modeling window size affect the prediction performance, therefore the AREP of the PLS models have been shown in Figure 2. It shows the model with four latent variables, and the 2-day window size had the best prediction performance. The prediction results of the PLS, RPLS, and FMWPLS are compared in Figure 3. In Figure 3a, three different models had a comparable prediction performance, since the initial PLS model had captured the process variations due to the short-term effects of disturbance. However, from Figure 3b-d, as the long-term effect evolved, the prediction performance also declined as well for a PLS model without model updating. On the other hand, although the RPLS model

had been adapted with the new data, the prediction results gradually fluctuated as the catalyst deactivated. It is obvious that a global linear model cannot depict this simulated example with nonlinear relationships between inputs and output. Since the proposed approach discarded the oldest data once the new data were available, the prediction performance was maintained in a consistent way. Table 1 lists the AREP by the different models. It shows that the prediction performance of PLS and RPLS declined as the long-term effect of disturbances became significant. From the physical perspective of an irreversible exothermic reaction, the measured reactant concentration (CA) was inversely proportional to the reactor temperature (T). From the control strategy of the reactor temperature, the coolant flow rate (QC) was also inversely proportional to the reactor temperature. Therefore, the measured reactant concentration was proportional to the coolant flow rate. This can be observed in Figure 4a, which displays the regression coefficients of the FMWPLS for the different time frames. Since the cooling jacket temperature was proportional to the reactor temperature, the reactant concentration was inversely proportional to the coolant temperature (TC). The proposed approach consistently captured the relationships between the inputs and output for the different time frames. When these are compared to the regression coefficients by the RPLS, which are displayed in Figure 4b, it can be seen that the relationships of the inputs and output were varying as the amount of observations were accumulated. Even after the 90th day, the measured reactant concentration became inversely proportional to the coolant flow rate, which is inconsistent with the understandings of the simulated process. Since the proposed approach adapts the inferential model once the new measurement of the online analyzer is available, the model will be misled by samples with abrupt noise. The

11536

Ind. Eng. Chem. Res., Vol. 49, No. 22, 2010

Figure 3. Prediction results of CA using PLS, RPLS, and FMWPLS: (a) from the 3rd to 45th day, (b) from the 45th to 90th day, (c) from 90th to 135th day, (d) from 135th to 180th day. Table 1. The AREP (%) by Different Soft Sensors days

PLS

RPLS

FMWPLS

2-45 45-90 90-135 135-180 overall

3.81 10.39 17.28 24.11 14.01

2.06 4.92 7.28 9.03 5.86

1.20 1.14 1.11 1.13 1.15

confidence intervals of the response measurements need to be created to prevent the model being affected by outliers. In an industrial process, a common example of encountering abrupt noise occurs when the online analyzer is calibrating. Assuming that the online analyzer of the reactant concentration was calibrated every 90 days, the range of the calibrating signals was from 1 (mol/m3) to 100 (mol/m3) during 2 h. There were eight calibrating signals, as shown in Figure 5, in which the 99% confidence limits of the predictions were labeled as gray lines. Since the window size of the FMWPLS was 2 days, the predictions of the blindly updating model continued to oscillate until the calibrating signals left the model, as Figure 5 shows. The AREP by the blindly updating model and the proposed approach, respectively, were 3.85 and 1.21. The results showed the confidence intervals of the predictions were capable of preventing the model from abrupt noise.

The soft sensors infer the measurements of output variables through a process model and the online measurements of input variables; therefore, the failure of the input instruments needs to be addressed during online prediction. From Figure 4a, the coolant temperature (TC) was the most important variable for the output variable. It was assumed that the measured values of the coolant temperature had a bias value (5 K) from the 6th day. The statistics of Q and T2 have been shown in Figure 6, in which the dash lines were the 99% confidence limits. The T2 statistics went over its control limits from the 6th day. Figure 7 shows the recalculated combined indices using the reconstruction-based approach. It displays the combined index under its control limit only when the coolant temperature was reconstructed. The reconstructed coolant temperature has been compared with the actual and bias data in Figure 8a. It shows the reconstructed data by minimizing the combined index, which captured the variations of the coolant temperature. In contrary, the reconstructed data by minimizing the Q statistic failed to recognize the bias measurement, as Figure 8b shows. It is because most of the statistics Q were under its control limit in Figure 6. The predictions of the reactant concentrations have been compared in Figure 9. Figure 9a shows that the predictions with bias inputs were distorted compared to the predictions with

Ind. Eng. Chem. Res., Vol. 49, No. 22, 2010

11537

Figure 4. The regression coefficients for different time frames: (a) FMWPLS, (b) RPLS.

Figure 5. The unstable predictions due to the calibration of the online analyzer.

Figure 6. Statistic Q and T2 for the test data with bias measurements.

the reconstructed data, in which the AREP values, respectively, were 10.25 and 1.45. The AREP with the reconstructed inputs

Figure 7. The recalculated combined indices for the reconstruction-based approach.

by minimizing the Q statistic was 9.92, which was close to the AREP with bias inputs; that is, the prediction deterioration from the bias data was not removed completely. In addition, the predictions with the reconstructed inputs by minimizing the Q statistic were highly oscillating, as Figure 9b shows. Therefore, the sensor validation with a minimized combined index could reach to a more stable solution. Another common type of sensor failure is the missing data of the input sensors. Figure 10 shows the first three input variables, which, respectively, are the temperatures of reactor, coolant, and feed (T, TC, TF), assuming data loss after the 6th day. Therefore, the three variables kept the last living values during the period of missing data. For preventing the inferential mode being misled by this type of failure, the data range of the past 15 min, which was the sampling rate for the online analyzer, was examined for each input sensor before updating the inferential model. In general, the data of input sensors were collected for less than one min. When the input readings kept the last living values, the data ranges of the inputs would come to zero. If the inferential model were adapted without examining the data ranges, the relationships between the failed inputs and

11538

Ind. Eng. Chem. Res., Vol. 49, No. 22, 2010

Figure 8. The reconstructed coolant temperature, (a) minimizing the combined index, (b) minimizing the statistic Q.

Figure 9. The comparisons of the reactant concentrations from the online analyzer, the reconstructed inputs, and bias inputs: (a) minimizing the combined index, (b) minimizing the statistic Q.

Figure 10. The missing data of the input sensors.

output would vanish. Figure 11 shows the regression coefficients of the FWMPLS during the period that the model was updated with the last living inputs. Eventually, the effects of the reactor

Figure 11. The regression coefficients for the missing data.

Ind. Eng. Chem. Res., Vol. 49, No. 22, 2010

Figure 12. The reconstructed inputs for the missing data.

Figure 13. The comparisons of the reactant concentrations from the online analyzer, the reconstructed inputs, and the missing data.

temperature and the coolant temperature to the measured reactant concentration disappeared. Therefore, the missing data needed to be reconstructed, instead of blindly updating the model. Figure 12 compares the reconstructed data with the actual data, in which the process behavior due to the short-term effect of disturbances was captured, that is, the set point changes of the reactor temperature. Figure 13 compares the reactant concentrations from the online analyzer, the reconstructed inputs, and the last living inputs, in which the AREP by the reconstructed inputs and the last living inputs, respectively, were 1.70 and 6.10. The AREP was significantly improved by the reconstructed inputs, due to the incorrect information being removed from the predictions. It should be noted that the reconstruction approach cannot provide further information about the failed sensors. The advantages of the proposed approach have been demonstrated using the simulated example. First, the local linear relationship between inputs and output was gradually adapted to accommodate the nonlinear system by the fast moving window algorithm; contrarily, the recursive algorithm failed to capture the characteristic of the time-vary process, which can be observed in Figure 4. Second, since the moving window approach is sensitive to outliers, the confidence intervals for the response variable were created to prevent the inferential

11539

model from being misled by the abrupt noise of the online analyzer. In addition, during the transition stage, the confidence intervals would be widened because of the larger prediction uncertainty that is contributed by the predictor variables leaving the steady-state values that allow the model to be updated by new data, which would be demonstrated in the industrial application. Third, the model prediction performance would be deteriorated if the sensors of predictor variables failed. In the proposed approach, the information of the known faulty sensors has been removed before conducting the model prediction. 4.2. Industrial Application. 4.2.1. Process Description. A brief flow diagram of the air separation process has been shown in Figure 14, in which five distillation columns have been highly integrated to separate nitrogen, oxygen, and argon from compressed air. In the process, the compressed air coming from the atmosphere was fed into the pressurized column through stream 1. Part of the N2 product was drawn from the top of the pressurized column. Stream 3, which mostly consisted of O2 and Ar, was used as a coolant of the condensers of crude argon column 2 and the pure argon column. It was then fed into the low pressure column through stream 4 in order to withdraw the rest of the N2 from the top of the low pressure column and withdraw the liquefied O2 product from the bottom. The argon would be purified by crude argon columns 1 and 2, and the pure argon column. The Ar purity of stream 9 from the crude argon column reached around 99%, and then the stream was fed into the pure argon column. The specification of product purity of Ar from the pure argon column was 99.9999%. In the separation process, three online analyzers were installed to inspect the O2 concentrations, which, respectively, were labeled as y1, y2 and y3 in Figure 14. The O2 concentration at the sampling point of the pressurized column was used to stabilize the column operation. The set points of the concentration varied with the operating load in order to keep the ratio of gas and liquid flows in the column at a certain constant. The O2 concentration of stream 8 was monitored to prevent crude argon column 2 from dumping. If the measured O2 concentration were too low, which was a high amount of nitrogen in the inlet of the crude argon column 1, the top pressure of crude argon column 2 would build up until column dumping occurred. The details can be found in Liu and Chen.26 Owing to the high purity of the Ar product from the pure argon column, the O2 concentration must be less than 1 ppm in the Ar product. Therefore, the operators adjusted the inlet flow rate of the pure argon column according to the O2 concentration of stream 9, labeled as y3. When the measured O2 concentration came to a low value, the inlet flow rate was increased, and vice versa. During daily production, the operators relied on these online analyzers to maintain the process running smoothly. However, the maintenance period of the hardware analyzer needed 30 days at least. Therefore, the soft sensors were requested by the operators. In addition, the sampling rates of y1 and y2 were 5 min, and the y3 was 15 min. The operators wanted to prolong the sampling rates in order to reduce the operating costs for analyzing the samples. Table 2 lists the input and output variables to develop the soft sensors. Wold et al.11 suggested that a PLS model for all output variables is preferred when the output variables are correlated, otherwise, a separate modeling of the output variables gives a set of simpler models, which are easier to interpret. The PCA was applied to the data set that contained the measurements of the three output variables collected every 15 min over 10 days. Figure 15 shows the eigenvalues of the covariance and the accumulated captured variances. It can be observed that all

11540

Ind. Eng. Chem. Res., Vol. 49, No. 22, 2010

Figure 14. Brief flow diagram of the air separation process. Table 2. Predictor and Response Variables for the Air Separation Process variable

description

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 y1 y2 y3

flow rate of stream 1 pressure of stream 1 temperature of stream 1 temperature at the sampling point of the pressurized column pressure at the sampling point of the pressurized column temperature of stream 8 top pressure of the low pressure column recycle flow temperature of the crude argon column 2 top pressure of the crude argon column 1 recycle flow rate of the crude argon column 2 condenser temperature of crude argon column 2 condenser pressure of crude argon column 2 temperature of stream 9 pressure of stream 9 flow rate of stream 9 flow rate of stream 6 flow rate of stream 5 flow rate of stream 7 O2 concentration at the sampling point of the pressurized column O2 concentration of stream 8 O2 concentration of stream 9

eigenvalues were close to 1 and the captured variance of each principal component was near to 33%. Therefore, there was no significant correlation among the output variables. Three inferential models were developed instead of an overall model in this application. The window sizes and the number of latent variables for each inferential model were determined through investigating the AREP, as Figure 16 shows. Table 3 lists the optima of the window sizes and the number of latent variables for each inferential model. The inferential models were built according to the parameters in Table 3, and the regression coefficients have been shown in Figure 17. From Figure 17a, the dominating variables affecting the y1 were variables 4, 16, and 1, which were the temperature at the sampling point, the reflux flow rate from the main condenser, and the flow rate of the compressed air. It is reasonable that the O2 concentration

Figure 15. Analyzing the collinearity of the output variables.

of the y1 was highly correlated with the temperature at the same sampling point. Since the inlet flow of the pressurized column was operated in gas phase and the ratio of phase flows needed to be maintained at a certain constant, that is, a fixed L/V, the reflux flow rate varied with the inlet flow rate. When the process was operated at high-loading mode, the reflux flow rate of the pressurized column had to be increased to maintain a fixed L/V; therefore, the O2 concentration of the pressurized column also increased due to the inlet flow having a constant O2 concentration. Figure 17b shows that the O2 concentration of stream 8 was inversely proportional to variable 18 and was proportional to variables 10 and 6. Stream 4, which had two phases, came from stream 3 with lower pressure and higher temperature after leaving the condensers of crude argon column 2 and the pure argon column. In the low pressure column, stream 7, which was liquefied nitrogen, was used as a coolant to condense the

Ind. Eng. Chem. Res., Vol. 49, No. 22, 2010

11541

Figure 16. The AREP for different window sizes and numbers of latent variables: (a) y1 model, (b) y2 model, (c) y3 model. Table 3. The Window Sizes and the Numbers of Latent Variables for Each Inferential Model model

window sizes (days)

No. of latent variables

y1 y2 y3

8 6 6

4 5 2

gases of stream 4. When variable 18, the flow rate of stream 7, was too high, the concentrations of the lighter gases would be higher in stream 8, that is, the lower concentration of O2 in stream 8. On the other hand, when variable 10, which was the partial flow of stream 4, had a higher value, more O2 came into the low pressure column through stream 4, that is, the higher concentration of O2 in stream 8. The temperature of stream 8, which is variable 6, was at the same sampling point with the y2; therefore, the temperature varied with the concentration of O2. In Figure 17c, there is one significant variable, which is variable 15, which affected the concentration of O2 in stream 9. As mentioned before, the operators drew more flow from crude argon column 2 when the concentration of O2 had a low value. From the above discussion, the inferential models captured the relationships of the response and predictor variables. They were consistent with the understandings of the process engineers.

4.2.2. Prediction Performance of the Inferential Models. Three inferential models were built using the predictor variables listed in Table 2 and the parameters listed in Table 3 for predicting the O2 concentrations in the different locations. Figure 18 compares the prediction results by the PLS and FMWPLS models. Figure 18a shows that the PLS and FMWPLS had a comparable performance for the y1 variable; that is, there was no long-term effect of disturbance affecting the y1 variable. On the contrary, the predictions of the PLS for the y2 and y3 variables had drifted after the 15th day, as Figure 18 panels b and c show. However, the proposed FMWPLS models could effectively capture the long-term effect of disturbance affecting the y2 and y3 variables. In Figure 18b, there existed the calibration signals around the 13th day, which will be discussed later. As mentioned before, the purposes of developing the soft sensors in this work are (1) providing the O2 concentrations when the hardware analyzer is not available and (2) prolonging the sampling rates of the online analyzers for reducing the operating cost. Table 4 lists the AREP of the different model updating rates, in which the rows labeled as “every sample” and “without updating”, respectively, list the results of the FMWPLS and PLS in Figure 18, and the other rows list the AREP of the FMWPLS with different updating rates. The calibration signals of the y2 have been excluded. From

11542

Ind. Eng. Chem. Res., Vol. 49, No. 22, 2010

Figure 17. The regression coefficients of the inferential models: (a) y1 model, (b) y2 model, (c) y3 model.

the guidelines of the online analyzers, the accuracy of y1, y2, and y3, respectively, are 0.125(%), 0.1(%), and 0.1(ppm). The average relative errors (ARE) due to hardware accuracies are 3.32, 0.11, and 30.54 for y1, y2, and y3, respectively. Compared with values in Table 4, all the AREP of soft sensors y1 and y3 were better than the ARE from the hardware accuracies; therefore, the sampling rates of y1 and y3 could be prolonged and the information provided by the soft sensors was in line with the hardware sensors. On the other hand, the AREP of soft sensor y2 updated by every sample was worse than the ARE of the hardware sensor. Therefore, the hardware sensor of y2 was critical and could not be replaced by the soft sensor. A redundancy procedure needs to be prepared for the times during hardware maintenance. Figure 19a enlarges Figure 18b, in which the time frame was the calibration of y2 for the next 8 days. The calibration signal for 2 h varied from 80 to 100. In the figure, the confidence limits were derived on the basis of the prediction variances that included the measurement errors in the response and predictor variables. If the inferential model was blindly updated without concerning the prediction uncertainty, the prediction results would be unstable until over the window size, which was 6 days for the y2 model. During the window, the AREP of the blindly updating model was 0.46, which was worse than the

AREP of the nonupdating model in Table 4. This case shows that prediction uncertainty is an important issue for an adaptive model with new data. During the transition stage, some of the input variables were adjusted to bring the process to a new steady state. The confidence limits would be widened due to the adjusted process variables deviating from the steady-state values; therefore, the inferential models still could to be updated during the transition stage. Figure 19b compares the measurements of the online analyzer and the outputs of the y1 model during the period when the operating modes had changed several times. At the bottom of the figure, it can be seen the operating load was varying due to variable 1, which was the inlet flow rate of the compressed air, being adjusted. It can be observed that the confidence limits were widening so that the model of y1 was still updated. Therefore, by implementing the confidence limits of prediction outputs, the model would not be misled by abrupt noises, and could be updated by the data during mode transition. 5. Conclusions A method is presented to develop a soft sensor that can cope with colinearity among predictor variables as well as nonlinearity between the predictor and response variables.

Ind. Eng. Chem. Res., Vol. 49, No. 22, 2010

11543

Figure 18. Prediction results of PLS and FWMPLS: (a) y1 model, (b) y2 model, (c) y3 model.

The FMWPLS algorithm inherits the characteristics of the PLS model, capturing the linear relationships between inputs and outputs, and adapts the model with the new data,

gradually correcting the captured relationships, to describe a nonlinear process behavior. Since a moving window model is sensitive to outliers, the confidence intervals are created

Figure 19. The inferential models protected by the confidence limits: (a) y2 model, (b) y1 model.

11544

Ind. Eng. Chem. Res., Vol. 49, No. 22, 2010

Table 4. The AREP (%) for Different Model Updating Rates model updating rate

y1

y2

y3

every sample 1h 4h 8h without updating

2.53 2.61 2.62 2.62 2.64

0.12 0.22 0.30 0.33 0.37

6.69 13.44 17.68 19.26 19.35

based on the prediction variances of the predictor and response variables to prevent the model being misled by outliers. Additionally, the inferential model can be adapted during the transition stage due to the widened intervals contributed from the adjusted process variables leaving the steady-state operation. The prediction performance of an inferential model is not only dependent on the capability of the model, but also relies on the correct measurements of the input variables. Sensor validation is implemented for the input measurements. The measured data are validated and reconstructed once the faulty sensors are identified. The reconstruction of the faulty sensors removes the effect of the failed inputs from the model predictions, rather than enhancing the prediction performance with the reconstructed data. In the simulated example, it was demonstrated that the PLS precisely captured the process behavior simulated by the shortterm effect of disturbances, and correctly performed a one-stepahead prediction. The inferential models were adapted to accommodate the long-term effect of disturbances where the captured relationships by the FMWPLS were consistent, which was contrary to the counterparts by the RPLS. In the industrial example, the sampling rates of the online analyzers were examined by the assessment of the prediction performance of the online analyzers and the accuracy of the hardware analyzers. The operating cost could be reduced by prolonging the sampling rate. The results show that two of the three hardware analyzers have the capacity to prolong the sampling rate. It was also demonstrated that the inferential models were not misled by outliers from the online analyzers, such as calibration signals, and the prediction results were not deteriorated by missing or biased data from the input instruments. The proposed approach has potential for the implementation of soft sensors in the process industry. Acknowledgment This work was supported by the China Steel Corporation. Appendix A. Kernel Algorithm for the PLS 1. Set a ) 1, (ΣXY)a ) ΣXY, (ΣX)a ) ΣX and Ha ) I. 2. Perform singular value decomposition (SVD) on the T )a. The eigenvector of the largest eigenvalue is the (ΣXYΣXY weighting vector wa and an auxiliary vector is defined as ra ≡ Hawa. 3. The loading vectors of the pa and qa can be obtained from paT ) waT(ΣX)a/(waT(ΣX)awa) and qaT ) waT(ΣXY)a/(waT(ΣX)awa). 4. The deflation procedure is conducted as: (ΣXY)a+1 ) (I wapaT)T(ΣXY)a and (ΣX)a+1 ) (I - wapaT)T(ΣX)a(I - wapaT). Prepare Ha+1 ) Ha - rapaT for the next iteration. 5. Set a ) a + 1, go to step 2 until all data structures of covariance and cross covariance are extracted. 6. The matrix of the regression coefficients can be obtained by BPLS ) RQT, where R ) [r1 r2 ... ra] and Q ) [q1 q2 ... qa].

Figure A1. Schematic of the CSTR system with cascade control. Table A1. Nominal Operating Conditions and Model Parameters for the CSTR Example Q ) 100 L/min QC ) 15 L/min TF ) 320 K TCF ) 300 K T ) 402.35 K TC ) 345.44 K CAF ) 1.0 mol/L CA ) 0.037 mol/L h ) 0.6 m

A ) 0.1666 m2 k0 ) 7.2 × 1010 min-1 ∆H ) -5 × 104 J/mol FCp ) 239 J/(L · K) FCCpC ) 4175 J/(L · K) E/R ) 8750 K UAC ) 5 × 104 J/(min · K) VC ) 10 L

dCA QFCAF - QCA ) -k0e-E/RTCA + dt Ah

UAC(TC - T) k0e-E/RTCA(-∆H) QFTF - QT dT + ) + dt FCp Ah FCpAh (A2) dTC QC(TCF - TC) UAC(T - TC) ) + dt VC FCCpCVC

(A3)

QF - Q dh ) dt A

(A4)

where the process variables and model parameters are defined in the Nomenclature section. The nominal operating conditions and model parameters are given in Table A1 and the details of the simulated CSTR can be found in ref 24. The data were generated according to the problem settings of Fujiwara et al.25 The set point of the reactor temperature was changed between (2 K every day. For the catalyst deactivation, the frequency factor (k0) was considered to be linearly decreasing from 7.2 × 1010 to 5.4 × 1010 over 180 days. It can be expected that there were two types of disturbances to affect the simulated process. One was a short-term effect due to the set point of the reactor temperature changing, and another was a long-term effect coming from the catalyst deactivation. The process variables shown in Figure A1 were used as input variables to predict the measured reactant concentration (CA). The input variables were the temperatures of the reactor, coolant, feed, and coolant feed (T, TC, TF, TCF), the reactor level (h), the flow rates of reactor exit, coolant, and feed (Q, QC, QF). The data were collected every 15 min to simulate the sampling rate of an online analyzer. The white noises were added with 1% of the nominal values listed in Table A1 as the standard deviations of Gaussian distributions. Nomenclature

Appendix B. CSTR Model A schematic diagram of the CSTR and feedback control system is shown in Figure A1. The CSTR model can be derived as follows:23

(A1)

A ) cross-sectional area of the reactor AC ) heat-transfer area B ) true regression vector

Ind. Eng. Chem. Res., Vol. 49, No. 22, 2010 BOLS ) matrix of the regression coefficients by the OLS BPLS ) matrix of the regression coefficients by the PLS C ) matrix used to calculate the statistic Q CA ) reactant concentration in the reactor CAF ) reactant concentration in the reactor feed stream Cp ) heat capacity of the reactor contents CpC ) heat capacity of the coolant ci,j ) the ith row and the jth column element of the matrix C D ) the matrix used to calculate the statistic T2 E ) residual parts of the predictor data matrix E ) activation energy e ) white noise of the response variable Fv ) residual parts of the response data matrix H ) the matrix used to calculate an auxiliary vector r in the kernel algorithm h ) liquid level in the reactor I ) identity matrix K ) number of principal components expand the PC subspace k0 ) frequency factor l ) number of response variables m ) number of observations in the training data set n ) number of predictor variables nf ) number of faulty inputs P ) loading matrix of the PC subspace P˜ ) loading matrix of the residual subspace pa ) the ath loading vector for the predictor variables in the kernel algorithm Q ) loading matrix for the response variables Q ) statistic Q of the PCA; flow rate of the reactor outlet stream in the simulated CSTR QC ) coolant flow rate QF ) feed flow rate of the reactor feed stream Q*k ) the recalculated statistic Q with the kth reconstructed input QR ) control limits of the statistic Q with (1-R) confidences qa ) the ath loading vector for the response variables in the kernel algorithm R ) full matrix contains the auxiliary vectors to calculate score vectors R ) gas constant ra ) the ath auxiliary vector used to calculate score vector in the kernel algorithm S ) covariance matrix of the training data SX ) diagonal matrix of standard deviations for the input variables S*X ) diagonal matrix of updated standard deviations for the input variables SY ) diagonal matrix of standard deviations for the output variables S*Y ) diagonal matrix of updated standard deviations for the output variables sxi ) standard deviation of the ith input variable sx*i ) updated standard deviation of the ith input variable syi ) standard deviation of the ith output variable sy*i ) updated standard deviation of the ith output variable T ) reactor temperature TC ) temperature of the coolant in the cooling jacket TCF ) inlet coolant temperature TF ) reactor feed temperature Tk ) full matrix contains the first k terms of score vectors T2 ) statistic T2 of the PCA TR2 ) control limits of the statistic T2 with (1 - R) confidences t ) score vectors t ) time tR/2, m ) -n ) a (1 - R) significant level of the t-distribution with m - n degrees of freedom U ) heat-transfer coefficient

11545

VC ) volume of the cooling jacket W ) data matrix of the input measurements j ) mean vector of the input measurements W j * ) updated mean vector of the input measurements W wa ) the ath weighting vector in the kernel algorithm w j i ) mean of the ith input variable w j *i ) updated mean of the ith input variable X ) normalized predictor data set ˆ ) systemic parts of the data matrix X xo ) new observations of the predictor variables xf ) collection of the faulty inputs x*f ) reconstructed faulty inputs xk ) the kth predictor variable x*k ) the reconstructed data for the kth predictor variable Y ) normalized response data set y ) normalized data with one response variable yˆOLS ) predictions by the OLS yˆPLS ) predictions by the PLS Z ) data matrix of the out measurements j ) mean vector of the output measurements Z j * ) updated mean vector of the output measurements Z jzi ) mean of the ith output variable jz*i ) updated mean of the ith output variable Greek Letters ξi ) a column vector in which the ith element is one and the others are zero ∆H ) heat of reaction Φ ) matrix used to calculate the combined index ΦPLS ) matrix used to calculate the combined index in PLS φi,j ) the element of matrix Φ in the ith row and the jth column η ) diagonal matrix where the elements are one for the faulty sensors and zero for the normal inputs. φ ) combined index φPLS ) combined index for PLS Λ ) diagonal matrix with eigenvalues from covariance matrix of the normalized data ˜ ) diagonal matrix of the residual eigenvalues Λ ΣX ) covariance matrix of the predictor data matrix Σ*X ) updated covariance matrix of the predictor data matrix ΣXY ) cross covariance matrix of the predictor and the response data matrices Σ*XY ) updated cross covariance matrix of the predictor and the response data matrices F ) density of the reactor contents FC ) density of the coolant σ ) standard deviation of the regression errors σOLS ) standard deviation of the regression errors by the OLS σPLS ) standard deviation of the regression errors by the PLS ξ ) full matrix contains nf fault directions

Literature Cited (1) Kadlec, P.; Gabrys, B.; Strandt, S. Data-driven Soft Sensors in the Process Industry. Comp. Chem. Eng. 2009, 33, 795. (2) Burnham, A. J.; Viveros, R.; MacGregor, J. F. Frameworks for Latent Variable Multivariate Regression. J Chemom. 1996, 10, 31. (3) De Assis, A. J.; Filho, R. M. Soft Sensors Development for Online Bioreactor State Estimation. Comput. Chem. Eng. 2000, 24, 1099. (4) Yan, W.; Shao, H.; Wang, X. Soft Sensing Modeling Based on Support Vector Machine and Bayesian Model Selection. Comput. Chem. Eng. 2004, 28, 1489. (5) Zhang, X.; Yan, W.; Shao, H. Nonlinear Multivariate Quality Estimation and Prediction Based on Kernel Partial Least Squares. Ind. Eng. Chem. Res. 2008, 47, 1120. (6) Ho¨skuldsson, A. PLS Regression Methods. J Chemom. 1988, 2, 211.

11546

Ind. Eng. Chem. Res., Vol. 49, No. 22, 2010

(7) Qin, S. J. Recursive PLS Algorithms for Adaptive Data Modeling. Comp. Chem. Eng. 1998, 22, 503. (8) Vijaysai, P.; Gudi, R. D.; Lakshminarayanan, S. Identification on Demand Using a Blockwise Recursive Partial Least-Squares Technique. Ind. Eng. Chem. Res. 2003, 42, 540. (9) Dayal, B. S.; MacGregor, J. F. Recursive Exponentially Weighted PLS and Its Applications to Adaptive Control and Prediction. J Process Control 1997, 7, 169. (10) Wang, X.; Kruger, U.; Irwin, G. W. Process Monitoring Approach Using Fast Moving Window PCA. Ind. Eng. Chem. Res. 2005, 44, 5691. (11) Wold, S.; Sjo¨stro¨m, M.; Lennart, E. PLS-Regression: A Basic Tool of Chemometrics. Chemom. Intell. Lab. Syst. 2001, 58, 109. (12) Lindgren, F.; Geladi, P.; Wold, S. The Kernel Algorithm for PLS. J. Chemom. 1993, 7, 45. (13) De Jong, S.; Ter Braak, C. J. F. Comments on the PLS Kernel Algorithm. J. Chemom. 1994, 8, 169. (14) Dayal, B. S.; MacGregor, J. F. Improve PLS Algorithms. J. Chemom. 1997, 11, 73. (15) Mu, S.; Zeng, Y.; Liu, R.; Wu, P.; Su, H.; Chu, J. Online Dual Updating with Recursive PLS Model and Its Application in Predicting Crystal Size of Purified Terephthalic Acid (PTA) Process. J Process Control 2006, 16, 557. (16) Faber, K.; Kowalski, B. R. Prediction Error in Least Squares Regression: Further Critique on the Deviation Used in the Unscrambler. Chemom. Intell. Lab. Syst. 1996, 34, 283. (17) Zhang, L.; Garcia-Munoz, S. A Comparison of Different Methods to Estimate Prediction Uncertainty Using Partial Least Squares (PLS): A Practitioner’s Perspective. Chemom. Intell. Lab. Syst. 2009, 97, 152.

(18) Martin, G. Consider Soft Sensors. Chem. Eng. Prog. 1997, 93, 66. (19) Qin, S. J.; Yue, H.; Dunia, R. Self-Validating Inferential Sensors with Application to Air Emission Monitoring. Ind. Eng. Chem. Res. 1997, 36, 1675. (20) Yue, H. H.; Qin, S. J. Reconstruction-Based Fault Identification Using a Combined Index. Ind. Eng. Chem. Res. 2001, 40, 4403. (21) Jackson, J. E. A User’s Guide To Principal Components; Wiley: New York, 1991. (22) Dunia, R.; Qin, S. J.; Edgar, T. F.; McAvoy, T. J. Identification of Faulty Sensors Using Principal Component Analysis. AIChE J. 1996, 42, 2797. (23) Russo, L. P.; Bequette, B. W. Effect of Process Design on the OpenLoop Behavior of a Jacketed Exothermic CSTR. Comp. Chem. Eng. 1996, 20, 417. (24) Singhal, A.; Seborg, D. E. Pattern Matching in Multivariate Time Series Databases Using a Moving-Window Approach. Ind. Eng. Chem. Res. 2002, 41, 3822. (25) Fujiwara, K.; Kano, M.; Hasebe, S.; Takinami, A. Soft-Sensor Development Using Correlation-Based Just-in-Time Modeling. AIChE J. 2009, 55, 1754. (26) Liu, J.; Chen, D. S. Operational Performance Assessment and Fault Isolation for Multimode Processes. Ind. Eng. Chem. Res. 2010, 49, 3700.

ReceiVed for reView June 25, 2010 ReVised manuscript receiVed September 9, 2010 Accepted October 4, 2010 IE101356C