Nonlinear Probabilistic Monitoring Based on the Gaussian Process

Apr 23, 2010 - upon the Gaussian process latent variable model. Different from the ... latent variables are restricted within a specific region. There...
0 downloads 0 Views 426KB Size
4792

Ind. Eng. Chem. Res. 2010, 49, 4792–4799

Nonlinear Probabilistic Monitoring Based on the Gaussian Process Latent Variable Model Zhiqiang Ge*,† and Zhihuan Song*,†,‡ State Key Laboratory of Industrial Control Technology, Institute of Industrial Process Control, Zhejiang UniVersity, Hangzhou 310027, Zhejiang, China, and Ningbo Institute of Technology, Zhejiang UniVersity, Ningbo, 315100, Zhejiang, China

For probabilistic interpretation and monitoring performance enhancement in noisy processes, the probabilistic principal component analysis (PPCA) method has recently been introduced into the monitoring area. However, PPCA is restricted in linear processes. This paper first gives a new interpretation of PPCA through the Gaussian process manner. Then a new nonlinear probabilistic monitoring method is proposed, which is developed upon the Gaussian process latent variable model. Different from the traditional PPCA method, the new approach can successfully extract the nonlinear relationship between process variables. Furthermore, it exhibits more detailed information of uncertainty for process data, through which the operation condition and the fault behavior can be interpreted more easily. Two case studies are provided to show the efficiency of the proposed method. 1. Introduction With the wide use of distribution control systems in modern industrial processes, an increasingly massive amount of data has been generated and collected. Therefore, monitoring based upon database methods has become very popular, especially the one based on multivariate statistical process control (MSPC). Traditional MSPC methods include principal component analysis (PCA) and partial least squares (PLS), etc.1-10 By projecting the data into a lower-dimensional space, these MSPC methods can accurately characterize the operation state of the monitored process systems. Simultaneously, the monitoring performance and the implementation procedure can be greatly improved and simplified. However, traditional MSPC methods are not achieved through the probabilistic manner upon which all statistical judgments and decisions should be made.11 To meet the probabilistic monitoring requirement, and also for performance enhancement in the noisy process environment, the traditional PCA method has recently been extended to its probabilistic counterpart: probabilistic principal component analysis (PPCA).11 Compared to the traditional method, more satisfactory performance has been obtained by the PPCA approach. However, it is noticed that the developed PPCA-based monitoring method is limited in linear processes. To our best knowledge, few probabilistic nonlinear methods have been developed for monitoring purposes up to now. Previously, we have proposed a nonlinear probabilistic monitoring method, which is based on generative topographic mapping (GTM).12-14 Although the GTM-based method does perform better than the linear probabilistic approach (PPCA), the information of latent variables is lost. This is because GTM is typically designed to embed the data in one or two dimensions, and the embedded latent variables are restricted within a specific region. Therefore, the problem will arise in the GTM method when the dimensionality of the latent variable increases. In fact, point representations in the latent space are important for monitoring, because the fault can only be detected in the latent space when * To whom all correspondence should be addressed. E-mail: [email protected] (Z.G.), [email protected] (Z.S.). † Institute of Industrial Process Control. ‡ Ningbo Institute of Technology.

the model structure has not been violated. Besides, GTM can only provide a single noise variance for different data samples. However, in our opinion, the uncertainties of data samples may be different from each other. Fortunately, a novel probabilistic interpretation of PCA has recently been explored in the machine learning area, which can be easily nonlinearized through Gaussian processes (GP).15 The new nonlinear probabilistic method was called Gaussian process latent variable model (GPLVM). Through this new probabilistic framework, the traditional PPCA method can be considered as a special case when the variable correlation is linear. In other words, GPLVM is a special nonlinear probabilistic PCA method. As a novel unsupervised approach for nonlinear low dimensional embedding, GPLVM has been used in several application areas, such as face recognition, data visualization, and clustering, etc.15 However, its application for monitoring has rarely been reported yet. Because of its modeling efficiency in nonlinear noisy process systems, we intend to introduce GPLVM into the process control area for monitoring purpose. Compared to PPCA, the nonlinearity of the process data can be successfully characterized under the new model structure. Besides, unlike the traditional PPCA and GTM methods which characterize the uncertainty of data sample as a single value, GPLVM exhibits specific uncertainties for different data samples. For monitoring purposes, the uncertainty information can also be used. Therefore, another monitoring statistic can be constructed, which is based on the uncertainty value of each data sample. The layout of this paper is presented as follows. First, the principal of PPCA and its new interpretation through Gaussian processes are given in section 2, which is followed by the detailed description of the nonlinear probabilistic monitoring method in the next section. Then two case studies are carried out in section 4, and some conclusions are made in the last section. 2. Probabilistic Principal Component Analysis and Its New Interpretation In this section, the principal of PPCA is first described, which is followed by its new interpretation through Gaussian processes.

10.1021/ie9019402  2010 American Chemical Society Published on Web 04/23/2010

Ind. Eng. Chem. Res., Vol. 49, No. 10, 2010

2.1. PPCA. PPCA was first proposed by Tipping and Bishop,16 its formulation can be constructed through a generative manner, which is given as x ) Pt + e

(1)

where x ∈ Rm represents the process variable, t ∈ Rk is the latent variable, P ∈ Rm×k specifies the relationship between the latent space and the original data space, e ∈ Rm is the noise term, which is always assumed to follow the Gaussian distribution with zero mean and variance β-1I, thus p(e) ) N(e|0,β-1I). Therefore, the conditional distribution of x can be calculated as p(x|t,P,β) ) N(x|Pt,β-1I). In the PPCA model, the prior distribution of the latent variable t is assumed to be a Gaussian distribution with zero mean and one variance p(t) ) N(t|0,I). Therefore, the marginal likelihood of x can be calculated as p(x|P,β) ) ∫p(x|t,P,β) p(t) dt. For a given data set X ) (x1,x2, · · · ,xn) of n data samples, P and β can be determined by maximizing the following likelihood function: n

L(P, β) ) ln

∏ p(x |P, β) i

(2)

i)1

It is noticed that the standard PPCA model is obtained by marginalizing the latent variable t and optimizing the parameter matrix P via maximum likelihood. In the next subsection, an alternative approach for PPCA model development will be introduced, which marginalizes the parameter matrix P and optimizes the latent variable t. 2.2. New Interpretation of PPCA through Gaussian Processes. Gaussian processes are a class of probabilistic models which specify Gaussian distributions over function spaces.17-19 Typically, the mean function is taken to be zero, while the covariance function is necessarily constrained to positive matrices. In the Bayesian framework, the parameter matrix P is viewed as a random matrix. Usually the choice of prior for P would be a spherical Gaussian distribution m

p(P) )

∏ N(p |0, I)

(3)

i

i)1

where pi is the ith row of P. We now marginalize P, which renders to the following new marginalized likelihood: m

p(X|T, β) )

∏ N(x |0, TT

T

i

+ βI)

(4)

i)1

where T is the corresponding matrix of latent variable t. Consider a simple Gaussian process with covariance matrix of the form K ) TTT + βI, where K is also called kernel matrix in Gaussian processes. We can easily find that the new marginalized likelihood is a product of m independent Gaussian processes. Then, the optimization step can be carried out with respect to the latent matrix T. The log-likelihood objective function is given as L)-

m 1 mn ln(2π) - ln |K| - tr(K-1XXT) 2 2 2

(5)

where K ) TTT + βI, takes the gradients of L with respect to T, a fixed point where the gradients are zero can be given by (1/m)XXTK-1T ) T. Substitute T with its singular value decomposition T ) USVT and the optimization value of T can be obtained, where U is an m × k matrix whose columns are

4793

the first k eigenvectors of XX and S is a k × k diagonal matrix whose ith element is si ) (λi -1/β)-1/2, where λi is the eigenvlaue associated with the ith eigenvector of m-1XXT and V is an arbitrary k × k rotation matrix. T

3. Nonlinear Probabilistic Monitoring Based on GPLVM In the previous section, we saw that PPCA can be interpreted as a special Gaussian process. The positions of the points in the latent space can be determined by maximizing the likelihood function with respect to the latent variables. It is straightforward that this framework can be extended for nonlinear process modeling, which is known as nonlinear GPLVM. The model development of the nonlinear GPLVM will be demonstrated in the next subsection, which is followed by the proposed monitoring scheme. 3.1. Nonlinear GPLVM Model Development. By introducing a nonlinear covariance function (or kernel function), the Gaussian process-based PPCA model can be easily nonlinearized. However, the resulting model cannot be optimized through an eigenvalue decomposition any more, since the log-likelihood is a highly nonlinear function of the embeddings and parameters. In other words, there will be no closed form solution for the nonlinear GPLVM model. Fortunately, we can turn to gradientbased optimization methods such as scaled conjugate gradient (SCG).20 This is an approach which implicitly considers second order information while using a scale parameter to regulate the positive definitiveness of the Hessian at each data point. Owing to the efficiency of the SCG method, it is employed for optimization purpose in this paper. Another important issue is about the selection of the kernel function. Generally, a Gaussian process covariance function can be developed from any positive definite kernel. Besides, new kernels can also be developed by the combination of existing kernels. Particularly, three common used kernels are: linear kernel, RBF kernel, and MLP kernel. In the present paper, the RBF kernel is selected for GPLVM modeling, which is given as follows:

[

krbf(ti, tj) ) θrbf exp -

(ti - tj)T(ti - tj) 2σrbf

]

(6)

where θrbf is the process variance, which controls the scale of the output function, and σrbf is the width parameter. Given a training data set Xtr ) (x1,x2, · · · ,xn), the aim of the GPLVM modeling is to determine the values of latent variables Ttr and their corresponding kernel matrix Ktr. A key step of GPLVM modeling is the SCG optimization procedure; a detailed description of this method can be found in ref 20. An important issue is how to guarantee a global solution of the optimization procedure. However, this is out of the scope of the present paper. To improve the optimization procedure of GPLVM, a more efficient optimization method should be considered. Although the modeling performance of GPLVM seems better than other methods, its computational complexity is more intensive, which makes this method impractical. Therefore, a more practical algorithm for GPLVM modeling has been proposed by Lawrence et al.,15 which is based on the informative vector machine (IVM) algorithm.21 Depending on the IVM method, the most representative subset of Xtr can be selected through a recursive manner. The modeling process can be greatly sped up through sparsification, thus only a subset of Xtr is selected for modeling. For simplicity, we denote the modeling data set as Xmd, the rest is represented as X-md. Correspondingly,

4794

Ind. Eng. Chem. Res., Vol. 49, No. 10, 2010

the latent and kernel matrices of Xmd can be represented as Tmd and Kmd, respectively. 3.2. Proposed Monitoring Scheme. After the nonlinear GPLVM model has been developed, we are in the position to construct the monitoring scheme for nonlinear probabilistic monitoring. In the traditional PPCA model, the distribution of a new data point xnew can be easily presented as p(xnew|P,β) ) N(xnew|0,PPT + β-1I). However, in the framework of GPLVM, the distribution of xnew should be calculated in a different way. Since the new data point xnew has an associated latent variable tnew, this value should be determined first. Again, the SCG optimization method can be used. Denote the SCG final optimization value of tnew as t*new, the distribution of the new data point xnew is also a Gaussian form, which can be given as p(xnew |Tmd, t*new) ) N(xnew |µnew, σnew2)

(7)

T -1 where µnew ) Xmd Kmd kmd,new and σnew2 ) knew,new T -1 kmd,newKmd kmd,new, kmd,new is a column vector developed from computing the elements of the kernel matrix between the modeling data set and the new data point, knew,new is the value of the kernel element between t*new and itself. When the latent space has been built, corresponding to the training data set Xtr, the latent matrix can be represented as Ttr, we are ready to construct the T2 monitoring statistic which is similar to the traditional PPCA method. The construction of the T2 statistic is given as follow

Ti ) 2

TtrT(i)

-1

var (Ttr) Ttr(i)

(8)

where i represents the ith data sample, var(Ttr) represents the variance of the training latent matrix. The confidence limit of the T2 statistic can be determined by a central χ2 distribution with γ significance level and k degree of freedom. When the new latent variable has been determined through the SCG optimization method, its corresponding T2 statistic value can be calculated as T Tnew2 ) t*new var-1(Ttr) t*new

Ernew ) xnew -

n

min R2 + C R,a,ξ

(11)

The confidence limit of SPE can be determined by a χ2 distributed function as SPElim ≈ g · χγ,h2, where g and h are defined as2

∑ξ

(13)

i

i)1

subject to 2 |Φ(σtr,i ) - a| 2 e R2 + ξi, ξi g 0

where R and a are radius and center of the hypersphere, C gives the trade-off between the volume of the hypersphere and the number of errors, and ξi represents the slack variable which allows the probability that some of the training samples can be wrongly classified. To simplify the optimization problem, a kernel function Ksvdd(σtr,i2,σtr,j2) ) 〈Φ(σtr,i2),Φ(σtr,j2)〉 is introduced to compute the inner product in the feature space, thus the exact form of the mapping function is not needed. The dual form of the optimization problem is given as22

min Ri



n

RiKsvdd(σtr,i2, σtr,j2) -

i)1

n

∑ ∑RRK

2

i j svdd(σtr,i

, σtr,j2)

i)1 j)1

(14) subject to n

0 e Ri e C,

(10)

Therefore, the SPE value of the new data sample can be calculated as follows: T SPEnew ) Ernew Ernew

Among the various one-class classification methods, the recently developed SVDD method is used in this paper. SVDD first maps the data from the original space to the highdimensional feature space through a mapping function Φ, then a hypersphere with the minimum volume which can include most of the normal data samples is built by solving the following optimization problem:22

n

(9)

Similarly, the traditional SPE statistic can be developed in the residual space. Note that eq 7 has given a probabilistic result of the new data sample, we can use its mean value as the prediction result. Thus the error of the new data sample can be generated as XTmdKmd-1kmd,new

limit of 1 - γ confidence level for this uncertainty. However, the distribution of σtr,i2 is unknown and it is difficult to determine the control limit of the data uncertainty. Fortunately, the monitoring task of the data uncertainty can be changed to a one-class classification problem, through which the normal process data is labeled in the normal operation region. When the new data sample is out of this region, it should be judged as fault.

∑R

i

)1

i)1

where Ri is Lagrange multipliers. To build the SVDD hypersphere for the normal operation region in the latent space, the center and the radius can be calculated as n

a)

∑ R Φ(σ i

2

tr,i

)

i)1

n

g · h ) mean(SPEtr) 2g h ) var(SPEtr) 2

(12)

where SPEtr represents the SPE value set of train data set Xtr, mean(SPEtr) is the mean value of SPEtr, and var(SPEtr) is the variance of SPEtr. According to the data uncertainty, another new monitoring statistic can be developed. Denote the variance of the training data set as σtr,i2(i ) 1,2, · · · ,n). We should calculate the control

R ) (1 - 2

∑RK

2

i svdd(σtr,z

, σtr,i2) +

(15)

i)1

n

n

∑ ∑RRK

2

i j svdd(σtr,i

, σtr,j2))1/2

i)1 j)1

where σtr,z2 is one of support vectors of the SVDD model. For the new data sample, the new uncertainty monitoring statistic UC can be developed as

Ind. Eng. Chem. Res., Vol. 49, No. 10, 2010

UCnew ) d (Φ(σtr,new )) ) |Φ(σtr,new ) - a| 2

2

2

2

Table 1. Monitoring Variables in the TE Process no. measured variables no.

n

)1-2

∑RK

i svdd(σtr,new

2

, σtr,i ) + 2

i)1 n

(16)

n

∑ ∑RRK

i j svdd(σtr,i

2

, σtr,j2)

i)1 j)1

The confidence limit of the new UC statistic is given as the square value of R, thus UClim ) R2. If UCnew > UClim, it means that the new data sample has come out of the normal region which is defined by SVDD. 3.3. Discussions and Some Remarks. In the present study, a new interpretation of the traditional PPCA method has been provided, which is based on the Gaussian process framework. Through this new interpretation, the PPCA method can be easily extended to its nonlinear form. Under the new nonlinear probabilistic model structure, the uncertainty of each data sample can be obtained. Different from the PPCA method, in which all data samples share an equal uncertainty value, the new GPLVM method can give specific uncertainties for different data samples. Hence, a new monitoring statistic has been constructed, on which the process fault can also be detected. It is straightforward that the more uncertain a data sample behaves, the higher is the probability that it is abnormal. In section 4, it can be found that the data uncertainty value can correctly reflect the fault behavior, as well as detect the fault efficiently. 4. Case Studies In this section, two case studies are demonstrated to evaluate the efficiency of the proposed monitoring method. The first one is the well-known TE benchmark process, which has been widely used for algorithm testing and evaluation in the past decades.23,24 The second one is a real polypropylene production process. 4.1. TE Benchmark Process. The TE benchmark process consists of 41 measured variables and 12 manipulated variables;

Figure 1. TE benchmark process.

4795

1 2 3 4 5 6 7 8

A feed D feed E feed A and C feed recycle flow reactor feed rate reactor temperature purge rate

9 10 11 12 13 14 15 16

measured variables product separator temperature product separator pressure product separator underflow stripper pressure stripper temperature stripper steam flow reactor cooling water outlet temperature separator cooling water outlet temperature

a set of 21 programmed faults can be simulated in this process. The flowchart of this process is given in Figure 1. More detailed description of this process can be found in ref 25. In the present paper, 16 continuous process variables are selected for monitoring purposes, which are tabulated in Table 1. To build the model for monitoring, a total of 960 data samples have been collected under normal operation conditions. The sampling interval is 3 min. For comparison, the GPLVM model and the traditional PPCA model are constructed simultaneously. Five principal components are selected in both of the two models. First, normal process data sets are tested; both methods show good performance, thus the Type I errors of both methods are very low. To evaluate the method with statistical significance, 100 Monte Carlo simulations have been executed, and the mean detection rates of all 21 process faults are given in Table 2. Because a fault is considered to be detected if either of the two monitoring statistics exceeds their control limits, the overall detection rates of all 21 faults are also provided in Table 2. Judging from this table, we can found that the monitoring performance can be improved by the new GPLVM method. Among all of the 21 process faults, the traditional PPCA only performs better in five fault cases. In addition, the detection times of all these faults can also be obtained (tabulated in Table 3). It can be found that GPLVM can detect the fault earlier than PPCA in most fault cases. Particularly, the monitoring results of fault 5 of both GPLVM and PPCA methods are given in Figure 2. It can be seen that the monitoring performance of this fault has been improved by the new method. Although the

4796

Ind. Eng. Chem. Res., Vol. 49, No. 10, 2010

Table 2. Mean Detection Rates of 21 Faults in TE Process fault number

GPLVM_T2

GPLVM_SPE

GPLVM_Overall

PPCA_T2

PPCA_SPE

PPCA_Overall

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

0.995 0.974 0.021 0.014 0.235 1.000 0.379 0.935 0.021 0.329 0.151 0.978 0.939 0.958 0.040 0.226 0.824 0.891 0.013 0.333 0.305

0.993 0.988 0.001 0.000 0.199 0.995 0.326 0.975 0.001 0.246 0.060 0.983 0.949 0.788 0.041 0.141 0.984 0.910 0.000 0.099 0.204

0.995 0.988 0.021 0.014 0.238 1.000 0.401 0.975 0.021 0.381 0.171 0.984 0.950 0.996 0.070 0.268 0.985 0.911 0.013 0.355 0.306

0.990 0.935 0.001 0.000 0.203 0.991 0.320 0.873 0.000 0.136 0.040 0.945 0.916 0.820 0.005 0.061 0.754 0.889 0.000 0.081 0.175

0.994 0.984 0.010 0.006 0.219 1.000 0.300 0.968 0.019 0.273 0.368 0.975 0.946 1.000 0.013 0.141 0.945 0.899 0.245 0.306 0.418

0.994 0.984 0.013 0.006 0.226 1.000 0.349 0.974 0.019 0.306 0.368 0.985 0.946 1.000 0.018 0.169 0.945 0.899 0.245 0.310 0.418

Table 3. Fault Detection Times of 21 Faults in TE Process (Hour) fault number

GPLVM_T2

GPLVM_SPE

GPLVM_Overall

PPCA_T2

PPCA_SPE

PPCA_Overall

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

8.450 9.750 9.750 8.050 8.050 8.050 8.050 8.800 8.150 8.250 8.200 8.150 10.400 8.100 10.350 8.050 9.300 8.400 8.550 11.400 28.850

8.150 8.650 26.200 8.050 8.100 8.150 8.050 8.550 19.750 9.650 9.300 8.150 10.400 8.350 39.500 23.550 9.550 12.450 8.550 12.550 39.200

8.150 8.650 9.750 8.050 8.050 8.050 8.050 8.550 8.150 8.250 8.200 8.150 10.400 8.100 10.350 8.050 9.300 8.400 8.550 11.400 28.850

8.450 9.850 26.200 8.050 8.050 8.150 8.050 9.550 8.050 12.500 12.850 8.150 10.550 8.200 39.550 10.000 9.400 12.450 8.550 12.550 36.700

8.250 8.700 10.150 8.050 8.050 8.050 8.050 8.750 8.050 8.400 8.350 8.150 9.950 8.050 11.150 9.200 8.050 9.250 8.550 12.450 9.700

8.250 8.700 10.150 8.050 8.050 8.050 8.050 8.750 8.050 8.400 8.350 8.150 9.950 8.050 11.150 9.200 8.050 9.250 8.550 12.450 9.700

detection rate of the SPE statistic has not been improved, the value of the new SPE statistic is much bigger than that of PPCA, which means the GPLVM method is more sensitive to this fault. Similarly, one realization of both methods for fault 10 is shown in Figure 3. Again, it can be judged that the monitoring performance of this fault has been improved. Next, the data uncertainties of these two selected faults are presented in Figure 4a and Figure 4b, respectively. As can be seen from Figure 4a, the data uncertainty has been greatly

Figure 2. Monitoring results of fault 5: (a) GPLVM; (b) PPCA.

enlarged between sample 160 and 350, which is in accordance with the fault behavior shown in Figure 2a. Similarly, the fault behavior can also be judged from the uncertainty information, which is clearly presented in Figure 4b. To examine the fault detection performance of the new developed monitoring statistic, the SVDD model can be constructed. Two parameters of the SVDD model are selected as C ) 2.0 and τ ) 0.01 (width of the SVDD kernel function). The monitoring results of both two faults are given in Figure 5. Compared to the monitoring result shown in Figure

Ind. Eng. Chem. Res., Vol. 49, No. 10, 2010

4797

Figure 3. Monitoring results of fault 10: (a) GPLVM; (b) PPCA.

Figure 4. Data uncertainty information: (a) fault 5; (b) fault 10.

Figure 5. Monitoring based on the data uncertainty statistic: (a) fault 5; (b) fault 10.

Figure 6. Flowchart of the polypropylene production process.

2a and Figure 3a, it can be found that the data uncertainties of both faults can correctly reflect the process behavior. 4.2. Polypropylene Production Process. As we know, polypropylene has been widely used in the light industry and

chemical industry, etc. To produce different brands of product, three reactors are connected in series. The flowchart of the polypropylene production process is given in Figure 6. The whole process includes four major units, and over 40 variables

4798

Ind. Eng. Chem. Res., Vol. 49, No. 10, 2010

Figure 7. Monitoring results of the process fault: (a) GPLVM; (b) PPCA.

Figure 8. Monitoring based on the data uncertainty statistic: (a) uncertainty information; (b) uncertainty statistic. Table 4. Monitoring Variables in the Polypropylene Production Process no.

measured variables

no.

measured variables

1 2 3 4 5 6 7

hydrogen concentration of the first reactor hydrogen concentration of the second reactor density of the first reactor density of the second reactor TEAL flow DONOR flow Atmer-163 flow

8 9 10 11 12 13 14

propylene feed of the first reactor propylene feed of the second reactor power for the first reactor power for the second reactor lever of the second reactor temperature of the first reactor temperature of the second reactor

are measured, among which 14 important and highly correlated variables are selected in this study. These 14 monitored variables are listed in Table 4. To construct the GPLVM and PPCA models, a normal data set containing 500 samples has been collected. Five principal components have been selected for both models. After testing for a normal data set, both of the two monitoring methods show good performance. Then, a fault data set is used to evaluate the feasibility and efficiency of the proposed method, which consists of 200 data samples. This fault is a step change of the TEAL flow starting at sample 101. The monitoring results of this fault by GPLVM and PPCA are given in Figure 7a and Figure 7b, respectively. It can be seen from this figure that the detection rate for this fault has been improved by the T2 statistic of GPLVM. Comparing the SPE monitoring results of both methods, it can be found that the SPE statistic values of GPLVM are much bigger than that of PPCA, which means the sensitivity of the GPLVM method is higher than PPCA. Similarly, the data uncertainty of this fault is presented, which is shown in Figure 8a. It can be seen from this figure that the process was changed after data sample 101, since the uncertainty of the data has been greatly enlarged. Correspondingly, the monitoring result of the UC statistic is given in Figure 8 b. From this result, we can also determine the fault behavior of

the process, because the value of the UC statistic has exceeded its confidence limit after sample 101. Again, the consistency of the traditional monitoring statistic and the new UC statistic has been obtained, which one can easily find when comparing Figure 7a with Figure 8b. 5. Conclusions A new nonlinear probabilistic monitoring method has been proposed, which is based on the recently developed GPLVM method. Based on the Gaussian process approach, the traditional PPCA can be interpreted through another way, which can be easily extended to the nonlinear form. Because of the relationship between GPLVM and PPCA, GPLVM can be regarded as a new nonlinear probabilistic PCA approach. Under this new probabilistic framework, two traditional monitoring statistics have been constructed. Additionally, one more monitoring statistic has been developed, which is based on the uncertainty of the data samples. Different from the two traditional statistics, the confidence limit of the new statistic was determined by a one-class classification method. The feasibility and efficiency of the proposed method have been evaluated through two case studies, and, depending on which, one can find that the

Ind. Eng. Chem. Res., Vol. 49, No. 10, 2010

monitoring performance and the process behavior interpretation have been improved by the new method. Acknowledgment This work was supported by the National Natural Science Foundation of China (60974056), and China Postdoctoral Science Foundation (20090461370). Literature Cited (1) Kruger, U.; Chen, Q.; Sandoz, D. J.; McFarlane, R. C. Extended PLS approach for enhanced condition monitoring of industrial processes. AIChE J. 2001, 47, 2076–2091. (2) Qin, S. J. Statistical process monitoring: basics and beyond. J. Chemom. 2003, 17, 480–502. (3) Chen, Q.; Kruger, U.; Meronk, M.; Leung, A. Y. T. Synthesis of T2 and Q statistics for process monitoring. Control Eng. Pract. 2004, 12, 745– 755. (4) Simoglou, A.; Georgieva, P; Martin, E. B.; Morris, A. J.; Feyo de Azevedo, S. On-line monitoring of a sugar crystallization process. Comput. Chem. Eng. 2005, 29, 1411–1422. (5) Zhang, Y. W.; Qin, S. J. Improved nonlinear fault detection technique and statistical analysis. AIChE J. 2008, 54, 3207–3220. (6) Kruger, U.; Dimitriadis, G. Diagnosis of process faults in chemical systems using a local partial least squares approach. AIChE J. 2008, 54, 2581–2596. (7) AlGhazzawi, A.; Lennox, B. Monitoring a complex refining process using multivariate statistics. Control Eng. Pract. 2008, 16, 294–307. (8) Zhao, C. H.; Wang, F. L.; Mao, Z. Y.; Lu, N. Y.; Jia, M. X. Adaptive monitoring based on independent component analysis for multiphase batch processes with limited modeling data. Ind. Eng. Chem. Res. 2008, 47, 3104– 3113. (9) Chen, T.; Sun, Y. Probabilistic contribution analysis for statistical process monitoring: A missing variable approach. Control Eng. Pract. 2009, 17, 469–477. (10) Zhao, C. H.; Wang, F. L.; Gao, F. R. Improved calibration investigation using phase-wise local and cumulative quality interpretation and prediction. Chemom. Intell. Lab. Sys. 2009, 95, 107–121. (11) Kim, D.; Lee, I. B. Process monitoring based on probabilistic PCA. Chemom. Intell. Lab. Syst. 2003, 67, 109–123.

4799

(12) Bishop, C. M.; Svensen, M.; Williams, C. K. I. The generative topographic mapping. Neural Comput. 1998, 10, 215–234. (13) Bose, I.; Chen, X. A method for extension of generative topographic mapping for fuzzy clustering. J. Am. Soc., Inf. Sci. Technol. 2009, 60, 363– 371. (14) Ge, Z. Q.; Song, Z. H. A nonlinear probabilistic method for process monitoring. Ind. Eng. Chem. Res. 2009, 49, 1770–1778. (15) Lawrence, N. D. Probabilistic nonlinear principal component analysis with Gaussian process latent variable models. J. Machine Learn. Res. 2005, 6, 1783–1816. (16) Tipping, M. E.; Bishop, C. M. Mixtures of probabilistic principal component analysis. Neural Comput. 1999, 11, 443–482. (17) Rasmussen, C. E.; Williams, C. K. I. Gaussian Process for Machine Learning; MIT Press: Cambridge, MA, 2006. (18) Chen, T.; Morris, A. J.; Martin, E. Gaussian process regression for multivariate spectroscopic calibration. Chemom. Intell. Lab. Syst. 2007, 87, 59–71. (19) Li, X. L.; Su, H. Y.; Chu, J. Multiple model soft sensor based on affinity propagation, Gaussian process, and Bayesian committee machine. Chin. J. Chem. Eng. 2009, 17, 95–99. (20) Mø´ller, M. F. A scaled conjugate gradient algorithm for fast supervised learning. Neural Networks 1993, 6, 525–533. (21) Lawrence, N. D.; Seeger, M.; Herbrich, R. Fast sparse Gaussian process methods: The informative vector machine. AdV. Neural Inf. Process. Syst. 2003, 15, 625–632. (22) Tax, D. M. J.; Duin, R. P. W. Support vector domain description. Machine Learn. 2004, 54, 45–66. (23) Chiang, L. H., Russell, E. L., Braatz, R. D. Fault Detection and Diagnosis in Industrial Systems; Springer-Verlag: London, 2001. (24) Kano, M.; Nagao, K.; Hasebe, H.; Hashimoto, I.; Ohno, H.; Strauss, R.; Bakshi, B. R. Comparison of multivariate statistical process monitoring methods with applications to the Eastman challenge problem. Comput. Chem. Eng. 2002, 26, 161–174. (25) Downs, J. J.; Vogel, E. F. A plant-wide industrial process control problem. Comput. Chem. Eng. 1993, 17, 245–255.

ReceiVed for reView December 8, 2009 ReVised manuscript receiVed February 23, 2010 Accepted April 6, 2010 IE9019402