Fault Detection of Nonlinear Processes Using Multiway Kernel

In this paper, a new nonlinear process monitoring method that is based on multiway kernel independent component analysis (MKICA) is developed. Its bas...
6 downloads 0 Views 126KB Size
7780

Ind. Eng. Chem. Res. 2007, 46, 7780-7787

Fault Detection of Nonlinear Processes Using Multiway Kernel Independent Component Analysis Yingwei Zhang*,†,‡ and S. Joe Qin‡ Key Laboratory of Integrated Automation of Process Industry, Ministry of Education, Northeastern UniVersity, Shenyang, Liaoning 110004, People’s Republic of China, and Department of Chemical Engineering, UniVersity of Texas at Austin, Austin, Texas 78712

In this paper, a new nonlinear process monitoring method that is based on multiway kernel independent component analysis (MKICA) is developed. Its basic idea is to use MKICA to extract some dominant independent components that capture nonlinearity from normal operating process data and to combine them with statistical process monitoring techniques. The proposed method is applied to the fault detection in a fermentation process and is compared with modified independent component analysis (MICA). Applications of the proposed approach indicate that MKICA effectively captures the nonlinear relationship in the process variables and show superior fault detectability, compared to MICA. 1. Introduction Many chemical, pharmaceuticals, and biological products are produced in batch process, and many other manufacturing processes are batch in nature.1-15 Because of the complexity of the batch processes, small changes in the operations during critical periods may degrade the quality and yield of the final product. Furthermore, product quality variables, which are the key indicators of process performance, are often examined offline in a laboratory.16-38 Therefore, it is desirable to detect the faults on-line. The development of effective methods for online monitoring and fault diagnosis of batch processes would significantly improve product quality, because such methods would enable the detection of faults during process operation, making it possible to correct fault either prior to completion of the batch or before the production of subsequent bathes. Such early detection and correction of problems in batch processes would reduce the number of rejected batches. Several techniques based on multivariate statistical analysis have been proposed for on-line monitoring and fault detection in the batch processes.6-13 Research groups led by MacGregor6,8,9 and Gallagher10 presented multiway principal component analysis (MPCA) by extending the multivariate statistical process control. Qin and co-workers5,11 proposed multiway partial least-squares (MPLS) with recursive least squares (RLS) for monitoring processes for which both the process data and the product quality are available. Nomikos and MacGregor7 proposed estimating the missing data on trajectory deviations from the current time until the end of the batch. These approaches have worked quite well in practice. However, it would be nice to have model predictive control (MPC) or MPLS monitoring approaches that are not dependent on having to fill in these missing data. Dong and McAvoy12 used nonlinear principal component analysis (NLPCA) that was based on principal curves and neural networks to monitor batch processes. NLPCA approaches are based on neural networks. Artificial * To whom correspondence should be addressed. Tel.: +1-512-4714417. Fax: +1-512-471-7060. E-mail address: zhangyingwei@ ise.neu.edu.cn. † Key Laboratory of Integrated Automation of Process Industry, Ministry of Education, Northeastern University. ‡ Department of Chemical Engineering, University of Texas at Austin.

neural networks (ANNs) have poor process interpretability and are hindered by problems that are associated with weight optimization, such as slow learning and local minimization.33 As an alternative learning strategy, Scho¨lkopf et al.34 adopted support vector machine (SVM) learning. Lee and co-workers used multilayer perceptrons (MLPs) to show a global approximation of a nonlinear mapping, whereas radial basis function networks (RBFNs) construct a local approximation using radial basis functions that are exponentially decaying nonlinear functions.35 Ra¨nnar et al.13 suggested an adaptive batch monitoring method using hierarchical principal component analysis (PCA) to overcome the need to estimate the missing data on trajectory deviation from the current time until the end of the batch in PCA. Recently, the kernel PCA also has been used to extract nonlinear relations.14,15 The methods have proven to be very powerful for analyzing historical data from past production, and diagnosing operating problems; they have also proven to be very effective for the on-line monitoring of new batches.16-22 However, the previously mentioned methods perform poorly when they are applied to industrial chemical process data that have non-Gaussian distributions. More recently, several statistical processing methods (SPMs) based on independent component analysis (ICA) have been proposed.23-27 ICA provides more meaningful statistical analysis and on-line monitoring, because ICA assumes that the latent variables do not have a Gaussian distribution, which involves higher-order statistics; that is, not only does it decorrelate the data based on second-order statistics, but it also reduces higherorder statistical dependencies. Hence, independent components (ICs) reveal more useful information on higher-order statistics from observed data than principal components (PCs).28,29 For some complicated cases in industrial chemical and environmental processes, which especially have nonlinear characteristics, kernel independent component analysis (KICA) recently has emerged to solve the nonlinear problems.28 KICA can efficiently compute ICs in high-dimensional feature spaces using the kernel matrix K. The basic idea of KICA is, first, to map the input space into a feature space via a nonlinear map and then extract the ICs in the feature space. KICA extracts several dominant ICs from multivariate data. First, it estimates the variance of dominant ICs and the directions using PCA and

10.1021/ie070381q CCC: $37.00 © 2007 American Chemical Society Published on Web 10/12/2007

Ind. Eng. Chem. Res., Vol. 46, No. 23, 2007 7781

then it performs conventional ICA to update the dominant ICs while maintaining the variance. In this paper, a new nonlinear process monitoring method based on multiway kernel independent component analysis (MKICA) is developed. The organization of the paper is as follows. Conventional ICA monitoring is briefly reviewed in the next section, followed by a brief introduction to the KICA algorithm. The MKICA algorithm and its application to process monitoring then are proposed. The performance of process monitoring using the MKICA is illustrated through an example (a fermentation process). Conclusions are given at the end of the paper.

To calculate B, each column vector bi is randomly initialized and then updated, so that the ith independent component sˆi ) (bi)Tz has maximum non-Gaussianity. The objective function that the elements of sˆ are statistically independent can be reflected by their non-Gaussianity. Negentropy, which is a common measure of non-Gaussianity, is based on the information-theoretic quantity of differential entropy. Nonlinear data do not imply implicitly that the data distribution will be nonGaussian, which may be Gaussian.36 Hyva¨rinen introduced the following fast and robust fixed-point convergence approximation algorithm of negentropy:30-32

bi r E{zg(biTz)} - E{g′(biTz)}bi

(7)

2. Independent Component Analysis (ICA) Several different algorithms for ICA have been proposed. The most well-known ICA algorithms are based on neural networks, higher-order statistics, and minimum mutual information. We briefly review the fast and robust fixed-point ICA algorithm (Fast and Robust ICA) that was developed by Hyva¨rinen and Oja.30-32 Suppose that d number of measured variables (x1, x2, ..., xd) can be expressed as linear combinations of m number of unknown independent components s1, s2, ..., sm (where m e d). The ICs and the measured variables have zero mean. If we denote the random column vectors as x ) [x1, x2, ..., xd]T and s ) [s1, s2, ..., sm]T, the relationship between them is given by

x ) A‚s

(1)

where A ) [a1, ..., am] ∈Rd×m is the unknown mixing matrix. The basic problem of ICA is to estimate both the mixing matrix A and the independent components s from only the observed data x. This solution is equivalent to finding a demixing matrix W whose form is such that the elements of the reconstructed vector sˆ, which is given as

sˆ ) W‚x

(2)

become as independent of each other as possible. For the sake of convenience, we assume that d ) m and the ICs have unit variance: E(ssT) ) I. The initial step in ICA is whitening, which eliminates all the cross correlation between random variables. This transformation can be also accomplished by classical PCA. The whitening transformation is expressed as

z ) Λ-1/2‚UT‚x ) Q‚x

(3)

where Q is the whitening matrix (Q ) Λ-1/2‚UT) and U (orthogonal matrix of eigenvectors) and Λ (diagonal matrix of its eigenvalues) are generated from the eigen-decomposition of the covariance matrix (E(xxT) ) U‚Λ‚UT). After the transformation, we have

z ) Q‚x ) Q‚A‚s ) B‚s

(4)

where B ) Q‚A is an orthogonal matrix, because E(zzT) ) B‚E(ssT)‚BT ) B‚BT ) I. We then can estimate s from eq 4 as follows:

sˆ ) BT‚z ) BT‚Q‚x

(5)

From eqs 2 and 5, the relation between W and B can be expressed as

W ) BT‚Q

where g′ is the first-order derivative of g, and Hyva¨rinen suggested the following three functions for g:30-32

g1(u) ) tanh(a1u)

( )

g2(u) ) u exp g3(u) ) u3

(9) (10)

3. Kernel Independent Component Analysis (KICA) In this section, KICA is derived to extract statistically independent components that also capture the nonlinearity of the observed data.28 The idea is to map the data nonlinearly into a feature space where the data have a more-linear structure. The modified ICA29 then is mapped in feature space and the extracted components of data are made as independent as possible. As in the central algorithm, we will use “kernel tricks” to extract whitened PCs in high-dimensional feature space and ultimately convert the problem of performing modified ICA (MICA) in feature space into a problem of implementing MICA in the kernel principal component analysis (KPCA) transformed space. Consider a nonlinear mapping

Φ:Rm f F (feature space)

(11)

We first map the input space into a feature space via nonlinear mapping and make the covariance structure of mapped data be an identity matrix to make the problem of ICA estimation simpler and better-conditioned. (The detail algorithm will be explained later.) Our objective then is to determine a linear operator WF in feature space F to recover the ICs from Φ(x) via the following linear transformation:

y ) WF‚Φ(x)

(12)

where E{Φ(x)‚Φ(x)T} ) I. 3.1. Whitening of Data in Feature Space Using KPCA. Consider the observed data

x k ∈ Rm

(for k ) 1, ..., N)

where N is the number of observations. Using the nonlinear mapping Φ:Rm f F, the observed data in the original space are extended into the high-dimensional feature space, Φ(xk) ∈F. The covariance matrix in feature space then will be

CF ) (6)

a2u2 2

(8)

1

N

∑ Φ(xj)‚Φ(xj)T N j)1

(13)

7782

Ind. Eng. Chem. Res., Vol. 46, No. 23, 2007

where Φ(xk) (for k ) 1, ..., N) is assumed to be zero mean and unit variance, which will be explained later. Let Θ ) [Φ(x1), ..., Φ(xN)]; CF then can be expressed by CF ) 1/(N)Θ‚ΘT. We can obtain PCs in feature space by finding eigenvectors of CF, which is straightforward. Instead of eigendecomposing CF directly, we can alternatively find PCs using the “kernel tricks”. If we define an N × N Gram kernel matrix K by

[K]ij ) Kij ) 〈Φ(xi),Φ(xj)〉 ) k(xi,xj)

we have K ) The use of a kernel function k(xi,xj) allows us to compute inner products in F without performing nonlinear mappings. That is, we can avoid performing the nonlinear mappings and computing inner products in the feature space by introducing a kernel function of form k(x,y) ) 〈Φ(x),Φ(y)〉. Some of the most widely used kernel functions are the Radial basis kernel,

(

||x - y|| c

vj )

1

xλj

)

(for j ) 1, ..., d)

V ) Θ‚H‚Λ-1/2

(18)

(19)

where H ) [r1, r2, ..., rd] and Λ ) diag(λ1, λ2, ..., λd). V then makes the covariance matrix CF become a diagonal matrix:

CF ) V diag

(

)

λ 1 λ2 λd 1 , , ..., VT ) V‚Λ‚VT N N N N

(20)

Let

2

P)V

the polynomial kernel,

Θ‚rj

The eigenvector matrix V ) [v1,v2, ..., vd] can be briefly expressed by the following matrix:

(14)

ΘT‚Θ.

k(x,y) ) exp -

can be expressed as

-1/2

(N1 Λ)

) xN Θ‚H‚Λ-1

(21)

Then

k(x,y) ) 〈x,y〉r

PTCFP ) I

and the sigmoid kernel,

Thus, we obtain the whitening matrix P, and the mapped data in the feature space can be whitened using the following transformation:

k(x,y) ) tanh(β0〈x,y〉 + β1) where c, r, β0, and β1 must be specified. A specific choice of kernel function implicitly determines the mapping Φ and the feature space F. From the kernel matrix K, the mean centering and variance scaling of Φ(xk) in high-dimensional space can be performed as follows. The mean centered kernel matrix (K ˜) can be easily obtained from

K ˜ ) K - 1N‚K - K‚1N + 1N‚K‚1N

(15)

· · · 1 · N×N · · · ∈R · · · · · 1

Also, the variance scaling of the kernel matrix can be done using the following equation:15

K ˜ scl )

K ˜ trace(K ˜ )/N

(16)

If we apply eigenvalue decomposition to K ˜ scl,

λr ) K ˜ scl‚r

(17)

we can obtain the orthonormal eigenvectors r1, r2, ..., rd of K ˜ scl corresponding to the d largest positive eigenvalues λ1 g λ2 g ... gλd. Theoretically, the number of nonzero eigenvalues is equal to the hyperdimension. In this paper, we empirically determine the hyperdimension to be the number of eigenvalues that satisifies the relation

λi sum(λi)

z ) PT‚Φ(x)

> 0.001

The d largest positive eigenvalues of CF then are λ1/N, λ2/N, ..., λd/N, and the associated orthonormal eigenvectors W1, W2, ..., Wd

(23)

In detail,

z ) PT‚Φ(x) ) xN Λ-1‚HT‚ΘT‚Φ(x) ) xN Λ-1‚HT[Φ(x1), ..., Φ(xN)]T‚Φ(x) ) xN Λ-1‚HT[k˜scl(x1,x), ..., k˜scl(xN,x)]T

where

1 · 1N ) 1/N · · 1

(22)

(24)

In fact, z is the same as the whitened KPCA score vector that satisfies E{zzT} ) I. 3.2. Further Processing Using the Modified ICA.29 The goal of this step is to extract ICs from the KPCA-transformed space. In the central part of this step, we applied the MICA of Lee and Qin.29 To be suitable for process monitoring, Lee and Qin proposed the MICA to extract some dominant ICs from the observed data. Compared to conventional ICA, the MICA algorithm can extract a few dominant factors that are needed for to conduct process monitoring, attenuate high computational load, consider the ordering of ICs, and give a consistent solution. From z ∈ Rd, the MICA can find p (p e d) dominant ICs, y, that satisfy E{yyT} ) D ) diag{λ1, ..., λp} by maximizing the non-Gaussianity of the elements of y, using

y ) CT‚z

(25)

where C ∈ Rd×p and CTC ) D. The requirement E{yyT} ) D reflects that the variance of each element of y is the same as that of scores in KPCA. Similar to PCA, the MICA can rank ICs according to their variances. If we define the normalized ICs, yn, as

yn ) D-1/2‚y ) D-1/2‚CT‚z ) CTn ‚z

(26)

it is clear that D-1/2‚CT ) CTn ,CTn ‚Cn ) I, and E{ynyTn } ) I. Although z is not independent, it can be a good initial value of

Ind. Eng. Chem. Res., Vol. 46, No. 23, 2007 7783

yn, because it has removed the statistical dependencies of data up to the second order (mean and variance). Therefore, we can set the initial matrix of CTn to be29

CTn ) [Ipl0]

(27)

where Ip is the p-dimensional identity matrix and 0 is the p × (d - p) zero matrix. When Cn has been determined, then the demxing matrix W and the mixing matrix A can be obtained from29

W ) D1/2‚CTn ‚Q ) D1/2‚CTn ‚Λ-1/2‚VT

(28)

A ) V‚Λ1/2‚Cn‚D-1/2

(29)

where W‚A ) Im. To calculate Cn, each column vector cn,i is initialized and then updated so that the ith independent component (yn,i ) (cn,i)Tz) can have the most non-Gaussianity. The objective functionsthat yn,i (for i ) 1, ..., p) are statistically independentsis equivalent to maximizing the non-Gaussianity. Hyva¨rinen’s research30-32 introduced a flexible and reliable approximation of the negentropy as a measure of the nonGaussianity:

J(y) ≈ [E{G(y)} - E{G(V)}]2 where y is assumed to be of zero mean and unit variance, V a Gaussian variable of zero mean and unit variance, and G any nonquadratic function. The detailed algorithms are given below: (1) Choose p, which is the number of ICs to estimate. Set the counter for i r 1; (2) Take the initial vector cn,i to be the ith row of the matrix in eq 17; (3) Maximize the approximated negentropy, by letting cn,i r E{zg(cTn,iz)} - E{g′(cTn,iz)}cn,i, where g is the first derivative and g′ is the second derivative of G; (4) Perform the orthogonalization to exclude the information contained in the solutions already found: i-1

cn,i r cn,i -

T (cn,i cn,j)cn,j ∑ j)1

(5) Normalize cn,i r cn,i/||cn,i||; (6) If cn,i has not converged, go back to Step 3; (7) If cn,i has converged, output the vector cn,i. Then, if i e p, set i r i + 1 and go back to Step 2. After Cn has been determined, the kernel ICs are obtained from

y ) D ‚Cn ‚z ) D 1/2

T

‚CTn ‚PT‚Φ(x)

1/2

(30)

way techniques (for example, MKICA). MKICA is equivalent to performing ordinary KICA on a large two-dimensional matrix X, which is constructed by unfolding the three-way data. MKICA is equivalent to performing KICA on a large twodimensional matrix. 4.1. On-line Batch Monitoring Based on MKICA. For online batch monitoring using MKICA, we know only the value from the beginning to the current time. For on-line monitoring, test data should be completed until the end of the batch. These remaining trajectories can be estimated many ways:38 (i) use ad hoc approaches, such as using the average trajectories (Z) or maintaining for the remainder of the batch the CD of the scaled trajectories, or (ii) use more-powerful statistical estimation approaches based on using the PCA or PLS models (MD approaches). The latter MD approaches are shown to be capable of providing excellent predictions of the process variable trajectories by optimally using all of the available data and knowledge of the time-varying covariance structure among all of the variables over the entire batch, provided by the PCA or PLS models of the batch. The choice of the most suitable approach is dependent on the characteristics of the batch process. The missing data can be filled in using the current values in this article. The detailed procedure of off-line and on-line monitoring method using MKICA model to supervise the batch process are as follows. 4.1. Developing the Normal Operating Condition (NOC). Development of the normal operating condition (NOC) involves the following steps: (1) Acquire an operating data set during normal batch operation; (2) Unfold X(I × J × K) to X(I × JK), using a batch-wise unfolding scheme, i.e., a set of JK-dimensional scaled normal operating data xk ∈ RJK, for k ) 1, ..., I; (3) Normalize the data X(I × J × K) using the mean and standard deviation of each variable at each time in the batch cycle over all batches; (4) Compute the kernel matrix K ∈ RI×I using eq 14. Calculate the mean centered kernel matrix K ˜ according to eq 15. Also, the variance scaling of the kernel matrix (K ˜ scl) can be computed according to eq 16, and, by eigenvalue decomposition to K ˜ scl, according to eq 17 (we can thus obtain the orthonormal eigenvectors r j 1, r j 2, ..., r j d of K ˜ scl corresponding to the largest positive eigenvalues λh1 g λh2 g ... g λhd); (Steps 5-11 are similar to steps 1-7 given in Section 3.2.) (12) Compute W and mixing matrix A:

W)D h 1/2DTn Q h )D h 1/2DTn Λ h -1/2V hT

(31)

h -1/2 A ) VΛ1/2DnD

(32)

This equation is the final realization of eq 2. 4. Multiway Kernel Independent Component Analysis (MKICA) In this section, the MKICA approach is presented and is used to analyze and monitor the batch process data. Batch data are typically reported in terms of batch numbers, variables, and times. Data are arranged into a three-dimensional matrix X(I × J × K), where I is the number of the batches, J is the numbers of the variables, and K is the numbers of the times sampled in each batch. This matrix can be decomposed using various three-

where Dn ) [c1, ..., cq], Λ h ) diag[λh1, ..., λhq], D h ) diag[λh1, ..., λhd], V h )Θ hH hΛ h -1/2, H h ) [r j 1, r j 2, ..., r j d], and Θ h ) [Φ(x1), ..., Φ(xI)]. (13) Calculate the monitoring statistics (T2 and SPE). The T2 statistic is defined as follows:

h -1y T2 ) yTD y)D h

DTn P h TΦ(x)

1/2

h ‚H h ‚Λ h -1 P ) xIΘ

(33a) (33b) (33c)

7784

Ind. Eng. Chem. Res., Vol. 46, No. 23, 2007

Figure 1. Graphical depictions of the performance of the kernel independent component analysis (KICA): (a) source data, (b) nonlinear mixed data, (c) principal components, and (d) multiway kernel independent component analysis (MKICA) solutions.

To detect the change in the nonsystematic part in the residual of the MKICA model, the SPE statistic is defined as follows:

SPE ) eTe ) (z - zˆ )T(z - zˆ ) ) zT(I - DnDTn )z (34a) z)P h TΦ(x)

(34b)

where e ) z - zˆ and zˆ can be determined from

h -1/2y ) DnDTn z zˆ ) DnD

(2) Given JK-dimensional scaled test data, xt ∈ RJK, compute the kernel vector kt ∈ R1×I by

[kt]j ) [kt(xt,xj)] where xj ∈ RJK (for j ) 1, 2, ..., I); (3) The test kernel vector kt is mean centered as follows:

k˜t ) k - 1tK - K1I + 1tK1I where K are determined by step 4 of the NOC and

(35)

The upper control limit for T2 can be determined using the F-distribution, because y does not follow a Gaussian distribution. In this paper, kernel density estimation is used to define the control limit.29 4.2. On-line Monitoring. On-line monitoring involves the following steps: (1) For new batch data until k, Xt(k × J), unfold it to XTt (1 × Jk). Apply the same scaling used in the modeling and fill in the missing data with the current values to create Xt(1 × JK), using the second method mentioned in this paper;

1 1t ) [1, ‚‚‚, 1] ∈ R1×I I 1 1· 1I ) · I· 1

· · · 1 · I×I · · · ∈R · · · · · 1

The mean centered kernel vector (kt )scl ) k˜ t/trace(K ˜ )/I; (4) Use the following equation:

(kt) scl )

k˜ t trace(K ˜ )/N

By applying eigenvalue decomposition to (kt )scl,

(36)

Ind. Eng. Chem. Res., Vol. 46, No. 23, 2007 7785

λr ) (kt) sclr

(37)

Table 1. Variables of the Fermentation Process Value

we can obtain the orthonormal eigenvectors r ˜ 1, r ˜ 2, ..., r ˜ d of (kht )scl that correspond to the d largest positive eigenvalues λ˜ 1 g λ˜ 2 g ... g λ˜ d. (Steps 5-13) are similar to steps 5-13 in NOC.) 5. Case Studies 5.1. Simple Example. To illustrate the performance of the KICA, we apply these methods to a simple example. Figure 1a shows the original sources. Figure 1b reflects the mixture data. When KICA is applied, the PCs are shown in Figure 1c. Moreover, the KICA solution (shown in Figure 1d) is able to recover original sources. The simple example clearly demonstrates that the KICA is very effective to extract a few dominant essential components. 5.2. Fermentation Process. Nosiheptide is a bicyclic peptide antibiotic that is produced with streptomyces actuosus. Its molecular formula is C51H43N13O12S6. Nosiheptide is mainly used as a feed additive, because it can promote livestock growth and there is no residual in the body of livestock. Some plants have produced Nosiheptide as a feed additive. The Nosiheptide production process is an aerobic fermentation in batch fermentors, the periods of which are ∼96 h. The incubation of the culture strains provides the seed that grows in a seed fermentor until a stage of maturity is reached. The seed then is transferred to a final-stage fermentor. These fermentors are operated in batch mode under standard conditions to optimize the synthesis of Nosiheptide. After that, the product is withdrawn via solvent extraction in the downstream flow. In the Nosiheptide fermentation process, the measurable variables on-line include physics and chemical parameters (see Table 1). In our study, the measurement noises were also added to 8 variables used in monitoring. A total of 30 batches thought to be normal were generated from the simulator. To test if the batches were statistically normal, off-line analysis was performed. This dataset was analyzed with MICA and MKICA each. Generally, the selected number of ICs for KICA is larger than that for ICA, because KICA extracts major ICs from the high-dimensional feature space. There are 8 linearly independent components and 96 nonlinearly independent components in the fermentation process. Four and eight independent components were selected for MICA and MKICA modeling, respectively (i.e., the nonlinear component number is more than the linear component number).37 The radial basis kernel function (RBKF),

(

k(x,y) ) exp -

parameter

minimum

maximum

temperature, T (°C) dissolved oxygen tension, DO (% set.) pressure, P (Pa) off-gas oxygen concentration, [O2] (%) off-gas CO2 concentration, [CO2] (%) pH value oxygen uptake rate, OUR (mol) carbon dioxide production rate, CER (mol)

5 0 0 12 0 3 0 0

50 100 35 22 7 10 5 5

and MKICA, a fault was imposed to the batch for the first batch. That is, the oxygen uptake rate was decreased nonlinearly from 4.5 to 0.5, because of the fault of a feeding pump until the end of batch operation. The initial time of the fault was 1 h. The 99% confidence limit is calculated and defines the normal operating region. The monitoring results of MPCA and MKPCA are shown in Figures 2a and b, respectively. In the case of MICA (Figure 2a), the T2 chart did not detect a significant deviation, whereas detection was possible using the SPE chart after 77 h. In comparison to MICA, the on-line batch monitoring charts with MKICA were able to detect the fault earlier. As shown in

)

||x - y||2 c

is selected as the kernel function, with

c ) rmσ2 where r is a constant to be selected, m the dimension of the input space, and σ2 the variance of the data. 5.3. On-line Monitoring MICA and MKICA. Based on the off-line analysis, all of the 30 batches are modeled for the online monitoring of MICA and MKICA. To fill in the future values in Xnew, we used the second approach that was suggested: fill in the missing value with the current value. The models, using MICA and the proposed MKICA against a new batch, then are tested with a 99% confidence limit. The two abnormal batches were used to evaluate the monitoring performance. To compare the monitoring performance between MICA

Figure 2. (a) MICA monitoring results and (b) MKICA monitoring of the fermentation process in the case of the first fault batch.

7786

Ind. Eng. Chem. Res., Vol. 46, No. 23, 2007

respectively. For this abnormal batch, as shown in Figure 3a, the monitoring charts of MICA detect the fault after 60 h. However, in comparison to MICA, the MKICA method can detect the abnormal situation. Figure 3b shows that the T2 chart of MKICA displayed a significant deviation after 55 h and the SPE chart detected the fault after 60 h, delayed by ∼10 h because of the occurrence of the fault. This delay was caused by the fact that the effect of oxygen uptake rate is propagated slowly through the correlated variables. The 95% confidence limit is calculated and defines the normal operating region. The monitoring results of MPCA and MKPCA are also shown in Figures 3a and b, respectively. For this abnormal batch, as shown in Figure 3a, the T2 chart of MKICA displayed a significant deviation after 62 h, and the SPE chart detected the fault after 55 h. However, in comparison to MICA, the MKICA method can detect the abnormal situation. Figure 3b shows that the T2 chart of MKICA displayed a significant deviation after 52 h, and the SPE chart detected the fault after 55 h. 6. Conclusions

Figure 3. (a) MICA monitoring results and (b) MKICA monitoring of the fermentation process in the case of the second fault batch.

Figure 2b, the T2 chart detected the fault after 60 h, whereas the SPE chart detected it after 30 h. In conclusion, the detection time of MKICA is faster than that of MICA (by 47 h). The 95% confidence limit is calculated and defines the normal operating region. The monitoring results of MPCA and MKPCA are also shown in Figure 2a and b, respectively. In the case of MICA (Figure 2a), the T2 chart detected a significant deviation after ∼80 h, whereas the SPE chart was able to detect the deviation after 65 h. In comparison to MICA, the on-line batch monitoring charts with MKICA was able to detect the fault earlier. As shown in Figure 2b, the T2 chart detected the fault after 52 h, whereas the SPE chart detected it after 23 h. In conclusion, the detection time of MKICA is faster than that of MICA (by 42 h). Otherwise, the fault detection delay was caused by the fact that the effect of glucose substrate feed rate is propagated slowly through the correlated variables (i.e., dissolved oxygen concentration, culture volume, and coolant water flow rate). For the second batch, a nonlinear decrease in the oxygen uptake rate was introduced at 45 h and retained until the end of fermentation. The 99% confidence limit is calculated and defines the normal operating region. The monitoring results of MPCA and MKPCA are shown in Figures 3a and b,

In this paper, off-line analysis and on-line batch monitoring are developed based on multiway kernel independent component analysis (MKICA). Three-way batch data of the normal batch process is unfolded batch-wise, and then kernel independent component analysis (KICA) is used to capture the nonlinear characteristics among the process variables. KICA can efficiently compute independent components in the high-dimensional feature spaces. The proposed monitoring method was applied to fault detection in the simulation showing Nosiheptide production. In on-line monitoring, MKICA is able to detect significant deviation that may cause the quality of the final products to be reduced. Consequently, the proposed approach can effectively capture the nonlinear relationship among the process variables and its application to process monitoring shows better performance than modified independent component analysis (MICA). The MKICA approach that has been developed in the present paper shows great promise but requires further study to resolve problems such as how many independent components are to be chosen and how does one select an appropriate kernel function and apply it to real batch data. We believe that, with further development, the proposed method will be useful for nonlinear batch process monitoring. The data mapped into feature space become redundant and linear data introduce error when the kernel trick is used. In addition, in the training process of kernel principal component analysis (KPCA), the size of the kernel matrix is the square of the number of samples. Hence, in future work, data preprocessing is needed. Acknowledgment This work was supported the Texas-Wisconsin Modeling and Control Consortium. Literature Cited (1) Martin, E. B.; Morris, A. J. An overview of multivariate statistical process control in continuous and batch process performance monitoring. Trans. Inst. Meas. Control (London) 1996, 18, 51-60. (2) Zhang, J.; Yang, X. H. MultiVariate Statistical Process Control; Chemical Industry Press: Beijing, PRC, 2000. (3) Kourti, T. Process analysis and abnormal situation detection: From theory to practice. Control Syst. Mag., IEEE 2002, 22, 10-25. (4) Liang, J.; Qian, J. Multivariate statistical process monitoring and control: Recent developments and applications to chemical industry. Chin. J. Chem. Eng. 2003, 11, 191-203.

Ind. Eng. Chem. Res., Vol. 46, No. 23, 2007 7787 (5) Qin, S. J.; Cherry, G.; Good, R.; Wang, J.; Harrison, C. A. Semiconductor manufacturing process control and monitoring: Afab-wide framework. J. Process Control 2006, 16, 179-191. (6) Nomikos, P.; MacGregor, J. F. Monitoring batch processes using multiway principal component analysis. AIChE J. 1994, 40, 1361-1375. (7) Nomikos, P.; MacGregor, J. F. Multi-way partial least squares in monitoring batch processes. Chemom. Intell. Lab. Syst. 1995, 30, 97-108. (8) Nomikos, P.; MacGregor, J. F. Multivariate SPC charts for monitoring batch processes. Technometrics 1995, 37, 41-59. (9) Kourti, T.; Lee, J.; MacGregor, J. F. Experiences with industrial applications of projection methods for multivariate statistical process control. Comput. Chem. Eng. 1996, 20, S745-S750. (10) Gallagher, N. B.; Wise, B. M. Application of multi-way principal component analysis to nuclear waste storage tank monitoring. Comput. Chem. Eng. 1996, 20, S739-S744. (11) Yue, H. H.; Qin, S. J.; Markle, R. J.; Nauert, C.; Gatto, M. Fault detection of plasma etchers using optical emission spectra. IEEE Trans. Semicond. Manufact. 2000, 13, 374-385. (12) Dong, D.; McAvoy, T. J. Batch tracking via nonlinear principal component analysis. AIChE J. 1996, 42, 2199-208. (13) Ra¨nnar, S.; MacGregor, J. F.; Wold, S. Adaptive batch monitoring using hierarchical PCA. Chemom. Intell. Lab. Syst. 1998, 41, 73-81. (14) Lee, J. M.; Yoo, C. K.; Choi, S. W.; Vanrolleghem, P. A.; Lee, I. B. Nonlinear Process Monitoring Using Kernel Principal Component Analysis. Chem. Eng. Sci. 2004, 59, 223-234. (15) Lee, J. M.; Yoo, C. K.; Lee, I. B. Fault Detection of Batch Processes Using Multiway Kernel Principal Component Analysis. Comput. Chem. Eng. 2004, 28, 1837-1847. (16) Tates, D. J.; Louwerse, A. A.; Smilde, A. K.; Koot, G. L. M.; Berndt, H. Monitoring a PVC batch process with multivariate statistical process control charts. Ind. Eng. Chem. Res. 1998, 38, 4769-4776. (17) Westerhuis, J. A.; Kourti, T.; MacGregor, J. F. Comparing alternative approaches for multivariate statistical analysis of batch process data. J. Chemom. 1999, 13, 397-413. (18) Gregersen, L.; Jorgensen, S. B. Supervision of fed-batch fermentations. Chem. Eng. J. 1999, 75, 69-76. (19) Albert, S.; Kinley, R. D. Multivariate statistical monitoring of batch processes: an industrial case study of fermentation supervision. Trends Biotechnol. 2001, 19, 53-62. (20) Lennox, B.; Montague, G. A.; Hiden, H. G.; Kornfeld, G.; Goulding, P. R. Process monitoring of an industrial fed-batch fermentation. Biotechnol. Bioeng. 2001, 74, 125-135. (21) Chen, Q.; Wynne, R. J. The application of principal component analysis and kernel density estimation to enhance process monitoring, Control Eng. Pract. 2000, 8, 531-543. (22) Kourti, T. Multivariate dynamic data modeling for analysis and statistical process control of batch processes, start-ups and grade transitions. J. Chemom. 2003, 17, 93-109. (23) Yoo, C. K.; Lee, J. M.; Vanrolleghem, P. A.; Lee, I. B. On-line monitoring of batch processes using multiway independent component analysis. Chemom. Intell. Lab. Syst. 2004, 71, 151-163.

(24) Kano, M.; Tanaka, S.; Hasebe, S.; Hashimoto, I.; Ohno, H. Monitoring independent components for fault detection. AIChE J. 2003, 49, 969-976. (25) Kano, M.; Hasebe, S.; Hashimoto, I.; Ohno, H. Evolution of multivariate statistical process control: independent component analysis and external analysis. Comput. Chem. Eng. 2004, 28, 1157-1166. (26) Lee, J. M.; Yoo, C. K.; Lee, I. B. Statistical process monitoring with independent component analysis. J. Process Control 2004, 14, 467485. (27) Albazzaz, H.; Wang, X. Z. Statistical process control charts for batch operations based on independent component analysis. Ind. Eng. Chem. Res. 2004, 43, 6731-6741. (28) Yang, J.; Gao, X.; Zhang, D.; Yang, J. Kernel ICA: An alternative formulation and its application to face recognition. Pattern Recognit. 2005, 38, 1784-1787. (29) Lee, J. M.; Qin, S. J.; Lee, I. B. Fault detection and diagnosis of multivariate process based on modified independent component analysis. AIChE J. 2006, 52, 3501-3514. (30) Hyva¨rinen, A.; Oja, E. Independent component analysis: algorithms and applications. Neural Networks 2000, 13, 411-430. (31) Hyva¨rinen, A. New approximations of differential entropy for independent component analysis and projection pursuit. AdV. Neural Inf. Process. Syst. 1998, 10, 273-279. (32) Hyva¨rinen, A. Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Networks 1999, 10, 626634. (33) Walczak, B.; Massart, D. L. Local modelling with radial basis function networks. Chemom. Intell. Lab. Syst. 2000, 50, 179-198. (34) Scho¨lkopf, B.; Sung, K.-K.; Burges, C. J. C.; Girosi, F.; Niyogi, P.; Poggio, T.; Vapnik, V. Comparing support vector machines with Gaussian kernels to radial basis function classifiers. IEEE Trans. Signal Process. 1997, 45, 2758-2765. (35) Choi, S. W.; Lee, D. K.; Park, J. H.; Lee, I.-B. Nonlinear regression using RBFN with linear submodels. Chemom. Intell. Lab. Syst. 2003, 65, 191-208. (36) Choudhury, M. A. S; Shah, S. L.; Thornhill, N. F. Diagnosis of poor control-loop performance using higher-order statistics. Automatica 2004, 40, 1719-1728. (37) Lee, J.; Yoo, C.; Choi, S.; Vanrolleghemb, P.; Lee, I. Nonlinear process monitoring using kernel principal component analysis. Chem. Eng. Sci. 2004, 59, 223-234. (38) Garcı´a-Munoz, S.; Kourti, T.; MacGregor, J. F. Model predictive monitoring for batch processes. Ind. Eng. Chem. Res. 2004, 43, 59295941.

ReceiVed for reView March 13, 2007 ReVised manuscript receiVed August 7, 2007 Accepted August 21, 2007 IE070381Q