Nonlinear Bioprocess Monitoring Using Multiway Kernel Localized

Feb 3, 2011 - ... the traditional multiway Fisher discriminant analysis (MFDA) method becomes inappropriate and unable to effectively .... Chemometric...
0 downloads 0 Views 4MB Size
ARTICLE pubs.acs.org/IECR

Nonlinear Bioprocess Monitoring Using Multiway Kernel Localized Fisher Discriminant Analysis Jie Yu*,† Department of Chemical Engineering, The University of Texas at Austin, Austin, Texas 78712, United States ABSTRACT: A novel batch bioprocess monitoring approach based on multiway kernel localized Fisher discriminant analysis (MKLFDA) is proposed in this article. For the routine bioprocess operation with some abnormal events, a supervised monitoring method is needed to handle the training data set that includes various types of faulty samples. Because of the inherent process nonlinearity and multimodality among normal and faulty clusters, the traditional multiway Fisher discriminant analysis (MFDA) method becomes inappropriate and unable to effectively detect or classify faulty samples. The newly developed MKLFDA approach, however, combines kernel function with the localized Fisher discriminant analysis so that the “kernel” feature can retain the process nonlinearity while the “localized” characteristic is able to extract the multi-Gaussianity within data clusters. Furthermore, the integrated multiway analysis uses batch-wise unfolding to convert the three-dimensional data set into a two-dimensional matrix that can be fed into the kernel localized Fisher discriminant analysis for fault detection and classification. The proposed MKLFDA approach is applied to three test scenarios in the fed-batch penicillin fermentation process and its batch monitoring performance is compared to that of the conventional MFDA method. The results indicate that the MKLFDA approach performs much better than MFDA method in detecting abnormal operating conditions as well as classifying various types of process faults occurring in fedbatch operation. The MKLFDA approach results in higher fault detection rate and lower false classifications.

1. INTRODUCTION In the biopharmaceutical industry, batch or semibatch bioprocess monitoring plays an increasingly important role to meet the specific requirements of the Process Analytical Technology (PAT) initiative by the U.S. Food and Drug Administration.1,2 The complexity of biological systems and the lack of quantitative analysis on the underlying mechanism pose great challenges to successful bioprocess monitoring and control. However, the product yield and quality in bioprocesses rely upon the stable bioreactor operation under normal conditions. Any small perturbations in the process operation may cause dramatic degradation of yield and quality of final products. Hence, early and accurate detection of abnormal events in bioprocesses can effectively prevent the production incidents through rapid action to fix process faults in a proactive manner. As a result, the number of potentially rejected batches can be largely reduced with improved product yield, quality, and manufacturing profit. A typical batch or semibatch bioprocess consists of the following features: (i) the metabolism of cell growth and product formation can be quite complex so that the first-principle model based process control and monitoring become infeasible; (ii) the strong nonlinearity is often present in the process and thus the commonly used linear monitoring techniques are likely to fail; (iii) the batch-to-batch variations may be significant and the operating scenario changes can result in different process dynamics; (iv) the high sensitivity of the bioprocess to subtle operation disturbances requires strong detectability of the batch monitoring technique; (v) there are insufficient online measurements available in the bioreactors.2-4 The simple methods like visual inspection of variable trending plots or empirical rules often fail because of the complicated r 2011 American Chemical Society

variable correlations, inherent process dynamics, and nonlinearity in batch bioprocesses. In literature, multivariate statistical process monitoring (MSPM) techniques have been developed and applied to batch or fed-batch processes. Nomikos and MacGregor developed a multiway principal component analysis (PCA)-based monitoring approach to extract faulty features from multivariate trajectory data. The obtained multivariate statistical process control (MSPC) charts can track the progress of new batch runs and detect the occurrence of process upsets.5,6 Kosanovich et al. applied multiway PCA to identify the major source of batch-to-batch variability and improve the fundamental understanding of batch processes.7 In addition to multiway PCA, Wise and Gallagher further explored the multiway partial leastsquares (PLS) and its potential for batch process monitoring.8 Martin et al. investigated the different types of control limits in the multiway PCA- and PLS-based MSPC charts.9 Lennox et al. demonstrated the monitoring capability of PCA and PLS techniques in an industrial fed-batch fermentation process.10 Considering the slow changes of normal operation in real processes, Lee et al. proposed a consecutively updated MPCA model to include any new batches, and such a strategy can effectively reduce the false ::alarms during online monitoring of fed-batch bioprocesses.11 Undey et al. developed two types of multiway PLS models through different unfolding methods for fed-batch fermentation process monitoring. The issue of discontinued process measurements due to operation switching can be resolved by the batch-wise data matrix unfolding.12 Gunther et al. Received: August 16, 2010 Accepted: January 7, 2011 Revised: January 4, 2011 Published: February 03, 2011 3390

dx.doi.org/10.1021/ie1017282 | Ind. Eng. Chem. Res. 2011, 50, 3390–3402

Industrial & Engineering Chemistry Research applied multiway PCA model to detect abnormal process conditions and diagnose root causes of an industrial fed-batch cell culture process.13 Chiang et al. conducted a comparative study on different data unfolding techniques in multiway PCA and PLS methods for batch process monitoring. It is found that the batch unfolding multiway PCA is more sensitive to the overall batch variation, while the observation unfolding multiway :: PLS is more sensitive to the localized batch variation.14 Undey et al. integrated the multiway PLS method into a real-time knowledge-based system for automated process monitoring, quality estimation, and fault diagnosis of batch and fed-batch cultivations.15 Though mutliway PCA and PLS methods have gained some success in batch bioprocess monitoring, they suffer from several drawbacks that degrade their monitoring capability. First of all, they are both essentially linear modeling techniques and thus unable to handle the common nonlinearity present in various kinds of bioprocesses. Dong and McAvoy proposed a nonlinear PCA-based monitoring approach to deal with the nonlinear batch processes.16 Lee et al. developed a multiway kernel PCA method for both online and off-line fault detection of batch processes. The nonlinear kernel function is combined with PCA to characterize the inherent process nonlinearity.17,18 Second, multiway PCA/PLS relies upon the assumption that the unfolded measurement and quality variables follow multivariate Gaussian distribution. In industrial practice, however, the process data often exhibit a non-Gaussian feature due to the reasons such as shifting operation phases, various process conditions, different types of process faults, etc. As an alternative solution, the multiway independent component analysis (ICA) method has been proposed to monitor batch and fed-batch processes.19-23 The latent variables that are statistically independent of each other are separated, and the higher-order statistics are examined to extract the non-Gaussian directions for fault detection. Another non-Gaussian type of monitoring approach, namely Gaussian mixture model (GMM), has been applied to both continuous and batch process monitoring with the specific capability of characterizing the multimodality of normal operating data.24-27 The aforementioned multiway PCA, PLS, ICA, and GMM methods are all attributed to the class of unsupervised monitoring techniques, which require a fault-free training data set to build the normal operation model. The real process data, however, are very likely to be contaminated with a certain number of faulty samples because different types of process and measurement faults may occur randomly during the plant operation. It is noted that the faulty samples in this work are specifically referred to the data collected under abnormal operation events instead of the invalid or outlier points. In contrast, Fisher discriminant analysis (FDA) as a popular supervised monitoring technique has been widely applied to continuous processes28-30 and also attempted in batch processes.31 The basic idea of FDA monitoring approach is to seek the optimal separating directions between normal and faulty classes, which can form the discriminant functions for fault detection and point to the variables with high contribution for fault diagnosis. The significant advantage of FDA over those unsupervised monitoring methods lies in the fact that both normal and faulty data sets are fully used to construct the operation model so as to improve the fault prediction accuracy. Nevertheless, the conventional FDA algorithm depends upon the assumption of within-class Gaussianity and therefore is unable to handle the multimodality within normal and/or faulty

ARTICLE

classes due to various operating modes or faulty events. A new localized Fisher discriminant analysis (LFDA) method has been applied to continuous chemical process monitoring and can effectively retain the multi-Gaussianity within normal or faulty data sets while separating the faulty samples from normal operation.32 In this research, the LFDA approach is further extended to batch or fed-batch bioprocess monitoring and a novel multiway kernel LFDA (MLKFDA) method is developed to capture the nonlinearity, batch dynamics, and within-class multimodality. The basic idea is to first unfold the three-dimensional data matrix collected from batch processes and align data arrays of unequal length via a dynamic time warping technique. Then the raw data space is projected into a higher-dimensional kernel feature space through the nonlinear kernel function. The localized Fisher discriminant directions can thus be extracted from the kernel feature space with nonlinear process characteristics captured. Moreover, the kernel localized Fisher directions maintain the potential multimodality among different faulty clusters or various operating phases. A kernel localized Fisher discriminant function can be constructed to determine the abnormal events and classify different types of process faults. Different from other kernel analysis based nonlinear monitoring methods,33,22,34,23,35 the presented MLKFDA approach cannot only retain and capture the multi-Gaussianity within the batch process data, but also identify the best separating hyperplanes with data set involving both normal and faulty batches. The organization of the paper is as follows. A brief introduction of the localized FDA method is provided in section 2. Then section 3 describes the kernel LFDA algorithm. The novel multiway kernel LFDA based batch process monitoring approach is further developed in section 4. In section 5, the monitoring performance of the proposed MLFDA approach is demonstrated on the simulated fed-batch penicillin production process, and the results are also compared to those of the regular multiway FDA (MFDA) method in different test scenarios. The concluding remarks of this paper are finally summarized in section 6.

2. REVIEW OF LOCALIZED FISHER DISCRIMINANT ANALYSIS As a popular supervised pattern classification and dimensionality reduction method, Fisher discriminant analysis is able to separate different clusters with maximized margin while minimizing the within-class scatter. It is essentially achieved through generalized eigenvalue analysis of the between-class scatter matrix Sb against the within-class scatter matrix Sw Sb p ¼ λSw p

ð1Þ

where λ and p are generalized eigenvalue and eigenvector, respectively. The definitions of between-class and within-class scatter matrices along with the FDA algorithm details can be found in the literature.36 The conventional FDA is based upon the underlying assumption that each cluster is of an approximate multivariate Gaussian distribution, and thus the classification results may not be desirable if some of the classes are of multimodality. Aimed at this issue, a localized Fisher discriminant analysis method has been proposed with the capability of preserving the local features within every single class.37 ~ (i,j) ~ (i,j) In LFDA algorithm, two weighting matrices W and W b w are introduced into the localized between-class and within-class 3391

dx.doi.org/10.1021/ie1017282 |Ind. Eng. Chem. Res. 2011, 50, 3390–3402

Industrial & Engineering Chemistry Research

ARTICLE

Figure 1. Illustration of batch-wise data unfolding.

scatter matrices as follows: n X n X ~Sb ¼ 1 ~ ði, jÞ ðxi -xj Þðxi -xj ÞT W 2 i ¼1 j ¼1 b

ð2Þ Figure 2. Schematic diagram of the multiway KLFDA based batch process monitoring approach.

and ~Sw ¼ 1 2

n X n X i ¼1 j ¼1

~ wði, jÞ ðxi -xj Þðxi -xj ÞT W

ð3Þ

Table 1. Monitored Variables in the Penicillin Production Process

where xi (i = 1, 2, ..., n) represent the n training samples from d-dimensional data space Rd, and the weighting matrices are defined as 8   1 1 > > A if xi ∈Ck , xj ∈ Ck < i, j n nk ~ bði, jÞ ¼ ð4Þ W 1 > > : otherwise n and

8 > < Ai, j ði, jÞ ~ Ww ¼ nk > :0

if

xi ∈ Ck , xj ∈Ck

variable no.

ð5Þ

otherwise

)

)

with nk denoting the number of samples in the kth class. The affinity matrix A includes the scaling factors to quantify the similarity between all pairs of samples. Typically it is set by a Gaussian function as Ai,j = exp(-( xi-xj 2)/σ2) with thrameter σ to adjust the exponentially decaying speed. Similarly, the generalized eigenvalue analysis can be conducted between the localized between-class and within-class scatter matricess ~ ~Sw ~p ~Sb~p ¼ λ ð6Þ where the eigenvector ~p corresponds to the localized Fisher

monitored variable

1

aeration rate (L/h)

2

agitator power (W)

3 4

substrate feed flow rate (L/h) substrate feed temperature (K)

5

pH

6

dissolved oxygen concentration (g/L)

7

carbon dioxide concentration (g/L)

8

biomass concentration (g/L)

9

penicillin concentration (g/L)

10

fermentor temperature (K)

11 12

cooling water flow rate (L/h) generated heat (kcal)

discriminant direction and the eigenvalue λ~ indicates the localized separation ratio between different classes. The local multiGaussianity with a single class can be effectively retained by the distance-based weighting matrices while the class separation is simultaneously maximized by the generalized eigenvalue decomposition.

3. KERNEL LOCALIZED FISHER DISCRIMINANT ANALYSIS Though the LFDA method has the enhanced capacity of handling the within-class multimodality, it is still a kind of linear 3392

dx.doi.org/10.1021/ie1017282 |Ind. Eng. Chem. Res. 2011, 50, 3390–3402

Industrial & Engineering Chemistry Research

ARTICLE

Figure 3. Diagram of fed-batch penicillin production process41.

Table 2. Three Test Scenarios in the Simulated Penicillin Production Process faulty scenarios in the test batches

case no. 1 2 3

200th-300th hour 300th-400th hour 100th-200th hour 250th-350th hour 100th-150th hour 200th-280th hour 300th-360th hour

step change in substrate feed flow rate increased variation in fermentor temperature drift error in aeration rate increased variation in fermentor temperature increased variation in fermentor temperature step change in substrate feed flow rate drift error in aeration rate

classification technique and unable to extract the complex nonlinear features in bioprocesses. To overcome this limitation, the LFDA algorithm has been extended to the nonlinear framework through kernel feature space projection.37 For the sample data matrix X, the within-class scatter matrix can be decomposed as ~Sw ¼ X L ~w X T

ð7Þ

~w - W ~ w with D~w denoting a n-dimensional diagonal ~w = D where L n ~ (i,j) ~ (i,i) matrix of the ith diagonal entry D w = Σj=1W w . Define a local ~ mixture scatter matrix as Sm = ~S wþ ~S b. It can be further written as n X n X ~Sm ¼ 1 ~ ði, jÞ ðxi -xj Þðxi -xj ÞT ð8Þ W 2 i ¼1 j ¼1 m where the mixture weighting matrix is defined as 8 Ai, j > < if xi ∈ Ck , xj ∈Ck n ~ mði, jÞ ¼ W 1 > : otherwise n

ð9Þ

Similarly, the between-class scatter matrix is decomposed as ~Sb ¼ X L ~b X T

ð10Þ

where L~b = (D~m - D~w) - (W~m - W~w) with D~m representing a

n-dimensional diagonal matrix of the ith diagonal entry n ~(i,j) D~(i,i) m = Σj=1W m . With the scatter matrix decomposition in eqs 7 and 10, the generalized eigenvalue analysis can be written as follows ~ L ~b X T~p ¼ λX ~w X T~p XL ð11Þ Define a n  n matrix K = {Ki,j} = {xiTxj} and a n-dimensional vector as ~p = X. Then the above equation can be transformed to ~ L ~w K R ~b K R ~ ¼ λK ~ ð12Þ KL Let Φ be a nonlinear function to map sample points from the original data space X ∈ Rd into a high-dimensional feature space F. Then the localized within-class and between-class scatter matrices can be reformulated as n X n X ~Sw ¼ 1 ~ ðwÞ ðΦðxi Þ - Φðxj ÞÞðΦðxi Þ - Φðxj ÞÞT ð13Þ W 2 i ¼1 j ¼1 i, j and n X n X ~Sb ¼ 1 ~ ðbÞ ðΦðxi Þ - Φðxj ÞÞðΦðxi Þ - Φðxj ÞÞT ð14Þ W 2 i ¼1 j ¼1 i, j

Because of the dimensionality disaster in computing all Φ(xi) directly, the kernel function is introduced as follows   ~ ¼ fK ~ i, j g ¼ f Φðxi Þ,Φðxj Þ g ð15Þ K Replacing the vector product based matrix K with the above kernel function K~ in eq 12 can lead to ~L ~b K ~R ~L ~w K ~R ~ ¼ λ~ K ~ K ð16Þ The most commonly used kernel function is the Gaussian function as follows ! jjxi -xj jj2 ~ i, j ¼ exp ð17Þ K 2~ σ2 where σ~ determines the width of Gaussian kernel. 3393

dx.doi.org/10.1021/ie1017282 |Ind. Eng. Chem. Res. 2011, 50, 3390–3402

Industrial & Engineering Chemistry Research

ARTICLE

Figure 4. Normal data trends of nine selected variables in the fed-batch penicillin production process.

4. MULTIWAY KERNEL LFDA-BASED BATCH PROCESS MONITORING APPROACH In batch or fed-batch bioprocesses, the collected operating data typically form a three-dimensional matrix with time-varying dynamics along the trajectories and batchto-batch variations. Therefore, the kernel LFDA method described in section 3 needs to be integrated with multiway analysis in order to be applied to batch process monitoring. Consider a batch or fed-batch process with I batches, J process variables, and K sampling instants. Then the formed three-way training data matrix is denoted by Xt (I  J  K). First, the dynamic time warping (DTW) technique is used to synchronize different batches and reconcile the timing difference across various trajectories.38 The DTW algorithm can measure the similarity across different time series and further find the optimal match among those sequences. The similarity measured by DTW is independent of nonlinear variations in the sequences and the effects of shifting and distortion along the time axis can be minimized. As DTW technique has been well established in the literature, interested readers can refer to the references for algorithm details.39,40 Then, a batchwise unfolding is conducted on the raw data matrix Xt to convert it into a two-dimensional

matrix X̂t(I  JK), as illustrated in Figure 1. The unfolded data can be further scaled to Xt with zero mean and unit variance along each variable at every sampling instant. A k-nearest neighbor (KNN) algorithm36 is adopted to classify the unfolded and normalized training data into C different clusters. Such class information along with the preprocessed training ~b and L ~w, data set Xt are used to compute the local matrices L which can further lead to the localized scatter matrices ~Sb and ~Sw. With the kernel matrix K~ computed from all the training data, the generalized eigenvalue decomposition can be performed according to eq 16 to obtain the series of eigenvalues and eigenvectors. The eigenvectors corresponding to the r largest eigenvalues form a kernel localized Fisher subspace ~1 R ~ r . ~ 2 ::: R P~r ¼ ½R For each row of the preprocessed training data matrix xhi, compute its image point in kernel-localized Fisher subspace as 0

qffiffiffiffiffi

qffiffiffiffiffi

~1 ~ ~2 λ2R φxi ¼ ð ~λ 1 R

3394

1 Kðx1 , xi Þ qffiffiffiffiffi B B Kðx2 , xi Þ C C C ~ ~ r ÞT B ... λr R B C l @ A Kðx1 , xi Þ

ð18Þ

dx.doi.org/10.1021/ie1017282 |Ind. Eng. Chem. Res. 2011, 50, 3390–3402

Industrial & Engineering Chemistry Research

ARTICLE

Figure 5. First test case: Test samples projected into two-dimensional (a) Fisher discriminant subspace and (b) kernel-localized Fisher discriminant subspace.

Similarly, a new batch xs (1  J  K) can be batchwise unfolded and scaled to xhs (1  JK). Then its image point is given by 0 1 Kðx1 , xs Þ qffiffiffiffiffi qffiffiffiffiffi qffiffiffiffiffi B B Kðx2 , xs Þ C C C ~r R ~2R ~1R ~ r ÞT B ~1 λ ~ 2 ::: λ φ xs ¼ ð λ ð19Þ B C l @ A KðxI , xs Þ

where nl denotes the number of training batches that are classified into the lth cluster and gk(xs) is the discriminant function value of the test batch xs belonging to the lth cluster. Therefore, the test batch at a sampling instant can be categorized into the particular cluster corresponding to the largest discriminant function value as given below Cðxs Þ ¼ arg max gl ðxs Þ 1el e C

Let φ(l) xhi

be the mean vector of all the image points that are from the lth cluster Cl. Then the kernel localized Fisher discriminant function can be computed for the new batch xs as follows: 1 ðlÞ ~r gl ðxs Þ ¼ - ðφxs -φ xi ÞT P 2 1-1

0

T

1 ~T X ðlÞ ~ ðφx -φ xðlÞ Þ P ð @ ðφ -φ xðlÞi Þðφxi -φ xi ÞT ÞP~r A P s i nl -1 r ~x i ∈Cl xi 2

0

r

13 1 4 @ 1 ~T X ðlÞ ðlÞ P ð þ ln C- ln det ðφ -φ xi Þðφxi -φ xi ÞT ÞP~r A5 2 nl -1 r ~x i ∈Cl xi

ð20Þ

ð21Þ

A schematic of the multiway KLFDA-based batch process monitoring approach is shown in Figure 2. The step-by-step procedure can be summarized as follows: 1. Collect training data set Xt in the format of three-dimensional matrix including both normal and faulty operations from a batch or fed-batch bioprocess. 2. Perform dynamic time warping based batch alignment and data synchronization. 3. Conduct batchwise unfolding of training data matrix Xt ^ t. into a two-dimensional matrix X ^ t into Xt with zero mean and unit 4. Scale the unfolded data matrix X variance along all the process variables at each sampling time. 5. Apply KNN algorithm to classify data set Xt into total C clusters. 3395

dx.doi.org/10.1021/ie1017282 |Ind. Eng. Chem. Res. 2011, 50, 3390–3402

Industrial & Engineering Chemistry Research

ARTICLE

Figure 6. First test case: Fault detection and classification results using (a) MFDA and (b) MKLFDA methods.

Table 3. Quantitative Comparison of Fault Detection and Classification Results between MFDA and MKLFDA Methods no. false alarms MFDA

no. missing faulty samples

MKLFDA

MFDA

no. fault classification errors

MKLFDA

MFDA

MKLFDA

case 1

143

8

36

3

138

11

case 2

195

14

60

11

95

11

case 3

186

16

44

12

159

10

6. Compute the within-class and between-class local matrices ~b. ~w and L L 7. Further calculate the within-class and between-class scatter matrices ~Sw and ~Sb according to eqs 7 and 10. ~ i.j and then compute the kernel 8. Specify kernel function K matrix K using preprocessed training data set Xt. 9. Conduct generalized eigenvalue decomposition as eq 16 to obtain the eigenvalues fλ~1 ,λ~2 ,:::,λ~I g and the eigenvectors fR~1 ,R~2 ,:::,R~I g. Form a reduced-dimensional kernel localized Fisher subspace P~r ¼ ½R~1 ,R~2 ,:::,R~r . 10. For each batch of preprocessed training data, compute its image point φi in kernel-localized Fisher subspace according to eq 18. 11. For each new monitored batch xs, unfold and normalize it into xhs. 12. Calculate its image point φs in kernel localized Fisher subspace as per eq 19.

13. Compute the kernel localized Fisher discriminant function of the monitored batch xs at a sampling instant relative to all C clusters according to eq 20. 14. Classify the monitored batch at a sampling instant into one of the clusters with the largest Fisher discriminant function value. Thus one can determine if any fault occurs in the monitored batch operation and to which type of fault it belongs.

5. CASE STUDY 5.1. Fed-Batch Penicillin Fermentation Process. A simulated fed-batch penicillin fermentation process41,42 is adopted in this research to demonstrate the fault detection and classification capability of the multiway KLFDA-based batch monitoring approach. C-inar’s group has developed a web-based dynamic 3396

dx.doi.org/10.1021/ie1017282 |Ind. Eng. Chem. Res. 2011, 50, 3390–3402

Industrial & Engineering Chemistry Research

ARTICLE

Figure 7. Second test case: Test samples projected into two-dimensional (a) Fisher discriminant subspace and (b) kernel-localized Fisher discriminant subspace.

simulation program, namely, Pensim, to simulate a typical penicillin production process in a bioreactor primarily under a fed-batch operation mode. In the simulator, a mechanistic model has been extended to include various process variables such as aeration rate, agitation power, substrate feed, oxygen level, carbon dioxide concentration, etc. The entire list of operating variables are provided in Table 1. The fermentation process is composed of a bioreactor along with substrate feed, acid and base streams to balance pH value, cold and hot water flows to control temperature, and air flow to manipulate dissolved oxygen level. Two flow ratio controllers are implemented to track pH and temperature set points by adjusting acid over base flows and cold water over hot water flows, respectively. However, the other process variables like glucose, biomass, and oxygen concentrations are under open-loop operation. The bioreactor starts in a batch culture to grow the microorganism and is then switched to fed-batch operation after 40 h to promote the penicillin synthesis. The schematic diagram of the fermentation process is shown in Figure 3, and the online simulation program is available at C-inar’s group Web site (http://www.chbe.iit.edu/cinar/). The whole fermentation process lasts 400 h with a sampling interval of 0.5 h. In this study, three types of process faults are designed, which include a step change of 0.015 L/h in substrate feed flow rate, a drift error of slope 0.01 L/h in aeration rate, and increased

variations in fermentor temperature. First, 100 training batches are collected to build the KLFDA model, and all three kinds of faults are added to the training data. Then, three different test scenarios are set up as given in Table 2 to examine how accurately the MKLFDA method can detect the fault occurrence and identify the fault types in comparison with the conventional MFDA algorithm. 5.2. Fed-Batch Process Monitoring Results Using MFDA and MKLFDA Methods. In the training set, both the normal and the three types of faulty batches are simulated to build the initial KLFDA model and obtain the leading directions with the largest separation margins among four different clusters. The normal operating data of nine selected variables from the training batches are plotted in Figure 4. As listed in Table 2, the first test case is composed of normal operation and two different types of process faults, the step change in substrate feed flow rate occurring from the 200th hour with 100-h duration and the increased variation in fermentor temperature between the 300th and the 400th hour. The 800 samples from the test batch are projected into the two-dimensional Fisher discriminant subspace and kernel-localized Fisher discriminant subspace, respectively. As shown in Figure 5, the projected sample points in the FDA subspace are not well isolated among the three different clusters (normal and two 3397

dx.doi.org/10.1021/ie1017282 |Ind. Eng. Chem. Res. 2011, 50, 3390–3402

Industrial & Engineering Chemistry Research

ARTICLE

Figure 8. Second test case: Fault detection and classification results using (a) MFDA and (b) MKLFDA methods.

kinds of faulty operations). There is significant overlapping area especially between the two types of faults, step change in substrate feed rate, and increased variation in fermentor temperature. On the contrary, the projected points in the KLFDA subspace have fairly clear separation boundary among three clusters. The superiority of KLFDA approach over FDA method in classification capability is due to the fact that the former has both “kernel” and “localized” features to deal with the inherent nonlinearity and multimodality of the batch operating data with various types of faults. The paper includes the projected separation plots in Figure 5 for illustration purposes to show that the graphic visualization can provide an intuitive way to identify a small number of abnormal events. However, such visualizationbased fault identification is unlikely to work in more complicated industrial applications with large numbers of potential faults because their corresponding two- or three-dimensional graphical representations will inevitably overlap among many different faults. Therefore, it needs to be noted that fault identification in real-world applications will depend on calculation and display of the kernel localized Fisher discriminant function rather than the graphical presentation. The process monitoring results in the first case are further depicted in Figure 6. The X axis represents the sampling time while the Y axis provides normal and different faulty classes. For each sampling point in a monitored batch, it can be seen which normal or faulty class this point has been categorized into by either MFDA or MKLFDA monitoring

method. It is obvious that the multiway FDA method leads to rather high fault detection and classification errors. During the initial 200-h normal operation, there are total 143 points detected incorrectly as faulty ones. In the following two faulty periods, 36 points are missed from the fault detection without triggering any alarms. Furthermore, another 138 samples are misclassified into wrong types of process faults even though they are alarmed correctly. The quantitative results of fault detection and classification are summarized in Table 3. In contrast, both the fault detection and classification accuracy are significantly improved by adopting the multiway KLFDA approach. It can be observed from Figure 6 b that there are only eight false alarms triggered in the normal operating period. When the fault of step change is happening, three faulty samples are not captured and another four points are misclassified into the fault of increased variation. Similarly, in the last 100-h faulty period, there are no undetected faulty samples and only seven misclassified points that are categorized into the fault of step change or drift error. It is noted that some of the samples are classified into the fault of drift error by both MFDA and MKLFDA methods although it never occurs during the entire batch and fed-batch fermentation. For the false alarms, typically any types of abnormal events should last certain period of time and therefore the preceding and subsequent sample points can be examined to verify whether the current one is a valid alarm point or not. In other words, if neither the preceding nor the subsequent sample point triggers 3398

dx.doi.org/10.1021/ie1017282 |Ind. Eng. Chem. Res. 2011, 50, 3390–3402

Industrial & Engineering Chemistry Research

ARTICLE

Figure 9. Third test case: Test samples projected into two-dimensional (a) Fisher discriminant subspace and (b) kernel-localized Fisher discriminant subspace.

an alarm, then the current point can be excluded from faulty operation. Thus most of false alarms from the MKLFDA approach can be eliminated by the above heuristic rule. Similar guidance can be applied to reexamine the validity of fault classification and reduce the classification errors. The second test scenario involves both faults of drift error in aeration rate and increased variation in fermentor temperature. The fermentation process starts with normal operation until the 100th hour when the drift error is introduced and lasts 100 h. Then the process operation is shifted back the normal conditions for another 50 h before the fault of increased variation in fermentor temperature is taking place. The test samples projected into the Fisher discriminant and kernel-localized Fisher discriminant subspaces are shown in Figure 7 panels a and b, respectively. The multiway FDA method still performs poorly in classifying different clusters, and the normal or two kinds of faulty samples are mixed together in the leading Fisher discriminant subspace. As a result, the process faults occurring in the batch or fed-batch operation cannot be effectively detected and classified by the MFDA algorithm. Figure 7a shows that the detection and classification errors are unacceptably high. For example, there are 195 out of total 400 normal samples that are misidentified as faulty ones and resulting in false alarms. On the other hand, 60

faulty points are missed without any alarms. In addition to the fault detection, 46 samples with drift error are misclassified into the faulty type of either increase variation or step change. The similarly high percentage of misclassifications are also observed during faulty operation with increase variation. However, the test sample clustering in kernel localized Fisher discriminant subspace as shown in Figure 7b is much more accurate. The one normal and two faulty classes of test samples are transformed to the feature plane with well enough separation, and there are very few overlapped points around the separating boundary between classes. Consequently, the process monitoring results by MKLFDA approach in Figure 8b greatly excels those of MFDA method. Only 14 normal samples trigger false alarms and 11 faulty points are mistakenly attributed to normal operation. Moreover, the fault classification errors happen on as few as 11 samples. The superior fault detection and classification results of the MKLFDA method further confirms its strong ability to characterize nonlinearity and within-class multi-Gaussianity of batch bioprocesses. In the third test case, all three types of process faults (step change, drift error, and increased variation) are applied to the fed-batch penicillin fermentation simultaneously. The process is operated under normal condition during the first 100 h and then 3399

dx.doi.org/10.1021/ie1017282 |Ind. Eng. Chem. Res. 2011, 50, 3390–3402

Industrial & Engineering Chemistry Research

ARTICLE

Figure 10. Third test case: Fault detection and classification results using (a) MFDA and (b) MKLFDA methods.

the fault of increased variation in fermentor temperature is added. After 50-h abnormal operation, another fault of step change in substrate feed rate occurs and lasts 80 h. With a 20-h short period of fault-free operation, the last fault of drift error appears on aeration rate and remains for 60 h before the fedbatch process is switched back to normal operation. This scenario is even more challenging for fault detection and particularly fault classification because more different kinds of faults are involved during the routine process operation. The two-dimensional projection charts of the test samples in Fisher and kernel localized Fisher subspaces are drawn in Figure 9 panels a and b, respectively. Similar to the above two cases, the conventional MFDA method cannot capture the separation boundary of any one of the four classes from the others. The projected points of different clusters are mostly overlapped with each other and thus the normal and various types of faulty data can hardly be classified. In contrast, the normal and different faulty operation regions as projected by the MKLFDA approach are well isolated from each other with a fairly small percentage of overlapped points. Furthermore, the faulty detection and classification results of MFDA and MKLFDA methods are compared in Figure 10 panels a and b. As summarized in Table 3, the MFDA method produces a total of 186 false alarms during normal operation and also misses 44 faulty samples without detection. Among the remaining 336 faulty samples with alarms triggered,

159 points are classified into the wrong types of process faults. The MKLFDA approach, however, results in only 16 false alarms out of a total of 420 normal operating samples, which indicates a dramatic improvement in terms of false alarm rate. On the other hand, there are 12 undetected faulty samples by the MKLFDA approach and such a type-II error is much lower than that of the MFDA method. Moreover, the fault classification error only occurs at 10 faulty points by the MKLFDA approach. In this case comparison, the apparent multimodality among the faulty data and the intrinsic nonlinearity of both normal and faulty data cause the failure of MFDA method. Nevertheless, the MKLFDA approach successfully detects the vast majority of faulty samples and further classifies them into three types of faulty categories with very high accuracy. Its satisfactory performance indicates the MKLFDA approach can be extended to the complex batch bioprocess operations with multiple types of faults.

6. CONCLUSIONS The localized Fisher discriminant analysis algorithm is incorporated with the multiway analysis and kernel function to monitor the challenging batch bioprocesses with strong nonlinearity and multi-Gaussianity. The batch-wise unfolding is first conducted to transform the three-way data matrix into twodimensional data set. Then different clusters corresponding to various types of process faults are identified by the clustering 3400

dx.doi.org/10.1021/ie1017282 |Ind. Eng. Chem. Res. 2011, 50, 3390–3402

Industrial & Engineering Chemistry Research algorithm. The kernel function is further integrated with the localized within-class and between-class scatter matrices to extract the reduced-dimensional Fisher discriminant subspace through generalized eigenvalue analysis. The projected image points in the nonlinear and multi-Gaussian feature subspace can be adopted to compute the kernel-localized Fisher discriminant function values with respect to different clusters. Therefore, such decision criterion can lead to not only the abnormal event detection but also the fault type classification. The localized characteristic along with the multiway analysis may effectively capture the within-class multimodality due to the switching fault types in the routine process operation, while the kernel function can handle the inherent nonlinearity of bioprocesses. The proposed MKLFDA approach is applied to the fed-batch penicillin production process and compared to the conventional MFDA method in the aspects of fault detection and classification. Three different test cases are designed with multiple types of process faults and the monitoring results demonstrate that the MKLFDA approach detects and classifies various kinds of faults more accurately than the MFDA method. The sample point projection into the kernel localized Fisher discriminant subspace shows well separation among the normal and different faulty classes, while the regular Fisher subspace projection is unable to isolate the heavily overlapped clusters. The quantitative comparison further verifies that the presented MKLFDA approach outperforms the MFDA method in terms of much lower false alarm rate, fewer undetected faulty samples and fewer fault classification errors. It can be highlighted that the supervised MKLFDA approach offers a promising way to tackle the technical challenges of both multimodality and nonlinearity in batch or fed-batch bioprocesses. In industrial practices, however, powerful computation methods and tools should still be combined with process experts’ knowledge and insights in order to validate the monitoring results and improve the reliability of the systematic approaches. Future work can be focused on the industrial applications of the proposed technique in larger-scale and more complicated bioprocess monitoring to tackle the practical issues in real implementation. For instance, the normal batch operation may include several different operating conditions along with various types of faults such as sensor drifting, flow valve sticking, and bacterial contamination. In addition, the potential integration of this monitoring technique into commercial software or automation systems can also be explored.

’ AUTHOR INFORMATION Corresponding Author

*E-mail: [email protected]. Tel.: þ1-281-544-7629. Fax: þ1-281544-7246. Present Addresses †

Currently with Shell Global Solutions (US) Inc., Houston, TX 77082, USA

’ ACKNOWLEDGMENT The author appreciates the valuable comments and suggestions of the anonymous reviewers. ’ REFERENCES (1) Food U, Administration D. Guidance for industry PAT: A framework for innovative pharmaceutical development, manufacturing,

ARTICLE

and quality assurance. 2004; http://www.fda.gov/cder/guidance/ 6419fnl.pdf (accessed on July 5th, 2010). (2) Clementschitsch, F.; Bayer, K. Improvement of bioprocess monitoring: development of novel concepts. Microb. Cell Fact. 2006, 5, 1–11. :: (3) C-inar, A., Parulekar, S., Undey, C., Birol, G. Batch Fermentation: Modeling, Monitoring, Control; CRC Press: New York, NY, 2003. (4) Lennox, B.; Kipling, K.; Glassey, J.; Montague, G.; Willis, M.; Hiden, H. Automated production support for the bioprocess industry. Biotechnol. Prog. 2002, 18, 269–275. (5) Nomikos, P.; MacGregor, J. F. Monitoring of batch processes using multi-way principal component analysis. AIChE J. 1994, 40, 1361– 1375. (6) Nomikos, P.; MacGregor, J. Multivariate SPC charts for monitoring batch processes. Technometrics 1995, 37 (1), 41–59. (7) Kosanovich, K.; Dahl, K.; Piovoso, M. Improved process understanding using multiway principal component analysis. Ind. Eng. Chem. Res. 1996, 35, 138–146. (8) Wise, B.; Gallagher, N. The process chemometrics approach to process monitoring and fault detection. J. Process Control 1996, 6, 329–348. (9) Martin, E.; Morris, A.; Papazoglou, M.; Kiparissides, C. Batch process monitoring for consistent production. Comput. Chem. Eng. 1996, 20, 599–604. (10) Lennox, B.; Montague, G.; Hiden, H.; Kornfeld, G.; Goulding, P. Process monitoring of an industrial fed-batch fermentation. Biotechnol. Bioeng. 2001, 74, 125–135. (11) Lee, J. M.; Yoo, C.; Lee, I. B. On-line batch process monitoring using a consecutively updated multiway principal component analysis model. Comput. Chem. Eng. 2003, 27, 1903–1912. :: (12) Undey, C.; Ertunc-, S.; C-inar, A. Online batch/fed-batch process performance monitoring, quality prediction, and variable-contribution analysis for diagnosis. Ind. Eng. Chem. Res. 2003, 42, 4645–4658. (13) Gunther, J. C.; Conner, J. S.; Seborg, D. E. Fault detection and diagnosis in an industrial fed-batch cell culture process. Biotechnol. Prog. 2007, 23, 851–857. (14) Chiang, L.; Leardi, R.; Pell, R.; Seasholtz, M. Industrial experiences with multivariate statistical analysis of batch process data. Chemom. Intell. :: Lab. Syst. 2006, 81, 109–119. (15) Undey, C.; Ertunc-, S.; C-inar, A. Intelligent real-time performance monitoring and quality prediction for batch/fed-batch cultivations. J. Biotechnol. 2004, 108, 61–77. (16) Dong, D.; McAvoy, T. Nonlinear principal component analysis based on principal curves and neural networks. Comput. Chem. Eng. 1996, 20, 65–78. (17) Lee, J. M.; Yoo, C. K.; Lee, I. B. Fault detection of batch processes using multiway kernel principal component analysis. Comput. Chem. Eng. 2004, 28, 1837–1847. (18) Lee, J. M.; Yoo, C. K.; Lee, I. B. Nonlinear process monitoring using kernel principal component analysis. Chem. Eng. Sci. 2004, 59, 223–234. (19) Yoo, C. K.; Lee, D. S.; Vanrolleghem, P. A. Application of multiway ICA for on-line process monitoring of a sequencing batch reactor. Water Res. 2004, 38, 1715–1732. (20) Albazzaz, H.; Wang, X. Z. Statistical process control charts for batch operations based on independent component analysis. Ind. Eng. Chem. Res. 2004, 43, 6731–6741. (21) Yoo, C. K.; Lee, J. M.; Vanrolleghem, P. A.; Lee, I. B. On-line monitoring of batch processes using multiway independent component analysis. Chemom. Intell. Lab. Syst. 2004, 71, 151–163. (22) Lee, J. M.; Qin, S. J.; Lee, I. B. Fault detection of nonlinear processes using kernel independent component analysis. Can. J. Chem. Eng. 2007, 85, 526–536. (23) Zhang, Y; Qin, S. Fault detection of nonlinear processes using multiway kernel independent analysis. Ind. Eng. Chem. Res. 2007, 46, 7780–7787. (24) Choi, S. W.; Park, J. H.; Lee, I. B. Process monitoring using a Gaussian mixture model via principal component analysis and discriminant analysis. Comput. Chem. Eng. 2004, 28, 1377–1387. 3401

dx.doi.org/10.1021/ie1017282 |Ind. Eng. Chem. Res. 2011, 50, 3390–3402

Industrial & Engineering Chemistry Research

ARTICLE

(25) Yu, J; Qin, S. J. Multimode process monitoring with Bayesian inference-based finite Gaussian mixture models. AIChE J. 2008, 54, 1811–1829. (26) Yoo, C. K.; Villez, K; Lee, I. B.; Rosen, C; Vanrolleghem, P. A. Multi-model statistical process monitoring and diagnosis of a sequencing batch reactor. Biotechnol. Bioeng. 2007, 96, 687–701. (27) Yu, J; Qin, S. J. Multiway Gaussian mixture model based multiphase batch process monitoring. Ind. Eng. Chem. Res. 2009, 48, 8585–8594. (28) Chiang, L.; Russell, E.; Braatz, R. Fault Diagnosis in Chemical Processes using Fisher Discriminant Analysis, Discriminant Partial Least Squares, and Principal Component Analysis. Chemom. Intell. Lab. Syst. 2000, 50, 243–252. (29) Chiang, L; Kotanchek, M; Kordon, A. Fault diagnosis based on Fisher discriminant analysis and support vector machines. Comput. Chem. Eng. 2004, 28, 1389–1401. (30) He, Q. P.; Qin, S. J.; Wang, J. A new fault diagnosis method using fault directions in Fisher discriminant analysis. AIChE J. 2005, 51, 555–571. (31) Zhang, X.; Yan, W.; Zhao, X.; Shao, H. Nonlinear biological batch process monitoring and fault identification based on kernel fisher discriminant analysis. Process Biochem. 2007, 42, 1200–1210. (32) Yu J. Localized Fisher Discriminant Analysis Based Complex Chemical Process Monitoring. AIChE J., published online August 5, 2010; DOI: 10.1002/aic.12392. (33) Lee, J. M.; Yoo, C. K.; Choi, S. W.; Vanrolleghem, P. A.; Lee, I. B. Nonlinear process monitoring using kernel principal component analysis. Chem. Eng. Sci. 2004, 59, 223–234. (34) Cho, H. W. Nonlinear feature extraction and classification of multivariate data in kernel feature space. Expert Syst. Appl. 2007, 32, 534–542. (35) Alcala, C.; Qin, S. Reconstruction-based contribution for process monitoring with kernel principal component analysis. Ind. Eng. Chem. Res. 2010, 49, 7849–7857. (36) Duda, R. O., Hart, P. E., Stork, D. G. Pattern Classification, 2nd ed.; John Wiley & Sons, Inc.: New York, NY, 2001. (37) Sugiyama, M. Dimensionality reduction of multimodal labeled data by local Fisher discriminant analysis. J. Machine Learn. Res. 2007, 8, 1027–1061. (38) Kassidas, A.; MacGregor, J.; Taylor, P. Synchronization of batch trajectories using dynamic time warping. AIChE J. 1998, 44, 864–875. (39) Rabiner, L.; Rosenberg, A.; Levinson, S. Considerations in dynamic time warping algorithms for discrete word recognition. IEEE Trans. Acoustics, Speech Signal Process. 1978, 26, 575–582. (40) Rabiner, L. R., Juang, B. Fundamentals of Speech Recognition; Prentice Hall: Englewood Cliffs, NJ, 1993. :: (41) Birol, I.; Undey, C.; Birol, G.; Tatara, E.; C-inar, A. A web-based simulator for penicillin :: fermentation. Int. J. Eng. Simul. 2001, 2, 24–30. (42) Birol, G.; Undey, C.; C-inar, A. A modular simulation package for fed-batch fermentation: penicillin production. Comput. Chem. Eng. 2002, 26, 1553–1565.

3402

dx.doi.org/10.1021/ie1017282 |Ind. Eng. Chem. Res. 2011, 50, 3390–3402