Adaptive Monitoring Method for Batch Processes Based on Phase

model data structure with accumulation of new normal batches. On the basis of analysis of the unfolded matrix ((time × batch) × variable), a general...
1 downloads 0 Views 257KB Size
Ind. Eng. Chem. Res. 2007, 46, 4943-4953

4943

Adaptive Monitoring Method for Batch Processes Based on Phase Dissimilarity Updating with Limited Modeling Data Chunhui Zhao, Fuli Wang,* Furong Gao,† Ningyun Lu,‡ and Mingxing Jia College of Information Science and Engineering, Northeastern UniVersity, Shenyang, Liaoning ProVince, P. R. China

A monitoring method is proposed for batch processes, starting with limited reference batches and then updating model data structure with accumulation of new normal batches. On the basis of analysis of the unfolded matrix ((time × batch) × variable), a generalized moving window method is introduced for exploring local covariance structure, phase division, and the development of monitoring models. Then, an adaptive updating algorithm, based on phase-specific dissimilarity analysis, is developed for model updating to accommodate additional information from the accumulation of new batch data and to explore time-varying behaviors from batch to batch. The proposed method is illustrated with two processes, an industrial scale experimental injection molding and a simulated fed-batch penicillin fermentation. Both results show that the proposed method is effective. 1. Introduction Batch and semibatch processes play an important role in specialty chemical, semiconductor, food, and biology industry for producing high-value-added products to meet today’s rapidly changing market. Characterized by operation in a finite duration, a batch process is to produce products of desired quality at low cost. The nature of nonsteady, time-varying, finite-duration, and nonlinear behaviors of batch processes make batch control more difficult than that of a continuous process. Process disturbances, which may vary batch to batch, affect both process and product reproducibility. Proper process monitoring and diagnosis is important not only to process safety but also to quality improvement.1-7 Most batch process monitoring methods are based on multiway principal component analysis (MPCA) and partial least-squares (MPLS).8-12 In these monitoring methods, an assumption is made that the MPCA model, once built from reference data, is assumed to be time-invariant despite the fact that most batch processes exhibit certain batch-to-batch variations due to, for example, equipment and catalysis aging, sensor and process drifting, and preventive maintenance and cleaning. It is not difficult to imagine that a fixed MPCA can often result in false alarms in these cases. Several alternative methods have been proposed to resolve this problem.13-16 Li et al.13 proposed two adaptive PCA process monitoring algorithms to update recursively the correlation matrix. Flores-Cerrillo and MacGregor14 extended the multiblock MPCA/MPLS to incorporate explicitly batch-to-batch trajectory information. Zhao et al.15 proposed a double moving window MPCA technique for adaptive batch monitoring. Lu et al.16 proposed a twodimensional dynamic PCA, which captures both within-batch and batch-to-batch dynamics simultaneously. These works provide a theoretical basis for our approach. In most existing methods, it is assumed that the reference data collected for modeling covers statistically all normal batch* To whom correspondence should be addressed. Tel.: +86-02483687434. Fax: +86-024-23890912. E-mail address: [email protected]. † Dept. of Chemical Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong SAR, P. R. China. ‡ College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, Jiangsu Province, P. R. China.

to-batch variation. This is relatively easy for batch processes with short batch durations and processes with which it is inexpensive to conduct many trial runs. Slow batch processes such as bio-related processes take long times to complete a batch run. In this case, it is not practical to have many cycles of reference data before the application of process monitoring. A modeling and monitoring method, thus, is needed to start process monitoring with only limited batch cycles, i.e., just several batches. Initial models can then be adaptively improved and updated with the newly available batch data for model improvement. This updating can also account for the slow batch-tobatch variation nature. This work extends and refines the idea of our previous work,17 in which a moving time-window modeling and monitoring method is proposed to start with only one normal batch. It differs from our previous work17 in the following ways: (1) This work employs several reference batches, rather than one batch, for initial modeling. This allows more information to be covered, which can be beneficial to process understanding and analysis. (2) This work can capture the covariance structure and variance information of process variables along both batch and time directions, which represents both slow and fast time-varying characteristics. (3) This method allows modeling batches to be arranged in operation order, and the models to be updated with new batches according to the manufacturing sequence. This can not only acquire new process information but also properly reflect the common nature of slow process drift along batch direction in evolution succession. (4) The model is discriminatingly updated in each phase based on a proposed dissimilarity weighted average algorithm. This paper is organized as follows. Details of the proposed method are explained in section 2. Two illustration examples are shown in section 3, an experimental application on an injection molding process, and a benchmark simulation application of fed-batch penicillin production. Conclusions are drawn in section 4. 2. Methodology In each batch run, assume that J variables are measured at k ) 1, 2, ..., K time instances throughout the batch. Then, the vast amount of process data collected from similar I batches can be organized as a three-way array X(I × J × K). Before

10.1021/ie061320f CCC: $37.00 © 2007 American Chemical Society Published on Web 06/05/2007

4944

Ind. Eng. Chem. Res., Vol. 46, No. 14, 2007

performing PCA, the three-way data matrix has to be unfolded to a two-way matrix in six possible ways, resulting in one of the following two-dimension forms: A(KI × J), B(JI × K), C(IJ × K), D(I × KJ), E(I × JK), and F(J × IK). The principal component analysis of these two-dimensional matrices corresponds to the observation of the data for different types of variability. For industry applications, only matrices A and D have practical meanings. In the method of Nomikos and MacGregor,11,12 the three-way matrix process data is unfolded to matrix D: X(I × KJ), which is one popular unfolding method for batch analysis and monitoring. PCA performed on matrix D focuses on the variance information and covariance structure of process variables along different batches at the same sampling time. Commonly, modeling based on D-unfolding should have statistically sufficient I batches under normal conditions, where, generally, it should satisfy I g2J as recommended in the field of multivariate statistical regression to ensure a reliable statistical model,18 which can provide correct and enough process information. For the specific situation presented in the paper, the existing reference batches are so limited and rare that modeling directly with a D-unfolding matrix may not be able to extract correct and sufficient batch variation information, so another proper data arrangement method should be developed correspondingly. 2.1. Generalized Moving Windows Based on A-Unfolding. Recently, application have been presented in which the threeway process data matrix was unfolded to matrix A and scaled to zero mean and unit variance,3,19-21 where the K number of vertical time-slice matrices, Xk(I × J), are placed beneath one another to form a two-way data matrix X(KI × J). This unfolding way does not require all batches to be of the equal length nor does it require estimation of future data for online monitoring. Subtracting the average batch trajectory is essential to remove the major nonlinear dynamics of the batch. This procedure enables us to model the deviations from the mean trajectory. However, the conventional mean centering for an A-unfolding matrix, however, is simply a method that subtracts a constant, the grand mean of each variable over all batches and all time, from the trajectory of each variable. This still leaves the nonlinear and time-varying trajectories in the data matrix. Furthermore, the resulting PCA loading model derived from an A-unfolding matrix can only reflect the process variable correlations and their relative importance from an “overall” perspective without revealing the time-varying correlation structures. Hence, direct unfolding and mean centering in this manner (matrix A) offers little benefit to process monitoring. In this section, we present a new concept of generalized moving window based on a modified A-unfolding on several batches, which is the basic unit for extracting the variance along both batches and time. As illustrated in Figure 1, data are arranged in each so-called generalized window as a two-way matrix, X((l‚I) × J), in which a sequential production order along the batch axis is preserved. The term l is the time length of the moving window, so l‚I is the data size of the generalized window. Here, l should be properly chosen so that the data length of the generalized window, L (L ) l‚I), can satisfy two or three times the number of process variables to ensure a reliable statistical model as commonly practiced in most multivariable regressions.18 Each row represents process measurements at the same sampling time of a batch, and rows are arranged according to ascending time and batch. With this unfolding, the J-dimensional process variables spanning each moving time window and all modeling batches are combined with preserved order to extract the process correlations covering

both batch and time directions. In the present work, the batches are of equal length without special declaration so that the specific process time can be used as an indicator to data preprocessing, modeling, and online monitoring. The modeling procedure is summarized as follows: First, for each batch data set X(K × J), in the initial modeling database, a series of moving time windows, each of which is a two-way matrix, noted as Xw(l × J)(w ) 1, 2, ..., K - l + 1), are developed. Where, l is the time window length, and J is the number of process variables. The total number of the moving windows is K - l + 1. This is illustrated as Figure 1a. Second, all moving-time windows of modeling batches spanning the same sampling time regions are combined and arranged in A-unfolding form as illustrated by Figure 1b. With this arrangement, K - l + 1 number of generalized moving windows finally result, designated as X h w((l‚I) × J)(w ) 1, 2, ..., K - l + 1). Third, in each generalized moving window, since within the short period the process trajectory hasn’t varied greatly, subtracting the mean of each column can approximately eliminate the main nonlinearity due to the dynamic behavior of the process and look at the deviation from the average trajectory. And, each variable is scaled to unit variance to handle different measurement units, thus giving each equal weight. Then, the means and standard deviations are denoted with the specific process time corresponding to the right side of each moving window one by one, which will be used in the latter online monitoring. Fourth, PCA is applied to each generalized moving window to obtain the PCA loadings, Pw(J × J)(w ) 1, 2, ..., K - l + 1), which can reflect the covariance structures among the process variables and variance information along both time and batch directions. It is important to point out that this is a key difference from the other work where only variance in one axis (time or batch) can be reflected, not both. This will be beneficial for process understanding and monitoring. Fifth, the PCA loading clustering algorithm22 can be applied here to divide all PCA loading matrices Pw(J × J) into several groups of different covariance structures. Consequently, a representative process monitoring model, P/c , can be developed for each phase, with the corresponding control limits of Hotelling-T2 and squared prediction error (SPE). Finally, when online monitoring, the models and control limits can be updated with the accumulation of new batches. The detailed modeling algorithm is shown in the next two subsections. 2.2. Process Phase Partition. Generally speaking, industrial batch processes may operate in different phases, in which input profiles, conditions, process characteristics, and control strategies can vary from phase to phase. Louwerse and Smilde4 initiated the idea of partitioning reference data into several time periods for online monitoring. But, this method, based on MPCA, requires the estimation of future unavailable measurements in each remaining time period for online monitoring. Adaptive batch monitoring strategy based on recursive multiblock PCA proposed by Ra¨nnar et al.23 avoids the need for estimating or filling of unknown data. Its computation, however, can be overwhelming. The key to the phase-based sub-PCA modeling strategy is to divide a batch process into several phases by proper clustering of process characteristics. Covariance structure changes, reflecting the changes of process characteristics, may be employed in the phase partition algorithm, as pointed out by our earlier work.22 The generalized moving window loading matrices,

Ind. Eng. Chem. Res., Vol. 46, No. 14, 2007 4945

Figure 1. Illustration of moving window formulation: (a) formulation of time windows for individual batches; (b) formulation of generalized window of I number of cycles.

Pw(J × J), represent the local covariance information and the underlying process behaviors during the corresponding time period among different batches. Although batch process variables are time-varying, fast or slow, neighboring generalized windows have the similar local covariance structures and, consequently, the similar PCA loading matrices. The phasebased modeling begins with analyzing and clustering the loading matrices of generalized windows. Different phases result in different loading matrices driven by the different underlying characteristic, whereas, within the same phase, the process correlation is similar and a representative phase model can be built. This method allows two-way PCA to be "directly" applied to a batch process. It has significant benefits because the latent variable structure is allowed to change at each phase.22-24 Analyzing the process in a phase-based manner also allows for the more detailed detection of fault location in a process.4 Clustering Based on Moving Windows. How to divide reasonably the process trajectory using history data has been an open and hot issue. Existing works have described various strategies to identify phases and accordingly align data.22,25,26 The changes of PCA loading structures, corresponding to the

varying process characteristics, can be and have been used to divide the process into different phases.22 There are several methods for defining the similarity of two PCA loading matrices. For instance, Krzanowski27 presented a PCA similarity factor as a measure of the similarity between two data spaces. Johannesmeyer et al.28 proposed a statistical method for comparing two PCA models by calculating the T2 and/or Q similarity factor. Singhal and Seborg29 reported a patternmatching methodology based on a modified PCA similarity factor. All of these existing methods, however, are ill-suited for partitioning simultaneously a large number of PCA models into groups. Here, a modified k-means clustering algorithm in our previous work22 is inherited to class the generalized window PCA loading matrices. By the clustering algorithm, (K - l + 1) number of generalized moving window PCA loading matrices, Pw(J × J), are partitioned into C number of subgroups, where the loading matrices remain similar within the same group but show significant differences between different groups. The clustering results, associated with process operation time or an indicator variable, can be used to define subprocess phases.

4946

Ind. Eng. Chem. Res., Vol. 46, No. 14, 2007

Figure 2. Scheme of clustering algorithm based on a generalized window’s PCA loading matrices.

P/c (c ) 1, 2, ..., C), is the center of each phase, representing C kinds of pattern features. The scheme of the phase division algorithm based on generalized moving windows is illustrated in Figure 2. In the clustering algorithm, the modeling accuracy and complexity depend on the specification of the clustering threshold. A larger threshold results in few clusters and more sampling points in each cluster, which contain stable and sufficient process characteristics but can’t more sensitively unveil the correlation varying between different patterns. Contrastively, a smaller one generates more clusters focusing on reflecting the evolvement of process correlation dynamics from one phase to another. However, less samples in each class cannot provide enough process operation information for each phase and induce the lack of model robustness. That is, a larger threshold conforms to the capability of extracting stable and sufficient phase characteristics from each cluster and a smaller value corresponds to the required ability to track the varying process dynamics between different phases along time. In conclusion, the threshold value should be determined by a tradeoff between the above two appealing abilities by trial and error. Moreover, it is noteworthy that in the moving window sweeping across time, some data belonging to one phase and some to the neighboring phase are covered in the same window. Generally, the moving windows should show more similar characteristics to the corresponding phase if more data in the windows are collected from that phase. By clustering, they will be partitioned into the corresponding phase and employed to get the phase-representative model. However, due to the smooth effect caused by the use of moving windows, misclassification may occur at the switching regions between two neighboring phases. The obtained phase division boundary may drift forward along the time direction compared with the real value. Since the time lengths of moving windows are much shorter compared with the corresponding phases, hence their effects on phase division and the characteristics of the representative phasespecific model, P/c , are negligible and the present clustering approach is applicable. However, it may lead to false alarms or missing alarms in online monitoring during the neighboring regions between two sequential phases. Two alternative methods may be used to resolve this problem. One is to adjust the monitoring confidence limits around the neighboring regions of two phases; the other is that the indicator variable technique and prior process experience can be combined as assistant means to interrogate phase completion information which can help to divide phases more accurately instead of only depending on the clustering results. Moreover, it could be made more

appropriate by applying the soft-partition idea, where those samplings located within the neighboring regions between two phases, termed transition regions here, which are around the phase boundary obtained from the clustering results, should be separated and given special attention. Consequently, this will refer to how to determine the proper transition range. The research on such cases is significative and deserves further investigation. In conclusion, considering the complexity of the transition partition problem, there may be various circumstantialities that should be integrated and paid special attention. The research on new methods of modeling and monitoring focusing on the transition regions under different conditions is promising and certainly will arouse extensive academic interest in the future. 2.3. PCA Modeling Based on Phase Partition. With the application of the above-mentioned clustering algorithm, all moving window data matrices are grouped into several subphases associated with process time. Within the same group, the process correlations are similar without changing drastically, i.e., similar loading matrices, Pw(J × J), so the representative PCA loading matrix P/c for each phase can be directly modeled by a simple averaging algorithm without increasing the heavy calculation burden and complexity. Meanwhile, it is enough to ensure the accuracy of modeling and online monitoring:

P/c )

1 nphase_c

∑w Pw(J × J) (w ) 1, 2, ..., nphase_c; c ) 1, 2, ..., C) (1)

Where, nphase_c is the number of the generalized moving data windows belonging to phase c. In this way, the phaserepresentative PCA loading model P/c will be orthogonal approximately since it is obtained by averaging all the loadings within the same phase, Pw(J × J), which bear the similar orthogonal relationships. In the same way, S/c (c ) 1, 2, ..., C) can be similarly defined as the representative singular-value diagonal matrix for each phase, which will be used to determine the retained principal components A/c and the control limits later:

S/c )

1 nphase_c

nphase_c

∑ w)1

/ / / Sw ) diag(λ1,c ,λ2,c , ..., λJ,c )

(2)

The number of retained principal components A/c for each phase could be determined by the cumulative explained variance h /c and P h /c , for rate. Accordingly, P/c is divided into two parts, P principal component subspace and residual subspace, respectively, so does S/c . In each phase c (c ) 1, 2, ..., C), the representative loading matrix P h *c is used to construct a subPCA model. Initial Online Control Limits. The initial control limit trajectories for the two statistics, Hotelling-T2 and SPE are estimated from the initial several successful reference batches. For error subspace, adopting the works of Box30 and Jackson and Mudholkar,31 the SPE statistic can be approximated by a weighted chi-squared distribution, gχh2, where the weight g and the freedom degree h can be obtained following the same approach as that of Nomikos and MacGregor.11 The parameters of the gχh2 distribution at time k are estimated from the SPE values in the corresponding generalized moving window, that is, gk ) νk/2mk and hk ) 2(mk)2/νk, where mk is the average of

Ind. Eng. Chem. Res., Vol. 46, No. 14, 2007 4947

the window’s SPE values and νk is the corresponding variance. Thus, the SPE control limit at time k can be approximated by

SPEkR ) gkχhk,R2 ) (νk/2mk)χ2(mk)2/νk,R2

(3)

Where, for those samplings within the first generalized moving window of the whole process, their control limits can simply employ the first moving window’s SPE value. For the principal component subspace, the generalized control limit32 of Hotelling-T2 is estimated for each phase to describe the average variability of process variables over both batches and time respectively in eq 4.

Tc ∼ 2

A/c nphase_c(L - 1)

FA /,n (L - 1) - Ac/,R nphase_c(L - 1) - A/c c phase_c

(4)

Where, L is the total size of the generalized moving window (L ) l‚I). 2.4. Online Monitoring and Updating. The above PCA models obtained for every phase, which cover the normal variability information along both batch and time directions, can be used for online monitoring. Process time or characteristic variables may be used to determine the phase that the current sampling belongs to by simply checking which time span the new data falls into. In online monitoring, the two statistics, SPE and Q, are compared with the predetermined control limits. When the statistics go beyond the control limit, responding to an abnormality, the contribution plot,33,34 a commonly used diagnosis tool, can be used to show the variables impacted by the occurred abnormality. Online Process Monitoring. In online monitoring, the newly acquired sampling data Xnew(1 × J) should first be normalized using the mean and standard deviation at the current sampling time, which is obtained from the previous modeling procedure. And, those samplings within the first window all use the mean and variance value of the first window since they don’t correspond to the right side of any window. The principal component score, estimated value, and SPE can be calculated by the following:

h /c tnew ) xnewP xˆ new ) tnew(P h /c )T SPEnew ) (xˆ new - xnew)(xˆ new - xnew)T

(5)

Since the initial monitoring models are derived from a limited number of reference batches, they naturally cover limited batch variation and process information. With the accumulation of new successful batches, we should update the reference trajectory, the representative PCA loading matrix, and the online control limit to explore more accurate modeling information and, moreover, to accommodate the slow process drift along the batch direction. The detailed updating algorithm is given as follows. Model Updating Based on Phase Dissimilarity. During online monitoring, when a new batch run tracks the usual operation trajectory and stays well within the normal confidence limit employing the current monitoring models, it can be considered as a success and should be archived in the normal database, otherwise it should be archived in a fault database. When several new normal batches are available, the models should be updated so as to accommodate the new normal process variation information.

As the process exhibits significantly different underlying behaviors over different phases, different subphase models, reflecting different time-varying process characteristics, need to be updated respectively in different phases from batch to batch, i.e., the updating ratio of the monitoring model in each phase should be different in harmony with the characteristic changes of that phase. Here, we introduce the concept of dissimilarity35 as a forgetting factor over batches to quantitatively evaluate the batch evolution and update discriminatingly the models over different phases to overcome the drawbacks of the uniform updating algorithm. Moreover, the light computational requirements of the simple updating algorithm adopted in the present work do not impose a heavy calculation burden and complexity in its implementation. However, it can still ensure the online monitoring accuracy, which can be illustrated in the latter simulation. The concept of similarity or dissimilarity is often used for classifying a set of data. For example, the dissimilarity between two classes is measured and the two classes with the smallest degree of dissimilarity are combined to generate a new class. To evaluate the difference between distributions of data sets, a classification method based on the Karhunen-Loeve (KL) expansion36 is used in this work, which is a well-known technique for feature extraction or dimensionality reduction in pattern recognition. Considering the following two data sets with the same number of columns and arbitrary rows, X1 ∈ RM,J and X2 ∈ RN,J, the dissimilarity index D is defined as follows:

D ) diss(X1,X2) )

4

J

(λj - 0.5)2 ∑ J j)1

(6)

Where, λj is the eigenvalue of the covariance matrix of the transformed matrix35 obtained from one data set Xi, which is defined as xNi/(N1+N2)XiP0Λ-1/2, where P0TRP0 ) Λ (Λ is a diagonal matrix whose elements are eigenvalues of R) and R ) 1/(N1 + N2)(X1TX1 + X2TX2). J is the number of process variables. The dissimilarity index, D, has been shown to change between zero and one. Here, each column of Xi is assumed to be mean-centered and scaled. On the basis of the phase division information of section 2.2, the new updating PCA loading models P h /c specific to each phase can be readily obtained by recalculating eq 1 and determining the new retained number of principal components, and then, they will be used to adjust the previous ones. To detect the normal batch-to-batch variation, the dissimilarity index between the old reference batches distribution and the distribution representing the current new modeling batches should be calculated; then, it will be used to update the current monitoring model for the cth phase as the following by determining how much new information should be included and what old information should be excluded. / / / ) (1 - Dc,cur) × P h c,pre + Dc,cur × P h c,upd P h c,cur

(7)

/ Where, the subscript c ) 1, 2, ..., C; P h c,cur denotes the current / is the previous updated monitoring model of cth phase, P h c,pre model, and P h /c,upd is the updating one, which represent the process distribution and are employed as Xi in eq 6 to calculate the D value. Moreover, when the retained number of principal / components in P h /c,upd is different from that of P h c,pre , the bigger one will be chosen to match their dimensionality, so that the number of principal components may change adaptively with the process updating.

4948

Ind. Eng. Chem. Res., Vol. 46, No. 14, 2007

The forgetting factor is important in the updating algorithm, which determines the ratio of historical information to new information in the current process monitoring models. When two data sets are very similar to each other, D is close to zero, and then, the updated phase model will be kept largely similar to the old one; otherwise, when data sets are very different from each other, D is close to one, and the updating ratio, consequently, is very high. From the above algorithm, it can be obviously seen that the updating strategy based on phase dissimilarity not only allows the center of representative PCA loading matrices to shift with the accumulation of new batch data but also unveils the phase-specific batch variation extent. Then, the corresponding updated PCA monitoring models will be employed for the subsequent online monitoring. Here, it should be pointed out that for online updating, the new number of modeling batches, i.e., the update frequency, is an important parameter deserving special considerations. Suppose h is the update frequency. Obviously, a larger h will result in slower reflection of changes along batch, whereas an extremely small h may cause overupdating. To ensure favorable sensitivity and the robustness of the proposed method, the practical experience supports that we can choose h not less than the number of initially modeled batches. This serves as a tradeoff between the ability to track process drift and the reliability of model updating. Control Limits Updating. With the accumulation of new successful batch data, the initial control limits described above may become too tight (or loose) in certain time spans for monitoring a new batch. It is desirable to adjust the control limits by taking the batch-to-batch variation into account. T2 control limits are defined separately in each phase; meanwhile, its normal variation range along a phase is much larger than that along the batch axis. For the initial SPE control limits, SPEkR are estimated from the SPE values of the data in the generalized moving window, X h ((l‚I) × J). In the updating procedure, the data used for estimating the parameters can be augmented to include the data of the new normal batches

X h kcur )

( ) X h kpre Xkupd

(8)

where, X h kcur denotes the current augmented generalized moving window data corresponding to sampling time k, X h kpre is the old k one at time k, and Xupd, is the new one used for updating. By doing so, the generalized moving windows will contain more and more batch-to-batch variation, and the SPE control limits estimated from this new data window will be correspondingly adjusted. Moreover, with the evolution of real process operation, the mean and standard deviation will also shift slowly. To reduce the error caused by normalization, when the new batch is judged to be normal, the current mean and standard deviation will be calculated by averaging the new and old ones for the subsequent monitoring. So the augmented matrix composed of the new modeling batch set and the historical old one should be renormalized before estimating the SPE parameters. 3. Illustration and Discussions The proposed process monitoring approach is tested with two processes: a typical multiphase process, injection molding process, and the long-cycle fed-batch penicillin fermentation process. Although it is developed for those long batches, the method works equally well with a relatively short batch such

Figure 3. Simplified schematic diagram of the injection molding machine and its major measurements. Table 1. Process Variables for Injection Molding Process no.

variable’s descriptions

unit

1 2 3 4 5 6 7 8 9 10 11 12

barrel temperature packing pressure. stroke A stroke B injection velocity hydraulic pressure plastication pressure cavity pressure screw rotation speed SV1 opening SV2 opening mold temperature

deg Celcius bar mm mm mm/s bar bar bar rpm percent percent deg Celcius

Table 2. Operation Condition Setting for Injection Molding Process operating conditions

setting values

material injection velocity packing pressure mold temperature seven-band barrel temp packing-holding time cooling time

high-density polyethylene (HDPE) 24 mm/s 200 bar 45 °C 200, 200, 200, 200, 180, 160, 120 °C 6s 15 s

as injection molding. The first example demonstrates the feasibility of the proposed algorithm to treat a relatively “complex” multiphase process, whereas the second example focuses on the application of the proposed scheme to a slow bio-related batch process. 3.1. Injection Molding Process. Injection molding,37,38 a key process in polymer processing, transforms polymer materials into various shapes and types of products. Figure 3 shows a simplified diagram of a typical reciprocating-screw injection molding machine with instrumentation.37 A typical injection molding process consists of three operation phases: injection of molten plastic into the mold, packing-holding of the material under pressure, and cooling of the plastic in the mold until the part becomes sufficiently rigid for ejection. Besides, plastication takes place in the barrel in the early cooling phase, where polymer is melted and conveyed to the barrel front by screw rotation, preparing for next cycle.37 The material used in this work is high-density polyethylene (HDPE). The process variables selected for modeling are listed in Table 1. The operating conditions are set as shown in Table 2. Forty normal batch runs are conducted in turn under the same condition, among which the first 4 batches are used for initially modeling, and the other 36 cycles are used for model updating.

Ind. Eng. Chem. Res., Vol. 46, No. 14, 2007 4949

Figure 4. Phase clustering result for injection molding trajectory.

Figure 5. Online SPE monitoring chart for an abnormal batch with chilling water fault (solid line, 99% control limit; dashed line, 95% control limit; bold solid line, online SPE statistic).

The time length of moving window l is set to be 36 samplings, so the size of generalized window L will be 36 × 4 ) 144. The weighted loading matrices calculated from the generalized moving window matrices are fed to the clustering algorithm. The clustering result is shown in Figure 4, clearly showing that without using any prior process knowledge, the trajectories of the injection molding are automatically divided into 10 phases reflecting the changing trend of local covariance structures, among which 4 long phases (shaded with 1 and 3) correspond to the four main physical operation phases, i.e., the injection, packing-holding, plastication, and cooling phases. A few short transitional phases (marked with 2 and 4) emerge between the four main phases, corresponding to the transition phases from one major phase to the next. A detailed division of a batch process into “steady” and transient phases can not only improve process monitoring and diagnosis performance but also enhance process analysis and understanding. Then in each phase, only three to five principal components are retained for each representative model. Online process monitoring is conducted by judging whether the SPE values of the current sample in a running batch are below the control limits. A chilling water fault is introduced from the beginning by shutting down the chilling water pump. The chilling water system is important to the injection molding process; it keeps the mold temperature stable within the set range. If it fails, excessive heats cannot be discharged and too high temperature will affect molding shrinkage and certainly will result in defective product. This upset can be quickly picked up by the SPE monitoring charts shown in Figure 5, where the SPE values obviously go above the control limit shortly after the start of the process and yield a more stable detecting result during the later process phase. The detecting result shows that this fault can be readily checked with the initial models derived from limited reference batches. From the contribution plot shown in Figure 6, it can be seen that the mold temperature (variable

Figure 6. Contribution plots to SPE at different main phases.

12) is most seriously impacted by the detected fault throughout all phases and that the barrel temperature (variable 1) displays abnormity only around 200 samplings (about phase 2), which generally agree with the industrial circumstances. However, it should be noted that the process variables tend to be complicatedly related to each other in the practical batch industry. With the evolution of time, if fault detection is delayed, the fault diagnosis will be more complicated and deceptive since all these variables will indicate significant deviations from their expected values. Therefore, to identify the cause correctly, not only the information from contribution plots but also knowledge of the process has to be combined to give a reasonable and accurate explanation to the fault causes. Figure 7 shows the monitoring results of the first normal batch in the updating database, where the values of Hotelling-T2 and SPE statistics are both well below the control limits, indicating that the batch is free of any process abnormality. According to the proposed method, normal batches are then used to update the models and control limits in production sequence. Here, the new number of modeling batches, i.e., the update frequency, is chosen to be the same as the initial modeling batches. Figures 8 and 9 show, respectively, the monitoring results of a normal batch using updated models and ones without updating. Contrastingly, it can be seen that the false alarms mainly occur within the fourth phase if the models are not updated but disappear using updated models. The dissimilarity indexes between historical models and the updating ones are shown in Figure 10 corresponding to each phase, which explains why

4950

Ind. Eng. Chem. Res., Vol. 46, No. 14, 2007

Figure 9. Online SPE monitoring chart for a new normal batch without updating (solid line, 99% control limit; dashed line, 95% control limit; bold solid line, online SPE statistic).

Figure 7. Online monitoring results for a normal batch in updating database (solid line, 99% control limit; dashed line, 95% control limit; bold solid line, online T2 or SPE statistic).

Figure 10. Dissimilarity index between historical models and current ones (x-axis is the sampling time, and y-axis is the dissimilarity index between models corresponding to each process time).

Figure 8. Online SPE monitoring chart for a new normal batch with updated models (solid line, 99% control limit; dashed line, 95% control limit; bold solid line, online SPE statistic).

false alarms can happen without updating since larger dissimilarity values appear mostly during the fourth phase corresponding to the larger batch-to-batch variation and the greater extent of updating necessary. From the aforementioned, the serials of monitoring results have demonstrated effectively that it is viable to monitor such a process starting with the models and control limits derived from just limited cycles. The adjustment considering the successive batch-to-batch variation in evolution is reasonable and will benefit the efficiency of process monitoring. 3.2. Fed-Batch Penicillin Fermentation. In this section, the proposed method is applied to the monitoring of a well-known benchmark process, the fed-batch penicillin fermentation process.39,40 A flow diagram of the penicillin fermentation process

Figure 11. Flow diagram of the penicillin fermentation process.

is given in Figure 11. The production of secondary metabolites such as antibiotics has been the subject of many studies because of its academic and industrial importance. In the typical operating procedure for the modeled fed-batch fermentation, most of the necessary cell mass is obtained during the initial

Ind. Eng. Chem. Res., Vol. 46, No. 14, 2007 4951 Table 3. Variables Used in the Monitoring of the Benchmark Model no.

variables

1 2 3 4 5 6 7 8 9 10 11

aeration rate (L/h) agitator power (W) substrate feed rate (L/h) substrate feed temperature (K) dissolved oxygen concentration (g/L) culture volume (L) carbon dioxide concentration (g/L) pH fermentor temperature (K) generated heat (kcal) cooling water flow rate (L/h)

Table 4. Twenty Updating Batches (Initial Culture Volume Change Bounded within 100-103 L) batch no.

initial culture volume (L)

batch no.

initial culture volume (L)

1 2 3 4 5 6 7 8 9 10

100.2 100.3 100.5 100.7 100.8 101.0 101.1 101.3 101.5 101.6

11 12 13 14 15 16 17 18 19 20

101.8 101.9 102.0 102.1 102.3 102.4 102.5 102.7 102.8 103.0

preculture phase. When most of the initially added substrate has been consumed by the microorganisms, the substrate feed begins. The penicillin starts to be generated at the exponential growth phase and continues to be produced until the stationary phase. A low substrate concentration in the fermentor is necessary for achieving a high product formation rate due to the catabolite repressor effect. Consequently, glucose is fed continuously during fermentation instead of being added at the beginning. Here, we focus on the process to produce penicillin, which has nonlinear dynamics and multiphase characteristics. The Monitoring and Control Group of the Illinois Institute of Technology has developed a simulator (PenSim v1.0) that is capable of simulating the production of penicillin under various operating conditions (http://www.chee.iit.edu/∼cinar). In the present simulation experiment, the duration of each batch is set to be 400 h and the sampling interval is 1 h. The process variables retained in this work are shown in Table 3. The initial monitoring models are developed based on data from only four batches of normal operation, providing the information necessary for online monitoring. The length of the moving time window l is set to be 33 samplings, i.e., three times of the number of process variables, and the size of the generalized window length is set L ) 33 × 4 ) 132. Small variations are added to mimic the real process environment under default initial setting conditions. By a clustering algorithm, the process can be naturally divided into four main phases. Only three to five principal components are selected in each phase for the proposed modeling method. The initial culture volume of realistic process operation tends to gradually increase from batch to batch, whereas that of the initial modeling batches is the default value 100. Therefore, 20 additional normal batches were generated for updating under the conditions shown in Table 4. Despite the gradual increase in initial culture volume, the batches are regarded as normal because this change is the real reflection of actual operation status and will not drastically affect the correlations of process variables. Figure 12 shows the T2 and SPE online monitoring charts of updating batch no. 1 obtained using initial PCA models. The time trajectories of T2

Figure 12. Online monitoring results for a normal batch in updating database (solid line, 99% control limit; dashed line, 95% control limit; bold solid line, online T2 or SPE statistic).

and SPE are well within the confidence limits, indicating that the batch tracks the normal operation trajectory. In this simulation, the update frequency is chosen as four, i.e., the same as the initial modeling batches. According to the proposed method, whenever four new successful batches are available, the initial model should be updated discriminatingly in each phase and the information of the updating batches will enter into the designed models. Consequently, for all of the 20 normal updating batches shown in Table 4, the models and control limits will be updated five times consecutively. Comparatively, without updating, the process is detected to be out of control throughout the whole process at test batch no. 13 shown in Figure 13a, falsely indicating that test batch no. 13 is abnormal. However, it is well-known that test batch no. 13 is one of the normal batches operated under the setting conditions in Table 4. Hence, without updating the initial monitoring models, the reliability of fixed PCA models has been compromised when used for the current monitoring. Comparatively, the monitoring result for the same normal batch using the updated models is shown in Figure 13b. It clearly displays no alarm during batch operation which reflects the actual state. Thus, the model updating approach effectively captures the batch variation trend and significantly reduces false alarms. Moreover, besides the normal cases in Table 4, we add one abnormal batch with the pH controller failure introduced at 1 h corresponding to every new operating condition to evaluate the monitoring performance of the new models whenever they are updated. Consequently, there will be five monitoring results for normal and abnormal batches respectively corresponding to five updating manipulations using the proposed method, which are shown in Table 5 by means of three evaluation indices, i.e., fault delay in detecting the fault from the time of occurrence; false alarm rate before the occurrence of fault; and missing alarm rate after the abnormality. Conducting more simulations with different kinds of faults and operation conditions will help one to confirm the effectiveness

4952

Ind. Eng. Chem. Res., Vol. 46, No. 14, 2007

oped for each phase according to the clustering results. A model updating law has been formulated with the accumulation of newly available batches. The phase dissimilarity concept is introduced to allow models to be updated discriminatingly over different phases. The proposed method can not only give a valid initial monitoring scheme with limited reference batches but also allow adaptive model updating to accommodate the slow time-varying behaviors from batch to batch. The applications to two types of batch processes show that the proposed method is effective. Acknowledgment This work was supported in part by Hong Kong RGC(DAG01/02 EG11), the National Natural Science Foundation of China (No. 60374003), and Project 973 (No. 2002CB312200), China. Appendix: Modified k-Means Clustering Algorithm

Figure 13. Online SPE monitoring chart for a new normal batch using (a) fixed models and (b) updated models (solid line, 99% control limit; dashed line, 95% control limit; bold solid line, online SPE statistic). Table 5. Summary of Monitoring Results for Five Updates updating no.

batch no.

updating 1

1a 2b 3a 4b 5a 6b 7a 8b 9a 10b

updating 2 updating 3 updating 4 updating 5 a

detecting delay 3 5 2 0 1

false alarms rate 3.5% 0% 3.0% 0% 1.75% 0% 2.0% 0% 1.5% 0%

missing alarms rate 2.25% 4.25% 3% 2.5% 4%

Normal batch. b Fault process.

of the proposed method. However, we are still able to get a general and clear impression of the effective monitoring performance of the proposed method from Table 5. For example, when there is no abnormality in the process, the monitoring statistical values stay well below the control limits throughout the process with fewer false alarms, indicating that the batch tracks the normal operation trajectory. For the abnormal operation, the proposed method can quickly and correctly detect the fault with a low missing alarm rate and an almost 0% false alarm rate. All those efficiently prove the reliability of the updating algorithm. 4. Conclusions A batch monitoring method has been proposed for the cases where process monitoring is necessary to be started with limited cycles. A normalization method has been formulated to remove the main nonlinear and dynamics from the data before modeling. The formulation of a generalized window allows covariance information to be extracted over both batch and time directions. Process phases are determined by analyzing the changes in process covariance structure. Sub-PCA models are then devel-

Inputs: the patterns to be partitioned, {Pˇ 1, Pˇ 2, ..., Pˇ k}, and the threshold θ for cluster elimination. Outputs: the number of clusters C, the cluster centers {W1,W2, ...,WC}, and the strict membership of Pˇ k to C centers, m(k). The index variables are the iteration index, i, and the pattern index, k. (1) Choose C0 (i ) 0) cluster centers W0c (c ) 1, 2, ..., C0) from the K patterns along the time series. Practically, the initial cluster centers can be assumed to be uniformly distributed in the pattern set. (2) Merge pairs of clusters whose intercenter distance, dist i-1 (Wi-1 c1 ,Wc2 ), is below the predetermined threshold θ. (3) Calculate the distances from each pattern Pˇ k to all of the ˇ k to the nearest center Wi-1 centers, dist(Pˇ k,Wi-1 c ), assign P c* , and denote its membership as m(k) ) c*. (4) Eliminate the clusters that catch few patterns after a set number of iterations i > I_num to avoid singular clusters. (5) Update the number of cluster centers to be Ci, recompute the new cluster centers Wic (c ) 1, 2, ..., Ci), using the current cluster membership, m(k). (6) Go back to step 2 if a convergence criterion is not met. Typical convergence criteria are minimal changes in the cluster centers and/or a minimal rate of decrease in squared errors. Remark: The detailed programs for implementing the proposed algorithm and the experimental data are not included in the present work for brevity. A more complete document will be provided upon request addressed to the first author. Literature Cited (1) Kourti, T.; MacGregor J. F. Process analysis, monitoring and diagnosis, using multivariate projection methods. Chemom. Intell. Lab. Syst. 1995, 28, 3. (2) Kosanovich, K. A.; Dahl, K. S.; Piovoso, M. J. Improved process understanding using multiway principal component analysis. Ind. Eng. Chem. Res. 1996, 35, 138. (3) Wold, S.; Kettaneh, N.; Friden, H.; Holmberg, A. Modelling and diagnosis of batch processes and analogous kinetic experiments. Chemom. Intell. Lab. Syst. 1998, 44, 331. (4) Louwerse, D. J; Smilde, A. K. Multivariate statistical process control of batch processes based on three-way models. Chem. Eng. Sci. 2000, 55, 1225. (5) Chen, G.; McAvoy, T. J. Process control utilizing data based multivariate statistical models. Can. J. Chem. Eng. 1996, 74, 1010. (6) U ¨ ndey, C.; Cinar, A. Statistical monitoring of multistage, multiphase batch processes. IEEE Control Syst. Mag. 2002, 22, 40.

Ind. Eng. Chem. Res., Vol. 46, No. 14, 2007 4953 (7) Kourti, T. Multivariate dynamic data modeling for analysis and statistical process control of batch processes, start-ups and grade transitions. J. Chemom. 2003, 17, 93. (8) Kosanovich, K. A.; Piovoso, M. J.; Dahl, K. S. Multi-Way PCA Applied to an Industrial Batch Process. In Proceedings of the American Control Conference, Baltimore, MD, June 1994; p 1294. (9) Dong, D.; McAvoy, T. J. Multi-stage Batch Process Monitoring. In Proceedings of the American Control Conference, Seattle, WA, June 1995; p 1857. (10) Nomikos, P.; MacGregor, J. F. Monitoring of batch processes using multi-way principal component analysis. AIChE J. 1994, 40, 1361. (11) Nomikos, P.; MacGregor, J. F. Multivariate SPC charts for monitoring batch processes. Technometrics 1995, 37, 41. (12) Nomikos, P.; MacGregor, J. F. Multiway partial least squares in monitoring batch processes. Chemom. Intell. Lab. Syst. 1995, 30, 97. (13) Li, W. H.; Yue, H. H.; Valle-Cervantes, S.; Qin, S. J. Recursive PCA for PCA adaptive process monitoring. J. Process Control 2000, 10, 471. (14) Flores-Cerrillo, J.; MacGregor, J. F. Multivariate monitoring of batch processes using batch-to-batch information. AIChE J. 2004, 50, 1219. (15) Zhao, L.; Chai, T.; Wang, G. Double Moving Window MPVCA for Online Adaptive Batch Monitoring. Chin. J. Chem. Eng. 2005, 13, 649. (16) Lu, N. Y.; Yao, Y.; Wang, F. L.; Gao, F. R. Two-dimensional dynamic PCA modeling for batch process monitoring. AIChE J. 2005, 51, 3300. (17) Lu, N.Y.; Yi, Y.; Wang, F. L.; Gao, F. R. A stage-based monitoring method for batch processes with limited reference data. Presented at the 7th International Symposium on Dynamics and Control of Process Systems (Dycops-7), Boston, MA, July 2004. (18) Johnson, R. A.; Wichern, D. W. Applied multiVariate statistical analysis; Tsinghua University Press: Beijing, 2002. (19) Wise, B. M.; Gallagher, N. B.; Butler, S. W. A comparison of multiway principal component analysis, trilinear decomposition and parallel factor analysis for fault detection in a semiconductor etch process. J. Chemom. 1999, 13, 379. (20) Westerhuis, J. A.; Kourti, T.; Macgregor, J. F. Comparing alternative approaches for multivariate statistical analysis of batch process data. J. Chemom. 1999, 13, 397. (21) van Sprang, E. N. M.; Ramaker, Henk-Jan.; Westerhuis, J. A. Critical evaluation of approaches for on-line batch process monitoring. Chem. Eng. Sci. 2002, 57, 3979. (22) Lu, N. Y.; Gao, F. R; Wang, F. L. A sub-PCA modeling and online monitoring strategy for batch processes. AIChE J. 2004, 50, 255. (23) Ra¨nnar, S.; MacGregor, J. F.; Wold, S. Adaptive batch monitoring using hierarchical PCA. Chemom. Intell. Lab. Syst. 1998, 41, 73. (24) Qin, S. J.; Valle, S.; Piovoso, M. J. On unifying multiblock analysis with application to decentralized process monitoring. J. Chemom. 2001, 15, 715.

(25) U ¨ ndey, C.; Ertunc, S.; Cinar, A. Online Batch/Fed-batch Process performance monitoring, Quality Prediction, and Variable-Contribution Analysis for Diagnosis. Ind. Eng. Chem. Res. 2003, 42, 4645. (26) Zhao, S. J.; Zhang, J.; Xu, Y. M. Monitoring of Processes with Multiple Operating Modes through Multiple Principle Component Analysis Models. Ind. Eng. Chem. Res. 2004, 43, 7025. (27) Krzanowski, W. J. Between-groups comparison of principal components. J. Am. Stat. Assoc. 1979, 74, 703. (28) Johannesmeyer, M. C.; Singhal, A.; Seborg, D. E. Pattern matching in historical data. AIChE J. 2002, 48, 2022. (29) Singhal, A.; Seborg, D. E. Pattern Matching in Historical Batch Data Using PCA. IEEE Control. Syst. Mag. 2002, 22, 53. (30) Box, G. E. Some theorems on quadratic forms applied in the study of analysis of variance problems, I. Effect of inequality of variance in oneway classification. Ann. Math. Stat. 1954, 25, 290. (31) Jackson, J. E.; Mudholkar, G. S. Control procedures for residuals associated with principal components analysis. Technometrics 1979, 21, 341. (32) Jackson, J. E. A user’s guide to principal components; Wiley: New York, 1991. (33) Miller, P.; Swanson, R. E.; Heckler, C. E. Contribution plots: A missing link in multivariate quality control. Appl. Math. Comput. Sci. 1998, 8, 775. (34) Westerhuis, J. A.; Gurden, S. P.; Smilde, A. K. Generalized contribution plots in multivariate statistical process monitoring. Chemom. Intell. Lab. Syst. 2000, 51, 95. (35) Kano, M.; Hasebe, S.; Hashimoto, I. Statistical Process Monitoring Based on Dissimilarity of Process Data. AICHE J. 2002, 48, 1231. (36) Fukunaga, K.; Koontz, W. L. G. Application of the Karhunen-Loeve Expansion to Feature Selection and Ordering. IEEE Trans. Comput. 1970, 19, 311. (37) Yang, Y.; Gao, F. R. Cycle-to-cycle and within-cycle adaptive control of nozzle pressures during packing-holding for thermoplastic injection molding. Polym. Eng. Sci. 1999, 39, 2042. (38) Yang, Y.; Gao, F. R. Adaptive control of injection velocity of thermoplastic injection molding. Control Eng. Pract. 2000, 8, 1285. (39) Birol, G.; U ¨ ndey, C.; Parulekar, S. J.; Cinar, A. A morphologically structured model for penicillin production. Biotechnol. Bioeng. 2002, 77, 538. (40) Birol, G.; U ¨ ndey, C.; Cinar, A. A modular simulation package for fed-batch fermentation: penicillin production. Comput. Chem. Eng. 2002, 26, 1553.

ReceiVed for reView October 14, 2006 ReVised manuscript receiVed April 28, 2007 Accepted April 30, 2007 IE061320F