Online Monitoring of Batch Processes Using IOHMM Based MPLS

Feb 4, 2010 - It consists of the input−output hidden Markov model (IOHMM) and ... An HMM is capable of characterizing a doubly embedded stochastic ...
4 downloads 0 Views 4MB Size
2800

Ind. Eng. Chem. Res. 2010, 49, 2800–2811

Online Monitoring of Batch Processes Using IOHMM Based MPLS Junghui Chen,* Che-Ming Song, and Tong-Yang Hsu R&D Center for Membrane Technology, Department of Chemical Engineering, Chung-Yuan Christian UniVersity, Chung-Li, Taiwan 320, Republic of China

Online monitoring of the batch process is extremely important to the health assessment of the batch operation; it ensures making products of consistent high quality. In this paper, an integrated framework called IOHMMMPLS is proposed to monitor the performance of a batch process. It consists of the input-output hidden Markov model (IOHMM) and multiway partial least-squares (MPLS) method. The sequence of the process variables and the product quality variables are decomposed into linear outer relations, which can be handled by MPLS, and simple inner dynamic sequence relations, which can be coped with by a set of single-input-single-output IOHMM models. MPLS is used to solve the problem of high dimensionality and collinearity, while IOHMM is used to capture the transition probability of the dynamic information. The combined IOHMM-MPLS method requires a smaller computational load, converges faster, and encounters lesser occurrences of false detection. It utilizes output quality variables and within batch process variables to find the most likely state evolution. After extracting the essential features of the past operating information, subsequently, two simple monitoring charts are presented to track the progress of each batch run and monitor the occurrence of the observable upsets. The proposed model is successfully applied to two simulated problems. 1. Introduction When compared to traditional processes, batch and semibatch processes have been receiving increasing attention in most industrial sectors for more than 2 decades now, because they are simpler and easier to set up and operate. Moreover with little or no hardware modification, they can produce several types of high-value-added products such as products in pharmaceutical, semiconductor, polymer, biochemical, and many other industries. Thus, batch operation is usually more efficient than continuous operation when effecting frequent product change. However, unlike the continuous process which operates under steady-state conditions, most batch processes are always transient in nature so that the operating conditions can have significant effects on the product characteristics. The batch process usually exhibits some batch-to-batch variations even if the operation is under its specified trajectories. These variations often mislead monitoring of operations, particularly detecting the occurrence of faults. Therefore, to have safe operation and consistent good-quality products, online process monitoring is imperative. Process monitoring plays an increasingly important role in modern technology because a fault in a single component may cause the malfunction of the whole batch operation. In the model-based approach, the user can compare the predictions of the model with the operational data to determine if the process is to operate within acceptable limits. Often, the process may be too complex or the information may be too limited to develop fundamental models so that the method seems to be useful for only limited applications. The statistical approach provides an alternative model development paradigm. The development of batch process monitoring based on the multivariable statistical control process has increased steadily since the original work on the multiway principal component analysis (MPCA) of Nomikos and MacGregor1 came into play. MPCA based on the work of Nomikos and MacGregor has been proven useful in detecting abnormal conditions while monitoring batch oper* To whom correspondence should be addressed. E-mail: jason@ wavenet.cycu.edu.tw.

ations.2-6 Comprehensive reviews on this area have been found in refs 7 and 8. Using a batch data structure, state space models in combination with batch modeling for identifying a dynamic batch-to-batch model were also developed and applied.9,10 The minimum-order state-space model could be used to deal with batch process data and derive a model capturing batch-to-batch dynamics or correlations. The model was used together with PCA to monitor the process for batch-to-batch variations. Due to the repetitive nature of the batches and the batch-to-batch correlation of the disturbances,11,12 the combination of iterative learning control and MPLS was developed to overcome the model error. The multivariate autoregressive model integrated with MPCA for monitoring of the performance of a batch process was proposed.13 However, the above method assumed that the statistical data followed Gaussian distribution, constructing the joint probability distribution of all the measurements. It is difficult to construct this probability distribution because the observations were of large dimensionality and the number of possible time points was exponential in the length of the sequences. Also, monitoring schemes for detecting changes in batch-to-batch variations, which related to decisions made at the end of the batch, should be considered.9,12 The potentials of combining methods to incorporate external information are not yet fully explored14 and further research is needed in this area. Most monitoring methods make the assumption that the data are independent and identically distributed. The assumption does not necessarily hold true for the items that can be arranged in sequential data, because there is often an interaction between the observations and the variables and between the variables. Recently, hidden Markov models (HMM) appeared as an efficient statistical tool for speech and language modeling.15 In an HMM model, each observation is the data sequence depending on the previous elements in the sequence. An HMM is capable of characterizing a doubly embedded stochastic process with an underlying stochastic process that can be observed through another set of stochastic processes because the state of the system is not observable directly.15,16 It also does have a concept of internal state or memory. More importantly, HMM

10.1021/ie900536z  2010 American Chemical Society Published on Web 02/04/2010

Ind. Eng. Chem. Res., Vol. 49, No. 6, 2010

does not rely heavily on the structure of the model and the assumption regarding the observed data distribution. The chain structure of the parametric model characterized by the state transition probability can be adaptively trained by the observed sequential data. This means that the distribution of the normal operation patterns of the batch operation can be gotten from the trained HMM model. Despite the widespread use of HMM models in many science and technology fields, such as speech utterances, temperature variations, biological sequences, and other sequential data,16-19 the applications to process monitoring of chemical engineering have been relatively meager. In the standard HMM, the model parameters are fixed and HMM is used for output sequences only. In the batch process operation, the sequence dynamics of process variable measurements throughout the duration of the batch can be taken as input variables and the product quality variable measurements at the end of each batch are taken as output variables. If the variables of the product quality are used when the sequence of process variables is given, the approach follows that of an input-output hidden Markov model (IOHMM), and this approach is more discriminating than the standard HMM in fault detection. In this paper, dynamic mapping between product qualities and process variables of the batch process is developed by IOHMM. Our approach is inspired by the IOHMM framework proposed by Bengio and Frasconi.20 With the trained IOHMM, the IOHMM model can be integrated into the existing SPC methods to enhance the capability of the statistical process monitoring. Thus, MSPC control charts under IOHMM are developed in this research. This approach can not only analyze the sequential measurements but also capture the statistical characteristics of the practical data. This paper is organized into six sections. After the Introduction, the end point quality of batch monitoring problems is defined in section 2. In the third section, the conventional chain structure of IOHMM for multivariate sequential data is briefly reviewed. Then, in section 4, an IOHMM-MPLS model with the required monitoring statistics for online monitoring is developed. Illustrative examples are given in section 5 to present the performance of the proposed method through two sets of benchmark data from the simulation problems. A comparison of the proposed method with the conventional MPLS and IOHMM is shown. In the last section, conclusions are made. 2. Batch Process Monitoring In each batch run, the designed profiles of the manipulated variables (xc) are implemented in order to match the final product qualities (y(t ) K)) with the desired qualities when the batch is finished. The dynamic batch can be mathematically formulated as

[]

x˙m ) g(xu, xm, xc), x˙u

y ) h(xu, xm, xc)

0eteK

(1)

K is the duration of each batch run. During this batch run, J process variables, including manipulated variables (xc) and online measured variables (xm) (xk ) {xm,xc}k, xk ∈ RJ, k ) 1, 2, ..., K), are collected at K time instances through one batch, and M quality variables (yK ) y(k ) K), y ∈ RM) can only be measured at the end of the batch after the batch run is finished. xu is the unmeasured vector of disturbances. In most of the openloop batch operation, the goal of batch monitoring is to develop effective technologies to detect faults of product quality variables

2801

Figure 1. Graphical model for (a) HMM and (b) synchronous IOHMM in the sequential variables.

in early enough time and then do something useful with the information. However, it is usually very difficult to get a reliable dynamic batch model (g and h). Thus, a model based on the operation data is a practical way. In this research, an IOHMM probability model is developed to infer if the quality variables (yK ) y(k ) K)) are normal given a sequence of process variables ({x1, x2, ..., xK}). 3. Input-Output Hidden Markov Model Most analysis of time series is restricted to extrapolating series through an autoregressive model assuming linearity and causality of the system. However, how well the models are built relies heavily on the assumption of the prediction error distribution. The autogressive models can also be fit well into another simple model that is a probabilistic fashion by sequences with the Markov property, such as HMM. The HMM model enables one to predict the future of the system on the basis of the present state. HMM is a joint probability distribution over states ({s1, ..., sK}) and observations ({y1, ..., yK}) shown in Figure 1a. HMM can be considered as a mixture model with temporal coupling between the variables ({y1, ..., yK}). When the batch operation model is trained, the sequence of process variables ({y1, ..., yK}) is used in HMM instead of the input variables ({x1,x2, ..., xK}), so the distribution retains most of the variation in process variables only. By specifying {x1,x2, ..., xK} to include the operation information, the input variables can provide a more compact representation correlated with differences in the fault monitoring. The condition distribution of the output sequence ({y1, ..., yK}) can be obtained given the input sequence ({x1,x2, ..., xK}). This means that the output plays the role of a desired output in response to the input. The probability model associated with the sequence of inputs and outputs is called IOHMM, as shown in Figure 1b. IOHMM is an extension of HMM that includes inputs and outputs.20 With the given inputs, the model now represents the distribution of the output condition. In Figure 1b, the arc from xk to yk indicates that IOHMMs represent a conditional distribution of an output sequence when an input sequence is given. The arc from xk to sk implies that, in IOHMM, transition probabilities are conditional on the input. In comparison, standard HMMs are based

2802

Ind. Eng. Chem. Res., Vol. 49, No. 6, 2010

Figure 2. Graphical model for asynchronous IOHMM in the sequential variables.

on homogeneous Markov chains. Therefore, IOHMMs are better suited for learning to represent long-range behavior than standard HMMs.21 IOHMMs shown in Figure 1b are called synchronous IOHMMs, where the length of the input sequence matches the length of the output sequence. However, in our batch application problem, the method for modeling sequential data is the chain structure that captures interactions between a sequence of inputs, x ) {x1,x2, ..., xK}, and the output vector, y ) yK. The length of the input and output sequences may not be equal because the output is only available at the end of time point K. Due to the lack of online sensors for tracking these variables, upon completion of the batch, offline quality measurements are usually made in the laboratory. Thus, the number of sampling units in process variable trajectory measurements (x) is K; only one sampling point at the end of the batch is for quality measurements (y). This IOHMM model, as shown in Figure 2, is an asynchronous IOHMM. The asynchronous IOHMM is trained to fit the conditional distribution probability P(y|x) of the output y given the input sequence x. In Figure 2, there is a set of hidden variables, s ) {s1, ..., sK}, and a fixed possible number of states. The conditional probability P(y|x) can be written as a sum of terms P(y,s|x) over all the possible state sequences, P(y|x) )

∑ P(y, s|x) s

(2)

K

)

∑ P(y|s , x ) ∏ P(s |s K

s

k k-1, xk)

K

k)1

IOHMM has three sets of parameters. (1) The initial state parameters, τ1: It controls the prior distribution of the hidden state at the step k ) 1, P(s1 ) m|x). (2) The transition distribution, P(sk|sk-1,xk): It determines the probability of transitioning to state sk ) j1 given sk-1 ) j2. It has been implemented as a multilayer perceptron with a single hidden layer and a sigmoidal activation function. Here is the transition distribution parametrized by the softmax function.22 P(sk ) j1 |sk-1 ) j2, xk) ∝ exp{λjT1 j2 · xk}

(3)

The parameters of the transition distribution are λj1j2. (3) The emission distribution, P(y|sK,xK): The form of the emission distribution depends on the observation as well as the model assumptions. The distribution model is often characterized by a mixture of conditional linear Gaussians with separate parameter conditions on each setting of the hidden state.21 M

P(y|sK ) j, xK) ∝

∏ N(µ , Σ ) m

m)1

m

(4)

where the means (µm) of the distribution is dependent on the input variables (xK), µm ) µm0 + bmT xK. µm0 is the vector determining the basis center of the Gaussian function and Σm is the covariance matrix. bm represents the regression coefficients. Thus, the parameters of the emission distribution are θm ) {µm0 bm Σm}. Given a set of the sequence observations (inputs and outputs), IOHMM can handle three basic tasks: classification, inference, and learning. Classification is used to calculate the probability that an input-output sequence is generated by this IOHMM. Inference is employed to compute the probability that this IOHMM is in a state at the time point k. Learning is used to determine the setting of the parameters that maximizes the probability of the set of training sequences in the model. For the purpose of monitoring of the batch operation, the tasks of learning and classification are needed. In the learning phase, the parameter estimation of IOHMM follows Bengio’s approach20 to train IOHMM under the expectation-maximization (EM) framework. Let the collected data, D ) {(xi, yi);xi ) {x1i · · · xKi}; i ) 1, ..., I}, be I sets of input-output sequences independently sampled from the identical distribution and Θ ) {θm, m ) 1, ..., M; λj1j2,j1,j2 ) 1, ..., M} be the set of all the parameters of the model. Thus, the log likelihood function is I

L(Θ;D) )

∑ log P(y |x ) i

i

i)1 I

)

∑ ∑ log P(y , s |x ) i

i)1 I

)

∑∑ i)1

i

i

si

si

K

[log P(yi |siK, xiK) +

∑ log P(s |s

i i i k k-1, xk)]

k)1

(5) where the first term of the right-hand side of eq 5 depends only on θ ) {θm} and the second term of the right-hand side of eq 5 depends only on λ ) {λj1j2}. The problem of optimizing the parameters (λ and (θ)) using EM is a problem of weighted logistic regression models. Each data set (yi,xi) has associated posterior probabilities (P(sik ) j1,sik-1 ) j2|yi,xi) and P(sKi ) j|yi,xi)) when the latent state is known. To adapt the representation of emission and transition probabilities, the parameters then are updated. The process of optimizing the parameters is iterative, and it may take some time to converge. For detailed description of the IOHMM learning algorithm, please see Bengio and Frasconi’s work.20 4. IOHMM-MPLS Online Batch Monitoring Although the IOHMM model approach performs a lot in different fields, it still has problems in the case of correlation inputs or limited data. Particularly in the case of limited data, the number of parameters could be larger than the number of observations. Therefore, some of the parameters cannot be uniquely determined from the observed data. In this paper, the integration of MPLS regression and IOHMM is proposed to formulate the probability model that can handle correlated inputs and limited observations. MPLS is applied to decomposing the variables into the lower dimensional ones in the orthonormal space. Then IOHMM is used to obtain information about the variability among batches for batch monitoring. 4.1. Multiway Partial Least-Squares Method. In the batch operation, when there are more than one process variable (J > 1) and quality variable (M > 1), a set of I numbers of historical batch data over K time intervals can be constructed into a three-

Ind. Eng. Chem. Res., Vol. 49, No. 6, 2010

way array and a two-way array shown in Figure 3. The twoway data matrix, Y(I×M), is defined. The three-way data matrix is X _ (I×J×K). Before performing MPLS, the data matrix X _ (I×J×K) should be unfolded to a two-dimensional matrix. Figure 3 shows a schematic diagram of the two unfolding structures used. Structure I allows one to analyze the variability among the batches by summarizing the information in the data with respect to both the measured variables and their time variation. It cannot focus on the local behavior of the batch process. On the other hand, structure II can obtain information on the variability among the batch variables. It cannot take into account the time dependency and the correlation. To solve both structure drawbacks, in this research work, first, the data matrix X _ (I×J×K) is unfolded to the two-dimensional matrix (I×KJ) (X structure I of Figure 3). Structure I is used to scale data and keep the important variables of small magnitudes from being taken over by less important but larger magnitude variables. Then each row of the data matrix for all the process variables at one batch is rearranged to form a two-dimensional matrix with time × variables Xi(K×J). Each block matrix Xi(K×J) is arranged from the top to the bottom, starting with the block corresponding to the first batch run. The resulting twodimensional matrix has dimensions X(IK×J) (X structure II of Figure 3). Structure II is used for variability among the batch variables at each local time point. Likewise, the quality variable is scaled in the same manner (Y structure I of Figure 3). To implement MPLS computation, the quality variables of each batch run are duplicated by K times to form Y(IK×M) (Y structure II of Figure 3) to make a consistent size with the number of rows of X(IK×J). This structure illustrates that the quality variables at the end of the batch run are accepted if the process operation up to the current time point is normal. Note that MPLS just extracts the cross-correlation among batch relationships in the low dimensional space. The temporal behavior of the sequence of the feature variables can be constructed by IOHMM. Now PLS is used to relate a set of predictor variables (X(IK×J)) to a set of response variables (Y(IK×M)). Standard PLS can be viewed as a two-stage process. Initially, an outer model is used to decompose X and Y blocks into scores (tr,ur) and loadings (pr,qr) based on the following projection:

Figure 3. Two unfolded structures for MPLS.

2803

R

X)

∑t p

T r r

+ E ) TPT + E

r)1 R

Y)

∑u q

T r r

+ F ) UQT + F

(6)

r)1

where E and F are the residuals of X and Y and R is the number of components used. The purpose of this step is to eliminate the interactions between process variables in the latent space. The score matrices are

[] []

T1 2 T) T l TI

Ti ) [ti1 ti2 · · · tiR ] )

[

Ui ) [ui1 ui2 · · · uiR ] )

U1 2 and U ) U l UI

ti1,1 ti1,2 · · · ti1,R

ti2,1 ti2,2 · · · ti2,R ·

··

(7)

]

,

tiK,1 tiK,2 · · · tiK,R i ) 1, 2, ..., I

[

ui1,1 ui1,2 · · · ui1,R ui2,1 ui2,2 · · · ui2,R ·

··

]

(8)

,

uiK,1 uiK,2 · · · uiK,R i ) 1, 2, ..., I

(9)

The next step is to find fitting regression for scores. If the inner relation between scores tr and ur is interpolated linearly, i.e., ur ) brtr, it is called a linear PLS model. Some improved PLS approaches can be generalized to nonlinear systems using nonlinear PLS; i.e., uˆr ) fr(tr), where fr(...) represents the relationship between the response score vector ur and the predictor score vector tr. Quadratic PLS is commonly used.23 The nonlinearities can also be based on the sigmoidal function as used in the artificial neural network.24 However, such a deterministic model depicts only those characteristics of the domain model of the system response; therefore, many effects are knowingly left unmodeled, particularly the disturbances

2804

Ind. Eng. Chem. Res., Vol. 49, No. 6, 2010

which we can neither control nor model deterministically. In this research paper, an IOHMM probability model is developed and applied to the inner relation; i.e., ur ) IOHMMr(tr). 4.2. IOHMM-MPLS Learning Algorithm. IOHMM-MPLS is a projection-based probability method that can be estimated in a sequential, stepwise manner. IOHMM-MPLS is different from the direct IOHMM modeling approach because the data are not used directly to train the IOHMM model. It is preprocessed by the MPLS outer transform. The MPLS outer projection is used as a dimensional reduction tool to remove collinearity. The coupling effect in the input-output system can be decomposed into several pairs of inputs and outputs, so the number of pairs can be selected on the basis of the variation captured by each pair. This will simplify the IOHMM model. Now the input and output variables are transformed to feature space variables, and then the probability model for the transformed sequential data set is constructed. In the decomposition procedures shown in Figure 4, each component in the batch matrix Ti is extracted over all the batches to form a component matrix Tr(K×I), Tr ) [tr1 tr2 · · · trI ],

r ) 1, 2, ..., R

(10)

In eq 9, Ui(K×R) represents the output scores corresponding to each batch run. However, the final qualities at the end of each batch run should not depend upon the separative Ui(K×R) at each time point; they should be settled together by the overall process operation scores with the single score. Thus, the average score uav,i(1×R), revealing the pivotal information at the end of the batch run, can be utilized as the modeling output steadily obtained by simply averaging all Ui(K×R) within the batch run i.

Figure 4. New approach of IOHMM-MPLS modeling for batch processes.

K

ui,av ) [ui,av · · · ui,av 1 R ] )



1 [ ui ··· K k)1 k,1

K

∑u

i k,R]

k)1

(11) The pair of the score vectors (uri,av and tri) is used to build up i the inner IOHMM probability model P(ui,av r |tr,θ). The parameters of P(uri,av|tri,θ) should be selected to maximize the likelihood probability estimation. The same training procedures in section 3 can be directly applied. Consequently, as shown in Figure 4, the distribution of the IOHMM model for each component, P(urav|tr,θ), r ) 1, 2, ..., R, is trained. Figure 5 shows the structure of the IOHMM-MPLS method. It uses the MPLS outer transform to generate sequential scores from the sequential data. The proposed IOHMM-MPLS algorithm can be formulated as follows: (1) Collect the historical batch data sets X _ (I×J×K) and Y(I×M) indicative of normal operations. The data should cover the range of the batch operating patterns and the conditions that yield accepted product quality. (2) Unfold data (X _ (I×J×K)) batchwise X(I×KJ). Pretreat X(I×KJ) and Y(I×M) using the mean and the standard deviation of each variable at each time in the batch cycle for all batches. (3) Rearrange the scaled X(I×KJ) and Y(I×M) into X(IK×J) and Y(IK×M). Let X0 ) X(IK×J), Y0 ) Y(IK×M), and r ) 1. (4) For each r, take ur from one of the columns of Yr-1. (5) PLS outer transform in matrix X wrT ) urTXr-1 /urTur and normalize wr to norm 1 (12) trT ) Xr-1wr

(13)

Ind. Eng. Chem. Res., Vol. 49, No. 6, 2010

in matrix Y crT

)

trTYr-1 /trTtr

and normalize cr to norm 1 (14)

urT ) Yr-1cr

4.3. Online Batch Monitoring Scheme. After modeling of IOHMM-MPLS, the model becomes

(15)

Keep doing this step iteratively until it converges. Although several algorithms have been proposed, the above nonlinear iterative partial least-squares algorithm is used to give a clear picture of PLS outer projection. (6) Find the inner IOHMM probability model to get the maximum probability of the output score given a sequence of input scores I

max

∑ log P(u

i,av i r |tr, θ)

(16)

i)1

For online monitoring of the measurement variables, the control limits should be built up at each sampling time point. As the new batch revolves, only the process values from the beginning of the batch until the current time (k) are available (X(k×J)). To fulfill the incomplete data set for online monitoring from the current time to the end of the batch, filling the missing values with the normal profile is applied here. Likewise, the output (yk(M×1)) is filled with the normal values. The score matrix by projecting the input and output data, (Xk(K×J) and yk(M×1)), mentioned above onto the loading matrices (P and Q) can be computed:

The parameter estimation of IOHMM can be successfully solved by means of the iterative MLE algorithm. (7) Calculate X and Y loadings prT qrT

)

)

2805

trTXr-1 /trTtr

(17)

(urav)TYr-1 /(urav)Turav

(18)

where

Tk ) [tk,1 tk,2 · · · tk,R ] ) XkP

(22)

Uk ) [uk,1 uk,2 · · · uk,R ] ) YkQ

(23)

The reconstructed score matrix and the corresponding unexplained matrix at each time point can be gotten by applying the trained IOHMM-MPLS, ek ) [ek,1 ek,2 · · · ek,J ] ) Xk(k, :) - Tk(k, :)PT

urav ) [[ur1,av]T · · · [uri,av]T uri,av ) [uri,av · · · uri,av]T

···

[urI,av]T]T

and

(24)

where the symbol (k,:) indicates that the elements of the kth row from the whole matrix are extracted. Also, the probability models at each time point are av P(uk,r |tk,r(k)), r ) 1, 2, ..., R

(8) Calculate the residuals for component r in matrix X Xr ) Xr-1 - trprT

(19)

Yr ) Yr-1 - uravqrT

(20)

in matrix Y

(9) Let r ) r + 1; return to step 4 until all the principal components are calculated. The number of principal components is determined by the cross-validation using the prediction residual sum of square (PRESS). The order of the components is set to be the order at minimum PRESS.25

(25)

where tk,r(k) is the kth element from tk,r. To check if the current operating process is normal or in control, the control charts for the monitoring procedure should be properly set up. With the developed probability distribution that reflects the normal operation, control limits for the scores are required to detect any departure of the process from its standard behavior.



Ω∈{tr}

av P(uk,r |tk,r(k)) dtr ) 0.95,

r ) 1, 2, ..., R

(26)

However, it is not analytically tractable using the probability model to calculate the threshold. Alternatively, this integral is

Figure 5. Sequential method of IOHMM-MPLS modeling: The data are projected onto the latent scores, and then IOHMMs are built up based on the scores.

2806

Ind. Eng. Chem. Res., Vol. 49, No. 6, 2010

Table 1. Input Variables Used in the Monitoring of the Benchmark Model input variables 1 2 3 4 5 6 7 8 9 10 11

output variable

agitator power substrate feed rate substrate feed temperature dissolved oxygen concentration culture volume carbon dioxide concentration pH fermentor temperature generated heat cooling water flow rate agitator power

1

penicillin concentration

approximated by bootstrapping the original batch data set to generate enough data and obtain truly quality-related variables from the batch process data. For more information on the bootstrapping, refer to ref 26. Given the IOHMM model, the calculation results of the likelihood of these data are sorted. The data corresponding to 95% of the confidence level av |tk,r(k)) > Pth (P(uk,r k ) is taken to get the confidence bounds. The squared prediction error (Qx,k) is statistically defined as Qx,k ) |ek | 2

(27)

The confidence limits of Qx(k) cannot be determined directly from a particular distribution. The kernel density estimation27 is used to determine the confidence limits (Qth k ). The above developed statistic measures (P and Qx) are applied to a practical problem. Nonsmooth responses may happen due to the measurement noise at each instant time point. These nonsmooth response problems are a fact of life in complex chemical processes. To avoid the fluctuation of the calculated statistic measures and pay attention to the deterministic fault, a class of moving average that reduces the variation is added to the control limits, th ˆ th ˆ th Q k ) γQk-1 + (1 - γ)Qk

(28)

th ˆ th Pˆth k ) γPk-1 + (1 - γ)Pk

(29)

where γ is the momentum coefficient between 0 and 1. In the same manner, the calculated statistic measures for the momentum modification are obtained ˆ x,k ) γQ ˆ x,k-1 + (1 - γ)Qx,k Q

(30)

av av av Pˆ(uk,r |tk,r) ) γPˆ(uk,r |tk,r) + (1 - γ)P(uk,r |tk,r),

r ) 1, 2, ..., R (31)

5. Illustration Example To prove the detectability of the proposed model, a problem of the fed-batch penicillin production is given here.8 This example is a challenging test bed for process monitoring and detection. The duration of each batch is 400 h, comprising a preculture stage of about 45 h and a fed-batch stage of about 355 h. The sampling interval is 46.8 min. In the normal operating condition, small variations are added to the simulation input data to mimic process variation under the normal operating condition. Also, measurement noise is added to each of the monitoring variables. The simulation condition and the relevant parameters are the same as those of Cinar et al.8 A total of 11 process variables, shown in Table 1, is collected in each batch, and a total of 50 batches based on

the normal operation is used for the basis analysis. Because there are 250 time points during each batch run, the data are arranged in a three-way array X(50×11×250). The quality variable is penicillin concentration. The resulting output variable measured at the final time point only is summarized as the quality variable (Y(50×1)). With collecting historical batch data, IOHMM-MPLS can be trained using eqs 12-20. The model with three latent variables captures over 97% of the variance in the in-out relationships of the system. The likelihood score charts based on eq 26 and the squared prediction error based on eq 27 at every time point obtained from the IOHMM-MPLS method can be constructed and the control limits representing 95% of the confidence limit are also shown in Figure 6. To compare the performance between the proposed method and the conventional MPLS monitoring method, the two methods are applied to two additional batches, a normal batch and an abnormal one, both of which are not selected from the training data sets. The abnormal situation caused by the substrate feed rate linearly decreased from 0.042 to 0.032 and from the 40th hour until the end of the batch is used for testing. Monitoring of the normal batches in Figure 6 shows that this batch stays below the upper control limits, indicating that there is no abnormal behavior at any time during the batch run. In this situation, this batch is classified as being “in control” or “normal”. The final product quality described by the IOHMM-MPLS model should satisfy the same product specification. If MPLS is applied, the results of the normal batch with MPLS shown in Figure 7 have several false detections in the T2 chart. This is because MPLS assumes that data are statistically independent. The autocorrelation and the normal probability plots of tr of the first three components are shown in Figure 8. The raw data are observed using the MPLS model, which has a significantly effect on the autocorrelation structure and deterioriates the monitoring performance. The successful quantitative detection rates of the proposed IOHMM-MPLS and MPLS are compared and shown in Table 2a. The results show that the proposed method can be used to monitor the dynamic batch system more effectively than MPLS. The IOHMM-MPLS is applied to an abnormal batch, and the results of batch monitoring are shown in Figure 9. Unlike monitoring of the previous batch, this batch starts above the upper control (Figure 9). It is evident from these charts that each score probability plot of IOHMM-MPLS is capable of identifying the fault of the normal operating region from the 50th time point on. This indicates a special variation occurs from the 50th time point on. Applying MPLS to the same batch yields the results presented in Figure 10 and Table 2b. It can be seen that Qx and T2 are not sensitive to this fault. The Qx or T2 monitoring charts of MPLS do not detect this fault until the 100th hour of the batch run. If IOHMM is applied only to the same test batch, the result of Figure 11 evidently shows that the control point has increased remarkably and fallen outside of the likelihood-based confidence limit after 75 h of the batch run. By comparing the number of training parameters between IOHMM-MPLS and IOHMM, IOHMM-MPLS has 42 parameters while IOHMM has 276. The former has significantly less parameters than the latter. The smaller the number of parameters is fitted, the less risk of overfitting the training data has; accordingly, fewer occurrences of false alarms would be detected. The time for fault detected by IOHMM is a little behind that by IOHMM-

Ind. Eng. Chem. Res., Vol. 49, No. 6, 2010

2807

Figure 6. Control charts for online monitoring in the illustration example using IOHMM-MPLS: (a-c) IOHMM control chart of each score and (d) Q chart. Each chart contains 95% (solid lines) of the control limit. The solid line with star signs represents the normal batch.

training result of the two model likelihoods with respect to the iteration number for the training set. The proposed model enables the model training to converge faster and have higher probabilities. Thus, the proposed IOHMM-MPLS has the ability to capture the actual process relationship between the variables under the noise environment. 6. Conclusion

Figure 7. Control charts for online monitoring in the illustration example using MPLS. Each chart contains 95% (solid lines) of the control limit. The solid line with star signs represents the normal batch.

MPLS, probably because the training model is not good at handling the large number of parameters. Also, the benefits of incorporating MPLS into IOHMM can be easily seen by the modeling training. Figure 12 shows the comparison made between IOHMM and IOHMM-MPLS in terms of the EM

Batch process monitoring has been evolving rapidly in academic research and industrial applications over the past decade. Multivariable statistical methods, such as MPCA and MPLS, have been widely used in modeling multivariate batch processes. The entire batch data are taken as a single object to monitor the batch at the end of the process. Those methods are not useful if they are only able to monitor the outcome of the batch process because batch and semibatch processes with operations of different phases typically exhibit significant behaviors with transient states. Moreover, when the control limits of MPCA and MPLS are under development, the fundamental assumption is restricted to a simple Gaussian distribution, which may not be a valid approximation for the data collected from the complex batch operations. This may cause false detection since the serial correlations between the multivariable behaviors and the proper probability distribution that characterizes the system are not considered. In this paper, the IOHMM-MPLS model, which combines the MPLS empiri-

2808

Ind. Eng. Chem. Res., Vol. 49, No. 6, 2010

Figure 8. (a) Autocorrelation plot and (b) normal probability plot of the score variables. Table 2. Successful Detection Rates (%) MPLS

IOHMM-MPLS

T2

Qx

84.8

93.6

45.9

43.1

P(uav 1 |t1)

P(uav 2 |t2)

P(uav 3 |t3)

Qx

100.0

100.0

100.0

91.9

(a) Normal Conditions 100.0

100.0

(b) Abnormal Conditions 99.6%

98.4

cal model and the IOHMM stochastic model, is proposed to efficiently monitor the problem of the online batch process. First, the complicated three-dimensional batch data are unfolded into the two-dimensional data, and then the original data space is projected onto mutually independent subspaces. The strongest relation of the input-output scores of each subspace is found, and the original multivariable system is translated into a singlevariable system of each subspace. A function of single-variable

Ind. Eng. Chem. Res., Vol. 49, No. 6, 2010

2809

Figure 9. Control charts for online monitoring in the illustration example using IOHMM-MPLS: (a-c) IOHMM control chart of each score and (d) Qx chart. Each chart contains 95% (solid lines) of the control limit. The solid line with star signs represents the abnormal batch.

Figure 10. Control charts for online monitoring in the illustration example using MPLS. Each chart contains 95% (solid lines) of the control limit. The solid line with star signs represents the abnormal batch.

Figure 11. Control chart for online monitoring in the illustration example using IOHMM. The chart contains 95% (solid lines) of the control limit. The solid line with star signs represents the abnormal batch.

conditional probability density is constructed through training, and the IOHMM parameters are updated to describe the actual input-output data distribution. The advantages of the IOHMMMPLS model include simplification of the parameters and the transition matrix of the IOHMM model, acceleration of the training convergence, decrease of the overfitting situation, and

enhancement of the applicability and accuracy of the IOHMM probability model. It is a low hanging fruit for engineers to combine MPLS and IOHMM together and to assess the current operation using the proposed framework. In view of online monitoring methods, the input and the output variables are first projected down onto a subspace of

2810

Ind. Eng. Chem. Res., Vol. 49, No. 6, 2010

Acknowledgment The authors wish to express their sincere gratitude to the Center-of-Excellence (COE) Program on Membrane Technology from the Ministry of Education (MOE), R.O.C., and to the project Toward Sustainable Green Technology in the Chung Yuan Christian University, Taiwan, under grant CYCU-98-CRCE for their financial support. Notation

Figure 12. (a) IOHMM and (b) IOHMM-MPLS model likelihoods concerning the EM iterative number in the illustration example, in which three IOHMM-MPLS probability models are used for each input-output score.

orthogonal latent variables to give the input and output scores. Then each pair of the latent variables is trained to build up the corresponding IOHMM probability density function. Unlike T2 control limits with the assumption of Gaussian distribution, the probability control limit for each component is sequentially calculated on the basis of IOHMM probability density estimation and the bootstrap algorithm. Also unlike Qx control limits, the Kernel density function is used to construct the residue control limit. In comparison with conventional MPLS model, the proposed IOHMM-MPLS model shows negligible erroneous judgment on the normal operation condition and efficient monitoring capability to detect the time of occurrence of the process fault. The bad side of the simplicity of the MPLS model is that it can fail even in simple nonlinearities presented in a batch system. With a structure of internal state or memory, IOHMM-MPLS can capture long-range structure. Because IOHMM-MPLS is data-driven-based, it can be easily implemented on any industrial scale. A good IOHMM-MPLS model can maintain explicit links to the observations. Thus, rich and representative data are needed. The appropriately selected variables and data sizes would be the important issues in our extension work. Also, other future work of ours will focus on the use of the IOHMM-MPLS model’s prediction on the diagnosis of fault patterns and the application to the problem of multistage process systems instead of the single-batch unit only.

bm ) mth regression coefficients, m ) 1, ..., M, RR E ) residual matrix for X in MPLS, RIK×J ek ) vector of prediction errors of process variables at time point k, RJ E[...] ) expectation value F ) residual matrix for Y in MPLS, RIK×M I ) number of batch runs J ) number of process variables K ) number of sampling points in a batch run M ) number of quality variables P ) loading matrix for X, RJ×R p ) loading vector for X, RJ P(...) ) probability R ) number of components Q ) loading matrix for Y, RM×R q ) loading vector for Y, RM Qx ) sum of squares of the estimated residuals from MPLS s ) {s1 · · · sK} ) set of hidden variables in a batch run T ) score matrix for X in MPLS, RIK×R Ti ) score matrix of component i for T, RK×R tri ) score vector of component r and batch i for Ti, RK U ) score matrix for Y in MPLS, RK×R ui,aVg ) average score vector of batch i for Ui, RR X ) unfolded data matrix of batch process variables, RIK×J x ) process variables, x ) {xc xm}, RJ xc ) manipulated variables xm ) online measured variables xu ) unmeasured disturbance variables Y ) unfolded data matrix of batch quality variables, RIK×M y ) quality variables, RM Greek Letters Θ ) set of parameters in the IOHMM model, Θ ) {θm, m ) 1, ..., M; λj1j2,j1,j2 ) 1, ..., M} θ ) set of emission parameters in the IOHMM model λ ) set of transition parameters in the IOHMM model µm0 ) vector of the mth basis center of the Gaussian function, RR, m ) 1, ..., M Σm ) mth covariance matrix of the Gaussian function, m ) 1, ..., M, RR×R AbbreViations HMM ) hidden Markov model IOHMM ) input-output hidden Markov model IOHMM-MPLS ) combination of MPLS and IOHMM MPLS ) multiway partial least-squares method PLS ) partial least-squares method

Literature Cited (1) Nomikos, P.; MacGregor, J. F. Multivariate SPC Charts for Monitoring Batch Processes. Technometrics 1995, 37, 41. (2) Ferreira, A. P.; Lopes, J. A.; Menezes, J. C. Study of the Application of Multiway Multivariate Techniques to Model Data from an Industrial Fermentation Process. Anal. Chim. Acta 2007, 595, 120.

Ind. Eng. Chem. Res., Vol. 49, No. 6, 2010 (3) Chiang, L. H.; Leardi, R.; Pell, R. J.; Seasholtz, M. B. Industrial Experiences with Multivariate Statistical Analysis of Batch Process Data. Chemom. Intell. Lab. Syst. 2006, 81, 109. (4) Lee, J.-M.; Yoo, C.-K.; Lee, I.-B. Enhanced Process Monitoring of Fed-Batch Penicillin Cultivation Using Time-Varying and Multivariate Statistical Analysis. J. Biotechnol. 2004, 110, 119. (5) Wang, X.; Kruger, U.; Lennox, B. Recursive Partial Least Algorithms for Monitoring Complex Industrial Processes. Control Eng. Pract. 2003, 11, 613. (6) Chen, J.; Liu, K.-C. On-line Batch Process Monitoring Using Dynamic PCA and Dynamic PLS Models. Chem. Eng. Sci. 2002, 57, 63. (7) Qin, S. J. Statistical Process Monitoring: Basics and Beyond. J. Chemom. 2003, 17, 480. (8) Cinar, A.; Parulekar, S. J.; Undey, C.; Birol, G. Batch Fermentation: Modeling, Monitoring and Control; Dekker: New York, NY, 2003. (9) Lee, J. H.; Dorsey, A. W. Monitoring of Batch Processes through State-Space Models. AIChE J. 2004, 50, 1198. (10) Kadlec, P.; Gabrys, B.; Strandt, S. Data-Driven Soft Sensors in the Process Industry. Comput. Chem. Eng. 2009, 33, 795. (11) Flores-Cerrillo, J.; MacGregor, J. F. Iterative Learning Control for Final Batch Product Quality Using Partial Least Squares Models. Ind. Eng. Chem. Res. 2005, 44, 9146. (12) Flores-Cerrillo, J.; MacGregor, J. F. Control of Batch Product Quality by Trajectory Manipulation Using Latent Variable Models. J. Process Control 2004, 14, 539. (13) Choi, S. W.; Morris, J.; Lee, I.-B. Dynamic Model-Based Batch Process Monitoring. Chem. Eng. Sci. 2008, 63, 622. (14) Ramaker, H. J.; van Sprang, E. N. M.; Gurden, S. P.; Westerhuis, J. A.; Smilde, A. K. Improved Monitoring of Batch Processes by Incorporating External Information. J. Process Control 2002, 12, 569. (15) Rabiner, L. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. IEEE 1989, 76, 257. (16) Ross, S. M. Introduction to Probability Models, 5th edl; Academic: New York, 1993.

2811

(17) Ge, M.; Du, R.; Xu, Y. Hidden Markov Model Based Fault Diagnosis for Stamping Processes. Mech. Syst. Signal Process. 2004, 18, 391. (18) Baldi, P. Brunak, S. Bioninformatics, the Machine Learning Approach; MIT Press: Cambride, MA, 1998. (19) Schenkel, M.; Guyon, I.; Henderson, D. On-line Cursive Script Recognition Using Time Delay Neural Networks and Hidden Markov Models. Mach. Vision Appl. 1995, 215. (20) Bengio, Y.; Frasconi, P. Input/Output HMM’s for Sequence Processing. IEEE Trans. Neural Networks 1996, 3, 1231. (21) Bengio, Y.; Lauzon, V.-P.; Ducharme, R. Experiments on the Application of IOHMMs to Model Financial Returns Series. IEEE Trans. Neural Networks 2001, 12, 113. (22) Wold, S.; Kettaneh-Wold, N.; Skagerberg, B. Nonlinear PLS Modeling. Chemom. Intell. Lab. Syst. 1989, 7, 53. (23) Holcomb, T.; Morari, M. PLS/Neural Network. Comput. Chem. Eng. 1992, 16, 393. (24) Zwick, W. R.; Velicer, W. F. Comparison of Five Rules for Determining the Number of Components to Retain. Psychol. Bull. 1986, 99, 432. (25) Efron, B.; Tibshirani, R. J. An Introduction to the Bootstrap; Chapman & Hall: New York, 1993. (26) Bishop, C. M. Neural Networks for Pattern Recognition; Oxford University Press: London, U.K., 1995. (27) Qin, S. J.; Li, W. Detection and Identification of Faulty Sensors in Dynamic Processes. AIChE J. 2001, 47, 1581.

ReceiVed for reView April 2, 2009 ReVised manuscript receiVed October 19, 2009 Accepted December 21, 2009 IE900536Z