Effect of Synchronization on Bilinear Batch Process Modeling

Feb 10, 2014 - ... software. The consequences of inappropriately synchronizing batch data with multiple asynchronisms in process monitoring are invest...
0 downloads 0 Views 4MB Size
Article pubs.acs.org/IECR

Effect of Synchronization on Bilinear Batch Process Modeling J. M. González-Martínez,*,†,‡ R. Vitale,† O. E. de Noord,‡ and A. Ferrer† †

Departamento de Estadística e Investigación Operativa Aplicadas y Calidad, Universitat Politècnica de València, Camino de Vera s/n, Edificio 7A, 46022 Valencia, Spain ‡ Shell Global Solutions International B.V., Shell Technology Centre Amsterdam, PO Box 38000, 1030 BN Amsterdam, The Netherlands S Supporting Information *

ABSTRACT: There is a widespread assumption that batch synchronization is only required if the batch trajectories have different duration. This paper is devoted to demonstrate that synchronization is a critical and necessary preliminary step to bilinear batch process modeling, no matter whether batch trajectories have equal length or not. Another practical assumption is that all batches need the same synchronization method to be aligned. Two different synchronization approaches are compared in terms of synchronization quality: the Multisynchro approach that takes into account the type of asynchronism and the method based on linearly expanding and/or compressing pieces of variables trajectories (the TLEC method), implemented in commercial software. The consequences of inappropriately synchronizing batch data with multiple asynchronisms in process monitoring are investigated. For this study, the observationwise unfolding-T scores batchwise unfolding (OWU-TBWU) approach, which integrates the TLEC method for batch synchronization, is used for process modeling. Data from realistic simulations of a fermentation process of the Saccharomyces cerevisiae cultivation with five different types of asynchronism are used for illustration. unfolding),2,5 implemented in the SIMCA software package by Umetrics. Wold et al.2 stated that “for the OWU-TBWU approach, the data do not need alignment before the OWU modeling, but certainly the resulting OWU scores, T, need alignment before the subsequent batch modeling using the unfolded scores”. Later on, Eriksson et al.6 stressed “One assumption of most methods used for analysis of batch data is that batches have equal duration and are synchronized, i.e., measurements are made at the same time points. If this is not the case, the batch data need to be aligned. Wold and coworkers5 use stretched and contracted values of local time, and thus re-express each batch in terms of interpolated data so that in the end each batch has the same number of rows for the same time points”. Mingxing et al.7 pointed out that variablewise unfolding “has the advantage of being very simple to carry out, because it can be applied in a straightforward way to sets of batches which have different time duration, without the need of synchronizing the batch length”. Facco et al.8 emphasized that “an advantage of variable-wise unfolding is that the uneven batch durations problem in process modeling is spontaneously solved without trajectory synchronization”. Lee et al.,3 Zhao et al.,4 Yao and Gao,9 and Huang and Qu10 claimed that OWUTBWU does not require that all batches be of equal length. Martin et al.11 stated that the observation levelthe first step of the OWU-TBWU approachis an alternative approach to the analysis of batches with unequal lengths. Simoglou et al.12 reported a comparative evaluation of four multivariate statistical process control techniques for online monitoring pointing out that the Wold et al. proposal is able “to overcome the problem

1. INTRODUCTION In most batch processes, the assumption that all the batch trajectories are synchronized is rarely met. Typically, the recipes for automation are based on triggers that are seldom dependent on time, which causes the batch pace evolution to be different batch to batch.1 In addition, different sizes of batch charge, modifications of the recipe to release products at lower costs, impurities in the raw materials, and disturbances in the environmental conditions may produce uneven time-length batches.2 The main types of asynchronism that can be found are (i) batches with equal duration but key process event not overlapping at the same time point in all batches (class I asynchronism), (ii) batches with different duration and process pace caused by external factors (class II asynchronism), (iii) batches with different duration due to incompletion of some batches and key process events overlapping (class III asynchronism); and (iv) batches with different duration due to delay in the start but batch trajectories showing the same evolution pace after (class IV asynchronism). When batch data contain one or more types of asynchronism, the application of projection to latent structures methods for process understanding, optimization, and monitoring cannot be straightforwardly carried out. Prior to modeling, variable trajectories need to be synchronized in such a way that the key events defining the normal process are aligned in all batches. In the scientific community, there is a widespread assumption that batch synchronization is only required if the batch trajectories have different duration.3,4 However, equal duration is not a sufficient condition to consider batch trajectories to be synchronized. Some authors proposed methods that only address the problem of the different duration among batches without considering the overlap of the key process events, such as the observationwise unfolding-T scores batchwise unfolding (OWU-TBWU) approach (also known as variable-wise © 2014 American Chemical Society

Received: Revised: Accepted: Published: 4339

June 30, 2013 February 9, 2014 February 9, 2014 February 10, 2014 dx.doi.org/10.1021/ie402052v | Ind. Eng. Chem. Res. 2014, 53, 4339−4351

Industrial & Engineering Chemistry Research

Article

with unequal batch lengths”. Ü ndey et al.13,14 ensured that OWU-TBWU “provides solutions to data synchronization”. Fransson et al.15 indicated that “there exist a number of methods including [...] using local batch time as the response vector in a PLS model when unfolding the three-way array in the variable direction”, upon which the OWU-TBWU is based, “to deal with varying batch-to-batch process time”. In case these methods do not work out, these authors suggest the use of timealignment algorithms like dynamic time warping (DTW)16 and correlation optimized warping.15 In addition, there are commercial software packages for batch process monitoring, such as SIMCA Release 13.0.3,17 which only demand synchronization of batch trajectories when they have different length. The main goals of this paper are: (i) to demonstrate that batch synchronization is a crucial and necessary step prior to batch process modeling, no matter whether batches have equal duration or not, and (ii) to show that not all batches may need the same synchronization method to be aligned. For this purpose, two different synchronization approaches are evaluated under scenarios of multiple asynchronisms: the Multisynchro approach18 and, the method based on linearly expanding and/or compressing pieces of variable trajectories in the local batch time dimension,6 which is referred to as the TLEC method. (TLEC is the default synchronization procedure implemented in SIMCA Release 13.0.3. In case the differences in batch length are greater than 20%, a maturity variable is used as the basis of batch synchronization instead of the local batch time.6 Also, TLEC is one of the synchronization techniques provided in ProMV Batch Edition Release 13.02.) To align batches with the latter, the process evolution is linearly interpolated throughout batch time to make batches equal in length. The Multisynchro approach is devoted to synchronize batch trajectories taking into consideration different types of asynchronisms by using the DTW16 and relaxed greedy time warping (RGTW)19 algorithms. The selection of the synchronization techniques relies on both their capacity to differently handle asynchronous batch trajectories and their widespread use in the chemometrics community. Note that the method based on an indicator variable20 is not taken into consideration in this study because of the class III asynchronism. When the duration of the variables trajectories differ across batches due to incompletion of batches, none of the process variables has the same ending value across batches. Hence, this major requirement20 is not met to use a process variable as indicator for synchronization in this study. To proceed with the comparative study, five experimental cases with different types of asynchronism are designed using data from realistic simulations of a fermentation process of the Saccharomyces cerevisiae cultivation. Batches that produced on-spec product while operating under normal operating conditions (NOC) and faulty batches containing the different types of asynchronism are simulated. These batches are used to evaluate the performance of both approaches in terms of synchronization quality and their influence in the monitoring schemes to accurately detect faults. For this purpose, the OWU-TBWU approach, which integrates the TLEC method for batch synchronization, is used for bilinear batch process modeling. The paper is organized as follows. Batch data and the types of asynchronisms simulated for the comparative study are presented in section 2. Section 3 briefly introduces the OWUTBWU approach for bilinear process modeling, the two synchronization methods, and the metric used for comparison.

Section 4 discusses the results of the comparison of the two different synchronization methods in terms of synchronization quality. Also, the impact of inappropriate synchronization performed in batch data on fault detection is addressed. Finally, some conclusions are drawn in section 5.

2. MATERIAL Batch data are generated based on the biological model of the aerobic growth of Saccharomyces cerevisiae on glucose limited medium21 to evaluate the importance of batch synchronization in batch multivariate statistical process control. For this purpose, the simulation scheme designed using Simulink for Matlab release 2010a (The MathWorks, Inc.) (available in the MP toolbox22) is used. Data for 60 batches run under normal operating conditionsbatches processed with the nominal values of the internal kinetic constants21are simulated. To make the simulation realistic, Gaussian noise of low magnitude in the initial conditions (10%) and measurements (5%) are introduced. These batches are split up in a NOC calibration and test data set composed of 40 and 20 batches, respectively. Additionally, three different faults are designed: two process faults generated by modifying the internal constants k1l and k6 (denoted as type I and type II fault, respectively) and one engineering process representing a bias in the biomass concentration sensor (denoted as type III fault). Data for 10 batches for each one of the three abnormalities are simulated. The first two faults do not illustrate abnormal behaviors related to specific biochemical changes in the metabolic network but abnormal operating conditions that may reveal as apparent changes in the kinetic parameters of the model. In particular, scenarios of better diffusion after solving operating problems, such as incorrect stirring of the reactor or high viscosity in the medium, are simulated. In this context, a better material transport is expected, and hence, a higher apparent maximum reaction rate. To simulate these scenarios, the values of the kinetic constants k1l (associated to the reaction describing the glucose uptake system and the glycolytic pathway) and k6 (associated to the reaction describing the formation of ethanol from acetaldehyde) were modified in the stoichiometric equations with higher values than the nominals indicated in ref 21. The third type of fault represents a malfunctioning of the biomass concentration probe, one of the most used sensors to assess the yield of a fermentation. For each batch, measurements belonging to 10 process variables are registered every sampling time over all batches: concentrations (glucose, pyruvate, acetaldehyde, acetate, ethanol, and biomass), active cell material, acetaldehyde dehydrogenase (proportional to the measured activity), specific oxygen uptake rate, and specific carbon dioxide evolution rate. The original time of processing from simulation is also added to the batch data matrix. In addition, the intrinsic biological variability of a population of the microorganism is taken into account in the simulation. As a result, batches with different duration and evolution pace (key process events not overlapped at the same sampling points in all batches) are obtained. Five experimental cases with different types of asynchronism were generated for the calibration and test data sets: (a) case 1, equal batch duration and different evolution pace in the last stage of the batch run in all batches, i.e., class I asynchronism (see Figure 1a); (b) case 2, different batch duration produced by natural variability and key process events not overlapping at the same sampling point across batches, that 4340

dx.doi.org/10.1021/ie402052v | Ind. Eng. Chem. Res. 2014, 53, 4339−4351

Industrial & Engineering Chemistry Research

Article

Figure 1. Trajectories of the process variable acetate concentration corresponding to 40 NOC calibration batches in five different scenarios of asynchronism: (a) case 1, (b) case 2, (c) case 3, (d) case 4, and (e) case 5. The batch trajectories with different asynchronism patterns for each scenario are distinguished by black and gray lines. Note that only black trajectories are shown in case 2 (b) because all batches have the same type of asynchronism.

in terms of synchronization quality and evaluate the impact of batch synchronization in process monitoring is also presented. 3.1. OWU-TBWU approach. In batch processes, a set of measurements belonging to J process variables are usually measured at K different sampling points over I batches. Batch data can be organized in a three-way data array X̲ (I × J × Ki), which is treated with different rearranging methods, such as batch-wise unfolding (BWU), observation-wise unfolding (OWU) (also called variable-wise unfolding, VWU) and batch dynamic unfolding (BDU), for its subsequent bilinear modeling. A good survey of these methods from different perspectives (dynamics modeling, online prediction and parameters stability) can be found in refs 2 and 23−25. The OWU-TBWU approach2 consists of two levels: the observation-wise unfolding (OWU) and the T-scores batch wise unfolding (TBWU) levels. In the former (OWU level), the main motivation is dimension reduction before the BWU stage, which is desired in a context of a large number of process variables (e.g., spectroscopic or chromatographic data). The TBWU level provides a monitoring scheme for end-of-batch online process monitoring based on the OWU scores, Hotellingʼs T2, and DModX (equivalent to squared prediction error, SPE). Concerning the second level, the aim is to analyze the differences among batches using the information summarized through the matrix scores T from the observation level. In the following, the observation (OWU) level, which is used in this research work, is described in detail. For explanation of the batch (TBWU) level, readers are referred to the original work.2,5 Observation (OWU) Level and TLEC Synchronization Approach. In the observation (OWU) level, the three-way

is, class II asynchronism (see Figure 1b); (c) case 3: different batch duration produced by incompletion of some batch runs and key process events overlapping at the same sampling point in all batches, i.e., class III asynchronism (see Figure 1c); (d) case 4, different batch duration produced by a delay in start of the batch (shift) and the same evolution pace across batches, i.e., class IV asynchronism (see Figure 1d); and (e) case 5, different batch duration produced by natural variability and incompletion of some batches, and key process events not overlapping at the same sampling point across batches, i.e., combined class II and class III asynchronism (see Figure 1e). For each scenario, 10 out of the 40 calibration NOC batches, 5 out of the 20 test NOC batches, and 5 out of the 10 test faulty batches for each fault are manipulated to incorporate the aforementioned types of asynchronism. For the sake of visualization, the batch trajectories corresponding to the 40 NOC batches from the calibration data set for the process variable acetate concentration and for the five different types of asynchronism are shown in Figure 1. In total, five data sets containing 60 NOC batches and 30 faulty batches are designed (provided in the Supplementary Information).

3. METHODS In this section, the bilinear process modeling method used in this paper, the observation level of the OWU-TBWU, is described. In addition, the basis of the TLEC synchronization approach, method integrated in the OWU-TBWU approach, and the Multisynchro synchronization method are explained. The metric employed to compare synchronization approaches 4341

dx.doi.org/10.1021/ie402052v | Ind. Eng. Chem. Res. 2014, 53, 4339−4351

Industrial & Engineering Chemistry Research

Article

Figure 2. Procedure performed in the observation or OWU level.

array X̲ is unfolded, preserving the variable direction: each variable, measured at different time points for the different batches, is arranged in one column of a new two-way array X (KiI × J), in a way that each one of its rows corresponds to a single time point at which the measurements are registered (see OWU unfolding step in Figure 2). Once the arrangement is performed, each column is autoscaled by subtracting its average value and dividing by its standard deviation (the so-called slab-wise preprocessing or variable centering and scaling). With this normalization, time periods with less variability will be downweighted and periods with more variability will be weighted more in the multivariate analysis. Afterward, a PLS model is built in order to relate all the process variables at every batch time point to a dummy variable y, representing the local batch time (see calibration step in Figure 2). This variable is created as follows: i. A vector yi containing sorted integer values ranging from 0 to the length of ith batch minus one is built for each single batch. ii. Each value of yi is transformed as yi , k y inew = × K̃ ,k Ki − 1 (1) ̃ where Ki is the length of the ith run and K represents the

monitoring and subsequent multivariate analysis in the batch level. For real-time process monitoring, the so-called OWU scores control charts (one per each latent variable) are designed. The synchronized OWU scores are first batch-wise arranged in such a way that those belonging to a specific batch form one row of a new matrix, XT̂ A (I × K̃ A). Afterward, by computing the average t ̂a,k and standard deviation sa,k of each batch time point for the synchronized scores of all the latent variables (columns of XT̂ A), the upper and lower control limits (UCL and LCL, respectively) can be straightforwardly calculated: UCLaT, k = tâ , k + zα /2·sa , k

(2)

LCLaT, k = tâ , k − zα /2·sa , k

(3)

where zα/2 is the 100·(1 − α/2) % standardized normal percentile. Hotelling-T2-based control charts can be also built from the synchronized OWU scores, but they are rarely taken into account in the OWU-TBWU approach. In contrast, the residual matrix E derived from the PLS model is used in order to estimate the DModX statistic at every time point for each batch,

median of the training batch lengths. iii. All the vectors for the different batches are arranged into a single column array, y, containing the values of the variable y, for each batch at each time point. iv. y is autoscaled. Note that if all batches have equal duration, the vectors yi and ynew contain the same number of elements and the same values. i In the case in which batches have unequal duration, all the vectors ynew have the same starting and ending values, but a i different number of intermediate values depending on their batch length Ki. Once the PLS model is fitted, the resulting OWU scores TA, Hotellingʼs T2, and DModX statistics for each batch are readjusted by linear interpolation (the so-called TLEC-based method) using the ynew maturity index vector (see i synchronization step in Figure 2). This readjustment permits the new OWU scores (denoted as T̂ A) to be used for process

J

DModX k =

∑j = 1 ej2, k (J − A )

(4)

where J is the number of process variables and A is the number of latent variables extracted. DModX-based control charts are then built and its corresponding control limit (CL) is calculated using the F distribution with J−A and (I−A−1)(J−A) degrees of freedom for in-control observations.6 On the basis of the relation between the squared prediction error (SPE) and DModX,26 the control limits for the DModX-based control chart can be estimated by using the approximation proposed by Box27 and Jackson and Mudholkar.28 A drawback of the observation level in real-time monitoring of new batches is that the OWU scores and the multivariate statistics cannot be aligned until the completion of the batch. Hence, there is no guarantee that the control charts on the 4342

dx.doi.org/10.1021/ie402052v | Ind. Eng. Chem. Res. 2014, 53, 4339−4351

Industrial & Engineering Chemistry Research

Article

Figure 3. Multisynchro approach for batch synchronization in scenarios of multiple asynchronisms. The algorithm is composed of two routines: high-level (a) and low-level (b). The high level routine is devoted to recognize asynchronous patterns in batch data (a.1), and classify and arrange batches into different subdata sets based on the type of asynchronism (a.2). The aim of the low-level routine is to synchronize batch trajectories with an appropriate technique for the type of asynchronism present in the batch trajectories: DTW/RGTW-based iterative synchronization (b.1), DTW/ RGTW-based synchronization with relaxed end point constraint and TSR-based missing trajectory imputation (b.2), DTW/RGTW-based synchronization with relaxed start point constraint (b.3), and DTW/RGTW-based synchronization with relaxed start and end point constraint combined with TSR-based missing trajectory imputation (b.4).

key process events ensuring the same evolution across batches, no matter the type of asynchronism present in batch data. It takes as inputs the three-way batch data array X̲ , the technique to weight the process variables, and the strategy to select the reference batch, against which all batches are synchronized. The procedure returns the synchronized batch data matrix and the warping time profiles that indicate how to warp the batch trajectories to make them synchronized.29 The Multisynchro algorithm is composed of a high-level and low-level routine (see Figure 3). The high-level routine is aimed at recognizing the different types of asynchronous trajectories for the subsequent batch classification as a function of the nature of asynchronism (see Figure 3a). The low-level routine is in charge of synchronizing the variable trajectories of each one of the batches with a specific procedure based on the type

OWU stage show aligned results and, therefore, monitoring may be misleading. At this point, it is worth commenting that the application of the synchronization procedure integrated in the OWUTBWU approach (the TLEC-based method) may be completely inappropriate due to the underlying assumptions that are seldom fulfilled in batch processes: (i) linear process pace, (ii) all batches are completed and all the key process events defining the process evolution throughout the batch run are present in all batches, and (iii) batches with equal duration are considered as synchronized. If batch data do not meet these assumptions, the process evolution in the trajectories of the process variables is different batch-to-batch. Hence, the application of projection to latent structures methods for process understanding and monitoring is meaningless. 3.2. Multisynchro Approach for Batch Synchronization. The Multisynchro approach18 is devoted to synchronize the 4343

dx.doi.org/10.1021/ie402052v | Ind. Eng. Chem. Res. 2014, 53, 4339−4351

Industrial & Engineering Chemistry Research

Article

Specific Batch Synchronization. The Multisynchro approach continues the execution synchronizing the different data sets with different types of asynchronism: i. Synchronize the three-way batch data array X̲ 1 using the iterative synchronization based on the DTW algorithm18 (see Figure 3b.1). This iterative procedure consists of synchronizing batch trajectories in such a way that possible abnormalities present in batch data do not affect the synchronization quality: i.1 While outliers in data: (i.1.1) Synchronize batches from the three-way array X̲ 1 against a reference batch. (i.1.2) ∼ Preprocess X 1 by removing the average trajectory and scaling all the process variables at every sampling point, unfold batch-wise and fit a PCA model. Subsequently, design a monitoring system based on the SPE statistic. (SPE and not Hotelling-T2 statistic is used because batches that break the correlation structure are of interest. When this occurs, the corresponding batch trajectories usually have different shapes in comparison to NOC batches, which may affect the synchronization quality.) (i.1.3) Project all batches onto the latent subspace for outlier detection. If any abnormality is detected, the raw batch trajectories corresponding to the faulty batches are isolated and arranged into the three-way batch data array X̲ FAULT. The rest of the raw batch trajectories are arranged into a new three-way batch data array X̲ NOC. (i.1.4) Replace the three-way array X̲ by X̲ NOC. i.2 Synchronize the raw variable trajectories corresponding to the faulty batches X̲ FAULT isolated in the iterative procedure. The procedure returns the weighting matrix, the loading vector obtained from the PCA-based modeling in the procedure, and ∼ ∼ the three-way batch data arrays X NOC and X FAULT containing the synchronized NOC and faulty batches, respectively (see Figure 3b.1). ii. Synchronize the three-way batch data array X̲ 2 using the DTW algorithm with the relaxed end point constraint using those parameters estimated in the iterative synchronization.18 This version of the DTW algorithm synchronizes batches against a segment of the reference batch limited by the first point and the best matching end point e* instead of the reference as a whole. The algorithm returns the batch trajectories ∼ synchronized till the best end point of each batch X 2. The missing part of each of the batches is imputed using the trimmed score regression method30 (see Figure 3b.2). The ∼ procedure returns the three-way batch data array X 2 containing the synchronized batch trajectories. iii. Synchronize the three-way batch data array X̲ 3 using the DTW algorithm with the relaxed start point constraint and the parameters calculated in the iterative synchronization (see Figure 3b.3).18 This version of the DTW algorithm synchronizes segments of batches against a reference batch. The segments are limited by the best matching start point s* of each batch with the first point of the reference, and their last ∼ point. The procedure returns the three-way batch data array X 3 containing the synchronized batch trajectories. iv. Synchronize the three-way batch data array X̲ 4 using the DTW algorithm with the relaxed start and end point constraint using those parameters estimated in the iterative synchronization18 (see Figure 3b.4). The procedure returns a three-way batch ∼ data array X 4 containing the synchronized batch trajectories.

of asynchronism (see Figure 3b). In the following, the algorithm is described. Asynchronism Recognition. The high-level routine is divided into two steps. The first step is devoted to recognize the different types of asynchronous trajectories and is carried out by using the warping time profiles derived from a preliminary synchronization as follows: i. Select a reference batch from the three-way batch data array X̲ . ii. Synchronize all batches using the DTW algorithm giving the same weight to the process variables. Those variables that are either showing constant values in most of production time or discarded beforehand by prior knowledge are constrained in the synchronization with a null weight. The algorithm returns a set of synchronized ∼ batch trajectories X and a warping information matrix containing the matching points between each batch and the reference batch. iii. For each warping time profile from the three-way warping information array: (1) Count the number of consecutive horizontal transitions denoting the number of compressions carried out by the synchronization algorithm at the first time period of the ith batch. (2) Count the number of consecutive vertical transitions denoting the number of expansions carried out by the synchronization algorithm at the last time period of the ith batch. These features of the warping time profiles are used to detect the different types of asynchronisms presented in data (see Figure 3a.1). In the case of class III asynchronism, uncompleted batches are associated with warping profiles showing an excessive number of vertical transitions at the last time period of the runs. These transitions are related to expansions that the DTW/RGTW algorithm should carry out for synchronization. Batches with a shift at the start of the run are associated with warping profiles that contain a high number of horizontal transitions at the same time period (asynchronism IV). These transitions are related to compressions that the DTW/RGTW algorithm should carry out for synchronization. Finally, in class I and class II asynchronism, the resulting warping profiles show a reasonable combination of horizontal transitions and vertical transitions throughout the batch run. The second step of the high-level routine is aimed at classifying each batch by the type of asynchronism (see Figure 3a.1) and arranging them into different data sets (see Figure 3a.2), as follows: i. Repeat for all batches: i.1 If the number of compressions and expansions are less than a threshold, arrange the ith batch into the three-way batch data array X̲ 1. i.2 If only the number of expansions at the end of the batch is greater than or equal to a threshold, arrange the ith raw batch into three-way batch data array X̲ 2. i.3 If only the number of compressions at the start of the batch is greater than or equal to a threshold, arrange the ith raw batch into three-way batch data array X̲ 3. i.4 If the number of compressions and expansions are greater than or equal to a threshold, arrange the ith raw batch into the three-way batch data array X̲ 4. 4344

dx.doi.org/10.1021/ie402052v | Ind. Eng. Chem. Res. 2014, 53, 4339−4351

Industrial & Engineering Chemistry Research

Article

After synchronizing batch data using the Multisynchro approach, all the resulting submatrices need to be merged into a three-way array for subsequent bilinear process modeling. Even though some batches may have been detected as abnormal in the iterative synchronization procedure, they are not discarded for modeling. The reason behind this is that these batches are a valuable source of information. The real-time application of the Multisynchro approach is straightforwardly done by using the RGTW algorithm instead of the DTW algorithm. For off-line applications, DTW is preferred since it provides us with the optimum global solution. However, if the main goal is to design a monitoring scheme for real-time application, the RGTW algorithm is required. For further details in the algorithm, readers are referred to the original work.18 3.3. Comparison of Synchronization Methods. The two synchronization approaches under study are evaluated in scenarios of multiple asynchronisms from two different perspectives: (i) synchronization quality and (ii) accuracy of the monitoring schemes designed from synchronized batch data to efficiently detect abnormal situations. Different metrics are defined for comparison purposes. The synchronization quality is defined as the accuracy of a synchronization procedure to make the key process events overlapped throughout the batch run, ensuring the same process pace in all batches. To assess this factor in the two synchronization techniques under study, the variability of the resulting synchronized batch trajectories around their mean trajectory is calculated. This can be measured by the standard deviation vector after the mean trajectory has been subtracted from batch data. The lower the difference among standard deviation vectors, the higher the synchronization quality. Data corresponding to the experimental cases simulated are used to build the monitoring schemes based on the OWUTBWU approach explained in section 3.1. The control limits of the control charts designed at the OWU level are initially estimated from theoretical results and subsequently adjusted for an imposed significance level α. The aim of this readjustment is to ensure that the OWU scores and DMoDX control charts have the same percentage of faults detected by chance for a batch under normal operating conditions. The performance of the monitoring schemes based on the OWU approach for the different synchronization approaches will be compared using two indices.31−33 To evaluate the proper adjustment of the control limits, the Overall Type I (OTI) risk from the batches under NOC belonging to the test data set is computed. This value can be understood as the actual percentage of faults in the NOC batches or the false alarms rate and is estimated as follows: OTI = 100

where nnf represents the number of nonsignaled faulty sample points, Ifaulty is the number of faulty batches, and l the length of the faulty time period. To consider that the monitoring system has a good performance, the OTII value should be close to 0 as much as possible.

4. RESULTS AND DISCUSSION The five batch data sets containing different types of asynchronism are used as a benchmark to illustrate (i) the performance of the Multisynchro approach and the TLEC method in terms of synchronization quality, and (ii) to what extent the accuracy of detection of faults from the monitoring schemes based on the OWU-TBWU modeling approach is affected by scenarios of multiple asynchronisms. As a first step of this study, the calibration batch data set of each one of the types of asynchronism under study are synchronized using the Multisynchro approach. In this case, the DTW algorithm is used for batch synchronization. (If the aim is to design monitoring schemes for real-time applications, the RGTW algorithm should be chosen.) The reference batch selected in each scenario of asynchronism was the closest one to median length from the batches arranged for the iterative synchronization. The rest of the conditions and constraints are set according to the specifications in ref 19. Second, a crossvalidated PLS model for each one of the five synchronized calibration data sets are fitted (see results in Table 1). For the Table 1. PLS Models Results in the Different Cases of Asynchronism for the Two Synchronization Procedures under Study

nnf Ifaultyl

no. LVsa

R2a (%)

Q2a (%)

1

TLEC Multisynchro TLEC Multisynchro TLEC multisynchro TLEC Multisynchro TLEC Multisynchro

4 4 4 4 4 4 4 4 4 4

89.5 90.1 91.1 90.4 87.6 90.8 91.4 89.8 89.5 91.8

98.4 99.3 95.5 99.2 93.3 99.3 98.6 99.3 94.5 98.8

3 4 5 a

LVs, R2, and Q2 stand for the latent variables extracted, the goodness of fit and prediction, respectively.

sake of comparison, the TLEC method was applied for batch synchronization as well. Following the procedure explained in section 3.1, a PLS model is first fitted in each of the calibration raw batch data sets. Second, the corresponding OWU scores and DModX statistics derived are synchronized by using TLEC following the steps in section 3.1. 4.1. Effects of Asynchronisms in Synchronization Quality. The standard deviation vector of the corresponding synchronized OWU scores for each scenario of asynchronism are computed for comparison purposes. Note that the length of these vectors differs among synchronization approaches since batch duration is different. To make them equal in length, the standard deviation vectors derived from data synchronized by TLEC are linearly interpolated to the same number of values as those obtained from data synchronized by Multisynchro. The resulting vectors for all components are shown in Figure 4. This figure reveals that when data are synchronized using the

(5)

where nf denotes the total number of faulty sampling points and INOC is the number of NOC batches considered. Note that the adjustment of the control limits can be considered appropiate when the OTI value is close to the imposed significance level (ISL) α. To assess the accuracy of the control chart in terms of fault detection, the Overall Type II (OTII) risk is calculated as

OTII = 100·

approach

2

nf INOCK

case

(6) 4345

dx.doi.org/10.1021/ie402052v | Ind. Eng. Chem. Res. 2014, 53, 4339−4351

Industrial & Engineering Chemistry Research

Article

Figure 4. Comparison of the standard deviation vectors obtained from the four OWU scores (separated by dashed lines) for the Multisynchro approach (black stars lines) and by the TLEC method (red empty circles lines) for all the scenarios of asynchronism: (a) case 1, (b) case 2, (c) case 3, (d) case 4, (e) case 5.

inheritance of the asynchronism. The more different the evolution pace and the duration among batches, the higher the variability. A comparison of the standard deviation vectors of cases 2 and 5, higher values are observed at the last stage of the process (last 50 sampling points) in the latter than in the former. These differences are caused by incompletion of some batches in case 5 (combination of class II and class III asynchronisms). In this scenario, TLEC-based synchronization worsens asynchronism by introducing misaligned points and flat profiles, which produce artificial variability not related to normal process variation. The same phenomenon occurs with batches from case 3 only affected by class III asynchronism. The difference lies in the variability associated with the batches of case 3, which is apparently lower in comparison to the batches of case 5 (see Figure 4c and Figure 4e for comparison). As a conclusion, the larger the incompletion of the batches, the higher the variability. These results show that the accuracy of the synchronization approach to make the key process events overlap in all batches is crucial in bilinear process modeling. TLEC-based synchronization is not focused on ensuring the same process pace in all batches, but on duration equality (TLEC linearly interpolates data without considering the overlapping of the key process events). This means that the different types of asynchronism present in raw batch data are inherited in the latent structure. Hence, the derived OWU scores and DModX statistic have undesired variability that may seriously affect the performance of the monitoring schemes. 4.2. Effects of Synchronization in Process Monitoring. To illustrate the effect of the propagation of asynchronism in the performance of monitoring schemes, the OWU control charts

Multisynchro approach, the standard deviation values are lower (black star lines) than those obtained from data synchronized by the TLEC-method (red empty circle lines). It implies that the Multisynchro approach clearly outperforms the TLEC method in terms of synchronization quality. Also, the standard deviations from TLEC show that this synchronization approach is less vulnerable to class I and class IV asynchronisms (lower values in cases 1 and 4, respectively, in Figure 4) than to class II and class III asynchronisms (higher values in cases 2, 3, and 5 in Figure 4). In the cases with different evolution pace, cases 2 and 5 show higher standard deviations for all the scores than case 1. This high variability is basically produced by the type of asynchronism and the way how the TLEC method addresses the batch synchronization. Case 2 and case 5 have in common that the corresponding raw batch trajectories are different in length due to normal variability of the process. It has two main effects: (i) the key process events do not overlap in all batches from the early stage of the process, causing differences in the evolution pace across, and (ii) the duration among batches differs much more than in the other cases of asynchronism. In contrast, batch trajectories belonging to case 1 have equal length and different evolution pace only at the last stage of the process. As the TLEC method linearly interpolates data without considering the overlap of the key process events, the asynchronism present in the raw batch data is inherited in the resulting OWU scores (see OWU scores and DModX control charts for the NOC test batches affected by cases 1, 2, and 5 of asynchronism in Supporting Information). Hence, the normal process variation is dramatically affected by the 4346

dx.doi.org/10.1021/ie402052v | Ind. Eng. Chem. Res. 2014, 53, 4339−4351

Industrial & Engineering Chemistry Research

Article

Figure 5. OWU scores control charts of the first latent variable monitoring from the NOC calibration data set with different asynchronism patterns after TLEC-based synchronization: (a) case 1, (b) case 2, (c) case 3, (d) case 4 and (e) case 5. Red dashed lines represent the control limits at 95% confidence level. Batches with simulated asynchronies are in black lines.

for each type of asychronism after TLEC-based synchronization are shown in Figure 5. The resulting synchronized OWU scores belonging to the 10 out of 40 batches in which specific types of asynchronism were simulated (see black lines in Figure 5) clearly show the propagation of trajectory variability from raw batch data to the latent structure. In case 1, the difference of the process evolution at the last stage of the process remains in the OWU scores since TLEC does not take any action (see black and gray lines in Figure 5a). The OWU scores belonging to case 2 show the same type of asynchronism as raw batch data, which is more prominent from the 50th sampling point onward. These plots confirm that the asynchronism present in batch data is inherited in the OWU scores due to inappropriate synchronization. In case 3 (see Figure 5c), the OWU scores show a gradual asynchronism from the first time points onward, where the last stage of the process becomes more severe. This is produced by the incompletion of the batch trajectories, i.e., by the missing trajectory belonging to the last period of the process. Regarding case 4, the shift at the early stage of the process stays in the OWU scores. However, this effect is corrected gradually by the execution of a high number of linear interpolations at the end of the runs (see black lines in Figure 5d). Finally, in case 5 (see Figure 5e), a similar phenomenon to case 3 is observed given the similarity between the class of asynchronism. The difference lies in the 30 out of 40 batches (gray lines), whose key process events do not coincide through the batch run in case 5. This asynchronism can be also observed in the OWU scores. The existence of asynchronism in the OWU scores produces a high trajectory variability between batches, as was discussed before (Figure 4). Hence, the control limits estimated from

data need to be wide enough to meet with the ISL requirement (5%). The higher the variability, the wider the control limits. When the batch trajectories (and therefore the OWU scores when TLEC-based synchronization is applied) have different process evolution and lack of overlap in the key process events (i.e., cases 2, 3, and 5), the control limits will be wider (see Figure 5b,c,e) than for the rest of types of asynchronism (see Figure 5a,d). Unnecessary wide control limits may cause that some types of faults cannot be properly detected and diagnosed, putting safety and reliability of the process at risk. When the raw batch trajectories are synchronized by taking into consideration the different types of asynchronism (i.e., applying the Multisynchro approach), the same process evolution and occurrence of the process events in time are ensured. This yields synchronized OWU scores with narrower control limits (see Figure 6) than those obtained when OWU scores are synchronized by TLEC (see Figure 5). To study the risk of applying an inappropriate synchronization in fault detection, the monitoring performance of control charts obtained from batch data synchronized by Multisynchro and TLEC on the raw OWU scores and DModX are compared. To carry out this comparative study, the OTI and OTII values are calculated (see Tables 2 and 3, respectively). For a fair comparison, the control charts of the two synchronization approaches applied to the five different types of asynchronism should present similar OTI values. Otherwise, the OTII results are not comparable. This can be achieved by readjusting the theoretical control limits estimated using the calibration data sets. The OTI values shown in Table 2 are computed using the independent test sets. As can be appreciated, the OTI values are quite similar for both synchronization approaches in 4347

dx.doi.org/10.1021/ie402052v | Ind. Eng. Chem. Res. 2014, 53, 4339−4351

Industrial & Engineering Chemistry Research

Article

Figure 6. OWU scores control charts of the first latent variable from the NOC calibration data set with different asynchronism patterns after Multisynchro-based synchronization: (a) case 1, (b) case 2, (c) case 3, (d) case 4, and (e) case 5. Red dashed lines represent the control limits at the 95% confidence level.

the Multisynchro approach are lower than those obtained from batch data synchronized by the TLEC method. This can be observed both for the all the scores and the DmodX statistic. Note that the OTII values belonging to the first scores are prominently lower than for the rest of scores. This is because the set of OWU-PLS scores capture the average trajectories of the highly correlated variables in different ways at different time points throughout the batch. Remember that these two types of faults illustrate different operating conditions induced from the start of the batch run. Hence, the first score most likely captures the average trajectory of the variables at early stage of the batch process, when the fault is signaled by the control chart (see OWU scores control charts for type I and type II faults shown in the Supporting Information). Concerning the type III fault, the OTII values of the scores of the first latent variable and the DmodX statistic are lower for the Multisynchro aproach than for the TLEC method. Note that these differences are more prominent in the scores than in the DmodX statistic. This is caused by the type of fault, which produces a different process performance with a break of the data correlation structure after some minutes the fault started. Hence, more samples beyond the control limits are expected in scores than in DmodX. Nonetheless, these differences are not equally important in the Multisynchro approach and in the TLEC method. In the former, the differences of the OTII values between the scores belonging to the first latent variable and the DmodX are considerably higher than in the latter. Thanks to a better synchronization considering the types of asynchronism carried out by the Multisyncro approach, the variability is reduced and the control limits are better fitted to the actual process variability. This enables the monitoring scheme to detect the faults with

Table 2. Overall Type I Risk (OTI) Values for the Control Charts Based on the DModX Statistic and the OWU Scores (ta) for the Two Synchronization Procedures under Study. ISL = 5% Test-NOC OTI case

approach

DModX (%)

t1 (%)

t2 (%)

t3 (%)

t4 (%)

1

TLEC Multisynchro TLEC Multisynchro TLEC Multisynchro TLEC Multisynchro TLEC Multisynchro

4.6 4.2 5.5 4.8 5.0 3.5 4.3 4.8 4.1 2.3

2.9 5.3 1.9 6.8 6.6 3.6 2.7 8.1 6.7 3.7

3.4 5.0 2.8 6.0 6.2 2.6 2.0 6.3 6.4 3.6

4.7 7.4 3.0 7.3 7.2 5.1 3.9 10.6 7.2 5.4

3.1 4.7 3.2 5.9 6.7 3.3 2.5 6.8 6.6 3.2

2 3 4 5

all the types of asynchronism, being close to 5%which is the ISL of the limits. To carry out the comparison of OTII values, readers should not focus attention on the specific percentage shown in Table 3since it is dependent on the magnitude of the process faultbut in the difference of OTII values between approaches in each asynchronism case. At first glance, the Multisynchro approach seems to outperform the TLEC method in terms of accurate fault detection using the OWU-TBWU approach in the first level, irrespectively of the type of fault and asynchronism added in batch data. For the first two types of faults, the OTII values derived from batch data synchronized by 4348

dx.doi.org/10.1021/ie402052v | Ind. Eng. Chem. Res. 2014, 53, 4339−4351

Industrial & Engineering Chemistry Research

Article

Table 3. Overall Type II Risk (OTII) Values for the Control Chartsa test type I-fault

test type II-fault

test type III-fault

OTII

OTII

OTII

case

approach

DModX (%)

t1 (%)

t2 (%)

t3 (%)

t4 (%)

DModX (%)

t1 (%)

t2 (%)

t3 (%)

t4 (%)

DModX (%)

t1 (%)

t2 (%)

t3 (%)

t4 (%)

1

TLEC Multisynchro TLEC Multisynchro TLEC Multisynchro TLEC Multisynchro TLEC Multisynchro

55.5 43.8 61.8 42.9 67.7 44.0 63.8 42.0 66.5 51.5

45.9 35.8 82.3 38.0 77.0 37.2 65.6 35.9 83.0 37.1

71.1 66.1 90.6 66.6 90.8 69.5 81.1 66.4 88.1 62.8

68.8 62.3 84.6 55.5 88.8 62.2 73.3 56.4 82.4 53.8

70.1 53.6 88.7 54.1 70.4 60.2 69.0 48.3 81.7 73.4

66.8 55.2 60.4 58.5 69.0 57.3 65.1 57.0 59.9 60.9

57.3 49.9 65.0 54.3 78.9 50.7 62.7 54.1 73.0 50.8

61.6 43.9 67.5 43.2 72.4 47.7 55.1 43.3 64.8 53.6

56.4 38.8 81.9 36.9 71.7 44.7 62.1 37.0 78.7 50.3

52.6 39.2 66.8 43.3 61.1 47.8 48.0 41.1 61.3 46.3

79.4 75.6 93.2 74.7 87.1 73.4 85.7 72.7 92.3 65.4

31.3 23.5 99.0 20.3 90.7 21.5 74.1 21.0 89.1 26.2

91.1 89.8 97.2 91.9 85.8 92.8 70.8 88.2 87.1 90.2

89.0 85.4 98.6 86.4 84.6 89.4 76.3 78.4 88.3 87.7

89.9 93.6 98.8 92.8 85.5 95.2 72.5 89.5 94.2 85.9

2 3 4 5

a Values are based on the DModX statistic and the OWU scores (ta) computed from test batches containing three different types of faults and five different types of asynchronism. These asynchronous faulty batches are synchronized using the Multi-synchro approach and the TLEC method. Lowest OTII values in each case, approach, and type of fault in bold.

Figure 7. LSD intervals (95% confidence level) for the arc sin of the OTII values for (a) the simple effect of the synchronization method and (b) the interaction between the experimental cases containing different asynchronisms and the synchronization method. Red and blue LSD intervals represent the TLEC and Multisynchro methods, respectively.

more accuracy, reducing the number of faulty samples not signaled (see OWU scores control charts for the type III fault shown in the Supporting Information). Another issue worth being emphasized is the apparent differences observed in the OTII values between the different types of asynchronism for all the faults when the TLEC-based synchronization is performed. The OTII values belonging to the scores pointed out above and the DmodX for case 1 and case 4 are lower than for the rest of cases. This may mean that the degree of asynchronism may affect the accuracy of the monitoring schemes when the TLEC method is used for batch synchronization. With the aim of determining if there exist statistical significant differences in the OTII values of the monitoring schemes among the synchronization methods and the types of asynchronism, an analysis of variance (ANOVA) is performed on the OTII values (arcsin square root transformation is used). The outcomes of this analysis determined that the simple effect of the synchronization method and the interaction between synchronization method and type of asynchronism are statistically significant (p-value < 0.05). To find out in

what synchronization approach and type of asynchronism the differences lie, the 95% confidence least significant difference (LSD) intervals are computed (see Figure 7). When the Multisynchro approach is used for batch synchronization, the percentage of faults detected as NOC are statistically lower on average (OTII = 37%) in comparison to when the TLEC method is applied (OTII = 51%)(see Figure 7a). Unlike TLECbased synchronization, Multisynchro-based synchronization is robust to the presence of the different types of asynchronism simulated in the process variables since no statistical significant differences are found for the different types of asynchronisms (see LSD intervals in Figure 7b). Depending on the nature of the asynchronism, the OTII values of the monitoring scheme when the OWU scores are synchronized by TLEC is affected in lesser or greater extent. In particular, in cases 2, 3, and 5 the OTII values are, on average, statistically higher than those obtained in cases 1 and 4. In the former scenarios, batches show high variability in the evolution pace and batch duration, and also there are incomplete batches. An interesting result is observed in Figure 7b with the case 1. In this case (see Figure 1a) 4349

dx.doi.org/10.1021/ie402052v | Ind. Eng. Chem. Res. 2014, 53, 4339−4351

Industrial & Engineering Chemistry Research

Article

Pharmaceutical Process Improvement through Multivariate Latent Variable Modelling. J. Process Control 2011, 21, 1370−1377. (2) Wold, S.; Kettaneh-Wold, N.; MacGregor, J.; Dunn, K. Batch Process Modeling and MSPC. Comprehensive Chemometrics; Elsevier: Amsterdam, The Netherlands, 2009, 2, 163−195. (3) Lee, J.; Yoo, C.; Lee, I. On-line Batch Process Monitoring Using Different Unfolding Method and Independent Component Analysis. J. Chem. Eng. Jpn. 2003, 36, 1384−1396. (4) Zhao, C.; Wang, F.; Mao, Z.; Lu, N.; Jia, M. Improved Batch Process Monitoring and Quality Prediction Based on Multiphase Statistical Analysis. Ind. Eng. Chem. Res. 2008, 47, 835−849. (5) Wold, S.; Kettaneh, N.; Friden, H.; Holmberg, A. Modelling and Diagnostics of Batch Processes and Analogous Kinetic Experiments. Chemom. Intell. Lab. Syst. 1998, 44, 331−340. (6) Eriksson, L.; Johansson, E.; Kettaneh-Wold, N.; Trygg, J.; Wikström, C.; Wold, S. Multi- and Megavariate Data Analysis; Umetrics AB: Umea, Sweden, 2006; Chapter BSPC. (7) Mingxing, J.; Fengxiang, L.; Shouping, G. Optimal PCA-based modeling and fault diagnosis for uneven-length batch processes. 8th IEEE International Conference on Control and Automation (ICCA); Xiamen, China, June 9-11, 2010; pp 1731−1736. (8) Facco, P.; Doplicher, F.; Bezzo, F.; Barolo, M. Moving Average PLS Soft Sensor for Online Product Quality Estimation in an Industrial Batch Polymerization Process. J. Process Control 2009, 19, 520−529. (9) Yao, Y.; Gao, F. A Survey on Multistage/Multiphase Statistical Modeling Methods for Batch Processes. Annu. Rev. Control 2009, 33, 172−183. (10) Huang, H.; Qu, H. In-Line Monitoring of Alcohol Precipitation by near-Infrared Spectroscopy in Conjunction with Multivariate Batch Modeling. Anal. Chim. Acta 2011, 707, 47−56. (11) Martin, E.; Morris, J.; Lane, S. Monitoring Process Manufacturing Performance. IEEE Control Systems Mag. 2002, 26−39. (12) Simoglou, A.; Georgieva, P.; Martin, E.; Morris, A.; Feyo de Azevedo, S. On-Line Monitoring of a Sugar Crystallization Process. Comput. Chem. Eng. 2005, 29, 1411−1422. (13) Ü ndey, C.; Tatara, E.; Ç inar, A. Real-Time Batch Process Supervision by Integrated Knowledge-Based Systems and Multivariate Statistical Methods. Eng. Appl. Artif. Intell. 2003, 16, 555−566. (14) Ü ndey, C.; Ertunç, S.; Ç inar, A. Online Batch/Fed-Batch Process Performance Monitoring, Quality Prediction, and VariableContribution Analysis for Diagnosis. Ind. Eng. Chem. Res. 2003, 42, 4645−4658. (15) Fransson, M.; Folestad, S. Real-Time Alignment of Batch Process Data Using COW for on-Line Process Monitoring. Chemom. Intell. Lab. Syst. 2006, 84, 56−61. (16) Kassidas, A.; MacGregor, J.; Taylor, P. Synchronization of Batch Trajectories Using Dynamic Time Warping. AIChE J. 1998, 44, 864− 875. (17) SIMCA-P release 13.0.3 for windows, Graphical Software for Multivariate Process Modeling; Umetri: Umea, Sweden, 2013, (18) González-Martínez, J. M.; de Noord, O. E.; Ferrer, A. Multisynchro: a Novel Approach for Batch Synchronization in Scenarios of Multiple Asynchronisms. Submitted for publication. (19) González-Martínez, J. M.; Ferrer, A.; Westerhuis, J. Real-Time Synchronization of Batch Trajectories for on-Line Multivariate Statistical Process Control Using Dynamic Time Warping. Chemom. Intell. Lab. Syst. 2011, 105, 195−206. (20) Nomikos, P.; MacGregor, J. Monitoring Batch Processes Using Multiway Principal Components. AIChE J. 1994, 40, 1361−1375. (21) Lei, F.; Rotbøll, M.; Jørgensen, S. A Biochemically Structured Model for Saccharomyces cerevisiae. J. Biotechnol. 2001, 88, 205−221. (22) Camacho, J.; González-Martínez, J.; Ferrer, A. Multi-Phase (MP) toolbox. http://mseg.webs.upv.es/Software.html (accessed 2013). (23) Camacho, J.; Picó, J.; Ferrer, A. Bilinear Modelling of Batch Processes. Part I: Theoretical Discussion. J. Chemom. 2008, 22, 299− 308.

batches have the same length and, therefore, by applying TLEC trajectories do not change. Nevertheless, as shown in Figure 7b, LSD intervals between TLEC and Multisynchro do not overlap yielding OTII values, on average, significantly higher for TLEC. This proves that equal length does not necessarily ensure synchronized data, and by choosing a good synchronization method the performance of the monitoring scheme may improve. These differences in terms of capability of detecting faults as a function of the type of asynchronism for TLEC are in concordance with the differences observed in terms of quality of synchronization summarized in Figure 4. The higher the variability in the synchronized trajectories, the lower the performance of the monitoring schemes in fault detection.

5. CONCLUSIONS Quality of batch synchronization is one of the critical factors that affects the performance of the monitoring schemes in fault detection. When the key process events do not overlap at the same point of process evolution ensuring the same process pace in all batches, the capability of the monitoring schemes for fault detection is dramatically reduced. Contrary to what is often assumed in practice (an also in commercial software as for example, SIMCA Release 13.0.3 by Umetrics), equal length does not guarantee synchronized batches. Simple methods like TLEC (implemented in SIMCA) linearly interpolate data without considering the overlap of the key process events. Hence, the asynchronism present in the raw data is inherited in the resulting OWU scores and DModX statistics. The increase of the variability in batch trajectories due to inappropriate synchronization has an important negative effect. The higher the variability, the lower the performance of the control charts in fault detection. Multisynchro is a promising approach to perform reliable batch synchronizations with different classes of asynchronisms and can be used for both end-of-batch and realtime applications.



ASSOCIATED CONTENT

S Supporting Information *

OWU scores and DModX control charts for the NOC and faulty (type I, II, and III) test batches affected by five different types of asynchronism (cases 1, 2, 3, 4, and 5) are provided. In addition, the NOC and faulty batches simulated with different types of asynchronism are available. This material is available free of charge via the Internet at http://pubs.acs.org.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This research work was partially supported by the Spanish Ministry of Economy and Competitiveness under the project DPI201128112-C0t4-02. Part of this research work was carried out during an internship of the corresponding author at Shell Global Solutions International B.V. (Amsterdam, The Netherlands).



REFERENCES

(1) García-Muñoz, S.; Polizzi, M.; Prpich, A.; Strain, C.; Lalonde, A.; Negron, V. Experiences in Batch Trajectory Alignment for 4350

dx.doi.org/10.1021/ie402052v | Ind. Eng. Chem. Res. 2014, 53, 4339−4351

Industrial & Engineering Chemistry Research

Article

(24) Camacho, J.; Picó, J.; Ferrer, A. Bilinear Modelling of Batch Processes. Part II: A Comparison of PLS Soft-Sensors. J. Chemom. 2008, 22, 533. (25) González-Martínez, J. M.; Camacho, J.; Ferrer, A. Bilinear Modeling of Batch Processes. Part III: Parameters Stability. J. Chemom. 2014, 28, 10−27. (26) Ferrer, A. Multivariate Statistical Process Control based on Principal Component Analysis (MSPC-PCA): Some Reflections and a Case Study in an Autobody Assembly Process. Qual. Eng. 2007, 19, 311−325. (27) Box, G. E. P. Some Theorems on Quadratic Forms Applied in the Study of Analysis of Variance Problems: Effect of Inequality of Variance in One-Way Classification. Ann. Math. Stat. 1954, 25, 290− 302. (28) Jackson, J.; Mudholkar, G. S. Control Procedures for Residuals Associated with Principal Component Analysis. Technometrics 1979, 21, 341−349. (29) González-Martínez, J. M.; Westerhuis, J.; Ferrer, A. Using Warping Information for Batch Process Monitoring and Fault Classification. Chemom. Intell. Lab. Syst. 2013, 127, 210217. (30) Arteaga, F.; Ferrer, A. Dealing with Missing Data in MSPC: Several Methods, Different Interpretations, Some Examples. J. Chem. 2002, 16, 408−418. (31) Ramaker, H.; van Sprang, E.; Westerhuis, J.; Smilde, A. Fault Detection Properties of Global, Local and Time Evolving Models for Batch Process Monitoring. J. Process Control 2005, 15, 799−805. (32) Camacho, J.; Picó, J.; Ferrer, A. On-Line Monitoring of Batch Processes Based on PCA: Does the Modelling Structure Matter? Anal. Chim. Acta 2009, 642, 59−69. (33) van Sprang, E.; Ramaker, H.; Westerhuis, J.; Gurden, S.; Smilde, A. Critical Evaluation of Approaches for on-Line Batch Process Monitoring. Chem. Eng. Sci. 2002, 57, 3979−3991.

4351

dx.doi.org/10.1021/ie402052v | Ind. Eng. Chem. Res. 2014, 53, 4339−4351