A Unifying and Integrated Framework for Feature Oriented Analysis of

We present a data analytics framework for offline analysis of batch processes. The framework provides a unified setting for implementing several varia...
0 downloads 3 Views 1MB Size
Subscriber access provided by UNIVERSITY OF MICHIGAN LIBRARY

Article

A Unifying and Integrated Framework for Feature Oriented Analysis of Batch Processes Ricardo R. Rendall, Bo Lu, Ivan Castillo, Swee-Teng Chin, Leo H. Chiang, and Marco Seabra Reis Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/acs.iecr.6b04553 • Publication Date (Web): 05 Jul 2017 Downloaded from http://pubs.acs.org on July 9, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Industrial & Engineering Chemistry Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

A Unifying and Integrated Framework for Feature Oriented Analysis of Batch Processes Ricardo Rendall1, Bo Lu2, Ivan Castillo2, Swee-Teng Chin2, Leo H. Chiang2 and Marco S. Reis1,* (1) CIEPQPF, Department of Chemical Engineering, University of Coimbra, Rua Sílvio Lima, 3030-790, Coimbra, Portugal (2) Analytical Tech Center, Dow Chemical Company, Freeport, TX * [email protected]

Abstract We present a data analytics framework for offline analysis of batch processes. The framework provides a unified setting for implementing several variants of feature oriented analysis proposed in the literature, including a new methodology based on the process variables’ profiles presented in this article. It also integrates feature generation and feature analysis, in order to speed up the data exploration cycle, which is especially relevant for complex batch processes. The FOBA (Feature Oriented Batch Analytics platform) is described in detail and applied to several case studies regarding different analysis goals: visualization of the differences between the operation of two industrial units (dryers), quality prediction and end-of-batch process monitoring. The performance of the proposed methodology is also critically assessed and compared with other alternative analytical approaches currently in use.

Keywords Batch Processes; Process Analysis and Diagnosis; Industrial Data Analytics; Process Data Mining; Knowledge Discovery from Databases; Profiles; Features generation;

1 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Introduction

Batch processes are ubiquitous in modern industry, irrespectively of the sector of activity. Chemical, pharmaceutical, food & drink, and semiconductors are examples of industrial sectors with high economic importance, where many operations take place through repetitive and consistent cycles of production, giving rise to high value–added products. Continuous improvement efforts in these settings tend to present a strong leveraging effect, as even marginal gains in efficiency of a few percentage points can represent a significant accumulated economic impact in the end of the year – either in cost savings, added production or better product quality. In this context, industrial databases are a key resource for supporting continuous improvement activities. They contain historical information about raw products, process conditions and final product quality, that can be further explored to identify improvement opportunities, diagnose problems and conduct process optimization activities. However some common problems lie ahead when pursuing this data-driven path for process improvement. First of all, data from raw materials/process operations/product quality is not always available in an integrated and easily accessible way, which may create difficulties in retrieving the appropriate records for conducting a certain data analysis task and in combining them in a meaningful and coherent way. Furthermore, even when industrial data can be properly retrieved from the process storage servers, their analysis is not straightforward because of the simultaneous presence of multiple complicating features lacking adequate analytical solutions to be handled, such as: highdimensionality, missing data, multirate measurements, multiresolution data, noise, outliers, etc.1-5 In the case of batch processes, data analysis is even more complex (when compared to continuous processes), because the process features to be described and analyzed not only concern the relationship between variables, but also their nonstationary variation in time, which play a central role in the final quality of the products. Adding to this, batch processes often present multiple stages, not all of them equally important and whose interaction pattern may be quite diverse, further complicating their analysis. Correct handling all these characteristics is fundamental to batch analysis. Therefore data-driven continuous improvement in this context call for proper analytical solutions able to cope with the aforementioned complicating features and that can efficiently

2 ACS Paragon Plus Environment

Page 2 of 37

Page 3 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

assist users in the off-line analysis of batch data, with all sorts of structures they may present.

Analytics frameworks for batch data Many solutions have been proposed to address the challenges outlined. The initial proposals, still widely applied, consist of unfolding the three-way data array with dimensions (I×J×K), where I stands for the number of batches, J for the number of variables and K for the number of time steps, into a two-way matrix (Figure 1) that could be properly analyzed using available multivariate approaches, such as Principal Component Analysis (PCA)6, 7 or Partial Least Squares (PLS).8, 9 For instance, Nomikos and MacGregor10,

11

suggested unfolding to a I×(J×K) matrix (batch-wise unfolding,

BWU), containing all the variables at all the times as columns, and the different batches as rows. This usually results in a very wide matrix, with several thousands of pseudovariables whose relationships need to be modelled and requiring proper preprocessing (centering and scaling) as well as synchronization/alignment (a far from trivial matter, especially in online implementations 10-16). Due to the nature of the unfolding, in online applications this approach requires the prediction of future values for the variables being monitored, for which several solutions were also proposed.17 An alternative to circumvent the prediction problem of batch-wise unfolding, is to build a battery of local models or time-evolving models, making use only of the information that is available up until the current time of monitoring.18

Figure 1. Batch-wise unfolding procedure: the three-way array (I×J×K) is unfolded into a two-way array I×(J×K).

3 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Wold and coworkers,19 on the other hand, suggested unfolding the original cube of data into an (I×K)×J matrix (variable-wise unfolding, VWU). The dimensionality of this modelling space is much lower than in the previous unfolding strategy, but the presence of non-stationary dynamics (a key feature of batch processes) is not captured by the direct application of PCA or PLS, which limits the application of this approach in the analysis of complex batch processes. Furthermore, VWU is not so appropriate for predicting end-of-batch quality parameters20 and may also require synchronization,16 which increases its implementation complexity. Dynamic methods, such as dynamic PCA and dynamic PLS, can also be applied to monitoring batch operations.21-23 These methods rely on variable-wise unfolding, and therefore are more parsimonious than those based on the Nomikos and MacGregor unfolding scheme (BWU). However, they also include time-shifted variables to incorporate the batch dynamics in the bilinear modeling, and because of that, they fall somewhat in between batch-wise and variable-wise unfolding. The limiting feature of this class of methods is that batch trajectories are non-stationary and hardly a single dynamic model is sufficient to describe the autocorrelation of variables throughout the entire batch duration. Therefore, they usually require additional preprocessing or a sound division of the batch into stages with homogeneous dynamic features.24 Another analytical platform for analyzing batch data consists in the adoption of methods that are able to handle directly its three-way structure. This can be achieved with resort to multi-way methods,25 being Tucker3 and PARAFAC

26

the ones more commonly

adopted in this context. These methods also require preprocessing (which is now more complex than with the former unfolding schemes, and with more impact in the final performance) and also require that the batch trajectories are preliminary aligned/synchronized, as happens for batch-wise unfolding. In the presence of strong nonlinearities, nonlinear methods have been introduced for batch data analysis and, in particular, kernel methods have been adopted27-32. These methods are suitable when the relationship between process variables and the quality parameter is predominantly nonlinear. However, their complexity is high and care must be exercised in order to avoid overfitting. Also, the increased complexity requires the availability of large data sets for properly estimating the model’s parameters and hyperparameters. 4 ACS Paragon Plus Environment

Page 4 of 37

Page 5 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

An alternative class of analytical approaches for analyzing batch data offline is based on the computation of a set of features from the variables’ trajectories that capture the essential information about the process behavior but without retaining the entire variables’ profiles. The underlying principle stems from classical Pattern Recognition analysis, where one of the critical steps is precisely “feature generation”. Pattern recognition can be briefly described as a succession of transformations from the measurement space, M, to the feature space, F, and, finally to the decision space, D:33

M →F →D In a classification problem the set D is discrete and finite, whereas in a prediction problem, it is continuous. The information regarding the response, d ∈ D , is available for the objects belonging to the training set in order to develop the classifier/predictor that will be used to estimate the class memberships/value for new objects, symbolically represented by the application, δ : F → D . This happens after data in the measurement space is properly mapped onto the feature space, α : M → F , that potentially contains all the information necessary to successfully conduct the pattern recognition task. Applying the same principle to the offline analysis of batch data, batch measurements are first converted into properly defined features, which are then used for process analysis, to build a predictive model, or for process monitoring, among other possibilities. In this context, several classes of features have been proposed in the literature:



Simple descriptive features for the batch variables, such as minimum, maximum, mean and duration of the stage. To these simple features, sometimes others are added that are based on the background knowledge available about the process and the aspects of the variables profiles that are more important for the outcome of the batch. These features are usually called Landmark Features34 or, in the case where they are based on a priori knowledge, Knowledge-Based Features.



In the scope of the semiconductor industry, Wang and He proposed the use of several statistical measures for the variables, such as the mean, covariance, skewnesss and kurtosis. The methodology is called Statistics Pattern Analysis (SPA).35, 36

5 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60



Also in the scope of the semiconductor industry, Rato et al., proposed the use of features computed from the wavelet-based multiscale decomposition of the variables batch-profiles.37 The methodology is called TIME (Translation Invariant Multiscale Energy-based features).

As mentioned above, in its base formulation, SPA makes use of four statistical features computed for the measured variables:36 mean, skewness, kurtosis and covariance; other higher order statistics and more elaborate features can also be used to capture key process characteristics in some variables. These four statistics were developed for characterizing the random behavior of stationary i.i.d. variables, as none of them contains information about their dynamics or non-stationary behavior, which is a prevalent characteristic in batch processes. In fact, these statistics are invariant to any change in the order of the observations. In other reference,35 SPA also includes the variables autocorrelation, which bring some dynamic description of the behavior of the variables around the mean level into the analysis. However, autocorrelation assumes a stationary dynamic behavior, which cannot be assumed in general for this type of processes. SPA features are general and therefore easy to compute and automatize. They also allow for suitable descriptions of stationary processes (with or without dynamics). On the other hand, they do not address directly the intrinsic non-stationary nature of the variables profiles in batch processes. In fact, the typical patterns for batch variables do not consist of i.i.d. variability nor simple stationary dynamics (autoregressive and/or moving average processes), but rather patterns with changing slopes, heteroscedastic variability, piece-wise behavior, etc., depending on the phenomena and the conditions that optimize the production goals. In this context, one of the contributions in this paper is a new of set of features (more specifically, a new dictionary of features, in the nomenclature introduced in Section 2) that incorporates the specificity of the profile exhibited by each variable. In this way, we aim at extracting the fundamental patterns of variation of the variables’ behavior in a more compact (with less descriptors) and targeted way (looking at the actual non-stationary pattern), without assuming that all variables, with their diverse behavior, will be adequately characterized by the same set of statistical features.

Complexity mapping of analytical frameworks for batch data 6 ACS Paragon Plus Environment

Page 6 of 37

Page 7 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Considering the different analytics frameworks for analyzing batch data that have been proposed in the literature, one can easily find a strong dominance of methodologies that retain time-resolution information, either by handling directly the three-way nature of data (multiway approaches38) or by converting it into a suitable two-way format through batch-wise or variable-wise unfolding (BWU/VWU)34 where existent multivariate approaches can be applied, possibly complemented or upgraded with other methodologies for handling non-gaussianity,39 non-linearity,28 multiple phases,40 transitions between phases,31,

41

etc. All these approaches have model structures that

capture non-stationarity for the diversity of batch processes in a rather flexible way, a capability achieved at the cost of introducing a large number of parameters to be estimated and tuned for each application: from preprocessing parameters (mean trajectory subtraction, scaling) to those appearing in the two-way and three-way model structures. These additional parameters increase the complexity of the problem because of the extra degrees of freedom they introduce and the higher their number is, the more complex the model becomes – this regards the method’s “estimation complexity”, where complexity increases with the number of degrees of freedom utilized for fitting a model. Therefore, all these methods rank high in the scale of “estimation complexity” – a large number of parameters need to be estimated, starting with those from preprocessing (mean trajectory subtraction and scaling), 2 × ( J × K ) , to which one must also add the number of degrees of freedom consumed in estimating the bilinear model (sometimes difficult to define, but of the order of magnitude of the number of latent variables considered

42

): for a few dozens of process variables and hundreds of

time steps for the batch duration, this class of approaches easily present several thousands of parameters to be estimated from process data; this may create conditions for over-parametrized models with the usual consequences being data overfitting and loss of robustness – a critical feature in industrial processes. In addition to the estimation complexity, another type of complexity, related to the implementation difficulty of the methods, is also relevant and should be taken into account.

For

instance,

the

two-way

BWU

methods

require

non-trivial

alignment/synchronization of batch runs, which implies the existence of personnel in the companies with a particular set of skills and/or that have undergone advanced training in order to be prepared to implement such tasks. Thus, taking into account both the estimation and implementation aspects referred above, one can conclude that most 7 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 37

of the time-resolved two-way and three-way analytics platforms (time-resolved approaches are those that retain the information with the original time resolution), proposed in the literature for handling batch data belong to the higher levels of the complexity spectrum (see Figure 2). On the other hand, the less complex spectrum of data analytics approaches remains vastly unexplored. This part of the spectrum is mainly composed by feature oriented approaches based on the pattern recognition principles43, for which only a few methodologies have been proposed in the literature. We believe that it is both opportune and important to further explore this domain of analytics solutions, for reasons related with the (i) potentially better characteristics of these frameworks for a wide class of processes, and also from a (ii) methodologic perspective, as will be shortly discussed below.

SPA, TIME-PCA (Stage-wise) Landmark Features

Dynamic methods (ARPCA, BDPCA)

3-Way 2 Way-BWU PARAFAC, Tucker3 (with alignment) (with alignment)

Complexity

LOW COMPLEXITY

SPA (batch-wise)

HIGH COMPLEXITY

2 Way-VWU

2 Way Kernel -BWU 2 Way-BWU (without alignment) (with alignment)

Figure 2. A schematic representation of the hierarchy of complexity of analytics platforms for batch process data.

Approaches based on the preliminary computation of features for each variable profile – feature-oriented approaches–, are simpler to implement (do not require batch alignment/synchronization, complex preprocessing and trajectory mean shifting, etc.) and the models are less complex (from both the estimation and implementation perspectives). Despite their inherent simplicity (or because of it), several articles report better results in a number of applications, including process monitoring and fault detection.35-37 This means that, in the light of the parsimony principle (the Occam’s razor), the simplest solution that works is often found in the low complexity domain of feature-oriented methodologies, rather than in the high complexity domain of timeresolved approaches. From a methodological point of view, common sense advises starting with simpler approaches and evolve in complexity if necessary to improve results (even though this may turn out not to be needed) and only progress in this path

8 ACS Paragon Plus Environment

Page 9 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

as long as results improve

44

. Starting in the other way around, i.e. beginning with the

high complexity approaches, may lead to waste of resources (time, human, computational, etc.) with no a priori guarantee that such extra overhead of work will lead to better results. However, this is precisely the dominant scenario nowadays in the context of offline batch analysis: most studies start (and end) with time-resolved twoway or three-way approaches. Therefore, for offline analysis with the purposes of data visualization, troubleshooting, quality prediction, end-of-batch process monitoring, etc. (data-driven continuous improvement activities), we recommend applying the parsimony principle and start by exploring the low complexity analytics platforms for batch data 44.

Summary of the contributions in this article The main contributions in this article can be summarized in the following three bullets:



A new dictionary of features (Profiles-driven Features, PdF) is proposed, to enrich the space of solutions in the lower domain of the complexity spectrum (Section 2.1);



A unifying and integrated platform for feature-oriented analysis is proposed (FOBA, Feature Oriented Batch Analytics platform): unifying because it allows for the use of different sets of features (called “dictionaries of features”, such as Landmark features, SPA, PdF, etc), one-at-a-time or combined; integrated, because feature generation is integrated with analytics platforms for visualization, classification, prediction, etc. (Section 2.2).



A structured methodologic workflow is proposed for the off-line analysis of batch processes based on the parsimony principle (Section 2.3).

These three contributions have a different nature (conceptual, algorithmic, and methodologic) and will be presented in more detail in the next section (Section 2). Then, in Section 3, we apply FOBA to several case studies, with different goals, in order to illustrate its potential, as well as the added-value of the new dictionary of features (PdF). Finally, we conclude with a summary of the contributions proposed in this article and an indication of possible research paths for the future.

9 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2

Page 10 of 37

Methods

This section describes in detail the new proposed approach, which is based on a dictionary of Profile-driven Features (PdF), and outlines different aspects involved in its application. Furthermore, the Feature Oriented Batch Analytics platform (FOBA) is described and a unified view of feature oriented approaches for batch analytics is presented.

2.1

Profiles-driven Features (PdF)

In the scope of FOBA (see Section 2.2), each set of features (Landmark, SPA, etc.) correspond to a different “dictionary”. A dictionary is a collection of objects whose properties lead to the computation of features for the variables profiles. A new dictionary is proposed here, that explicitly considers the specific non-stationary nature of the variables profiles in the computation of the features. Variable profiles that are widely different, will be characterized by different sets of features: each set being more adequate to represent the variability exhibited by the respective type of profile. This is one of the differentiating aspects of this dictionary relatively to the other existing ones, namely SPA and TIME, where more general features are often computed for all variables, that may not fully account for their specific profile pattern (see also Appendix A1). The new dictionary is called Profiles-driven Features (PdF), whose components are described in the following paragraphs (see also Figure 3).

Object-profiles. The dictionary is composed by a finite set of objects, P = {opi }i =1,..., p . Each object corresponds to a given type of profile or pattern. A profile is a parametrized representation of a pattern that, upon training with process data, will lead to a realization of a curve. We call profile

pi the parametrized model structure associated with the object-

opi . Variables are assigned to the object-profile opi if they exhibit a similar

pattern (of course, the case of non-correspondence is also possible and actually rather common, as the dictionary has a wide coverage of possible profile patterns, not all of them necessarily appearing in every batch process); the assignment process will be addressed further ahead in this section. By “similar” it is meant that is possible to fit the profile structure to data obtaining in the end an estimated profile

pˆi ,

d a ta

p i → pˆ i , with

good fit. In other words, “similar” means that the profile is able to adequately describe the non-stationary behavior of the variable in question. Object-profiles considered in the 10 ACS Paragon Plus Environment

Page 11 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

dictionary tend to have very parsimonious model structures, with only a few parameters to be estimated from process data.

Estimation Engine. Each object-profile has an estimation engine. This estimation engine is a method (using object-oriented programming nomenclature) that takes data regarding a given variable under consideration and computes the estimated parameters of the profile assigned to such variable by maximizing model fitting (in the sense of minimizing the residual sums of squares). Thus, the estimation engine provides an

ˆi ) for that variable, which will be used later by the computation estimated trajectory ( p engine.

List of Profile Specific Features. Associated to each object-profile is a set of features that are pertinent to compute. These features are specific of the profile under consideration.

Features Computation Engine. This engine computes the values of the features in the list, for each variable that was assigned to the associated object-profile, using process data and the estimated profile provided by the estimation engine.

Figure 3. Schematic representation of the PdF dictionary and its main structural components.

11 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 37

Some examples of profile-objects are represented in Table 1. Table 1. Examples of object-profiles available in the PdF dictionary.

Object-Profile

Features Mean Variance

Constant

[Residual statistics] Area

Slope Intercept SSR*

Linear

Residual variance [Residual statistics] Area Step occurrence Mean before and after step Variance before and after step SSR*

Step

[Residual statistics] Area

Pulse Beginning Mean before, after and in the pulse Variance before, after and in the pulse

Pulse

SSR* [Residual statistics] Area Impulse beginning Maximum value of impulse

Impulse

Time of occurrence of maximum value Area

*

SSR is the sum of squared residuals between the estimated and measured profiles.

[Residual statistics] correspond to additional statistics that are not in the default set of features in the dictionary, but that could be included to improve performance in particular applications (e.g., high-order statistics to address nonGaussian residuals).

12 ACS Paragon Plus Environment

Page 13 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

The process of assigning variables to the characteristic profile (object) can be made in one of three possible ways, called “assignment modes” (AM):



User supervised (AM-US). The user provides the mapping between variables and the corresponding characteristic profile. This is the method adopted in all case studies since it can be done alongside a first exploratory stage for acquiring familiarity with the main variability patterns found in the data set. However, this approach may be cumbersome or impractical in industrial processes containing a large number of variables. In those scenarios, the automated or semi-automated methods are preferred to facilitate the assignment task.



Automated (AM-AU). Profiles are automatically assigned using an algorithm based on the average quality of fit of each type of profile to the variables’ data. For instance, measures such as R2 or the sum of squared errors can be used to measure the similarity/dissimilarity between raw and estimated trajectories. However, one must account for the complexity of the object-profiles and, for instance, a linear object-profile will always fit equally well or better than a constant object-profile. Thus, the adjusted R2 and other alternatives that penalize complexity might be more suitable (such as the AIC and BIC criteria). In other words, the mechanism for achieving an automatic assignment mode consists in using a measure of model fitting to guide the mapping between process variables and object-profiles. However, the AM-AU was not fully developed in this paper and should be considered for future work since further studies are required to identify and compare different measures of model fitting in the context of PdF. Thus, this assignment mode was not used in any of the case studies presented here because, as stated above, we value the AM-US aspect of enabling a first stage of data visualization.



Semi-automated (AM-SA). In this case, AM-AU is first conducted and the best candidates for each variable are saved. The user then selects one among the suggested profiles for each variable. This alternative speeds up the assignment process relatively to AM-US, still allowing the user to have control over the process, which may also be opportune for some cases where variables profiles do not show a clear pattern.

The proposed dictionary of profile-driven features presents the following positive characteristics: 13 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60



Avoids the need of preprocessing, namely trajectory mean shifting and scaling (which can be seen as an over-parametrized way of modeling the non-stationary behavior of individual variables);



Does not require unfolding and complex batch alignment/synchronization – a task that is usually difficult or requires advanced training;



Significantly reduces the dimension of the predictor space when compared to time-resolved alternatives;



Retain variables profiles specificity in the computation of the features;



Can be applied when batches do not have equal lengths;



Fast to implement;



Easily expandable to include more features for each object-profile.

As disadvantages, we may refer the lack of time resolution in the analysis (it is the whole batch or stage that is being analyzed, and not its behavior at a given instant) may lead to some information loss. In particular, fine changes in the local process correlation structure may be difficult to detect if they do not affect significantly the profile-driven features (see also Appendix A2, for a more detailed discussion). Another disadvantage is the fact that the dictionary has always a finite set of features, which could limit its application to more specific processes. However, the dictionary is easily expandable with more profiles and features or combined with other dictionaries, as described in the next section, which mitigate the impact of this limitation. We would like to point out that, when PCA, PLS and other multivariate methods are applied to the feature space generated by PdF, it is also assumed that the distribution of features across different batches is i.i.d.. This assumption is common to most multivariate methods used in the analysis of batch process data.

2.1.1 Illustrative example of PdF The workflow for the PdF dictionary consists of: (i) allocating variables to profileobjects ( X k

da ta / use r



o p i ); (ii) run the estimation engine and estimate the parameters of the d a ta

model structure associated with the profiles ( p i → pˆ i ); and finally (iii) run the feature computation engine, from which the set of variables-specific features are obtained (

14 ACS Paragon Plus Environment

Page 14 of 37

Page 15 of 37

data

pˆi →{Fk } ). Consider, for instance, Figure 4.a which presents the totalized feed during 12 batches of a drying operation (more details regarding this data set is provided in the case study from Section 3.1). By visual inspection and adopting the user supervised assignment mode (AM-US), one would classify this variable as a linear object-profile and following the workflow specified above, proceed to compute its estimated profile. In this simple case, the estimated profile is obtained by computing the slope and the intercept which minimize the squared residuals (Figure 4.b presents an example of the estimated profile for one batch). The specific features characterizing the linear objectprofile are the slope, intercept, area, SSE and the variance of SE, as defined in Table 1. These features are computed from the estimated profile and capture the essential characteristics of its evolution.

12000

10000

8000

Totalized Feed

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

6000

4000

2000

Measured data Estimated Profile

0 0

5

10

15

20

25

30

35

Time

(a)

(b)

Figure 4. Example of the application of PdF: (a) the totalized feed is classified as a linear object-profile due to their similar evolution pattern. As an example, the estimated profile for the first batch is presented (b).

2.2

The Feature Oriented Batch Analytics (FOBA) platform

The unifying and integrated platform for feature-oriented analysis of batch processes (FOBA) is presented in Figure 5. As shown in this figure, the first stage corresponds to the generation of features from process data. For such, a dictionary must be selected. In the case depicted, the selected dictionary is the proposed PdF (this is the default option; combination of dictionaries is also a possibility, resulting in an inflated dimension of the features space). Following the workflow described for the PdF dictionary (Section data

ˆi →{Fk } ). 2.1.1), a set of features is obtained ( p 15 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 37

FOBA Analytics Platform

Library of feature dictionaries

Visualization PdF Troubleshooting Process Data

Analysis Outcome

Features SPA Quality Prediction Land mark

Off-line Monitoring …

… Feature Generation

Feature Oriented Analysis

Figure 5. Schematic representation of the FOBA superstructure.

The feature generation stage can be implemented for the entire batch, or for all the stages of the batch in a stage-wise analysis. In the last case, each stage requires the repetition of the three-step procedure (profile assignment → profile estimation → features computation), resulting in the end on a collection of features representing variables behavior in all the stages. The entire set of features arising from all process variables (at all stages) will be analyzed in a second stage (Feature Oriented Analysis). We strongly recommend starting with the Visualization module, where a pre-selection and preliminary analysis of features is made, with resort to robust measures of association (Spearman correlation, Mutual information) in order to remove noisy and irrelevant features (which in turn will improve the model interpretability and mitigate the risk of overfitting). A variety of graphical and descriptive analytical tools is applied in this module, to get basic insights into the nature of variation along the batches and variables (stratified box-plots, analysis of scores from a PCA analysis; more information will be provided in the next section, where FOBA is applied to several case studies). Then, depending on the specific goal of the analysis, the workflow can proceed to one of the following modules: troubleshooting (root cause analysis), quality prediction and

16 ACS Paragon Plus Environment

Page 17 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

offline (and stage-wise) process monitoring, where specific tools are found to conduct the required analysis.

2.3

Methodologic Workflow

A methodologic workflow for batch data analysis is presented in Figure 6, following the Parsimony or Occam’s razor principle, i.e., starting from simple methodologies and moving to more complex approaches if the performance increments obtained justifies the resort to increased complexity. The workflow starts with the problem definition step, where clear objectives are defined and the retrieval of data sets from historical databases is conducted. In terms of objectives, one should distinguish those that can be re-casted as a regression problem and those that involve a classification problem, in order to select suitable performance metrics or KPI (Key Performance Indicators): classification metrics include the Area Under the receiver-operating Curve45 (AUC), True Positive Rate (TRP) and False Positive Rate (FPR), F-score among others. The performance in regression problems is typically assessed by the Root Mean Squared Error of Prediction (RMSEP). These measures of performance should be chosen according to the objectives of the problem at hand and in industrial processes, one often wants to constraint the number of false detection in order to avoid unnecessary control actions, whereas a missed detection may be not so critical because the fault usually persist long enough to be discovered by the monitoring system. The next steps in the methodological framework are rather straightforward: if the available knowledge is enough to select an appropriate complexity level (see Figure 2), then the corresponding method is adopted and tested. If process knowledge is not enough or unavailable, the starting point should be the use of the PdF dictionary. The PdF dictionary is a suitable first choice since it contains a small number of features for each object-profile and it is easiest to implement. Thus, the models tend to be more parsimonious and may lead to similar or better results when compared to more complex approaches (see results section). After choosing the initial method complexity, one evolve one step in the complexity scale presented in Figure 2 and evaluates the improvement, or not, of the performance metric. This sequential procedure should stop once an acceptable performance is obtained or when the higher levels of complexity 17 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

lead to no improvements in the results. In this scenario, one has adopted the least complex model that still achieves the desired accuracy or the one that maximizes it. In either way, one only makes use of as much complexity as needed to cope with the problem at hand, potentially avoiding time-consuming and highly technical solutions that sometimes do not offer proportional benefits.

Figure 6. Methodologic workflow for selecting a suitable approach for offline batch data analysis.

3

Case studies

In this section, the FOBA framework coupled with the PdF dictionary will be used to conduct a variety of tasks concerning offline batch process data analysis, with the purpose to demonstrating the potential advantages of adopting the proposed methodology and to illustrate their application in practice. In particular, the first task concerns data visualization of a real industrial crystallization process, where the operation of two dryers are compared. The second task concerns quality prediction for a widely used fed-batch simulated process for penicillin production (PENSIM46). Lastly, end-of-batch process monitoring is also conducted for the PENSIM simulator, allowing the identification of normal and abnormal batches. The proposed dictionary of features will be compared to benchmark approaches such as SPA and the more complex timeresolved two-way approaches (BWU with data aligned with resort to Dynamic Time

18 ACS Paragon Plus Environment

Page 18 of 37

Page 19 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Warping, DTW). The SPA features considered here are similar to those used for monitoring a semiconductor batch process (mean, variance, covariance, skewness and kurtosis) in Ref.

47

; other features such as auto-correlation and lagged cross-correlation

were not used. The use of lagged variables could potentially improve performance but only ad-hoc rules are available35 for selecting the proper lag. Thus, we tested the version of SPA without these features.

3.1

Industrial Data Exploration

Data exploration or visualization is the recommended first task in any offline batch data analysis, where the main goal is to quickly extract useful insights of the nature of process variability, mainly with resort to graphical methods given their strong synergy with human visual and pattern recognition capabilities. This task can be cumbersome and time consuming for batch data, given their intrinsic three-dimensional structure, which may get even more complex, due to the presence of multiple stages, misaligned process variables, batches with different lengths, etc. The FOBA framework avoids the alignment step because features are computed directly form the estimated objectprofiles ( pˆi ) , providing a representation whose distribution can be used for assessing the batch-to-batch variability (some features also address the intra-batch variability, such as those involving the residuals of the estimated profiles). In this context, a case study is considered here with the aim of identifying differences in the operation of two industrial dryers working in parallel, in an industrial crystallization process (examples of the trajectory of the 20 process variables measured during the batch is provided in Figure 7). Existing process knowledge points to the existence of possible differences in the dryers given their production outcomes. This happens even though they share the same batch stages and recipe, which motivate a more detailed analysis of their operation, looking for possible sources for the different behavior. The proposed PdF dictionary, described in Section 2.1, was applied in a stage-wise fashion (i.e., the PdF features were computed for each stage in the crystallization process) and the trajectory of process variables were assigned manually (AM-US) to one of the object-profiles in the dictionary, as specified in Table 2.

19 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

10.8

5.5

10.6 5 10.4

6

30

5.5

4.5

20

4

10

3.5

9.6

var:4

15

var:3

4

var:2

9.8

2.5

3 0

9.2 2.5

2 0

20

40

60

80

100

120

140

160

180

200

2

-5

9 8.8

1.5

-10 0

20

40

60

80

Time (s)

100

120

140

160

180

200

1 0

20

40

60

80

Time (s)

2500

3.5 3

5

9.4

10 4

5

25

10

var:1

35

4.5

10.2

100

120

140

160

180

200

0

20

40

60

80

Time (s)

4.4

110

4.2

100

100

120

140

160

180

200

120

140

160

180

200

120

140

160

180

200

120

140

160

180

200

120

140

160

180

200

Time (s)

10 4

4.5 4

2000

3.5 4 90

3

1500

3.6

2.5

80

var:8

1000

var:7

var:6

var:5

3.8

2

70 3.4

1.5

500 60 3.2

1

0

-500 0

20

40

60

80

100

120

140

160

180

3

50

2.8

40

200

0

20

40

60

80

Time (s)

100

120

140

160

180

200

0.5 0 0

20

40

60

80

Time (s)

11

2.2

10.5

1.8

100

120

140

160

180

0

200

20

40

60

80

10 4

100

Time (s)

Time (s)

70

120

60

100

50

80

40

60

1.6

var:9

var:11

1.4

var:10

10

1.2

9.5

1 0.8

9

var:12

2

30

40

20

20

0.6 10

0

0.4 8.5

0.2 0

20

40

60

80

100

120

140

160

180

200

0 0

20

40

60

80

Time (s)

100

120

140

160

180

200

-20 0

20

40

60

80

Time (s)

100

120

140

160

180

200

0

20

40

60

80

Time (s)

120

120

600

100

100

500

80

80

400

100

Time (s)

1600 1400 1200

60

var:16

60

var:15

var:14

var:13

1000 300

800 600

40

40

200

20

20

100

0

0

0

400 200

0

20

40

60

80

100

120

140

160

180

200

0

20

40

60

80

Time (s)

100

120

140

160

180

200

0 0

20

40

60

80

Time (s)

12000

25

10000

20

8000

15

100

120

140

160

180

200

0

20

40

60

80

Time (s)

100

Time (s)

1

450

0.8

400

0.6

350

0.4

300

10

var:20

6000

var:19

var:18

0.2

var:17

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 37

0

250 200

-0.2 4000

5

2000

0

150

-0.4

100

-0.6

50

-0.8 0

-5 0

20

40

60

80

100

120

140

160

180

200

-1 0

20

40

60

80

Time (s)

100

120

140

160

180

200

0 0

20

40

60

80

Time (s)

100

120

140

160

180

200

0

20

Time (s)

40

60

80

100

Time (s)

Figure 7. Example of the trajectory of 20 process variables measured during the drying operation.

Table 2. User specified object-profiles for process variables at different batch stages of an industrial crystallization unit.

Batch Stage Variable Index

Spin

Wash 1

wash 2

Spin Dry

1 2

Product feed Constant Constant

-

-

-

-

3 4

Constant Linear

-

-

-

-

5 6

Constant Constant

-

-

-

-

7 8

Linear Constant

-

-

-

-

9 10

Constant Linear

-

-

-

-

11

Constant

-

-

-

-

20 ACS Paragon Plus Environment

Page 21 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

12

Constant

-

-

-

-

13 14

-

Linear

Linear Constant

Linear -

-

15 16

-

Linear

Linear Linear

Linear Constant

Constant Constant

17 18

Linear -

Constant -

Constant

Constant

-

19

-

-

-

-

-

20

Pulse

Constant

Constant

Linear

Linear

Table 2 shows that most variables analyzed were classified as simple object-profiles from the PdF dictionary. This stems from the fact that the majority of process variables had simple dynamic profiles, suggesting that the use of features can effectively capture most of the information regarding their evolution, although time-resolution is lost. Thus, despite the fact that the whole batch behavior of some variables shown in Figure 7 present rather intricate profiles which are not similar to the profiles available in the PdF dictionary, the patterns in a given batch stage are much simpler and can be easily classified into one the available object-profile. As stated above, the assignment into one of the object-profiles followed the manual assignment mode (AM-US), which was implemented by plotting each variable at each batch stage and then selecting the objectprofile that is more similar to the observed trajectory. Analyzing Table 2, one can also note that not all variables were analyzed at all batch stages, because prior knowledge regarding the dryers’ operation revealed that some variables are not relevant at some stages. The flexibility to discarding irrelevant variables at some batch stages is shared by all FOBA approaches – it can be performed in a supervised way (as in the present case) or in an unsupervised way (using feature selection approaches, e.g., based on parametric or non-parametric measures of association). The specification defined in Table 2 might seem cumbersome to conduct manually since one can potentially analyze 20 variables at 5 batch stages for a total of 100 profiles. However, the assignment task can be carried out rather quickly alongside a first step of data visualization by inspection of the process variables trajectories. After the assignment task, the estimation engine adjusts the parameters of the selected object-profile using a least squares approach and the computation engine generates the features associated with the object-profile, thus completing the PdF workflow. As a practical example, variable #20 from the product feed step was assigned to the pulse object-profile (see Table 1), due to their intrinsic similarity. In order to obtain the estimated trajectory, the product feed step was split into 21 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

three contiguous and non-overlapping regions with minimum sum of variances. Once the regions were identified, the estimated trajectory consists of the mean value for each region, which are then used to compute the features associated to the pulse objectprofile. Returning to the problem of identifying differences between the dryers’ operation, a data set composed of 12 batches was available for each dryer and the assignment shown in Table 2 resulted in a total of 115 features for each batch. Thus, one has now available 2 matrices, X1 for dryer 1 and X2 for dryer 2, with dimensions 12×115 for each dryer. In order to find differences in the dryers’ operation, a comparison of the distribution of features from both dryers was conducted: each column of X1 was compared to the corresponding column in X2 and a t-test (at a 5% significance level) was used to identify features whose distribution was statistically different across dryers. From the 115 features, 21 were found to be statistically different across dryers and some of the identified features and process variables are presented in Figure 8.

90 700

70 60

500

Variable 11

Variance of Variable 11

Dryer 1 Dryer 2

80

600

400

300

50 40 30 20

200

10 100 0 Dryer 1

Dryer 2

0

5

10

15

20

25

30

35

Time (s)

(a)

(b) 120

90

Dryer 1 Dryer 2

85 100 80 75

80

Variable 14

Mean of Variable 14

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 37

70 65 60

60

40 55 50

20

45 0 Dryer 1

Dryer 2

1

2

3

4

5

6

Time (s)

(c)

(d)

22 ACS Paragon Plus Environment

7

8

9

10

Page 23 of 37

450 5

440 Dryer 1 Dryer 2

430

4.8

420 4.6

Variables 20

Regression Coefficient of Variable 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

4.4 4.2

410 400 390 380

4

370 3.8 360 3.6 350 Dryer 1

Dryer 2

0

5

10

15

20

25

Time (s)

(e)

(f)

Figure 8. Examples of identified differences between dryer 1 and dryer 2 obtained with the PdF dictionary of features: (a) the variance of variable 11 at the product feed stage and (b) the plot of its evolution; (c) the average value of variable 14 at water wash stage and (d) the plot of its evolution; (e) the regression coefficient for variable 20 at the spin dry stage and (f) the plot of its evolution.

Figure 8 shows that the identified variables have rather distinct trajectories and the PdF dictionary is able to effectively identify features with different distributions across dryers. Moreover, more than one feature can be identified for the same variable, highlighting multiple differences regarding their evolution. For instance, Figure 8.a shows that the variance of variable 11 at the product feed stage is higher for dryer 2 but another identified feature (not shown in Figure 8) was the average value, which is also higher for dryer 2. The average value and the variance cover independent aspects of the evolution of variable 11 and the interpretation of these features is straightforward since they are directly related to the object-profile assigned to variable 11. Variable 11 is the opening percentage of the feed valve and the fact that these differences were spotted, correlated to differences in the quality of the batch. Figure 8.c also shows that the average value of variable 14 is higher at the water wash stage, which is confirmed by looking at the variables’ trajectory (Figure 8.d). Variable 14 is the opening percentage of the control valve for the recycling water flow and suggest that, on average, dryer 1 is using more recycled water than dryer 2. Lastly, Figure 8.e suggest that the slope of variable 20 at the spin dry stage is statistically different across dryers, a fact that is again confirmed by observing the trajectory (Figure 8.f). Variable 20 is the rotational speed of the dryers’ and a clear difference is observed between dryer 1 and 2. These clear differences quickly identified during the data visualization stage based on PdF features and FOBA, constitute useful insights about process variation that can be

23 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

used by process experts to interpret and raise conjectures about the observed differences in production quality of the two driers.

3.2

Quality prediction

Quality prediction is another important task in batch data analysis where the aim is predicting quality parameters based on the variables process trajectories. For this task, the PENSIM46,

48

simulator was adopted in order to test the potential of using the

proposed PdF dictionary of features in this application context. The PENSIM simulator has been widely used in the literature as a testing environment for assessing and comparing methodologies for batch analysis. It consists of a detailed model of a fedbatch reactor for penicillin production. The simulator includes typical characteristics found in practice, namely noise and other sources of natural process variability, nonstationarity and PID controllers to regulate pH and the reactor temperature. The natural parameter for characterizing batch quality is the penicillin concentration at the end of the batch and in order to predict it, three sets of predictors were considered: features obtained from the PdF dictionary with manual assignment of object-profiles (AM-US), SPA features (namely, the mean, variance, covariance, skewness and kurtosis) and time-resolved BWU process data, properly aligned with DTW. The relationship between predictors and penicillin concentration is modelled by PLS and a total of 70 batches were simulated for analysis. From the 70 batches, 50 were randomly selected for model training while the remaining 20 batches were used as a test set. Monte Carlo iterations of 5-fold cross-validation were used to determine the optimal number of latent variables (LV) and the median coefficient of determination of cross

( ) is presented in Table 3 for the training set. Table 3 also presents the 2

validation Rcv

2 coefficient of determination for predictions on the test set ( Rtest ).

24 ACS Paragon Plus Environment

Page 24 of 37

Page 25 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Table 3. Prediction performances obtained with the PdF dictionary and benchmark methods using PLS as the regression method.

Dictionary

෩ ૛ࢉ࢜ ࡾ

ࡾ૛࢚ࢋ࢙࢚

PdF

0.82 (2 LV)

0.90

SPA

0.75 (2 LV)

0.82

BWU

0.85 (1 LV)

0.89

The prediction performances presented in Table 3 point to rather interesting results, showing that the PdF and BWU dictionaries were the top performing approaches under testing conditions, followed by SPA. The best performance obtained with PdF dictionary can be explained in terms of parsimoniously matching model and system complexity since, in this case study, PdF was able to compress the entire batch trajectory in a small number of features that preserved information regarding the penicillin concentration. Although the PLS model with PdF features uses 2 latent variables compared to 1 latent variable with BWU data, the reduction in the dimension of the predictor matrix is considerable: when PdF is adopted, the predictor matrix contains 67 columns/features describing each batch, whereas BWU data has a very wide predictor matrix with 19216 columns (16 variables and 1201 time points). Thus, in terms of the complexity scale (see Figure 2), there is no clear advantage to move to higher complexity methods, since no clear improvement in prediction performance is obtained.

3.3

End-of-Batch Process Monitoring

In this section we illustrate the use of the proposed PdF dictionary for end-of-batch process monitoring. Typical industrial databases contain hundreds or thousands of measurements made during batch operations and this information can in turn be used for quality assessment and improvement. The exploitation of such data sets provides knowledge regarding normal operating conditions, characterizing the typical measurement levels and correlation structures. Furthermore, the identification of abnormal batches can be conducted and the possible source of faults pinpointed. For this task, the PENSIM46, 48 simulator is again considered with the aim of identifying and characterizing abnormal batches. The set of 70 NOC batches described in Section 3.2 were used to build a PCA model and an additional 80 faulty batches were simulated and 25 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

monitored. In the simulation of the faulty batches, 20 batches are affected by a drop in the aeration rate, 20 are affected by a change in the feed temperature, 20 contain a drift in the reactor temperature sensor while the remaining 20 batches are contaminated. The number of principal components was chosen by analyzing the explained variability and 3 principal components were selected, explaining 72% of the features’ total variability. Note that, similarly to the quality prediction case study (Section 3.2), the user specified assignment mode was used (AM-US) to map each variable to one of the object-profiles in Table 1. The NOC and faulty batches were monitored using T2 and Q statistics and the results are presented in Figure 9.a and Figure 9.b, where it can be seen that the PdF dictionary is able to effectively identify abnormal batches since the T2 and/or Q statistics are clearly above their theoretical 95% upper control limits for the faulty batches. As benchmarks, Figure 9.c and Figure 9.d presents the monitoring results obtained with the SPA dictionary (using as features the mean, variance, covariance, skewness and kurtosis) while Figure 9.e and Figure 9.f presents results obtained with BWU data aligned with DTW. Although both benchmark methods are also able to detect most faults, a few of the monitoring statistics for fault 4 (batch indexes 131-150) are more or less at the level of the NOC data, which means that in a real monitoring scenario, these faults will be difficult to detect by these benchmark methods. BWU uses a very wide matrix with a high potential for overfitting due to the high number of model parameters and this fact is probably deteriorating its ability to detect fault 4. When comparing the ability of PdF and SPA to detect fault 4, one can observe a small advantage obtained by adopting the PdF dictionary, which results from the use of specific and targeted information about the time-varying profiles, which captures the relevant dynamic patterns in the evolution of process variables in a more effective way. This example shows that methods belonging to the low spectrum of the complexity scale (PdF and SPA) are suitable for conducting off-line process monitoring and can have superior performance when compared to more complex approaches. The methodological differences between PdF and SPA suggest that their optimality is dependent on the case study under analysis, which is typical of data-driven approaches. For instance, when monitoring a semiconductor batch process,47 SPA conducted to better results compared to a k-nearest neighbor classifier and BWU, because the variability around the constant profiles of these process variables can be very well described by the features of this method. A similar result is expected for PdF using a constant profile. The theoretical justification for this is the following: apart from the 26 ACS Paragon Plus Environment

Page 26 of 37

Page 27 of 37

mean, SPA uses mean-corrected statistics, such as the variance, covariance, skewness and kurtosis. Therefore, the results of SPA should match those of PdF with a “constant” profile assignment, as long as the same statistics are used for characterizing the residuals around the level of this constant profile (which is also estimated by the mean). However, by default PdF proposes the use of a reduced set of statistics for characterizing the residuals, in order to avoid the generation of large amounts of features, whereas SPA makes also use of higher-order moments (e.g., skewness and kurtosis), which can be advantageous in certain applications, such as in the

Q

T2

semiconductor industry.

(a)

(b)

Q

T2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

(c)

(d)

27 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

30

10 5

4 Normal batches Faulty batches

Page 28 of 37

Normal batches Faulty batches

3.5

25 3 20 2.5 15

2 1.5

10 1 5 0.5 0

0 0

50

100

150

0

50

Batch index

100

150

Batch index

(e)

(f) 2

Figure 9. Monitoring statistics for NOC and abnormal simulated batches: (a) T and (b) Q statistics for PCA model with the PdF dictionary; (c) T2 and (d) Q statistics for PCA model with the SPA dictionary of features; (e) T2 and (f) Q statistics for PCA model with the BWU data aligned with DTW. Batch indexes 1-70 correspond to NOC data, 7190 correspond to fault 1, 91-110 correspond to fault 2, 111-130 correspond to fault 3 and 131-150 represent fault 4. The value of the Q statistic for some batches is very high and are not presented for ease of visualization. The dashed line is the 95% upper control limit.

Besides fault detection, another critical aspect of end-of-batch process monitoring is the ability to identify the source of the fault so that process engineers may obtain insights into its root causes. In order to illustrate the ability of the PdF dictionary to identify the faults’ root causes, Figure 10 presents the average contributions of each feature to the Q statistic for the first 2 faults. The average is taken over all faulty batches affected by the same fault. As can be seen in Figure 10, the Q statistic is very specific since only 2 features are highlighted for each fault. In fault 1 (Figure 10.a), the process was disturbed by a drop in the aeration rate and upon inspection, features 50 and 51 correspond, respectively, to the mean and variance of the aeration rate. In fault 2 (Figure 10.b), the process was perturbed by a change in the feed temperature and features 47 and 48 correspond, respectively, to the mean and variance of the temperature of the feed flow. Similar results were obtained for faults 3 and 4 but are not presented here. The specificity obtained with the PdF dictionary is a useful characteristic since it clearly pinpoints the source of the fault, avoiding ambiguity in fault identification.

28 ACS Paragon Plus Environment

Mean contributions to the Q statistic

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Mean contributions to the Q statistic

Page 29 of 37

(a)

(b)

Figure 10. The average contribution of each feature to the Q statistic for: (a) fault 1 and (b) fault 2.

The results obtained with FOBA and, in particular, with the PdF and SPA dictionaries show that even a model with a small number of parameters can be suitable for fault detection and identification: the matrix obtained with the PdF dictionary contains 67 features for each batch while SPA had 153 features. These low number of features can be contrasted with the large number obtained with a BWU procedure (19216 columns, see Section 3.2), where the potential for overfitting is very high because all auto- and cross-correlations are modelled. The development of PdF- and SPA-based models required low user input, effort and technical specialization, since the features are easily computed from raw data and the complex synchronization step can be avoided, which in practice means a smaller barrier for model development. Thus, as long as the results obtained fit the purpose of analysis, one does not necessarily need to move to higher complexity methods, as the gains in doing so are not guaranteed a priori. Therefore, it is our opinion that more complex methods should not to be considered as default approaches but their use should be justified by the improvements obtained over the simpler ones. In practice, process knowledge may be used to identify a suitable approach and a high complexity method may be the correct starting point if the process is known to contain many and complex auto- and cross-correlation features. However, for offline quality improvement activities, a more complex model may be difficult to estimate and interpret and although simpler models may found limits in capturing all relevant process characteristics, their simplicity can be an advantage for extracting the fundamental trends and regularities in the data set.

29 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

4

Conclusions

In this work, a framework for feature oriented batch analytics platform (FOBA) is presented, unifying available approaches for feature generation. The proposed framework integrates feature generation and feature analysis for batch data in the context of continuous improvement activities. We also presented a new dictionary of features, known as Profile-driven Features (PdF). The proposed PdF dictionary lies in between the current feature oriented approaches (e.g. SPA, TIME) and the timeresolved approaches (2-way BWU, Tucker 3, PARAFAC, etc.) in the sense that it lacks time resolution (as the current feature oriented approaches) but retains the specificity of the profiles through the use of profile-specific features (as in the time-resolved approaches, but not in the current feature oriented approaches where the same features are typically used for all profiles). One limitation of the proposed PdF dictionary and other feature-oriented approaches is that they can only be applied at the end of a batch phase or at the end of a batch run. Nevertheless, quality improvement activities are mostly conducted offline and therefore PdF is a suitable first choice due to the simplicity of its implementation and low estimation complexity of the model. Lastly, a methodologic workflow is presented for conducting offline batch data analysis, starting with simpler methods and only increasing in modelling complexity if significant increments in performance are obtained, which should be monitored and assessed step-by-step. The PdF dictionary was applied to three different and relevant activities in batch data analysis: data visualization, quality prediction and end-of-batch process monitoring. In the first case study, the use of the PdF dictionary allowed for the identification of differences in the operation of two dryers in an industrial crystallization process, by pinpointing differences in the evolution of process variables. The second case study concerned the prediction of penicillin concentration for the PENSIM simulator and the PdF dictionary, with the smallest number of model parameters, obtained the lowest prediction errors under testing conditions, followed by batch-wise unfolding with alignment and SPA. In the third case study, the PENSIM simulator was again adopted and the ability of the PdF dictionary to detect abnormal simulated batches was assessed. Monitoring statistics for a PCA model showed that all faulty batches were indeed identified since the monitoring statistics were well above their upper control limits.

30 ACS Paragon Plus Environment

Page 30 of 37

Page 31 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Furthermore, contribution plots allowed the correct identification of the faults’ root causes. As future work, we intend to:



Extend the application of the PdF dictionary to more case studies, representing different analysis scopes: visualization, troubleshooting, quality prediction and end-of-batch process monitoring.



Develop another dictionary of features, based on the promising results obtained in wavelet-based multiscale decomposition of the variables batch-profiles, namely with TIME-PCA (Translation Invariant Multiscale Energy-Based Principal Components Analysis).37



Apply this methodology to the analysis of profiles, besides those from batch data, following the initiatives found in other works for which FOBA is a valuable alternative.49-53



Develop feature selection methods for process monitoring and quality prediction. It is known that using irrelevant features for model building needlessly increases model complexity and the risk of overfitting. Thus, using appropriate features is of key importance in order to obtain improved performances on both tasks.

Appendix A A.1 A closer look to PdF features In this section we provide more information about the different characteristics of the features generated according to the PdF scheme and those other feature oriented approaches, namely SPA. As mentioned before, the PdF dictionary is able to integrate the non-stationarity nature of the profiles and its specificity in the features that are computed to summarize them. On the other hand, SPA features, being based on statistics derived to characterize stationary random processes, do not present in general the same capabilities, with the exception of the particular case of constant profiles, where they are indeed very suitable to described the variation around the constant level. To illustrate this aspect, in Figure A1 we present several artificial generated curves 31 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

representing rather distinct patterns. Despite their differences, most SPA features (e.g. mean, variance, skewness and kurtosis) remain almost unchanged, as shown in Table A1. On the other hand, PdF features capture the changes between the different patterns, because their non-stationary behavior is considered and incorporated in the computation of these quantities. 2.5 1 2 3 4

2 1.5

Process variable #1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 37

1 0.5 0 -0.5 -1 -1.5 -2 0

5

10

15

20

25

30

Time

Figure A1. Simulated trajectories of a process variable for different batches.

Table A1. Features computed with PdF (linear profile) and SPA for the profiles presented in Figure .

PdF (linear profile) Profiles 1 2 3 5

Slope Intercept

SSR

0.14 0.12 0.05 -0.14

0.58 9.52 41.4 0.49

-1.97 -1.69 0.10 2.28

SPA

Residual Area Mean Variance Kurtosis Skewness variance 0.02 0.31 1.38 0.02

5.06 5.09 5.08 4.96

0.174 0.176 0.175 0.171

1.45 1.45 1.43 1.45

2.12 2.11 2.10 2.07

0.11 0.14 0.10 0.08

A.2 Modelling the process correlation structure using feature oriented approaches Modelling of correlation information between process variables is one of the most important factors contributing to the success of multivariate methods. Therefore, a more detailed discussion regarding the ability of PdF to allow for capturing variables’ correlations is in order. PdF has the ability to summarize the essential features of the variables profiles in a reduced set of features. Thus, if two variables present mutual 32 ACS Paragon Plus Environment

Page 33 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

correlations in their original representation, this correlation should be passed on to at least some of their features. For instance, let us consider the case of constant profiles for simplification (which are also very common in the semiconductor industry). If two variables are correlated, it means that their levels are correlated. But as their levels are well-estimated by the mean of the profile – one of the features of PdF or SPA – either PdF or SPA will be able to capture very well the native correlation between these two variables (in this case without any significant loss of information). This illustrates that their univariate nature does not limit or preclude, a priori, the success of correlationbased approaches implemented over the features. However, if variables are only correlated at a finer resolution (e.g. at the observation level), it may not be possible to capture it with PdF or SPA since some detailed information is lost in the computation of the features. The tradeoff between information loss and complexity (related to the number of features) is inherent to any method that attempts to obtain a compact and informative representation of a dataset in terms of features, instead of using all the raw measurements. It may be opportune to use finer correlation information in order to detect some very particular faults, but the PdF assumption is that the effects of the fault will also be reflected in the correlation of features that characterize the process variables. If this assumption is not met, using more complex methods is recommended but at the expense of a higher risk of overfitting.

Acknowledgements Ricardo Rendall acknowledges the Portuguese Foundation for Science and Technology (FCT) for financial support through PhD grant with reference SFRH/BD/123774/2016. Marco Reis acknowledges financial support through project 016658 (references PTDC/QEQ-EPS/1323/2014, POCI-01-0145-FEDER-016658) financed by Project 3599-PPCDT (Promover a Produção Científica e Desenvolvimento Tecnológico e a Constituição de Redes Temáticas) and co-financed by the European Union’s FEDER.

33 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

References (1) Reis, M. S. Data-Driven Multiscale Monitoring, Modelling and Improvement of Chemical Processes. PhD Thesis, University of Coimbra, Coimbra, 2006. (2) Reis, M. S.; Saraiva, P. M., Heteroscedastic Latent Variable Modelling with Applications to Multivariate Statistical Process Control. Chemom. Intell. Lab. Syst. 2006, 80, 57-66. (3) Reis, M. S.; Saraiva, P. M., Multiscale Statistical Process Control with Multiresolution Data. AIChE J. 2006, 52, (6), 2107-2119. (4) Reis, M. S.; Saraiva, P. M., Multivariate and Multiscale Data Analysis. In Statistical Practice in Business and Industry, Coleman, S.; Greenfield, T.; Stewardson, D.; Montgomery, D. C., Eds. Wiley: Chichester, 2008; pp 337-370. (5) Ge, Z.; Song, Z.; Gao, F., Review of Recent Research on Data-Based Process Monitoring. Ind. Eng. Chem. Res. 2013, 52, 3543-3562. (6) Jackson, J. E., A User's Guide to Principal Components. Wiley: New York, 1991. (7) Jolliffe, I. T., Principal Component Analysis. 2nd ed.; Springer: New York, 2002. (8) Martens, H.; Naes, T., Multivariate Calibration. Wiley: Chichester, 1989. (9) Wold, S.; Sjöström, M.; Eriksson, L., PLS-Regression: A Basic Tool of Chemometrics. Chemom. Intell. Lab. Syst. 2001, 58, 109-130. (10) Nomikos, P.; MacGregor, J. F., Monitoring Batch Processes Using Multiway Principal Component Analysis. AIChE J. 1994, 40, (8), 1361-1375. (11) Nomikos, P.; MacGregor, J. F., Multivariate SPC Charts for Monitoring Batch Processes. Technometrics 1995, 37, (1), 41-59. (12) Kassidas, A.; MacGregor, J. F.; Taylor, P. A., Synchronization of batch trajectories using dynamic time warping. AIChE J. 1998, 44, 864-875. (13) Gins, G.; Van den Kerkhof, P.; Van Impe, J. F. M., Hybrid Derivative Dynamic Time Warping for Online Industrial Batch-End Quality Estimation. Ind. Eng. Chem. Res. 2012, 51, (17), 6071-6084. (14) González-Martínez, J. M.; Ferrer, A.; Westerhuis, J. A., Real-time synchronization of batch trajectories for on-line multivariate statistical process control using Dynamic Time Warping. Chemom. Intell. Lab. Syst. 2011, 105, (2), 195-206. (15) Fransson, M.; Folestad, S., Real-time alignment of batch process data using COW for on-line process monitoring. Chemom. Intell. Lab. Syst. 2006, 84, (1-2), 56-61. (16) González-Martínez, J. M.; Vitale, R.; De Noord, O. E.; Ferrer, A., Effect of Synchronization on Bilinear Batch Process Modeling. Ind. Eng. Chem. Res. 2014, 53, 4339-4351. (17) Nomikos, P.; MacGregor, J. F., Multivariate SPC Chart for Monitoring Batch Processes. Technometrics 1995, 37, (1), 41-59. (18) Ramaker, H.-J.; van Sprang, E. N. M.; Westerhuis, J. A.; Smilde, A. K., Fault detection properties of global, local and time evolving models for batch process monitoring. J. Process Control 2005, 15, (7), 799-805. (19) Wold, S.; Kettaneh, N.; Friden, H.; Holmberg, A., Modelling and diagnostics of batch processes and analogous kinetic experiments. Chemom. Intell. Lab. Syst. 1998, 44, 331-340. (20) Camacho, J.; Picó, J.; Ferrer, A., Bilinear modelling of batch processes. Part II: a comparison of PLS soft‐sensors. Journal of Chemometrics 2008, 22, (10), 533-547.

34 ACS Paragon Plus Environment

Page 34 of 37

Page 35 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

(21) Van den Kerkhof, P.; Vanlaer, J.; Gins, G.; Van Impe, J. F. M., Dynamic model-based fault diagnosis for (bio)chemical batch processes. Computers & Chemical Engineering 2012, 40, 12-21. (22) Chen, J.; Liu, K.-C., On-line batch process monitoring using dynamic PCA and dynamic PLS models. Chemical Engineering Science 2002, 57, (1), 63-75. (23) Choi, S. W.; Morris, J.; Lee, I.-B., Dynamic model-based batch process monitoring. Chemical Engineering Science 2008, 63, (3), 622-636. (24) Camacho, J.; Picó, J.; Ferrer, A., Multi‐phase analysis framework for handling batch process data. Journal of Chemometrics 2008, 22, (11‐12), 632-643. (25) Smilde, A.; Bro, R.; Geladi, P., Multi-way Analysis: Applications in the Chemical Sciences. Wiley: Chichester, UK, 2004. (26) Louwerse, D.; Smilde, A., Multivariate statistical process control of batch processes based on three-way models. Chemical Engineering Science 2000, 55, (7), 1225-1235. (27) Lee, J.-M.; Yoo, C.; Choi, S. W.; Vanrolleghem, P. A.; Lee, I.-B., Nonlinear process monitoring using kernel principal component analysis. Chemical Engineering Science 2004, 59, (1), 223-234. (28) Tian, X. M.; Zhang, X. L.; Deng, X. G.; Chen, S., Multiway kernel independent component analysis based on feature samples for batch process monitoring. Neurocomputing 2009, 72, 1584-1596. (29) Zhang, Y., Fault detection and diagnosis of nonlinear processes using improved kernel independent component analysis (KICA) and support vector machine (SVM). Industrial & Engineering Chemistry Research 2008, 47, (18), 6961-6971. (30) Lee, J. M.; Qin, S. J.; Lee, I. B., Fault detection of non‐linear processes using kernel independent component analysis. The Canadian Journal of Chemical Engineering 2007, 85, (4), 526-536. (31) Zhao, C. H.; Gao, F. R.; Wang, F. L., Nonlinear batch process monitoring using phase-based kernel independent component analysis-principal component analysis. Industrial & Engineering Chemistry Research 2009, 48, 9163-9174. (32) Vitale, R.; Noord, O.; Ferrer, A., A kernel‐based approach for fault diagnosis in batch processes. Journal of Chemometrics 2014, 28, (8), S697-S707. (33) Pal, S. K.; Mitra, M., Pattern Recognition Algorithms for Data Mining. Chapman & Hall/CRC: Boca Raton, 2004. (34) Wold, S.; Kettaneh-Wold, N.; MacGregor, J. F.; Dunn, K. G., Batch Process Modeling and MSPC. In Comprehensive Chemometrics, Elsevier: Oxford, 2009; pp 163-197. (35) Wang, J.; He, Q. P., Multivariate statistical process monitoring based on statistics pattern analysis. Ind. Eng. Chem. Res. 2010, 49, 7858-7869. (36) He, Q. P.; Wang, J., Statistics Pattern Analysis: A New Process Monitoring Framework and its Application to Semicinductir Batch Processes. AIChE J. 2011, 57, (1), 107-121. (37) Rato, T. J.; Blue, J.; Pinaton, J.; Reis, M. S., Translation Invariant Multiscale Energy-based PCA (TIME-PCA) for Monitoring Batch Processes in Semiconductor Manufacturing. IEEE Transactions on Automation Science and Engineering 2016, In Press (DOI: 10.1109/TASE.2016.2545744). (38) Louwerse, D. J.; Smilde, A. K., Multivariate statistical process control of batch processes based on three-way models. Chem. Eng. Sci. 2000, 55, (7), 1225-1235. (39) Yoo, C. K.; Lee, J.-M.; Vanrolleghem, P. A.; Lee, I.-B., On-line monitoring of batch processes using multiway independent component analysis. Chemom. Intell. Lab. Syst. 2004, 71, (2), 151-163. 35 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(40) Undey, C.; Çinar, A., Statistical monitoring of multistage, multiphase batch processes. IEEE Control Systems 2002, 22, (5), 40-52. (41) Ge, Z.; Zhao, L.; Yao, Y.; Song, Z.; Gao, F., Utilizing transition information in online quality prediction of multiphase batch processes. J. Process Control 2012, 22, 599-611. (42) van der Voet, H., Pseudo-Degrees of Freedom for Complex Predictive Models: The Example of Partial Least Squares. J. Chemom. 1999, 13, 195-208. (43) Seasholtz, M. B.; Kowalski, B., The parsimony principle applied to multivariate calibration. Analytica Chimica Acta 1993, 277, (2), 165-177. (44) Levenspiel, O., Modeling in chemical engineering. Chem. Eng. Sci. 2002, 57, 4691-4696. (45) Tiago, R.; Rendall, R.; Gomes, V.; Chin, S.-T.; Leo, H. C.; Saraiva, P.; Reis, M., Batch Process Monitoring Methods: Part I - Assessing detection strength. In Industrial & Engineering Chemistry Research, 2015. (46) Birol, G.; Ündey, C.; Cinar, A., A modular simulation package for fed-batch fermentation: penicillin production. Computers & Chemical Engineering 2002, 26, (11), 1553-1565. (47) He, Q. P.; Wang, J., Statistics pattern analysis: A new process monitoring framework and its application to semiconductor batch processes. AIChE Journal 2011, 57, (1), 107-121. (48) Van Impe, J.; Gins, G., An extensive reference dataset for fault detection and identification in batch processes. Chemometrics and Intelligent Laboratory Systems 2015, 148, 20-31. (49) Kim, K.; Mahmoud, M. A.; Woodall, W. H., On the Monitoring of Linear Profiles. J. Qual. Technol. 2003, 35, (3), 317-328. (50) Reis, M. S.; Saraiva, P. M. In A Multiscale Approach for the Monitoring of Paper Surface Profiles, 5th Annual Meeting of ENBIS, Newcastle (UK), 2005; Newcastle (UK), 2005. (51) Woodall, W. H.; Spitzner, D. J.; Montgomery, D. C.; Gupta, S., Using Control Charts to Monitor Process and Product Quality Profiles. J. Qual. Technol. 2004, 36, (3), 309-320. (52) Reis, M. S.; Bauer, A., Wavelet texture analysis of on-line acquired images for paper formation assessment and monitoring. Chemom. Intell. Lab. Syst. 2009, 95, (2), 129-137. (53) Jin, J.; Shi, J., Feature-Preserving Data Compression of Stamping Tonnage Information Using Wavelets. Technometrics 1999, 41, (4), 327-339.

36 ACS Paragon Plus Environment

Page 36 of 37

Page 37 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Table of Contents (TOC)/Abstract graphic

37 ACS Paragon Plus Environment