A Systematic Methodology for Comparing Batch Process Monitoring

Mar 27, 2018 - Since the first batch process monitoring approaches were published in the literature approximately 20 years ago, a significant number o...
1 downloads 9 Views 643KB Size
Subscriber access provided by UNIVERSITY OF TOLEDO LIBRARIES

Process Systems Engineering

A Systematic Methodology for Comparing Batch Process Monitoring Methods: Part II – Assessing Detection Speed Tiago J Rato, Ricardo R. Rendall, Véronique Medeiros Gomes, Pedro M Saraiva, and Marco Seabra Reis Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/acs.iecr.7b04911 • Publication Date (Web): 27 Mar 2018 Downloaded from http://pubs.acs.org on April 1, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

A Systematic Methodology for Comparing Batch Process Monitoring Methods: Part II – Assessing Detection Speed

Tiago J. Rato1, Ricardo Rendall1, Veronique Gomes2, Pedro M. Saraiva1 and Marco S. Reis1,* (1) CIEPQPF, Department of Chemical Engineering, University of Coimbra, Rua Sílvio Lima, 3030-790, Coimbra, Portugal (2) CITAB-Centre for the Research and Technology of Agro-Environmental and Biological Sciences, University of Trás-os-Montes e Alto Douro, Vila Real, Portugal

*Corresponding author: e-mail: marco@eq.uc.pt, phone: +351 239 798 700, FAX: +351 239 798 703 Note: Tiago Rato and Ricardo Rendall have contributed equally to this work.

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract Since the first batch process monitoring approaches were published in the literature approximately 20 years ago, a significant number of extensions and new contributions have been proposed. They enrich the toolkit of solutions made available to practitioners, who face today the daunting task of finding the best tool among all of them, and also to define and tune all associated configuration options. This task could be greatly facilitated by the availability of sound comparison studies, with rigorous and unambiguous metrics and language. However, comparative studies are rare and some of them still present several limitations or even flaws that make their findings not easily generalizable or, in extreme situations, fundamentally wrong. There is also a lack of agreement and consistency in the way batch process monitoring methods are assessed, which makes the comparison of approaches available in the literature very difficult or even impossible to make. Therefore, in this two parts sequel of articles, we address in detail the different comparison perspectives in use and critically assess their merits and limitations, as well as point out some of the flaws found in comparison studies of batch process monitoring approaches. The first part was dedicated to “detection strength” (the ability to correctly detect abnormal situations, without incurring in excessive false alarms). The present article addresses the complementary dimension of “detection speed” (the ability to rapidly signaling an abnormality, after it occurs). The comparison and assessment methodology (CAM) for on-line batch process monitoring approaches proposed in Part I is now extended with figures of merit for assessing detection speed, and a well-defined workflow is established for conducting the analysis of this dimension. The proposed figure of merit is based on the computation of the Conditional Expected Delay (CED) and Probability of False Alarms (PFA), and their relationship. These quantities are greatly overlooked by the research community, despite their rigor and adequacy for characterizing detection speed, especially for batch processes. The framework can be used either in the retrospective evaluation of existing methods (as illustrated in this article) or in the analysis of new contributions, where the true added value should be demonstrated in a rigorous, unambiguous and as extensive as possible, way. Similar to Part I of this sequel, the major contribution of this work is the CAM framework itself, and not the generalization of the conclusions drawn for the particular case studies analyzed here. For illustration purposes, this framework was applied to the comparative analysis of 60 different types of methods and their variants, in 7 situations.

2 Environment ACS Paragon Plus

Page 2 of 32

Page 3 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

The results obtained were analyzed at different aggregation levels. For example, they showed that from the standpoint of detection speed and for the scenarios tested, the dynamic and 2-way methods tend to present the highest detection speeds. The 2-way methods showed better results when coupled with missing data infilling and control limits for the Q statistic based on one observation (WS1). The best synchronization approach was found to be highly case dependent.

Keywords Process Monitoring; Batch Processes; Non-stationary processes; Figures of Merit; Conditional Expected Delay; Probability of False Alarms. Equation Chapter (Next) Section 1

3 Environment ACS Paragon Plus

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Introduction

Batch processes play a major role in today’s industrial processes. From chemicals and semiconductors to the pharmaceutical and agrofood sectors, production often takes place in a series of stages that cyclically repeat each other: loading of materials and initiators, adjusting operational conditions following a product-specific recipe, unloading products and cleaning. This operation mode brings flexibility in producing final products with customized characteristics and of allowing diverse production scales when compared to continuous processes. Rather paradoxically, the same reason that brings them competitive advantage is also the main source of challenges: flexibility means that there are many opportunities to interfere with the process and driving it away from the desired course (initial quantities, phase durations, characteristics of temperature/pressure time-profiles, etc.). For this reason, there is a permanent interest in industry for the development of better process monitoring approaches, and a rich variety of batch

process monitoring approaches were

developed, including 2-way

rd

methodologies based on unfolding the data cube (or 3 order tensor) in a batch-wise 1, 2 or observation-wise fashion 3, 4; 3-way methods 5, 6; dynamic methods 7, 8; and the more recent feature oriented approaches 9-11 (see also Part I, for more examples 12). The central aim of industrial process monitoring is to detect a special event as fast as possible, in order to conduct the proper remedial actions in useful time, minimizing product loss while securing safety operation and product quality. In this context, detection speed is of primary interest, when comparing alternatives for conducting batch process monitoring. This is the main topic of this article, the second and last of the sequel dedicated to establishing robust and effective comparison metrics and practices for the sound comparison of batch process monitoring methods. As mentioned before, there is an increasing quantity of methods proposed for conducting process monitoring of batch operations

10, 13

. Each one of them adopts a

certain methodology for validating and comparing its performance against some reduced subset of the existing approaches that are used as benchmarks. In Part I 12, we have reviewed the main classes of approaches adopted for establishing such comparisons and pointed out their pros and cons, as well as some of the flaws committed in this type of studies. These classes are: Graphical evaluation; Conformance analysis; Detection strength; and Detection speed. The first two classes can be used to complement the introduction and characterization of the process monitoring schemes, 4 Environment ACS Paragon Plus

Page 4 of 32

Page 5 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

but can hardly serve the purpose of supporting a rigorous and complete comparative assessment. Detection strength, i.e., the ability to correctly detect abnormal situations without incurring in excessive false alarms, is an often used approach, and was extensively covered in Part I, where a comparison framework was proposed whose Key Performance Indicators (KPI) are based on a robust measure of detection strength: the area under the ROC curve (AUC; for more details see Part I of this sequel 12). In Part II, we complete the sequel by proposing the analogous robust comparison methodology for addressing the complementary assessment dimension to detection strength: detection speed. As before, the main purpose is not to propose a new monitoring method, nor to conduct a study to generalize their relative performance (which is certainly case dependent

12, 14-16

). The emphasis is instead on establishing

robust figures of metric and sound procedures that can be adopted in the future for testing new methods or to rigorously conduct comparison studies. However, we not only extend the comparison methodology, called CAM (Comparison and Assessment Methodology), to incorporate the dimension of detection speed, but also extensively illustrate its application in a study involving a rich variety of methods (2-way, 3-way and dynamic, 60 methods overall) and testing scenarios (2 systems tested in 7 different conditions). This article is organized as follows: Section 2 presents the comparison framework, where figures of merit are defined for assessing detection speed, and key performance indicators are established to comparatively assess the performance of monitoring methods. Section 3 describes the variety of batch monitoring methods used to exemplify a comparison study following the CAM framework, and also provides details regarding the simulated systems utilized to generate the datasets. In Section 4, the main results are presented, followed by an analysis conducted at different aggregation levels. In Section 5, these results are discussed in detail and, lastly, Section 6 summarizes the main conclusions of this work.

2

Comparison framework

In this section, we discuss the existent approaches for characterizing detection speed in process monitoring schemes and introduce the proposed figures of merit to assess this

5 Environment ACS Paragon Plus

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 32

dimension on Batch Process Monitoring (BPM) methods. Afterwards, following the structure of the comparison and assessment methodology (CAM) described in detail in the first part of this article

12

, we briefly review the main stages considered in the

simulation of the different batch operation conditions where raw data for comparison was generated, as well as the computation of the key performance indicator (KPI) used to score the BPM methods according to their relative performance.

2.1 Baseline figures of merit for assessing the detection speed of batch process monitoring methods Detection speed is the performance dimension of a monitoring scheme concerned with characterizing how fast a fault can be detected after it took place. It usually involves some computations that measure the amount of time (or number of observations) that a monitoring scheme takes to signal a fault after its onset. In this regard, the most common figures of merit are the Average Run Length (ARL) or the Average Time to Signal (ATS)17-22, which is related to ARL through ATS = ARL × ∆t , where ∆t is the sampling period. However, there are several limitations on the adoption of the ARL approach. First, it requires that all methods under comparison are tuned to the same incontrol ARL(0). Otherwise the comparison is meaningless. Secondly, ARL follows a geometric distribution and as such, its variance has the same order of magnitude of its mean. Therefore, an ARL estimate is likely to be affected by a large uncertainty, limiting its quality and practical use. Furthermore, as the ARL construct was derived for continuous processes with ergodic behavior (i.e., its probabilistic properties are invariant in space and time), it is assumed that the fault occurs right at the beginning of the process, t = 1 (under the ergodicity assumption, the detection behavior would be exactly the same as if the fault occurred for any other time t ≥ 1 ; we will consider here a discrete time grid, which starts at t = 1 ). In mathematical terms:

ARL = E ( t A | τ = 1)

(1)

In this equation: τ represents the time at which the fault takes place; t A is the time when an alarm is issued. While this definition does not pose any strong limitation for stationary processes, since the faults location does not affect the detection speed, the

6 Environment ACS Paragon Plus

Page 7 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

same does not apply to batch processes, where ergodicity does not hold any longer. Therefore, the ARL approach cannot fully capture the effects of non-stationarity on detection speed. In this context, an alternative figure of merit for characterizing the detection speed should be adopted for the case of BPM. Surveillance

23-25

– a field closely related to

monitoring and strongly focused in the timely detection of process perturbations – offers a conceptual background and alternatives for characterizing detection speed, that are opportune to consider for addressing the case of batch processes. As an alternative to the ARL, we consider the conditional expected delay (CED), which measures the expected delay between the time of fault’s beginning ( τ ) to the time when an alarm is issued by the monitoring method ( t A ), conditioned to the fact that no false alarm has been issued until τ : 23

CED ( t ) = E ( t A − τ | t A ≥ τ = t )

(2)

For completeness, the CED should be computed for multiples value of τ since it depends on the time of fault’s beginning. Likewise, the CED also depends on the Type I error or false alarm rate (α) considered for the control limits: for lower control limits a faster detection is obtained. However, this increased sensitivity occurs at the expense of increasing the number of false alarms. Therefore, together with the CED it is also important to measure the probability of false alarms (PFA) before the fault’s beginning at time τ :

PFA = P ( t A < τ )

(3)

Note that PFA is not the same as the Type I error rate (α) of the monitoring procedure. PFA is the probability of having (at least) one false alarm during the entire period since the beginning of the surveillance period until the point in time where the fault occurred; on the other hand, the Type I error rate is the probability that any given observation from a process operating under Normal Operating Conditions (NOC) gives rise to a false alarm. In this study, we will adopt and integrate both metrics (CED and PFA) in the computation of the Key Performance Indicator (KPI) used as the basis for comparing

7 Environment ACS Paragon Plus

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

the monitoring methods. In this way, it is possible to assess the detection speed dimension while considering the associated probability of false alarms, securing a fair and rigorous assessment of the methods. The best BPM methods will have simultaneously low values of PFA and CED. In other words, the methods providing faster detections of faults without compromising their reliability due to excessive amounts of false alarms, will be preferred. We recall that in the absence of ergodicity (which is the case of batch processes), both CED and PFA are dependent on the time of fault’s beginning ( τ ) and on the false alarm rate considered for the control limits ( α ) used in the monitoring schemes. Therefore, in order to have a clear picture of the methods performance both τ and α should be varied over a range of values. To do so, we opt to fix τ , generating multiple independent replicates of a given type of fault, and then compute the associated CED and PFA for each value of α . By doing so, a CED versus PFA curve similar to that represented in Figure 1 is obtained. To summarize this curve, we compute the area under the curve (AUC) by numerical integration. For practical reasons, we only compute the AUC for PFA in the interval [0.01 0.20] (by varying α ) since this corresponds to the interval of probability of false alarms in which virtually all BPM methods operate. In this study, as in typical BPM implementations, we consider that an alarm is issued only after three consecutive observations exceed the control limit of the BPM method.

40

30

20

10

0 0

0.2

0.4

0.6

0.8

1

PFA

Figure 1 Example of the CED versus PFA curve.

8 Environment ACS Paragon Plus

Page 8 of 32

Page 9 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

As the BPM methods considered in this study are based on principal component analysis (PCA) models, they evaluate the process state through the use of two monitoring statistics (with the exception of autoregressive PCA, ARPCA, that makes use of three statistics). Consequently, for each method there is a pair of AUCs related to each one of the monitoring statistics considered. Since these monitoring statistics convey complementary information, one of them tends to be more sensitive than the other for a given type of fault. Furthermore, as only one of them needs to be out-ofcontrol in order to signal a fault, the overall performance of each method is here defined by the AUC of the most sensitive monitoring statistic in each condition studied. Based on this consideration, the AUC of the i-th method in situation j (j is a compounded index, which specifies the particular combination of conditions contemplated in a given case study, cs=1:CS, and for a given fault, f=1:F) is defined as the minimum of the

{

}

2

AUCs obtained with the T2 and Q statistics: AUCi , j = min AUCiT, j , AUCiQ, j . The main advantages of using this AUC methodology is that it does not depend on the observed false alarm rates for a given control limit nor even on the use of inaccurate theoretical control limits for a given false alarm rate. This happens because all values of

α are tested and the observed PFA is determined from data. Finally, it should be noted that, as the AUC is related with the overall detection delay, lower AUCs are associated with better performances.

2.2 Background on the Comparison and Assessment Methodology (CAM) and key performance indicators (KPI) Computer simulations form the basis of the overwhelming majority of approaches focused in characterizing the “detection speed” dimension of complex monitoring methods, in a rigorous and robust way. For some univariate methods, analytical expressions are available to compute some figures of merit

20, 26

, but for multivariate

approaches this becomes unfeasible, and the only solution is through the implementation of large scale Monte Carlo simulations

25

. In analogy to the workflow

adopted in the first part of this study, the Comparison and Assessment Methodology (CAM) was also used here, which is based on the results collected from a structured

9 Environment ACS Paragon Plus

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

design of conditions where extensive computational simulations are run. This framework contains five stages 12, as presented in Figure 2. These stages largely overlap with the methodology proposed earlier for the analysis of the detection strength (see Ref. 12). However, as some minor adaptations are required to accommodate the different figures of merit, the entire methodology is here reviewed as well as its building blocks.

Figure 2. The five stages of the Comparison and Assessment Methodology (CAM) for the analysis of detection speed.

I. Definition of testing conditions and methods/versions to analyze. This stage regards the selection of the case studies (cs=1:CS) and the faults over which they are to be analyzed (f=1:F(cs)). For each fault, it is also defined (i) the type of fault, (ii) fault’s magnitude and (iii) time of fault’s beginning (note that a change in any of these parameters leads to a different faulty scenario). Afterwards, for each scenario (i.e., for each fault in each case study), several independent replicates will be generated (r=1:R) in order to compute the figures of merit. Furthermore, it should be noted that in order to have meaningful results, the case studies and faults must be representative of realistic situations. II. Generate training data sets to estimate the monitoring models and testing data sets for all the conditions to be considered. The first step in this stage concerns the generation of NOC batches to be used as training data in each case study. These data will then be used to fit the BPM models. On a second step, the test conditions are simulated originating multiple replicates of each fault, in each case study. These simulations are performed through the hierarchy of steps presented in Table 1.

10 Environment ACS Paragon Plus

Page 10 of 32

Page 11 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Table 1 Pseudocode for the generation of data.

for cs = 1: CS for f = 1: F ( cs ) for r = 1: R

Simulate process @ conditions ( cs, f , r ) Save data @ ( cs, f , r ) end ( r ) end ( f ) end ( cs )

III. Run the selected process monitoring approaches over all conditions and replicates and compute the CED versus PFA curves and the associated AUCs. In this stage, the different variants of the BPM methods are applied to the data sets generated during Stage II, returning the respective profiles of the monitoring statistics for each batch run, data@( cs, f , r ) . Afterwards, for each monitoring statistic, the

CED@( cs, f ) and PFA@( cs, f ) are computed, using information collected from all the replicates. By varying the false alarm rates of the monitoring methods, the

CED@( cs, f ) versus PFA@( cs, f ) curves are obtained and used to determine the AUC @( cs, f ) by numerical integration. Finally, for each method, the minimum AUC is selected to represent its overall performance. One should note that many fault replicates (e.g. 10000) are required in order to construct the CED versus PFA curve. The availability of industrial datasets with these characteristics is of course limited; thus, as previously stated, CAM is more suited for comparing the performance of monitoring methods under simulated scenarios where replicates are available and the characteristics of the fault are known in detail. IV. Process the results and compute Key Performance Indicators (KPI). After Stage III, we have access to the fundamental figure of merit (AUC) for each BPM method/variant and for all case studies and faults. However, the AUCs are only comparable within the same fault, while AUCs from different faults are not related to each other. Therefore, it is convenient to build a Key Performance Indicator (KPI) that is transversal to all circumstances and does not depend on the scale of the AUCs. One

11 Environment ACS Paragon Plus

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

way to achieve this is to score the BPM methods based on their relative performance in each fault. In order to attribute a score to each method within a case study/fault we resort to a successive comparison between each pair of methods. This comparison intends to determine if the AUCs of the testing methods are equal to each other, or if one AUC is significantly lower than the other. In order to decide if two methods have significantly different AUCs, we analyze all possible differences between the AUCs within a fault and record the interval containing 10% of the lowest differences. Afterwards, if the difference between two AUCs falls within this interval, these methods are considered as having a similar performance. Based on this, the methods are scored by comparing each method’s AUC against the others and counting the number of victories (lower AUC), losses (higher AUC) and ties (similar AUC). These counts are then translated into a single number, by attributing a score to each outcome (victory, loss or tie), and sum all scores obtained by each method: score of 2 for each victory, 1 for ties and 0 for losses. This means that for each fault under analysis, one method can have a score between 0 (all losses) and 2 × ( number of methods −1) (all wins, except against himself). V. Analyze the results at different levels of aggregation. To conclude the CAM, in this stage, the KPI are analyzed at different levels of aggregation, such as per class of methods, fault types, infillings, etc. This analysis at different levels of aggregation is illustrated in Section 4.

3

Application of the CAM framework to a large-scale comparison study

In this section, we provide a brief overview of the methods (Section 3.1) and simulated case studies (Section 3.2) considered to illustrate the application of CAM. For more details on the construction and implementation of the entire methodology, the reader is referred to the first article of the sequel 12.

12 Environment ACS Paragon Plus

Page 12 of 32

Page 13 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

3.1 Batch process monitoring approaches considered in the study This study includes the typical modelling methodologies employed for BPM, as well as some more recent ones: 2-way, 3-way and dynamic methods. For each class of methods, different variants were also considered through the use of different infilling of future observations as well as different approaches for setting the control limits. These methods were selected based on their ability to cover a wide range of processes and prevalence in practical scenarios. Furthermore, the large number of methods considered in this study shows that the framework is flexible and suitable for extensive comparison studies. Flexibility is of paramount importance since practitioners can easily test many monitoring methods with the current computational resources. Regarding the infilling of future observations in 2-way and 3-way methods, the following alternatives were applied 27:



zero deviation (ZD), based on the assumption that future measurements will not deviate from their mean value;



current deviations (CD), where one assumes that the deviations observed in the current measurements will remain constant throughout the batch duration;



missing data imputation approach (MD), which is an approach based on the projection capabilities of PCA, to recover missing data.

To represent the class of 2-way methods, multiway PCA (MPCA)

27

was selected and

different windows sizes to compute the control limits of the Q (SPE) statistic were considered: the control limits for time k are obtained by considering also the values computed for the Q statistics at times k ± 1 (window size of 3, WS3), and k ± 2 (window size of 5, WS5). The 3-way methods contemplated in this comparison study are Tucker3

28

and

29

PARAFAC . Both methods were applied after preprocessing the data by centering the first mode (batch mode) and scaling the third mode (time mode). As for the dynamic methods, batch dynamic PCA (BDPCA) 7 and autoregressive PCA (ARPCA)

8

were included in this study, and their variants correspond to two different

approaches to normalize the scores during the computation of the T2 statistic. In the first

13 Environment ACS Paragon Plus

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 32

case, the original T2 statistic was retained without local normalization of the scores (the results for this variant are identified by the label “ORIG”). Therefore, the control limit for the T2 statistic is time dependent. In the second case, the scores were normalized based on their time dependent mean and covariance, resulting in a T2 statistic with a constant control limit (these results are identified as “NORM”). A variant of BDPCA based on dynamic PCA with decorrelated residuals (DPCA-DR)

30

is also included in

this study. DPCA-DR assumes that the current observation is missing and makes use of previous samples and a dynamic PCA model to predict the current sample. Afterwards, the residuals between model predictions and current observations are monitored. Lastly, different data synchronization approaches were tested: no synchronization, synchronization based on an indicator variable (IV)

31

and dynamic time wrapping

32

(DTW) . In all methods, a leave-one-out (LOO) approach was adopted for estimating the monitoring parameters and compute the control limits of the monitoring statistics. All the aforementioned methods/variants tested result in 60 BPM alternatives that are summarized in Table 2. A more detailed description of the methods can be found in the first part of this series (Ref. 12). Table 2 Summary of the different methods and variants contemplated in the comparison study. Also shown, is the nomenclature adopted in the results section.

Modeling

Representative

approach

of the class

Synchronization

Window size (*)

Infilling

Versions of

Number of

approach

dynamic

versions in

methods 2-way

3-way

MPCA PARAFAC Tucker3 ARPCA

None, IV, DTW

WS1, WS3,

ZD, CD,

WS5

MD

None, IV, DTW

--

None, IV, DTW

--

ZD, CD, MD --

Dynamic (DYN)

(†)

the line

--

27

--

18

ORIG, NORM

6

ORIG, BDPCA

None, IV, DTW

--

--

NORM,

9

DPCA-DR Total * For computing the control limits of the instantaneous Q statistic. † Without scores normalization (ORIG) and with scores normalization (NORM).

14 Environment ACS Paragon Plus

60

Page 15 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

3.2 Testing scenarios In this section, an overview of the two simulated case studies is provided. Likewise, the faults introduced in each case study are also listed. These scenarios (case studies and faults) are intended to cover a variety of realistic situations in order to potentiate a robust comparison of the BPM methods under distinct conditions.

3.2.1 Case study 1: PENSIM The first simulated system implements a mathematical model from a fed-batch fermentation process for the production of penicillin (PENSIM, version 2.0, developed at

the

Illinois

Institute

of

Technology

and

available

at

http://simulator.iit.edu/web/pensim/index.html).33 The PENSIM simulator generates profiles with several realistic features and allows for full control of the operation conditions and the capability for simulating several types of faults, with different magnitudes. For monitoring purposes, nine variables were considered: aeration rate, substrate feed temperature, dissolved oxygen concentration, culture volume, CO2 concentration, pH, bioreactor temperature, generated heat, and cooling/heating water flow rate. The duration of each batch was set to 200 h with a sampling interval of 0.5 h (corresponding to 400 observations). The pre-culture stage lasts for about 45 h and the fed-batch stage has a duration of approximately 155 h. More details about the simulation conditions can be found elsewhere.33 A total of 200 batches representing normal operation conditions were simulated for training the BPM methods. Three types of step faults were simulated reflecting abnormal situations: aeration rate (fault 1), agitator power (fault 2) and substrate feed rate (fault 3), as summarized in Table 3. For each type of fault there were generated 10 000 replicates.

Table 3. Summary of the testing conditions considered for the PENSIM case study.

Fault

Description

Fault Magnitude (%)

1

Step fault in the aeration rate

1

2

Step fault in the agitator power

1

3

Step fault in the substrate feed rate

30

15 Environment ACS Paragon Plus

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 32

3.2.2 Case study 2: SEMIEX The SEMIEX case study is a simulation of a semi-batch reactor where an exothermic second order reaction takes place ( A + B → C ).34 The reactor temperature is maintained at 25 °C by a PID controller that manipulates the flow of cooling fluid. Similar to the PENSIM case study, natural sources of variability affect the system, namely Gaussian noise and some process variables present auto-regressive drifting patterns. In this simulated system seven process variables were monitored: the reactor volume, the concentration of reactants A and B, the concentration of product C, the reactor temperature and the inlet temperature and flow rate of the cooling fluid. For illustration purposes, the trajectories of some of these process variables are presented in Figure 3 for five batches. Monitoring models were build using 200 NOC simulated batches, and four different faults disturbed the reactor: a fault in the inlet flow sensor for reactant B (fault 1); a lag in the response of the control signal (fault 2); a fault in the sensor measuring the reactor temperature (fault 3); and finally a reactor leakage (fault 4). These faulty conditions are summarized in Table 4. For each fault, 10 000 batches were simulated.

Table 4. Summary of the testing conditions considered for the SEMIEX case study.

Fault

Description

Fault Magnitude

1

Fault in the inlet flow sensor for reactant B

-6 %

2

Lag in the response of the control signal

0.01 h

Fault in the sensor measuring the reactor

3

1%

temperature

4

Reactor leakage

17 %

12

26

25

10

24 8 23 6 22 4 21 2

20

0

19 0

1

2

3

4

5

6

0

1

2

3

Time (h)

Time (h)

16 Environment ACS Paragon Plus

4

5

6

Page 17 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

(a)

(b)

1.8

1.24

1.6

1.23

1.4

1.22

1.2

1.21

1

1.2

0.8

1.19

0.6

1.18

0.4

1.17

0.2

1.16

0

1.15 0

1

2

3

4

5

6

0

1

2

3

Time (h)

Time (h)

(c)

(d)

4

5

6

Figure 3. Examples of batch trajectories for five variables in the SEMIEX case study: (a) reactor temperature, (b) concentration of reactant A in the reactor, (c) concentration of reactant B in the reactor, and (d) flow of reactant B to the reactor.

4

Results

To demonstrate the application of the proposed CAM framework as a comparison tool, the 60 variants of BPM methods described in Section 3.1 were implemented on the two case studies under different faulty scenarios. 200 in-control batches were used to train the models of each BPM method. Afterwards, for each process and fault, 10 000 batches were generated. This large number of replicates is required to reliably compute the CED and PFA statistics. For the SEMIEX process, the faults were introduced at observation 250 (from a total of 601 observations), while for the PENSIM process, the faults started at observation 170 (from a total of 400 observations). Furthermore, in the PENSIM case study, each batch was divided according to its operational phase and only the second phase was considered in this analysis. Therefore, about 100 observations were discarded at the beginning of each batch in the PENSIM process. After defining the test data sets, the CED and PFA of each fault can be computed. The resulting CED versus PFA curve was summarized by the area under the curve, for the relevant PFA range, i.e., between 0.01 and 0.20. Under this framework, an AUC equal to 0 means that the fault can be detected without any delay, while a larger AUC value implies a larger detection delay. Consequently, the methods with the lowest AUCs are considered the best approaches to quickly detect a fault without compromising its reliability. Based on the AUC of each method, the Key Performance Indicator (KPI) is computed as described in Section 2.

17 Environment ACS Paragon Plus

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

According to the specifications of Stage V, the results are aggregated at different levels in order to extract general trends on the performance of the methods and useful insights for the selection of BPM schemes. Following this procedure, an inter-methods analysis is firstly conducted through overall comparison of the different BPM classes (2-way, 3way and dynamic). Afterwards, a finer intra-methods analysis of the features within each modeling methodology is made. Before discussing the obtained results, we would like to note that the main contribution of this article is the CAM framework and that the following analysis aims solely to demonstrate its applicability and advantages.

4.1 Aggregation level: inter-methods The broadest aggregation level that can be defined is for the overall comparison of the different model types (2-way, 3-way and dynamic methods) regardless of their specific version. Figure 4 presents the distribution of the KPI at this aggregation level, i.e., for all methods within a given class, for both the SEMIEX and PENSIM case studies. From these results, it can be observed that in the SEMIEX process the dynamic methods tend to present better performances, followed by the 2-way methods. On the other hand, the 3-way methods performed considerably worse, with the exception of six variants for fault 1, which receive a score of 100 (out of a maximum of 118). Likewise, for the PENSIM process, the dynamic and 2-way methods presented the best performances, while the 3-way methods only show a high score in one situation.

18 Environment ACS Paragon Plus

Page 18 of 32

Score

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Score

Page 19 of 32

(a)

(b)

Figure 4 Score (KPI) distribution stratified by class of modeling approach for the SEMIEX process (a) and the PENSIM process (b). Higher scores are related with higher performance.

Another interesting observation from Figure 4 is that even similar methods do not perform equally depending on the different elements contemplated in each variant (type of infilling, synchronization approach and control limits). Likewise, the KPI can vary with the type of fault. Therefore, for each class of methods it is necessary to go deeper into the aggregation level in order to determine which components most affect the speed of detection. To achieve a deeper analysis of the KPI, the results are stratified by class of method and synchronization approach in Figure 5, by means of a multi-vary chart. For the SEMIEX process (see Figure 5 (a)) it can be verified that synchronization can have a significant impact on the KPI. The major improvement in the dynamic methods happens for the IV synchronization, which is responsible for a faster detection on the BDPCA models in faults 1 and 4 (higher scores). However, even with this improvement, BDPCA is still not competitive against the other dynamic methods (see Section 4.2.3). Similarly, for the 2-way methods, there is an improvement when IV synchronization is used. On the other hand, all 3-way methods have low KPI regardless of the synchronization approach adopted. Note however that synchronization does have an impact on the performance of 3-way methods (see Section 4.2.2), but since they tend to detect the faults much later (especially faults 2 and 4), they often lose during the AUC comparison and the effect of this feature becomes attenuated. Regarding the results for the PENSIM process in Figure 5 (b), it is again observed that IV synchronization leads to the faster detections on the dynamic and 3-way methods. Again, for the dynamic methods this apparent improvement is only due to a slight 19 Environment ACS Paragon Plus

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 32

enhancement of the BDPCA models (see Section 4.2.3), while the ARPCA variants are mostly unaffected by the synchronization approach (all of them have a fast detection). As for the 3-way methods, the better performance with IV synchronization is caused by the ability to detect fault 3 when this synchronization is used. On the other hand, for the 2-way methods, it is verified that the IV synchronization now leads to the worst performance.

100

100 None IV DTW

80

None IV DTW

80

60 60 40 40

20

20

0 Dyn

2Way

3Way

Dyn

Models

2Way

3Way

Models

(a)

(b)

Figure 5 Multi-vary chart for the performance scores, stratified by class of BPM models and synchronization: (a) SEMIEX process and (b) PENSIM process. Higher scores are related with higher performance.

Another aspect that has an impact on the performance of the BPM methods, is the testing scenario (type of fault). This level of stratification is represented in Figure 6 for both case studies. Regarding the SEMIEX process (Figure 6 (a)), it is verified that the scores obtained by each class of methods do not vary much with the type of fault, with the exception of fault 1. For the dynamic methods, the lower scores in fault 1 are due to an increase in ties against the other methods. Therefore, while dynamic methods are still capable to rapidly detect fault 1, it is noticeable that the other methods have also a good performance in this fault as well (however the other methods are not as good at detecting the remaining faults). Similarly, the 3-way methods also show better scores for fault 1 due to a rapid detection by the Tucker3 variants with CD and ZD infillings (regardless of the synchronization approach). As for the PENSIM process (Figure 6 (b)), it is verified that the dynamic methods have a fast detection speed for faults 1 and 3, while the 2-way methods are more suitable for detecting fault 2. Again, for the 3-way methods it can be observed that they are unable

20 Environment ACS Paragon Plus

Page 21 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

to detect the faults (apart from a late detection of fault 3), which causes them to have the lowest scores.

100

90 Fault 1 Fault 2 Fault 3 Fault 4

80

Fault 1 Fault 2 Fault 3

80 70

60

60

40

50 40

20 30 0

20 Dyn

2Way

3Way

Dyn

Models

2Way

3Way

Models

(a)

(b)

Figure 6. Multi-vary chart for the performance scores, stratified by class of BPM methods and type of process fault: (a) SEMIEX process and (b) PENSIM process. Higher scores are related with higher performance.

Based on the aforementioned results, and from the standpoint of speed of detection, 2way and dynamic methods should be considered as first choices to monitor processes similar to those presented in this study. It is also verified that IV synchronization tends to lead to higher scores. However, using unsynchronized data is still a valid approach since the lower scores on these variants are caused by a higher number of ties in the AUCs and not due to a later detection of the faults.

4.2 Aggregation level: intra-methods In order to have a clearer picture of the characteristics driving each class of BPM methods, in the following subsections we conduct a detailed intra-method analysis of the results.

4.2.1 2-way methods The performance of the 2-way methods can be affected by the use of different infilling approaches, windows sizes in the definition of the control limits and synchronization 21 Environment ACS Paragon Plus

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 32

approaches. Thus, in order to better assess the contribution of these elements, in Figure 7 we present the multi-vary charts for the scores stratified by infilling and window size, while in Figure 8 we represent the scores stratified by infilling and synchronization approach. For the SEMIEX case study it is verified that the synchronization approach is the major contributor for the variability in performance (see Figure 8 (a)). Furthermore, the best results are obtained with IV synchronization, especially when coupled with the ZD or MD infillings. Regarding the window sizes used for the control limits, from Figure 7 (a) it can be observed that a window size of one observation (WS1) leads to the best performances and that larger windows have a negative impact. As for the PENSIM process, the major source of variability in detection speed is the infilling approach. In this regard, it is observed that the variants based on MD infilling have generally higher scores (see Figures 7 (b) and 8 (b)). On average, CD infilling also produces similar results to the MD infilling. However, when the synchronization approach is also taken into account (see Figure 8 (b)) it becomes clear that MD infilling without synchronization or DTW synchronization lead to considerably higher scores. As for the window size, it is again observed that the detection performance decreases with the increase of observations used to set the control limits. These results show that none of the variants has a generally superior performance in both case studies. Nevertheless, it is verified that, within the 2-way methods, the variants using MD infilling without synchronization or IV synchronization are a good compromise between the two case studies.

90

80

90 WS1 WS3 WS5

80

70

70

60

60

50

50

40

WS1 WS3 WS5

40 ZD

CD

MD

ZD

Infilling

CD

Infilling

22 Environment ACS Paragon Plus

MD

Page 23 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

(a)

(b)

Figure 7 Multi-vary chart for the performance scores, stratified by class of infilling methods and window sizes: (a) SEMIEX process and (b) PENSIM process. Higher scores are related with higher performance.

85

110 NONE IV DTW

80

100

75

90

70

80

65

70

60

60

55

50

50

NONE IV DTW

40 ZD

CD

MD

ZD

Infilling

CD

MD

Infilling

(a) (b) Figure 8 Multi-vary chart for the performance scores, stratified by class of infilling methods and synchronization: (a) SEMIEX process and (b) PENSIM process. Higher scores are related with higher performance.

4.2.2 3-way methods Regarding the 3-way methods, from Figures 9 and 10 it is visible that both case studies present distinct results. For the SEMIEX process, the infillings have a negative effect when moving from ZD to CD to MD, while for the PENSIM process, the best performance is attained with MD, followed by ZD and CD. Likewise, for the synchronization approaches, there is no discernable trend in the SEMIEX process, while in the PENSIM process, IV synchronization is clearly superior. In the SEMIEX process, the best conditions are observed for PARAFAC with ZD infilling and DTW synchronization and for Tucker3 with ZD infilling with either no synchronization or IV synchronization. As for the PENSIM process, the best approach is found to be the PARAFAC with MD infilling and IV synchronization. Nevertheless, even in these variants, the 3-way methods tend to detect the faults much later than all other methods, which cause them to have lower scores. The only occasional exceptions for this general behavior happen for fault 1 of the SEMIEX process (for four Tucker3 variants and two PARAFAC variants) and for fault 3 of the PENSIM process (for one PARAFAC variant).

23 Environment ACS Paragon Plus

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

45

Page 24 of 32

50 ZD CD MD

40

40

35 30

30

25

20

20 10 15 10

ZD CD MD

0 PARAFAC

Tucker3

PARAFAC

Tucker3

Models

Models

(a)

(b)

Figure 9 Multi-vary chart for the performance scores, stratified by the different 3-way models and infilling options: (a) SEMIEX process and (b) PENSIM process. Higher scores are related with higher performance.

40

70 NONE IV DTW

35

NONE IV DTW

60 50

30 40 25 30 20

20

15

10 PARAFAC

Tucker3

PARAFAC

Models

Tucker3

Models

(a) (b) Figure 10 Multi-vary chart for the performance scores, stratified by the different 3-way models and synchronization approach: (a) SEMIEX process and (b) PENSIM process. Higher scores are related with higher performance.

4.2.3 Dynamic methods The dynamic methods vary in performance due to the use of different normalization approaches in the retained scores (“ORIG” or “NORM”). These effects are depicted in Figure 11. From this figure, it is noticeable that ARPCA variants have higher scores, regardless of the type of normalization, while BDPCA improves when the DPCA-DR variant is used. However, while DPCA-DR can lead to a faster detection of faults, in some cases it is not capable to sustain an out-of-control signal, which have a negative impact on the method’s detection strength (this was verified in the first part of the article

12

). As for the synchronization approaches, Figure 12 reveals that no

synchronization and IV lead to the best results.

24 Environment ACS Paragon Plus

Page 25 of 32

110

100 ORIG NORM DPCA-DR

100

ORIG NORM DPCA-DR

90 80

90

70

80

60 50

70 40 60

30 ARPCA

BDPCA

ARPCA

Models

BDPCA

Models

(a)

(b)

Figure 11 Multi-vary chart for the performance scores, stratified by type of dynamic methods and normalization approach: (a) SEMIEX process and (b) PENSIM process. Higher scores are related with higher performance.

110 NONE IV DTW

100 90

Mean Score

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

80 70 60 50 ARPCA

BDPCA

DPCA-DR

Models

(a) (b) Figure 12 Multi-vary chart for the performance scores, stratified by type of dynamic methods and synchronization: (a) SEMIEX process and (b) PENSIM process. Higher scores are related with higher performance.

5

Discussion

The CAM framework proposed in the first part of this article

12

is here applied, for

illustration purposes, to the analysis of detection speed of a wide variety of BPM methods (60 variants overall). As in the case of detection strength studied earlier, it is again observed that there is no transversal solution for a fast detection of faults and that the speed of detection is highly dependent on the type of process and fault. Therefore, these results highlight that before implementing any BPM method, its building elements should be carefully selected and analyzed, in a case by case fashion. To this end, we propose the CAM framework as a systematic and rigorous comparison procedure that can be applied to any type of process, fault and monitoring procedure.

25 Environment ACS Paragon Plus

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

To exemplify the analysis conducted through the CAM framework, three classes of BPM methods (2-way, 3-way and dynamic methods) were considered, as well as their multiple variants based on different infillings of future observations (ZD, CD and MD), synchronization approaches (no synchronization, IV and DTW) and control limits (window sizes in the 2-way method; normalization approaches in the dynamic methods). As a result, 60 BPM variants were studied. Subsequently, these variants were used to monitor two distinct simulated batch processes: the SEMIEX process and the PENSIM process. Among the BPM methods considered, the 2-way methods are the most demanding since they have to model the entire batch, requiring for such a large number of parameters. Furthermore, in the case of on-line monitoring, this class of methods also require the infilling of future observations. Despite these disadvantages, the 2-way methods are still among the methods leading to the fastest detections of faults. More specifically, in the SEMIEX process, some of the 2-way variants are ranked as the second faster approaches (after some of the dynamic variants), while in the PENSIM process they are among the best methods. Regarding its building elements, it is verified that the MD infilling and WS1 lead to the best results. However, the best synchronization approach is not the same in both case studies, being IV synchronization preferable for the SEMIEX process and no synchronization for the PENSIM process. Conversely, the 3-way methods presented the worst performances in most scenarios, being only capable to detect faults 1 and 3 of the SEMIEX process and fault 3 of the PENSIM process. Furthermore, even in these scenarios only a few variants have a detection speed comparable to the 2-way and dynamic methods. In the SEMIEX process the 3-way methods do show some detection capabilities, but it is often slower. On the other hand, for the PENSIM process, the 3-way methods were mostly unable to detect the faults, apart from fault 3. Furthermore, in fault 3, the MD infilling (for both Tucker3 and PARAFAC and all synchronization approaches) produced a large amount of false alarms, which turns this variant unreliable. The dynamic methods showed consistent results in both case studies and also presented the highest scores in both cases. This is mostly due to the ARPCA variants, which were capable to promptly detect most faults. Furthermore, the normalization options and synchronization approaches showed to have a small impact on the performance of

26 Environment ACS Paragon Plus

Page 26 of 32

Page 27 of 32

ARPCA. Regarding BDPCA, it is observed that the DPCA-DR variant can enhance detection speed, putting it in line with ARPCA. The major drawback of ARPCA and DPCA-DR is that, while they are capable to detect the initial stages of the faults, there are cases in which their monitoring statistics return to an in-control state while the fault persists. Therefore, in such scenarios the detection strength of the dynamic approaches is suboptimal as observed in the first part of this study.12 As a final illustration of the overall speed of detection of each variant, we ranked the BPM variants based on the average AUC across all faults. This representation is given in Figure 13 for the SEMIEX process and in Figure 14 for the PENSIM process. From Figure 13 it is observed that the multiple variants of ARPCA dominate the detection speed as they provide a rapid detection in all faults. As mentioned before, this fast detection is independent of the normalization and synchronization approaches. ARPCA is then closely followed by DPCA-DR and afterwards by the 2-way methods with IV synchronization. Regarding the PENSIM process in Figure 14, it is verified that the best performance is attained by 2-way methods with MD infilling, followed by ARPCA with IV synchronization. In both case studies it is also observed that 3-way methods tend to have substantially slower detection speeds.

ARPCA-ORIG-IV ARPCA-ORIG-NONE ARPCA-NORM-NONE ARPCA-NORM-IV ARPCA-ORIG-DTW ARPCA-NORM-DTW DPCA-DR-IV DPCA-DR-NONE DPCA-DR-DTW MPCA-MD-WS0-IV MPCA-ZD-WS0-IV MPCA-ZD-WS1-IV MPCA-CD-WS0-IV BDPCA-ORIG-IV MPCA-MD-WS0-NONE MPCA-MD-WS1-IV MPCA-ZD-WS2-IV MPCA-ZD-WS0-NONE MPCA-CD-WS1-IV MPCA-ZD-WS1-NONE BDPCA-ORIG-NONE MPCA-CD-WS0-NONE MPCA-MD-WS0-DTW MPCA-MD-WS1-NONE MPCA-CD-WS1-NONE MPCA-CD-WS2-IV MPCA-ZD-WS0-DTW MPCA-MD-WS2-IV MPCA-ZD-WS1-DTW MPCA-MD-WS1-DTW BDPCA-NORM-IV BDPCA-ORIG-DTW MPCA-ZD-WS2-DTW MPCA-ZD-WS2-NONE BDPCA-NORM-NONE MPCA-CD-WS2-NONE BDPCA-NORM-DTW MPCA-CD-WS0-DTW MPCA-MD-WS2-DTW MPCA-MD-WS2-NONE MPCA-CD-WS1-DTW MPCA-CD-WS2-DTW PARAFAC-ZD-DTW Tucker3-CD-IV PARAFAC-CD-DTW Tucker3-CD-DTW Tucker3-CD-NONE PARAFAC-ZD-IV Tucker3-MD-IV PARAFAC-CD-NONE Tucker3-ZD-NONE PARAFAC-MD-IV Tucker3-MD-NONE Tucker3-ZD-DTW Tucker3-ZD-IV PARAFAC-MD-DTW Tucker3-MD-DTW PARAFAC-MD-NONE PARAFAC-CD-IV PARAFAC-ZD-NONE

Average AUC

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Figure 13 Ranking of the BPM variants based on the average AUC over the 4 faults of the SEMIEX process. Lower values represent a faster detection of faults. The horizontal lines represent the thresholds for the 10 % and 20 % variants with the fastest detection speeds.

27 Environment ACS Paragon Plus

MPCA-MD-WS0-NONE MPCA-MD-WS1-NONE DPCA-DR-DTW MPCA-MD-WS0-DTW MPCA-MD-WS2-NONE MPCA-MD-WS1-DTW MPCA-MD-WS2-DTW MPCA-CD-WS1-NONE MPCA-CD-WS1-DTW ARPCA-ORIG-IV ARPCA-NORM-IV DPCA-DR-IV MPCA-CD-WS0-NONE ARPCA-ORIG-DTW MPCA-CD-WS0-DTW ARPCA-NORM-DTW MPCA-CD-WS0-IV MPCA-CD-WS1-IV MPCA-CD-WS2-IV ARPCA-NORM-NONE ARPCA-ORIG-NONE MPCA-CD-WS2-NONE MPCA-CD-WS2-DTW DPCA-DR-NONE PARAFAC-MD-IV MPCA-ZD-WS1-DTW PARAFAC-CD-IV MPCA-ZD-WS1-NONE MPCA-ZD-WS0-DTW MPCA-ZD-WS2-DTW BDPCA-ORIG-IV BDPCA-NORM-IV MPCA-ZD-WS0-IV MPCA-ZD-WS0-NONE MPCA-MD-WS0-IV Tucker3-MD-IV MPCA-ZD-WS2-NONE MPCA-MD-WS1-IV MPCA-MD-WS2-IV Tucker3-CD-IV PARAFAC-ZD-IV MPCA-ZD-WS1-IV MPCA-ZD-WS2-IV PARAFAC-CD-DTW BDPCA-NORM-DTW BDPCA-ORIG-DTW PARAFAC-CD-NONE BDPCA-ORIG-NONE PARAFAC-ZD-DTW Tucker3-MD-DTW PARAFAC-MD-DTW BDPCA-NORM-NONE Tucker3-MD-NONE PARAFAC-ZD-NONE Tucker3-ZD-IV PARAFAC-MD-NONE Tucker3-ZD-DTW Tucker3-CD-DTW Tucker3-ZD-NONE Tucker3-CD-NONE

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Average AUC

Industrial & Engineering Chemistry Research

Figure 14 Ranking of the BPM variants based on the average AUC over the 3 faults of the PENSIM process. Lower values represent a faster detection of faults. The horizontal lines represent the thresholds for the 10 % and 20 % variants with the fastest detection speeds.

Overall, dynamic and 2-way methods showed to be the most consistent approaches for a fast detection of faults in both case studies. Therefore, they are recommended as an initial choice when building a BPM on-line scheme. However, it should be noted that this recommendation is tied to the characteristics of the processes considered and that different results may be found for other processes. Furthermore, we highlight that the major contribution of this article is not the relative performance of each variant, but rather the comparison methodology used to evaluate the methods and the useful insights it provides for selecting the most suitable BPM method for a specific process.

6

Conclusions

In this article, the comparison and assessment methodology (CAM) is applied to the analysis of detection speed of batch process on-line monitoring methods. To perform such comparison, the CAM framework uses a set of figures of merit that effectively track the time delay between the fault’ beginning and its detection (measured by the CED), as well as the number of false alarms incurred (measured by the PFA). Based on this methodology, the CED versus PFA curve is then summarized by the area under the curve (AUC). Afterwards, a Key Performance Indicator (KPI) is computed in order to assess the relative performance of each BPM method and its variants when subject to different synchronizations, infillings and normalization schemes. The results are then analyzed at different levels of aggregation in order to determine the main effects driving the performance of the studied BPM methods.

28 Environment ACS Paragon Plus

Page 28 of 32

Page 29 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

To illustrate the applicability of the CAM framework, two simulated batch processes (SEMIEX and PENSIM) were considered and subject to multiple faults. The overall results suggest that dynamic and 2-way methods have the fastest detection speeds. The 2-way methods showed better results when coupled with MD infilling and control limits based on one observation (WS1). The best synchronization approach was however case dependent. As for the dynamic methods, it is verified that ARPCA is considerably better than its counterparts and that it can rapidly detect the faults regardless of the synchronization or normalization scheme. Nevertheless, it is worth noticing that faster detections were obtained with IV synchronization. In this study, it is also observed that 3-way methods are generally unable to detect most of the faults considered. Furthermore, in the cases where the 3-way methods detected the faults, their detection speed was usually slower than that of 2-way or dynamic methods. Finally, it should be noted that detection strength and detection speed measure two complementary characteristics and that a good score in one metric does not ensure a good performance in the other. This is observed, for instance, in the dynamic methods, which can detect the faults in their early stages (good detection speed) but in some cases their monitoring statistics return to in-control levels shortly after (poor detection strength). Therefore, during the selection or development of adequate BPM methods both metrics should be considered in order to obtain a reliable detection performance.

Acknowledgements Marco Reis and Tiago Rato acknowledge the financial support through project 016658 (references PTDC/QEQ-EPS/1323/2014, POCI-01-0145-FEDER-016658) financed by Project 3599-PPCDT (Promover a Produção Científica e Desenvolvimento Tecnológico e a Constituição de Redes Temáticas) and co-financed by the European Union’s FEDER. Ricardo Rendall acknowledges the Portuguese Foundation for Science and Technology (FCT) for financial support through PhD grant with reference SFRH/BD/123774/2016.

29 Environment ACS Paragon Plus

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

References (1) Nomikos, P.; MacGregor, J. F., Monitoring Batch Processes Using Multiway Principal Component Analysis. AIChE Journal 1994, 40, 8, 1361-1375. (2) Nomikos, P.; MacGregor, J. F., Multivariate SPC Charts for Monitoring Batch Processes. Technometrics 1995, 37, 1, 41-59. (3) Wold, S.; Kettaneh, N.; Friden, H.; Holmberg, A., Modelling and diagnostics of batch processes and analogous kinetic experiments. Chemometrics and Intelligent Laboratory Systems 1998, 44, 331-340. (4) González-Martínez, J. M.; Vitale, R.; De Noord, O. E.; Ferrer, A., Effect of Synchronization on Bilinear Batch Process Modeling. Industrial & Engineering Chemistry Research 2014, 53, 4339-4351. (5) Louwerse, D. J.; Smilde, A. K., Multivariate statistical process control of batch processes based on three-way models. Chemical Engineering Science 2000, 55, 7, 12251235. (6) Meng, X.; Morris, A. J.; Martin, E. B., On-line monitoring of batch processes using a PARAFAC representation. Journal of Chemometrics 2003, 17, 1, 65-81. (7) Chen, J.; Liu, K.-C., On-line batch process monitoring using dynamic PCA and dynamic PLS models. Chemical Engineering Science 2002, 57, 1, 63-75. (8) Choi, S. W.; Morris, J.; Lee, I.-B., Dynamic model-based batch process monitoring. Chemical Engineering Science 2008, 63, 3, 622-636. (9) Rato, T. J.; Blue, J.; Pinaton, J.; Reis, M. S., Translation Invariant Multiscale Energy-based PCA (TIME-PCA) for Monitoring Batch Processes in Semiconductor Manufacturing. IEEE Transactions on Automation Science and Engineering 2017, 14, 2, 894-904. (10) Rendall, R.; Lu, B.; Castillo, I.; Chin, S.-T.; Chiang, L. H.; Reis, M. S., A Unifying and Integrated Framework for Feature Oriented Analysis of Batch Processes. Industrial & Engineering Chemistry Research 2017, 56, 30, 8590-8605. (11) Wang, J.; He, Q. P., Multivariate statistical process monitoring based on statistics pattern analysis. Industrial & Engineering Chemistry Research 2010, 49, 7858-7869. (12) Rato, T. J.; Rendall, R.; Gomes, V.; Chin, S.-T.; Chiang, L. H.; Saraiva, P. M.; Reis, M. S., A Systematic Methodology for Comparing Batch Process Monitoring Methods: Part I - Assessing Detection Strength. Ind. Eng. Chem. Res. 2016, 55, 18, 5342-5358. (13) Ge, Z.; Song, Z.; Gao, F., Review of Recent Research on Data-Based Process Monitoring. Industrial & Engineering Chemistry Research 2013, 52, 3543-3562. (14) Chiang, L. H.; Leardi, R.; Pell, R. J.; Seasholtz, M. B., Industrial experiences with multivariate statistical analysis of batch process data. Chemometrics and Intelligent Laboratory Systems 2006, 81, 2, 109-119. (15) Camacho, J.; Pico, J.; Ferrer, A., Bilinear modelling of batch processes. Part I: theoretical discussion. Journal of Chemometrics 2008, 22, 5, 299-308. (16) Camacho, J.; Picó, J.; Ferrer, A., The best approaches in the on-line monitoring of batch processes based on PCA: Does the modelling structure matter? Analytica chimica acta 2009, 642, 1, 59-68. (17) Rato, T. J.; Reis, M. S., On-line process monitoring using local measures of association. Part I: Detection performance. Chemometrics and Intelligent Laboratory Systems 2015, 142, 255-264. (18) Reis, M. S.; Bakshi, B. R.; Saraiva, P. M., Multiscale statistical process control using wavelet packets. AIChE Journal 2008, 54, 9, 2366-2378.

30 Environment ACS Paragon Plus

Page 30 of 32

Page 31 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

(19) Woodall, W. H., Controversies and Contradictions in Statistical Process Control. Journal of Quality Technology 2000, 32, 4, 341-350. (20) Montgomery, D. C., Introduction to Statistical Quality Control. 4th ed.; Wiley: New York, 2001. (21) Ramaker, H.-J.; van Sprang, E. N.; Westerhuis, J. A.; Smilde, A. K., Fault detection properties of global, local and time evolving models for batch process monitoring. Journal of Process control 2005, 15, 7, 799-805. (22) Kenett, R. S.; Pollak, M., On Assessing the Performance of Sequential Procedures for Detecting a Change. Quality and Reliability Engineering International 2012, 28, 5, 500-507. (23) Frisén, M., Methods and evaluations for surveillance in industry, business, finance, and public health. Qual. Reliab. Eng. Int. 2011, 27, 5, 611-621. (24) Järpe, E.; Wessman, P. Some power aspects of methods for detecting different shifts in the mean; Department of Statistics, Goteborg University, Sweden: 1999. (25) Frisén, M., On multivariate control charts. Produção 2011, 21, 2, 235-241. (26) Reynolds, M. R., Jr.; Stoumbos, Z. G., Should Observations Be Grouped for Effective Process Monitoring. Journal of Quality Technology 2004, 36, 4, 343-366. (27) Nomikos, P.; MacGregor, J. F., Multivariate SPC Chart for Monitoring Batch Processes. Technometrics 1995, 37, 1, 41-59. (28) Tucker, L. R., Some mathematical notes on three-mode factor analysis. Psychometrika 1966, 31, 3, 279-311. (29) Louwerse, D.; Smilde, A., Multivariate statistical process control of batch processes based on three-way models. Chemical Engineering Science 2000, 55, 7, 12251235. (30) Rato, T. J.; Reis, M. S., Fault detection in the Tennessee Eastman benchmark process using dynamic principal components analysis based on decorrelated residuals (DPCA-DR). Chemometrics and Intelligent Laboratory Systems 2013, 125, 101-108. (31) Ündey, C.; Ertunç, S.; Çinar, A., Online batch/fed-batch process performance monitoring, quality prediction, and variable-contribution analysis for diagnosis. Industrial & engineering chemistry research 2003, 42, 20, 4645-4658. (32) Kassidas, A.; MacGregor, J. F.; Taylor, P. A., Synchronization of batch trajectories using dynamic time warping. AIChE Journal 1998, 44, 4, 864-875. (33) Birol, G.; Ündey, C.; Cinar, A., A modular simulation package for fed-batch fermentation: penicillin production. Computers & Chemical Engineering 2002, 26, 11, 1553-1565. (34) Ingham, J.; Dunn, I. J.; Heinzle, E.; Prenosil, J. E.; Snape, J. B., Chemical engineering dynamics: an introduction to modelling and computer simulation. John Wiley & Sons: 2008; Vol. 3.

31 Environment ACS Paragon Plus

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 32

Graphical Abstract

Batch Process Monitoring 2-Way methods 3-Way methods

CAM Comparison and Assessment Methodology

(60 methods/variants)

Simulated scenarios SEMIEX PENSIM (1650 test data sets)

Baseline figures of merit Conditional Expected Delay Probability of False Alarms

KPI AUC (CED vs PFA)

32 Environment ACS Paragon Plus