Resilience-based process upset events prediction analysis for

The Process Resilience Analysis Framework is a system-based novel ... As PRAF is a quantitative and data-driven approach, 24 resilience metrics have b...
0 downloads 0 Views 1MB Size
Subscriber access provided by UNIV TEXAS SW MEDICAL CENTER

General Research

Resilience-based process upset events prediction analysis for uncertainty management using Bayesian deep learning: application to a PVC process system Prerna Jain, Antik Chakraborty, Efstratios N. Pistikopoulos, and M. Sam Mannan Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/acs.iecr.8b01069 • Publication Date (Web): 05 Oct 2018 Downloaded from http://pubs.acs.org on October 8, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Resilience-based process upset events prediction analysis for uncertainty management using Bayesian deep learning: application to a PVC process system 1,3,4Prerna

Jain*, 2Antik Chakraborty, 3,4E.N. Pistikopoulos, 1,3,4M. Sam Mannan 1Mary Kay O’Connor Process Safety Center 2Department of Statistics 3Texas A&M Energy Institute 4Artie McFerrin Department of Chemical Engineering Texas A&M University, College Station, TX778433122, USA *Corresponding author email: [email protected]

Abstract There are uncertainties involved in the risk assessment of process systems operations. Also, systems are complex and deteriorate gradually with time or due to exposure to expected or unexpected disturbances/events. Questions such as, what is the frequency of a process upset?, can we predict incidents?, are yet to be explored and answered. With the use of Process Resilience Analysis Framework (PRAF), this work presents a resilience-based approach to manage uncertainties to better predict process upsets. Prior specification on uncertain parameters is assumed based on historical data. Popular sampling and Bayesian techniques such as Markov Chain Monte Carlo (MCMC) simulation and mixture modeling are used for posterior inference on the parameters. The application of the predictability assessment for uncertainty management is demonstrated using a Poly Vinyl Chloride (PVC) process system. Three types of uncertainties: cooling medium temperature, agitator failure and reactants charging are considered. It is concluded that with the use of resilience metrics data, the variance of statistical parameters can be updated leading to high probability regions of the parameter space responsible for the observed data. This helps the risk assessors to make more accurate and informed process risk decisions. Keywords: uncertainty; PVC; resilience; risk analysis; process safety; Bayesian deep learning; chemical industry; PRAF

1. Introduction Process systems are complex socio-technical systems that are susceptible to catastrophic incidents as there are certain process or mechanical or instrumentation or human hazards in the plant1. Uncertainty management in the risk assessment has a pivotal role to predict any process upsets or prioritize safeguards to survive through upsets or minimize emergency response time to reduce the severity of consequences. Recently, a considerable literature has grown up around the theme of resilience engineering being applied to process systems for improved risk assessment. For example, book by Hollnagel et al., marked the maturation of a new approach to safety management and the chapters explored different facets of resilience2. The work of Morari and Grossman demonstrates the simulation and optimization modeling of technical processes with respect to flexibility and operability3, 4. The Columbia disaster was analyzed using Resilience Engineering concepts and it was concluded that resilience perspective will create foresight about the changing patterns of risk before failure and harm occurs5. The Resilience Analysis Grid was introduced to provide a well-defined characterization of a system that can be used to manage the system and specifically to develop its potential for resilient performance6. Dinh et al., proposed six resilience aspects for safety in chemical industry7. A survey method was used to identify deficiencies related to Resilience Engineering by measuring seven safety culture indicators and managerial factors8. A resilience analysis framework was proposed whose implementation is encapsulated within resilience metric incorporating absorptive, adaptive, and restorative capacities9. A mathematical programming

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 26

approach was applied to identify the most important factors of integrated managerial and organizational factors10. The Process Resilience Analysis Framework used in the current study has the following characteristics: integrated, systems-based, quantitative, data-driven, dynamic, uncertainty management, and cost-effective11. This novel framework consists of three different phases covering the whole anatomy of a process incident: avoidance, survival and recovery. The avoidance phase deals with the objective of predicting process upsets. The major contribution of the proposed approach is the establishment of accurate and integrated method to predict process upset situations by virtue of a MCMC formulation backed by process resilience concepts. In order to demonstrate the application of the methodology for uncertainty management in predictability assessment, batch reactor operation has been selected as the principal example. Table 1 presents the statistical data on the prime causes of batch reactor incidents12, 13, 14. It has been summarized in previous research that polymerization process is one of the leading processes that has contributed to thermal runway reactions14. Therefore, the example of PVC suspension polymerization batch process is chosen. Table 1: Major causes of batch reactor incidents 1962 - 1984

1962 - 1987

1986 - 1990

1988 - 2013

(%)

(%)

(%)

(%)

21.4

20.1

14.8

-

Raw material quality

7.9

8.9

9.803

10

Maintenance factors

22.3

21.3

22.2

13.3

Temperature control

22.2

18.9

13.9

-

Loss of agitation

9.5

10.1

13.1

13

Mischarging of reactants

16.7

20.7

26.2

16.7

Incident cause Thermo-reaction chemistry

As the key purpose of this work is to establish a methodology for uncertainty management in predictability assessment, it does not concentrate on other steps in details. The terminology and primary concepts related to PRAF are summarized in Section 2. Section 3 provides a brief overview of the methodology for predictability assessment and Section 4 gives the introduction of PVC example with methodology application and detailed algorithms in Section 5. The results from this work are discussed in Section 6 and conclusions in Section 7.

2. Process Resilience Analysis Framework (PRAF) The Process Resilience Analysis Framework is a system-based novel framework developed for process systems for improved management of risk and process safety11. PRAF has been developed with the following objectives of early detection of unsafe domains of operation, assessment of aggregate risks and prioritization of safeguards during process upsets, and reduction in response time. The accomplishment of these objectives would result in a reduced frequency of process upsets, reduced consequences, and enhanced recovery. This framework has three phases of avoidance, survival, and recovery with four aspects of process resilience – early detection, error tolerant design, recoverability, and plasticity15 as represented in Figure 1 below. As PRAF is a quantitative and data-driven approach, 24 resilience metrics have been identified for four process resilience aspects. The data based on these metrics with weights have been used in the case study. This would result in two significant improvements – integration of social metrics in a quantified model, and use of data to drive towards accuracy leading to a more meaningful analysis.

ACS Paragon Plus Environment

Page 3 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Figure 1: Process Resilience Analysis Framework

3. Predictability assessment methodology In recent years, the application of resilience engineering concept has been widely accepted as an important tool for improved risk assessment. However, most of the models or studies are often based on qualitative approaches, pure technical or human/organizational analysis in isolation from each other and/or consideration of resilience principles only after a major catastrophe has happened. There have been research efforts in process systems resilience modeling, however, it has not been used in prediction of process upsets in a comprehensive manner. A few studies in literature also cover the analysis of a management layer in hazard identification. Some examples are Accimap16, System-theoretic process analysis (STPA)17, Blended Hazid (BLHAZID)18, Dynamic Procedure for Atypical Scenarios Identification (DyPASI)19, and Resilience-based Integrated Process Systems Hazard Analysis (RIPSHA)20 ,21. Although, these methods are limited in their application due to their qualitative nature or lack of common terminology and hence good for only screening purposes. Some work has been done in modeling early warning signals or early fault detection in the chemical industry22, 23. One of the major observations in the past decade made by many researchers in the area is that incidents continue to happen, although advancements in risk assessment methods have taken place but something more is needed11. With the application of the method proposed and demonstrated in this manuscript, we will overcome issues such as the limited historical database, missed scenarios in HAZOP, disintegrated social aspects analysis. This paper develops on the idea of using the plant data for prediction of process upsets. The major contribution of the proposed approach is the establishment of an accurate and integrated method to predict process upset situations by virtue Bayesian deep learning backed by process resilience concepts. The overall predictability assessment methodology is presented in Figure 2.

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2: Predictability assessment

This methodology has three main steps of 1) Scenario analysis, 2) Bayesian analysis, and 3) Dynamic simulation and optimization. Additionally, it has two side steps of assigning weights to the resilience metrics and conducting congruence analysis for social factors. This paper covers the first two steps – Scenario and Bayesian analysis. The scenario analysis consists of identification of triggers or events that may cause the system to transit to the process upset state from the normal operations state as illustrated in Figures 1 and 6. Further, the posterior values of statistical parameters are inferred in the Bayesian analysis step based on the prior, resilience metrics information (plant performance data) and congruence analysis (social aspects quantification) results. The incorporation of the relevant resilience metrics information in the analysis enables the use of various data collected in the plant and more importantly, the analysis is more accurate as it is based on the performance of the process system or plant under study. Within the resilience metrics, some are social metrics and in cases of mutual impact of such social metrics, it is important to quantify them before including them in the model. A method called congruence analysis is used to achieve this24. For example, the mutual impact of social metrics such as procedures revised and updated and trainings completed on procedures is combined with observations such as shift handover communication violations to obtain a socio-technical congruence metric (Cst). This metric is then utilized in the assessment. Figure 3 illustrates the flow diagram consisting of the procedural steps, which are taken to calculate the posterior distribution of parameters using Bayes theorem. The authors apply the developed methodology to a case study on Poly Vinyl Chloride (PVC) process system. The step 3 of this methodology is the dynamic simulation and optimization. This consists of sensitivity analysis using global methods and then optimization for maximum profitability. Global sensitivity analysis (GSA) is a tool used to quantify the significant model parameters and their ranges with regard to the model output. For example, for the runway reaction scenario, the impact of three uncertainties (or model parameters) would be studied on the reactor

ACS Paragon Plus Environment

Page 4 of 26

Page 5 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

temperature (model output). Then, optimization is employed to assess the maximum profitability of the system under process and safety constraints. The primary objective of the predictability assessment is to predict the process upsets to avoid the propagation of an event to the catastrophic incident state. With this research, it is aimed to:  integrate the social factors analysis in a single approach rather than having a separate human risk analysis while covering any missed scenarios in HAZOP studies,  move from traditional point values for occurrence of loss of containment events (used in QRAs) to a range similar to the Probabilistic Risk Assessment (PRA) methods,  and manage the uncertainties in the limited historical database on frequency or failure rates by using resilience metrics and thus incorporating the process plant performance data of the plant under study.

Figure 3: Bayesian analysis for uncertainty management

3.1. Bayesian methods The two formal statistical methodologies are frequentist and bayesian. The Bayesian methodology uses Bayes’ theorem. It is based on the idea that there might be prior information (knowledge or belief) about the distribution of a parameter value before taking a sample of observations. The Bayesian methodology provides a way to update our prior information about the model parameters using sample information. Generally, the prior information is summarized in the form of a probability rule called the prior distribution of the model parameters. The posterior distribution of the parameters is proportional to the product of the likelihood and the prior distribution25. Recently, safety assessment methods based on Bayesian analysis have been extensively developed and used in the chemical process industry. It has been realized that techniques such as hierarchical Bayesian analysis and Bayesian network, are effective to overcome the limitation of conventional techniques like fault tree in lack of application to dynamic safety analysis26. The paper by Lee et al., summarizes some recent progress in the process systems engineering field such as Bayesian Q learning, Bayesian-adaptive Markov Decision Processes, and Bayesian reinforcement learning27. Some examples of work on the Bayesian network methods include the work of on Bayesian network

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

method for vulnerability assessment of chemical facilities to intentional attacks28, another Bayesian network model extended to a dynamic Bayesian network based on Dynamic Operational Risk Assessment29. Additionally, there are other application methods developed such as Bayesian–LOPA methodology for the LNG industry30 and a new algorithm using the Bayes’ rule for the diagnosis of known, multiple and unknown faults31.

3.2. Motivating example A systematic description of the predictability assessment for uncertainty management is provided below using a motivating example of pump failure. This example demonstrates the incorporation of plant actual performance data in the form of operations and resilience metrics data to estimate failure probability against using a failure probability from limited the incident database. Problem Statement: To estimate p, the probability of pump failure that can impact process upset given the pump operating characteristics and information on the resilience metrics. Furthermore, the aim is to provide an adequate quantification of the uncertainty in the estimation procedure. From a Bayesian perspective, this can be done by providing credible intervals based on posterior distribution (Bayesian analogue of a confidence interval, in this case, provides a range of values in which the probability of pump failure lies). Step1: Resilience metrics and congruence analysis data and function Resilience metric variables (X1, X2, ……, Xk.) based on metrics collected in the plant along with their weights are considered. Let ν = f(X1, X2, ……, Xk) be an index of the 𝑘 current state of resilience metrics. For instance, one can consider ν = ∑𝑖 = 1𝑤𝑖𝑋𝑖 , a weighted sum, with weights determined from the survey results analysis32. Step2: Appropriate model and simulation data Variables (Y1, Y2, ….., Ym) associated with the pump's operating characteristics recorded in the plant are considered in this step. The authors believe that the joint distribution of g(Y1, Y2, ….., Ym) carries information about the pump's functioning. For example, let θ be the parameter characterizing the joint distribution g(Y1, Y2, ….., Ym). Here θ can be of dimension more than 1. θ is expected to be different under the two circumstances: when the pump is working properly and when the pump is not working properly. The difference is determined by the resilience metrics that capture the actual plant performance data including social – human and organizational aspects. The following joint density can summarize the statistical formulation of the notions described above, g(Y1,Y2,….,Ym)~p g(Y1,Y2,….,Ym;θ,υ(.)) + (1 ― p)g(Y1,Y2,……,Ym; θ), which is a mixture density with two components. The joint density g(·) can be formulated so as to capture the dependence among Y1, Y2, ….., Ym. For example, when m = 2 and Y2 is dependent on Y1, g(Y1, Y2) may be defined as g1(Y1)g2(Y2 | Y1). If they are independent, then g(Y1, Y2) = g1(Y1) g2(Y2). Y1, and Y2 could be current load and unplanned maintenance respectively.

ACS Paragon Plus Environment

Page 6 of 26

Page 7 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Figure 4: A bimodal density of current load measurements (simulated pump failure scenario)

Figure 4 helps to illustrate how indicators might behave in situations when the plant operations are running smoothly and performance is good and when there are some issues. The two humps in Figure 4 represent two modes from two different distributions of current load. A standard statistical way to model bimodal densities is by considering mixture densities as described above. Step3: Prior distributions of parameters The next step is to formulate a prior distribution for the failure probability p. Since p ∈ [0, 1], a natural candidate for the prior is the Conjugate Beta distribution; p ~ Beta(a, b) written as π(p). The hyperparameters ‘a’ and ‘b’ can be set according to domain knowledge or previous analysis. To see further detail into the choice of the prior distributions refer to the Supporting Information. The exact choice of the prior parameters or the distribution itself does not influence the ultimate outcome due to the famous Bernstein-von Mises theorem33, which essentially says asymptotically, the Bayes posterior can be approximated by a Gaussian distribution centered at the maximum likelihood estimate with the Fisher information as its covariance. Step4: Formation of likelihood function Suppose there are n independent observations of (Y1, Y2, ….., Ym) from plant data and for simplicity θ and ν(·) are assumed to be known. First, the likelihood is formed of the observed data for a failure probability p as, 𝑛 l(p | Data) = ∏𝑖 = 1{𝑝𝑔(Y1i,Y2i, …. , Ymi; 𝜃, ν(·)) + (1 ― 𝑝)g( Y1𝑖,Y2i, …. , Ymi; 𝜃)}

Next, latent indicator variables Z1, . . ., Zn are introduced. Here Zi is 1 if (Y1i, Y2i, ….., Ymi) is an observation from a time when the pump was not working properly and 0 otherwise. Note that Zi ~ Bernoulli(p). Although unobserved, the likelihood in the previous step with these latent variables is augmented as, l(p, Z1, . . ., Zn |Data) = ∏𝑍 )} 𝑝𝑆 (1 ― 𝑝)𝑛

𝑖

{𝑔( Y1i, Y2i, ……, Ymi; 𝜃, ν(·)) ∏𝑧

=1

―𝑆

𝑛

where S = ∑𝑖 = 1𝑍𝑖

ACS Paragon Plus Environment

𝑖

𝑔(Y1i,Y2i, ….., Ymi; 𝜃

=0

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Step5: Posterior failure probability functions Step 5.1 With the likelihood and the prior defined above, the posterior distribution is obtained as, π(p, Z1, . . ., Zn)| Data ∝ l(p, Z1, . . ., Zn |Data) π(p) ∏𝑍 = 1{𝑔( Y1i, Y2i, ……, Ymi; 𝜃, ν(·)) 𝑖 ∏ 𝑔(Y1i,Y2i, ….., Ymi; 𝜃)} 𝑝𝑆 + 𝑎 ― 1 (1 ― 𝑝)𝑏 + 𝑛 ― 𝑆 ― 1 𝑧𝑖 = 0

Step 5.2 The posterior distribution does not have any standard form. Thus, to aid the analysis, a standard Markov chain Monte Carlo sampling technique known as Gibbs sampling is used to obtain samples from the posterior; see Supporting Information for more details. In the current scenario, the Gibbs sampler comprises of two steps: (a) Draw p ~ Beta (a+S, b+n−S) (b) Update Zi ~ Bernoulli(pi) where pi is defined below, 𝑝𝑖 =

𝑝𝑔(𝑌1𝑖,…., 𝑌𝑚𝑖; 𝜃, ν(·)) 𝑝𝑔(𝑌1𝑖,…., 𝑌𝑚𝑖; 𝜃, ν(·)) + (1 ― 𝑝)g(𝑌1𝑖,…., 𝑌𝑚𝑖; 𝜃)

Starting with an initial choice of p and (Z1, . . ., Zn), the above two steps are repeated N times to obtain N draws from the posterior distribution in Step 5.1. Result Finally, the posterior mean of p as an estimate of the pump failure probability is reported and 95% symmetric credible intervals as a quantification of the uncertainty involved in the procedure is provided. In the simulation experiments, true parameter values are known, hence the validity of the proposed statistical procedure can be tested by checking whether the credible intervals contain the true parameter. One other important indicator is the length of these intervals. Due to the large sample sizes considered in the experiments, the intervals are expected to be short, thus leading to more precise and improved decision making.

4. Case study: Poly Vinyl Chloride suspension batch process Poly Vinyl Chloride (PVC) is one of the most important products of the polymer industry. The three main processes used for the commercial production of PVC are suspension (providing 80% of world production), emulsion (12%) and bulk (8%). The physical-chemical properties change during the batch, important modifications of the heat and mass transfer take place; thus, the control of reactor temperature becomes difficult34. In the PVC process presented in Figure 5, the batch reactor is the most critical equipment from process safety perspective and hence considered by the authors. For the predictability analysis; thermal runway reaction scenario, which is the worst-case scenario12 is being studied in this example.

ACS Paragon Plus Environment

Page 8 of 26

Page 9 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Figure 5: Poly Vinyl Chloride manufacturing

4.1. Motivation As noted in the introduction of Section 4, PVC forms an important product of the chemical industry. Table 2 shows a list of PVC incidents that have occurred over decades35, 36. One can note that the incidents have a variety of causes including social aspects such as failure to learn from previous incidents, operator errors, lack of implementation of the hazard evaluation study’s findings etc.. This emphasizes the significance of incorporation of social metrics in the analysis. Table 2: PVC manufacturing incidents 35, 36

Year

Location

Cause

Consequence Four killed, ten injured. Major structural damage.

1961

Japan

Operator error

1966

New Jersey

Operator error

1980

Massachusetts

Operator error

1980 2003

California Louisiana

Design error Operator error

One killed. Plant destroyed. Two injured; damage over $1 million. Major structural damage. Release of 8,000 pounds of VCM

2004

Illinois

Operator error

Release of VCM to the atmosphere

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2004

Illinois

2005

Delaware City

Page 10 of 26

Human factors Lack of implementation of hazard evaluation findings Lack of learning from previous incidents Emergency response

Five killed, three injured, and community evacuated

Operator error

VCM release of about 2,500 pounds

4.2. Resilience metrics for batch plant operations For any quantitative analysis, metrics and data are important. With the objective of quantifying the risk and resilience, 24 resilience metrics (including both technical -process or equipment related parameters such as alarm rate37, and social aspects - human and organization related factors such as current and accurate procedures, training completed on schedule, shift-handover communication) covering the 3 phases of PRAF were developed32. In order to incorporate their impact or contribution to the mathematical model, it was important to assign weights. Weights of these resilience metrics were assessed based on the analysis of the PRAF survey response32. This survey produced categorical responses and appropriate techniques such as ordinal alpha calculation, Kruskal-Wallis test, and polychoric correlations assessment were used to analyze the survey response. These metrics incorporate plant actual performance data from the historian, Central Maintenance Management System (CMMS), Process Safety Management (PSM) system etc38. Table 3 lists the selected resilience metrics considered in the analysis of PVC process system with their weights. These were selected depending on the type of case under study. Table 3: Resilience Metrics for Batch Plant Operations32

Resilience Metrics for Batch Plant Operations Metric

Resilience Aspect

Weight

Early Detection

0.85

Early Detection

0.65

Early Detection

0.64

Early Detection

0.73

Plasticity

0.71

Plasticity

0.71

Number of process safety near-miss

Plasticity

0.86

Number of shift handover communication violations

Plasticity

0.72

Learning From Incident (LFI) communication

Plasticity

0.58

Percentage of process safety action items not closed

Plasticity

0.80

Percentage of process hazard evaluations not completed

Plasticity

0.69

Percentage of required process safety related procedures reviewed or revised

Plasticity

0.71

Percentage of time when reactor was operated outside design limits Alarm rate - temperature of incoming supply of cooling agent/water, current load data, agitator speed Number of unplanned maintenance Number of unplanned shutdowns Percentage of process safety required training sessions completed with skills verification Percentage of maintenance backlogs

ACS Paragon Plus Environment

Page 11 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

5. Methodology application 5.1. Step1: Scenario analysis A report published by the U.S. Chemical Safety Board illustrated that around 35% of incidents were caused due to runaway reactions39. In the PVC process system, there are three main system states: normal operation, process-upset event, and catastrophic state as shown in Figure 6. As noted in the literature, the rate of heat removal vs heat generation is critical for a thermal runaway reaction40. The literature on the exothermic reaction leading to a thermal runaway has highlighted several causes41, 42. Some of them are thermochemistry, reactants mischarging, temperature control, inadequate agitation, maintenance, and human errors. In this example, the purpose is to predict the high temperature situation in the reactor. Hence, the initiating events IE1, IE2, IE3, IE4 - high cooling medium temperature, agitator failure, mischarging of reactants, and unknown disruptions selected for this study are the ones that relate closely to the selection of optimal operating conditions.

Figure 6: PVC process system transition diagram

5.2. Step2: Statistical analysis Finite sample performance of statistical procedures can be largely enhanced when domain expertise is incorporated in such an analysis43. In the current case study, domain knowledge and information from plant performance is utilized in the form of prior information and likelihood respectively. Also, there is a two-pronged objective to estimate the parameters in the model, and to quantify the uncertainty associated in the form of posterior distribution. Models for three variables are demonstrated using hypothetical data to quantify uncertainty incorporating the plant process and resilience metrics data. These metrics should be selected to be utilized in the analysis based on the following two factors:  Relevance to the case study or scenario under study. For example, metrics such as unplanned maintenance can give an indication of the health of the agitator or pump.

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60



Page 12 of 26

Availability of information on the metrics. Many organizations already capture information on most of the metrics, although it is important to see what information is available to conduct the analysis.

5.2.1. Cooling medium temperature analysis Model and Prior Figure 7 shows the detailed algorithm for cooling medium temperature analysis. Suppose there is a set of cooling medium temperatures (these can be observed from Historian) {T}ni=1 measured in Kelvin scale. An additive model is proposed with three components for Ti ; a grand mean µ, a deterministic known effect ν and a random effect α. µ can be interpreted as the unknown population mean of the temperatures. The number ʋ = ʋ(𝑋)= ƩwjXj is a linear weighted summary of resilience metrics (Xj ) where the weights have been determined based on the results of PRAF survey on the resilience metrics32. To accommodate further randomness due to unknown resources, a random effect that follows normal distribution, α~N(0,φ2) is added. With these components, the ith cooling medium temperature Ti is assumed to be, Ti = µ + ʋ(𝑋)+ α + εi ,

(1)

where ε~N(0,σ2), the usual randomness in any measurement independent of the random effect alpha. Equation (1) can alternatively be written as, Ti∗ = µ + εi∗ where Ti∗ = Ti ―υ(𝑋) and εi∗ = α + εi . Clearly, εi∗ ~N(0,τ2) where τ2 = σ2 + φ2 . The model is complemented with a conjugate prior specification on the parameters µ and τ2: Ti∗ |µ,τ2 = µ + εi∗ µ~N(µ0,σ20) , τ2~Inverse Gamma (β1,β2)

(2)

The hyperparameters µ0, σ02 are chosen to reflect prior belief on µ. The β1 and β2 are chosen such that the prior distribution of τ2 remains sufficiently noninformative.

ACS Paragon Plus Environment

Page 13 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Figure 7: Algorithm: cooling medium temperature analysis

Analysis The objective is to obtain the full posterior distribution of the parameters that can be written as π (µ, τ2 |T, X). Often, such posterior distributions are not analytically tractable and hence Markov Chain Monte Carlo (MCMC) techniques are employed to sample from the posterior43. The conjugate prior specification allows for conditional Gibbs update of the parameters and the MCMC sampler cycles through the following steps: 𝑇𝑖∗

𝑛

∑𝑖 = 1

1. π (µ|T, X, τ2 ) ∼ N(

n τ2

τ2

+

+ 1 σ2 0

µ0 σ2 0

,

1 n τ2

+

1) σ2 0

(3) 2. π

(τ2

|T, X, µ) ∼ Inverse-Gamma (n/2 + β1 , 1/2

∑𝑛 𝑖=1

(

𝑇𝑖∗

2

― µ) + 𝛽

2

In these simulation experiments, a sample size of n = 8736 (one year of hourly observations of temperature) is considered. The prior mean is fixed as µ0 = 283.15K based on domain knowledge for the process with a high prior variance, σ02 = 30K. Essentially, this choice quantifies a moderately strong belief of a process running in normal conditions. β1 and β2 were both fixed at 0.001 which gives a sufficiently flat

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 26

prior for τ2. Figure 8 shows the graph of data generated using this simulation model to obtain the posterior distribution.

Figure 8: Histogram with density estimate of cooling medium temperatures

5.2.2. Agitator failure analysis Model and Prior Figure 9 illustrates the detailed algorithm for agitator failure analysis. Typical statistical techniques to model binary events such as agitator failure include logistic regression, probit regression etc.. The authors rely on analyzing underlying variables that are believed to drive agitator performance44. In this analysis, two variables I1 = current load in Ampere and I2 = number of unplanned maintenance are used. A joint distribution of (I1, I2) is a mixture with two components - when the agitator is more likely to fail joint distribution of (I1, I2), it takes the form f1(·, ·) and when agitator is working normal it takes the form f2(·, ·). Formally, this can be written as g(I1, I2), and the distribution of (I1, I2) as, g(I1, I2) = pAf1(I1, I2) + (1 − pA)f2(I1, I2)

(4)

where pA denotes the agitator failure probability. Although f1(·, ·) and f2(·, ·) can be assumed to belong to different parametric families, here it is restricted to the case where both f1 and f2 belong to the same parametric family possibly with different parameters. Since, I1 is a positive real number and I2 is a positive integer, the following data generating mechanism is assumed: I1~Gamma(α1,β1),

I2|I1~Poisson(c1,I1)

(5)

when the agitator is working normal and I1~Gamma(α2 +υ(X),β2), I2|I1~Poisson(c2,I1)

ACS Paragon Plus Environment

(6)

Page 15 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

when the agitator is not working normally. The resilience summary ʋ(𝑋) determines the additional shift in shape of the distribution of I1. Since I2 is generated conditional on I1, this shift in scale will also influence I2. Here, it is assumed c1, c2 ∈ R+. The objective, given the data is to estimate pA and quantify the associated uncertainty. As for the priors, a conjugate Beta (a, b) on the agitator failure probability is placed to complete the model specification.

Figure 9: Algorithm: agitator failure analysis

Analysis In order to facilitate the posterior computation, a binary latent variable Z for each pair of (I1, I2) which indicates the underlying distribution (I1, I2) is coming from is introduced. Hence, I1,I2|Z = 0)~f1 and I1,I2|Z = 1)~f2.Clearly, apriori P (Zi = 1) = pA for all pairs (I1i, I2i). The following conditional Gibbs steps are used to update the joint posterior distribution of (pA, Z| I1, I2). 𝑛

𝑛

1. π (pA|Z, I1, I2) ∼ Beta ∼ (a + ∑𝑖 = 1𝑍𝑖,𝑏 + ∑𝑖 = 1(1 ― 𝑍𝑖) ) (7) pAfγ(𝐼1𝑖 |α1, β1)fP (𝐼2𝑖 |𝑐1𝐼1𝑖)

2. P(Zi=1|p,I1,I2)= pAfγ(𝐼1𝑖 |α1, β1)fP (𝐼2𝑖 |𝑐1𝐼1𝑖)

+ (1 ― pA)fγ(𝐼1𝑖 |α2 + ν(X), β1)fP (𝐼2𝑖 |𝑐2𝐼1𝑖)

where fγ(·|α, β) denotes the Gamma probability density function with parameters α and β and fP (·|λ) denotes the Poisson probability mass function with parameter λ. In these simulation experiments, a sample size of n = 8736 (one year of hourly observations of agitator), α1 = 1, β1 = 3, α2 = 10 and β2 = 3 was used. The constants c1 and c2 were chosen as 2 and 4 respectively. The prior parameters a

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 26

and b were fixed at 0.00001 and 0.009702. In Figure 10 displays the values of (I1, I2) generated from the mixture density in eqn. (4).

Figure 10: Scatterplot showing the clustering of (I1, I2). Red triangles - when agitator is working in normal conditions, black triangles - otherwise.

5.2.3. Mischarging of reactants Model and Prior Figure 11 illustrates the detailed algorithm for reactor mischarging analysis. Two metrics, Y1 = percentage of process safety required procedures reviewed or revised as scheduled and Y2= percentage of process safety required training sessions completed are considered. Additionally, there is Cst = socio-technical congruence metric lying in the interval [0, 1] which depend on Y1 and Y2. In these simulation experiments, a sample size of n = 720 (one year of observations for every batch) is considered. The following decomposition is used for the joint distribution of (Y1, Y2, Cst), f(Y1, Y2, Cst) = pRf(Y1)f(Y2)f(Cst | Y1, Y2) + (1 − pR)f(Y1)f(Y2)f(Cst | Y1, Y2) (8) where pR is probability of reactants mischarging. The mixture formulation can be interpreted in the same manner as in the agitator failure case. The individual components of the mixture density are products of the densities of Y1 and Y2, which are assumed to be independent and then the product is multiplied by the conditional distribution f(Cst|Y1, Y2). In these simulations f(Y1) and f(Y2) are assumed to be Uniform (0, 1). A weighted sum of Y1 and Y2 is considered to generate the Cst values. Cst is formulated as,

{

𝐶𝑠𝑡 = 𝐶𝑠𝑡 =

𝑒𝑍1

1 + 𝑒𝑍1 𝑒𝑍2 1+𝑒

(9)

𝑍2

where Z1~N(w1Y1 + w2Y2,1) and Z2~N(w1Y1 + w2Y2 +υ(X),1). Here Z1 represents a Gaussian random summary of (Y1, Y2) when there is no reactant mischarging.

ACS Paragon Plus Environment

Page 17 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Figure 11: Algorithm: reactor mischarging analysis

If reactant mischarging occurs, the mean of the index is shifted by an amount determined by ʋ(𝑋). Finally, since Cst ∈ [0, 1], a logistic transformation is used to generate Cst. The choice of logistic transformation is not necessary. For example, it can be set to Cst = P (Z ≤ z) for any probability distribution function P. However, the logistic transformation is the most widely used transformation in the analysis of variables, which lie in the interval [0, 1]. Similar to the previous section a Beta (a, b) prior on the mischarging probability is assumed and the prior parameters a and b were fixed at 0.00001 and 0.9999. Analysis A binary latent variable V is introduced. These latent variables play the same role as Z in the agitator failure analysis. As such, P(V = 1) = pR apriori. The following Gibbs sampling scheme is adopted to sample from the joint posterior distribution of (pR, V1, . . ., Vn). 𝑛

𝑛

1.π (pR|V, Cst, Y1, Y2) ∼ Beta ∼ (a + ∑𝑖 = 1𝑉𝑖,𝑏 + ∑𝑖 = 1(1 ― 𝑉𝑖)) (10) 2.P(Vi=1|pR,I1,I2)=

pR𝑓U (Y1i)𝑓U (Y2i)𝑓N (log𝐶st,i | Y1, Y2)

pR𝑓U (𝑌1i)𝑓U (Y2i)𝑓N (log𝐶st,i|Y1, Y2) + (1 ― pR)𝑓U (Y1i)𝑓U (Y2i)fN (log𝐶st,i|Y1, Y2 + ν(X))

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 12: Scatterplot of Z1 and Z2. Red crosses indicate values of Z1 and black triangles indicate values of Z2.

Since the distribution of Y1 and Y2 is same for the two different mixture components, values of Z1 and Z2 are plotted in Figure 12 to gain a better insight into how a typical data generated from the above model would look like.

6. Results The results for three different variables – cooling medium temperature, agitator failure, and reactor mischarging for six different choices of true mean or true probabilities are presented in Table 4. Also, the true data generating parameters (µ0, p0A, p0R) are reported. These are complemented with the posterior mean estimate obtained from the model and analysis described in Section 5. Additionally, the 95% symmetric credible intervals of all the parameters, which are computed as the 0.025, and 0.975 quantiles of the posterior samples of the parameters and the posterior standard deviation are provided. All simulations and analysis were carried out in the programming language R45. To summarize, with the proposed method, the authors provide a formal statistical method to incorporate important resilience indicators that may influence plant operations. Using the simulation experiments, it is demonstrated how these methods lead to accurate inference. A more interesting exercise would be to apply the proposed method to data from chemical plant operations, which is beyond the scope of this article.

ACS Paragon Plus Environment

Page 18 of 26

Page 19 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

Industrial & Engineering Chemistry Research

Table 4: Uncertainty analysis: simulation results

Cases

High cooling medium temperature

Agitator failure

Prior mean cooling medium temperature: 278.15K

Prior mean probability of agitator failure: 0.00103

Prior variance cooling medium temperature: 30 K

Prior variance probability of agitator failure: 0.001

Mischarging of reactants Prior mean probability of mischarging of reactants: 0.001 Prior variance probability of mischarging of reactants:0.00091

True mean temperature (µ0)

Posterior mean (µ)

Posterior sd

Length of credible interval

True probability of agitator failure (p0A)

Posterior mean (p0A^)

Posterior sd

Length of credible interval

True mean of mischarging of reactants (p0R)

Posterior mean (p0R^)

Posterior sd

Length of credible interval

287.15

285.66

2.758799

7.11

0.004

0.00400

0.000840

0.0035

0.006

0.00620

0.002081

0.0037

Case 2

285.15

284.25

2.230892

7.05

0.005

0.00490

0.000917

0.0040

0.005

0.00560

0.001951

0.0046

Case 3

283.15

282.58

1.909877

7.36

0.006

0.00600

0.001000

0.0044

0.004

0.00420

0.001696

0.0060

Case 4

281.15

280.84

1.777232

7.94

0.007

0.00730

0.001096

0.0046

0.003

0.00350

0.001549

0.0065

Case 5

279.15

279.06

1.735566

9.24

0.008

0.00790

0.001132

0.0050

0.002

0.00210

0.001199

0.0076

Case 6

277.15

277.25

1.744221

11.68

0.009

0.00900

0.001207

0.0053

0.001

0.00140

0.000988

0.0081

Case 1

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

6.1. Cooling medium temperature Simulation results for different choices of the true mean µ are summarized in Figure 13. It is evident that the posterior distribution centers around the true data generating µ with a very small spread around it. Posterior credible intervals were also formed to further gain insight into the associated uncertainty in the estimation procedure. The coverage probability was 0.96.

Figure 13: Cooling medium temperature: prior (black) and posterior (blue) distribution

A comparative analysis for different choices of resilience metrics leading to three values of ʋ(𝑋) is conducted. The results for this analysis are reported in Table 5. It can be seen from the results for posterior mean, the variation in the uncertainty based on the selection of the

ACS Paragon Plus Environment

Page 20 of 26

Page 21 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

resilience metrics. For cases II and III, the posterior men values are significantly different from the true mean values. This implies that the domain knowledge on the underlying variables and hence the choice of resilience metrics is critical for a robust analysis. Table 5: Comparative analysis with different choice of resilience factors: cooling medium temperature High cooling medium temperature Prior mean cooling medium temperature: 278.15K Prior variance cooling medium temperature: 30 K Cases

I

II

III

True mean temperature (µ0)

Posterior mean (µ)

Posterior mean (µ)

Posterior mean (µ)

277.15

277.25

272.08

271.32

279.15

279.06

273.71

272.87

281.15

280.84

275.45

274.58

283.15

282.58

277.24

276.37

285.15

284.25

279.05

278.15

287.15

285.66

280.85

279.94

6.2. Agitator failure This section summarizes the results with boxplots of prior and posterior samples of agitator failure probability as shown in Figure 14. It is also observed here that the concentration of the posterior distribution centers around the truth with a very small spread. Similar to the temperature case, posterior credible intervals were also formed and the coverage probability was 0.97.

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 14: Boxplot of prior and posterior samples of agitator failure probability

6.3. Reactor mischarging This section summarizes the results with boxplots of prior and posterior samples of reactor mischarging probability as represented in Figure 15. It is also observed here that the concentration of the posterior distribution centers around the truth with a very small spread. Similar to previous cases, posterior credible intervals were also formed and the coverage probability was 0.95.

ACS Paragon Plus Environment

Page 22 of 26

Page 23 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Figure 15: Boxplot of prior and posterior samples of reactor mischarging probability

7. Conclusions With changing technology and increasing regulatory standards in process industries, process safety management and risk analysis have become challenging. There are uncertainties involved in the risk assessment of process systems operations due to exposure to expected or unexpected disturbances. In this work, a resilience-based approach to manage uncertainties for predictability assessment of process-upset conditions has been conducted using plant performance data and advanced statistical methods. The use of resilience metrics data in quantifying uncertainties for prediction of process upset event of high reactor temperature in the PVC process system has been demonstrated. The statistical models developed for three uncertainties - cooling medium temperature, agitator failure and reactants mischarging are robust. The two main conclusions from the case study results are – reduction in variance of statistical parameters and information on the accurate values. The major contribution of the proposed approach is the establishment of an accurate and integrated method to predict process upset situations by virtue of an MCMC formulation backed by process resilience concepts. Most of the existing risk assessment methods for failure rates use failure numbers from the historical databases that do not represent the true picture of the process system for which the risk assessment is being conducted. The proposed approach would enable the risk assessors to make risk decisions based on information of their process plant and hence better allocate the resources towards improvement areas. The implementation of the developed models can (i) integrate the social factors analysis in a single approach, (ii) move from traditional point values for occurrence of loss of containment events (used in QRAs)

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

to a range similar to the Probabilistic Risk Assessment (PRA) methods, and (iii) manage the uncertainties in the limited historical database on frequency or failure rates by using resilience metrics and thus incorporating the process plant performance data of the plant under study. This subsequently may increase safety, reliability, efficiency, and profitability of the process system. The proposed method is easy to understand and implement in the real application to a process system. Some of the initial challenges that the team could face are – identification of the process operations data and choice of resilience metrics, data collection format, and data preprocessing. Overall, this study strengthens the idea that with the PRAF approach; incorporation of both technical and social aspects was achieved, which would help the risk assessor to make informed process risk decisions. This would lead to safer, reliable, efficient, and profitable process systems. A further study regarding the incorporation of results from this study into the dynamic simulation and robust optimization step to complete overall predictability evaluation will be conducted. In addition, further studies in order to validate the models from this work with a real dataset would be worthwhile.

Supporting Information This information is available free of charge via the Internet at http://pubs.acs.org/. Statistical methods: Bayes’ theorem; conjugate priors; Gibbs sampling

Acknowledgments The Mary Kay O’Connor Process Safety Center sponsored this research with partial financial support from Texas A&M Energy Institute. We would like to express our gratitude to Mr. Albert Halphen for his valuable insights.

References (1) Khan, F. I.; Abbasi, S., Techniques and methodologies for risk analysis in chemical process industries. Journal of loss Prevention in the Process Industries 1998, 11 (4), 261-277. (2) Hollnagel, E.; Woods, D. D.; Leveson, N., Resilience engineering: Concepts and precepts. Ashgate Publishing, Ltd.: 2007. (3) Morari, M., Design of resilient processing plants—III: A general framework for the assessment of dynamic resilience. Chemical Engineering Science 1983, 38 (11), 1881-1891. (4) Grossmann, I. E.; Morari, M., Operability, resiliency, and flexibility: Process design objectives for a changing world. 1983. (5) Woods, D. D., Creating foresight: How resilience engineering can transform NASA’s approach to risky decision making. Work 2003, 4 (2), 137-144. (6) Hollnagel, E., RAG-The resilience analysis grid. Resilience engineering in practice: a guidebook. Ashgate Publishing Limited, Farnham, Surrey 2011, 275-296. (7) Dinh, L. T.; Pasman, H.; Gao, X.; Mannan, M. S., Resilience engineering of industrial processes: principles and contributing factors. Journal of Loss Prevention in the Process Industries 2012, 25 (2), 233-241. (8) Shirali, G.; Motamedzade, M.; Mohammadfam, I.; Ebrahimipour, V.; Moghimbeigi, A., Challenges in building resilience engineering (RE) and adaptive capacity: A field study in a chemical plant. Process safety and environmental protection 2012, 90 (2), 83-90. (9) Francis, R.; Bekera, B., A metric and frameworks for resilience analysis of engineered and infrastructure systems. Reliability Engineering & System Safety 2014, 121, 90-103. (10) Azadeh, A.; Salehi, V.; Mirzayi, M.; Roudi, E., Combinatorial optimization of resilience engineering and organizational factors in a gas refinery by a unique mathematical programming

ACS Paragon Plus Environment

Page 24 of 26

Page 25 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

approach. Human Factors and Ergonomics in Manufacturing & Service Industries 2017, 27 (1), 5365. (11) Jain, P.; Pasman, H. J.; Waldram, S.; Pistikopoulos, E.; Mannan, M. S., Process Resilience Analysis Framework (PRAF): A systems approach for improved risk and safety management. Journal of Loss Prevention in the Process Industries 2017. (12) Westerterp, K.; Molga, E., Safety and runaway prevention in batch and semibatch reactors—a review. Chemical Engineering Research and Design 2006, 84 (7), 543-552. (13) Verwijs, J. W., Reactor start-up and safeguarding in industrial chemical processes. 1994. (14) Saada, R.; Patel, D.; Saha, B., Causes and consequences of thermal runaway incidents—will they ever be avoided? Process Safety and Environmental Protection 2015, 97, 109-115. (15) Jain, P.; Pasman, H. J.; Waldram, S. P.; Rogers, W. J.; Mannan, M. S., Did we learn about risk control since Seveso? Yes, we surely did, but is it enough? An historical brief and problem analysis. Journal of Loss Prevention in the Process Industries 2016. (16) Rasmussen, J., Risk management in a dynamic society: a modelling problem. Safety science 1997, 27 (2-3), 183-213. (17) Leveson, N., A new accident model for engineering safer systems. Safety science 2004, 42 (4), 237270. (18) Seligmann, B. J.; Németh, E.; Hangos, K. M.; Cameron, I. T., A blended hazard identification methodology to support process diagnosis. Journal of Loss Prevention in the Process Industries 2012, 25 (4), 746-759. (19) Paltrinieri, N.; Tugnoli, A.; Buston, J.; Wardman, M.; Cozzani, V., Dynamic procedure for atypical scenarios identification (DyPASI): a new systematic HAZID tool. Journal of Loss Prevention in the Process Industries 2013, 26 (4), 683-695. (20) Jain, P.; Rogers, W. J.; Pasman, H. J.; Keim, K. K.; Mannan, M. S., A Resilience-based Integrated Process Systems Hazard Analysis (RIPSHA) approach: Part I plant system layer. Process Safety and Environmental Protection 2018, 116, 92-105. (21) Jain, P.; Rogers, W. J.; Pasman, H. J.; Mannan, M. S., A resilience-based integrated process systems hazard analysis (RIPSHA) approach: Part II management system layer. Process Safety and Environmental Protection 2018, 118, 115-124. (22) Pariyani, A.; Seider, W.; Oktem, U.; Soroush, M., Improving process safety and product quality using large databases. Computer Aided Chemical Engineering 2010, 28, 175-180. (23) Pariyani, A.; Seider, W. D.; Oktem, U. G.; Soroush, M., Dynamic risk analysis using alarm databases to improve process safety and product quality: Part I—Data compaction. AIChE Journal 2012, 58 (3), 812-825. (24) Cataldo, M.; Herbsleb, J. D.; Carley, K. M. In Socio-technical congruence: a framework for assessing the impact of technical and work dependencies on software development productivity, Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement, ACM: 2008; pp 2-11. (25) Bolstad, W. M.; Curran, J. M., Introduction to Bayesian statistics. John Wiley & Sons: 2016. (26) Khakzad, N.; Yu, H.; Paltrinieri, N.; Khan, F., Reactive approaches of probability update based on Bayesian methods. In Dynamic Risk Analysis in the Chemical and Petroleum Industry, Elsevier: 2017; pp 51-61. (27) Lee, J. H.; Shin, J.; Realff, M. J., Machine learning: Overview of the recent progresses and implications for the process systems engineering field. Computers & Chemical Engineering 2017. (28) Argenti, F.; Landucci, G.; Reniers, G.; Cozzani, V., Vulnerability assessment of chemical facilities to intentional attacks based on Bayesian Network. Reliability Engineering & System Safety 2018, 169, 515-530. (29) Barua, S.; Gao, X.; Pasman, H.; Mannan, M. S., Bayesian network based dynamic operational risk assessment. Journal of Loss Prevention in the Process Industries 2016, 41, 399-410. (30) Yun, G.; Rogers, W. J.; Mannan, M. S., Risk assessment of LNG importation terminals using the Bayesian–LOPA methodology. Journal of Loss Prevention in the Process Industries 2009, 22 (1), 91-96. (31) Chiang, L. H.; Jiang, B.; Zhu, X.; Huang, D.; Braatz, R. D., Diagnosis of multiple and unknown faults using the causal map and multivariate statistics. Journal of Process Control 2015, 28, 27-39. (32) Jain, P.; Mentzer, R.; Mannan, M. S., Resilience metrics for improved process-risk decision making: Survey, analysis and application. Safety Science 2018, 108, 13-28.

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(33) Freedman, D., Wald lecture: On the Bernstein-von Mises theorem with infinite-dimensional parameters. The Annals of Statistics 1999, 27 (4), 1119-1141. (34) Nagy, Z.; Agachi, Ş., Model predictive control of a PVC batch reactor. Computers & chemical engineering 1997, 21 (6), 571-591. (35) Nass, L. I.; Heiberger, C. A., Encyclopedia of PVC, vol. 1. Marcel Dekker Inc., New York and Basel 1976, 271, 271. (36) CSB Investigation Report, Vinyl Chloride Monomer Explosion; 2004-10-I-IL; Chemical Safety Board: 2007. (37) Goel, P.; Datta, A.; Mannan, M. S., Industrial alarm systems: Challenges and opportunities. Journal of Loss Prevention in the Process Industries 2017, 50, 23-36. (38) Goel, P.; Datta, A.; Mannan, M. S. In Application of big data analytics in process safety and risk management, Big Data (Big Data), 2017 IEEE International Conference on, IEEE: 2017; pp 11431152. (39) CSB Incident data: Reactive hazard investigation.; Chemical Safety Board: 2003. (40) Nolan, P. F.; Barton, J. A., Some lessons from thermal-runaway incidents. Journal of hazardous materials 1987, 14 (2), 233-239. (41) Barton, J.; Nolan, P., Incidents in the chemical industry due to thermal runaway chemical reactions. Hazards X: Process Safety in Fine and Speciality Chemical Plants 1989, (115), 3-18. (42) CCPS, C. f. C. P. S., Layer of protection analysis: simplified process risk assessment. Wiley: 2011. (43) Robert, C., The Bayesian choice: from decision-theoretic foundations to computational implementation. Springer Science & Business Media: 2007. (44) Nelder, J. A.; Baker, R. J., Generalized linear models. Wiley Online Library: 1972. (45) R Development Core Team, R: A language and environment for statistical computing [Internet]. Vienna, Austria: R Foundation for Statistical Computing; 2014. 2015.

For Table of Contents Only

ACS Paragon Plus Environment

Page 26 of 26