Maximum Likelihood Estimation—A Reliable Statistical Method for

Nov 16, 2015 - Permutation tests were incorporated to ensure unbiased comparison from a statistical point of view. The MLE method proved to be consist...
0 downloads 17 Views 715KB Size
Subscriber access provided by MONASH UNIVERSITY

Article

Penalized Maximum Likelihood Estimation - A Reliable Statistical Method for Hydrate Nucleation Data Analysis Thor Martin Svartaas, Wei Ke, Sergei Tantciura, and Aina Undersrud Bratland Energy Fuels, Just Accepted Manuscript • DOI: 10.1021/acs.energyfuels.5b02056 • Publication Date (Web): 16 Nov 2015 Downloaded from http://pubs.acs.org on November 24, 2015

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Energy & Fuels is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

Maximum Likelihood Estimation - A Reliable Statistical Method for Hydrate Nucleation Data Analysis Thor M Svartaas1*, Wei Ke1, Sergei Tantciura1‡, Aina U Bratland1

1

Department of Petroleum Engineering, Faculty of Science and Technology, University of Stavanger, 4036 Stavanger, Norway ‡

Present Affiliation: Lukoil, Arkhangelsk, Russia

Received: / Accepted: / Published:

Abstract Analysis on hydrate nucleation and nucleation processes in general are assumed to require a large number of parallels due to the stochastic nature of the process. Stochastic processes are described by exponential probability distribution. The present work adopts Maximum Likelihood Estimation (MLE) with penalized estimators for data analysis in gas hydrate nucleation studies. Key parameters analyzed and discussed are nucleation rate and lag time during formation of

*

Corresponding author: Tel.: +4751832285; fax: +4751832050. Email address: [email protected]

1

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 57

critical size nuclei. The MLE technique is commonly used in analysis of other types stochastic processes, but has not to our knowledge previously been applied on hydrate nucleation. A total of six data sets, three on methane-propane hydrate nucleation from the present work, and three on general crystal nucleation extracted from literature, were analyzed using MLE. The MLE analysis was compared with conventional method fitting the Experimental Probability Array to the exponential Probability Distribution Function (EPA-PDF). The results indicated that the MLE method outperforms the EPA-PDF method for data analysis of stochastic nucleation processes, in both consistency and reliability. The dependence on the number of parallels to produce reliable nucleation rate estimates was examined. Permutation tests were incorporated to ensure unbiased comparison from a statistical point of view. The MLE method proved to be consistent and robust, less dependent on a large number of parallels for reliable estimation of nucleation rates. MLE analysis on the present hydrate nucleation data indicated that 20 parallels is most probably sufficient for this method to maintain reliable estimation of hydrate nucleation rate with statistical consistency. Literature data on other crystallization processes supported a range of 20 to 25 parallels as sufficient.

Keywords: gas hydrates; nucleation; probability distribution; number of parallels; statistical method

2

ACS Paragon Plus Environment

Page 3 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

Introduction Gas hydrates are family of crystalline, ice-like, non-stoichiometric clathrate compounds. They are formed of hydrogen-bonded water molecules as hosts and a variety of gas molecules as guests. Gas hydrates are usually stable at relatively high pressures and low temperatures. The pressure and temperature range for the hydrate stability zone is dependent on type and sizes of gas molecules involved. Hydrate forming gas molecules are usually small (< 10 Å)1 and includes hydrocarbon gases such as methane, ethane, propane, iso-butane, as well as inorganic gases like nitrogen, carbon dioxide, hydrogen, oxygen, etc.2 Gas hydrates from hydrocarbon gases may form at temperatures well above the freezing point of water, and may thus cause problems during oil and gas production.3, 4 Over the last few decades, natural gas hydrates formed from methane, ethane, propane, carbon dioxide and hydrogen sulfide, and mixtures of those have been systematically studied and reviewed.1, 2, 5-11 Traditionally hydrate formation in gas pipelines are prevented by adding large quantities of antifreezes such as methanol (MeOH) or mono ethylene glycol (MEG) to the water phase.12-14 In multiphase pipelines and at harsh conditions of low temperatures, high pressures and high water cuts, the costs of MeOH or MEG for hydrate prevention may be too high. This led to search for and development of low dosage hydrate inhibitors (LDHIs).15, 16 LDHIs are divided in two different groups, anti-agglomerants (AAs) and kinetic hydrate inhibitors (KHIs). KHIs, for instance, may prevent / delay nucleation and reduce growth rates allowing operation within the hydrate stability region for some limited time period.17, 18 Research around gas hydrates and crystallization in general has targeted the three main stages of phase transitions: nucleation,2, 19 growth,2, 19, 20 and dissociation.2, 21 Numerous experimental 3

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 57

investigations2, 10, 22-29 as well as modeling and simulation work6, 26, 30-33 have been performed to understand the thermodynamics and kinetics behind each stage. The behavior of hydrate growth and dissociation is relatively easy to monitor in real-time, and study on a macroscopic scale.28, 29, 34-39

On the contrast, hydrate nucleation is stochastic process. Gas-water clusters of various sizes

form and shrink in metastable region until the moment when the free energy barrier is overcome and stationary nucleation commences.2, 40 The nucleation mechanism is not yet fully understood. During nucleation, assembly and destruction of water lattices with encaged gas molecules occurs at a molecular level. At such a scale, MD simulations could offer useful insights to understand the nucleation process. Several research groups have conducted MD studies on hydrate nucleation32, 41, 42 and visualized possible mechanisms and interactions on molecular level. MD has also been used to study possible interactions between gas hydrates and KHIs during nucleation and growth in attempts to disclose mechanisms.43-45 The nucleation rate at given PT condition is function of the energy barrier (i.e. activation energy) at the critical sized nuclei19 and is thus an important parameter describing the process. Yuhara et al.30 claimed it is difficult to study hydrate nucleation by MD simulations at "normal" temperature and pressure conditions due to the relatively high free energy barrier. They conducted studies on methane hydrate nucleation at 255 K and 50 MPa. According to Mullin19 the energy barrier and critical nuclei size both decrease as function of decreasing temperature. Using MD in analysis to estimate nucleation rates in real systems may thus not be straightforward. For experimental studies, laborious efforts have been accomplished in attempts to obtain abundant high quality induction time data for analytical deduction of stationary nucleation rate.24, 46-50 Despite monitoring and measurement challenges, hydrate nucleation has been widely studied from various points of view. Literature on hydrate nucleation include the following topics: hydrate structure on guest 4

ACS Paragon Plus Environment

Page 5 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

components (organic and inorganic) and compositions;51, 52 driving force expressions fitting general and specific nucleation systems;53-55 KHIs56, 57 and antifreeze proteins (AFPs);58-60 use of lab-scale reactors (batch,22, 61 semi-batch,62, 63 continuous reactor64) and type of devices (autoclave,25, 65-68 DSC,58, 69 HP-ALTA,47, 48 rocking cell,70, 71 etc.). Depending on the type of experimental device used and procedures applied in different hydrate systems, two major experimental approaches have been widely adopted in crystallization studies. One is to obtain isothermal induction time distributions at constant degrees of subcooling or supersaturations;24, 25, 48, 70, 72-74 the other is to observe the temperature at spontaneous nucleation, during continuous cooling or cooling ramps.24, 50, 66-68, 70, 74, 75 Nucleation data collected by either methods represent well the stochastic nature of nucleation processes. Nucleation temperatures during constant cooling experiments76, 77 may appear less stochastic as compared to induction times experienced during experiments at constant temperature.78, 79 This could be effect of an energy barrier that decreases as function of decreasing temperature19 and hence increased nucleation rate and decreased size of critical nuclei at the point of spontaneous nucleation. Given a set of nucleation data from hydrate nucleation or other crystallization processes, several analytical methods may apply. Zachariassen et al.80 studied ice-nucleating agents in freeze-tolerant insects. They proposed the concept of supercooling point (SCP, also known as freezing temperature) to quantitatively describe the nucleation behavior. Barlow and Haymet presented an automated lag-time apparatus50 (ALTA) and studied isothermal ice nucleation through more than 350 parallel experiments in system containing AgI as nucleator. They analyzed the data through decay curves showing the fraction of data not nucleated within given time intervals and concluded that 300 to 500 parallels would be required using this experimental 5

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 57

method. Thus, the isothermal method appears as impractical for studying the effect of e.g. KHIs on the hydrate nucleation process. Wilson et al.77 used the ALTA technique to study SCP during constant cooling experiments and collected statistical nucleation data when same water sample was repeatedly frozen and thawed more than hundred times. They recorded individual induction time and subcooling temperature for each run and demonstrated through lag time histograms ("Manhattans") that the nucleation process appeared less stochastic using this method.77 Through Manhattans the fraction of unfrozen samples versus the degree of subcooling could be plotted to produce a survival curve from which the SCP distribution could be determined.76, 77 Wilson et al.77 defined the temperature at which half of the experimental runs have experienced frozen samples as the supercooling temperature. They applied such analytical method to both systems of water freezing to ice77 and tetrahydrofuran (THF) solution forming hydrate / ice.76 In constant cooling experiments the induction time,  , is function of the cooling gradient, /, and the degree of subcooling, ∆, at the nucleation point (SCP) and given by  = ∆/(/).66 Thus the induction time and SCP are interrelated in this type experiments and must be considered in analysis of nucleation rates. For single-component crystallization process at constant cooling, Kashchiev et al.81 have proposed a progressive nucleation rate expression. This expression has been applied it in methane hydrate studies in attempt to estimate nucleation rates in constant cooled process.66-68 Kvamme et al.82 have applied phase field theory of nuclei on carbon dioxide hydrate where stationary nucleation rate could be related to temperature and the formation work of nuclei. Other analytical methods may also apply for specific nucleation systems. For instance, Goh et al.83 have linked the statistics of nucleation in droplet-based microfluidic systems with varying degree of supersaturation where the induction time and nucleation probability distribution could be described analytically. Considering the range and convenience of 6

ACS Paragon Plus Environment

Page 7 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

applicability, EPA-PDF method, originated from Toschev’s work84 on electrolytic nucleation of mercury and cadmium on platinum, is probably one of the most commonly used in crystallization studies. The method arranges the nucleation data into to an Experimental Probability Distribution Function (EPDF) / Experimental Probability Array (EPA) and fits this probability array / probability function to the theoretical Probability Distribution Function (PDF) describing the behavior of stochastic processes. In the present work, we use the abbreviations EPA on the experimental and PDF for the theoretical probability. Through several hundred parallel experiments Toschev et al.84 showed that the probability of nucleation followed an exponential distribution function () = 1 – (− ) where  is stationary rate of nucleation when  → ∞. Various studies on hydrate or general crystallization processes have applied such a methodology for data analysis. For instance, Kulkarni et al.24 have applied this method to analyze both induction time and metastable zone widths of isonicotinamide (INA) nucleation in ethanol. Abay and co-workers25, 73, 85, 86 have applied the EPA – PDF method to both methane and multi-component natural gas hydrate formation data, in the presence or absence of KHIs. Takeya et al.87 have applied EPA - PDF for carbon dioxide hydrate nucleation data analysis. Kulkarni et al.24 have demonstrated that the EPA-PDF method is equally applicable to nucleation data collected from either isothermal condition or with cooling / cooling ramps though equations for curve-fitting are slightly different. However, the huge number of parallels required to obtain a representative description of the EPA is a drawback and makes the method time consuming. As a well-accepted statistical method for non-linear data modeling and prediction in modern statistics, Maximum Likelihood Estimation (MLE) is one of the great contributions by the British 7

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 57

statistician and mathematician, Sir Ronald Aylmer Fisher (R. A. Fisher).88-90 For several types of statistical estimations, one may not be able to conduct large number of measurements on a certain variable due to limit of time or cost. Hydrate nucleation experiments at constant temperature may be good example of such case. For given statistical model, MLE estimates the mean and deviation of model parameters by assigning the observed results an optimal probability. With probability density function and cumulative distribution function, MLE proves to be a unified and capable approach for statistical estimation of exponential distributions.91-93 MLE has been used for establishing a wide range of statistical models with broad applications. Millar94 has demonstrated examples of MLE applications including numerical optimization, population, survival analysis, profile likelihood, power transformation, etc. Later improvements by Sarhan95 and Cohen and Helm96 have made MLE an even more powerful statistical tool, especially for exponential distributions. Epstein97 and Zheng98 have demonstrated that MLE also has the statistical capability to handle truncated and incomplete / censored data sets. In hydrate nucleation experiments, censored data may be experienced if a set time limit for an experimental run is exceeded. As a result, MLE has been adopted in several fields like communication systems,99 transport networks,100 econometrics,101 genetics,102, 103 neuroscience and cognition,104 time-delay of arrival (TDOA) in acoustic or electromagnetic detection,105 and magnetic resonance imaging.106, 107 It has also been successfully used on stochastic processes, such as radioactive decaying108 and prediction of the next extreme snowfall.109 In the present work, hydrate nucleation experiments have been conducted on binary methanepropane mixture and distilled water in high-pressure autoclave under isothermal conditions. In addition isothermal nucleation data has been extracted from literature24, 49, 50 within acceptable 8

ACS Paragon Plus Environment

Page 9 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

accuracy. This generated a total of six datasets consisting of 30 to 309 parallel isothermal induction times for analysis of nucleation rate and lag time. Each set of parallels was divided into several, but equal groups of 6 to 50 parallels. These data groups were analyzed individually to examine effects of the number of parallels on the estimation accuracy. If moderate or limited number of parallels can be used for reliable data analysis, the isothermal nucleation method could be made applicable for studies on chemicals affecting the nucleation process. To validate the statistical consistency and reliability of MLE method against reduced number of experimental parallels, Permutation Test110, 111 has been performed. As a control group and a conventional estimation tool, the EPA-PDF method was applied as reference method. The use of both EPA-PDF and MLE methods are elucidated in the following section. To the best of our knowledge, it is the first attempt to apply MLE for statistical analysis of experimentally obtained gas hydrate nucleation data for the deduction of stationary nucleation rate and system lag time. Experimental Setup and Analytical Methods This section introduces the experimental setup used in the current work for collection of hydrate nucleation data, followed by the two analytical methods (EPA-PDF and MLE) applied for data analysis. Experimental.

The experimental work on hydrate nucleation were conducted on binary

methane-propane mixture consisting of 92.5 mol% methane + 7.5 mol% propane (SNG2) in high-pressure autoclave cell at isothermal conditions. An outline of the experimental setup and hydrate equilibrium properties of this binary mixture is reported elsewhere (Abay and Svartaas25, 73

). The cell setup as reported by Abay and Svartaas, was modified and equipped with an extra

temperature sensor to include temperature measurements both in the aqueous phase at the cell 9

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 57

bottom and in the vapor phase at the top. The nucleation experiments were conducted at pressure of 90 ± 0.2 bar and at experimental temperatures (11.75 °C, 13 °C, and 14.25 °C). At 90 bar, the equilibrium temperature for SNG2 was estimated to be 21.8 °C by CSMGem (Software from Colorado School of Mines2). Thus, the three experimental temperatures gave constant degrees of subcooling (∆) of 10.05 °C, 8.8 °C, and 7.55 °C respectively at experimental pressure. Prior to start of the experiment, the auto-clave cell was cooled at a cooling rate of 6 °C/h from hydrate free region to the desired temperature without stirring. Having reached the experimental temperature, the stirrer was started at a preset rate of 750 rpm. Start of stirring was defined as point zero for the measurement of induction time. The induction time was calculated from the time elapsed between point zero and first sign of hydrate formation in the cell. Onset of hydrate formation was detected through a slight pressure spike just prior to the first detection of gas consumption in cell. Natarajan et al.112 made similar observations of a slight pressure pulse occurring a few seconds before observation of the turbidity point defining onset in their study. 30 parallels were produced at 11.75 °C and 14.25 °C and 60 parallels at 13 °C and data were recorded every 0.05 min (i.e. 3 s intervals). The measured isothermal induction times are given as supportive information. In addition to our own data collection, isothermal induction time distributions in three other crystallization systems were read from figure plots in literature.24, 49, 50 The latter was done to extend the statistical fundament for evaluation of the MLE method. The first dataset was extracted from work by Jiang and ter Horst.49 This dataset contained 80 parallels on nucleation of m-amino benzoic acid (m-ABA) from supersaturated solution ( = 1.96) at 25 °C and was conducted using a Crystal16 multiple-reactor. The second dataset was obtained from work by 10

ACS Paragon Plus Environment

Page 11 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

Barlow and Haymet.50 They conducted 358 experiments with automated lag time apparatus (ALTA) and recorded induction time on the nucleation of super-cooled water on insoluble AgI crystals at -4 °C. Due to some overlapping points, only 309 of the 358 data points could be read and extracted from Fig. 3 in Barlow and Haymet's article. The third dataset was re-generated based on the work from Kulkarni, et al.24 They measured induction time from 144 parallel nucleation experiments in Crystal16 multiple-reactor for supersaturated solution ( = 1.40) of isonicotinamide (INA) in ethanol at a temperature of 25 °C. The accuracy of data reading from the work by Jiang and ter Horst and Kulkarni et al. was assumed ± 5 s from exact values. The accuracy of data reading from Barlow and Haymet was estimated within ± 20 s from exact values. The error range of the data extracted from literature should not influence evaluation of methods or conclusions. EPA-PDF method. This method estimates stationary nucleation rate and system lag time in nucleation system by curve fitting the Experimental Probability Array (EPA) to the theoretical exponential Probability Distribution Function (PDF) for stochastic process. The expressions for curve-fitting at isothermal conditions, applicable for nucleation data in current study, are illustrated below. For a total of  parallel experiments, the probability () to observe an induction time in the time interval of [0, ] is defined as:

() = ()/

(1)

where () is the number of parallels in which nucleation is detected within a time interval of  seconds or shorter. This generates an experimental probability array, EPA, as described by the 11

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 57

experimental probability distribution function in Eq. (1) starting from a least probability of 1/ at the shortest induction time to a probability of unity at the longest. Accordingly, an exponential, two-parameter PDF, can describe the behavior of this type of isothermal EPA:

() = 1 − exp (− ∗ " ∗ ( − # ))

(2a)

where is the stationary nucleation rate (s-1m-3), # is lag time (s) and " is the sample volume involved. At  < # , the probability of nucleation, () < 0, thus # describes the minimum time required for the process to become probable. There may be some confusions around the term lag time in hydrate nucleation because the term is often used to describe different time intervals in the process. Another interpretation of lag time, #, describes the time elapsed between # and the point at which the process reaches steady state nucleation. The nucleation rate is function of this latter lag time. A third interpretation of lag time describes the time elapsed to the onset of observable macroscopic hydrate growth and is commonly used in evaluation of KHI effects. In our experiments, we did not know the exact sample volume involved in the nucleation process because hydrate nucleation commonly takes place at the gas – water interface and not in the bulk. Assuming that hydrate nucleation occurs at the gas – water interface, the volume involved will be proportional with the interfacial area. Then assuming that the interfacial area is fairly constant at fixed stirring rate, the volume involved could be approximated constant. Eq. (2a) can then be expressed as:

() = 1 − exp (−



∗ ( − # ))

(2b)

12

ACS Paragon Plus Environment

Page 13 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

where



= ∗ " is "volume independent" stationary nucleation rate (s-1).

With induction time distribution and its EPA determined by Eq.(1) as known variables, the key ∗

parameters for nucleation, or

and # can be obtained by curve fitting with Eqs. (2a or 2b).

Penalized MLE & Permutation Test. Maximum Likelihood Estimation (MLE) is especially useful for non-linear modeling with non-uniformly distributed data. Consider, for instance, a measured induction time distribution and take induction time, , as a random variable based on a two-parameter exponential distribution. As in the conventional EPA-PDF method, the two parameters in MLE are the stationary nucleation rate, , and the system lag time, # . The probability density function93 for induction time, , is given by: %(; , τ ) = (

∗ exp (− ∗ ( − # ))  ≥ τ 0 *ℎ,-./

(3)

Based on Eq.(3), the Cumulative Probability Function (CPF) for steady state nucleation to occur within a time interval [# , ], is of a similar form as the PDF in Eq.(2b): 6

0() = (# ≤ 2 ≤ ) = 37

8

∗  45∗(6478 ) 9 = 1 − exp (− ∗ ( − # ))

(4)

With following estimates of and # by MLE as input to Eq. (4), 0() defines the statistical probability array, SPA, of the experimental data. Estimation of () via the deterministic EPA function is thus ommitted. As demonstrated in Zheng’s work, a penalty multiplier was added to the original MLE likelihood function, to obtain Uniformly Minimum Variance Unbiased Estimators (UMVUE).98 Applying this to hydrate nucleation, for a set of : induction time samples in their original 13

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 57

orders, ; , < , … > , with probability density function described by Eq.(3), the penalized likelihood function carries the form: ?( , τ ) = (;:> − τ ) ∏> B; %( ; , τ ) = (;:> − τ )

>

exp(− ∑> B;( − τ )), ;:> ≥ τ

(5) where ;:> ≤  ≤ … ≤ >:> are re-ordered data sequence starting from the shortest induction time, ;:> . As seen in Eq.(5), by adding a penalty multiplier, (;:> − τ ) , the likelihood function ?( , τ ) is no longer monotone to the lag time, τ . More important, if ;:> = τ (though not practical in reality), ?( , τ ) would give a likelihood of zero. This can be well explained by the classical nucleation theory that the lag time τ is the starting point before which probability of nucleation does not exist, nor could any stationary nucleation rate be obtained. Taking natural logorithm for Eq.(5): D ?( , τ ) = ln(;:> − τ ) + :D − ∑> B;( − τ ) , ;:> ≥ τ With Eq.(6), the penalized MLE estimators of

(6)

and τ can be deducted by taking partial

differentiation for each variable respectively and set the derivatives to zero. This gives: = (: − 1)/[:(̅ − ;:> )]

(7)

τ = (:;:> − ̅)/(: − 1)

(8)

where ̅ is the mean value of the induction time distribution. With and τ estimated in this way, the cumulative probability, 0(), for each individually measured induction time can be calculated with Eq.(4). As can be seen, unlike in the EPA-PDF method where its probability 14

ACS Paragon Plus Environment

Page 15 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

distribution function is obtained by curve-fitting to a deterministic experimental distribution function, the CPF in MLE itself is an “end-use” statistic relation able to be deducted with solved MLE estimators. The original equations in MLE analysis refer to a scale parameter, I, where I = 1/ , and a location parameter,τ or ŋ. In Eq. (7) the scale parameter is given by I = [:(̅ − ;:> )]/(: − 1) which is volume independent function of the average induction time, ̅, and the number of parallels included, :. In the above equations we have replaced I with 1/ to relate the probability distribution to the nucleation rate in MLE analysis. For the purpose of conducting unbiased comparison and verifying results with statistical significance, permutation test can be performed in conjunction with MLE analysis. Permutation test is a powerful statistical significance tests through resampling by exchanging labels on data points.110, 111 In this work, permutation tests helps determine whether two data groups, K and L, are statistically homogeneous and interchangeable; or whether a subgroup, M, is still capable to keep the statistical properties of the full original data, N, where M comes from. Positive answer to the former indicates that K and L are probably generated from a same data distribution. Positive answer to the latter means that M, from a statistical point of view, has no significant difference from N, thus could be taken to well represent N. Technically, permutation test is run with MLE estimations of stationary nucleation rate by Eq.(7). By performing a permutation test, two data groups of measured induction times will first be combined into one group. After resampling a total of O times, O pairs of two new data groups keeping their original sizes will be randomly generated. For each regenerated data pair, the difference in their estimated nucleation rates, P , will be calculated by MLE. A total of O such differences in nucleation rates, P ; , P < , … P Q , will each be compared with the nucleation rate difference, P  , between the two original data groups. 15

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 57

Some could be larger while the rest smaller. The sum of the ratios for each situation should be 1. A -value as the indicator from a permutation test is assigned with the smaller ratio, in a range [0, 0.5]. A -value of 0.05 or less indicates a significant difference between the two compared data groups, and it can be assumed unlikely that they originate from a same data distribution. If the value is close to 0.5, the probability of the datasets being interchangeable is high, showing no significant difference. In the region 0.05 <  ≤ 0.1 the datasets may be assumed representative for the original distribution, and if 0.1 ≤  ≤ 0.5 the probability that the datasets represent the original distribution can be assumed significant. If the latter is fulfilled, the datasets possess statistical homogeneity and could be used independently to describe the system properties. For this work, nucleation rate and lag time estimation by penalized MLE with 95% confidence boundary and permutation test were programmed in R and managed with RStudio computing platform.113 A O value of 10000 has been chosen to ensure sufficient resampling and statistical reliability for the permutation tests. R scripts for MLE estimations and permutation tests are given as supportive information. MLE on Incomplete (Censored) data. MLE is able to include and handle incomplete data in analysis of nucleation rate, , and the lag time, # . Incomplete data (censored data) means data point where nucleation did not occur within the time limit of the experiment or in situations where induction time is too short for determination within acceptable accuracy. If incomplete data is experienced during some of our experiments, we normally extend the time limit for the experiment to avoid such situation or increase the minimum experimental temperature to avoid spontaneous nucleation from occurring at start of experiment. In the present work, incomplete data were not experienced in any of the experiments. Zeng98 has given a description of the 16

ACS Paragon Plus Environment

Page 17 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

inclusion of incomplete data in MLE analysis. Incomplete data should be kept at a minimum by e.g. increasing the time limit for the experiment. Experimental temperatures where spontaneous nucleation at start of experiment would dominate should be avoided.

Results Complete datasets and corresponding subgroups were analyzed individually using the EPAPDF and MLE methods. Figs. 1 to 6 show data points according to the experimental probability distributions, () , as determined by the

EPA─Eq. (1), and data points according to the

statistical probability array, SPA, as determined by the cumulative probability distributions, 0(), CPF─Eq. (4). The data points created by EPA method are deterministic by nature, while SPAdeduced data points maintain the stochastic nature of nucleation process. Curves were fitted to data points via the theoretical PDF as given by Eq. (2b). Nucleation rates and lag times deduced by SPA-PDF fit are identical with values determined by Eqs. (7) and (8), and the SPA-deduced data points were included to enable comparison between EPA-PDF and MLE methods. The SPA-deduced data points help visualize effects of lag time tuning for the two methods in Figs. 4 and 6. In the present work nucleation rate is given as "volume independent" rate according to Eq. (2b). Data taken from ter Horst and coworkers24,

49

should be divided by the actual sample

volume (" = 10-6 m3) for comparison with nucleation given in their publications.

Figure 1. Probability distribution by EPA-PDF and MLE analysis on m-ABA nucleation (data from literature49).

17

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 57

Figure 2. Probability distribution by EPA-PDF and MLE analysis on nucleation from supercooled water on AgI crystals (data from literature50).

Figure 3. Probability distribution by EPA-PDF and MLE analysis on INA nucleation (data from literature24).

For the literature data shown in Figs. 1-3, the discrepancies between EPA-PDF and MLE in estimation of nucleation rates are relatively small due to sufficient number of parallels involved. However, it should be noted that using EPA-PDF on the m-ABA nucleation data we obtain a lag time, # = 1165 ± 14 s, while the lowest induction time was 1030 s. This contradicts the definition of the probability distribution function, which is not valid at induction times less than the lag time, # . Using MLE on these data we obtain # = 1006 ± (7×10-6) s which is slightly less than the lowest induction time observed. Then the probability distribution function is valid for all induction times measured. For the m-ABA system, EPA-PDF gave a nucleation rate of 0.000627 s-1 (627 s-1m-3 using Eq. 2a) and 0.000589 s-1 by MLE, i.e. 6 % less. To examine effect of an overestimated lag time by EPA-PDF we tuned # to take the value of the lowest induction time observed (1030 s) and conducted a new curve fit with # locked to this value. This tuning gave nucleation rate of 0.000559 s-1 by EPA-PDF method, i.e. a reduction of approx. 11 %. This is not drastic and with respect to nucleation rate the EPA-PDF method would probably give estimated values within acceptable accuracy provided a sufficient number of parallels. However, tuning of lag time, # , should be considered if simulated values are in conflict with the definition of the theoretical probability distribution function, but tuned values must be according to statistical behavior of system. 18

ACS Paragon Plus Environment

Page 19 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

Figure 4. Probability distribution by EPA-PDF and MLE analysis for binary methane-propane hydrate nucleation at 90 bar and 11.75 °C (data from present work). MLE analysis includes simulation with lag time set to zero due to suggested negative value, yielding an overlapped probability curve upon the original.

Figure 5. Probability distribution by EPA-PDF and MLE analysis for binary methane-propane hydrate nucleation at 90 bar and 13 °C (data from present work).

Figure 6. Probability distribution by EPA-PDF and MLE analysis for binary methane-propane hydrate nucleation at 90 bar and 14.25 °C (data from present work). The Figure shows nucleation rates for system with lag time, # , according to primary simulation, and with lag time tuned to zero or lowest observed induction time to demonstrate effect of non-realistic low lag time values for EPA-PDF and MLE methods.

As can be seen from the literature data in Figs.1-3, the estimations of nucleation rate, , by EPA-PDF and MLE methods are in good agreements. In all simulations regression coefficients, R, were above 0.99 and Chi square values (Chisq) showed good correlation between data points and fitted curves. Chi square values below 0.05 indicate minimal disagreement between data points and point on the curve, and Chi square values below 0.1 could still be acceptable. For the hydrate experiments discrepancies between EPA-PDF and MLE estimates of nucleation rates occurred as shown in Figs.4-6 and Table 1. The regression coefficients for the EPA-PDF fit on 19

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 57

the hydrate data were acceptable but the Chi square values showed greater deviation between estimated () curve and the experimental data than desirable for the hydrate data. This could be due to a number of parallels less than the minimum required for the EPA-PDF method. The Chi square values show that 60 parallels worked better with the EPA-PDF method than 30 parallels. The estimated nucleation rate by EPA-PDF method is by average three to four times higher than that estimated by MLE, at all experimental temperatures. A comparison between Figs.4-6 and Figs.1-3 indicate that the EPA-PDF did not manage to perform "perfect" curve fitting for the stochastic hydrate nucleation behavior, but performed better on literature data. This could be due to the fact that the experimental probability array (EPA) is deterministic, making it too rigid for curve fitting on stochastic data when the number of parallels are below a critical limit. MLE generated the cumulative probabilities for each induction time via its penalized estimators. As a result, it gives smoother predictions of hydrate nucleation probability at any induction time observed. If the number distribution of data points is greater on one side of the average value, EPA-PDF will give those data a higher weight displacing the () curve towards that side of the average value. The result would then be an overestimated or underestimated nucleation rate. With MLE the data points are given an even weighing around the average. This weighing also influences the estimated lag time and sometimes results in a longer lag time, # , with EPA-PDF than the shortest induction time measured. The anomaly with predicted lag times longer than the shortest induction time using EPA-PDF showed deviation between prediction and theory using this method. However, predicted negative lag times may occur with both methods as shown in Figs. 4 and 6. Adjustments of negative lag times to zero or setting the lag time equal with lowest observed induction time may have some moderate influence on nucleation rates estimated by the EPA-PDF and MLE methods. In Fig. 6 we see that an adjustment of estimated lag time from 20

ACS Paragon Plus Environment

Page 21 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

113.8 s to zero gave 28.5 % increase in nucleation rate for the EPA-PDF method while an adjustment from -55 s to zero gave 4.2 % increase in nucleation rate with the MLE method. Tuning the lag time to 54 s, i.e. equal with the lowest induction time observed, the MLE nucleation rate increased by 8.5 % as compared to the un-tuned simulation. We assume the most accurate estimate of the nucleation rate by MLE simulation in this case would be obtained at a # tuned between zero and the lowest observed induction time. For the hydrate experiments, the EPA-PDF method gave unfavorable experimental probability distribution with points that did not adapt well to the theoretical probability distribution function (PDF) at the lower and higher induction times of the distribution range. The MLE method appears as less sensitive to corrections for negative lag time estimates than EPA-PDF. This is seen through the changed slope of the fitted () curves in Fig. 6. The lag time, # , defines a single point probability and the present study indicates that an infinite number of parallels may be required for exact determination of # . On the other hand, the nucleation rate is based on data representing the probability distribution, and the present study indicates that a fewer number of parallels are required for estimation of nucleation rate within acceptable accuracy using MLE. The ratio between hydrate nucleation rates obtained at 11.75 °C (cf. Fig. 4) and at 13 °C (cf. Fig. 5), was estimated to 1.885 by EPA-PDF and 2.014 by MLE. Going from 13 to 14.25 °C (cf. Fig 6) the ratios increased to 3.777 by EPA-PDF and 3.689 by MLE respectively. Thus, the relation between estimated nucleation rate and the temperature (subcooling) appears not to be affected by the method, provided sufficient amount of parallels and estimation based on realistic lag time value for both methods.

21

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 57

Table 1. A summary of stationary nucleation rate and lag time ± standard error as estimated by EPA-PDF and MLE respectively. Note: Nucleation rates, J, are multiplied with given scaling factors.

Table 1 summarizes the results of stationary nucleation rate and lag time for the six datasets analyzed. For C1-C3 and experiments at 11.75 °C and 14.25 °C the negative lag time estimates shown in Figs.4 and 6 indicate that accurate lag time estimation probably requires a larger number of data points. Negative lag time has no physical meaning and such values were tuned to zero in Table 1. Fig. 6 shows estimated nucleation rates with negative lag time tuned to zero or to a value equal with the lowest induction time observed for the dataset. The tuning resulted in some slight increases in the estimated nucleation rates by both EPA-PDF and MLE methods. Tuning of # were justified by the fact that negative induction times were not observed (i.e. nucleation prior to start of stirring) in any of the experiments conducted and that the lag time estimator of the non-penalized MLE analysis is equal with the lowest induction time observed.98 Table 1 indicates that the nucleation rate for binary methane-propane hydrate formation decreased with increasing temperature (i.e. decreasing degree of subcooling). This is well consistent with classical nucleation theory. Based on the discussion above MLE appeared as a more reliable tool for analysis of nucleation rates from experimental data than EPA-PDF. With a sufficient number of parallels, EPA-PDF appears to approach MLE with respect to estimated nucleation rates, while offset in estimation of lag time may be experienced with EPA-PDF independent on the number of parallels involved. We assume it is the weighing of data around the average value that causes the discrepancies in addition to the EPA-PDF relation being based on deterministic approach to stochastic behavior. 22

ACS Paragon Plus Environment

Page 23 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

To investigate the robustness of MLE with penalized estimators, the datasets were divided into subgroups containing gradually lower number of parallels and the parallels within each group were analyzed individually and finally compared with the analysis of the complete dataset containing all parallels. E.g., the induction time distribution from C1-C3 nucleation experiments at 13 °C contains 60 data points. This dataset was divided into 2, 3, 5 or 10 subgroups containing 30, 20, 12 or 6 data points each subgroup respectively. Figs. 7a and 7b show the average nucleation rates of each subgroup for C1-C3 nucleation at 13 °C using MLE and EPA-PDF methods respectively. Using MLE the average nucleation rate estimated from subgroups containing 30, 20 or 12 data points are in good agreement with the nucleation rate estimated for the complete dataset as seen from Fig. 7a. As seen from Fig. 7b, greater variations in nucleation rate estimates were observed over the same region of data points with EPA-PDF method. When dataset was further split into 10 individual subgroups containing only 6 data points each, the estimated nucleation rates showed great fluctuations with an error bar indicating non-reliable predictions for both methods. It should be noted that though the error bar at group of six parallels stretches into the negative region below zero, none of the estimated nucleation rates were negative. This was mainly effect of the fluctuating values.

Figure 7. Effect of reduced number of experimental parallels on estimated nucleation rate (methane-propane nucleation at 13 °C) using MLE (Fig. 7a) and EPA-PDF (Fig. 7b) methods.

Similar data division was done on the other five data sets followed by MLE estimation. The mABA experiments containing 80 data points were split into 2, 4, 8 or 16 subgroups, containing 40, 23

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 57

20, 10 or 5 data points per subgroup. The first 300 of the 309 induction time measurements on AgI – water nucleation were split into 6, 12, 24 or 60 subgroups, containing 50, 25, 12/13 or 5 data points per subgroup correspondingly. The 144 data points on INA nucleation were split into 3, 6, 9, 12 or 24 subgroups, containing 48, 24, 16, 12 or 6 data points per subgroup correspondingly. In addition to the split conducted on the hydrate nucleation data at 13 °C, the data at 11.75 °C and 14.25 °C was split into 2, 3, or 5 subgroups, containing 15, 10, or 6 data points per subgroup correspondingly. 226 subgroups in total were generated from the six original data sets. Table 2 below summarizes all results of MLE performance against reduced number of data points per subgroup.

Table 2. Nucleation rate by MLE, (s-1), with reduced number of induction times for analysis.

When we tested MLE on the literature data on supercooled water nucleation on AgI,50 we noticed that the predicted nucleation rate remained fairly constant over the whole region analyzing on data groups containing a decreasing number of parallels from 300 down to a minimum of 25 points (see Table 2). However, the lag time gradually decreased from 163 s to an average of 111.6 s over the same region of parallels. All individually estimated lag times remained at a positive value over the same region of parallels in each data group. Negative lag time estimates were experienced when the number of parallels was reduced to 12 – 13 points (average lag time of 68.8 s). We made similar observations on the INA data from the literature24 where nucleation rates were less affected by the number of parallels down to 20 points in each group while the average lag time decreased by decreasing number of parallels included. 24

ACS Paragon Plus Environment

Page 25 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

As can be seen from Table 2, a reduced number of parallels within the subgroup has effect on the accuracy of estimation. With 40 to 50 data points per subgroup the literature data showed nucleation rates close to that obtained with all parallels included. At a level of 20 to 30 data points per subgroup the literature data and the hydrate data at 13 °C gave nucleation rates within ± 25 % or less relative to the average with all data included except the water-AgI which required 40 to 50 data points to be within ± 25 % of the average with all data included. At a level of 12 to 20 data points per subgroup most of the estimated nucleation rates showed deviation greater than ± 30 % except the hydrate data at 13 °C (20 points per subgroup) and 14.25 °C (15 points per subgroup) which both performed within ± 25 % of the nucleation rate with all data included. With 10 to 12 data points included all systems except hydrate at 14.25 °C (10 points per subgroup) gave deviations greater than ± 30 %. With 5-6 data points per subgroup, the resulting average nucleation rate and error range suffered huge deviation and loss of consistency, and was completely unacceptable. The level of data randomness should be related to the intrinsic nucleation kinetics, and could be dependent on the actual nucleation reactor used and experimental setup / procedures applied in each case. Based on the above we assume that 20 to 30 parallels would be sufficient for estimation of nucleation rates within acceptable limits of uncertainty. The hydrate data indicate that 10 to 15 parallels per subgroup could be acceptable in some situations and that 20 parallels most probably were sufficient. To verify whether the sizes of the subgroups contained the required amount of data point being representative for the total distribution range or not, permutation tests were conducted on all individual subgroups and their corresponding mother datasets. The results from the permutation tests are listed in Table 3 (literature data) and Table 4 (methane – propane hydrate nucleation 25

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 57

data from the present study). A p-value ≤ 0.05 indicates that the subgroup has probability distribution significantly different from its mother dataset, thus not eligible to represent the probability distribution of the nucleation system. Symbols  and N in Tables 3 and 4 refer to the number of divided subgroups () from the complete dataset and the number of data points (N, i.e. no. of parallels) per subgroup, respectively. Column ( ≤ 0.05) shows the number of subgroups with no homogeneity as compared to its mother dataset and column (p ≤ 0.1) includes the region 0.05 ≤ p ≤ 0.1 where the subgroup is close to an assumed limit for being stated representative or not. The ratio between total number of subgroups and the number of subgroups with no statistical homogeneity against the overall data was estimated from the numbers in the lower row (sum).

Table 3.  value statistics summarized from permutation tests on literature nucleation data for all subgroups within each dataset.

Table 4.  value statistics summarized from permutation tests on hydrate nucleation for all subgroups within the dataset.

As seen in Table 3, among the 186 subgroups generated from literature data, 32 subgroups were identified with  ≤ 0.05, and 21 subgroups in the range 0.05 ≤ p ≤ 0.1 with proportions of 17.2% and 11.3 % of the total number of subgroups respectively. Among all 40 subgroups of data from hydrate nucleation (see Table 4), one subgroup was identified with  ≤ 0.05 and two subgroups in the range 0.05 ≤ p ≤ 0.1, with proportions of 2.5% and 5.0 % of the total number of subgroups respectively. Situations where permutation tests show p ≤ 0.1 show that the number of 26

ACS Paragon Plus Environment

Page 27 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

parallels should be increased to obtain reliable estimation of the nucleation rate while p > 0.1 show that the subgroup can be assumed representative for the actual distribution range. The analysis indicates that minimum number of parallels to ensures representative sampling for fair data analysis varies from nucleation system to nucleation system. Provided nucleation analysis by MLE, the literature data indicates that a minimum number of 20 to 25 parallels would be required for significant representation of the actual probability distribution range. This is in agreement with the nucleation rate analysis presented in Table 2. The permutation tests conducted on the hydrate data as shown in Table 4 indicates that 10 to 12 parallels could be the minimum. Table 2 shows that the average nucleation rates for the hydrate system are within an acceptable accuracy down to 10 parallels at all temperatures, but the width of the error band increases with decreasing number of parallels. This indicates that even though data can be stated representative for the actual probability distribution, a minimum number of parallels are required to obtain acceptable accuracy in estimation of nucleation rate. Based on the results presented in Tables 2 to 4 we assume that the minimum number of parallels required to determine nucleation rate within acceptable accuracy could be of the range 20 to 25 and that 25 parallels most probably is sufficient. Our stirred cells could be assumed less perfect for nucleation experiments than the equipment and methods used in the referred literature studies, but the results and MLE analysis with permutation tests indicated that the equipment and method used could produce reliable nucleation data. Discussion a) EPA-PDF vs. MLE. The EPA-PDF method is quite simple and easy to apply. Despite its simplicity to use, and reliable estimation of nucleation rate through curve fitting, there may be 27

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 57

flaws in situations where the number of parallels are below some critical limit. In EPA-PDF the experimental probability array is generated by re-ordering induction time distributions according to observed nucleation frequency within a given time period and is thus deterministic by nature. The real physical process may not support such a "subjective" probability assignment if the number of parallels are too few. In EPA-PDF a probability of unity is assigned to the longest measured induction time. For finite number of parallels, the highest probability for critical nuclei formation can only be located in the vicinity of the longest induction time measured with a probability less than unity. Experiments can only cover the complete probability array when the number of experiments approaches infinity, thus EPA-PDF is most often dependent on a huge number of parallels, as also suggested in the existing literature. The applicability and reliability of EPA-PDF is thus inevitably deteriorated at reduced number of parallels. The conduction of hydrate nucleation experiments may often be time-consuming making complete nucleation studies nearly impossible task. The use of large-volume, stirred autoclave at isothermal conditions in current work is an example, taking several weeks or months to complete one series of nucleation experiments. This kind of imperfection and practical limitation have called for reliable alternative methods to study hydrate nucleation. With MLE method, the nucleation rate and lag time are estimated through statistical relations. The corresponding probability distribution is then deducted with the cumulative probability density function. The estimation performance of MLE is less dependent on the number of parallels, and more relying on its statistical data handling. This probably makes MLE more consistent and applicable for varied nucleation systems. At the longest induction time measured,

() value less than but close-to-unity probability is always deducted with MLE, instead of unity as assigned in EPA-PDF. This is more natural and reasonable with finite number of parallels. 28

ACS Paragon Plus Environment

Page 29 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

Regular MLE analysis always chooses the shortest induction time to estimate the lag time.98 It is convenient but conservative and logically fragile. According to the classic nucleation theory, the system must pass the lag time, # , before stationary nucleation becomes probable and can eventaully be observed or detected in the system. As a compensation to mitigate this and for the estimation to be in line with theory, the modified MLE with penalized estimators was applied in current work. b) Data collection approaches. Both isothermal induction time measurements and cooling / cooling ramps with formation temperature monitoring have been widely used in crystallization studies. Differences do exist. For instance, in a liquid – vapor system, solubility of natural gas components like methane36 and carbon dioxide114 in the liquid phase would remain constant at isothermal condition within the metastable region. With continuous cooling, however, gas solubility will increase as the system is approaching a required supersaturation for nucleation onset.36, 114 As a result, the mass transfer from vapor phase to gas-water interface and / or bulk liquid phase will probably become a confining factor in nucleation.48 Isothermal induction time distributions were reported to be more stochastic than the P distributions observed in continuous cooling method.48 Another remark on the two data collection approaches is that while continuous cooling method is less labor intensive, the isothermal induction time measurements are easier to analyze, and with higher accuracy.24 Nucleation data collected from either approach can be analyzed with EPA-PDF method, with similar forms of equations.49 Since formation temperature distribution can easily be translated via cooling rate into induction time distribution,66 nucleation rate and lag time estimators through MLE should work equally fine.

29

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 57

c) “Maximum Likelihood Estimate” In hydrate literature. The term “maximum likelihood estimate” has appeared in MD simulation studies for methane hydrate nucleation.30, 115 In these studies they calculated MFPT (mean first-passage time), in an attempt to combine results from both nucleated and non-nucleated system in molecular MD environment. While survival probability analysis30 was adopted for deduction of lag time and nucleation rate in these studies, it is apparently different from the “Maximum Likelihood Estimation” examined in the present work. Despite MLE has been used in crystallization processes, to the best of our knowledge, it is the first time such a statistical evaluation tool has been applied to analyze laboratory hydrate nucleation data, for deduction of nucleation rate, lag time and probability distribution. d) The Conjunctive Use of MLE and Permutation Test Independent on method, the number of parallels will affect the accuracy in estimation of nucleation rate and lag time. Differently from EPA-PDF, MLE can be used to treat smaller number of parallels within acceptable accuracy. However, it is important to keep in mind that a minimum yet sufficient number of parallels need be included to ensure statistical homogeneity and valid data sampling. As revealed in current study, the minimum number of parallels required can be examined by conjunctive use of MLE and permutation test on larger datasets. This helps verify that estimated parameters have significant statistical meaning, thus avoiding the possibility of drawing biased conclusions. In two previous nucleation studies on methane and binary methane-propane hydrate with same type of autoclave cell at isothermal conditions, 6-16 parallel nucleation experiments were performed.25, 73 Although the present study indicated that 10 to 15 parallels may hold the statistical properties of hydrate nucleation behavior, we assume that

30

ACS Paragon Plus Environment

Page 31 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

around 20+ parallels should be included to obtain sufficient accuracy in estimates of nucleation rates and lag time. e) MLE for incomplete or censored data. It is not rare in scientific research that some experiments are interrupted and stopped, e.g. due to lack of time, or preset maximum experimental time exceeded. In hydrate literature, such incomplete datasets have been reported, but is often excluded from analysis since EPA-PDF is not able to handle censored data . MLE can be combined with different types of data censoring.93,

97, 116, 117

We normally include

censored date in the MLE analysis, but in the present study censored data was not experienced. Conclusion In the present work, we analyzed hydrate nucleation data from our laboratory and literature data from other crystallization systems by Maximum Likelihood Estimation (MLE) method and compared with conventional analysis by EPA-PDF method. While the latter requires a large number of parallels for good estimation of nucleation characteristics, MLE with penalized nucleation rate and lag time estimators proved to be a powerful alternative. MLE is efficient, statistically consistent with decent accuracy, and appears to be less dependent on a large number of experimental parallels. To ensure valid data sampling and unbiased estimation, a minimum yet sufficient number of parallels should be determined for the actual experimental setup. The minimum number may vary in different nucleation systems, due to different experimental device, procedure, data monitoring, measurement techniques, etc. The present study indicates that our experimental setup and method may produce reliable estimates of hydrate nucleation rates based on a minimum of 10 to 15 parallels. However, the number of parallels included in MLE analysis of nucleation rates should most probably be of the range 20 to 25 to ensure adequate accuracy. 31

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 57

Acknowledgements The authors appreciate support from and discussions with Professor Jan Terje Kvaløy at University of Stavanger on stochastic processes, permutation testing and MLE analysis. The authors thank The Norwegian Ministry of Education and Research and University of Stavanger for their financial support of this work. Conflicts of Interest The authors declare no conflict of interest.

Nomenclature D

Number of parallels (data points) per divided sub-group

%

Probability density function /



Stationary nucleation rate (s-1, or s-1m-3)

O

Maximum number of resampling applied in permutation test

?

Penalized likelihood function

:

Number of induction time points

 / 

Number of experimental parallels



Significance indicator given by permutation test

()

Probability of hydrate nuclei formation within time interval [0, ]



Supersaturation level in aqueous solution Number of divided sub-groups from complete dataset



Induction time, time interval (s)



Average induction time (s) 32

ACS Paragon Plus Environment

Page 33 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

;:>

Shortest induction time measured (s)

"

Sample volume for aqueous nucleation (m3)

I

Scale parameter (I = 1/ ) (s)

τ

Lag time for hydrate nucleation (s)

∆

Degree of subcooling (°C)

Abbreviations EPA

Experimental Probability Array (experimental probability distribution function)

PDF

Probability Distribution Function (theoretical probability distribution function)

EPA-PDF

Method fitting EPA values to PDF

MLE

Maximum Likelihood Estimation

CPF

Cumulative Probability Function

SPA

Statistical Probability Array

33

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 57

References 1.

Sum, A. K.; Koh, C. A.; Sloan, E. D., Clathrate Hydrates: From Laboratory Science to

Engineering Practice. Ind. Eng. Chem. Res. 2009, 48 (16), 7457-7465. 2.

Sloan, E. D.; Koh, C. A., Clathrate Hydrates of Natural Gases. Third ed.; CRC Press:

2008. 3.

Sloan, E. D., Fundamental Principles and Applications of Natural Gas Hydrates. Nature

2003, 426, 353-363. 4.

Sloan, E. D., Hydrate Engineering. Society of Petroleum Engineers: Richardson, Texas,

2000; Vol. 21. 5.

Collett, T.; Bahk, J.-J.; Baker, R.; Boswell, R.; Divins, D.; Frye, M.; Goldberg, D.;

Husebø, J.; Koh, C. A.; Malone, M.; Morell, M.; Myers, G.; Shipp, C.; Torres, M., Methane Hydrates in Nature—Current Knowledge and Challenges. J. Chem. Eng. Data 2015, 60 (2). 6.

Sum, A. K.; Koh, C. A.; Sloan, E. D., Developing a Comprehensive Understanding and

Model of Hydrate in Multiphase Flow: From Laboratory Measurements to Field Applications. Energy Fuels 2012, 26 (7), 4046-4052. 7.

Koh, C. A., Towards a Fundamental Understanding of Natural Gas Hydrates. Chem. Soc.

Rev. 2002, 31 (3), 157-167. 8.

Sloan, E. D., Clathrate Hydrates: The Other Common Solid Water Phase. Ind. Eng. Chem.

Res. 2000, 39, 3123-3129. 9.

Ripmeester, J. A., Hydrate Research - From Correlations to a Knowledge-Based

Discipline. In Gas Hydrates: Challenges for the future, Holder, G. D.; Bishnoi, P. R., Eds. Ann. N. Y. Acad. Sci. 2000, 912, 1-16.

34

ACS Paragon Plus Environment

Page 35 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

10.

Makogon, Y., Hydrates of Hydrocarbons. PennWell Publishing Company: Tulsa,

Oklahoma, USA, 1997. 11.

Berecz, E.; Balla-Achs, M., Gas Hydrates. In Akadémiai Kiadó: Budapest, 1983.

12.

Kinnari, K.; Hundseid, J.; Li, X.; Askvik, K. M., Hydrate Management in Practice. J.

Chem. Eng. Data 2015, 60 (2), 437-446. 13.

Sloan, E. D.; Koh, C.; Sum, A. K., Natural Gas Hydrates in Flow Assurance. Gulf

Professional Publishing: New York, 2011. 14.

Li, X.; Gjertsen, L.; Austvik, T., Thermodynamic Inhibitors for Hydrate Plug Melting. In

Gas Hydrates: Challenges for the Future, Holder, G.; Bishnoi, P., Eds. Ann. N. Y. Acad. Sci. 2000, 912, 822-831. 15.

Tohidi, B.; Anderson, R.; Chapoy, A.; Yang, J.; Burgass, R. W., Do We Have New

Solutions to the Old Problem of Gas Hydrates? Energy Fuels 2012, 26 (7), 4053-4058. 16.

Kelland, M. A., History of the Development of Low Dosage Hydrate Inhibitors. Energy

Fuels 2006, 20 (3), 825-847. 17.

Jensen, L.; Thomsen, K.; Von Solms, N., Inhibition of Structure I and II Gas Hydrates

Using Synthetic and Biological Kinetic Inhibitors. Energy Fuels 2011, 25 (1), 17-23. 18.

Villano, L. D.; Kommedal, R.; Fijten, M. W. M.; Schubert, U. S.; Hoogenboom, R.;

Kelland, M. A., A Study of the Kinetic Hydrate Inhibitor Performance and Seawater Biodegradability of a Series of Poly(2-alkyl-2-oxazoline)s. Energy Fuels 2009, 23 (7), 36653675. 19.

Mullin, J. W., Crystallization. Fourth ed.; Butterworth Heinemann: Oxford, 2001.

20.

Freer, E. M.; Selim, M. S.; Sloan, E. D., Methane Hydrate Film Growth Kinetics. Fluid

Phase Equilib. 2001, 185 (1–2), 65-75. 35

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

21.

Page 36 of 57

Kawamura, T.; Ohga, K.; Higuichi, K.; Yoon, J. H.; Yamamoto, Y.; Komai, T.; Haneda,

H., Dissociation Behaviour of Pellet-shaped Methane-ethane Mixed Gas Hydrate Samples. Energy Fuels 2003, 17, 614-618. 22.

Lee, J. M.; Cho, S. J.; Lee, J. D.; Linga, P.; Kang, K. C.; Lee, J., Insights into the Kinetics

of Methane Hydrate Formation in a Stirred Tank Reactor by in situ Raman Spectroscopy. Energy Technol. 2015, 3 (9), 925-934. 23.

Fandino, O.; Ruffine, L., Methane Hydrate Nucleation and Growth from the Bulk Phase:

Further Insights into Their Mechanisms. Fuel 2014, 117, 442-449. 24.

Kulkarni, S. A.; Kadam, S. S.; Meekes, H.; Stankiewicz, A. I.; ter Horst, J. H., Crystal

Nucleation Kinetics from Induction Times and Metastable Zone Widths. Cryst. Growth Des. 2013, 13 (6), 2435-2440. 25.

Abay, H. K.; Svartaas, T. M., Multicomponent Gas Hydrate Nucleation: The Effect of the

Cooling Rate and Composition. Energy Fuels 2011, 25, 42 - 51. 26.

Jensen, L. Experimental Investigation and Molecular Simulation of Gas Hydrates. PhD

thesis, Technical University of Denmark, 2010. 27.

Abay, H. K. Kinetics of Gas Hydrate Nucleation and Growth. PhD thesis, University of

Stavanger, Stavanger, Norway, 2011. 28.

Li, S.; Sun, C.; Liu, B.; Li, Z.; Chen, G.; Sum, A. K., New Observations and Insights into

the Morphology and Growth Kinetics of Hydrate Films. Sci Rep. 2014, 4. 29.

Ebinuma, T.; Takeya, S.; Chuvilin, E. M.; Kamata, Y.; Uchida, T.; Nagao, J.; Narita, H.

Dissociation Behavior of Gas Hydrates at Low Temperature, in Fourth International Conference on Gas Hydrates, Yokohama, Japan, May 19-23, 2002.

36

ACS Paragon Plus Environment

Page 37 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

30.

Yuhara, D.; Barnes, B. C.; Suh, D.; Knott, B. C.; Beckham, G. T.; Yasuoka, K.; Wu, D.

T.; Sum, A. K., Nucleation Rate Analysis of Methane Hydrate from Molecular Dynamics Simulations. Faraday Discuss. 2015, 179, 463-474. 31.

Moon, C.; Hawtin, R. W.; Rodger, P. M., Nucleation and Control of Clathrate Hydrates:

Insights from Simulation. Faraday Discuss. 2007, 136, 367. 32.

Vatamanu, J.; Kusalik, P. G., Molecular Insights into the Heterogeneous Crystal Growth

of sI Methane Hydrate. J. Phys. Chem. B 2006, 110 (32), 15896-15904. 33.

Ribeiro Jr, C. P.; Lage, P. L. C., Modelling of Hydrate Formation Kinetics: State-of-the-

art and Future Directions. Chem. Eng. Sci. 2008, 63 (8), 2007-2034. 34.

Veluswamy,

H.

P.;

Yang,

T.;

Linga,

P.,

Crystal

Growth

of

Hydrogen/Tetra‑n‑butylammonium Bromide Semiclathrates Based on Morphology Study. Cryst. Growth Des. 2014, 14, 1950-1960. 35.

Lim,

Y.-A.;

Babu,

P.;

Kumar,

R.;

Linga,

P.,

Morphology

of

Carbon

Dioxide−Hydrogen−Cyclopentane Hydrates with or without Sodium Dodecyl Sulfate. Cryst.Growth Des. 2013, 13, 2047-2059. 36.

Taylor, C. J.; Miller, K. T.; Koh, C. A.; Sloan, E. D., Macroscopic Investigation of

Hydrate Film Growth at the Hydrocarbon/Water Interface. Chem. Eng. Sci. 2008, 62 (24), 65246533. 37.

Osegovic, J. P.; Tatro, S. R.; Holman, S. A.; Ames, A. L.; Max, M. D., Growth Kinetics

of Ethane Hydrate from a Seawater Solution at an Ethane Gas Interface. J. Pet. Sci. Eng. 2007, 56 (1-3), 42-46. 38.

Hussain, S. M. T.; Kumar, A.; Laik, S.; Mandal, A.; Ahmad, I., Study of the Kinetics and

Morphology of Gas Hydrate Formation. Chem. Eng. Technol. 2006, 29 (8), 937-943. 37

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

39.

Page 38 of 57

Ohmura, R.; Shigetomi, T.; Mori, Y. H., Formation, Growth and Dissociation of

Clathrate Hydrate Crystals in Liquid Water in Contact with a Hydrophobic Hydrate-Forming Liquid. J. Cryst. Growth 1999, 196, 164-173. 40.

Vekilov, P. G., Nucleation. Cryst. Growth Des. 2010, 10 (12), 5007-5019.

41.

Zhang, J.; Hawtin, R. W.; Yang, Y.; Nakagava, E.; Rivero, M.; Choi, S. K.; Rodger, P.

M., Molecular Dynamics Study of Methane Hydrate Formation at a Water/methane Interface. J. Phys. Chem. B 2008, 112 (34), 10608-10618. 42.

Hawtin, R. W.; Quigley, D.; Rodger, P. M., Gas Hydrate Nucleation and Cage Formation

at a Water/methane Interface. Phys. Chem. Chem. Phys. 2008, 10 (32), 4853-4864. 43.

Anderson, B. J.; Tester, J. W.; Borghi, G. P.; Trout, B. L., Properties of Inhibitors of

Methane Hydrate Formation via Molecular Dynamics Simulations. J. Am. Chem. Soc. 2005, 127, 17852-17862. 44.

Kvamme, B.; Kuznetsova, T.; aasoldsen, K., Molecular Dynamics Simulations for

Selection of Kinetic Hydrate Inhibitors. J. Mol. Graphics Modell. 2005, 23 (6), 524-536. 45.

Hawtin, R. W.; Rodger, P. M., Polydispersity in Oligomeric Low Dosage Gas Hydrate

Inhibitors. J. Mater. Chem. 2006, 16 (20), 1934-1942. 46.

Maeda, N., Fuel Gas Hydrate Formation Probability Distributions on Quasi-free Water

Droplets. Energy Fuels 2015, 29, 137-142. 47.

Maeda, N., Measurements of Gas Hydrate Formation Probability Distributions on a

Quasi-Free Water Droplet. Rev. Sci. Instrum. 2014, 85, (6:065115). 48.

Wu, R.; Kozielski, K. A.; Hartley, P. G.; May, E. F.; Boxall, J.; Maeda, N., Probability

Distributions of Gas Hydrate Formation. AIChE J. 2013, 59 (7), 2640-2646.

38

ACS Paragon Plus Environment

Page 39 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

49.

Jiang, S.; ter Horst, J. H., Crystal Nucleation Rates from Probability Distributions of

Induction Times. Cryst.Growth Des. 2011, 11 (1), 256-261. 50.

Barlow, T. W.; Haymet, A. D. J., ALTA: An Automated Lag-time Apparatus for

Studying the Nucleation of Supercooled Liquids. Rev. Sci. Instrum. 1995, 66 (4), 2996-3007. 51.

Subramanian, S.; Kini, R. A.; Dec, S. F.; Sloan, E. D., Evidence of Structure II Hydrate

Formation from Methane + Ethane Mixtures. Chem. Eng. Sci. 2000, 55, 1981-1999. 52.

Subramanian, S.; Ballard, A. L.; Kini, R. A.; Dec, S. F.; Sloan, E. D., Structural

Transitions in Methane+Ethane Gas Hydrates - Part I: Upper Transition Point and Applications. Chem. Eng. Sci. 2000, 55 (23), 5763-5771. 53.

Arjmandi, M.; Tohidi, B.; Danesh, A.; Todd, A. C., Is Subcooling the Right Driving

Force for Testing Low-Dosage Hydrate Inhibitors? Chem. Eng. Sci. 2005, 60 (5), 1313-1321. 54.

Anklam, M. R.; Firoozabadi, A., Driving Force and Composition for Multicomponent

Gas Hydrate Nucleation from Supersaturated Aqueous Solutions J. Chem. Phys. 2004, 121 (23), 11867-11875. 55.

Kashchiev, D.; Firoozabadi, A., Driving Force for Crystallization of Gas Hydrates. J.

Cryst. Growth 2002, 241, 220-230. 56.

Kelland, M. A., Gas Hydrate Control. In Production Chemicals for the Oil and Gas

Industry, second ed.; CRC Press: Boca Raton, FL, 2014. 57.

Kelland, M. A., A Review of Kinetic Hydrate Inhibitors - Tailor-made Water-soluble

Polymers for Oil and Gas Industry Applications. In Advances in Materials Science Research; Nova Science Publishers, Inc: New York, 2012; Vol. 8.

39

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

58.

Page 40 of 57

Walker, V. K.; Zeng, H.; Ohno, H.; Daraboina, N.; Sharifi, H.; Bagherzadeh, S. A.; Alavi,

S.; Englezos, P., Antifreeze Proteins as Gas Hydrate Inhibitors. Can. J. Chem. 2015, 93 (8), 839849. 59.

Sharifi, H.; Walker, V. K.; Ripmeester, J. A.; Englezos, P., Insights into the Behavior of

Biological Clathrate Hydrate Inhibitors in Aqueous Saline Solutions. Cryst. Growth Des. 2014, 14 (6), 2923-2930. 60.

Bagherzadeh, S. A.; Alavi, S.; Ripmeester, J. A.; Englezos, P., Why Ice-binding Type I

Antifreeze Protein Acts as a Gas Hydrate Crystal Inhibitor. Phys. Chem. Chem. Phys. 2015, 17, 9984-9990. 61.

Skovborg, P.; Ng, H. J.; Rasmussen, P.; Mohn, U., Measurement of Induction Times for

the Formation of Methane and Ethane Gas Hydrates. Chem. Eng. Sci. 1993, 48 (3), 445 - 453. 62.

Vysniauskas, A.; Bishnoi, P. R., A Kinetic Study of Methane Hydrate Formation. Chem.

Eng. Sci. 1983, 38 (7), 1061-1072. 63.

Englezos, P.; Kalogerakis, N.; Dholabhai, P. D.; Bishnoi, P. R., Kinetics of Formation of

Methane and Ethane Gas Hydrates. Chem. Eng. Sci. 1987, 42 (11), 2647-2658. 64.

Okano, T.; Yanagisawa, Y.; Yamasaki, A. Development of a New Method for Hydrate

Formation Kinetics Measurements - a Breakthrough Method, in Fifth International Conference on Gas Hydrates, Trondheim, Norway, June 13-16, 2005. 65.

Seo, Y.; Shin, K.; Kim, H.; Wood, C. D.; Tian, W.; Kozielski, K. A., Preventing Gas

Hydrate Agglomeration with Polymer Hydrogels. Energy Fuels 2014, 28, 4409-4420. 66.

Ke, W.; Svartaas, T. M.; Abay, H. K. An Experimental Study on sI hydrate Formation in

Presence of Methanol, PVP and PVCap in an Isochoric Cell, in Seventh International Conference on Gas Hydrates, Edinburgh, Scotland, July 17-21, 2011. 40

ACS Paragon Plus Environment

Page 41 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

67.

Ke, W.; Svartaas, T. M. The Effect of Molar Liquid Water-gas Ratio on Methane Hydrate

Formation, in Seventh International Conference on Gas Hydrates, Edinburgh, Scotland, July 1721, 2011. 68.

Ke, W.; Svartaas, T. M. Effects of Stirring and Cooling on Methane Hydrate Formation

in a High-pressure Isochoric Cell, in Seventh International Conference on Gas Hydrates, Edinburgh, Scotland, July 17-21, 2011. 69.

Lachance, J. W.; Sloan, E. D.; Koh, C. A., Determining Gas Hydrate Kinetic Inhibitor

Effectiveness Using Emulsions. Chem. Eng. Sci. 2009, 64 (1), 180-184. 70.

Daraboina, N.; Pachitsas, S.; Von Solms, N., Experimental Validation of Kinetic

Inhibitor Strength on Natural Gas Hydrate Nucleation. Fuel 2015, 139, 554-560. 71.

Daraboina, N.; Pachitsas, S.; Von Solms, N., Natural Gas Hydrate Formation and

Inhibition in Gas/Crude Oil/Aqueous Systems. Fuel 2015, 148, 186-190. 72.

Abay, H. K.; Svartaas, T. M.; Ke, W., Effect of Gas Composition on sII Hydrate Growth

Kinetics. Energy Fuels 2011, 25, 1335 - 1341. 73.

Abay, H. K.; Svartaas, T. M., Effect of Ultralow Concentration of Methanol on Methane

Hydrate Formation. Energy Fuels 2010, 24, 752-757. 74.

Daraboina, N.; Malmos, C.; Von Solms, N., Synergistic Kinetic Inhibition of Natural Gas

Hydrate Formation. Fuel 2013, 108, 749-757. 75.

Sharifi, H.; Ripmeester, J. A.; Walker, V. K.; Englezos, P., Kinetic Inhibition of Natural

Gas Hydrates in Saline Solutions and Heptane. Fuel 2014, 117, 109-117. 76.

Wilson, P. W.; Lester, D.; Haymet, A. D. J., Heterogeneous Nucleation of Clathrates

from Supercooled Tetrahydrofuran (THF)/Water Mixtures, and the Effect of an Added Catalyst. Chem. Eng. Sci. 2005, 60 (11), 2937 - 2941. 41

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

77.

Page 42 of 57

Wilson, P. W.; Heneghan, A. F.; Haymet, A. D. J., Ice Nucleation in Nature:

Supercooling Point (Scp) Measurements and the Role of Heterogeneous Nucleation. Cryobiology 2003, 46 (1), 88 - 98. 78.

Kelland, M.; Svartaas, T. M.; Øvsthus, J.; Namba, T., A new class of kinetic hydrate

inhibitor. In Gas Hydrates: Challenges for the future, Holder, G. D.; Bishnoi, P. R., Eds. Ann. N. Y. Acad. Sci. 2000, 912, 281-293. 79.

Ohmura, R.; Ogawa, M.; Yasuoka, K.; Mori, Y. H., Statistical Study of Clathrate-Hydrate

Nucleation in a Water/Hydrochlorofluorocarbon System: Search for the Nature of the “Memory Effect”. J. Phys. Chem. B 2003, 107, 5289-5293. 80.

Zachariassen, K. E.; Baust, J. G.; Lee, R. E., A Method for Quantitative Determination of

Ice Nucleating Agents in Insect Hemolymph. Cryobiology 1982, 19 (2), 180-184. 81.

Kashchiev, D.; Borissova, A.; Hammond, R. B.; Roberts, K. J., Effect of Cooling Rate on

the Critical Undercooling for Crystallization J. Cryst. Growth 2010, 312 (5), 698-704. 82.

Kvamme, B.; Graue, A.; Aspenes, E.; Kuznetsova, T.; Granasy, L.; Toth, G.; Pusztai, T.;

Tegze, G., Kinetics of Solid Hydrate Formation by Carbon Dioxide: Phase Field Theory of Hydrate Nucleation and Magnetic Resonance Imaging. Phys. Chem. 2004, 6 (9), 2327-2334. 83.

Goh, L.; Chen, K.; Bhamidi, V.; He, G.; Kee, N. C. S.; Kenis, P. J. A.; Zukoski, C. F.;

Braatz, R. D., A Stochastic Model for Nucleation Kinetics Determination in Droplet-Based Microfluidic Systems. Cryst. Growth Des. 2010, 10 (6), 2515-2521. 84.

Toschev, S.; Milchev, A.; Stoyanov, S., On Some Probability Aspects of the Nucleation

Process. J. Cryst. Growth 1972, 13/14, 123 - 127.

42

ACS Paragon Plus Environment

Page 43 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

85.

Abay, H. K.; Hovland, J.; Svartaas, T. M. The Effect of PVCap on Methane Hydrate

Nucleation and Growth, in Seventh International Conference on Gas Hydrates, Edinburgh, Scotland, July 17-21, 2011. 86.

Abay, H. K.; Hoevring, E.; Svartaas, T. M. Does PVCap Promote Nucleation of Structure

II Hydrate?, in Seventh International Conference on Gas Hydrates, Edinburgh, Scotland, July 17-21, 2011. 87.

Takeya, S.; Hori, A.; Hondoh, T.; Uschida, T., Freezing-Memory Effect of Water on

Nucleation of CO2 Hydrate Crystals. J. Phys. Chem. B 2000, 104, 4164 - 4168. 88.

Fisher, R. A., On an Absolute Criterion for Fitting Frequency Curves. MM. 1912, 41,

155-160. 89.

Savage, L. J., On Rereading R.A.Fisher. Ann. Stat. 1976, 4 (3), 441-500.

90.

Aldrich, J., R. A. Fisher and the Making of Maximum Likelihood 1912 – 1922. Stat. Sci.

1997, 12 (3), 162-176. 91.

Epstein, B., Estimation of the Parameters of Two Parameter Exponential Distributions

from Censored Samples. Technometrics 1960, 2 (3), 403-406. 92.

Epstein, B.; Sobel, M., Sequential Life Tests in the Exponential Case. Ann. Math. Statist.

1955, 26, 82-93. 93.

Epstein, B.; Sobel, M., Some Theorems Relevant to Life Testing from an Exponential

Distribution. Ann. Math. Stat. 1954, 25 (2), 373-381. 94.

Millar, R. B., Some Widely Used Applications of Maximum Likelihood. In Maximum

Likelihood Estimation and Inference: With Examples in R, SAS and ADMB, John Wiley & Sons, Ltd: Chichester, UK, 2011.

43

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

95.

Page 44 of 57

Sarhan, A. E., Estimation of the Mean and Standard Deviation by Order Statistics. Ann.

Math. Stat. 1954, 25, 317-328. 96.

Cohen, A. C.; Helm, F. R., Estimation in the Exponential Distribution. Technometrics

1973, 15 (2), 415-418. 97.

Epstein, B., Truncated Life Tests in the Exponential Case. Ann. Math. Statist. 1954, 25

(3), 555-564. 98.

Zheng, M. Penalized Maximum Likelihood Estimation of Two-parameter Exponential

Distributions. Master Thesis, University of Minnesota, 2013. 99.

Forney, G. D., Maximum-Likelihood Sequence Estimation of Digital Sequences in the

Presence of Intersymbol Interference. IEEE Trans. Inf. Theory 1972, 18 (3), 363-378. 100.

Tamin, O. Z., Public Transport Demand Estimation by Calibrating a Combined Trip

Distribution-modal Choice from Passenger Counts: A Case Study in Bandung (Indonesia). EASTS 1997, 2 (3), 949-961. 101.

Greene, W. H., Maximum Likelihood Estimation of Econometric Frontier Functions. J.

Econometrics 1980, 13, 27-56. 102.

Hutchinson, J. B., The Application of the "Method of Maximum Likelihood" to the

Estimation of Linkage. Genetics 1929, 14, 519-537. 103.

York, T. L.; Durrett, R. T.; Tanksley, S.; Nielsen, R., Bayesian and Maximum Likelihood

Estimation of Genetic Maps. Genet. Res. Camb. 2005, 85, 159-168. 104.

Myung, J., Tutorial on Maximum Likelihood Estimation. J Math Psychol. 2003, 47, 90-

100. 105.

Zhong, S.; Xia, W.; He, Z. Approximate Maximum Likelihood Time Differences

Estimation in the Presence of Frequency and Phase Consistence Errors, in 2013 IEEE 44

ACS Paragon Plus Environment

Page 45 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

International Symposium on Signal Processing and Information Technology(ISSPIT), Athens, December 12-15, 2013. 106.

Sijbers, J.; Den Dekker, A. J.; Scheunders, P.; Van Dyck, D., Maximum-Likelihood

Estimation of Rician Distribution Parameters. IEEE. Trans. Med. Imaging 1998, 17 (3), 357-361. 107.

Sijbers, J.; den Dekker, A. J., Maximum Likelihood Estimation of Signal Amplitude and

Noise Variance from MR Data. Magn. Reson. Med. 2004, 51 (3), 586-594. 108.

Gardner, D. G.; Gardner, J. C.; Meinke, W. W., Method for the Analysis of

Multicomponent Exponential Decay Curves. J. Chem. Phys. 1959, 31 (4), 978-986. 109.

Rotondi, M. A., To Ski or not to Ski: Estimating Transition Matrices to Predict

Tomorrow’s Snowfall Using Real Data. J. Stat. Educ. 2010, 18 (3), 1-14. 110.

Good, P., Permutation Tests: A Practical Guide to Resampling Methods for Testing

Hypotheses. Second ed.; Springer: New York, 2000. 111.

Collingridge, D., A Primer on Quantitized Data Analysis and Permutation Testing. J. Mix.

Methods. Res. 2013, 7 (1), 81-97. 112.

Natarajan, V.; Bishnoi, P. R.; Kalogerakis, N., Induction Phenomena in Gas Hydrate

Nucleation. Chem. Eng. Sci. 1994, 49 (13), 2075-2087. 113.

R Core Team (2013). R: A Language and Environment for Statistical Computing. R

Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/. 114.

Zatsepina, O. Y.; Buffett, B. A., Experimental Study of the Stability of CO2-Hydrate in a

Porous Medium. Fluid Phase Equilib. 2001, 192 (1–2), 85-102. 115.

Walsh, M. R.; Beckham, G. T.; Koh, C. A.; Sloan, E. D.; Wu, D. T.; Sum, A. K.,

Methane Hydrate Nucleation Rates from Molecular Dynamics Simulations: Effects of Aqueous

45

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 46 of 57

Methane Concentration, Interfacial Curvature, and System Size. J. Phys. Chem. C 2011, 115, 21241-21248. 116.

Childs, A.; Chandrasekar, B.; Balakrishnan, N.; Kundu, D., Exact Likelihood Inference

Based on Type-I and Type-II Hybrid Censored Samples from the Exponential Distribution. Ann. Inst. Statist. Math. 2003, 55 (2), 319-330. 117.

Ganguly, A.; Mitra, S.; Samanta, D.; Kundu, D., Exact Inference for the Two-Parameter

Exponential Distribution under Type-II Hybrid Censoring. J. Stat. Plan. Inference 2012, 142 (3), 613-625.

46

ACS Paragon Plus Environment

Page 47 of 57

FIGURES m-aminobenzoic acid (m-ABA) nucleation at 25 °C 1

EPA-PDF, τ

Probability distribution, P(t)

0,tuned

0.8

J Chisq R

0.6

0.4

J τ

0

Chisq R

Value 0.000559 0.15824 0.98806

1

= 1030 s Error 9.6e-6 NA NA

0.8

EPA-PDF Value Error 0.000627 1.13e-5 1164.5

14.2

0.0831 0.99375

NA NA

0.6

Penalized MLE Value Error J 0.000588 4.7e-12 τ 1005.7 7.4e-6 0

0.2

Chisq R

1.3e-14 1

0 600

800 1000

3000

5000

NA NA

0.4

0.2

0 7000 9000

P(t)MLE

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

Induction time [s]

Figure 1. Probability distribution by EPA-PDF and MLE analysis on m-ABA nucleation (data from literature49). The Figure shows nucleation rates for system with lag time, # , according primary simulation, and with lag time tuned to lowest measured induction time to demonstrate effect of non-realistic high lag time value with EPA-PDF method.

47

ACS Paragon Plus Environment

Energy & Fuels

Nucleation of super-cooled water on AgI crystals at -4 °C 1

Probability distribution, P(t)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 48 of 57

J= τ =

0.8

EPA-PDF Value Error 0.000335 1.7e-6

0

95.6

5.6

Chisq R

0.09208 0.99838

NA NA

0.6

Penalized MLE Value Error J = 0.0004798 1.7e-12 τ = 163.3 2.9e-6

0.4

0.2

0

Chisq R

4.8e-14 1

NA NA

0 100

1000

10000

Ind. time [s]

Figure 2. Probability distribution by EPA-PDF and MLE analysis on nucleation from supercooled water on AgI crystals (data from literature50).

48

ACS Paragon Plus Environment

Page 49 of 57

Isonicotinamide (INA) nucleation in ethanol solutions at 25 °C 1

1

EPA-PDF Value

J

Probability distribution, P(t)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

τ

0.8

0

Chisq R

Error

0.000461 3.9e-6 211.3

8.5

0.04938 0.99794

NA NA

0.8

0.6

0.6

0.4

0.4

Penalized MLE Value

J τ

0.2

Error

0.000401 2.2e-12 182.7

6.2e-6

2.0e-14 1

NA NA

0

Chisq R

0.2

0

0 100

4

1000

10

Induction time [s]

Figure 3. Probability distribution by EPA-PDF and MLE analysis on INA nucleation (data from literature24).

49

ACS Paragon Plus Environment

Energy & Fuels

Methane-propane sII hydrate nucleation at ∆ T = 10.05 °C EPA-PDF

Probability distribution, P(t)

1 J τ

0

0.8

Chisq R

1

Value 0.00833

Error 0.000728

6.39

4.71

0.1425 0.97105

NA NA

0.8 Penalized MLE J τ

0.6

0

Chisq R

Value 0.00233

Error 6.46e-7

-2.29

0.03

1.36e-6 1

NA NA

0.4

0.6

0.4 Penalized MLE, τ

=0s

Value 0.00237 0.00025 0.99994

Error 6.1e-6 NA NA

0,tuned

0.2

J Chisq R

0 10

100

1000

0.2

0 10000

Induction time [s]

P(t)MLE

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 50 of 57

Figure 4. Probability distribution by EPA-PDF and MLE analysis for binary methane-propane hydrate nucleation at 90 bar and 11.75 °C (data from present work). MLE analysis includes simulation with lag time set to zero due to suggested negative value, yielding an overlapped probability curve upon the original.

50

ACS Paragon Plus Environment

Page 51 of 57

Methane-propane sII hydrate nucleation at ∆ T = 8.8 °C EPA-PDF 1 J τ

0

Probability distribution, P(t)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

6,05

Chisq 0,10901 R 0,98904

0.8

1

Value Error 0,004419 0,0001497 3,45 NA NA

0.8

0.6

0.6

0.4

0.4 Penalized MLE J τ

0.2

0

Chisq R

Value Error 0,0011768 1,5e-7 6,85

0,03

9,94e-7 1

NA NA

0 10

100

1000 Ind. time [s]

10

4

0.2

0

Figure 5. Probability distribution by EPA-PDF and MLE analysis for binary methane-propane hydrate nucleation at 90 bar and 13 °C (data from present work).

51

ACS Paragon Plus Environment

Energy & Fuels

Methane-propane sII hydrate nucleation at ∆ T = 7.55 °C

Probability distribution, P(t)

1

EPA-PDF, τ

0,tuned

J Chisq R

0.8

0.6

J τ

0

Value 0.00117 0.3039 0.93718

Penalized MLE Value Error J 0.000306 5.1e-12 τ -55.0 1.3e-5

Error 0.000102 NA NA

-113.8

0.8

0

Chisq R

EPA-PDF Value Error 0.000913 0.000123

Chisq 0.2597 R 0.94657

0.4

1

=0s

4.4e-15 1

Penalized MLE, τ

60.6 NA NA

J Chisq R

J Chisq R

=0s

0,tuned

Value 0.000319 0.00313 0.9996

Penalized MLE, τ

0.2

NA NA

Error 3.6e-6 NA NA

0,tuned

Value 0.000332 0.0135 0.99827

0.6

0.4

= 54 s

Error 8.0e-6 NA NA

0

0.2

0 10

100

1000

4

10

10

5

P(t)MLE

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 52 of 57

Induction time [s]

Figure 6. Probability distribution by EPA-PDF and MLE analysis for binary methane-propane hydrate nucleation at 90 bar and 14.25 °C (data from present work). The Figure shows nucleation rates for system with lag time, # , according to primary simulation, and with lag time tuned to zero or lowest observed induction time to demonstrate effect of non-realistic low lag time values for EPA-PDF and MLE methods.

52

ACS Paragon Plus Environment

Page 53 of 57

C1-C3 nucleation at 13 °C, J with data splitting (mean±SD) 0.03 MLE estimates 0.025 -1

Nucleation rate, J [s ]

-1

0.02 0.015

J = 0.0012 s (all data included) With subgroups of 30 datapoints With subgroups of 20 datapoints With subgroups of 12 datapoints With subgroups of 6 datapoints

0.01 0.005 0

-0.005 a)

0.03 EPA-PDF estimates 0.025 -1

-1

Nucleation rate, J [s ]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

0.02 0.015

J = 0.0044 s (all data included) With subgroups of 30 datapoints With subgroups of 20 datapoints With subgroups of 12 datapoints With subgroups of 6 datapoints

0.01 0.005 0

-0.005 b)

Figure 7. Effect of reduced number of experimental parallels on estimated nucleation rate (methane-propane nucleation at 13 °C) using MLE (Fig. 7a) and EPA-PDF (Fig. 7b) methods.

53

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 54 of 57

TABLES Table 1. A summary of stationary nucleation rate and lag time ± standard error as estimated by EPA-PDF and MLE respectively. Note: Nucleation rates, J, are multiplied with given scaling factors. m-ABA

Water on AgI

INA

C1-C3

C1-C3

C1-C3

J x 104

J x 104

J x 104

J x 103

J x 103

J x 103

25 °C

-4 °C

25 °C

11.75 °C

13 °C

14.25 °C

EPA-PDF, (s-1)

6.3 ± 0.1

3.4 ± 0.0

4.6 ± 0.0

8.3 ±0.7

4.4 ± 0.1

1.2 ± 0.1

MLE , (s-1)

5.9 ± 0.0

4.8 ± 0.0

4.0 ± 0.0

2.4 ± 0.0

1.2 ± 0.0

0.3 ± 0.0

EPA-PDF, #0 (s)

1165 ± 14

96 ± 6

211 ± 9

6±4

6±3

0*

MLE, #0 (s)

1006 ± 0

163 ± 0

183 ± 0

0*

7±0

0*

Method

Results

*: Negative lag time, # , by simulation adjusted to zero.

54

ACS Paragon Plus Environment

Page 55 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

Table 2. Nucleation rate by MLE, (s-1), with reduced number of induction times for analysis. Data points

m-ABA

Water on AgI

INA

C1-C3

C1-C3

C1-C3

per group

25 °C

-4 °C

25 °C

11.75 °C

13 °C

14.25 °C

(144 points)

(30 points)

(60 points)

(30 points)

(80 points) (first 300 points) x 104

x 104

x 104

x 103

x 103

x 103

All Data

5.9 ± 0.0

4.8 ± 0.0

4.0 ± 0.0

2.4 ± 0.0

1.2 ± 0.0

0.32 ± 0.00

40 – 50

6.0 ± 0.1

4.8 ± 0.9

4.0 ± 0. 4

-

-

-

20 – 30

6.0 ± 0.6

5.2 ± 2.3

4.1 ± 0. 8

-

1.2 ±0. 3

-

12 – 20

-

6.0 ± 5.0

5.1 ± 3.1

3.2 ± 2.4

1.2 ± 0.3

0.30 ± 0.05

10 – 12

8.1 ± 4.8

-

5.4 ± 4.1

3.9 ± 2.5

1.6 ± 0.9

0.30 ± 0.07

5–6

8.3 ± 3.8

9.2 ± 11.5

7.2 ± 6.2

6.3 ± 5.2

6.7 ± 12.6

0.70 ± 0.90

55

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 56 of 57

Table 3.  value statistics summarized from permutation tests on literature nucleation data for all subgroups within each dataset.



m-ABA

Water on AgI

INA

25 °C

-4 °C

25 °C

(80 parallels)

(300 parallels)

(144 parallels)

N







N

≤0.05 ≤0.1







N

≤0.05 ≤0.1





≤0.05 ≤0.1

2

40

0

0

6

50

0

0

3

48

0

0

4

20

0

0

12

25

2

3

6

24

0

0

8

10

0

2

24

12/13

4

8

9

16

3

4

16

5

1

3

60

5

14

20

12 12

2

2

-

-

-

-

-

-

-

-

24

6

6

11

-

1

5

102

-

20

31

54

-

11

17

Sum: 30

56

ACS Paragon Plus Environment

Page 57 of 57

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

Table 4.  value statistics summarized from permutation tests on hydrate nucleation for all subgroups within the dataset.



C1-C3

C1-C3

C1-C3

11.75 °C

13 °C

14.25 °C

(30 parallels)

(60 parallels)

(30 parallels)

N







N

≤0.05 ≤0.1







N

≤0.05 ≤0.1





≤0.05 ≤0.1

2

15

0

0

2

30

0

0

2

15

0

0

3

10

0

0

3

20

0

0

3

10

0

0

5

6

0

0

5

12

0

0

5

6

0

2

-

-

-

-

10

6

1

1

-

-

-

-

-

0

0

20

-

1

1

10

-

0

2

Sum: 10

57

ACS Paragon Plus Environment