Automatic Fitting of Binary Interaction Parameters for Multi-fluid

Sep 26, 2016 - ... automatic evolutionary optimization algorithm written in the python programming language to fit the two most important interaction ...
1 downloads 8 Views 2MB Size
Article pubs.acs.org/jced

Automatic Fitting of Binary Interaction Parameters for Multi-fluid Helmholtz-Energy-Explicit Mixture Models Ian H. Bell* and Eric W. Lemmon* National Institute of Standards and Technology, Boulder, Colorado 80305, United States S Supporting Information *

ABSTRACT: In the highest-accuracy mixture models available today, these being the multi-fluid Helmholtz-energyexplicit formulations, there are a number of binary interaction parameters that must be obtained through correlation or estimation schemes. These binary interaction parameters are used to shape the thermodynamic surface and yield higherfidelity predictions of various thermodynamic properties including vapor-liquid equilibria and homogeneous p-v-T data, among others. In this work, we have used a novel and entirely automatic evolutionary optimization algorithm written in the python programming language to fit the two most important interaction parameters for more than 1100 binary mixtures. This fitting algorithm can be run on multiple processors in parallel, resulting in a reasonable total running time for this large set of binary mixtures. For more than 830 of the binary pairs, the median absolute relative error in bubble-point pressure is less than 5%. The source code for the fitter is provided as supplemental data, as well as the entire set of binary interaction parameters obtained and comparisons with the best experimental vapor-liquid-equilibrium data that are available.



thermophysical property libraries REFPROP,3 CoolProp,4 and TREND.5 Non-dimensionalized Helmholtz Energy Models. REFPROP 9.13 is the current state-of-the-art in mixture binary parameters; it includes interaction parameters for 697 mixtures, with approximately 200 of these obtained from the literature6−13 as well as mixture interaction parameters that were fitted through the use of in-house code. To the authors’ knowledge, the only other libraries that implement a significant number of binary interaction parameters for high-accuracy mixture models are the open-source library CoolProp 5.14 comprising 220 binary pairs and the TREND 2.0 package from the University of Bochum, Germany,5 comprising 210 binary pairs. In the multi-fluid mixture model, the pressure of the mixture can be obtained from

INTRODUCTION

There are many types of mixture models available to represent the thermodynamic properties of mixtures of fluids that cover both phase equilibria and homogeneous (liquid, gas, and supercritical) states. Over the years, the mixture models have increased in complexity and accuracy, largely driven by the increasing power of computers, which has both allowed the use of more computationally expensive equations of state, as well as allowed correlators the ability to leverage this computational power to fit more complex equations of state. One of the most commonly used families of models is that of cubic equations of state. These cubic equations are straightforward to implement, even for mixtures, and can allow for solution of the density as a function of temperature and pressure in an explicit formulation, even though they are expressed in the form of pressure as a function of temperature and specific volume. These cubic equations of state can be converted to a Helmholtz-energy-explicit formulation through the use of the analysis of Bell and Jäger.1 Other models have also been proposed, including the SAFT-based models, LeeKesler-Plö cker,2 and the non-dimensionalized Helmholtz energy formulation utilized here. The multi-fluid mixture property model has seen wide application because it represents a fundamental thermodynamic potential, and therefore all other thermodynamic properties can be obtained by derivatives of the Helmholtz energy. This mixture formulation is that used in the state-of-the-art This article not subject to U.S. Copyright. Published 2016 by the American Chemical Society

⎛ ⎞ ∂α r p = ρRT ⎜1 + δ (τ , δ )⎟ ⎝ ⎠ ∂δ

(1)

Other thermodynamic properties of interest (enthalpy, entropy, etc.) can be obtained in a similar fashion. For more information on calculating other thermodynamic properties from derivatives of the residual Helmholtz energy, see the work of Span.14 The Helmholtz energy is expressed in terms of the reduced density Received: March 21, 2016 Accepted: September 13, 2016 Published: September 26, 2016 3752

DOI: 10.1021/acs.jced.6b00257 J. Chem. Eng. Data 2016, 61, 3752−3760

Journal of Chemical & Engineering Data

Article

δ = ρ/ρr(x)̅ and the reciprocal reduced temperature τ = Tr(x)/ ̅ T. For further information, the reader is directed to the coverage of the mixture models in the literature.11−13 According to the most recent formulation used in all of the state-of-the-art libraries, the reducing parameters for the mixture (Tr and vr = 1/ρr) can be given in a common form by N

Yr(x ̅ ) =

N−1

N

∑ xi2Yc ,i + ∑ ∑ i=1

i=1 j=i+1

2xixj

xi + xj βY2,ijxi + xj

Yij (2)

where Y is the parameter of interest, either molar specific volume v or temperature T. The necessary parameters are given by

Figure 1. Boxplot of the mixture interaction parameters in the REFPROP 9.1 library (blue box shows interquartile range containing 25th to 75th percentile of the data, fliers outside this range shown by markers, the red lines show median values).

These mixture reducing equations are simply weighting functions of the critical properties of the pure fluids that form the mixture. For the ij pair, there are a total of four adjustable parametersβT,ij, γT,ij, βv,ij, and γv,ij. The γ parameters are symmetric (γY,ij = γY,ji), while the β parameters are not symmetric (βY,ij = 1/βY,ji), and thus the order of fluids in the binary pair is important and must be handled carefully when implementing this sort of mixture model. There is additionally an adjustable parameter Fij that is applied to the generalized departure function as described by Kunz and Wagner.12 In this work, we do not consider the departure terms because of the paucity of experimental data for many of the mixtures under study; a significant body of experimental data is needed to be able to fit the departure function. Hence, the parameter Fij in the departure term is set equal to zero. The mixture interaction parameters βT,ij, γT,ij, βv,ij, and γv,ij are often obtained through manual fitting, some deterministic optimization routines, as well as user intervention to divide the experimental data into training and validation sets. The culmination of all this work is the set of binary interaction parameters in the REFPROP 9.1 library. Figure 1 shows a boxplot of the parameter distributions for all binary pairs that are included in REFPROP 9.1. As can be seen in the figure, the median value (given by the red bar inside the box), is very nearly unity for all four of the interaction parameters. Aside from γT,ij, the range of the parameters is also quite narrow, being quite tightly clustered around unity. While the work, which culminated in the mixture interaction parameters that are implemented in the REFPROP library (and the GERG models,12,13,15 among others), was able to use the binary-specific departure functions to strongly shape the thermodynamic property surface, the analysis here only fits two parameters, βT,ij and γT,ij, and sets the parameters βv,ij and γv,ij to 1.0. The motivation for fitting only these two parameters is that they are the parameters that are the most important for fitting bubble-point data, and adjusting these parameters within reasonable ranges does not tend to result in significant (and undesired) deformation of the thermodynamic property surfaces. The significance of these parameters can be seen by plotting phase envelopes for binary mixtures. Phase envelopes represent the saturation curve (dew-point and bubble-point), including

the entire critical region, for a fixed mixture composition. Figure 2 shows several phase envelopes for an equimolar

Figure 2. Phase envelopes for an equimolar mixture of methane and ethane.

mixture of methane and ethane, demonstrating that while adjusting the interaction parameters deforms the phase envelopes, the deformation does not result in drastically different phase envelope shapes. Importantly, modifying the interaction parameters has a much more significant impact on the bubble-point lines than on the dew-point lines. This is important because fitting only dew point data to obtain the interaction parameters could yield significantly erroneous predictions for the bubble-point data. It is largely for this reason that the bubble-point data were used exclusively to fit the mixture interaction parameters in this work. Furthermore, for many mixtures, the only data available are bubble-point measurements.



DATA COLLECTION AND PREPARATION The values obtained for the binary interaction parameters can only be as good as the experimental data upon which they are based. Therefore, it is of utmost importance to have as 3753

DOI: 10.1021/acs.jced.6b00257 J. Chem. Eng. Data 2016, 61, 3752−3760

Journal of Chemical & Engineering Data

Article

comprehensive and complete an experimental database to draw from as possible. In this work, two large databases of experimental vapor-liquid equilibria data were merged into one joined database. The primary database for this study is the database ThermoDataEngine16−18 that is developed at the National Institute of Standards and Technology in Boulder, Colorado. As of publication, ThermoDataEngine includes vapor-liquid equilibria data for more than 40 000 binary mixtures. The second database of experimental data is an internal database developed by researchers at NIST over the last few decades. Both ThermoDataEngine and the internal NIST mixture property database include other types of data, including homogeneous-phase pressure-density-temperature (p-ρ-T) data, virial coefficient data, etc. Merging the two databases required careful reconciliation of their respective data sets. In many cases the same data set was present in both databases, in which case only one copy was retained. Furthermore, the databases were in entirely different formats with incompatible metadata, requiring several steps of normalization to make the databases possible to be merged into one master database. In the end, the master database of experimental data was created in a form compatible with the python package pandas, and a separate database of bibliographic information was generated in the BibTeX format. Figure 3 shows the distribution of bubble-point data in the master database, sorted by the bubble-point data available.

• Modification of the interaction parameters can cause failure of flash routines that previously succeeded, or vice versa. Sometimes very small changes can cause this failure state transition, and only one data point may see this transition. • The objective function is not smooth, making numerical partial differentiation of the objective function with respect to the interaction parameters difficult or impossible. • A highly accurate and well constructed equation of state must already exist for each of the components in the mixture. • Some of the experimental data points are questionable or are known to be faulty, though that may not be noted as such in the respective database. • The experimental uncertainties are in most cases unknown. The overarching conclusion is therefore that deterministic (especially derivative-based) optimization routines are not viable for this problem. Therefore, another method must be selected that can accommodate the roughness of the objective function, is robust to transitions in the failure state of the flash routines, and can conveniently arrive at the optimal values for the parameters. Before beginning an optimization campaign, it is crucial to understand the shape of the objective function. Figure 4 shows

Figure 3. Distribution of bubble-point data from reconciled databases.

Figure 4. Presentation of the objective function over a regular grid for the binary pair methane-isobutane along with cross sections cutting through the surface in the vicinity of the minimum of the surface. The objective function surface is truncated at a value of 300.

There are a number of popular mixtures that have been exceptionally well studied; in this case the five mixtures with the most experimental bubble-point data are ethanol-water, methanol-water, nitrogen-water, benzene-cyclohexane, and ammonia-water. On the other end of the spectrum are comparably many mixtures for which there is only a small amount of data available, with as few as one bubble-point measurement.

a plot of the objective function for the methane-isobutane binary pair, as well as cross sections A and B cutting through the surface in the vicinity of the minimum value of the objective function. There is a clear minimum in the objective function near βT,ij = 1.01, and γT,ij = 1.2. The color filling the plot is generated by interpolation of the objective function at the gridded data points. The objective function is not smooth due to the random selection of experimental data points, as demonstrated by the jagged form of the contours, especially at high values of γT,ij. In the course of this work, several optimization methods were investigated, including brute-force optimization with cubic interpolation, among others. In the end, the most robust and computationally efficient solution proved to be genetic



FITTING METHODOLOGY As described above, the goal of the fitting procedure is to arrive at the best values for βT,ij and γT,ij for a given binary mixture of two pure fluids; this pair of optimal values represents a global minimization of the objective function (to be defined below). The fitting of binary interaction parameters is complicated by a number of unavoidable features of the fitting process: 3754

DOI: 10.1021/acs.jced.6b00257 J. Chem. Eng. Data 2016, 61, 3752−3760

Journal of Chemical & Engineering Data

Article

error metric. The weakness of these error metrics is that one very large value (for instance a failure of a flash routine entered as a large number) can dramatically skew the error metric. On the other hand, the use of the median absolute error rather than either RSSE or MAE results in an error that rejects the failures in the flash routine. So long as only a few randomly selected handfuls contain failures of the flash routines, the median absolute error metric will allow the optimization algorithm to reject these failures and keep moving toward the optimal solution. Thus, the base objective function that is being minimized is

optimization. In particular, we employed the open-source DEAP python package in this work,19 the details of which are described further on. Optimization and Fitting. As described above, determination of the mixture interaction parameters is a challenging optimization problem that is not well suited to deterministic (derivative-based) optimization methods. For this reason, various stochastic optimization routines were considered. The use of stochastic, or evolutionary, optimization routines is not a new concept. For instance Schmidt and Lipson20 used evolutionary optimization to distill governing physical laws from experimental data on (simple) physical systems. Furthermore, other stochastic optimization methods are available, and Krink et al.21 demonstrated that evolutionary optimization is one of the best optimization methodologies for noisy objective functions. They show that evolutionary optimization is superior to differential evolution for noisy optimization. This provides further support for the optimization scheme used here. Fitting Process. The computational pipeline employed for this problem is to start with a library of many (approximately 160000) experimental vapor-liquid-equilibrium data points spanning more than 1 200 binary mixtures for which reliable equations of state are available for both components. The current fitting methodology includes only the vapor-liquidequilibrium data, and more specifically, the fitting procedure is only based on the use of bubble-point measurements. All of the bubble-point data for a given binary pair are extracted from the library into a pair-specific subset. This subset for the selected mixture forms the data source for the fitting routines. The optimization routine has two independent variables, βT,ij and γT,ij. For a given set of βT,ij and γT,ij, a handful of data points (generally 5−10) are randomly selected from the data set for the binary pair. For each of the selected data points in the handful, the bubble-point pressure is calculated as a function of the given bubble-point temperature and bulk mole fraction (the mole fractions of the liquid phase at the bubble point are equivalent to the bulk mole fractions). The signed error vector over these randomly selected handful of data points is calculated as eS⃗ =

pexp ⃗ − pcalc ⃗ pexp ⃗

OBJ0(βT ,ij , γT ,ij) = median([E RSS,1, E RSS,2 ,...])

It is necessary to further constrain the objective function to yield more reliable flash routine evaluations. With the median error metric, if 49% of the experimental data points cause failures of the flash routines, the median error can still be quite low since the middle value is used when sorting the absolute error. As an example, we might have an error array ERSS (sample data for illustrative purposes) that looks like E RSS = [1, 2, 3, 1010 , 1010]

where 10 are large values signifying the failure of a flash routine. In this case, 40% of the runs have failed, but we still yield a median error that is 3%. In order to “push” the optimizer back to the best solution (trading flash routine failures for errors in prediction of VLE data), it is necessary to penalize solutions that have a low median error but cause a significant number of flash routine failures. To that end, the OBJ0(βT,ij,γT,ij) function is multiplied by a penalty function defined by F = 10 ffail nfail

(7)

where nfail is an integral penalty exponent (usually 2 or 3), which controls the sharpness of the penalty and f fail is the fraction (in the range 0 to 1) of the flash routine evaluations that fail. If none of the flash routine evaluations fail, the penalty factor F is 1, and thus no penalty is applied. If all the flash routines fail, the F factor is 10nfail. Thus, the final objective function that is used in the optimizer is given by

× 100

∑ [(eS⃗ )2 ]

(6)

10

(3)

OBJ(βT ,ij , γT ,ij) = F ·OBJ0(βT ,ij , γT ,ij)

If a flash routine fails for one of the data points forming the handful, a very large number is entered in the appropriate location in p⃗calc. The error metric for the randomly selected handful is then simply the root-sum-of-squares E RSS =

(5)

(8)

Fitting Implementation. The DEAP package used (and as described above) is a flexible set of open-source tools for minimization of objective function(s). The emphasis is on evolutionary optimization, though other optimization methodologies are also supported. As such, it is relevant to briefly describe the methodology that is used in DEAP to arrive at the “best” solution. To begin with, a population of individuals (sometimes referred to as chromosomes in genetic optimization literature) is generated, each with its own fitness associated with how well the individual predicts the bubble-point pressures. In this case the fitness is the inverse of the objective functionthe higher is the fitness, the lower is the objective function, and the better the values represent the experimental measurements. After having randomly generated the initial population of individuals, a number of methods are used to combine and mutate the population of individuals. The primary types of operations in the genetic optimization are listed:

(4)

With the RSSE error metric for the randomly selected handful, at least one data point causing failure of a flash routine will also result in large errors for the handful. To get a representative calculation of the error for the binary pair for a given set of βT,ij and γT,ij, this random-selection and error-calculation process must be carried out a number of times. In general, this process is repeated about 100 times, each time randomly selecting approximately 5 of the data points. Selection of the appropriate error metric is key when considering all the calculated values and errors for one pair of βT,ij and γT,ij. In general, it is quite common to use either a rootsum-of-squares error (RSSE) or mean absolute error (MAE) 3755

DOI: 10.1021/acs.jced.6b00257 J. Chem. Eng. Data 2016, 61, 3752−3760

Journal of Chemical & Engineering Data

Article

(1) Mutation: Similar to spontaneous mutation of genetic chromosomes in biological systems, the individual can mutate, and each of its parameters can be given an offset, where the offset is given by a Gaussian distribution centered around zero. The standard deviation of the Gaussian distribution is used to control the spread of the mutations, and a probabilistic parameter is applied to decide whether the individual should mutate or not. (2) Crossover/reproduction: In crossover or reproduction, various models can be applied to govern the outcome of the interaction of two individuals. This is similar, in biological systems, to the resulting offspring’s chromosomes when its parents reproduce sexually. Various models are available, including weighting of the parent’s chromosome, or allowing for weights that can yield chromosomes outside the range of the parent’s chromosomes. The extent of user involvement in running this code is tuning the width of the Gaussian distributions, and setting the probabilities of mutation, crossover, and reproduction of the individuals. Otherwise, the code is entirely automatic and requires no user intervention. The following values were used in the final version of the fitter: • number of generations: 30 • population size: 150 • tournament selection size: 5 (Tournament selection refers to the process of comparing a number of individuals, in randomly selected pools, and keeping the best individual from this pool to populate the next generation. The larger the pool size, the greater the selection pressure, corresponding to the aggressive selection of high-fitness individuals at the expense of population diversity.) • mutation probability: 50% • crossover probability: 30% The implementation is entirely based on a python-based open-source framework, aside from the use of NIST REFPROP as the property backend to provide the bubble-point evaluation. All thermodynamic calls are made through the CoolProp library, which delegates to the REFPROP library to make the necessary bubble-point calculations. The system is constructed so that it can scale to multiple processes in parallel. Processlevel parallelization (as opposed to thread-level parallelization) is required as REFPROP is not thread-safe, and multiple mixture interaction files in REFPROP must have the interaction parameters injected into them by the optimization routines (one mixture file per process). Finally, the fitting routines make heavy use of the pandas python package for data management and data subsetting. The flowchart in Figure 5 shows the way that the optimization is divided into subprocesses, each of which operates on a given binary mixture. The master process spawns a number of subprocesses, each of which is passed a subset of the data for the binary pair that is being calculated by the subprocess. In this way, the large job is subdivided into smaller segments that can be easily handled by one processor. As each subprocess finishes its optimization, it writes its data to a file and signals the master process that it has completed, and begins to work on the next mixture. The source code for the fitting routines is available as an electronic appendix, along with sample data for one binary pair (propane + n-decane) taken from the work of Mansfield et al.22

Figure 5. Flowchart for the fitting process.

These data serve as a real-world demonstration of the fitting methodology.



RESULTS This fitting procedure was carried out for all the mixtures for which at least one experimental bubble-point data point could be found in the reconciled database. If bubble-point data were not available for at least two temperatures and two compositions then βT,ij was set to 1.0 and only γT,ij was fitted. For more complex mixtures, adjusting γT,ij and βT,ij does not provide enough flexibility to properly model the vapor-liquid equilibria and homogeneous fluid properties of the binary mixture in a consistent fashion. Highly asymmetric binary mixtures (helium + water being perhaps the most extreme example) further stress the flash routines, resulting in frequent failures due to insufficiently accurate starting values. These failures of the flash routines mean that potentially high-accuracy vapor-liquid equilibria data points are being rejected. As a result of these challenges, the resulting binary interaction parameters were divided into two tiers. The first tier is mixtures for which it is believed that the interaction 3756

DOI: 10.1021/acs.jced.6b00257 J. Chem. Eng. Data 2016, 61, 3752−3760

Journal of Chemical & Engineering Data

Article

Figure 6. Improvement I and median absolute relative error (MARE) of the bubble point predictions. The improvement I is based on the median error over all experimental bubble point measurements. The entries in the legend correspond to the number of interaction parameters that were fit in REFPROP 9.1. The count of mixtures with this number of parameters fit in REFPROP 9.1 is given in parentheses.

improvement term will neither go to zero nor infinity. The improvement I should be interpreted in this way: a value of I = 1 is a 1% decrease in the MARE, I = −1 is a 1% increase in the MARE. Basically, the more positive I is, the better. Figure 6 shows a plot of the MARE versus the improvement factor I. While more than 1100 mixtures were fit in this work, REFPROP includes fitted binary interaction parameter values for 522 mixtures, and those are plotted here. For approximately 300 of the binary mixtures, the current fitted parameters yield an improvement over the parameters in REFPROP; some of this measured improvement is a result of the data sets considered in the respective fitting process. On the other hand, there are many mixtures where the fitted parameters provide a worse representation of the VLE data than the use of the parameters from REFPROP. It is challenging to draw clear conclusions from a comparison of the resultant errors caused by the use of the binary interaction parameters (BIP) in REFPROP and those in this work. The BIPs in REFPROP were obtained by careful curation of the experimental data that were used in the fitting process, while in this work, the sheer amount of experimental data considered required an automatic approach. Furthermore, in many cases (the open markers), the developers of the given pair of BIPs also fit a departure function with the multiplication factor Fij. The reader is directed to the description of the GERG model for a further discussion of the departure function. In many cases, two BIP parameters were fit, but the two parameters that were fit were either γT,ij and γv,ij instead of the pair βT,ij and γT,ij used in this work. Additionally, the fits in REFPROP are, in some cases, also based on p-v-T, heat capacity, speed of sound, and any other data available for a particular binary pair. Figure 7 shows the overall MARE distribution for all the mixtures that were fit in this work. For nearly 1000 of the binary mixtures, the MARE is less than 10%; a 10% MARE can be considered as a sufficiently accurate representation of the data for many applications. More than 800 of the mixtures have a MARE less than 5%. In some cases this is because there was only one data point, which was fit very accurately, so the MARE is an imperfect metric for goodness-of-fit.

parameters are reliable and can be used for accurate predictions of the properties of this system. For a binary pair to be in tier no. 1, the following two conditions must both be met: (1) The median absolute relative error (defined below) associated with the bubble-point data is less than 5%. (2) Sufficient experimental data are available to fit both βT,ij and γT,ij (experimental bubble point data for at least two temperatures and two compositions). If either of the conditions is not met, the binary pair is relegated to tier no. 2. The Supporting Information associated with this paper includes the set of interaction parameters that were obtained from the fitting procedure, as well as the complete set of bibliographic information associated with the data used for each of the binary pairs. Accuracy. The tables in the Supporting Information contain all of the mixture interaction parameters that were fit, along with information about the accuracy of each fit at predicting the bubble point data. The median absolute relative error (MARE) parameter is defined by ⎛ p⃗ − p⃗ exp calc MARE = median⎜ ⎜ p ⃗ exp ⎝

⎞ × 100⎟ ⎟ ⎠

(9)

The median absolute relative error is only a relevant metric in comparison to existing values for the error. Here we define an improvement-in-error term I that quantifies the improvement compared with REFPROP ⎧(ER − 1) × 100 for ER ≥ 1 ⎪ I=⎨ ⎛ 1 ⎞ − 1⎟ × 100 for ER < 1 ⎪− ⎜ ⎠ ⎩ ⎝ ER

(10)

where the error ratio ER is defined by

ER =

MARE REFPROP MARE Bell

(11)

The reason for this (admittedly convoluted) definition for the improvement in error is such that if one (but not both) of MAREREFPROP or MAREBell values is very nearly zero, the 3757

DOI: 10.1021/acs.jced.6b00257 J. Chem. Eng. Data 2016, 61, 3752−3760

Journal of Chemical & Engineering Data

Article

Figure 9. Evolution of the fitness function for the binary mixtures n‑heptane + n-hexane and refrigerant R143a + refrigerant R152a from the repeatability tests.

Figure 7. Distribution of MARE for the binary pairs fit in this work.

Repeatability. Due to the stochastic nature of the fitting process, each time the optimization is executed, the algorithm will arrive at a different optimal solution for the mixture interaction parameters βT,ij and γT,ij. Ideally, the solutions for the optimal mixture interaction parameters should be tightly clustered around their respective mean values. To demonstrate the repeatability of the optimization outlined here, the optimizer was run 100 times for two mixtures. These mixtures were selected because they demonstrated low computational requirements, while also having enough bubble-point measurements that the same points were not always sampled. Figure 8 demonstrates the results from 100 runs of the optimizer for the interaction parameters for these mixtures. As

decrease in the error initially, followed by a near-asymptotic behavior in the limit of an infinite number of generations. For both mixtures, most runs arrive very near their optimal value after approximately 10 generations, and the remaining generations are used to refine the solution. In some cases, there are local increases in the objective function (points with worse error), but in the end, all runs arrive near the same solution. Homologous Families. As a demonstration of the output of the fitting procedure, the results for the family of n-alkane + n-alkane mixtures are presented for all binary pairs where it was possible to fit both βT,ij and γT,ij. Figure 10 presents the fitted values for βT,ij and γT,ij. A few comments are required for this plot. As described above, βT,ij = 1/βT,ji. Thus, the order of fluids in the binary pair is of consequence. To yield a consistent set of variables, the binary pair was sorted such that the first element in the binary pair (i = 1) is that with the lower molar mass, and the second element (j = 2) in the binary pair is that with the higher molar mass. Furthermore, the abscissa (x-coordinate) of the plot is the ratio of the maximum molar mass in the binary pair to the minimum molar mass in the binary pair. The trend in interaction parameters for the homologous n‑alkane + n‑alkane family is nearly linear, and as a result, linear curves were fitted to βT,ij and γT,ij as a function of the ratio of molar masses. These linear correlations can be used to yield reasonable predictions of the interaction parameters for binary n-alkane mixtures for which no experimental data of any kind exist. This estimation scheme has been applied to mixtures that include higher alkanes and has demonstrated reasonable extrapolation ability. Similar exercises could be carried out for other families of fluids. As another demonstration of the strong familial trends, the obtained values of γT,ij are plotted for the binary pairs containing carbon dioxide in Figure 11. This figure demonstrates greater values of γT,ij for greater differences in critical temperature between carbon dioxide and the other component.

Figure 8. Deviations from the mean values (given by overbar) for βT,ij and γT,ij for the binary mixtures n-heptane + n-hexane and refrigerant R143a + refrigerant R152a over 100 runs of the fitter. Each marker represents the results from one execution of the fitter.

seen in this figure, the largest deviation of βT,ij and γT,ij from the mean value is less than 0.2%. In the vast majority of cases, the deviations are less than 0.05% from the mean value. Error Evolution. As the evolutionary algorithm works to find an optimal solution for the interaction parameters βT,ij and γT,ij, the error decreases in a nearly monotonic fashion. In the case of this optimization problem, there are sometimes steps that result in an increase in the error due to the experimental data points that have been selected. For instance, even if βT,ij and γT,ij are unchanged, the error can sometimes increase due to the inclusion of one or more points with higher error. Figure 9 shows the evolution of the fitness function over the 200 repeatability tests described above. These profiles demonstrate the characteristic error evolution, that of a steep



CONCLUSIONS The binary interaction parameters βT,ij and γT,ij have been fitted for more than 1100 mixtures. The improved interaction parameters will be included in a future version of REFPROP. The interaction parameters are obtained from an evolutionary optimization algorithm based on an open-source python framework. The fitting procedure conveniently scales to large computational clusters. 3758

DOI: 10.1021/acs.jced.6b00257 J. Chem. Eng. Data 2016, 61, 3752−3760

Journal of Chemical & Engineering Data

Article

Figure 10. Binary interaction parameters for the homologous n-alkane + n-alkane family.



AUTHOR INFORMATION

Corresponding Authors

*E-mail: [email protected]. *E-mail: [email protected]. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS The authors thank Ken Kroenlein and Chris Muzny of the National Institute of Standards and Technology who facilitated the extraction of data from ThermoDataEngine and made available computational resources used for this study, and Marcia Huber, also of NIST, for background on genetic optimization. Commercial equipment, instruments, or materials are identified only in order to adequately specify certain procedures. In no case does such identification imply recommendation or endorsement by the National Institute of Standards and Technology, nor does it imply that the products identified are necessarily the best available for the purpose. Contribution of the National Institute of Standards and Technology, not subject to copyright in the US.

Figure 11. Values of γT,ij for the carbon dioxide family. The abcissa Tc refers to the critical temperature of the other component in the mixture with carbon dioxide. The value for water is not included.

Further work will entail the development of a predictive scheme for βT,ij and γT,ij based on information about the pure fluids forming the mixture.





ASSOCIATED CONTENT

* Supporting Information S

REFERENCES

(1) Bell, I. H.; Jäger, A. Helmholtz energy transformations of common cubic equations of state for use with pure fluids and mixtures. J. Res. NIST 2016, 121, 238. (2) Plocker, U.; Knapp, H.; Prausnitz, J. Calculation of high-pressure vapor-liquid equilibria from a corresponding-states correlation with emphasis on asymmetric mixtures. Ind. Eng. Chem. Process Des. Dev. 1978, 17, 324−332.

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jced.6b00257. Python source code of the fitter (ZIP) Fitted binary interaction parameters; comparison with the available experimental data (PDF) 3759

DOI: 10.1021/acs.jced.6b00257 J. Chem. Eng. Data 2016, 61, 3752−3760

Journal of Chemical & Engineering Data

Article

(3) Lemmon, E. W.; Bell, I. H.; Huber, M. L.; McLinden, M. O. NIST Standard Reference Database 23: Reference Fluid Thermodynamic and Transport Properties-REFPROP, version 9.1.1; National Institute of Standards and Technology, 2016. (4) Bell, I. H.; Wronski, J.; Quoilin, S.; Lemort, V. Pure and Pseudopure Fluid Thermophysical Property Evaluation and the Open-Source Thermophysical Property Library CoolProp. Ind. Eng. Chem. Res. 2014, 53, 2498−2508. (5) Span, R.; Eckermann, T.; Herrig, S.; Hielscher, S.; Jäger, A.; Thol, M. TREND: Thermodynamic Reference and Engineering Data 2.0. 2015. (6) Lemmon, E. W.; Jacobsen, R. T.; Penoncello, S. G.; Friend, D. G. Thermodynamic Properties of Air and Mixtures of Nitrogen, Argon, and Oxygen from 60 to 2000 K at Pressures to 2000 MPa. J. Phys. Chem. Ref. Data 2000, 29, 331−385. (7) Lemmon, E. W.; Jacobsen, R. T. Equations of State for Mixtures of R-32, R-125, R-134a, R-143a, and R-152a. J. Phys. Chem. Ref. Data 2004, 33, 593−620. (8) Lemmon, E. W.; Jacobsen, R. T. A Generalized Model for the Thermodynamic Properties of Mixtures. Int. J. Thermophys. 1999, 20, 825−835. (9) Akasaka, R. A Thermodynamic Property Model for the R-134a/ 245fa Mixtures. 15th International Refrigeration and Air Conditioning Conference at Purdue, July 14−17, 2014. (10) Akasaka, R. Thermodynamic property models for the difluoromethane (R-32) + trans-1,3,3,3-tetrafluoropropene (R-1234ze(E)) and difluoromethane + 2,3,3,3-tetrafluoropropene (R-1234yf) mixtures. Fluid Phase Equilib. 2013, 358, 98−104. (11) Gernert, G. J. A New Helmholtz Energy Model for Humid Gases and CCS Mixtures. Ph.D. Thesis, Ruhr-Universität Bochum, 2013. (12) Kunz, O.; Klimeck, R.; Wagner, W.; Jaeschke, M. The GERG2004 Wide-Range Equation of State for Natural Gases and Other Mixtures; VDI Verlag GmbH, 2007. (13) Kunz, O.; Wagner, W. The GERG-2008 Wide-Range Equation of State for Natural Gases and Other Mixtures: An Expansion of GERG-2004. J. Chem. Eng. Data 2012, 57, 3032−3091. (14) Span, R. Multiparameter Equations of StateAn Accurate Source of Thermodynamic Property Data; Springer, 2000. (15) Gernert, J.; Span, R. EOS-CG: A Helmholtz energy mixture model for humid gases and CCS mixtures. J. Chem. Thermodyn. 2016, 93, 274−293. (16) Diky, V.; Chirico, R. D.; Muzny, C. D.; Kazakov, A. F.; Kroenlein, K.; Magee, J. W.; Abdulagatov, I.; Kang, J. W.; Frenkel, M. ThermoData Engine (TDE) software implementation of the dynamic data evaluation concept. 7. Ternary mixtures. J. Chem. Inf. Model. 2011, 52, 260−276. (17) Diky, V.; Chirico, R. D.; Muzny, C. D.; Kazakov, A. F.; Kroenlein, K.; Magee, J. W.; Abdulagatov, I.; Kang, J. W.; Gani, R.; Frenkel, M. ThermoData Engine (TDE): Software implementation of the dynamic data evaluation concept. 8. Properties of material streams and solvent design. J. Chem. Inf. Model. 2012, 53, 249−266. (18) Diky, V.; Chirico, R. D.; Muzny, C. D.; Kazakov, A. F.; Kroenlein, K.; Magee, J. W.; Abdulagatov, I.; Frenkel, M. ThermoData Engine (TDE): Software implementation of the dynamic data evaluation concept. 9. Extensible thermodynamic constraints for pure compounds and new model developments. J. Chem. Inf. Model. 2013, 53, 3418−3430. (19) Fortin, F.-A.; De Rainville, F.-M.; Gardner, M.-A.; Parizeau, M.; Gagné, C. DEAP: Evolutionary Algorithms Made Easy. J. Mach. Learn. Res. 2012, 13, 2171−2175. (20) Schmidt, M.; Lipson, H. Distilling free-form natural laws from experimental data. Science 2009, 324, 81−85. (21) Krink, T.; Filipič, B.; Fogel, G. B. Noisy optimization problems A particular challenge for differential evolution? CEC2004. Congress on Evolutionary Computation 2004, 332−339. (22) Mansfield, E.; Bell, I. H.; Outcalt, S. L. Bubble Point Measurements of n-Propane + n-Decane Binary Mixtures with

Comparisons of Binary Mixture Interaction Parameters for Linear Alkanes. J. Chem. Eng. Data 2016, 61, 2573.

3760

DOI: 10.1021/acs.jced.6b00257 J. Chem. Eng. Data 2016, 61, 3752−3760