Ind. Eng. Chem. Res. 2008, 47, 4533–4541
4533
Comparing the Use of Gibbs Ensemble and Grand-Canonical Transition-Matrix Monte Carlo Methods to Determine Phase Equilibria Andrew S. Paluch,† Vincent K. Shen,‡ and Jeffrey R. Errington*,† Department of Chemical and Biological Engineering, The State UniVersity of New York at Buffalo, Buffalo, New York 14260-4200, and Physical and Chemical Properties DiVision, National Institute of Standards and Technology, 100 Bureau DriVe MS 8380, Gaithersburg, Maryland 20899-8380
We present results from a computational study investigating the use of Gibbs ensemble and grand-canonical transition-matrix Monte Carlo (GC-TMMC) methods to determine the liquid-vapor phase coexistence properties of pure molecular fluids of varying degrees of complexity. The molecules used in this study were ethane, n-octane, cyclohexane, 2,5-dimethylhexane, 1-propanol, and water. We first show that the GC-TMMC method can reproduce Gibbs ensemble results found in the literature. Given the excellent agreement for each molecule, we then compare directly the performance of Gibbs ensemble and GC-TMMC simulations at both low and high reduced temperatures by monitoring the relative uncertainties in the saturation properties as a function of computational time. In general, we found that the GC-TMMC method yielded limiting uncertainties in the saturated vapor density and pressure that were significantly smaller, by an order of magnitude in some instances, than those of the Gibbs ensemble method. Limiting Gibbs ensemble uncertainties for these properties were generally in the 0.8-5.0% range. However, both methods yielded comparable limiting uncertainties in the saturated liquid density, which fell within the range of 0.1-1.0%. In the case of water at 300 K, we found that the Gibbs ensemble outperformed GC-TMMC. The relatively poor performance of the GC-TMMC method in this situation was tied to the slow convergence of the density probability distribution at this low temperature. We also discuss strategies for improving the convergence rate under these conditions. I. Introduction The phase behavior of fluids is a subject of fundamental importance in both the biological and physical sciences. Knowledge of the manner in which a fluid separates into distinct phases is crucial to the design and performance of various separation processes used in industrial settings. Historically, the thermodynamic data used for these purposes have been empirically based, manifesting themselves in the form of tables, charts, and correlations. However, it is often the situation that data are required at state points dramatically different from those of existing data, thus precluding the use of extrapolation. Furthermore, in many instances, obtaining new experimental data can be a nontrivial task, requiring a significant investment of time and money. A powerful, alternative means to this end is molecular simulation. Among the important historical methodological advances that have facilitated the determination of fluid-phase equilibria via molecular simulation is the Gibbs ensemble Monte Carlo method, which was introduced by Panagiotopoulos. 1–3 It is essentially an m-phase (m g 2) Monte Carlo simulation that avoids explicit simulation of the interfacial region between coexisting phases. In the Gibbs ensemble method, the phases of interest are simulated in separate simulation boxes, and three basic types of trial moves are performed: molecule displacements, volume changes, and interbox molecule transfers. Molecule displacements equilibrate internally in each phase; volume changes establish mechanical equilibrium between phases (i.e., equality of pressures); and interbox molecule transfers establish chemical equilibrium between phases (i.e., equality of chemical potentials). Temperature equality is trivial * To whom correspondence should be addressed. E-mail: jerring@ buffalo.edu. † The State University of New York at Buffalo. ‡ National Institute of Standards and Technology.
to establish in Monte Carlo simulation because it is a userspecified input parameter. Although the development of the Gibbs ensemble was a significant advance, in its incipient form, the method was restricted to the study of simple fluids. The phase behavior of complex molecular fluids became computationally feasible to investigate only when the Gibbs ensemble was combined with the subsequent advent of configurationalbias Monte Carlo (CBMC) sampling techniques,4–9 which made it possible to efficiently sample internal degrees of freedom. While other computational approaches for studying the phase coexistence of pure fluids have since emerged, for example, Gibbs-Duhem integration,10–13 histogram reweighting Monte Carlo,14 the NpT + test-particle method,15 and multicanonical sampling Monte Carlo,16,17 the Gibbs ensemble remains the most popular simulation method used to study the phase behavior of fluids due largely to its conceptual simplicity. Transition-matrix Monte Carlo (TMMC)18–22 is a comparatively recent development that has emerged as a highly efficient simulation method for computing the thermophysical properties of fluids. In contrast to the Gibbs ensemble where the coexisting fluid phases are simulated simultaneously in separate simulation boxes, a single simulation box is used in TMMC. The conditions of phase equilibrium are determined by first using a self-adaptive TMMC scheme to calculate the free energy along an order parameter path connecting the coexisting phases of interest. Histogram reweighting14 is then used to locate precisely the conditions of phase coexistence. For the study of fluid-phase equilibria, TMMC is typically implemented in the grandcanonical ensemble, and under these conditions is commonly referred to as grand-canonical transition-matrix Monte Carlo (GC-TMMC). At fixed chemical potential µ, temperature T, and volume V, a GC-TMMC simulation involves two basic types of trial moves: molecule displacements and insertions/deletions. Once again, molecule displacements internally equilibrate the system. Molecule insertions/deletions reflect the fact that the
10.1021/ie800143n CCC: $40.75 2008 American Chemical Society Published on Web 05/30/2008
4534 Ind. Eng. Chem. Res., Vol. 47, No. 13, 2008
system is in contact with a particle bath at chemical potential µ. An attractive practical aspect of GC-TMMC is that it can be implemented by making minimal changes to an existing grandcanonical Monte Carlo simulation code. The general transitionmatrix methodology has found a variety of applications. Examples include the determination of pure and multicomponent fluid properties in bulk and confinement,23–33 surface tension of pure fluids and mixtures,23,34–37 Henry’s constants,38 wetting transitions on solid surfaces,39–42 adsorption isotherms,43,44 and protein solution behavior. 42,45,46 However, the types of fluids that have been studied by transition-matrix-based methods have been largely restricted to relatively simple ones, a limitation that the Gibbs ensemble also faced when it was first introduced. To address this apparent limitation, it has recently been shown that GC-TMMC can be combined straightforwardly with CBMC and expanded ensemble techniques, thus significantly extending the range of fluids that can be studied. 29 In fact, any sampling technique that can be used in the Gibbs ensemble can also be used in a GC-TMMC simulation because the two methods share the same basic underlying suite of trial moves (molecule insertions/deletions are analogous to interbox molecule transfers), except for volume changes in the Gibbs ensemble. In light of this observation, an interesting and practical question therefore arises regarding the relative efficiency of the two methods. In this paper, we compare the performance of the Gibbs ensemble and GC-TMMC methods for determining the phasecoexistence properties of a broad range of single-component molecular fluids comprising ethane, n-octane, cyclohexane, 2,5dimethyhexane, 1-propanol, and water. Our intended audience is the simulator who is familiar with the basic concepts of molecular simulation but uses “off-the-shelf” software packages. The outline of this paper is as follows. In Section II, we briefly describe the simulation methods used in this work. Computational details are provided in Section III. Results and discussion are given in Section IV. Finally, in Section V, we summarize our findings and present conclusions. II. Simulation Methods In this section, the GC-TMMC method is described. GCTMMC is the focus of this section because it is a comparatively new simulation method and has not been discussed as extensively in the literature as the Gibbs ensemble. The interested reader can find detailed discussions of the Gibbs ensemble method in refs 1-3 and 47. In the grand-canonical ensemble, an important statistical quantity relevant to the thermodynamic description of a singlecomponent system is the density probability distribution Π(N; µ, V, T), which is the probability of observing N molecules in a system of volume V at fixed temperature T and chemical potential µ. A simple approach to calculate this probability distribution involves the construction of a histogram by simply counting the number of times the system volume contains N molecules during the course of a grand-canonical Monte Carlo simulation. In GC-TMMC, the distribution is calculated via a more efficient computational route. Rather than simply counting the number of times the system holds a given number of molecules, transition statistics are accumulated to calculate the overall probability PN, δ that the number of particles in the system N will change by an amount δ. In this work, only single molecules can be inserted/deleted, thus δ ) -1, 0, or +1. Actual calculation of the transition probability PN, δ takes place through the accumulation of acceptance probabilities pa in a collection matrix C. For any trial move, two elements of the collection matrix are updated simultaneously
CN,δ ) CN,δ + pa
(1a)
CN,0 ) CN,0 + 1 - pa
(1b)
Notice that for trial moves not proposing to change the number of molecules in the system (e.g., displacement, rotation, or regrowth), δ ) 0, and thus, the application of eqs 1a and 1b simply results in incrementing the element CN, 0 by unity. The acceptance probability pa is defined generically as pa ) min(1, fo, n)
(2)
where fo, n is a ratio of the product of probabilities fo, n )
Rn, o πn Ro, n πo
(3)
In eq 3, Ro, n is the probability of generating a new system configuration n from the old system configuration o, and Rn, o is the probability of generating the reverse move. For conventional Monte Carlo trial moves, Ro, n ) Rn, o . In contrast, advanced sampling techniques seek to modify or bias Ro, n and Rn, o to enhance the overall acceptance of trial moves.5,47 The probabilities of finding the system in the old configuration o and the new configuration n are πo and πn, respectively. Note that the above notation is completely general. That is, no distinction is made between conventional and advanced Monte Carlo sampling moves. Examples of the former include centerof-mass displacements and rotations about the center of mass, while examples of the latter include configurational-bias insertions/deletions and regrowths. In other words, the accumulation of statistics proceeds according to the above protocol regardless of the complexity of the attempted trial move. The interested reader can find detailed expressions for the acceptance criteria for the above-mentioned trial moves in refs 4-9 and 47- 50. The overall transition probability PN, δ is calculated from the statistics accumulated in the collection matrix and is just the corresponding element of C normalized appropriately as PN,δ )
CN,δ
∑
+1 j)-1
(4) CN,j
The quantity of interest Π(N) is obtained by imposing a detailed balance condition. In the commonly encountered situation where N can change by at most unity, the distribution Π(N) can be calculated straightforwardly using a sequential updating scheme ln Π(N + 1) ) ln Π(N) + ln
PN,+1 PN+1,-1
(5)
In order for the above scheme to be effective, the system must be able to sample N space adequately. Notice that in the case of liquid-vapor coexistence, N necessarily spans a broad range of values. To ensure that the system samples sufficiently the entire range, an N-dependent biasing function η(N) is employed, requiring the use of a biased acceptance probability. Trial moves are now accepted according to a biased acceptance probability pη
{
pη ) min 1,
}
exp[η(Nn)] f exp[η(No)] o, n
(6)
Although trial moves are accepted/rejected using the above biased acceptance criterion, the collection matrix continues to be updated with the unbiased acceptance probability pa (eq 2). Notice that if η(N) ) -ln Π(N)
(7)
Ind. Eng. Chem. Res., Vol. 47, No. 13, 2008 4535
then the system samples all values of N with equal probability. However, Π(N) is the unknown probability distribution that one seeks to calculate. In practice, a GC-TMMC simulation is usually initiated with a flat (uniform) biasing function, that is, η(N) ) k where k is an arbitrary constant. Initially, the system preferentially samples the most probable values of N consistent with the specified combination of µ, V, and T. Subsequently, during the course of the simulation, Π(N) is periodically estimated using the information accumulated in C, and a more effective biasing function η(N) that promotes uniform sampling of N -space is obtained via eq 7. This updating scheme provides a self-adaptive approach to the equilibrium distribution. The density probability distribution Π(N; µ, V, T) obtained from a GC-TMMC simulation is unique to the actual chemical potential used in the simulation. With the use of histogram reweighting,14 the distribution can be determined at other chemical potential values. Liquid-vapor coexistence is located by finding the chemical potential value µsat that yields a bimodal probability distribution with equal areas (pressures) under each peak. If Π(N; µsat, V, T) is normalized, then the saturated vapor pressure psat is βpsatV ) - ln[2Π(0; µsat, V, T)]
(8)
where β ) 1/(kBT) and kB is the Boltzmann constant. The factor of 2 appears due to the existence of two peaks in the distribution at phase coexistence. The statistical mechanical formalism that provides the connection between the thermodynamic pressure and the density probability distribution can be found elsewhere.23,25,27,28,51 The saturated density of phase i is i ) Fsat
〈N 〉i
V
(9)
where 〈N〉i is the mean value of N averaged over the region of Π(N; µsat, V, T) corresponding to phase i. III. Computational Details Comparison of the Gibbs ensemble and GC-TMMC methods for the simulation of molecular fluids was made within the framework of the general-purpose MCCCS Towhee simulation package.52 There were three primary reasons for using Towhee. First, the Towhee simulation package offered the capability of performing conventional Monte Carlo simulations, such as those of Gibbs ensemble and traditional grand-canonical Monte Carlo. Second, the advanced sampling techniques, in particular CBMC, required for the efficient conformational sampling of complex fluids, were available and seamlessly implemented. Last, Towhee offered a wide range of force-fields. While traditional grand-canonical Monte Carlo simulation was already implemented in Towhee, GC-TMMC was not. Therefore, the Towhee source code was minimally modified to implement the GCTMMC method. GC-TMMC functionality is now available in the Towhee simulation package.52 All simulations were performed on a single CPU equivalent to what is available on a desktop personal computer. Simulations wherein performance data were collected were performed on the same type of processor. A. Force-Fields. Gibbs ensemble and GC-TMMC simulations were performed for ethane, 1-propanol, n-octane, 2,5dimethylhexane, cyclohexane, and water. The organic molecules were modeled using the united-atom version of the Transferable Potentials for Phase Equilibria (TraPPE-UA) force-field developed by Siepmann and co-workers.8,53–55 Water was described using the extended simple point charge (SPC/E) model.56 Since
these force-fields have been discussed in detail elsewhere in the literature, only brief descriptions are provided here. In the TraPPE-UA force-field, the nonbonded interaction energy between sites i and j, either on different molecules or on the same molecule separated by more than three bonds, is given by the sum of a Lennard-Jones (LJ) and an electrostatic contribution
[( ) ( ) ]
unb(rij) ) 4εij
σij rij
12
-
σij rij
6
+
qiqj 4πε0rij
(10)
where rij is the distance between sites i and j, εij is the well depth of the LJ interaction, σij is the distance at which the LJ interaction is zero, and qk is the partial charge of site k. Partial charges were relevant, that is, qk * 0 only in the case of 1-propanol. Bond lengths are fixed, but bond angles are flexible and bend according to a harmonic potential kθ (11) (θ - θo)2 2 where θ is the bond angle, θo is the equilibrium bond angle, and kθ is the force constant. Bond rotations are governed by a torsional potential ubend(θ) )
utors(φ) ) c1[1 + cos(φ)] + c2[1 + cos(2φ)] + c3[1 + cos(3φ)] (12) where φ is the dihedral angle and the constants c1, c2, and c3 are Fourier coefficients. Force-field parameters and further details can be found in refs 8 and 53–55. The SPC/E model56 for water consists of three point masses with a fixed O-H bond distance of 1 Å and fixed H-O-H bond angle of 109.47°. The oxygen and hydrogen sites carry point charges of -0.8476e and +0.4238e, respectively. Finally, a Lennard-Jones interaction site is also located on the oxygen site, with parameters ε ) 0.65 kJ/mol and σ ) 3.166 Å. The functional form of the interaction between two water molecules is analogous to that of eq 10. For all systems studied in this work, the Lennard-Jones interaction was truncated at a cutoff distance ranging from 10 to 15 Å, depending on the size of the molecule, and standard uniform fluid tail corrections were used. Electrostatic interactions, in the case of water and 1-propanol, were evaluated using the Ewald summation, with the real-space interactions truncated at half the simulation box length. In addition, a dielectric constant of unity was used. For water, a damping parameter value of RL ) 5.6 was used, and the maximum number of reciprocal space lattice vectors was set by Kmax ) 5. These parameter values were identical to those used in other studies of SPC/E water found in the literature.57–59 For 1-propanol, parameter values of RL ) 5.6 and Kmax ) 6 were chosen, as these values yielded relative error estimates60 for the electrostatic contribution to the potential energy that were less than 0.015%. B. Simulation Parameters. For each molecule, an initial series of GC-TMMC simulations was performed at several subcritical temperatures to generate its liquid-vapor phase boundary, which could be compared with the Gibbs ensemble data available in the literature. Simulation parameters were subsequently taken from this first set of runs for use in a second series of simulations consisting of both GC-TMMC and Gibbs ensemble simulations where the performance of the methods was compared directly. The relative efficiency of the methods was assessed on the basis of monitoring the uncertainties in the saturation properties as a function of computational time. In practice, a GC-TMMC simulation requires specification of the chemical potential µ, system volume V, temperature T,
4536 Ind. Eng. Chem. Res., Vol. 47, No. 13, 2008
and the maximum number of molecules Nmax allowed in the system. In principle, any chemical potential can be used because Π(N; µ, V, T) can be determined easily at other chemical potential values using histogram reweighting,14 a post-simulation procedure that requires negligible computational time relative to a GC-TMMC run. However, from a practical perspective, setting µ ≈ µsat leads to efficient sampling of N space. The primary purpose of the first set of GC-TMMC simulations was to verify that the predicted liquid-vapor coexistence properties agreed with literature values. As a result, educated guesses for the saturation chemical potential values were used as input for µ. After completion of the simulation, a relatively straightforward reweighting procedure was used to find the saturation point, and coexistence properties were obtained from the corresponding density probability distribution using the procedures outlined above. The objective of the second set of simulations was to obtain performance data. In this case, the input value for µ was set to the saturation value µsat, which was obtained from the first set of runs. This was done to eliminate variation in the performance of the GC-TMMC simulations related to the input chemical potential, which enabled us to focus more clearly on the influence of temperature and chemical identity. Note that the strategy adopted for the first set of runs is consistent with the approach one would follow in practice (i.e., a single GC-TMMC simulation with a reasonable estimate for µ, which provides an efficient means to obtain saturation properties). The system volume V depended on the size of the molecule being simulated and ranged from (20 Å)3 for water to (35 Å)3 for the large organic molecules, such as n-octane and 2,5-dimethylhexane. The temperature T in a Monte Carlo simulation is simply an input parameter. Several subcritical temperatures were used in the first series of GC-TMMC simulations to generate each molecule’s liquid-vapor phase coexistence curve. In the second series of simulations, where performance data were collected, we restricted our attention to two or three temperatures broadly spaced along the liquid-vapor saturation line. Within this same series, the maximum number of molecules in the system Nmax was chosen based on the following empirical heuristic23,29 ln
Π(Nliq ;µsat, V, T) > 10 Π(Nmax ;µsat, V, T)
(13)
where Nliq is the location of the liquid peak in the saturated probability distribution. Nmax was calculated using the results of the first series of simulations. All GC-TMMC simulations were initiated with an empty box. The basic suite of trial moves generally consisted of 20% center-of-mass translations, 20% rotations about the center-of-mass, 40% CBMC insertions/ deletions, and 20% CBMC regrowths. The precise breakdown of trial moves varied slightly according to the complexity of the molecule, and no attempt was made to optimize these parameters. For each molecule, Gibbs ensemble simulations were performed at a low and a high reduced temperature. Performance data from these simulations were collected for comparison with GC-TMMC performance data. Gibbs ensemble simulations were performed at fixed total volume and were initiated by placing Nmax molecules in the liquid-phase box, and Nmax /2 molecules in the vapor-phase box. This initial distribution of molecules was chosen so that the GC-TMMC and Gibbs simulations sampled liquid phases of comparable extent, and that precise vapor-phase properties could be obtained in the Gibbs ensemble. Within each box, the first site of each molecule was placed at a cubic lattice point. The volumes of each box were chosen to
be consistent with their respective fluid-phase densities as determined from the first series of GC-TMMC simulations. The basic suite of trial moves consisted of center-of-mass translations, rotations about the center-of-mass, isotropic volume changes, CBMC interbox molecule transfers, and CBMC regrowths. Siepmann and co-workers8,53–55 have found that Gibbs performance is optimized by adjusting the frequency of volume and interbox transfer moves such that one successful volume update and one particle transfer occur on average every 10 Monte Carlo cycles. In our studies, the fraction of interbox molecule transfer and volume change moves required to achieve this target was obtained from a short, preliminary Gibbs ensemble simulation. The remaining fraction of trial moves was distributed as evenly as possible among the displacement, rotation, and regrowth moves. The precise breakdown depended on the complexity of the molecule. In all of the simulations, the coupled-decoupled CBMC method for selecting and generating trial positions was used.8 Other techniques, such as multiple-first-bead insertions and dualcutoffs,7 were also employed to facilitate efficient sampling in the simulations described above. Aggregation-volume-bias moves48–50 were used in the simulations of water and 1-propanol. For a given molecule, parameters for the CBMC moves and the other advanced ones lying outside the standard suite of trial moves were set to identical values so that the observed performance discrepancies could be attributed to the basic underlying differences in the methods themselves. IV. Results and Discusion Liquid-vapor phase coexistence properties computed by GCTMMC for ethane, n-octane, cyclohexane, 2,5-dimethylhexane, 1-propanol, and water are presented first. These simulations were performed over a sufficiently broad range of subcritical temperatures to generate the liquid-vapor coexistence curve for each molecule. Data are presented in the temperature-density plane in Figure 1 and in the pressure-temperature plane in Figure 2. Error bars represent the standard deviation of four independent simulations. Included in each figure are Gibbs ensemble results taken from the literature.8,58 For cyclohexane, Gibbs ensemble simulation results were not available in tabular form. In addition, empirically based equation-of-state data for each molecule obtained from the software package ThermoData Engine (using the Refprop option when possible) are plotted for point of reference.61–63 These predictions represent best-fit curves to available experimental data. For example, in the case of n-octane, the equation-of-state-predicted saturation pressures are within 0.1% of the experimental values across the temperature range of interest. In the T-F plane, the coexisting fluid densities predicted by GC-TMMC and Gibbs ensemble are in excellent agreement. In Figure 2, we plot psat versus 1/T for the various fluids, since the slope is related to the enthalpy of vaporization. Again, the agreement between GC-TMMC and Gibbs ensemble predictions is excellent. In the cases of cyclohexane and water, there is a noticeable discrepancy between the simulation results and the equation-of-state data. However, the observed discrepancy is a deficiency of the forcefield and not of the simulation methods. The important point to stress in Figures 1 and 2 is the agreement between the Gibbs ensemble and GC-TMMC methods. Given the excellent agreement between Gibbs ensemble and GC-TMMC data, an important practical question naturally arises: Which method is more efficient? To address this issue, GC-TMMC and Gibbs ensemble simulations were performed for each molecule at two or three broadly spaced temperatures
Ind. Eng. Chem. Res., Vol. 47, No. 13, 2008 4537
Figure 1. Fluid-phase diagrams of the molecular fluids investigated in this work in the temperature-density plane. Linear alkanes, ethane and n-octane, are shown in panel a; nonlinear alkanes, cyclohexane and 2,5-dimethylhexane, are shown in panel b; and polar fluids, water and 1-propanol, are shown in panel c. Circles represent the results of GC-TMMC simulations, and error bars represent the standard deviation of four independent simulations. Diamonds are Gibbs ensemble simulation data found in the literature using the same force-field. Solid curves represent equation-ofstate predictions.61–63
along the saturation curve. Because of the inherent differences in the two methods, it is not possible to directly map the simulation details germane to one onto the other. Therefore, when comparing the two methods, reasonable guidelines must be established regarding the manner in which the simulations are parametrized. As described in Section IIIB, simulation parameters were chosen to maximize the extent to which the observed performance discrepancies could be ascribed to the intrinsic differences in the simulation methods. Although we feel that the guidelines used here provide a fair comparison between the methods, alternative mappings are certainly possible and would likely lead to slightly different results. Our intention is to provide information useful to users who are inclined to use “off-the-shelf” simulation software packages. As a result, computational expediency as well as precision is an important factor to consider when comparing the two methods. For systems requiring more sophisticated sampling techniques (e.g., multiplefirst bead insertions, dual-cutoffs, aggregation-volume bias), those parameters, in addition to the CBMC parameters, were set to identical values in the Gibbs ensemble and GC-TMMC simulations. For each simulation method, uncertainties in the saturation properties (pressure and fluid densities) were calculated as a function of computational time. Here, the uncertainty was taken to be the relative standard deviation of four independent simulations. To be more explicit, the relative standard deviation σX for a generic property X was calculated using the following expression σX )
1 X(t)
1 Nr
∑ [X Nr
2
j
j
]2
(t) - X(t)
(14)
Figure 2. Fluid-phase diagram of the molecular fluids investigated in this work in the pressure-temperature plane. Linear alkanes, ethane and n-octane, are shown in panel a; nonlinear alkanes, cyclohexane and 2,5dimethylhexane, are shown in panel b; 1-propanol is shown in panel c; and water is shown panel d. Circles represent the results of GC-TMMC simulations, and error bars represent the standard deviation of four independent simulations. Diamonds are Gibbs ensemble simulation data found in the literature using the same force-field. Solid curves represent equation-of-state predictions.61–63
where Nr ()4 in the present study) is the number of independent j (t) is the mean value of the property X at time t, runs and X averaged over the independent runs X(t) )
1 Nr
∑
Nr j
Xj(t)
(15)
It was found that the saturation pressure was the most sensitive of the three properties, and therefore we predominantly focused our attention on its behavior in our comparison of the Gibbs and GC-TMMC methods. Note that in a Gibbs ensemble simulation, the pressure is calculated via the virial route and is thus prone to fluctuations, whereas in a GC-TMMC simulation, it is obtainable directly from the probability distribution Π(N). In the figures that follow, the time evolution of the uncertainties in the saturation pressure for both methods is shown. Before presenting the performance data, it is instructive to consider how each method might be expected to perform. Away from the critical point, the Gibbs ensemble method naturally converges to saturation conditions where each coexisting phase is clearly contained in a separate simulation box. In this work, recall that the initial densities of the liquid and vapor boxes were set to their saturation values. This spared considerable computational effort (and time) that would otherwise have been spent performing interbox molecule transfers had the initial conditions been far from phase coexistence. Since the Gibbs ensemble simulations performed in this work were initiated with ideal or best-case conditions, it should therefore be expected that the uncertainties in the saturation properties should start out within a reasonable range of their limiting levels and gradually decay with increasing computational time. In contrast, the corresponding GC-TMMC simulations were not initiated with best-case conditions. These simulations were initiated with
4538 Ind. Eng. Chem. Res., Vol. 47, No. 13, 2008
Figure 3. Gibbs ensemble and GC-TMMC performance data for ethane. The top panel corresponds to a temperature of 240 K, and the bottom panel corresponds to a temperature of 180 K. Performance data is plotted as the uncertainty versus computational time. The uncertainty (y axis) represents the relative standard deviation in the saturation pressure obtained from four independent simulations.
an empty simulation box and a uniform biasing function, that is, η(N) ) k where k is a constant. In the early stages of a GCTMMC simulation, because it is unlikely that the system is capable of accumulating enough statistics to construct an effective biasing function that enables it to sweep across the relevant range of values of N, it is reasonable to expect that the predicted saturation properties should be neither precise nor accurate. In fact, before reasonably accurate saturation properties can be obtained from a GC-TMMC simulation, both liquidlike and vapor-like values of N must be sampled by the system. Thus, the key to the GC-TMMC method lies in the construction of the biasing function η(N) . One can therefore expect the GCTMMC performance (uncertainty as a function of computational time) to display an initial transitory period where the uncertainty level is high and can potentially fluctuate dramatically. However, as the system accumulates enough statistics to enable it to sample N space, the uncertainty should decay and subsequently reach its intrinsic level. In the situation encountered in this study, where the intrinsic GC-TMMC uncertainty level was generally lower than that of the Gibbs ensemble, one should in fact expect a “crossover” point beyond which the GC-TMMC method provides estimates for the saturation properties that are superior (using precision as a metric) to those obtained from a Gibbs ensemble simulation. The results below provide an indication of how the crossover point changes with temperature for several common compounds. By combining this type of information with the knowledge of the computational time one has to invest, an educated decision can be made with respect to the method that optimizes precision within imposed resource constraints. The performance of the two methods is first compared for ethane, the simplest molecule investigated in this study. In Figure 3, the uncertainty in the saturated vapor pressure is plotted versus CPU time at 240 K (top panel) and 180 K (bottom panel). The salient feature of Figure 3 is the difference in the limiting uncertainties intrinsic to each simulation method. In the case of the Gibbs ensemble, the limiting uncertainty in the saturation pressure is 1-2% while that of GC-TMMC is smaller
Figure 4. Comparison of the evolution of the relative uncertainty in the saturated liquid density of ethane using Gibbs ensemble and GC-TMMC. The top panel corresponds to a temperature of 240 K, and the bottom panel corresponds to a temperature of 180 K. Performance data is plotted as the uncertainty versus computational time. The uncertainty (y axis) represents the relative standard deviation in the saturated liquid density obtained from four independent simulations.
by an order of magnitude. As expected, the same relative trend holds true for the saturated vapor densities (not shown). However, in the case of the saturated liquid densities shown in Figure 4, both methods yield comparable uncertainties, which are of the order of 0.1%. In terms of computing time, notice that GC-TMMC takes less time to reach its limiting precision than the Gibbs ensemble. This is an interesting observation when one considers, as discussed above, that the Gibbs ensemble simulations were initiated with best-case conditions, and thus the initial uncertainties started out within a reasonable range of their limiting values. This is in stark contrast to the comparatively abrupt GC-TMMC performance data, which possess an initial period where the uncertainty is quite high, but then drops precipitously to its limiting uncertainty level. This precipitous drop is a clear indication that the density probability distribution has converged in all four simulations and that the system is able to sweep across N space easily. As can be seen in Figure 3, extending the GC-TMMC simulations beyond this point only results in marginal reduction of the uncertainty, whereas the Gibbs ensemble uncertainties appear to benefit to a larger degree from increasing the simulation time. The statistical uncertainties do not decrease with the square root of the CPU time, as one might expect from simple mathematical arguments.47 In the case of the GC-TMMC calculations, additional transition probability data collected at each step contribute to the vapor pressure indirectly, and therefore these simple arguments are not applicable. Mathematical modeling of the evolution of the statistical uncertainty with CPU time would be a useful issue to examine in future work. Also, it is important to notice that the time required to attain the precipitous uncertainty drop, that is, the length of the initial transitory period, increases with decreasing temperature. The “crossover” point is essentially reached instantly at 240 K and increases to 8 h at a temperature of 180 K. Similar performance trends are also observed for n-octane, and thus, we do not show them.
Ind. Eng. Chem. Res., Vol. 47, No. 13, 2008 4539
Figure 5. Gibbs ensemble and GC-TMMC performance data for 2,5dimethylhexane. The top panel corresponds to a temperature of 510 K, and the bottom panel corresponds to a temperature of 390 K. Performance data is plotted as the uncertainty versus computational time. The uncertainty (y axis) represents the relative standard deviation in the saturation pressure obtained from four independent simulations.
Figure 6. Gibbs ensemble and GC-TMMC performance data for water. Comparisons were made at 500 K (top panel), 400 K (middle panel), and 300 K (bottom panel). Performance data is plotted as the uncertainty versus computational time. The uncertainty (y axis) represents the relative standard deviation in the saturation pressure obtained from four independent simulations.
In Figure 5, the performance of the Gibbs ensemble and GCTMMC methods for 2,5-dimethylhexane, a molecule with a more complex topology than the previously discussed linear n-alkanes, at 510 K (top panel) and 390 K (bottom panel) is compared. The uncertainty in the saturated vapor pressure is plotted as a function of CPU time. Again, it is apparent that the limiting uncertainty of GC-TMMC is significantly lower than that of Gibbs ensemble. For the saturated liquid density, the precisions of the two simulation methods are comparable and are in the 0.1% range. Notice again that after the initial transitory period in the GC-TMMC performance data, there is only limited reduction in the uncertainties. From a practical perspective, the GC-TMMC simulations could be terminated at this point. The crossover time is less than one hour at 510 K and increases to 10 h at 390 K. Similar performance trends were observed in the case of cyclohexane. Performance data for water are presented in Figure 6. For this fluid, Gibbs ensemble and GC-TMMC simulations were compared at 300, 400, and 500 K, which represent low, intermediate, and high reduced temperatures. At the elevated temperatures (middle and top panel of Figure 6), the limiting GC-TMMC uncertainty is smaller than the limiting Gibbs ensemble uncertainty. The crossover point occurs at approximately 5 and 20 h at temperatures of 500 and 400 K, respectively. Both methods yield comparable limiting uncertainties in the liquid densities, which are on the order of 0.1%. As expected, since the vapor phase is sufficiently dilute, at 300 K (bottom panel of Figure 6), the limiting uncertainties in the saturated vapor pressure for both methods are comparable. At this lowest temperature, the Gibbs ensemble proves to be slightly better. More importantly, notice the discrepancy in the amount of computational time required by each method to attain its limiting uncertainty level. By 100 CPU hours, the Gibbs ensemble already yields a reasonably precise saturation pressure at 300 K. In contrast, GC-TMMC requires almost 400 CPU hours to reach a comparable level of precision. Analogous performance was found for the saturated vapor densities at this
temperature. Similar difficulties were also encountered in the case of 1-propanol at low reduced temperatures. The apparent inefficiency of the GC-TMMC simulations of water at 300 K can be understood by focusing on the initial transitory periods in the performance data at 400 and 500 K. At the two higher temperatures, the duration of this transient is relatively short-lived and decays within 30 CPU hours. Notice, too, at 400 K that there is a gradual decay in the uncertainty rather than an abrupt drop following the initial transitory period. At 300 K, however, the transitory period consumes most of the simulation time. In fact, it is arguable that the transient requires more time to die out than the given duration of the simulation itself. As discussed earlier, the very existence of the initial transient in the GC-TMMC simulations is due to inadequate sampling of N space, which in turn is tied to the lack of an effective biasing function. The convergence rate of the probability distribution (and therefore the biasing function) is influenced by a number of factors, including the magnitude of the free energy barrier separating the liquid and vapor phases, the complexity of intermediate inhomogeneous configurations sampled by the system,33 and the anisotropy of the molecular interactions. In general, the rate of convergence decreases with decreasing temperature. This trend is quite clearly illustrated in the performance data presented here. Of crucial importance to the efficacy of the GC-TMMC method is the construction of the biasing function. At high temperatures, this is not a serious issue since the free energy barrier is low relative to thermal fluctuations. Moreover, the likelihood of encountering well-defined inhomogeneous structures, which often hinder efficient sampling,33 is reduced considerably at elevated temperatures. However, as can be seen in the case of water at 300 K, convergence can be a serious problem at low reduced temperatures. It makes sense that the performance of the GC-TMMC method can be improved by reducing the computational burden associated with construction of the biasing function. This can be achieved by a number of strategies. Here, we discuss three of these. The first, which has already been used in a number of previous studies,23–29,33,41 can
4540 Ind. Eng. Chem. Res., Vol. 47, No. 13, 2008
be described as a “divide and conquer” approach, which entails decomposing N space (0 e N e Nmax) into windows. Within each window, independent GC-TMMC simulations are performed to collect transition probability statistics over a relatively narrow range of N values, which are subsequently combined within a master collection matrix and used to determine the probability distribution over the entire N range of interest. This approach is ideally suited to parallel processing and is particularly useful when Nmax is very large. Recall that the GC-TMMC simulations in this study were initiated with a uniform biasing function. A second possible approach involves initiating a GCTMMC simulation with a reasonable estimate of Π(N), and thus η(N) (see eq 7). A plausible route involves the use of a hightemperature distribution Π(N; TH), which, according to the performance data, can be computed easily and rapidly, to estimate the distribution at a lower temperature Π(N; TL), where TL < TH . Future work involves the development of statistical mechanical approaches for this purpose. A third potential strategy involves combining the TMMC approach adopted here with the Wang-Landau (WL) sampling scheme.64 The latter technique enables one to rapidly obtain a rough estimate of the density probability distribution, but converges at a slower rate than the TMMC method.65 Shell et al. have demonstrated the utility of a combined scheme65 in which sampling is initiated with the WL approach and later turned over to the fasterconverging TMMC method. These authors focused on the construction of energy probability distributions at constant density. Employing this approach to sample density space represents an intriguing possibility. The GC-TMMC methodology is well-suited for calculating a wide variety of fluid-phase properties, including phase equilibrium. For example, from a practical perspective, one can show that the isothermal equation of state (pressure as a function of density) can be obtained directly from Π(N).25 The broad utility of the GC-TMMC method stems from the intimate relationship between the system’s free energy and the density probability distribution Π(N). In addition, an attractive aspect of GC-TMMC is the prospect of obtaining data over a wide range of state points from a single simulation. Given a generic system property X, say the potential energy or degree of polymerization or density profile, the combination of Π(N) with canonical averages 〈X(N, V, T)〉, which can be easily collected during the course of a GC-TMMC simulation, the average value 〈X〉 can be calculated straightforwardly by Nmax
〈X 〉
)
∑ Π(N;µ, V, T)〈X(N, V, T) 〉
(16)
N)0
Since histogram reweighting can be used to determine Π(N; µ, V, T) at other chemical potential values, 〈X〉 can therefore be calculated over a wide range of state points. Extension of these ideas to multicomponent systems is straightforward.28,45,46 Interestingly, a possible direction for future work involves a comparative study of the type performed here to phase equilibria calculations involving mixtures of molecular fluids. For simple binary fluid mixtures, it has already been shown that complete isothermal phase diagrams can be obtained in a single grand-canonical transition-matrix simulation by determining the multidimensional density probability distribution Π(N1, N2; µ1, µ2, V, T) .27,35 In contrast, a single Gibbs ensemble simulation only yields a single phase coexistence point. A related transition-matrix-based variant for mixtures that has recently been developed uses the total density as an order parameter. Like the Gibbs ensemble, this variant yields a single phase coexistence point per simulation. The relative speed of
the methods in this context has yet to be compared systematically. This requires implementation of a molecule identity change move in Towhee, which is currently not available. Finally, we note that it is possible to conduct TMMC-based phase equilibria calculations within the isothermal-isobaric ensemble. The formalism for completing single-component simulations of this type is outlined in ref 23. Comparison of this approach with the grand-canonical variant explored here provides another possible direction for future work. V. Summary and Conclusions Results have been presented from a computational study comparing the use of the GC-TMMC and Gibbs ensemble methods for determining the saturation properties of a variety of pure molecular fluids. The study was performed within the framework of the MCCCS Towhee simulation package.52 Towhee was chosen for several reasons. First, the Gibbs ensemble method was already available in the package. Second, other state-of-the-art sampling techniques required for simulating complex fluids, such as CBMC and aggregation-volume bias trial moves, were already implemented in Towhee. Last, a variety of force-fields could be used in this package. Implementation of the GC-TMMC method within Towhee required making minimal modification to the source code. Two sets of simulations were performed. In the first set, the saturation curves for ethane, n-octane, cyclohexane, 2,5dimethylhexane, 1-propanol, and water were determined via GCTMMC and compared with the Gibbs ensemble results reported in the literature. In all cases, the agreement was excellent. For each molecule, a second set of simulations, consisting of both Gibbs ensemble and GC-TMMC, was performed at a low and a high reduced temperature to directly compare their performance. Simulation parameters were chosen such that observed performance discrepancies could be attributed to the differences in the methods. Performance was assessed on the basis of monitoring the uncertainties in the phase coexistence properties as a function of computational time. In all cases, with one exception, the limiting uncertainty in the saturated vapor pressure obtained by GC-TMMC was significantly lower, in some instances by an order of magnitude, than that of the Gibbs ensemble. This was also true for the saturated vapor density. Both methods had comparable limiting uncertainties in the saturated liquid density. In the case of water at 300 K, the Gibbs ensemble uncertainties were slightly better than the GC-TMMC uncertainties. The GC-TMMC performance data were characterized by an initial transitory period where the uncertainties were as large as and sometimes larger than the Gibbs ensemble uncertainties. This initial transient was a consequence of the system’s inability to sample the range of N values, which in turn was due to lack of enough accumulated statistics to construct an effective biasing function. However, once enough statistics were accumulated, the limiting uncertainty levels were reached shortly thereafter. In general, for short computational times, the Gibbs ensemble method appears to be suitable for providing rough (order-ofmagnitude) estimates of saturation properties, but the GCTMMC method can provide significantly more precise predictions given slightly longer computational time. The exception to this general observation was low-temperature water. In the specific case of water at 300 K, it appeared that the GC-TMMC simulations were not able to accumulate enough statistics over the duration of the simulations and thus were stuck in this transitory period. Three approaches aimed at circumventing these difficulties by reducing the work required to construct the
Ind. Eng. Chem. Res., Vol. 47, No. 13, 2008 4541
biasing function were also discussed. Future work will focus on the development of methods for this purpose. Acknowledgment The authors would like to thank Professor Ilja Siepmann for informative discussions regarding his group’s experience with the Gibbs ensemble method. We would also like to thank Dr. Michael Frenkel for providing a copy of ThermoData Engine. J.R.E. gratefully acknowledges the financial support from the National Science Foundation under Grant No. CTS-0238772. This study utilized the high-performance computational capabiliities of the Biowulf PC/Linux cluster at the National Institutes of Health, Bethesda, MD (http://biowulf.nih.gov) and the University at Buffalo Center for Computational Research (http:// www.ccr.buffalo.edu). Literature Cited (1) Panagiotopoulos, A. Z. Mol. Phys. 1987, 61, 813–826. (2) Panagiotopoulos, A. Z.; Quirke, N.; Stapleton, M.; Tildesley, D. J. Mol. Phys. 1988, 63, 527–545. (3) Panagiotopoulos, A. Z. Int. J. Thermophys. 1989, 10, 447–457. (4) Siepmann, J. I. Mol. Phys. 1990, 70, 1145–1158. (5) Siepmann, J. I.; Frenkel, D. Mol. Phys. 1992, 75, 59–70. (6) Laso, M.; de Pablo, J. J.; Suter, U. W. J. Chem. Phys. 1992, 4, 2817– 2819. (7) Vlugt, T. J. H.; Martin, M. G.; Smit, B.; Siepmann, J. I.; Krishna, R. Mol. Phys. 1998, 94, 727–733. (8) Martin, M. G.; Siepmann, J. I. J. Phys. Chem. B 1999, 103, 4508– 4517. (9) Martin, M. G.; Frischknecht, A. L. Mol. Phys. 2006, 104, 2439– 2456. (10) Kofke, D. A. J. Chem. Phys. 1993, 98, 4149–4162. (11) Mehta, M.; Kofke, D. A. Chem. Eng. Sci. 1994, 49, 2633–2645. (12) Agrawal, R.; Kofke, D. A. Mol. Phys. 1995, 85, 23–42. (13) Agrawal, R.; Kofke, D. A. Mol. Phys. 1995, 85, 43–59. (14) Ferrenberg, A. M.; Swendsen, R. H. Phys. ReV. Lett. 1988, 61, 2635–2638. (15) Vrabec, J.; Fischer, J. Mol. Phys. 1995, 85, 781–792. (16) Berg, B. A.; Neuhaus, T. Phys. ReV. Lett. 1992, 68, 9–12. (17) Fenwick, M. K.; Escobedo, F. A. J. Chem. Phys. 2004, 120, 3066– 3074. (18) Smith, G. R.; Bruce, A. D. J. Phys. A 1995, 28, 6623–6643. (19) Wang, J.-S.; Tay, T. K.; Swendsen, R. H. Phys. ReV. Lett. 1999, 82, 476–479. (20) Wang, J.-S.; Swendsen, R. H. J. Stat. Phys. 2002, 106, 245–285. (21) Fitzgerald, M.; Picard, R. R.; Silver, R. N. Europhys. Lett. 1999, 46, 282–287. (22) Fitzgerald, M.; Picard, R. R.; Silver, R. N. J. Stat. Phys. 2000, 98, 321–345. (23) Errington, J. R. J. Chem. Phys. 2003, 118, 9915–9925. (24) Singh, J. K.; Kofke, D. A.; Errington, J. R. J. Chem. Phys. 2003, 119, 3405–3412. (25) Shen, V. K.; Errington, J. R. J. Phys. Chem B 2004, 108, 19595– 19606. (26) Singh, J. K.; Kofke, D. A. J. Chem. Phys. 2004, 121, 9574–9580. (27) Shen, V. K.; Errington, J. R. J. Chem. Phys. 2005, 122, 064508. (28) Errington, J. R.; Shen, V. K. J. Chem. Phys. 2005, 123, 164103. (29) Singh, J. K.; Errington, J. R. J. Phys. Chem. B 2006, 110, 1369– 1376. (30) Singh, J. K.; Adhikari, J.; Kwak, S. K. Fluid Phase Equilib. 2006, 248, 1–6.
(31) Errington, J. R.; Truskett, T. M.; Mittal, J. J. Chem. Phys. 2006, 125, 244502. (32) Mittal, J.; Errington, J. R.; Truskett, T. M. Phys. ReV. Lett. 2006, 96, 177804. (33) MacDowell, L. G.; Shen, V. K.; Errington, J. R. J. Chem. Phys. 2006, 125, 034705. (34) Errington, J. R. Phys. ReV. E: Stat., Nonlinear, Soft Matter Phys. 2003, 67, 012102. (35) Shen, V. K.; Errington, J. R. J. Chem. Phys. 2006, 124, 024721. (36) Shen, V. K.; Mountain, R. D.; Errington, J. R. J. Phys. Chem. B 2007, 111, 6198–6207. (37) Errington, J. R.; Kofke, D. A. J. Chem. Phys. 2007, 127, 174709. (38) Cichowski, E. C.; Schmidt, T. R.; Errington, J. R. Fluid Phase Equilib. 2005, 236, 58–65. (39) Errington, J. R.; Wilbert, D. W. Phys. ReV. Lett. 2005, 95, 226107. (40) Errington, J. R. Langmuir 2004, 20, 3798–3804. (41) Grzelak, E. M.; Errington, J. R. J. Chem. Phys. 2008, 128, 014710. (42) Rosh, T. W.; Errington, J. R. J. Phys. Chem. B 2007, 111, 12591– 12598. (43) Chen, H.; Sholl, D. S. Langmuir 2006, 22, 709–716. (44) Chen, H.; Sholl, D. S. Langmuir 2007, 23, 6431–6437. (45) Shen, V. K.; Cheung, J. K.; Errington, J. R.; Truskett, T. M. Biophys. J. 2006, 90, 1949–1960. (46) Cheung, J. K.; Shen, V. K.; Errington, J. R.; Truskett, T. M. Biophys. J. 2007, 92, 4316–4324. (47) Frenkel, D.; Smit, B. Understanding Molecular Simulation, 2nd ed.; Academic Press, San Diego, CA, 2002. (48) Chen, B.; Siepmann, J. I. J. Phys. Chem. B 2001, 105, 11275– 11282. (49) Chen, B.; Siepmann, J. I.; Oh, K. J.; Klein, M. L. J. Chem. Phys. 2001, 115, 10903–10913. (50) Chen, B.; Siepmann, J. I.; Oh, K. J.; Klein, M. L. J. Chem. Phys. 2002, 116, 4317–4329. (51) Potoff, J. J.; Errington, J. R.; Panagiotopoulos, A. Z. Mol. Phys. 1999, 97, 1073–1083. (52) MCCCS Towhee. Available at: http://towhee.sourceforge.net. (53) Martin, M. G.; Siepmann, J. I. J. Phys. Chem. B 1998, 102, 2569– 2577. (54) Chen, B.; Potoff, J. J.; Sipemann, J. I. J. Phys. Chem. B 2000, 105, 3093–3104. (55) Lee, J.-S.; Wick, C. D.; Stubbs, J. M.; Siepmann, J. I. Mol. Phys. 2005, 103, 99–104. (56) Berendsen, H. J.; Grigera, J. R.; Straatsma, T. P. J. Phys. Chem. 1987, 91, 6269–6279. (57) Alejandre, J.; Tildesley, D. J.; Chapela, G. A. J. Chem. Phys. 1995, 102, 4574–4583. (58) Errington, J. R.; Panagiotopoulos, A. Z. J. Phys. Chem B 1998, 102, 7470–7475. (59) Arbuckle, B. W.; Clancy, P. J. Chem. Phys. 2002, 116, 5090–5098. (60) Karasawa, N.; Goddard, W. A. J. Phys. Chem. 1989, 93, 7320– 7327. (61) Frenkel, M.; Chirico, R. D.; Diky, V. V.; Muzny, C.; Lemmon, E. W.; Yan, X.; Dong, Q. NIST ThermoData Engine, NIST Standard Reference Database 103, version 2.0; National Institute of Standards and Technology, Standards Reference Data Program: Gaithersburg, MD, 2006. (62) Frenkel, M.; Chirico, R. D.; Diky, V. V.; Yan, X.; Dong, Q.; Muzny, C. J. Chem. Inf. Model. 2005, 45, 816–838. (63) Diky, V. V.; Muzny, C.; Lemmon, E. W.; Chirico, R. D.; Frenkel, M. J. Chem. Inf. Model. 2007, 47, 1713–1725. (64) Wang, F.; Landau, D. P. Phys. ReV. Lett. 2001, 86, 2050–2053. (65) Shell, M. S.; Debenedetti, P. G.; Panagiotopoulos, A. Z. J. Chem. Phys. 2003, 119, 9406–9411.
ReceiVed for reView January 25, 2008 ReVised manuscript receiVed March 14, 2008 Accepted March 26, 2008 IE800143N