A Novel Technique To Predict the Solubility of ... - ACS Publications

Nov 21, 2016 - and P. Tarazona*,‡. †. Departamento de Física Teórica de la Materia Condensada, and. ‡. Departamento de Física Teórica de la ...
1 downloads 0 Views 2MB Size
Subscriber access provided by Fudan University

Article

A novel technique to predict the solubility of planar molecules Sergio Panzuela Pérez, Marco Bernabei, Enrique Velasco, Rafael Delgado-Buscalioni, and Pedro Tarazona Energy Fuels, Just Accepted Manuscript • DOI: 10.1021/acs.energyfuels.6b01587 • Publication Date (Web): 21 Nov 2016 Downloaded from http://pubs.acs.org on November 22, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Energy & Fuels is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

A novel technique to predict the solubility of planar molecules S. Panzuela,∗,† M. Bernabei,† E. Velasco,‡ R. Delgado-Buscalioni,‡ and P. Tarazona∗,‡ Departamento de Física Teórica de la Materia Condensada, Universidad Autónoma de Madrid, 28049 Madrid (Spain), and Departamento de Física Teórica de la Materia Condensada and Institute of Condensed Matter Physics (IFIMAC), Universidad Autónoma de Madrid, 28049 Madrid (Spain) E-mail: [email protected]; [email protected]

Abstract We present a new computational technique to quantify the solubility of planar molecules in a solvent. Solubility is calculated as the critical concentration at which solute molecules cease to stack as columns, but rather aggregate in all directions. An explicit expression for the solubility is obtained which involves the potential of mean force between two solute molecules as a function of their centre-of-mass distance in the limit of infinite dilution. This function can be easily obtained from molecular dynamics simulations involving a pair of solute molecules in a solvent using the umbrella sampling method. As a validation of our approach, we use a generic coarse-grained molecular model to represent the molecular interactions of ∗ To

whom correspondence should be addressed de Física Teórica de la Materia Condensada, Universidad Autónoma de Madrid, 28049 Madrid (Spain) ‡ Departamento de Física Teórica de la Materia Condensada and Institute of Condensed Matter Physics (IFIMAC), Universidad Autónoma de Madrid, 28049 Madrid (Spain) † Departamento

1

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

polycyclic-aromatic-hydrocarbon. Within that coarse-grained model, the solubility of pyrene and acenaphthene in heptane are estimated through large molecular dynamics simulations and compared with the experimental results. The umbrella sampling method, applied to single pairs of these molecules in the solvent, provides the values of the critical cluster size in the theoretical model of molecular stacking. Then umbrella sampling simulations for the first members of the polycyclic-aromatic-hydrocarbon series are used to predict their solubilities through our theoretical method. Within the typical uncertainty associated to theoretical solubility estimates, the agreement of our results with the experimental values is quite remarkable for most of the members of the series in a wide range of molecular masses, which confirms the general validity of the method in the case of planar molecules. Among the molecules explored, agreement with experiment fails for anthracene, for which the experimental solubility is clearly out of the general trend along the polycyclic-aromatic-hydrocarbon series, indicating that the coarse-grained representation used here is not able to capture its peculiarity. The new methodology can be applied to planar molecules to obtain relatively accurate values of solubility at a very low computational cost.

1 Introduction A proper understanding of the solubility behaviour of organic solids in different solvents is of great interest in many fields, including environmental predictions, biochemistry, pharmacy, drug-design, oil industry, agrochemical design, and protein ligand binding. The solubility of a solute in a solvent is defined as the concentration of the saturated solution in equilibrium with the precipitate. Any extra amount of solute added to the system precipitates as solid, until the solute concentration ρs in the liquid solvent is restablished to the equilibrium solubility ρsol . The solubility of a solute depends obviously on the kind of solvent and on the thermodynamic conditions of pressure and temperature. Despite its simple definition, the accurate calculation of solubilities is a considerable

2

ACS Paragon Plus Environment

Page 2 of 34

Page 3 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

challenge since the solubility depends exponentially on a free-energy difference. The challenge of predicting and accounting for solubilities of pairs of substances is also an experimental one and recent efforts in collecting disperse and, in many instances, inaccurate data is being carried out by several groups and institutions. 1,2 Not surprisingly, uncertainties in theoretical predictions are typically large and can be as high as 50% 3 and need to be handled with care, 4 particularly for sparingly soluble substances, like large PAH’s 5 as clearly shown in the recent compilation of Ref. 6 (see Fig. 1 therein). The number of theoretical models aimed at predicting the solubility behaviour of an organic solid in a solvent is not very large. Phenomenological approaches, based on a comparison of single-substance quantities (and the assumption that like-dissolves-like) are useful and provide rapid guidance on the “solvation strength” of a substance. The most popular and experimentally accessible one is probably the Hildebrand parameter δ (the square root of the substance cohesive energy density). Although this type of approach greatly oversimplifies the physics of solubility, many computational works have been devoted to directly computing Hildebrand and Hansen solubility parameters. 3,7 Another route taken in computational works to predict the solvation behaviour of organic solutes is partially based on thermodynamic relations and often require a lot of experimental data in order to solve their constitutive equations and thus provide reliable predictions. 8,9 QSPR (quantitative structure-properties relationship), often using machine learning algorithms and neural networks, predict the solubility from the molecular structures of solvent and solute, 10 which requires a large database of highly accurate experimental solubilities. More recently, molecular dynamics (MD) and Monte Carlo (MC) simulation techniques have been exploited to study the solubility behaviour of different solute/solvent systems. An advantage of this approach is that, once the molecular model is formulated, there is no need to incorporate external experimental data. As a bonus, the simulation technique also provides further insight into the molecular mechanism governing the solubility process. A obvious difficulty of this method lies in the formulation of a suitable interaction model

3

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

that can reproduce at the end realistic values of solubility. The standard route to study solubility by means of computer simulations is to measure the concentration dependence of the chemical potential of the solute in solution. 11,12 At the saturated concentration, the chemical potential of the dissolved solute equals the chemical potential of the pure crystal or amorphous solid phase. However, computing the chemical potential of the solute in the liquid and solid states is not an easy task. If molecules are too large or the system is too dense, a direct calculation using the Widom particle insertion method 13 is not feasible. An alternative route involves a generalized thermodynamic integration in terms of a parameter that couples a reference system (of known free energy) to the system of interest. 14–18 This approach demands a considerable computational effort. In a more direct simulation approach, one monitors the formation of clusters of solute molecules, and observe their size distributions in simulations with increasing concentration of solute. As ρs approaches its solubility limit ρsol , the concentrations of isolated solute molecules and small clusters (dimers, trimers,...) saturate, because any further increase in the total number of solute molecules goes to form larger clusters. However, this direct ’brute force’ estimation of ρsol becomes computationally very expensive in the case of large organic molecules with very low solubility (i.e. in a ’poor solvent’), since huge simulation boxes full of solvent molecules, and very long simulations times, are needed to explore the formation of even small solute clusters, and a fair estimation of solubility requires the presence of relatively large clusters. Even for smaller molecules (like pyrene, hereby considered) brute force calculations are not appropriate to achieve fast evaluation of the solubility of a series of different compound pairs. But such a problem is precisely the one encountered in most applications e.g. when seeking for the “optimal” solute. In this work we propose a novel, fast, and simple methodology to obtain a good estimation of the solubility of large planar molecules in a poor solvent from computer simulations. For concentrations well below the solubility limit, the typical clusters formed by these molecules are roughly linear stacks, since in this configuration molecules take full

4

ACS Paragon Plus Environment

Page 4 of 34

Page 5 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

advantage of their mutual attraction over the whole of their flat sides. For large and flat molecules, the attractive energy in these molecular-stacking configurations grows as the molecular weight, and clusters are very stable, even at very low solvent concentrations, causing a ’brute-force’ estimation by simulation of the very low solubility to be difficult. What we propose here is to take advantage of the strong tendency to molecular stacking and to describe the cluster size distribution from the structure of a single dimer, as it would happen for perfectly one-dimensional (1D) growth of columnar clusters. Then, we assume the existence of a ’critical mean cluster size’ beyond which solvent segregation becomes three-dimensional (3D) like. We show that this is a rather robust empirical parameter to describe a whole series of molecules with solubility values spanning more than three orders of magnitude. As a test bed for the method, we have studied the solubility of polycyclic aromatic hydrocarbon (PAHs) compounds in heptane. These planar hydrocarbon molecules are of great interest in various contexts. They are found in fossil fuels and as a product in the combustion of organic matter, 19 and affect health because of their mutagenic and carcinogenic character. 20 Also, PAHs may be abundant in the universe and are considered to be a possible carbon reservoir for the origin of simple forms of life. PAHs are nonpolar, hydrophobic molecules; therefore, they are insoluble in water and, as contaminants, are not easy to eliminate. Knowledge of their solubility in various solvents 6,21,22 is of great importance in agriculture in connection with possible soil remediation techniques. PAHs can also be regarded as building blocks of the central cores of asphaltene molecules present in crude oil, and a proper understanding of their solubility properties in polar and nonpolar solvents is of interest in the petroleum industry. 5,23 The extended PAHs, the largest members of the PAH family (also called nanographenes 24 ), are attracting renewed and more widespread attention as part of the present interest in graphene. The experimental data for PAHs in water and other solvents shows that their solubility decreases approximately exponentially with molecular mass and our method is able to

5

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

capture this feature. Previous simulation studies revealed that PAHs form one-dimensional columnar aggregates in heptane. 25 Such a structure is formed by π − π stacking of the molecular aromatic cores and is facilitated by the planarity of the molecules. This type of stacking is ideally suited to our method, which analyses the infinitely dilute mixture. To crosscheck the robustness of our method against experimental results we have to split the problem in two steps, the modelling of the molecular interactions and the estimation of the solubility for a given molecular interaction model. Both steps are focused towards the application of the method to estimate the solubility of large and very large PAH, and therefore the method should be designed to use a transferable model for the interactions, which once it has been validated for a reference PAH molecule, in a given solvent, allows to guess a good interaction model for other molecules. We use a MARTINI coarse-grained (CG) force-field whose parameters give a good representation of the heptane-benzene liquid mixture at room temperature. Then, under the usual MARTINI rules, the CG interactions of any PAH in the same solvent (heptane) may be obtained. This is obviously an approximate representation of the molecular interactions, but it offers a straightforward practical procedure to cover the whole range of large PAH, in terms of their basic geometrical structure. We have chosen pyrene and acenaphtene as our reference PAHs, to study their solubility in heptane. Pyrene it is large enough to have a fairly low solubility, but still not so low as to make impossible its evaluation by means of extensive (’brute force’) coarse-grained molecular dynamics (CGMD) simulations, which allow for the cluster size analysis. Acenaphtene is a smaller PAH, with higher solubility, that may be used as a more stringent test for the planar stacking hypothesis. For these two molecules we obtain first a direct check on the accuracy of the CG model with ’brute force’ large MD simulations. Then we apply our proposal of columnar stacking to estimate the effective threshold to full 3D cooperative aggregation. Theoretical predictions are made for the solubility of a larger series of PAHs molecules, based on the empirical stacking threshold obtained for pyrene and acenaphtene,

6

ACS Paragon Plus Environment

Page 6 of 34

Page 7 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

and the results are compared directly with the experimental data.

2 A theoretical framework for the solubility of planar molecules We first consider the solubility of identical planar molecules that aggregate in columns from a dilute solution of monomers. The partition function of a monomer in a volume V is Z1 = ΓV/vo , where vo is the configuration space unit volume and the factor Γ represents the complete solid angle explored by the orientation of the molecule. Consider two such molecules, 1 and 2, with interaction energy U (12) that depends on their relative distance and orientation. We define a dimer, rather loosely, as a given subset of the (12) space in which the two molecules have a strong mutual attraction, and write the partition function of such a dimer as Z2 = Z1 Zb /2, where Z1 includes all possible positions and orientations of the dimer (as those of a monomer) and

Zb =

1 vo

Z Z

dimer

d3 r12 dΩ2 e− βU (12) .

(1)

Here the integral is extended over the ’dimer’ subspace and includes the Boltzmann factor for the pair potential energy U (12), multiplied by β = 1/kT, being k Bolztmann’s constant and T the temperature. The factor 2 between Zb and Z2 /Z1 takes into account the symmetry between the monomers labelled as 1 and 2, or equivalently the fact that monomer 2 may be attached to any of the two sides of monomer 1. With this consideration, the partition function for larger clusters, assuming linear stacking of n molecules, is directly obtained as Zn = Z1 ( Zb /2)n−1 , and the total grand-partition function for the ideal mixture of monomers and clusters of all sizes, with the chemical equilibrium requirement µn = nµ between their chemical potentials, is ∞

ΞT =





∏ Ξn =

∏ ∑

n =1

n=1 Nn =1

 βnµ Nn

Zn e Nn !



= exp



Zn e βnµ

n =1

7

ACS Paragon Plus Environment

!



 = exp 



Z1 e βµ  . Zb βµ  1− e 2

(2)

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 34

The mean number of clusters of size n is ∂ log (Ξn ) 2Z1 h Nn i = = nZn e βnµ = ∂( βnµ) Zb



Zb e βµ 2

n

.

(3)

From this, the density of monomers is ρ1 = h N1 i/V = Z1 e βµ /V = Γe βnµ /vo , and the ratio between the density of dimers and monomers, λ ≡ ρ2 /ρ1 = h N2 i/h N1 i = Zb e βµ /2, allows to calculate the density of any n-molecule cluster as

ρ n = ρ1



Zb e βµ 2

 n −1

≡ ρ1 λ n −1 .

(4)

Obviously ρn , and the total concentration of the solute, would grow unbounded for λ ≥ 1, so that the chemical potential in a saturated solution has to be µ < −kT log( Zb /2). In practice, we expect that the solubility is below the 1D limit since for large clusters the planar stacking assumption would fail: new molecules can join the cluster not only at the two ends of the column, but also on lateral positions, giving a (3D) growth of the clusters. Condensation (precipitation) of the solvent would occur as a first-order phase transition, instead of the continuous 1D transition (at λ → 1) predicted by a strictly 1D stacking. To include the 1D-3D threshold in our analysis we calculate, within the assumption of 1D stacking, the total concentration of solute by adding molecules in aggregates of any size: ∞

ρs =





nρn = ρ1

n =1

ρ1

∑ nλn−1 = (1 − λ)2 .

(5)

n =1

Always within the 1D columnar stacking hypothesis, we can also obtain the mean cluster size



∑ nρn hni =

n =1 ∞

=

∑ ρn

1 , 1−λ

(6)

n =1

and its mean square fluctuation, which may be written either in terms of λ or in terms of

8

ACS Paragon Plus Environment

Page 9 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

ρs :

h∆n2 i ≡ hn2 i − hni2 =

vo Zb λ ρs = ρ = s 2Γ 2Γ (1 − λ )2

Z Z

dimer

d3 r12 dΩ2 e− βU (12) .

(7)

This equation states that, within the 1D (columnar) model for the solute clusters, the mean square fluctuations in the cluster size distribution grows linearly with the total concentration, and the ratio of growth is determined by the bond partition function Zb , which may be obtained from a computer simulation with just two solute molecules. The key point in our approach comes now as the assumption that the solubility limit for the saturated solution, ρs ≤ ρsol , may be described as a limit λ ≤ λc or equivalently

h∆n2 i ≤ ∆2c ≡ λc /(1 − λc )2 , at which columnar clusters would be overcome by the full 3D-like growth of clusters. The hope is that, over a broad range of PAH molecules, the solubility ρsol , could be obtained through Equation (7), with very different Zb but with similar values of ∆2c or λc . In that case, the large variation observed in the experimental results for the solubility of different PAHs could be theoretically estimated from Equation (1), i.e. from the Boltzmann factor of a bonded pair, averaged over the molecular orientations, which is precisely the zero concentration limit of the solute-solute radial distribution function, calculated in terms of the distance between the molecular centres, 1 g0 ( r ) = Γ

Z

dΩ2 e− βU (12) .

(8)

If we set a maximum distance between the molecular centres r12 ≤ rb to define a bonded pair, the bond partition function is 4πΓ Zb = vo

Z r b 0

9

dr r2 g0 (r ),

ACS Paragon Plus Environment

(9)

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 34

and the solubility, calculated as the λ = λc limit of Equation (7), becomes λc

ρsol = 2π (1 − λc )

2

Z r b 0

2

=

dr r g0 (r )



Z r b 0

∆2c 2

.

(10)

dr r g0 (r )

The above expression is reminiscent of an alternative estimation of solubility in terms of the solvent second virial coefficient, which is also an integral of the the radial distribution function at the ρs = 0 limit,

B2 = −2π

Z ∞ 0

dr r2 [ g0 (r ) − 1] .

(11)

The virial expansion for the osmotic pressure of the solute, βps /ρs = 1 + B2 ρs + ..., may be compared with the total osmotic pressure for an ideal mixture of monomers, dimers, and larger (1D) clusters, βps = ∑n ρn = ρ1 /(1 − λ) = (1 − λ)ρs . In the limit of very low solute density, λ ≪ 1 or hni ≈ 1 + λ + ..., we obtain B2 ρs ≈ −λ + O(λ2 )), and we could estimate the solubility as the value of ρs at which λ reaches the threshold value λc = 1 − 1/nc , ρsol = −

λc + O 2 (λc ) = B2



Z ∞ 0

∆2c 2

dr r [ g0 (r ) − 1]

+ O 2 ( λ c ).

(12)

The difference between our proposal Equation (10) and the latter expression sheds some light on the limits of the whole approach. For large solute molecules, with a strong tendency to segregate from the solvent, we expect that g0 (r ) ≃ exp [− βU (12)] exhibits a very high peak at the typical dimer distance. As far as the whole peak is included in the ’bond’ definition, the integrals in the denominator of Equations (10) and (12) should be similar (and very large). Then the integral of g0 (r ) for r ≤ rb and the integral of g0 (r ) − 1 over the whole range of r would not differ qualitatively, and both would give similar estimates for the (very low) solubility of those molecules. However, as discussed below and shown in Figure 6 in Section 5.2, our proposal, Equation (10), happens to be much

10

ACS Paragon Plus Environment

Page 11 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

more robust than Equation (12) in the description of PAH molecules of relatively small size.

3 Coarse-grained molecular models for PAH As mentioned in the introduction, we focus here on the solubility behaviour of polycyclic aromatic compounds in heptane. The analysed molecules are displayed in Figure 1, which correspond to the first members of the PAH series for increasing molecular mass: naphthalene, acenaphthene, phenanthrene, anthracene, pyrene and coronene. 26,27 The main mechanism of aggregation in polycyclic aromatic compounds should be the parallel π − π stacking of the aromatic cores. Under this assumption, we chose to represent the interactions between molecules through a coarse-grained force field. In this representation, groups of atoms are joined into single beads that interact via Lennard-Jones potentials with suitably chosen energy and length parameters. The benzene molecule has already been parametrised within the MARTINI force field, a successful coarse-grained (CG) representation of organic solvents and biomolecules, in terms of three interaction beads joined by very stiff bonds. Taking the benzene molecule parametrized by MARTINI as a reference, and following the MARTINI philosophy, we obtained a force field for all the molecules in Figure 1 by introducing a two-to-one mapping procedure that preserves the cyclic polyaromatic structure of the molecules. The method may be directly applied to any larger PAH, setting a generic scheme to approximate the properties of any of these molecules in an organic solvent. The mapping is done in such a way that every two carbons and their corresponding hydrogens are mapped onto a single CG bead (green bead in Figure 1). Note that, in this procedure there is not an univocal representation of the CG molecule. However, all possible representations are analogous as long as they respect the two-to-one mapping. Since all the PAH molecules possess benzene-like aromatic rings, we assigned a MARTINI

11

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 34

SC4-type to all beads, using a standard bond length (0.27 nm), and set a high force-constant value kbond = 9000 kJ mol−1 to the bonds so as to keep molecules as rigid as possible. Also, dihedral angles were controlled to prevent out-of-plane distortion among the aromatic cores by means of a force constant kdihedral = 200 kJ mol−1 . These values for the force constants guarantee a planar geometry within an uncertainty of 10%. Organic solvents like heptane are already parametrized within the MARTINI force field. 28 It should be noted that the properties of the alkane series are well reproduced by the MARTINI force field at T = 298 K and P = 1 atm. The cross interaction between alkane and PAH molecules is treated using the standard combining rules of the MARTINI force field. Finally, Lennard-Jones intermolecular potential were truncated and shifted at the cut-off distance of 1.2 nm.

Figure 1: Chemical structure and MARTINI bead representation of the PAH molecules studied. (a) Naphthalene. (b) Acenaphthene. (c) Phenanthrene. (d) Anthracene. (e) Pyrene. (f) Coronene.

12

ACS Paragon Plus Environment

Page 13 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

4 Computational details All the molecular dynamic (MD) simulations presented in this work were performed in the isobaric-isothermal NPT ensemble at P = 1 atm, T = 298 K and time step dt=0.03 ps using the simulation package GROMACS. 29 Temperature and pressure were kept constant by coupling the system to a Nosé-Hoover thermostat (coupling constant τ = 8 ps) and an isotropic Parrinello-Rahman barostat (coupling constant τ = 10 ps), respectively.

4.1 Large unconstrained MD simulations We have carried out MD simulations of pyrene and acenaphthene molecules in heptane at different concentrations of solute, above and below the experimental values for their solubility concentration (see Table 1). Pyrene is a good candidate in this respect, since it features an intermediate solubility to make feasible the direct (brute force) estimation of the solubility and it is already large enough flat molecule to represent the series of very large PAH. On the other hand, acenaphthene is a smaller molecule, with a larger solubility and therefore more easily studied through the cluster analysis in CGMD simulations. Its study here helps to set the limits of validity for our proposal for the smallest PAHs. Simulations were performed with a number of solute molecules ranging from Ns = 100 to 500 for pyrene and Ns = 250 to 1200 for acenaphtene, which allows the formation of large clusters. Appropriate numbers of solvent molecules (up to 33657) were used in order to obtain solute concentrations in the range ρs = 4.3 − 21.6 g/l for pyrene and ρs = 18 − 90 g/l for acenaphtene, which spans the experimental and predicted solubility values. For each value of solute concentration, we performed a simulation run for data production of the order of 500 ns, collecting data every 20 ps to improve statistics. Configurations of the solute molecules were post-processed by means of a computer routine to study the cluster distribution. The criterion used to consider two molecules as being connected was r < rb , where r is the molecular centre-of-mass distance and rb was defined above.

13

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 34

Clusters were then defined as groups of molecules where a given molecule is connected at least to another molecule belonging to the cluster.

4.2 Constrained MD simulations: Umbrella Sampling In order to use the model presented in Section 2 for the above molecules, and for other PAH in heptane, we need to provide two external inputs to Equation (10), namely the parameter ∆c and the dimer radial distribution functions g0 (r ). The latter are obtained by means of the umbrella-sampling simulation technique. 30 We set up a simulation box with two solute molecules in the chosen solvent. Then the MD sampling is biased towards the interesting configurations in phase space by introducing a biasing harmonic force between the solute molecules that depends quadratically on the centre-of-mass distance. The strength of the biasing force is set to have a spatial resolution of 0.2 nm, and the zero value of the force is fixed at r = r0 . This effectively restricts the relative molecular distance to a window centred at r0 . From a series of simulations with different r0 , we can obtain a histogram for r and, assuming the windows overlap, a weighted histogram analysis removes the effect of the biasing forces, and the potential of mean force F (r ) and the dimer (zero-concentration limit) radial distribution function g0 (r ) = e− βF(r) can be obtained. Note that F (r ) can also be regarded as a depletion potential resulting from the direct plus solvent-mediated interaction between the two solute molecules. The clear advantage of this method is that it requires only small volumes of solvent, just enough to sample the dimer distance over a few times rb . In a normal (unconstrained) MD simulation, of two solute molecules in a small box with a poor solvent, would be forming a dimer most of the time. A fair estimation of the peak in g0 (r ), with respect to its long-distance limit g0 (r ) → 1, could only be achieved with much larger simulation boxes (i.e. with a larger number of solvent molecules) and over very long simulation times, in order to obtain good statistics of the rare un-bonding and re-bonding events. By using the above described CG force field umbrella-sampling MD simulations, two 14

ACS Paragon Plus Environment

Page 15 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

solute molecules were splitted into 15 umbrella windows, and 250 ns simulations were run in each spatial window.

5 Results 5.1 Unconstrained simulations: Cluster analysis for pyrene and acenaphthene in heptane We performed large unconstrained MD simulations (see Section 4.1) of pyrene and acenaphthene in heptane at different concentrations of solute above and below the experimental value for the critical concentration (see Table 1). The simulated trajectories of the solute molecules were post-processed by means of a computational routine to obtained their corresponding cluster distribution function. The criterion used to consider two molecules as being connected was r < rb , where r is the molecular centre-of-mass distance and rb is the cut-off radius defined in Section 5.2. Clusters were then defined as groups of molecules where a given molecule is connected at least to another molecule belonging to the cluster.

15

ACS Paragon Plus Environment

Energy & Fuels

(b)

100 *

* 10-1

*

*

*

-2

10

10-3 10-4 0

5

**

**

**

* **

**

4.3 g/l 8.6 10.8 13.0 16.0 17.3 21.6

*****

10 15 cluster size n

cluster distribution h(n)/h(1)

(a)

cluster distribution h(n)/h(1)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

**

20

** 25

Page 16 of 34

100 *

* * * -2 * 10 * ** ** -3 *** 10 ** ****** *** ** * -4 * * 10 *** ** ** -5 ** 10 10-1 **

10-6 0

25

*

10.8 g/l 17.3 21.6

***** * ************ ****** * *** * * ****** * * * * * ** * **

50 75 100 125 150 cluster size n

Figure 2: Cluster size distributions h(n) for pyrene in heptane, at several concentrations. Data for each concentration are represented by different symbols, as indicated in the key box. (a) Continuous curves correspond to exponential fits h(n) ∝ e−αn for the three lower concentrations. Dash-dotted curve is an inverse-power fit h(n) ∝ ( a + n)−4 , which reproduces reasonably well the decay at the three highest concentrations. For ρs = 13.0 g/l, the dashed curve is just a guide to the eye. (b) Detail of the distribution tails for three different concentrations. For ρs = 21.6 g/l, large (n & 100) clusters are formed (i.e. ‘precipitate’, taking about one third of the total number of pyrene molecules), so that the fits to h(n) are done only for n < 30, and (with the appropriate normalization) are very similar for any concentration ρs & 16 g/l. Cluster distributions for pyrene are shown in Fig. 2. For ρs ≤ 10.8 g/l the distributions of clusters with 1 ≤ n . 8 are fitted very accurately by h(n) = h(1)λn−1 , with a parameter λ that decreases when the concentration increases. The exponential decay of h(n)/h(1) = ρn /ρ1 is consistent with the columnar stacking regime of cluster growth. The umbrella sampling result for vo Zb /Γ (see next section and Table 1) may be used in Equation (4) to predict the value of λ for a given total concentration of solute, and the results are in good agreement with those of the best exponential fit for the lowest concentrations. This gives support to Equation (9) and to our choice of rb to get the bond partition function Zb from the umbrella sampling result for g0 (r ). In contrast, for ρs & 13 g/l we already observe a deviation from the low-concentration (1D) regime. The exponential decay of the cluster size distribution changes to a power law h(n) ∝ ( a + n)−ν with ν ≈ 4 − 5; this reflects the cooperativity in the formation of the larger aggregates, which produces much larger size fluctuations. As shown in the inset of Fig. 2, for ρs ≥ 21.6 g/l some very large clusters 16

ACS Paragon Plus Environment

Page 17 of 34

(b)

0

10

cluster distribution h(n)/h(1)

(a) cluster distribution h(n)/h(1)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

-1

10

-2

10

-3

10

-4

10

0

10

18 g/l 36 45 60 67.5 90

-1

10

-2

10

-3

10

-4

10

-5

10

-6

10

-7

0

2

4 6 cluster size n

8

10

10

0

10

20 30 cluster size n

40

Figure 3: Cluster size distributions h(n)/h(1) for acenaphtene in heptane, at several concentrations. The distributions are normalized by the value for monomers. Data for each concentration are represented by different symbols, as indicated in the key box. In panel (a) the dashed line represents exponential behaviour. In panel (b) details of the distribution tails are given for the same concentrations. (n > 100) are formed, taking up to a third of the solute molecules (in the presence of gravity, this material would precipitate out of the solution). From these results, we estimate that the solubility of pyrene in heptane, with our MARTINI force-field representations, is about ρsol = 15 ± 2 g/l, with the central value fairly close to the experimental data –see Table 1. The cluster size distribution for acenaphthene in Fig. 3 shows the same qualitative features as for pyrene, although the exponential decay, characteristic of 1D aggregation, is lost more rapidly for concentrations ρs & 18 g/l. The signature of strong cooperativity is fully developed for 45g/l ≤ ρs ≤ 68 g/l, and the formation of large (precipitate) clusters is clear for ρs ≥ 90 g/l. From these results, we estimate the solubility of our CG-MARTINI model around ρsol = 60 ± 8 g/l, bracketing the experimental data in Table 1. Note that, within this CG scheme, the PAH-solvent and PAH-PAH molecular interactions are directly obtained from the MARTINI interaction sites derived for the benzene-heptane mixture. The molecular characteristics of pyrene or acenaphthene enter only through the number of carbon atoms in aromatic rings and their translation into the corresponding number of MARTINI interaction sites. The procedure may be directly applied to any

17

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 34

other PAH molecules to obtain an approximate representation of their interactions with no computational cost. The reasonable agreement with the experimental solubility for the two molecules studied here adds to the general support for the MARTINI CG method, and it allows us to explore the accuracy of our proposal for the estimation of the solubility in comparison with experimental data for a larger set of PAH molecules. Nevertheless, a warning is due with respect to the general validity of the CG interactions to describe the solubility of the PAHs. As shown Table 1 and Figure 1, phenanthrene and anthracene are isomers, with three aromatic rings and molecular mass 178 amu. The experimental solubility of phenanthrene (ρsol = 46.8 g/l) fits well between the values of acenaphthene and pyrene, which bracket its molecular mass and its CG description within the MARTINI rules. However, anthracene has a much lower experimental solubility (ρsol = 2.0g/l), despite the fact that its CG description is very similar to that of phenanthrene (both with 7 interaction sites), with the result that the observed simulation properties of their heptane mixtures are very similar. Therefore, it is clear that the molecular differences between phenanthrene and anthracene, which produce a large effect in their solubility in heptane, are lost in the translation to the GC-MARTINI representation. We discuss below the possible origin for the very low solubility of anthracene in heptane, but in any case anthracene has to be ruled out from the main trend of the PAH series, which we address in our analysis. The identification of clusters in the pyrene-heptane mixture allows a quantitative check for the hypothesis of columnar stacking. With this aim, we have calculated, for each cluster of size n, both the gyration and nematic-order tensors, G and S respectively, whose elements are defined as

Gαβ (n) =

1 n ri,α ri,β , n i∑ =1

Sαβ (n) =

 1 n ˆ ˆ u − δ 3 u , i,β αβ i,α 2n i∑ =1

(13)

where uˆ i,α are the Cartesian components (α, β = x, y, z) of a unit vector normal to the

18

ACS Paragon Plus Environment

Page 19 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

average molecular plane of the ith molecule, and ri,α the position vector of this molecule with respect to the center of mass of a cluster of size n. For each cluster these two tensors are diagonalized and keep the separate averages over all clusters of size n for the ordered eigenvalues, S3 ≥ S2 ≥ S1 for the nematic order tensor and G3 ≥ G2 ≥ G1 for the gyration tensor. The nematic order parameter is given by the mean value of the largest eigenvalue, s(n) = hS3 in , so that perfectly parallel configurations correspond to s(n) = 1, while with a random orientation of n molecules the mean nematic order would decay with the cluster size as s(n) = n−1/2 . The three gyration moments are combined to represent the cluster shape by the three parameters R g = (h G1 in + h G2 in + h G3 in )1/2 b = h G3 in −

radius of gyration,

1 (h G1 in + h G2 in ) 2

asphericity,

c = h G2 in − h G1 in

acylindricity.

(14)

Another useful parameter that quantifies the spherical or cylindrical shape of a cluster of size n is the relative shape anisotropy κ 2 , given by κ2 =

b2 + (3/4)c2 . R4g

(15)

Now perfect stacking, where molecular planes arrange in a columnar configuration, would correspond to κ 2 (n) = 1, while a random orientation of the n molecules would give lower κ 2 (n), approaching the perfect spherical shape with κ 2 (n) = 0. Fig. 4 presents average values of s(n) (a) and κ 2 (n) (b) for clusters of different sizes, calculated at different total concentrations of pyrene in heptane. These results confirm the formation of small columnar clusters, which are otherwise clearly observed in snapshots from the MD simulations. Regardless of the total concentration of pyrene in heptane, the

19

ACS Paragon Plus Environment

(a) 4.3 g/l 8.6 10.8 13.0 16.0 17.3 21.6

0.4

1

cluster size n

10

2

0

10

relative shape anisotropy κ (n)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

nematic order parameter S(n)

Energy & Fuels

10

Page 20 of 34

0

(b)

0.4 1

cluster size n

10

Figure 4: (a) Average nematic order parameter s(n), and (b) relative shape anisotropy κ 2 (n) as a function of cluster size n for pyrene in heptane at different solute concentrations, indicated in key box of panel (a). The dashed line in (a)corresponds to random orientation of the molecules, for which s(n) = n−1/2 . relative shape anisotropy and the nematic order parameter both decrease very slowly with cluster size up to n ≈ 3 − 4, showing that these small clusters are close to the perfect columnar staking. For larger clusters, n & 5 − 6, the nematic order parameter decays as s(n) ∝ n−1/2 , but with a displacement with respect to the straight broken line in Figure 4(a) that indicates that these cluster are typically formed by smaller units, with nematic order within each one, but with random mixture of orientations between them. For those larger cluster the relative shape anisotropy decreases but in a more noisy way, indicating a complex ensemble of cluster shapes. These results give full support to the approach described above to estimate the solubility of PAH: small clusters are mainly formed by molecular stacking, and their concentration may be obtained directly from the radial distribution function g0 (r ) in the dilute limit; however, when the mean cluster size becomes too large, the 1D regime is replaced by full 3D condensation of clusters, which produces phase separation of large aggregates. On the other hand, the nematic order parameter of the largest (n ≫ 100) clusters (not shown in Fig. 4) is fairly large, s ≃ 0.6, clearly above the typical values observed for medium-sized clusters. This result is in agreement with the visual aspect of these aggregates, which appear elongated in

20

ACS Paragon Plus Environment

Page 21 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

our MD simulations, and typically formed by several long (and approximately parallel) columns of stacked molecules. These face-to-face stacks are typical in pyrene crystals. 31 Similar analysis of the nematic order parameter for clusters of acenaptene in heptane shows that molecular stacking is much weaker for these smaller PAH molecules, although the stacking tendency is enhanced in MD simulations at lower temperature. Together with the smoother transition from exponential to inverse-power decay in the cluster size distribution, we may expect that the hypothesis of planar stacking becomes less founded for lower molecular weights.

5.2 Constrained simulations: Prediction of PAHs solubility from their molecular stacking The direct evaluation of the solubility of large PAHs by cluster analysis increases exponentially the computational cost, which makes it unfeasible for molecular weight above ∼ 300 amu. In contrast, the method based on Equation 10 requires only the solute-solute radial distribution function g0 (r ) in the low concentration limit. The rapid decay of the solubility with the molecular weight comes directly from the integral (up to rb ) of g0 (r ) = e− βF(r) , in terms of the potential of mean force F (r ), computed from the constrained UmbrellaSampling simulations described in Section 3. These MD simulations are done just with two PAH molecules and a small volume of solvent, with a computational cost with increases only with a low power (∼ 3/2) of the molecular weight. The cut-off radius rb in Equation 10 is defined as the location of the first minimum of the radial distribution function, which loosely defines the limit of bound configurations for the two solute molecules. In cases where the minimum is not visible, rb should correspond to a distance beyond which the potential of mean force becomes zero, or the radial distribution function has already set to unity (long-distance limit). In practice we obtain values of rb ≃ 1 nm. An example of a potential of mean force obtained is shown in Figure 5, which pertains to the case of pyrene in heptane at 298 K and 1 atm. In Figure 5(a) the histogram 21

ACS Paragon Plus Environment

Energy & Fuels

(a) 1000

counts

800 600 400 200 0

(b)

0

0.5

1

1.5 r (nm)

2

1 r (nm)

1.5

2.5

5 0

PMF[kJ/mol]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 34

-5 -10 -15 -20

0

0.5

2

Figure 5: Results for the umbrella-sampling analysis of two pyrene molecules in an heptane solvent. (a) Histogram of relative centre-of-mass distances between the two solute molecules. (b) Potential of mean force resulting from processing the histograms using the weighted-histogram technique. The number of solvent molecules used in the simulation was 3000 in a box of dimensions 9 × 9 × 9 nm3 . The dashed curve is an extrapolation to short distances.

22

ACS Paragon Plus Environment

Page 23 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

of relative molecular distances is shown for the different windows applied. In panel (b) the potential of mean force obtained after processing the histograms is shown. A clear potential well located at ≃ 0.47 nm can be seen, with essentially no free-energy barrier separating bound from unbound configurations. Table 1: Solubilities of some members of the PAH series in heptane. Experimental values obtained from, 1 with estimated relative error of less than 3%. Theoretical values obtained for the range 0.30 ≤ λc ≤ 0.45 within the method proposed in the present work, Equation (10), with values of Zb vo /Γ given by the umbrella-sampling simulations, and from Equation (12) with the second virial coefficient B2 obtained from the same umbrella-sampling simulations.

Exp.

Coronene Pyrene Phenanthrene Anthracene Acenaphthene Naphthalene

0.1 15.1 46.8 2.0 64.3 77.0

Solubility in heptane (g/l) MARTINI Eq.(10) Eq. (12) cluster λc = 0.45 λc = 0.45 analysis (0.30) (0.30) 0.1 (0.05) 0.1 (0.05) 15 ± 2 15 (5) 16 (5.3) 107 (44) 302 (124) 124 (51) 490 (201) 60 ± 8 149 (61) 635 (260) 186 (76) —

Zb vo /Γ (nm−3 )

−2B2 (nm−3 )

19450 ± 10 66.4 ± 0.5 8.2 ± 0.5 7.1 ± 0.5 5.1 ± 0.5 3±1

19430 ± 10 62.5 ± 0.5 2.9 ± 0.2 1.8 ± 0.2 1.2 ± 0.1 −2.0 ± 0.5

Alternatively, following the second virial coefficient route of Section 2, the same g0 (r ) may be used to obtain the integral in the denominator of Equation (12), using any reasonable upper limit, i.e. g0 (r ) − 1 for r ≤ rcut , in the farther side of our umbrella sampling. In this case we may also separate the integral of g0 (r ) − 1 into a positive contribution, which represents the attraction between the solute molecules and pushes B2 to negative values, and a contribution from low distances, g0 − 1 ≃ −1, which is negative and represents the excluded-volume effects (which have been neglected in all of the above theoretical discussions). Keeping track of these two contributions is helpful in order to understand some of the results presented below. The experimental results for the PAH solubilities obtained from Equation (10) are listed in Table 1, with substances arranged from higher to lower molecular mass (or number of coarse-grained units). The table also presents the values of the integrals in the 23

ACS Paragon Plus Environment

Energy & Fuels

molecular mass (amu) 1000

128 154 178 202

300

10

λc=0.30 & 0.45

1 0.1

acenaphtene

100

experiment 2ndvirial coeff. (λc=0.375)

naphtalene

solubility (g/l)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 34

phenanthrene anthracene pyrene coronene

0.01 4

λ= c 0 .45

λ= c 0 .30

6

8 10 12 number of CG units

14

Figure 6: Solubilities of the members of the PAH series analysed as a function of molecular mass and number of coarse-grained units in the MARTINI force field. Open circles: experimental results. Dashed curves: theoretical band as predicted from Eqn. (10) with the parameter range 0.30 ≤ λc ≤ 0.45, including error bars at both limits coming from uncertainty in the estimation of the bond partition function (see Table 1). Continuous curve: estimation from the virial route, Equation (12), with the central choice for the parameter λc = 0.375. Chemical structure and names for the different substances are depicted below the corresponding data.

24

ACS Paragon Plus Environment

Page 25 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

denominators of Equations (10) and (12). To get an estimation of the solubility from these equations we still have to fix the empirical parameter λc , which describes the threshold between the 1D (columnar) and the full 3D condensation of the aggregates. The hope is that this parameter should change little along the PAH series, while Table 1 shows that the factor Zb obtained from the umbrella sampling MD simulations changes by several orders of magnitude. Indeed, the experimental solubility and our MD result for Zb vo /Γ could be used to get λc ≈ 0.45 as the appropriate value for pyrene in heptane, and if we use that value with the MD result for Zb vo /Γ for coronene we get an excellent prediction for the much lower solubility (0.1 g/l) of these larger PAH molecule. To give a more intuitive idea of p that value, it corresponds to a cluster distribution hni ± h∆n2 i = 1.8 ± 1.2, and to have 42% of the solute molecules in clusters of size n ≤ 3. An interpretation for this threshold p may be attempted in geometrical terms, as a columnar cluster of nc = hni + h∆n2 i ≈ 3

pyrene (or coronene) molecules looks already like a rather globular object, so that the

further aggregation of these clusters should follow 3D-like cooperative condensation, rather than 1D stacking. This is in good agreement with the analysis of the nematic order √ parameter in Figure 4(a) for 20 & n & 6, which is very well approximated by S = 3/n that corresponds to n/3 objects with randomly distributed orientations. The extrapolation to much larger PAH would be probably more accurate with some weak increase of the parameter λc with the molecular size, but that effect should be much less important than the exponential decay of the solubility from the Zb vo /Γ factor. In the opposite direction, going to smaller PAH molecules, the validity of the 1D analysis is expected to be more restricted, as already suggested by the cluster analysis for acenaphtene in heptane presented above. The direct extrapolation of the parameter p λc = 0.45 (or nc = hni + h∆n2 i ≈ 3) from pyrene gives a clear overestimation of

the experimental solubility of phenanthrene, acenaphthene and naphthalene (setting

aside the atypical case of anthracene). Very good estimates for the solubility of these

25

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 34

three small PAH may be obtained with the smaller threshold λc = 0.3, equivalent to p nc = hni + h∆n2 i ≈ 2.2, which could be interpreted in geometrical terms as if the 1D

stacking of these molecules could not go much further than the dimers to form globular objects, at least within the MARTINI-CG representation used in our umbrella sampling MD simulations to obtain Zb vo /Γ. Figure 6 presents the experimental solubility data together with the theoretical estimates using a value for λc in the range 0.30 ≤ λc ≤ 0.45. This gives a fair idea of the predictive power of the approach proposed here. The low solubility of large PAH molecules is seen to decrease exponentially with molecular mass, with similar slopes in the experimental and theoretical results. For the smaller molecules the slope of the decay is significantly reduced, and again this feature is perfectly captured by our theoretical predictions within the narrow band of λc values. The uncertainty associated to our theoretical predictions for the solubility may be estimated in two steps: the error bars in our evaluation of the bond partition function for each PAH, and those associated to the extrapolation of the optimal parameter λc from the reference molecules (pyrene and acenaphthene) to the full PAH series. The values of Zb vo /Γ in Table 1 are given with error bars that take into account the changes associated to a ±10% in the choice of the cut-off radius rb for the bound, as well as possible errors in the umbrella sampling estimation of g(r ). The relative errors are lower for the larger molecules, because in those cases g(r ) has a very high and narrow peak. But for smaller PAH molecules the difference between ’bonded’ and ’unbonded’ pairs becomes less clear, and we have to accept larger error bars. The uncertainty in Zb vo /Γ produces an error bar in the optimal value of the parameter λc =0.45± 0.02 for pyrene, and λc =0.30± 0.04 for acenaphthene, to reproduce the experimental results. The predictive power of our approach comes from the extrapolation of the parameter λc from these reference molecules to the other PAHs, and in that sense the main contribution to the associated error bars comes from the difference between the optimal values for pyrene and acenaphthene, which

26

ACS Paragon Plus Environment

Page 27 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

is represented by the band between broken lines in Figure 6 and by the two sets of values in the central columns of Table 1. Note that in the application of our method to larger PAH molecules, the accuracy of our prediction would be certainly better than that implied by the above gross estimate given by the λc =0.3−0.45 band. In fact, as shown by the results for coronene, the better accuracy in our evaluation of Zb vo /Γ and the direct use of the optimal value of λc =0.45± 0.02 for pyrene could give solubility predictions with an error well within 10%. With the expected exponential decrease of the solubility for larger PAH molecules, which would require extremely large MD simulations to estimate the solubility by direct cluster analysis, the practical advantages of our proposal become very clear.

6 Discussion To put these results into context, we should say from the outset that theoretical solubility estimates, as well as experimental measurements of solubility, are very prone to error due to their extreme exponential sensitivity to free-energy differences of a solute molecule in the dilute and solid phases. In particular the lower experimental values of the solubility should be regarded as ‘order of magnitude’ estimations. Therefore, our degree of uncertainty in determining the solubility is not uncommon: different computational or phenomenological methodologies applied to different kind of systems, from salts to organic solids, can only predict critical solubility values within an error in the range 10 − 50%. 3–6 With this in mind, we can say that our theoretical method provides very good accuracy considering the computationally cheap character of the method. Using a single empirical parameter for the whole set of PAH molecules, and from the results of an umbrella-sampling molecular dynamics simulation with just two solute molecules in the solvent, we obtain a rather accurate prediction for a property that changes over three orders of magnitude. However, while the result for phenanthrene fits nicely in our general scheme, anthracene, with the same molecular mass (178 amu), presents a gross discrepancy. Our

27

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 34

predicted values for the solubility of these two isomers are very similar. This is not surprising, considering that the number of interacting units of the molecular bodies is identical and energies associated to configurations where two molecules are face-to-face should be very similar, at least in the framework of the MARTINI force field. However, the experimental solubilities for these two isomers differ in more than a factor twenty, reflecting the stark difference in molecular arrangement and energies of their dimers in the real system. Clearly the MARTINI force field is a gross representation that cannot cope with subtle differences in interaction and molecular shape. These molecular factors give rise to different symmetries in the molecular crystalline structures of the two isomers observed in the experiments, 32 and to significantly different melting temperatures (215◦ C 33 for anthracene, as opposed to 101◦ C 33 for phenanthrene). These differences are rooted in the electronic structure of the molecules. 34 The latter feature is particularly illuminating in interpreting the large difference in experimental solubility of phenanthrene and anthracene, which reflects larger binding energies for anthracene in perfect crystalline molecular order. 35 In contrast, both isomers have very similar boiling points (332◦ C for phenanthrene versus 339◦ C for anthracene 36 ) indicating that, when sampled over the more disordered molecular configurations of the liquid phase, their typical molecular interactions are similar, as they appear in our MARTINI coarse-grained model representation. Therefore, the failure of our theoretical prediction for the solubility of anthracene stems most probably from the simplification associated to the MARTINI force-field, which cannot account for the subtle symmetry differences of the anthracene and phenanthrene crystals. Our interaction model gives similar free-energy profiles for both molecules, because the subtleties associated with geometry-dependent binding energies, aggregation geometry, etc., are lost. Letting aside that peculiar case of anthracene, the different sources of inaccuracies and uncertainties implicit in our model, namely, inadequacy of the columnar-stacking hypothesis, unrealistic value for λc , inaccurate estimation of the potential of mean force F (r ), etc., seem to be rather mild. For comparison, the use of the same g0 (r ) through the

28

ACS Paragon Plus Environment

Page 29 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

second virial coefficient route gives very similar theoretical results for the less soluble molecules, but it deviates upwards for the smaller molecules. In the case of naphtalene we cannot even get a prediction from Equation 12, since B2 becomes positive. This reflects the fact that our proposal based on Equation (10) is much more robust than any estimation based on the second virial coefficient, such as Equation (12)

7 Conclusions We have presented a novel, fast and simple method to approximately estimate the solubility of planar molecules in a solvent. The method has been tested for the first members of the PAH series in heptane, and agreement with experimental values is found within the typical uncertainties for solubilities from present theoretical approaches. The model assumes that the cluster distribution in the system, immediately before solid precipitation, consists of aggregates formed by molecules stacked in columns. Under this assumption, which should be reasonable for approximately planar molecules, the solubility can be obtained in terms of two single ingredients: the potential of mean force in the limit of infinite dilution (which can be obtained from simulations of two single solute molecules) and the threshold for the change between 1D (columnar) aggregation to 3D cooperative growth of the largest clusters. This threshold may be equivalently described in terms of the parameter λc ≥ λ ≡ N2 /N1 between dimers and isolated monomers, or in terms of the mean square √ fluctuation in the cluster size ∆n2 ≥ hn2 i − hni2 = λc /(1 − λc )2 = (1 + λc )/(1 − λc ), when the molecular aggregates start to condensate in full 3D fashion, rather than in the limited 1D stacking. A perhaps more intuitive description may be given in terms of a √ typical ’mean plus standard deviation’ cluster size, nc ≡ hni + ∆n2 which (within the 1D assumptions) corresponds to having about 40% of the molecules in clusters with size n ≥ nc . For pyrene in heptane the empirical value λc = 0.45 corresponds to nc ≈ 3, as the threshold for the solubility limit. For smaller molecules (acenaphthene), the threshold 29

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 34

shifts to a lower cluster size nc ≈ 2.2, i.e. λc = 0.3, compatible with the intuition that a cluster of size nc should have the aspect of a globular object, rather than a planar one. Given the value of λc (or nc ) for a solute-solvent system, the solubility may be obtained just from the potential of mean force in the limit of infinite dilution. The crucial point is that a relatively narrow interval 0.30 ≤ λc ≤ 0.45 covers the complete series of PAHs explored here, with the only exception of anthracene. The huge variations of the solubility along the PAH series may be predicted from simple umbrella-sampling simulations using only two solute molecules and a small simulation box with ∼ 103 solvent (heptane) molecules. For comparison, we have presented also a direct estimation of the solubility of pyrene, using up to 500 solute molecules, and 33657 solvent molecules. The same estimation for coronene (with a solubility 150 times smaller than that of pyrene) would require to increase the number of solvent molecules by 150, and to use very long simulations, so that the coronene clusters could reach equilibrium in such a huge simulation box. In contrast, a single umbrella sampling fo two coronene molecules kept at short distance, and borrowing the same λc as for pyrene, provides excellent agreement with the reported value for the experimental solubility. The advantage of the method should be even greater for still larger molecules, providing a practical strategy to estimate the ultra-low solubilities of the extended PAHs (nanographenes) members of the series and even of large supramolecular assemblies such as paraffin platelet nanocrystals, laponite. On the other hand, all the results presented here have been simulated within the MARTINI coarse-grained force-field scheme. The good comparison with experimental values obtained from the large ‘brute-force’ simulations of pyrene and acenaphthene in heptane, with no free parameters, and a narrow range of λc values that is able to cover the PAH series analysed, indicate that this coarse-grained representation of the molecular interactions is sufficiently accurate to calculate PAH solubilities in heptane. However, the limitations of the MARTINI representation show up clearly in the case of anthracene and phenantrene, two isomers that present very different values of solubility. Our method is unable to

30

ACS Paragon Plus Environment

Page 31 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

distinguish between the two, since their coarse-grained potentials of mean force are very similar and, consistent with the spirit of the approach, the same value for λc is used, giving rise to very similar predictions for the solubility of these two PAH molecules. We believe that the problem here comes from the coarse-grained representation of the molecular interactions, rather than from a gross failure of the assumption of transferability for the parameter λc . Our method could equally be applied to a fully-atomistic representation of the molecular interactions so as to reproduce the difference between the structure and stability of the crystal phases of these two isomers. However, it is unclear that atomistic models are more able to correctly reproduce solubility properties. 37 To finish we note that, letting aside the peculiar case of anthracene, the globally linear decay of the solubility with the molecular weight (clearly visible in Figure 6 and also observed in the experimental data) may be more accurately described in terms of two different ranges, with a milder decay for the smaller molecules than for the larger ones. Our analysis shows that, within a relatively narrow band of our empirical parameter λc , these two different decays can be predicted just from the bond partition functions, Zb , i.e. from the potential of mean force between two solute molecules in the infinitely dilute limit. However, we may go even a bit further and speculate that small molecules, exhibiting a slower decay of their ρs , can be represented by an even narrower band λc ≈ 0.3, while larger members of the PAH family (pyrene and coronene) seem to share a larger value λc ≈ 0.45, i.e. a larger size and mean-square fluctuation of their columnar clusters before they reach the threshold for 3D cluster growth; this could be expected from purely geometrical considerations.

Acknowledgement We acknowledge financial support from MINECO (Spain) under grant FIS2010-22047-C0501 , from REPSOL and the “María de Maeztu” Programme for Units of Excellence in R&D (MDM-2014-0377). 31

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 34

References (1) IUPAC-NIST Solubility Database. http://srdata.nist.gov/solubility/ IUPAC/iupac.aspx. (2) Bradley, J.-C.; Neylon, C.; Williams, A.; Guha, R.; Hooker, B.; Lang, A. S.; Freisen, B.; Bohinski, T.; Bulger, D.; Federici, M. Open Notebook Science Challenge: Solubilities of Organic Compounds in Organic Solvents (2ND); 2009. (3) Jouyban, A. In Handbook of Solubility Data for Pharmaceuticals; Press, C., Ed.; 2009. (4) Ekberg, C.; Emrén, A. T. Monatshefte für Chemie / Chemical Monthly 2001, 132, 1171– 1179. (5) Fetzer, J. C. Polycyclic Aromatic Compounds 2007, 27, 143–162. (6) Achten, C.; Andersson, J. T. Journal Polycyclic Aromatic Compounds 2015, 35, 177. (7) Hansen, C. M. Hansen solubility parameters: a user’s handbook; CRC press, 2007. (8) Acree-Jr., W. E.; Abraham, M. H. Fluid Phase Equilibria 2002, 30, 245–258. (9) Jouyban, A.; Shayanfar, A.; Acree-Jr., W. E. Fluid Phase Equilibria 2010, 293, 47–58. (10) Jorgensen, W. L.; Duffy, E. M. Advanced Drug Delivery Reviews 2002, 54, 355 – 366, Computational Methods for the Prediction of {ADME} and Toxicity. (11) Aragones, J. L.; Sanz, E.; Vega, C. The Journal of Chemical Physics 2012, 136. (12) Ferguson, A. L.; Debenedetti, P. G.; Panagiotopoulos, A. Z. J. Phys. Chem. B 2009, 113, 6405–6414. (13) Widom, B. The Journal of Chemical Physics 1963, 39, 2808–2812. (14) Kirkwood, J. G. The Journal of Chemical Physics 1935, 3, 300–313.

32

ACS Paragon Plus Environment

Page 33 of 34

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

(15) Frenkel, D.; Ladd, A. J. C. The Journal of Chemical Physics 1984, 81, 3188–3193. (16) Kolafa., J.; Nezbeda, I. Fluid Phase Equilibria 1994, 100, 1–34. (17) Paluch, A. S.; Dan D. Cryan, I.; Maginn, E. J. Journal of Chemical & Engineering Data 2011, 56, 1587–1595. (18) Iwai, Y.; Uchida, H.; Arai, Y.; Mori, Y. Fluid Phase Equilibria 1998, 144, 233–244. (19) Edwards, N. T. Journal of Environmental Quality 1983, 12, 427–441. (20) Luch, A. In The Carcinogenic Effects of Polycyclic Aromatic Hydrocarbons, 1st ed.; Luch, A., Ed.; Imperial College Press, 2005. (21) Aray, Y.; Hernández-Bravo, R.; Parra, J. G.; Rodríguez, J.; Coll, D. S. The Journal of Physical Chemistry A 2011, 115, 11495–11507. (22) Barré, L.; Simon, S.; Palermo, T. Langmuir 2008, 24, 3709–3717. (23) Goual, L.; Firoozabadi, A. AIChE Journal 2002, 48, 2646–2663. (24) Narita, A.; Wang, X.-Y.; Feng, X.; Mullen, K. Chem. Soc. Rev. 2015, 44, 6616–6643. (25) Jian, C.; Tang, T. The Journal of Physical Chemistry B 2014, 118, 12772–12780. (26) Public Health Statement, Polycyclic Aromatic Hydrocarbons (PAHs); Agency for Toxic Substances and Disease. Registry (ATSDR): U.S. Public Health Service, U.S. Department of Health and Human Services, Atlanta, 1990. (27) Emergency Planning and Community Right-to- Know Act - Section 313: Guidance for Reporting Toxic Chemicals: Polycyclic Aromatic Compounds Category; Office of Environmental Information: United States Environmental Protection Agency, Washington, DC, 2001.

33

ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 34

(28) Marrink, S. J.; Risselada, H. J.; Yefimov, S.; Tieleman, D. P.; de Vries, A. H. The Journal of Physical Chemistry B 2007, 111, 7812–7824. (29) Van Der Spoel, D.; Lindahl, E.; Hess, B.; Groenhof, G.; Mark, A. E.; Berendsen, H. J. C. Journal of Computational Chemistry 2005, 26, 1701–1718. (30) Torie, G. M.; Valleus, J. P. J. Comp. Phys. 1997, 23, 187–199. (31) Lee, S.; Chen, B.; Fredrickson, D. C.; DiSalvo, F. J.; Lobkovsky, E.; Adams, J. A. Chemistry of materials 2003, 15, 1420–1433. (32) Mason, R. Molecular Physics 1961, 4, 413–415. (33) Allen, J. O.; Sarofim, A. F.; Smith, K. A. Polycyclic Aromatic Compounds 1999, 13, 261– 283. (34) Hancock, R. D.; Nikolayenko, I. V. The Journal of Physical Chemistry A 2012, 116, 8572– 8583. (35) Pinal, R. Org. Biomol. Chem. 2004, 2, 2692–2699. (36) Haynes, W. M. CRC handbook of chemistry and physics; CRC press, 2014. (37) Kubicki, J. D. Environ. Sci. Technol. 2006, 40, 2298–2303.

34

ACS Paragon Plus Environment