This is an open access article published under an ACS AuthorChoice License, which permits copying and redistribution of the article or any adaptations for non-commercial purposes.
Article Cite This: J. Am. Chem. Soc. 2018, 140, 12527−12537
pubs.acs.org/JACS
Mixed Explicit−Implicit Solvation Approach for Modeling of Alkane Complexation in Water-Soluble Self-Assembled Capsules Henrik Daver,† Andreś G. Algarra,‡ Julius Rebek, Jr.,§,∥ Jeremy N. Harvey,*,⊥ and Fahmi Himo*,† †
Department of Organic Chemistry, Arrhenius Laboratory, Stockholm University, SE-106 91 Stockholm, Sweden Departamento de Ciencia de los Materiales e Ingeniería Metalúrgica y Química Inorgánica, Facultad de Ciencias, Instituto de Biomoléculas, Universidad de Cádiz, Puerto Real, Cádiz 11510, Spain § The Skaggs Institute for Chemical Biology and Department of Chemistry, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037, United States ∥ Center for Supramolecular Chemistry and Catalysis, Shanghai University, Shanghai 200444, China ⊥ Department of Chemistry, KU Leuven, Celestijnenlaan 200F, B-3001 Heverlee, Belgium
Downloaded via 95.85.69.143 on October 30, 2018 at 15:18:26 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.
‡
S Supporting Information *
ABSTRACT: The host−guest binding properties of a water-soluble resorcinarene-based cavitand are examined using density functional theory methodology. Experimentally, the cavitand has been observed to self-assemble in aqueous solution into both 1:1 and 2:1 host/ guest complexes with hydrophobic guests such as n-alkanes. For n-decane, equilibrium was observed between the 1:1 and 2:1 complexes, while 1:1 complexes are formed with shorter n-alkanes and 2:1 complexes are formed with longer ones. These findings are used to assess the standard quantum chemical methodology. It is first shown that a rather advanced computational protocol (B3LYP-D3(BJ)/6-311+G(2d,2p) with COSMO-RS and quasi-rigid-rotor-harmonic-oscillator) gives very large errors. Systematic examination of the various elements of the methodology shows that the error stems from the implicit solvation model. A mixed explicit−implicit solvation protocol is developed that involves a parametrization of the hydration free energy of water such that water cluster formation in water is predicted to be thermoneutral. This new approach is demonstrated to lead to a major improvement in the calculated binding free energies of n-alkanes, reproducing very well the 1:1 versus 2:1 host/guest binding trends.
1. INTRODUCTION Reversible encapsulation is a process that temporarily confines guest molecules in container host structures. The guest is isolated from other solutes and the solvent, as the container walls act as more-or-less rigid solvation shells. The containers are usually assemblies that emerge through hydrogen bonding in organic media,1−9 while metal/ligand interactions,10,11 ionic forces,12 or even purely hydrophobic effects can stabilize capsules in water.13 A number of contortions accompany the confinement of flexible guest molecules in the hosts including coiling, bending, folding, and other distortions that effect reactivity.14,15 In the framework of homogeneous chemistry, the containers allow the study of species or events that cannot usually be observed in solution. Potential applications include selective binding, recognition of unstable intermediates, and encapsulation for catalysis purposes, to mention just a few.16 The hosts can also be considered as mimics of biomolecules, which usually have binding pockets that selectively bind guests with high affinities.17 Several self-assembling complexes have been synthesized and constructed from noncovalent linkages, such as hydrogen bonds, allowing for flexible binding and release of molecules to the solution.16,18 Of particular interest are molecular containers that can selfassemble in aqueous medium. Water is ubiquitous in biology; © 2018 American Chemical Society
thus, to properly mimic biological events, the hosts must be active in this medium. The aqueous environment implies competition for all polar interactions, including hydrogen bonds, making self-assembly more difficult. On the other hand, the hydrophobic effect might favor the process of self-assembly.17 A number of hosts have been designed for self-assembly in water, and supramolecular hosts can make guest binding and reactivity possible that are usually not viable in aqueous medium.17−21 One specific example of a water-soluble self-assembling molecule is cavitand 1 (Figure 1), developed by Rebek and co-workers,22 building upon the design of de Mendoza.23 This host has shown interesting binding trends for a variety of alkane guests in water.22,24,25 Complexes of 1 host n-alkanes up to n-C18H38 (here referred to as C18 for brevity).25 The stoichiometries of the complexes depend on the alkane length. Alkanes up to n-nonane (C9) result in 1:1 host/guest complexes,24 whereas C11 and longer n-alkanes form 2:1 complexes in which the alkane is incarcerated inside the 12 capsule in an extended conformation.22,24 For C10, equilibrium is observed between the 1:1 and 2:1 complexes, as expressed in Equilibrium 1-C10).24 Received: July 3, 2018 Published: September 6, 2018 12527
DOI: 10.1021/jacs.8b06984 J. Am. Chem. Soc. 2018, 140, 12527−12537
Article
Journal of the American Chemical Society
C10@1 and C10@12 complexes were observed in a roughly 1:1 ratio.24 Hence, out of three molecules of 1, two are engaged in the C10@12 complex and one is engaged in the C10@1 complex. The concentration of cavitand 1 in solution was 0.5 mM, which means that the concentrations of both C10@1 and C10@12 were 0.167 mM. Assuming alkane C10 to be present in excess as a separate phase and its aqueous concentration to be approximately constant, the equilibrium constant K′ for Equilibrium 1-C10 becomes K′ =
[C10@12 ] K 0.167 × 10−3 mol · L−1 = = 2 [C10] [C10@1] (0.167 × 10−3 mol · L−1)2
= 6 × 103 L·mol−1
which can be converted to a free energy difference of −RT ln(K′) = −5 kcal/mol (including the standard free energy of C10 in its pure phase). This value will be used as a reference to assess the various computational protocols. There are a couple of other pieces of information known about this reaction, and these will be mentioned in the appropriate places of the further discussion. We show that the first attempt to model Equilibrium 1-C10, using a rather advanced methodology, yielded very large errors. Systematic examination of the various elements of the methodology is then presented, showing that the main deviation stems from the solvation model. An amended computational protocol is then proposed that significantly reduces the error.
Figure 1. Cavitand 1 and capsule 12. In the picture of the capsule, the R groups are replaced with methyl groups for clarity.
2 × C10@1(aq) F C10@12 (aq) + C10(l)
(1-C10)
Water has been suggested to make specific interactions with both cavitand 1 and capsule 12.22,24,25 For cavitand 1, with its vaselike structure, water molecules are suggested to bridge between the benzimidazolone groups of the vase’s rim, forming hydrogen bonds with the urea groups.22,24 Studies on watersoluble cyclodextrin hosts have shown that release of hostbound water molecules that are energetically “frustrated”, i.e., that have insufficient numbers of possible hydrogen bonds, is a driving force in the formation of inclusion complexes of other guests than water.26,27 A similar effect has been proposed for 1.22 It has also been suggested that water molecules can play a role in the widening of the capsule by binding along the seam of the dimer in order to accommodate folded forms of n-alkanes longer than C14.25 Quantum chemical methodology has recently become an important tool in the study of supramolecular host−guest complexes.28−52 The various components of the methodology have gradually improved, and more accurate results are possible today. Advances in density functional theory are, of course, of great importance here, but also important is the development of various corrections, such as dispersion and free energy corrections.33,53 A significant number of theoretical studies of supramolecular complexes in water have applied implicit solvation models.28,30,31,33,37,38,47,50,51,53−55 However, none of these studies treated dimerization processes. In a molecular dynamics study of cyclodextrin dimerization, it was found that explicit solvent models performed better than generalized Born models.56 Furthermore, in the SAMPL5 challenge of calculating host− guest binding free energies, in which a wide variety of theoretical methods were applied, it was found that methods involving explicit water molecules in general performed better than implicit solvent models.57 Aqueous solvation clearly creates a challenge for standard quantum chemical protocols, and a combination of implicit and explicit solvation has been suggested to provide a general improvement for calculations of pKa, solvation free energies, and reaction mechanisms in the case of strong interactions between the solute and the solvent.58−63 In the present work, we study the host−guest binding properties of 1 and 12 using DFT methodology. We will use Equilibrium 1-C10 as a benchmark, because it is well-defined and quantifiable in terms of free energy. Experimentally, the
2. REFERENCE COMPUTATIONAL PROTOCOL The total Gibbs energy of the system is calculated according to the following equation, Gtot = Egas + δGfree + δGsol
where Egas is the gas-phase energy of the system, δGfree is the free energy correction to the gas-phase energy, and δGsol is the solvation free energy. First, the calculations were performed using the following computational scheme, which will be referred to as the reference protocol. This computational protocol was used very recently in the study of a cycloaddition reaction inside a very similar self-assembled capsule, but in an organic solvent, and yielded very good results.45 The calculations were performed using the B3LYP-D3(BJ) functional,53,64−66 i.e., the B3LYP functional augmented with the DFT-D3 dispersion method, utilizing the Becke−Johnson damping function, as implemented in Gaussian 09.67 Geometry optimizations were performed with the 6-31G(d,p) basis set, and on the basis of these optimized geometries, single-point calculations were conducted with the 6-311+G(2d,2p) basis set. In Gaussian 09, only the two-body part of the D3(BJ) dispersion correction is included in the DFT-D3 method. The three-body terms were therefore added manually to the single-point energies calculated with the larger basis set. The sum of these contributions constitutes Egas. Next, frequency calculations were carried out at the same level of theory as the geometry optimizations to characterize the nature of the optimized structures and to calculate the free energy correction, δGfree. The quasi-rigid-rotor-harmonicoscillator (qRRHO) approach33 was applied, as this method corrects for the overestimation of the entropic contributions from low-frequency vibrational modes by the standard harmonicoscillator approach. An additional benefit of this approach is that it minimizes the impact of errors in the magnitude of lowfrequency modes due to numerical error in the Hessian. 12528
DOI: 10.1021/jacs.8b06984 J. Am. Chem. Soc. 2018, 140, 12527−12537
Article
Journal of the American Chemical Society
part of the alkane is inside the cavitand, while the four linearly arranged carbons are found in the solvent-exposed end. In free C10, each gauche interaction is calculated to be associated with an energy penalty of ca. 0.7 kcal/mol. This penalty is, however, compensated by stronger interactions between the cavitand and the six-gauche conformation of C10 as well as a smaller solvent-exposed area. This isomer is calculated to be 1.2 kcal/mol more stable than the one with an all-anti C10. In the capsule complex C10@12, the most stable structure of C10 was found to be the all-anti conformation, as shown in Figure 2. Using the lowest-energy structures of C10@1 and C10@12 and the computational protocol described earlier, the free energy of Equilibrium 1-C10 is calculated to be −46.8 kcal/mol, which is an overestimation by >40 kcal/mol compared to the experimental value of −5 kcal/mol. This shows that the adopted methodology, which is normally considered as relatively accurate and has been employed successfully in previous studies, fails severely in this case. In the following sections, we will examine the possible sources of this failure by considering a number of alterations of the various elements of the computational protocol.
For both reasons, this approach can be important for large systems with many soft modes, and it has been used quite frequently in recent years, yielding generally good results.68−73 The free energy calculations were done using rotational symmetry numbers σr = 1. Determination of the approximate molecular point groups and symmetry numbers for all molecules was carried out in GaussView,74 and the term RT ln(σr) was added to the calculated free energy corrections. For a couple of structures (the empty cavitand 1 and the water dimer, see Supporting Information), small imaginary frequencies were obtained that could not be eliminated by further optimization of the geometries. In these cases, to avoid inaccuracies that would arise from comparing entropic values calculated with different numbers of vibrational modes, these imaginary frequencies were treated as real in the qRRHO calculations. Finally, solvation free energy, δGsol, was calculated with the COSMOtherm software,75,76 at 298.15 K and 1 atm pressure, at the BP86/TZVP level of theory.77,78,79 This solvation method has been used in several similar applications in recent years.33,36,51 To correct for the change in standard state in going from the ideal gas (1 atm standard state) to a water solution (1 mol/L standard state for all compounds except water, for which 55.3 mol/L is used), a value of RT ln(24.5 L/mol × 1 mol/L) = +1.9 kcal/mol is added for all solutes, and RT ln (24.5 L/mol × 55.4 mol/L) = +4.3 kcal/mol is added for water. The solvation free energies of the n-alkanes are calculated in their own phases because they are poorly soluble in water, and the corresponding liquid-phase concentrations are used in the standard state corrections. An important note here concerns the conformational search. The size of the system under study requires a thorough examination of different conformations in order to identify the energetically most stable one. We have therefore explicitly considered different hydrogen-bonding patterns, guest conformations, and host−guest binding modes for each complex studied, and the structures reported here are the ones with the lowest free energy. A model of 1 is used in which the alkylpyridinium substituents and the Cl− counterions are replaced by methyl groups (the effect of this replacement will be examined below). In the most stable conformer of C10@1, the alkane adopts six gauche configurations and one anti configuration for the internal C−C single bonds (Figure 2). The coiled
3. CAVITAND MODEL The alkylpyridinium substituents are crucial for the solubility in water, but these groups are not expected to strongly affect the binding properties of the cavitand. In the reference protocol presented earlier, these alkylpyridinium substituents, together with the Cl− counterions, were therefore replaced by methyl groups. Here, we evaluate the effect of this truncation on the free energy of Equilibrium 1-C10. Using the full alkylpyridinium substituents means that cavitand 1 has 4+ charge, and bringing together two of these units in the dimer leads consequently to a large repulsion in the gas phase. Indeed, as seen in Table 1 (entries 2 and 3), the gas-phase energy change (ΔEgas) associated with Equilibrium 1-C10 for the full model is >230 kcal/mol higher than that for the truncated model. The solvation correction (ΔδGsol) of course acts in the opposite direction, but only by 160 kcal/mol. In total, thus, the model with alkylpyridinium substituents results in an energy of +25.5 kcal/mol, to be compared to −46.8 kcal/mol obtained with the truncated model. Adding eight chloride counterions to the model, such that the full model is neutral, gives energies that are very close to those for the truncated model (−50.0 vs −46.8 kcal/mol, see Table 1, entries 2 and 4). This shows that the truncated model provides a good representation of the full system and can be used for the evaluation of the methodology. In water, the halide ions are not expected to be tightly bound to the alkylpyridinium substituents, so in principle the 4+ and 8+ models should yield accurate results, but we presume that continuum solvation of the multiple charged groups leads to the error obtained when including only the alkylpyridinium groups in the model (and not the chloride ions). 4. GAS-PHASE ENERGY To evaluate the sensitivity of the computational results to the choice of DFT functional, we have in addition to the B3LYPD3(BJ) method also used the TPSS-D3(BJ)53,66,80 and M06-2X81 functionals. These are popular methods and have been used previously in supramolecular chemistry applications.30,33,36−38,40,42−44,47,54,82 The geometries were then
Figure 2. B3LYP-D3(BJ)/6-31G(d,p)-optimized geometries of the most stable conformers of C10@1 and C10@12. 12529
DOI: 10.1021/jacs.8b06984 J. Am. Chem. Soc. 2018, 140, 12527−12537
Article
Journal of the American Chemical Society Table 1. Energies (kcal/mol) for Equilibrium 1-C10 Calculated with Different Computational Protocolsa entry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
method experimental value reference protocolb
ΔEgas
ΔδGfree
−115.9 +7.1 cavitand model [full model]4+ +118.1 [full model]4+·4Cl− −115.5 functional TPSS-D3(BJ)c −117.1 +5.7 M06-2Xc −95.7 +5.5 B3LYPc (no dispersion) −105.9 +19.5 dispersion model XDMd −115.8 DFT-D2 −102.3 basis set 6-311+G(2d,2p) + −114.7 counterpoise def2-TZVP −116.1 def2-TZVP + gCP −115.1 free energy protocol RRHO +8.2 qRRHO (only vibrational) +3.0 implicit solvation model SMD C-PCM SMD, joined cavities C-PCM, joined cavities
ΔδGsol
ΔGtot
+62.0
−5 −46.8
−99.7 +58.4
+25.5 −50.0
+63.3 +60.1 +66.2
−48.1 −30.1 −20.2
Figure 3. Top views of the geometry of C10@1, optimized with the functionals (a) B3LYP, (b) B3LYP-D3(BJ), and (c) B3LYP-D3(BJ) with four water molecules present. The alkane is made transparent for clarity.
−46.7 −33.2
obtained when four water molecules are included in the model, forming hydrogen bonds between the cavitand units (ca. 7 Å). As seen, uncorrected B3LYP gives a similar overall free energy for Equilibrium 1-C10 as M06-2X, which is closer to the experimental value than the reference B3LYP-D3(BJ) result. This is somewhat unexpected because it is generally accepted that B3LYP-D3(BJ) gives better results than uncorrected B3LYP. To further investigate this issue, one can consider the following equilibrium as an auxiliary condition on the modeling. Namely, according to the experiments, Equilibrium 2-C10 should be exergonic, because complex C10@1 can be observed.24
−45.6 −47.0 −46.0 −45.7 −50.9 +61.1 +44.6 +63.4 +38.9
−47.7 −64.2 −45.4 −69.9
1(aq) + C10(l) F C10@1(aq)
(2-C10)
Note that, in solution, a 1:10 mixture of C4v-symmetric 1 and the so-called “velcrand” dimer (1-C2v)2 is observed,22 so Equilibrium 2-C10 should include both these species on the left-hand side. However, modeling of (1-C2v)2 proved to be quite challenging, because the positively charged alkylpyridinium substituents and the chloride counterions were found to interact with the 1-C2v units at the monomer−monomer interface (see Supporting Information). Interestingly, when n-decane is added to the mixture, no signal from (1-C2v)2 or 1 was observed,24 which means that C10@1 is more stable than both complexes, in particular 1, as formulated in Equilibrium 2-C10. Uncorrected B3LYP predicts Equilibrium 2-C10 to be endergonic by as much as +23.3 kcal/mol, and therefore C10 is calculated to be unbound to cavitand 1. B3LYP-D3(BJ), TPSSD3(BJ), and M06-2X also predict endergonicity (+3.0, +5.1, and +1.7 kcal/mol, respectively), but these values are significantly lower than uncorrected B3LYP and much closer to the experimental outcome. It can therefore be concluded that the better agreement of B3LYP for Equilibrium 1-C10 as compared to B3LYP-D3(BJ) must be fortuitous. To further assess the sensitivity of the results to the specific choice of dispersion correction, we have also used the wave function-based XDM protocol83 and the DFT-D2 correction,84 in both cases in conjunction with the B3LYP functional (Table 1, entries 8 and 9). The calculations were performed on the basis of the geometries optimized with B3LYP-D3(BJ). While the XDM and D3(BJ) corrections give very similar results (−46.7 vs −46.8 kcal/mol), the D2 correction results in a much lower exergonicity (−33.2 kcal/mol). The latter value is closer to the experimental result but still far from the measured −5 kcal/mol. In addition, with XDM, Equilibrium 2-C10 is calculated to have a positive free energy of +5.2 kcal/mol, while B3LYP-D2 is the only method considered here that predicts this equilibrium to be exergonic, by −7.0 kcal/mol.
a
Unless stated otherwise, all calculations are performed on the basis of the geometries optimized at the B3LYP-D3(BJ)/6-31G(d,p) level of theory. bTruncated cavitand model with B3LYP-D3(BJ)/6311+G(2d,2p), COSMO-RS, and qRRHO. cThe geometries are reoptimized with the respective functional and the 6-31G(d,p) basis set. dOn the basis of the electron density calculated at the B3LYP/6311+G(2d,2p) level of theory.
reoptimized with these functionals, and corresponding thermal and solvation corrections were recalculated. The results are given in Table 1, entries 5 and 6. While TPSS-D3(BJ) is found to give very similar results compared to B3LYP-D3(BJ), −48.1 vs −46.8 kcal/mol, the difference is quite large using M06-2X. With this method, the overall free energy of Equilibrium 1-C10 is calculated to be −30.1 kcal/mol, which is 17 kcal/mol smaller in magnitude than that with the reference protocol. As seen in Table 1, the difference stems mainly from the gas-phase energies. Interestingly, the M06-2X value is rather insensitive to whether the D3 correction is added or not. The ΔEgas value changes only by 1.7 kcal/mol upon inclusion of this correction (from −95.7 to −97.4 kcal/mol). On the other hand, the B3LYP functional without the dispersion correction gives a large difference compared to B3LYP-D3(BJ), −20.2 vs −46.8 kcal/mol (Table 1, entries 2 and 7). Notably, both the gas-phase energies and the free energy corrections differ in this case. The geometry of the cavitand optimized with the B3LYP functional changes significantly with and without the dispersion correction. The diameter of the cavitand−decane complex C10@1, measured as the distance between the oxygen atoms of diametrically opposite carbonyl groups at the rim, is ca. 11 Å when the geometry is optimized with the uncorrected B3LYP functional and ca. 8 Å when optimized with B3LYP-D3(BJ). As shown in Figure 3, the latter is much closer to the distances 12530
DOI: 10.1021/jacs.8b06984 J. Am. Chem. Soc. 2018, 140, 12527−12537
Article
Journal of the American Chemical Society
To understand the source of the discrepancy between the different solvation methods, one can consider the urea molecule, which is akin to the imidazolone units at the rim of the cavitand, and for which the hydration free energy has previously been calculated to be −13.8 ± 0.4 kcal/mol using Monte Carlo simulations.97 Also for this molecule, COSMO-RS and SMD give almost identical results (−11.3 and −11.2 kcal/mol, respectively), while C-PCM gives the significantly lower estimate of −7.8 kcal/mol. This strongly suggests that the difference between C-PCM and the other methods stems thus from the fact that the solvation free energy of the cavitand monomer is underestimated. These results also hint at the possibility that inclusion of explicit water molecules at the rim may provide a systematic way to improve the results. This will be investigated in detail later. Before that, however, it is important to discuss an aspect of the implicit solvation methods that can be problematic in the case of host−guest systems and that is related to the way in which the molecular cavities used in these solvation calculations are generated. Specifically, for the inclusion complexes studied here, two disjoint cavities are formed: one around the host and one around the bound guest. Between them there is a void, which the implicit solvent models decorate with point charges in order to model the electric field created by the solvent. The effect is thus that the incarcerated guests are modeled as if they were directly interacting with the solvent, which of course is an artifact of the solvation model. To estimate the effect of this on the solvation free energy of Equilibrium 1-C10, the two cavities were joined by adding extra spheres (using the “ExtraSph” keywords in Gaussian) between the host and the guest. Quite reassuringly, with SMD it turns out that this is a quite minor problem, as the procedure described above gives a negligible effect of +2.3 kcal/mol, (Table 1, entries 15 and 17). With C-PCM, the effect is −5.7 kcal/mol (Table 1, entries 16 and 18), i.e., in the wrong direction compared to experiments.
According to these results, it seems that the B3LYP-D2 method gives somewhat better agreement with the experimental findings than B3LYP-D3(BJ) and B3LYP-XDM. The latter ones are, however, generally found to be superior to B3LYP-D2,85 which might indicate that the D2 benefits from a fortuitous cancellation of errors. Further tests with B3LYP-D2 will be presented in the last part of this Article. As a final element of the evaluation of the gas-phase energy, we also investigated the sensitivity of the results to the basis set superposition error (BSSE). Because eight hydrogen bonds are formed between the cavitands in the 12 dimer, the energies might be prone to BSSE, where the basis functions on one monomer extend to the other. This would result in an artificial stabilization of the supramolecular complex with respect to the monomers, which would favor the right-hand side of Equilibrium 1-C10. To examine this effect, counterpoise calculations86,87 were performed with the 6-311+G(2d,2p) basis set. As seen in Table 1 (entries 2 and 10), the difference is negligible, ∼1 kcal/mol. In addition, calculations were also performed with the def2-TZVP basis set88 and the geometrical counterpoise correction (gCP)89 (entries 11 and 12). The calculated difference compared to the reference protocol is −5 −44.0 −27.5 −34.7 −35.8
4-C9
1-C10
+3.4 +2.5 +12.9 +10.7
−5 −47.3 −32.5 −42.1 −42.9
4-C10
1-C11
−3.7 −5.5 +6.3 +1.9