Letter pubs.acs.org/OrgLett
Solving the Density Functional Conundrum: Elimination of Systematic Errors To Derive Accurate Reaction Enthalpies of Complex Organic Reactions Arkajyoti Sengupta and Krishnan Raghavachari* Department of Chemistry, Indiana University, 800 East Kirkwood Avenue, Bloomington, Indiana 47405, United States S Supporting Information *
ABSTRACT: The failure of available density functional methods to compute accurate reaction enthalpies of common organic reactions is well documented. Herein, we demonstrate that the disparate results from different functionals stem from the systematic errors in the underlying elementary reactions that represent the changes in the bonding environment between reactants and products. We develop a rigorous protocol to correct for these systematic errors and obtain dramatically improved results with deviations of only 1−2 kcal/mol for most functionals.
T
as wave function theory (WFT) methods using CBH that are in excellent agreement with accurate methods such as G4 theory11 or CCSD(T)12 in conjunction with large basis sets.13 Herein, we use a novel CBH-based error-correction scheme to determine highly accurate reaction enthalpies for a range of organic reactions with commonly used methods. The systematic errors of the different DFT and WFT methods are identified as the intrinsic errors in the reaction enthalpies of the underlying elementary reactions that represent the fundamental changes in the bonding environment between reactants and products. This insight leads to a systematic error-correction protocol to derive accurate reaction enthalpies via the following steps: (1) Compute the reaction energy with the current DFT (or WFT) method under consideration and assess its performance vs the accurate energies from G4 theory (vide infra), and the deviation is denoted as “Dev-0”. Dev-0 varies across a wide range of 0−45 kcal/mol, depending on the method used and the reaction considered. (2) For each rung of CBH-n (only n = 1, 2 considered in this paper though higher rungs can be used as needed), the CBH reaction schemes are set up for the reactant and the product to identify the net change in the elementary model reactions (see Figure 1 for an example). For all of the compounds considered in this work, all of the fragments representing the reactants and products in the CBH reactions can be automatically and uniquely defined. For more complex systems such as delocalized aromatic molecules that have multiple resonance structures, the fragments may depend on the nature of the valence bond structure in the case of the higher rungs (CBH-3 and higher). Fragment molecules common to both the reactant and product sides cancel each other, and the resultant CBH-1
he widespread application of computational chemistry for the elucidation of reaction mechanisms has contributed immensely toward the huge success of fields such as organocatalysis.1 Density functional theory (DFT) provides a great trade-off between accuracy and computational cost, becoming the go-to approach for computational modeling of large molecules (∼100 atoms). The rapid growth of the field has also shed some light on the inadequate performance of many DFT methods, particularly the widely used B3LYP functional.2 Resolving some of the deficiencies has popularized some new functionals3 and has led to the development of useful schemes (e.g., D3 dispersion corrections).4 However, the incessant search for the “exact” functional has resulted in numerous new functionals (with acronyms that can be bewildering even for the expert user), with each one specialized in its own turf, yet not successful enough to be used universally.5 The increasing number of studies resulting in erroneous conclusions calls for a broader education among the users.6 With hundreds of variants available, there is a growing need for a systematic protocol to assess different methods. In this paper, we present a well-defined protocol that yields dramatically improved results. Wheeler, Houk, Schleyer, and Allen revived the age-old technique of error cancellation through their systematic hybridization-based hierarchy of homodesmotic reactions for hydrocarbons.7 While some applications have been carried out,8 the need to predefine reactants and products in higher levels of the hierarchy has restricted their use to only a few classes of organic molecules like hydrocarbons.9 To overcome some of the associated limitations, a more systematic and generalized error-canceling thermochemical hierarchy called connectivitybased hierarchy (CBH) was developed in our group that is applicable to a broad range of organic and biomolecular systems.10 We have derived thermochemical properties of both open-shell and close-shell organic molecules with DFT as well © 2017 American Chemical Society
Received: March 26, 2017 Published: May 5, 2017 2576
DOI: 10.1021/acs.orglett.7b00891 Org. Lett. 2017, 19, 2576−2579
Letter
Organic Letters
Figure 1. Derivation of ΔCBH-1 and ΔCBH-2 schemes with an illustrative example to define Dev-0, Dev-1, and Dev-2.
and CBH-2 schemes (ΔCBH-1 and ΔCBH-2) are then used to provide error correction. (3) Calculations are carried out on the resultant reactions with the current method as well as the reference method (G4 theory, vide infra) to calculate the associated corrections. Corrn = ΔCBH-n(G4) − ΔCBH-n(method). The overall energy deviation at rung n is Dev-n = Dev-0 − Corr-n. CBH-1 corresponds to the well-known isodesmic bond separation scheme, and CBH-2 is denoted as the isoatomic scheme that preserves the immediate chemical environment of all the heavy (non-H) atoms. Thus, ΔCBH-1 corrects for the intrinsic errors in bond transformations and ΔCBH-2 corrects for the errors from medium-range effects such as hyperconjugation and protobranching (1,3 alkyl−alkyl interactions) along with bond transformations. Due to the scarce availability of experimental enthalpies of the associated species with high accuracy, we have used G4 theory (typically accurate to 1 kcal/ mol) as our “experimental” values, though other more accurate methods such as Wn theory can also be used. In our thermochemical protocol, expensive G4 calculations are needed only on the small molecules involved in ΔCBH-n schemes, independent of the size of the reactant or product. In addition, to overcome the redundancy in calculations on common species across different reactions, a stored data table can save computer time. The overall scaling of the protocol for a large organic system is thus the cost of the underlying DFT (e.g., B3LYP) or WFT (e.g., MP2) method. We have computed reaction energies with different DFT and WFT methods for a set of 25 organic reactions (named as CBH-R25 test set) shown in Figure 2. CBH-R25 set includes common organic reactions like Diels−Alder (R1−R6), aldol condensation (R7−R8), Pauson−Khand reaction (R13), aminoxylation (R14), and isomerization reactions (R17−
Figure 2. CBH-R25 test set of 25 organic reactions.
2577
DOI: 10.1021/acs.orglett.7b00891 Org. Lett. 2017, 19, 2576−2579
Letter
Organic Letters
analogues (e.g., R23) as well.18 The deviation in reaction R23 decreases from −8.4 to 1.9 kcal/mol with ΔCBH-2. B3LYP isomerization energy for reaction R24 (Claisen rearrangement) is underestimated by 6.1 kcal/mol but accurately described by ΔCBH-2 (Dev-2 = 0.0 kcal/mol). A large deviation of −26.9 kcal/mol in the B3LYP cyclization energy in reaction R25 decreases to a more reasonable value of Dev-2 = −3.7 kcal/mol. Averaged over the CBH-R25 test set, the large B3LYP mean absolute deviation (MAD-0 = 12.9 kcal/mol) decreases only marginally with the isodesmic scheme (ΔCBH-1: MAD-1 = 9.6 kcal/mol) but improves dramatically using ΔCBH-2 schemes (MAD-2 = 1.7 kcal/mol). Similar improvements are seen for all DFT and WFT methods. Figure 4 represents the mean absolute deviations (MAD-0) in
R25). It includes molecules with a broad range of functional groups and chemical environments to provide a rigorous calibration of the performance of existing methods. Details about the computational methods and the different density functionals (including references) are provided in the Supporting Information. As an illustration, we first examine the performance of the ΔCBH schemes in conjunction with the widely used B3LYP functional. Table S1 lists the ΔCBH-1 and ΔCBH-2 reaction schemes for the reactions R1−R25. Prior studies have repeatedly shown the failure of B3LYP toward the determination of accurate reaction enthalpies of Diels−Alder reactions.3,14 B3LYP underestimates the reaction enthalpies of R1−R6 due to an inadequate description of σ → π bond transformations and hyperconjugation effects present in the cyclic and bicyclic products. Figure 3a shows that application of
Figure 4. Calculated mean absolute deviations (MAD-0) and CBH corrected MAD-2 in the reaction energies of R1−R25 reactions at selected levels of theory.
Figure 3. B3LYP-calculated Dev-0 and CBH-corrected deviations (Dev-1 and Dev-2 at ΔCBH-1 and ΔCBH-2, respectively) in the reaction energies of (a) R1−R6 and (b) R17, R19, R20, and R22.
the reaction energies of reactions R1−R25 and corrected deviations through ΔCBH-2 schemes (MAD-2). The dramatic decrease of the deviations from the MAD-0 to MAD-2 across the various DFT and WFT methods demonstrates the consistently excellent performance of the ΔCBH-2 schemes. Only the local density functional (SVWN5) has a MAD-2 of greater than 3 kcal/mol after error cancellation. Table S4 lists the MAD for all 26 different DFT- and WFT-based methods. Among the different DFT methods, ωB97xD (see Figure 4) performs best in yielding reaction energies of the parent reactions R1−R25 directly with MAD-0 = 2.2 kcal/mol. While there is not room for substantial improvements in such cases, we note that the ΔCBH-2 scheme still yields significant improvements for some of the reactions where the errors are larger. For example, the maximum deviation for ωB97xD was found in reaction R13 (Dev-0 = 7.2 kcal/mol). Application of ΔCBH-2 scheme resulted in a much smaller deviation, Dev-2 = −1.4 kcal/mol. Overall, ωB97xD, in conjunction with ΔCBH-2, yields accurate reaction enthalpies (MAD-2 = 1.6 kcal/mol). Many of the previously known performance trends for families of DFT functionals still hold after ΔCBH-2 corrections, but the range of errors is compressed. For example, hybrid functionals work better than gradient-corrected (GGA) functionals. Thus, PBE has a MAD-2 of 1.9 kcal/mol, while the PBE-0 hybrid functional performs better (1.4 kcal/mol). Inclusion of D3 dispersion corrections improves the performance for these organic reactions. Thus, B3LYP has a MAD-2 of 1.7 kcal/mol while B3LYP-D3BJ has a smaller error of 1.3 kcal/ mol. We evaluated the performance of four different DFT-D3 methods namely, ωB97xD (discussed in the previous paragraph), B3LYP-D3BJ, CAM-B3LYP-D3BJ, and B97-D3BJ. In conjunction with ΔCBH-2 schemes, they all offer results with similar accuracy (∼1.5 kcal/mol). Similarly, B3LYP-NL, wherein the nonlocal part of the VV10 functional is used for dispersion,19 also has a MAD-2 of 1.5 kcal/mol.
ΔCBH-1 (isodesmic) schemes improves the results only marginally, but dramatic improvements are observed with ΔCBH-2. The mean absolute deviations, Dev-0, Dev-1, and Dev-2, for the reactions R1−R6 are 15.2, 10.8, and 0.9 kcal/ mol, respectively. The large value of Dev-1 shows that correcting errors via the isodesmic scheme alone is insufficient, while the small value of Dev-2 suggests that the 1,3 alkyl−alkyl interactions and hyperconjugation effects present in the parent molecules are accurately represented in the CBH-2 fragments.15 Similarly, ΔCBH-2 schemes for R9 and R10 lower the underestimations from −6.9 and −6.2 kcal/mol to −2.5 and −0.5 kcal/mol, respectively. The addition reaction resulting in a cyclized product in R11 involves CC → −C−C− and CO → −C−O− transformations. A deviation of −15.7 kcal/mol in the reaction due to delocalization errors and protobranching reduces to a small deviation of −1.7 kcal/mol using the ΔCBH2 scheme. Similarly, a spectacular improvement is seen in the dimerization of tetramethylethene forming the cyclized product (R12),16 the ΔCBH-2 scheme resulting in a deviation of only −1.5 kcal/mol compared to the original deviation of −28.1 kcal/mol. The [3 + 2 + 2] cyclization in R13 involves multiple bond transformations: CC → CC, CC → −C−C−, and CO → CO. The calculated B3LYP deviation of −7.1 kcal/ mol is improved by the ΔCBH-2 scheme, yielding Dev-2 of a mere 0.6 kcal/mol. A representative set of six isomerization reactions from Grimme’s study has been included in the present study (R17− R22) as additional challenging systems.17 Reactions R17, R19, and R22 have deviations (Dev-0) greater than 10 kcal/mol with values as high as 45 kcal/mol in R19. On application of ΔCBH2 reaction schemes, dramatic improvements are observed as shown in Figure 3b. Our analysis reveals that the incorrect isomerization energies for the allene to propyne rearrangement, seen for many density functionals, crops up in their higher 2578
DOI: 10.1021/acs.orglett.7b00891 Org. Lett. 2017, 19, 2576−2579
Letter
Organic Letters ORCID
Double-hybrid DFT functionals include a component of MP2 electron correlation and are known to perform better than hybrid functionals. B2PLYP, a double-hybrid derivative of B3LYP, resulted in significant improvement over B3LYP. Without any error corrections, B2PLYP and B2PLYP-D3BJ yield MAD-0 of 6.9 and 3.7 kcal/mol, respectively. The ΔCBH2 schemes resulted in substantial improvements and yielded highly accurate reaction enthalpies: MAD-2 = 1.1 and 1.0 for B2PLYP and B2PLYP-D3BJ, respectively. With only two deviations greater than 2 kcal/mol and a MAD of 1.0 kcal/ mol, B2PLYP-D3BJ shows consistent performance irrespective of the size of the molecules in the reactions, making it the most accurate functional tested. As in the case of DFT methods, impressive improvements are also seen for the Hartree−Fock method with the ΔCBH-2 schemes (MAD-0 = 13.3 vs MAD-2 = 2.1 kcal/mol). Grimme’s spin-component scaled MP2 (SCS-MP2) yields more accurate reaction energies (MAD-0 = 1.8 kcal/mol) than the conventional MP2 method (MAD-0 = 4.0 kcal/mol). The performance with ΔCBH-2 schemes (MAD-2) improved to 1.3 and 1.1 kcal/mol for MP2 and SCS-MP2 methods, respectively. Direct computations with CCSD(T) on the large sized molecules involved in the reactions R19, R20 and R22 were not carried out. For the remaining systems, CCSD(T) with ΔCBH-2 schemes resulted only in marginal improvement to the already accurate results (MAD-0 = 0.9 kcal/mol, MAD-2 = 0.5 kcal/ mol). This is consistent with the similar performance expected for CCSD(T) and G4 theory. Overall, the protocol presented in this work eliminates the systematic errors of DFT and WFT methods to yield accurate enthalpies of complex organic reactions. Notably, most of the performance differences between different density functionals decrease dramatically. In conjunction with ΔCBH-2 schemes, many functionals yield MAD less than 2.0 kcal/mol, and B2PLYP-D3BJ and SCS-MP2 yield MAD of only 1.0 kcal/mol. We note that the current protocol is valid for a wide range of chemical reactions involving organic and biomolecular systems and should find widespread applications in many fields of research. In particular, we expect the methods to be applicable for systems containing heteroatoms or halogens, and for unstable systems such as radicals to enable applications in challenging areas such as combustion chemistry of biofuels. Further developments are ongoing to extend the applicability of these methods for transition states, for delocalized aromatic systems, and for complex systems such as metal−organic frameworks.
■
Krishnan Raghavachari: 0000-0003-3275-1426 Notes
The authors declare no competing financial interest.
■
ACKNOWLEDGMENTS A.S. acknowledges a Kraft fellowship from Indiana University. We acknowledge support from NSF Grant No. CHE-1266154 at Indiana University.
■
ASSOCIATED CONTENT
* Supporting Information S
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.orglett.7b00891. ΔCBH-1 and ΔCBH-2 reaction schemes for R1−R25, the corresponding reaction enthalpies at various levels of theory, and Cartesian coordinates of optimized geometries (PDF)
■
REFERENCES
(1) (a) Tantillo, D. J. Acc. Chem. Res. 2016, 49, 1079−1079. Refer to all articles included in the special issue. (b) Cheng, G.; Zhang, X.; Chung, L. W.; Xu, L.; Wu, Y. J. Am. Chem. Soc. 2015, 137, 1706. (2) (a) Schreiner, P. R.; Fokin, A. A.; Pascal, R. A.; de Meijere, A. Org. Lett. 2006, 8, 3635. (b) Pieniazek, S. N.; Clemente, F. R.; Houk, K. N. Angew. Chem., Int. Ed. 2008, 47, 7746. (c) Johnson, E. R.; MoriSánchez, P.; Cohen, A. J.; Yang, W. J. Chem. Phys. 2008, 129, 204112. (3) (a) Mardirossian, N.; Head-Gordon, M. J. Chem. Phys. 2015, 142, 074111. (b) Zhao, Y.; Truhlar, D. Theor. Chem. Acc. 2008, 120, 215. (4) (a) Grimme, S.; Ehrlich, S.; Goerigk, L. J. Comput. Chem. 2011, 32, 1456. (b) Grimme, S.; Antony, J.; Ehrlich, S.; Krieg, H. J. Chem. Phys. 2010, 132, 154104. (5) (a) Peverati, R.; Truhlar, D. G. Philos. Trans. R. Soc., A 2014, 372, No. 20120476. (b) Mardirossian, N.; Head-Gordon, M. J. Chem. Phys. 2016, 144, 214110. (6) (a) Medvedev, M. G.; Bushmarinov, I. S.; Sun, J.; Perdew, J. P.; Lyssenko, K. A. Science 2017, 355, 49. (b) Wilson, E. K. Chem. Eng. News 2011, 89, 36. (c) Plata, R. E.; Singleton, D. A. J. Am. Chem. Soc. 2015, 137, 3811. (7) Wheeler, S. E.; Houk, K. N.; Schleyer, P. v. R.; Allen, W. D. J. Am. Chem. Soc. 2009, 131, 2547. (8) Wheeler, S. E.; Moran, A.; Pieniazek, S. N.; Houk, K. N. J. Phys. Chem. A 2009, 113, 10376. (9) (a) Fishtik, I. J. Phys. Chem. A 2012, 116, 8792. (b) Wodrich, M. D.; Gonthier, J. F.; Corminboeuf, C.; Wheeler, S. E. J. Phys. Chem. A 2012, 116, 8794. (10) Ramabhadran, R. O.; Raghavachari, K. J. Chem. Theory Comput. 2011, 7, 2094. (11) Curtiss, L. A.; Redfern, P. C.; Raghavachari, K. WIREs Comput. Mol. Sci. 2011, 1, 810. (12) Friesner, R. A. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 6648. (13) (a) Sengupta, A.; Ramabhadran, R. O.; Raghavachari, K. J. Phys. Chem. B 2014, 118, 9631. (b) Sengupta, A.; Raghavachari, K. J. Chem. Theory Comput. 2014, 10, 4342. (c) Ramabhadran, R. O.; Sengupta, A.; Raghavachari, K. J. Phys. Chem. A 2013, 117, 4973. (14) Mezei, P. D.; Csonka, G. I.; Kállay, M. J. Chem. Theory Comput. 2015, 11, 2879. (15) McKee, W. C.; Schleyer, P. v. R. J. Am. Chem. Soc. 2013, 135, 13008. (16) Check, C. E.; Gilbert, T. M. J. Org. Chem. 2005, 70, 9828. (17) Huenerbein, R.; Schirmer, B.; Moellmann, J.; Grimme, S. Phys. Chem. Chem. Phys. 2010, 12, 6940. (18) Woodcock, H. L.; Schaefer, H. F.; Schreiner, P. R. J. Phys. Chem. A 2002, 106, 11923. (19) Vydrov, O. A.; Van Voorhis, T. J. Chem. Phys. 2010, 132, 164113.
AUTHOR INFORMATION
Corresponding Author
*E-mail:
[email protected]. 2579
DOI: 10.1021/acs.orglett.7b00891 Org. Lett. 2017, 19, 2576−2579