Improving “Silver-Standard” Benchmark Interaction Energies with

May 17, 2018 - We investigate the effect of adding midbond basis functions on the performance of various conventional and explicitly correlated (F12) ...
0 downloads 0 Views 1MB Size
Subscriber access provided by Washington University | Libraries

Quantum Electronic Structure

Improving "silver-standard" benchmark interaction energies with bond functions Narendra Nath Dutta, and Konrad Patkowski J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.8b00204 • Publication Date (Web): 17 May 2018 Downloaded from http://pubs.acs.org on May 19, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Improving “silver-standard” benchmark interaction energies with bond functions Narendra Nath Dutta and Konrad Patkowski Department of Chemistry and Biochemistry, Auburn University, Auburn, Alabama 36849 (Dated: April 24, 2018) We investigate the effect of adding midbond basis functions on the performance of various conventional and explicitly correlated (F12) estimates of complete basis set limit coupled-cluster (CCSD(T)/CBS) noncovalent interaction energies. In particular, we search for an improved “silver standard” of interaction energy calculations for systems where the CCSD(T) computation is feasible in a double-zeta basis but not in a triple-zeta one. We follow a recent study (Sirianni et al., J. Chem. Theory Comput. 2017, 13, 86) of different CCSD(T)-F12 variants in midbondless bases over the A24 and S22 benchmark interaction energy databases, and extend Dunning’s correlationconsistent basis sets with three different midbond sets. The addition of bond functions is highly beneficial for conventional CCSD(T) and most CCSD(T)-F12 variants, improving both the CCSD part and the unscaled triples contribution. However, the commonly used scaling of triples by the ratio of the MP2-F12 and MP2 correlation energies usually overshoots: as a result, the scaled triples term gets worse upon the addition of bond functions. In contrast, a milder triples scaling by the ratio of the CCSD-F12b and CCSD correlation energies (Brauer et al., Phys. Chem. Chem. Phys. 2016, 18, 20905) leads to the most accurate estimates of this term as long as bond functions are included. The combination of the triples term scaled in this way with the CCSD-F12b interaction energy leads to the CCSD(Tbb)-F12b approach that provides consistent high accuracy when a (3s3p2d2f ) set of midbond functions is added to the aug-cc-pVDZ atom-centered basis set. The combination of midbond functions and the composite MP2/CBS+δ(CCSD(T)) treatment is able to make up for the deficiencies in the atom-centered part of the basis set, in particular, for a partial (or even complete) lack of diffuse functions. Considering both the A24 and S22 accuracy and the computational efficiency, we propose several new “silver standard” approaches improving upon the currently established midbondless levels of theory, ranging from the most consistent CCSD(Tbb)F12b/aug-cc-pVDZ+(3s3p2d2f ) variant (with mean unsigned errors of 0.010 and 0.042 kcal/mol for the A24 and S22 databases, respectively) to the significantly cheaper MP2/CBS+δ(CCSD(T))/ccpVDZ+(3s3p2d2f ) approach (mean unsigned errors of 0.039 and 0.096 kcal/mol for A24 and S22, respectively).

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

Page 2 of 23 2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

I.

INTRODUCTION

Noncovalent interactions (NCI) are ubiquitous and the knowledge of accurate intermolecular interaction energies is indispensable for many fields of chemistry, physics, biology, and materials science. At the same time, an accurate calculation of interaction energies is not a trivial task as the results are highly sensitive to both the one-electron basis set and the treatment of electron correlation. Therefore, one would ideally study NCI at the “gold standard” level of electronic structure theory, that is, the coupled-cluster method with single, double, and perturbative triple excitations [CCSD(T)]1 at the complete basis set (CBS) limit (for the smallest complexes, computing corrections beyond the CCSD(T)/CBS level, in particular, the effects of higher-order coupled-cluster excitations, is also feasible2–4 ). However, the steep o3 v 4 scaling of CCSD(T) with system size (with o, v being the numbers of occupied and virtual molecular orbitals, respectively) implies that very often a small-basis CCSD(T) calculation is feasible but converging the interaction energy to the CBS limit is a serious problem. In order to alleviate the slow convergence of CCSD(T) interaction energies with the size of the one-electron basis set, four approaches have been proposed5 : composite (focal point) methods, CBS extrapolation, midbond functions, and the explicitly correlated (R12/F12) treatment. The composite approach, highly popular in the thermochemistry field6 , assumes that the basis set convergence pattern of a high-level method (such as CCSD(T)) is similar to the convergence pattern of a low-level, more computationally tractable method (such as the second-order Møller-Plesset perturbation theory, MP2), so that the effects beyond a medium-size basis set can be approximated at a lower level of theory. The CBS extrapolation requires a sequence of calculations in a family of systematically improved basis sets (most typically, the correlation-consistent cc-pVXZ sets of Dunning and coworkers7,8 ) and assumes a functional form for the interaction energy convergence with the basis set cardinal number X. The most important example is the X −3 extrapolation of correlation energy contributions introduced by Helgaker and coworkers9,10 , however, other variations are also possible11–13 . The addition of midbond functions was popularized by Tao14–16 , who observed that noncovalent interaction energies, especially those dominated by dispersion, require a reliable description of the wavefunction in the region where the molecular density tails overlap, a region that is not well described by conventional atom-centered basis sets. Therefore, Tao suggested adding an additional set of basis functions centered in between the interacting molecules (monomers), most typically, halfway between the monomers’ centers of masses. Finally, the newest option to speed up basis set convergence of CCSD(T) are the explicitly correlated17,18 (F12) variants such as CCSD(T)-F12a, CCSD(T)-F12b19,20 , and CCSD(T)(F12∗ ) (also termed CCSD(T)-F12c)21 . In addition to these four CBS convergence acceleration techniques, it should be mentioned that the accuracy of a finite-basis calculation also depends on the treatment of basis set superposition error (BSSE), that is, whether counterpoise-corrected (CP)22 , uncorrected, or average (half corrected, half uncorrected) interaction energies are used. Only the full CP treatment is compatible with midbond functions14,15 : therefore, in this work, we will use the CP-corrected approach exclusively. If a CCSD(T) computation in the aTZ≡aug-cc-pVTZ basis is feasible for a given noncovalent complex, a very accurate CCSD(T)/CBS estimate can be computed using e.g. the composite MP2/CBS+δ(CCSD(T))/aTZ approach: int int int ECCSD(T)/CBS ≈ EMP2/CBS + δECCSD(T)/aTZ

(1)

where each E int = E AB − E A − E B is the supermolecular interaction energy and int int int δECCSD(T)/aTZ = ECCSD(T)/aTZ − EMP2/aTZ .

(2)

However, CCSD(T)/aTZ can be performed only for systems up to about 19 nonhydrogen atoms (the adenine-thymine complex, the largest system in the popular S22 database23 ). The calculation for the latter system was actually quite nontrivial and required a supercomputer back in 201024 . On the other hand, numerical evidence shows25 that even int the composite approach to CCSD(T)/CBS is sometimes highly inaccurate if the δECCSD(T) term is not computed in at least a partially augmented cc-pVDZ basis (plain cc-pVDZ is not accurate enough for this purpose). The CCSD(T)/aDZ calculations are feasible up to about 27 nonhydrogen atoms (the CO2 –curved coronene complexes of Ref. 26). Therefore, there exists an important class of medium-sized complexes for which CCSD(T)/aDZ is feasible but CCSD(T)/aTZ is not: in such a case, obtaining benchmark CCSD(T)/CBS interaction energies accurate to about 0.1 kcal/mol is possible but requires great care. Extending the “precious metals” language where an MP2/CBS+δ(CCSD(T))/aTZ (or similar) estimate of CCSD(T)/CBS is regarded as gold standard, one needs to look for “silver” and “bronze standards” that provide accuracy only slightly inferior to the gold one, do not exhibit particularly bad outliers, and are computationally tractable for systems of (somewhat) larger size. It should be noted that the range of applicability of CCSD(T) has recently been significantly extended thanks to approximations that exploit the local character of electron correlation27–29 . However, it is still very difficult to keep the residual interaction energy errors of the local approximation below the desired 0.1–0.2 kcal/mol level30 , therefore, the local approximation will not be considered in the present work.

ACS Paragon Plus Environment

Page 3 of 23

Journal of Chemical Theory and Computation 3

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The issue of appointing “silver” and “bronze standards” of electronic structure theory for NCI was thoroughly studied by Sherrill and coworkers31 using a benchmark dataset composed of 345 structures. They found respectably low values of mean unsigned errors (MUE) for several approaches that did not require coupled-cluster calculations beyond aDZ: specifically, the MUE for MP2/CBS+δ(CCSD(T))/aDZ, CCSD(T∗∗ )-F12b/aDZ, and MP2F12/CBS+δ(CCSD(T∗∗ )-F12b)/aDZ were 0.08, 0.09, and 0.06 kcal/mol, respectively (the double asterisk in (T∗∗ )F12b) denotes an empirical scaling of the triples interaction energy contribution20,32 to approximately account for the effect of explicit correlation on this term: see Sec. II for details). An even lower MUE of 0.05 kcal/mol was found for the dispersion weighted DW-CCSD(T∗∗ )-F12/aDZ variant which is a linear combination of CCSD(T∗∗ )-F12a and CCSD(T∗∗ )-F12b with system-dependent coefficients33 . Therefore, the latter variant was termed the “silver standard” of electronic structure theory, while the name “bronze standard” was assigned to MP2C-F12/aDZ, an approach that is much cheaper but not quite of benchmark quality (the MP2C “MP2 coupled” approach of Hesselmann corrects the dispersion in the MP2 interaction energy with a coupled Kohn-Sham expression34 ). The choice of DW-CCSD(T∗∗ )-F12/aDZ as a “silver standard” has an additional advantage that no large-basis MP2 calculation is needed – such a calculation is not nearly as costly as large-basis CCSD(T) but might not be trivial nonetheless. The “silver standard” has been subsequently employed to generate various benchmark-level interaction energies, in particular, the BBI and SSI databases of backbone-backbone and sidechain-sidechain interactions between amino acids in actual protein structures35 . The most comprehensive assessment of different CCSD(T∗∗ )-F12x interaction energies in both the aXZ basis sets and the cc-pVXZ-F12 sets constructed specifically for F12 calculations36,37 is likely the recent work by Sirianni et al.38 involving the A2439 and S2223 databases. In this study, the improvement brought about by the F12 approach in the aXZ sequence was impressive (equivalent to increasing the cardinal number X by 2–3). However, the cc-pVXZ-F12 family did not perform nearly as well for any F12 variants, indicating that the exponents of these bases, optimized for molecular correlation energies, are likely not diffuse enough for NCI. This result confirms earlier findings40–42 indicating lower-than-expected accuracy of CCSD(T)-F12/cc-pVXZ-F12 interaction energies for a more limited set of complexes. In view of this lower accuracy, we will not consider the cc-pVXZ-F12 basis sets in this work. It should be noted that, even more recently, sets of diffuse functions augmenting the cc-pVXZ-F12 sets have been optimized43 . However, the additional functions leading to the aug-cc-pVDZ-F12 basis are limited to a single set of d functions for nonhydrogen atoms and a set of p functions for hydrogen, which might be too little to introduce a substantial improvement. One should also note that, contrary to what the name implies, the nonaugmented cc-pVDZ-F12 basis is already larger than aDZ. Nevertheless, we will make some limited comparisons to the aug-cc-pVDZ-F12 results in Sec. III. The aforementioned studies by the Sherrill group31,38 contribute to the large body of work (initiated by Tew et al.44 and Marchetti and Werner32 ) that demonstrates the utility of small-basis approximate CCSD(T)-F12 treatment of interaction energies, both stand-alone and in combination with the composite approach and/or CBS extrapolation. In comparison, quite little is known about the performance of the fourth CBS convergence acceleration technique described above, the addition of midbond functions. While it has been verified that bond functions can be successfully combined with CBS extrapolations45 and with the CCSD(T)-F12 approach40,41 , hardly any benchmark sets available have utilized this methodology (the only example that comes to mind are the MP2/CBS+δ(CCSD(T))/aTZ+(bond) results of Ref. 46 for the S22 benchmark database23 ). In particular, while the strategies for using “silver standard”, CCSD(T)-F12/aDZ-level computations to maximize interaction energy accuracy have been thoroughly investigated38 , a similar study at the (only slightly more expensive) CCSD(T)-F12/aDZ+(bond) level has not been performed so far. In this work, we extend the investigations of Refs. 31 and 38 to a new dimension: bases involving midbond functions. Our main goal is to find out if an addition of a single set of bond functions to the aDZ basis set, bringing about only a small increase in the computational cost of the CCSD(T)/CCSD(T)-F12 calculation, can substantially increase its accuracy. In other words, we are looking for an improved “silver standard” in a form of either a stand-alone CCSD(T)-F12/aDZ+(bond) calculation or a composite MP2/CBS+δ(CCSD(T)-F12)/aDZ+(bond) treatment. The performance of various CCSD(T)/CBS estimates will be tested on the same two databases as in Ref. 38: A2439 and S2223 . For the former database, the reference CCSD(T)/CBS limit values are available from extrapolations involving basis sets as large as (aQZ,a5Z) ((a5Z,a6Z) for selected systems)47 : for the latter one, a series of improvements to the original benchmark values24,32,46,48 resulted in reference data in which the δ(CCSD(T)) term is computed in at least the aTZ basis set. Therefore, to make sure that the reference values have sufficient precision to distinguish between different small-basis estimates, we extend our studies to one cardinal number below the benchmark, that is, we will examine CCSD(T)/CCSD(T)-F12 results in basis sets aDZ, aTZ, and aQZ for A24 and only aDZ for S22. In addition to the assessment of aDZ-level CCSD(T)/CBS estimates, we aim to find out whether bond functions can make up for deficiencies in the diffuse functions of the atom-centered part of the basis set, for example, when the latter is truncated to a “calendar” set jul-cc-pVDZ or jun-cc-pVDZ49 or even down to nonaugmented cc-pVDZ. The relevant details of the complexes, basis sets, and the methodology of calculations are given in Sec. II. The results are presented and analyzed in Sec. III. Finally, Sec. IV contains conclusions.

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

Page 4 of 23 4

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

II.

METHODOLOGY

In this work, we performed calculations for 46 weakly interacting complexes belonging to two popular NCI databases: A2439 and S2223 . The geometries of the complexes were taken from the original works as listed in the BEGDB online database50 . The reference interaction energies for the A24 set are taken from counterpoise-corrected conventional CCSD(T) calculations extrapolated to CBS using results in the (aQZ,a5Z) basis sets ((a5Z,a6Z) for selected small systems). These reference values were computed in Ref. 47, and used as benchmark in Ref. 38 under the name A24B. We would like to stress that, in order to ensure a fair assessment of the CBS convergence, the reference values should not contain any effects beyond the frozen-core CCSD(T) level of theory: therefore, we used the CCSD(T)-level A24B values as benchmarks even though more accurate interaction energies, containing coupled-cluster excitations up to CCSDT(Q), are available for this database4 . For the S22 set, we used the latest, most precise revision of benchmark CCSD(T)/CBS estimates, introduced in Ref. 48 under the name S22B. In addition to the reference values for the “gold standard” CCSD(T)/CBS interaction energies, we need the data for MP2/CBS (for the composite treatment) and CCSD/CBS (to separate out the convergence of the CCSD and (T) contributions). For the former data and the A24 database, we use the well converged MP2-F12/a5Z values from Ref. 38. For the S22 set, we compute the benchmark MP2/CBS interaction energy as the difference between the (S22B) CCSD(T)/CBS value and its δ(CCSD(T)) contribution as listed in Table S1 of the Supplementary Material to Ref. 48. The CCSD/CBS reference values for the A24 database are extracted at exactly the same (system-dependent) basis set levels as the A24B benchmark; for example, if the benchmark is calculated as CCSD(T)/(aQZ,a5Z extrapolation), we will define the CCSD/CBS reference as CCSD/(aQZ,a5Z extrapolation). All the required CCSD interaction energies have been extracted from the BioFragment Database BFDb35 using its powerful Python front end. The CCSD/CBS estimates for the S22 database are not available at the same level as the CCSD(T)/CBS benchmarks, therefore, we will restrict the analysis of the separate CCSD and (T) contributions to the A24 set. The conventional practice of adding midbond functions locates them on a point halfway between the centers of masses of the interacting monomers. While the precise location of a midbond center has little influence on the resulting interaction energies51,52 , the standard practice runs into problems when one monomer is much longer than the other and the midpoint between the molecular centers of masses is still within one of the monomers. In such a case, a more elaborate algorithm of placing a midbond center has been proposed53 where its location is a r−6 weighted average of intermolecular atom-atom midpoints: P rbond =

a∈A

P

P

a∈A

b wab ra +r 2 b∈B wab

b∈B

P

wab = |ra − rb |−6

(3)

where the summations run over all atoms a in molecule A and all atoms b in molecule B. We observed that, for the A24 database, the midpoint between the monomers’ centers of masses sometimes happens to be too close to one monomer: therefore, we used Eq. (3) to place the midbond function centers for all complexes, both A24 and S22. We will consider both the standard (3s3p2d) and (3s3p2d2f ) sets of midbond functions (dating back to the original work of Tao and coworkers14 ) and a set of bond functions that varies with X. The midbond function exponents for the (3s3p2d2f ) set were 0.9, 0.3, and 0.1 for sp and 0.6 and 0.2 for df ; the (3s3p2d) set involves the same exponents as (3s3p2d2f ) for the spd orbitals, only the f orbitals are omitted. We will use the shorthand notation aXZ+(332) and aXZ+(3322) for the aXZ atom-centered basis set with (3s3p2d) and (3s3p2d2f ) bond functions, respectively. The variable-midbond calculations, denoted as aXZ+(aXZ), place the same (aXZ) set, with exponents and contraction coefficients appropriate for hydrogen, at the midbond location. An explicitly correlated CCSD(T)-F12 calculation brings about two additional degrees of freedom. First, a particular approximation to full CCSD-F12 has to be selected: in this work, we will investigate the CCSD-F12a and CCSD-F12b approaches of Refs. 19 and 20. The second degree of freedom pertains to the treatment of perturbative triples. While an explicit F12 Ansatz for triples has been proposed54 , it is too costly for practical applications: as a result, the (T) energy term is normally calculated from a standard, non-F12 formula using the converged CCSD-F12 singles and doubles amplitudes. To account for the missing F12 contribution to triples, it has been proposed20,32 to scale the (T) term by the ratio of the MP2-F12 and MP2 correlation energies in a given basis set: E(T∗ ) = E(T)

corr EMP2−F12 corr EMP2

(4)

In the context of NCI, such a scaling is performed in the calculations for dimer AB, monomer A, and monomer B. If the scaling factors in these three calculations are different, residual size inconsistency occurs: to prevent that, the scaling factor for AB is also employed in calculations for A and B (after Ref. 33, this practice is denoted by double asterisk, e.g., CCSD(T∗∗ )-F12b). The ratio of MP2-level correlation energies is not the only possible choice for the

ACS Paragon Plus Environment

Page 5 of 23

Journal of Chemical Theory and Computation 5

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

triples scaling factor: Brauer et al.55 examined several choices and recommended two new factors that are the ratios of correlation energies at the CCSD level: E(Tb) = E(T)

corr ECCSD−F12b corr ECCSD

(5)

and an analogous E(Tc) choice using the CCSD(F12∗ )≡CCSD-F12c correlation energy instead of the CCSD-F12b one. In this work, we will test the performance of E(Tb) with the scaling factor obtained for the dimer used for the monomer calculations as well. In accordance with the double asterisk notation of Ref. 33, we will refer to such a variant as CCSD(Tbb)-F12b. It should be noted that the (Tbb) scaling, unlike the (T∗∗ ) one, requires a conventional CCSD calculation in addition to the CCSD(T)-F12 one. All CCSD(T) and CCSD(T)-F12 calculations are performed with the molpro2012.1 code56,57 . The explicitly correlated variants include the unscaled CCSD(T)-F12a and CCSD(T)-F12b approaches, their scaled CCSD(T∗∗ )F12a, CCSD(T∗∗ )-F12b, and CCSD(Tbb)-F12b counterparts, and their dispersion-weighted DW-CCSD(T∗∗ )-F12 combination (the details and results for the latter approach are presented in the Supporting Information). We did not investigate the CCSD(T)(F12∗ )≡CCSD(T)-F12c variant in the present study as it typically produces results similar to CCSD(T)-F12b38 . All F12 computations use the default (1.0 a−1 0 ) value of the geminal correlation factor β. As far as the auxiliary basis sets are concerned, we use the standard cc-pVXZ/MP2FIT and aug-cc-pVXZ/MP2FIT sets58,59 for fitting the MP2-F12 pair functions, and the cc-pVXZ/JKFIT and aug-cc-pVXZ/JKFIT sets60 for fitting the Fock matrix as well as for the complementary auxiliary basis set to approximate many-electron integrals via the resolution of identity. We use the augmented aug-cc-pVXZ/MP2FIT and aug-cc-pVXZ/JKFIT sets for all atoms that have some or all diffuse functions in the orbital basis and nonaugmented cc-pVXZ/MP2FIT and cc-pVXZ/JKFIT sets otherwise: for example, the jun-cc-pVDZ and jul-cc-pVDZ orbital sets are accompanied by augmented auxiliary sets centered on nonhydrogen atoms and nonaugmented auxiliary sets on hydrogens. For the (3322) and (332) midbond sets, the corresponding auxiliary basis, proposed in Ref. 61, involves (1.8, 1.2, 0.6, 0.4, 0.2) exponents for spd orbitals, (1.5, 0.9, 0.5, 0.3) exponents for f , and (1.5, 0.9, 0.3) exponents for g. In the special case of the aug-cc-pVDZ-F12 basis set of Ref. 43, the auxiliary basis sets employed were aug-cc-pVDZ/MP2FIT for the MP2-F12 pair functions, aug-cc-pVDZ/JKFIT for fitting the Fock matrix, and cc-pVDZ-F12/OPTRI62 , augmented by one diffuse function for each angular momentum in an even-tempered fashion, for the resolution of identity. As already mentioned, all calculations employ the full counterpoise correction for BSSE. We will compare the performance of various CCSD(T)/CBS estimates using mean unsigned errors (MUE) averaged over a given database (A24 or S22). The corresponding comparisons in terms of mean unsigned relative errors (MURE), that would lead to virtually identical conclusions, are shown in the Supporting Information.

III.

RESULTS AND DISCUSSION

We will first examine the performance of the (332), (3322), and (aXZ) sets of midbond functions in straight-up CCSD(T)/CCSD(T)-F12 computations on the A24 dataset. The relevant MUE values of conventional CCSD(T) and different CCSD(T)-F12 variants for the aDZ, aTZ, and aQZ basis sets are presented in Figs. 1, 2, and 3, respectively. The corresponding percentage MURE values as well as the individual interaction energies are given in the Supporting Information. The midbondless aXZ values (which were already computed in Ref. 38, albeit with slightly different auxiliary bases) are listed for comparison. As expected, the plain CCSD(T)/aDZ interaction energies in Fig. 1 are far from benchmark quality, with an MUE of 0.38 kcal/mol, and the maximum error of 0.91 kcal/mol attained for the formaldehyde dimer. As it is typically the case with counterpoise-corrected results, all CCSD(T)/aDZ interaction energies are higher than the benchmark values. The addition of bond functions lowers all interaction energies, and the presence of the larger (3322) set lowers them more than the (332) set. Nevertheless, all CCSD(T)/aDZ+(bond) interaction energies are still above int int int int the benchmark values. This ordering ECCSD(T)/aX Z > ECCSD(T)/aX Z+(332) > ECCSD(T)/aX Z+(3322) > ECCSD(T)/CBS int int int and ECCSD(T)/aX Z > ECCSD(T)/aX Z+(aX Z) > ECCSD(T)/CBS , which is expected but not formally guaranteed in a supermolecular counterpoise-corrected calculation, will be referred to as the natural ordering. As illustrated in the Supporting Information, this natural ordering for conventional CCSD(T) interaction energies is maintained also for the aTZ and aQZ basis sets for all A24 complexes. In quantitative terms, the improvement brought about by midbond functions is quite impressive for aDZ, with the largest (3322) midbond set lowering the MUE to 0.09 kcal/mol and the maximum error to 0.37 kcal/mol, again for the formaldehyde dimer. The next most effective midbond set for the aDZ atom-centered basis is (332), and (aDZ) provides the least improvement. A similar pattern is observed for the aTZ basis, although the improvement in MUE provided by the best (3322) midbond set is twofold rather than fourfold as in the case of aDZ. At the aQZ level, the improvement is further reduced but still substantial - a factor of 1.6 reduction

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

Page 6 of 23 6

0.4

0.375

aDZ

0.35

aDZ+(332) aDZ+(3322)

MUE (kcal/mol)

0.3

aDZ+(aDZ) 0.242

0.25 0.2 0.15

0.135

0.129 0.092

0.091

0.089

0.1

0.070

0.05

0.038 0.011 0.009

0.055 0.039

0.071 0.057 0.035

0.027

0.075 0.034 0.022 0.012

0.051 0.014 0.010

0 CCSD(T)

CCSD(T)-F12a

CCSD(T)-F12b

CCSD(T**)-F12a

CCSD(T**)-F12b

CCSD(Tbb)-F12b

FIG. 1. Mean unsigned errors for different CCSD(T)/CCSD(T)-F12 variants in the aDZ, aDZ+(332), aDZ+(3322), and aDZ+(aDZ) basis sets, averaged over the A24 database.

0.14 aTZ 0.12

0.118

aTZ+(332) aTZ+(3322)

0.1 aTZ+(aTZ)

MUE (kcal/mol)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

0.08 0.066

0.06

0.068

0.055

0.04

0.034 0.020 0.019 0.015

0.02

0.024

0.027 0.023

0.014

0.022 0.013

0.007 0.005 0.005

0.009 0.006 0.004 0.004

0.008 0.007 0.004

0 CCSD(T)

CCSD(T)-F12a

CCSD(T)-F12b

CCSD(T**)-F12a

CCSD(T**)-F12b

CCSD(Tbb)-F12b

FIG. 2. Mean unsigned errors for different CCSD(T)/CCSD(T)-F12 variants in the aTZ, aTZ+(332), aTZ+(3322), and aTZ+(aTZ) basis sets, averaged over the A24 database.

in MUE with the (3322) midbond and a factor of 1.7 with the (now best performing) (aQZ) midbond. This behavior is very much expected: for a constant midbond set such as (332) or (3322), bond functions constitute a smaller fraction of the entire basis set as X is increased, leading to a less pronounced (but still clearly visible) improvement. On the other hand, the variable midbond (aXZ) set grows together with the atom-centered basis, and it is expected

ACS Paragon Plus Environment

Page 7 of 23

Journal of Chemical Theory and Computation 7

0.045

0.042

aQZ

0.04

aQZ+(332)

0.035

MUE (kcal/mol)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

0.03

aQZ+(3322) 0.030

aQZ+(aQZ) 0.027 0.025

0.025 0.02 0.015

0.012

0.01

0.011 0.008

0.005

0.003

0.004

0.004

0.013 0.013 0.013

0.007 0.007

0.008

0.004

0.004 0.002

0.003 0.003

0.004

0.003

0.003

0 CCSD(T)

CCSD(T)-F12a

CCSD(T)-F12b

CCSD(T**)-F12a

CCSD(T**)-F12b

CCSD(Tbb)-F12b

FIG. 3. Mean unsigned errors for different CCSD(T)/CCSD(T)-F12 variants in the aQZ, aQZ+(332), aQZ+(3322), and aQZ+(aQZ) basis sets, averaged over the A24 database.

to provide larger improvement than any constant midbond for sufficiently large X. Overall, we see from Figs. 1–3 that the convergence pattern with midbond is very similar as without, only the errors are smaller. This is the reason why midbond functions actually work well together with CBS extrapolation: in fact, the CCSD(T)/CBS estimates employed in several ultra-accurate potentials for small complexes63–66 owe their impressive precision to a combination of large correlation-consistent atom-centered basis sets, midbond functions, and CBS extrapolation. Moreover, the CCSD(T)/aDZ+(3322) treatment is quite accurate despite the fact that such a basis set is sometimes criticized66 for a lack of balance, as the angular momenta of midbond functions (spdf ) exceed those for the atom-centered functions (spd). Clearly, an “imbalanced” basis set with midbond is not automatically a bad choice for computing noncovalent interaction energies. Before we turn to the explicitly correlated results in Figs. 1–3, it is important to set the expectations for the ordering of different F12 variants. It should be recalled that CCSD-F12a is formally more approximate (neglects more diagrams) than CCSD-F12b19,20 . At the same time, while the scaling of triples can sometimes overshoot40,41 , it should certainly be better than no scaling at all. Therefore, one could naively expect CCSD(T∗∗ )-F12b (or possibly CCSD(Tbb)-F12b, which will be discussed later in this section) to be the best F12 variant and CCSD(T)-F12a to be the worst one, with CCSD(T∗∗ )-F12a and CCSD(T)-F12b providing intermediate performance. However, in practical calculations of interaction energies, the CCSD(T)-F12a approach strongly benefits from an error cancellation between the CCSD-F12a part and the triples part41 . As a result, the intermediate CCSD(T∗∗ )-F12a and CCSD(T)F12b variants, where this error cancellation is disturbed, quite consistently exhibit a worse performance than either CCSD(T)-F12a or CCSD(T∗∗ )-F12b26,67 . As Figs. 2 and 3 show, the same ordering is observed for the midbondless aTZ and aQZ results averaged over the A24 database. Only for the aDZ basis set (Fig. 1) the CCSD(T∗∗ )-F12a results are more accurate than either CCSD(T)-F12a or CCSD(T∗∗ )-F12b. It is likely that, in the aDZ case, the basis set incompleteness errors are too large to ensure a consistent cancellation pattern observed for larger basis sets. From now on, we will refer to the CCSD(T)-F12a or CCSD(T∗∗ )-F12b variants as balanced and the CCSD(T∗∗ )-F12a and CCSD(T)-F12b ones as imbalanced. We can now evaluate the influence of midbond functions on the CCSD(T)-F12 performance as displayed in Figs. 1–3. First, the unscaled-triples CCSD(T)-F12b results are consistently improved by the addition of midbond, and a larger midbond provides more improvement than a smaller one. This is the case because the “natural ordering” of CCSD(T)F12b interaction energies, defined earlier in this section, holds true for all A24 complexes. However, CCSD(T)-F12b is the least accurate F12 variant to start with and the improvement afforded by the midbond functions, while consistent, still does not make the CCSD(T)-F12b results competitive with other F12 flavors. On the other hand, the balanced

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

Page 8 of 23 8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

CCSD(T)-F12a and CCSD(T∗∗ )-F12b variants exhibit impressive performance when midbond functions are added, with the MUE values in the aDZ basis dropping from 0.070–0.075 kcal/mol to 0.009–0.022 kcal/mol for the two fixed-midbond sets and 0.034–0.038 kcal/mol for aDZ+(aDZ). As expected, the improvement decreases in magnitude (but remains consistent) for the aTZ atom-centered basis set. At the aQZ level, the midbondless CCSD(T)-F12a and CCSD(T∗∗ )-F12b results are already highly accurate, and the addition of bond functions leaves the MUE almost the same (with a slight increase or decrease possible) while slightly decreasing the MURE values. However, the observed average improvement is a net result of different tendencies for different systems, as the natural ordering does not in general hold for the CCSD(T)-F12a and CCSD(T∗∗ )-F12b interaction energies — the midbond correction sometimes overshoots for both variants. As a result, the (332) midbond overperforms the larger (3322) one for both variants and all basis sets. Finally, the effects of adding midbond functions on the CCSD(T∗∗ )-F12a interaction energies are mostly harmful. Therefore, this variant, which is relatively inaccurate already in midbondless basis sets (except for aDZ), should not be recommended for benchmark interaction energy calculations — its good performance for A24 in the aDZ set is likely accidental. The overall MUE values for different variants and midbond functions for the S22 database, displayed in Fig. 4, follow the same patterns as for the A24 database discussed so far. The CCSD(T)/aDZ+(bond) and CCSD(T)F12b/aDZ+(bond) interaction energies all follow the natural ordering: as a result, the MUE values improve when bond functions are included, and aDZ+(3322) performs better than aDZ+(332). However, as before, neither of these variants provides acceptable accuracy even when midbond functions are added. The CCSD(T∗∗ )-F12a variant is again quite erratic: even without midbond, the errors of this approach go in both directions although the overall MUE happens to be the lowest among the four variants, barely overtaking CCSD(T∗∗ )-F12b. Again, this variant cannot be improved by the addition of bond functions. However, the balanced CCSD(T)-F12a and CCSD(T∗∗ )-F12b methods are significantly improved by the addition of midbond, with the overall MUE decreased from 0.11–0.17 kcal/mol to 0.06– 0.09 kcal/mol for the two fixed midbonds and 0.07–0.14 kcal/mol for aDZ+(aDZ). Interestingly, the (332) midbond is slightly preferable to the (3322) one for CCSD(T∗∗ )-F12b but slightly inferior for CCSD(T)-F12a. The performance of the dispersion-weighted DW-CCSD(T∗∗ )-F12 approach of Ref. 33 (the pertinent MUE values for the A24 and S22 databases are presented in the Supporting Information) is intermediate between those of its parent CCSD(T∗∗ )F12a and CCSD(T∗∗ )-F12b approximations: therefore, it is in general inferior to a straight-up CCSD(T∗∗ )-F12b treatment when midbond functions are added. An interesting exception is the variable-midbond case where DWCCSD(T∗∗ )-F12/aDZ+(aDZ) happens to provide the lowest MUE on the S22 database (0.030 kcal/mol) of any method considered. Apparently, a small midbond set such as (aDZ) can enhance the DW-CCSD(T∗∗ )-F12 accuracy while a larger midbond disturbs the balance between the CCSD(T∗∗ )-F12a and CCSD(T∗∗ )-F12b contributions reached by fitting the midbondless results. The mean unsigned errors for the hydrogen-bonded, dispersion-dominated, and mixed-influence subsets of the A24 and S22 databases (as defined in Ref. 38), displayed in the Supporting Information, show that the addition of bond functions generally affects the interaction energies for different types of systems in a similar way: in the aDZ basis set, the improvement that bond functions provide to CCSD(T), CCSD(T)-F12a, and CCSD(T∗∗ )-F12b for different classes of systems is mostly comparable. The convergence patterns of different CCSD(T)-F12 variants become much clearer when one investigates the behavior of the CCSD and triples contributions separately. Therefore, the MUE values for the CCSD/CCSD-F12 interaction energies (relative to the CCSD/CBS benchmark) and for the triples interaction energy contribution (relative to the difference between the CCSD(T)/CBS and CCSD/CBS benchmarks) are presented in Figs. 5, 6, and 7 for the A24 database and basis sets aDZ, aTZ, and aQZ, respectively. The reference CCSD/CBS values, at exactly the same basis set and extrapolation level as the A24B reference CCSD(T)/CBS values, were extracted from the BioFragment Database (BFDb)35 project website. Unfortunately, not all CCSD/CBS values for the S22 database are available at the level consistent with the CCSD(T)/CBS benchmarks: therefore, we limit the consideration of the CCSD and (T) interaction energy contributions to the A24 set. The results in Figs. 5–7 clearly demonstrate both the power of F12 to improve the CCSD interaction energy contribution and the lack of improvement (actually, a slightly detrimental effect) that the converged CCSD-F12 amplitudes have on the perturbational (T) term. The CCSD-F12a and CCSD-F12b interaction energies are all much more accurate than the conventional CCSD ones. Without midbond functions, the CCSD-F12a variant performs better in the aDZ and aTZ basis sets while CCSD-F12b is superior for aQZ. The addition of midbond functions has a slightly harmful effect on CCSD-F12a but significantly improves the interaction energies for both CCSD-F12b and conventional CCSD: as expected, the improvement is most impressive in aDZ and gradually decreases as the basis set is enlarged. Moreover, the CCSD and CCSD-F12b MUE values are always improved more by the addition of a larger midbond. The slight worsening of the unscaled (T) interaction energy term by the F12 treatment has been observed before41 . In fact, similar considerations recently led Manna et al.68 to propose a hybrid CCSD-F12b+δ(T) variant to compute benchmark interaction energies for water clusters69,70 : the CCSD part of the benchmark is extrapolated from CCSDF12b calculations but the triples part is extrapolated from conventional CCSD(T) results. For completeness, we

ACS Paragon Plus Environment

Page 9 of 23

Journal of Chemical Theory and Computation 9

1.2 aDZ 1.026

aDZ+(332)

1

aDZ+(3322)

0.896

MUE (kcal/mol)

aDZ+(aDZ) 0.8 0.725 0.607

0.6

0.4

0.338 0.296 0.248 0.214

0.176

0.172

0.2

0.148 0.109

0.136 0.093 0.067

0.167 0.072 0.061 0.055

0.115

0.112

0.122 0.071 0.042

0 CCSD(T)

CCSD(T)-F12a

CCSD(T)-F12b

CCSD(T**)-F12a

CCSD(T**)-F12b

CCSD(Tbb)-F12b

FIG. 4. Mean unsigned errors for different CCSD(T)/CCSD(T)-F12 variants in the aDZ, aDZ+(332), aDZ+(3322), and aDZ+(aDZ) basis sets, averaged over the S22 database.

0.35 aDZ

0.314

aDZ+(332)

0.3

aDZ+(3322) aDZ+(aDZ)

0.25

MUE (kcal/mol)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

0.195

0.2

0.15 0.105

0.1 0.067

0.065 0.036 0.029 0.030 0.021

0.05

0.065

0.061 0.042

0.021 0.015

0.046 0.031 0.023

0.050 0.034 0.027

0.035 0.027 0.015 0.011

0.028 0.016 0.012 0.014

(T**)-F12

(Tbb)-F12

0 CCSD

CCSD-F12a

CCSD-F12b

(T)

(T)-F12

FIG. 5. Mean unsigned errors for the separate CCSD and (T) interaction energy contributions (both conventional and explicitly correlated) in the aDZ, aDZ+(332), aDZ+(3322), and aDZ+(aDZ) basis sets, averaged over the A24 database.

have extracted the CCSD-F12b+δ(T) interaction energies from our calculations — the pertinent MUE values are reported in the Supporting Information. However, while such an approach is a viable way to determine gold-standard CCSD(T)/CBS values in large-basis calculations, it is obviously not the best choice for a silver-standard estimate: the conventional δ(T) values in double-zeta basis sets are not accurate enough. Figures 5–7 show that both (T)

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

Page 10 of 23 10

0.120 aTZ

MUE (kcal/mol)

0.100

aTZ+(332)

0.100

aTZ+(3322) aTZ+(aTZ)

0.080

0.060

0.056

0.058

0.047

0.040

0.020 0.012

0.016 0.015 0.014

0.014 0.007 0.007 0.005

0.020

0.017 0.011 0.010 0.008

0.012 0.013 0.010

0.011 0.009 0.009 0.004

0.008 0.003 0.003 0.003

(T**)-F12

(Tbb)-F12

0.000 CCSD

CCSD-F12a

CCSD-F12b

(T)

(T)-F12

FIG. 6. Mean unsigned errors for the separate CCSD and (T) interaction energy contributions (both conventional and explicitly correlated) in the aTZ, aTZ+(332), aTZ+(3322), and aTZ+(aTZ) basis sets, averaged over the A24 database.

0.040 aQZ

0.036

0.035

aQZ+(332) aQZ+(3322)

0.030

aQZ+(aQZ)

0.026

MUE (kcal/mol)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

0.025

0.023 0.022

0.020 0.015 0.009 0.009 0.008 0.008

0.010 0.005

0.008 0.004 0.003 0.003 0.003

0.006 0.004 0.004 0.003

CCSD-F12b

(T)

0.005 0.005 0.005

0.005 0.005 0.004 0.003

0.003

0.001 0.002 0.001

0.000 CCSD

CCSD-F12a

(T)-F12

(T**)-F12

(Tbb)-F12

FIG. 7. Mean unsigned errors for the separate CCSD and (T) interaction energy contributions (both conventional and explicitly correlated) in the aQZ, aQZ+(332), aQZ+(3322), and aQZ+(aQZ) basis sets, averaged over the A24 database.

and unscaled (T)-F12 contributions strongly benefit from the addition of midbond functions, and again the larger (3322) midbond set leads to more accurate results than the smaller (332) one. However, midbond functions are quite detrimental to the accuracy of the scaled (T∗∗ )-F12 term in all basis sets tested, likely because the scaling overshoots further as the unscaled results get closer to the correct value. This overshooting is partially alleviated by the (Tbb)

ACS Paragon Plus Environment

Page 11 of 23

Journal of Chemical Theory and Computation 11

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

triples scaling proposed in Ref. 55, as the scaling ratios in Eq. (5) are generally closer to one (a milder scaling) than those of Eq. (4). As a result, while the (Tbb)-F12 interaction energy contribution, computed without midbond, is less accurate than (T∗∗ )-F12 (although still better than (T)-F12), the (Tbb)-F12 results exhibit impressive improvement in the presence of midbond functions. We have observed that the (Tbb)-F12 variant with midbond functions is the best strategy to recover the triples interaction energy contribution alone. Thus, it is worthwhile to go back to Figs. 1–4 and investigate how the combination of (Tbb)-F12 with CCSD-F12b (the best strategy to recover the CCSD interaction energy according to Figs. 5–7) recovers the complete CCSD(T)/CBS benchmark values. Interestingly, when no midbond functions are employed, the CCSD(Tbb)-F12b approach does not usually fare as well as the two balanced variants CCSD(T)-F12a and CCSD(T∗∗ )-F12b. A similar hierarchy is observed for the aXZ+(aXZ) and aXZ+(332) calculations. However, the benefits of using the larger (3322) set of midbond functions are more pronounced, and more consistent, for CCSD(Tbb)-F12b than for either CCSD(T)-F12a or CCSD(T∗∗ )-F12b. It appears that the combination of the best-performing CCSD variant and the best performing (T) variant can only achieve maximum performance for the complete CCSD(T)/CBS interaction energy if a sufficiently large set of bond functions is used. The comparison of, e.g., Figs. 1 and 5 indicates that this maximum performance still involves a good deal of error cancellation: the lowest aDZ-level MUE values for any straight-up CCSD(T)/CCSD(T)-F12 calculation on the A24 database, 0.009 kcal/mol for CCSD(T)-F12a/aDZ+(332) and 0.010 kcal/mol for CCSD(Tbb)-F12b/aDZ+(3322), are lower than any aDZ-computed MUE for either CCSD or (T) alone. Having verified how the addition of midbond functions influences the performance of straight-up CCSD(T) and CCSD(T)-F12 computations, we now turn to the assessment of the composite MP2/CBS+δ(CCSD(T))/aXZ+(bond) approaches. The pertinent MUE values for the A24/aDZ, A24/aTZ, A24/aQZ, and S22/aDZ combinations of the database and the atom-centered part of the basis set are presented in Fig. 8, 9, 10, and 11, respectively; the corresponding MURE data are given in the Supporting Information. Overall, on one hand, the inclusion of the leading MP2 interaction energy contribution at the CBS limit reduces the errors of conventional (non-F12) approaches immensely. On the other hand, when only the δ(CCSD(T)) term is computed instead of the full CCSD(T), the improvement afforded by the F12 approaches is greatly reduced and sometimes vanishes altogether. Importantly, the MP2/CBS+δ(CCSD(T∗∗ )-F12a) and MP2/CBS+δ(CCSD(T)-F12b) approaches remain inferior to the MP2/CBS+δ(CCSD(T)-F12a), MP2/CBS+δ(CCSD(T∗∗ )-F12b), and MP2/CBS+δ(CCSD(Tbb)-F12b) ones (and to conventional MP2/CBS+δ(CCSD(T))). For example, for the A24 database in the aDZ atom-centered basis, the MUE values for different choices of midbond (or no midbond at all) are in the range 0.04–0.08 kcal/mol for both MP2/CBS+δ(CCSD(T∗∗ )-F12a) and MP2/CBS+δ(CCSD(T)-F12b); the corresponding MUE from the balanced F12 variants are 0.01–0.03 kcal/mol for MP2/CBS+δ(CCSD(T)-F12a) and MP2/CBS+δ(CCSD(T∗∗ )-F12b), and around 0.03 kcal/mol for conventional MP2/CBS+δ(CCSD(T)). The balanced F12 composite variants remain substantially superior to the two imbalanced F12 flavors for the aTZ and aQZ atomic basis sets as well as for the S22/aDZ database. In the latter case, conventional MP2/CBS+δ(CCSD(T)) interaction energies are just as good as the balanced F12 composite results: for A24/aTZ and especially A24/aQZ, the explicitly correlated δ(CCSD(T)) values converge faster. The presence or absence of midbond functions has only a secondary effect on the accuracy of the composite approach: however, if the atom-centered part of the basis is aDZ, the effect of midbond on conventional and balanced F12 results is usually beneficial. The influence of bond functions on the A24/aTZ and A24/aQZ results is so tiny, and the results already so accurate, that no conclusions should be drawn from the directions of the MUE/MURE changes in this case. An interesting exception are the MP2/CBS+δ(CCSD(Tbb)-F12b) interaction energies that are overwhelmingly improved by the addition of any midbond set: without bond functions, the MP2/CBS+δ(CCSD(T)-F12a) and MP2/CBS+δ(CCSD(T∗∗ )-F12b) approaches are more accurate but the MP2/CBS+δ(CCSD(Tbb)-F12b) results with midbond often outperform all other variants. Overall, the straight-up CCSD(Tbb)-F12b/aDZ+(3322) calculation is always a very good choice. Among all 56 single-step and composite aDZ-level CCSD(T)/CBS estimates for a given database, presented in Figs. 1, 4, 8, 11, and the Supporting Information, CCSD(Tbb)-F12b/aDZ+(3322) is very near the top for both A24 (a MUE of 0.010 kcal/mol versus 0.008 kcal/mol for MP2/CBS+δ(CCSD(Tbb)-F12b)/aDZ+(332)) and S22 (a MUE of 0.042 kcal/mol versus 0.030 kcal/mol for DW-CCSD(T∗∗ )-F12/aDZ+(aDZ)). As the CCSD(Tbb)-F12b/aDZ+(3322) method relies on an error cancellation between the CCSD part and the triples part less than the other variants, it is at this point our top candidate for computing “silver standard” interaction energies that only require aDZ-level coupled-cluster calculations. We have demonstrated the ability of aDZ-level calculations to deliver highly accurate interaction energies as long as an appropriate combination of F12 variants and bond functions is selected. As the next step, we turn to investigating if a further reduction of the atom-centered basis set, in particular, a removal of some or all diffuse functions, can improve the computational efficiency without a highly adverse effect on the accuracy. For this purpose, we examine the effects of successively deleting diffuse functions from aDZ within a sequence of “calendar” basis sets49 jul-cc-pVDZ≡jul-DZ, jun-cc-pVDZ≡jun-DZ, and finally nonaugmented cc-pVDZ≡DZ. Furthermore, we limit our consideration of bond

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

Page 12 of 23 12

0.090 0.083

aDZ 0.080 aDZ+(332) 0.070

aDZ+(3322)

0.060

aDZ+(aDZ)

0.073 0.067

MUE (kcal/mol)

0.062

0.051

0.050 0.040 0.030

0.046

0.045 0.040

0.037 0.033 0.030 0.028 0.026

0.031

0.030 0.024

0.021

0.020

0.018

0.015 0.013

0.022

0.016 0.010 0.008

0.010 0.000

FIG. 8. Mean unsigned errors for different composite MP2/CBS+δ(CCSD(T)) variants in the aDZ, aDZ+(332), aDZ+(3322), and aDZ+(aDZ) basis sets, averaged over the A24 database.

0.030 aTZ

0.027

0.026 0.025 0.024

aTZ+(332) 0.025

aTZ+(3322) aTZ+(aTZ)

MUE (kcal/mol)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

0.020

0.020

0.018

0.017 0.017 0.017 0.016

0.018

0.016

0.015

0.015

0.010

0.005

0.006 0.006 0.006 0.005

0.007 0.007 0.006 0.006

0.006

0.006

0.004

0.000

FIG. 9. Mean unsigned errors for different composite MP2/CBS+δ(CCSD(T)) variants in the aTZ, aTZ+(332), aTZ+(3322), and aTZ+(aTZ) basis sets, averaged over the A24 database.

functions to the largest (3322) set. The resulting MUE values for conventional CCSD(T) and different CCSD(T)-F12 variants are presented in Figs. 12 and 13 for the A24 and S22 databases, respectively. The jul-DZ set is equivalent to aDZ for non-hydrogen atoms and DZ for hydrogens, while jun-DZ differs from jul-DZ by the absence of the diffuse d basis function for non-hydrogen atoms. Accordingly, the auxiliary basis sets for F12 calculations were selected

ACS Paragon Plus Environment

Page 13 of 23

Journal of Chemical Theory and Computation 13

0.014

0.013

0.013 0.012

0.012

MUE (kcal/mol)

0.013

0.012 0.011

aQZ

0.012

aQZ+(332)

0.011

aQZ+(3322)

0.010

0.01

aQZ+(aQZ) 0.008 0.008

0.008

0.007 0.006

0.006 0.004 0.004 0.004

0.004

0.004 0.003

0.003 0.002

0.002

0.002

0.002

0.003

0.002

0

FIG. 10. Mean unsigned errors for different composite MP2/CBS+δ(CCSD(T)) variants in the aQZ, aQZ+(332), aQZ+(3322), and aQZ+(aQZ) basis sets, averaged over the A24 database.

0.3 aDZ

0.259

0.25

aDZ+(332)

0.239

aDZ+(3322)

0.214

MUE (kcal/mol)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

0.195

0.2

0.151

0.160

0.093

0.083 0.066

0.078

0.079

0.076

0.068

0.05

aDZ+(aDZ)

0.180

0.15

0.1

0.194

0.060

0.074 0.074 0.073

0.088 0.056

0.050

0.069 0.050

0

FIG. 11. Mean unsigned errors for different composite MP2/CBS+δ(CCSD(T)) variants in the aDZ, aDZ+(332), aDZ+(3322), and aDZ+(aDZ) basis sets, averaged over the S22 database.

as nonaugmented cc-pVDZ/MP2FIT and cc-pVDZ/JKFIT variants for all atoms lacking any diffuse functions, and augmented aug-cc-pVDZ/MP2FIT and aug-cc-pVDZ/JKFIT sets for all atoms possessing a full or partial (in case of jun-DZ) set of diffuse functions. The first observation from Figs. 12 and 13 is that the accuracy of midbondless results strongly increases with

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

Page 14 of 23 14

1.200 cc-pVDZ cc-pVDZ+(3322)

MUE (kcal/mol)

1.000

0.800

0.600

0.400

0.200

0.000 jun-

jul- aug-

CCSD(T)

jun-

jul- aug-

CCSD(T)-F12a

jun-

jul- aug-

CCSD(T)-F12b

jun-

jul- aug-

CCSD(T**)-F12a

jun-

jul- aug-

CCSD(T**)-F12b

jun-

jul- aug-

CCSD(Tbb)-F12b

FIG. 12. Mean unsigned errors for different CCSD(T)/CCSD(T)-F12 variants in the “calendar” sequence of double-zeta basis sets, averaged over the A24 database. The leftmost set of data for each method corresponds to the nonaugmented DZ basis.

3 cc-pVDZ cc-pVDZ+(3322) 2.5

MUE (kcal/mol)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2

1.5

1

0.5

0 jun-

jul-

CCSD(T)

aug-

jun-

jul-

CCSD(T)-F12a

aug-

jun-

jul- aug-

CCSD(T)-F12b

jun-

jul-

CCSD(T**)-F12a

aug-

jun-

jul-

aug-

CCSD(T**)-F12b

jun-

jul- aug-

CCSD(Tbb)-F12b

FIG. 13. Mean unsigned errors for different CCSD(T)/CCSD(T)-F12 variants in the “calendar” sequence of double-zeta basis sets, averaged over the S22 database. The leftmost set of data for each method corresponds to the nonaugmented DZ basis.

the addition of diffuse functions. The results with the (3322) midbond (which are significantly more accurate than those without midbond) show the same trend with the exception of the CCSD(T∗∗ )-F12a approach that once again exhibits erratic behavior when the augmentation level is increased. Without midbond, the largest increase in accuracy is clearly afforded by going from jun-DZ to jul-DZ — the former is nearly as bad as DZ and the latter nearly

ACS Paragon Plus Environment

Page 15 of 23

Journal of Chemical Theory and Computation 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

as good as aDZ. Most results with bond functions also experience a significant improvement when going from junDZ+(3322) to jul-DZ+(3322), however, even the former results can be quite accurate (the lowest MUE at this level are 0.028 kcal/mol (CCSD(T∗∗ )-F12b/jun-DZ+(3322)) for A24 and 0.086 kcal/mol (CCSD(T∗∗ )-F12a/jun-DZ+(3322)) for S22). The CCSD(Tbb)-F12b variant, recommended above for full aDZ+(3322), remains one of the top performers in jul-DZ+(3322) and the atom-centered basis set reduction leads to only a small drop in accuracy — the MUE increases from 0.010 kcal/mol to 0.013 kcal/mol for A24 and from 0.042 kcal/mol to 0.057 kcal/mol for S22. Incidentally, some of the other balanced F12 results, CCSD(T)-F12a and CCSD(T∗∗ )-F12b, become slightly more accurate when going from aDZ+(3322) to jul-DZ+(3322), and the lowest MUE value in the entire Fig. 12, 0.008 kcal/mol, belongs to CCSD(T)-F12a/jul-DZ+(3322) (for the S22 database and Fig. 13, the CCSD(Tbb)-F12b/aDZ+(3322) MUE of 0.042 kcal/mol mentioned above is the lowest one). We conclude from Figs. 12 and 13 that while midbond functions substantially improve the accuracy in any atom-centered basis set, at least a partial augmentation (preferably at the jul-DZ level) of the latter is required to obtain silver-standard quality results. In order to find out how a removal of diffuse functions affects the composite MP2/CBS+δ(CCSD(T)) interaction energies, we have plotted the MUE values for various variants, with δ(CCSD(T)) computed in a double-zeta-level set, in the Supporting Information. Without midbond, a larger number of diffuse functions is always beneficial for the δ(CCSD(T)) accuracy, at least for conventional CCSD(T) and the balanced CCSD(T)-F12 variants. When midbond functions are present, the diffuse functions have a minor effect on the performance of the composite methods, however, having some diffuse functions is in most cases better than having none at all. Overall, we observe that the straight-up CCSD(T)-F12 and composite MP2/CBS+δ(CCSD(T)-F12) calculations, in their best chosen variants, afford virtually the same accuracy of benchmark interaction energies, however, while stand-alone CCSD(T)-F12 needs at least the jul-DZ+(3322) basis set for silver-standard accuracy, the composite approach can include δ(CCSD(T)-F12) computed in jun-DZ+(3322) or even DZ+(3322). Our findings about the performance of various combinations of basis sets, midbond functions, and explicitly correlated approaches need to be put into context of their relative computational efficiency. For this purpose, we selected the parallel-displaced benzene dimer as a representative of the S22 database that is neither one of the smallest nor one of the largest complexes in this set. We compared the molpro2012.1 wall times for different approaches utilizing all 16 AMD Opteron 4386 cores on a single cluster node. We took into account the fact that the CCSD(Tbb)-F12b and CCSD-F12b+δ(T) approaches require two computations, a conventional one and an F12 one (one with triples and one without), and that all composite variants require an additional MP2/CBS calculation (which, for the purpose of timings, was represented as a sequence of DF-MP2 calculations in the aTZ and aQZ basis sets). All timings include the dimer calculation as well as the counterpoise-corrected calculations for monomers (technically, the CCSD(Tbb)-F12b calculation only needs the dimer CCSD correlation energy, but we ran the monomers as well so all three scaling factors are available). We will measure the relative computational complexity of a given variant by the ratio of its wall time to the wall time of the cheapest variant considered, conventional CCSD(T)/DZ. The resulting relative timings as well as the corresponding accuracies (quantified by the mean of the MUE values for the A24 and S22 sets, as exhibited by the best performing variant of a given complexity) are displayed in Table I (the (332) midbond was omitted as its intermittent slight superiority over (3322) is clearly accidental). The maximum unsigned errors on each database are also listed in Table I. A complete table listing the accuracy and relative timings of all variants considered is given in the Supporting Information. One should pay special attention to methods that constitute “Pauling points”, that is, exhibit higher accuracy (in terms of MUE for each separate database) than all basis set, bond function, and CCSD(T)/CCSD(T)-F12 variant combinations that are computationally cheaper. Such methods, which are clearly the best choices at a given level of complexity, are distinguished by the MUE values given in bold in Table I. One should also note the position of the current silver standard31 DW-CCSD(T∗∗ )-F12/aDZ (which gives a lower average MUE than any individual CCSD(T)-F12 variant in this basis, so it is present in Table I) and of the historically most important estimate of this kind, MP2/CBS+δ(CCSD(T))/aDZ, employed e.g. in the construction of the highly popular S66x8 benchmark database71 . These two reference levels exhibit similar complexity for the benzene dimer, both requiring calculations 9–10 times longer than CCSD(T)/DZ. While both approaches are quite accurate, neither of them represents a Pauling point for either A24 or S22. The relative timings in Table I indicate that the addition of (3322) midbond functions leads to only a modest increase in complexity: it is always cheaper to include bond functions than to increase the augmentation level of the basis set or to switch from conventional to explicitly correlated CCSD(T). The latter two improvements bring about a comparable increase in calculation time. It should be noted that even the most expensive double-zeta level calculation, MP2/CBS+δ(CCSD(Tbb)-F12b)/aDZ+(3322), with a relative wall time of 21.8, is still much less involved than the cheapest triple-zeta level calculation, CCSD(T)/TZ (a relative wall time of 29.9). This stresses the importance of DZ-level silver standards as the only viable option to obtain reliable benchmarks for a large class of complexes. Moreover, the recently constructed aug-cc-pVDZ-F12 basis set43 is substantially larger than aDZ, so that the relative timings of the CCSD(T)/aug-cc-pVDZ-F12 and CCSD(T)-F12/aug-cc-pVDZ-F12 calculations for the benzene dimer are 42.7 and 51.1, respectively. Thus, this basis set can only be a sensible option for benchmark calculations if it leads

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

Page 16 of 23 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

TABLE I. The best performing approaches (as classified by the arithmetic mean of the A24 and S22 MUE values) at a given computational cost, measured by the calculation time for the parallel-displaced benzene dimer relative to CCSD(T)/DZ (column ‘Time’). The maximum unsigned errors (MaxUE) and the basis set size for the benzene dimer (‘Size’) are also given. The “Pauling-point” MUE values, that is, the errors that are lower than those of any cheaper method, are marked in bold. The MUE and MaxUE unit is kcal/mol.

Time Method

Basis

Bond

Size MUE(A24) MUE(S22) MaxUE(A24) MaxUE(S22)

1.00 CCSD(T)

DZ

-

228 0.970

2.764

2.823

6.335

1.64 CCSD(T)

DZ

(3322) 264 0.275

1.327

1.212

4.270

1.89 CCSD(T)

jun-DZ -

276 0.804

2.242

2.017

5.134

2.45 CCSD(T∗∗ )-F12a

DZ

-

228 0.365

0.908

0.908

2.158

2.94 CCSD(Tbb)-F12b

DZ

-

228 0.436

1.136

1.131

2.856

3.06 CCSD(T)

jun-DZ (3322) 312 0.138

1.023

0.520

2.842

3.51 MP2/CBS+δ(CCSD(T))

DZ

-

0.222

0.275

0.659

3.59 CCSD(T∗∗ )-F12a

DZ

(3322) 264 0.094

0.314

0.587

1.251

4.08 CCSD(T)

jul-DZ -

1.168

1.002

2.970

4.15 MP2/CBS+δ(CCSD(T))

DZ

0.096

0.141

0.339

4.34 CCSD(T∗∗ )-F12a

jun-DZ -

276 0.249

0.537

0.542

0.957

4.36 CCSD(Tbb)-F12b

DZ

(3322) 264 0.149

0.518

0.667

1.871

4.40 MP2/CBS+δ(CCSD(T))

jun-DZ -

276 0.070

0.155

0.176

0.383

4.95 MP2/CBS+δ(CCSD(T)-F12a)

DZ

228 0.079

0.159

0.241

0.472

228 0.093 336 0.486

(3322) 264 0.039

-

5.22 CCSD(Tbb)-F12b

jun-DZ -

276 0.326

0.770

0.763

1.597

5.45 MP2/CBS+δ(CCSD(Tbb)-F12b)

DZ

228 0.100

0.156

0.271

0.462 0.224

-

5.57 MP2/CBS+δ(CCSD(T))

jun-DZ (3322) 312 0.039

0.076

0.116

5.90 CCSD(T)

jul-DZ (3322) 372 0.095

0.643

0.379

2.073

6.05 CCSD(T∗∗ )-F12a

jun-DZ (3322) 312 0.036

0.086

0.091

0.235

6.10 MP2/CBS+δ(CCSD(T)-F12a)

DZ

(3322) 264 0.021

0.064

0.082

0.215

6.57 CCSD(T)

aDZ

-

384 0.375

1.026

0.910

2.802

6.58 MP2/CBS+δ(CCSD(T))

jul-DZ -

336 0.048

0.105

0.111

0.350

6.81 CCSD(T∗∗ )-F12a

jul-DZ -

336 0.078

0.133

0.266

0.401

6.85 MP2/CBS+δ(CCSD(T)-F12a)

jun-DZ -

276 0.059

0.108

0.156

0.257

6.87 MP2/CBS+δ(CCSD(Tbb)-F12b)

DZ

(3322) 264 0.013

0.063

0.026

0.253

7.36 CCSD(T)

aDZ

(aDZ) 393 0.242

0.896

0.719

2.592

7.45 CCSD(Tbb)-F12b

jun-DZ (3322) 312 0.039

0.252

0.114

0.886

7.72 MP2/CBS+δ(CCSD(Tbb)-F12b)

jun-DZ -

276 0.079

0.117

0.185

0.337

8.40 MP2/CBS+δ(CCSD(T))

jul-DZ (3322) 372 0.031

0.065

0.106

0.167

8.55 MP2/CBS+δ(CCSD(T)-F12a)

jun-DZ (3322) 312 0.020

0.038

0.061

0.161

8.58 CCSD(Tbb)-F12b

jul-DZ -

336 0.145

0.238

0.353

0.458

9.07 MP2/CBS+δ(CCSD(T))

aDZ

-

384 0.030

0.083

0.086

0.289

9.32 MP2/CBS+δ(CCSD(T∗∗ )-F12b)

jul-DZ -

336 0.053

0.095

0.130

0.331

9.39 CCSD(T∗∗ )-F12b

jul-DZ (3322) 372 0.016

0.050

0.041

0.158

9.72 CCSD(T)

aDZ

(3322) 420 0.089

0.607

0.372

2.028

9.87 MP2/CBS+δ(CCSD(T))

aDZ

(aDZ) 393 0.026

0.076

0.096

0.244

jun-DZ (3322) 312 0.012

0.044

0.034

0.199

10.27 DW-CCSD(T∗∗ )-F12

9.96 MP2/CBS+δ(CCSD(Tbb)-F12b)

aDZ

-

384 0.054

0.056

0.144

0.153

11.09 MP2/CBS+δ(CCSD(Tbb)-F12b)

jul-DZ -

336 0.068

0.122

0.159

0.378

11.49 DW-CCSD(T∗∗ )-F12

aDZ

0.030

0.093

0.095

(aDZ) 393 0.025

11.90 MP2/CBS+δ(CCSD(T)-F12a)

jul-DZ (3322) 372 0.013

0.056

0.053

0.200

11.97 CCSD(Tbb)-F12b

jul-DZ (3322) 372 0.013

0.057

0.035

0.212

12.22 MP2/CBS+δ(CCSD(T))

aDZ

0.152

(3322) 420 0.033

0.066

0.108

12.78 MP2/CBS+δ(DW-CCSD(T∗∗ )-F12) aDZ

-

384 0.028

0.047

0.121

0.167

13.18 CCSD(Tbb)-F12b

aDZ

-

384 0.092

0.167

0.216

0.402

13.99 MP2/CBS+δ(CCSD(T∗∗ )-F12b)

aDZ

(aDZ) 393 0.015

0.074

0.059

0.249

14.48 MP2/CBS+δ(CCSD(Tbb)-F12b)

jul-DZ (3322) 372 0.008

0.048

0.030

0.227

14.74 CCSD(Tbb)-F12b

aDZ

(aDZ) 393 0.051

0.122

0.162

0.349

14.84 CCSD(T)-F12a

aDZ

(3322) 420 0.011

0.067

0.034

0.191

15.69 MP2/CBS+δ(CCSD(Tbb)-F12b)

aDZ

-

384 0.045

0.087

0.131

0.332

17.25 MP2/CBS+δ(CCSD(Tbb)-F12b)

aDZ

(aDZ) 393 0.022

0.069

0.091

0.300

17.35 MP2/CBS+δ(CCSD(T)-F12a)

aDZ

(3322) 420 0.015

0.050

0.055

0.192

19.26 CCSD(Tbb)-F12b

aDZ

(3322) 420 0.010

0.042

0.024

0.212

21.76 MP2/CBS+δ(CCSD(Tbb)-F12b)

aDZ

(3322) 420 0.010

0.050

0.024

0.219

ACS Paragon Plus Environment

Page 17 of 23

Journal of Chemical Theory and Computation 17

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

to significantly higher interaction energy accuracy than aDZ. However, this is not the case: the MUE of different straight-up and composite CCSD(T)/CBS estimates for the A24 database, presented in the Supporting Information, show that the aug-cc-pVDZ-F12 basis set provides only a modest improvement upon ordinary aDZ. Moreover, while some improvement is always present in midbondless bases, the relative accuracy of aDZ and aug-cc-pVDZ-F12 with the (332) and/or (3322) midbond sets does not follow a consistent pattern. Therefore, just like its nonaugmented counterpart38 , aug-cc-pVDZ-F12 does not appear to be a good choice for benchmark interaction energy calculations. Let us now examine the Pauling points in Table I, especially the ones that are common to both databases. Disregarding the two cheapest variants that are clearly far from benchmark accuracy, the first noteworthy option is MP2/CBS+δ(CCSD(T))/DZ. Considering the commonly listed target of 0.1 kcal/mol for the benchmark accuracy, this inexpensive variant barely makes the cut for the A24 set but is not accurate enough for the S22 one. Thus, as already observed by several research groups (including ours)24,25,71 , the nonaugmented DZ basis set does not provide the δ(CCSD(T))/DZ correction values of benchmark quality. The accuracy improves substantially upon adding midbond functions: the next Pauling point for either database, MP2/CBS+δ(CCSD(T))/DZ+(3322), affords a more than twofold reduction of MUE and brings the S22 accuracy (barely) within the 0.1 kcal/mol range. From this point on, all Pauling-point variants include bond functions. The next two approaches that improve the MUE for both databases relative to all cheaper variants are MP2/CBS+δ(CCSD(T)-F12a)/DZ+(3322) and MP2/CBS+δ(CCSD(Tbb)-F12b)/DZ+(3322): once the composite approach with midbond functions is selected as the best choice, the F12 calculations provide an additional benefit. For the latter variant, the average errors of 0.013 and 0.063 kcal/mol for A24 and S22, respectively, are truly remarkable given that the only diffuse functions present in the coupled-cluster calculation are those centered on the midbond. Incidentally, MP2/CBS+δ(CCSD(Tbb)F12b)/DZ+(3322) is also the last double Pauling point in Table I: while the errors for each of the two datasets can be reduced somewhat further, the lowest MUE for each set is attained by a different variant. The overall winners in Table I regardless of the computational cost are MP2/CBS+δ(CCSD(Tbb)-F12b)/jul-DZ+(3322) (MUE of 0.008 kcal/mol) for A24 and DW-CCSD(T∗∗ )-F12)/aDZ+(aDZ) (MUE of 0.030 kcal/mol) for S22. However, the differences between top-performing approaches are small at this level. Importantly, the CCSD(Tbb)-F12b/aDZ+(3322) choice, which we have praised above for its consistency and low reliance on error cancellation, gives MUE values for both datasets very close to the top performers. As expected, the silver standard proposed in Ref. 31, DW-CCSD(T∗∗ )-F12/aDZ, performs well for the S22 database — after all, the parameters for the dispersion weighting have been optimized33 on this very dataset. Nevertheless, it is not a Pauling point — three cheaper variants in Table I give slightly lower average errors. The performance of DW-CCSD(T∗∗ )-F12/aDZ for the A24 dataset is respectable but not great: many approaches above it in Table I afford higher accuracy. Moreover, as argued above, the dispersion-weighted variant is not very prone to improvement via bond functions, except for the smallest (aDZ) midbond that actually improves the straight-up DW-CCSD(T∗∗ )F12 results and slightly worsens the composite ones. The conventional MP2/CBS+δ(CCSD(T))/aDZ benchmark level performs somewhat better for A24 but worse for S22. Overall, while the accuracy of this level is quite acceptable, there exist other variants, some of them computationally cheaper, that lead to still lower average errors for both datasets. For example, the MP2/CBS+δ(CCSD(T)-F12a)/jun-DZ+(3322) approach is computationally less demanding, but more accurate, than either DW-CCSD(T∗∗ )-F12/aDZ or MP2/CBS+δ(CCSD(T))/aDZ. Based on the simultaneous consideration of the A24 and S22 accuracy and computational efficiency, several recommended approaches can be identified from Table I. The aforementioned CCSD(Tbb)-F12b/aDZ+(3322) variant (MUE of 0.010 and 0.042 kcal/mol for A24 and S22, respectively) and its composite MP2/CBS+δ(CCSD(Tbb)F12b)/aDZ+(3322) counterpart (MUE of 0.010 kcal/mol for A24 and 0.050 kcal/mol for S22) offer impressive and consistent accuracy. While these approaches are among the most expensive at the double-zeta level (due to an additional CCSD calculation needed for the scaling factor), they are nevertheless significantly more accessible than any triple-zeta calculations including the “gold standard” MP2/CBS+δ(CCSD(T))/aTZ one. Accordingly, we recommend CCSD(Tbb)-F12b/aDZ+(3322) as the “silver-plus” standard. A cheaper alternative (about two times faster for the benzene dimer) is the MP2/CBS+δ(CCSD(T)-F12a)/jun-DZ+(3322) approach that incidentally leads to the second lowest MUE overall for the S22 database (0.038 kcal/mol) and remains very accurate (MUE of 0.020 kcal/mol) for the A24 dataset. This approach is both somewhat cheaper and more accurate than the established “silver standard” DW-CCSD(T∗∗ )-F12/aDZ31 and we recommend it as a replacement “silver standard”. The cheapest approach to attain the 0.1 kcal/mol target accuracy, MP2/CBS+δ(CCSD(T))/DZ+(3322), is another factor of two faster for the benzene dimer and gives MUE of 0.039 and 0.096 kcal/mol for A24 and S22, respectively. This variant gets our recommendation as the “silver-minus” standard. Finally, the MP2/CBS+δ(CCSD(T)-F12a)/DZ+(3322) and MP2/CBS+δ(CCSD(Tbb)-F12b)/DZ+(3322) approaches, with their computational complexity in between the “silver” and “silver-minus” treatments, are both more accurate on either database than any cheaper variant, and the DW-CCSD(T∗∗ )-F12/aDZ+(aDZ) method is only slightly more expensive than DW-CCSD(T∗∗ )-F12/aDZ but significantly more accurate, leading to the lowest overall MUE on S22 (0.030 kcal/mol) of all variants considered. These three methods are also excellent choices.

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

Page 18 of 23 18

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

IV.

SUMMARY

Computing benchmark noncovalent interaction energies requires a reliable estimate of the complete basis set limit at the CCSD(T) level. For a large class of important complexes, CCSD(T) calculations can only be performed in a double-zeta basis set, and the accuracy of the resulting CCSD(T)/CBS estimate strongly depends on the particular coupled-cluster variant employed, for example, whether one uses straight-up CCSD(T) or composite MP2/CBS+δ(CCSD(T)), or which approximate CCSD(T)-F12 approach has been selected. While the performance of different CCSD(T)/CCSD(T)-F12 flavors and atom-centered basis sets for noncovalent interaction energies has been thoroughly studied (most recently, in Ref. 38), the benefits of adding basis functions centered on the intermolecular midbond have not been explored beyond a few model complexes40,41 . Therefore, in this work we extend the investigations of Sirianni et al.38 , performed over the A2439 and S2223 benchmark interaction energy databases, to basis sets with midbond functions. We investigate the performance of both constant-midbond (332) and (3322) sets and a variable-midbond hydrogenic (aXZ) set with the cardinal number X changing together with the atom-centered part of the basis. The effect of midbond functions on the CBS convergence strongly varies with the CCSD(T)/CCSD(T)-F12 variant employed. The presence of bond functions is uniformly beneficial for conventional CCSD(T) and CCSD(T)-F12b and mostly harmful for CCSD(T∗∗ )-F12a. The other two variants, CCSD(T)-F12a and CCSD(T∗∗ )-F12b, outperform CCSD(T∗∗ )-F12a and CCSD(T)-F12b (especially in larger basis sets) and generally benefit from the inclusion of midbond functions, although the improvement is not always systematic and larger midbond sets are not guaranteed to give better results. The accuracy trends of different approximations are rationalized by examining the separate CCSD and (T) interaction energy contributions. While the CCSD part strongly benefits from the F12 treatment, and both CCSD and CCSD-F12b are made substantially more accurate by midbond functions, the converged CCSDF12b amplitudes actually perform slightly worse than the conventional CCSD ones in the perturbative calculation of the triples interaction energy contribution. A scaling of the (T) term by the ratio of the MP2-F12 and MP2 correlation energies, as included in CCSD(T∗∗ )-F12a and CCSD(T∗∗ )-F12b, strongly improves the midbondless results but generally overshoots. As a result, the inclusion of midbond functions actually worsens the (T∗∗ )-F12 accuracy. The remedy for the overshooting is the recently proposed milder scaling utilizing the ratio of the CCSD-F12b and CCSD correlation energies55 . The resulting (Tbb)-F12 interaction energy contributions with midbond functions outperform all other triples estimates. As such, the CCSD(Tbb)-F12b combination is the variant that relies least on an error cancellation between the CCSD-F12b and (T) interaction energy contributions (although some error cancellation does take place). Indeed, the CCSD(Tbb)-F12b accuracy strongly benefits from the inclusion of midbond functions, larger midbond sets provide more improvement, and the CCSD(Tbb)-F12b/aXZ+(3322) variant is among the most accurate ones for both databases and all X values considered. The composite MP2/CBS+δ(CCSD(T)) approach exhibits higher accuracy than the corresponding straight-up CCSD(T) computation, although the improvement decreases significantly with the basis set increase and/or the F12 treatment as the straight-up results become more converged. The real power of the combination of midbond functions and the composite approach is its ability to make up for the deficiencies in the atom-centered part of the basis set, in particular, for a partial (or even complete) lack of diffuse functions. Already at the nonaugmented DZ+(3322) level, the conventional MP2/CBS+δ(CCSD(T)) treatment delivers average errors below 0.1 kcal/mol for both the A24 and S22 databases, and the MP2/CBS+δ(CCSD(T)-F12a) and MP2/CBS+δ(CCSD(Tbb)-F12b) approaches are still more accurate. In fact, the latter two variants are markedly cheaper than the well established MP2/CBS+δ(CCSD(T))/aDZ approach and the DW-CCSD(T∗∗ )-F12/aDZ “silver standard” of Ref. 31, but they are more accurate (except for the dispersion-weighted variant performing somewhat better on the S22 database, the one used to fit its parameters). By considering the A24 and S22 accuracy as well as computational efficiency, several recommended approaches were identified in Table I, leading to the establishment of the new “silver-minus”, “silver”, and “silver-plus” standards. The accuracy and relative computational complexity of these newly recommended variants and a few established ones is summarized in Fig. 14. In the future, we plan to apply the best performing double-zeta+(midbond) CCSD(T)/CBS estimates to more extensive benchmark databases, especially those including short-range and long-range configurations in addition to the van der Waals minima. Here, the S66x8 dataset71 or its recent short-range S66x10 extension72 are particularly intriguing choices, and a comparison with the MP2/CBS+δ(CCSD(T)-F12)/cc-pVDZ-F12 benchmark calculations of Brauer et al.55 will be worthwhile. Shortly after this manuscript was initially submitted, a paper by Shaw and Hill73 appeared, challenging the prevalent assertion that the precise exponents of midbond functions are relatively unimportant. Shaw and Hill showed, on a somewhat limited range of test systems (noble gas dimers, alkali metal dimers, and 4 out of 7 very small molecular dimers studied in Ref. 41), that a carefully optimized set of system-dependent midbond exponents as small as (2s2p1d) leads to an impressive improvement of the CCSD(T)/aDZ and CCSD(T)-F12b/aDZ interaction energies. Thus, Ref. 73 not only corroborates our findings of the superior performance of selected CCSD(T)-F12 variants with bond functions, but suggests that the general-purpose unoptimized midbond sets employed in this work have not yet realized the full

ACS Paragon Plus Environment

Page 19 of 23

Journal of Chemical Theory and Computation 19

1.2 A24 S22

1.0

Calculation time

Time

MUE (kcal/mol)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

0.8

0.6

0.4

0.2

0.0

FIG. 14. Comparison of the accuracy (as measured by the MUE values on the A24 and S22 databases) and computational efficiency (as measured by the relative timings for the parallel-displaced benzene dimer) of (left to right) conventional CCSD(T)/aDZ without and with midbond functions, CCSD(T∗∗ )-F12b/aDZ without and with midbond functions, the “silver standard” of Ref. 31, and the new “silver-minus”, “silver”, and “silver-plus” standards recommended in this work.

potential of this technique, and that the accuracy gains afforded by optimized midbond sets might be even more impressive. We are planning to explore bond function optimization as a possible avenue for further improvement of double-zeta-level benchmark CCSD(T) interaction energies.

ASSOCIATED CONTENT

Details of the DW-CCSD(T∗∗ )-F12 calculations, figures showing the average performance of DW-CCSD(T∗∗ )-F12, the composite methods in “calendar” basis sets, and the aug-cc-pVDZ-F12 basis set performance, mean unsigned relative errors for the methods considered, tables with all individual interaction energies, separate error statistics for hydrogen-bonded, dispersion-dominated, and mixed-influence subsets, and an extended version of Table I. This material is available free of charge via the Internet at http://pubs.acs.org/.

ACKNOWLEDGMENTS

We thank Mr. Dominic Sirianni for providing us with the MP2/CBS benchmark values with additional significant digits and Professor Cl´emence Corminboeuf for reading and commenting on the manuscript. This work was supported by the U.S. National Science Foundation CAREER award CHE-1351978.

1

2

3

Raghavachari, K.; Trucks, G. W.; Pople, J. A.; Head-Gordon, M. A 5th-Order Perturbation Comparison of Electron Correlation Theories. Chem. Phys. Lett. 1989, 157, 479–483. ˇ ˇ aˇc, J.; Hobza, P. Convergence of the Interaction Energies in Noncovalent Complexes in the Coupled-Cluster Simov´ a, L.; Rez´ Methods Up to Full Configuration Interaction. J. Chem. Theory Comput. 2013, 9, 3420–3428. Smith, D. G. A.; Jankowski, P.; Slawik, M.; Witek, H. A.; Patkowski, K. Basis Set Convergence of the Post-CCSD(T) Contribution to Noncovalent Interaction Energies. J. Chem. Theory Comput. 2014, 10, 3140–3150.

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

Page 20 of 23 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

4

5

6

7

8

9

10

11

12

13

14

15

16

17 18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

ˇ aˇc, J.; Dubeck´ Rez´ y, M.; Jureˇcka, P.; Hobza, P. Extensions and applications of the A24 data set of accurate interaction energies. Phys. Chem. Chem. Phys. 2015, 17, 19268–19277. Patkowski, K. Benchmark Databases of Intermolecular Interaction Energies: Design, Construction, and Significance. In Annual Reports in Computational Chemistry; Dixon, D. A., Ed.; Elsevier, Amsterdam, 2017; Vol. 13; pp 3–91. Tajti, A.; Szalay, P. G.; Cs´ asz´ ar, A. G.; K´ allay, M.; Gauss, J.; Valeev, E. F.; Flowers, B. A.; V´ azquez, J.; Stanton, J. F. HEAT: High accuracy extrapolated ab initio thermochemistry. J. Chem. Phys. 2004, 121, 11599–11613. Dunning Jr., T. H. Gaussian-Basis Sets for Use in Correlated Molecular Calculations. 1. The Atoms Boron through Neon and Hydrogen. J. Chem. Phys. 1989, 90, 1007–1023. Kendall, R. A.; Dunning Jr., T. H.; Harrison, R. J. Electron Affinities of the 1st-Row Atoms Revisited - Systematic Basis Sets and Wave Functions. J. Chem. Phys. 1992, 96, 6796–6806. Helgaker, T.; Klopper, W.; Koch, H.; Noga, J. Basis-set convergence of correlated calculations on water. J. Chem. Phys. 1997, 106, 9639–9646. Halkier, A.; Helgaker, T.; Jørgensen, P.; Klopper, W.; Koch, H.; Olsen, J.; Wilson, A. K. Basis-Set Convergence in Correlated Calculations on Ne, N2 , and H2 O. Chem. Phys. Lett. 1998, 286, 243–252. Martin, J. M. L. Ab initio total atomization energies of small molecules - towards the basis set limit. Chem. Phys. Lett. 1996, 259, 669–678. Klopper, W. Highly accurate coupled-cluster singlet and triplet pair energies from explicitly correlated calculations in comparison with extrapolation techniques. Mol. Phys. 2001, 99, 481–507. Schwenke, D. W. The extrapolation of one-electron basis sets in electronic structure calculations: How it should work and how it can be made to work. J. Chem. Phys. 2005, 122, 014107. Tao, F.-M.; Pan, Y.-K. Møller-Plesset perturbation investigation of the He2 potential and the role of midbond basis functions. J. Chem. Phys. 1992, 97, 4989–4995. Tao, F.-M. A new approach to the efficient basis set for accurate molecular calculations: Applications to diatomic molecules. J. Chem. Phys. 1994, 100, 3645–3650. Tao, F.-M. Bond functions, basis set superposition errors and other practical issues with ab initio calculations of intermolecular potentials. Int. Rev. Phys. Chem. 2001, 20, 617–643. H¨ attig, C.; Klopper, W.; K¨ ohn, A.; Tew, D. P. Explicitly Correlated Electrons in Molecules. Chem. Rev. 2012, 112, 4–74. Kong, L.; Bischoff, F. A.; Valeev, E. F. Explicitly Correlated R12/F12 Methods for Electronic Structure. Chem. Rev. 2012, 112, 75–107. Adler, T. B.; Knizia, G.; Werner, H.-J. A Simple and Efficient CCSD(T)-F12 Approximation. J. Chem. Phys. 2007, 127, 221106. Knizia, G.; Adler, T. B.; Werner, H.-J. Simplified CCSD(T)-F12 Methods: Theory and Benchmarks. J. Chem. Phys. 2009, 130, 054104. H¨ attig, C.; Tew, D. P.; K¨ ohn, A. Accurate and efficient approximations to explicitly correlated coupled-cluster singles and doubles, CCSD-F12. J. Chem. Phys. 2010, 132, 231102. Boys, S. F.; Bernardi, F. Calculation of Small Molecular Interactions by Differences of Separate Total Energies - Some Procedures with Reduced Errors. Mol. Phys. 1970, 19, 553–566. ˇ ˇ Jureˇcka, P.; Sponer, J.; Cern´ y, J.; Hobza, P. Benchmark database of accurate (MP2 and CCSD(T) complete basis set limit) interaction energies of small model complexes, DNA base pairs, and amino acid pairs. Phys. Chem. Chem. Phys. 2006, 8, 1985–1993. Takatani, T.; Hohenstein, E. G.; Malagoli, M.; Marshall, M. S.; Sherrill, C. D. Basis set consistent revision of the S22 test set of noncovalent interaction energies. J. Chem. Phys. 2010, 132, 144104. Smith, D. G. A.; Patkowski, K. Interactions between Methane and Polycyclic Aromatic Hydrocarbons: A High Accuracy Benchmark Study. J. Chem. Theory Comput. 2013, 9, 370–389. Smith, D. G. A.; Patkowski, K. Benchmarking the CO2 Adsorption Energy on Carbon Nanotubes. J. Phys. Chem. C 2015, 119, 4934–4948. Riplinger, C.; Sandhoefer, B.; Hansen, A.; Neese, F. Natural Triple Excitations in Local Coupled Cluster Calculations with Pair Natural Orbitals. J. Chem. Phys. 2013, 139, 134101. Sch¨ utz, M.; Masur, O.; Usvyat, D. Efficient and accurate treatment of weak pairs in local CCSD(T) calculations. II. Beyond the ring approximation. J. Chem. Phys. 2014, 140, 244107. Guo, Y.; Riplinger, C.; Becker, U.; Liakos, D. G.; Minenkov, Y.; Cavallo, L.; Neese, F. Communication: An improved linear scaling perturbative triples correction for the domain based local pair-natural orbital based singles and doubles coupled cluster method [DLPNO-CCSD(T)]. J. Chem. Phys. 2018, 148, 011101. Pavoˇsevi´c, F.; Peng, C.; Pinski, P.; Riplinger, C.; Neese, F.; Valeev, E. F. SparseMaps-A systematic infrastructure for reduced scaling electronic structure methods. V. Linear scaling explicitly correlated coupled-cluster method with pair natural orbitals. J. Chem. Phys. 2017, 146, 174108. Burns, L. A.; Marshall, M. S.; Sherrill, C. D. Appointing silver and bronze standards for noncovalent interactions: A comparison of spin-component-scaled (SCS), explicitly correlated (F12), and specialized wavefunction approaches. J. Chem. Phys. 2014, 141, 234111. Marchetti, O.; Werner, H.-J. Accurate calculations of intermolecular interaction energies using explicitly correlated wave functions. Phys. Chem. Chem. Phys. 2008, 10, 3400–3409. Marshall, M. S.; Sherrill, C. D. Dispersion-Weighted Explicitly Correlated Coupled-Cluster Theory [DW-CCSD(T**)-F12]. J. Chem. Theory Comput. 2011, 7, 3978–3982.

ACS Paragon Plus Environment

Page 21 of 23

Journal of Chemical Theory and Computation 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

34

35

36

37

38

39

40

41

42 43

44

45

46

47

48

49

50

51

52 53

54 55

56

57

58

59

60

61

62

63

64

Hesselmann, A. Improved supermolecular second order Møller-Plesset intermolecular interaction energies using timedependent density functional response theory. J. Chem. Phys. 2008, 128, 144112. Burns, L. A.; Faver, J. C.; Zheng, Z.; Marshall, M. S.; Smith, D. G. A.; Vanommeslaeghe, K.; MacKerell, Jr., A. D.; Merz, Jr., K. M.; Sherrill, C. D. The BioFragment Database (BFDb): An open-data platform for computational chemistry analysis of noncovalent interactions. J. Chem. Phys. 2017, 147, 161727. Peterson, K. A.; Adler, T. B.; Werner, H.-J. Systematically convergent basis sets for explicitly correlated wavefunctions: The atoms H, He, B-Ne, and Al-Ar. J. Chem. Phys. 2008, 128, 084102. Peterson, K. A.; Kesharwani, M. K.; Martin, J. M. L. The cc-pV5Z-F12 basis set: reaching the basis set limit in explicitly correlated calculations. Mol. Phys. 2015, 113, 1551–1558. Sirianni, D. A.; Burns, L. A.; Sherrill, C. D. Comparison of Explicitly Correlated Methods for Computing High- Accuracy Benchmark Energies for Noncovalent Interactions. J. Chem. Theory Comput. 2017, 13, 86–99. ˇ aˇc, J.; Hobza, P. Describing Noncovalent Interactions beyond the Common Approximations: How Accurate Is the ’Gold Rez´ Standard,’ CCSD(T) at the Complete Basis Set Limit? J. Chem. Theory Comput. 2013, 9, 2151–2155. Patkowski, K. On the accuracy of explicitly correlated coupled-cluster interaction energies - have orbital results been beaten yet? J. Chem. Phys. 2012, 137, 034103. Patkowski, K. Basis Set Converged Weak Interaction Energies from Conventional and Explicitly Correlated Coupled-Cluster Approach. J. Chem. Phys. 2013, 138, 154101. Boese, A. D. Basis set limit coupled-cluster studies of hydrogen-bonded systems. Mol. Phys. 2015, 113, 1618–1629. Sylvetsky, N.; Kesharwani, M. K.; Martin, J. M. L. The aug-cc-pVnZ-F12 basis set family: Correlation consistent basis sets for explicitly correlated benchmark calculations on anions and noncovalent complexes. J. Chem. Phys. 2017, 147, 134106. Tew, D. P.; Klopper, W.; H¨ attig, C. A diagonal orbital-invariant explicitly-correlated coupled-cluster method. Chem. Phys. Lett. 2008, 452, 326–332. Jeziorska, M.; Bukowski, R.; Cencek, W.; Jaszu´ nski, M.; Jeziorski, B.; Szalewicz, K. On the performance of bond functions and basis set extrapolation techniques in high-accuracy calculations of interatomic potentials. A helium dimer study. Coll. Czech. Chem. Commun. 2003, 68, 463–488. Podeszwa, R.; Patkowski, K.; Szalewicz, K. Improved interaction energy benchmarks for dimers of biological relevance. Phys. Chem. Chem. Phys. 2010, 12, 5974–5979. Burns, L. A.; Marshall, M. S.; Sherrill, C. D. Comparing Counterpoise-Corrected, Uncorrected, and Averaged Binding Energies for Benchmarking Noncovalent Interactions. J. Chem. Theory Comput. 2014, 10, 49–57. CCSD(T) Marshall, M. S.; Burns, L. A.; Sherrill, C. D. Basis Set Convergence of the Coupled-Cluster Correction, δMP2 : Best Practices for Benchmarking Non-Covalent Interactions and the Attendant Revision of the S22, NBC10, HBC6, and HSG Databases. J. Chem. Phys. 2011, 135, 194102. Papajak, E.; Zheng, J.; Xu, X.; Leverentz, H. R.; Truhlar, D. G. Perspectives on Basis Sets Beautiful: Seasonal Plantings of Diffuse Basis Functions. J. Chem. Theory Comput. 2011, 7, 3027–3034. ˇ aˇc, J.; Jureˇcka, P.; Riley, K. E.; Cern´ ˇ ˇ aˇc, T.; Pitoˇ Rez´ y, J.; Valdes, H.; Pluh´ aˇckov´ a, K.; Berka, K.; Rez´ na ´k, M.; Vondr´ aˇsek, J.; Hobza, P. Quantum Chemical Benchmark Energy and Geometry Database for Molecular Clusters and Complex Molecular Systems (www.begdb.com): A Users Manual and Examples. Collect. Czech. Chem. Commun. 2008, 73, 1261–1270. Tao, F.-M. The use of midbond functions for ab initio calculations of the asymmetric potentials of He-Ne and He-Ar. J. Chem. Phys. 1993, 98, 3049–3059. Tao, F.-M. An accurate ab initio potential energy surface of the He–H2 interaction. J. Chem. Phys. 1994, 100, 4947–4954. Akin-Ojo, O.; Bukowski, R.; Szalewicz, K. Ab Initio Studies of He–HCCCN Interaction. J. Chem. Phys. 2003, 119, 8379– 8396. K¨ ohn, A. Explicitly correlated connected triple excitations in coupled-cluster theory. J. Chem. Phys. 2009, 130, 131101. Brauer, B.; Kesharwani, M. K.; Kozuch, S.; Martin, J. M. L. The S66x8 benchmark for noncovalent interactions revisited: explicitly correlated ab initio methods and density functional theory. Phys. Chem. Chem. Phys. 2016, 18, 20905–20925. Werner, H.-J. et al. MOLPRO, version 2012.1, a package of ab initio programs. 2012; see http://www.molpro.net (accessed June 2, 2016). Werner, H.-J.; Knowles, P. J.; Knizia, G.; Manby, F. R.; Sch¨ utz, M. Molpro: a general-purpose quantum chemistry program package. WIREs Comput Mol Sci 2012, 2, 242–253. Weigend, F.; K¨ ohn, A.; H¨ attig, C. Efficient use of the correlation consistent basis sets in resolution of the identity MP2 calculations. J. Chem. Phys. 2002, 116, 3175–3183. H¨ attig, C. Optimization of auxiliary basis sets for RI-MP2 and RI-CC2 calculations: Core-valence and quintuple-zeta basis sets for H to Ar and QZVPP basis sets for Li to Kr. Phys. Chem. Chem. Phys. 2005, 7, 59–66. Weigend, F. A fully direct RI-HF algorithm: Implementation, optimised auxiliary basis sets, demonstration of accuracy and efficiency. Phys. Chem. Chem. Phys. 2002, 4, 4285–4291. Podeszwa, R.; Bukowski, R.; Szalewicz, K. Potential energy surface for the benzene dimer and perturbational analysis of π − π interactions. J. Phys. Chem. A 2006, 110, 10345–10354. Yousaf, K. E.; Peterson, K. A. Optimized auxiliary basis sets for explicitly correlated methods. J. Chem. Phys. 2008, 129, 184108. Hellmann, R.; Bich, E.; Vogel, E. Ab initio potential energy curve for the neon atom pair and thermophysical properties of the dilute neon gas. I. Neon-neon interatomic potential and rovibrational spectra. Mol. Phys. 2008, 106, 133–140. Patkowski, K.; Szalewicz, K. Argon pair potential at basis set and excitation limits. J. Chem. Phys. 2010, 133, 094304.

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

Page 22 of 23 22

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

65

66

67

68

69

70

71

72

73

Bakr, B. W.; Smith, D. G. A.; Patkowski, K. Highly accurate potential energy surface for the He-H2 dimer. J. Chem. Phys. 2013, 139, 144305. J¨ ager, B.; Hellmann, R.; Bich, E.; Vogel, E. State-of-the-art ab initio potential energy curve for the krypton atom pair and thermophysical properties of dilute krypton gas. J. Chem. Phys. 2016, 144, 114304. Li, S.; Smith, D. G. A.; Patkowski, K. An accurate benchmark description of the interactions between carbon dioxide and polyheterocyclic aromatic compounds containing nitrogen. Phys. Chem. Chem. Phys. 2015, 17, 16560–16574. Manna, D.; Kesharwani, M. K.; Sylvetsky, N.; Martin, J. M. L. Conventional and Explicitly Correlated ab Initio Benchmark Study on Water Clusters: Revision of the BEGDB and WATER27 Data Sets. J. Chem. Theory Comput. 2017, 13, 3136–3152. Bryantsev, V. S.; Diallo, M. S.; van Duin, A. C. T.; Goddard III, W. A. Evaluation of B3LYP, X3LYP, and M06-Class Density Functionals for Predicting the Binding Energies of Neutral, Protonated, and Deprotonated Water Clusters. J. Chem. Theory Comput. 2009, 5, 1016–1026. Temelso, B.; Archer, K. A.; Shields, G. C. Benchmark Structures and Binding Energies of Small Water Clusters with Anharmonicity Corrections. J. Phys. Chem. A 2011, 115, 12034–12046. ˇ aˇc, J.; Riley, K. E.; Hobza, P. S66: A Well-balanced Database of Benchmark Interaction Energies Relevant to BiomolecRez´ ular Structures. J. Chem. Theory Comput. 2011, 7, 2427–2438. Smith, D. G. A.; Burns, L. A.; Patkowski, K.; Sherrill, C. D. Revised Damping Parameters for the D3 Dispersion Correction to Density Functional Theory. J. Phys. Chem. Lett. 2016, 7, 2197–2203. Shaw, R. A.; Hill, J. G. Midbond basis functions for weakly bound complexes. Mol. Phys. 2018, DOI:10.1080/00268976.2018.1440018.

ACS Paragon Plus Environment

Page 23 of 23

Journal of Chemical Theory and Computation 23

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

TOC Graphic

ACS Paragon Plus Environment