Scalable Electron Correlation Methods. 3. Efficient and Accurate

Jun 29, 2017 - Nevertheless, due to the parallelization that is efficient up to about 100–200 CPU cores (dependent on the molecular size), accurate ...
0 downloads 8 Views 3MB Size
Article pubs.acs.org/JCTC

Scalable Electron Correlation Methods. 3. Efficient and Accurate Parallel Local Coupled Cluster with Pair Natural Orbitals (PNOLCCSD) Max Schwilk, Qianli Ma, Christoph Köppl, and Hans-Joachim Werner* Institut für Theoretische Chemie, Universität Stuttgart, Pfaffenwaldring 55, D-70569 Stuttgart, Germany S Supporting Information *

ABSTRACT: A well-parallelized local singles and doubles coupledcluster (LCCSD) method using pair natural virtual orbitals (PNOs) is presented. The PNOs are constructed using large domains of projected atomic orbitals (PAOs) and orbital specific virtual orbitals (OSVs). We introduce a hierarchy of close, weak, and distant pairs, based on pair energies evaluated by local Møller−Plesset perturbation theory (LMP2). In contrast to most previous implementations of LCCSD methods, the close and weak pairs are not approximated by LMP2 but treated by higher-order methods. This leads to greatly improved accuracy for large systems, in particular when long-range dispersion interactions are important. Close pair amplitudes are optimized using approximate LCCSD equations, in which slowly decaying terms that mutually cancel at long-range are neglected. For weak pairs, the same approximations are used, but in addition, the nonlinear terms are neglected (coupled electron pair approximation). Distant pairs are treated by spin-component scaled (SCS)-LMP2 using multipole approximations. For efficiency reasons, various projection approximations are also introduced. The impact of these approximations on absolute and relative energies depends on the PNO domain sizes. The errors are found to be negligible, provided that sufficiently large PNO domains are used for close and weak pairs. For the selection of these domains the usual natural orbital occupation number criterion is found to be insufficient, and an additional energy criterion is used. For extended one-dimensional systems, the computational effort of the method scales nearly linearly with the number of correlated electrons, but the linear scaling regime is usually not reached in real-life applications for three-dimensional systems. Nevertheless, due to the parallelization that is efficient up to about 100−200 CPU cores (dependent on the molecular size), accurate calculations for three-dimensional molecules with about 100 atoms and augmented triple-ζ basis sets (e.g., cc-pVTZF12) can be carried out in 1−3 h of elapsed time (depending on the molecular structure and the number of CPU cores, excluding the time for Hartree−Fock). The convergence of the results with respect to the thresholds and options that control the domain, pair, and projection approximations is extensively tested. Benchmark examples for several difficult and large cases are presented, which demonstrate that the errors of relative energies (e.g., reaction energies, barrier heights) caused by the pair and projection approximations can be reduced to below 1 kJ mol−1. The remaining errors are mainly caused by the finite PAO domains. The larger these are made, the more intramolecular or intermolecular basis set superposition errors are introduced, and the canonical results are approached only very slowly. This problem is eliminated in the explicitly correlated variant of our method (PNO-LCCSD-F12), which will be described in a separate paper, along with a larger set of benchmark calculations.

1. INTRODUCTION

Nel). This problem is particularly severe since the correlation energy converges very slowly with basis set size. The latter problem can be alleviated by explicit correlation methods,1−43 which include terms in the wave function that depend explicitly on the interelectronic distances rij and thus make it possible to describe the wave function cusp for rij → 0 accurately. This leads to a drastic reduction of the basis set incompleteness errors, and with modern F12 methods,13−43 near complete basis set (CBS) quality can be achieved with triple-ζ basis sets. There are several ways to reduce the scaling of the computational effort with molecular size, which exploit the

The coupled-cluster method with singles and doubles amplitudes and a perturbative treatment of triples amplitudes [CCSD(T)] is considered the gold standard of quantum chemistry. Provided that the single-determinant Hartree−Fock wave function is a good zeroth-order approximation and large basis sets are used, CCSD(T) yields relative energies with chemical accuracy (100 atoms, often achieving chemical accuracy (1 kcal mol−1) for relative energies (as compared to canonical calculations with the same basis).175 However, more recently the errors were found to be much larger when dispersion interactions are important, and then very tight thresholds were needed to obtain converged results.107,176 This is due to the overestimation of dispersion energies by LMP2, which was used to approximate the weak and distant pairs.

short-range character of dynamical electron correlation in insulators using localized orbitals. In traditional local correlation methods,44−77,79−107 usually two kinds of approximations are introduced: The treatment of distant electron pairs is simplified or neglected, and the excitation space for each localized electron pair is restricted to a subspace (domain) of local virtual orbitals that are spatially close to the corresponding occupied orbitals. Asymptotically, this (formally) leads to linear scaling of all computational resources with the number of correlated electrons. A disadvantage of these methods is that they are complicated and difficult to implement, and the treatable molecular size is limited by the available computational resources (mainly memory and disk space). Nevertheless, the correlation energy of rather large molecules (100−200 atoms) can be very accurately computed using local coupled-cluster methods, and the errors can be well controlled. The method described in the current paper belongs to this class. Local approximations have also been developed for multi-reference wave functions.108−116 Another approach to reduce the scaling is to split the molecule or the occupied orbital space into smaller pieces, which are treated independently, mostly using conventional methods (although the use of local correlation methods is also possible). Advantages of such fragmentation methods117−144 is the ease of parallel implementation and strict linear scaling with the molecular size in the asymptotic limit. However, since the fragments must strongly overlap, there is a lot of redundancy and the prefactor is high. Furthermore, in order to obtain accurate results, the fragments must in some cases be very large, making canonical calculations with triple-ζ (or better) basis sets for the largest fragments very expensive or even impossible. A special way of assembling the correlation energy using a manybody expansion is used in the so-called incremental methods, 145−153 which also belong to the group of fragmentation methods. In the following, we only consider traditional local correlation methods. For a long time, mostly projected atomic orbitals (PAOs) have been used to span the domains, as originally proposed by Pulay.44 Already 30 years ago, Saebø and Pulay have shown that with small domains (typically all PAOs at two atoms for a bicentric bond and at one atom for lone pairs) 98−99% of the canonical correlation energy (for a given basis set) can be recovered.45−49 The errors caused by the domain approximation (in the following “domain errors”) decrease with increasing basis set size. Distant pairs have been treated by local MP2 (LMP2) and multipole approximations.55,56,83 On the basis of these approximations, integraldirect linear-scaling PAO-LMP2 and PAO-LCCSD(T) methods have been developed in our group.57−63 Somewhat later, a significant improvement of the efficiency was achieved by the introduction of local density fitting approximations.65,66,74 Furthermore, the combination of local and explicit correlation (F12) treatments lead to a drastic improvement in the accuracy.78−81,154−166 This is not only caused by the reduction of basis set incompleteness errors but also due to a strong reduction of the domain errors. The latter improvement is caused by implicit excitations to the space outside the domains in the explicitly correlated treatment. Since the F12 terms are fast decaying, near linear scaling can also be realized for explicitly correlated LMP2-F12 and LCCSD(T)-F12.80,156,157 A disadvantage of the PAO-based local methods is that the errors are not always easy to control. In most cases, the domains are chosen based on a selection of atoms for each pair 3651

DOI: 10.1021/acs.jctc.7b00554 J. Chem. Theory Comput. 2017, 13, 3650−3675

Article

Journal of Chemical Theory and Computation

LMP2 pair energies and energy thresholds. Distant and very distant pairs are treated by LMP2 and multipole approximations.55,56,83 The impact of this approximation for PNOLMP2 correlation energies has recently been investigated in detail by one of us.83 It was found to be very small provided that sufficiently high multipole orders (up to dipole−octupole and quadrupole−quadrupole interactions) are included, and couplings between the distant pairs are not neglected. In the current work, these couplings are included for distant pairs but not for very distant pairs. In our previous PAO-LCCSD(T) as well as in the DLPNO-LCCSD(T) methods of Riplinger and Neese,101,104 weak pairs were also treated by LMP2 (or even semi-canonical LMP2). As already pointed out in earlier studies,42,181−183 this is not a good approximation for intermolecular interaction energies since LMP2 significantly overestimates long-range dispersion contributions. Schütz et al.181−183 therefore proposed more accurate approximate coupled-cluster treatments of intermolecular pairs. In a previous communication,81 we demonstrated that the overestimation of intramolecular dispersion energies by LMP2 can also have a significant effect on reaction energies of large systems. We therefore proposed an approximate PNO-LCCD treatment of close and weak pairs, which is closely related to the approximations used by Schü tz et al.182 In fact, pair approximations based the same principles have already been proposed and tested 30 years ago by Saebø and Pulay in their seminal work on local MP4(SDQ) (fourth-order Møller− Plesset perturbation theory with single, double, and quadruple excitation operators).47 Most importantly, they showed that the expensive contributions of 4-external integrals cancel with 0external and 2-external integral contributions at long-range and that the neglect of the canceling terms leads to large savings without much loss of accuracy. This has been confirmed in our previous work81 for very much larger systems than could be treated by Saebø and Pulay. In the current paper, we generalize this accurate weak and close pair treatment for PNO-LCCSD. We also discuss and test the projection approximations. In their general form, they have been introduced by Neese et al.,96,97,101 but to the best of our knowledge, some of them have not been tested in detail so far. These approximations project integrals from one domain to another, thereby strongly reducing the number of nonredundant integrals that must be computed and stored. They become exact for complete PNO domains. Testing the local approximations is possible in various ways. For relatively small molecules, local and canonical calculations can be directly compared. With tighter and tighter thresholds, the local method should then converge exactly to the canonical result. Pair approximations can be tested by comparing local and canonical calculations for molecular dimers as a function of the intermolecular distance. If the intermolecular pairs are treated by pair approximations, the errors caused by these approximations can be seen directly. Unfortunately, such tests are not possible for large molecules, for which canonical calculations are not feasible anymore. In this case, one can only study the dependence of absolute or relative energies on the truncation thresholds. In the current work, we use all these possibilities to establish the accuracy of our method. As already pointed out, explicitly correlated terms strongly reduce the basis set incompleteness and domain errors, and in our opinion, they are indispensable for accurate calculations on large systems. However, the current work focuses only on the PNO-LCCSD method itself and the related domain, pair, and

Originally, Neese et al. created the PNOs from semicanonical MP2 amplitudes in the full virtual orbital basis, which scales as 6(Nel5) [or 6(Nel4) if distant pairs are neglected]. Later on, in the so-called domain-based local PNO (DLPNO) methods,101−104 the initial semi-canonical amplitudes were created in large domains of PAOs, which leads asymptotically to linear scaling. More recently, PNO-based local correlation methods have also been developed in our group78−83 and by Tew and Hättig et al.158−163 Some of these methods used domains of orbital specific virtuals177−180 (OSVs) to obtain the initial amplitude approximation, which makes it possible to reduce the scaling for the PNO generation to cubic.161 In our methods, a sequence of transformations PAO → OSV → PNO is implemented, which leads to linear scaling and minimizes the CPU and memory demands for generating PNOs without losing significant accuracy.78−82 Our current efforts are concentrated on developing new PNO-based low-order scaling correlation methods which are well parallelized. Ideally, the targets are “scalable” implementations, i.e., linear scaling of the computational effort with molecular size and inverse linear scaling of the compute time with the number of CPU cores. Near linear scaling with molecular size has been achieved previously for PNOLMP279,82 and PNO-LMP2-F12,80 and these methods are also well parallelized. In the current work, this is extended to PNO-LCCSD. With “well parallelized”, we mean an implementation which runs well on multiple computer nodes with communication over a fast network (e.g., Infiniband) and typically a total of 100−200 CPU cores. This is to be distinguished from “moderately” parallel codes that work only on a single node with shared memory and a shared file system or “massively” parallel programs that are designed for many thousands of CPU cores. Due to the large communication demands in CCSD methods, massive parallelization is very difficult to achieve without introducing large redundancy in the computations. Even for our well-parallelized PNO programs, true inverse linear scaling with the number of cores or nodes could not be achieved due to communication overheads and imperfect load balancing. True linear scaling of the computational effort with system size is also difficult to achieve in PNOLCCSD. While all dominant parts of our current PNO-LCCSD method scale (in the asymptotic limit) nearly linearly with the molecular size, we decided for the sake of accuracy and robustness to keep a few terms which scale quadratically or cubically, though with a very small prefactor. Efficient PNO-LCCSD methods require a number of approximations beyond the use of PNO domains. First of all, approximations for close, weak, and distant pairs are necessary in order to make calculations for large molecules feasible. Second, many summations in the residual equations have to be truncated in order to reduce the computational effort and to achieve low-order scaling. Third, various projection approximations as first introduced by Neese et al.96,97,101 can be used to reduce the number of transformed integrals that need to be computed and stored. Finally, density fitting and screening techniques are used for efficient computation of the transformed integrals in the LMO/PNO basis. The correlated electron pairs can be classified according to their distance or their correlation energy contributions. In the current work, we distinguish strong, close, weak, distant, and very distant pairs. Energy thresholds are generally preferred, and the different pair classes can then be defined based on 3652

DOI: 10.1021/acs.jctc.7b00554 J. Chem. Theory Comput. 2017, 13, 3650−3675

Article

Journal of Chemical Theory and Computation

Figure 1. Benchmark reactions.

2. BENCHMARK SYSTEMS AND COMPUTATIONAL DETAILS In order to test our PNO-LCCSD and PNO-LCCSD-F12 methods, we have computed reaction energies, isomerization energies, barrier heights, and intermolecular interaction energies. On the one hand, the benchmark systems should be representative for real-life chemistry and large enough to test all kinds of local approximations. In particular, strong dispersion interactions, which can, for example, arise in large molecules between aromatic rings, may have a large effect, and their importance increases with system size. On the other hand, the benchmark systems should not be too large so that the many necessary calculations can be run in a reasonable time. The systems we have chosen comprise 60−174 atoms and include the most difficult test cases that we have encountered so far. The following benchmark sets have been used [cf. Figure 1; three-dimensional representations can be found in the Supporting Information (SI)]: 1. The FH benchmark set of Friedrich and Hänchen,151 which contains 104 small to medium size molecules and 55 reactions. This includes the “difficult cases”, which have been omitted in some previous benchmarks. 2. The ISOL24 benchmark of Huenerbein et al.186 This contains 24 real-life isomerization reactions with up to 81 atoms, which have been extensively used in the past to

projection approximations. The F12 treatment is rather independent of these and will be described in a following paper.184 Extensive benchmarks for the overall accuracy of our PNO-LCCSD-F12 method will be given therein as well. Clearly, connected triple excitations are also indispensable for accurate calculations. Linear scaling PAO,59,60,62,85 OSV,180 and PNO102 implementations of the perturbative (T) treatment have been described in the past. Alternatively, Laplace transform methods163,185 can be used. A new parallel implementation of the PNO-LCCSD(T)-F12 method is currently developed in our group and will be described in a future publication. The organization of this paper is as follows. First, we describe our benchmark systems and the relevant computational details. Then, we introduce our notation and discuss some general aspects of local coupled-cluster theory. Subsequently, we discuss the domain, pair, and projection approximations that are used in our method. For each approximation, we present representative benchmark results. Finally, we describe some technical details about our implementation and demonstrate the scaling with the molecular size and the number of processing cores for some model systems. A summary concludes the paper. 3653

DOI: 10.1021/acs.jctc.7b00554 J. Chem. Theory Comput. 2017, 13, 3650−3675

Article

Journal of Chemical Theory and Computation

was used for the hydrogen atoms, which reduces linear dependencies (in particular, for the coronene dimer). In the following, this mixed basis is denoted aVTZ. Local density fitting (DF) approximations65 were employed throughout this work for the evaluation of two-electron integrals, and the augcc-pVTZ/MP2FIT basis sets198 were used as the DF auxiliary basis in all calculations. For computing the Fock matrix, we used the aug-cc-pVTZ/JKFIT basis sets, which were derived from the cc-pVTZ/JKFIT sets199 by adding for each angular momentum another shell of diffuse functions. We note that the additional diffuse functions in the fitting basis sets have a relatively small effect in standard LCCSD calculations. However, they sometimes have a non-negligible effect in F12 calculations and were used here for consistency with the F12 calculations in ref 184. Intrinsic bond orbitals200 were used as LMOs. Inner-shell core orbitals were not correlated. The local fitting domains correspond to the IEXT = 2 orbital domains (cf. Section 4.1 and ref 79), i.e. IDFDOM = 2, RDFDOM = 5 a0. For the calculations with IEXT = 3, larger DF domains (IDFDOM = 3, RDFDOM = 7 a0) were employed. Unless otherwise stated, all other parameters for local approximations are our program defaults values, as described in the following sections. All benchmark calculations were performed using the development version201 of the Molpro quantum chemistry package.202 The PNO-LCCSD method will be made available for public use in the near future.

test the performance of DFT methods with numerous different functionals.186−188 In many of these reactions, the electronic structure changes significantly. In the current paper, we only use isomerization reaction 4, which was found to be the most difficult one due to its strong dispersion contribution to the reaction energy. In the original paper of Huenerbein et al.,186 it was also found to be the most difficult one for density functional theory (DFT), with a dispersion correction of ≈40 kcal mol−1. Results for the full ISOL24 set will be presented in our accompanying paper on PNO-LCCSD-F12.184 3. The WCCR10 benchmark set of Weymuth, Couzijn, Chen, and Reiher, 189 which contains 10 ligand dissociation energies of large transition metal complexes for which gas-phase experimental values are available. In the current paper, we only present results for reaction 4 of this set, denoted as “WCCR10-4” in the following. This is by far the largest system of the whole benchmark set. Again, results for all 10 reactions will be presented in the PNO-LCCSD-F12 paper.184 4. The dissociation of a gold(I)-aminonitrene complex (AuC41H45N4P, for simplicity denoted Auamin, Figure 1). This “Auamin reaction” is taken from ref 190. It plays an important role in catalytic aziridination and insertion reactions. The Auamin molecule contains 92 atoms and has three phenyl and three mesityl groups. There are strong dispersion interactions between these groups and with the gold atom. The computed dissociation energy can be directly compared to an experimental gas-phase value190 of 47.0 ± 2.7 kcal mol−1, which has been obtained by subtracting the PW91/cc-pVTZ-PP zeropoint correction190 of −2.0 kcal/mol from the measured value. The Hartree−Fock value is computed to be 22.0 kcal mol−1. Thus, the correlation contribution to the dissociation energy is large, and it is very sensitive to local approximations. The reaction therefore provides an excellent “difficult” benchmark system. 5. The “Androstendione reaction”, which is the last step in the synthesis of androstendione. This has also been used as a benchmark reaction in our previous work on local correlation methods.79−81,157 6. An organocatalysis reaction,191 which plays a role in a highly enantioselective organocatalysis. 7. The coronene dimer in a parallel displaced stacked configuration (C2C2PD) from the L7 test set of ref 192. This has also been used as benchmark system in the recent paper of Pavošević et al. on a DLPNO-CCSD-F12 method.107 For consistency with the other papers in this series,79,80,184 all calculations have been carried out using the cc-pVTZ-F12 (VTZ-F12) basis sets193 unless otherwise noted. These were specially optimized for explicitly correlated (F12) calculations. They contain diffuse functions, which are particularly important for F12 calculations (the s and p sets correspond to the aug-ccpVQZ basis sets194). Furthermore, these basis sets yield considerably better Hartree−Fock energies than the standard cc-pVTZ195 or aug-cc-pVTZ194 basis sets. For the gold and ruthenium atoms, the ECP60MDF and ECP28MDF effective core potentials,196 respectively, with the associated aug-ccpVTZ-PP basis sets197 were employed. In some calculations of intermolecular interactions, we used the standard aug-cc-pVTZ basis set. In these cases, the nonaugmented cc-pVTZ basis set

3. THEORY AND IMPLEMENTATION 3.1. Definitions and Notation. The local closed-shell coupled-cluster singles and doubles (LCCSD) wave function is defined as ̂

Ψ = eT |0⟩

(1)

where |0⟩ is the Hartree−Fock reference function and T̂ the cluster operator T̂ =

1 a taiEî + 2 a ∈ [i]

∑∑ i

∑ ∑ ij

a , b ∈ [ij]

ij ̂ a ̂ b Tab Ei Ej

(2)

where tia and Tijab = Tjiba are the singles and doubles amplitudes, † † respectively, and Ê ai = ηαa ηαi + ηβa ηβi are the usual spin-summed one-electron excitation operators. Here, [i] and [ij] denote orbital and pair domains of virtual orbitals, respectively. We assume [i] = [ii], even though this restriction could also be lifted. If necessary, the type of virtual orbitals is given by a subscript: for example, [i]PAO denotes an orbital-specific domain of PAOs, and [ij]PNO denotes a pair-specific domain of PNOs. Here and throughout this paper, we use the following index notation for the various orbital spaces: i, j, k, l denote occupied localized molecular orbitals (LMOs), and a, b, c, d denote virtual orbitals, belonging to the domains specified in the summations. In addition, the indices r, s refer to nonorthogonal projected atomic orbitals (PAOs) and μ, ν to AOs. Indices aij, bij, cij, dij denote virtual orbitals that are specific for a pair ij. In order to avoid excessive and redundant indexing, we mostly omit the superscripts of the virtual orbital indices and use the convention that they are implied by the domain specifications in the summations. The orbitals in each domain are assumed to be orthonormal, but those belonging to 3654

DOI: 10.1021/acs.jctc.7b00554 J. Chem. Theory Comput. 2017, 13, 3650−3675

Article

Journal of Chemical Theory and Computation T̅ [ijkl , mn] = S[kl , ij]T[ijij , ij]S[ij , mn]

different pairs are mutually nonorthogonal. The overlap matrices are defined as [S[ij , kl]]ab = ⟨aij|bkl⟩

T̃ [ijkl , mn] = 2T̅ [ijkl , mn] − T̅ [jikl , mn]

(3)

a ∈ [i]

ij [T[ijij , ij]]ab = Tab ,

Note that S[kl, kl] is the unit matrix, and therefore, = S[ij, ik] ij i i Tik[ik, ik], T̅ [ijij, ij] = T[ij, ij], and t[̅ ii] = t[ii]. Thus, there is some redundancy in this notation, but we keep the quantities without overbars for clarity. Equation 13 can now be written most compactly as R [ijij , ij] = K [ijij , ij] + T̃ [ikij , ik]K [kjik , ij] + αij , kl T̅ [klij , ij] + ...

(5)

The domain specification in square brackets is redundant here and could be omitted, but it is sometimes kept for clarity. Furthermore, we define integral matrices of 2-external integrals [K [ijkl , mn]]ab = (akli|bmnj) [J[ijkl , mn]]ab

(6)

| a⟩ =

kl mn

= (a b |ij)

ij ij



[K(T )[ij , ij]]ab =

ij ij

(a c | d b



(20)

(10)

(aik k|bijc jj)tcj (11)

c ∈ [jj]

| i ⟩̃ = |i⟩ +

Some typical terms in the LCCSD residual equations are ij ij = Kab + Rab

∑ ∑ k

+

kl

|a ̅ ii⟩ = |aii⟩ −

c , d ∈ [ik]PNO

∑ αij , kl ∑

+ ...,

=

+

+

(12)

(

∑ |k⟩[ t ̅[kii]]a

(22)

Using “dressed” integrals such as

a , b ∈ [ij]

c , d ∈ [kl]

K [ijij , ij]

(21)

k

ij

[K̃ [ijij , ij]]ab = (a ̅ ij i |̃ b ̅ j ̃ )

Here, the indices a, b refer to PNOs for pair ij, while c, d refer to PNOs for pair ik or kl. In matrix form, this can be written as R [ijij , ij]

|c⟩tci

∑ c ∈ [ii]

kj Sac(2Tcdik − Tcdki)Kdb

SacTcdklSdb

S[ij , ik] 2T[ikik , ik]



αij , kl S[ij , kl]T[klkl , kl]S[kl , ij]

a , b ∈ [ij]PC‐PAO

Similarly, OSV or PNO pair domains can be transformed to pseudo-canonical representations. The PNO-LMP2 and PNOLCCSD calculations are carried out in the PC-PNO basis, and for brevity, we use in the following [ij] ≡ [ij]PC‑PNO. 3.2. Dressed and Undressed Formulations of PNOLCCSD Equations. As shown in the Appendix, the PNOLCCSD equations can be compactly described in terms of dressed orbitals

(9)

c ∈ [jj]

[K(Ekj)[ik , ij]]ab =

W raijFrsW sbij = δab ϵija ,



(8)

(aik bij|kc jj)tcj



(19)

r , s ∈ [ij]PAO

c , d ∈ [ij]

[J(Ekj)[ik , ij]]ab =

ij Fab =

(amnc ij|d ijk)Tcdij



a ∈ [ij]PC‐PAO

so that S[ij, ij]PC‑PAO = 1 and

)Tcdij

c , d ∈ [ij] k) [K(Tij)[(mn ]]a =

|r ⟩W raij ,

∑ r ∈ [ij]PAO

(7)

as well as contractions of 3-external and 4-external integrals with amplitudes ij

(18)

The equations are then formally analogous to the canonical case. The domains can be spanned by different equivalent orbital representations, and the above definitions are independent of the choice, provided that the orbitals within each domain are orthonormal. A PAO domain [ij]PAO comprises a subspace of nonorthogonal PAOs r, s. The orbitals in this domain can be orthogonalized and transformed to a pseudo-canonical (PC) representation [ij]PC−PAO by a transformation Wij

(4)

a , b ∈ [ij]

(17)

T̅ ik[ij, ik]

In most cases we express the PNO-LCCSD equations in matrix form. Matrices are written in bold face, and unless otherwise noted, their elements refer to PNO domains given in square brackets. The definitions of all matrices and vectors are given in the Appendix. Here, we mention only a few, which are used in the following text. The singles and doubles amplitudes are considered as vectors and matrices, respectively, [t[i ii]]a = tai ,

(16)

+ ...

mn

[K̅ [ijkl , mn]]ab = (akli|b ̅ j ̃ )

)

T[kiik , ik]

(23)

K [kjik , ij]

(24)

mn

[ J̅ [ijkl , mn] ]ab = (akl b ̅ |ij ̃ )

the doubles amplitude equations then take exactly the same form as for PNO-LCCD. An advantage of this formulation is the simple and compact form of the equations to be implemented. Furthermore, many summations over occupied indices are absorbed in the dressed orbitals, and therefore truncations of these summations in large molecules can be avoided. However, a serious disadvantage is that the dressed integrals depend on the singles amplitudes and must therefore be recomputed in each LCCSD iteration. This is very expensive, particularly for the dressed 3-external and 4-external integrals, as well as for the dressed 2-external Coulomb integrals in eq 25. Another disadvantage of the dressed

(13)

and then the ranges of the summations in the matrix multiplications are defined by the domains given in square brackets. Summations over repeated occupied indices (in this case k, l) are also implied. For brevity in later expressions, we define “projected” quantities, which are indicated by an overbar t ̅ [i ij] = S[ij , ii]t[i ii]

(14)

† D̅ [ijkl , mn] = t ̅[i kl] t ̅[jmn ]

(15)

(25)

3655

DOI: 10.1021/acs.jctc.7b00554 J. Chem. Theory Comput. 2017, 13, 3650−3675

Article

Journal of Chemical Theory and Computation

Table 1. Default Values for Thresholds for Domains and Pair Approximations Used in PNO-LMP2 and PNO-LCCSD Calculations threshold

PNO-LMP2

PNO-LCCSD

description

TLMO IEXT REXT TOSV TPNO Ten

0.2 2 5 10−9 10−8 0.997

0.2 2 5 10−9 10−7 0.990

primary PAO domains PAO domain extension (bonds) PAO domain extension (distance, in a0) OSV domains PNO occ. number threshold PNO energy threshold

computed and used to determine semi-canonical (SC)-PAO amplitudes

equations is that the dressed orbitals are less sparse than the original orbitals, and therefore, linear scaling is more difficult to achieve. In the first stage of our work, we have implemented a partially dressed form of the PNO-LCCSD equations, in which only dressed 0−2-external integrals need to be recomputed in each iteration. This still leads to great simplifications since, for example, eq 23 expands to 16 terms in the undressed form. Nevertheless, this approach is still rather expensive and not linearly scaling. In our final implementation, all terms are fully expanded, and all required integrals are precomputed. The only exception is the dressed Fock matrix Frs̃ = hrs +

ii = Tab

∑ [2(rs|l l̃ ) − (rl|l s̃ )] ∑∑ l

c ∈ [ll]

[2(rs|cl) − (rl|cs)]tcl (26)

which is recomputed in each iteration. The terms involving this matrix describe orbital relaxation contributions and are slowly decaying. This is further discussed in Section 5. In order to achieve linear scaling (except for the dressed Fock terms), many summations have to be truncated and domain as well as projection approximations have to be introduced. The principles of these approximations are discussed in the following sections; full details are given in the Appendix, so that the method is exactly reproducible.

Dij =

4. DOMAIN APPROXIMATIONS 4.1. Definition of Domains. The construction of PAO, OSV, and PNO domains has been described in detail in ref 79, and therefore we only briefly outline the procedure in this section. The thresholds and their default values are summarized in Table 1. For LMP2, the same thresholds as have been recommended in our previous papers79,80 are used. The subsequent PNO-LCCSD calculation can be carried out with smaller domains, as explained in more detail in Section 4.2. The LMOs and PAOs are represented in the AO basis by matrices Lμi and Pμr, respectively, with P = 1 − LL†SAO

a , b ∈ [i]PC‐PAO

(28)

Next, OSV orbital domains [i]OSV are created by diagonalizing pair densities Dii = Tii Tii (or Tii directly; in this case, the eigenvalues have to be squared to get the natural occupation number nia). All OSVs with occupation numbers nia ≥ TOSV are kept and made pseudo-canonical. The transformations from the nonorthogonal PAO domains [i]PAO to the PC-OSV domains [i]PC‑OSV are described by matrices Qi. The OSVs are used to compute semi-canonical distant pair energies EOSV using multipole expansions for the integrals. The ij approximate pair energies are used to select distant pairs, using an energy threshold Tdist (for details, see Section 5.1 and ref 83). For the remaining nondistant pairs, OSV pair domains [ij]OSV = [i]PC−OSV ∪ [j]PC‑OSV are generated, orthogonalized, and made pseudo-canonical. The transformation from the nonorthogonal domains [ij]OSV to the pseudo-canonical domains [ij]PC‑OSV is denoted Wij. Then, the integrals (ai|bj), a, b ∈ [ij]PC‑OSV are computed using local density fitting techniques as described in refs 79 and 80. They are first evaluated in the PAO basis and then transformed to the PCOSV domains using the transformation matrices Qi and Wij. The OSV integrals are used to compute semi-canonical OSV amplitudes Tij, and from these OSV pair densities

l

= Frs +

−(ai|bi) , εai + εbi − 2fii

1 (T̃ ij †Tij + T̃ ijTij †), 1 + δij

ij

ij ij ̃ = 2Tab Tab − Tba

(29)

which are diagonalized to yield the OSV → PNO transformations Vij. The eigenvalues of Dij are the pair natural occupation numbers nija. PNO domains for nondistant pairs are selected using two thresholds TPNO and Ten. The PNOs for a pair ij are always processed in the order of decreasing occupation numbers nija. A PNO aij is included if either the pair occupation number nija is greater or equal to the threshold TPNO or the semi-canonical PNO pair energy EPNO is still smaller than Ten × ESC‑OSV . As ij ij demonstrated further below, the latter criterion is important to ensure that the PNO domains of weak pairs are sufficiently large. Finally, the PNO domains are made pseudo-canonical, using a transformation matrix Uij. The whole process can be summarized as follows:

(27)

where [SAO]μν = ⟨μ|ν⟩ is the AO overlap matrix. The PAO orbital domains [i]PAO are determined by center lists, and all PAOs arising from AOs at these centers are included. In the first step, primary center lists are selected on the basis of IBO partial charges200 (threshold TLMO). These lists are then extended by adding all atoms which are connected by ≤ IEXT bonds or are within a distance REXT (in a0) to any atom in the primary domain. The extended PAO domains are orthogonalized and made pseudo-canonical. The eigenvalues are denoted εia. Subsequently, the integrals (ri|si), r, s ∈ [i]PC‑PAO are

Qi

P

AO → [i]PAO → [i]PC‐OSV

(30)

[ij]OSV = [i]PC‐OSV ∪ [j]PC‐OSV

(31)

W ij

V ij

U ij

[ij]OSV ⎯→ ⎯ [ij]PC‐OSV → [ij]PNO → [ij]PC‐OSV 3656

(32)

DOI: 10.1021/acs.jctc.7b00554 J. Chem. Theory Comput. 2017, 13, 3650−3675

Article

Journal of Chemical Theory and Computation In each step of the transformation sequence PAO → OSV → PNO, the domains are reduced. For the VTZ-F12 basis,193 typical average pair domain sizes of large molecules with default settings are 600, 300, 30. The intermediate OSV step can optionally be skipped (as in the DLPNO methods of Riplinger and Neese101,104), but the advantage of it is that the PNO generation, which scales cubically with the domain size, is significantly faster. Also the transformation matrices Vij are smaller, which is important to minimize the local memory demands of each processor in parallel calculations. Timings for some typical cases can be found in ref 82. In the following, the dependence of the domain sizes on the PNO thresholds is demonstrated for the so-called Auamin molecule, which is shown in Figure 1. Figure 2 shows the

Figure 3. Auamin PNO domain sizes for various combinations of the thresholds TPNO and Ten.

pairs begin slowly to grow (Figure S2, SI). For a value of 0.99 (second row of Figure 3), many close and weak pair domains grow significantly, while the strong and distant pair domains are still hardly affected. For TPNO = 0.997, the weak and close pair domains become even larger than most strong pair domains (which are still not much affected). Overall, the average domain size increases almost by a factor of 2 when increasing Ten from 0.9 to 0.997. These findings show that even with an occupation number threshold of 10−8 the close and weak pair energies are significantly underestimated, unless the energy criterion is used. This affects the long-range interactions and can lead to significant errors of the reaction energy. However, the convergence of the absolute and relative energies with respect to the PNO domain thresholds can be much improved by domain corrections, as shown in the next section. 4.2. Domain Corrections. The PNO domain sizes significantly affect the computational cost of PNO-LCCSD calculations, in particular, the memory or disk space needed to store 4-external integrals (ab|cd), 3-external integrals (ab|ck), and the very many PNO-overlap matrices S[ij,kl]. On the other hand, as demonstrated below, very tight thresholds are needed to converge the relative energies to within ∼1 kJ mol−1. To a large extent, this is probably due to intramolecular basis set superposition effects (BSSE) on the reaction energies, which can only be recovered by using large domains. However, accurate results can also be obtained by first performing an LMP2 calculation with tight PNO domain thresholds and then the LCCSD calculation with smaller PNO domains. The difference of the LMP2 correlation energies obtained with large and small domains is then added to the LCCSD correlation energy. For this, we determine the large and small domains simultaneously using the semi-canonical OSV (or PAO) amplitudes. The initial small PNO domains are a subset of the larger ones. However, this is no longer true once the transformation to PC-PNOs is carried out. This must be done separately for the large and small domains, and the eigenvectors of the Fock matrices for pair ij are the columns of unitary

Figure 2. Pair energies of the Auamin molecule as a function of the orbital distance Rij. The red line corresponds to R−6 decay.

LMP2 pair energies as a function of the distances Rij = |rr − rj|, where ri and rj are the charge centers of the LMOs i and j, respectively. Each of the of 7503 points represents one pair. As expected, at long-range, the pair energies decay approximately as R−6 ij . Also shown are the numbers of strong, close, weak, and distant pairs using default thresholds. One can see that there are 4295 weak and distant pairs. The individual correlation energies of these pairs are small but add up to a significant contribution to the total correlation energy. As shown later, these pairs also have a significant effect on the reaction energy. Figure 3 shows the PNO domain sizes as a function of Rij. Again, each point represents one pair. The first row shows the domain sizes for TPNO = 10−7 (left) and TPNO = 10−8 (right) without the energy criterion (Ten = 0). In these calculations, all pairs have been fully optimized by PNO-LMP2 without multipole approximations. The domains obtained with TPNO = 10−8 are nearly twice as large as the ones with TPNO = 10−7, and the maximum domain size increases from about 120 to 220. The domain sizes quickly decrease with increasing distance Rij (decreasing pair energies), and many distant pairs vanish (zero domain size). Also, the domains of weak pairs are rather small. This leads to an underestimation of the weak and distant pair energies. In the second and third rows of Figure 3, the energy completeness criterion is switched on (see also Figure S1 in the SI for additional values of Ten). The distant pair domains now remain finite, though very small. As has been shown in ref 79, these very small domains are sufficient to recover a very large fraction of the long-range dispersion energies. Up to a threshold of about TPNO = 0.95 not much happens for strong and close pairs, but the domains of weak 3657

DOI: 10.1021/acs.jctc.7b00554 J. Chem. Theory Comput. 2017, 13, 3650−3675

Article

Journal of Chemical Theory and Computation

Figure 4. PNO-LCCSD reaction energies of the Isomer4 (left) and Auamin (right) reactions as a function of the PNO threshold TPNO with and without LMP2 domain corrections and for various thresholds Ten (given in parentheses in the caption). Default PAO domains (IEXT = 2) are used.

Figure 5. PNO-LMP2 and PNO-LMP2-F12 reaction energies of the Isomer4 (left) and Auamin (right) reactions as a function of the PAO domain parameter IEXT, using REXT = (2 × IEXT + 1) a0. Default PNO domains are used.

matrices Uijlarge and Uijsmall, respectively. The transformation from the large to the small pseudo-canonical domains is then ij ij Uij† largeUsmall, where only the vectors of Ularge that belong to the small domains are used. After carrying out the LMP2 calculation with the large domains, the integrals, amplitudes, as well as the OSV-PNO transformation matrices, are transformed from the large to the small domains. The overlap matrices are recomputed, and the LMP2 is repeated with the smaller domains. The extra cost is quite small since only very few iterations are needed. Then, the following domain corrections are evaluated: sc sc sc ΔE LMP2(large) = EOSV ‐LMP2 − E PNO‐LMP2(large)

ΔE LMP2(small) = E LMP2(large) − E LMP2(small) +

Auamin and Isomer4 reactions are shown in Figure 4 as a function of TPNO using various thresholds Ten. It is found that without the energy criterion (Ten = 0), the convergence is rather slow and similar for both molecules. For a threshold TPNO = 10−7, only about 99.5% of the estimated correlation energy for full domains is recovered (Figure S2, SI), and the domain error of the reaction energy of Auamin is still larger than 10 kJ mol−1. With increasing energy threshold, the convergence is strongly improved. With Ten = 0.99, the domain-corrected values are very close to the estimated limit, even for the largest threshold, TPNO = 10−6. We have therefore chosen TPNO = 10−7 and Ten = 0.99 as default values for our PNO-LCCSD program. These corrections are very much in the spirit of composite correlation approaches, as used, for example, in G3203 or W3204 theories. Here, the corrections affect only domain errors, and no recomputation of any integrals is required. It is also possible to carry out the PNO-LCCSD calculation using smaller PAO domains and/or smaller basis sets. Then the correction ELCCSD(small) − ELMP2(small) is added to an accurate PNOLMP2 energy. We found that this also works well, but then two independent calculations have to be carried out. In Figure S2 of the SI, we also compare the above approach with the simpler semi-canonical domain corrections SC ΔELMP2(small) as used in the DLPNO-CCSD program of Riplinger and Neese. In this case, the correction is computed as in eq 33 but using the small PNO domains, which are used in the PNO-LCCSD calculation. The correction ΔESC LMP2(small) is somewhat smaller than ΔELMP2(small) and underestimates the

(33) sc ΔE LMP2(large)

(34)

ΔEscLMP2(large)

is a semi-canonical domain correction for the large domains, which amounts to the difference of the semi-canonical OSV-LMP2 and PNO-LMP2 energies. Similar domain corrections are used in the DLPNO methods of Riplinger and Neese.101,104 In our program, they are obtained for free since the semi-canonical energies are computed anyway in the domain selection process (cf. Section 4.1). If the correction ΔELMP2(small) is added to ELMP2(small), the domain-corrected energy of the large PNO-LMP2 is exactly recovered. The correction ΔELMP2(small) is also added to the final LCCSD energy, which is computed with the small PNO domains. The effects of the PNO domain thresholds and domain corrections on the PNO-LCCSD reaction energies of the 3658

DOI: 10.1021/acs.jctc.7b00554 J. Chem. Theory Comput. 2017, 13, 3650−3675

Article

Journal of Chemical Theory and Computation

Figure 6. PNO-LCCSD reaction energies of the Isomer4 (left) and Auamin (right) reactions as a function of the threshold Tcutpairs. Pairs with energies below this threshold are either neglected (black) or approximated by LMP2 (violet), SCS-LMP2 (blue), or by our new close and weak pair approximations (red). In the latter case, Tclose = Tcutpairs, Tweak = Tclose/10. All other pairs are fully treated by LCCSD.

Table 2. Definition of Pair Classes, Thresholds, and Their Default Values pair class

threshold

strong pairs close pairs weak pairs distant pairs very distant pairs

Eij > Tclose Tclose ≥ Eij > Tweak Tweak ≥ Eij > Tdist Tdist ≥ Eij > Tvdist Tvdist ≥ Eij

default valuea (Eh)

treatment

Tclose = 10−4 Tweak = 10−5 Tdist = 10−6 Tvdist = 10−7

full LCCSD approximate LCCSD linearized LCCSD iterative LMP2 multipole approximationb noniterative SC-LMP2 multipole approximationb

a

Thresholds Tclose and Tweak are applied using the converged LMP2 pair energies, while Tdist and Tvdist are applied using semi-canonical LMP2 pair energies obtained with the noniterative multipole approximation, p = 3. bUsing SCS-LMP2 distant pair energies computed with the iterative multipole approximation, p = 3 (see ref 79).

domain error of PNO-LCCSD, while ΔELMP2(small) overestimates it, in particular, for small PNO domains. However, for tight thresholds (TPNO ≤ 3 × 10−8), the full LMP2 correction yields in both cases faster convergence to the limit. The overshooting of the full LMP2 correction for very small domains is particularly strong for Isomer4. This is probably due to the strong overestimation of long-range dispersion interactions by LMP2, which have a large effect on the isomerization energy. The semi-canonical correction then appears to be better, but this is probably due to error compensation. Obviously, the domain correction and overall accuracy also depends on the PAO domains. In fact, the finite PAO domain sizes are probably causing the largest errors in PNO-LMP2 and PNO-LCCSD calculations. This is demonstrated for our two test reactions in Figure 5. The convergence of the relative energies with increasing PAO domain sizes is frustratingly slow. Most likely, this is due to intramolecular BSSE effects, which are partly removed in local correlation methods. It is well known that this improves the accuracy of intermolecular interaction energies57,205 [without counterpoise (CP) correction], but unfortunately, this is not always true for reaction energies. As demonstrated in Figure 5, this problem is almost entirely removed if F12 terms are included, which strongly reduce the basis set incompleteness errors and consequently the BSSE. Furthermore, they also strongly reduce the domain errors. The importance of BSSE effects will be discussed more extensively and for more examples in our forthcoming paper on PNO-LCCSD-F12.184 Finally, we note that the convergence of the results with the PNO domain thresholds is also much improved when F12 terms are included. In this case, it is again possible to apply domain corrections to the LCCSD-F12 energy, using LMP2-

F12 instead of LMP2. But, these corrections are very much smaller than the ones shown in Figure 4. This will be demonstrated in our paper on PNO-LCCSD-F12.184

5. PAIR APPROXIMATIONS The computational cost of local coupled-cluster calculations depends strongly on the number of pairs that are fully treated by LCCSD. Pulay44 has therefore proposed to exploit the fast decay of the pair energies as a function of the distances Rij (cf. Figure 2) and to treat “weak” pairs with small energies (or large Rij) more approximately by LMP2. This has been adopted by our group in early PAO-LCCSD methods42,50,61 and also by Neese and co-workers in their DLPNO-CCSD methods.101,104 However, in the course of the current work, we found that this approximation can sometimes lead to large errors since LMP2 strongly overestimates long-range dispersion interactions. To demonstrate this, we consider the reaction energies of the Auamin and Isomer4 reactions as a function of a threshold Tcutpairs. Pairs with energies below this threshold are considered as weak pairs and are either neglected or approximated by LMP2 or SCS-LMP2 (this includes distant pairs). The remaining “strong pairs” are fully treated by PNO-LCCSD. The results are shown in Figure 6. It is shown that the many weak pairs with individually small energy contributions have a large overall effect on the reaction energies. If they are entirely neglected, the reaction energies of both systems are strongly underestimated since the weak pair correlation contributions to the reactant molecules are larger than those of the products. On the other hand, if the weak pairs are treated by LMP2, their correlation contribution is overestimated, leading to too large reaction energies. Better convergence is obtained with spincomponent-scaled (SCS)-LMP2.206 But even this approximation converges rather slowly with increasing Tcutpairs and is not 3659

DOI: 10.1021/acs.jctc.7b00554 J. Chem. Theory Comput. 2017, 13, 3650−3675

Article

Journal of Chemical Theory and Computation Table 3. Truncation of Summations in PNO-LCCSD Equations (see text) pair classes of P included in the summations linear terms ij

R

strong close weak a

e

−RP

decay

strong, close strong, close strong, close

nonlinear terms R−6 P

decay

strong, close, weak strong, close, weak strong, close, weak

−RP

e

decay

strong, close strong a

R−6 P decay strong, close, weak strong, close a

Not included for weak pair residuals.

sufficiently accurate for pairs with Eij > 10−6 Eh. A very similar qualitative behavior has been found for several other reactions. We have therefore introduced a much more accurate hierarchy of pair approximations, which treat strong, close, weak, distant, and very distant pairs at different levels. A summary of these classes and the corresponding default thresholds is shown in Table 2. The individual approximations are described in the following section. 5.1. Distant Pair Approximation. The integrals (ai|bj) for distant pairs are approximated by multipole approximations55,56,83 using pseudo-canonical OSV domains a ∈ [i]PC‑OSV, b ∈ [j]PC‑OSV. Exchange and charge transfer contributions are neglected (NOEX approximation83). In the current work, we include all contributions up to the order p = 3, which means all contributions to the integrals that decay with R−(p+2) or slower. These are the dipole−dipole (p = 1), dipole− ij quadrupole (p = 2), and quadrupole−quadrupole, as well as dipole−octupole interactions (p = 3). For details and explicit expressions, we refer to ref 83. The approximated integrals are used to calculate semi-canonical distant pair energies Eijdist (noniterative multipole approximation). Very distant pairs with energies Edist ij ≤ Tvdist are dropped, and their semi-canonical pair energies are added later to the final LMP2 and LCCSD correlation energies. For pairs with Tdist ≥ Edist > Tvdist, two ij different approximations are possible: either they are treated noniteratively by the semi-canonical approximation as the very distant pairs (in this case, there is no difference between distant and very distant pairs), or their amplitudes and pair energies are optimized iteratively together with all other pairs using the approximate integrals. For this purpose, semi-canonical amplitudes Tijab, a ∈ [i]PC‑OSV, b ∈ [j]PC‑OSV are used to generate distant pair PNOs (dPNOs, cf. ref 83). Due to the latter restriction of the indices a and b, the distant pair density matrices Dij are block diagonal, and separate PNOs are obtained for i and j. A completeness criterion Tdist PNO = 0.997 is used to determine the dPNO domains [i]ijdPNO and [j]ijdPNO. All PNOs a ∈ [i]ijdPNO and b ∈ [j]ijdPNO are kept, which are needed dist dist to satisfy EPNO ≥ Tdist ij PNO × Eij , where Eij is the semi-canonical distant pair energy in the PC-OSV domains. The dPNOs a and b are added stepwise in the order of decreasing energy contributions. Note that the domains [i]ijdPNO and [j]ijdPNO are pair specific. Typically, each of them contains only 3−4 dPNOs. The integrals are transformed to the dPNO basis and stored for later use in the iterative LMP2. Due to the very small domains, the storage requirements for these integrals are small, despite the large number of distant pairs. It has been demonstrated in ref 83 that the iterative multipole approximation is much more accurate than the noniterative one, and it is therefore used throughout this paper. The dominant part of the remaining error stems from the neglect of charge transfer and exchange terms (NOEX approximation) and not from the multipole approximation.

More significant errors can arise if only the dipole−dipole approximation (p = 1) is used as in other works.101,103 As shown in Figure 6, for distant pairs the SCS-LMP2 approximation is more accurate than LMP2. Therefore, SCSLMP2 distant pair energies are used throughout this work (but note that the distant pair selection uses LMP2 distant pair energies). Due to the NOEX approximation, the SCS-LMP2 distant pair energies are simply obtained by multiplying the LMP2 ones by s = ( f∥ + f⊥)/2, where f∥ and f⊥ are Grimme’s scaling factors206 for parallel and antiparallel spin contributions, respectively. Using the recommended values of f∥ = 6/5, f⊥ = 1/ 3, a scaling factor of s = 23/30 = 0.7666 is obtained. It should be noted that this is closely related to the so-called scaled opposite spin (SOS)-MP2 approximation,207 in which case the scaling factor would be s = cSOS/2 = 0.65. 5.2. Weak, Close, and Strong Pair Approximations. One can classify the contribution to the LCCSD correlation energy of each individual summation in the residuals Rij and ri according to its decay with respect to the distances Rik, Rjk, Ril, Rjl, Rij, and Rkl where k and/or l denote summation indices in the residual equations (cf. Appendix). The decay of a particular term with respect to these distances can be different, and the distance which causes the fastest decay is used to classify the term. In the following discussion this distance is simply denoted as RP. Apart from the decay properties of a term, the expected magnitude of its contribution is an important consideration. A demonstration how to elaborate the contributions of the individual terms of the residuals Rij and ri to the correlation energy is given in the SI. The long-range decay of a term with respect to Rp can be either exponential (if dependent on the overlap of two LMOs) or with R−n (1 ≤ n ≤ 6) (if dependent on multipole P interactions). It turns out that all terms in the residuals that −2 −4 −5 decay with R−1 P , RP , RP , and RP in the summation indices cancel at long-range, and only three types of decay occur: fast −3 exponential decay, medium fast R−6 P decay, and slow RP decay. The cancellations are related to particle−hole symmetries: in a diagrammatic representation, the terms that cancel differ only in the direction of the arrows in the related diagrams. This has already been worked out for many LCCSD terms by Schütz et al.,181,182 and the resulting approximations for PNO-LCCD have been described in detail in ref 81. Here, they are extended and refined for the LCCSD case. With the exception of one high-order term in the singles residual (the second term in eq 65, entering only in MP6), the slowly decaying R−3 P terms all involve single excitations and can be fully included through the singles Fock-like contribution F̃ (cf. eq 26).61,74,100,182 This matrix is computed in each iteration in the AO basis and then transformed to the LMO/PNO basis. This proceeds in close analogy to the computation of the twoelectron contributions in a Fock matrix, and local density fitting approximations can be used to achieve linear scaling for the 3660

DOI: 10.1021/acs.jctc.7b00554 J. Chem. Theory Comput. 2017, 13, 3650−3675

Article

Journal of Chemical Theory and Computation

Figure 7. Reaction energies of the Isomer4 (left) and Auamin (right) reactions as a function of the pair selection thresholds Tclose and Tweak with default domains and default projection approximations. The red curves correspond to those in Figure 6.

exchange contribution.208 The Coulomb contribution scales quadratically, and the transformation of F̃ from the AO into the PAO basis even scales cubically. Earlier attempts to approximate these contributions in a linear scaling fashion showed that this is a very sensitive approximation: In some cases, the errors were small, and it worked well. But in others, the LCCSD iterations did not even converge.61,74 We therefore decided to sacrifice linear scaling for accuracy and robustness. The additional effort in each LCCSD iteration is less than for a Hartree−Fock iteration. The general strategy for the pair approximations used in our method is summarized in Table 3. In the first column of this table, the pair class of the residuals Rij is given, while the remaining columns refer to the classes of pairs P in the summations of these residuals. As outlined above, the decay of each individual term depends on a distance RP, where P is a pair index ik, il, jk, jl, or kl, and k, l are the summation indices. The summations in the residuals are restricted by the condition that the index P belongs to the listed pair classes. For the selection of the pair classes we use the converged PNO-LMP2 pair energies (Table 2). If the LCCSD is carried out with smaller PNO domains, then the pair energies of the PNO-LMP2 calculation with large domains (cf. eq 34) are used to determine the pair classes. The strong pair residuals are based on the full LCCSD equations, but the summations in the terms are restricted to certain pair classes as indicated in Table 3. The singles residuals ri are treated similarly to the strong pair residuals Rii, which is justified by considerations given in the SI. All details of the summation restrictions in the strong pair and singles residuals are given in the Appendix; those of the close and weak pair residuals are given in the SI. The general goal of the close and weak pair approximations is to simplify the full LCCSD equations without sacrificing much accuracy. The close pair residuals include all terms that formally contain contributions to the MP4 correlation energy with exception of the terms whose summed contributions decay as R−9 ij , as described in detail in ref 81 and references therein. Two higher-order groups of terms are also included into the close pair residuals, even though they only contribute in the order of MP5 or higher to the correlation energy. One of the groups contains dressed Fock matrices. Their magnitudes can be significant, while the computational overhead is minimal. Also, we discovered that the contributions containing the 3-external integral contractions K(Ekj) (cf. eq 11) and their related 1external counterparts as defined in eq 77 give significant

contributions to the close pair residuals and are therefore included. This can be rationalized by the fact that if the orbital k is located spatially in between the orbitals i and j, it acts as a “bridging” orbital that slows down the decay of the overall contribution with the distance Rij. In the weak pair residuals, we use the same approximations as for close pairs, and in addition, all nonlinear terms in the LCCSD equations are neglected, leading to the coupled electron pair approximation (CEPA).169,170 The detailed summation restrictions of the close and weak pair residual can be found in the SI. If Tweak ≤ Tdist, the weak pair approximation is not used. Figure 7 demonstrates the accuracy of these approximations for the Auamin and Isomer4 reactions on a finer scale than in Figure 6. The figure shows the reaction energies as a function of the threshold Tclose. In order to minimize the number of independent parameters, we have chosen Tweak = Tclose/10. In addition, the results without weak pair approximation (Tweak = 0) are also shown. For the largest threshold (Tclose = 10−3Eh), the errors of the Isomer4 and Auamin reaction energies caused by the close pair approximation amount to 0.8 and 0.6 kcal mol−1, and the additional weak pair errors are 0.03 and 0.2 kcal mol−1, respectively. The total errors quickly decrease to negligible values for smaller thresholds Tclose. For our default threshold Tclose = 10−4Eh, they are reduced to 0.03 and 0.1 kcal mol−1, respectively. This should be compared with the much larger errors that arise if the close and weak pairs are treated by LMP2 or SCS-LMP2 as shown in Figure 6. The improved treatment of close and weak pairs is particularly important if strong dispersion interactions are present. An extreme example is the interaction energy of the coronene dimer, which is discussed in Section 9. 5.3. Case Study: Benzene Dimer. As a further test and demonstration of the improved pair approximations, we apply in this section our method to the stacked and T-shaped forms of the benzene dimer, with geometries taken from the S22 benchmark set.209 For these calculations, we have chosen the aVTZ orbital basis set. Note that these calculations do not include F12 terms and triples corrections. The total interaction energies may therefore have large errors, but here we are only interested in the additional errors caused by the pair approximations. Figure 8 shows the computed PNO-LCCSD interaction energies of the stacked and T-shaped structures as a function of the pair selection threshold Tclose using different approximations for the pairs with energies below this threshold. The reference 3661

DOI: 10.1021/acs.jctc.7b00554 J. Chem. Theory Comput. 2017, 13, 3650−3675

Article

Journal of Chemical Theory and Computation

distances using a restart option. To test and demonstrate this effect, we have computed the interaction energy of the benzene dimer for the sandwich structure (without horizontal displacement of the two molecules) as a function of the distance between the two molecular planes. The upper panel of Figure 9

Figure 8. Dependence of the interaction energy of the benzene dimer on the close and weak pair thresholds using different pair approximations (see text). Upper panel: stacked structure. Lower panel: T-shaped structure. The open symbols (arbitrarily placed at the threshold 10−3 Eh) show the interaction energies if only the intermolecular pairs are treated by the respective approximations. Figure 9. Intermolecular PNO-LCCSD interaction energy of the benzene dimer sandwich structure as a function of the distance R between the benzene planes. The two molecules are parallel to each other and not horizontally displaced. Upper panel: potential without pair approximations (red) and with close pair approximation for intermolecular pairs (blue). Lower panel: errors relative to the PNOLCCSD (strong) potential caused by the close pair approximation for Tclose = 3 × 10−4, Tweak = 0. Red curve: close pair list selected at R = 3.5 a0 and then frozen. Blue curve: close pair list determined at R = 8.0 a0 and then frozen. Black curve: close pairs selected at each R.

is a calculation without pair approximations, which corresponds to a threshold Tclose = 1 μEh. If only the close-pair CCSD approximation is applied (red curves), the errors for the largest threshold (1 mEh) amount only to 0.07 kcal mol−1 for both structures. If only the weak pair CEPA approximation is used (violet curves, Tweak = Tclose), the errors increase to about 0.29 kcal mol−1, still being quite similar for both structures. However, if the pairs with energies below the threshold of 1 mEh are treated by LMP2 (black curves, no multipole approximations), the errors increase drastically to 3.35 and 1.48 kcal mol−1 for the stacked and T-shaped structures, respectively. This is due to the well-known overshooting of the MP2 approximation for dispersion interactions. These errors are reduced to 1.63/0.62 kcal mol−1 by the SCS-MP2 variant (green curves). The blue curves show the results for the case that both the close and weak pair approximations are used, with Tclose = Tweak/10. In this case, the errors become negligibly small for Tclose ≤ 0.3 mEh. The open symbols (placed arbitrarily at the threshold of 1 mEh) correspond to results obtained with all intramolecular pairs fully treated by PNO-LCCSD, while all intermolecular pairs are treated by the respective approximation. A general problem associated with the assignment of pairs to different classes is that this assignment depends on the geometry, which can lead to small steps on potential energy surfaces. This problem can be avoided by selecting the pair classes at one distance and then keeping them fixed for all other

shows the computed PNO-LCCSD potential energy curves with (blue) and without (red) close pair approximations. In the latter case, the close pair list has been selected at the longest distance (8 a0) using Tclose = 3 × 10−4 Eh and then kept fixed. To keep things simple, the weak and distant pair approximations have been disabled (Tweak = 0). This means that all intermolecular pairs (and some others) are treated by the close pair approximation. The error caused by this approximation is shown in the lower panel of Figure 9 (blue curve). As expected, it increases with decreasing R but amounts to only 0.11 kcal mol−1 at R = 3.5 a0. An alternative approach is to select the close pairs at a short distance, e.g., R = 3.5 a0. In this case, some intermolecular pairs are strong pairs, and consequently, the error is much smaller (red curve). In fact, this significant improvement is caused by only nine additional pairs; in the latter calculation, there are 207 strong pairs and in the former case 198. 3662

DOI: 10.1021/acs.jctc.7b00554 J. Chem. Theory Comput. 2017, 13, 3650−3675

Article

Journal of Chemical Theory and Computation In both cases, the potentials are smooth. However, if the close pairs are selected at each individual distance R, small steps occur on the potential where pairs change their class. Using the chosen threshold Tclose = 3 × 10−4 Eh, this happens near the minimum of the potential. The steps are small (of the order 0.02 kcal mol−1) but could cause trouble when fitting the potential. It should be noted that with our default threshold Tclose = 1 × 10−4 Eh, the steps are an order of magnitude smaller and hardly visible on the scale of the figure (Figure S4, SI). Small discontinuities can also be caused by variations of the PAO or PNO domain sizes as a function of geometry, but these are not visible on the scale of the figure. Microscopically smooth potentials could be obtained by freezing the domains, as described earlier.70

The advantage of this approximation is that in the density fitting less integrals of the (A|rs) type are required. This greatly simplifies and speeds up the evaluation of the integrals Jkj, and the memory required to hold the 3-index integrals is significantly reduced. However, since these terms are linear in the doubles amplitudes, the error is larger than that caused by applying the projection approximation only to the nonlinear terms. We implemented an option project_j to control this projection, which can take the values off (no projection), weak (only terms in weak pair residuals projected), close (terms in weak and close pair residuals projected), or all (terms in all residuals projected). The quality of this approximation depends on the PNO domain sizes. The errors are particularly sensitive to the energy criterion Ten, which mainly affects the close and weak pair domains. This can be rationalized by the fact that without the energy criterion the close and weak pair domains [ij] decrease with Rij (Figure 3) and then do not overlap sufficiently with the pair domains [ik]. However, with larger domains (e.g., Ten ≥ 0.99), the errors become very small even if the approximation is used for all pairs. This effect is demonstrated in Figure 10 for the Auamin reaction. The figure also shows that the errors are much smaller than the domain correction described in Section 4.2.

6. PROJECTION APPROXIMATIONS A major problem in PNO-LCCSD is the large number of integral blocks that have to be computed and stored. A typical term in the PNO-LCCSD residuals is (summations over k, l are implied) R [ijij , ij] = ... T̅ [ikij , ik]K [klik , jl]T̅ [ljjl , ij]

(35)

Since the PNOs are different for each pair, one has to compute and store a huge number of integral blocks Kkl[ik, jl]. In order to reduce the number of these blocks, Neese et al.96,97,101 proposed projection approximations, in which an orbital in a particular domain, e.g., |aik⟩, is projected onto another domain, e.g., [kl]. This leads to the expansion |aik ⟩ ≈ |a ik̃ ⟩ =



|c kl⟩⟨c kl|aik ⟩ =

c ∈ [kl]



|c kl⟩[S[kl , ik]]ca

c ∈ [kl]

(36)

It is easily shown that this also minimizes the mean square deviation ⟨aik − ãik|aik − ãik⟩. This leads to the approximation K [ijkl , mn] ≈ S[kl , ij]K [ijij , ij]S[ij , mn]

(37)

The term in eq 35 can then be written as R [ijij , ij] ≈ ... T̅ [ikij , kl]K [klkl , kl]T̅ [ljkl , ij]

Figure 10. Dependence of the Auamin reaction energy on the energy threshold Ten and project_j. IEXT = 2, TPNO = 10−7 (see text). The dashed line corresponds to the result with large domains, TPNO = 10−8, Ten = 0.997. Results are shown with and without the domain correction described in Section 4.2 (eq 34).

(38)

The projection of the amplitudes (or integrals, whatever is more convenient) is done on the fly. The big advantage is that only the Kkl[kl, kl] blocks have to be precomputed and stored. The number of these integrals is drastically smaller than the number of all integrals Kkl[ik, lj], but more matrix multiplications and additional S-matrix blocks are needed. Overall, the reduction of data is very significant, and the CPU time is comparable (due to savings in the integral transformation). The projection approximation might look rather crude since the domain [kl] only partly spans the domains [ik] and [jl]. However, we have implemented this as an optional approximation (denoted project_k) for all contributions of the integrals Kkl in nonlinear terms of the LCCSD equations and found that the errors are surprisingly small (see below). We also tested a slightly different projection approximation for the contributions of the Coulomb integrals Jkj in the intermediates Yijk and Zijk:

In order to reduce the number of 3- and 4-external integrals, we introduce further projection approximations. First of all, we approximate products of singles amplitudes by eq 15 in all terms where they occur (see Appendix). We tested a few difficult molecules in the FH test set with the dressed form of PNO-LCCSD equations and found the errors caused by this projection less than our default energy convergence threshold of 1 μEh. The linear contribution of the singles amplitudes to the doubles residual can use the approximation K(Eij)[ij , ij] ≈ [K(E̅ ij)[ij , ij]]ab =

c ∈ [ij]

(aiji|bijc ij)[ t ̅[jij]]c (40)

where the contractions of 3-external integrals with singles amplitudes in eq 11 are approximated using the definition of the projected singles in eq 14. This should be a rather good approximation since in the PAO basis the singles domain [j]PAO

R[ijij , ij] = ... T̅ [kiij , ki]J[kjki , ij] ≈ ... T̅ [kiij , ij]J[kjij , ij]



(39) 3663

DOI: 10.1021/acs.jctc.7b00554 J. Chem. Theory Comput. 2017, 13, 3650−3675

Article

Journal of Chemical Theory and Computation

Table 4. Statistical Analysis (in kcal mol−1) of Errors Caused by Projection Approximations on Reaction Energies of FH Set without Pair Approximationsa Ten = 0.99 (default)

project_

Ten = 0.90

j

k

je

ke

s

MAXb

RMSDc

MADd

MAXb

RMSDc

MADd

off off all all all

on on on on on

on on on on on

1 2 1 2 2

off off off off all

0.01 0.09 0.07 0.16 0.15

0.00 0.03 0.02 0.03 0.03

0.00 0.02 0.02 0.02 0.02

0.01 0.10 0.16 0.14 0.26

0.00 0.03 0.05 0.04 0.06

0.00 0.02 0.04 0.02 0.04

a

Default domains, except for Ten, which is shown in the table, have been used. The reference values have been obtained with all projections turned off, except for the one shown in eq 76. bMaximum deviation from the reference values. cRoot-mean-square deviation. dMean absolute deviation.

The effect of this approximation is very small, but the number of 3-external integrals (aikk |bijcij) in these terms still quite large. (iii) In addition, project both [j] and [ik] to [ij] (project_ke = 2):

is contained in the pair domain [ij]PAO (which is the union of [i]PAO and [j]PAO). It is not guaranteed that [j]PNO is contained in [ij]PNO, but the error will become smaller with decreasing PNO threshold and zero if TPNO = 0. This approximation is controlled by the option project_s, whose possible values are the same as project_j. When this projection is enabled for a pair ik, we also approximate a term in the singles residuals ri[ii] with r[i ii] + = K(T̃ [ikik , ik])[kii]



S[ii , ik]K(T̃ [ikik , ik])[kik]

G[ijij , ij] + ≈ T̅ [ikij , ij]K(E̅ kj)[ij , ij]

This leads to the smallest number of integrals and has also been used by Riplinger and Neese.104 The integral set is the same as needed for the contributions of the projected J(Ekj) integrals (cf. eq 42). In Table 4, we show the error statistics of various projection approximations of the FH test set with the VTZ-F12 basis set and two values of Ten. In these tests, pair approximations were turned off, and the reference values were computed with the dressed PNO-LCCSD equations, enabling only the single unavoidable projection approximation shown in eq 76. This should have a negligible effect. We can see from the first row of Table 4 that the combined errors of project_k, project_je, and project_ke = 1 are negligible for both domain sizes. The errors of other projections are somewhat larger and overall increase when smaller PNO domains are used. Still, the RMSD projection errors are well within 0.1 kcal mol−1 in all cases. In Table 5 the computed reaction energies for some large molecular systems are shown using our default domain sizes and various combinations of projections. We found that the approximations are nearly additive, and therefore, only a selection of the many possible combinations is presented, in which the various projections are successively turned on.

(41)

since otherwise the projection would not reduce the number of required 3-external integrals. For the terms involving the J(Ekj) contractions, we use G[ijij , ij] + = T̅ [ikij , ik]J(Ekj)[ik , ij] ≈ T̅ [ikij , ij]J(E̅ kj)[ij ,ij]

(42)

where in analogy to eq 40 [J(E̅ kj)[ik , ij]]ab =



(aik bij|kc ij)[ t ̅ [jij]]c

c ∈ [ij]

(43)

This has the advantage that only integrals (aijbij|cijk) involving a single PNO domain are needed. The index k is restricted by the condition that it needs to be a strong pair with either i or j (see Appendix). The effect of these approximations is small. They can be enabled using option project_je = on. For consistency, when project_je = on, we also project the 1-external contribution in eq 75 by

k [lkjik]t ̅ [l †ij] ≈ S[ik , ij]k [lkjij]t ̅ [l †ij]

(44)

Similar approximations can be applied for the corresponding exchange contributions G[ijij , ij] + = T̅ [ikij , ik]K(Ekj)[ik , ij]

Table 5. Effect of Projection Approximations on Reaction Energies of Some Large Systems (in kcal mol−1) Using Default Domains

(45)

In this case three different projection approximations have been implemented and tested: (i) No projection, i.e., use eq 45 (project_ke = 0). The number of the required 3-external integrals (aikk|bjjcij) is large, especially since they are also included into the close pair residuals. Their storage can be avoided by recomputing the contractions K(Ekj)[ik, ij] in each iteration by a partial integral transformation, similar to the integrals Kkj[ik, ij]. However, this is rather expensive in terms of CPU time. (ii) Only project [jj] to [ij] as defined in eq 40 (project_ke = 1): G[ijij , ij] + ≈ T̅ [ikij , ik]K(E̅ kj)[ik , ij]

(47)

project_

(46)

a

3664

j

k

je

ke

s

off off weak close all all all all all all

on on on on on on on on on on

off on on on on on on on on on

0 1 1 1 1 0 1 2 2 2

off weak weak weak weak weak weak weak close all

Isomer4a Aureacta 62.84 62.84 62.84 62.82 62.82 62.81 62.82 62.91 62.89 62.84

42.34 42.33 42.32 42.29 42.25 42.24 42.25 42.27 42.25 42.20

Coronene dimerb 14.09 14.06 14.04 14.01 13.95 14.00 13.95 13.94 13.92 13.93

Basis VTZ-F12. bBasis aVTZ, without counterpoise correction. DOI: 10.1021/acs.jctc.7b00554 J. Chem. Theory Comput. 2017, 13, 3650−3675

Article

Journal of Chemical Theory and Computation

pair approximations. A similar table showing the errors relative to these reference values can be found in the SI. We believe that for the calculations with large domains and Tclose = 10−5Eh the remaining errors caused by the PNO, pair, and projection approximations are negligible (