Optimization of the Linear-Scaling Local Natural Orbital CCSD (T

orbital CCSD(T) method: improved algorithm and benchmark .... local methods mostly benefit from the use of LMO pair specific pair natural orbital (PNO...
0 downloads 0 Views 2MB Size
Subscriber access provided by University of Massachusetts Amherst Libraries

Quantum Electronic Structure

Optimization of the linear-scaling local natural orbital CCSD(T) method: improved algorithm and benchmark applications Péter R. Nagy, Gyula Samu, and Mihaly Kallay J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.8b00442 • Publication Date (Web): 02 Jul 2018 Downloaded from http://pubs.acs.org on July 4, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Optimization of the linear-scaling local natural orbital CCSD(T) method: improved algorithm and benchmark applications P´eter R. Nagy,∗ Gyula Samu, and Mih´aly K´allay∗ MTA-BME Lend¨ ulet Quantum Chemistry Research Group, Department of Physical Chemistry and Materials Science, Budapest University of Technology and Economics, H-1521 Budapest, P.O.Box 91, Hungary E-mail: [email protected]; [email protected]

Abstract An optimized implementation of the local natural orbital (LNO) coupled-cluster (CC) with single-, double,- and perturbative triple excitations [LNO-CCSD(T)] method is presented. The integral-direct, in-core, highly efficient domain construction technique of our local second-order Møller–Plesset (LMP2) scheme is extended to the CC level. The resulting scheme, which is also suitable for general-order LNO-CC calculations, inherits the beneficial properties of the LMP2 approach, such as the asymptotically linear-scaling operation count, the asymptotically constant data storage requirement, and the completely independent domain calculations. In addition to integrating our recent redundancy-free LMP2 and Laplace-transformed (T) algorithms with the LNO-CCSD(T) code, the memory demand, the domain and LNO construction, the auxiliary basis compression, and the previously rate-determining two-external integral ∗

To whom correspondence should be addressed

1

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

transformation have been significantly improved. The accuracy of all the approximations is carefully examined on medium-sized to large systems to determine reasonably tight default truncation thresholds. Our benchmark calculations, performed on molecules of up to 63 atoms, show that the optimized method with the default settings provides average correlation and reaction energy errors of less than 0.07% and 0.34 kcal/mol, respectively, compared to the canonical CCSD(T) reference. The efficiency of the present LNO-CCSD(T) implementation is demonstrated on realistic, three-dimensional examples. Using the new code an LNO-CCSD(T) correlation energy calculation with a triple-ζ basis set is feasible on a single processor for a protein molecule including 2380 atoms and more than 44000 atomic orbitals.

1

Introduction

Wave function based quantum chemical approaches, especially the coupled-cluster (CC) hierarchy of methods, 1 exhibit many features desired for the accurate modeling of molecular systems. The CC methods are systematically improvable, size-extensive, and their extensions are also available for various molecular properties and excites states. 2,3 For single reference cases the CC model with single and double excitations (CCSD) augmented with perturbative triples correction [CCSD(T)], 4 is often considered as the “gold standard” of quantum chemistry since assuming a sufficiently complete atomic orbital (AO) basis, chemical accuracy can be achieved with CCSD(T). Although higher-order CC methods 5,6 are available and potentially more accurate, they are in turn drastically more demanding. However, even the use of the recent, highly-optimized conventional implementations of the most commonly employed CCSD and CCSD(T) methods 7–22 is limited by the steep, sixth and seventh power scaling computational requirements. Consequently, molecular systems above 20–30 atoms and with at least triple-ζ quality AO bases can only be studied via reduced-cost, approximate CC approaches. Local correlation methods, exploiting the relatively rapid decay of the electron correlation, have become 2

ACS Paragon Plus Environment

Page 2 of 73

Page 3 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

particularly effective, affordable, and still accurate tools to study such systems. The extensive and fast-growing literature of local methods have been thoroughly summarized in our previous reports 23–29 and in recent reviews, 30–33 here we only focus on the introduction of the closely related techniques. Historically, the development of local correlation methods followed two paths. The socalled fragmentation-based approaches 23,24,31–48 decompose the corresponding equations or often even the actual physical system into separately treatable parts and evaluate the results as the sum of fragment (and inter-fragment interaction) contributions. The alternative, “direct” methods 49–68 avoid such partitioning into smaller subunits and employ local and additional cost-reducing approximations for the equations characterizing the entire system. Recently a third group of methods, 25–27,69 including the present scheme, has emerged at the intersection of the above two classes which combine ideas originating from both traditional approaches. Many of the presently applied and generally popular approximations find its roots in the pioneering work of Pulay and Saebø. 49–52 Firstly, the correlation contributions of the distant, occupied localized molecular orbital (LMO) pairs can be neglected or approximated (pair approximation). Secondly, an LMO or LMO pair-specific subset of local virtual orbitals, usually expressed in a projected atomic orbital (PAO) form, can be employed to span the virtual subspace for the LMOs or LMO pairs (domain approximation). On top of this framework Werner and Sch¨ utz introduced the density-fitting (DF) 55 approximation and developed local CC methods including triples contributions. 53–55 The next significant leap was the introduction of various natural orbitals (NOs), constructed at the level of second-order Møller–Plesset (MP2) perturbation theory, to compress the virtual subspace. 23,56,60 Among those the direct local methods mostly benefit from the use of LMO pair specific pair natural orbital (PNO) based solutions first developed by Neese, Valeev, Riplinger, Guo, Pinski, Pavoˇsevi´c, and co-workers. 60–66 Similarly efficient PNO-based local CC methods are also reported recently by Werner, Ma, Schwilk, and K¨oppl 57–59 and by H¨attig, Tew, and Schmitz. 67,68

3

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The subset of fragmentation-based methods, which have also been developed up to the CC level, collects a wide variety of approaches. 23,24,34–43 The CCSD part of the present method is closely related to the basic concept of the cluster-in-molecule (CIM) approach developed by Li, Li, Piecuch, Guo, and their co-workers. 43,48,69 Similar ideas are utilized in the divide-expand-consolidate (DEC) approach of Jørgensen, Kristensen, Kjærgaard, and their co-workers, 42,44,70 and in the divide-and-conquer (DC) method of Li and Li 46 and Kobayashi and Nakai. 38,39 The form of the inter-domain correlation correction of our scheme shows resemblance to the second increment term applied in the incremental method, which was proposed by Stoll 45 and further employed in the local correlation context by Friedrich, Dolg, and their co-workers. 36,37,71 Concerning the fragment-based methods the use of the local natural orbitals (LNOs) introduced by us for local CC 23,24,27 and local direct random phase approximation (dRPA) 25 methods turned out to be an especially efficient solution for the compression of the virtual orbital space. Here the relatively simple but accurate, efficient, and fully integral-direct LMO-specific domain construction scheme of our recent local MP2 (LMP2) 26 method is extended to accommodate the LNO approximation successfully utilized before for domain CC calculations. 23,24,27 In the resulting LNO-based local CCSD(T) [LNO-CCSD(T)] scheme we have also significantly improved upon the algorithms employed for the LNO construction and the most expensive integral transformation yielding the all-external integrals. The so-called natural auxiliary function (NAF) 72 based auxiliary space reduction technique is also introduced here for the first time in the context of local CC methods. Our recently developed, redundancy-free, Laplace-transform LMP2 26 and (T) 27 algorithms are adopted here providing at least an order of magnitude better operation count efficiency compared to the naive, redundant evaluation of conventional perturbative expressions in the domains. Additionally, we have performed extensive optimization regarding the memory consumption of the method allowing for exceptionally large LNO-CCSD(T) calculations on average, single-CPU workstations. Some of these efficiency gains have been reinvested in the improvement of the

4

ACS Paragon Plus Environment

Page 4 of 73

Page 5 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

accuracy of the local approximations by tightening the default thresholds. The LNO-CCSD(T) scheme presented herein combines attractive properties of both fragmentation and direct methods as discussed further in section 2.1. It is also important to emphasize that the method is completely free from real space cutoffs or other empirical parameters, manual fragment definitions via input atom lists, bond cutting and capping, etc. In other words, the domain construction is completely black box and automatic, it uses solely system specific truncations, which adapt to the complexity of the electronic structure of the actual molecule. A development version of the present LNO-CCSD(T) method has already been successfully employed, with slightly different settings, in chemical applications 73,74 and as part of our multilevel approach combining, e.g., LNO-CCSD(T) for the most important subset of atoms of a large system and LMP2 for its environment. 28,29 Although general-order LNO-CC methods are also implemented in the present framework, here we focus on the development (sections 2 and 3) and performance assessment (sections 4 to 7) of the LNO-CCSD(T) method. The truncation threshold dependence and the accuracy of triple- and quadruple-ζ LNO-CCSD(T) correlation, reaction, and interaction energies are thoroughly investigated on a new test system compilation containing the canonical CCSD(T) energies of molecules with up to 63 atoms, and further convergence tests are carried out for even larger systems. Additional benchmark applications illustrating the convergence of LNO-CCSD(T) energies towards the complete basis set limit will be provided in a forthcoming report. 75 Presently we employ conventional extrapolation techniques 76 to accelerate the convergence with the AO basis set size. As an alternative the use of explicitly correlated terms in recent local methods 59,65,68 was also proven to be highly effective to remedy the finite basis problem.

5

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2

Page 6 of 73

Theoretical background

The following considerations begin by assuming a closed-shell reference determinant constructed from spatial orbitals. These orbitals will be subjected to different orbital transformations. Notations for the most frequently used orbital sets are collected in Table 1. For instance, correlation energy expressions will mostly be expressed in terms of quasi-canonical orbitals denoted by indices i, j, k, . . . and a, b, c . . . for the occupied and virtual subsets, respectively, while most of the local approximations are introduced using localized orbitals (i0 , j 0 , . . . ). Table 1: Summary of index notations. The definition of the various orbital sets will be provided in sections 2.1-3.3. i0 , j 0 , k 0 , . . . i, j, k, . . . a, b, c, . . . P, Q, . . . I, J, K, . . . ˜i, . . . , a ˜, . . . ¯i, . . . , a ¯, . . .

2.1

localized occupied molecular orbitals (LMOs) (quasi-)canonical occupied orbitals (quasi-)canonical virtual orbitals (natural) auxiliary functions for the DF approximation Gram–Schmidt–L¨owdin (GSL) basis orbitals in the primary or extended domain (ED) orbitals in the local interacting subspace (LIS)

The LNO-CCSD(T) Ansatz

Motivated by the CIM formulation developed by Li, Piecuch, and co-workers 43,48 correlation energy expressions, such as the closed shell CCSD correlation energy can be rewritten as the sum of individual occupied orbital contributions, δEiCCSD , as #

" E CCSD =

X i

δEiCCSD =

X

2

X a

i

fai tai +

X

a b ab Lab ij (tij + ti tj ) ,

(1)

abj

where fai is an element of the Fock-matrix, tai and tab ij are cluster amplitudes for single and double excitations, and Lab ij = 2(ai|bj) − (aj|bi)

6

ACS Paragon Plus Environment

(2)

Page 7 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

with (pq|rs) as a two-electron electron repulsion integral (ERI) in the Mulliken notation. Similarly, correlation energy expressions corresponding to wave functions obtained by perturbative or higher-order iterative CC methods 23 can also be expressed as sum of δEiCCSD type energy contributions. For instance, the closed-shell MP2 formula can be given as

E MP2 =

X

" X X

δEiMP2 =

i

ab[1]

with tij

i

# ab[1]

Lab ij tij

,

(3)

abj

denoting the first-order (MP1) amplitudes. The analogous but more involved (T)

expression for the δEi

orbital contributions to the perturbative (T) correction was also

presented in our earlier work, 24,27 with which the closed-shell CCSD(T) correlation energy is given as E CCSD(T) =

X

δEiCCSD +

X

(T)

δEi

=

X

.

(4)

i

i

i

CCSD(T)

δEi

In our LNO-CCSD(T) Ansatz Eq. (4) is rewritten relying on, for instance, orbital transformation techniques so that it can be carefully and efficiently approximated in the resulting form. The detailed steps are collected in sections 3.1-3.3, a general overview is presented in the following. Let us first recognize that the above correlation energy and δEicorr expressions (where corr = MP2, CCSD, (T), CCSDT, ...) are invariant to the separate unitary transformations of the occupied and virtual orbitals. Hence the canonical occupied orbitals can be replaced by localized molecular orbitals (LMOs), indexed by i0 , in order to facilitate the forthcoming local approximations, e.g.:

E CCSD(T) =

X

CCSD(T)

δEi

=

X

CCSD(T)

δEi0

.

(5)

i0

i

To reduce the scaling of the evaluation of Eq. (5) to linear, pair and domain approximations are invoked. First, each LMO i0 is selected as the central LMO of its own extended domain (ED), denoted as Ei0 . Then such LMOs are collected into the ED that form strong pairs

7

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 73

with the central LMO. LMO pairs having an MP2 pair correlation energy [δEi0 j 0 (Pi0 j 0 )] larger (smaller) than a threshold are characterized as strong (distant), where δEi0 j 0 (Pi0 j 0 ) is evaluated in i0 − j 0 specific pair domains, Pi0 j 0 . In the second step of the domain construction the virtual space of the ED is restricted to local virtual orbitals (in the form of PAOs) surrounding the LMOs of the ED. The MP2 energy contribution of the ED is computed in the full basis of the ED, yielding the δEiMP2 (Ei0 ) correlation energy contributions [see Eq. 0 (10) below]. The readily available MP2 amplitude list is also useful to construct MP2 natural orbitals, the LNOs of the ED. The full LMO-PAO basis of the ED is then compressed by transforming to a truncated LNO list, yielding the local interacting subspace (LIS, Pi0 ) of the central LMO. The final energy formula, for instance, for the LNO-CCSD(T) energy, is expressed with CCSD(T)

the CCSD(T) and MP2 energies of the LIS, δEi0 E LNO−CCSD(T) =

(Pi0 ) and δEiMP2 (Pi0 ), respectively, as 0

i X h CCSD(T) δEi0 (Pi0 ) + ∆EiMP2 , 0

(6)

i0

where ∆EiMP2 corrects for the above pair and LNO approximations at the MP2 level: 0

∆EiMP2 0

=

δEiMP2 (Ei0 ) 0



δEiMP2 (Pi0 ) 0

distant 1 X δEi0 j 0 (Pi0 j 0 ) . + 2 j0

(7)

Currently, the same ∆EiMP2 correction is applied in the case of all of our LNO-CC type 0 methods. The essential difference is in the CC method used in the LIS, which can be, in principle, of general order ranging from CCSD through CCSD(T), CCSDT, CCSDT(Q), etc. to full CC. The most involved part of any local correlation scheme, especially for iterative CC methods, is the computation of the amplitudes and the corresponding intermediates, as well as the two-electron integral lists in the MO basis. Two main strategies have been developed in the literature for this purpose. Recent implementations of the direct methods 57–68 determine

8

ACS Paragon Plus Environment

Page 9 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

an occupied orbital pair specific list of the MP2, CCSD, and (T) amplitudes simultaneously for the entire molecule using a PNO basis. The main benefit of working in the pair specific PNO basis ({ai0 j 0 }) is that, for instance, in the case of double excitations, the number of a

0 0

the necessary ti0ij 0j

bi0 j 0

type amplitudes can be significantly reduced. In turn the size of the

transformed integral lists and other intermediates in the PNO basis, especially with more than two PNO indices, leads to data storage and/or traffic bottlenecks, which demand grows asymptotically linearly with the system size. Even with highly optimized implementations several hundreds of GBs of memory and/or disk allocations were reported 58,59,62,66 for realistic molecules with around 100 atoms and triple-ζ quality basis sets. The alternative, fragmentation-based methods 23,24,34–43 split up the molecule into overlapping fragments and evaluate the fragment energies and the corresponding amplitudes independently for each fragment. The obvious benefits of such decoupling of the amplitude calculations are the asymptotically constant disk and memory requirements and the straightforward paralellizability. Existing canonical implementations are often attempted to be reused for the individual fragment calculations, for which a fragment-specific (quasia b

canonical) basis ({pi0 }) is constructed. However, to determine the ti0ik0 i0i0 amplitudes required a b

to evaluate fragment energies analogous to Eq. (1) with sufficient accuracy additional tjii00kii00 type amplitudes have to be introduced to take into account the otherwise neglected coupling to the neighboring fragments. The size of the resulting fragment calculation, especially for realistic, three-dimensional (3D) systems with large AO basis sets, often reaches the limit that can still be handled economically with conventional CC implementations. In contrast to the case of direct methods, the most common bottleneck of fragmentation based methods is the extensive operation count requirement arising from the overlapping fragment calculations. In our recent contributions we were able to completely remove the need for the evaluaa b

tion of tjii00kii00 type “buffer” amplitudes as well as of the corresponding integrals from our local MP2 25,26 and (T) 27 implementations. In other words, in our local MP2 and (T) expressions a b

only the necessary ti0ik0 i0i0 -type (Laplace-transform) MP2 and (T) amplitudes and the respec9

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 73

tive integrals with one fixed i0 occupied index are evaluated, there is no quasi-redundancy among such quantities of different domains. For that purpose not only the correlation energy, but also the amplitude equations have to be decoupled even when working in the (non-canonical) LMO basis. This is achieved via the factorization of the energy denominators of the perturbative amplitude expressions using Cholesky-decomposition 25,26,77,78 or Laplace-transform 27,68,70,79–85 techniques. Consequently, our latest local MP2 and (T) approaches resemble more closely direct local methods, which utilize similar local, pair and domain approximations and also free from quasi-redundant amplitude evaluation. At the same time we were able to keep the benefits of the asymptotically constant array sizes and the completely independent domain computations. The advantages of direct and fragmentation based approaches cannot be so well combined for iterative methods, such as the CCSD. For iterative CC methods we choose to utilize fragmentation based ideas related to the core concept of the CIM 43,48 or even the DEC 42 schemes. More precisely, one separate domain of LNOs is formed around each LMO a b

in which the CCSD equations are solved for all the necessary tjii00kii00 amplitudes of the domain. In our scheme average domain sizes saturate between 100 and 150 atoms even for the largest presented examples making the operation count linear-scaling and the data storage requirement constant above this limit. However, such domain calculations are only feasible if additional, e.g., local and natural orbital based cost reduction techniques are also utilized. The current set of approximations are detailed in sections 3.1–3.3. A brief summary of the algorithm is given in the end of the next section in Table 2.

3

Algorithm and implementation

The initial HF calculation and the localization of the occupied orbitals are the two, currently not linearly scaling, prerequisite steps of our local correlation methods. We have found the Boys 86 localization to be the best suited for our algorithms. 26 If desired, Pipek–Mezey 87

10

ACS Paragon Plus Environment

Page 11 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

or intrinsic bonding orbitals 88 can also be chosen at the cost of noticeably larger domain sizes. The steps required for the local MP2 calculations in the EDs and the (T) calculations in the LISs closely follow the approaches proposed in Refs. 26 and 27. These are briefly summarized in the following focusing on the minor modifications. Our improved schemes for the LNO and two-external three-center integral evaluation are discussed in extensive detail.

3.1

Pair energy computation

Having localized occupied orbitals at hand the virtual space of the molecule is spanned by PAOs as proposed by Pulay: 49 

|aµ i = 1 −

X

 |i ihi | |µi, 0

0

(8)

i0

where the aµ PAO results from projecting the µ PAO-center AO onto the virtual subspace. Next, pair correlation energies are determined in order to separate a linear scaling number of strong LMO pairs from the remaining distant ones. To that end a compact atom list is compiled for each LMO and PAO utilizing our modified 26 Bougton–Pulay 89 (BP) algorithm. The algorithm ensures that the projection of the input MO onto the AOs of the output BP atom list specific to the MO is practically the closest to the original MO in the least-squares sense with their least-squares residual being smaller than 1 minus an input completeness criterion. BP atom lists of the LMOs and PAOs with TPDo = 0.999 and TPDv = 0.98 completeness criteria are then used to construct sufficiently large primary domains (PD) around each LMO. The PD of LMO i0 , PDi0 , consists of the BP atoms of LMO i0 , the PAOs (with their respective BP atom lists) residing on the BP atoms of LMO i0 , and all AOs centered on the atoms selected above. The LMOs and PAOs are then projected onto the PD AOs, and the PAOs orthogonalized and canonicalized among each other, leading to the domain PAO set {˜ ai0 }. Finally, for each LMO pair opposite-spin (OS) MP2 pair correlation energies are computed relying on the multipole approximation of ERIs, as presented in Ref.

11

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

26: PDi0 PDj 0

δEi0 j 0 (Pi0 j 0 ) = −8



a ˜i0 i | ˜bj 0 j 0

0

[4] 2

X X a ˜i0

˜b 0 j

Page 12 of 73

εa˜i0 + ε˜bj0 − Fi0 i0 − Fj 0 j 0

.

(9)

Here εa˜i0 (ε˜bj0 ) stands for the pseudo-canonical orbital energies corresponding to the canonicalized PAOs a ˜i0 (˜bj 0 ), and Fi0 i0 and Fj 0 j 0 are the Fock-matrix elements calculated with LMOs [4]  i0 and j 0 , respectively. Moreover, integrals a ˜i0 i0 | ˜bj 0 j 0 are multipole expansion approximated ERIs up to fourth order, including the terms with dipole-dipole, dipole-quadrupole, quadrupole-quadrupole and dipole-octupole moments. 26 Similar multipole approximated pair correlation energies were also successfully utilized in several recent local methods, 25,61,90 mostly motivated by the early work of Hetzer, Pulay, and Werner. 91 The authors, studying the interaction of two water molecules and small polyglycine chains, concluded that the optimal order to truncate the multipole series is the same as in the above OS-MP2 expression of Eq. (9). Our recent study performed on a more representative set of large molecules demonstrated that including the higher-order terms on top of the more widely used dipole–dipole only approach 25,61,63,92 provides significantly better pair energies, assuming the use of sufficiently large PDs. 26 Our conclusions have also been confirmed by Werner in the context of the PNO-LMP2 and PNO-LCCSD methods. 90 Having such reliable pair correlation energies allowed us to simplify our Ansatz and replace the role of the previously applied weak-type pairs 25 partly by more accurately computed distant pairs and partly by more efficiently handled strong pairs. 26 LMO pairs are identified as distant [strong] if δEi0 j 0 (Pi0 j 0 ) < εw [δEi0 j 0 (Pi0 j 0 ) ≥ εw ], where εw is the pair energy threshold.

3.2

Local MP2 domain energy

The remaining computations are performed only for the selected, linear scaling number of strong pairs. To ensure linear-scaling operation count these steps are performed separately for each LMO in the LMO specific EDs having asymptotically constant domain sizes. 12

ACS Paragon Plus Environment

Page 13 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

3.2.1

Orbitals of the extended domain

The occupied space of the ED for LMO i0 consists of LMO i0 itself, the central LMO of the ED, and all LMOs characterized as strong pairs with i0 , n ˜ o LMOs altogether. BP atom lists, referred to as BPED, are constructed for each LMO with TEDo = 0.9999 criterion. The atom and AO lists of the ED are given by merging the atom and AO lists of the BPEDs of all the LMOs included in the ED. The LMOs of the ED are projected onto the AO basis of their own BPED, which is necessary for the linear scaling of the subsequent integral transformation steps. TEDo = 0.9999 corresponds to the tighter settings in our previous LMP2 approach 26 and guarantees that the above LMO completeness is better than 99.99%. After the truncation of the LMOs their exact orthogonality is lost, which was previously recovered via L¨owdin symmetric orthogonalization. 26 This procedure mixes the central LMO with the other LMOs of the ED, which was not a problem for LMP2 but is incompatible with our current LNO construction scheme (see section 3.3.1) required for the LNO-CC models. However, an alternative, so-called Gram–Schmidt–L¨owdin (GSL) basis 27,93,94 is suitable for all of our purposes. This GSL basis is formed by first orthogonalizing the LMOs to the central LMO in a Gram–Schmidt step and then performing a L¨owdin symmetric orthogonalization on the resulting n ˜ o − 1 orbitals. This GSL procedure keeps the central LMO intact, preserves the locality of the remaining LMOs as much as possible, and can be performed very efficiently. 93,94 Finally, a quasi-canonical occupied orbital set, denoted as {˜j}, is also constructed to enable the use of the working equations valid in a canonical basis. For simplicity, subscript i0 labeling the orbital sets of the corresponding domains will be omitted from this point wherever possible. The virtual space of the ED is spanned by PAOs whose PAO center AO [see Eq. (8)] is located in the PAO center domain (PCD) of the ED. To construct the PCD, first, the most important atoms of the ED LMOs are determined using the BP algorithm with criterion To = 0.985, then the PCD is assembled as the union of these compact BP lists as presented in Ref. 26. We have found recently that if there are ghost atoms in the system, i.e., there 13

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 73

are AO centers where the densities of all LMOs are especially small, this PCD construction scheme does not necessarily collect all the important PAO center atoms required for better than 99.9% accuracy. The issue is resolved by extending the initial PCD (PCD0 ) with a small number of additional PAO center atoms (APAO+ ) so that all important three-center two-electron AO integrals (µν|P ) are kept in the enlarged PCD with µ, ν, and P being two AOs and an auxiliary function of the ED. Precisely, an APAO+ atom is added to PCD0 if the Schwartz-estimate of any (µν|P ) integral with µ ∈ PCD0 , ν ∈ APAO+ , and P ∈ PCD0 ∪APAO+ is above 2 Eh . A simpler solution of choosing the closest ghost atoms would probably also suffice but our preference was to keep the algorithm completely free from real space distance based thresholds. The PAOs are significantly more delocalized than the LMOs so they are projected onto the entire AO basis of the ED in order to minimize their truncation error, following the procedure discussed in Ref. 26. The truncated PAOs are again not orthogonal, which is simply solved by their L¨owdin canonical orthogonalization among each other. The truncated PAOs also slightly overlap with the occupied subspace of the ED. At the LMP2 level we found it more beneficial not to truncate the PAO space any further by orthogonalizing the PAOs to the occupied LMOs of the ED. 26 However, for CC calculations the orthogonality of the occupied and virtual subspaces will be enforced later in the LISs (see section 3.3.1). Finally, the linear dependency of the PAOs is removed, and the remaining n ˜ PAO number of PAOs are canonicalized in the AO basis of the ED, yielding the {˜ a} set of the ED.

3.2.2

MP2 domain energy

The MP2 domain energy is evaluated, following closely the procedure of Ref. 26, as:

δEi0 (Ei0 ) =

i ˜ Xh a ˜b[1] 2(˜ ai0 |˜b˜j) − (˜ a˜j|˜bi0 ) ti0˜j , a ˜˜b˜ j

14

ACS Paragon Plus Environment

(10)

Page 15 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

where the necessary integrals are obtained relying on the density-fitting (DF) approximation:

(˜ ai0 |˜b˜j) =

X

Ja˜i0 ,P (˜b˜j|P ) .

(11)

P

Here only those P auxiliary functions (AFs) are used that reside on the atoms of the PCD, which was shown to be an excellent approximation for the fitting of LMO-PAO type integrals. 26 Since the i0 index is restricted to only one element of the occupied ED basis, the central LMO, it is more efficient to use the asymmetric form, Eq. (11), for the integral assembly with Ja˜i0 ,P expressed as X

Ja˜i0 ,Q VQP = (˜ ai0 |P ) ,

(12)

Q

and VP Q = (P |Q) being the usual two-center two-electron integral in the auxiliary basis. Note that the fitting step is performed using the Cholesky-factorized form of V, V = LLT , and solving a system of linear equations for Ja˜i0 ,Q of Eq. (12) in order to avoid any matrix inversion in the AF space. The above (˜ ai0 |P ) and (˜b˜j|P ) integrals are obtained in an integral direct manner completely in memory relying on our highly optimized three-center two-electron AO integral coma ˜˜b[1]

puting 95 and AO to LMO-PAO integral transformation 26 implementations. Finally, the ti0˜j

amplitudes appearing in Eq. (10) are evaluated directly in the mixed LMO/canonicalized LMO basis using a unitary rotation invariant amplitude expression with Cholesky-decomposed energy denominators. 25,26 This is an important cost reduction step, because it keeps the operation count of all steps performed in the EDs strictly at most fourth power scaling with the size of the ED and makes it possible to deal with large EDs containing more than 200 atoms and 5000 AOs in a matter of minutes. Note that at the end of the loop for the ED calculations an accurate LMP2 energy can

15

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 73

be composed of the already available energy components as " E

LMP2

=

X

EiMP2 (Ei0 ) 0

i0

# distant 1 X + δEi0 j 0 (Pi0 j 0 ) , 2 j0

(13)

which is numerically very close to the previous LMP2 energy 26 obtained with the tight settings of Ref. 26.

3.3

Local interacting subspace and the MP2 energy correction

Following the concluded LMP2 step the LNO-CC calculation proceeds with the construction of the compressed, LNO basis for each LMO.

3.3.1

Local natural orbitals

The occupied LNO space is obtained following our previous approach 23,24 with minor modifications. First, the occupied-occupied block of the MP2 density matrix (of the ED) is built as (i0 ) D˜j k˜

=

n ˜X PAO



a ˜˜b[1] a ˜˜b[1]

a ˜˜b[1] ˜b˜ a[1]

2t˜ji0 tki tki ˜ 0 − t˜ ˜0 ji0



,

(14)

a ˜, ˜b=1

and it is transformed to the occupied GSL basis introduced in section 3.2.1, resulting in (i0 )

(i0 )

(i0 )

the DJK quantities. The off-diagonal elements of Di0 K and DJi0 are set to zero in order to avoid the mixing of the central LMO with the rest of the occupied orbitals of the ED. Note that it is necessary to switch to the GSL basis, where LMO i0 is the first element, because 0

(i ) i0 is not present explicitly in the quasi-canonical ˜j set. Hence, if D˜j k˜ is diagonalized in

the quasi-canonical basis, the mixing of the central LMO with the remaining LMOs cannot be avoided. Thus, the occupied LNOs and the corresponding occupation numbers (n¯j ) are obtained by diagonalizing the edited D

(i0 )

in the GSL basis, and the occupied LNOs with

n¯j < εo are kept frozen in the remaining calculations. The retained LNOs are canonicalized leading to the {¯j} occupied basis of the LIS.

16

ACS Paragon Plus Environment

Page 17 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

We have also improved on the virtual LNO construction procedure. Previously, the 0

virtual-virtual block of the MP2 density matrix of the domain (D(i ) ) was constructed using all MP1 amplitudes of the domain, and the virtual LNOs, distinguished here for clarity as LNO0 ’s, were given as the eigenvalues of that complete density matrix block. 23,24 Clearly, this LNO0 set is suitable to describe the correlation of all electrons residing on the occupied orbitals in the ED. Since a part of the occupied space has already been frozen in the form of occupied LNOs with n¯j < εo , the final virtual set of the LIS only has to be able to account for the correlation of the remaining electrons of the retained, n ¯ o number of occupied LNOs. To exploit this the occupied index of the MP1 amplitudes is first transformed to the restricted 0

occupied LNO basis, and then D(i ) is built as (i0 )

Da˜˜b =

n ˜ PAO X n ¯o   1 X c˜a ˜[1] c˜˜b[1] a ˜c˜[1] ˜b˜ c[1] c˜a ˜[1] ˜b˜ c[1] a ˜c˜[1] c˜˜b[1] 3ti0¯j ti0¯j + ti0¯j ti0¯j − ti0¯j ti0¯j − ti0¯j ti0¯j . 2 c˜=1 ¯

(15)

j=1

Note that the invariance of the original expression 23,24 to the unitary rotation of the (untruncated) occupied basis was also exploited. As before only the most important n ¯ v virtual 0

LNOs, i.e., D(i ) eigenvectors, with occupation numbers na¯ ≥ εv are retained for the CC calculation of the LIS. After canonicalization this leads to the {¯ a} virtual basis of the LIS. According to the best of knowledge the use of such LNOs is a unique feature of our scheme among the available local correlation methods. The above LNOs resemble the most to the orbital specific virtual (OSV) orbitals introduced by Manby, Chan, and their coworkers, 56 which, in the context of our scheme, could be obtained by diagonalizing the density formed a ˜˜b[1]

of only the diagonal MP1 amplitudes ti0 i0 . OSVs can be constructed especially efficiently, and for this reason can serve as useful intermediate 58,59 or final 96 virtual basis functions in some of the recent local approaches. We have found that such OSVs are much less optimal for our purposes than the above LNOs, simply because the contribution of the amplitudes corresponding to the strong pair LMOs of the central LMO is completely neglected in the former.

17

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

There is also a notable relation between the LNO and the popular PNO 59,66,68 schemes. Using again the above notation PNOs for the LMO pair i0 −j 0 in the present context would be a ˜˜b[1]

the eigenvectors of the density constructed from amplitudes ti0 j 0 . Clearly, without dropping any functions both the virtual LNOs and the PNOs span the same, the complete PAO space of the ED. The main difference is that for similar accuracy usually 3-4 times smaller number of PNOs (nPNO ) is needed for each LMO pair than the number of virtual LNOs in a LIS (¯ nv ). If there are no and nsp number of occupied orbitals and strong pairs in the entire system, then about 2nsp nPNO different PNOs would be needed in our scheme, while the total number of virtual LNOs is only no n ¯ v . For our case the use of such a large number of PNOs and the corresponding excessive array sizes would not be beneficial, while the PNO basis seems optimal for local correlation methods with direct CCSD solvers. 59,66,68 From an other perspective LNOs are more advantageous for our method, because in each LIS only ¯

the tai¯0b¯j CCSD amplitudes are required highly accurately [see Eqs. (1) and (6)], the other “buffer” amplitudes have secondary importance. Considering Eq. (15) the virtual LNOs are obviously optimal for these essential doubles amplitudes, while the union of the PNO bases of all strong pairs in the domain could, at a higher cost, represent the wave function a ˜˜b[1]

component corresponding to all ti0 j 0 -type amplitudes similarly well. 3.3.2

Two-external integral transformation

In order to perform the CCSD calculation in the LISs (¯ a¯i|P ), (¯i¯j|P ), and (¯ a¯b|P ) three-center integrals are also needed. The first set can simply be transformed from the (˜ a˜i|P ) integrals available in the ED. The (¯i¯j|P ) integrals are also obtained with negligible cost by calculating the (˜i˜j|P ) list in the same direct transformation step as (˜ a˜i|P ) and then transforming it to the occupied LNO basis. The integral direct construction of the two-external list is, however, significantly more expensive since there are usually almost 3 times more virtual LNOs in the LIS than LMOs in the ED. Both sets are expanded in the same, potentially quite extensive AO basis of the ED but local approximations can only be efficiently applied to the LMOs.

18

ACS Paragon Plus Environment

Page 18 of 73

Page 19 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Taking these into account it is understandable that the AO to virtual LNO three-center integral transformation with the same algorithm is nearly an order of magnitude slower than the AO to LMO transformation in the ED. Our attempts to obtain an alternative basis spanning the virtual LNO subspace that exhibits an acceptable amount of locality were less satisfactory and are not discussed in the present report. A much more preferable idea is to utilize the fact that the virtual LNO space is a subspace of not only the full AO space of the ED but also the PAO space of the ED, the latter being usually of 2-3 times smaller in rank. This cannot be directly exploited since the (˜ a˜b|P ) PAO-PAO integrals are not available, and their evaluation for each ED would be even more demanding than the direct AO to virtual LNO transformation. We have found a computationally much more favorable alternative where the AO→PAO→LNO route is replaced by an approximate but numerically very close AO→PAO’→LNO’ one. For the brief introduction of the PAO’ and LNO’ functions let us consider the following form of the complete system’s PAOs projected on the AO basis of the ED:

ED |˜ µi = PˆAO



1 − Pˆocc



|µPCD i ,

(16)

ED projects onto the AO space of the ED, where |µPCD i is an AO of the given ED’s PCD, PˆAO

and Pˆocc projects onto the (untruncated) occupied subspace of the complete system. The set of the |˜ µi functions is overlapping, redundant, and can exactly represent any PAO of the ED. Consequently, the virtual LNOs can be expressed with the

ED ED |¯ µi = PˆvLNO |˜ µi = PˆvLNO



1 − Pˆocc



|µPCD i

(17)

ED functions, where we introduced the PˆvLNO projector that projects onto the virtual LNO ED ED ED subspace and exploited that the effect of acting with the PˆvLNO PˆAO and PˆvLNO projectors

is the same since the virtual LNO space is a subspace of the ED’s AO space.

19

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 73

Let us now expand Pˆocc as Pˆocc = Pˆsp + Pˆdp + Pˆcore

(18)

with Pˆsp , Pˆdp , and Pˆcore projecting onto the space of the (untruncated) strong pair LMOs of the ED, the complement distant pair LMOs, and the core occupied MOs (if there is any core orbital frozen), all three expressed in the complete system’s AO basis. Let us substitute Eq. ED ED (18) into Eq. (17) and recognize that the PˆvLNO Pˆcore |µPCD i and PˆvLNO Pˆdp |µPCD i terms are

negligible because of the near exact orthogonality of the virtual LNOs to the core MOs and to the distant pair LMOs located outside the ED. These terms are not exactly zero due to the slight mixing of the occupied and virtual subspaces when the PAOs are projected onto the ED’s AO basis but are nevertheless very small, and we can neglect them. 0 , the latter On top of that our second approximation is the substitution of Pˆsp with Pˆsp

projecting onto the space of the strong pair LMOs truncated to their BPED AO domain. As discussed before, the least-squares similarity of the truncated and the original LMOs is 0 the more severe but still very accurate at least 99.99% by default, which makes Pˆsp ≈ Pˆsp 0 to the linear transformation of the occupied approximation. If we exploit the invariance of Pˆsp

orbitals of the ED and express it with the quasi-canonical, orthogonal {˜j} set we arrive at the following form of the PAO’ functions:  0

|˜ µ i = 1 −

sp X

 |˜jih˜j| |µPCD i = |µPCD i −

˜ j

sp X

C¯˜jµ |˜ji ,

(19)

˜ j

with C¯˜jµ = h˜j|µPCD i and with the summation running over only the strong pair list of the central LMO. The above PAO’ space is not identical with the space of the ED PAOs due to the ap0 proximations of neglecting the Pˆdp and Pˆcore terms, and the introduction of Pˆsp , hence the

PAO’ space cannot exactly expand the virtual LNOs. The closest functions to the original

20

ACS Paragon Plus Environment

Page 21 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

LNOs which lie exactly in the PAO’ space are the !

PCD

ED |¯ a0 i = PˆPAO ai = 0 |¯

X

Aµ˜0 a¯0 |˜ µ0 i =

X

µ ˜0

Aµ¯a0 |µPCD i

µ

 −



sp

X

A¯˜j¯a0 |˜ji

(20)

˜ j

ED LNO’ functions. Here PˆPAO 0 projects onto the PAO’ space, A collects the expansion coeffi-

¯ = CA. ¯ cients of the |¯ a0 i LNO’-type functions in the PAO’ basis, and A Finally, using the above form of the LNO’ functions the necessary two-external integrals can be written as

0¯0

(¯ a b |P ) =

PCD X

µ, ν

PCD, sp

PCD, sp

Aµ¯a0 Aν¯b0 (µν|P )−

X

A¯˜j¯a0 Aν¯b0 (˜jν|P )−

ν, ˜ j

X

sp X ˜ ˜ ). ¯ Aµ¯a0 Ak˜¯b0 (µk|P )+ A¯˜j¯a0 A¯k˜¯b0 (˜j k|P

˜ µ, k

˜ ˜ j, k

(21) The first term on the right-hand side of this expression can be computed much more operation count efficiently than the naive AO to LNO transformation because both summations run over the AOs of the PCD instead of the AOs of the ED, the former being usually about 2-3 times smaller in size for at least medium-sized systems. The last three terms are even cheaper ˜ ) and (˜j k|P ˜ ) quantities can be constructed with negligible cost to compute since the (µk|P ˜ ) and (¯j k|P ¯ ) integral calculations, respectively. For small from the intermediates of the (˜ ak|P molecules where the PCD and ED are the same the last three terms are clear overhead. Since in this case the LNO and the LNO’ orbitals are identical, we automatically switch back to the direct AO to LNO transformation. The main benefit of the new approach is that, compared to the direct AO to LNO case,  PCD 2 the AO to LNO’ transformation can be performed by a factor of nED /n less operations, AO AO PCD where nED AO and nAO are the number of AOs in the ED and PCD, respectively. Note that

nPCD AO is slightly larger than the number of the PAOs after the redundancy removal, that is, 2 n ˜ PAO . Thus one would get only a marginally better speedup of nED /˜ n if the (˜ a˜b|P ) PAO AO  PCD 2 PAO-PAO integrals were available. The nED /n factor is close to 1 for small molecules, AO AO where the PCD is almost as big as the ED but can be significant for medium sized molecules and reaches up to 8-9 (10-11) for the average (largest) EDs as the ED size saturates with

21

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

increasing system size. Without the improved approach this transformation would be the clear operation count bottleneck of the entire LNO-CC calculation for systems with more than about 50-100 atoms and large AO basis sets. Let us note that this scheme shows a close resemblance to the integral transformation algorithm recently presented by Kats. 97 Both the approach of Ref. 97 and of this report utilize the substitution of the explicit PAO expression of Eq. (8) (suitable for the exact, complete system PAOs) into corresponding integral transformation expressions. Ref. 97 also presents a PAO-PAO integral expression analogous to the LNO-LNO integral formula of Eq. (21), although only PAO-LMO type integrals were calculated in Ref. 97, the PAO-PAO integrals were not implemented. Basically, the same core idea is realized both in Ref. 97 and here differing mostly in the details of the introduced approximations to the original expressions. To the best of our knowledge this idea has not been employed anywhere else, and the present one is the first implementation of such two-external integral expressions in the context of local correlation methods. One drawback of our current implementation is that the last three and the first terms of Eq. (21) are computed in two separate direct integral transformation routine calls, both with integral pre-screening. 26 The AO to LNO’ transformation is more sensitive to the balanced accuracy of all four terms in Eq. (21), and increased inaccuracies were observed for handful cases. To solve this issue the original Itol = 10−7 Eh pre-screening tolerance 26 was tightened a˜j|P ) evaluation to Itol = 10−8 Eh which turned out to be suitable for all purposes, for the (˜ in the ED, for the AO→PAO’→LNO’ route, and for the direct AO to LNO transformation as well. The above PAO’ and LNO’ sets are orthogonal to the truncated LMO and hence to the occupied LNO spaces by definition. The additional benefit of not imposing this orthogonality earlier, beside those presented in Ref. 26, is that the MP2 correction of Eq. (7) also corrects for the loss of correlation energy resulting from the orthogonalization of the virtual space to the occupied one.

22

ACS Paragon Plus Environment

Page 22 of 73

Page 23 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Since the LNO’ and LNO sets are not identical in general, all the quantities depending on the virtual LNOs are constructed consistently in the LNO’ basis. To keep the formulation more transparent the prime will be dropped from this point wherever clarity allows it.

3.3.3

Natural auxiliary functions

The LNO approximation decreases the number of orbital product densities (|¯ pq¯i) formed with general LNOs (|¯ pi) of the LIS by about an order of magnitude compared to the number of |˜ ai0 i densities for which the AFs are selected in the ED. Consequently, the auxiliary space of the ED can also be compressed, for which we utilize natural auxiliary functions (NAFs). 25–27,72,98,99 The NAFs 72 are defined using the three-center MO integrals, J, defined as X

Q T Jpq LQP = (pq|P ) .

(22)

Q

By construction the NAFs, 72 being the singular vectors of J, provide the best reduced-rank approximation to J constructed with the complete AF basis. It is more cost- and memoryefficient to compute NAFs of the LIS via diagonalization. 72 For that purpose, first, we build the matrix WP0 Q =

X

(¯ pq¯|P ) (¯ pq¯|Q)

(23)

p¯≤¯ q

and perform the fitting step directly on W0 . Matrix inversions in the AF space are again avoided via solving two consecutive system of linear equations to form W = LT W0 L. Then W is diagonalized, and only those singular vectors, i.e., NAFs, are retained for which the corresponding eigenvalue (squared singular value) is above the εNAF threshold. Finally, all three-center LNO integrals are fitted and transformed to the NAF basis in a single step as discussed in Ref. 72. Note that in our local MP2 and dRPA schemes NAFs were formulated only using the Ja˜Pi0 or Ja¯P¯i integrals, respectively. 25,26 Unfortunately, the NAFs composed as the singular vectors of the Ja¯P¯i or even the Jp¯P¯i integrals are insufficient for the LNO-CC methods, although it 23

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

would be beneficial from the algorithm optimization perspective if the NAF construction could precede the two-external integral transformation. The reason for the insufficiency is the potentially missing AF linear combinations needed for the fitting of the |¯ a¯bi orbital products. Therefore, in contrast to the Ja˜Pi0 -based NAF scheme of our local MP2 algorithm, 26 the full AF set has to be retained in the EDs and the Jp¯Pq¯ -based NAFs can only be constructed in the LISs, so that the retained NAFs fit accurately all four-center integrals appearing in the LIS. Let us also note that the MP2, CCSD, (T), etc. domain energies [e.g., in Eq. (6)] are evaluated using ERIs constructed with the full AF basis, because this was found to effectively reduce the error of the NAF basis truncation. 25,26,72 The corresponding amplitudes are of course affected by the NAF approximation but this error is very small with a sufficiently tight εNAF . Additionally, this error is also partially corrected by the ∆EiMP2 correction of 0 Eq. (7) since δEiMP2 (Ei0 ) is evaluated with the complete auxiliary basis of the ED. 0 Finally, we mention the relation of the NAF scheme to the AF basis truncation approach of Kjærgaard. 70 The main difference is that those AFs are discarded in Ref. 70 which give the least energy contribution to the so-called SOS-MP2 atomic fragment energies. 70 We find the original NAF approach to be more suitable for our purposes because the Jp¯Pq¯ -based NAFs take into account all the pair densities appearing in the CC equations, and the approximate three-center integrals are the closest to the original ones in the reduced-rank SVD basis.

3.3.4

CCSD(T) calculation in the LIS

Having the three-center integrals in the compact LNO-NAF basis at hand the assembly of the four-center LNO integrals is very efficient. Our slightly modified canonical CCSD implementation 24 is employed for the CCSD iteration and domain energy calculation. The (T) step is performed via our recent, highly optimized Laplace-transform (T) approach. 27 The most demanding operations for both the CCSD and the (T) parts scale as n ¯ 2o n ¯ 4v . Even with the much compressed LNO bases this steps are often the rate determining for the entire

24

ACS Paragon Plus Environment

Page 24 of 73

Page 25 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

LNO-CC calculation, especially with large, diffuse AO basis sets and with LNO thresholds tighter than the default. If higher-order CC methods are used, we perform the same steps up to the assembly of the four-center integrals and utilize the general-order iterative and perturbative CC implementations of the Mrcc suite. 5,6,100 One has to consider, however, that the computational expenses of these higher-order methods scale at least with the eighth power of the LNO dimensions. For instance, for the same system an LNO-CCSDT calculation is about 3 orders of magnitude slower than an LNO-CCSD(T). Table 2: The purpose and composition of the various domains are collected as a brief summary of the domain construction scheme. primary domain (PD) purposes occupied MO virtual MOs atoms

purposes

occupied MOs PAO center domain (PCD) virtual MOs atoms AFs

purpose occupied/virtual MOs AFs

3.4

1) evaluate pair energies in pairs of PDs 2) determine the distant and strong pair lists (threshold εw ) the central LMO PAOs centered on the BP atoms of the central LMO (threshold TPDo ) union of the BP atoms lists of the PD’s MOs (thresholds TPDo and TPDv ) extended domain (ED) 1) determine LMP2 correlation energy contribution of its central LMO 2) obtain LMP2 natural orbitals, the LNOs 3) perform integral transformation for the CC calculations in the LIS the central LMO and all of its strong pairs union of the BP atom lists of the ED’s occupied orbitals (threshold To ) all PAOs centered on the PCD’s atoms with above TEDv BP completeness union of the BP atom lists of the ED’s occupied orbitals (threshold TEDo ) all auxiliary functions centered on the PCD’s atoms local interacting subspace (LIS) compute the CCSD(T) correlation energy contribution of its central LMO truncated list of LNOs (thresholds εo and εv ) truncated list of NAFs composed of the AFs of the ED (threshold εNAF )

Point group symmetry

Our first LNO-CC implementations have already included a simple way to exploit the point group symmetry of molecules. 23,24 The main idea is that, if the central LMOs of two do25

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

mains can be transformed into each other via symmetry operations, their respective domain energy contributions are equal. Therefore the domain calculation is performed only once for equivalent domains, leading to a speedup comparable to the number of symmetry elements of the point group. Such symmetry-equivalent domains were previously identified using the diagonal elements of the full Fock-matrix transformed to the LMO basis since the LMO Fockian exhibits the same symmetry properties as the domain energies. Presently, this approach is improved to distinguish truly symmetry equivalent LMOs and the ones with quasi-degenerate diagonal elements. To this end the relations of all off-diagonal LMO Fock-matrix elements are also examined. For instance, for a σ ˆ symmetry operation of the point group relations Fi0 j 0 = Fσˆ i0 σˆ j 0 are also need to be fulfilled. Note that this approach can exploit the full symmetry of even non-Abelian point groups. We found it highly unlikely that the domains themselves exhibit spatial symmetry, unless the entire molecule is included in all domains, hence this option was not pursued. Admittedly, the medium/large systems targeted with our scheme are rarely symmetric, therefore we did not invest any more effort in this direction. For instance, our orbital localization algorithms are not yet symmetry adapted and may not yield exactly symmetry equivalent LMOs for all cases. We also mention that other local schemes implement similar symmetry based cost reduction techniques for orbital 71 or orbital pair 101 domains.

4

Computational details and test systems

The above LNO-CC approach has been implemented in the Mrcc suite of quantum chemical programs and it is already available in the latest release of the package. 100 The benchmark calculations of sections 6 and 7 were performed with the default settings determined in the following sections. The most important thresholds having the largest effect on the accuracy are collected in Table 3. The quality of the LNO-CCSD(T) energies with varying εo ,

26

ACS Paragon Plus Environment

Page 26 of 73

Page 27 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

εv , TEDo , εw , and εNAF thresholds will be investigated in section 5 in extensive detail. For that purpose reference energies will be obtained using a very tight threshold value for the investigated approximation and the default settings for the remaining truncations. For the sake of completeness, the reference values are collected in the Supporting Information. Note that those reference energies in that sense do not correspond either to results obtained with the default settings or with any other preset threshold combinations. The more meaningful results obtained with balanced, preset threshold combinations will be analyzed in our forthcoming report. 75 Similarly thorough benchmarking led to sufficiently rigorous default values for the remaining BP completeness criteria (see TPDo , TPDv , To , and TEDv of Ref. 26) and for the TLT Laplace-quadrature parameter. 27 The previous tests 26,27 performed for the latter five thresholds are not repeated here for the sake of brevity, but their effect on the corresponding approximations will also be assessed on the examples of sections 5 and 6. Table 3: Default threshold values set by the composite keyword of lcorthr=Normal in the case of localcc=2018. Symbol Keyword εo lnoepso εv lnoepsv TEDo bpedo εw wpairtol εNAF naf cor TLT laptol

4.1

Default value 10−5 10−6 0.9999 10−5 Eh 10−2 Eh 10−2

Computational details

In the calculations presented here Dunning’s (augmented) correlation-consistent polarized valence X-tuple-ζ basis sets [(aug-)cc-pVXZ, X = D, T, and Q], 102–104 the correlationconsistent polarized valence triple-ζ basis for explicitly correlated calculations (cc-pVTZF12) developed by Peterson et al., 105 as well as the triple- and quadruple-ζ valence basis sets including polarization functions (def2-TZVP and def2-QZVP) optimized by Weigend 27

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

and Ahlrichs 106 were used. If not indicated otherwise, for the above AO basis sets the corresponding auxiliary bases of Weigend and co-workers were applied. 107–109 In the case of the cc-pVTZ-F12 the auxiliary basis corresponding to the aug-cc-pVTZ basis set was applied. For most of our calculations the DF approximation was also invoked at the HF calculations, while for systems with more than 9000 AOs the exchange contribution in the DF-HF calculations was computed with local fitting domains as described in Ref. 26. The core electrons were kept frozen in all the correlation calculations. The reported computation times are wall-clock times determined on a machine with 128 GB of main memory, a standard 7200 rpm hard disk, and an 8-core 3.0 GHz Intel Xeon E5-1660 processor.

4.2

Benchmark sets and test systems

The accuracy of the LNO-CCSD(T) correlation and reaction energies was assessed on the test set of Neese, Wennmohs, and Hansen (NWH). 60 The NWH set is assembled from 23 reactions including molecules of up to 36 atoms. 60 Furthermore, weak interaction energies ˇ aˇc and co-workers. 110 A new were tested for dimers collected in the S66 benchmark set of Rez´ test set composed of 26 CCSD(T) correlation energies of medium-sized systems (CEMS26) has also been constructed. The CCSD(T) reference energies were obtained with at least triple-ζ quality basis sets for molecules containing 30–63 atoms (see section 6.2). Tests calculations were also performed for the triphenylphosphine (PPh3 ) molecule and the androstendion and organocatalysis reactions presented in Ref. 57, and the fourth reaction of the isomerization test set (ISOL4) of Grimme and co-workers 111 (see Figure 1). The latter ISOL4 reaction represents two intermediate steps in the biosynthesis of cholesterol: lanosterol [ISOL4 educt] and (S)-2,3-oxidosqualene [ISOL4 product]. The large-scale calculations collected in Table 11 are performed for one of the key transition states (TSRS CC · · · pnp) of a Michael-addition reaction depicted in Figure 7 of Ref. 73, the crambin protein (Figure 9 of 28

ACS Paragon Plus Environment

Page 28 of 73

Page 29 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Ref. 62), and the core domain of the HIV-1 integrase enzyme (integrase for short) illustrated in Figure 1 of Ref. 26. The structures of all the investigated molecules are collected in the Supporting Information. For the statistical analysis the maximum absolute error (MAX), the mean absolute error (MAE), and standard deviation of the absolute errors (STD) were applied. Relative energy differences with respect to a reference energy (E ref ) are obtained as (100%)·(E LNO−CCSD(T) − E ref )/E ref . Exact CCSD(T) or LNO-CCSD(T) correlation energies obtained with very tight thresholds will be used as E ref .

Figure 1: Three reactions investigated in the convergence tests of section 5: formation of androstendion 57 (top), fourth isomerization of the ISOL test set 111 (middle), and an organocatalysis reaction taken from Ref. 57 (bottom).

29

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

5

Convergence of the local approximations

This section documents the truncation threshold dependence of the LNO-CCSD(T) energies in order to corroborate the choices of the above default values in Table 3. The accuracy of the approximations corresponding to the tested parameters were already investigated previously in the context of related local correlation methods. The effect of the εw pair energy, the TEDo BP completeness, and the εNAF NAF eigenvalue thresholds on the accuracy was assessed previously for our local dRPA 25 and local MP2 26 methods. The range of sufficiently tight but still effective LNO threshold values, εo and εv , were also found in our previous, local dRPA 25 and local CCSD(T) 23,24 studies. Default threshold settings are sought here for the present LNO-CCSD(T) scheme which are suitable for accurate computation for various medium and large systems with at least triple- or quadruple-ζ basis sets. The main focus will be on optimizing the method for triple-ζ bases including the case where the basis set is also supplemented with diffuse functions. One may argue that the more expensive quadruple-ζ local CCSD(T) calculations are less relevant than the triple-ζ ones, because the majority of the remaining basis set incompleteness of the triple-ζ basis can alternatively be included at a lower level of theory via composite energy expressions 57,112 or by utilizing explicitly-correlated approaches. 59,65,68 We have also found that the basis set limit of the LNO-CCSD(T) energies can usually be approached well by adding an LMP2-based basis set incompleteness correction to a triple-ζ LNO-CCSD(T) result, which will be presented in our forthcoming report. 75

5.1

LNO selection: Medium-sized systems

First, the system and basis-set dependence of the LNO occupation number thresholds are studied for such systems where the exact CCSD(T) reference is available to us: the relatively average case of the PPh3 molecule with the cc-pVTZ basis, the glycine tetrapeptide (Gly4 ) of Ref. 96 with the aug-cc-pVTZ basis, and the more compact and somewhat more challenging

30

ACS Paragon Plus Environment

Page 30 of 73

Page 31 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

octamethylcyclobutane (OMCB) of the NWH set with the cc-pVTZ, aug-cc-pVTZ, and cc-pVQZ bases sets. The above medium-sized systems are tested first because for these the effect of the LNO basis truncation can be almost completely separated from the other approximations. To that end all but three approximations were turned off in the tests of this section. Larger molecules are also tested in section 5.6, where the LNO truncation is applied in combination with all other approximations turned on. The first of the three remaining approximations in this test is, of course, the LNO truncation. The second one is the use of the Laplace-transform 27 approximation for the (T) energy. The latter is kept because it makes the (T) calculations much more manageable with the extremely tight LNO thresholds used here in turn for an additional error that is negligible in this test compared to the error of the LNO truncation. 27 The third difference between the LNO-CCSD(T) and the reference calculations is the DF approximation, which cannot be completely eliminated from the former but not present in the canonical CCSD(T) calculations performed for the Gly4 and the OMCB molecules. To minimize the DF error in the LNO-CCSD(T) energies of these examples the (aug)-cc-pV(X + 2)Z-RI bases were applied in combination with the (aug)-cc-pVXZ AO bases. Previous studies concluded 23–25 that it is beneficial to use an about an order of magnitude smaller threshold for the virtual LNOs than for the occupied ones. Motivated by the benefits of reducing the number of independent truncation thresholds we decided to tie together the related εo and εv values and determine default values for εo and the εo /εv ratio. The basis set dependence of the LNO-CCSD(T) correlation energy error is shown in Figure 2 for the OMCB molecule, which is the most difficult of the three. For all three bases the εo /εv = 10 ratio performs significantly better than the εo /εv = 5 option, and it is close to the more accurate but also more demanding εo /εv = 20. Thus εo /εv = 10 is selected as default. The convergence with εo is noticeably faster for the most commonly applied triple-ζ bases than for cc-pVQZ. The errors using the triple-ζ bases are already below the very satisfactory 0.015% with the default εo = 10−5 . In combination with quadruple-ζ basis

31

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

0.2

Correlation energy error [%]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

0.2 OMCB cc−pVTZ

0.15 0.1

Page 32 of 73

OMCB aug−cc−pVTZ

0.2

εo/εv=20 εo/εv=10 εo/εv=5

0.15

0.15

εo/εv=20 εo/εv=10 εo/εv=5

0.1

OMCB cc−pVQZ εo/εv=20 εo/εv=10 εo/εv=5

0.1 0.05

0.05

0

0

0.05

3⋅10−5

3⋅10−6 εο

3⋅10−7

3⋅10−5

3⋅10−6 εο

3⋅10−7

0 3⋅10−5

3⋅10−6 εο

3⋅10−7

Figure 2: Relative error of LNO-CCSD(T) correlation energies compared to the exact CCSD(T) reference values as a function of the εo and εo /εv LNO thresholds using the cc-pVTZ (left), the aug-cc-pVTZ (middle), and the cc-pVQZ (right) basis sets and the settings described in section 5.1. Horizontal lines indicate ± 0.01%, 0.03%, and 0.1% values on the horizontal axis. The presented errors are obtained with the following εo values: 3 · 10−5 , 10−5 , 3 · 10−6 , 10−6 , 3 · 10−7 , and 10−7 .

sets the relative error with the default threshold is larger but steeply drops to 0.02% with the tighter εo = 3 · 10−6 . The system dependence is investigated in Figure 3 by comparing the LNO errors for PPh3 , Gly4 , and OMCB. The εo /εv = 10 curves are again closely following the more accurate εo /εv = 20 ones. The convergence with εo is both satisfactory and systematic, the errors are below 0.1% with the default εo = 10−5 and decrease further to 0.035% if εo = 3 · 10−6 even for the more challenging aug-cc-pVTZ and cc-pVQZ bases. The performance for the PPh3 /cc-pVTZ system is even better since the error is already below 0.025% with the default settings.

5.2

Strong pair selection

Unlike the LNO truncation the remaining local approximations only moderately affect the small or medium-sized systems, and larger test cases are needed to obtain generally applicable conclusions. Thus, the LNO-CCSD(T) correlation energies are assessed here for the androstendion precursor, the educt of the ISOL4 reaction, and the educt of the organocatal32

ACS Paragon Plus Environment

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Correlation energy error [%]

Page 33 of 73

PPh3 cc−pVTZ

0.2

εo/εv=20 εo/εv=10 εo/εv=5

−0.15 −4

10

−5

10

εο

−6

10

OMCB cc−pVQZ εo/εv=20 εo/εv=10 εo/εv=5

0.1

−0.1

0.05

0.2 0.15

−0.05

0.1

0

Gly4 aug−cc−pVTZ

0

εo/εv=20 εo/εv=10 εo/εv=5

0.15

0.05

10−4

10−5 εο

0.05

10−6

0 3⋅10−5

3⋅10−6 εο

3⋅10−7

Figure 3: Relative error of LNO-CCSD(T) correlation energies compared to the exact CCSD(T) reference values as a function of the εo and εo /εv LNO thresholds. The PPh3 (left), Gly4 (middle), and OMCB (right) molecules are studied using the cc-pVTZ, aug-ccpVTZ, and cc-pVQZ basis sets, respectively. See section 5.1 and the caption of Figure 2 for more details.

ysis reaction (denoted as AI in Ref. 57), containing 61, 81, and 106 atoms, respectively. The corresponding reaction energy errors will also be considered. The cc-pVTZ-F12 basis is used for the species in the androstendion and ISOL4 reactions, and cc-pVTZ is selected for the educt and product of the organocatalysis reaction. The exact CCSD(T) reference is, of course, not available, therefore we study the convergence with respect to tightening one threshold at a time, while the default settings are fixed for the rest of the approximations. Similar strategies were also employed previously for direct local schemes. 58,59,63 This test is not ideal since the error sources are not completely separated. For instance, with a tighter pair energy threshold the size of both the ED and the PCD grows adaptively. Nevertheless, a suitable error measure is the LNO-CCSD(T) correlation and reaction energy difference compared to the results obtained with the tightest, sufficiently converged threshold values. The εw = 10−5 Eh pair energy threshold was found to be highly accurate for a test set containing a wide range of molecules with 92–260 atoms in the context of our local MP2 scheme, 26 where the ED construction and MP2 domain energy computation steps are closely analogous to those of the present approach. In the LNO-CCSD(T) scheme εw also determines

33

ACS Paragon Plus Environment

0 −0.03 −0.06 −0.09 Androstendion p. ISOL4 educt Organocat. educt

−0.12 −0.15

10−4

3⋅10−5

10−5 εw [Eh]

3⋅10−6

10−6

Reaction E difference [kcal/mol]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Correlation E difference [%]

Journal of Chemical Theory and Computation

Page 34 of 73

0.4 0.2 0 −0.2 −0.4 Androstendion ISOL4 Organocat.

−0.6 −0.8

10−4

3⋅10−5

10−5 εw [Eh]

3⋅10−6

10−6

Figure 4: LNO-CCSD(T) relative correlation energy (left) and reaction energy (right) differences for the androstendion, ISOL4, and the organocatalysis reactions as a function of the pair energy threshold, εw . Solid lines indicate the default distant–strong pair two-layer scheme, while dashed lines are used for the results obtained with three pair categories. See section 5.2 for more details.

the set of strong pair LMOs spanning the occupied LNO subspace of the CC calculation, which is an other error source if εw is not sufficiently tight. The solid lines of Figure 4 show the εw dependence of the LNO-CCSD(T) correlation and reaction energy differences compared to references obtained with εw = 10−6 Eh . This very tight value ensures that for the three systems at least 60–70% of the LMO pairs become strong and the distant pair contribution to the correlation energy is below 0.01%. All three cases exhibit rapid convergence, the correlation energy errors with respect to the reference are below 0.05% already with the default εw = 10−5 Eh , and decrease below 0.013% with the tighter εw = 3 · 10−6 Eh . Even the somewhat looser εw = 3 · 10−5 Eh is sufficient to yield relative errors of less than 0.1%. The corresponding reaction energy differences are also satisfactory being at most 0.3 kcal/mol with the default tolerance and less than 0.1 kcal/mol with the tighter εw = 3 · 10−6 Eh value. Note that this default value of εw = 10−5 Eh corresponds to the tighter settings in Ref. 26, matches the close and weak pair separating threshold in the local PNO family of methods of Werner and co-workers, 58 and it is halfway between the default strong/weak

34

ACS Paragon Plus Environment

Page 35 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(10−4 Eh ) and weak/distant (10−6 Eh ) separating values used in the domain based local PNO (DLPNO) methods of Neese and co-workers. 62 In order to make the thresholds consistent in our multilevel approaches, 28,29 εw = 10−5 Eh is set as default also for the 2018 variant of our LMP2 method. Let us briefly add that we also considered a three-layer distant–weak–strong pair hierarchy (dashed lines of Figure 4), where the distant and strong pairs are treated as before. In the intermediate zone of the weak pairs accurate MP2 pair energies are evaluated in the EDs as

δEi0 j 0 (Ei0 ) =

i ˜ Xh a ˜b[1] 2(˜ ai0 |˜bj 0 ) − (˜ aj 0 |˜bi0 ) ti0 j 0 ,

(24)

a ˜˜b

a ˜˜b[1] a ˜˜b[1] where (˜ ai0 |˜bj 0 ) and ti0 j 0 are obtained from the corresponding (˜ ai0 |˜b˜j) integrals and ti0˜j

amplitudes of the ED by transforming their second occupied index to the LMO basis. In these tests the δEi0 j 0 (Ei0 ) improved local MP2 pair energies were evaluated for non-distant pairs with OS-MP2 pair energy estimates of at least 10−6 Eh . Then a new, more compact ED was built including strong pairs with δEi0 j 0 (Ei0 ) > εw , and the calculation then proceeded with the LIS construction as discussed before. Notice that this procedure is reminiscent of the weak pair energy computing scheme of the DLPNO-CCSD(T) method introduced recently. 64 Looking at the performance on the ISOL4 and the organocatalysis examples of Figure 4 the accuracy of the three-layer scheme is systematically but not much better in the region of the default and tighter settings compared to the two-layer distant–strong approach. Unfortunately, having more precise pair energies, and potentially smaller number of strong pairs does not lead to more compact LNO domains and shorter wall times for the CC calculations in the LISs. To illustrate this point let us compare the domain sizes of the three-layer scheme obtained with εw = 3 · 10−5 Eh to the dimensions in the similarly accurate two-layer calculations performed with εw = 10−5 Eh and εw = 3 · 10−6 Eh on the educt of the organocatalysis reaction. There are indeed much fewer strong pairs in the three-layer case (22% vs. 29% and

35

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

42%), and the EDs are also more compact (57 vs. 66 and 81 atoms on the average) but the total (occupied plus virtual) number of LNOs in the LISs are barely smaller (140 vs. 141 and 143 in average and 212 vs. 214 and 218 at maximum). Considering these limited merits, the overhead of the repeated ED constructions, the drawbacks coming with the more complicated ansatz, and the benefits of involving the contributions of more weak pairs into the CC calculation 57,58 we decided to keep the two-layer distant–strong pair selection strategy.

5.3

Atom lists of the localized MOs

The TEDo threshold controls the BP atom lists of the truncated LMOs in the ED and implicitly, together with εw , also determines the atom list of the EDs and consequently the PAO truncation error (see section 3.2.1). The BP algorithm ensures that the LMO truncation error is below (1 − TEDo ) · 100%, that is, below 0.01% for the default TEDo = 0.9999. The PAO truncation error contribution to the local MP2 correlation energy controlled by TEDo was also found in the 0.01% range. 26 The overall effect of the LMO truncation on the accuracy of LNO-CCSD(T) can be inferred from Table 4. The LNO-CCSD(T) correlation energy difference (∆E corr ) and reaction energy difference (∆E reac ) values in the table are compared to references obtained with TEDo = 0.99999 for the androstendion and the ISOL4 reactions. Table 4: Relative LNO-CCSD(T) correlation energy differences (∆E corr ) and the corresponding reaction energy differences (∆E reac ) for the androstendion and ISOL4 reactions as a function of TEDo . Reference energies were obtained by using 0.99999 for TEDo and the default settings for the other approximations. ∆E corr [%] TEDo andr. p. ISOL4 educt 0.99995 -0.008 -0.007 0.9999 -0.020 -0.016 0.9998 -0.049 -0.040 0.9997 -0.071 -0.072 0.9995 -0.127 -0.134

∆E reac [kcal/mol] andr. ISOL4 -0.04 -0.10 -0.07 -0.26 -0.16 -0.33 0.05 -0.94 0.36 -1.47

In accord with the findings of our LMP2 study 26 the correlation energy error compared 36

ACS Paragon Plus Environment

Page 36 of 73

Page 37 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

to the reference is about 2–3 times larger than (1 − TEDo ) · 100% in the investigated range, for instance, it is between −0.016% and −0.020% with the default TEDo = 0.9999. The analogous reaction energy errors also decrease rapidly with increasing TEDo . The errors of −0.07 and −0.26 kcal/mol obtained with the default threshold are also excellent, considering that the total reaction energies are about 5 and 70 kcal/mol for the androstendion and the ISOL4 reactions, respectively.

5.4

Natural auxiliary functions

The NAF approximation was found to be very effective in both the local dRPA 25 and MP2 26 contexts, although the conclusions are not completely transferable because Ja¯P¯i -type and Ja˜Pi0 type NAFs are used for the above methods, respectively, while the present LNO-CCSD(T) scheme employs the Jp¯Pq¯ -type ones, as discussed in section 3.3.3. Table 5: Relative LNO-CCSD(T) correlation energy differences (∆E corr , in percents) for the PPh3 /cc-pVTZ, the OMCB/cc-pVQZ, and the androstendion precursor/cc-pVTZ-F12 systems as a function of the εNAF threshold. The reference is obtained with the default settings and εNAF = 0 Eh . εNAF [Eh ] PPh3 0.001 -0.00005 0.003 -0.0006 0.01 0.0027 0.03 0.0366 0.1 0.5526

OMCB andr. p. 0.0001 0.0002 0.0003 0.0008 0.0008 0.0035 0.0161 0.0253 0.2509 0.3379

The ∆E corr correlation energy difference measures are collected in Table 5 as a function of the εNAF threshold used for Jp¯Pq¯ -type NAFs. The reference point here is εNAF = 0 Eh , i.e., the case when the NAF space is not truncated. The PPh3 /cc-pVTZ/cc-pVTZ-RI, OMCB/ccpVQZ/cc-pVQZ-RI, and the androstendion precursor/cc-pVTZ-F12/aug-cc-pVTZ-RI systems are selected in order to sample various important AO and auxiliary basis set types. The results in Table 5 reveal a rapid convergence with εNAF , all errors are well below 0.005% for all three test systems with the default εNAF = 0.01 Eh . Generally we find the NAF

37

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

compression highly effective since the LNO truncation drastically decreases the number of independent LNO product densities in the LISs. Compared to the number of AFs in the PCDs (which is already 2-3 times smaller than the number of AFs residing on all atoms of the EDs) the average ratio of the retained NAFs are 23%, 17%, and 18% for the three test systems of Table 5, respectively. For larger domains and/or AO basis sets the compression rate is even better, for instance, 12% and 9% of the NAFs were sufficient for the TSRS CC · · · pnp structure with the aug-cc-pVTZ and aug-cc-pVQZ bases, and 16% and 9% of the NAFs were kept for the crambin protein with the def2-TZVP and def2-QZVP bases (see Table 11).

5.5

Approximations for the virtual LNOs

Four similar but not identical virtual LNO sets can be distinguished considering the two new approximations introduced in sections 3.3.1 and 3.3.2. First, the previously employed 23–25,27 LNO0 and the LNO set defined by Eq. (15) can be compared, differing only in the inclusion of the frozen occupied LNO contributions in the virtual–virtual block of the domain MP2 density matrix. Second, both the LNO0 and LNO sets can be approximated by projection from the PAO’ space (see 3.3.2), leading to the LNO0 ’ and LNO’ sets. The LNO’ approach combines the algorithmic benefits of both ideas and it is chosen as default. The accuracy of the two approximations is illustrated on the example of the organocatalysis reaction (see Table 6). The previous correlation and reaction energy difference measures are studied again using the LNO0 results as reference. As expected, the numerical results obtained with the four LNO sets are very close, all of them are found within a 0.02% wide correlation and a 0.009 kcal/mol wide reaction energy range. The accuracy of the LNO0 to LNO approximation is rationalized by the fact that the virtual LNOs do not have to describe the correlation of the electrons on frozen occupied LNOs, that is taken into account at the MP2 level in the ED. Considering the LNO’ approximation the difference of the LNO and LNO’ orbitals is fortunately noticeable only around the edges of the ED, where the tails of the distant pair LMOs of the neighboring 38

ACS Paragon Plus Environment

Page 38 of 73

Page 39 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

EDs start to overlap with the LMOs of the given ED, and where the tails of the ED’s strong pair LMOs are cut to the largest extent. By construction the virtual LNOs are centered more closely to the central LMO of the ED, where the approximations leading to the LNO’ set are the least severe. Indeed, we observe that, on the average, the overlap of a virtual LNO’s projection onto the LNO’ space with the parent LNO is about 0.9995-0.9999, which is almost an order of magnitude better than the same measure for the PAOs. Furthermore, since neither of the new virtual LNO approximations affect the MP2 domain energies of the EDs, both of them are corrected by the ∆E MP2 correction of Eq. (7) at the MP2 level. Table 6: LNO-CCSD(T) relative correlation energy (∆E corr ) difference for the organocatalysis educt and reaction energy (∆E reac ) difference for the organocatalysis reaction. The reference for the differences and speedup measurements is obtained with the previous virtual LNO0 orbitals, 23,24,27 the three LNO sets in the last three columns are defined in section 5.5. LNO LNO0 ’ ∆E corr [%] -0.019 0.006 ∆E reac [kcal/mol] -0.003 0.009 n ¯v 117.3 121.7 speedup 1.06 2.06

LNO’ -0.012 0.007 117.3 2.20

Considering the efficiency gain the two ideas enhance each other. The most direct consequence is that the operation count requirements of the rate determining steps in the twoexternal integral transformation and in the CCSD(T) calculation decrease by a factor of f and f 4 , respectively. Here f is the ratio of the number of LNO0 and LNO type virtual functions, that is, 121.7/117.3=1.04 in an average LIS for the above example. This in itself is quite insignificant, but f can reach up to 1.14 in the case of the crambin/def2-TZVP example. This predicts a potential speedup of at most f 4 = 1.7, and we indeed measured factors of 1.4 and 1.6 for the wall times of the CCSD and (T) steps, respectively. As discussed in section 3.3.2, the operation count reduction in the first step of the two-external  PCD 2 integral transformation on the primed route is g 2 = nED , while the cost of the AO /nAO CCSD(T) part is not affected. For the organocatalysis and the crambin examples this factor is about 1.92 = 3.6 and 2.82 = 7.8, while the total theoretical speedup, f · g 2 , is 3.7 and 8.9, 39

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

respectively, for the first AO to LNO transformation step. It is more practical to measure the wall times for the complete two-external transformation (tJab ). The measured LNO0 to LNO’ speedups in tJab (see, e.g., the last line of Table 6) are considerably less, specifically, 2.2 and 4.1 for the organocatalysis and the crambin examples. The lower but still significant improvement is understandable since the tJab times include more steps. Namely, the recomputation of all the necessary three-center AO integrals, both AO to LNO transformations, the NAF construction, and the AF to NAF transformation are all taken into account in tJab , but only the operation count of the most demanding first AO to LNO step decreases by the whole f · g 2 factor.

5.6

LNO selection: Larger examples

The convergence with the LNO truncation thresholds have been illustrated for relatively small molecules in section 5.1. To verify our conclusions for large molecules further convergence tests were carried out for bigger systems. The results are shown in Figure 5. Of course, unlike in section 5.1, the exact CCSD(T) reference is not available here any more. Instead, all but the εo thresholds are set to their default values, including the εo /εv = 10 ratio, and only εo is varied. For the sake of clarity, the updated, LNO’ virtual orbitals are employed in the LISs from this point throughout the rest of this report but the prime will not be indicated any more. The left panel of Figure 5 reveals that, for these three larger systems, with (augmented) triple-ζ basis sets the absolute value of the correlation energy is consistently overestimated compared to the reference due to the systematic overcorrection of the ∆E MP2 term of Eq. (7). Since the second largest contribution to the correlation energy error with respect to conventional CCSD(T), which is caused by the pair- and domain approximations and mostly controlled by εw , is found to be of the opposite sign (both the LMO and the PAO space of the ED is shrunk), often a fortunate error compensation occur. All three correlation energy differences of Figure 5 are below 0.035% with the default 40

ACS Paragon Plus Environment

Page 40 of 73

0.3 Androstendion p. ISOL4 educt Organocat. educt

0.25 0.2 0.15 0.1 0.05 0 10−4

3⋅10−5

10−5 εο

3⋅10−6

10−6

Reaction E difference [kcal/mol]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Correlation E difference [%]

Page 41 of 73

3 2.5 2 1.5 1 0.5 0 −0.5 −1 −1.5

Androstendion ISOL4 Organocat.

10−4

3⋅10−5

10−5 εο

3⋅10−6

10−6

Figure 5: LNO-CCSD(T) relative correlation energy (left) and reaction energy (right) differences for the androstendion, ISOL4, and the organocatalysis reactions as a function of the LNO threshold, εo . See section 5.6 for more details.

εo = 10−5 , while the largest error is only 0.01% with the tighter εo = 3 · 10−6 (ISOL4 educt). It is very promising that the relative correlation energy error is comparable for the medium and the larger systems of sections 5.1 and 5.6. The reaction energy differences are below 0.16 kcal/mol for the androstendion and organocatalysis reactions, and 0.76 kcal/mol for the ISOL4 reaction using the default εo . These are also satisfactory since the relative errors compared to the reference total reaction energies are all below 1.1%.

6

Benchmarks with the default settings

The quality of the LNO-CCSD(T) correlation and reaction energies has been assessed so far by changing the thresholds individually. In this section the errors of LNO-CCSD(T) correlation energies and energy differences are studied with respect to the exact CCSD(T) reference for a wide range of relatively small and medium sized systems using the default settings. Such comparisons are limited by the extremely high costs of the conventional CCSD(T) reference calculations and therefore cannot be performed for significantly larger systems than those in the CEMS26 compilation presented in section 6.2. In order to illustrate the size- and system-dependence of the accuracy, large-scale convergence tests will also be 41

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

presented in our forthcoming benchmark study 75 using a series of systematically improving threshold combinations.

6.1

Correlation energies

The NWH test set 60 contains 23 organic reactions of 47 different chemical species including the largest OMCB with 36 atoms. We computed the reference CCSD(T) energies for the whole test set using the cc-pVTZ, aug-cc-pVTZ, and cc-pVQZ bases. The reference data is provided in the Supporting Information. It was not possible to use DF for the aug-cc-pVTZ, and the cc-pVQZ reference calculations. To determine the magnitude of the DF error when comparing LNO-CCSD(T) correlation energies including the DF error to CCSD(T) reference without DF we computed the CCSD(T)/cc-pVTZ reference energies both with and without DF. The average (maximum) relative correlation energy error of DF-CCSD(T) with respect to canonical CCSD(T) for the 47 species is 0.051% (0.088%). All the 47 correlation energies are overestimated, that is, DF-CCSD(T) gives larger absolute values. The maximum of the relative errors, corresponding to the H2 molecule, is an outlier. It is significantly larger than the others due to the small CCSD and 0 Eh (T) correlation energy of H2 . If we exclude H2 from the test set for this reason, the maximum error drops to 0.055%. The resulting STD of 0.002% indicates that the DF error is highly systematic and can be predicted to be around 0.05%. As it is expected for the DF approximation, the reaction energies are hardly affected, the average (maximum) error is only 0.01 (0.05) kcal/mol. Nonetheless, if aiming at a local approximation error of about 0.1% in the correlation energy, the ca. 0.05% additional error of the DF approximation also has to be taken into consideration. Since the DF and the local errors are usually of the opposite sign, this DF error could systematically improve the relative errors of local correlation energies if comparisons are made to reference energies obtained without DF. For instance, the average local error of about 0.15% compared to a DF-CCSD(T) reference would appear to be only 0.1% with respect to CCSD(T) without DF. Of course, if only the 42

ACS Paragon Plus Environment

Page 42 of 73

Page 43 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

errors of energy differences are of interest, references both with and without DF are equally suitable. In order to decrease the DF error as much as possible when comparing to DFerror-free CCSD(T) references, in those cases conventional HF calculations were performed and the (aug)-cc-pV(X+2)Z-RI auxiliary bases were employed with the (aug)-cc-pVXZ AO bases. Table 7: Statistical measures for relative LNO-CCSD(T) correlation energy errors in percents for the species of the NWH 60 and S66 110 test sets. CCSD(T) and DF-CCSD(T) references were used for the NWH and S66 sets, respectively. See section 6.1 for more details. test set NWH

S66

AO basis cc-pVTZ aug-cc-pVTZ cc-pVQZ aug’-cc-pVTZ

MAE MAX STD 0.024 0.092 0.025 0.024 0.084 0.030 0.069 0.143 0.035 0.032 0.094 0.033

The LNO-CCSD(T) correlation energy errors of Table 7 are in accord with our design goals. The MAE errors with the (aug)-cc-pVTZ bases are 0.024% and the MAX errors are still below 0.1%. The cc-pVQZ MAE being 0.069% is still acceptable although larger than 0.1% errors also occur with the quadruple-ζ basis. The STD values are consistently small, lying in the 0.025-0.035% range, which is very promising from the perspective of energy differences. The accuracy of the local approximations is also consistent if the LNO-CCSD(T)/ccpVTZ/cc-pVTZ-RI energies are compared to the DF-CCSD(T)/cc-pVTZ/cc-pVTZ-RI reference: 0.028%, 0.095%, and 0.024% are obtained for the MAE, MAX, and STD, respectively, on the NWH set. To illustrate the above point about the drawbacks of comparing local CCSD(T) results to DF-error-free CCSD(T) reference the error measures for the LNOCCSD(T)/cc-pVTZ/cc-pVTZ-RI energies are also given using CCSD(T)/cc-pVTZ (without DF) as reference. The corresponding MAE, MAX, and STD values of 0.030%, 0.095%, and 0.025% are quite close to the previous ones. However, this agreement is accidental since the average signed relative error is −0.018% for the cc-pVTZ case of Table 7, where the DF error is almost eliminated, while the corresponding signed relative error is 0.027% when 43

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

LNO-CCSD(T)/cc-pVTZ/cc-pVTZ-RI energies are compared to a reference without DF. Thus, here we observe that the systematic DF error shifts the mean signed error by almost exactly 0.05%, from −0.018% to 0.027%. The absolute measures are insensitive to that shift in this special case, but generally the quality of the reference regarding the use of DF will affect the correlation energy errors. The same correlation energy error measures are also given in Table 7 for the monomers and dimers of the S66 set. 110 The aug’-cc-pVTZ basis was employed, that is, cc-pVTZ for the H atoms and aug-cc-pVTZ for the remaining heavy atoms. The DF-CCSD(T)/aug’-cc-pVTZ reference energies are obtained relying on our recently improved CCSD(T) algorithm 27 and collected in the Supporting Information. The S66/aug’-cc-pVTZ errors are quite consistent with the NWH/aug-cc-pVTZ ones, all measures are only about 0.01% larger.

6.2

Correlation energies of medium-sized systems

Considering only relatively small systems, like those in the above test sets, can be misleading for multiple reasons. Firstly, some of the local approximations, such as the LMO and PAO truncation or the pair approximation, are only moderately activated for smaller systems. Secondly, the correlation energy error, as the CC correlation energy itself, is an extensive quantity and grows linearly with the system size. It is not evident, however, that the relative errors remain constant when all the approximations start to operate at full power. On top of that, both error amplifications and cancellations could happen. Although efficient and parallel CCSD(T) implementations have become available recently, 7–10,14–21 it is still challenging to find or compute reference CCSD(T) results for systems with more than, say, 30 atoms and at least triple-ζ quality AO basis sets due to the high computational costs. In addition, some of the CCSD(T) energies published for large systems are difficult to reproduce because the exact computational settings are unavailable or some approximations (e.g., frozen NOs) are made. We attempted to collect the currently available and well reproducible CCSD(T) data from the literature and also supplemented this 44

ACS Paragon Plus Environment

Page 44 of 73

Page 45 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

list with some of our own reference CCSD(T) calculations. The resulting test set contains 26 CCSD(T) correlation energies for medium-sized systems (CEMS26) with 30–63 atoms computed with at least triple-ζ quality basis sets (667–1564 AOs). The structures of the molecules are depicted in Figure 6, and some additional shorthands are introduced in the figure caption. The reference CCSD(T) correlation energies and references to the original publications containing the geometries and correlation energies are collected in Table 8. The entries without references are produced by us. Further computational details, the corresponding HF, MP2, and CCSD energies, point group assignations, and all the structures are available in the Supporting Information. As shown in Table 8 a reasonable range of molecule types are present, but mainly organic and biomolecules are represented. Besides the most common H, C, N, and O, other elements appear as well, namely P, Cl, S, Si, Na, Mg, and Li. Using the available CCSD(T) reference data it was also possible to calculate nine reaction energies and three conformation energy differences (see later in Table 10). The reference CCSD(T) energies are collected from multiple sources, consequently there are inconsistencies to be highlighted. Various different AO bases are employed, and fortunately in almost half of the cases basis sets beyond the polarized triple-ζ quality are used by including either diffuse AOs or a fourth valence shell. Most but not all the CCSD(T) energies are obtained without the DF approximation. In those cases the LNO-CCSD(T) calculations are also performed without relying on DF at the HF calculation. The DF error in the local correlation energy is minimized as before, using a “(X+2)-ζ-RI” type auxiliary basis in combination with the corresponding “X-ζ” AO basis wherever it was possible. The details on the auxiliary basis choices are also gathered in the Supporting Information. The relative LNO-CCSD(T) errors in Table 8 are mostly negative, the absolute value of the correlation energy is underestimated. This is in accordance with the expectation that the truncation of the orbital space should decrease the absolute value of the correlation energy. There are two exceptions: for the highly compact [2.2]PCP and OMCB systems

45

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 6: Structures of the molecules in the CEMS26 compilation. Abbreviations: dCMP is deoxycytidine monophosphate, [2.2]PCP is [2.2]paracyclophane, OMCB is octamethylcyclobutane, OMS is octamethylsilsesquioxane, [Li(crown)2 ]+ is the complex of Li+ and two 12crown-4 ether molecules, and GC-dDMP is guanine-cytosine deoxydinucleotide monophosphate. The structure of the second, B conformer of GC-dDMP is similar to that of the A conformer and is not shown. The figures are produced with the IboView program. 88

46

ACS Paragon Plus Environment

Page 46 of 73

Page 47 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Table 8: Relative LNO-CCSD(T) correlation energy errors with respect to the exact E CCSD(T) reference. Statistical measures for the accuracy of LNO-CCSD(T) on the CEMS26 set: MAE = 0.067%, MAX=0.145%, STD=0.066%. Wall times for the correlation part are presented in column 7 obtained with the more relevant default auxiliary basis (i.e., “X-ζ-RI” is used with “X-ζ” bases) using an 8-core CPU. Wall times measured with the “(X+2)-ζ-RI” bases and further details are given in the Supporting Information. Molecule diclofenac 66 ISOL11 educt 111 ISOL11 product 111 cyclopentene dimer 110 Gly4 96 Gly4 96 [2.2]PCP 60 [2.2]PCP 60 [2.2]PCP 60 melatonine aa 113 melatonine dw 113 PPh3 57 neopentane dimer 110 dCMP OMCB 60 OMCB 60 OMCB 60 Mg-porphyrin 114 porphyrin 114 penicillin 66 (H2 O)17 sphere 115 (H2 O)17 5525 115 OMS 116 [Li(crown)2 ]+ GC-dDMP A 9 GC-dDMP B 9

Atoms 30 30 30 30 31 31 32 32 32 33 33 34 34 34 36 36 36 37 38 42 51 51 52 57 63 63

AO basis def2-TZVP cc-pVTZ cc-pVTZ aug’-cc-pVTZ cc-pVTZ aug-cc-pVTZ cc-pVTZ aug-cc-pVTZ cc-pVQZ cc-pVT’Z cc-pVT’Z cc-pVTZ aug’-cc-pVTZ cc-pVTZ cc-pVTZ aug-cc-pVTZ cc-pVQZ cc-pVTZ cc-pVTZ def2-TZVP aug-cc-pVTZ aug-cc-pVTZ cc-pVTZ cc-pVTZ 6-311++G** 6-311++G**

No. of AOs 667 740 740 740 706 1104 704 1104 1360 606 606 784 796 800 696 1104 1380 922 916 858 1564 1564 1208 1198 1042 1042

E CCSD(T) [Eh ] -3.459576 66 -3.540825 -3.566624 -1.940429 -3.459994 -3.523466 -2.802003 -2.841173 -2.938780 -3.149493 113 -3.139806 113 -3.219445 -2.018191 -4.022583 -3.459994 -2.374168 -2.448038 -4.197969 -4.189472 -4.550853 66 -4.925483 115 -4.920207 115 -5.353334 -4.982973 -6.600887 9 -6.602004 9

error [%] -0.02 -0.08 -0.08 -0.00 -0.10 -0.10 0.01 0.04 0.11 -0.03 -0.04 0.00 -0.03 -0.07 -0.02 0.01 0.08 -0.13 -0.12 -0.05 -0.14 -0.14 -0.15 -0.06 -0.06 -0.06

time [min] 119 131 262 22 15 24 95 131 178 104 64 45 18 62 17 26 37 441 420 91 50 32 3 14 362 328

the error is positive due to the overcorrection of ∆E MP2 . In general the statistical measures are highly satisfactory being 0.067% for the MAE and 0.066% for the STD. Errors above 0.1% also occur for six systems, these include the more challenging [2.2]PCP/cc-pVQZ, the two (H2 O)17 clusters, the two porphyrin molecules with extensive delocalized π-systems, and OMS. The MAX error of 0.145% measured for the latter is, however, still acceptable with the default settings. Promisingly, the relative errors for analogous (e.g., educt-product pair or conformational isomer) structures differ by less than 0.016%, therefore error cancellation of 47

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

at least an order of magnitude can be expected for the energy differences. Let us again point out the importance of separating the DF and the local errors also on example of the CEMS26 set. For instance, compared to the same reference containing mostly DF-error-free CCSD(T) energies the mean signed error of the LNO-CCSD(T) correlation energy is −0.047% with the “(X+2)-ζ-RI” auxiliary basis sets, which is decreased by the DF error to only −0.015% if the “X-ζ-RI” bases are employed. The CEMS26 results can be compared to the previous observations for the NWH set. The overall MAE for the NWH set averaging for all three bases is lower, only 0.039%, but the significant size differences also have to be taken into account. The average number of atoms is 13.5 for the NWH set, almost three times smaller than the corresponding CEMS26 value of 39 atoms. Similarly, there are only 437 AOs on the average for the molecules of the NWH set with all three bases, while the same average for the CEMS26 is 960 AOs. Nevertheless, the MAE of 0.067% measured for the CEMS26 set is probably closer to the average error to be expected for even larger molecules. We continue to investigate the size dependence of the local errors on even larger systems in the forthcoming benchmark study. 75

6.3

Reaction and interaction energies

The accuracy of the LNO-CCSD(T) method is also characterized for reaction, non-covalent interaction, and conformational energies of the NWH, 60 S66, 110 and CEMS26 compilations. Looking at the NWH reaction energy errors presented in Table 9 one finds the accuracy of LNO-CCSD(T) highly balanced across the three studied basis sets. The MAE, MAX, and STD measures vary only a few hundredths of a kcal/mol around 0.13, 0.63, and 0.21 kcal/mol, respectively. Interestingly, the relatively larger inaccuracies in the cc-pVQZ correlation energies do not have apparently larger effect on the cc-pVQZ reaction energies. This is at least partly explained by the comparably small STD of the relative correlation energy errors obtained for all three bases (see Table 7). The LNO-CCSD(T) interaction energies, at least for the S66 set, are almost as accurate 48

ACS Paragon Plus Environment

Page 48 of 73

Page 49 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Table 9: Error measures (in kcal/mol) characterizing the accuracy of LNO-CCSD(T) reaction and interaction energies with respect to the CCSD(T) reference values. In the last three lines the S66 set is partitioned into three groups including mainly H-bonding, dispersion, and other interactions according to the characterization of Ref. 110. test set NWH

AO basis MAE MAX STD cc-pVTZ 0.13 0.64 0.19 aug-cc-pVTZ 0.14 0.61 0.21 cc-pVQZ 0.13 0.63 0.23 S66 (all) aug’-cc-pVTZ 0.16 0.58 0.15 S66 (H bond) aug’-cc-pVTZ 0.14 0.41 0.13 S66 (dispersion) aug’-cc-pVTZ 0.24 0.58 0.18 S66 (other) aug’-cc-pVTZ 0.09 0.22 0.07 as the above reaction energies. It is possible to further analyze the interaction energy errors by partitioning the S66 collection according to the interaction types of hydrogen bonding, dispersion, and others. 110 The same statistical measures, collected also in Table 9, are found to be below the S66 average for the hydrogen bonding and other categories, while the MAE for the dispersion subset is above the total S66 average. On the other hand one should consider that many factors contribute to the local errors besides the interaction type. For instance, the average number of atoms in the dimers of the dispersion set is 1.4 and 1.7 times larger than the same number in the other and the hydrogen bonding lists, respectively. Perhaps it is even more important to note that the average DF-CCSD(T)/aug’-cc-pVTZ interaction energy for the dispersion set is 1.7 and 2.3 times larger than in the remaining two subsets. Considering all the above the interaction energies can be considered quite balanced over the different interaction types. Altogether, for these smaller systems, the performance of the LNO-CCSD(T) model with the default settings is promising, the MAE (MAX) is reliably below 0.25 (0.65) kcal/mol with very small, lower than 0.25 kcal/mol STD. The latter is especially important in realistic applications, where, for instance, different reaction paths or consecutive steps of the same reaction chain are compared. The largest errors consistently occur when extensive delocalized π-systems are involved, indicating in these cases the non-negligible contribution of

49

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

the medium-range interactions treated at the MP2 level. For instance, the errors are 0.5 kcal/mol or higher for the 1,2,3,4,5,6-heptahexene → hepta-1,3,5-triyne and the 2 p-xylene → [2, 2]PCP + 2 H2 reaction energies of the NWH set, and for the uracil–uracil H-bonding and benzene–uracil stacking interaction energies of the S66 set. For instance, the error of 0.58 kcal/mol measured for the benzene–uracil interaction corresponds 5.4% of the reference interaction energy obtained in the aug’-cc-pVTZ basis with DF-CCSD(T). All relative correlation energy errors are within the target accuracy, 0.02%, -0.07%, and 0.002% for the benzene, uracil, and the dimer systems, respectively, but there is no cancellation of errors between the monomer and dimer results. It is also worth pointing that for the benzene–uracil interaction energy the remaining basis set incompleteness error with the aug’-cc-pVTZ basis 68 is almost an order of magnitude larger than the error originating from the local approximations. The issue of the basis set convergence will be thoroughly investigated in our next report. 75 Finally, we note that the above generally good performance cannot solely be attributed to error compensation between the terms appearing in the energy difference expressions since many of the above reaction energies and all interaction energies result from a chemical equation with two smaller species being on one side and a single larger species being on the other side. In Table 10 reaction energies and conformer energy differences are put together including the medium-sized systems of the CEMS26 collection. The reactions forming [2, 2]PCP and OMCB are also part of the NWH set. Most apparently there is only one error grown above the previous MAX value of 0.64 kcal/mol, the 1.01 kcal/mol error for the formation of [Li(crown)2 ]+ . Since the reference energy of -125.20 kcal/mol is quite large, the relative error is fairly satisfactory being only 0.8%. Generally, for the nine reactions of Table 10 the relative reaction energy errors are all below 1.1%, hence the LNO-CCSD(T) and the exact CCSD(T) results are equally suitable for chemical purposes. The list of the three conformer energy differences in Table 10 is, of course, far from being representative. Nevertheless, it is a good start that all three errors are below 0.31

50

ACS Paragon Plus Environment

Page 50 of 73

Page 51 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Table 10: Error of LNO-CCSD(T) reaction energies and conformation energies with respect to the CCSD(T) reference [∆E CCSD(T) ] in kcal/mol. The statistical error measures are MAE=0.34 kcal/mol, MAX=1.01 kcal/mol, and STD=0.46 kcal/mol. reaction/conformers ISOL11 educt → ISOL11 product 2 p-xylene → [2, 2]PCP + 2 H2 2 p-xylene → [2, 2]PCP + 2 H2 2 p-xylene → [2, 2]PCP + 2 H2 2 2,3-dimethylbut-2-ene → OMCB 2 2,3-dimethylbut-2-ene → OMCB 2 2,3-dimethylbut-2-ene → OMCB porphyrin + Mg → Mg-porphyrin + H2 Li+ + 2 12-crown-4 → [Li(crown)2 ]+ melatonine aa → melatonine dw (H2 O)17 sphere → (H2 O)17 5525 GC-dDMP A → GC-dDMP B

AO basis ∆E CCSD(T) cc-pVTZ 35.32 cc-pVTZ 57.95 aug-cc-pVTZ 55.74 cc-pVQZ 57.90 cc-pVTZ -19.67 aug-cc-pVTZ -22.03 cc-pVQZ -18.98 cc-pVTZ -97.28 cc-pVTZ -125.20 cc-pVT’Z 9.56 aug-cc-pVTZ 0.71 6-311++G** -1.22

error -0.16 -0.54 -0.61 -0.62 0.03 -0.13 0.00 0.29 1.01 0.31 -0.07 -0.29

kcal/mol. Since these differences are usually much smaller than the reaction energies, their computation often requires the use of tighter threshold settings. 117

7

Hardware requirements and performance for large examples

Wall-clock times presented in this section are measured using a single, 8-core processor. Representative timings for the systems of 30 to 63 atoms of the CEMS26 collection can be found in the last column of Table 8. The table actually collects LNO-CCSD(T) wall times measured with the more realistic default auxiliary bases that corresponds to the selected AO basis, i.e., “X-ζ-RI” is employed with an “X-ζ” AO basis. The timings measured with the “(X+2)ζ-RI” auxiliary bases are presented in the Supporting Information. For half of the systems in Table 8 at least C2 spatial symmetry was utilized in the local correlation calculation. This is one aspect of the compilation that is not representative of the target systems which the LNO-CCSD(T) method will most probably be applied to. The inclusion of molecules with spatial symmetry was, however, necessary to reduce the high costs of the reference 51

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

CCSD(T) calculations. The gain in the most cases is about a factor of two for molecules with C2 symmetry. In the most extreme example there are only 4 symmetry-inequivalent LMOs out of the 80 total correlated LMOs in OMS exhibiting octahedral symmetry. Thus 20·3=60 minutes would be a better estimate for the wall time for analogous non-symmetric systems. The ratio of symmetry-inequivalent per total LMOs for the other elements of the set is also available in the Supporting Information allowing for similar wall-time estimations. The actual wall times in Table 8 being in the 1–7 hour range prove that accurate an LNOCCSD(T) energy calculation with sufficiently large AO basis sets has become a routine task for such medium-sized systems even with a simple desktop computer. As that the approximations automatically adopt to the complexity and electronic structure of the molecule, longer evaluation times are observed for systems with extended delocalized π-systems, with basis sets including diffuse functions, or with compact, three-dimensional structure. The latter aspect is observable, for instance, on the two significantly different melatonine conformers.

7.1

Large-scale applications

Large-scale examples representative of the current capabilities of our implementation are collected in Table 11. The TSRS CC · · · pnp example is the largest species appearing in a Michaeladdition reaction that was a subject of a recent, computational mechanistic study. 73 This example was selected because its size is close to the limit for which near-CBS quality LNOCCSD(T) energies can be obtained by extrapolating from augmented quadruple-ζ results. Additionally, LNO-CCSD(T)/aug-cc-pVQZ calculations with significantly tighter thresholds can still be performed for molecules of about 100 atoms using the same modest hardware in a matter of 1–2 weeks. 75 The latter calculations are useful to estimate the accuracy of the local approximations for larger systems. 75 The basis set limit of LNO-CCSD(T) with the default settings can be approached presently up to the crambin protein 62 containing 644 atoms. The corresponding def2-TZVP and def2-QZVP LNO-CCSD(T) calculations took about 3 and 11 days on the same single-CPU machine, respectively. The largest LNO-CCSD(T) calculation 52

ACS Paragon Plus Environment

Page 52 of 73

Page 53 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

we have attempted so far is performed for the integrase protein with the def2-TZVP basis set, containing 2380 atoms and 44035 AOs. The integrase calculation represents the limit of our current cubic-scaling reduced-cost SCF implementation 26 on this single CPU machine, but the LNO-CCSD(T) calculation was still relatively manageable taking only about 10 days. Table 11: Average (maximum) domain sizes and orbital space dimensions, detailed wall-clock times in minutes (obtained with an 8-core CPU) and memory requirements, HF energies, and LMP2 and LNO-CCSD(T) correlation energies for LNO-CCSD(T) computations on large molecules using the default threshold set. Molecule No. of atoms Basis Total AOs Total AFs Total LMOs Strong pairs [%] Atoms in ED Atoms in PCD AOs in ED AFs in PCD LMOs in ED PAOs in ED Occupied LNOs Virtual LNOs NAFs HFa +localization Local MP2+LIS const. Two-external trf. LNO-CCSD Laplace (T) Total LNO-CCSD(T) Max. memory [GB] EHF [Eh ] ELMP2 [Eh ] ELNO−CCSD(T) [Eh ]

TSRS CC · · · pnp 90 aug-cc-pVTZ 3155 7001 123 39 78 (90) 48 (77) 2769 (3155) 3669 (6163) 49 (88) 1678 (2645) 29 (62) 134 (273) 461 (987) 384 552 799 1065 366 2782 24.3 -2328.083841 -8.3971 -8.9503

TSRS CC · · · pnp 90 aug-cc-pVQZ 5742 11362 123 39 76 (90) 45 (76) 4904 (5742) 5946 (9626) 49 (88) 2891 (4659) 29 (62) 154 (300) 544 (1122) 1244 2309 3145 1401 539 7395 62.9 -2328.207262 -8.8428 -9.3412

crambin 644 def2-TZVP 12075 29829 909 6.0 128 (270) 46 (89) 2615 (5694) 2398 (4917) 55 (117) 943 (1936) 28 (56) 108 (191) 373 (639) 3906 802 716 2318 428 4264 23.5 -18011.6227 -59.9816 -64.2288

crambin 644 def2-QZVP 28227 64015 909 6.1 127 (270) 46 (89) 5747 (12310) 4945 (9910) 57 (117) 2071 (4107) 29 (56) 138 (231) 484 (828) 16122 5482 5594 3934 988 15988 101.0 -18012.5067 -65.0684 -68.8940

integrase 2380 def2-TZVP 44035 108414 3301 1.6 138 (314) 47 (95) 2712 (6359) 2368 (4902) 55 (102) 938 (1950) 28 (57) 108 (212) 371 (706) 98220 3158 2651 7647 1436 14892 104.7 -59078.9084 -217.6134 -232.9814

a Canonical

DF-HF for TSRS CC · · · pnp, DF-HF with local fitting domains employed for the exchange for crambin and integrase. The initial guess for the HF calculation was the density obtained with the corresponding basis set with a cardinal number of lower by 1, e.g., the aug-cc-pVQZ calculation was started from the aug-cc-pVTZ density, etc.

part 26

Looking briefly at the average (maximum) domain sizes and orbital space dimensions presented in the top panel of Table 11 a fast saturation is observed both in the ratio of 53

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

strong LMO pairs and in the average (maximum) number of atoms, AOs, and strong pair LMOs in the EDs. The latter, fairly constant number, 50–60 in average (90–120 maximally), seems to be the limit for large, 3D (bio)organic molecules similar to those in Table 11. The computational demand beyond this molecular size is already constant in a single ED, and hence the overall computation time is linear scaling, as it was already proven by measurements in our previous report. 26 However, simply calling even an optimized conventional RI-MP2 routine would not be feasible hundreds of times for each EDs containing thousands of AOs. Due to the compact construction of the PCDs the MO (LMO and PAO) and AF space dimensions are already a factor of 2–3 smaller than those that would correspond to the full ED size. This reduces the operation count by more than one order of magnitude. An important point to note again is that the cost of all steps in a single ED scales only at most quarticly with the MO/AF dimensions of the ED, which alone decreases the time of the “Local MP2+LIS construction” steps (see Table 11) by an other order of magnitude. Compared to the PAO/AF dimensions of the EDs an additional 6- to 15-fold compression is achieved in the virtual LNO/NAF spaces of the LISs, and the number of occupied orbitals is roughly cut in half during the LMO to occupied LNO transition. With that the wall-clock times required for the highly-expensive “two-external integral transformation”, and “LNO-CCSD” and “Laplace (T)” steps scaling as the sixth-power of the LIS size become comparable to the ones measured for the LMP2 calculation in the ED. Consequently, the “Total LNO-CCSD(T)” wall-clock times are only about 3–5 times longer than those for LMP2. This ratio increases when the LNO thresholds are tightened, which, at some point, might provide sufficient motivation to stop at the LMP2 level. Since the highly-optimized Laplace transform (T) calculation requires less than 15% of the total LNO-CCSD(T) computation time, there is, however, absolutely no reason to stay at the LNO-CCSD level and not to perform the full LNO-CCSD(T) calculation. We arrived at a point in our optimization efforts where the above four major time consuming parts of the LNO-CCSD(T) calculation became roughly equally long. More precisely,

54

ACS Paragon Plus Environment

Page 54 of 73

Page 55 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

the first two steps (LMP2 and two-external transformation), whose computational requirements scale at least as the third power of dimensions of the ED, are usually more demanding with the quadruple-ζ basis, but the CCSD and (T) calculations in the LIS appear to be more expensive with smaller basis sets or when tighter LNO thresholds are applied. However, the wall-clock times can still be significantly improved in future versions, for example, by the parallel execution of the independent domain computations. At the same time more attention has to be devoted to the reference HF calculation since we have well passed the crossover point of our asymptotically linear-scaling LNO-CCSD(T) and the cubic-scaling SCF implementation. 26 The tools at our disposal to determine the accuracy of the local correlation energies for such large-scale examples are presently rather limited. Monitoring the convergence of ELNO−CCSD(T) with tighter thresholds for the TSRS CC · · · pnp and the crambin system is in the scope of our next study. 75 An alternative route for checking the accuracy of our results is the comparison to other local CCSD(T) results available in the literature. For instance, impressive DLPNO 64 and CIM-DLPNO 69 local correlation calculations have recently been presented by Neese, Riplinger, Guo, and their co-workers for the crambin/def2-TZVP system with both MP2 and CCSD(T0 ). The three available correlation energies at the MP2 level (DLPNO-MP2, CIM-DLPNOMP2, and the present LMP2 of Table 11) agree very closely, they are within 0.05%. However, the disagreement among the CIM-DLPNO-CCSD(T0 ), 69 the DLPNO-CCSD(T0 ), 64 and the LNO-CCSD(T) correlation energies is more significant. The larger difference can at least partially be attributed to the use of different HF reference (no approximation is used in the HF calculations of Refs. 64 and 69) and to the different (T) expressions [i.e., the semicanonical (T0 ) and the present Laplace-transformed (T)], to name a few. The difference in the local (T) energies might be soon eliminated to a large extent via the new iterative DLPNO(T) method of Guo et al. 66 To better understand the source of the remaining difference we are planning to perform LNO-CCSD(T) calculations with tighter thresholds, which will be

55

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

presented in a later report. 75

7.2

Scaling and data storage requirement

The prerequisite HF, occupied orbital localization, and PAO construction steps are currently cubic-scaling, while the pair-energy computation is quadratic-scaling in our present implementation. Due to the small prefactor of the latter three, in the limit of our largest examples only the time spent on the SCF iteration becomes comparable to or larger than the time of the remaining linear-scaling local correlation steps. Our current SCF implementation also requires the storage of 7 n2AO -sized arrays, which, with nAO = 44035 in the case of the integrase/def2-TZVP system, grow up to about 104 GB. The LNO-CCSD(T) part also requires the quadratic-scaling storage of the complete Fock matrix, and the LMO and PAO coefficients, which three arrays are alone responsible for about 35 GB of the memory demand of the integrase/def2-TZVP calculation. The size of the latter three arrays has, however, not caused any problem so far, and as they are sparse, their linear-scaling storage can also be implemented relatively simply if needed. We also note that currently there are 6 pieces of n2AO -sized arrays allocated for the PAO and pair energy construction parts, which are mostly responsible for the almost 105 GB memory requirement of the integrase calculation. The majority of the arrays are deallocated as soon as the ED calculations starts, and some of them are planned to be removed in the near future. This allocation is not problematic presently since the memory demand of the SCF step is still not exceeded. Following the pair prescreening the operation count and the memory demand of all the operations are asymptotically linear-scaling and constant, respectively. We have devoted considerable efforts to minimize the memory requirement of the operations in the EDs without resorting to any hard disk use. Hence the maximum memory consumption, e.g., for the largest examples of the CEMS26 set, is kept below 12 GB (OMCB/cc-pVQZ), and only 7.2 GB was measured for the GC-dDMP/6-311++G** calculation. The larger TSRS CC · · · pnp/augcc-pVTZ and crambin/def2-TZVP calculations can also be performed with less than 25 GB 56

ACS Paragon Plus Environment

Page 56 of 73

Page 57 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

of memory, the allocation of 63 GB for the TSRS CC · · · pnp/aug-cc-pVQZ system is still acceptable, and only the crambin/def2-QZVP and the integrase/def2-TZVP examples require slightly more than 100 GB of memory. Up to the beginning of the LIS calculations all the steps are performed in the memory. Currently, the assembled four-center integrals and other intermediates of the CC iteration are stored on the hard drive, therefore memory bottlenecks do not occur during the LIS calculations. The total disk use is also quite moderate, it only grows above 100 GB for the largest examples. Moreover, for normal applications, these files can often be stored in memory as well if, for instance, a sufficiently large portion of memory is available mounted as a shared memory or temporary file storage facility (usually called /dev/shm on unix-like operating systems).

8

Summary and outlook

In this work a significantly more efficient LNO-CCSD(T) implementation is presented combining beneficial properties of both fragmentation and direct local correlation methods with new algorithmic ideas. The major improvements are achieved in the domain construction, the LNO determination, the auxiliary basis compression, and the previously rate-determining two-external integral transformation steps. The accuracy is also improved by tuning some of the truncation thresholds. The errors of the local approximations are illustrated on the example of correlation and reaction energies of realistic molecules of 61–106 atoms also including species (e.g., the molecules of the ISOL4 reaction) that were identified as especially challenging from the perspective of local approximations. A reasonable default threshold set is determined providing on the average correlation and reaction energy errors of smaller than 0.07% and 0.34 kcal/mol, respectively, compared to the exact CCSD(T) reference. Those error statistics hold for both triple- and quadruple-ζ basis sets and also for the molecules and reactions of a new test set compiled herein collecting exact CCSD(T) reference energies

57

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

for medium-sized systems of 30–63 atoms. Further benchmarks will be provided in our upcoming report 75 to assess the efficient LNO-CCSD(T) based tools suitable for approaching the CBS CCSD(T) energies of even larger molecules. It is important to highlight that the entire LNO-CCSD(T) approach is free from heuristic parameters (such as real space cutoffs, scaling parameters, etc.), the domain construction is system specific and automatic. Additionally, a highly-accurate LMP2 correlation energy is also obtained as a spin-off of the LNO-CCSD(T) calculation. We have shown that, due to the extensive optimization of both the operation count and the memory requirement, LNO-CCSD(T) calculations can now be preformed in a matter of days and with a few tens of GBs of memory on a single-CPU workstation for systems with close to 100 atoms and augmented quadruple-ζ quality basis sets. Using the same modest hardware we have presented by far the largest local CCSD(T) calculations performed for realistic, three-dimensional systems, the integrase enzyme of 2380 atoms and the crambin protein with 644 atoms using the def2-TZVP and the def2-QZVP basis sets, respectively. To our knowledge both systems include almost four times as many atoms as the molecules considered in the previously reported largest applications with the same basis sets, which were performed for the crambin/def2-TZVP 27,64,69 and vancomycin/def2-QZVP 66 systems, respectively. Since, for example, in an enzyme reaction it is usually not necessary to perform quantum mechanical calculations involving more than about 500–1000 atoms, the present LNO-CCSD(T) scheme, in combination with embedding or multilevel techniques, 28,29 can already be employed even for entire biomolecules.

Acknowledgement Helpful correspondence regarding some of the CEMS26 reference energies are gratefully acknowledged to Professors Jan M. L. Martin (Weizmann Institute), Eduardo Apr`a (PNNL), Sotiris S. Xantheas (PNNL), Victor M. Anisimov (University of Illinois), and Yang Guo (MPI M¨ ulheim). The authors are grateful for the financial support from the National Research, 58

ACS Paragon Plus Environment

Page 58 of 73

Page 59 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Development, and Innovation Office (NKFIH, Grant No. KKP126451). This work was also supported by the BME-Biotechnology FIKP grant of EMMI (BME FIKP-BIO). The work of P.R.N. is supported through the New National Excellence Program of the Ministry of Human Capacities, ID: UNKP-17-4-BME-55. The computing time granted on the Hungarian HPC ´ Infrastructure at NIIF Institute, and on the BME HPC Cluster (project number: TAMOP - 4.2.2.B-10/1-2010-0009) is thankfully acknowledged.

Supporting Information Available See the supporting information for the structure of the investigated molecules including the ones in the NWH and the CEMS26 compilations, and the large systems of Table 11. Reference canonical CCSD(T) or DF-CCSD(T) correlation energies and further computational details are also provided for the NWH, S66, and the CEMS26 test sets. Correlation and reaction energies used as reference in Figures 2–5 and Tables 4–6 are also reported. This material is available free of charge via the Internet at http://pubs.acs.org/.

References (1) Bartlett, R. J.; Musial, M. Coupled-cluster theory in quantum chemistry. Rev. Mod. Phys. 2007, 79, 291. (2) K´allay, M.; Gauss, J. Analytic second derivatives for general coupled-cluster and configuration interaction models. J. Chem. Phys. 2004, 120, 6841. (3) K´allay, M.; Gauss, J. Calculation of excited-state properties using general coupledcluster and configuration-interaction models. J. Chem. Phys. 2004, 121, 9257. (4) Raghavachari, K.; Trucks, G. W.; Pople, J. A.; Head-Gordon, M. A fifth-order perturbation comparison of electron correlation theories. Chem. Phys. Lett. 1989, 157, 479.

59

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(5) K´allay, M.; Surj´an, P. R. Higher excitations in coupled-cluster theory. J. Chem. Phys. 2001, 115, 2945. (6) K´allay, M.; Gauss, J. Approximate treatment of higher excitations in coupled-cluster theory. J. Chem. Phys. 2005, 123, 214105. (7) Deegan, M. J. O.; Knowles, P. J. Perturbative corrections to account for triple excitations in closed and open shell coupled cluster theories. Chem. Phys. Lett. 1994, 227, 321. (8) Kobayashi, R.; Rendell, A. P. A direct coupled cluster algorithm for massively parallel computers. Chem. Phys. Lett. 1997, 265, 1–11. (9) Anisimov, V. M.; Bauer, G. H.; Chadalavada, K.; Olson, R. M.; Glenski, J. W.; Kramer, W. T. C.; Apr`a, E.; Kowalski, K. Optimization of the Coupled Cluster Implementation in NWChem on Petascale Parallel Architectures. J. Chem. Theory Comput. 2014, 10, 4307–4316. (10) Harding, M. E.; Metzroth, T.; Gauss, J.; Auer, A. A. Parallel Calculation of CCSD and CCSD(T) Analytic First and Second Derivatives. J. Chem. Theory Comput. 2008, 4, 64. (11) DePrince, A. E.; Sherrill, C. D. Accurate Noncovalent Interaction Energies Using Truncated Basis Sets Based on Frozen Natural Orbitals. J. Chem. Theory Comput. 2013, 9, 293. (12) DePrince, A. E.; Sherrill, C. D. Accuracy and Efficiency of Coupled-Cluster Theory Using Density Fitting/Cholesky Decomposition, Frozen Natural Orbitals, and a t1Transformed Hamiltonian. J. Chem. Theory Comput. 2013, 9, 2687–2696. (13) Epifanovsky, E.; Zuev, D.; Feng, X.; Khistyaev, K.; Shao, Y.; Krylov, A. I. General implementation of the resolution-of-the-identity and Cholesky representations of 60

ACS Paragon Plus Environment

Page 60 of 73

Page 61 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

electron repulsion integrals within coupled-cluster and equation-of-motion methods: Theory and benchmarks. J. Chem. Phys. 2013, 139, 134105. (14) Janowski, T.; Pulay, P. Efficient Parallel Implementation of the CCSD External Exchange Operator and the Perturbative Triples (T) Energy Calculation. J. Chem. Theory Comput. 2008, 4, 1585–1592. (15) Pitoˇ na´k, M.; Aquilante, F.; Hobza, P.; Neogr´ady, P.; Noga, J.; Urban, M. Parallelized implementation of the CCSD(T) method in MOLCAS using optimized virtual orbitals space and Cholesky decomposed two-electron integrals. Coll. Czech. Chem. Comm. 2011, 76, 713–742. (16) Deumens, E.; Lotrich, V. F.; Perera, A.; Ponton, M. J.; Sanders, B. A.; Bartlett, R. J. Software design of ACES III with the super instruction architecture. WIREs Comput. Mol. Sci. 2011, 1, 895–901. (17) Peng, C.; Calvin, J. A.; Pavoˇsevi´c, F.; Zhang, J.; Valeev, E. F. Massively Parallel Implementation of Explicitly Correlated Coupled-Cluster Singles and Doubles Using TiledArray Framework. J. Phys. Chem. A 2016, 120, 10231–10244. (18) Kaliman, I. A.; Krylov, A. I. New algorithm for tensor contractions on multi-core CPUs, GPUs, and accelerators enables CCSD and EOM-CCSD calculations with over 1000 basis functions on a single compute node. J. Comput. Chem. 2017, 38, 842–853. (19) A. Eugene DePrince, I.; Kennedy, M. R.; Sumpter, B. G.; Sherrill, C. D. Densityfitted singles and doubles coupled cluster on graphics processing units. Mol. Phys. 2014, 112, 844–852. (20) Asadchev, A.; Gordon, M. S. Fast and Flexible Coupled Cluster Implementation. J. Chem. Theory Comput. 2013, 9, 3385–3392.

61

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(21) Eriksen, J. J. Efficient and portable acceleration of quantum chemical many-body methods in mixed floating point precision using OpenACC compiler directives. Mol. Phys. 2017, 115, 2086. (22) Dutta, A. K.; Neese, F.; Izs´ak, R. Accelerating the coupled-cluster singles and doubles method using the chain-of-sphere approximation. Mol. Phys. 2018, 116, 1428–1434. (23) Rolik, Z.; K´allay, M. A general-order local coupled-cluster method based on the clusterin-molecule approach. J. Chem. Phys. 2011, 135, 104111. (24) Rolik, Z.; Szegedy, L.; Ladj´anszki, I.; Lad´oczki, B.; K´allay, M. An efficient linearscaling CCSD(T) method based on local natural orbitals. J. Chem. Phys. 2013, 139, 094105. (25) K´allay, M. Linear-scaling implementation of the direct random-phase approximation. J. Chem. Phys. 2015, 142, 204105. (26) Nagy, P. R.; Samu, G.; K´allay, M. An integral-direct linear-scaling second-order Møller–Plesset approach. J. Chem. Theory Comput. 2016, 12, 4897. (27) Nagy, P. R.; K´allay, M. Optimization of the linear-scaling local natural orbital CCSD(T) method: Redundancy-free triples correction using Laplace transform. J. Chem. Phys. 2017, 146, 214106. (28) H´egely, B.; Nagy, P. R.; Ferenczy, G. G.; K´allay, M. Exact density functional and wave function embedding schemes based on orbital localization. J. Chem. Phys. 2016, 145, 064107. (29) H´egely, B.; Nagy, P. R.; K´allay, M. Dual basis set approach for density functional and wave function embedding schemes. J. Chem. Theory Comput. 2018, submitted. (30) Zale´sny, R.; Papadopoulos, M.; Mezey, P.; Leszczynski, J. Linear-Scaling Techniques

62

ACS Paragon Plus Environment

Page 62 of 73

Page 63 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

in Computational Chemistry and Physics: Methods and Applications; Challenges and Advances in Computational Chemistry and Physics; Springer: Netherlands, 2011. (31) Collins, M. A.; Bettens, R. P. A. Energy-Based Molecular Fragmentation Methods. Chem. Rev. 2015, 115, 5607–5642. (32) Raghavachari, K.; Saha, A. Accurate Composite and Fragment-Based Quantum Chemical Models for Large Molecules. Chem. Rev. 2015, 115, 5643–5677. (33) Gordon, M., Ed. Fragmentation: Toward Accurate Calculations on Complex Molecular Systems; Wiley: New York, 2017. ˇ ıˇzek, J. Coupled-cluster studies. II. The role of (34) F¨orner, W.; Ladik, J.; Otto, P.; C´ localization in correlation calculations on extended systems. Chem. Phys. 1985, 97, 251. (35) Flocke, N.; Bartlett, R. J. A natural linear scaling coupled-cluster method. J. Chem. Phys. 2004, 121, 10935. (36) Friedrich, J.; Dolg, M. Fully Automated Incremental Evaluation of MP2 and CCSD(T) Energies: Application to Water Clusters. J. Chem. Theory Comput. 2009, 5, 287. (37) Fiedler, B.; Schmitz, G.; Hattig, C.; Friedrich, J. Combining Accuracy and Efficiency: An Incremental Focal-Point Method Based on Pair Natural Orbitals. J. Chem. Theory Comput. 2017, 13, 6023–6042. (38) Kobayashi, M.; Nakai, H. Divide-and-conquer-based linear-scaling approach for traditional and renormalized coupled cluster methods with single, double, and noniterative triple excitations. J. Chem. Phys. 2009, 131, 114108. (39) Nakano, M.; Yoshikawa, T.; Hirata, S.; Seino, J.; Nakai, H. Computerized implementation of higher-order electron-correlation methods and their linear-scaling divide-andconquer extensions. J. Comput. Chem. 2017, 38, 2520–2527. 63

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(40) Mochizuki, Y.; Yamashita, K.; Nakano, T.; Okiyama, Y.; Fukuzawa, K.; Taguchi, N.; Tanaka, S. Higher-order correlated calculations based on fragment molecular orbital scheme. Theor. Chim. Acta 2011, 130, 515–530. (41) Yuan, D.; Li, Y.; Ni, Z.; Pulay, P.; Li, W.; Li, S. Benchmark Relative Energies for Large Water Clusters with the Generalized Energy-Based Fragmentation Method. J. Chem. Theory Comput. 2017, 13, 2696–2704. (42) Eriksen, J. J.; Baudin, P.; Ettenhuber, P.; Kristensen, K.; Kjærgaard, T.; Jørgensen, P. Linear-Scaling Coupled Cluster with Perturbative Triple Excitations: The Divide– Expand–Consolidate CCSD(T) Model. J. Chem. Theory Comput. 2015, 11, 2984– 2993. (43) Li, W.; Piecuch, P.; Gour, J. R.; Li, S. Local correlation calculations using standard and renormalized coupled-cluster approaches. J. Chem. Phys. 2009, 131, 114109. (44) Zi´olkowski, M.; Jans´ık, B.; Kjærgaard, T.; Jørgensen, P. Linear scaling coupled cluster method with correlation energy based error control. J. Chem. Phys. 2010, 133, 014107. (45) Stoll, H. Correlation energy of diamond. Phys. Rev. B 1992, 46, 6700. (46) Li, W.; Li, S. Divide-and-conquer local correlation approach to the correlation energy of large molecules. J. Chem. Phys. 2004, 121, 6649. (47) Findlater, A. D.; Zahariev, F.; Gordon, M. S. Combined Fragment Molecular Orbital Cluster in Molecule Approach to Massively Parallel Electron Correlation Calculations for Large Systems. J. Phys. Chem. A 2015, 119, 3587. (48) Li, W.; Ni, Z.; Li, S. Cluster-in-molecule local correlation method for post-Hartree– Fock calculations of large systems. Mol. Phys. 2016, 114, 1447–1460. (49) Pulay, P. Localizability of dynamic electron correlation. Chem. Phys. Lett. 1983, 100, 151. 64

ACS Paragon Plus Environment

Page 64 of 73

Page 65 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(50) Pulay, P.; Saebø, S. Orbital-invariant formulation and second-order gradient evaluation in Møller–Plesset perturbation theory. Theor. Chim. Acta 1986, 69, 357. (51) Saebø, S.; Pulay, P. Fourth-order Møller–Plessett perturbation theory in the local correlation treatment. I. Method. J. Chem. Phys. 1987, 86, 914. (52) Saebø, S.; Pulay, P. Local configuration interaction: An efficient approach for larger molecules. Chem. Phys. Lett. 1985, 113, 13. (53) Sch¨ utz, M.; Werner, H.-J. Local perturbative triples correction (T) with linear cost scaling. Chem. Phys. Lett. 2000, 318, 370. (54) Sch¨ utz, M. Low-order scaling local electron correlation methods. III. Linear scaling local perturbative triples correction (T). J. Chem. Phys. 2000, 113, 9986. (55) Werner, H.-J.; Sch¨ utz, M. An efficient local coupled cluster method for accurate thermochemistry of large systems. J. Chem. Phys. 2011, 135, 144116. (56) Yang, J.; Kurashige, Y.; Manby, F. R.; Chan, G. K.-L. Tensor factorizations of local second-order Møller–Plesset theory. J. Chem. Phys. 2011, 134, 044123. (57) Schwilk, M.; Usvyat, D.; Werner, H.-J. Communication: Improved pair approximations in local coupled-cluster methods. J. Chem. Phys. 2015, 142, 121102. (58) Schwilk, M.; Ma, Q.; K¨oppl, C.; Werner, H.-J. Scalable Electron Correlation Methods. 3. Efficient and Accurate Parallel Local Coupled Cluster with Pair Natural Orbitals (PNO-LCCSD). J. Chem. Theory Comput. 2017, 13, 3650–3675. (59) Ma, Q.; Werner, H.-J. Scalable Electron Correlation Methods. 5. Parallel Perturbative Triples Correction for Explicitly Correlated Local Coupled Cluster with Pair Natural Orbitals. J. Chem. Theory Comput. 2018, 14, 198–215.

65

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(60) Neese, F.; Wennmohs, F.; Hansen, A. Efficient and accurate local approximations to coupled-electron pair approaches: An attempt to revive the pair natural orbital method. J. Chem. Phys. 2009, 130, 114108. (61) Riplinger, C.; Neese, F. An efficient and near linear scaling pair natural orbital based local coupled cluster method. J. Chem. Phys. 2013, 138, 034106. (62) Riplinger, C.; Sandhoefer, B.; Hansen, A.; Neese, F. Natural triple excitations in local coupled cluster calculations with pair natural orbitals. J. Chem. Phys. 2013, 139, 134101. (63) Pinski, P.; Riplinger, C.; Valeev, E. F.; Neese, F. Sparse maps—A systematic infrastructure for reduced-scaling electronic structure methods. I. An efficient and simple linear scaling local MP2 method that uses an intermediate basis of pair natural orbitals. J. Chem. Phys. 2015, 143, 034108. (64) Riplinger, C.; Pinski, P.; Becker, U.; Valeev, E. F.; Neese, F. Sparse maps—A systematic infrastructure for reduced-scaling electronic structure methods. II. Linear scaling domain based pair natural orbital coupled cluster theory. J. Chem. Phys. 2016, 144, 024109. (65) Pavoˇsevi´c, F.; Peng, C.; Pinski, P.; Riplinger, C.; Neese, F.; Valeev, E. F. SparseMaps—A systematic infrastructure for reduced scaling electronic structure methods. V. Linear scaling explicitly correlated coupled-cluster method with pair natural orbitals. J. Chem. Phys. 2017, 146, 174108. (66) Guo, Y.; Riplinger, C.; Becker, U.; Liakos, D. G.; Minenkov, Y.; Cavallo, L.; Neese, F. Communication: An improved linear scaling perturbative triples correction for the domain based local pair-natural orbital based singles and doubles coupled cluster method [DLPNO-CCSD(T)]. J. Chem. Phys. 2018, 148, 011101.

66

ACS Paragon Plus Environment

Page 66 of 73

Page 67 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(67) Schmitz, G.; Hattig, C.; Tew, D. P. Explicitly correlated PNO-MP2 and PNO-CCSD and their application to the S66 set and large molecular systems. Phys. Chem. Chem. Phys. 2014, 16, 22167–22178. (68) Schmitz, G.; H¨attig, C. Perturbative triples correction for local pair natural orbital based explicitly correlated CCSD(F12*) using Laplace transformation techniques. J. Chem. Phys. 2016, 145, 234107. (69) Guo, Y.; Becker, U.; Neese, F. Comparison and combination of “direct” and fragment based local correlation methods: Cluster in molecules and domain based local pair natural orbital perturbation and coupled cluster theories. J. Chem. Phys. 2018, 148, 124117. (70) Kjærgaard, T. The Laplace transformed divide-expand-consolidate resolution of the identity second-order Møller–Plesset perturbation (DEC-LT-RIMP2) theory method. J. Chem. Phys. 2017, 146, 044103. (71) Friedrich, J.; Hanrath, M.; Dolg, M. Using symmetry in the framework of the incremental scheme: Molecular applications. Chem. Phys. 2008, 346, 266 – 274. (72) K´allay, M. A systematic way for the cost reduction of density fitting methods. J. Chem. Phys. 2014, 141, 244113. ´ R´ev´esz, A.; ´ Dobi, Z.; Varga, S.; Hamza, A.; Nagy, P. R.; (73) F¨oldes, T.; Madar´asz, A.; Pihko, P. M.; P´apai, I. Stereocontrol in Diphenylprolinol Silyl Ether Catalyzed Michael Additions: Steric Shielding or Curtin–Hammett Scenario? J. Am. Chem. Soc. 2017, 139, 17052–17063. (74) Bojt´ar, M.; Janzs´o-Berend, P. Z.; Mester, D.; Hessz, D.; K´allay, M.; Kubinyi, M.; Bitter, I. An uracil-linked hydroxyflavone probe for the recognition of ATP. Beilstein J. Org. Chem. 2018, 14, 747–755.

67

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(75) Nagy, P. R.; K´allay, M. J. Chem. Theory Comput. 2018, in preparation. (76) Helgaker, T.; Klopper, W.; Koch, H.; Noga, J. Basis-set convergence of correlated calculations on water. J. Chem. Phys. 1997, 106, 9639. (77) Koch, H.; S´anchez de Mer´as, A. M. Size-intensive decomposition of orbital energy denominators. J. Chem. Phys. 2000, 113, 508. (78) Cacheiro, J. L.; Pedersen, T. B.; Fern´andez, B.; S´anchez de Mer´as, A.; Koch, H. The CCSD(T) model with Cholesky decomposition of orbital energy denominators. Int. J. Quantum Chem. 2011, 111, 349–355. (79) Alml¨of, J. Elimination of energy denominators in Møller–Plesset perturbation theory by a Laplace transform approach. Chem. Phys. Lett. 1991, 181, 319. (80) Constans, P.; Ayala, P. Y.; Scuseria, G. E. Scaling reduction of the perturbative triples correction (T) to coupled cluster theory via Laplace transform formalism. J. Chem. Phys. 2000, 113, 10451. (81) Constans, P.; Scuseria, G. E. The Laplace Transform Perturbative Triples Correction Ansatz. Coll. Czech. Chem. Comm. 2003, 68, 357–373. (82) Ayala, P. Y.; Scuseria, G. E. Linear scaling second-order Moller–Plesset theory in the atomic orbital basis for large molecular systems. J. Chem. Phys. 1999, 110, 3660. (83) Maslen, P. E.; Dutoi, A. D.; Lee, M. S.; Shao, Y.; Head-Gordon, M. A local correlation model that yields intrinsically smooth potential-energy surfaces. Mol. Phys. 2005, 103, 425. (84) Nakajima, T.; Hirao, K. An approximate second-order Møller–Plesset perturbation approach for large molecular calculations. Chem. Phys. Lett. 2006, 427, 225 – 229. (85) Scuseria, G. E.; Ayala, P. Y. Linear scaling coupled cluster and perturbation theories in the atomic orbital basis. J. Chem. Phys. 1999, 111, 8330. 68

ACS Paragon Plus Environment

Page 68 of 73

Page 69 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(86) Foster, J. M.; Boys, S. F. Canonical Configurational Interaction Procedure. Rev. Mod. Phys. 1960, 32, 300. (87) Pipek, J.; Mezey, P. A fast intrinsic localization procedure applicable for ab initio and semiempirical linear combination of atomic orbital wave functions. J. Chem. Phys. 1989, 90, 4916. (88) Knizia, G. Intrinsic Atomic Orbitals: An Unbiased Bridge between Quantum Theory and Chemical Concepts. J. Chem. Theory Comput. 2013, 9, 4834. (89) Boughton, J. W.; Pulay, P. Comparison of the Boys and Pipek–Mezey Localizations in the Local Correlation Approach and Automatic Virtual Basis Selection. J. Comput. Chem. 1993, 14, 736. (90) Werner, H.-J. Communication: Multipole approximations of distant pair energies in local correlation methods with pair natural orbitals. J. Chem. Phys. 2016, 145, 201101. (91) Hetzer, G.; Pulay, P.; Werner, H.-J. Multipole approximation of distant pair energies in local MP2 calculations. Chem. Phys. Lett. 1998, 290, 143. (92) Werner, H.-J.; Knizia, G.; Krause, C.; Schwilk, M.; Dornbach, M. Scalable Electron Correlation Methods I.: PNO-LMP2 with Linear Scaling in the Molecular Size and Near-Inverse-Linear Scaling in the Number of Processors. J. Chem. Theory Comput. 2015, 11, 484. ´ Mayer’s orthogonalization: relation to the (93) Nagy, P. R.; Surj´an, P. R.; Szabados, A. Gram–Schmidt and L¨owdin’s symmetrical scheme. Theor. Chem. Acc. 2012, 132, 1109. ´ Novel orthogonalization and (94) T´oth, Z.; Nagy, P. R.; Jeszenszki, P.; Szabados, A. biorthogonalization algorithms. Theor. Chem. Acc. 2015, 134, 100.

69

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(95) Samu, G.; K´allay, M. Efficient evaluation of three-center Coulomb intergrals. J. Chem. Phys. 2017, 146, 204101. (96) Sch¨ utz, M.; Yang, J.; Chan, G. K.-L.; Manby, F. R.; Werner, H.-J. The orbital-specific virtual local triples correction: OSV-L(T). J. Chem. Phys. 2013, 138, 054109. (97) Kats, D. Speeding up local correlation methods. J. Chem. Phys. 2014, 141, 244101. (98) Mester, D.; Nagy, P. R.; K´allay, M. Reduced-cost linear-response CC2 method based on natural orbitals and natural auxiliary functions. J. Chem. Phys. 2017, 146, 194102. (99) Mester, D.; Nagy, P. R.; K´allay, M. Reduced-cost second-order algebraic diagrammatic construction method for excitation energies and transition moments. J. Chem. Phys. 2018, 148, 094111. (100) Mrcc, a quantum chemical program suite written by M. K´allay, Z. Rolik, J. Csontos, P. Nagy, G. Samu, D. Mester, J. Cs´oka, B. Szab´o, I. Ladj´anszki, L. Szegedy, B. Lad´oczki, K. Petrov, M. Farkas, P. D. Mezei, and B. H´egely. See also Ref. 24 as well as http://www.mrcc.hu/. (101) K¨oppl, C.; Werner, H.-J. On the use of Abelian point group symmetry in density-fitted local MP2 using various types of virtual orbitals. J. Chem. Phys. 2015, 142, 164108. (102) Dunning Jr., T. H. Gaussian basis sets for use in correlated molecular calculations. I. The atoms boron through neon and hydrogen. J. Chem. Phys. 1989, 90, 1007. (103) Kendall, R. A.; Dunning Jr., T. H.; Harrison, R. J. Electron affinities of the first-row atoms revisited. Systematic basis sets and wave functions. J. Chem. Phys. 1992, 96, 6796. (104) Woon, D. E.; Dunning Jr., T. H. Gaussian basis sets for use in correlated molecular calculations. III. The atoms aluminum through argon. J. Chem. Phys. 1993, 98, 1358.

70

ACS Paragon Plus Environment

Page 70 of 73

Page 71 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(105) Peterson, K. A.; Adler, T. B.; Werner, H.-J. Systematically convergent basis sets for explicitly correlated wavefunctions: The atoms H, He, B-Ne, and Al-Ar. J. Chem. Phys. 2008, 128, 084102. (106) Weigend, F.; Ahlrichs, R. Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy integrals over Gaussian functions. Phys. Chem. Chem. Phys. 2005, 7, 3297. (107) Weigend, F. Hartree–Fock Exchange Fitting Basis Sets for H to Rn. J. Comput. Chem. 2008, 29, 167. (108) Weigend, F.; H¨aser, M.; Patzelt, H.; Ahlrichs, R. RI-MP2: optimized auxiliary basis sets and demonstration of efficiency. Chem. Phys. Lett. 1998, 294, 143. (109) Weigend, F.; K¨ohn, A.; H¨attig, C. Efficient use of the correlation consistent basis sets in resolution of the identity MP2 calculations. J. Chem. Phys. 2002, 116, 3175. ˇ aˇc, J.; Riley, K. E.; Hobza, P. S66: A Well-balanced Database of Benchmark (110) Rez´ Interaction Energies Relevant to Biomolecular Structures. J. Chem. Theory Comput. 2011, 7, 2427. (111) Huenerbein, R.; Schirmer, B.; Moellmann, J.; Grimme, S. Effects of London dispersion on the isomerization reactions of large organic molecules: a density functional benchmark study. Phys. Chem. Chem. Phys. 2010, 12, 6940. (112) Liakos, D. G.; Neese, F. Improved Correlation Energy Extrapolation Schemes Based on Local Pair Natural Orbital Methods. J. Phys. Chem. A 2012, 116, 4801–4816. (113) Fogueri, U. R.; Kozuch, S.; Karton, A.; Martin, J. M. L. The Melatonin Conformer Space: Benchmark and Assessment of Wave Function and DFT Methods for a Paradigmatic Biological and Pharmacological Molecule. J. Phys. Chem. A 2013, 117, 2269.

71

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(114) Chaudhuri, R. K.; Freed, K. F.; Chattopadhyay, S.; Mahapatra, U. S. Application of an efficient multireference approach to free-base porphin and metalloporphyrins: Ground, excited, and positive ion states. J. Chem. Phys. 2011, 135, 084118. (115) Yoo, S.; Apr`a, E.; Zeng, X. C.; Xantheas, S. S. High-Level Ab Initio Electronic Structure Calculations of Water Clusters (H2 O)16 and (H2 O)17 : A New Global Minimum for (H2 O)16 . J. Phys. Chem. Lett. 2010, 1, 3122–3127. (116) Wann, D. A.; Less, R. J.; Rataboul, F.; McCaffrey, P. D.; Reilly, A. M.; Robertson, H. E.; Lickiss, P. D.; Rankin, D. W. H. Accurate Gas-Phase Experimental Structures of Octasilsesquioxanes (Si8 O12 X8 ; X = H, Me). Organometallics 2008, 27, 4183– 4187. (117) Liakos, D. G.; Sparta, M.; Kesharwani, M. K.; Martin, J. M. L.; Neese, F. Exploring the Accuracy Limits of Local Pair Natural Orbital Coupled-Cluster Theory. J. Chem. Theory Comput. 2015, 11, 1525–1539.

72

ACS Paragon Plus Environment

Page 72 of 73

Page 73 of 73 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Figure 7: For Table of Contents Only

73

ACS Paragon Plus Environment