Efficient Semi-Numerical Implementation of Relativistic Exact

Aug 12, 2019 - Further, σ is the vector of the Pauli matrices, whose components are given by (5) .... and efficiency of our new method for several mo...
0 downloads 0 Views 1MB Size
Article pubs.acs.org/JCTC

Cite This: J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Efficient Semi-Numerical Implementation of Relativistic Exact Exchange within the Infinite-Order Two-Component Method Using a Modified Chain-of-Spheres Method Toni M. Maier,† Yasuhiro Ikabata,‡ and Hiromi Nakai*,†,‡,§

Downloaded via UNIV OF NEW SOUTH WALES on August 26, 2019 at 05:09:05 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.



Department of Chemistry and Biochemistry, School of Advanced Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan ‡ Waseda Research Institute for Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan § Elements Strategy Initiative for Catalysts and Batteries (ESICB), Kyoto University, Katsura, Kyoto 615-8520, Japan S Supporting Information *

ABSTRACT: We present an efficient implementation of relativistic exact exchange within the infinite-order two-component method (IOTC) by employing a state-of-the-art seminumerical integration technique. For accurate consideration of the picture change, inherent to two-component methods, we propose a new scheme based on a relativistic or picture-change transformation of the density matrix, which provides a simple and efficient formulation of relativistically transformed quantities such as the electron density or exact exchange and thus avoids expensive integral transformations. We show that the new scheme does not introduce additional numerical or theoretical errors beyond the approximations of the IOTC method. For the efficient implementation of exactexchange integrals, we build upon a modified version of the chain-of-spheres exact-exchange (COSX) method. In addition to the conventional overlap and density matrix screening by S- and P-junctions, respectively, we introduce a new simple screening technique in the sense of the original COSX method by additionally considering the asymptotic decay of the integrals over the Coulomb operator within the new F-junctions. Together with the picture-change transformation of the density matrix, this modified COSX method is shown to provide superior efficiency for the calculation of relativistic exact exchange compared to a conventional analytical direct self-consistent-field implementation of exact exchange.

1. INTRODUCTION

g Cij =

1−9

Relativistic quantum chemistry represents the backbone for ab initio calculations on molecular systems containing heavy elements, which play a major role in many fields of chemical research, ranging from homogeneous10,11 and heterogeneous catalysis12−14 to bioinorganic chemistry,15 electrochemistry,16 and material science.17 The most basic and commonly used method of considering special relativity18 in many-electron systems is the use of the Dirac−Coulomb Hamiltonian DC

H

=

∑ i

hiD

1 + 2



g Cij

i≠j

Vi = −∑ A

ZA ·12 |ri − RA|

(4)

Further, σ is the vector of the Pauli matrices, whose components are given by i 0 1 yz i 0 −i yz i1 0 yz zz, σy = jjj zz, σz = jjj zz σx = jjj k1 0 { ki 0 { k 0 −1{

(1)

(5)

Note that boldface, unless otherwise specified, has been used to indicate nonscalar quantities such as vectors and matrices (hence, also formally nonscalar operators as in eq 3). Special relativistic effects on electron−electron interactions, such as retardation effects, do not play a major role for most

(2) Received: March 7, 2019 Published: August 12, 2019

with the Coulomb Hamiltonian for two-electron interactions © XXXX American Chemical Society

(3)

Here, pi = −i∇i denotes the linear momentum operator, c the speed of light, and Vi the electron−nucleus interaction, which is given as

which combines the relativistic one-electron Dirac Hamiltonian19 (here given in its shifted form without external fields; atomic units are used throughout the manuscript) cσ ·pi zy jij Vi zz zz hiD = jjjj j cσ ·p Vi − 2c 2·12 zz i k {

1 ji12 ·12 02 yzz zz ·jjj |ri − rj| jk 02 12 ·12 z{

A

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Journal of Chemical Theory and Computation

jij h+̃ , i 02 zyz D zz h̃ i = U†i hiDUi = jjjj zz jj 0 zz ̃ h 2 − , i k {

chemical questions,20 but they significantly increase the computational effort. Hence, they are commonly neglected,21 but they can be considered by using more sophisticated operators derived from quantum electrodynamics, for example, the Breit Hamiltonian for a perturbational treatment of twoelectron relativistic effects.22 In contrast to the nonrelativistic Schrödinger equation,23 these relativistic Hamiltonians describe electrons and positrons on an equal footing, including their respective spin states, thus requiring a wave function with four individual components, the large and the small component with two spin components each. By connecting small- and large-component basis sets by the kinetic balance condition (in most cases restricted kinetic balance), the variational collapse observed for individual basis sets can be avoided.24−26 However, it was recognized early27−29 that such an explicit four-component description is not required for an accurate treatment of most environmental chemical processes, for which electron−positron pair formation can be safely neglected (the no-pair approximation). Hence, it is generally sufficient to consider only a quasi-relativistic two-component electronic wave function, in which negative energy solutions are projected. In essence, this requires a decoupling of the positive and negative states of the four-component wave function. During the last decades, a large variety of different methods have been developed for this purpose, which can be roughly categorized into approximate and exact schemes. Representatives of the former are, for example, the Fouldy−Wouthuysen (FW)27 and finite-order Douglas−Kroll−Hess (DKH) methods,29−31 as well as the regular approximation32 (typically used at the zeroth,33 first, or infinite order34) and the relativistic elimination of the small component (RESC).35 Also, a similarly large number of exact decoupling schemes have been proposed, such as the infinite-order two-component (IOTC) method by Barysz and co-workers,36,37 which is sometimes also called the Barysz−Sadlej−Snijders or infiniteorder DKH method (not to be confused with the converged finite-order DKH scheme31), the exact quasi-relativistic method developed by Liu and co-workers,38,39 the normalized elimination of the small component (NESC) as proposed by Dyall,4,40 and the exact two-component (X2C) method.41,42 The basic concept of all of these exact methods boils down to two critical features that are necessary for an exact decoupling of the four-component wave function, i.e., an appropriate normalization and consideration of the kinetic balance condition.42 The main differences between these methods lie in the complexity of the employed equations and the way they are solved, which mainly has some implications on the efficiency43 but generally does not diminish their applicability.42 For example, the original IOTC method involves a two-step procedure, i.e., a FW transformation followed by an exact decoupling step, with an iterative solution of the underlying equations, whereas the equations of the sole exact decoupling step in the X2C method are solved in a one-step procedure.41 Although an exact decoupling considering oneand two-electron contributions for the calculation of the transformation matrix is possible, it was found that good agreement with four-component calculations employing the Dirac−Coulomb Hamiltonian can be obtained in most cases by considering only the one-electron terms for the calculation of the transformation matrix,42,44 i.e.,

Article

(6)

where Ui denotes the one-electron unitary transformation of the ith electron with a dimension of 4 × 4. In either case, the theoretical problem of exact decoupling of the four-component Hamiltonian appears to be largely solved, so approximate decoupling schemes seem to be obsolete in large parts, when replaced with exact decoupling methods. An inherent issue affecting all two-component methods is the so-called picture change, which arises due to the decoupling of the four-component Hamiltonian.42,45−49 That is, all quantum-mechanical operators need to be subject to the same transformation as the Hamilton operator to avoid picture-change errors (PCEs). This obviously applies to property operators probing core regions but also to the twoelectron operators in the Hamiltonian itself.50 In fact, the transformation of the Coulomb Hamiltonian is mostly omitted in two-component calculations for the sake of efficiency,43 which can be either associated with the neglect of indirect twoelectron relativistic effects or interpreted as PCEs in the twoelectron energy contributions. Accordingly, the terms picturechange transformation (PCT) and relativistic transformation are largely interchangeable in this context, but we prefer the former. As other authors have stated,51 the occasionally used term picture-change ef fect, in contrast, is somehow misleading, since neglecting the picture change simply leads to an incorrect relativistic treatment and does not describe an additional physical effect.42 We further note that the PCE is specifically related to electronic operators, since pure magnetic operators contain only off-diagonal elements (in the 2 × 2 matrix formulation used, for example, in eq 2), so the corresponding expectation values cannot be computed without applying a PCT.47 Overall, there are two different strategies to account for the picture change. In the first one, the PCT is applied directly to the quantum mechanical operator in question. Besides leading to relatively complicated expressions,49,50 picture-change artifacts (PCAs) may be a significant source of error.47 Similarly, the calculation of magnetic properties requiring an additional gauge-independent atomic orbital transformation may pose several difficulties.52 As an alternative, Malkin et al.51 proposed a back-transformation of the two-component wave function to the four-component picture. Although formally straying from a pure two-component framework, this approach fully eliminates complicated operator transformations as well as PCAs, thus rendering the back-transformation a “simple” and “smart procedure”47 to account for the picture change. The same approach has also been investigated successfully by Barysz et al. in terms of the IOTC method.53 However, so far, only the addition of a few core basis functions and excessively large basis sets have been investigated by the mentioned authors, respectively. Errors related to the neglect of the PCT for two-electron integrals have been investigated by Seino and Hada50 at the Hartree−Fock (HF) level. Significant errors have been found, especially for heavy elements, suggesting that the PCT of twoelectron integrals is essential in these cases. Corresponding errors in the framework of Kohn−Sham density functional theory (DFT) have been analyzed recently by Oyama et al.54 by employing a PCT of the Dirac delta operator used for the calculation of the electron density.34,49,55 Most notably, PCEs B

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

2. PICTURE-CHANGE TRANSFORMATION OF THE DENSITY MATRIX 2.1. General Formalism. First, we introduce the general formalism for the derivation of the picture-change-transformed density matrix. As a starting point, we consider a general property operator Ô and a unitary transformation operator Û that diagonalizes the Hamiltonian within the four-component space, which are given by

were larger when neglecting only the PCT of the exchangecorrelation (XC) functional, but applying it for the Coulomb interaction than when the PCT was completely neglected for two-electron contributions. This shows that only a PCT of all two-electron contributions guarantees more accurate results. An alternative approach to correct electron densities based on atomic four-component densities has been investigated by several authors.38,56,57 Also, effective one-electron operators for the inclusion of two-electron spin−orbit effects have been developed.58 However, the PCTs of the two-electron integrals and XC contributions make the corresponding calculations as expensive as or even more expensive than four-component calculations while providing no advantage in accuracy.44,59 The latter issue has been addressed successfully by Seino and Nakai60 by employing the local unitary transformation (LUT) method,61 which represents one of several approaches to exploit the locality of relativistic effects.59,62,63 Nonetheless, corresponding calculations can be expected to be less efficient than four-component calculations employing state-of-the-art screening schemes, such as the chain-of-sphere exact-exchange (COSX) method64−66 by Neese et al., the resolution-of-theidentity (RI) for Coulomb integrals, 67−74 and many others,75−92 which have experienced major successful developments during the last decades. In fact, the combination of these efficient integration methods with a PCT of the Coulomb operator is not straightforward, since many of them are designed exclusively for a direct-SCF procedure,93,94 whereas the expensive PCT of two-electron integrals is best performed in a conventional disc-SCF or in-core-SCF93 manner (for a more detailed discussion, see Section 4). In this work, we address the problem of the expensive direct PCT of four-center integrals by adopting the back-transformation concept of Malkin et al.51 In particular, we develop an exact density matrix formulation of the back-transformation approach based on a formal RI ansatz for the transformed integrals (Section 2). This naturally combines the advantages of working within a two-component framework and the benefits of the direct-SCF scheme, such as the applicability of modern integral acceleration techniques based on density matrix screening and easy scalability. Although the actual implementation is based on the IOTC method, the derived formulations can be similarly applied to the X2C method. We will exploit the advantages of the new PCT scheme within the first semi-numerical implementation of relativistic exact exchange in a two-component quantum-chemical program (see ref 95 for a four-component implementation of the pseudospectral method). Note that after a short introduction on semi-numerical integration techniques of exact exchange in Section 3.1, we present a modified variant of the COSX method (mCOSX) in Section 3.2, which combines the conventional overlap and density matrix prescreenings of the COSX method64 with a new, simple integral screening based on ideas adopted from the preLinX method87 and thus enables an accelerated exact-exchange evaluation in both nonrelativistic and relativistic calculations but especially the latter. After discussing the details of our new implementation and providing computational details in Sections 4 and 5, respectively, we highlight the accuracy and efficiency of our new method for several molecular test cases (Section 6).

ijÔ Ô yz ij Û Û yz mz j el zz, Û = jjj 11 12 zzz Ô = jjj z jj z jj ̂ j Û 21 Û 22 zz ̂ zz (7) k { kOm Oel { respectively. Here, Ô el and Ô m denote the electric and magnetic Hermitian property operators, respectively. In contrast, the unitary transformation operator consists of four individual components. The expectation value of Ô in terms of the original and picture-change-transformed four-component wave functions, |Ψ⟩ and |Ψ̃⟩ = Û †|Ψ⟩, respectively, is then given by † ̂ ̂ |Ψ̃⟩ ⟨O⟩ = ⟨Ψ|Ô |Ψ⟩ = ⟨Ψ̃|Û OU

(8)

By considering only the electronic two-component wave function |Ψ̃+⟩, i.e., the electronic part of |Ψ̃⟩, the electronic expectation value may be expressed as † † ⟨O⟩+ = ⟨Ψ+̃ |Û 11Ôel Û 11 + Û 21Ôel Û 21|Ψ+̃ ⟩ † † + ⟨Ψ+̃ |Û 11Ô m Û 21 + Û 21Ô m Û 11|Ψ+̃ ⟩

(9)

where the first and second lines on the right-hand side represent the electric and magnetic parts of the expectation value, respectively. In terms of the two-component density matrix

ij Dαα ̃ ̃ yz ijC̃ α yz Dαβ j zz jj zz ̃ † ̃ † zz = j z( C C ) D̃ = jjjj j D̃βα D̃ββ zz jjjC̃ β zzz α β (10) k { k { with C̃ being the occupied molecular orbital coefficients and α and β denoting the two spin indices and the corresponding scalar basis set {χ̃}, one obtains the density matrix formulation of the electronic expectation value as ⟨O⟩+ =









∑ tr[D̃ νμ⟨χμ̃ |Û 11Ôel Û11 + Û 21Ôel Û 21|χν̃ ⟩] μν

+

∑ tr[D̃ νμ⟨χμ̃ |Û 11Ô mÛ 21 + Û 21Ô mÛ11|χν̃ ⟩] μν

(11)

where tr denotes the trace in the 2 × 2 spin space. For clarity in the following formulas and in accordance with the numerical determination method of the unitary transformation matrix by Heß,29 the scalar basis set {χ̃} is assumed to be fully uncontracted, i.e., to consist only of primitive basis functions. The corresponding formulations for contracted basis sets require only a simple additional transformation step. Note also that similar equations can be formulated for the positronic twocomponent wave function. In the next step, which is the essential one for the definition of the picture-change-transformed density matrix, we insert an RI between Ô and Û by introducing the scalar nonorthogonal auxiliary basis {χ} with its related overlap integrals

Sμν = ⟨χμ |χν ⟩ C

(12) DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation After rearranging the resulting formulas, we obtain the final expression ⟨O⟩+ =

1

Π̂ = K ·[αb·12 − p−1 ·Y]·(12 + Y†Y)− /2

The definitions of K, α, b, p, and Y are identical to those commonly used in the literature.49,50,54 As in the original IOTC method,37 Y is determined in an iterative procedure. Hence, in addition to determining the matrix elements of the scalar operators M̂ and Π̂ within the uncontracted AO basis, which is already required for the transformation of the oneelectron Hamiltonian, only the calculation of simple overlap and momentum integrals is required to evaluate the matrix elements of Û . By including the inverse of the auxiliary basis overlap matrix from eqs 14 and 15 into eq 16, one can define the transformation matrix

m ⟨χμ |Ô m |χν ⟩] ∑ tr[Delνμ⟨χμ |Ôel |χν ⟩ + Dνμ

(13)

μν

where Delμν =



∑ S−μκ1⟨χκ |Û11|χλ̃ ⟩D̃ λη⟨χη̃ |Û 11|χϑ ⟩S−ϑν1 κλη ϑ

+



∑ S−μκ1⟨χκ |Û 21|χλ̃ ⟩D̃ λη⟨χη̃ |Û 21|χϑ ⟩S−ϑν1 (14)

κλη ϑ

and Dmμν =



∑ S−μκ1⟨χκ |Û11|χλ̃ ⟩D̃ λη⟨χη̃ |Û 21|χϑ ⟩S−ϑν1

[T†0]μν =

∑ S−μκ1⟨χκ |χλ̃ ⟩S̃−λη1⟨χη̃ |M̂ |χν̃ ⟩

κλη ϑ

+

(21)

κλη †

Due to the scalar product of σ with ∇, eq 17 consists of three terms, so inclusion of the inverse overlap matrix equivalent to eq 21 makes it possible to define a transformation matrix for each spatial component, which is given by

∑ S−μκ1⟨χκ |Û 21|χλ̃ ⟩D̃ λη⟨χη̃ |Û 11|χϑ ⟩S−ϑν1 (15)

κλη ϑ

denote the electric and magnetic picture-change-transformed 2 × 2 density matrices, respectively. Note that the order of the transformation operators in eqs 14 and 15 differs from that in eq 11, which is allowed only due to the Hermiticity of the operators Ô el and Ô m. In fact, eq 13 resembles the related density matrix formulation in the non-picture-change-transformed case, except for the use of an auxiliary basis instead of the atomic orbital (AO) basis and the picture-changetransformed density matrices Del and Dm instead of the corresponding density matrices within the AO basis. That is, calculation of the expectation value ⟨O⟩+ within this formalism requires only the introduction of an auxiliary basis and calculation of the picture-change-transformed density matrices, while the remaining routines are equivalent to those in the non-picture-change-transformed case. An implementation into existing quantum-chemical programs is thus significantly easier than a full four-component implementation. For the actual calculation of the picture-change-transformed density matrices, the matrix elements of the transformation operator Û in eqs 14 and 15 need to be determined. In this work, we explicitly apply the transformation scheme used in the IOTC method. However, similar formulas can be easily derived for other exact decoupling schemes as well, such as X2C or NESC. In either case, the essential step is the separation of the scalar parts of the transformation operators (scalar with respect to real space), whose matrix elements can be easily determined following the approach by Heß29, from nonscalar operators using an RI within the uncontracted AO basis. For the IOTC method, this leads to the following expressions ⟨χμ |Û 11|χν̃ ⟩ =

(20)

∑ ⟨χμ |χκ̃ ⟩S̃−κλ1⟨χλ̃ |M̂ |χν̃ ⟩ κλ

⟨χμ |Û 21|χν̃ ⟩ = −iσ ·∑ ⟨χμ |∇T |χκ̃ ⟩S̃−κλ1⟨χλ̃ |Π̂ |χν̃ ⟩ κλ

[T†j ]μν = σj·∑ S−μκ1⟨χκ |∇j χλ̃ ⟩S̃−λη1⟨χη̃ |Π̂ |χν̃ ⟩

with j ∈ {x, y, z}. In terms of these transformation matrices, the picture-change-transformed density matrices can be easily expressed as ̃ 0 + [T†x + T†y + T†z]D̃ [Tx + Ty + Tz] Del = T†0DT

(24)

Consequently, the calculation of the picture-change-transformed density matrices from the precalculated transformation matrices (eqs 21 and 22) requires only simple matrix algebra. An even more compact formulation can be achieved by defining the supervector ij T0 yz jj zz jj zz jj Tx zz  = jjjj zzzz jj Ty zz jj zz jj zz j Tz z k {

(25)

Equations 23 and 24 can then be reformulated in terms of a supermatrix formulation, which in the general case is given by Dμν =

∑ D̃ κλ†μκλν (26)

κλ

 represents a coupling matrix between the components of the supervector  . In the general case, the electric and magnetic coupling matrices are defined as

(16)

jij1 jj j0 el = jjjj jj 0 jj j0 k

(17)

(18)

and the scalar operators being defined as

0 1 1 1

0 1 1 1

0 zy jij 0 zz jj j− 1 1 zzz m zz,  = i·jjj z z jjj−1 1 zz jj zz j− 1 1{ k

1 0 0 0

1 0 0 0

1 zy zz 0 zzz zz 0 zzzz z 0 z{

(27)

respectively, while for scalar-relativistic calculations, the electric coupling matrix reduces to

1

M̂ = K ·[12 + αbp ·Y]·(12 + Y†Y)− /2

(23)

̃ 0 Dm = i·T†0D̃ [Tx + Ty + Tz] − i·[T†x + T†y + T†z]DT

with the overlap integrals in the uncontracted AO basis

̃ = ⟨χ ̃ |χ ̃ ⟩ Sμν μ ν

(22)

κλη

(19) D

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Journal of Chemical Theory and Computation

 elscal

jij1 jj j0 = jjjj jj 0 jj j0 k

0 1 0 0

0 0 1 0

0 zy zz 0 zzz zz 0 zzzz z 1 z{

Exex

∑ tr[Delνμ]·χμ (r)χν (r) (29)

μν

Taking the operators Ô el = 0 and Ô m = cσ·δ(r − r′) yields the picture-change-transformed current density96

∂J = ∂D̃ ϑη

m j(r) = c·∑ tr[σ Dνμ ]·χμ (r)χν (r)

(30)

μν

1 2

∑ tr[Delνμ]·tr[Delλκ ]gμνκλ

∫∫

χμ (r)χν (r)χκ (r′)χλ (r′) |r − r′|

dr′dr

μνκλ

(34)

(35)

respectively, with Tr denoting the trace operation in the supermatrix space only. Calculation of eq 34 can be realized by first calculating the electric picture-change-transformed density matrix Del and then determining its contraction with the fourcenter integrals within the auxiliary basis set, eq 32. In fact, the latter step is equivalent to that in the non-picture-changetransformed case but gives only the Coulomb matrix within the auxiliary basis set. Hence, one needs to back-transform these integrals into the AO basis by applying the remaining transformation with the supervectors  (including the coupling matrix  ). A similar technique can be employed to calculate the picture-change-transformed exact-exchange matrix in eq 35. However, due to the existence of the outer product instead of the inner product of the transformation supervectors within the four-dimensional supermatrix space, resulting from the nonlocality of exact exchange, the number of picture-changetransformed density matrices that need to be considered increases to 16, i.e., one density matrix for each possible combination of the elements of  . In particular, one first has to determine the 16 picture-change-transformed density matrices, which are then individually contracted with the four-center integrals within the auxiliary basis set. The resulting matrices are a representation of the picture-change-transformed exactexchange matrix within the auxiliary basis set and can be used directly to calculate the picture-change-transformed exactexchange energy in eq 33, which requires only another contraction step with the 16 picture-change-transformed density matrices. To calculate the exact-exchange part of the Fock matrix, one has to apply the remaining transformation with the supervectors  , which includes the coupling matrix  as well as the trace operation. The required steps can thus be

with the four-center integrals over the Coulomb operator within the auxiliary basis set being defined as gμνκλ =

∑ †μηϑν·tr[Delλκ ]·gμνκλ

ÄÅ ÉÑ ÅÅ ÑÑ ∂Exex ÅÅ Ñ † † = − ∑ TrÅÅÅϑλ μη ∑ Dςι̃ ςνκιÑÑÑÑgμνκλ Å ÑÑ ∂D̃ ϑη ÅÅÇ μνκλ ις ÑÖ

(31)

μνκλ

(33)

and

Therefore, the determination of picture-change-transformed ingredients in DFT requires only the calculation of the respective picture-change-transformed density matrices, their multiplication with the basis function vectors in the auxiliary basis set, and taking the trace within the spin space. The latter procedure is essentially equivalent to that in the non-picturechange-transformed case and thus does not require major changes in existing program codes. Furthermore, the expensive individual PCTs of different DFT quantities for each point of the numerical integration grid, employed in previous work,54 can be avoided. 2.2. Two-Electron Integrals. While the picture-changetransformed density matrices were derived only for oneelectron operators so far, a similar formalism can also be applied to the calculation of the picture-change-transformed (or relativistically transformed) two-electron integrals. This is obvious in the case of the Coulomb energy, which can be expressed in terms of the electron density (eq 29) and is thus given by J=

ÄÅ ÉÑ ÅÅ ÑÑ ÅÅ Ñ † † ̃ ̃ Å   ·   Tr D D ∑ ÅÅÅ ∑ ϑη ϑλ μη ςι ςν κιÑÑÑÑÑgμνκλ ÅÅÇ η ϑις ÑÑÖ μνκλ

where Tr denotes the matrix trace within the supermatrix and the 2 × 2 spin space. Note that the order of the transformation matrix supervectors and their adjoints is reversed compared to that in eq 26, resulting in a 4 × 4 supermatrix (outer product) instead of a scalar (inner product) within the supermatrix space. Application of the trace operation finally leads to a scalar in the supermatrix space. While the corresponding formulas for exact exchange using a conventional matrix formulation, as in eqs 23 and 24, would lead to relatively lengthy expressions, especially in the nonscalar-relativistic case, i.e., using the electric coupling matrix (eq 27), the supermatrix formulation provides a more compact representation. Both the Coulomb and exact-exchange energies thus depend explicitly on the two-component density matrix D̃ . Hence, calculation of respective Fock matrix contributions, i.e., taking the first derivative with respect to the two-component density matrix, causes no further complications. Elements of the Coulomb and exact-exchange parts of the Fock matrix are thus given by

(28)

Although the introduction of the supermatrix formalism is less important in the case of one-electron operators discussed so far, it will significantly simplify the corresponding formulas for the picture-change-transformed two-electron integrals discussed in Section 2.2. Finally, we show two explicit examples of application of the picture-change-transformed density matrices for calculating the expectation values of one-electron operators. For the calculation of the picture-change-transformed electron density ρ(r), which is the basic ingredient of DFT, Ô el = δ(r − r′), and Ô m = 0. This gives ρ(r) =

1 =− 2

Article

(32)

For exact exchange, however, the situation is more difficult due to its nonlocality. By applying the same formalism as for the one-electron operators, i.e., insertion of the RI and rearrangement of the integrals over the unitary transformation matrix to define a picture-change-transformed density matrix, one obtains the following expression within the supermatrix formulation E

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

takes the kinetic balance condition into account, but as a result, the calculation cost also becomes largely equivalent to that of four-component calculations. In this context, it should be noted that singularity issues which might occur when inverting the overlap matrix of a general RI basis due to similar exponents can be fully avoided using the exact auxiliary basis set by either restricting the overlap matrix to one-center contributions with same exponents and angular momentum quantum numbers l, l ± 1 (as done in this work) or by directly calculating the respective transformation matrices. In this sense, the exact PCT of the density matrix can be regarded in two different ways. From a two-component point of view, it formally represents a back-transformation of the two-component density matrix into four-component space. Since the density matrix is just another representation of the wave function, the present scheme is formally equivalent to the approach of Malkin et al.51 and the inverse IOTC method,53 which, however, apply the back-transformation directly to the wave function. Furthermore, introducing the auxiliary basis provides an analytically exact back-transformation without extending the AO basis, as done in the previous studies, if the exact transformation matrix is known. The developed matrix formulation can also be implemented easily into existing twocomponent quantum-chemical programs, effectively enabling calculations on the four-component level. From a fourcomponent point of view, the PCT of the density matrix can be regarded as a scheme for projecting negative energy states on the level of the density matrix. In principle, the introduction of the auxiliary basis set, i.e., using a common basis set for small- and large-component-type basis functions, leads to a significantly increased number of integrals over primitive basis functions, with a significant percentage being not needed for the calculation of molecular integrals. Since the density matrix elements related to these unneeded primitive integrals are, however, exactly or at least close to 0, these integrals can be easily screened by a density matrix screening scheme and are thus never evaluated in actual calculations. That is, the larger size of the auxiliary basis set compared to the AO basis is compensated by the sparseness of the corresponding density matrix. This is true especially for the large exact auxiliary basis. Hence, an efficient density matrix screening is essential within the picture-change-transformed density matrix scheme. The implicit expression of the kinetic balance in terms of an RI also suggests the possibility to employ approximate auxiliary basis sets. In contrast to four-component calculations without or just with an approximate kinetically balanced small component basis, errors due to an approximate auxiliary basis would lead only to an incomplete correction of the PCE and not to a potential breakdown of the SCF procedure. The use of optimized approximate auxiliary basis sets could generally help to reduce the calculation costs of relativistic calculations. The requirement for the exact construction of the auxiliary basis, i.e., adding basis functions with the angular momentum quantum numbers being shifted by ±1, suggests the development of specialized AO basis sets that already contain those additional basis functions needed in the exact auxiliary basis. An example of such a basis set is the universal Gaussian basis set (UGBS) developed by de Castro and Jorge.97 Although it is relatively large and not optimized for efficiency, the difference in the number of basis functions between the auxiliary and AO basis sets is significantly lower than that for many optimized basis sets. Furthermore, one can

summarized as follows. First, perform the transformation of the density matrix. Next, determine the integrals and contractions within the auxiliary basis. Finally, apply a back-transformation into the AO basis. Except for the two transformation steps and the treatment of 16 density matrices for exact exchange, this procedure is identical to that in the non-picture-changetransformed case and thus requires only minor modifications in existing quantum-chemical programs featuring a relativistic two-component treatment of one-electron terms. Although we perform only self-consistent energy calculations employing the Dirac−Coulomb Hamiltonian in this work, some further general aspects of the PCT of the density matrix should be noted. First, the presented formalism for the Coulomb and exact-exchange integrals can be similarly applied to the Gaunt operator (one term in the Breit Hamiltonian). In fact, formulas for the corresponding terms arising from the Gaunt operator within the supermatrix formalism will be similar to eqs 31 and 32, except that the magnetic instead of the electric picture-change-transformed density matrix is used, and one has to consider the three components of the σ operator (as in the calculation of the current density in eq 30). Since the same four-center integrals as those for the Coulomb and exact-exchange integrals can be used, there will be an additional cost only for the contraction steps due to the 17 additional density matrices but not for the electron repulsion integral evaluation itself. Second, the picture-change-transformed density matrices, in contrast to the AO density matrix, depend explicitly on the atomic coordinates through the transformation matrices. Accordingly, calculation of the gradients with respect to the atomic positions, which is needed for structure optimizations, requires several additional terms, in particular, the derivatives of the transformation matrices. However, these terms would also be necessary for a conventional PCT of the integrals. 2.3. Choice of Auxiliary Basis Set. For the definition of the picture-change-transformed density matrices in the previous two sections, we made extensive use of the RI technique. While its use for the evaluation of the matrix elements of the scalar operators in eqs 16 and 17 in the eigenvector space of the momentum operator, which was introduced by Heß,29 represents an established scheme and thus requires no further discussion in the current context, the choice of the auxiliary basis set {χ} has not been explained yet. In particular, {χ} is involved in the calculation of the overlap and momentum integrals in eqs 21 and 22, respectively. In terms of the RI method, these integrals can be interpreted as a representability problem. Accordingly, the auxiliary basis set needs to fully represent the AO basis set (eq 21) as well as its first derivatives (eq 22). Hence, an exact representation is achieved when the auxiliary basis includes all original AO basis functions as well as additional basis functions with the same exponents as the AO basis functions but with the angular momentum quantum number decreased or increased by 1 (both for Gaussian basis functions but only the down-shift for Slater basis functions). In fact, using the exact auxiliary basis set and thus the exact PCT is equivalent to satisfying the kinetic balance condition in four-component calculations exactly. This also includes the larger size of the auxiliary basis, which for a general Gaussian AO basis set is three to four times larger than the original AO basis. The auxiliary basis set thus treats small- and large-component-type basis functions of the four-component scheme together within one common basis set. Hence, the exact PCT of the density matrix implicitly F

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

size N, which is commonly used in the literature as a common measure to compare the scaling behavior of different methods.87 The latter obviously renders semi-numerical integration an excellent choice for relativistic calculations using kinetically balanced basis sets, which is equivalent to the exact auxiliary basis in this work (see Section 2.3). Nonetheless, besides a nonrelativistic two-component implementation by Plessow and Weigend,78 semi-numerical integration techniques have not been applied within a relativistic framework to date. This advantage generally applies equally to the exact-exchange and Coulomb integrals. However, while semi-numerical exact exchange determined by eq 38 has been found to provide sufficiently accurate results even for relatively small standard grids used for conventional DFT calculations,64,78,104,105 numerical instabilities have been observed in the case of Coulomb integrals when the simple semi-numerical scheme (eq 36) is applied together with standard grids.106 In the original PS method, these errors are corrected by employing an overlap-based projection between electronic and real space and introducing grid- and basis-set-specific dealiasing functions.80 However, this type of specialization, although it enables the use of small grids while retaining accurate results, increases the methodological complexity and reduces the universality compared to those of a simple seminumerical treatment. In fact, this might be one reason for the low distribution of the PS method compared to the prevalent RI and density-fitting methods,67−74 which also provide cubic scaling for the Coulomb integral evaluation. The simple seminumerical scheme would require large numerical grids, which are inappropriate for efficient integral evaluation, to avoid numerical instabilities. Semi-numerical integration techniques for determining exact exchange, on the other hand, do not suffer from these issues and have thus gained larger popularity during the past decade. In particular, the reduction in formal scaling compared to conventional RI schemes,75,76 which only reduce the prefactor compared to an analytical integral evaluation but retain the formal N4 scaling, has been pointed out as one of the main appealing points. Furthermore, semi-numerical integration naturally provides a simple expression for the exact-exchange energy density, which is the main ingredient in many state-ofthe-art hyper-generalized-gradient approximation (GGA) XC functionals, such as local hybrids107,108 or the B05 functional by Becke.109 Together with its improved efficiency, especially in combination with efficient integration techniques for Coulomb integrals such as the RI-J method,65,104 seminumerical exact exchange has thus played a key role in the development of these functionals during the last years. The first asymptotically linear scaling implementation of semi-numerical exact exchange was presented by Neese et al. in their COSX method64,66 by introducing additional prescreenings for the orbital overlap and density matrix, denoted as Sand P-junctions, respectively. S-junctions are determined by first surrounding each shell of the contracted basis functions with a sphere outside of which all absolute basis function values of the shell are less than a certain threshold. The A matrix elements of shell pairs with nonoverlapping spheres are expected to yield negligible matrix elements and are thus omitted. S-junctions need to be determined only once per calculation. P-junctions exploit the locality of the basis function vector and the band-diagonal nature of the density matrix. In particular, for a certain grid point, X contains only a constant number of nonzero values, which, in combination with the

think about developing schemes that automatically construct an adequate approximate RI basis based on a given basis set, for example, by reusing the original AO basis functions within the small-component part of the RI basis and just adding a few additional basis functions. While the size of the auxiliary basis set could be reduced significantly in this way, singularity issues of the overlap matrix due to similar basis function exponents need to be taken into account properly to avoid numerical instabilities. Hence, there is some potential for future optimization of the basis sets employed within the picturechange-transformed density matrix scheme. In this work, although not being the optimal choice, we stay with the exact auxiliary basis set to facilitate comparison with reference calculations.

3. SEMI-NUMERICAL EXACT EXCHANGE 3.1. General. Traditionally, molecular four-center integrals over the Coulomb operator (eq 32) are determined using analytical integration techniques such as the Obara−Saika, McMurchie−Davidson,98 and other recursion schemes,99,100 or Rys quadrature.101−103 In an alternative approach introduced by Friesner in his pseudospectral (PS) method,79−82 integration over one electronic coordinate in eq 32 is replaced by an integration in real space, i.e., a numerical integration on a molecular grid, thus giving n gμνκλ ≈ ∑ wi ·χμ (ri)A κλ (ri)χν (ri) (36)

i

with ri and wi being the ith grid point and its weight, respectively, n the number of grid points, and A the analytical two-center integral over the Coulomb operator, which is given by A κλ (ri) =



χκ (r)χλ (r) |ri − r|

dr

(37)

Since it uses both analytical and numerical integration techniques, such an approach is commonly denoted as seminumerical integration. For exact exchange, this leads to the (nonrelativistic) energy expression 1 n Exex ≈ − ∑ wi ·XT(ri)DA(ri)DX(ri) 2 i (38) where X denotes the basis function vector for the corresponding grid point. In actual calculations, eq 38 is evaluated in a stepwise procedure with the intermediate vectors B(ri) = DX(ri)

(39)

G(ri) = A(ri)B(ri)

(40)

thus reducing the computational effort to conventional matrix−vector multiplications. The semi-numerical treatment of the picture-change-transformed exact-exchange energy in eq 33 is essentially equivalent, except that respective algebraic operations have to be performed individually for the 16 density matrices. Compared to analytical techniques, the formal scaling with respect to the system and basis set size is reduced from quartic scaling to N3 and N2BF, respectively, suggesting higher efficiency for larger systems and especially for larger basis sets. Note that NBF and n are proportional but not equivalent to the system G

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

any bias, which is consistent with the observation that looser thresholds can be applied for preLinX gradients. Finally, in light of the remarkably improved computation times compared to the COSX method, we consider the authors’ explanations of the effectiveness of the batch-wise integral screening to be insufficient. In particular, the effect of a common overlap and density matrix screening on computation times can theoretically be expected to be significantly smaller than that of the individual screenings. However, the improvements appear to be at the same order of magnitude as those of the S-junctions, thus suggesting that another additional effect is considered by the batch-wise integral screening. In this work, we thus propose a new screening technique for determining the semi-numerical exact exchange which combines the simplicity of the screenings introduced in the COSX method with the improved screening of preLinX. First, we provide a more detailed explanation for the effectiveness of the batch-wise integral screening in the preLinX method. By employing S- and P-junctions, the localities of the basis function vector in eq 38 and the nonoverlapping basis function pairs in eq 37 are considered. A critical aspect which is not considered is the locality of the A matrix, or rather, the locality of the Coulomb operator, in eq 37. That is, integrals of overlapping basis function pairs in which the center is distant from the grid point ri, thus giving negligible values, are not screened by S- and P-junctions. Consequently, the additional effect implicitly covered by preLinX by calculating the central A matrix is directly related to the asymptotic decay of the Coulomb operator. In fact, the latter case can be accurately treated using an asymptotic approximation, so explicit precalculation of the central A matrix can be avoided. In particular, we propose an alternative additional screening procedure that is closely related to the concepts of the COSX method. In the first step, we determine the center and radius of the smallest sphere that contains all grid points of the current grid point batch. While the exact solution of the smallestsphere problem can be rather expensive, in practice, a sufficiently small sphere can be constructed by determining the minimum and maximum spatial components of the grid points of the batch, Rmin and Rmax, respectively. This is equivalent to finding the smallest possible cuboid coaxial to the Cartesian axes containing all grid points and causes negligible costs, as it is done outside of the basis function loops for determining the A matrix elements. The center of the sphere is then given by the point exactly centered between Rmin and Rmax, i.e., at the center of the cuboid, while the radius of the sphere is simply the distance from this center to Rmin or Rmax. That is, we employ the radius and center of the sphere covering the cuboid. Although not being optimal, the sphere thus contains all grid points of the batch, which is a necessary requirement of the approach. On the basis of the distance ΔRsphe from the center of a basis function pair to the surface of the determined sphere, we propose the asymptotic integral estimate (similar to the common asymptotic approximation of the Boys function)

band-diagonal structure of D, also results in a constant, albeit larger, number of nonzero values in the B vector. A matrix elements for shell pairs in which both shells exhibit only negligible values in the B vector are omitted, finally enabling an asymptotic linear scaling. P-junctions need to be determined for each batch of grid points and each SCF cycle. Recently, Laqua et al. proposed a new linear scaling seminumerical exact-exchange method, named preLinX.87 In addition to the conventional S-junctions from the COSX method, they employ a prescreening similar to that in the preLinK110 method (denoted as K-junctions), as well as an additional batch-wise integral screening. For the latter, the A matrix of the central point of the grid batch is precalculated and screened for elements that are negligible upon multiplication with the B vector from the right and the left. The central grid point is determined using a sophisticated batching algorithm based on Hilbert curves to ensure compact grid batches. Compared to the COSX method, preLinX shows a significantly reduced prefactor and reaches the linear scaling regime somewhat earlier, which the authors explained in terms of an additional effect of the batch-wise integral screening beyond the independent screening of the density matrix and the overlap. 3.2. Modified Chain-of-Spheres Exact Exchange. Despite the undeniable improvement of preLinX over the COSX method, which should be retained in the development of new screening techniques, several of its features require a critical examination. First, the intrinsic requirement for a sophisticated batching algorithm, which is rarely part of quantum-chemical programs, as simple batching methods are generally sufficient for conventional DFT codes, may require higher implementation efforts than the simple COSX method. Further, the assumption that the central grid point is representative for the entire batch, i.e., the assumption of compactness of the batch, does not necessarily hold for all grids and all grid regions. For example, for small grids, which are desirable due to their lower computational cost, or in outer grid regions with a sparse distribution of grid points, this assumption is questionable and theoretically can lead to overscreening, although in many cases associated errors do not necessarily have a significant impact on the numerical results. Another minor issue is the fact that the entire central A matrix is required for the batch-wise screening, so determination of the central A matrix itself cannot benefit from this screening and just employs conventional S- and K-junctions, which at least already provide linear scaling. A more critical issue is the definition of the screening condition. As described above, the central A matrix elements are multiplied from the left and the right with the corresponding elements of the B vector. This results in a screening condition for the exact-exchange energy. However, as has been pointed out by Almlöf in the context of the directSCF method,94 such a screening condition guarantees only sufficient accuracy of the exact-exchange energy but not of the corresponding part of the Fock matrix, which can lead to convergence problems in several cases. Although selfconsistent energy calculations such as those shown in the work of Laqua et al.87 may not be severely affected, one can expect that more sensitive response calculations will be significantly biased. At least tighter thresholds than those for the ground-state energy calculation would be required. Note that screening with respect to the energy, in contrast, is adequate for gradient calculations and thus does not introduce

ij π yz I est = Ipre·jjj zzz j pz k {

3/2

·

1 , ΔR sphe > 0 ΔR sphe

(41)

where Ipre denotes the r-independent prefactor and p the sum of the basis function exponents. If the center of the basis function pair lies within the sphere, i.e., ΔRsphe ≤ 0, the integral estimate is not determined, and no additional screening is H

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation applied. For centers lying outside of the sphere, ΔRsphe is a lower bound to the exact distances ΔR for the grid points in the sphere. Hence, in principle, the integral estimate is larger than the real integrals and thus represents an upper bound for the real integrals. In particular, we assumed that in the asymptotic limit all basis functions decay like s-functions, i.e., spherical basis functions. In fact, it can be easily shown mathematically that this assumption holds exactly by considering explicit integral formulas in terms of the nthorder Boys functions Fn and the asymptotic formula98 Fn(x) ≤

(2n − 1)!! 2n + 1

π x 2n + 1

method, our prescreening thus ensures a certain accuracy for the Fock matrix elements and not just for the energy, thus guaranteeing direct rather than indirect control over the SCF convergence behavior. In fact, P-junctions in the COSX method work in the same way. Indirect control over the accuracy of the exact-exchange energy, in contrast, is not a significant issue, since screening of the G vector represents the tighter screening. Therefore, depending on the employed threshold, the new screening can lead to less screened integrals than in the preLinX method while giving the same accuracy for the exact-exchange energy. However, we consider direct control over the screening to be more important than the presumably small benefit of an energy-based screening, which is in accordance with the arguments of Almlöf et al.94 Being based on the definition of a grid batch sphere and a spherical integral estimate, the newly proposed prescreening procedure displays many similarities with the original COSX method. As screening of the G vectors is closely related to a direct screening of Fock matrix elements, we thus denote our new prescreening technique as F-junctions. In combination with conventional S- and P-junctions, F-junctions essentially describe a modified version of the original COSX method, which we thus denote as the modified COSX (mCOSX) method in the following. However, in contrast to the original COSX method, the F- and P-junctions in the mCOSX method do not consider a grid weight dependence in the basis function vector, which can be rationalized by the fact that such a dependence would ultimately lead to neglect of all A matrix elements for infinitely large molecular grids. Hence, our COSX implementation also does not consider grid-weight-dependent P-junctions. Although this might not be a major issue for most finite grids, we thus consider a grid-dependent screening as an undesirable feature. This is in accordance with the arguments of Laqua et al.87 for their batch-wise screening and has already been employed for P-junctions in previous COSX implementations of local hybrid functionals.105

(42)

In particular, the integral value will be dominated by the zeroth-order Boys function in the asymptotic limit, since higher-order Boys functions decay more rapidly. This results in asymptotic formulas identical to the integral expression for sfunctions. Accordingly, eq 41 represents the exact asymptotic integral expression when used with the exact distance R from the center of a basis function pair to the grid point ri instead of ΔRsphe. Since all grid points lie within the sphere surrounding the grid point batch, ΔRsphe is guaranteed to be less than or equal to the exact distance ΔR for any grid point of the batch, so the asymptotic integral estimates determined using eq 41 are bound to be larger than the correct asymptotic integrals. Note, however, that eq 41 represents a strict upper bound only in asymptotic regions, i.e., for large values of ΔRsphe, and for integrals of two s-functions. That is, integral estimates for smaller distances ΔR sphe that include higher angular momentum quantum numbers can be lower than the correct integral. Since, as discussed above, the additional screening is based on the asymptotic decay of the Coulomb operator, one can expect that the integral estimate in eq 41 nonetheless represents an adequate though not exact upper bound, even for shorter distances ΔRsphe, so the integral screening generally does not lead to overscreening, as long as reasonable thresholds are chosen. Note also that the integral screening by Neese et al.64 is a special case of our estimate where F0 = 1 is assumed, which thus gives a tighter screening threshold, as the asymptotic decay of the Coulomb operator is neglected. Furthermore, calculation of the integral estimate (eq 41) is less expensive than the precalculation and screening of the central A matrix in the preLinX method, while providing a safer integral approximation, since no assumptions about grid batch compactness are necessary. Nonetheless, the present scheme can benefit from sophisticated batching schemes as used in the preLinX method to provide more compact grid point batches.87 Note that direct screening with respect to the integral estimate (eq 41), in principle, also enables linear scaling of the semi-numerical evaluation of Coulomb integrals but most likely only for very large molecules. As the last step in our screening procedure, we include the concept of a common prescreening from the preLinX method. In particular, we screen for the estimated G vectors of the current batch of grid points est max Gνest(ri) = I μν ·Bμ

4. IMPLEMENTATION All implementations in this work are performed in a development version of the recently published Relativistic and Quantum Electronic Theory (RAQET) program package,111 which is designed for large-scale two-component relativistic quantum-chemical calculations. This includes the PCT of the density matrix (see Section 2), as well as seminumerical exact exchange using the new mCOSX method (see Section 3.2). In the following, both parts of the implementation are described in detail. For the first evaluation of the picture-change-transformed density matrix scheme, we restrict ourselves to the calculation of scalar-relativistic Coulomb and exact-exchange integrals, as described in Section 2.2, using the exact auxiliary basis set. While the latter ensures direct comparability with results obtained using the conventional operator transformation approach, the scalar-relativistic treatment substantially reduces the formal complexity of the corresponding transformations, since only the scalar-relativistic and thus diagonal coupling matrix (eq 28) needs to be considered. Nonetheless, the most expensive part of the calculation of relativistic two-electron integrals, i.e., the evaluation of four-center integrals within the large auxiliary basis set and their contraction with the density matrix, is already included, so the computational effort, in principle, is similar to that of exact spin-free four-component calculations. Analytical four-center integrals are determined

(43)

where Bmax is the vector containing the maximum absolute values of all B vectors of the current batch. That is, if the est estimated vector elements Gest ν and Gμ are negligible with respect to a threshold, explicit integral evaluation for the basis function pair can be omitted. In contrast to the preLinX I

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation using the hybrid integration scheme in RAQET111 together with the standard overlap and density matrix screenings of the direct SCF method.94 Density matrix screenings of the integrals are performed using the largest absolute values of all density matrices (note again that there are 16 density matrices in the case of exact exchange), whereas the contraction with the density matrices is screened individually. Here, a fundamental difference between the implementation strategies of the picture-change-transformed density matrix scheme and the operator transformation approach should be highlighted. In the latter, four-center integrals in the auxiliary basis set are directly transformed into the AO basis. This transformation step formally scales as N5 and thus represents the computational bottleneck. Application of the LUT60,61 significantly reduces the cost for large molecules but not the scaling for atomic transformations. Accordingly, repetitive transformations of the four-center integrals should be strictly avoided in this approach. This can be realized by storing the transformed integrals either on disc or in the main memory, which is known as disc-SCF and in-core-SCF,112 respectively. In many cases, however, especially for molecules containing heavy elements with large basis sets, the latter is not applicable due to the resulting high main memory requirements. In contrast, the efficiency of the disc-SCF method, especially in parallelized calculations and for large molecules, can be sharply limited by disc input/output, which was one of the main reasons for the development of direct SCF methods.94 In the operator transformation scheme, this limitation also holds for the transformation of the four-center integrals themselves. A direct comparison with the PCT of the density matrix, which is mainly bound by CPU capacity, is thus not possible. In addition, integral screenings based on the density matrix cannot be applied straightforwardly within the disc-SCF approach. However, the number of transformed integrals that must be stored is clearly lower than the number of integrals in the auxiliary basis set, which needs to be fully uncontracted and contains up to four times as many primitive basis functions as the AO basis. Hence, the advantage of not recalculating four-center integrals is significantly larger than that in the nonrelativistic case. Furthermore, contraction of the transformed four-center integrals in the AO basis is more efficient than contraction in the large auxiliary basis. In the picture-change-transformed density matrix scheme, the expensive transformation of four-center integrals is completely avoided, while the evaluation of four-center integrals in the auxiliary basis is equivalent to that in the operator transformation approach. The additional cost of the PCT of the density matrix itself, as discussed below, can be largely neglected. However, the contraction of the picturechange-transformed density matrix with the four-center integrals in the auxiliary basis without screening of the picture-change-transformed density matrix, which contains many zero values, can be significantly more expensive than in the disc-SCF case. Hence, application of the direct-SCF method, which by default enables density matrix screenings, represents the natural choice for the picture-change-transformed density matrix approach, rendering the picture-changetransformed density matrix scheme the preferable choice for large molecules and parallelized calculations, especially when it is combined with linear scaling methods such as COSX64 or LinK.86 In contrast, the operator transformation approach together with the disc-SCF method represents the best choice for smaller molecules on single cores. However, the disc-SCF

method, especially in combination with the LUT scheme, might still be faster for many medium-sized molecules due to the more efficient storage in the AO basis. The transformation matrices are calculated according to eqs 21 and 22, with the spin dependence of the Pauli matrices having no effect due to the diagonality of  . The corresponding overlap integrals are calculated using the conventional Gauss−Hermite quadrature.98 The PCT of the density matrix follows eqs 26 and 33 for the Coulomb and exact-exchange integrals, respectively. While the picturechange-transformed density matrix in eq 26 is symmetric (Hermitian in spin-dependent calculations) by default, the 16 density matrices in the case of exact exchange are decomposed into symmetric and antisymmetric (or Hermitian and antiHermitian) contributions ̃ j + T†j DT ̃ i Dijs = T†i DT

(44)

̃ j − T†j DT ̃ i Dijas = T†i DT

(45)

with i, j ∈ {0, x, y, z}, thus facilitating treatment by standard algebraic methods. Although the band-diagonal structure of most of the relevant matrices suggests the general applicability of sparse algebra routines, which theoretically would enable linear scaling of the density matrix transformation, we employed the standard BLAS and LAPACK routines so far, since the cost of the density matrix transformation, although exhibiting a formal cubic scaling, can be expected to be much lower than that of the four-center integral evaluation in the auxiliary basis, even for larger molecules, and thus usually does not influence the overall calculation cost significantly. Note that consideration of the band-diagonal structure of the relevant matrices, i.e., the density matrix and the transformation matrices, is similar to the LUT scheme, which considers the locality of relativistic effects. That is, in the present picture-change-transformed density matrix scheme, the LUT approach would reduce only the computational cost of the transformation of the density matrix but would have no impact on the integral evaluation, so it would become effective first for large molecules. The reduction of integral evaluation costs, on the other hand, is achieved by density matrix screenings within the direct-SCF method. As an approximation to the full transformation, we implemented a variant, denoted as partial transformation, in which the PCT is applied only to certain atoms, which might become useful for the treatment of molecules with only one or few heavy atoms and is similar to previous approximations for the one-electron contributions.62 Note also that the picture-change-transformed density matrix (eq 26), which is needed for the calculation of relativistic Coulomb integrals, can be directly applied for the calculation of picture-change-transformed DFT ingredients arising from electric operators such as the electron density and its spatial derivatives. For clarity, we postpone these analyses to future work, along with the assessment of the spin-dependent transformations and the treatment of magnetic quantities. For the implementation of the mCOSX method, we employed efficient molecular integration grids based on the scheme developed by Treutler and Ahlrichs.113 In particular, we used a combination of a mapped Gauss−Chebychev and Gauss−Lebedev quadrature and introduced small spherical grids for radial grid points close to nuclei. The details of the grid definitions, however, differ slightly from those in the original work and can be found in the Supporting Information. J

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

relativistically when the partial PCT was applied. SCF energies were converged to 10−10 Eh, with the density matrix elements being typically converged below 10−6. Molecular grids as described in Section 4 and specified in the Supporting Information were used. All calculations using the picturechange-transformed density matrix scheme were performed in a direct-SCF manner. Except for the semi-numerical treatment of exact exchange, two-electron integrals were calculated analytically using the standard quadratically scaling hybrid integral scheme implemented in RAQET.111 Numerical accuracies were evaluated for three properties, total energies (in Eh), atomization energies (in kcal/mol), and 1s orbital energies (in eV). Tight integral thresholds of 10−20 a.u. were used for one- and two-electron integrals, so numerical errors in the analytical integration routines were largely avoided. As a measure for comparing different accuracies, we introduce the number of accurate digits

Furthermore, we employed a simple nonoptimized atomic grid batching with 100 grid points per batch and optimized screened matrix−vector multiplication to evaluate eq 39 and construct the Fock matrix elements based on the G vectors, as described in detail elsewhere.114 The contraction in eq 40 is done shell-pairwise to avoid full construction of the A matrices for the entire grid point batch and thus significantly reduces memory requirements. The elements of the A matrix themselves are calculated using a combination of machinegenerated explicit two-center integral routines for angular momentum quantum numbers of lμ + lν ≤ 10 and Rys quadrature for the remaining integrals. This enables the use of efficient explicit integral formulas for most shell pairs, even for heavy elements with large auxiliary basis sets. The calculation of shell radii for the determination of the S-junctions is performed using a quadratically convergent algorithm, in contrast to the linear screening proposed in the original work.64 As for the analytical two-electron integral evaluation, S- and P-junctions in terms of the integral screening are calculated for the maximum absolute density matrix, whereas screenings associated with the contraction in eq 40 are applied individually for each density matrix. In this way, the large number of negligible values in the 16 exact-exchange density matrices can be efficiently screened, significantly reducing the computation cost of the contraction step, when the corresponding A matrix elements are not screened. This procedure is equivalent to the commonly used screening of different transition density matrices in semi-numerical timedependent DFT implementations.105

i |x − xref | yz zzz P(u) = −log10jjjj u k {

(46)

where x and xref represent the calculated and reference values, respectively, and u denotes the unit of the quantity in question (see above). For analytical HF calculations using the picturechange-transformed density matrix scheme, analytical HF results obtained by the operator transformation approach were employed as a reference. For atomization and 1s orbital energies, technical accuracies of 10−2 kcal/mol and 10−2 eV, respectively, i.e., P = 2, were considered to be sufficiently accurate for typical applications, since methodological errors of most electronic structure methods are more than 1 order of magnitude larger. These values were thus used as benchmark values for evaluating the accuracy. Except for the gold rings, computation times were evaluated on one core of an Intel i7-7700 processor with 8 GB of main memory using analytical integral thresholds of 10−12 a.u. The gold rings were computed on 12 cores of a single node with Intel Xeon X5690 processor and a total of 72 GB of main memory. To ensure comparability, all computation times for the Fock matrix build were measured for the initial SCF step; i.e., they were based on a core Hamiltonian guess.

5. COMPUTATIONAL DETAILS For the technical evaluation of the picture-change-transformed density matrix scheme and the mCOSX method, we employed three groups of molecular systems, the halogen molecules F2, Cl2, Br2, and I2 together with the UGBS basis set97 (in one case also At2), hydrogen sulfide and the linear alkanethiols, given by CnH2n+1SH (n = 0−14), in combination with the uncontracted cc-pVDZ basis set,115,116 and cationic gold rings of the sum formula Aun+ n (n = 1−10) with an uncontracted SapporoDKH3-DZP basis set.117 As this work addresses only the technical properties of the new implementation, we used nonrelativistic HF/6-311G** structures in the former two cases. For the gold rings, we employed an Au−Au distance of 3.332 Å and an Au−Au−Au angle of (n − 2)/n·180° (with n > 2). For the alkanethiols, computation times required for the PCT were additionally measured using contracted and uncontracted cc-pVXZ (X = D,T,Q) bases. 115,116 All calculations were done at the HF level of theory using a development version of the RAQET program package.111 Oneelectron energy contributions are treated within the conventional IOTC method, while the picture-change-transformed density matrix scheme with the IOTC transformation matrices was applied for two-electron integral calculations. In particular, we used three levels of transformation for the two-electron integrals (a) a PCT of all atoms, denoted as full, (b) no PCT, denoted as none, and (c) a partial PCT of one heavy atom, denoted as partial. In the case of the halogen molecules, (c) thus refers to a relativistic transformation of only one of the two halogen atoms, which can be considered as the worst-case scenario for the partial transformation approach, since molecular symmetry will be broken. For the considered linear alkanethiols, only two-electron integrals of the sulfur atom were treated

6. RESULTS 6.1. Picture-Change Transformation of the Density Matrix. For convenience, we start by evaluating the picturechange-transformed density matrix scheme. Although the theoretical derivation of the transformation scheme in Section 2 does not introduce any approximation compared to the conventional operator transformation approach if the exact auxiliary basis is employed, which is the case in this work, the actual calculation of the transformation matrices, as well as the transformation of the density matrix itself, may introduce numerical errors, especially in the calculation of the inverse overlap matrices. Hence, we compared the accuracy of the picture-change-transformed density matrix with that of the operator transformation scheme for the 1s orbital and atomization energies of the halogen molecules. The results are presented in Table 1 in the columns labeled f ull. In fact, atomization energies (in kcal/mol) and 1s orbital energies (in eV) were calculated accurately to at least the fifth digit, which is 3 orders of magnitude better than the defined benchmark accuracy of P = 2. In view of their large absolute values, especially deviations of the 1s orbital energies of the K

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation Table 1. Comparison of Number of Accurate Digits P of Analytical HF Results Using Picture-Change-Transformed Density Matrix Scheme and Operator Transformation Approach for 1s Orbital Energies (in eV) and Atomization Energies (in kcal/mol) of Halogen Molecules 1s orbital energy

atomization energy

molecule

full

partial

full

partial

none

F2 Cl2 Br2 I2 At2

8.56 6.38 5.33 7.21 −

2.47 3.56 3.62 3.54 3.10a

7.20 7.20 6.42 5.57 4.85b

1.80 1.89 1.25 1.02 1.10b

1.82 2.19 2.28 2.14 1.54b

a

Full PCT of the density used as reference. bFull PCT of the density used as reference for At2 molecule.

heavier elements can be considered to be insignificant, while the accuracy of the atomization energies is close to the SCF convergence threshold. Numerical results obtained using the picture-change-transformed density matrix approach can thus be considered to be essentially exact. In addition, we also investigated the accuracy of the partial PCT of the density matrix as described in Section 5. For the 1s orbital energies, since they are strongly affected by relativity, we compare the lower of the two orbital energies with the full PCT and the higher 1s orbital energy with the nontransformed results. The obtained accuracies (see Table 1) are significantly lower than those using the full PCT but are still above the set benchmark accuracy, even for F2, which has the potentially highest valence character of the 1s core orbitals. This suggests that a partial PCT can be useful for reducing the cost of relativistic calculations of core excitation energies in larger molecules if excitations for only a few atoms are needed. Atomization energies, on the other hand, are expected to be less affected by relativistic effects. Hence, we compared the partial PCT results directly with the full PCT results. In fact, atomization energies obtained using the partial PCT were even less accurate than those obtained without applying a PCT for the two-electron contributions at all (column labeled “none” in Table 1), especially for the heavier elements. We attribute this behavior to the treatment of bond breaking between atoms treated on different relativistic levels, which occurs when a partial PCT is applied. Hence, the partial PCT appears to be less useful for calculating atomization energies, and most likely also for other calculations involving bond breaking, although we expect higher accuracies when only the two-electron integrals of hydrogen atoms are treated nonrelativistically. As we discussed in Section 4, the additional step of the PCT contributes to the computation time in each SCF step and thus might has a non-negligible influence on total computation times. Hence, we measured combined computation times for the PCT of the density matrix and back-transformation of the Fock matrix for the alkanethiols with different basis sets (Figure 1). As expected, transformation times increase steadily with the basis set size and the number of molecules, reaching approximately 100 s for 10 carbon atoms and the uncontracted cc-pVQZ basis set. In the latter case, the auxiliary basis set already consists of 5775 basis functions (compared to 1746 in the AO basis), which roughly corresponds to a quintuple- to sextuple-ζ basis set in a nonrelativistic calculation. Hence, the four-center integral evaluation is still expected to be far more expensive than the PCT of the density matrix despite the latter’s unfavorable formal cubic scaling. Numerical evidence

Figure 1. Transformation times (in s) of the PCT of the density matrix for linear alkanethiols obtained using different basis sets.

can be found for the uncontracted cc-pVDZ basis set, which we employed to evaluate computation times of the mCOSX method (see Section 6.2). In particular, the calculation time for one SCF step is at least 3 orders of magnitude larger than the measured transformation times. In contrast to the operator transformation approach, the PCT in the picture-changetransformed density matrix scheme thus does not represent a computational bottleneck, so acceleration schemes such as the LUT have less impact on total computation times, as discussed in Section 4. In summary, the picture-change-transformed density matrix scheme provides a numerically exact treatment of relativistic two-electron contributions with calculation costs for the PCT being significantly lower than four-center integral evaluation times. It thus represents an efficient way to treat relativistic effects, for example, in direct-SCF calculations, due to the possibility to reduce the number of four-center integrals to be calculated, and for the numerical integration of XC functionals, since an individual PCT on each grid point is not needed. The partial PCT, on the other hand, appears to be useful only for specific applications, such as the calculation of relativistic core orbital energies. 6.2. Semi-Numerical Exact Exchange. As semi-numerical integration has never been applied for the calculation of relativistic exact exchange but has been used only in nonrelativistic calculations, conclusions regarding the accuracy and performance of the approximation for relativistic calculations need to be evaluated. However, before doing so, we should determine a set of reliable thresholds for the new mCOSX method, i.e., for S-, P-, and F-junctions. For this purpose, we rely solely on an accuracy measure, as efficiency considerations should play only a minor role for reliability. Since S-junctions are defined in the same way as in the original COSX method and a previous semi-numerical local hybrid implementation,104 the thresholds can be simply transferred. In contrast, P-junctions in the mCOSX method do not account for the grid dependence, so the respective thresholds need to be redetermined. The effect of different thresholds for F-junctions was investigated here for the first time. In particular, we measured the accuracy of the nonrelativistic HF energy of tetradecanethiol with different individual thresholds ϵ for P- and F-junctions. That is, Pjunctions were switched off for the evaluation of different FL

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

Table 2. Accuracy P of Relativistic HF 1s Orbital Energies (in eV) and Atomization Energies (in kcal/mol) of Halogen Molecules Obtained Using the mCOSX Method with Different Thresholds ϵPF for P- and F-Junctions

junction thresholds and vice versa. In all calculations, the Sjunction threshold was set to ϵS = 10−7, and grid 1 was used. The results are shown in Figure 2.

1s orbital energy

atomization energy

ϵPF

F2

Cl2

Br2

I2

F2

Cl2

Br2

I2

10−5 10−6 10−7 10−8 10−9 10−10 10−11 10−12

4.67 5.88 6.70 7.69 7.96 8.09 8.02 8.09

4.45 5.18 6.27 5.97 5.63 5.62 5.62 5.62

4.81 5.64 7.13 6.46 7.26 5.97 6.85 6.66

4.80 5.37 5.45 6.38 6.05 6.41 6.29 6.28

2.93 4.77 5.77 5.65 5.65 5.65 5.65 5.65

2.80 4.43 6.84 5.37 5.33 5.33 5.33 5.33

2.96 4.06 3.98 3.91 3.91 3.91 4.33 5.04

2.78 3.66 3.07 2.94 2.94 3.54 4.16 5.45

Again, grid 1 was used, and S-junctions were switched off. For 1s orbital energies, even a relatively loose threshold of ϵPF = 10−5 provides accuracies beyond P(eV) = 4, while at the same threshold, the accuracy of the atomization energies is only approximately P(kcal/mol) = 2.8−2.9, which, however, is still sufficient with respect to the intended technical accuracy of P = 2. By reducing the threshold by 1 order of magnitude to ϵPF = 10−6, an accuracy close to P(kcal/mol) = 4 is also achievable for the atomization energies. Obtaining more accurate atomization energies, however, might require very low thresholds, especially for heavy elements. However, F- and P-junctions in the mCOSX method do not appear to introduce significant systematic errors related to the relativistic framework. Hence, corresponding considerations for nonrelativistic calculations can be applied similarly. On the basis of accuracies determined with different S-junction thresholds by Bahmann and Kaupp104 and our own measured accuracies for P- and F-junctions, we thus define two threshold sets (ϵS, ϵP, ϵF) for the mCOSX method, a standard set, (10−7, 10−6, 10−9), with an expected accuracy of approximately 10−7 − 10−8 Eh, denoted as high (with respect to the accuracy), and a looser set, (10−6, 10−5, 10−8), for faster calculations with an expected accuracy of approximately 10−6 Eh, denoted as low. As the last step in the accuracy evaluation, we investigated the effect of different numerical grids on nonrelativistic, partially relativistic (partial PCT), and fully relativistic (full PCT) semi-numerical HF calculations. For simplicity, screening by S-, P-, and F-junctions was switched off. In fact, the results for the three levels of the two-electron PCT were exactly the same for atomization energies and very similar for 1s orbital energies. This shows that there are no further requirements for the numerical grid in relativistic calculations. Therefore, we show only results for the full PCT in Figure 3. As is known from the nonrelativistic case,105 the orbital energies are accurately described even when relatively small grids are used. On the other hand, slightly larger grids, for example, grid 3, are required to obtain accurate atomization energies. This supports the use of different grid sizes for the SCF convergence and energy calculation, which is commonly used in some quantum-chemical programs. However, less precise results with errors of approximately 0.1 kcal/mol, which still exceeds the accuracy of most XC functionals, can already be computed using smaller grids (grids 1 and 2). In all cases, the results for heavier elements are less accurate, reflecting higher requirements on the numerical grid. In addition to evaluating the accuracy of the numerical grids, we measured the computation times for the three levels of the

Figure 2. Number of accurate digits P of nonrelativistic HF energies (in Eh) of n-tetradecanethiol using the mCOSX method with grid 1 and different thresholds ϵ for P- and F-junctions (for further details, see text and Section 5).

For P-junctions, a threshold of ϵP = 10−5 already provides an accuracy of approximately 10−6 Eh, while errors less than 10−8 Eh are achieved with ϵP = 10−6. In view of the different methodologies, this is in fair agreement with the results obtained in ref 104 for local hybrids, although the errors for ϵP = 10−6 are found to be somewhat lower in the mCOSX method, in part due to less screening. We observed a similar behavior for the F-junction threshold ϵF, with thresholds of ϵF = ϵP·10−3 providing accuracies comparable to those of Pjunctions with ϵP in the range of intermediate accuracies of P(Eh) = 6−8, which is sufficiently accurate in most cases. For simplicity, we thus fixed ϵF to be 3 orders of magnitude lower than ϵP and evaluated the accuracy for joint use of F- and Pjunctions using the threshold ϵPF (green curve in Figure 2), with ϵP = ϵPF and ϵF = ϵPF·10−3. Except for slower convergence in the less-important high-accuracy region, the results of the joint screening are as accurate as the results obtained employing P-junctions alone. Hence, the additional consideration of F-junctions does not introduce significant errors when the joint threshold ϵPF is used. However, in relativistic HF calculations using the picturechange-transformed density matrix scheme, the picturechange-transformed density matrices contain many zero values, in contrast to the nonrelativistic case even for compact molecules, since only few basis functions of the auxiliary basis set, which is shared by the large and small component, are commonly used by both components when the exact auxiliary basis is employed. Hence, the behavior of screenings based on the density matrix, as it is the case for P- and F-junctions, might differ from that in the nonrelativistic case, where density matrix elements usually become zero only for distant orbitals. Therefore, we also determined the accuracy of different P- and F-junction thresholds for the relativistic atomization and 1s orbital energies of the halogen molecules (Table 2), which, due to the known locality of relativistic effects,61 suffice as test cases, as two-center relativistic effects are already considered. M

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

Figure 3. Accuracy P of relativistic HF (a) atomization energies and (b) 1s orbital energies for halogen molecules obtained using the mCOSX method with different grid sizes.

Figure 4. Computation times t (in s) for building the respective parts of the Fock matrix (Coulomb and/or exact-exchange contributions) for linear alkanethiols of different lengths using analytical and semi-numerical integration techniques (a) without PCT (nonrelativistic two-electron integrals), (b) with partial PCT, and (c) with full PCT (relativistic two-electron integrals). Solid lines represent fits for alkanethiols with n = 10−14. (d) Computation times t (in s) for building the respective parts of the Fock matrix for the employed linear cationic gold rings using the full PCT.

Coulomb integrals (J + K). As test systems, we employed alkanethiols containing up to 14 carbon atoms and cationic gold rings up to 10 gold atoms, with grid 1 and grid 3 being used for semi-numerical calculations, respectively. In fact, linear chains consisting of light atoms represent the worst case when comparing analytical and semi-numerical integration

PCT mentioned above. In particular, we compared the mCOSX method using the high and low threshold sets with analytical Coulomb and analytical exact-exchange integrals, which are denoted as J and K, respectively. As a standard direct-SCF code for analytical integrals was used, exactexchange integrals were calculated only together with N

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

size, i.e., a quadratic rather than a quartic scaling in the case of analytical integration, together with the efficient screenings and the large size of the auxiliary basis set, which is approximately 3 to 4 times larger than the original AO basis. The second point concerns the dimensions of the picture-change-transformed density matrices, whose size is also increased by the larger auxiliary basis set. While only one picture-change-transformed density matrix is needed to evaluate Coulomb integrals, fourcenter integrals need to be contracted with 16 picture-changetransformed density matrices in the case of exact exchange (see Section 4). Furthermore, more electron repulsion integrals contribute to exact exchange, since, in contrast to the picturechange-transformed density matrix in eq 26, the density matrix in eq 33 is not block diagonal; so, less integrals can be omitted due to the density matrix screening. In fact, these two aspects cause analytical relativistic exact exchange to be more expensive than the corresponding Coulomb integrals, which can be seen in the difference between the red and blue curves in Figure 4. The efficient screening of the numerous negligible elements of the individual density matrices in the mCOSX method eliminates many of these contraction steps, which can be regarded as one of its main advantages. In addition, the contraction requires only conventional matrix vector multiplications, which is significantly less expensive than the contraction of a four-index quantity. Despite the larger number of treated density matrices, mCOSX is thus faster than an analytical Coulomb integral evaluation. In contrast to the nonrelativistic case, a combination of analytical Coulomb integrals and mCOSX thus provides a significant speed-up compared to the full analytical integral evaluation. Therefore, the combination with RI-J or other acceleration techniques for Coulomb integrals, which is not investigated in this work, can be expected to be even more beneficial. However, mCOSX might offer slightly less improvement over linear scaling analytical exact-exchange methods, which feature more sophisticated density matrix screenings. The partial PCT (Figure 4b) represents an intermediate approach between the full PCT and the nonrelativistic case, in particular, with respect to the size of the auxiliary basis set, which is almost as large as that in the full-PCT case for H2S, but increases like that in the nonrelativistic case, as no PCT is applied to carbon and hydrogen atoms. However, the calculation of the exact-exchange integrals still requires the treatment of 16 density matrices, which is why the crossover between the mCOSX method and analytical Coulomb integrals is observed for longer chain lengths than in the nonrelativistic case, and the analytical exact-exchange evaluation is dominated by the contraction step with the density matrices. Due to the mentioned differences, a direct comparison with the nonrelativistic case in Figure 4b) should be thus treated with caution. Nonetheless, mCOSX is still significantly faster than the conventional analytical exactexchange evaluation. Most notably, total calculation times with the partial PCT are distinctly lower than those for the full PCT. Hence, the partial PCT represents a reasonable alternative if, for example, only core excitations of the sulfur atom are to be calculated. The comparison of the performance of the mCOSX method for the three levels of the PCT is shown in Figure S1 of the Supporting Information. According to the increase in the basis set size by a factor of 3−4 and the treatment of 16 density matrices instead of one in the full-PCT case compared to the no-PCT case, one could naively expect an increase in the total

techniques, since the latter exhibits a formal quadratic and cubic scaling with the basis set and system size, respectively, while the former scales as N4BF. In both schemes, quadratic scaling with respect to the system size is realized by a proper overlap screening, so the quadratic scaling regime is approached at a comparable system size. For longer chain lengths, the semi-numerical scheme would eventually reach the linear scaling regime, as already shown by several other authors.64,87 Another demonstration is thus not needed. Since conventional direct-SCF codes, however, scale only quadratically in the asymptotic limit, larger chain lengths would favor the semi-numerical scheme, again. Hence, the chosen linear alkanethiols of intermediate length represent the worst case for the semi-numerical scheme in this comparison regarding the scaling behavior and are thus a suitable test case, although absolute calculation times, in fact, would be larger for more compact molecules. For lighter elements, the use of the small grid 1, which, as shown above, itself provides only a relatively low accuracy, can be rationalized by the common use of small grids within a multi-level grid setup, for which the calculation times of the small grid dominates the overall calculation time. For the alkanethiols, timings for the larger grids 2 and 3 can be estimated by multiplying a factor of ca. 1.9 and 3.5, respectively. Further notice that the chosen double-ζ basis sets (larger bases would be problematic for the analytical relativistic calculations) already favor the analytical integration scheme. We note in passing that the measured parallelization efficiency of our implementation using OpenMP is similar to that of the original COSX method64 (see Supporting Information). First, the performance of the mCOSX method for the alkanethiols in the nonrelativistic case is discussed (Figure 4a). In fact, we observe a similar behavior for the mCOSX method compared to the conventional COSX method with gridweight-independent P-junctions using the same thresholds, as reported by Laqua et al. for preLinX,87 i.e., an earlier crossover with analytical integration and a somewhat faster convergence to the linear scaling regime. In particular, we observed subquadratic scaling for alkanethiols with n > 10 in the mCOSX method, whereas the COSX method did not reach quadratic scaling for the same chain lengths. This highlights the fact that the newly defined F-junctions cover the same effects as the batch-wise screening in preLinX. For tetradecanethiol, the calculation times are already essentially halved when mCOSX is used with the low threshold set instead of an analytical exact-exchange integral evaluation. However, due to the joint calculation of Coulomb and exact-exchange integrals in the conventional direct-SCF method, an overall acceleration of the nonrelativistic HF (or DFT calculations using hybrid functionals) will be observed only when mCOSX is combined with an efficient Coulomb integral evaluation method such as RI-J. For the full PCT, i.e., when relativistic two-electron integrals are used (Figure 4 c), the mCOSX method clearly outperforms conventional analytical integration. In particular, the mCOSX method with the loose threshold requires only less than 5% and less than 15% of the calculation time for tetradecanethiol compared to the joint calculation of analytical Coulomb and exact-exchange integrals and separate calculation of analytical Coulomb integrals, respectively. However, mCOSX is already faster for the small H2S molecule. The main reason for this great improvement is the known superior scaling of seminumerical methods with respect to the one-particle basis set O

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

directly related to additional density matrix screenings due to the PCT of the density matrix. Again, grid 1 was used, and Sjunctions were switched off. For very tight thresholds, between 10−12 and 10−8, we observed only moderate speed-ups of up to 1.2 times. However, when the proposed standard threshold set was employed, i.e., ϵPF = 10−6, the calculation cost was already reduced to less than 65%. When the low threshold set was used, speed-ups by more than a factor of 2 could be achieved. Most notably, the effectiveness of the density matrix screenings appeared to be similar for heavy and light elements. The measured speed-ups thus support the finding that the density matrix screenings, in the mCOSX method provided by P- and F-junctions, are essential for an efficient relativistic treatment within the picture-change-transformed density matrix scheme.

computation time by a factor of approximately 200. However, the computation time of mCOSX with the full PCT increases by factors of only approximately 18 and 15 for the high and low threshold sets, respectively, compared to increases by factors of more than 150 for analytical exact exchange. In fact, this finding highlights the remarkably high efficiency of the mCOSX method for the calculation of relativistic exact exchange and the need for an efficient density matrix screening within the picture-change-transformed density matrix scheme. As the calculation of four-center integrals in the disc-SCF method has fundamentally different technical requirements, as discussed in Section 4, a direct comparison is not possible. In most cases, however, calculations on standard cluster machines with eight cores appeared to be faster using the direct-SCF method with the picture-change-transformed density matrix scheme than for the disc-SCF scheme with a PCT of the integrals. In addition to the alkanethiols, the performance of mCOSX has been investigated also for the cationic gold rings, which, in contrast to the alkanethiols, consist only of heavy elements (Figure 4d). Despite the larger employed grid 3, mCOSX outperforms the analytical integration scheme even more distinctly than for the alkanethiols. Already for Au2+ 2 , less than 15% and 5% of the computation time compared to the analytical evaluation of Coulomb integrals and Coulomb together with exact-exchange integrals are required. Besides highlighting the high efficiency of mCOSX particularly for heavy elements, this also shows that linear alkanethiols, in fact, represent a worse case for the semi-numerical integration scheme than heavy-element-containing systems, which, as explained in detail above, rationalizes the use of the linear alkanethiols for a fair technical comparison between mCOSX and the analytical integration scheme. To complete the performance evaluation, we tried to quantify the observed additional effect of the density matrix screenings in the mCOSX method, i.e., the P- and F-junctions, in the full-PCT case. Therefore, we measured the speed-ups for the halogen molecules with different thresholds ϵPF relative to mCOSX without screening (Figure 5). In fact, P- and Fjunctions can be assumed to be largely ineffective in the nonrelativistic case due to the small size of the halogen molecules. Hence, the speed-ups in the relativistic case can be

7. CONCLUSIONS In this work, we presented the first semi-numerical implementation of relativistic exact exchange into a twocomponent quantum-chemical program, specifically within an exact two-component framework using the IOTC method, focusing mainly on the efficiency and reliability of the integration method. This was achieved by introducing two individual approaches, which represent the core of our method, a PCT of the two-component density matrix based on a formal resolution of the identity and introduction of an (exact) auxiliary basis set, as well as a modified version of the COSX method. The picture-change-transformed density matrix scheme allows us to eliminate the expensive four-index integral transformations in the conventional integral transformation approach of two-component methods and thus represents an ideal choice when combined with the direct-SCF method, which is known for its high efficiency for large molecules and highly parallel environments. Similarly, the new PCT scheme can be easily applied to the calculation of relativistically transformed DFT ingredients such as the electron density to avoid expensive integral transformations on the numerical grid. The simple algebraic formulation of the PCT in terms of a transformation of the density matrix, which requires only the introduction of an (exact) auxiliary basis set and the calculation of standard overlap integrals, provides a straightforward way to easily upgrade existing relativistic two-component codes to quasi-four-component accuracy without the need to rewrite existing four-center integral routines. In this way, acceleration schemes implemented for nonrelativistic Coulomb and exact-exchange integrals are readily available without further ado. The new mCOSX method represents an extension of the COSX method that combines the simplicity and robustness of the original method with the improved potential for screening exploited in preLinX. In particular, we introduced an improved spherical integral estimate for the screening of two-center integrals over the Coulomb operator on the numerical grid based on the introduction of a grid sphere which takes into account the asymptotic decay of the Coulomb operator. A joint screening with the density matrix in the sense of the original P-junctions, denoted as F-junctions, allows an additional prescreening of negligible contributions to the Fock matrix, thus significantly accelerating the semi-numerical exact-exchange evaluation compared to the original COSX method. However, due to the simplicity of F-junctions, existing COSX implementations can be easily extended to the mCOSX method by minor changes in the source code. We also

Figure 5. Speed-ups of the mCOSX method for halogen molecules with different thresholds ϵPF relative to mCOSX without screening. P

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

(5) Dyall, K. G.; Fægri, K. Introduction to Relativistic Quantum Chemistry; Oxford University Press, Oxford, 2007. (6) Reiher, M.; Wolf, A. Relativistic Quantum Chemistry, The Fundamental Theory of Molecular Science; Wiley-VCH, Weinheim, 2009. (7) Nakajima, T.; Yanai, T.; Hirao, K. Relativistic Electronic Structure Theory. J. Comput. Chem. 2002, 23, 847−860. (8) Pyykkö, P. Relativistic Effects in Structural Chemistry. Chem. Rev. 1988, 88, 563−594. (9) Autschbach, J. Perspective: Relativistic effects. J. Chem. Phys. 2012, 136, 150902. (10) Fürstner, A.; Davies, P. W. Catalytic Carbophilic Activation: Catalysis by Platinum and Gold π Acids. Angew. Chem., Int. Ed. 2007, 46, 3410−3449. (11) Raubenheimer, H. G.; Schmidbaur, H. The Late Start and Amazing Upswing in Gold Chemistry. J. Chem. Educ. 2014, 91, 2024− 2036. (12) van Spronsen, M. A.; Frenken, J. W. M.; Groot, I. M. N. Observing the oxidation of platinum. Nat. Commun. 2017, 8, 429. (13) Penschke, C.; Paier, J. Reduction and oxidation of Au adatoms on the CeO2(111) surface − DFT+U versus hybrid functionals. Phys. Chem. Chem. Phys. 2017, 19, 12546−12558. (14) Paier, J.; Penschke, C.; Sauer, J. Oxygen Defects and Surface Chemistry of Ceria: Quantum Chemical Studies Compared to Experiment. Chem. Rev. 2013, 113, 3949−3985. (15) Gohr, S.; Hrobárik, P.; Kaupp, M. Four-Component Relativistic Density Functional Calculations of EPR Parameters for Model Complexes of Tungstoenzymes. J. Phys. Chem. A 2017, 121, 9106− 9117. (16) Ahuja, R.; Blomqvist, A.; Larsson, P.; Pyykkö, P.; ZaleskiEjgierd, P. Relativity and the Lead-Acid Battery. Phys. Rev. Lett. 2011, 106, 018301. (17) Ibañez-Azpiroz, J.; Eiguren, A.; Bergara, A. Relativistic effects and fully spin-polarized Fermi surface at the Tl/Si(111) surface. Phys. Rev. B: Condens. Matter Mater. Phys. 2011, 84, 125435. (18) Einstein, A. Ist die Trägheit eines Körpers von seinem Energieinhalt abhängig? Ann. Phys. 1905, 323, 639−641. (19) Dirac, P. A. M. The quantum theory of the electron. Proc. R. Soc. London, Ser. A 1928, 117, 610−624. (20) Barysz, M.; Syrocki, Ł. Relativistic calculations of X-ray photoelectron spectra and the accuracy of the IOTC method. Mol. Phys. 2014, 112, 583−591. (21) Moss, R. E. Advanced Molecular Quantum Mechanics; Chapman and Hall, London, 1973. (22) Breit, G. Dirac’s Equation and the Spin-Spin Interactions of Two Electrons. Phys. Rev. 1932, 39, 616−624. (23) Schrödinger, E. An Undulatory Theory of the Mechanics of Atoms and Molecules. Phys. Rev. 1926, 28, 1049−1070. (24) Kutzelnigg, W. Basis Set Expansion of the Dirac Operator without Variational Collaps. Int. J. Quantum Chem. 1984, 25, 107− 129. (25) Stanton, R. E.; Havriliak, S. Kinetic balance: A partial solution to the problem of variational safety in Dirac calculations. J. Chem. Phys. 1984, 81, 1910−1918. (26) Dyall, K. G.; Fægri, K. Kinetic balance and variational bounds failure in the solution of the Dirac equation in a finite Gaussian basis set. Chem. Phys. Lett. 1990, 174, 25−32. (27) Foldy, L. L.; Wouthuysen, S. A. On the Dirac Theory of Spin 1/2 Particles and Its Non-Relativistic Limit. Phys. Rev. 1950, 78, 29− 36. (28) Sucher, J. Foundations of the relativistic theory of manyelectron atoms. Phys. Rev. A: At., Mol., Opt. Phys. 1980, 22, 348−362. (29) Heß, B. A. Applicability of the no-pair equation with freeparticle projection operators to atomic and molecular structure calculations. Phys. Rev. A: At., Mol., Opt. Phys. 1985, 32, 756−763. (30) Reiher, M.; Wolf, A. Exact decoupling of the Dirac Hamiltonian. I. General theory. J. Chem. Phys. 2004, 121, 2037−2047.

provided two threshold sets for mCOSX, with a focus on performance and accuracy, respectively, which can be readily applied. While each approach individually represents an advance in its field, the combination of the picture-change-transformed density matrix scheme, which has been applied only in the scalar-relativistic case so far, with the mCOSX method exhibited superior performance compared with a conventional analytical computation of relativistic exact exchange within the picture-change-transformed density matrix scheme, with distinct speed-ups already for medium-sized molecules using a conventional double-ζ basis set. Most notably, the resulting accuracies were found to be similar to those in the nonrelativistic case, rendering mCOSX an excellent choice for the acceleration of relativistic calculations. Although we investigated its use only within the two-component framework, we suspect that four-component relativistic codes can also benefit greatly from an mCOSX implementation. Finally, seminumerical relativistic exact exchange for the first time enables the use of hyper-GGA functionals such as local hybrids or B05 within a two-electron-relativistic framework. This opens the door to include relativistic effects in modern XC functionals and thus allows more accurate relativistic DFT calculations, where GGA and GGA-hybrid functionals still represent the standard approach.



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jctc.9b00228. Details on the employed numerical grids and parallel efficiencies of the mCOSX implementation. (PDF)



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. ORCID

Toni M. Maier: 0000-0001-5571-7576 Yasuhiro Ikabata: 0000-0003-3811-2908 Hiromi Nakai: 0000-0001-5646-2931 Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS As an International Research Fellow of the Japan Society for the Promotion of Science (Postdoctoral Fellowships for Research in Japan (Standard)), T.M.M. thanks the JSPS for financial support via a postdoctoral scholarship. We further acknowledge financial support from the JSPS through KAKENHI Grant Number JP18K14184 and KAKENHI Grant Number 17F17818.



REFERENCES

(1) Pyykkö, P. Relativistic Effects in Chemistry: More Common Than You Thought. Annu. Rev. Phys. Chem. 2012, 63, 45−64. (2) Saue, T. Relativistic Hamiltonians for Chemistry: A Primer. ChemPhysChem 2011, 12, 3077−3094. (3) Liu, W. Ideas of relativistic quantum chemistry. Mol. Phys. 2010, 108, 1679−1706. (4) Cremer, D.; Zou, W.; Filatov, M. Dirac-exact relativistic methods: the normalized elimination of the small component method. WIREs Comput. Mol. Sci. 2014, 4, 436−467. Q

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

order two-component wave functions. J. Chem. Phys. 2009, 130, 164114. (54) Oyama, T.; Ikabata, Y.; Seino, J.; Nakai, H. Relativistic density functional theory with picture-change corrected electron density based on infinite-order Douglas-Kroll-Hess method. Chem. Phys. Lett. 2017, 680, 37−43. (55) Knecht, S.; Fux, S.; van Meer, R.; Visscher, L.; Reiher, M.; Saue, T. Mö ssbauer spectroscopy for heavy elements: a relativistic benchmark study of mercury. Theor. Chem. Acc. 2011, 129, 631−650. (56) Peng, D.; Liu, W.; Xiao, Y.; Cheng, L. Making four- and twocomponent relativistic density functional methods fully equivalent based on the idea of ”from atoms to molecule”. J. Chem. Phys. 2007, 127, 104106. (57) van Wüllen, C.; Michauk, C. Accurate and efficient treatment of two-electron contributions in quasirelativistic high-order DouglasKroll density-functional calculations. J. Chem. Phys. 2005, 123, 204113. (58) Heß, B. A.; Marian, C. M.; Wahlgren, U.; Gropen, O. A meanfield spin-orbit method applicable to correlated wavefunctions. Chem. Phys. Lett. 1996, 251, 365−371. (59) Peralta, J. E.; Scuseria, G. E. Relativistic all-electron twocomponent self-consistent density functional calculations including one-electron scalar and spin−orbit effects. J. Chem. Phys. 2004, 120, 5875−5881. (60) Seino, J.; Nakai, H. Local unitary transformation method toward practical electron correlation calculations with scalar relativistic effect in large-scale molecules. J. Chem. Phys. 2013, 139, 034109. (61) Seino, J.; Nakai, H. Local unitary transformation method for large-scale two-component relativistic calculations: Case for a oneelectron Dirac Hamiltonian. J. Chem. Phys. 2012, 136, 244102. (62) Peng, D.; Reiher, M. Local relativistic exact decoupling. J. Chem. Phys. 2012, 136, 244108. (63) Peralta, J. E.; Uddin, J.; Scuseria, G. E. Scalar relativistic allelectron density functional calculations on periodic systems. J. Chem. Phys. 2005, 122, 084108. (64) Neese, F.; Wennmohs, F.; Hansen, A.; Becker, U. Efficient, approximate and parallel Hartree-Fock and hybrid DFT calculations. A ‘chain-of-spheres’ algorithm for the Hartree-Fock exchange. Chem. Phys. 2009, 356, 98−109. (65) Izsák, R.; Neese, F.; Klopper, W. Robust fitting techniques in the chain of spheres approximation to the Fock exchange: The role of the complementary space. J. Chem. Phys. 2013, 139, 094111. (66) Petrenko, T.; Kossmann, S.; Neese, F. Efficient time-dependent density functional theory approximations for hybrid density functionals: Analytical gradients and parallelization. J. Chem. Phys. 2011, 134, 054116. (67) Eichkorn, K.; Treutler, O.; Ö hm, H.; Häser, M.; Ahlrichs, R. Auxiliary basis sets to approximate Coulomb potentials. Chem. Phys. Lett. 1995, 240, 283−290. (68) Bauernschmitt, R.; Häser, M.; Treutler, O.; Ahlrichs, R. Calculation of excitation energies within time-dependent density functional theory using auxiliary basis set expansions. Chem. Phys. Lett. 1997, 264, 573−578. (69) Whitten, J. L. Coulombic potential energy integrals and approximations. J. Chem. Phys. 1973, 58, 4496−4501. (70) Bleiziffer, P.; Heßelmann, A.; Görling, A. Resolution of identity approach for the Kohn-Sham correlation energy within the exactexchange random-phase approximation. J. Chem. Phys. 2012, 136, 134102. (71) Vahtras, O.; Almlöf, J.; Feyereisen, M. W. Direct atomic-orbitalbased time-dependent Hartree-Fock of frequency-dependent polarizabilities. Chem. Phys. Lett. 1993, 213, 514−518. (72) Dunlap, B. I.; Connolly, J. W. D.; Sabin, J. R. On some approximations in applications of Xα theory. J. Chem. Phys. 1979, 71, 3396−3402. (73) Mintmire, J. W.; Dunlap, B. I. Fitting the Coulomb potential variationally in linear-combination-of-atomic-orbitals density-functional calculations. Phys. Rev. A: At., Mol., Opt. Phys. 1982, 25, 88−95.

(31) Reiher, M.; Wolf, A. Exact decoupling of the Dirac Hamiltonian. II. The generalized Douglas−Kroll−Hess transformation up to arbitrary order. J. Chem. Phys. 2004, 121, 10945−10956. (32) Chang, C.; Pelissier, M.; Durand, P. Regular Two-Component Pauli-Like Effective Hamiltonians in Dirac Theory. Phys. Scr. 1986, 34, 394−404. (33) van Lenthe, E.; Baerends, E. J.; Snijders, J. G. Relativistic regular two-component Hamiltonians. J. Chem. Phys. 1993, 99, 4597− 4610. (34) Dyall, K. G.; van Lenthe, E. Relativistic regular approximations revisited: An infinite-order relativistic approximation. J. Chem. Phys. 1999, 111, 1366−1372. (35) Nakajima, T.; Hirao, K. A new relativistic theory: a relativistic scheme by eliminating small components (RESC). Chem. Phys. Lett. 1999, 302, 383−391. (36) Barysz, M.; Sadlej, A. J.; Snijders, J. G. Nonsingular Two/OneComponent Relativistic Hamiltonians Accurate Through Arbitrary High Order in α2. Int. J. Quantum Chem. 1997, 65, 225−239. (37) Barysz, M.; Sadlej, A. J. Infinite-order two-component theory for relativistic quantum chemistry. J. Chem. Phys. 2002, 116, 2696− 2704. (38) Liu, W.; Peng, D. Infinite-order quasirelativistic density functional method based on the exact matrix quasirelativistic theory. J. Chem. Phys. 2006, 125, 044102. (39) Kutzelnigg, W.; Liu, W. Quasirelativistic theory equivalent to fully relativistic theory. J. Chem. Phys. 2005, 123, 241102. (40) Dyall, K. G. Interfacing relativistic and nonrelativistic methods. I. Normalized elimination of the small component in the modified Dirac equation. J. Chem. Phys. 1997, 106, 9618−9626. (41) Iliaš, M.; Saue, T. An infinite-order two-component relativistic Hamiltonian by a simple one-step transformation. J. Chem. Phys. 2007, 126, 064102. (42) Peng, D.; Reiher, M. Exact decoupling of the relativistic Fock operator. Theor. Chem. Acc. 2012, 131, 1081. (43) Peng, D.; Middendorf, N.; Weigend, F.; Reiher, M. An efficient implementation of two-component relativistic exact-decoupling methods for large molecules. J. Chem. Phys. 2013, 138, 184105. (44) Nakajima, T.; Hirao, K. Extended Douglas−Kroll transformations applied to the relativistic many-electron Hamiltonian. J. Chem. Phys. 2003, 119, 4105−4111. (45) Baerends, E. J.; Schwarz, W. H. E.; Schwerdtfeger, P.; Snijders, J. G. Relativistic atomic orbital contractions and expansions: magnitudes and explanations. J. Phys. B: At., Mol. Opt. Phys. 1990, 23, 3225−3240. (46) Kellö, V.; Sadlej, A. J. Picture Change and Calculations of Expectation Values in Approximate Relativistic Theories. Int. J. Quantum Chem. 1998, 68, 159−174. (47) Wolf, A.; Reiher, M. Exact decoupling of the Dirac Hamiltonian. III. Molecular properties. J. Chem. Phys. 2006, 124, 064102. (48) Wolf, A.; Reiher, M. Exact decoupling of the Dirac Hamiltonian. IV. Automated evaluation of molecular properties within the Douglas-Kroll-Hess theory up to arbitrary order. J. Chem. Phys. 2006, 124, 064103. (49) Seino, J.; Uesugi, W.; Hada, M. Expectation values in twocomponent relativistic theories. J. Chem. Phys. 2010, 132, 164108. (50) Seino, J.; Hada, M. Examination of accuracy of electron− electron Coulomb interactions in two-component relativistic methods. Chem. Phys. Lett. 2008, 461, 327−331. (51) Malkin, I.; Malkina, O. L.; Malkin, V. G.; Kaupp, M. Scalar relativistic calculations of hyperfine coupling tensors using the Douglas−Kroll−Hess method. Chem. Phys. Lett. 2004, 396, 268−276. (52) Hayami, M.; Seino, J.; Nakai, H. Gauge-origin independent formalism of two-component relativistic framework based on unitary transformation in nuclear magnetic shielding constant. J. Chem. Phys. 2018, 148, 114109. (53) Barysz, M.; Mentel, Ł.; Leszczyński, J. Recovering fourcomponent solutions by the inverse transformation of the infiniteR

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation (74) Neese, F. An Improvement of the Resolution of the Identity Approximation for the Formation of the Coulomb Matrix. J. Comput. Chem. 2003, 24, 1740−1747. (75) Weigend, F. Hartree−Fock Exchange Fitting Basis Sets for H to Rn. J. Comput. Chem. 2008, 29, 167−175. (76) Weigend, F. A fully direct RI-HF algorithm: Implementation, optimised auxiliary basis sets, demonstration of accuracy and efficiency. Phys. Chem. Chem. Phys. 2002, 4, 4285−4291. (77) Werner, H.-J.; Manby, F. R.; Knowles, P. J. Fast linear scaling second-order Møller-Plesset perturbation theory (MP2) using local and density fitting approximations. J. Chem. Phys. 2003, 118, 8149− 8160. (78) Plessow, P.; Weigend, F. Seminumerical Calculation of the Hartree-Fock Exchange Matrix: Application to Two-Component Procedures and Efficient Evaluation of Local Hybrid Density Functionals. J. Comput. Chem. 2012, 33, 810−816. (79) Friesner, R. A. An Automatic Grid Generation Scheme for Pseudospectral Self-Consistent Field Calculations on Polyatomic Molecules. J. Phys. Chem. 1988, 92, 3091−3096. (80) Friesner, R. A. Solution of self-consistent field electronic structure equations by a pseudospectral method. Chem. Phys. Lett. 1985, 116, 39−43. (81) Friesner, R. A. Solution of the Hartree-Fock equations for polyatomic molecules by a pseudospectral method. J. Chem. Phys. 1987, 86, 3522−3531. (82) Ko, C.; Malick, D. K.; Braden, D. A.; Friesner, R. A.; Martínez, T. J. Pseudospectral time-dependent density functional theory. J. Chem. Phys. 2008, 128, 104103. (83) Janesko, B. G.; Scalmani, G.; Frisch, M. J. Practical auxiliary basis implementation of Rung 3.5 functionals. J. Chem. Phys. 2014, 141, 034103. (84) Proynov, E.; Liu, F.; Shao, Y.; Kong, J. Improved self-consistent and resolution-of-identity approximated Becke’05 density functional model of nondynamic electron correlation. J. Chem. Phys. 2012, 136, 034102. (85) Proynov, E.; Shao, Y.; Kong, J. Efficient self-consistent DFT calculation of nondynamic correlation based on the B05 method. Chem. Phys. Lett. 2010, 493, 381−385. (86) Ochsenfeld, C.; White, C. A.; Head-Gordon, M. Linear and sublinear scaling formation of Hartree-Fock-type exchange matrices. J. Chem. Phys. 1998, 109, 1663−1669. (87) Laqua, H.; Kussmann, J.; Ochsenfeld, C. Efficient and LinearScaling Seminumerical Method for Local Hybrid Density Functionals. J. Chem. Theory Comput. 2018, 14, 3451−3458. (88) White, C. A.; Head-Gordon, M. A J matrix engine for density functional theory calculations. J. Chem. Phys. 1996, 104, 2620−2629. (89) Manzer, S.; Horn, P. R.; Mardirossian, N.; Head-Gordon, M. Fast, accurate evaluation of exact exchange: The occ-RI-K algorithm. J. Chem. Phys. 2015, 143, 024113. (90) Reine, S.; Tellgren, E.; Krapp, A.; Kjærgaard, T.; Helgaker, T.; Jansik, B.; Høst, S.; Salek, P. Variational and robust density fitting of four-center two-electron integrals in local metrics. J. Chem. Phys. 2008, 129, 104101. (91) Sierka, M.; Hogekamp, A.; Ahlrichs, R. Fast evaluation of the Coulomb potential for electron densities using multipole accelerated resolution of identity approximation. J. Chem. Phys. 2003, 118, 9136− 9148. (92) Schwegler, E.; Challacombe, M. Linear scaling computation of the Hartree−Fock exchange matrix. J. Chem. Phys. 1996, 105, 2726− 2734. (93) Häser, M.; Ahlrichs, R. Improvements on the Direct SCF Method. J. Comput. Chem. 1989, 10, 104−111. (94) Almlöf, J.; Fægri, K., Jr.; Korsell, K. Principles for a Direct SCF Approach to LCAO-MO Ab-initio Calculations. J. Comput. Chem. 1982, 3, 385−399. (95) Nakajima, T.; Hirao, K. Pseudospectral approach to relativistic molecular theory. J. Chem. Phys. 2004, 121, 3438−3445. (96) Jacob, C. R.; Reiher, M. Spin in Density-Functional Theory. Int. J. Quantum Chem. 2012, 112, 3661−3684.

(97) de Castro, E. V. R.; Jorge, F. E. Accurate universal Gaussian basis set for all atoms of the Periodic Table. J. Chem. Phys. 1998, 108, 5225−5229. (98) Helgaker, T.; Jørgensen, P.; Olsen, J. Molecular ElectronicStructure Theory; John Wiley & Sons, 2000; Chapter 9. (99) Hayami, M.; Seino, J.; Nakai, H. Accompanying coordinate expansion and recurrence relation method using a transfer relation scheme for electron repulsion integrals with high angular momenta and long contractions. J. Chem. Phys. 2015, 142, 204110. (100) Ishimura, K.; Nagase, S. A new algorithm of two-electron repulsion integral calculations: a combination of Pople−Hehre and McMurchie−Davidson methods. Theor. Chem. Acc. 2008, 120, 185− 189. (101) Rys, J.; Dupuis, M.; King, H. F. Computation of electron repulsion integrals using the Rys quadrature method. J. Comput. Chem. 1983, 4, 154−156. (102) Dupuis, M.; Rys, J.; King, H. F. Evaluation of molecular integrals over Gaussian basis functions. J. Chem. Phys. 1976, 65, 111− 116. (103) King, H. F. Strategies for Evaluation of Rys Roots and Weights. J. Phys. Chem. A 2016, 120, 9348−9351. (104) Bahmann, H.; Kaupp, M. Efficient Self-Consistent Implementation of Local Hybrid Functionals. J. Chem. Theory Comput. 2015, 11, 1540−1548. (105) Maier, T. M.; Bahmann, H.; Kaupp, M. Efficient Seminumerical Implementation of Global and Local Hybrid Functionals for Time-Dependent Density Functional Theory. J. Chem. Theory Comput. 2015, 11, 4226−4237. (106) van Wüllen, C. A hybrid method for the evaluation of the matrix elements of the Coulomb potential. Chem. Phys. Lett. 1995, 245, 648. (107) Jaramillo, J.; Scuseria, G. E.; Ernzerhof, M. Local hybrid functionals. J. Chem. Phys. 2003, 118, 1068−1073. (108) Maier, T. M.; Arbuznikov, A. V.; Kaupp, M. Local hybrid functionals: Theory, implementation, and performance of an emerging new tool in quantum chemistry and beyond. WIREs Comput. Mol. Sci. 2019, 9, No. e1378. (109) Becke, A. D. Real-space post-Hartree-Fock correlation models. J. Chem. Phys. 2005, 122, 064101. (110) Kussmann, J.; Ochsenfeld, C. Pre-selective screening for matrix elements in linear-scaling exact exchange calculations. J. Chem. Phys. 2013, 138, 134114. (111) Hayami, M.; Seino, J.; Nakajima, Y.; Nakano, M.; Ikabata, Y.; Yoshikawa, T.; Oyama, T.; Hiraga, K.; Hirata, S.; Nakai, H. RAQET: Large-Scale Two-Component Relativistic Quantum Chemistry Program Package. J. Comput. Chem. 2018, 39, 2333−2344. (112) Guidon, M.; Schiffmann, F.; Hutter, J.; VandeVondele, J. Ab initio molecular dynamics using hybrid density functionals. J. Chem. Phys. 2008, 128, 214104. (113) Treutler, O.; Ahlrichs, R. Efficient molecular numerical integration schemes. J. Chem. Phys. 1995, 102, 346−354. (114) Maier, T. M. Development of Local Hybrid Functionals for Time-Dependent Density Functional Theory. Doctorial Thesis, Technische Universität Berlin, 2016. DOI: 10.14279/depositonce5625. (115) Dunning, T. H., Jr. Gaussian basis sets for use in correlated molecular calculations. I. The atoms boron through neon and hydrogen. J. Chem. Phys. 1989, 90, 1007−1023. (116) Woon, D. E.; Dunning, T. H., Jr. Gaussian basis sets for use in correlated molecular calculations. III. The atoms aluminum through argon. J. Chem. Phys. 1993, 98, 1358−1371. (117) Noro, T.; Sekiya, M.; Koga, T. Sapporo-(DKH3)-nZP (n = D, T, Q) sets for the sixth period s-, d-, and p-block atoms. Theor. Chem. Acc. 2013, 132, 1363.

S

DOI: 10.1021/acs.jctc.9b00228 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX