Efficient Computation of Exchange Energy Density with Gaussian

Apr 27, 2017 - Practical Density Functionals beyond the Overdelocalization–Underbinding Zero-Sum Game. Benjamin G. Janesko , Emil Proynov , Jing Kon...
0 downloads 8 Views 498KB Size
Subscriber access provided by UB + Fachbibliothek Chemie | (FU-Bibliothekssystem)

Article

Efficient Computation of Exchange Energy Density with Gaussian Basis Functions Fenglai Liu, and Jing Kong J. Chem. Theory Comput., Just Accepted Manuscript • Publication Date (Web): 27 Apr 2017 Downloaded from http://pubs.acs.org on April 28, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Theory and Computation is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Efficient Computation of Exchange Energy Density with Gaussian Basis Functions Fenglai Liu and Jing Kong∗ Department of Chemistry, Middle Tennessee State University, TN, 37130 E-mail: [email protected]



To whom correspondence should be addressed

1

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract Density Functional Theory(DFT) is widely applied in chemistry and Physics. Still it fails to predict correctly quantitatively or even qualitatively for systems with significant nondynamic correlation. Several DFT functionals were proposed in recently years to treat the nondynamic correlation, most of which added the exact exchange energy density as a new variable. This quantity, calculated as Hartree-Fock (HF) exchange energy density, is the computational bottleneck for calculations with these new functionals. We present an implementation of an efficient seminumerical algorithm in this paper as a solution for this computational bottleneck. The method scales quadratically with respect to the molecular size and the basis set size. The scheme, exact for the purpose of computing the HF exchange energy density, is favored for medium-sized basis sets and can be competitive even for large basis sets with efficient grids when compared with our previous approximate resolution-of-identity scheme. It can also be used as a seminumerical integration scheme to compute the HF exchange energy and matrix on a standard atom-centered grid. Calculations on a series of alanine peptides show that the seminumerical scheme becomes competitive for large basis sets to the conventional analytical method and can be about six times faster for aug-cc-pvtz basis. The practicality of the algorithm is demonstrated through a local hybrid self-consistent calculation of the acenes-20 molecule.

Introduction Density Functional Theory(DFT) has made tremendous impact on the application of quantum mechanics in chemistry and physics. 1 At the core of DFT is the exchange-correlation(XC) functional that uses electron density to describe quantum mechanical effects under the meanfield formalism of the Kohn-Sham scheme. The progression of widely applied functionals can be characterized by the variables included in these functionals, from local density approximation (LDA) that uses electron density only, to general-gradient-approximation (GGA) with the gradient of the electron density added, and to meta-GGA with the kinetic energy 2

ACS Paragon Plus Environment

Page 2 of 31

Page 3 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

density and/or the Laplacian of the electron density further added. This depiction fits the Jacob’s Ladder 2 going from the 1st-rung for LDA to the 3rd for meta-GGA. These functionals are often mixed with a fraction of Hartree-Fock exchange, resulting in the so-called hybrid functionals. Currently a number of hybrid GGA and meta-GGA functionals are extensively applied in many fields of science and engineering. 3–7 Still, further development of DFT functionals is needed, especially for the nondynamic correlation, where current DFT functionals fail to count for. 8 The nondynamic correlation arises because an electronic state can not be well described by a single Slater determinant. Instead a linear combination of degenerated or nearly degenerated Slater determinants is needed for correct description of the electronic state. This problem manifests in the prediction of barriers of chemical reactions, band gaps of materials, bond dissociations of molecules etc. 9 Methods to treat nondynamic correlation within the single-determinant KS DFT scheme have been proposed in the last decade. Early methods, such as Becke03 (B03), 10 Becke05 (B05) 11 by Becke, and Perdew-Staroverov-Tao-Scuseria (PSTS) by Perdew and co-workers, 12 measure the delocalization of the exact exchange in real-space and compensate for this effect with a local correction serving as nondynamic correlation. We implemented B05 and PSTS self-consistently and re-optimized the parameters for B05. 13–16 We demonstrated that these methods perform similarly to the mainstream methods on standard thermodynamic benchmarks, and do much better on molecules with significant nondynamic correlation, such as NO dimer. Recently, general single determinant model functionals have been proposed to treat very strong nondynamic correlations up to the dissociation of a covalent bond. 17,18 In this vein, Becke proposed the Becke13 (B13) method by adding a strong correlation term to B05 17,19 to count the correlation at the dissociation limit of a covalent bond. The potential of the method was demonstrated for transition-metal complexes 19 and the treatment of avoided crossings and conical interactions. 19–21 Our group recently developed another general functional that computes the nondynamic correlation of all strength with a single term. 18 The correlation potential part of the functional is borrowed from the

3

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 31

correlation potential of B13. The correlation kinetic part was modeled via adiabatic connection based on physical arguments and some exact conditions for both the weak and strong correlations. The preliminary results with self-consistent-field (SCF) implementation show that the method, with only three empirical parameters, recovers the majority of the left-right nondynamic/strong correlation upon bond dissociation and performs well for near equilibrium properties such as heats of formation and singlettriplet energy splittings of diradicals. Other methods based on B05 have also been proposed for strong correlation in the cases of fractional charge and fractional spin. 22,23 All the emerging functionals discussed above include the exact exchange energy density as a fundamental variable:

eex σ (r)

0 0 occ X occ Z X φ∗i,σ (r)φ∗j,σ (r)φi,σ (r )φj,σ (r ) 0 dr =− 0 |r − r | i j

,

(1)

where φσ denotes Kohn-Sham occupied orbital for a given spin state σ. Such a functional is categorized as hyper-GGA,the 4th-rung on the Jacob’s Ladder. 2 We note that this definition of the exact exchange energy density has the same formalism for the computation of the Hartree-Fock (HF) exchange energy density, and we will refer it as HF exchange energy density or exchange energy density henceforth. Besides those discussed above, other examples of hyper-GGA functionals include the so-called local hybrid functionals. 24–28 We note that functionals going beyond 4th-rung have also been proposed. 29–32 They include virtual orbitals in the formalism and require higher computational cost than the lower rungs of functionals that depend on the occupied orbitals only. Although quite a few of hyper-GGA functionals have been developed in the last decade, there have been rarely reports of their applications. One reason is the high cost to calculate the exchange energy density eex σ (r). It needs to be evaluated at each grid point with four atomic orbital (AO) indexes when Gaussian AO basis functions are used. Since the numerical integration for an XC functional involves several thousands to dozens of thousands of grid

4

ACS Paragon Plus Environment

Page 5 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

points for each atom, the exchange energy density could be many times more expensive to compute than the normal exchange energy, which calculates each four-AO electron repulsion integral (ERI) analytically. Gorling etc. 33 proposed an algorithm that uses the Gaussian basis functions themselves in a resolution-of-the-identity (RI) expansion of basis functions, resulting in a convenient formalism that requires only the standard HF exchange matrix. Its application to the implementation of PSTS showed that this method demands large, uncontracted basis sets, 34 which is rather inconvenient for routine calculations. In Becke’s non-SCF implementation of B05 and B13, the HF exchange energy density is calculated based on a multi-center Poisson solver 35 with numerical basis functions. In our previous SCF implementations of B05 and PSTS with Gaussian basis functions, 13–15 we employed a RI expansion for pairs of basis functions, which reduces the number of indexes to two per grid point. The algorithm applies to any molecular orbital from the standard basis set libraries. However, the RI approach has two problems. First it is not efficient for large molecules, as it scales cubically with respect to the molecular size. Second, the calculated exchange energy density value is not negative-definite. This has not been a major problem in practice if large auxiliary basis set is used, but does introduce extra complexity to the formulations of functionals as the occasional small positive values have to be removed. A more recent implementation of local hybrid functionals using a seminumerical HF scheme for the computation of HF exchange energy density showed promising computational performance. 36–38 In this work we report our implementation of a seminumerical algorithm to calculate the HF exchange energy density exactly. This scheme evaluates Eq. 1 by breaking it into several steps. We provide the cost analysis of each step with respect the basis set size and the molecular size. We show that the overall computational cost is quadratic scaling with respect the molecular size and the basis set size. Compared with our previous approximate RI scheme, this scheme is exact and more efficient for large molecules. A key component of the scheme is the application of an efficient integral algorithm recently developed 39 to compute the following two-center electrostatic potential (ESP), which is at the core computing the

5

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 31

HF exchange energy density: Z vµν (r) =

0

0

ϕµ (r )ϕν (r ) 0 dr |r − r0 |

,

(2)

ϕ is a general basis set function. This ESP integral is computed with analytical formalism. The above scheme to calculate HF exchange energy density also leads to a way to compute the HF exchange energy and matrix with numerical integration, giving rise to a seminumerical algorithm. The HF exchange energy can be computed by integrating the HF exchange energy density (Eq. 1) numerically :

Eσex

1 = 2

Z

eex σ (r)dr

.

(3)

Various methods have been proposed for computing HF exchange seminumerically. Frisner 40–44 proposed an efficient pseudospectral method for computation of HF exchange energy and matrix with Gaussian basis functions. Beside the Gaussian type basis functions, HF exchange energy and matrix can also be evaluated numerically with atom-centered numerical basis functions 35 and plane wave basis functions. 45–47 More recently, Neese et al. 36–38 have proposed a seminumerical method for HF and local hybrid functionals calculations, and these methods showed good accuracy and efficiency for large molecules with large basis sets. The present algorithm is essentially the same as that of Neese et al. except that a different integral engine for ESP is used. In this paper, we will discuss in detail the performance of the seminumerical algorithm with respect to basis sets and numerical quadrature.

Algorithms HF exchange energy density The exchange energy densities for spins α and β are needed at every grid point for the computation with a hyper-GGA functional. It can be defined through atomic orbital(AO) 6

ACS Paragon Plus Environment

Page 7 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

index: eex σ (r)

=−

XX µν

σ σ Pλη Pµν

Z

λη

0

0

ϕµ (r)ϕλ (r)ϕν (r )ϕη (r ) 0 dr |r − r0 |

.

(4)

σ is the density matrix with spin σ in AO basis. The In the equation, ϕ is an AO, and Pµν

complex conjugate sign is removed in comparison with Eq. 1 since the AO basis and density matrix are usually in the real form. Compared with the computation of the HF exchange energy, Eq. 4 involves looping over grid points plus looping over the four AO indices. Thus its computational cost appears to be equivalent to the cost of computing HF energy per grid point, which would be tremendously expensive. Here we implement an algorithm that computes Eq. 4 exactly in several steps such that the four-index loop can be avoided. First, the density matrix is combined with basis set value ϕ(r) through a BLAS(Basic Linear Algebra Subprograms) level 3 matrix-matrix multiplication: Qσν (r) =

X

σ ϕµ (r)Pµν

.

(5)

µ

The cost for this step scales in O(ng NB2 ), where NB is the number of AO basis functions, and ng is the average number of grid points that a basis function effectively reaches. The quantity ng does not change in general when the molecular size increases asymptotically. Thus the cost for this step increases quadratically with respect to the basis set size for the same molecule. When the size or the volume (V) of the molecule increases, the cost increases in O(V 2 ) assuming the density matrix is dense, since ng does not change for the same basis set. The next step is to calculate the ESP of the pairs of basis functions vνη (r) as shown in Eq. 2. Its cost scales in O(Ng NB2 ) with respect the basis set size for the same molecule and the total number of grid points Ng of the molecule. The number of the pairs of basis functions, however, scales linearly with the increase of the molecular size, i.e. in O(V ), since the two Gaussian basis functions have zero effective overlap beyond a certain distance. Therefore, the cost of this step scales in O(V 2 ) with respect to the molecular size. However, the

7

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 31

computation of each ESP integral involves multiple operations, unlike the previous step where the operation per basis function pair and grid point is just one multiplication/addition. First, the computation of the bottom integral requires a few dozen operations at least. Second, the computation effort scales quadratically with respect to the degree of contraction of the basis set. Thus, this step has the highest cost among all the steps for the seminumerical algorithm. Details of the computation of the ESP integral are discussed in the next subsection. The third step is to calculate the following quantity Rνσ (r): Rνσ (r) =

X

Qση (r)vνη (r) .

(6)

η

The scalings of this step with respect to the basis set size and the molecular size are the same as the second step, but it costs less than the latter, since it only includes one multiplication and one addition at the innermost loop. The last step is the formation of the HF energy density:

eex σ (r) = −

X

Qσν (r)Rνσ (r) .

(7)

ν

Its computational cost is in O(NB Ng ), and scales in O(V 2 ) asymptotically with respect to the molecular size. The rate-determined step for the exchange energy density calculation is the second step and the least costly step is the last one. Overall, the above algorithm scales quadratically with respect to the basis set size of the same molecule and with respect to the molecular size with the same basis set.

ESP integrals of basis function pairs The rate-determining step in the above procedure is the computation of ESP integrals. The ESP integral calculation is performed for all the basis function pairs per each grid point. Since the number of grid points for each atom involved for practical DFT applications is

8

ACS Paragon Plus Environment

Page 9 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

large, the key to achieve a productive exchange energy density calculation is to find efficient integral code for ESP integrals. Our implementation for ESP integrals uses a combination of the Obara-Saika(OS) 48,49 and Head-Gordon-Pople(HGP) 50 algorithms. The evaluation of ESP integral starts by de1 |0)(m) through the following vertical recurrence relariving a group of integrals (a| |r − r0 | tion(VRR) in the OS scheme:

((a + ιi )|

1 1 1 (m) (m) (m+1) = (Pi − Ai )(a| − (Pi − Ri )(a| 0 |0) 0 |0) 0 |0) |r − r | |r − r | |r − r |   Ni (A) 1 1 (m) (m+1) + ((a − ιi )| |0) − ((a − ιi )| |0) 2 |r − r0 | |r − r0 |

, (8)

2

In the above equation, a refers to a general Cartesian Gaussian function xm y n z l e−αr , and 0 0 2

is the Gaussian function e−α r without Cartesian component. ιi is the increment on x, y or z on the Cartesian part of Gaussian function:

a + ιx = xm+1 y n z l e−αr

2

.

(9)

2

Ni (A) is the m, n or l in the Cartesian part of Gaussian function xm y n z l e−αr . The definitions 1 |0)(m) is of other symbols can be found in Ref. 48 Additionally, the integral of (a| |r − r0 | expressed as: 1 2 (m) (a| =√ 0 |0) |r − r | π

Z

2 =√ π

Z



 du

0 ∞

 du

0

u2 ρ + u2

m

u2 ρ + u2

m Z

(a|

1 |0) |r − r0 | 2

0 2

xm y n z l e−αr e−α r dr |r − r0 |

.

(10)

When m = 0, such integral becomes the ESP integral. 1 In the OS scheme, the bottom integral (0| |0)(m) m = 0, 1, · · · are calculated |r − r0 | first, then one can raise up the angular momentum according to Eq. 8 until the target

9

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 31

1 |0) are reached. The next step is the transformation of the VRR output |r − r0 | 1 1 integral (a| |b) through “horizontal recurrence 0 |0) into the result integral (a| |r − r | |r − r0 | relation”(HRR): 50 integrals (a|

(a|

1 1 1 |b) . 0 |(b + ιi )) = ((a + ιi )| 0 |b) + (Ai − Bi )(a| |r − r | |r − r | |r − r0 |

(11)

A and B are the centers of the Cartesian Gaussian functions. By using a greedy search algorithm on VRR and HRR, 39 we are able to find the optimal recurrence relations structure which use less number of temporary variables so that both of FLOPS(Floating-point Operations Per Second) and CPU cache usage are optimized. Another improvement in computational efficiency is the employment of a compact hybrid 1 scheme to calculate the bottom integral (0| |0)(m) . 39 This scheme replaces the use of |r − r0 | the expensive incomplete Gamma functions with a combination of recurrence relations and error function calculation to derive the bottom integrals, therefore it’s much more efficient in terms of FLOPS evaluation.

Numerical integration for Hartree-Fock exchange energy and matrix The above scheme for calculating the HF exchange energy density also provides a seminumerical way to compute the HF exchange energy and matrix. The HF exchange energy can be calculated according to Eq. 3 on a quadrature once the HF exchange energy density is calculated. The HF exchange matrix can also be evaluated on a quadrature as the following: ∂Eσex σ = Fµν =− σ ∂Pλη

Z

ϕµ (r)Rνσ (r)dr

r

grid



X

ϕµ (r)Rνσ (r)w(r) ,

r

10

ACS Paragon Plus Environment

(12)

Page 11 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

where w(r) is the numerical weight of the grid points. As a comparison, a conventional analytic computation of HF exchange matrix is done the following way: σ Fµν =

X

σ Pλη (µλ|νη) ,

(13)

λη

where (µλ|νη) is the four-center ERI(Electron Repulsion Integral). Its computational cost scales O(V 2 ) with respect to the size of the molecule since the basis function pairs on either side the Coulomb operator have limited overlap range. It scales in O(N 4 ) with respect to the basis set size for the same molecule as all the four AO indexes are present in a single ERI. Both the analytic approach and the present seminumerical algorithm scale quadratically with respect to the molecular size. The seminumerical algorithm has the disadvantage of requiring thousands of grid points for even a small number of integrals. On the other hand, it has the advantage of better scaling in terms of the basis set size, as it only scales in O(NB2 ). Furthermore, the algorithm is favored for high contraction basis set. As shown from Eq. 4 to Eq. 7 all the values on the grid points are for contracted basis functions except the ESP integral calculation. ESP integral calculation scales up to O(K 2 ) with respect to the contraction, where K is the average degree of contraction for a basis set. In comparison, the analytical evaluation of the four center ERI scales up to O(K 4 ). We note that our computation steps are similar to those in Frisner’s pseudospectral method. 40–44 The pseudospectral method includes additional steps such as the optimization of the grid points. It provides a way for efficient Coulomb and exchange matrix calculation through “spectral space” over a designed small grid. Our motivation here is to compute the matrix with the established DFT quadrature. 51–54 The computation steps described above is essentially same with the chain of spheres implementations used for Hartree-Fock matrix and local hybrid functional calculations. 36–38 However, the present scheme uses a different integral engine for ESP integrals. It also does not use the density matrix for screening out negligible integrals, the so-called P junction 11

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 31

screening in the previous implementations. Compared with the implementation for local hybrid functional application, 38 we use a tight threshold (1.0×10−11 ) for the effective overlap of a pair of basis functions when testing the performance of our implementation.

Previous RI scheme We previously implemented the computation of the HF exchange energy density using the RI approximation. 13 We briefly present here the algorithm for the purpose of comparing with the exact scheme introduced in this paper. In the RI scheme the product of a pair of AO basis functions is approximated through a linear expansion of a set of predefined auxiliary basis functions: ϕµ (r)ϕλ (r) ≈

X

m Cµλ Xm (r) ,

(14)

m m m is determined is the RI fitting coefficient. Cµλ where Xm is an auxiliary basis function and Cµλ

as the following:

m Cµλ =

X

(Xm |

n

1 1 −1 (Xn | |ϕµ ϕλ ) . 0 |Xn ) |r − r | |r − r0 |

(15)

Based on the RI approximation the HF exchange energy density can be approximated as:

eex σ (r)



X

Bσmn w(r)Xm (r)

0

Z

Xn (r ) 0 dr |r − r0 |

mn

,

(16)

.

(17)

The matrix Bσmn is given as: Bσmn

=

X

σ σ m n Pµν Pλη Cµλ Cνη

=

occ X

m n Cij,σ Cij,σ

ij

µνλη

m m In Eq 17, Cij,σ is the occupied-MO transformation of the fitting coefficient Cµλ .

There are two main computational parts for this RI scheme. One is the part on the quadrature in Eq. 16. This cost scales in O(V 2 ) with respect to the molecular size since the

12

ACS Paragon Plus Environment

Page 13 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

2 auxiliary basis function has limited range in space. It scales in O(Naux ) in terms of the size

of the auxiliary basis set. The other part is outside of the grid, including the computation of the fitting coefficients and the subsequent Bσmn matrix. The computation of the fitting 3 coefficients has the higher cost. It scales in O(Naux ) and O(V 3 ) in terms of the size of the

auxiliary basis set and the molecular size, respectively, due to the matrix inversion in the calculation of the fitting coefficient in Eq. 15, and in O(NB2 ) with respect to the AO basis set size if the auxiliary basis set is kept the same. Compared with the exact scheme, the RI scheme has the advantage on the grid part as it requires much less number of ESP integrals because the number of auxiliary basis functions is much less than the number of the pairs of AO basis functions. The computation of the ESP integrals dominates the cost for the exact scheme, especially when the AO basis set becomes large. The non-grid part of the RI scheme has an advantage for small molecules where the numbers of AO and auxiliary basis functions are moderate. However, it grows cubically with respect to the molecular size, much faster than the exact scheme, making it less suitable for large molecules. Another drawback of the RI scheme is that the negative definiteness of the HF exchange energy density cannot be rigorously maintained with the RI approximation. This definiteness is a formal property assumed by DFT functionals. The grid points where the HF exchange energy density is positive must be abandoned in the calculation, which may affect the calculation accuracy as well as SCF stability. In practice, we have found that large auxiliary basis sets need to be used even for moderate sized AO basis sets in order to avoid this problem. Besides unfavorable scaling of the computing time with respect to the molecular size, a RI scheme also tends to require more memory because of the desire to store the fitting coefficients to save CPU time, the amount of which scales in O(V 2 ).

13

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Results and Discussions Performance comparison with RI scheme Table 1 shows the timing comparison between the exact scheme and the RI scheme for four small molecules. The calculation with RI scheme was performed using Q-Chem (version 4) 55 package, and the calculation with the exact scheme in this paper was carried out with our development program. Both Q-Chem and the development program are compiled with Intel compiler with O3 optimization. The standard atom-centered grid scheme is used, and the scheme uses Eular-Maclaurine formula for calculating the radius part 53 and Lebedev quadrature for the angular. 51,52 Two sizes of grid are tested. One is an often applied pruned grid named SG1 56 based on (50,194), i.e. 50 radial points and 194 angular points, with approximately 3000 points per atom. The other is unpruned (128,194), with about 24000 points per atom. All the calculations below are using single thread. The rimp2-aug-cc-pVTZ basis set 57 is used as the RI auxiliary basis set for both AO basis sets. Table 1 contains calculations with two basis sets of different sizes, namely 6-31G** and G3LARGE 58 on the two grids, respectively. As one can see, the exact scheme performs better than the RI scheme for the smaller basis set 6-31G** and worse for the larger G3LARGE basis set. As we analyzed earlier, the computational cost of the exact scheme scales quadratically with respect to the AO basis set size. With the RI scheme, on the other hand, the grid part of the computation scale linearly with respect to the AO basis set size. Since the molecules tested here have modest sizes, the grid part takes more computing time than the non-grid part, and the computing time for the exact scheme grows faster than the RI scheme when the size of the AO basis set is increased. On the other hand, the exact scheme benefits more from the reduction of the grid size for the same reason as evidenced by the reduction of the CPU time for the same molecule from the (128,94) grid to SG1 grid, e.g. five times reduction for pyrrole with G3LARGE with the exact scheme versus three times with the RI. When the molecular size is changed from a ten-atom pyrrole to an 18-atom naphthanlene,

14

ACS Paragon Plus Environment

Page 14 of 31

Page 15 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

the computing time for the RI scheme increases by about 7 times versus 4-5 times for the exact scheme, when the larger grid, (128,194), is used. The faster increase of computational cost for the RI scheme is consistent with our analysis that the non-grid part of the RI scheme has a cubic scaling with respect to the molecular size versus the quadratic scaling with the exact scheme. This also causes the larger increase of computing for the RI scheme when the smaller grid SG1 is used especially for the larger basis set G3LARGE. The relative efficiency of the exact scheme for larger molecules and more moderate grid size makes the exact scheme to be about as fast as the RI for G3LARGE basis set on SG1 grid. Comparative calculations with larger molecules were prevented by the limit on the storage of the RI fitting coefficients in Q-Chem implementation. We are pleased that the exact scheme presented here is competitive even for small molecules, for which RI is presumed to perform efficiently. With the efficient ESP integral code and the better arrangement of computation steps, the exact scheme is faster for basis sets of medium size, and only about two-times slower for large basis sets with large grids. In contrast, our previous implementation of the exact HF exchange energy density was about 100 times slower than the RI scheme with G3LARGE basis set on (128,194) grid set. 13 Table 1: Timing comparison between the present scheme described in this paper and RI scheme for one SCF iteration in seconds. grid set (128,194)

SG1

Method Present(6-31G**) RI(6-31G**) Present(G3LARGE) RI(G3LARGE) Present(6-31G**) RI(6-31G**) Present(G3LARGE) RI(G3LARGE)

Pyrrole Benzene Alanine 42 67 79 57 98 121 200 306 341 75 136 160 5.8 9 16 14 16 19 41 76 75 24 54 54

Naphthalene 164 400 1099 554 28 68 281 217

To demonstrate the practicality of our implementation, We performed an SCF calculation with a 4th-rung functional B05 11 on an acenes-20 molecule as shown in Fig. 1 with cc-pvdz 15

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

basis set. Acenes have been shown to become increasingly polyradical as its size grows. 59,60 Acenes-20 is composed of 20 benzene rings with 126 atoms, and has 1368 basis functions with cc-pvdz. We use two pruned grids for calculation, one is the SG1 grid and the other is a pruned (50,434) grid for the first row atoms. The latter is obtained from the PQS program 61 . For SG1 grid set each atom contains approximately 3000 grid points, for the fine grid set each atom contains about 9000 grids. The integral threshold is set to 1.0 × 10−10 and the threshold for numerical part is set to 1.0 × 10−12 . The SCF process was carried out on Intel Xeon E5-1650(3.2GHZ) CPU with 6 hardware threads. For the SG1 grid, each SCF cycle took 740 seconds; for the PQS fine grid each SCF cycle took 1625 seconds.

Figure 1: Acenes-20 molecule

Comparison with analytic computation of Hartree-Fock exchange The HF exchange energy density can also be used to compute the HF exchange energy and matrix through numerical integration. The algorithm and its cost are presented in the previous section. Here we demonstrate its accuracy and efficiency through series of calculations. Tables 2 illustrates the accuracy of the numerical integration, using a series of alanine peptides of Alpha helix ranging from 6 to 10 alanines. Analytical results are used as the reference. Dunning basis set aug-cc-pvdz 62 is employed. Both numerical and analytical exchange matrix calculations use the same input density matrix. Integral threshold value 1.0 × 10−10 is applied for analytical exchange matrix calculation to screen out the insignificant electron repulsion integrals. The seminumerical computation of the exchange matrix 16

ACS Paragon Plus Environment

Page 16 of 31

Page 17 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

employed threshold value 1.0 × 10−11 to generate significant shell pair data for the ESP integral evaluation. Three numerical grids are tested, namely (128,302), (50,194) and SG1. The errors are reported in several forms, including the maximum absolute deviation (Max Dev) in the elements of the HF exchange matrix, the root-mean-square deviation (RMSD) of the HF exchange matrix, and the total energy difference and the difference per atom, all in Hartree. As one can see from the table, the numerical integration yields consistently small errors for molecules of different sizes. The very fine grid (128,302) yields better accuracy as expected. The pruning of SG1 based on the (50,194) grid causes little extra error. Overall SG1 is an acceptable choice of grid due to its small error and computational efficiency. As one can see from the table, the numerical integration yields consistently small errors for molecules of different sizes. The very fine grid (128,302) yields better accuracy as expected. The pruning of SG1 based on the (50,194) grid causes little extra error, about the same level of error for RI HF exchange. 36 Overall SG1 is an acceptable choice of grid due to its small error and computational efficiency. We also benchmarked the error and the computational performance of the seminumerical scheme for larger molecules and basis sets with the SG1 grid. The molecules are analine peptides of various sizes. Four basis sets were tested, namely 6-31G**, cc-pvdz, aug-cc-pvdz and aug-cc-pvtz. The results are listed in Tables 3 to 6. One can see that adding extra diffuse functions, e.g. aug-cc-pvdz versus cc-pvdz, increases the maximum deviation and the RMSD somewhat, but the difference in total energy per atom changes little. Overall, the errors of the numerical integration are stable for molecules of different sizes with different basis sets, and stay below or about 1.0 × 10−4 Hartree per atom in total energy. For the comparison in performance, the CPU timings corresponding to the calculations in Tables 3 to 6 are shown in Figures 2 to 5. The timing of the standard analytical exchange matrix calculation is used as the reference, and the timing ratios between numerical and analytical calculations are listed the figures. The calculations were performed on Intel

17

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 2: Accuracy assessment for exchange matrix calculation with aug-cc-pvdz basis function on different grid sets Moleculea ala6(1001)

grid MAX Dev.b (128,302) 1.7E-4 (50,194) 2.8E-4 SG1 2.8E-4 ala7(1161) (128,302) 1.5E-4 (50,194) 2.8E-4 SG1 2.8E-4 ala8(1321) (128,302) 1.2E-4 (50,194) 2.9E-4 SG1 2.9E-4 ala9(1481) (128,302) 1.4E-4 (50,194) 3.4E-4 SG1 3.4E-4 ala10(1641) (128,302) 1.5E-4 (50,194) 4.3E-4 SG1 4.3E-4

RMSDc 3.3E-6 8.3E-6 8.3E-6 3.3E-6 7.0E-6 7.1E-6 3.2E-6 6.4E-6 6.4E-6 3.5E-6 6.7E-6 6.7E-6 3.5E-6 6.7E-6 6.8E-6

E diff.d E diff. per atome 1.09E-3 1.73E-5 3.57E-3 5.67E-5 3.54E-3 5.63E-5 1.44E-3 1.97E-5 2.31E-3 3.16E-5 2.22E-3 3.04E-5 0.61E-3 0.73E-5 1.53E-3 1.84E-5 1.46E-3 1.76E-5 0.11E-3 0.11E-5 3.82E-3 4.11E-5 3.85E-3 4.14E-5 1.11E-3 1.08E-5 6.65E-3 6.46E-5 6.77E-3 6.58E-5

a

The number in parenthesis is the number of basis functions. MAX Dev is maximum absolute deviation. c RMSD is root-mean-square deviation. d E diff. is the exchange energy difference between the seminumerical and analytical schemes, unit is in Hartree e E diff. per atom is the exchange energy difference per atom, unit is in Hartree b

18

ACS Paragon Plus Environment

Page 18 of 31

Page 19 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Table 3: Accuracy assessment for exchange matrix calculation with cc-pvdz basis function on SG1 grid Molecule MAX Dev. ala10(974) 2.1E-4 ala11(1069) 2.2E-4 ala12(1164) 2.2E-4 ala13(1259) 2.5E-4 ala14(1354) 2.6E-4 ala15(1449) 2.7E-4 ala16(1544) 2.7E-4 ala17(1639) 2.7E-4 ala18(1734) 2.6E-4 ala19(1829) 2.5E-4 ala20(1924) 3.0E-4 ala21(2019) 3.4E-4 ala22(2114) 3.6E-4 ala23(2209) 3.7E-4 ala24(2304) 3.8E-4 ala25(2399) 3.8E-4

RMSD 3.2E-6 3.1E-6 3.0E-6 2.9E-6 2.9E-6 2.9E-6 2.9E-6 2.8E-6 2.8E-6 2.9E-6 2.8E-6 2.8E-6 2.8E-6 2.7E-6 2.7E-6 2.7E-6

E diff. E diff. per atom 7.86E-3 7.63E-5 1.04E-2 9.19E-5 1.11E-2 9.03E-5 1.09E-2 8.23E-5 1.18E-2 4.53E-5 1.19E-2 7.75E-5 1.38E-2 8.49E-5 1.50E-2 8.66E-5 1.49E-2 8.13E-5 1.48E-2 7.67E-5 1.62E-2 8.00E-5 1.78E-2 8.34E-5 1.96E-2 8.79E-5 2.04E-2 8.76E-5 2.27E-2 9.35E-5 2.62E-2 1.04E-4

Table 4: Accuracy assessment for exchange matrix calculation with aug-cc-pvdz basis function on SG1 grid Molecule MAX Dev. ala10(1641) 4.3E-4 ala11(1801) 4.5E-4 ala12(1961) 4.3E-4 ala13(2121) 4.8E-4 ala14(2281) 5.4E-4 ala15(2441) 5.6E-4 ala16(2601) 5.6E-4 ala17(2761) 5.8E-4 ala18(2921) 5.7E-4 ala19(3081) 5.5E-4 ala20(3241) 5.3E-4 ala21(3401) 5.4E-4 ala22(3561) 6.0E-4 ala23(3721) 6.3E-4 ala24(3881) 6.7E-4 ala25(4041) 6.9E-4

RMSD 6.7E-6 6.4E-6 6.9E-6 6.1E-6 6.1E-6 6.1E-6 6.0E-6 6.0E-6 6.0E-6 6.0E-6 6.0E-6 6.0E-6 6.0E-6 6.0E-6 6.0E-6 6.0E-6

19

E diff. E diff. per atom 6.78E-3 6.58E-5 9.81E-3 8.68E-5 7.87E-3 6.40E-5 5.06E-3 3.81E-5 2.95E-3 1.09E-5 8.05E-4 5.26E-6 2.23E-3 1.37E-5 6.62E-3 3.83E-5 9.40E-3 5.14E-5 1.43E-2 7.39E-5 1.37E-2 6.74E-5 1.42E-2 6.68E-5 2.14E-2 9.61E-5 2.00E-2 8.59E-5 2.54E-2 1.05E-4 3.75E-2 1.48E-4

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 5: Accuracy assessment for exchange matrix calculation with aug-cc-pvtz basis function on SG1 grid Molecule MAX Dev. ala6(2162) 4.3E-4 ala7(2507) 3.9E-4 ala8(2852) 4.3E-4 ala9(3197) 4.6E-4 ala10(3542) 5.1E-4 ala11(3887) 7.1E-4 ala12(4232) 7.9E-4 ala13(4577) 7.9E-4 ala14(4922) 7.5E-4 ala15(5267) 7.4E-4 ala16(5612) 7.6E-4

RMSD 7.9E-6 7.1E-6 6.2E-6 6.1E-6 6.1E-6 5.9E-6 5.6E-6 5.5E-6 5.5E-6 5.4E-6 5.5E-6

E diff. E diff. per atom 3.41E-3 5.42E-5 3.27E-4 4.48E-6 1.36E-3 1.64E-5 1.14E-3 1.23E-5 1.37E-3 1.33E-5 2.00E-3 1.77E-5 4.17E-3 3.39E-5 8.42E-4 6.33E-6 2.24E-3 1.56E-5 5.72E-3 3.74E-5 1.09E-2 6.66E-5

Table 6: Accuracy assessment for exchange matrix calculation with 6-31G** basis function on SG1 grid Molecule MAX Dev. ala10(1025) 1.9E-4 ala11(1125) 1.8E-4 ala12(1225) 2.1E-4 ala13(1325) 2.3E-4 ala14(1425) 2.4E-4 ala15(1525) 2.3E-4 ala16(1625) 2.3E-4 ala17(1725) 2.3E-4 ala18(1825) 2.3E-4 ala19(1925) 2.3E-4 ala20(2025) 2.6E-4 ala21(2125) 3.0E-4 ala22(2225) 3.1E-4 ala23(2325) 3.2E-4 ala24(2425) 3.3E-4 ala25(2525) 3.3E-4

RMSD 2.5E-6 2.3E-6 2.3E-6 2.3E-6 2.2E-6 2.3E-6 2.2E-6 2.2E-6 2.2E-6 2.2E-6 2.2E-6 2.2E-6 2.1E-6 2.1E-6 2.1E-6 2.0E-6

20

E diff. E diff. per atom 7.59E-3 7.37E-5 9.55E-3 8.45E-5 1.04E-2 8.46E-5 1.05E-2 7.93E-5 1.08E-2 5.19E-5 1.05E-2 6.89E-5 1.25E-2 7.65E-5 1.40E-2 8.07E-5 1.50E-2 8.18E-5 1.59E-2 8.22E-5 1.79E-2 8.81E-5 2.00E-2 9.40E-5 2.04E-2 9.14E-5 2.13E-2 9.13E-5 2.38E-2 9.78E-5 2.56E-2 1.01E-4

ACS Paragon Plus Environment

Page 20 of 31

Page 21 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Xeon E5-1650 at 3.20GHz. Both numerical and analytical calculations were done in parallel with six hardware threads. As one can see, the numerical scheme performs worse than the analytical one for double-zeta basis sets 6-31G** and cc-pvdz, but better for triple-zeta or when diffuse functions are added to the basis sets. Between the two double-zeta basis sets, the numerical integration catches up to the analytical integration somewhat as its CPU-time ratio to the analytical one goes down from about 6 with 6-31G** down to about 4 with ccpvdz, as shown in Figures 2 and 3. Part of the reason is that cc-pvdz has a higher degree of contraction. As we analyzed in the previous section, the numerical integration favors higher degree of contraction. Our cost analysis also shows that numerical integration favors larger basis sets, which is evidenced in Figures 4 and 5. For both aug-cc-pvdz and aug-cc-pvtz, the numerical integration outperforms the analytical integration, with about 1.5 times speed-up for augcc-pvdz and about 6 times for aug-cc-pvtz. This is due the fact that the seminumerical scheme scales in O(N 2 ) with respect the basis set size whereas the analytical scheme scales in O(N 4 ).

Conclusion In this paper, we present an algorithm to compute the Hartree-Fock exchange energy density exactly in real space used as the exact exchange energy density for local hybrid DFT functionals. It scales quadratically with respect to the molecular size and the basis set size. In comparison, a previous approximate scheme based RI scales cubically if the auxiliary basis remains the same. Calculations on small molecules, for which an RI scheme is presumably good for, show that the exact scheme is favored with medium-sized basis sets and can be competitive even for large basis sets with efficient grids. This scheme can also be used to compute HF exchange energy and matrix on a standard atom-centered grid. Compared with the conventional analytical approach, it has the advantage of quadratic scaling with respect

21

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

3500

analytical HF exchange matrix calculation exchange matrix forming with grid SG1

3000

4.79X 4.55X

2500

4.63X 4.53X

time in seconds

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 31

2000

4.34X 4.37X 4.35X

1500

4.23X 4.20X 4.10X

1000

4.02X 4.03X

3.98X

3.93X

500 3.90X

3.87X

0 1000

1200

1400

1600 1800 number of basis functions

2000

2200

2400

Figure 2: timing comparison for HF exchange matrix calculation on cc-pvdz basis function, time in seconds to the basis set size instead of quadruple scaling for a given molecule. Calculations on a series of alanine peptides show that the seminumerical scheme becomes more efficient for aug-cc-pvdz basis and is about six times faster for aug-cc-pvtz basis. The practicality of our scheme is demonstrated through an SCF calculation of acenes-20 (C82 H44 ) with the B05, a local hybrid functional, and cc-pvdz basis set. Each SCF iteration takes 750 seconds for SG1 grid, and 1625 seconds for a finer grid on an Intel Xeon E5-1650(3.20GHz) CPU with six hardware threads.

Acknowledgement This project is supported by Petroleum Research Funds (PRF 55229-DNI6). F. Liu is grateful to the help from Dr. David A. Kofke in this project. The authors thank Dr. Peter Pulay and late Dr. Jon Baker for letting us use their pruned fine grid. They also thank Dr. Jianguo Yu for technical assistance.

22

ACS Paragon Plus Environment

Page 23 of 31

1600

analytical HF exchange matrix calculation exchange matrix forming with grid SG1

6.63X

1400

6.38X 6.46X

1200

time in seconds

6.27X 6.06X

1000 6.09X 6.07X

800 5.83X 5.81X 600

5.71X 5.72X

400

5.57X

5.57X 5.71X 5.41X

200

5.30X

0 1000

1200

1400

1600 1800 2000 number of basis functions

2200

2400

2600

Figure 3: timing comparison for HF exchange matrix calculation on 6-31G** basis function, time in seconds

20000

analytical HF exchange matrix calculation exchange matrix forming with grid SG1

18000 16000 14000 time in seconds

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

12000 0.63X 0.68X

10000 0.66X 0.66X

8000 0.66X 0.68X

6000

0.67X 0.68X 0.67X

4000

0.68X 0.67X 0.68X 0.71X

2000

0.72X 0.73X 0.73X

0 1500

2000

2500

3000 3500 number of basis functions

4000

4500

Figure 4: timing comparison for HF exchange matrix calculation on aug-cc-pvdz basis function, time in seconds

23

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

80000

analytical HF exchange matrix calculation exchange matrix forming with grid SG1

70000

60000

time in seconds

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 31

50000

40000

30000

20000

10000

0

0.21X

0.20X

2500

0.17X

3000

0.17X

0.17X

0.17X

0.16X

0.17X

3500 4000 4500 number of basis functions

0.16X

5000

0.16X

0.16X

5500

6000

Figure 5: timing comparison for HF exchange matrix calculation on aug-cc-pvtz basis function, time in seconds

24

ACS Paragon Plus Environment

Page 25 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Figure 6: TOC Picture. For TOC ONLY. Not cited in the text.

References (1) Cohen, A. J.; Mori-Snchez, P.; Yang, W. Challenges for Density Functional Theory. Chem. Rev. 2012, 112, 289–320. (2) Perdew, J. P.; Schmidt, K. Jacob’s ladder of density functional approximations for the exchange-correlation energy. AIP Conf. Proc. 2001, 577, 1–20. (3) Vosko, S. H.; Wilk, L.; Nusair, M. Accurate spin-dependent electron liquid correlation energies for local spin density calculations: a critical analysis. Can. J. Phys. 1980, 58, 1200–1211. (4) Lee, C.; Yang, W.; Parr, R. G. Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density. Phys. Rev. B 1988, 37, 785–789. (5) Becke, A. Density-functional thermochemistry. III. The role of exact exchange. Chem. Phys 1993, 98, 5648–5652. (6) Perdew, J. P.; Burke, K.; Ernzerhof, M. Generalized Gradient Approximation Made Simple. Phys. Rev. Lett. 1996, 77, 3865–3868.

25

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(7) Zhao, Y.; Truhlar, D. G. Density Functional for Spectroscopy: No Long-Range SelfInteraction Error, Good Performance for Rydberg and Charge-Transfer States, and Better Performance on Average than B3LYP for Ground States. J. Phys. Chem. A 2006, 110, 13126–13130. (8) Becke, A. D. Perspective: Fifty years of density-functional theory in chemical physics. J. Chem. Phys. 2014, 140, 18A301. (9) Cohen, A. J.; Mori-Sanchez, P.; Yang, W. Insights into Current Limitations of Density Functional Theory. Science 2008, 321, 792–794. (10) Becke, A. D. A real-space model of nondynamical correlation. J. Chem. Phys. 2003, 119, 2972–2977. (11) Becke, A. D. Real-space post-Hartree-Fock correlation models. J. Chem. Phys. 2005, 122, 064101. (12) Perdew, J. P.; Staroverov, V. N.; Tao, J.; Scuseria, G. E. Density functional with full exact exchange, balanced nonlocality of correlation, and constraint satisfaction. Phys. Rev. A 2008, 78, 052513. (13) Proynov, E.; Shao, Y.; Kong, J. Efficient self-consistent DFT calculation of nondynamic correlation based on the B05 method. Chem. Phys. Lett. 2010, 493, 381 – 385. (14) Proynov, E.; Liu, F.; Kong, J. Modified Becke05 method of nondynamic correlation in density functional theory with self-consistent implementation. Chem. Phys. Lett. 2012, 525526, 150 – 152. (15) Emil Proynov, Y. S., Fenglai Liu; Kong, J. Improved self-consistent and resolutionof-identity approximated Becke’05 density functional model of nondynamic electron correlation. J. Chem. Phys. 2012, 136, 034102.

26

ACS Paragon Plus Environment

Page 26 of 31

Page 27 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(16) Liu, F.; Proynov, E.; Yu, J.-G.; Furlani, T. R.; Kong, J. Comparison of the performance of exact-exchange-based density functional methods. J. Chem. Phys. 2012, 137, 114104. (17) Becke, A. D. Density functionals for static, dynamical, and strong correlation. J. Chem. Phys. 2013, 138, 074109. (18) Kong, J.; Proynov, E. Density Functional Model for Nondynamic and Strong Correlation. J. Chem. Theory Comput. 2015, 12, 133–143. (19) Becke, A. D. Communication: Calibration of a strong-correlation density functional on transition-metal atoms. J. Chem. Phys. 2013, 138, 161101. (20) Becke, A. D. Excited-state surfaces of ethylene from the B13 strong-correlation density functional. Mol. Phys. 2015, 113, 1884–1889. (21) Becke, A. D. Density Functionals; Springer, 2014; pp 175–186. (22) Johnson, E. R.; Contreras-Garc´ıa, J. Communication: A density functional with accurate fractional-charge and fractional-spin behaviour for s-electrons. J. Chem. Phys. 2011, 135, 081103. (23) Johnson, E. R. A density functional for strong correlation in atoms. J. Chem. Phys. 2013, 139, 074110. (24) Jaramillo, J.; Scuseria, G. E.; Ernzerhof, M. Local hybrid functionals. J. Chem. Phys. 2003, 118, 1068–1073. (25) Bahmann, H.; Rodenberg, A.; Arbuznikov, A. V.; Kaupp, M. A thermochemically competitive local hybrid functional without gradient corrections. J. Chem. Phys. 2007, 126, 011103. (26) Janesko, B. G.; Scuseria, G. E. Local hybrid functionals based on density matrix products. J. Chem. Phys. 2007, 127, 164117. 27

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(27) Janesko, B.; Krukau, A.; Scuseria, G. Self-consistent generalized Kohn-Sham local hybrid functionals of screened exchange: Combining local and range-separated hybridization. J. Chem. Phys. 2008, 129, 124110. (28) Johnson, E. R. Local-hybrid functional based on the correlation length. J. Chem. Phys. 2014, 141, 124120. (29) Schwabe, T.; Grimme, S. Double-hybrid density functionals with long-range dispersion corrections: higher accuracy and extended applicability. Phys. Chem. Chem. Phys. 2007, 9, 3397–3406. (30) Tarnopolsky, A.; Karton, A.; Sertchook, R.; Vuzman, D.; Martin, J. M. Double-hybrid functionals for thermochemical kinetics. J. Phys. Chem. A 2008, 112, 3–8. (31) Grimme, S.; Neese, F. Double-hybrid density functional theory for excited electronic states of molecules. J. Chem. Phys. 2007, 127, 154116. (32) Zhang, Y.; Xu, X.; Goddard, W. Doubly hybrid density functional for accurate descriptions of nonbond interactions, thermochemistry, and thermochemical kinetics. Proc. Natl. Acad. Sci. USA 2009, 106, 4963. (33) Della Sala, F.; Gorling, A. Efficient localized Hartree–Fock methods as effective exactexchange Kohn–Sham methods for molecules. J. Chem. Phys. 2001, 115, 5718–5732. (34) Arbuznikov, A. V.; Kaupp, M.; Bahmann, H. From local hybrid functionals to localized local hybrid potentials: Formalism and thermochemical tests. J. Chem. Phys. 2006, 124, 204102. (35) Becke, A. D.; Dickson, R. M. Numerical solution of Poissons equation in polyatomic molecules. J. Chem. Phys. 1988, 89, 2993–2997.

28

ACS Paragon Plus Environment

Page 28 of 31

Page 29 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(36) Neese, F.; Wennmohs, F.; Hansen, A.; Becker, U. Efficient, approximate and parallel HartreeFock and hybrid DFT calculations. A chain-of-spheres algorithm for the HartreeFock exchange. Chem. Phys. 2009, 356, 98 – 109. (37) Kossmann, S.; Neese, F. Comparison of two efficient approximate HarteeFock approaches. Chem. Phys. Lett. 2009, 481, 240 – 243. (38) Bahmann, H.; Kaupp, M. Efficient Self-Consistent Implementation of Local Hybrid Functionals. J. Chem. Theory Comput. 2015, 11, 1540–1548. (39) Liu, F.; Furlani, T.; Kong, J. Optimal Path Search for Recurrence Relation in Cartesian Gaussian Integrals. J. Phys. Chem. A 2016, 120, 10264–10272. (40) Friesner, R. A. Solution of self-consistent field electronic structure equations by a pseudospectral method. Chem. Phys. Lett. 1985, 116, 39–43. (41) Friesner, R. A. Solution of the HartreeFock equations by a pseudospectral method: Application to diatomic molecules. J. Chem. Phys. 1986, 85, 1462–1468. (42) Friesner, R. A. Solution of the HartreeFock equations for polyatomic molecules by a pseudospectral method. J. Chem. Phys. 1987, 86, 3522–3531. (43) Friesner, R. A. An automatic grid generation scheme for pseudospectral self-consistent field calculations on polyatomic molecules. J. Phys. Chem. 1988, 92, 3091–3096. (44) Chasman, D.; Beachy, M. D.; Wang, L.; Friesner, R. A. Parallel pseudospectral electronic structure: I. Hartree–Fock calculations. J. Comput. Chem. 1998, 19, 1017–1029. (45) Paier, J.; Hirschl, R.; Marsman, M.; Kresse, G. The Perdew–Burke–Ernzerhof exchangecorrelation functional applied to the G2-1 test set using a plane-wave basis set. J. Chem. Phys. 2005, 122, 234102. (46) Wu, X.; Selloni, A.; Car, R. Order-N implementation of exact exchange in extended insulating systems. Phys. Rev. B 2009, 79, 085102. 29

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(47) DiStasio, R. A.; Santra, B.; Li, Z.; Wu, X.; Car, R. The individual and collective effects of exact exchange and dispersion interactions on the ab initio structure of liquid water. J. Chem. Phys. 2014, 141 . (48) Obara, S.; Saika, A. Efficient recursive computation of molecular integrals over Cartesian Gaussian functions. J. Chem. Phys. 1986, 84, 3963. (49) Obara, S.; Saika, A. General recurrence formulas for molecular integrals over Cartesian Gaussian functions. J. Chem. Phys. 1988, 89, 1540. (50) Head-Gordon, M.; Pople, J. A method for two-electron Gaussian integral and integral derivativeevaluation using recurrence relations. J. Chem. Phys. 1988, 89, 5777. (51) Lebedev, V. Values of the nodes and weights of ninth to seventeenth order gauss-markov quadrature formulae invariant under the octahedron group with inversion. USSR Comput. Math. and Math. Phys. 1975, 15, 44–51. (52) Lebedev, V. Quadratures on a sphere. USSR Comput. Math. and Math. Phys. 1976, 16, 1024. (53) Murray, C. W.; Handy, N. C.; Laming, G. J. Quadrature schemes for integrals of density functional theory. Mol. Phys. 1993, 78, 997. (54) Becke, A. D. A multicenter numerical integration scheme for polyatomic molecules. J. Chem. Phys. 1988, 88, 2547–2553. (55) Shao, Y.; Molnar, L.; Jung, Y.; Kussmann, J.; Ochsenfeld, C.; Brown, S.; Gilbert, A.; Slipchenko, L.; Levchenko, S.; ONeill, D. et al. Advances in methods and algorithms in a modern quantum chemistry program package. Phys. Chem. Chem. Phys. 2006, 8, 3172–3191. (56) Gill, P. M. W.; Johnson, B. G.; Pople, J. A. A standard grid for density functional calculations. Chem. Phys. Lett. 1993, 209, 506 – 512. 30

ACS Paragon Plus Environment

Page 30 of 31

Page 31 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(57) Jung, Y.; Sodt, A.; Gill, P. M. W.; Head-Gordon, M. Auxiliary basis expansions for large-scale electronic structure calculations. Proc. Natl. Acad. Sci. USA 2005, 102, 6692–6697. (58) Curtiss, L. A.; Redfern, P. C.; Raghavachari, K.; Rassolov, V.; Pople, J. A. Gaussian-3 theory using reduced Mo/ller-Plesset order. J. Chem. Phys. 1999, 110 . (59) Hachmann, J.; Dorando, J. J.; Avils, M.; Chan, G. K.-L. The radical character of the acenes: A density matrix renormalization group study. J. Chem. Phys. 2007, 127, 134309. (60) Jiang, D.-e.; Dai, S. Electronic Ground State of Higher Acenes. J. Phys. Chem. A 2008, 112, 332–335. (61) Baker, J.; Janowski, T.; Wolinski, K.; Pulay, P. Recent developments in the PQS program. WIREs Comput. Mol. Sci 2012, 2, 63–72. (62) Davidson, E. R.; Feller, D. Basis set selection for molecular calculations. Chem. Rev. 1986, 86, 681–696.

31

ACS Paragon Plus Environment