Implementation of Exact Exchange with Numerical Atomic Orbitals

J. Phys. Chem. A 2010, 114, 1039–1043

1039

Implementation of Exact Exchange with Numerical Atomic Orbitals Honghui Shang, Zhenyu Li, and Jinlong Yang* Hefei National Laboratory for Physical Sciences at Microscale, UniVersity of Science and Technology of China, Hefei, Anhui 230026, China ReceiVed: September 13, 2009; ReVised Manuscript ReceiVed: October 21, 2009

A method to calculate Hartree-Fock-type exact exchange has been implemented in the electronic structure code SIESTA based on a localized numerical atomic orbital basis set. In our implementation, the electron repulsion integrals are calculated by solving Poisson’s equation using the interpolating scaling function method and then doing numerical integration in real-space. Test calculations for both isolated and periodic systems are performed, and good agreement with results calculated by Gaussian03 or Crystal06 packages is obtained. I. Introduction First-principles electronic structure calculations based on density functional theory (DFT)1 have been successfully applied to both molecular and condensed-matter systems. However, the widely used functionals based on the local density approximation (LDA) or the generalized gradient approximation (GGA) sometimes are not accurate enough.2 A possible remedy is adding nonlocal Hartree-Fock-type exact exchange (HFX) into local or semilocal density functionals. Such hybrid functionals can be used to treat dynamical, nondynamical, and dispersion correlations.3,4 The heavy computational demand to evaluate HFX is the main bottleneck for hybrid functional calculations, which prohibits its use for large molecules and for periodic systems with a big unit cell. Previous implementations of HFX mainly focus on Gaussian-type orbital (GTO) and plane wave (PW) basis sets, for example, in Crystal06 with GTO,5 CPMD6,7 and VASP8,9 with PW, and CP2K10 with a hybrid GTO and PW basis set. A basis set based on numerical atomic orbitals is practically very attractive because numerical orbitals can be designed to be localized. Due to its localization, a numerical atomic orbital only overlaps with a finite number of neighboring orbitals, which naturally leads to lower order scaling of the computational time versus the system size. In fact, a localized pseudoatomic orbital basis set has been widely used in order-N DFT packages, such as SIESTA,13 CONQUEST,14 and OPENMX.15 Implementation of HFX with a numerical atomic basis set is thus very desirable. A couple of attempts have been made in this direction very recently. Toyoda and Ozaki12 have tried to calculate the electron repulsion integrals (ERIs) of HFX in the reciprocal space with a numerical atomic basis set, and Wu et al.11 have tried to calculate HFX using maximally localized Wannier functions obtained from PW calculation using the GGA functional. In this work, we present an accurate scheme for efficient HFX calculation in real-space with a localized numerical atomic basis set implemented in the SIESTA package. Permutational symmetry of ERIs is considered to reduce computational time. The outline of this paper is as the following. In section II, the basic theory of HFX and hybrid functional is reviewed. In section III, all techniques that we have incorporated in our implementa* To whom correspondence should be addressed. E-mail: jlyang@ ustc.edu.cn.

tion are discussed in detail, and in section IV, benchmark results are presented. Section V concludes this paper. II. Theory A. The Hamiltonian. For periodic systems, the crystalline orbital ψi(k, r) is a linear combination of Bloch functions φµ(k, r), defined in terms of atomic orbitals χµR(r)

ψi(k, r) )

∑ cµ,i(k)φµ(k, r)

(1)

∑ χµR(r)eik · R

(2)

µ

φµ(k, r) )

R

where µ is the index of atomic orbitals, i is the suffix for different bands, R is the origin of a unit cell, and summation for R runs for all unit cells in the real-space. In practice, however, it is not feasible to add contributions from all unit cells. An extended cell with Born-von Karman boundary condition is thus used in our calculations, and the summation of R is limited in the extended cell; χµR(r) ) χµ(r - R - rµ) is the µth atomic orbital, whose center is displaced from the origin of the unit cell at R by rµ; cµ,i(k) is the wave function coefficient, which is obtained by solving the following equation

H(k)c(k) ) E(k)S(k)c(k)

(3)

∑ HReik · R

(4)

ˆ |χRν 〉 [HR]µν ) 〈χµ0 |H

(5)

H(k) )

R

S(k) )

∑S

R ik · R

e

(6)

R

[SR]µν ) 〈χµ0 |χRν 〉

(7)

ˆ between the In eq 5, [HR]µν is a matrix element of operator H atomic orbitals, χµ locates in the central unit cell 0, and χν locates ˆ is the one-electron Hamiltonian operator, in the unit cell R. H and in pseudopotential approximation, it is composed of the following terms:

ˆ ) Tˆ + VˆPS + VˆH + VˆXC H

(8)

where Tˆ is the kinetic energy operator, VˆPS is the pseudopotential operator, VˆH is the Hartree potential operator, and VˆXC is the exchange-correlation potential operator, which is local in pure DFT, but is nonlocal in HF.

10.1021/jp908836z  2010 American Chemical Society Published on Web 11/23/2009

1040

J. Phys. Chem. A, Vol. 114, No. 2, 2010

Shang et al.

B. HFX and the B3LYP Functional. In LDA and GGA, VXC(r) is the same in every unit cell. However, the HFX potential matrix element is defined as Q [VX]µλ )-

1 2

∑ ∑ PνσN [(χµ0χHν |χQλ χσH+N)]

∑ il jl kl Fi,j,k ) ∫ drxl yl zl F(r) 1 2

(9)

3

∑ ∫BZ c*ν,j(k)cσ,j(k)θ(εF - εj(k))eik · Ndk j

where θ is the step function, εF is the Fermi energy, and εj(k) is the jth eigenvalue at point k. The B3LYP hybrid functional16 includes HFX in its exchange part. Its correlation part is a combination of LYP17 and VWN18 functionals (VWN III is used in Gaussian03, while VWN V is used in Crystal06)

E

)

AESlater x

+ (1 -

A)EHF x

+

+ EVWN c C∆Esemilocal c

B∆EBecke x

A

H χµ0 (r)χNν (r)χQ λ (r )χσ (r )

|r - r |

(11)

∫

|r - r |

dr

∫ Vµ ,ν (r)χQλ (r)χσH(r) dr 0

(13)

N

(14)

III. Methods In this section, we describe the techniques we have employed to calculate and store the ERIs, which is the most computationally demanding part to evaluate HFX. A. The ISF Method To Solve Poisson’s Equation. The ISF method23 is used to solve Poisson’s equation. With a mother wavelet, we can obtain a scaling function basis set by translations. There is a refinement relation between the scaling functions on a grid with spacing h and another one with spacing h/2, and the scaling function coefficients of any function are just the values of the function to be expanded on the grid. As a result, a continuous charge density is given by

F(r) )

∑ Fi ,i ,i φ(x - i1)φ(y - i2)φ(z - i3)

i1,i2,i3

1 2 3

(17)

Denoting the potential on the grid point rj1, j2, j3 by Vj1, j2, j3 ) V(rj1, j2, j3), we have

Vj1,j2,j3 )

∑ Fi ,i ,i ∫ dr

i1,i2,i3

φ(x - i1)φ(y - i2)φ(z - i3)

1 2 3

|rj1,j2,j3 - r |

(18) The above integral defines a discrete kernel

K(i1 - j1, i2 - j2, i3 - j3) ) K(i1, j1 ;i2, j2 ;i3, j3) )

∫ dr

φ(x - i1)φ(y - i2)φ(z - i3) |rj1,j2,j3 - r |

Then potential Vj1, j2, j3 can be obtained from the charge density Fi1, i2, i3 by the following three-dimensional convolution:

Vj1,j2,j3 )

which was obtained by solving the Poisson’s equation in the real-space with free boundary condition, using the interpolating scaling functions (ISF) method.23 The next step is an integration also in the real-space H (χµ0 χNν |χQ λ χσ ) )

∫ dr |r -1 r′| F(r)

(19)

in a real-space grid, we first calculate

Vµ0,νN(r) )

(16)

3

drdr

(12)

χµ0 (r)χNν (r)

V(r) )

+

where ∆EBecke ) EBecke - ESlater and ∆Esemilocal ) ELYP - EVWN x x x c c c with A ) 0.8, B ) 0.72, and C ) 0.81. In our B3LYP . implementation, we use the VWN V functional for EVWN c In order to calculate the following ERI in eq 9 H (χµ0 χNν |χQ λ χσ ) )

2

for l1, l2, and l3 smaller than m. Since the first m discrete and continuous moments are identical for a mth order interpolating scaling function, a scaling function representation gives the most faithful mapping between a continuous and discretized charge distribution for electrostatic problems. The following integral equation gives the potential generated by charge density F

(10)

B3LYP

1

i,j,k

νσ N,H

where Q, H, and N represent different unit cells, and summations of H and N run for all unit cells in the extended cell. The N is computed by an integration over density matrix element Pνσ the Brillouin zone (BZ) N Pνσ )

The interpolating scaling functions are localized and thus suitable to describe nonperiodic systems. The first mth moments of scaling functions satisfy23

(15)

where φ is one-dimensional scaling functions and i1, i2, and i3 are indexes of points in a uniform three-dimensional mesh.

∑ K(i1 - j1, i2 - j2, i3 - j3)Fi ,i ,i

i1,i2,i3

1 2 3

(20)

It is easy to make this convolution in the Fourier space. Therefore, the kernel and charge density are both transformed to the Fourier space, and finally, the potential is transformed back to the real-space. The kernel is calculated in the real-space using the refinement relation for interpolating scaling functions. We only need to calculate the kernel once before calculating all the ERIs. The 50th order interpolating scaling functions are used, which leads to a very high accuracy.23 B. Calculate ERIs. With Possion’s equation solved, ERIs can be, in principle, directly calculated by eq 14. However, to improve computational efficiency, several techniques are adopted. First, those very small ERIs are not calculated. With GTO, it is common to estimate the upper bound of ERIs by calculating all the two-index quantities and then applying the Schwarz inequality. Such a prescreening procedure, taking advantage of the exponential decay of the charge distributions with respect to the distance between Gaussian centers, reduces the total number of ERIs to be considered from O(N4) to O(N2). In our numerical basis set case, solving the poisson equation takes most of the computational time (almost 400 times of the integration time). Considering the locality of numerical atomical orbitals, we do not need to solve Poisson’s equation for those two indexes if their corresponding orbitals have no overlap. To further improve the efficiency, we take the full permutational symmetry of the ERIs into account for both molecules and solids. In the molecule case, the permutational symmetry of the ERIs reads

Exact Exchange with Numerical Atomic Orbitals

J. Phys. Chem. A, Vol. 114, No. 2, 2010 1041

(µν|λσ) ) (µν|σλ) ) (νµ|λσ) ) (νµ|σλ) ) (λσ|µν) ) (λσ|νµ) ) (σλ|µν) ) (σλ|νµ)

(21)

In the solid case, the first index varies for all orbitals in the unit cell, and the other three indexes varies for all orbitals in the extended cell. Considering the following symmetry R -R Hµν ) Hνµ

(22)

(µ0νH |λGσN) ) (µ0νH |σNλG) ) (ν0µ-H |λG-HσN-H) ) (ν0µ-H |σN-HλG-H) ) (λ0σN-G |µ-GνH-G) ) (λ0σN-G |νH-Gµ-G) ) (σ0λG-N |µ-NνH-N) ) (σ0λG-N |νH-Nµ-N)

(23)

we have

In this way, we save a factor of 8 in the number of integrals that have to be considered. For ERI storage, we only keep the ERIs whose absolute value is greater than 10-10 Hartree and store them with single precision numbers, which gives a similar precision with the compression algorithm used in CP2K.10 Using single precision instead of double precision only leads to a very small error. As an example, for polyyne with double-ζ (DZ) basis set, the band gap difference is only 4.0 × 10-4 eV.

TABLE 2: ERIs (in Hartree) of H2 with a STO-3G Basis Set (The Exact ERIs Are Calculated Analytically) index (1 (2 (2 (2 (2 (2

1 1 1 2 2 2

1 1 2 1 2 2

1) 1) 1) 1) 1) 2)

exact

ISF

relative error

1.2493681 0.3695430 0.1498951 0.6768696 0.3695430 1.2493681

1.2493733 0.3695434 0.1498951 0.6768702 0.3695434 1.2493733

4.16 × 10-6 9.22 × 10-7 1.78 × 10-8 8.01 × 10-7 9.22 × 10-7 4.16 × 10-6

For comparison, the conjugate gradient (CG) method is also used to solve Poisson’s equation.11 The Laplace operator is discretized with both 7 and 13 mesh points. The computational time for CG method is about 10 times longer than the ISF method, while its precision is lower. We calculate ERIs for a H2 molecule with a H-H bond length of 0.74 Å. The STO-3G basis function was discretized in a 10 × 10 × 10 Å3 cell with a 200 Ry cutoff, which leads to a h equal to 0.21 Bohr. As shown in Table 2, compared to analytical results, the ERIs obtained from ISF method carry small errors

IV. Benchmark Calculations In order to demonstrate the capabilities of our code, we present benchmark calculations for several systems. We present results for molecules and one-dimensional chains here; other tested systems including two-dimensional boron nitride sheet and three-dimensional silicon bulk are not given. For all tested systems, our results agree well with those obtained from existing software using Gaussian basis set. The norm-conserving pseudopotentials generated using the Troullier-Martins20 scheme, in the fully separable form developed by Kleiman and Bylader,19 are used to represent interaction between core and valence electrons. Pseudopotentials derived within semilocal formulations are used in hybrid functional calcualtions without any modification.11,24 All of the numerical atomic orbital basis are generated using SIESTA’s default parameters.21,22 A. The Poisson Solver and ERIs. Before performing benchmark calculations for real systems, we do a Poisson solver test with a Gaussian charge to illustrate the efficiency of the ISF method. For a Gaussian charge

F(r) ) (R√π)3e-|r-r | /R 02

2

Figure 1. Eigenvalue spectra of Si2H6 molecule calculated with HF (a), B3LYP functional (b).

(24)

the exact solution is available

V(r') )

{

derf(r /R) r * 0 r ) 0 2/(R√π)

(25)

Using ISF method, we solve Poisson’s equation with different R values in a 10 × 10 × 10 Å3 cell. Grid spacing h is chosen to be 0.21 Bohr. As shown in Table 1, a very impressive accuracy is obtained. TABLE 1: Maximal Absolute Errors of Electrostatic Potential for a Gaussian Charge Distribution Using ISF Method and CG Method Where the Laplace Operator Is Discretized Both with 7 (CG7) and 13 (CG13) Mesh Points (r is in Bohr) R

ISF

CG13

CG7

0.84 2.31

4.27 × 10-10 4.60 × 10-9

3.10 × 10-4 3.17 × 10-5

8.51 × 10-3 3.71 × 10-4

Figure 2. Structure of (a) polyyne and (b) trans-polyacetylene. The unit cells are marked by dashed brackets. The geometry of polyyne is taken from ref 27, and the geometry of trans-polyacetylene is optimized with Crystal06 using the B3LYP functional and 6-31G** basis set.

TABLE 3: HOMO-LUMO Gap of Si2H6 Molecule (in eV) software

method

LUMO-HOMO

Gaussian03 our approach Gaussian03 our approach

B3LYP B3LYP HF HF

8.64 8.64 14.79 15.30

1042

J. Phys. Chem. A, Vol. 114, No. 2, 2010

Shang et al. TABLE 4: Band Gap of trans-Polyacetylene (in eV): Small (S), Medium (M), and Large (L) Basis Sets Mean SZ, DZ, and DZP in Our Approach, and STO-3G, 6-31G, and 6-31G** in Crystal06, Respectively band gap

Figure 3. Band structure of polyyne calculated with the HF method. The DZ basis set was used in our code, and the 4-31G basis set was used in the PLH code.28 The Fermi energy is set to zero.

at the order of 10-6 Hartree. The precision can be further improved by using larger extended cell and/or finer real-space grid. We also test the computational time for a methane molecule, which has been adopted in the study by Toyoda and Ozaki.12 In our implementation, only 4.9 × 10-2 s is required for a single integral on a 2.33 GHz Intel Xeon processor, much faster than in the previous implementation.12 B. Si2H6 Molecule. A Si2H6 molecule, with its atomic coordinates taken from the G2 set,25 is placed in a 10 × 10 × 10 Å3 box with a 200 Ry cutoff grid. The energy gap between the highest occupied molecular orbital (HOMO) and the lowest unoccupied molecular orbital (LUMO) is calculated using both the HF and B3LYP method with our implementation. The results are listed in Table 3 together with those obtained from the Gaussian03 package. Double-ζ plus polarization (DZP) basis set is used in our code, and 6-31G** basis set is used in Gaussian03 code. For B3LYP, the gap difference between our implementation and Gaussian03 is very small (within 0.01 eV). An about 0.51

software

S

M

L

Crystal06 our approach difference

1.36 1.27 0.09

1.18 1.24 0.06

1.18 1.19 0.01

eV difference is observed for HF calculations because we use pseudoatomic basis sets and Gaussian03 uses Gaussian basis sets. As shown in Figure 1, the eigenvalue spectra are also agree reasonably well between our implementation and Gaussian03, especially for the occupied orbitals. C. One-Dimensional Polymers. Polyyne is chosen as our first benchmark for extended systems. Equilibrium geometry we employed was optimized with the MP2 method.28 In our approach, the DZ basis set is used, which has a similar size with the 4-31G basis set. The Brillouin zone is sampled by 1 × 1 × 400 special k-points using the Monkhorst-Pack scheme.26 The grid step h we adopted is 0.205 Bohr. The HF band gap we get is 6.55 eV, which agrees well with previous results using PLH code (6.60 eV)28 and Gaussian03 package (6.59 eV).27 As shown in Figure 3, our HF band structures with all valence bands and the lowest unoccupied conduction band are essentially identical to those obtained from the PLH code.28 Our last benchmark system is 1D polymer trans-polyacetylene. Its geometry is optimized with the Crystal06, using the B3LYP functional and 6-31G** basis set (Figure 2). With this geometry, we calculate its band structure. The single-ζ (SZ), DZ, and DZP basis sets are employed with our code, and the STO-3G, 6-31G, and 6-31G** basis sets are employed with Crystal06. The Brillouin zone is sampled by 1 × 1 × 200 special k-points using the Monkhorst-Pack scheme. The grid step h we used is 0.219 Bohr. The five highest occupied bands, along with the three lowest conduction bands, are plotted in Figure 4. Generally, the agreement between numerical and analytical basis sets are good. Bigger basis set leads to better agreement between our results with those obtained with Gaussian basis set in the Crystal06 package. This trend is clearly shown in the band gap differences listed in Table 4.

Figure 4. Band structure of trans-polyacetylene calculated with B3LYP functional using (a) SZ, (b) DZ, and (c) DZP basis sets in our implementation, and (a) STO-3G, (b) 6-31G, and (c) 6-31G** basis sets are used in Crystal06, respectively. The Fermi energy is set to zero.

Exact Exchange with Numerical Atomic Orbitals V. Conclusions In this work, we have prensented the implementation of HFX in SIESTA. We use the ISF Possion solver to calculate ERIs, which shows good precision and efficiency. The full permutational symmetry of the ERIs for both molecules and solids is considered, which leads to a factor of 8 saving in the number of ERIs to calculate and to store. We test our method thoroughly for systems from Gaussian charge distribution to molecular and extended systems. Good agreement with results obtained using Gaussian03 and Crystal06 packages is obtained, especially when the DZP basis set is used in our approach and 6-31G** basis set is used in Gaussian03 or Crystal06. The implementation of periodic HFX serves as a starting point for the implementation of MP2 and other correlated methods in extended systems. Acknowledgment. This work is partially supported by NSFC (50721091, 20533030, 50731160010, 20873129), by MOE (FANEDD-2007B23, NCET-08-0521), by the National Key Basic Research Program (2006CB922004), by the USTC-HP HPC project, and by the SCCAS and Shanghai Supercomputer Center. References and Notes (1) Parr, R. G.; Yang, W. Density Functional Theory of Atoms and Molecules; Oxford University Press: New York, 1989. (2) Mori-Sanchez, P.; Cohen, A. J.; Yang, W. Phys. ReV. Lett. 2008, 100, 146401. (3) Becke, A. D.; Johnson, E. R. J. Chem. Phys. 2007, 127, 124108. (4) Heyd, J.; Scuseria, G. E.; Ernzerhof, M. J. Chem. Phys. 2003, 118, 8207. (5) Causa, M.; Dovesi, R.; Orlando, R.; Pisani, C.; Saunders, V. R. J. Phys. Chem. 1988, 92, 909.

J. Phys. Chem. A, Vol. 114, No. 2, 2010 1043 (6) Alkauskas, A.; Broqvist, P.; Devynck, F.; Pasquarello, A. Phys. ReV. Lett. 2008, 101, 106802. (7) Car, R.; Parrinello, M. Phys. ReV. Lett. 1985, 55, 2471. Hutter, J.; Curioni, A. Chem. Phys. Chem. 2005, 6, 1788. (8) Paier, J.; Hirschl, R.; Marsman, M.; Kresse, G. J. Chem. Phys. 2005, 122, 234102. (9) Paier, J.; Marsman, M.; Hummer, K.; Kresse, G.; Gerber, I. C.; Angyan, J. G. J. Chem. Phys. 2006, 124, 154709. (10) Guidon, M.; Schiffmann, F.; Hutter, J.; VandeVondele, J. J. Chem. Phys. 2008, 128, 214104. (11) Wu, X.; Selloni, A.; Car, R. Phys. ReV. B 2009, 79, 085102. (12) Toyoda, M.; Ozaki, T. J. Chem. Phys. 2009, 130, 124114. (13) Soler, J. M.; Artacho, E.; Gale, J. D.; Garcia, A.; Junquera, J.; Ordejon, P.; Sanchez-Portal, D. J. Phys.: Condens. Matter 2002, 14, 2745. Artacho, E.; et al. Phys. Status Solidi B 1999, 215, 809. Sanchez-Portal, D.; et al. Int. J. Quantum Chem. 1997, 65, 453. (c) Ordejon, P.; et al. Phys. ReV. B 1996, 53, 10441. (14) Torralba, A. S.; Todorovi, M.; Miyazaki, T.; Gillan, M. J.; Bowler, D. R. J. Phys.: Condens. Matter 2008, 20, 294206. (15) Ozaki, T.; Kino, H. Phys. ReV. B 2004, 69, 195113. (16) Becke, A. D. J. Chem. Phys. 1993, 98, 5648. (17) Lee, C.; Yang, W.; Parr, R. G. Phys. ReV. B 1988, 37, 785. (18) Vosko, S. H.; Wilk, L.; Nusair, M. Can. J. Phys. 1980, 58, 1200. (19) Kleinman, L.; Bylander, D. M. Phys. ReV. Lett. 1982, 48, 1425. (20) Troullier, N.; Martins, J. L. Phys. ReV. B 1991, 43, 1993. (21) Perdew, J. P.; Zunger, A. Phys. ReV. B 1981, 23, 5048. (22) Ceperley, D. M.; Alder, B. J. Phys. ReV. Lett. 1981, 45, 566. (23) Genovese, L.; Deutsch, T.; Neelov, A.; Goedecker, S.; Beylkin, G. J. Chem. Phys. 2006, 125, 074105. (24) Broqvist, P.; Alkauskas, A.; Pasquarello, A. Phys. ReV. B 2009, 80, 085114. (25) Curtiss, L. A.; Raghavachari, K.; Redfern, P. C.; Pople, J. A. J. Chem. Phys. 1997, 106, 1063. (26) Monkhorst, H. J.; Pack, J. D. Phys. ReV. B 1976, 13, 5188. (27) Ayala, P. Y.; Kudin, K. N.; Scuseria, G. E. J. Chem. Phys. 2001, 115, 9698. (28) Poulsen, T. D.; Mikkelsen, K. V.; Fripiat, J. G.; Jaquemin, D.; Champagne, B. J. Chem. Phys. 2001, 114, 5917.

JP908836Z

Implementation of Exact Exchange with Numerical Atomic Orbitals

Recommend Documents