Local Fitting of the Kohn–Sham Density in a Gaussian and Plane

Apr 6, 2017 - However, due to the overhead introduced by the local density fitting, single ... available by participants in Crossref's Cited-by Linkin...
0 downloads 0 Views 2MB Size
Article pubs.acs.org/JCTC

Local Fitting of the Kohn−Sham Density in a Gaussian and Plane Waves Scheme for Large-Scale Density Functional Theory Simulations Dorothea Golze,*,†,‡ Marcella Iannuzzi,† and Jürg Hutter† †

Department of Chemistry, University of Zürich, Winterthurerstrasse 190, CH-8057 Zürich, Switzerland COMP/Department of Applied Physics, Aalto University, P.O. Box 11100, Aalto FI-00076, Finland



S Supporting Information *

ABSTRACT: A local resolution-of-the-identity (LRI) approach is introduced in combination with the Gaussian and plane waves (GPW) scheme to enable largescale Kohn−Sham density functional theory calculations. In GPW, the computational bottleneck is typically the description of the total charge density on real-space grids. Introducing the LRI approximation, the linear scaling of the GPW approach with respect to system size is retained, while the prefactor for the grid operations is reduced. The density fitting is an O(N) scaling process implemented by approximating the atomic pair densities by an expansion in one-center fit functions. The computational cost for the grid-based operations becomes negligible in LRIGPW. The self-consistent field iteration is up to 30 times faster for periodic systems dependent on the symmetry of the simulation cell and on the density of grid points. However, due to the overhead introduced by the local density fitting, single point calculations and complete molecular dynamics steps, including the calculation of the forces, are effectively accelerated by up to a factor of ∼10. The accuracy of LRIGPW is assessed for different systems and properties, showing that total energies, reaction energies, intramolecular and intermolecular structure parameters are well reproduced. LRIGPW yields also high quality results for extended condensed phase systems such as liquid water, ice XV, and molecular crystals.

1. INTRODUCTION Kohn−Sham density functional theory1,2 (KS-DFT) has become a standard technique for solving electronic structure problems. KS-DFT provides a reasonable compromise between accuracy and efficiency thus enabling the simulation of systems of several hundreds of atoms. However, the time scales accessible in molecular dynamics (MD) simulations are limited to the range of picoseconds and calculations of very large condensed matter systems remain challenging. The computational cost of nonhybrid KS-DFT calculations is dominated by the evaluation of the Coulomb term J J[ρ] =

1 2



ρ(r)ρ(r′) d rd r′ |r − r′|

Resolution-of-the-identity (RI) approaches, also referred to as “density fitting methods”, were introduced to alleviate the computational cost for the evaluation of J. Early attempts aimed at approximating the two-center product χAμ χBν by an expansion in orbital basis functions.3−10 Pioneering work on approximating the full KS density11−13 (global fitting) or the atomic pair densities14 (local fitting) by an expansion in one-center auxiliary fit functions dates back to the 1970s. Local and especially global RI approaches have been intensively developed from the early 1990s onward.15−19 Global RI leads to an O(N3) scaling behavior, since the fitting procedure itself scales cubically. In local RI schemes on the contrary, the fitting of the expansion coefficients scales formally only quadratically and can be further reduced to O(N). RI approaches are not only used for DFT but also for Hartree−Fock20,21 and higher-level ab initio methods such as second-order Møller−Plesset (MP2) perturbation theory,22−36 the random phase approximation,21,37−41 and GW.21,42,43 Local RI in particular has been more recently implemented for the computation of the exchange contribution in Hartree−Fock and hybrid KS-DFT calculations.44−49 In most approaches, the overlap or the Coulomb metric is chosen as the measure of the fitting error and minimized.15 The overlap metric minimizes the residual of the overlap or, equivalently, the deviation between

(1)

where the electronic density ρ(r) ρ(r) =

∑ ρAB (r) = ∑ ∑ AB

Pμν χμA (r)χνB (r)

AB μ ∈ A, ν ∈ B

(2)

can be expressed as the sum of atomic pair densities ρAB, where A and B label the atomic centers. The pair densities are expanded in terms of a local basis of, e.g., atom-centered Gaussian functions {χμ}. The expansion coefficients are the density matrix elements Pμν. Hence, the evaluation of J requires the calculation of the four-center two-electron repulsion integrals (ERIs) which scales with the fourth power of the number of basis functions N. © 2017 American Chemical Society

Received: February 12, 2017 Published: April 6, 2017 2202

DOI: 10.1021/acs.jctc.7b00148 J. Chem. Theory Comput. 2017, 13, 2202−2214

Article

Journal of Chemical Theory and Computation the exact and fitted density. When the Coulomb metric is used, a least-squares minimization of the Coulomb repulsion of the density residual is performed. The locality in the density fitting is achieved either by using local metrics14,45,47 or local domains.44,50−53 The latter rely on predefined criteria to localize the fitting region, such as certain radii around atoms.44,51 The local metric, instead, is based on the definition of an alternative local operator. Baerends et al.14 introduced a local metric, where the electron density is decomposed into pair-atomic densities ρAB , which are approximated as a linear combination of auxiliary functions localized at atoms A and B. The expansion coefficients are obtained by employing an overlap metric. This strategy was subsequently used by several others54,55 and also in this work. A similar approach has also been applied in a density-functional tight-binding model.56 In addition to the O(N) scaling, the local atomic-pair fitting approach proposed by Baerends et al. has several other advantages compared to global RI approaches. The quality of the fitted pair density does not depend on the chemical environment, because the fitting is applied individually to each pair. The (matrix) dimensions of the least-squares minimization step remain modest and are independent of the system size. The pairwise fitting schemes can be easily parallelized by distributing the pairs over the processors. The main drawback is that a local overlap metric introduces linear errors in the fit, whereas the error is quadratic for a global Coulomb metric.12,45 Therefore, larger auxiliary basis sets have to be used to reach the same accuracy as with a global Coulomb metric. Several attempts were made to introduce locality for the Coulomb-based fitting.57,58 Recently, Merlot et al.45 proposed the pair atomic resolution of the identity (PARI) approximation, where locality is exploited in a similar manner as by Baerends et al.14 In this method, each single product χAμ χBν is approximated individually instead of fitting the entire pair density ρAB. To improve the accuracy, the Dunlap correction59 is used to remove the linear-error terms. The advantages compared to Baerends’ approach are that the fitting is robust and variational and that the error of the fit is quadratic. However, each product of basis functions is addressed individually, which potentially increases the number of fit operations. In this work, the local resolution-of-the-identity (LRI) approach of Baerends et al.14 has been combined with the Gaussian and plane waves (GPW) framework,60,61 and implemented in the CP2K program package.62,63 In the following, this combined approach is referred to as LRIGPW. Employing GPW, the construction of the KS matrix scales quasi linearly O(N ln N) thanks to the properties of the local Gaussian basis and to the solution of the Poisson equation in reciprocal space. However, the operations that are performed on the real-space grids can become a computational bottleneck for periodic systems. Using LRI, the prefactor for these operations can be massively reduced, while retaining the linear scaling. In Sections 2 and 3, the theory and implementation details of LRIGPW are described. The computational setup is given in Section 4. The accuracy of LRIGPW is discussed in Section 5 by evaluating total and reaction energies as well as molecular structure parameters for a broad range of systems. We employ LRIGPW also to evaluate net dipole moments, cohesive energies, and lattice constants of condensed matter systems, such as ice XV and molecular crystals. MD simulations of liquid

water have been performed as well. We discuss the computational efficiency of the LRI approach and finally draw conclusions in Section 6.

2. THE LRIGPW METHOD 2.1. The GPW Method. In GPW, a dual representation of the electronic density in real and reciprocal space is used.60,61 In the real-space representation, the density ρ(r) is expressed in terms of atom-centered, contracted Gaussian functions {χμ}, see eq 2. By introducing a sufficiently large, auxiliary plane wave (PW) basis set, the same density can then be expressed as ρ(G) in reciprocal space by applying a Fourier transformation, where G are the reciprocal lattice vectors. This can be considered a density fitting scheme itself. In Fourier space, the Poisson equation is easily solved to obtain the Hartree potential VH(G) = 4πρ(G)/G2. In the GPW scheme, the potential is subsequently converted to its real-space analogue VH(r). In order to exploit the advantages of the Fast Fourier Transform (FFT) algorithms, both the density and the potential VH(r) need to be mapped on regular and sufficiently dense grids in real space. The exchange-correlation (XC) potential Vxc(r) is also evaluated at each real-space grid point and added to the Hartree potential to obtain the potential V(r) = VH(r) + Vxc(r). The collocation of the density on the grid and the subsequent integration of the potential V(r) to compute the contribution HAB μν to the KS matrix elements AB Hμν =

∫ V (r)χμA (r)χνB (r)d r

(3)

both require the evaluation of products of basis functions of the type χAμ (r)χBν (r) at each grid point r. Within the GPW framework, the evaluation of the KS matrix scales linearly with system size. This is achieved by solving the Poisson equation in reciprocal space and by discarding pair densities ρAB that are negligible. Since we use a local basis of Gaussian functions, effective screening procedures can be employed. Only nonzero products of Gaussian functions on atoms that are sufficiently close are included. The operations on the real-space grids, i.e. the collocation of the density and the integration of the KS potential, have to be performed for each pair of interacting atoms. Even though the number of these pairs scales linearly with system size, this part of the calculation is predominant for condensed matter systems and dense grids, i.e. large PW basis sets. For a liquid water box consisting of 64 H2O molecules, the number of pairs is already larger than 200 000. The proportion between total run time and the time spent in the grid-based operations is displayed in Figure 1 for

Figure 1. Total GPW run time and time needed for the operations on the real-space grid (collocation of the density and integration of the potential). The calculations are carried out for liquid water boxes of different sizes with a plane wave cutoff of 400 Ry and a DZVP basis set, see Section 4. 2203

DOI: 10.1021/acs.jctc.7b00148 J. Chem. Theory Comput. 2017, 13, 2202−2214

Article

Journal of Chemical Theory and Computation water boxes of different size. For the medium-sized systems up to 128 H2O molecules, the grid operations require more than 90% of the total run time. Even for larger systems, where, for example, the diagonalization of the KS matrix contributes increasingly, the grid operations remain the most expensive part. The objective of this work is to reduce the prefactor for the grid-based operations by introducing a density fitting scheme. The basic idea is to map the fitted instead of the exact density on the real-space grid. The fitting scheme needs to be local to retain the linear scaling behavior. The fitted density is employed for both, the Coulomb and XC term, which has been also proposed by Laikov64 several years ago. We employ an overlap metric rather than a Coulomb metric since the latter minimizes only the error in the Coulomb term. These requirements, locality in combination with an overlap metric, are fulfilled by the pair-fitting scheme proposed by Baerends et al.,14 which is opportunely adapted to the GPW scheme and from now on named the LRIGPW method. 2.2. Local Density Fitting. Within the LRIGPW approach, the atomic pair densities ρAB, see eq 2, are approximated by an expansion in a set of fit functions centered at atom A {f Ai (r)} and atom B {f Bj (r)} ρ̃AB (r) =

SilAA =

SijAB =



ti =

=

Tμνi =

(13)

∫ χμA (r)χνB (r)fiA/B (r)d r

(14)

χAμ (r)χBν (r)f Ai (r) is integrated for fit functions χAμ (r)χBν (r)f Bi (r) for functions at B. The last

where the product centered at A and term on the rhs of eq 8 is associated with the constraint, where the vector n is given by ni =

∫ fiA/B (r)d r

(15)

which is nonzero only for s-functions. The Lagrange multiplier λ is given by λ=

NAB − n TS−1t n TS−1n

(16) A

A

A

Note that for the self-pair AA, only a , t , n , and the selfoverlap SAA il remain. Based on eq 2, the total fitted density ρ̃(r) is ρ ̃(r) =

∑ ρAB̃ (r) AB

∫ |ρAB − ρAB̃ | dr 2

(5)

∫ ρAB̃ (r)d r

=

Pμν

∫ χμA (r)χνB (r)d r

aĩ A =

(7)

∑ aiA,(AB) B

μ ∈ A, ν ∈ B

ρ̃(r) =

(17)

(18)

i

(19)

In LRIGPW, the total density is thereby presented as a sum over the number of atoms, whereas in GPW we have a sum over pairs, see eq 2. In the case of the 64H2O box, this means that the fitted density is collocated on the grid by looping over 192 atoms rather than 200 000 pairs. As discussed in Section 1, the error in the fit is linear. Localized RI schemes like PARI45 employ the Dunlap correction,59 which eliminates the linear error terms directly from the RI expressions of the four-center ERIs. This is not possible within the GPW framework, since we do not derive explicit RI expressions for the ERIs but evaluate the fitted KS density on a real-space grid, subsequently changing to a PW representation. Furthermore, this correction would only apply to the Coulomb term, but we employ the grid representation of the fitted density also for the XC potential. Additionally, it has been demonstrated45 that including the Dunlap correction

(8)

(9)

Note that we have dropped the pair index (AB) in eqs 8 and 9. The upper part of vector a, aA, yields the expansion coefficients at center A and the lower part, aB, at center B. The overlap of the auxiliary basis functions is given by ⎛ SAA SAB ⎞ S = ⎜⎜ ⎟⎟ ⎝ SBA SBB ⎠

∑ ∑ aĩ A fiA (r) A

which can be expressed in terms of submatrices and subvectors ⎛ SAA SAB ⎞⎛ a A ⎞ ⎛ tA ⎞ ⎛nA ⎞ ⎜⎜ ⎟⎜⎜ ⎟⎟ = ⎜⎜ ⎟⎟ + λ⎜⎜ ⎟⎟ ⎟ ⎝ nB ⎠ ⎝ SBA SBB ⎠⎝ a B ⎠ ⎝ tB ⎠

⎤ B,(AB) B ⎥ a f ( r ) ∑ j j ⎥⎦ j

where the sum is running over the neighbors of A. Using eq 18, the total density is given by

where {χμ} denotes functions of the primary basis set. The total number of electrons Nelec corresponds to Nelec = ∑ABNAB. Minimizing eq 5 under the charge constraint with respect to the expansion coefficients yields a linear set of equations for each pair

Sa = t + λ n

⎡ ∑ ⎢⎢∑ aiA,(AB)fiA (r) + AB ⎣ i

The expansion coefficients are summed up for each atom A to get

(6)

∫ ρAB (r)d r = ∑

PμνTμνi

with

In order to conserve the charge, the integral of the pair density, NAB, has to be constant, i.e. NAB =

(12)

μ ∈ A, ν ∈ B

The fit functions are also Gaussian-type functions and provided as auxiliary basis set. The expansion coefficients {aA,(AB) } and i {aB,(AB) } are obtained for each pair AB by minimizing the j difference DAB between exact density ρAB and approximated density ρ̃AB DAB =

∫ fiA (r)f jB (r)d r

The elements on the right-hand side (rhs) of eq 8 are

(4)

j

(11)

and the overlap between functions at atoms A and B is

∑ aiA,(AB)fiA (r) + ∑ ajB,(AB)f jB (r) i

∫ fiA (r)flA (r)d r

(10)

where the self-overlap of atom A is 2204

DOI: 10.1021/acs.jctc.7b00148 J. Chem. Theory Comput. 2017, 13, 2202−2214

Article

Journal of Chemical Theory and Computation

“Self-pairs” AA and AA′, where A′ is the periodic image of A, give zero contributions. The derivatives of the expansion coefficients are given by

yields a four-center ERI matrix that is not manifestly positive semidefinite which can lead to convergence problems, see also ref 47. In the fitting scheme proposed here, no constraint is set that prevents the fitted density from getting negative. Indeed, numerical inaccuracies of the fitting procedure can produce small negative values of the density at grid points far from the atomic centers, where the density should be zero. However, the absolute values of these numerical errors are smaller than 10−6 au, which is less than the error introduced by the interpolation/ extrapolation operations associated with the multigrid approach in CP2K.60 2.3. Construction of the Kohn−Sham Matrix. The density-dependent term of the KS matrix, HAB μν , given in eq 3, has to be substituted by its LRIGPW analogue H̃ AB μν AB H̃ μν =

∑ i

daiA,(AB) A Ii + dPμν



dajB,(AB) dPμν

j

∂aiA/B,(AB) =− ∂RA, k +

(20)

Λkl =

Ṽ (r) = VH̃ (r) + Vxc̃ (r)

The derivatives of the expansion coefficients with respect to the density matrix are

∑ Sij−1nj (23)

j

j

(n TS−1)j n TS−1n

where

ãAi

FAB =

∂I A ∑ aĩ i + ∂RA, k i

is defined in eq 18,

∑ i

∂aiA,(AB) A Ii + ∂RA, k

∑ FAB

Tμνj (24)

B

(25)

IAi

is defined in eq 21 and where

∑ j

∂ajB,(AB) ∂RA, k

∂ajB,(AB) ∂RB, k

=−

∂ajB,(AB) ∂RA, k

i

∂IiA ∂hkl

(29)

1 2

∑ FAB(k)rAB(l)

(30)

AB

3. IMPLEMENTATION DETAILS 3.1. Fitting Procedure. The LRIGPW approach has been implemented in the DFT part of the CP2K program package.62,63 The pseudocode for the LRIGPW implementation is shown in Figure 2. In the following, the two- and three-index overlap integrals, see eqs 7, 12, and 14, are denoted by

Note that the integral of the potential IAi in eq 21 is computed for each atom A, whereas in GPW we have to evaluate the integral ∫ V(r)χAμ χBν dr for each pair, see eq 3. 2.4. Forces and Stress. The expressions for the forces on the atoms have to be adapted for the terms that depend explicitly on the fitted density ρ̃, i.e. for the Coulomb and XC term. The contributions due to the kinetic energy and the electron−core interactions are evaluated as discussed in ref 60. The Hartree and XC energy are denoted by EH(ρ̃) and EXC(ρ̃). The derivative of E(ρ̃) = EH(ρ̃) + EXC(ρ̃) with respect to the components k = x, y, z of the position vector RA of atom A is A

and

where rAB is the vector connecting A and B, and FAB is the pairwise force defined in eq 26. Note that also the periodic “self-pairs” AA′ contribute to the virial, since modifying the box size changes the internal interactions between A and its images.

with

dE(ρ̃) = dRA, k

∑ ∑ aĩ A

Πkl =

(22)



∂Sml −1 Slj nj ∂RA, k

where hkl refers to the kth component of one of the three lattice vectors. The contribution to the virial related to the second term of eq 25 is given by

The potential Ṽ (r) is the sum of the Hartree Ṽ H(r) and XC potential Ṽ xc(r), which are computed from the fitted density ρ̃(r)

∫ χμA (r)χνB (r)d r dλ = − dPμν n TS−1n

j,m,l

∂RA, k

(28)

A

j

j

∂t j

The computation of the stress tensor requires the virial. The contribution to the virial derived from the first term of eq 25 is

(21)

dλ dPμν

j

∑ Sij−1nj − λ ∑ Sim−1

∂aiA,(AB) ∂a A,(AB) =− i ∂RB, k ∂RA, k

I jB

∫ Ṽ (r)fiA (r)d r

∑ Sij−1Tμνj +

∂λ ∂RA, k

∑ Sij−1

For the derivatives of the expansion coefficients with respect to RB we use the following symmetry relations

where μ ∈ A and ν ∈ B. is the integral of the potential Ṽ (r) multiplied by the auxiliary function f Ai on atom A

daiA/B,(AB) = dPμν

j,m,l

∂Sml −1 Slj t j + ∂RA, k

(27)

IAi

IiA =

∑ Sim−1

(ab) =

∫ χμA (r)χνB (r)d r

(31)

(ab̃ )̃ =

∫ fiA (r)f jB (r)d r

(32)

(aba)̃ =

∫ χμA (r)χνB (r)fiA (r)d r

(33)

(abb)̃ =

∫ χμA (r)χνB (r)f jB (r)d r

(34)

The electronic density ρ has to be fitted at each step of the self-consistent field (SCF) procedure by solving eq 8. However, most of the matrices and vectors required to solve the linear system of equations or to update the KS matrix are calculated only once when the SCF procedure is initialized. The selfoverlap, see eq 11, and the vector elements {ni} can be precalculated for each elemental type. The overlap matrix S and its inverse, the integrals (abã) and (abb̃) and the derivatives (AB) (AB) daA, /dPμν and daB, /dPμν do not depend on the density i j matrix, too. At each SCF iteration, the vector t and the Lagrange multiplier λ are easily constructed from the precomputed integrals and the density matrix. Eq 8 is solved,

I jB (26)

The first term in eq 25 is evaluated on the grid and is analogous to the GPW expression. For the second term, which has no GPW-equivalent, we sum over all neighbors of A with A ≠ B. 2205

DOI: 10.1021/acs.jctc.7b00148 J. Chem. Theory Comput. 2017, 13, 2202−2214

Article

Journal of Chemical Theory and Computation

collocation of the density and the integration of the potential, the distribution is applied to the grid points. The fitting procedure requires an independent set of auxiliary functions {f i} as input. Our choice for the auxiliary basis is a set of uncontracted Gaussian functions. They are generated by geometric progression using 2αmin and 2αmax as start and end point, where αmin and αmax are the smallest and largest exponents in the orbital basis set. The largest l quantum number is set to 2lmax where lmax is the maximal l quantum number in the primary basis set. We found that the inclusion of polarization functions up to 2lmax + 1 for H improves the accuracy for hydrogen-containing compounds and is sometimes required to reproduce the electronic structure correctly when using very large primary basis sets. In general, the RI basis set has to be 3−4 times larger than the primary basis. Extensive tests showed indeed that smaller LRI basis sets lead to highly inaccurate energies and molecular properties or even to the failure of the fit, i.e. instability of the SCF procedure since the fitted density becomes completely unphysical. Note that the LRI fitting approach is not variational, and an optimization procedure24 as proposed for global RI schemes is not possible. 3.2. Integral Evaluation and Inversion of the Overlap Matrix. The evaluation of the three-index overlap integrals (abã) and (abb̃) over spherical harmonic Gaussian functions and the evaluation of their derivatives is computationally expensive for large auxiliary basis sets. Based on ref 65, we developed an integral scheme, which employs contracted solid harmonic Gaussian functions, see ref 66 for details. Our integral scheme is superior to widely used Cartesian Gaussian-based methods. It is particularly efficient for highly contracted primary basis sets and functions with large angular momentum, which are typically included in the auxiliary basis sets. For very large auxiliary basis sets, the overlap matrix S can become ill-conditioned. The inversion of an ill-conditioned matrix is numerically unstable and leads to convergence problems. It is then necessary to identify the few pairs (usually 20). For these pairs, a singular value decomposition of S is performed. Singular values smaller than a given threshold are skipped, and the pseudoinverse is constructed following the method proposed in ref 67, for details see Figure S1 in the Supporting Information (SI). This introduces small inconsistencies in the derivatives. However, we could not find any numerical problems caused by these. Note that for the auxiliary basis sets and systems discussed in this work, S is in most cases sufficiently well-conditioned to perform a regular inversion.

Figure 2. Pseudocode for LRIGPW for a single point calculation. Only operations directly related to LRIGPW are displayed. The evaluation of the other terms, e.g. pseudopotentials, is as in GPW and not reported.

and the fitted density is subsequently collocated on the realspace grid for each atom A. Once the fitted density is available, the standard GPW procedures are applied to compute the Hartree and XC potentials. The sum of the electrostatic and the XC contribution to the potential is integrated for each atom A yielding the vector elements {IAi }. The KS matrix is finally updated for each pair from the precalculated derivatives of the expansion coefficients and the vector elements {IAi }. If forces are computed, the derivatives of the overlap integrals (ab), (ãb̃), (abã), and (abb̃) with respect to the atomic positions are required, see eq 27. FAB is then built from the precomputed matrices and their derivatives. The LRIGPW operations can be easily parallelized using a standard message passing interface (MPI). The computation of the LRI overlap integrals, the evaluation of the expansion coefficients, and the update of the KS matrix are performed independently for each pair. These tasks are parallelized by distributing the pairs over the processors. For the grid-based

4. COMPUTATIONAL DETAILS 4.1. General Setup. KS-DFT calculations comparing the LRIGPW and GPW approach are carried out for isolated molecules and dimers as well as condensed matter systems. For the primary basis sets, double-ζ plus polarization (DZVP) and triple-ζ plus double polarization (TZV2P) basis sets of the MOLOPT type68 are used, which are good quality basis sets for ground state calculations. The MOLOPT basis sets are compact and highly contracted. For example, the double-ζ basis set for oxygen contains 13 contracted spherical Gaussian functions corresponding to 98 primitive Cartesian functions, which implies that a set of 300−400 functions is required as fit basis, see Section 3.1. The LRIGPW tests are performed with medium- (m-aux) and large-sized (l-aux) auxiliary basis sets, which are given in Tables S17−S32 (SI). Both auxiliary basis 2206

DOI: 10.1021/acs.jctc.7b00148 J. Chem. Theory Comput. 2017, 13, 2202−2214

Article

Journal of Chemical Theory and Computation sets contain the same number of exponents, but the large basis sets include additionally functions with higher angular momentum for the large exponents. For hydrogen, an alternative auxiliary basis (m+-aux), that contains higher polarization functions, is also available. The latter has been partly employed in combination with the TZV2P basis sets. If not indicated otherwise, the valence electrons are described at the DZVP level and the m-aux basis set is used. Normconserving pseudopotentials69−71 are employed to approximate the core−electron interactions. The Perdew−Burke−Ernzerhof (PBE)72 functional is used for the XC potential. Unless otherwise stated, the Grimme D3 correction73 is employed to model dispersion interactions. The PW cutoff has been adjusted for each type of system individually and is reported in Table 1.

Figure 3. Two of the 18 symmetry inequivalent structures of ice XV. In both cases, 2 × 2 × 2 supercells are shown along the [001] axis. (a) 2C1 structure with P1̅ symmetry. (b) 9A2 structure exhibiting Cc symmetry. Color code: H, white; O, green and red indicate distinct sublattices.

Table 1. PW Cutoffs Ecut (Ry) and Relative PW Cutoffs Erel_cut (Ry) for All Systems system

Ecut

Erel_cut

isolated molecules, (HF)2 dimer (NH3)2, (H2O)2 dimers graphene GMTKN24 benchmark ice XV molecular crystals H2O boxes (benchmark) 128H2O box (MD)

400 600 500 600 800 800 400 300

40 40 40 40 50 60 50 50

the theoretically predicted 9A2 structure80 are shown. For the ice XV structures, we use the labeling conventions introduced in refs 79 and 80, where the letters A, B, and C indicate three nonequivalent hydrogen networks. The unit cell of ice XV consists of 10 water molecules and has been extended to a 2 × 2 × 2 supercell. The structures and the supercell parameters have been retrieved from ref 80, where the geometry and the volume were relaxed under 1 GPa isotropic external pressure at the RI-MP2 level of theory. The relaxed cells are triclinic slightly deviating from an orthorhombic configuration. Net dipole moments for these cells are obtained by summing up the individual molecular dipole moments. The latter are computed from the centers of the atoms and the centers of the maximally localized Wannier orbitals (MLWO).81,82 4.2.5. Molecular Crystals. Cohesive energies and lattice parameters of five different molecular crystals are studied: NH3, CO2, formic acid, urea, and succinic anhydride. Their geometries and supercells are displayed in Figure 4. The

The optimization of the molecular orbitals is achieved with the orbital transformation (OT) technique74 for all systems, except graphene. In the latter case, the KS matrix is diagonalized. Periodic boundary conditions are applied in all spatial directions for the condensed matter systems. 4.2. Systems and Properties Investigated. 4.2.1. Dimers. The accuracy of LRIGPW for intermolecular distances is studied for the hydrogen-bonded dimers of HF, NH3, and H2O. The ammonia dimer has a very flat potential energy surface and several minima exist. The minimum structure of the ammonia dimer considered here has D2h symmetry, see ref 75. 4.2.2. Graphene. The equilibrium lattice constant of graphene is optimized employing a 12 × 12 supercell. A vacuum of 25 Å is added in the direction perpendicular to the surface of the graphene slab to avoid spurious interactions with the periodic images. 4.2.3. GMTKN24 Benchmark Suite. Reaction energies are studied for the WATER27, DARC, and S22 subsets of the GMTKN24 benchmark database.76 The WATER27 subset assesses the decomposition energies of neutral (H2O)n, positively H3O+(H2O)n, and negatively charged OH−(H2O)n water clusters. The DARC subset is a collection of different Diels−Alder reactions. S22 includes, among others, the binding energies of hydrogen-bonded dimers. For the labeling of the reactions see ref 76. In this case, the PBE calculations are carried out without D3 correction. The relative difference between the LRIGPW and GPW reaction energies is calculated as ΔGPW rel

|ΔEGPW − ΔE LRIGPW | = ΔEGPW

Figure 4. Supercells of molecular crystals studied and number of molecules Nmol per cell. (a) NH3, 2 × 2 × 2 supercell, Nmol = 32. (b) CO2, 2 × 2 × 2 supercell, Nmol = 32. (c) Formic acid, 1 × 3 × 2 supercell, Nmol = 24. (d) Urea, 2 × 2 × 2 supercell, Nmol = 16. (e) Succinic anhydride, 2 × 2 × 1 supercell, Nmol = 16. Color code: C orange, O red, N blue, H white.

experimental structures of the formic acid, urea, and succinic anhydride crystals have been retrieved from the Cambridge Structural Database.83 The respective reference codes are FORMAC01, UREAXX09, and SUCANH15. The structures of NH3 and CO2 are obtained from the experimental lattice parameters and the information on the space group, see ref 34. The geometries and lattice vectors have been relaxed at the GPW level starting from the experimental cell parameters. The

(35)

4.2.4. Ice XV. Ice XV is one of the high-pressure ice phases and occurs in the ∼1 GPa regime.77−80 There are 18 symmetry inequivalent structures possible within the ice XV unit cell. In Figure 3, the experimentally determined 2C1 structure77 and 2207

DOI: 10.1021/acs.jctc.7b00148 J. Chem. Theory Comput. 2017, 13, 2202−2214

Article

Journal of Chemical Theory and Computation

and XC energies) are given in Table S1. The energy values are obtained for different combinations of the DZVP and TZV2P primary basis sets and of the medium- and large-sized fit sets. For the single molecules, the total energies are computed for structures relaxed at the GPW-DZVP level. The total energies of the water samples are computed for single snapshots of structures equilibrated with the TIP5P88 force field. The LRI error is mostly less than 100 μEH (∼0.26 kJ/mol), for DZVP combined with the m-aux basis set. Employing larger auxiliary basis sets, the errors become smaller than 80 μEH. The TZV2P basis set captures more features of the electronic structure and should be more difficult to fit. Nevertheless, the deviations are only slightly larger. The LRI errors are in the range reported for global RI schemes applied to the Coulomb part, where total energies are affected by errors of less than 200 μEH.16,17 Further reduction of the error to less than 60 μEH has been reported with carefully optimized auxiliary basis sets.18 The errors in the total energies are expected to be smaller when using a larger auxiliary basis set. This is generally the case, but numerical instabilities associated with larger auxiliary basis sets can introduce additional errors of several μEH in a few cases. The main reason for this behavior is the already discussed ill-conditioned overlap matrix S, due to near-linear dependencies occurring for very large basis sets. However, the LRIGPW accuracy is always improved by using larger auxiliary basis sets for chemically relevant properties such as relative energies and structural parameters. 5.1.2. Structure Parameters. The bond lengths of the isolated molecules are reproduced with an error smaller than 0.6 mÅ, and the angles deviate by less than 0.14°, see Table S2 (SI). For comparison, the global RI errors are reported to be smaller than 1 mÅ for the bond length and 0.1° for the angles.16−18 Bond lengths larger or smaller than the equilibrium distance are also well reproduced, see Figure S2, where the dissociation curve of HF is reported. The accurate description of intermolecular interactions is of major importance for condensed phase systems. In order to evaluate the LRIGPW accuracy for intermolecular properties, the geometries of the HF, NH3, and H2O dimers are compared to those obtained by GPW. The dimers are optimized at the DZVP and TZV2P levels. Their geometries are displayed in Figure 5, and the intermolecular structure parameters, such as

counterpoise (CP) corrected cohesive energies are computed for the relaxed structures at a given volume V as33,34 Ecoh(V ) =

Esupercell(V ) Nmol

gas − Emol − ECP

(36)

where Nmol is the number of molecules in the supercell, Esupercell(V) is the total energy of the supercell, and Egas mol is the gas phase energy of an isolated, relaxed molecule. The CP correction,84 which is introduced to account for the basis set superposition error85−87 (BSSE), is defined as crystal crystal ECP = Emol + ghost (V ) − Emol (V )

Ecrystal mol+ ghost(V)

(37)

Ecrystal mol (V)

where and are the energies of one single molecule in the crystal geometry, with and without ghost functions. For NH3 and CO2, the first two coordination shells, i.e. the 12 nearest neighbors, are considered as ghost atoms, whereas for the other molecular crystals only atoms of the first coordination shell are included, see refs 33 and 34 for further details. 4.2.6. Water. Benchmark calculations are performed for orthorhombic liquid water boxes with a density of 1.0 g/cm3 increasing the size from 32 to 512 water molecules. MD simulations of liquid water are carried out within the canonical (NVT) ensemble for the water box with 128 molecules. The temperature is set to 330 K, and the time step is 0.5 fs. The system has been equilibrated for about 4 ps. Structural features are sampled from, at least, the last 25 ps of the simulation.

5. RESULTS AND DISCUSSION 5.1. Total Energies and Structure Parameters. Differences between exact and fitted density are usually very small unless the fit breaks down, e.g., due to a too small auxiliary basis set. The total residual ∑ABDAB is usually in the range of 10−7− 10−6 au and is not a good measure for the accuracy. The error caused by the LRI approach is better assessed by comparing molecular energies and structure parameters. 5.1.1. Total Energies. Total energies of a few small isolated molecules and of liquid water boxes are compared to the GPW value. The energy differences per atom are reported in Table 2. Further details on the individual contributions (e.g., Hartree Table 2. Differences in Total Energies ΔE = |EGPW−ELRI| (μEH) per Atom Using Two Different Primary and Two Different Auxiliary Basis Setsa ΔE(DZVP) HF HCl O2 CO2 H2O NH3 CH4 borazineb (32H2O)liq (128H2O)liq (512H2O)liq

ΔE(TZV2P)

m-aux

l-aux

m-aux

l-aux

263 72 1 90 112 71 89 497 41 68 27

66 75 74 2 59 60 76

258 71 6 191 116 75 86

70 77 72 99 56 64 71

62 35 57

190 51 164

202 157 188

Figure 5. Dimer geometries. (a) HF dimer, (b) NH3 dimer, and (c) H2O dimer. Atoms of the H-bond acceptor molecule are labeled with a, and atoms of the donor molecule are labeled with d. Color code: F cyan, H white, N blue, O red.

internuclear distances and angles, are given in Table 3. The interactions between these dimers are mediated by hydrogen bonds. Linear hydrogen bonds are characteristic for (HF)2 and (H2O)2, whereas the ammonia dimer is nonlinearly bound. The geometry parameters obtained for (HF)2, (H2O)2, and (NH3)2 in this work are in good agreement with previously published results.75,89 For (HF)2, (H2O)2, and (NH3)2, the LRI internuclear distances computed at the DZVP level are affected by errors smaller than 5 mÅ, and the error of the LRI angles is smaller

ΔE is calculated for isolated molecules and liquid water (liq) boxes of different sizes. bl-aux and TZV2P basis sets are not available for B. a

2208

DOI: 10.1021/acs.jctc.7b00148 J. Chem. Theory Comput. 2017, 13, 2202−2214

Article

Journal of Chemical Theory and Computation

Table 3. Intermolecular Geometry Parameters for the (HF)2, (NH3)2, and (H2O)2 Obtained with Two Different Primary Basis Setsa DZVP quantity

GPW

TZV2P

LRIGPW

Δ

GPW

LRIGPW

Δ

(HF)2

R(F···F) R(Fa···Hd) θHdFdFa

(NH3)2

R(N···N) θHdNdNa

3.157 42.29

3.154 42.36

0.003 0.07

3.160 42.22

3.158 42.25

0.002 0.03

(H2O)2

R(O···O) θHdOdOa

2.923 5.26

2.928 5.10

0.005 0.16

2.922 5.03

2.917 3.79

0.005 1.24

θHaFaHd

θdipb

2.762 1.822 5.56

2.765 1.826 5.66

112.71

112.81

120.37

0.003 0.004 0.10 0.10

120.30

0.07

2.762 1.828 5.55 112.29

119.50

2.766 1.833 5.80 112.58

118.26

0.004 0.005 0.25 0.29

1.24

For LRIGPW, the m-aux basis set is used (except: m -aux for H/(H2O)2 at the TZV2P level). Distances R are given in Å, and angles θ are given in degrees. Site names and labels are specified in Figure 5. bθdip: angle between the OO-axis and the bisection line of the HOH angle of the H-bond acceptor water molecule. a

+

reproduced with the larger auxiliary basis set. The remaining differences can be attributed to the limited length of the trajectories, i.e. incomplete sampling. In order to test the performance of LRIGPW for 2D periodic materials, a lattice parameter optimization of graphene is carried out. The total energy of the graphene layer has been evaluated by varying progressively the lattice parameter from 2.40 to 2.55 Å and accordingly rescaling the coordinates. The energy has been plotted as a function of the lattice constants, showing that the equilibrium is reached for a0 between 2.46 and 2.47 with both methods, see Figure 7. However, for larger lattice constants, the LRIGPW curve deviates slightly from the GPW results, suggesting increasing errors for longer C−C bond lengths.

than 0.16°. At the TZV2P level, the LRI errors are in most cases not significantly larger. However, the features of the electronic density are more sophisticated, and more diffuse auxiliary functions have been required for hydrogen to reproduce the structure of (H2O)2 correctly. We can argue that we have almost reached the basis set limit with the TZV2P basis sets, see ref 68. The differences between the GPW structures computed with the DZVP and TZV2P basis set can be taken as a measure for the incompleteness of the DZVP basis set, which is already a high-quality basis set. The GPW differences are up to 6 mÅ and 0.87° for distances and angles, respectively. Thus, the error margins due to the incompleteness of the DZVP basis set are larger than the error of the fit. Even more interesting is the analysis of the structural properties averaged over properly extended MD trajectories. Figure 6 displays the radial distribution functions (RDFs)

Figure 7. Lattice optimization for a graphene sheet employing the maux basis set at the DZVP level. Displayed are the relative energies. E0 is the total energy at the equilibrium lattice constant a0. Figure 6. RDFs for liquid water obtained with GPW and LRIGPW at the DZVP level. (a) Oxygen−oxygen and (b) oxygen−hydrogen RDFs.

5.2. Reaction Energies. Several decomposition energies have been calculated for the WATER27 subset of the GMTKN24 benchmark suite. The absolute and relative errors with respect to the GPW results are plotted in Figure 8 for two different combinations of primary and auxiliary basis sets. The relative LRI error ΔGPW is typically less than 2% and never rel larger than 5% using DZVP in combination with the m-aux basis sets. The agreement is quite good already at this level of approximation, in particular considering that a 5% deviation corresponds to an absolute error of only 3.6 kJ/mol, as shown Figure 8(a). Employing a larger auxiliary basis set, the maximal

sampled over the NVT trajectories of the water box with 128 H2O molecules. The RDFs obtained with LRIGPW in combination with the m-aux basis sets show small differences from the GPW results. The lower and smoother RDF profile indicates that the liquid is less structured, i.e., the intermolecular interactions (hydrogen bonds) are somewhat underestimated. However, the GPW-RDFs are almost perfectly 2209

DOI: 10.1021/acs.jctc.7b00148 J. Chem. Theory Comput. 2017, 13, 2202−2214

Article

Journal of Chemical Theory and Computation

Figure 9. Results for the 18 symmetry inequivalent ice XV structures at the DZVP level. (a) Relative energies Erel per molecule are defined as (E − Eaver)/Nmol, where E is the total energy of the supercell, Eaver = ∑iE(i)/18 is the average of all 18 structures, and Nmol is the number of molecules in the supercells. (b) Net dipole moments calculated from MLWO. Note that the curves in (b) are exactly on top of each other. Figure 8. Decomposition energies for the GMTKN24 subset and (b) relative errors ΔGPW WATER27. (a) Absolute errors ΔGPW abs rel with respect to GPW at the DZVP level using the m-aux and (c) l-aux basis sets. (d) Relative errors at the TZV2P level.

structures of 2A1, 2B1, and 2C1 have a net dipole moment of exactly zero, while the slightly polar structures 9A1 and 9B2 have very small residual dipoles. 5.4. Molecular Crystals. 5.4.1. Cohesive Energies. The five investigated molecular crystals are characterized by intermolecular interactions of different nature, i.e., electrostatic polarization, hydrogen bonding, and dispersion forces. The computed and measured cohesive energies, Ecoh, are listed in the upper part of Table 4. The experimental values are derived from sublimation energies ΔHs. When comparing to experiment, it is to note that the temperature-dependence and zeropoint vibrational energies are not considered for the computed values. The cohesive energies of molecular crystals have been calculated at the MP2 level previously, and good agreement with experimental values has been reported for the five crystals under investigation.33−35,90 However, even at the PBE-D3 level, the computed Ecoh compare reasonably well to experiment. In fact, the results for formic acid and succinic anhydride are in remarkably good agreement with the experimental values. LRIGPW reproduces the cohesive energies computed with GPW very well. The absolute differences are smaller than 5 kJ/ mol, and relative deviations are smaller than 5%. By increasing the size of the auxiliary basis sets, the maximal deviation is reduced to less than 2 kJ/mol corresponding to a relative difference of 2%. The absolute errors are in any case within the margins of the CP correction ECP, see Table 4. Note that ECP is determined by the primary basis sets. Within the error range of the fit, the same ECP are obtained with LRIGPW. 5.4.2. Lattice Vectors. The cell relaxation with both, GPW and LRIGPW, preserves the orthorhombic or, in the case of NH3 and CO2, cubic symmetry. Only for urea, the lattice vectors deviate by 1−2° from an orthorhombic symmetry. Comparing the GPW results to experiment, relatively large deviations of up to 3% are observed. One source of error is the incompleteness of the basis. See ref 35 for a discussion of the effect of the basis set on the cell optimization of molecular crystals. Using LRIGPW in combination with the m-aux basis sets, the lattice parameters did not converge for formic acid and succinic anhydride, and many steps have been required for the other crystals. Providing more flexibility by employing the l-aux basis

deviation with respect to GPW is reduced to 1.3%. At the TZV2P level, the decomposition energies are also well reproduced using the medium-sized auxiliary basis sets with relative deviations smaller than 7%. At the DZVP level, one of the largest absolute deviations (13.1 kJ/mol) is observed for the decomposition energy of the (H2O)20 cluster, reaction 14, see Figure 8(a). The absolute difference between the GPW results obtained with the DZVP and TZV2P basis sets is 27.8 kJ/mol for this reaction. Assuming that the basis limit is reached with the TZV2P basis set, we conclude that the LRI error is significantly smaller than the basis set error. Furthermore, the absolute error of 13.1 kJ/mol corresponds to a relative deviation of only 1.4%. LRIGPW can also accurately reproduce the GPW reaction energies for organic compounds as demonstrated by Figures S3 and S4 displaying the results for the DARC and S22 subsets. 5.3. Ice XV. The relative stability of the 18 symmetry inequivalent ice XV structures is rather subtle, and reproducing it provides a tough test for the LRIGPW approach. In Figure 9(a) we report the change in the total energies per molecule with respect to a mean value obtained by averaging over all the considered structures. The black curve shown Figure 9(a) resembles closely the curve obtained with PBE in ref 80. Small deviations must be attributed to different basis sets and the inclusion of the D3 correction. The differences between the 18 structures are in a small range of 1 kJ/mol. GPW and LRIGPW consistently predict that 9A2 is the global minimum structure. Using the m-aux basis sets, deviations are observed for the structures with C-type and partly with A-type hydrogen bonding. Increasing the auxiliary basis sets, the accuracy is noticeably improved, and only small deviations remain for the 4A1, 7A1, and 9A1 structures. The net dipole moments of the supercells are presented in Figure 9(b). Their values are insensitive to the inaccuracies introduced by the density fit. LRIGPW and GPW dipoles match exactly. In particular, we consistently find that the apolar 2210

DOI: 10.1021/acs.jctc.7b00148 J. Chem. Theory Comput. 2017, 13, 2202−2214

Article

Journal of Chemical Theory and Computation Table 4. CP Corrected Cohesive Energies Ecoh (kJ/mol) for Structures and Cells Relaxed at the DZVP-GPW Level and Lattice Parameter abc (Å) of the Unit Cell for NH3, CO2, Formic Acid (FA), Urea (U), and Succinic Anhydride (SA)a GPW Ecoh

CP

LRIGPW

LRIGPW

(m-aux)

(l-aux)

Ecoh

ΔGPW

Ecoh

ΔGPW

Ecoh

−44.3 −24.1 −68.7 −111.0 −81.3 GPW abc

abc

δGPW %

abc

δGPW %

abc

NH3 CO2

5.035 5.720 10.328 3.604 5.374 5.572 5.572 4.704 5.385 6.960 11.700

4.941 5.626

1.9 1.6

5.522 5.522 4.704

0.9 0.9 0.0

4.986 5.656 10.236 3.561 5.354 5.547 5.547 4.692 5.391 6.954 11.540

1.0 1.1 0.9 1.2 0.4 0.4 0.4 0.2 0.1 0.1 1.4

5.048 5.550 10.241 3.544 5.356 5.645 5.645 4.704 5.426 6.975 11.717

U

SA

0.6 0.2 2.1 4.7 3.8

−44.3 0.0 −23.4 0.7 −69.8 1.1 −111.0 0.0 −83.2 1.9 LRIGPW (l-aux)

exptb

NH3 CO2 FA U SA

FA

−1.5 −44.9 −1.8 −23.9 −3.5 −70.8 −4.2 −106.3 −3.5 −85.1 LRIGPW (m-aux)

results for Ecoh, it is reasonable to assume that the error introduced by the fit is smaller than the error due to the finite size of the primary basis set. The latter might be relevant since the cells have been relaxed at the DZVP level. Note that BSSEfree equilibrium lattice constants can be in principle obtained from an indirect approach, where the CP corrected cohesive energies are calculated at various fixed volumes for the respective equilibrium geometry. The equilibrium lattice constant is then obtained by fitting the Ecoh − V curves to an appropriate model function.33,35 However, the indirect approach is computationally more expensive. 5.5. Computational Efficiency of LRIGPW. The SCF convergence with GPW and LRIGPW requires approximately the same number of SCF iterations. Slower convergence is only observed for systems with an extremely ill-conditioned overlap matrix S, but the problem can be solved applying the pseudoinverse. The total run times reported in Figure 10(a) refer to single point calculations of water boxes with NH2O molecules, which are carried out on NH2O/4 cores of a Cray XC40 machine. The major objective of LRIGPW is to reduce the computational cost for the grid-based operations as discussed in Section 2.1, and indeed, the time required for the collocation of the density and the integration of the potential is drastically reduced with LRIGPW and becomes in fact negligible, see Figure 10(a). The grid operations are speeded up by a factor of 40−90 with respect to GPW. These speed-ups do not directly translate in an acceleration of the SCF iteration because LRI introduces also additional overhead. The density has to be fitted at each SCF step, and eq 8 has to be solved for each pair. Moreover, summing up the expansion coefficients in order to obtain {ãAi }, see eq 18, requires additional MPI communication since the pairs are distributed.

−36.3 −31.1 −68 −92 −81 exptc

The CP correction and the difference ΔGPW with respect to GPW are given in kJ/mol as well. Note that for NH3 and CO2 a = b = c. δGPW is the following ratio: |aGPW − aLRIGPW|/aGPW. bFor experimental Ecoh see refs 33, 35, 90, and 91. cFor experimental abc see refs 33, 35, and the Cambridge Structural Database.83

a

sets, the cell parameters converged smoothly and differ typically by less than 1% from the GPW results. With reference to the

Figure 10. Timings for GPW and LRIGPW on a Cray XC40 machine at the DZVP/m-aux level. (a) Total GPW run time and time for grid-based operations for liquid water boxes of different sizes. (b) Speed-up of the SCF step for an isolated water cluster of 20 molecules and liquid water. Speed-up of the (c) SCF step, (d) single point (SP) calculation, and (e) MD or geometry optimization (G-OPT) step as a function of the grid cutoff for liquid water, formic acid (FA), urea, and ice XV crystals. 2211

DOI: 10.1021/acs.jctc.7b00148 J. Chem. Theory Comput. 2017, 13, 2202−2214

Article

Journal of Chemical Theory and Computation The speed-up of the SCF step is dependent on the periodicity and system size. For isolated systems, the speedup is negligible since the number of pairs is not much larger than the number of atoms. On the contrary, a very large number of pairs is generated for condensed matter systems like liquid water. For the medium-sized water boxes, the SCF step is accelerated by a factor of 3−4, see Figure 10(b). The speed-up is noticeably smaller for large water boxes since the GPW grid operations contribute less to the total run time and we can gain less by the LRI. The grid operations in the GPW approach require more than 90% of the total run time for 128 H2O compared to 66% for 512 H2O, see Figure 10(a). The computational cost for the diagonalization of the KS matrix or, in this case, the calculation of the preconditioner for the OT approach scales O(N3) and increasingly contributes for large systems. The efficiency of LRIGPW depends also on the size of the auxiliary basis sets. Larger basis sets do not increase the time for the grid operations significantly but have an impact on the density fitting operations. The speed-up factors are a bit smaller for the l-aux basis set, see Figure 10(b). In Figure 10(c)-(e), the speed-up of the SCF step, for a single point (SP) calculation and for the MD and geometry optimization (G-OPT) step, is reported dependent on the grid cutoff. Generally, LRIGPW is increasingly efficient for large cutoffs. The efficiency depends also on the symmetry of the simulation cells. The largest speed-ups are obtained for nonorthorhombic cells. For example, for the triclinic ice XV cells, the SCF step is accelerated by up to a factor of 30. Speedups by a factor of 20 are also observed for the urea crystal, which deviates slightly from the orthorhombic shape after relaxation of the cell vectors. The reason for this behavior is that the grid operations can be simplified for orthorhombic cells by separating the spatial directions, thereby reducing the computational cost. The GPW grid operations are thus more expensive for triclinic cells, while the computational cost is still marginal for the LRIGPW grid operations. Since the LRI overhead, i.e. solving eq 8, is independent of the cell symmetry, LRIGPW is more efficient with respect to GPW for nonorthorhombic boxes. The computational efficiency for a fully converged SP calculation is presented in Figure 10(d), where we observe a maximal speed-up factor of 15 for ice XV. The speed-up is smaller than for the SCF step since the initialization of the LRI integrals, eqs 31−34, and the inversion of the overlap matrix S introduce computational overhead prior to the SCF iteration. LRIGPW is especially beneficial when many iterations are required until the electronic structure is converged. For an MD or G-OPT run, the SCF cycle converges quickly because the wave function can be extrapolated from the previous step. The speed-up of the MD and G-OPT steps has been therefore measured for a maximum of 8 (128H2O) and 10 (ice XV and urea) SCF steps to ensure a fair comparison. Even though the number of SCF steps is small, we observe a performance enhancement by a factor of up to 8 for the nonorthorhombic cells. However, the MD step is also accelerated for the orthorhombic 128H2O box, if the cutoff is larger than 600 Ry. In LRIGPW, the computational cost for the MD step is dominated by the initialization step, i.e. the computation of the inverse of the overlap matrix and, to a significantly smaller extent, the evaluation of the LRI integrals and their derivatives, see Figure S5 for details.

Figure 11. Scaling of the LRIGPW approach (a) with respect to system size and (b) number of processes on a Cray XC40 machine. Execution times are measured for a single point calculation at the DZVP/m-aux level.

The scaling behavior of LRIGPW is displayed in Figure 11(a). The LRI approach scales linearly with respect to system size for liquid water boxes with up to 128 molecules. For smalland medium-sized systems, the computational cost is dominated by the construction of the KS matrix, which also scales linearly. With larger system size, the nonlinearly scaling procedures for optimizing the wave function such as the diagonalization of the KS matrix or, in this case, the evaluation of the preconditioner for the OT method become predominant. Figure 11(b) shows that LRIGPW scales also well with respect to the number of processes.

6. CONCLUSIONS A local density fitting approach has been introduced within GPW in order to reduce the computational cost for the construction of the KS matrix. For periodic systems, the gridbased part of the calculation, such as the collocation of the density and integration of the potential, clearly dominates the calculation of the KS matrix and the overall run time. The computational demands for these operations are massively decreased with LRI, since the density is expanded in single center terms, rather than in terms of pairs. The accuracy of LRIGPW has been assessed for a wide range of systems. Total energies, reaction energies, and intramolecular and intermolecular structure parameters are well reproduced by LRIGPW. This has also been found for the cohesive energies of five different molecular crystals. The difference to the GPW result is significantly smaller or at least not larger than the CP correction. Good agreement with respect to the GPW reference has been also found for the relative energies of the 18 symmetry inequivalent structures of ice XV. Especially, the net dipole moments are exactly reproduced. Furthermore, LRIGPW is applicable for lattice optimizations of 2D and 3D periodic structures. The medium-sized auxiliary basis sets appeared to be applicable for almost all systems and properties. Employing larger auxiliary basis sets improves the results, particularly for 2212

DOI: 10.1021/acs.jctc.7b00148 J. Chem. Theory Comput. 2017, 13, 2202−2214

Article

Journal of Chemical Theory and Computation

(7) Boys, S. F.; Shavitt, I. University of Wisconsin Naval Research Laboratory Report WIS-AF-13; 1959. Retrievable from https://ntrl.ntis. gov/NTRL/ (accessed Apr 11, 2017). (8) Newton, M. D. J. Chem. Phys. 1969, 51, 3917−3926. (9) Billingsley, F. P.; Bloor, J. E. J. Chem. Phys. 1971, 55, 5178−5190. (10) Whitten, J. L. J. Chem. Phys. 1973, 58, 4496−4501. (11) Sambe, H.; Felton, R. H. J. Chem. Phys. 1975, 62, 1122−1126. (12) Dunlap, B. I.; Connolly, J. W. D.; Sabin, J. R. J. Chem. Phys. 1979, 71, 3396−3402. (13) Dunlap, B. I.; Connolly, J. W. D.; Sabin, J. R. J. Chem. Phys. 1979, 71, 4993−4999. (14) Baerends, E. J.; Ellis, D. E.; Ros, P. Chem. Phys. 1973, 2, 41−51. (15) Vahtras, O.; Almlöf, J.; Feyereisen, M. W. Chem. Phys. Lett. 1993, 213, 514−518. (16) Eichkorn, K.; Treutler, O.; Ö hm, H.; Häser, M.; Ahlrichs, R. Chem. Phys. Lett. 1995, 240, 283−290. (17) Eichkorn, K.; Weigend, F.; Treutler, O.; Ahlrichs, R. Theor. Chem. Acc. 1997, 97, 119−124. (18) Weigend, F. Phys. Chem. Chem. Phys. 2006, 8, 1057−1065. (19) Dunlap, B. I.; Palenik, M. C. Phys. Rev. B: Condens. Matter Mater. Phys. 2016, 93, 195162. (20) Weigend, F. Phys. Chem. Chem. Phys. 2002, 4, 4285−4291. (21) Ren, X.; Rinke, P.; Blum, V.; Wieferink, J.; Tkatchenko, A.; Sanfilippo, A.; Reuter, K.; Scheffler, M. New J. Phys. 2012, 14, 053020. (22) Feyereisen, M.; Fitzgerald, G.; Komornicki, A. Chem. Phys. Lett. 1993, 208, 359−363. (23) Weigend, F.; Häser, M. Theor. Chem. Acc. 1997, 97, 331−340. (24) Weigend, F.; Häser, M.; Patzelt, H.; Ahlrichs, R. Chem. Phys. Lett. 1998, 294, 143−152. (25) Bernholdt, D. E.; Harrison, R. J. J. Chem. Phys. 1998, 109, 1593−1600. (26) Werner, H.-J.; Manby, F. R.; Knowles, P. J. J. Chem. Phys. 2003, 118, 8149−8160. (27) Pisani, C.; Busso, M.; Capecchi, G.; Casassa, S.; Dovesi, R.; Maschio, L.; Zicovich-Wilson, C.; Schütz, M. J. Chem. Phys. 2005, 122, 094113. (28) Maschio, L.; Usvyat, D.; Manby, F. R.; Casassa, S.; Pisani, C.; Schütz, M. Phys. Rev. B: Condens. Matter Mater. Phys. 2007, 76, 075101. (29) Usvyat, D.; Maschio, L.; Manby, F. R.; Casassa, S.; Schütz, M.; Pisani, C. Phys. Rev. B: Condens. Matter Mater. Phys. 2007, 76, 075102. (30) Pisani, C.; Maschio, L.; Casassa, S.; Halo, M.; Schütz, M.; Usvyat, D. J. Comput. Chem. 2008, 29, 2113−2124. (31) Izmaylov, A. F.; Scuseria, G. E. Phys. Chem. Chem. Phys. 2008, 10, 3421−3429. (32) Katouda, M.; Nagase, S. J. Chem. Phys. 2010, 133, 184103. (33) Maschio, L.; Usvyat, D.; Schütz, M.; Civalleri, B. J. Chem. Phys. 2010, 132, 134706. (34) Del Ben, M.; Hutter, J.; VandeVondele, J. J. Chem. Theory Comput. 2012, 8, 4177−4188. (35) Del Ben, M.; Hutter, J.; VandeVondele, J. J. Chem. Phys. 2015, 143, 102803. (36) Baudin, P.; Ettenhuber, P.; Reine, S.; Kristensen, K.; Kjærgaard, T. J. Chem. Phys. 2016, 144, 054102. (37) Eshuis, H.; Yarkony, J.; Furche, F. J. Chem. Phys. 2010, 132, 234114. (38) Eshuis, H.; Bates, J. E.; Furche, F. Theor. Chem. Acc. 2012, 131, 1084. (39) Del Ben, M.; Hutter, J.; VandeVondele, J. J. Chem. Theory Comput. 2013, 9, 2654−2671. (40) Del Ben, M.; Schütt, O.; Wentz, T.; Messmer, P.; Hutter, J.; VandeVondele, J. Comput. Phys. Commun. 2015, 187, 120−129. (41) Wilhelm, J.; Seewald, P.; Del Ben, M.; Hutter, J. J. Chem. Theory Comput. 2016, 12, 5851−5859. (42) van Setten, M. J.; Weigend, F.; Evers, F. J. Chem. Theory Comput. 2013, 9, 232−246. (43) Wilhelm, J.; Del Ben, M.; Hutter, J. J. Chem. Theory Comput. 2016, 12, 3623−3635. (44) Sodt, A.; Head-Gordon, M. J. Chem. Phys. 2008, 128, 104106.

cell relaxations and structural properties extracted from MD trajectories such as RDFs. Generally, the medium-sized auxiliary basis sets yield accurate results for primary basis sets of DZVP as well as TZV2P quality. However, providing more flexibility by adding more functions might be occasionally necessary at the TZV2P level. For LRIGPW, the computational cost of the grid-dependent operations becomes negligible. This is reflected in a speed-up of the SCF step which depends on the symmetry of the simulation cell, the size of the system, and the grid cutoff. LRIGPW is most efficient for nonorthorhombic cells and large grid cutoffs, where speed-ups of the SCF step by a factor of 30 can be obtained. Since the initialization procedure of the LRI introduces computational overhead prior to the SCF iteration, LRIGPW is particularly beneficial when many SCF iterations are required to converge the wave function. SP calculations are up to 15 times faster with LRI and MD or G-OPT steps are accelerated by a factor of up to 8. The LRIGPW approach has been designed for the computation of extended condensed phase systems where compromises on, e.g., the size of the primary basis sets are required to run the calculation in a feasible time. Employing LRI, the computation can be speeded-up significantly, while introducing a relatively small additional error.



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jctc.7b00148. Flowchart for calculating the inverse or pseudoinverse of the overlap matrix, Figure S1; details on individual energy contributions, Table S1; intramolecular structure parameters, Table S2; dissociation curve of HF, Figure S2; reaction energies for the DARC subset, Figure S3, and the S22 subset, Figure S4; details on timings for the MD/G-OPT step, Figure S5; primary and auxiliary basis sets, Tables S3−S32 (PDF)



AUTHOR INFORMATION

Corresponding Author

*E-mail: dorothea.golze@aalto.fi. ORCID

Dorothea Golze: 0000-0002-2196-9350 Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS We thank the Swiss National Supercomputing Center (CSCS) for providing computational resources and Andreas Glöß for technical support. D. Golze acknowledges financial support by the Swiss National Science Foundation.



REFERENCES

(1) Hohenberg, P.; Kohn, W. Phys. Rev. 1964, 136, B864−B871. (2) Kohn, W.; Sham, L. J. Phys. Rev. 1965, 140, A1133−A1138. (3) Mulliken, R. S. J. Chim. Phys. 1949, 46, 497−542. (4) Ayed, O.; Bernard, E.; Silvi, B. J. Mol. Struct.: THEOCHEM 1986, 135, 159−168. (5) Rüdenberg, K. J. Chem. Phys. 1951, 19, 1433−1434. (6) Löwdin, P.-O. J. Chem. Phys. 1953, 21, 374−375. 2213

DOI: 10.1021/acs.jctc.7b00148 J. Chem. Theory Comput. 2017, 13, 2202−2214

Article

Journal of Chemical Theory and Computation

(82) Marzari, N.; Vanderbilt, D. Phys. Rev. B: Condens. Matter Mater. Phys. 1997, 56, 12847−12865. (83) Allen, F. H. Acta Crystallogr., Sect. B: Struct. Sci. 2002, 58, 380− 388. (84) Boys, S. F.; Bernardi, F. Mol. Phys. 1970, 19, 553−566. (85) Jansen, H. B.; Ros, P. Chem. Phys. Lett. 1969, 3, 140−143. (86) Liu, B.; McLean, A. D. J. Chem. Phys. 1989, 91, 2348−2359. (87) van Duijneveldt, F. B.; van Duijneveldt-van de Rijdt, J. G. C. M.; van Lenthe, J. H. Chem. Rev. 1994, 94, 1873−1885. (88) Mahoney, M. W.; Jorgensen, W. L. J. Chem. Phys. 2000, 112, 8910−8922. (89) Rankin, K. N.; Boyd, R. J. J. Comput. Chem. 2001, 22, 1590− 1597. (90) Maschio, L.; Civalleri, B.; Ugliengo, P.; Gavezzotti, A. J. Phys. Chem. A 2011, 115, 11179−11186. (91) http://www.webbook.nist.gov/chemistry (accessed April 11, 2017).

(45) Merlot, P.; Kjærgaard, T.; Helgaker, T.; Lindh, R.; Aquilante, F.; Reine, S.; Pedersen, T. B. J. Comput. Chem. 2013, 34, 1486−1496. (46) Manzer, S. F.; Epifanovsky, E.; Head-Gordon, M. J. Chem. Theory Comput. 2015, 11, 518−527. (47) Ihrig, A. C.; Wieferink, J.; Zhang, I. Y.; Ropo, M.; Ren, X.; Rinke, P.; Scheffler, M.; Blum, V. New J. Phys. 2015, 17, 093020. (48) Levchenko, S. V.; Ren, X.; Wieferink, J.; Johanni, R.; Rinke, P.; Blum, V.; Scheffler, M. Comput. Phys. Commun. 2015, 192, 60−69. (49) Rebolini, E.; Izsák, R.; Reine, S. S.; Helgaker, T.; Pedersen, T. B. J. Chem. Theory Comput. 2016, 12, 3514−3522. (50) Polly, R.; Werner, H.-J.; Manby, F. R.; Knowles, P. Mol. Phys. 2004, 102, 2311−2321. (51) Sodt, A.; Subotnik, J. E.; Head-Gordon, M. J. Chem. Phys. 2006, 125, 194109. (52) Franchini, M.; Philipsen, P. H. T.; van Lenthe, E.; Visscher, L. J. Chem. Theory Comput. 2014, 10, 1994−2004. (53) Mejía-Rodríguez, D.; Köster, A. M. J. Chem. Phys. 2014, 141, 124114. (54) Guerra, C. F.; Snijders, J. G.; te Velde, G.; Baerends, E. J. Theor. Chem. Acc. 1998, 99, 391−403. (55) te Velde, G.; Bickelhaupt, F. M.; Baerends, E. J.; Guerra, C. F.; van Gisbergen, S. J. A.; Snijders, J. G.; Ziegler, T. J. Comput. Chem. 2001, 22, 931−967. (56) Giese, T. J.; York, D. M. J. Chem. Phys. 2011, 134, 194103. (57) Jung, Y.; Sodt, A.; Gill, P. M. W.; Head-Gordon, M. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 6692−6697. (58) Reine, S.; Tellgren, E.; Krapp, A.; Kjærgaard, T.; Helgaker, T.; Jansik, B.; Høst, S.; Salek, P. J. Chem. Phys. 2008, 129, 104101. (59) Dunlap, B. I. J. Mol. Struct.: THEOCHEM 2000, 529, 37−40. (60) VandeVondele, J.; Krack, M.; Mohamed, F.; Parrinello, M.; Chassaing, T.; Hutter, J. Comput. Phys. Commun. 2005, 167, 103−128. (61) Lippert, G.; Hutter, J.; Parrinello, M. Mol. Phys. 1997, 92, 477− 487. (62) The CP2K developers group. CP2K is freely available from http://www.cp2k.org/ (accessed November 2016). (63) Hutter, J.; Iannuzzi, M.; Schiffmann, F.; VandeVondele, J. WIREs Comput. Mol. Sci. 2014, 4, 15−25. (64) Laikov, D. N. Chem. Phys. Lett. 1997, 281, 151−156. (65) Giese, T. J.; York, D. M. J. Chem. Phys. 2008, 128, 064104. (66) Golze, D.; Benedikter, N.; Iannuzzi, M.; Wilhelm, J.; Hutter, J. J. Chem. Phys. 2017, 146, 034105. (67) Golub, G. H.; Loan, C. F. V. Matrix computation, 2nd ed.; The Johns Hopkins Unversity Press: Baltimore, 1989; p 243. (68) VandeVondele, J.; Hutter, J. J. Chem. Phys. 2007, 127, 114105. (69) Goedecker, S.; Teter, M.; Hutter, J. Phys. Rev. B: Condens. Matter Mater. Phys. 1996, 54, 1703−1710. (70) Hartwigsen, C.; Goedecker, S.; Hutter, J. Phys. Rev. B: Condens. Matter Mater. Phys. 1998, 58, 3641−3662. (71) Krack, M. Theor. Chem. Acc. 2005, 114, 145−152. (72) Perdew, J. P.; Burke, K.; Ernzerhof, M. Phys. Rev. Lett. 1996, 77, 3865−3868. (73) Grimme, S.; Antony, J.; Ehrlich, S.; Krieg, H. J. Chem. Phys. 2010, 132, 154104. (74) VandeVondele, J.; Hutter, J. J. Chem. Phys. 2003, 118, 4365− 4369. (75) Boese, A. D.; Jansen, G.; Torheyden, M.; Höfener, S.; Klopper, W. Phys. Chem. Chem. Phys. 2011, 13, 1230−1238. (76) Goerigk, L.; Grimme, S. J. Chem. Theory Comput. 2010, 6, 107− 126. (77) Salzmann, C. G.; Radaelli, P. G.; Mayer, E.; Finney, J. L. Phys. Rev. Lett. 2009, 103, 105701. (78) Salzmann, C. G.; Radaelli, P. G.; Slater, B.; Finney, J. L. Phys. Chem. Chem. Phys. 2011, 13, 18468−18480. (79) Nanda, K. D.; Beran, G. J. O. J. Phys. Chem. Lett. 2013, 4, 3165− 3169. (80) Del Ben, M.; VandeVondele, J.; Slater, B. J. Phys. Chem. Lett. 2014, 5, 4122−4128. (81) Wannier, G. H. Phys. Rev. 1937, 52, 191−197. 2214

DOI: 10.1021/acs.jctc.7b00148 J. Chem. Theory Comput. 2017, 13, 2202−2214