Article Cite This: J. Chem. Theory Comput. 2017, 13, 5282-5290
pubs.acs.org/JCTC
Density-Based Multilevel Hartree−Fock Model Sandra Sæther,† Thomas Kjærgaard,‡ Henrik Koch,† and Ida-Marie Høyvik*,† †
Department of Chemistry, The Norwegian University of Science and Technology, Høgskoleringen 5, 7491 Trondheim, Norway qLEAP Center for Theoretical Chemistry, Department of Chemistry, Aarhus University, Langelandsgade 140, 8000 Aarhus C, Denmark
‡
ABSTRACT: We introduce a density-based multilevel Hartree−Fock (HF) method where the electronic density is optimized in a given region of the molecule (the active region). Active molecular orbitals (MOs) are generated by a decomposition of a starting guess atomic orbital (AO) density, whereas the inactive MOs (which constitute the remainder of the density) are never generated or referenced. The MO formulation allows for a significant dimension reduction by transforming from the AO basis to the active MO basis. All interactions between the inactive and active regions of the molecule are retained, and an exponential parametrization of orbital rotations ensures that the active and inactive density matrices separately, and in sum, satisfy the symmetry, trace, and idempotency requirements. Thus, the orbital spaces stay orthogonal, and furthermore, the total density matrix represents a single Slater determinant. In each iteration, the (level-shifted) Newton equations in the active MO basis are solved to obtain the orbital transformation matrix. The approach is equivalent to variationally optimizing only a subset of the MOs of the total system. In this orbital space partitioning, no bonds are broken and no a priori orbital assignments are carried out. In the limit of including all orbitals in the active space, we obtain an MO densitybased formulation of full HF.
1. INTRODUCTION Electronic-structure theory has become an integral part of molecular sciences, and it provides accurate determination of energy levels and molecular properties. However, as the limit toward larger and larger molecular systems is pursued, the conventional approach of treating the entire system at the same level of accuracy is counterproductive. Coupling high level theories with low level theories alleviates this problem, but it cannot be done in a haphazard manner. Ideally, for wave function-based methods, a single wave function for the entire system should be obtained such that the Pauli principle is fulfilled across all borders. For wave function theories, this has been done, e.g., in the multilevel coupled-cluster (MLCC) method, where different levels of the CC hierarchy are coupled together in a consistent manner.1 Other wave function in wave function (WF-in-WF) embedding schemes for correlation effects exist. In particular, we mention MP2 in Hartree−Fock (HF) by Carter and collaborators,2 the hybrid scheme of Mata et al.,3 the cluster-in-molecule approach of Li and Piecuch,4 the combination of the cluster-in-molecule approach with the use of frozen natural orbitals by Rolik and Kállay,5 CC-in-CC by Hö fener and Visscher,6 and the embedded many-body expansion by Manby and collaborators.7 Most embedding theories are developed within the framework of density functional theory (DFT). For DFT-in-DFT embedding methods, which will not be covered in this article, we refer to the review article of Wesolowski et al.8 and references therein. We briefly mention here that DFT-in-DFT © 2017 American Chemical Society
embeddings are carried out by considering an embedding operator that couples the environment to the region of interest. There are several approaches for the embedding operator such as pseudopotential approaches,9 exact local potential theory10,11 (related to subsystem DFT12 and partition DFT13), and subsystem embedding methods.14,15 WF-in-DFT is another class, where DFT is used as embedding for more accurate methods. 16−19 We refer to the work of Kál lay and collaborators19 for a comparison of DFT-in-DFT and WF-inDFT methods. In general, WF-in-WF embedding schemes found in the literature depend on having solved the self-consistent field (HF or Kohn−Sham) problem in advance. An exception to this is the HF embedding method developed by Shukla et al.,20 aimed at describing crystalline solids, which has also been extended to correlated calculations.21 Shukla et al. solved the Roothaan− Hall equations for molecular orbitals (MOs) in the reference cell in the atomic orbital (AO) basis of the reference cell and neighboring cells. Due to the diagonalization procedure, systems with large reference cells will pose problems, as the AO dimension grows quickly when including neighboring cells. Further, the environment orbitals must be generated to construct the reference cell Fock matrix. In this article, we introduce a multilevel HF method based on a density-based energy optimization. An MO formulation allows for a reduction Received: June 30, 2017 Published: September 25, 2017 5282
DOI: 10.1021/acs.jctc.7b00689 J. Chem. Theory Comput. 2017, 13, 5282−5290
Article
Journal of Chemical Theory and Computation of dimensions from the number AOs to the dimension of MOs in the region of interest (active MOs). The active MOs are generated by a decomposition of a starting guess AO density matrix (e.g., purified superposition of atomic densities22 density matrix), after which AO indices can be transformed to active MOs. The inactive MOs are never generated or referenced. The exponential parametrization of the orbital transformation matrix ensures that the active and inactive density matrices both separately, and in sum, satisfy the symmetry, trace, and idempotency requirements for representing a single Slater determinant. Thus, such a wave function has no problematic behavior across the region treated at HF level of theory and the region described by the starting guess AO density. Further, no bonds are broken and no a priori orbital assignments are carried out. The wave function will therefore be able to serve as a starting point for multilevel correlated calculations or other HFbased developments. In addition to the aforementioned electronic-structure embedding/multilevel methods, multiscale models should also be mentioned. Multiscale models are based on the same philosophy as multilevel methods, but they go beyond electronic-structure theory. Examples are quantum mechanics−molecular mechanics (QM/MM) models and QM combined with continuum models for the environment. The QM/MM method was originally introduced by Warshel and Levitt23 in 1976. Multiscale models have since been extensively used and developed, and we refer the reader to the review by Senn and Thiel.24 The MM methods may be both polarizable25−27 and nonpolarizable and have successfully been coupled to high level electronic-structure methods, e.g., the coupled-cluster hierarchy.28,29 For a review of solvation continuum models, see the paper by Tomasi et al.,30 and a review for treating local electronic excitations in complex systems can be found in ref 31. Combinations of QM/MM methods and continuum methods have also been developed,32−35 and multiscale approaches should be seen as extension possibilities to multilevel electronic-structure methods. Developing schemes that encompass multilevel electronicstructure regions, MM descriptions, and a continuum yields a powerful tool for large molecules in complex environments. The article is organized as follows. In Section 2, we outline the MO density matrix driven energy optimization, equivalent to standard HF theory, and in Section 3, we show how to exploit this formulation in a multilevel framework. In Section 4, we present some numerical illustrations of the approach with respect to efficiency and performance, before we provide a summary and concluding remarks in Section 5.
[G(D)]ij =
kl
ij
⎛1oo 0ov ⎞ ⎟ D=⎜ ⎝ 0vo 0vv ⎠
(3)
for a set of orthogonal MOs. All density matrices D representing a single closed-shell Ne electron Slater-determinant fulfill the symmetry, trace, and idempotency conditions36
DT = D Tr[D] =
(4)
1 Ne 2
(5)
D2 = D
(6)
2.2. Exponential Parametrization of the Density Matrix. We consider the parametrization of the density matrix D(κ) = exp( −κ)D exp(κ)
(7)
in terms of an antisymmetric parameter matrix, κ, in which only nonredundant (occupied-virtual) rotations for the HF state are considered. This parametrization is easily seen to conserve the symmetry, trace, and idempotency conditions of eqs 4−6. Rather than transforming the density matrix to the AO basis as done by Helgaker et al.,37 we consider the MO density driven energy optimization. Advantages of keeping the equations in the MO basis will be discussed in Section 3. A trust-region38 energy minimization is used for the energy minimization; hence, we here develop expressions for the gradient and linear transformation of the Hessian on a trial vector. Expressing the energy in terms of the transformed MO density, we have E(κ) = 2Tr[hD(κ)] + Tr[D(κ)G(D(κ))] + hnuc
(8)
Using the Baker−Campbell−Hausdorff expansion for the transformed density D(κ) = D + [D, κ ] +
1 [[D, κ ], κ ] + ··· 2
(9)
we find the gradient with respect to the occupied-virtual rotations to be E[1] = Fvo − Fov
(10)
where F = h + G(D) is the Fock matrix. The linear transformation of the Hessian matrix on a trial vector κ is given by E[2]κ = (Foo − Fvv )κ + κ(Foo − Fvv ) − Gov ([D, κ ]) + Gvo([D, κ ])
(11)
Both gradient and linear transformation expressions have been scaled by 1 and an occupied-virtual projection is applied 4 according to
∑ (2(ij|kl) − (ik|jl))DijDkl + hnuc ijkl
= 2Tr[hD] + Tr[DG(D)] + hnuc
(2)
The MO density matrix is given by
2. FULL SPACE MO DENSITY MATRIX DRIVEN ENERGY OPTIMIZATION 2.1. Hartree−Fock Energy in Terms of the MO Density Matrix. The HF energy expression for a closed-shell molecule in the MO basis is E HF = 2 ∑ hijDij +
∑ (2(ij|kl) − (ik|jl))Dkl
E[1] → PE[1]Q + QE[1]P
(1)
(12)
where P = D and Q = 1 − D. We have used the fact that κ contains only nonredundant rotations, i.e., κ = PκQ + QκP. We have introduced a superscript notation where, e.g., “ov” means that all blocks except the occupied-virtual block are zero, and similarly for vo, oo, and vv. This comes from the occupied-
where h contains the one-electron terms of the Hamiltonian, hnuc is the nuclear repulsion contribution, and (pq|rs) are twoelectron repulsion integrals in Mulliken notation. We have introduced the matrix of two-electron repulsion integrals 5283
DOI: 10.1021/acs.jctc.7b00689 J. Chem. Theory Comput. 2017, 13, 5282−5290
Article
Journal of Chemical Theory and Computation virtual projections and is analogous to the projections in the AO formulation.37 The level-shifted Newton equations can be solved using the same techniques as for the AO formulation, e.g., using the CROP solver39 of Ziółkowski et al. Each SCF iteration of the MO driven optimization algorithm consists of the following steps (1) construct the Fock matrix (2) solve level-shifted Newton equations to obtain κ (3) update orbitals C̃ = C exp(κ) and MO quantities M̃ = exp(−κ)M exp(κ) (4) compute gradient and check convergence Matrix exponentials are evaluated according to
Figure 1. Illustration of the structure of the active and environment MO density matrices, Da and De, with labels indicating the dimensions of the blocks. We have also illustrated the structure of the reduced space MO density matrix for the active space, Dar .
k
exp(κ) = [exp(2−kκ)]2
(13) −k
for accelerated convergence (see eq 50 in ref 40), where 2 is a small parameter related to the Frobenius norm of κ. In the AO-based formulation, transforming to the orthogonalized AO basis has proven useful since it reduces the condition number of the Hessian and simplifies the matrix algebra.41 In the orthogonal MO basis, this step is superfluous. However, the advantages of the MO density driven approach compared to the AO density driven approach are limited when considering standard full space HF. In the next section, we describe multilevel HF, where the MO formulation is critical to obtain computational efficiency.
E(Da , De) = E(Dar , De) = e(Dar ) + e(De) + 2Tr[Dar Gr(De)] + hnuc (17)
where e(Dar ) implies that the MO matrices entering eq 16 are in the active MO basis. Matrices in the reduced dimension are AO matrices transformed with MO coefficients belonging to only the active region (active MO coefficients). That is, they contain contributions from all AOs (in principle) of the entire system wrapped into the smaller dimension of Na. The transformation from AO quantities to active MO space needs to be performed only once. With respect to the optimization, we are only considering rotations among active occupied and active virtual orbitals, so the expression used for energy optimization will be
3. MULTILEVEL MO DENSITY MATRIX DRIVEN ENERGY OPTIMIZATION In this section, we consider the multilevel extension of the MO driven HF optimization described in Section 2. Specifically, we consider a density matrix D for a Ne-electron closed-shell Slater determinant that fulfills the symmetry, trace, and idempotency conditions of eqs 4−6 and its partitioning into an active part, Da, and an environment part, De. D = Da + De
E(Dar (κr), De) − e(De) = e(Dar (κr)) + 2Tr[Dar (κr)Gr(De)] + hnuc
where Dar (κr) = exp(−κr)Dar exp(κr). The first term on the right-hand side of eq 18 gives rise to standard gradient an linear transformation terms, as given in eqs 10 and 11, whereas additional terms to the gradient and linear transformations are found from the second term of eq 18. We may write an effective Fock matrix as
(14)
Both Da and De are required to separately fulfill the symmetry, trace, and idempotency conditions, as is Da + De. In particular, the requirement of idempotency for the sum of the two densities is equivalent to requiring orthogonality between the active and environment orbital spaces without referencing the orbitals themselves. Writing the energy of eq 1 in terms of the density in eq 14, we obtain a
e
a
e
a
Feff,r = Fr + Gr(De)
(19)
We obtain the gradient and linear transformations with respect to rotation among active occupied and active virtual orbitals as
e
E(D , D ) = e(D ) + e(D ) + 2Tr[D G(D )] + hnuc
vo ov E[1] r = Feff,r − Feff,r
(15)
where the energies for the separate density matrices are defined as e(Dx ) = 2Tr[hDx ] + Tr[Dx G(Dx )]
(18)
(20)
oo vv oo vv E[2] r κr = (Feff,r − Feff,r )κr + κr (Feff,r − Feff,r ) a vo a − Gov r ([Dr , κr ]) + Gr ([Dr , κr ])
(16)
for x = e, a. The structure of the matrices is illustrated in Figure 1, where we have indicated the dimensions for the active occupied and virtual space, oa and va, respectively, and the dimensions for the environment occupied and virtual space, oe and ve, respectively. The energy will be optimized through rotations among active occupied and active virtual MOs. Due to the structure of Da (see Figure 1), we may write terms containing Da in terms of the MO reduced space dimensions Na × Na, where Na = oa + va is the total number of MOs belonging to the active region of the system. We obtain
(21)
The level-shifted Newton equations can be solved in the same manner as the linear equations in the full MO basis and in the AO based formulation (discussed in Section 2.2). In the current implementation, we use the Roothaan−Hall Newton equations, e.g., only the Fock matrix contributions to the linear transformation are considered. 3.1. Starting Guess for Total Density and Decomposition to an Active Space. Generating the starting guess AO density, Dao, for the entire system is a matter of choice. Since the environment density will not be optimized, the starting guess density should be a reasonable starting point. In 5284
DOI: 10.1021/acs.jctc.7b00689 J. Chem. Theory Comput. 2017, 13, 5282−5290
Article
Journal of Chemical Theory and Computation [Gr(De)]pq =
this work, we choose to work with the superposition of atomic densities (SAD) starting guess of Van Lenthe.22 However, the SAD density matrix cannot be used directly as the starting guess since it is not an idempotent density. To obtain idempotency, one may either purify it through some scheme such as McWeeny purification42−44 or construct the Fock matrix from the SAD density, diagonalize, and use the resulting orbitals to construct an idempotent density. The construction of the Fock matrix in this case will be highly efficient due to the sparsity of the diagonal SAD density, but diagonalization will scale as N3 and will become less efficient as the system size increases. However, the diagonalization of the Fock matrix from the diagonal SAD density is not the time critical step, and for now, we use this scheme to generate an idempotent AO starting density. From the idempotent starting density, we may generate Dao,a and Dao,e by a Cholesky decomposition of AO density matrix diagonals belonging to active atoms, similar to the procedure described by Myhre et al.1 This can be seen as a decomposition involving diagonal elements Dao μμ, where μ is an AO centered on an active atom. The procedure will thus generate a set of active occupied orbitals (which will constitute Dao,a) and a set of active virtual orbitals when decomposing the virtual density matrix. The remainder of the density matrix after decomposition (for the occupied part) is the embedding density Dao,e = Dao − Dao,a. Active atoms are atoms in the region of interest, determined either by specifying atoms explicitly or choosing atoms within a given distance from the site of interest. The selection of atoms does not imply hard boundaries between the active region and the environment. The density decomposition generates orthogonal MOs which span the active−active part of the AO density matrix, and they will extend into the environment, although they will be largely localized in the active region due to their locality.45 Note that we never need the embedding orbitals since the constant contribution (computed once) involving the embedding density may be computed in the AO basis. The active MOs will be used to transform matrices from the AO basis to the active MO basis before commencing with the optimization procedure. Furthermore, Gr(De), which enters the contribution 2Tr[Dar Gr(De)], is kept in memory in the active MO representation (as indicated by the “r” subscript) and directly updated by the orbital transformations; thus, no AO quantities need to be stored in memory. The contribution e(De), eq 16, need not be computed at all if the purpose is to generate a reference for post-HF calculations since it represents a constant shift in the energy. 3.2. Sparsity of the AO Dimension. For large molecular systems where the active region constitutes a small fraction of the molecular system, one may not need AOs of the entire system to span the active MO space. This comes in addition to the usual integral screening procedures. The dimension of the AO space may thus be reduced for quantities relating to the active MO space. Naturally, the reduced AO dimension is much greater than the dimension of the active MO space, but exploiting AO sparsity will reduce dimensions of two-electron integral evaluation. To carry out a multilevel HF calculation, we need the two-electron contribution of the embedding expressed in the active MO basis, Gr(De). The evaluation of the Gr(De) may be written as
∑ ∑ [2(αβ|ρσ ) − (αρ|βσ )]Dρσao,eCβaqCαap α ,β ρ,σ
(22)
where Caαp and Caβq denote the active MO coefficients (p,q ∈ active space), and we have used Gr(De) = Gr(Dao,e) due to the diagonal structure of De (see Figure 1). Since the active MO space is local to the active region, AOs far from the active region will have vanishing contributions to the active MO coefficients and thus to the matrix Gr(De). The AO summation that involves elements of Ca may therefore be restricted to an effective AO space denoted with a prime [Gr(De)]pq =
∑ ∑ [2(α′β′|ρσ ) − (α′ρ|β′σ )]Dρσao,eCβa ′ qCαa ′ p α′,β′ ρ,σ
(23)
The effective set of AOs is obtained by removing the vanishing block of Ca, more specifically, the block where no AO has a contribution greater than a threshold Tao. ϕpa ≈ ϕpa′ =
∑ Cαa ′ pχα ′
(24)
α′
This can make a small impact on the orthonormality of the set; thus, the set is reorthogonalized if required. Further, the threshold should not be set such that the active MOs and thus the active density is compromised, i.e., that Da + De is no longer idempotent. The total density will then no longer represent a Slater determinant. This will be discussed with numerical examples in Section 4.3.1. The sparsity of the AO space may also be used for the contribution Gr(Da), which is computed once in every SCF iteration. We write [Gr(Da )]pq =
a a ∑ ∑ [2(α′β′|ρ′σ′) − (α′ρ′|β′σ′)]Dρao,a ′ σ ′Cβ ′ qCα ′ p α′,β′ ρ′,σ′
(25)
where
Dρao,a ′σ′ =
oa
∑ Cρa ′ iCσa′ i
(26)
i=1
If the contribution e(De) is required (typically when computing energy differences), then a linear scaling algorithm augmented by an additional screening criteria may be used. The screening criteria disregards integrals if the Schwarz estimated contribution to the energy is less than a certain threshold, T ao,e ao,e Dαβ (αβ|αβ)1/2 (ρσ |ρσ )1/2 Dρσ