Amplitude Determinant Coupled Cluster with Pairwise Doubles

Amplitude Determinant Coupled Cluster with Pairwise Doubles ... John A. Gomez , Matthias Degroote , Jinmo Zhao , Yiheng Qiu , Gustavo E. Scuseria. Phy...
3 downloads 3 Views 879KB Size
Subscriber access provided by UNIV TORONTO

Article

Amplitude determinant coupled cluster with pairwise doubles Luning Zhao, and Eric Neuscamman J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.6b00812 • Publication Date (Web): 14 Nov 2016 Downloaded from http://pubs.acs.org on November 15, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Theory and Computation is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Amplitude determinant coupled cluster with pairwise doubles Luning Zhao† and Eric Neuscamman∗,†,‡ †Department of Chemistry, University of California, Berkeley, California, 94720, USA ‡Chemical Science Division, Lawrence Berkeley National Laboratory, Berkeley, California, 94720, USA E-mail: [email protected]

Abstract Recently developed pair coupled cluster doubles (pCCD) theory successfully reproduces doubly occupied configuration interaction (DOCI) with mean field cost. However, the projective nature of pCCD makes the method non-variational and thus hard to improve systematically. As a variational alternative, we explore the idea of coupledcluster-like expansions based on amplitude determinants and develop a specific theory similar to pCCD based on determinants of pairwise doubles. The new ansatz admits a variational treatment through Monte Carlo methods while remaining size-consistent and, crucially, polynomial cost. In the dissociations of LiH, HF, H2 O and N2 , the method performs very similarly to pCCD and DOCI, suggesting that coupled-clusterlike ansatzes and variational evaluation may not be mutually exclusive. In an attractive pairing model, the method retains its accuracy even when pCCD suffers a severe variational violation.

1

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Introduction

The electronic correlation energy can be divided into two parts. 1 Dynamic correlation, small changes to mean field theory, can be described effectively through many-body perturbation theory. 2 Static or non-dynamic correlation becomes important when the ground state contains degenerate or near-degenerate configurations, i.e. where more than one determinant is needed for a qualitative description of the system. Such situations arise in bond breaking, 1 large π-conjugated molecules where the HOMO-LUMO gap is small, 3 transition metal complexes 4 and superconductivity. 5 For such cases, a single Slater determinant gives qualitatively incorrect results. The concept of seniority has a long history in nuclear physics. 6 Seniority number is the measurement of unpaired electrons in a determinant. It has been showed before that strong correlation is very often concentrated within the low seniority regions of Hilbert space. 7 Among them, the seniority zero wave function is the most studied due to the fact that it is both size consistent and capable of capturing a large amount of strong correlation. The most elaborate seniority zero wave function is doubly occupied configuration interaction (DOCI), 8 which is a full configuration interaction (FCI) wave function limited to seniority zero space. DOCI was thoroughly studied in the pioneering days of quantum chemistry 8–10 and has many desirable features, being size-consistent, variational and effective for describing strong correlation. However, DOCI’s factorial cost scaling severely limits its applicability. Coupled Cluster (CC) with single and double excitations (CCSD) 11 and CCSD with perturbative triple excitations [CCSD(T)] 12 provide a very powerful way to describe dynamic correlation for a polynomial scaling cost. Between its rigorous size consistency, its systematic improvability, and its exceptional accuracy even at the CCSD(T) level, CC has become one of the most trusted and even routine methods in quantum chemistry for weakly correlated systems that are not too large. Unfortunately, the main drawback of traditional CC is that the ansatz is optimized in a projective manner to achieve polynomial scaling, which makes 2

ACS Paragon Plus Environment

Page 2 of 33

Page 3 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

the energy non-variational and leads to qualitative failure in strongly correlated regimes. 13–15 Examples include breaking of multiple bonds simultaneously, and the Hubbard model at large U, in which RHF-based CCSD or even CCSD(T) “overcorrelates” and the converged energy is well below the FCI energy. 16 While many approaches have been taken to resolve this issue, including multi-reference CC, 17 combinatorially scaling variational CC, 18 single-pair couple cluster 19 and CC valence bond, 20 there remain no CC methods that can capture both static and dynamic correlation in large molecules at an affordable cost. Recently, the antisymmetric-product of one-reference-orbital geminals (AP1roG) 21–23 was introduced by Ayers and coworkers in an attempt to deal with static correlation at polynomial cost. Impressively, they showed that AP1roG gives results almost equivalent to those of DOCI. Later, Scuseria and coworkers recognized that AP1roG is a simplified version of coupled cluster doubles (CCD), in which electrons are paired and only pair excitations are allowed, which they termed pair-CCD (pCCD). 13,24,25 It is quite remarkable that by eliminating the vast bulk of the cluster operator, the pCCD ansatz accurately reproduces the factorially complex DOCI. It is also important: since DOCI provides a powerful description of strong correlation in a wide variety of systems, so do AP1roG and pCCD. However, like most other CC methods, both pCCD and AP1roG are non-variational. One may raise the question that since pCCD is already so close to DOCI, what is the point of achieving variationality? One should note that although pCCD itself often does not have enough flexibility to violate the exact energy lower bound, once one breaks electron pairs and adds higher seniority determinants to the ansatz in an attempt to recover more dynamic correlation, the theory can fail in the same manner as traditional CC, yielding nonvariational energies. 13 The ultimate source of this failure is the inability of CC’s projected amplitude equations to detect wave function errors in the higher-rank sectors of Hilbert space (e.g. CCD ignores errors made in the quadruples). Variational methods must by their very nature account for energy errors stemming from all parts of Hilbert space, and so their stability comes not only from their guarantee of an energy lower bound but also

3

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

from the fact that any errors that might appear in higher-rank excitation sectors as ansatz flexibility is increased will be accounted for, and, within the limits of the ansatz’s flexibility, avoided. Traditional CC theory’s inability to see such errors and the tendency of a more flexible doubles operator to create them 13 makes it very challenging to systematically improve pCCD. This difficulty is all the more unfortunate given that we know that if standard coupled cluster ansatzes could be optimized variationally, the unphysical overcorrelation in multiple bond breaking would be eliminated. 18 Such a variational CC (VCC) approach has combinatorial cost scaling, however, and so can only be applied to small systems. As a compromise, Knowles and Robinson developed quasi-variational coupled cluster theory, 26 which as its name suggests is only approximately variational and does not provide an upper bound guarantee. Besides deterministic methods, variational quantum Monte Carlo (VMC) 27 is another accurate and versatile method for treating electron correlation. The VMC method is used to find expectation values of operators for a given trial wave function and to optimize the parameters in the trial wave function stochastically. In general, VMC need not depend on the independent particle approximation as it is compatible with more general ansatzes such as Jastrow Slater, Jastrow multi-Slater 27 and Jastrow AGP. 28 VMC is strictly variational, but it does not currently admit a polynomial-cost evaluation of CC wave functions. There has been progress recently in combining CC with projector Monte Carlo formalisms, both in using CC as a trial function 29 and in using a FCI quantum Monte Carlo 30,31 formalism to solve the amplitude equations for very high level cluster operators. 32–34 However, both of these approaches have costs that in general scale factorially with system size, in the former case for reasons similar to those discussed in Sec 2.3 while in the latter case due to a sign problem in the sampling over excitation pathways. In this study, we introduce amplitude determinant coupled cluster theory (ADCC), an approach that seeks to maintain the properties of traditional CC, such as size-consistency,

4

ACS Paragon Plus Environment

Page 4 of 33

Page 5 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

while achieving polynomial cost and variational evaluation through VMC. The central idea is to redefine the cluster expansion so that each configuration in the expansion has a coefficient given by a determinant of cluster amplitudes. The thinking is that a determinant has the rare ability to map a combinatorially large sum of terms (the likes of which arise when one wants to consider all possible excitation pathways into a given configuration) into an object that can be evaluated for a polynomially scaling cost. While many choices for the matrix whose determinant is to be employed can be imagined, we seek in this study to develop what is likely the simplest example in which we follow the pairwise doubles approach of pCCD, whose cluster expansion suggests a particularly simple choice for the matrix in question. The resulting ADCC with pairwise doubles (pADCCD) proves to be remarkably similar to its non-variational cousin, motivating future investigations into more general ADCC expansions. This paper is structured as follows. We begin with a review of VMC and then pCCD, which allows us to show clearly why the two are not efficiently compatible. We then introduce pADCCD theory and discuss its properties as compared to pCCD, after which we explain how the linear method 35 can be used optimize its parameters. Having laid out the general formalism, we then introduce the orbital optimization of pADCCD using reduced density matrices (RDMs) and the Newton-Raphson algorithm, after which we conclude our theoretical analysis by discussing the scaling of the method. Results are presented for a size consistency check, the bond stretching potentials of LiH, HF, H2 O and N2 , and an attractive pairing model that is particularly difficult for pCCD. We conclude with a summary of our findings and comments on future directions.

2 2.1

Theory Notation

As in most CC approaches, we will distinguish between orbitals that are occupied (indexed by i, j, k, . . .) and those that are unoccupied or virtual (indexed by a, b, c, . . .). Unless 5

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 33

stated otherwise, it will be assumed that occupied and virtual are defined based on the reference determinant. The numbers of occupied and virtual spatial orbitals are given by no and nv , respectively. In cases where sums range over all orbitals, we will use p, q, r, . . . for the indices. When necessary, we will distinguish α and β spin orbitals by placing a double ¯. bar over indices for the latter, i.e. ¯i or a

2.2

VMC Energy Evaluation

In variational Monte Carlo in Hilbert space, an ansatz’s energy is expressed using a resoP lution of the identity ~n |~nih~n| over the orthonormal Slater determinants |~ni (denoted by their occupation number vectors ~n) that form a complete many-body basis within the space allowed by the chosen one particle orbitals, hΨ|H|Ψi X | h~n|Ψi |2 h~n|H|Ψi E= = . hΨ|Ψi hΨ|Ψi h~n|Ψi

(1)

~ n

After sampling a set ξ of occupation number vectors from the wave function’s probability distribution | h~n|Ψi |2 / hΨ|Ψi, the energy can be estimated as an average of local energies

E=

1 X EL (~n) ns

EL (~n) ≡

~ n∈ξ

h~n|H|Ψi h~n|Ψi

(2)

where ns is the sample length, i.e. the size of the set ξ. For our purposes, it is convenient to write the ab initio Hamiltonian as H=

X pq

+

X pqrs

  gpq a†p aq + a†p¯aq¯

(pq|rs)



 1 † † 1 † † † † a aq a as + ap¯aq¯ar¯as¯ + ap aq ar¯as¯ 2 p r 2

(3)

where (pq|rs) are the usual two-electron coulomb integrals in (11|22) order, a†p and ap are

6

ACS Paragon Plus Environment

Page 7 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

creation and destruction operators, and gpq are modified one-electron integrals,

gpq = hpq −

1X (pr|rq) 2 r

(4)

where hpq are the standard one-electron integrals. By inspecting Eqs. (2) and (3), we see that if an ansatz |Ψi can evaluate h~n|Ψi for a cost that grows polynomially with system size for any randomly chosen |~ni, then the cost of evaluating its overall energy with VMC will also scale polynomially. This is true because there are a polynomially scaling number of terms in the Hamiltonian, and because the number of samples required to maintain a fixed absolute uncertainty in an extensive quantity grows linearly with system size.

2.3

pCCD

The pCCD wave function ansatz, ˆ

Tˆ ≡

|ΨpCCD i = eT |Ri

X

+ Tia a+ ¯ a¯i ai a aa

(5)

ia

is specified by a reference Slater determinant |Ri and the no by nv matrix T of cluster amplitudes. It can be seen as a simplification of CCD in which only excitations involving one occupied and one virtual orbital are allowed in the cluster operator Tˆ. Like any wave function in the Hilbert space defined by the chosen one particle basis, the pCCD ansatz can be expanded in the basis of Slater determinants,

|ΨpCCD i =

X

|~nih~n|ΨpCCD i.

(6)

~ n

The coefficients in this linear combination, 

 1 ˆ2 ˆ h~n|ΨpCCD i = h~n| 1 + T + T + . . . |Ri, 2

7

ACS Paragon Plus Environment

(7)

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 33

are clearly zero in cases when |~ni is not seniority zero (i.e. when it has unpaired electrons), as |Ri is seniority zero and Tˆ preserves the seniority of whatever it operates on. As originally shown by Limacher et al, 21 the coefficients in cases when |~ni is seniority zero are given by matrix permanents, (8)

 h~n|ΨpCCD i = per A(~n) ,

where A(~n) is the (nx /2) by (nx /2) matrix obtained by deleting all rows and columns of T that correspond to orbitals not involved in the excitation that excites |Ri into |~ni. Here nx is equal to the excitation level, i.e. the number of creation operators in the excitation. For ab example, consider the quadruply excited configuration |Rij i obtained from |Ri by replacing

¯, b, and ¯b, respectively. The coefficient occupied orbitals i, ¯i, j, and ¯j with virtual orbitals a, a for this configuration is 



 Tia Tib  ab ab Tˆ ab 1 ˆ 2 hRij |ΨpCCD i = hRij |e |Ri = hRij | T |Ri = per  . 2 Tja Tjb

(9)

To summarize, evaluating pCCD’s coefficient for a given electron configuration is equivalent evaluating the permanent of a matrix whose dimension grows with excitation level. As the maximum excitation level grows with system size, and because there are no known polynomial cost methods for evaluating permanents, the cost of a variational evaluation of pCCD via VMC grows exponentially with system size.

2.4

Amplitude Determinants

In defining the pCCD ansatz, Eqs. (5) and (8) are equivalent, although they do provide two different viewpoints from which one may consider the pCCD cluster expansion. On the one hand, Eq. (5) lends itself to developing the traditional CC methodology, in which a non-variational projected approach is used to optimize the amplitudes T and evaluate the 8

ACS Paragon Plus Environment

Page 9 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

energy. On the other hand, Eq. (8) helps us see that any configuration’s coefficient contains contributions from all possible excitation pathways that lead to it from the reference, as well as why pCCD is not compatible with VMC. Starting from one viewpoint or the other, one can derive the key properties 13 of pCCD: it is polynomial cost, exact for an electron pair, size consistent, invariant to the ordering of occupied or virtual orbitals, and non-variational. To produce an ansatz similar to pCCD that permits an efficient variational evaluation while differing as little as possible in its other properties, we propose pADCCD as a coupledcluster-like expansion based on amplitude determinants. Just as pCCD can be defined by Eq. (8), we define the seniority-zero pADCCD ansatz via the coefficients it imparts on each closed shell electron configuration,

 h~n|ΨpADCCD i ≡ det B(~n) ,

(10)

where B(~n) is obtained from A(~n) by changing the signs of all elements below the diagonal. Note that we use B rather than A in an effort to make the ansatzes as similar as possible, as this choice makes the coefficients for pCCD and pADCCD equal for all configurations below hextuple excitations when the amplitudes T are taken to be the same. Crucially, the cost of a VMC evaluation now scales polynomially thanks to the fact that determinants can be evaluated via matrix diagonalization. Which qualities of pCCD are maintained by pADCCD? Its definition in Eq. (10) makes clear that, like pCCD, each excited determinant’s coefficient includes contributions from all possible excitation pathways (the number of which grows factorially with excitation level), although the details of how these contributions are combined to form the coefficient are of course different as permanents and determinants are not the same. Like pCCD, the cost scaling of pADCCD is polynomial, although the order as currently implemented is higher and the need for stochastic sampling raises the prefactor (see Section 2.10). As pCCD and pADCCD are equivalent at the doubles level, the theory remains exact for an electron pair.

9

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Finally, we will see that strict size consistency is also maintained despite the fact that it is no longer obvious how to write the ansatz in an exponential form like that in Eq. (5). The property that we fail to maintain is invariance to orbital ordering, with the exact pattern of signs appearing in the cluster expansion now depending on the ordering of the occupied and the virtual orbitals. While the importance of this shortcoming can only be fully discovered through numerical experiment (see Section 3.6 for a preliminary test), we can offer some general comments. First, orbital-ordering-dependence is present in other ansatzes, such as the highly successful matrix product state that underlies the density matrix renormalization group. 36 Second, as a typical optimization will start from T = 0, which is to say from a Slater determinant starting point in which orbital ordering does not matter, the pADCCD ansatz can be used in a way in which the user’s choice of orbital ordering in the initial guess has no effect. Third, orbital optimization (see below) can in principle find the optimal ordering, although of course in practice it would not be surprising for the nonlinear parameter optimization to become trapped in a local minimum characterized by a less-thanoptimal ordering. Finally, as for any nonlinear ansatz (e.g. consider Coulson-Fischer points in Hartree Fock), pADCCD may display non-smooth potential energy surfaces, and it seems natural to assume that a loss of orbital-ordering-invariance would if anything make this issue worse. Thus, while a dependence on orbital ordering is a concern, we suggest that in practice its effects may be mitigated by orbital optimization and an invariant initial guess, and, most importantly, that such effects may be a price well worth paying in order to achieve the stability offered by strict variationality. To conclude this section, we would like to point out that, although we have taken pains to make pADCCD and pCCD as similar as possible, the reason for this is not because we seek to create an approximation to pCCD. Indeed, pADCCD is not an approximation to pCCD any more than pCCD is an approximation to pADCCD. Instead, the two ansatzes form a pair of different but related approximations to the exact wave function. Rather than approximate pCCD, our purpose in keeping the ansatzes similar is to maintain as best as

10

ACS Paragon Plus Environment

Page 10 of 33

Page 11 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

possible the good properties of pCCD while also achieving variationality. The hope is that pADCCD will be effective both when pCCD is effective and in cases where pCCD suffers from severe variational violations.

2.5

Energy Expression

We will derive the energy expression for pADCCD in this subsection. As the VMC local energy EL (~n) involves excitations relative to the sampled configuration ~n, we will in this subsection use the convention of indexing orbitals occupied in ~n by {i, j, k, . . .} and orbitals not occupied in ~n by {a, b, c, . . .}. Before considering our particular wave function, notice that the subset of Hamiltonian terms that contain only number and hole operators (i.e. a†i ai and ab a†b ) will add a wavefunction-independent contribution to the local energy, E0 (~n) = 2

X

gii + 2

i

X

(ii|jj) +

ij

X

(ia|ai).

(11)

ia

Note that in deriving this formula for the wave-function-independent local energy component, we have taken advantage of the fact that the occupied α and occupied β orbitals are the same for any configuration ~n found in a seniority-zero wave function. The remaining one-electron terms in the Hamiltonian of the type a†i aa will not contribute to the local energy as they break seniority symmetry and pADCCD is strictly seniority zero. Likewise, the only remaining two-electron terms in the Hamiltonian that will give a nonvanishing contribution to the local energy are terms of the type a†i aa a¯†i aa¯ . Therefore, the expression for the local energy is  h~na |Ψi i ¯¯i ia|a h~ n |Ψi ia  det [B(~na )] X i ¯¯i = E0 (~n) + ia|a det [B(~ n )] ia

EL (~n) = E0 (~n) +

X

11

ACS Paragon Plus Environment

(12)

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 33

in which ~nai is understood to be the configuration resulting from ~n after two electrons have been moved from the ith to the ath spatial orbital.

2.6

Proof of Size-Consistency

In this subsection, we will prove that our pADCCD ansatz is strictly size consistent. The definition of size-consistency states that the energy calculated with two non-interacting systems A and B together as a “super system” should be equal to the sum of the energies of systems A and B calculated separately. To prove this, consider two non-interacting systems A and B. Then the matrix used to evaluate the coefficient becomes block diagonal 



0  BA (~nA ) B(~nAB ) =   0 BB (~nB )

(13)

since there are no “cross-excitations” between the two systems. Then due to the properties of determinants, we have

h~nAB |ΨAB i = h~nA |ΨA i h~nB |ΨB i

(14)

and it follows that, hΨAB |ΨAB i =

X

hΨAB |~nAB i h~nAB |ΨAB i

~ nAB

=

X

|hΨA |~nA i|2 |hΨB |~nB i|2

~ nAB

= hΨA |ΨA i hΨB |ΨB i

12

ACS Paragon Plus Environment

(15)

Page 13 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Thus the energy expression becomes

EAB =

X |hΨAB |~nAB i|2 h~nAB |HA + HB |ΨAB i

hΨAB |ΨAB i h~nAB |ΨAB i X |hΨAB |~nAB i|2  h~nA |HA |ΨA i h~nB |ΨB i h~nB |HB |ΨB i h~nA |ΨA i  + = hΨAB |ΨAB i h~nA |ΨA i h~nB |ΨB i h~nB |ΨB i h~nA |ΨA i ~ nAB

~ nAB

=

X |hΨA |~nA i|2 |hΨB |~nB i|2

~ nAB

=

hΨA |ΨA i hΨB |ΨB i

X |hΨA |~nA i|2 ~ nA

hΨA |ΨA i

(16)

(ELA (~nA ) + ELB (~nB ))

ELA (~nA ) +

X |hΨB |~nB i|2 ~ nB

hΨB |ΨB i

ELB (~nB )

= EA + EB and so pADCCD is size consistent. We should note, however, that its seniority-zero character may cause difficulty in describing some cases (but not all, see LiH below) involving open shell fragments. For closed shell fragments A and B, however, the method is clearly size consistent.

2.7

Derivative Ratios

Before discussing the variational minimization of the energy, we first lay the groundwork by developing efficient evaluations of terms that we call derivative ratios, which will make the optimization relatively simple to describe. Defining derivative notation |Ψx i ≡ ∂ |Ψi /∂µx , where µx is the xth wave function parameter, we first consider the “bare” derivative ratio

D~n (µx ) ≡

h~n|Ψx i h~n|Ψi

(17)

For derivatives with respect to the elements of T , these ratios are: h

det [B (~n)] Tr Θ (~n) D~n (Tia ) =

det [B (~n)]

13

∂B(~ n) ∂Tia

i



∂B (~n) = Tr Θ (~n) ∂Tia

ACS Paragon Plus Environment



(18)

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 33

in which Θ (~n) is the inverse of B (~n), Tr indicates the matrix trace, and we stress that i and a range over the occupied and virtual orbitals, respectively, of the reference determinant |Ri. In addition to the bare derivative ratios, we also define the energy derivative ratios,

G~n (µx ) ≡

h~n|H|Ψx i h~n|Ψi

(19)

which are related to derivatives of the local energy by

G~n (µx ) =

∂EL (~n) + D~n (µx ) EL (~n) . ∂µx

(20)

The local energy derivatives in question work out to be   ∂EL (~n) X  ¯¯ det B(~nbj ) = jb|bj ∂Tia det [B(~n)] jb

#  ! b ∂B(~ n ) ∂B(~ n ) j − Tr Θ(~n) Tr Θ(~nbj ) ∂Tia ∂Tia "

(21)

where i and a range over the occupied and virtual orbitals, respectively, of the reference determinant |Ri but j and b range over the occupied and unoccupied orbitals within ~n. Once these local energy derivatives are evaluated, Eq. (20) is used to construct the desired energy derivative ratios G~n (µx ).

2.8

Amplitude Optimization by Linear Method

We variationally optimize the energy of the pADCCD wave function using the linear method (LM). 37,38 The LM works by repeatedly solving the Schrödinger equation in a special subspace of the full Hilbert space defined by the span of the wave function and its first derivatives. More precisely, we construct the generalized eigenvalue problem X

hΨx |H|Ψy i cy = λ

y∈{0,1,...}

X

hΨx |Ψy i cy

y∈{0,1,...}

14

ACS Paragon Plus Environment

(22)

Page 15 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

where |Ψx i and |Ψy i are shorthand for derivatives of |Ψi with respect to the xth and yth wave function parameters µx and µy , respectively, and |Ψ0 i ≡ |Ψi. After solving this eigenvalue problem for c, one updates the parameters by

µx ← µx + cx /c0

(23)

The Hamiltonian and overlap matrices are built by Monte Carlo sampling X

X

~ n∈ξ y∈{0,1,...}



X

hΨx |~ni h~n|H|Ψy i cy hΨ|~ni h~n|Ψi

X

~ n∈ξ y∈{0,1,...}

hΨx |~ni h~n|Ψy i cy hΨ|~ni h~n|Ψi

(24)

Inspecting the above Monte Carlo approximations to the Hamiltonian and overlap matrices in the first derivative subspace makes clear that only derivative ratios Dn (µx ) and Gn (µx ) from Eqs. 17 and 19 are needed. Therefore, the LM can be applied efficiently to optimize the amplitudes by evaluating these ratios.

2.9

Orbital Optimization

Like DOCI and pCCD, pADCCD is not invariant to the choice of orbital basis, because all open-shell determinants are neglected. One benefit of a variational method is that we can straightforwardly define the optimal orbitals to be those that minimize the energy, and so in this section we introduce an orbital optimization method for pADCCD. First, consider modifying the ansatz by performing a unitary rotation of the orbital basis, ˆ

|Ψi → eK |Ψi XX  ˆ = K Kpq a†pσ aqσ − a†qσ apσ p>q

σ

15

ACS Paragon Plus Environment

(25) (26)

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 33

where K is an anti-Hermitian matrix and σ indexes the spin. For describing the optimization, it will be useful to organize the elements of the lower triangle of K into the length-d vector ~κ, where

d = (no + nv )(no + nv − 1)/2.

(27)

The variational energy can then be written as a function of ~κ, ˆ

E(~κ) =

ˆ

hΨ|e−K HeK |Ψi , hΨ|Ψi

(28)

where it is understood that the elements of ~κ determine those of K and thereby the operator ˆ Starting from the initial orbitals (i.e. at the point ~κ = 0), we expand the energy out to K. second order in ~κ to obtain 1 E(~κ) ≃ E (0) + ~κ+ ω ~ + ~κ+ Q~κ 2

(29)

where the length-d energy gradient ω ~ and the d × d energy hessian Q are given by ∂E (~κ) ∂κx 2 ∂ E (~κ) = ∂κx ∂κy

ωx = Qxy

(30) (31)

which in turn are functions of the one- and two-electron reduced density matrices

γpq

hΨ|a†p aq + a†p¯aq¯|Ψi , ≡ hΨ|Ψi

Γrs pq

hΨ| 12 a†p ar a†q as + 21 a†p¯ar¯a†q¯as¯ + a†p ar a†q¯as¯|Ψi . ≡ hΨ|Ψi

(32)

Note that both γ and Γ can be evaluated in the same manner and for the same cost as the energy in Section 2.5. Note also that the one-electron reduced density matrix γ is diagonal

16

ACS Paragon Plus Environment

Page 17 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

in the basis in which we define the pairing. The two-electron reduced density matrix Γ is also very sparse thanks to pADCCD being seniority zero. Detailed expressions for the orbital gradient and hessian can be found in the Appendix. Using ω ~ and Q, we can choose a ~κ that reduces the energy using the Newton-Raphson method, ~κ = −Q−1 ω ~.

(33)

At this point, we could continue with more Newton-Raphson steps until the energy is minimized with respect to ~κ, but we find it convenient to first reset ~κ to zero by absorbing its effects into the one- and two-electron integrals through a standard molecular orbital transformation. At this point another Newton-Raphson step can be taken (now using the convenient formulas above as ~κ = 0 again), and the method can be iterated to convergence. In practice, we find it effective to instead interleave the LM optimization steps for T with the NewtonRaphson steps for ~κ. Thus we first update T through a LM step, after which we evaluate the one- and two-electron reduced density matrices, which allows us to construct ω ~ and Q and take a Newton-Raphson step to update ~κ and transform the integrals into the corresponding new orbital basis. We then repeat this entire process until convergence is reached.

2.10

Scaling

Having presented our optimization method, we now analyze its cost. In each iteration, we need to loop over occupied and virtual orbitals to compute local energies, RDMs, and derivative ratios. As the evaluation of the relevant determinants scale as n3ex , in which nex is the pair excitation level, the overall cost scales as ns no nv n3ex , in which ns is the number of samples. Although this O (N 6 ) scaling looks much steeper than the O (N 3 ) scaling of pCCD, one should note that the highly excited configurations are rarely sampled, and so in many molecules the n3ex term in the scaling may behave more like a constant. In such a regime,

17

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

pADCCD’s scaling may appear closer to O (N 3 ). We should also note that in this pilot study, we have not attempted to accelerate determinant evaluations using the Sherman-Morrison formula or similar techniques, 39 which if applied would lower the cost scaling of the energy evaluation and the LM. The number of samples required to maintain a constant statistical error, ns , scales linearly with system size. This is because the energy variance is an extensive quantity (growing linearly with system size), and the statistical error is determined by the ratio of the variance to the sample length as per the central limit theorem. Thus, in order to maintain a constant statistical uncertainty, one should increase the sample length linearly with system size to compensate for the linear growth in the variance. For the most expensive calculations we did so far, for example N2 in 6-31G basis, the number of samples required is 3.6×106 , and so we have used this as our sample length throughout the paper. At the end of each iteration, we need to reset κ ˆ to 0 via a molecular orbital transformation. The one- and two-electron integrals needed to represent Hamiltonian in the new basis can be evaluated at an O (N 5 ) cost. However, as the basis rotation is required only once per LM iteration, rather than once per sample, its cost is typically negligible compared to that of the sampling effort involved in optimizing the cluster amplitudes.

3 3.1

Results Computational Details

pADCCD results were obtained using our own software for VMC in Hilbert space, with oneand two-electron integrals for the Hamiltonian taken from PySCF. 40 The full configuration interaction (FCI) results were obtained from Molpro 41 and CISD results from Psi4. 42 pCCD and DOCI results for molecules were kindly shared by Peter A. Limacher, 43 while pCCD and DOCI results for the attractive pairing model were kindly shared by Thomas Henderson. 44 We froze the N and O 1s orbitals at the RHF level. The sample size is taken to be 3.6×106 , 18

ACS Paragon Plus Environment

Page 18 of 33

Page 19 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Table 1: Absolute energy errors per molecule (|E − EFCI |/n) in kcal/mol for n well separated H2 molecules in the STO-3G basis. 45 Statistical uncertainties for pADCCD are less than 0.01 kcal/mol. See Section 3.2 for details. n 1 2 3 4 5

pADCCD 0.00 0.00 0.00 0.00 0.00

CISD 0.00 0.64 1.21 1.73 2.21

which produces statistical uncertainties of less than 0.25 kcal/mol in all cases.

3.2

Size Consistency Check

To begin, we check the size consistency of pADCCD by calculating the energy of one or more H2 molecules. We arrange the molecules such that no two are within 100 Å of each other, thus creating a situation where the overall system’s energy should be the sum of the subsystems’ energies, or in this particular case just nEH2 , where n is the number of molecules and EH2 is the energy of a single H2 molecule. As shown in Table 1, pADCCD displays no size consistency error within our tightly controlled (< 0.01 kcal/mol) statistical uncertainty. For comparison, the size consistency error of configuration interaction singles and doubles (CISD) is already readily apparent even in this simple test.

3.3

LiH

We begin our bond dissociation results with a simple example, the LiH molecule in the ccpVDZ basis. 46 The system has only two valence electrons, and both DOCI and pCCD deliver almost exact results compared to FCI. Due to our method’s similarity with pCCD, we also ˚ expect nearly exact results. Figure 1 shows our results at 18 bond lengths between 0.9A ˚ The non-parallelity error (NPE) defined as the difference between the largest and and 4.2A. smallest error with respect to FCI along the potential surface is about 2 mEh , confirming our expectations. 19

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

2

−7.86

−7.90

pADCCD FCI

Energy Error(kcal/mol)

−7.88 Energy(Hartree)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

−7.92 −7.94 −7.96 −7.98

1

−8.00 −8.02 0.8 1.2 1.6 2.0 2.4 2.8 3.2 3.6 4.0 Li−H distance (Å)

0 0.8 1.2 1.6 2.0 2.4 2.8 3.2 3.6 4.0 Li−H distance (Å)

Figure 1: Dissociation of LiH in cc-pVDZ basis set and energy error vs FCI.

3.4

HF

We next turn our attention to the dissociation of hydrogen fluoride in a 6-31G basis. 47 Figure 2 shows the absolute energy of pADCCD, along with DOCI, pCCD 43 and FCI results. The results show that pADCCD, DOCI and pCCD are energetically very similar. A close analysis shows that the NPE of pADCCD and pCCD with respect to FCI is about 17 and 18 mEh , respectively. Unlike LiH, where seniority zero based wave functions are near exact, we see a large energy gap between all seniority zero based ansatzes and FCI. This is a reminder that while seniority zero wave functions are often effective for strong correlations, they do not capture all the details of weak correlation.

3.5

H2 O

Our next example is the symmetric double dissociation of H2 O, as shown in Figure 3. Again, pADCCD provides very similar energies compared to DOCI and pCCD, and the NPE with respect to FCI is about 19 mEh for pADCCD and 16 mEh for pCCD. The coincidence of the pADCCD with pCCD and DOCI is thus true not just for one pair of strongly correlated

20

ACS Paragon Plus Environment

Page 20 of 33

Page 21 of 33

−99.80 −99.85

48 pADCCD DOCI pCCD FCI

46 Energy Error(kcal/mol)

−99.75

Energy(Hartree)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

−99.90 −99.95 −100.00 −100.05

pADCCD DOCI pCCD

44 42 40 38 36

−100.10 0.8

1.2 1.6 2.0 2.4 H−F distance (Å)

2.8

0.8

1.2 1.6 2.0 2.4 H−F distance (Å)

2.8

Figure 2: Dissociation of HF in 6-31G basis set and energy error vs FCI. electrons, but for two pairs as well. However, we can still see from the plot that like pCCD and DOCI, a significant amount of dynamic correlation is clearly missing in all these methods. In order to show the importance of orbital optimization, we plot optimized and RHF ˚ in Figure 4. It is very clear that at this canonical orbitals for H2 O at bond length 2.2A stretched geometry, the optimized orbitals are much more localized than canonical orbitals. This localization is also seen in pCCD. 7 Indeed, the qualitative difference between optimized orbitals and canonical orbitals emphasizes the necessity of orbital optimization. In all the systems presented so far, the major components of the wave function are all at the quadruples excitation level or lower. As pCCD and pADCCD have the same flexibility up to quadruples, it is not surprising that their results have been nearly identical. Indeed, this was precisely our intention: to behave like pCCD when it was well behaved while also adding variational stability. The fact that the results have not been exactly the same is most likely due to the difference in the two methods’ optimization conditions. While pADCCD relies on variational minimization of the energy for optimizing both T and ~κ, pCCD relies on non-variational projected amplitude equations for determining T and can rely on multiple different orbital optimization criteria 13,48–50 for ~κ, none of which are equivalent to energy 21

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

−75.40

46

−75.60

Energy Error(kcal/mol)

pCCD DOCI FCI pADCCD

−75.50 Energy(Hartree)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

−75.70 −75.80 −75.90 −76.00

Page 22 of 33

pADCCD DOCI pCCD

44 42 40 38 36

−76.10 0.8

1.2 1.6 2.0 H−O distance (Å)

2.4

34

0.8

1.2 1.6 2.0 H−O distance (Å)

2.4

Figure 3: Symmetric Dissociation of H2 O in 6-31G basis set and energy error vs FCI with a bond angle of 109.57◦ . minimization (although making the energy stationary is very similar).

3.6

N2

Our final molecular example is the dissociation of N2 . As Figure 5 reveals, the difference between pCCD and pADCCD is more noticeable in N2 , which is to be expected now that hextuples (the first excitation level at which the ansatz forms are different) are needed for a qualitatively correct description of the dissociation. Indeed, we find that disabling hextuple and higher excitations raises the pADCCD energy by 11.5 kcal/mol. Although their NPEs now differ noticeably, 56 mEh for pCCD versus 65 mEh for pADCCD, they are of the same order of magnitude and neither are close to quantitative. Achieving a more quantitative accuracy clearly requires a more flexible cluster expansion, but, as is well known, 18 this route leads to qualitatively incorrect variational violations when pursued in a traditional CC approach. As variational violations are not possible in a VMC-based approach, it will be interesting in future to investigate more flexible expansions within the amplitude determinant framework. 22

ACS Paragon Plus Environment

Page 23 of 33

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

−108.40

100

−108.60

Energy Error(kcal/mol)

pADCCD DOCI FCI pCCD

−108.50 Energy(Hartree)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

−108.70 −108.80 −108.90 −109.00

Page 24 of 33

pADCCD DOCI pCCD

95 90 85 80 75 70 65

−109.10 0.8

1.2 1.6 2.0 N−N distance (Å)

0.8

1.2 1.6 2.0 N−N distance (Å)

Figure 5: Dissociation of N2 in 6-31G basis set and energy error vs FCI. al. 51 In this model, for which DOCI is exact, 51 the Hamiltonian is given by

H=

X

+ ǫp (a+ p↑ ap↑ + ap↓ ap↓ ) − G

X

+ a+ p↑ ap↓ aq↓ aq↑

(34)

pq

p

where ǫp = p sets the site energies and G > 0 determines the strength of the pairing interaction and therefore of correlation effects. Previous studies have shown that in the strongly correlated large-G regime, both traditional CC 15 and pCCD 51 fail qualitatively with large variational violations. By construction, the variational pADCCD method cannot fail in the same manner, as its energy is guaranteed to stay above the exact energy. This behavior can be seen in Figures 6 and 7, where pADCCD not only stays variational, but retains a high accuracy even in the strongly correlated regime. Indeed, Figure 7 shows that pADCCD recovers between 95% and 100% of the exact correlation energy regardless of the value of G, while by G = 0.8 pCCD has failed qualitatively, recovering fully 200% of the correlation energy. This example demonstrates how a method like pADCCD that maintains strict variationality can be significantly more reliable than similar methods that do not. It is also worth noting that in the weak correlation region, where pCCD is accurate, pADCCD

24

ACS Paragon Plus Environment

Page 25 of 33

74

22 20

72

pADCCD DOCI pCCD

18

pADCCD DOCI pCCD

70 Energy(Hartree)

Energy(Hartree)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

16 14 12

68 66 64 62

10

60 58

8 0.2

0.4

0.6 G

0.8

0.1

1.0

0.2

0.3

0.4

0.5

0.6

0.7

G

Figure 6: Results for the 8 (left) and 16 (right) site attractive pairing model. Data for pCCD and DOCI kindly shared by Thomas Henderson. 44 and pCCD behave very similarly. A close observation shows that the energy difference between pCCD and pADCCD is at most 0.002 when G ≤ 0.2. This makes sense, because in the weakly correlated regime the system will behave more like a collection of independent electron pairs, for which both ansatzes are exact.

4

Conclusions

We have presented amplitude determinant coupled cluster with pairwise doubles (pADCCD) as a variational cousin to pCCD. Unlike the permanent-based coefficients of pCCD, pADCCD defines its expansion coefficients as amplitude determinants. Combined with variational Monte Carlo methods, this choice produces a method that is exact for an electron pair, size-consistent, polynomial cost, and variational. Initial tests on the dissociations of LiH, HF, H2 O and N2 reveal that pADCCD and pCCD produce similar results, suggesting that pADCCD’s determinant-based cluster expansion offers similar flexibility to pCCD’s more traditional form. In tests on the attractive pairing model, pADCCD proves accurate while pCCD suffers catastrophic variational violations. Together, these examples suggest that a 25

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

2.0

1.8

pCCD pADCCD

1.6 EC / ECexact

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 33

1.4

1.2

1.0

0.8 0.0

0.2

0.4

0.6

0.8

1.0

G

Figure 7: Fraction of the correlation energy EC recovered in the 8 site attractive pairing model. Data for pCCD kindly shared by Thomas Henderson. 44 determinant-based cluster expansion can offer both the reliability of a variational method and the compact encoding of correlation achieved by more traditional cluster expansions. Like pCCD and other seniority zero approaches, pADCCD proves effective for describing some strong electron correlations but is unable to deliver quantitative accuracy, a difficulty that may in future be addressed in two different ways. First, one may seek to increase the flexibility of the cluster expansion. We know that generalizing pCCD into CCSD greatly improves the recovery of weak correlation effects near equilibrium, but that it also leads to unacceptable variational violations as bonds are stretched. One exciting direction for future inquiry is therefore to investigate whether it is possible to construct a variational, determinant-based cluster expansion that can mimic CCSD as well as pADCCD mimics pCCD. A second exciting possibility would be to use an amplitude-determinant-based expansion as the trial function in projector Monte Carlo, which the low per-sample cost of pADCCD for low-lying configurations suggests may be an especially effective pairing. Of course, these avenues are not mutually exclusive, and we look forward to investigating both 26

ACS Paragon Plus Environment

Page 27 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

in future research.

Acknowledgement The authors thank Peter A. Limacher and Thomas Henderson for sharing raw pCCD and DOCI data with us. We also acknowledge funding from the Office of Science, Office of Basic Energy Sciences, the US Department of Energy, Contract No. DE-AC02-05CH11231. Calculations were performed using the Berkeley Research Computing Savio cluster.

Appendix For completeness, we include here expressions for the pADCCD orbital rotation gradient and hessian. These should provide everything needed for the Newton-Raphson algorithm we use for orbital optimization. The expressions are similar to those in the pCCD paper. 14 The energy can be written as ˆ

ˆ

E(~κ) =

hΨ|e−K HeK |Ψi hΨ|Ψi

(35)

with ˆ = K

XX p>q

Kpq a†pσ aqσ − a†qσ apσ

σ

(36)



  ˆ and transform the where we evaluate the orbital rotation unitary transformation as exp K

integrals into this new basis. The orbital gradient is 

∂E ∂Kpq

= Ppq



= Ppq

~κ=0

X

X 

H, a†pσ aqσ

σ

(uv|tp) Γvq ut

+



(up|tv) Γqv ut



(37) (uv|qt) Γvt up

uvt

27

ACS Paragon Plus Environment



(qv|tu) Γvu pt



Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 33

where Ppq is a permutation operator Ppq = 1 − (p ↔ q) and the notation for the expectation value means

hOi =

D

ˆ Ψ|O|Ψ hΨ|Ψi

E

(38)

Similarly, the Hessian is  ∂ 2E Qpq,rs = ∂Kpq ∂Krs ~κ=0 X    1 = Ppq Prs H, a†pσ aqσ , a†rτ asτ 2 σ,τ X  

 1 H, a†rσ asσ , a†pτ aqτ + Ppq Prs 2 σ,τ 

(39)

We obtain  1X sv vt vu δqr (uv|tp) Γvs ut + (up|tv) Γut + (uv|st) Γup + (sv|tu) Γpt 2 uvt  vq qv vu (uv|qt) Γvt ur + (qv|tu) Γrt + (uv|tr) Γut + (ur|tv) Γut

Qpq,rs = Ppq Prs

+ δps X sq vu vu + (up|vr) Γqs uv + (ur|vp) Γuv + (qv|su) Γpr + (sv|qu) Γrp uv



X

us st su (ut|qr) Γts up + (qu|tr) Γpt + (ur|qt) Γup + (qr|tu) Γpt

tu

qu uq tp + (ut|sp) Γtq ur + (up|st) Γsu + (su|tp) Γrt + (sp|tu) Γrt

28

ACS Paragon Plus Environment

(40)

Page 29 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

References (1) Helgaker, T.; Jøgensen, P.; Olsen, J. Molecular Electronic Structure Theory; John Wiley and Sons, Ltd: West Sussex, England, 2000; p 162. (2) Møller, C.; Plesset, M. S. Phys. Rev 1934, 46, 618. (3) Grimme, S.; Parac, M. ChemPhysChem 2003, 4, 292. (4) Neuscamman, E. J. Chem. Theory Comput 2016, 12, 3149. (5) Subedi, A.; Zhang, L.; Singh, D. J.; Du, M. H. Phys. Rev. B 2008, 78, 134514. (6) Ring, P.; Schuck, P. The Nuclear Many-Body Problem; Springer-Verlag: Berlin, 1980; pp 221–228. (7) Bytautas, L.; Henderson, T. M.; Jiménez-Hoyos, C. A.; Ellis, J. K.; Scuseria, G. E. J. Chem. Phys 2011, 135, 044119. (8) Weinhold, F.; Jr., E. B. W. J. Chem. Phys 1967, 46, 2752. (9) Allen, T. L.; Shull, H. J. Phys. Chem 1962, 66, 2281. (10) Smith, D. W.; Fogel, S. J. J. Phys. Chem 1965, 43, S91. (11) III, G. D. P.; Barlett, R. J. J. Chem. Phys. 1982, 76, 1910. (12) Raghavachari, K.; Trucks, G. W.; Pople, J. A.; Head-Gordon, M. Chem. Phys. Lett. 1989, 157, 479. (13) Stein, T.; Henderson, T. M.; Scuseria, G. E. J. Chem. Phys 2014, 140, 214113. (14) Henderson, T. M.; Bulik, I. W.; Stein, T.; Scuseria, G. E. J. Chem. Phys 2014, 141, 244104.

29

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(15) Bulik, I. W.; Henderson, T. M.; Scuseria, G. E. J. Chem. Theory Comput. 2015, 11, 3171. (16) Bartlett, R. J.; Musial, M. Rev. Mod. Phys 2007, 79, 291. (17) Lyakh, D. I.; Muslal, M.; Lotrlch, V. F.; Barlett, R. J. Chem. Rev 2012, 112, 182–243. (18) Cooper, B.; Knowles, P. J. J. Chem. Phys 2010, 133, 234102. (19) Gomez, J. A.; Henderson, T. M.; Scuseria, G. E. J. Chem. Phys 2016, 144, 244117. (20) Small, D. W.; Lawler, K. V.; Head-Gordon, M. J. Chem. Theory Comput 2014, 10, 2027. (21) Limacher, P. A.; Ayers, P. W.; Johnson, P. A.; Baerdemacker, S. D.; Neck, D. V.; Bultinck, P. J. Chem. Theory Comput 2013, 9, 1394. (22) Tecmer, P.; Boguslawski, K.; Johnson, P. A.; Limacher, P. A.; Chan, M.; Verstraelen, T.; Ayers, P. W. J. Phys. Chem. A 2015, 118, 9058–9068. (23) Boguslawski, K.; Ayers, P. W. J. Chem. Theory Comput. 2015, 11, 5252–5261. (24) Henderson, T. M.; Bulik, I. W.; Scuseria, G. E. J. Chem. Phys 2015, 142, 214116. (25) Shepherd, J. J.; Henderson, T. M.; Scuseria, G. E. J. Chem. Phys 2016, 144, 094112. (26) Robinson, J. B.; Knowles, P. J. J. Chem. Phys 2012, 136, 054114. (27) Umrigar, C. J. J. Chem. Phys. 2015, 143, 164105. (28) Casula, M.; Sorella, S. J. Chem. Phys. 2003, 119, 6500. (29) Roggero, A.; Mukherjee, A.; Pederiva, F. Phys. Rev. B 2013, 88, 115138. (30) Booth, G. H.; Thom, A. J. W.; Alavi, A. J. Chem. Phys. 2009, 131, 054106. (31) Cleland, D.; Booth, G. H.; Alavi, A. J. Chem. Phys. 2010, 132, 041103. 30

ACS Paragon Plus Environment

Page 30 of 33

Page 31 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(32) Thom, A. J. W. Phys. Rev. Lett. 2010, 105, 263004. (33) Franklin, R. S. T.; Spencer, J. S.; Zoccante, A.; Thom, A. J. W. J. Chem. Phys. 2016, 144, 044111. (34) Spencer, J. S.; Thom, A. J. W. J. Chem. Phys. 2016, 144, 084108. (35) Umrigar, C. J.; Filippi, C. Phys. Rev. Lett 2005, 94, 150201. (36) Chan, G. K.-L.; Sharma, S. Annu. Rev. Phys. Chem. 2011, 62, 465. (37) Nightingale, M. P.; Melik-Alaverdian, V. Phys. Rev. Lett. 2001, 87, 043401. (38) Umrigar, C. J.; Toulouse, J.; Filippi, C.; Sorella, S.; Hennig, R. G. Phys. Rev. Lett. 2007, 98, 110201. (39) Neuscamman, E. J. Chem. Phys. 2013, 139, 194105. (40) PySCF,

a

quantum

chemistry

package

written

in

python.

see

http://chemists.princeton.edu/chan/software/pyscf (accessed October 13, 2016). (41) Werner, H.-J.; Knowles, P. J.; Knizia, G.; Manby, F. R.; Schütz, M.; others, MOLPRO, version 2012.1, a package of ab initio programs. see http://www.molpro.net (accessed October 13, 2016). (42) Turney, J. M.; Simmonett, A. C.; Parrish, R. M.; Hohenstein, E. G.; Evangelista, F.; Fermann, J. T.; Mintz, B. J.; Burns, L. A.; Wilke, J. J.; Abrams, M. L.; Russ, N. J.; Leininger, M. L.; Janssen, C. L.; Seidl, E. T.; Allen, W. D.; Schaefer, H. F.; King, R. A.; Valeev, E. F.; Sherrill, C. D.; Crawford, T. D. WIREs Comput. Mol. Sci. 2012, 2, 556. (43) Limacher, P. A. personal communication, 2016. (44) Henderson, T. personal communication, 2016. (45) Hehre, W. J.; Stewart, R. F.; Pople, J. A. J. Chem. Phys. 1969, 51, 2657. 31

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(46) Dunning, T. J. Chem. Phys 1989, 90, 1007. (47) Hehre, W. J.; Ditchfield, R.; Pople, J. A. J. Chem. Phys 1972, 56, 2257. (48) Boguslawski, K.; Tecmer, P.; Limacher, P. A.; Johnson, P. A.; Ayers, P. W.; Bultinck, P.; De Baerdemacker, S.; Van Neck, D. J. Chem. Phys. 2014, 140, 214114. (49) Boguslawski, K.; Tecmer, P.; Bultinck, P.; Baerdemacker, S. D.; Neck, D. V.; Ayers, P. W. J. Chem. Theory Computi. 2014, 10, 4873–4882. (50) Boguslawski, K.; Tecmer, P.; Ayers, P. W. Phys. Rev. B 2014, 89, 201106(R). (51) Henderson, T. M.; Bulik, I. W.; Scuseria, G. E. J. Chem. Phys. 2015, 142, 214116.

32

ACS Paragon Plus Environment

Page 32 of 33

Page 33 of 33

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment