J. Phys. Chem. 1984, 88, 3532-3538
A Program for Efficient Perturbation Configuration Interaction James J. Diamond, Gerald A. Segal,* Department of Chemistry, University of Southern California, Los Angeles, California 90089- 1062
and Ross W. Wetmore Department of Chemistry, College of Physical Science, University of Guelph, Guelph, Ontario, Canada N l G 2 Wl (Received: February 6, 1984) A CI + Bk method has been developed by which configuration interaction energies and vectors may be obtained within the Bkapproximation at a fraction of the cost of two-electron integral transformations even when the size of the CI space required is large (>lo5 configurations), with errors in relative energies due to the Bkapproximation typically in the range of 0.1-0.15 eV. Our approach to large-scale electronic structure problems, PEPCI (Program for Efficient Perturbational Configuration Interaction), combines previously developed matrix-partitioning techniques with a rapid integral-driven Hamiltonian matrix element evaluation scheme based on a modified unitary group approach. The key features leading to the efficiency of this method are (a) a reorganization of the two-electron integral file which allows complete Hamiltonian matrix element evaluation with a single integral buffer, (b) a graphical representation of configuration (Le., a distinct row table framework) which bypasses the use of an explicit configuration list, (c) the use of separate DRTs for spatial occupancy and spin coupling, and (d) the precalculation of the matrix elements, (wlE,,lw), of single-particle excitation operators represented within the spin basis. Examples are given for the molecules BO2, allene, and (+)-(S)-2-butanol.
Introduction It has long been accepted in theoretical chemistry that explicit inclusion of electron correlation is necessary to achieve quantitative agreement between experimental and ab initio results. In addition, electron correlation can also make qualitative changes in the physical picture of some systems (eg., those possessing RHF-UHF type instabilities). Recent developments such as the symbolic matrix element method,2 CASSCF3 or super-CI method, and multireference direct CI techniques4 have made the configuration interaction (CI) method of electron correlation increasingly tractable. Foremost in the past several years has been the development of the unitary group approach (UGA), based on the elegant and powerful work of J. Paldus,, and its graphical interpretation (GUGA) by I. ShavittS6 Particular implementations of these ideas such as the loop-driven GUGA of B. Brooks et aL7s8 or the multireference direct CI-GUGA of H. Lischka et aL9 have resulted in computer codes capable of relatively large CI calculations. While these methods represent an order of magnitude improvement over older (e.g., configuration selection) CI programs, it still appears that very large (about 105-106 configurations, depending on the machine used) CI spaces form a computational stumbling block several times as expensive as the two-electron integral transformation. For this class of problems (in particular,
those not requiring great ( J), FIJ is the IJth Fock matrix element, and V,,, and ,V are two-electron integrals.
I I I I I Figure 3. Data structure used for ijkr subrecords. Each structure contains four words. IJKL is a packed label ( I > J > K > L ) , and VIJKL, VrKJL, and V,, label IJKL.
are the three two-electron integrals associated with the
ments but represents the effect of “subtracting out” the collection of terms in the Hamiltonian common to the bulk of C I configurations; these terms are summed prior to any explicit matrix element evaluation. This is done by choosing a reference configuration (Le., a set of reference occupations) and forming the Fock matrix and reference energy corresponding to the single determinant reference configuration. This, in the case of diagonal matrix elements, reduces the basis size dependence from n2 to (6n)2, where n is the number of orbitals and 6n is the average number of differences between the CI configurations’ occupations and the reference occupations; the latter depends only on the level of excitation, not on the orbital basis set size, and is typically in the range 4-8. In a similar way, the basis dependence of one-electron difference matrix elements is reduced from n to 6n. While, in general, the reference occupations may be any real number (e.g., average occupations), we choose to restrict the possible values of the reference occupation to 2 or 0. As a rule, the reference configuration has the lowest N mod 2 orbitals doubly occupied, and the remainder empty where N is the number of particles. The formal expressions for the reference energy and reference Fock matrix are given by Segal et al.14 We then make the crucial step of reorganizing the two-electron file into four parts: (1) preliminary information (i.e., basis size, reference occupations, reference energy, and diagonal Fock matrix elements). (2) an n X n matrix, whose lower triangle contains the Coulomb integrals, Kjh (i 2 j ) , and upper triangle, the exchange integrals, K,u,where Kjkl is the standard two-electron integral, (i( 1)k( 2 ) ~ 1 / r l 2 ~ ( 1 ) l ( 2(Note ) ) that 1 and 2 are small enough to reside in the core permanently; this is the only integral information needed for diagonal Hamiltonian matrix elements.) (3) the collection of “ij subrecords”, which contain all the integrals required for one-electron-difference and two-electrondifference (with two indices the same) Hamiltonian matrix elements. Each “ij subrecord” has the same structure as shown in Figure 2, where Z J is the subrecord label ( I > J), FIJ is the IJth Fock matrix element, and VIJkkand VIkJk are two-electron integrals. If all entries are zero, the subrecord is never created. (4) the large set of “ijkl subrecords”; these contain the integrals required for the two-electron difference matrix elements with all indices different. Each has the four-word structure as shown in Figure 3 where ZJKL is a packed label ( I > J > K > L ) . In the two-electron integral file, all nonempty ij subrecords are stored first, followed by all nonempty ijkl subrecords. Besides the obvious utility of this file structure in Hamiltonian matrix element formation, certain advantages accrue when the orbital basis is permuted (Le., when “interacting-space” CI calculations are desired), insofar as each subrecord has its two-electron integrals permuted and its label repacked; the set of subrecords of each type 3 and 4 may then be easily resorted, if necessary, with considerably less labor than resorting the canonically ordered two-electron integral file. The stream of two-electron integrals is associated with twoparticle excitation operators: EVEkl;this stream is grouped into 6 types of excitations (similar to the 14 types of loops in the
3536 The Journal of Physical Chemistry, Vol. 88, No. 16, 1984 loop-driven approach of Brooks et al.738). The explicit formulae for the six types are given in Appendix I. As in GUGA C I methods, evaluation of Hamiltonian matrix elements reduces to calculations of one- and two-body excitation operators represented in the spin basis. The spin basis used by us differs from the ordinary unitary group approach; only open shells and orbitals changing occupation under the influence of an excitation operator are used to identify the appropriate spin function. The spin basis used here resembles the symmetric group, in which particles and holes are treated equivalently, rather than the unitary group, where the location of doubly filled orbitals affects the sign of various loop segments. The essential difference is that middle one-body segments of type I, L, M, and P (in Shavitt’s notation6) are never used; these correspond to walks between the tail and head of a loop where orbitals are empty and filled in both configurations. The head segments (types A-D) must be modified by the phase factor ( - l ) b when the left- and right-hand walks have differing numbers of open shells; this maintains a consistent sign convention in the evaluation of spin-matrix elements of generators, ( wlEiilw). The resulting C I vectors differ from those generated by GUGA only by an irrelevant overall sign factor. Two special cases of two-body spin-matrix elements are treated: (a) The calculation of diagonal Hamiltonian matrix elements requires the evaluation of ( wlEijEj,Jw);since E, is simply the adjoint of Eji, this reduces to the positive number IEjilw)12.These are tabulated prior to any actual matrix element evaluation and kept in core with the Coulomb and exchange integrals (requiring less than 4000 words even for the case of a 12 open-shell singlet). (b) The calculation of matrix elements involving double excitations with two pairs of identical indices requires the evaluation of spin-matrix elements of the form (~jEf!~lw). This is nonzero only if orbital j is filled on the right-han side and empty on the left, and vice versa for orbital i; the left- and right-hand spin couplings must be identical. In our spin basis, the value of this two-body matrix element is exactly 2. All other two-body matrix elements are calculated in the “dot product” form by an explicit sum over intermediate states:
The one-body matrix elements are pretabulated and kept in core during the evaluation of Hamiltonian matrix elements. The explicit intermediate sum need not be expensive, if the nonzero terms involving Iw’)and Iw”) can be readily identified. In keeping with the picture of the Hamiltonian as a string of operators acting on a specific configuration, we seek all states Iw’) such that (wlEijEk,lw)is nonzero for a given Iw),rather than fixing both Iw’) and Iw). In a full space-spin graphical representation of configurations, the set of functions Iw’)(wlEijlw)is, in general, noncontiguous; and the lexical indices w’,if unknown, must be found for each Id). This “repacking” operation is, in our experience, the single most expensive step in Hamiltonian matrix element formation. We use two separate distinct row tables for spatial occupancy and spin coupling. Finding the configurations coupled by two-body excitations with a particulate configuration involves only one repacking operation for a given change in occupations. For example, in the predominant case where all four indices are different, the intermediate sum is
V i l k j C (SIEi/ls”)( s ’ l E k j l ~ )(18) S
Diamond et al. TABLE I: Sample Timing Data for Allene DZdsymmetry 44 AOs 38
transformed MOs
D2 symmetry used in CI
A 0 integral construction
SCF (9 iterations) integral transform preliminary sort total
state(s) nucleus CSFs core CSFs width of Hbo CSFs in total CI space CSFs actually used active/filled/virtual orbitals diagonal elements generatedb CPU time, s off-diagonal elements generated CPU time, s no. of CI roots sought total job time (CPU) per root, s
l’A, 7 128 128 73032 63251 4/5/29 190890 103.2 276634 217.1 1 553.4
‘A,
+ IBIa
591 s 232 s 1313 s 21 s 2157 s
+ IB?”
128 128 441801 289470
128 128 441680 293990
731144 334.5 919247 328.1 11 230.5
735542 334.7 964702 338.1 10 238.7
” CI space used was the appropriate D2 subspace spanned by all single, double, and triple substitutions from the SCF configuration. bThe number of diagonal matrix elements calculated exceeds the number of CSFs in the entire CI space because diagonal elements are used in a number of places in the program; it is easier to recalculate them when needed rather than store them on an external device. ble-{-quality atomic basis sets were used, supplemented with Rydberg-like Gaussians. In all cases, only SCF orbitals were used in the CI calculations. The theoretical values of the excitation energy for various states in each of these species are generally within 0.1 eV of the experimental results; corresponding oscillator strengths, MCD densities, etc. are also in good agreement with experiment. When at least two states of interest are sought in the CI, the required CPU time per root is no more than 20% of the CPU time needed for the combined atomic integral generation, SCF, and two-electron integral transformation. Readers are referred to the appropriate publication^^^-^^ for more details on particular calculations. Allene. The vertical excited states (at the ground-state DZd geometry) were calculated in the range of 6.0-8.5 eV above the ground state.z3 There were 18 singlet excited states found in this region. The primary purpose of this calculation was the ab initio evaluation of the MCD spectrum in this region. Another 12 singlet excited states were used to supplement the sum over states in an approximate evaluation of the MCD “B term”. The large number of desired roots and the relatively simple structure of the lower excited states permitted an unusual choice of configuration space; all single, double, and triple substitutions from the SCF occupation were allowed (symmetry selected within the Abelian D, subgroup of D2k) This led to C I spaces on the order of 440000 CSFs (in 0,). The core space was filled with all CSFs with a coefficient (from some initial CI) larger than 0.1 in magnitude and all CSFs with a diagonal Hamiltonian matrix element less than 0.85 au above the S C F energy. The ground state was also calculated within the C I space formed from all totally symmetric (in 0,) single and double substitutions from the leading seven terms in the CI expansion; the final eigenvalue differed by 0.05 eV from that obtained in the larger calculation. All the states of interest were thought to be of essentially Rydberg character. The standard Dunningz6atomic basis was supplemented with a set of (3s,3p,3d) Rydberg-like Gaussians on the central carbon atom. Timing data
)‘
The new spatial index, n’, is calculated only once for all spin functions Is’). Computational Timings Let us now examine some performance data for three sample molecules: allene, BOz, and (+)-(S)-2-butanol. The configuration spaces range in size over 2 orders of magnitude, from 25 000 to 3 500 000 configuration spin functions (CSFs). All calculations were performed on an IBM 3 7 0 / 1 6 8 (about 3 times faster than a VAX 11/780 with floating-point accelerator). Standard dou-
(23) J. J. Diamond and G. A. Segal, J . Am. Chem. Soc., 106,952 (1984). (24) V. Saraswathy, J. J. Diamond, and G. A. Segal, J . Phys. Chem., 87, 718 (1983). (25) G. A. Segal and K. Wolf, J. Am. Chem. SOC.,in press. (26) T. H. Dunning, Jr., and P. J. Hay, “Modern Theoretical Chemistry, Vol. 3, H. F. Schaefer 111, Ed., Plenum Press, New York, 1977, 1. (27) (a) K. Fuke and 0. Schnepp, Chem. Phys., 38,211 (1979); (b) J. W. Rabalais, J. M. McDonald, V. Scherr, and S . P. McGlynn, Chem. Rev., 71, 73 (1971). (28) J. W. Johns, Can. J . Phys., 39, 1738 (1961). (29) P. A. Snyder and W. C. Johnson, Jr., J . Chem. Phys., 59, 2618 (1973).
The Journal of Physical Chemistry, Vol. 88, No. 16, 1984 3537
Program for Configuration Interaction TABLE 11: Sample Timing Data for BO, D,, symmetry, RBO= 1.265 A A 0 integral construction 221 s 42 AOs 36 transformed MOs D2,,symmetry used in CI
SCF (5 iterations) integral transform preliminary sort
32 s 120 s 10 s
total nucleus CSFs core CSFs width of Hba CSFs in total CI space CsFs actually used active/filled/virtual orbitals diagonal elements generated CPU time, s off-diagonal elements generatedb CPU time, s no. of CI roots sought total job time (CPU) per root, s
383 s
XZII,
AQ,
128 128 32560 25369 5/5/23 42539 45.1 116418 73.9 1 308.8
4 128 128 25397 29960 5/5/23 34285 32.6 103388 71.4 1 274.3
state@)
5
C2Zgf 5 128 128 32508 28909 6/4/23 49440 55.3 131510 77.0 1 320.7
'See footnote b, Table I.
TABLE III: Sample Timing Data for 2-Butanol C1symmetry 59 AOs 54 transformed MOs C1symmetry used in CI
A 0 integral construction SCF (1 1 iterations) integral transform preliminary sort
3331 699 4309 258
s s s s
8597 s
total
statebl
XIA
F-SA
nucleus CSFs core CSFs width of Hba CSFs in total CI space CSFs actually used active/filled/virtual orbitals diagonal elements generated CPU time, s off-diagonal elements generated CPU time, s no. of CI roots sought total job time (CPU) per root, s
4 128 4s 735624 735446 5/14/35 735446 386.1 1076869 1032.4 1 2376.7
10 128 45 3414818 2387101 11/21/31 2387101 1502.6 48658 11 2056.0 4 1653.4
for these calculations are given in Table I. Hamiltonian matrix element formation was quite fast, comprising only about 25% of
the total job time. The submatrix Hbawas in each case very sparse, with 2%-3% nonzero elements. Because we calculate complete Hamiltonian matrix elements, rather than loops, a direct comparison with timings from loop-driven GUGA-based programs is not possible. In this set of calculations, diagonal matrix elements consisted of about 10 "loop pieces" on the average and off-diagonal elements about 1 "loop pieces". The matrix element production rates of 2200 and 2900 elements/CPU s for diagonal and offdiagonal elements, respectively, are, in this regard, moderately efficient. Comparable rates for other calculations reported here are generally within the range of 1500-2500 elements/CPU s. For each of the vertical single excited states of allene, the total job time (CPU) per root was about 10% of the total CPU time needed prior to any C I (integral generation, SCF, etc.). The theoretical results and experimental data for the lower excited states of allene are shown in Table IV. The ab initio spectrum agrees with experiment to about 0.15 eV, despite the use of results at only the vertical geometry. Oscillator strength and calculated MCD parameters are also in good agreement. In particular, the lowest optically allowed state (1'E in DZd)is calculated at 6.87 eV; the experimental maximum lies at 6.72 eV. Given the complicated nature of the observed spectrum, this level of agreement is certainly satisfactory for the theoretical interpretation of the overall absorption spectrum. BO2. The ground state (X2n) and three low-lying excited double states (A211,, B2Z,+, and C1Z6+)were calculated at various g e ~ m e t r i e s .The ~ ~ location of the optically forbidden (but twophoton-allowed) C2Z6+state was the primary focus of the calculation. The DZhsubgroup of Dh was used to symmetry select CSFs in the linear geometry. Only one state of a given symmetry was sought. The performance data for some of these states are shown in Table 11. This series of calculations is the smallest example given here. C I spaces averaged around 30 000 CSFs. Individual calculations required about 5 CPU min each, with Hamiltonian matrix element formation using about 35% of the total job time. The two-electron integral transform in the linear geometry took only about 2 min, due to the presence of high symmetry. Although only Abelian symmetry was exploited in the CI, the total CPU time required for a given geometry for the four electronic states (including integral generation, SCF, etc.) was about 25 min. A summary of the comparison between theoretical and experimental results is shown in Table IV. No attempt
TABLE I V Comparison of Theoretical and Experimental Results for Various Molecular and Electronic Soecies molecule alieneafb
upper state 1'E 1'B,
AEobsd, eV
fcalcd
fobsd
6.872 7.44s
6.72 7.23
0.079 0.285
-0.03
21B2
7.691 8.200 8.339
-7.7 8.02 8.15
0.036 0.037 0.028
4'E
8.4184
8.25
5IE 6'E
8.4188 8.624
8.32 8.38
AQ" BZZ, C2ZB+
2.446 2.998 3.785
2.329 3.039
7.10 7.68 7.85 7.9s 8.08 8.31 8.60 8.70 8.75 8.80 8.91 8.98
6.94 7.64
0.005 0.005 0.021
7.94
0.001
AEWlCd
MCDcalcd
-2.07 12.43
0.34 2IE 3IE
-13.79 -59.8 -3.2
MCDobsd
-8.06 19.0
-
-7
-0.12 0.020
BOZcqd
butanol'/
0.019
-100 -28.1
0.029 0.005 0.0 16 0.0 14 0.017 0.014 0.004 0.002
"Theoretical data from ref 23. bExperimental data from ref 27. cTheoretical data from ref 24. dExperimental data from ref 28. eTheoretical data from ref 25. /Experimental data from ref 29.
3538 The Journal of Physical Chemistry, Vol. 88, No. 16, 1984 was made to locate the minimum of the A211, state; the distance RB0 = 1.302 A (suggested by experiment) was used. The calculated adiabatic excitation energies for the A and B states are in reasonably good agreement with experiment, differing by 0.12 and 0.04 eV, respectively. The CzZ,+ state is predicted to be linear, with an equilibrium bond length of Rm = 1.268 A, 3.785 eV above the ground state. In spite of the little time required for the C I part of these calculations, this example represents the most relatively inefficient case given here. This is because only one root of a given symmetry was sought in any particular C I run. (+)-(S)-2-Butanol. The singly excited states of optically active 2-butanol have been calculated in the 7.0-9.0-eV region.25 The complete lack of symmetry and presence of many excited states made this the largest and most difficult set of C I calculations performed in this laboratory to date. The lower excited states are mast likely of Rydberg character. A standard double-{-quality atomic basis set was used on each atom, except on the carbon atom farthest from the oxygen. In an attempt to minimize the size of the basis set, an STO-3G set was placed on this atom: this was the only center where this economy could be made without serious degradation. Two sets of (3s,3p) and (4s,4p) Rydberg-like Gaussians were added to the oxygen atom, leading to 59 atomic orbitals. The lowest five molecular orbitals were not transformed. Nonetheless, total CPU time for all work prior to any CI, including the two-electron integral transform, was approximately 2.5 CPU h. This approaches the limit of feasibility in our computing environment. Table I11 shows the timing data for the ground state and some of the excited states. The calculation of the excited states was complicated by the fact that, within this one-particle basis, none of the excited states was essentially represented by three or four configurations; many states had at least 10 or more configurations with coefficients larger than 0.2 in smaller C I runs. To keep the size of the C I calculation tractable, the width of the columns over which the submatrix Hbawas evaluated was kept at 46 CSFs, the full core space of 128 CSFs was retained, and the 46 CSFs most important to the states of interest were to form Hba. The configuration space used in the ground-state calculation consisted of all single and double substitutions from the leading four terms in the CI expansion; this led to a C I space of dimension -750000. The excited states were formed from the space of all single and doubles from usually 10 leading terms in the states of interest: this typically resulted in dimensions of 3 000 000 CSFs or more. The column matrix Hba was somewhat less sparse than usual, with around 4.5% nonzero elements. Given the large amount of data handling required, it was pleasing that CPU time (in a multiuser and virtual memory environment) was no less than 40%-50% of wall clock time for these very large calculations. As seen in Table 111, the off-diagonal Hamiltonian matrix element production rate was still quite high in the largest calculation cited (2400 elements/CPU s), indicating that the size of the orbital basis has little effect on matrix element formation. Comparison of ab initio and experimental results is shown in Table IV. The calculated excitation energies for the lower two bands with maxima at 6.94 and 7.69 eV are in reasonable agreement with experiment, as are the oscillator strengths for the states in these bands. The continuously increasing absorption beyond 8.5 eV in the experimental spectrum is reflected in the higher density of states calculated above 8.6 eV. Within this or an even larger atomic orbital basis, it was not possible to obtain a satisfactory theoretical description of the excited-state spectrum without the use of extensive CI. These calculations represent perhaps the largest C I calculations ever attempted (albeit within the Bkapproximation). Given the necessarily limited nature of the atomic orbital basis set used and the rather large dimensions of the C I calculations, it is gratifying to obtain reasonably accurate (ca. 0.1-0.15eV) excitation energies in a relatively short amount of time.
Diamond et al.
Conclusion We have developed a rapid and efficient scheme for performing large-scale CI + Bkcalculations. The calculations of eigenvalues and eigenvectors is stable and free from pathologies. Using modest basis sets and our C I approach, we have consistently obtained ab initio results in agreement with experiment at within 0.1-0.15 eV or better. Complete Hamiltonian matrix element formation is achieved with a single pass of the two-electron integral file. The computational times required are not demanding, even in the largest cases. For example, four approximate eigenvalues and eigenvectors of a Hamiltonian matrix of dimension 3.5 X lo7 were obtained in 2 CPU h. We believe this approach to CI is a practical one where great accuracy is not needed and where the explicit effects of electron correlation are essential to the theoretical description of electronic structure problems.
t
Acknowledgment. This work was supported by N S F Grant C H E 81-06016 and, in part, by N I H Grant G M 30114.
Appendix I. Hamiltonian Matrix Element Formulas There are six different classes of matrix elements treated in the program PEPCI: three are associated with the Coulomb and exchange two-electron integrals, two with the integrals in each ij subrecord, and one with the remaining integrals of the form K ] k , = (i(l)k(2)~l/r12~(l)l(2)). In each of the following examples, it is assumed that the left- and right-hand side occupations are consistent with the indicated change in orbital occupancy. Each configuration is labeled by the pair (n;s), where n designates the lexical index associated with a given set of occupations and s designates one of the valid spin couplings for that spatial occupancy. The quantities Erefand FIJare the reference energy and the ZJth Fock matrix element of the reference configuration used to form the repartitioned Hamiltonian. Likewise, An, is the difference between the configuration of interest and the reference configuration in the occupations of the ith orbital. (1) Diagonal Hamiltonian Matrix Elements.
(n;slHln;s)= Eref+ CF,,An, + I
VIIil+
I
IAnl=2
C[KI]J-
KJlJ/21AnlAnJ
+
-
~ J l ] ( ~ E J l ~ n ; s ) ~ 2
+-i
I>J
1/21
(A-1)
y=nl=l
( 2 ) Spin-Exchange Elements.
(n;slHln;s)=
KJV;,,(n;slE$,,In;s) 1’1 n,=n,= 1
(S’ f S)
-+
(A-2)
( 3 ) Excitations of the Form aa bb. (n’;slHln;s) =Vabab if S ’ = S = 0 otherwise (A-3) ( 4 ) Excitations of the Form a b. (n’;SlHln;s)= eS,,[Fab Vaaab(ni Ana)/2 vabbb(nb + + ( v a b k k - Vakbk/2)Ankl +
+
c
+
k#a,b
k#a,b nt=
vakbk[(SIEbkEkalS)
- es’s/21
(A-4)
1
where eSfs= (n’;slEbaln;s). ( 5 ) Excitations of the Form ab -cc. = Vakbk(S1EkaEkblS)
-
(n’;SlHln;S)
(‘4-5)
Excitations of the form cc ab may be formed from the adjoint of this expression. (6) Excitations of the Form ab cd.
-
(n’;slNln;s)= Vacbd(SIEeaE,&) + Vadbc(SIEdaEcblS) (A-6)
: