Improved Design of Orbital Domains within the Cluster-in-Molecule

Apr 7, 2010 - The improved variant of the local correlation coupled-cluster (CC) framework ...... calls the appropriate GAMESS routines to carry out t...
0 downloads 0 Views 2MB Size
8644

J. Phys. Chem. A 2010, 114, 8644–8657

Improved Design of Orbital Domains within the Cluster-in-Molecule Local Correlation Framework: Single-Environment Cluster-in-Molecule Ansatz and Its Application to Local Coupled-Cluster Approach with Singles and Doubles† Wei Li and Piotr Piecuch* Department of Chemistry, Michigan State UniVersity, East Lansing, Michigan 48824, USA ReceiVed: January 26, 2010; ReVised Manuscript ReceiVed: March 11, 2010

The improved variant of the local correlation coupled-cluster (CC) framework termed “cluster-in-molecule” (CIM), defining the single-environment (SE) CIM-CC approach, is presented and tested at the CC singles and doubles (CCSD) level. In the proposed SECIM-CC method, the previous design of the CIM orbital subsystems [Li, W.; Gour, J. R.; Piecuch, P.; Li, S. J. Chem. Phys. 2009, 131, 114109], referred to as the dual-environment (DE) CIM-CC approach, which is based on the ideas of central orbitals and the associated primary and secondary environments, is replaced by the simplified design in which the central localized molecular orbitals (LMOs) and the corresponding environment LMOs are first assigned to each nonhydrogen atom and the hydrogen atoms that are bound to it. The SECIM-CC approach offers improvements in the DECIM-CC results, particularly for weakly bound molecular clusters using diffuse basis functions. Through the use of a single parameter to define the environment LMOs and through the assignment of subsystem LMOs to atoms, the SECIM-CC calculations are easy to control and the CIM subsystems do not unnecessarily vary with the nuclear geometry, creating smoother potential energy surfaces. The performance of SECIMCCSD is illustrated by the calculations for normal alkanes and water clusters described by the 6-31G(d), 6-31++G(d,p), and 6-311++G(d,p) basis sets. 1. Introduction 1

The coupled-cluster (CC) theory has become the standard for high accuracy electronic structure calculations,2,3 but there is a need for advancing CC methodology, particularly in the area of larger molecular systems. Indeed, the CPU times of the basic CC singles and doubles (CCSD) calculations4 are already defined by the steps that scale as no2nu4, where no and nu are the numbers of occupied and unoccupied orbitals, respectively, included in the post-Hartree-Fock (HF) or other post-selfconsistent-field calculations, or as N6, where N represents a measure of molecular size. In the case of the CC methods with higher-than-double excitations, including the popular CCSD(T) approach5 and its CR-CCSD(T),3,6 CR-CC(2,3),7 CCSD(2),8 and other similar extensions9 that improve the CCSD(T) results in the bond breaking and biradical regions of potential energy surfaces (PESs), and in the case of other higher-order CC approaches one has to consider even more expensive steps that make the CC calculations for larger polyatomic systems prohibitively expensive. The memory requirements, which typically scale as N4, rapidly grow with the system size as well. Code parallelization alleviates the situation to some extent10 and the impressive CC calculations that scale well across ∼100 000 CPUs have been reported,10p but parallelization of CC equations alone, when one has to deal with the steep polynomial scalings of the computer costs with the system size, is insufficient. One must attack the power scaling laws that define costs of the CC and other correlated calculations without sacrificing the accuracies these methods offer. One can do this by turning to localized molecular orbitals (LMOs).11 †

Part of the “Klaus Ruedenberg Festschrift”. * To whom correspondence should be addressed. E-mail: piecuch@ chemistry.msu.edu.

Among the earliest uses of LMOs in molecular CC calculations are the works of Laidig et al.,12 Cullen and Zerner,4b Fo¨rner et al.,13 and Takahashi and Paldus14 in the 1980s, and significant progress has occurred since then toward reducing the large costs of canonical CC calculations employing delocalized HF orbitals through the use of LMOs or atomic orbitals (AOs)15-28 (see, e.g., refs 29 and 30 for the information about the analogous low-order scaling algorithms for other kinds of ab initio correlated calculations). In addition to the intrinsically local CC methods, one can also reduce the costs of CC calculations by exploiting the energy-based fragmentation approaches in which the system is divided into intuitively defined fragments and the total energy is assembled from the corresponding fragment energies, as in the fragment-molecular-orbital (FMO) approach31 (see ref 32 for the FMO-CC work), the fragmentation schemes of ref 33 and, to a certain extent, the incremental methods of ref 34 (see ref 35 for a recent review). Of all intrinsically local CC approximations to date, the approach that has received the most attention is that of Werner, Schu¨tz, and their co-workers,17,18 who reported the low-order or linear scaling implementations of the local CCSD, CCSD(T), and CCSDT-1 methods, most recently adding the explicitly correlated F12 capability to local CCSD.36 Their implementations of local CC approaches rely on the local correlation formalism of Pulay and Saebø,29 in which one solves the CC equations in a basis of orthonormal occupied LMOs obtained with one of the MO localization schemes37 and nonorthogonal unoccupied orbitals constructed from the projected AOs (PAOs), while dividing the system into domains to which excitations defining the CC ansatz are restricted. We have chosen a different and, as we believe, more transparent route compared to many of the past development efforts, which has resulted in the formulation and highly efficient computer implementations of

10.1021/jp100782u  2010 American Chemical Society Published on Web 04/07/2010

Improved Design of Orbital Domains the linear scaling local CCSD, CCSD(T), and CR-CC(2,3) approaches28 that rely on the theoretical framework termed “cluster-in-molecule” (CIM), which was initially suggested in the context of the pilot second-order many-body perturbation theory (MBPT) and CC doubles (CCD)1c-e,38 considerations in ref 22. The CIM-CC approaches result in a straightforward algorithm in which, beginning with the AO f MO integral transformation and ending up with the final CC work, the CC calculation for a large system is split into independent and relatively inexpensive CC calculations for local orbital domains called CIM subsystems, which are subsequently used to evaluate the correlation energy of the entire system out of the correlation energy contributions that are associated with the individual occupied LMOs. Since the CIM subsystems are practically independent of the system size, the CPU time of the CIM-CC calculations scales nearly linearly with the size of the system, while the memory requirements of the largest CC calculation within the CIM framework are virtually independent of the system size.28 Other features of the CIM-CC methods include the applicability to the covalently and weakly bound molecular systems, and to all kinds of CC and MBPT methods, which one can mix within a single calculation, the use of orthonormal orbitals in the calculations for individual orbital subsystems that enables one to utilize the conventional CC routines for orthonormal bases in coding CIM-CC approaches, the coarse-grain parallelism when splitting the calculation into CIM orbital subsystems, which can be further enhanced by the fine-grain parallelism of each CC (or MBPT) subsystem calculation, and the noniterative character of the local triples corrections of CCSD(T), CR-CC(2,3), truncated MBPT, etc., when one makes use of the quasi-canonical MOs (QCMOs) that diagonalize the occupied-occupied and unoccupied-unoccupied blocks of the Fock matrix in each CIM subsystem calculation.28 We note that many of the previously formulated approaches use the less convenient nonorthogonal orbitals that complicate the resulting computer codes (cf., e.g., refs 17 and 18), do not always have the intrinsic linear scaling due to the presence of higher-manybody terms in the energy cluster expansions,34 and sometimes replace the noniterative steps of CCSD(T), CR-CC(2,3), MBPT, etc. by the more complex iterative steps,17c-e which can be avoided within the CIM-CC framework. By comparing the canonical and CIM CC results for alkanes and water clusters of the varying size, we have demonstrated that CIM-CCSD, CIM-CCSD(T), and CIM-CR-CC(2,3) can accurately recover the corresponding canonical correlation and relative energies, while offering savings in the computer effort by orders of magnitude.28 There remain, however, challenges that the CIM-CC methodology faces. One of such challenges is the CIM subsystem design adopted in ref 28, which is based on the ideas of central orbitals and the associated primary and secondary environments, and which is referred here and elsewhere in this paper to as the dual-environment (DE) CIM approach. The DECIM subsystem design uses two parameters, ζ1 and ζ2, to define the primary and secondary environments for each CIM orbital subsystem. Although we have developed intuition about the values of ζ1 and ζ2 appropriate for a given application, and come up with reasonable default values of these parameters,28 it is important to examine if one can reduce the number of variable CIM parameters without losing accuracy and without increasing the computer costs of the CIM-CC calculations. Another challenge for CIM-CC is the use of diffuse basis set functions which, as shown in this study, create difficulties in the DECIM-CC calculations for weakly bound

J. Phys. Chem. A, Vol. 114, No. 33, 2010 8645 molecular clusters if we do not use the extremely tight values of ζ1 and ζ2. Both of these issues are addressed in this paper by developing the improved variant of the CIM-CC methodology, termed the single-environment (SE) CIM-CC approach, in which the previous DECIM subsystem design adopted in ref 28, which uses two kinds of orbital environments and two parameters ζ1 and ζ2 to define them, is replaced by the simplified design in which the central LMOs of each CIM subsystem and the corresponding environment LMOs are assigned to each nonhydrogen atom and the adjacent hydrogen atoms that are bound to it. The SECIM-CC approach preserves all of the essential features of the CIM-CC methodology, while offering significant improvements in the description of the relative energies of weakly bound molecular clusters when the AO basis set used in the calculations contains diffuse functions and additional small improvements in the CIM-CC calculations without diffuse functions when compared to the analogous DECIM-CC calculations. Through the use of only one parameter ζ to define the environment LMOs in the SECIM-CC methodology rather than two parameters ζ1 and ζ2 used in the DECIM-CC approach of ref 28 and through the assignment of the environment LMOs to atoms, the SECIM-CC calculations are easier to control and the final orbital subsystems do not unnecessarily vary with the nuclear geometry, creating smoother PESs. The performance of the SECIM-CC approach is illustrated by the SECIM-CCSD calculations for normal alkanes and water clusters of varying size, as described by the 6-31G(d),39 6-31++G(d,p),39,40 and 6-311++G(d,p)40,41 basis sets. 2. Theory and Algorithmic Details 2.1. Key Concepts of CIM-CC. The basic idea of all CIMCC approaches, including the DECIM-CC methods reported in ref 28 and the SECIM-CC scheme developed in this work, is the observation that the total correlation energy of a manyelectron system can be formally obtained as a sum of contributions from the occupied orthonormal LMOs and their respective occupied and unoccupied orbital domains. For example, if i,j, k, ... (a,b,c, ...) designate the spin-orbitals occupied (unoccupied) in the reference determinant |Φ〉 and p,q,r, ... are the generic spin-orbitals, the CCSD correlation energy is determined using the equation

∆E(CCSD) )

∑ f iatia + ∑ i,a

Vijabτijab

(1)

i