Perturbation Improved Natural Linear-Scaled Coupled-Cluster Method

2 days ago - The fragment-based coupled-cluster (CC) theory utilizing the transferable functional groups through natural localized molecular orbital (...
2 downloads 0 Views 3MB Size
Article Cite This: J. Phys. Chem. A XXXX, XXX, XXX−XXX

pubs.acs.org/JPCA

Perturbation Improved Natural Linear-Scaled Coupled-Cluster Method and Its Application to Conformational Analysis Yifan Jin and Rodney J. Bartlett*

J. Phys. Chem. A Downloaded from pubs.acs.org by UNIV OF SOUTH DAKOTA on 12/26/18. For personal use only.

Quantum Theory Project and Departments of Chemistry and Physics, University of Florida, Gainesville, Florida 32611, United States ABSTRACT: The fragment-based coupled-cluster (CC) theory utilizing the transferable functional groups through natural localized molecular orbital (NLMO), that is, the natural linear-scaled coupled-cluster (NLSCC) has been further developed to take the extra-fragment interactions into account. The correction to the interaction energies sacrificed during the fragmentation process for the previous NLSCC method is computed by a computationally efficient perturbation theory that maintains the original scaling. The new linear-scaled coupled-cluster for the singles and doubles (CCSD) method is applied to the analysis of relative energies of delicate conformational problems of polypeptides. By adding a perturbation correction, results accurate to less than a kcal/ mol are obtained for the alanine tetramer. (1) Simply use frozen natural orbitals (FNO)8 or optimized virtual orbitals (OVOS)9−11 to reduce the dimension of the virtual space. (2) Use an internal simplification via pseudonatural orbitals (PNO)12−15 that has demonstrated application to quite large systems. (3) Cut a large molecule into several smaller fragments and treat the correlation problem in each of them semiindependently.16−19

1. INTRODUCTION Coupled-cluster (CC) theory can calculate correlated electronic properties of atoms and molecules with high accuracy, causing it to be used more and more widely in chemistry, physics, material sciences, and even nuclear physics.1−4 Under the same truncation as in configuration interaction (CI), the power of CC theory lies in its natural inclusion of higher-order disconnected clusters making the method size-extensive. This feature enables the CC method to evaluate properties more accurately as it converges faster to the full CI results while relative energies on potential energy surfaces are also more accurate for size-extensive treatments.5 Therefore, the results calculated by the CC methods, especially those that include triple excitations or higher connected clusters [such as CCSD(T), CCSDT, CCSDTQ, etc.] are often taken to provide benchmark values. However, the high computational cost of the CC method makes it difficult to apply it to large systems. For a system with n occupied orbitals and N virtual orbitals, even when the truncation is restricted to single and double excitations, as in CCSD, its naive scaling is roughly n2N4, where n and N are the numbers of occupied and virtual orbitals. To calculate the properties of large molecules, one often has to switch to other electronic structure methods like density functional theory (DFT), sacrificing any way to guarantee convergence to the right answer.6,7 Instead of depending on other less-predictive methods, an alternative approach is to reduce the scaling of the coupledcluster theory by making some approximations. There are about three quasi-independent approaches: © XXXX American Chemical Society

In the last case, by properly choosing the theoretical methodologies and fragmentation techniques, the accuracy of the properties computed by such approximate methods is comparable to the coupled-cluster calculation of the entire unit. At the same time, the computational cost is significantly reduced, usually scaling linearly with the size of the molecule. This is the foundation for the natural linear scaled coupledcluster (NLSCC) approach. There is an extensive history of using localized orbitals to facilitate fragmentation in large molecule calculations.20−35 The basic idea is similar in all these applications, with only the choice of orbital localization and other details being different. The merits of localized SCF orbitals and their approximate transferability are well-known.36−38 But that is only one component of the transferability of functional groups that is at the heart of NLSCC.28,29 The latter provides a full, correlated description of a functional group (QM1) in a buffer (QM2), with its energy and density matrix, that, in principle, could be Received: August 15, 2018 Revised: December 3, 2018

A

DOI: 10.1021/acs.jpca.8b07947 J. Phys. Chem. A XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry A

Figure 1. Examples of the Hartree−Fock orbitals and NLMO of the pentane molecule.

summation of the units (T = TA + TB + TC + ··· + Ti). The coupled-cluster wave function is thus29

stored and strung together to build large molecules as an electronic structure analog of molecular mechanics. To the degree that these units are transferable, one immediately has a wave function or density matrix and energy for the units that can be tied together to represent large molecules with essentially no additional computational demand. This transferability is achieved by identifying two units in a large molecule. A QM2 unit is meant to contain the QM1 units. The latter are the transferable functional groups of chemistry, like that for a ketone, R, R′CO, or acid, R COOH, or a peptide unit, or phenyl ring, or almost anything composed of a few atoms. So the rate-determining step will be the CC computation of the QM2 unit, whose dangling bonds might be neutralized by H atoms, or some kind of pseudopotenital for the rest of the system. Within this QM2 unit, one or more QM1 units are identified, and by virtue of its CC wave function being represented within local occupied and virtual units, it can be extracted and even stored for future use insofar as its average molecular geometry is adequate as a first approximation to that group in a different molecule. The same pertains to its energy and density matrix. Thus, QM2 provides a buffer to more correctly represent the wave function for the extracted QM1 units. Although many choices of localized orbitals could be used, NLSCC chooses to use Weinhold’s natural localized molecular orbitals (NLMO),39 primarily because that procedure readily yields local occupied and virtual orbitals, while many other localization methods do not (see, however, references from ref 1). Having both localized in an unambiguous (though not unique) way is essential in a description of the correlation problem that depends upon excitations from the occupied to the unoccupied space. We experimented with other localized orbital choices like that advocated in the divide−expand− consolidate (DEC) scheme,33,40 e.g., and found that the NLMOs were equally local in application. Once the orbitals are localized, the reference wave function is approximately the product of the buffered wave functions of each QM1 fragment, enabling the excitation amplitudes to be approximated by the

̂

|Ψ⟩ = eT |ϕ0⟩ ≈

̂i

∏ eT |ϕ0i⟩

(1)

i

And the correlation energy of the entire molecule is roughly the summation of that for each fragment: A B C ΔECC ≈ ΔECC + ΔECC + ΔECC + ···

(2)

Analogously, the density matrix for a QM2 unit is now decomposed into that for the QM1 units. Compared to some other localized orbitals, NLMO appeal to chemists since each of them tends to correspond to a core, a lone pair electron, or a chemical bond (Figure 1) as dictated by the particular prescription Weinhold offers. Therefore, within his perscription, it is easy to build the fragments to approximate familiar functional groups. As discussed above, the excitation amplitudes computed in terms of the NLMO will be transferable to a degree, as the amplitudes for the excitations inside a region of a molecule remain similar regardless of the modification of the surroundings since the electrons occupied on the orbitals inside a specific fragment will predominately be excited to the virtual orbitals in the same unit. The transferability has been demonstrated numerically in previous studies.28,29 Although the transferability of the NLMO minimizes the impact of the environment for each orbital, the effect of the environment cannot be eliminated. For a large molecule, the functional groups connected to the central fragment with chemical bonds are included in the QM2 buffer region, but the intramolecular interactions (the exta-fragment interactions) from the other functional groups are partly neglected. As the NLSCC is meant to treat correlation, the unperturbed reference is the global SCF for the whole system. So all the inter- and intraelectrostatic interactions common to the HFSCF are introduced at that level, leaving just electron correlation for the NLSCC. Nonetheless, the NLSCC correlation energies calculated by a semi-independent B

DOI: 10.1021/acs.jpca.8b07947 J. Phys. Chem. A XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry A

Figure 2. Basic procedure of NLSCC calculation. (a) Molecule with six groups divided into four fragments. (The dashed lines indicate the cutting point.) (b) QM1 and QM2 regions. (c) Final input geometries with the linked atoms. (d) Example of the dodecane molecule.

In previous studies, only the first index of the occupied orbitals was given special consideration. The second index of the occupied orbitals together with the indices of the virtual orbitals are simply included in the summation. However, pair indices i and j together provide much more information. As will be presented in the next section, it includes information about the interaction between the electrons in different occupied orbitals. Hence one can, in principle, calculate the energies sacrificed due to the fragmentation based on the coupled-cluster energy equations. In NLSCC, to accurately calculate the interaction energies for two functional groups that are not bonded, one always has the option to combine them into a larger fragment that is still small compared to the large molecule of interest. But to avoid building larger units, one can use perturbation theory to add the missing information. This also maintains the linear scaling of the original NLSCC. The new method will be applied to study the relative stability of the conformers of polypeptides. There are 27 conformers of the alanine tetramer (Ala4), and the energies are close to each other. According to a previous theoretical study, the energy difference between the most stable and the least stable conformers is smaller than 10 kcal/mol, and the relative energies between two conformers are usually less than 1 kcal/ mol.46−48 Therefore, the accurate computation of the correlation energies is critical for the accurate prediction of the stabilities. This manuscript will first introduce the theory of the new method and then demonstrate the performance of the method in conformational analysis.

approach inevitably differ from the exact coupled-cluster energies, and particularly in relative energies, there may be more severe errors. One such prominent example is the relative stability of conformers. Since the energies of conformers are close to each other, even a tiny error can switch the order.41−43 This effect is even more significant in those molecules that involve hydrogen bonds such as polypeptides.44 Therefore, to make further improvement of the accuracy so that the NLSCC method can be more widely applied, it is necessary to find an efficient method to calculate the intramolecular contributions sacrificed due to the fragmentation. These effects include additional electrostatic interactions, hydrogen bonds, and weaker interactions like those due to dispersion. Though such dispersion values might be individually small, there are many such contributions in a large molecule, often causing an essential net effect that is critical for “predictive” methods. However, this is not a paper focusing on dispersion. Instead, it simply wants to correct for the residual effects lost due to any kind of fragmentation beyond our QM2 (buffered) model. This is the “extra-fragmentation” (EF) error, and in particular, it includes H-bonds. Nonetheless, as dispersion coefficients are system dependent and not constant, they should be evaluated in a ab initio manner, and our approach also does that to a degree.45 This manuscript will present a method that could effectively calculate the “extra-fragmentation” energy corrections to the NLSCC based on many-body perturbation theory (MBPT) by taking advantage of the localization of the NLMOs. In the coupled-cluster energy equation, the indices of both the occupied orbitals and virtual orbitals appear in pairs (throughout this manuscript, the indices i, j, and k represent the occupied orbitals, while a, b, and c are for virtual orbitals): ΔE =

1 4

2. THEORY The detailed transformation process from Hartree−Fock orbitals to NLMO has been presented by Weinhold and coworkers in a series of publications39,49−52 and will not be repeated here. However, we use our own program to generate

∑ ⟨ij ab⟩(tijab + tiat jb − tibt ja) ijab

(3) C

DOI: 10.1021/acs.jpca.8b07947 J. Phys. Chem. A XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry A

occupied orbital. And, in fact, the energy of each occupied orbital can be further decomposed. According to eq 4, the energy of each orbital i is, in fact, contributed further by each of the occupied orbitals j throughout the entire molecule. Therefore, the pair correlation energy (Δeij) can be defined as

them. In this section, we will start by slightly reviewing some basic principles of the NLSCC method. Equations 1 and 2 summarize the fundamental ideas of the NLSCC method, but they cannot be used directly except for those systems whose fragments are far away from each other and there are no bonding interactions. The starting point of the actual calculation is the energy of each NLMO, that is, the contribution of each orbital to the correlation energy of the entire unit, as proposed by Flocke and Bartlett:28,29 1 Δei = 4

Δeij =

∑ ⟨ij ab⟩(tijab + tiat jb − tibt ja)

ΔE(A+B) =

(4)

∑ Δei i

1 4



Δeij +

i ∈ B,j ∈ B

Δeij

i ∈ A,j ∈ B

Δeij (9)

i ∈ B,j ∈ A

Obviously, the first two terms on the right-hand side of eq 9 are the correlation energies of the two fragments:



Δeij = ΔE A

i ∈ A,j ∈ A



Δeij = ΔEB (10)

i ∈ B,j ∈ B

Therefore, the third and fourth terms on the right side of eq 9 represent the interaction between the two fragments. If the effect of another fragment Y to the correlation energy of fragment X is represented by ΔEX←Y, then ΔE A ← B =



Δeij

i ∈ A, j ∈ B

ΔEB ← A =

∑ i ∈ B, j ∈ A

Δeij (11)

If the two fragments are not connected through a chemical bond, then the energy computed through eq 11 will account for the small effects that arise from excitations from a localized orbital on one site to an unoccupied one on a second site. However, it is not practical to use this method to calculate the correction to the NLSCC as the input geometry will include both fragments and their buffer regions. Then the computational cost will increase significantly, which diminishes the scaling of the correlation energy calculation (it will remain linear when used for suitably large systems, however). One could attempt to remove the buffer regions in the input geometry, but the buffer regions are essential for the transferability of the NLMO. To reduce the computational cost of a larger input geometry, one of the most straightforward approaches is to use second-order perturbation theory since it is the starting point of the coupled-cluster iterations:

∑ ⟨ij ab⟩(tijab + tiat jb − tibt ja) (6)

The NLSCC correlation energy for the entire molecule is thus obtained by summing over the orbital energies in each QM1 region: jij QM2 zyz ΔE ≈ ∑ jjj ∑ Δei zzz jj zz QM1 k i ∈ QM1 {



+ (5)



Δeij +

i ∈ A,j ∈ A

QM2 jab



=

The correlation energy of each QM1 region of the molecule is the sum of the energies tied to all the occupied orbitals inside that fragment. Similar to the excitation amplitudes and density matrices, these “orbital identified correlation energies” (OICE, not SCFlike orbital energies) are also expected to be largely transferable; that is, their energies should have a low dependence on its environment. For example, for a molecule with six groups A, B, C, D, E, and F (Figure 2a), properly terminated to avoid unoccupied orbital issues, the energies of the occupied orbitals in group A will not change too much if the other five groups are removed. However, there is still some possibility for them to be excited to neighbor regions, and the corresponding contribution cannot be neglected. Therefore, to ensure accuracy, the neighbors of each fragment are also included in the calculation as the buffer regions, and together with the fragment itself, they generate the QM2. For a large molecule, the buffer regions are usually the functional groups bonded directly to the fragment (Figure 2b). To avoid dealing with radicals, a linked hydrogen atom is added to the atoms where a bond is broken (Figure 2c). For each QM2, we can first calculate the energies associated with each occupied orbital: Δei ≈

∑ Δeij ij

Clearly, if there are no approximations made, the summation of the energies for all the occupied orbitals will be the total correlation energy of the molecule: ΔE =

(8)

ab

For a system with two fragments A and B, the total correlation energy can be written as

∑ ⟨ij ab⟩(tijab + tiat jb − tibt ja) jab

1 4

t ijab(1) = (7)

⟨ij ab⟩ εi + εj − εa − εb

(12)

Analogous to eq 8, we can define the second-order pair correlation energy: 1 Δeij(2) = ∑ ⟨ij ab⟩t ijab(1) 4 ab (13)

This strategy has been used to calculate the energies of polyglycine and met-enkephalin, and the correlation energies calculated by the NLSCC method are very close to the coupled-cluster calculation of the entire molecule.5 To further increase the accuracy, it is necessary to find an efficient method to calculate the interaction energies between the two fragments that are not connected to each other. This interaction can be evaluated through the pair correlation energy. Equation 5 indicates that the total correlation energy of the system can be decomposed into the contribution from each

If there is no approximation made, the summation of the second-order pair correlation energies across the entire set of the occupied orbital pairs is the exact MBPT(2) energy of the molecule. In the NLSCC calculation, one has to reduce the size of the fragments inside the buffer regions. Similar to what is discussed previously, the two groups that the interaction D

DOI: 10.1021/acs.jpca.8b07947 J. Phys. Chem. A XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry A

Figure 3. (a) QM1 (circled) and different QM2 regions (rectangles) defined in glycine tetramer. (b) Relative CCSD correlation energies of QM1 and the computing times for the six QM2 regions.

NLSCC. So even though the basis is modest, the comparison is well-defined, to suggest general conclusions as to the accuracy of such a perturbative correction. The geometries of the Ala4 are taken from reference Ala4 are taken from ref 46. When the linked hydrogen atom is added, the bond length between hydrogen and other atoms is fixed to 1.0 Å. 3.1. Buffer Region and Transferability of NLMO. The transferability of NLMO has been previously studied systematically. In this work, we will first further investigate the effect of the size of the buffer region on the transferability as well as the computing time by using the example of the glycine tetramer as shown in Figure 3a. In this example, we calculate the CCSD correlation energy of the functional group circled in the figure. This region is the QM1. The six rectangles are the possible QM2 regions. It is clear that the first QM2 has no buffer region, and the sixth QM2 is the entire molecule without approximation. By choosing different sizes of buffer regions, one calculated the correlation energy of the QM1 unit through eq 7. Taking the correlation energy computed with the entire molecule as the buffer region (the sixth QM2) as the reference, the relative errors correspond to the reference and the computing times are plotted in Figure 3b. As expected, the larger the buffer region (QM2), the more accurate a correlation energy will be obtained, but at the same time, the more computing time is required. If there is no buffer region, the result is not reliable since the error is significantly larger than those with buffer regions included. In this example,

energy will be computed can be defined as a QM1 (such as A and B in the examples above), and the functional groups close to the QM1 will be the QM2 regions. Therefore, the extrafragmentation (EF) contribution from fragment B to fragment A is A←B ΔE EF =

1 4

QM2

QM2

∑ ∑ ⟨ij ab⟩tijab(1) i ∈ A,j ∈ B

ab

(14)

By summing over all the correlated interaction energies for each fragment pair that are neglected in eq 7, one finally gets the total correction. Of course, the correction calculated by this approach is not as accurate as the direct coupled-cluster method. But since this correction is a small component of the total correlation energy, it should still be effective considering the low computational cost. The perturbation improved correlation energy (ΔEPNLSCC) is thus ΔE PNLCC = ΔE NLSCC + ΔE EF

(15)

3. RESULTS AND DISCUSSION All the NLSCC calculations are done on a locally modified version of the ACES II program,53 and the single-point energy calculations of the 27 conformers of Ala4 are completed with the ACES III program.54 The NWChem 6.8 program is used for the density functional theory calculations.55 The cc-pVDZ basis set is used for all the calculations, full versus fragmented E

DOI: 10.1021/acs.jpca.8b07947 J. Phys. Chem. A XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry A

Figure 4. QM1 (circled in red) and QM2 (circled in green) defined for the alanine conformer.

Table 1. CCSD and NLSCCSD Correlation Energies and Relative Energies, the Percentage of NLSCCSD Correlation Energies Recovered, and the Error of the Relative Energies correlation energies (au)

relative energies (kcal/mol)

conformer

CCSD

NLSCCSD

percentage

CCSD

NLSCCSD

error

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

−3.20442 −3.20584 −3.21202 −3.20615 −3.20590 −3.20970 −3.21191 −3.21160 −3.21053 −3.21155 −3.21233 −3.21251 −3.20776 −3.20793 −3.21301 −3.21027 −3.21277 −3.20955 −3.20973 −3.20920 −3.21049 −3.21121 −3.21097 −3.21086 −3.20968 −3.21018 −3.20961

−3.18854 −3.18798 −3.18880 −3.18904 −3.18780 −3.18777 −3.18819 −3.18895 −3.18676 −3.18731 −3.18735 −3.18534 −3.18813 −3.18558 −3.18742 −3.18693 −3.18733 −3.18814 −3.18735 −3.18573 −3.18854 −3.18905 −3.18717 −3.18640 −3.18583 −3.18554 −3.18536

99.50% 99.44% 99.28% 99.47% 99.44% 99.32% 99.26% 99.29% 99.26% 99.25% 99.22% 99.15% 99.39% 99.30% 99.20% 99.27% 99.21% 99.33% 99.30% 99.27% 99.32% 99.31% 99.26% 99.24% 99.26% 99.23% 99.24%

5.71 5.23 0.17 6.57 6.83 2.59 6.29 4.54 9.04 7.78 0.00 1.53 4.01 5.72 1.88 4.60 2.59 1.64 4.16 2.67 2.27 5.58 5.95 3.88 3.44 2.26 5.52

0.00 0.76 −0.94 1.63 2.51 0.67 5.50 3.08 8.28 7.32 0.00 2.90 0.65 4.07 2.26 3.57 2.88 −0.60 2.53 1.72 0.37 3.81 5.21 3.55 2.73 2.04 5.07

−5.71 −4.47 −1.11 −4.94 −4.32 −1.91 −0.79 −1.46 −0.76 −0.46 0.00 1.38 −3.35 −1.65 0.38 −1.03 0.29 −2.24 −1.63 −0.95 −1.90 −1.77 −0.74 −0.33 −0.71 −0.21 −0.46

the relative error of the QM1 without a buffer region is 2.8 kcal/mol. If we calculate each QM1 in the molecule in a similar way, the cumulative error can be a huge number. More importantly, if a small amount of the buffer region is added such as the second and third QM2, the computing time required increases slightly, but the accuracy can be significantly improved. This observation further demonstrates that the

buffer region is required to ensure the transferability of the NLMO. It can also be observed from Figure 3 that from the second to the third QM, the relative error just drops 0.11 kcal/mol (from 0.34 to 0.23 kcal/mol), but the computing time multiplies (from 0.5 to 1.0 h). If the size of the QM2 further increases, the relative error drops to 0.02 kcal/mol, which F

DOI: 10.1021/acs.jpca.8b07947 J. Phys. Chem. A XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry A

Figure 5. (a) Geometries of the Ala4 conformers 7, 9, and 12 with the index of the seven fragments labeled. The meaning of the dashed lines is the same as in Figure 2. (b) Interaction energies (kcal/mol) based on the full MPBT(2)/cc-pVDZ calculation. The cells with color have relatively large energies (the darker the color, the stronger the interaction).

3.3. Analysis of Extra-Fragmentation Interactions. The large errors of the relative energies in Table 1 is clearly due to neglecting the interactions between those functional groups that are not directly bonded during the fragmentation process. In the polypeptides, the strongest intramolecular interaction is the hydrogen bond. On the basis of the structure of Ala4 in Figure 4, there should be more or fewer hydrogen bond interactions for the group pairs 1↔3, 3↔5, and 5↔7. However, the strength of these interactions may vary considerably for different conformers. And more importantly, if the molecule is not linear, there may be additional hydrogen bond interactions for the group pair 1↔5, 3↔7, or 1↔7. To make the further analysis of the effect of the weak interactions on the correlation energies, we select three conformers (7, 9, and 12) that have special features for hydrogen bonds and their geometries with the corresponding groups that are fragmented are shown in Figure 5a. Conformer 9 just has the three basic hydrogen bond interaction pairs, that is, 1↔3, 3↔5, and 5↔7. In conformer 7, there is an additional hydrogen bond interaction between the group 1 and 7. And there are two relatively strong interaction pairs in conformer 12:1↔5 and 3↔7. There is also some interaction for the pair 1↔7 in conformer 12 although not as strong as in conformer 7. For illustrative purposes, the interaction energies of each group pair are computed at the second-order level without fragmentation (that is, the QM2 is the entire molecule in eq 14) and Figure 5b shows the tabulated results of the corresponding energies. Each fragment has the strongest interaction with the groups connected directly to it, as expected, and this contribution has already been included in the NLSCC energy. Among the groups that are not connected to each other, the pairs 1↔3, 3↔5, and 5↔ 7 have much stronger interaction energies, which can be noticed in Figure 5b, but the strength varies on the basis of the geometry. For example, the interaction energies between group 5 and 7 in conformer 9 is 1.55 kcal/mol, but it is only 0.82 kcal/mol in conformer 7. The relatively strong interaction for the pairs 1↔ 5, 3↔7, and 1↔7 is not available in all the conformers. In conformer 9, which is almost a linear structure, the interaction

could almost reproduce the exact energy without approximation. But at the same time, the computing time increases to about 5.5 h which violates linear scaling. In this work, to accommodate both the accuracy and the scaling, one more functional group is added on top of the QM1 as the buffer region. 3.2. NLSCC Calculation of the Energies of Ala4 Conformers. According to the functional group of alanine, each Ala4 is divided into seven fragments (QM1). For each QM1, the functional groups connected to it with chemical bonds are defined as the QM2. Figure 4 presents the QM1 and QM2 of one of the conformers. The natural linear scaled CCSD (NLSCCSD) energies of each conformer are calculated using eq 7. According to the CCSD calculation for the entire molecule, the 11th conformer has the lowest total energy, and thus we take it as the reference to calculate the relative energies of the other conformers. The CCSD correlation energies, NLSCCSD correlation energies, the percentage of correlation energies recovered for NLSCCSD compared to CCSD, the relative energies with the 11th conformer as the reference, and the error of the relative energies of NLSCCSD compared to CCSD are summarized in Table 1. According to Table 1, by comparing the absolute value of the coupled-cluster correlation energies, the approximate NLSCCSD results are close enough to the CCSD energies without fragmentation. It is clear that the NLSCCSD method can recover over 99% of the CCSD correlation energies for some well-localized systems. So from this viewpoint, the results calculated by NLSCCSD are already very good approximations. However, huge errors are observed for the relative energies. For NLSCCSD, the third conformer becomes the most stable since its total energy is 0.94 kcal/mol lower than the 11th conformer. Compared to CCSD results, the largest error in the NLSCCSD results is over 5 kcal/mol. Therefore, although the NLSCCSD correlation energies are close to CCSD, they cannot be used to analyze the relative stabilities where ordering might be completely different. G

DOI: 10.1021/acs.jpca.8b07947 J. Phys. Chem. A XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry A

Figure 6. Fragments created to calculate the dispersion interaction.

for these three pairs is almost zero. Conformer 7 has a strong interaction between the group 1 and 7. In conformer 12, all three pairs have relatively strong interactions. Obviously, different conformers have very different weak interactions, and the cumulative effect can play an important role. If these interactions are neglected in the correlation energy, the large error can easily switch the ordering, as has been shown in Table 1. 3.4. Conformational Analysis with Perturbation Improved NLSCC Method. Based on the analysis in section 3.3, it is necessary to create new fragments that account for the weak interactions especially for the hydrogen bonds between those groups that are not directly connected. Since there are four peptide bonds in Ala4, one straightforward starting point for the fragmentation process is to define the four basic units so that each one includes one peptide bond. Since a strong interaction may exist between each pair of the units, there are six possible combinations. Therefore, six new fragments are created by defining each combination of the unit pair as QM1 and the functional groups connected to them as QM2 (Figure 6). For each fragment, the MBPT(2) calculation is done on the basis of the NLMO. And then the interaction energies between those functional groups that are not included in the calculation in section 3.2 are computed through eq 14. The final PNLCCSD energies are obtained using eq 15. The intramolecular correction, new correlation energies, and the relative energies are summarized in Table 2. Comparing the data in Table 2 to those in Table 1, it is clear that the correlation energies computed by PNLCCSD are closer to the reference CCSD. More importantly, the errors of the relative energies are reduced significantly. Instead of an error as large as over 5 kcal/mol, the largest error of PNLCCSD is just 1.20 kcal/mol, and 24 conformers out of the 27 have an absolute error smaller than 1 kcal/mol. The difference between the results calculated by NLSCCSD and PNLCCSD is even clearer in Figure 7. The statistics of the results computed by NLSCCSD and PNLCCSD are summarized in Table 3. Obviously, although the intramolecular interaction energies are calculated by

Table 2. Extra-Fragmentation Corrections to NLSCCSD Correlation Energies (au), PNLCCSD Correlation Energies (au), Relative Energies (kcal/mol), and the Error of PNLCCSD Compared to CCSD Correlation Energies (kcal/ mol) conformer

NLSCCSD

perturbation correction

PNLCCSD

relative energies

error

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

−3.18854 −3.18798 −3.18880 −3.18904 −3.18780 −3.18777 −3.18819 −3.18895 −3.18676 −3.18731 −3.18735 −3.18534 −3.18813 −3.18558 −3.18742 −3.18693 −3.18733 −3.18814 −3.18735 −3.18573 −3.18854 −3.18905 −3.18717 −3.18640 −3.18583 −3.18554 −3.18536

−0.01070 −0.01210 −0.01815 −0.01176 −0.01259 −0.01587 −0.01614 −0.01818 −0.01742 −0.01749 −0.01886 −0.01915 −0.01392 −0.01468 −0.01882 −0.01672 −0.01931 −0.01605 −0.01659 −0.01637 −0.01684 −0.01682 −0.01697 −0.01650 −0.01649 −0.01748 −0.01723

−3.19924 −3.20008 −3.20695 −3.20080 −3.20040 −3.20363 −3.20433 −3.20713 −3.20418 −3.20481 −3.20621 −3.20449 −3.20204 −3.20026 −3.20624 −3.20365 −3.20664 −3.20419 −3.20394 −3.20210 −3.20538 −3.20588 −3.20414 −3.20290 −3.20232 −3.20302 −3.20258

5.12 5.01 −0.49 6.08 6.44 2.55 7.21 3.51 9.18 8.18 0.00 2.72 3.76 6.70 2.29 4.91 2.59 1.17 3.95 3.28 1.64 5.09 6.40 5.04 4.22 2.91 6.10

−0.59 −0.22 −0.66 −0.49 −0.38 −0.04 0.92 −1.03 0.14 0.39 0.00 1.20 −0.25 0.97 0.41 0.31 0.00 −0.48 −0.21 0.61 −0.63 −0.49 0.45 1.15 0.77 0.66 0.57

second-order perturbation theory, it still significantly improves the accuracy. One has to admit that although the PNLCC energies are more accurate than NLSCC, it still could not completely reproduce the CCSD ordering of the conformers. However, as H

DOI: 10.1021/acs.jpca.8b07947 J. Phys. Chem. A XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry A

systems, the new method should have a satisfactory performance leading to a reliable analysis. As one example of a density functional method providing results for this problem, Figure 8 shows a comparison of CAMQTP01 to the ab initio CCSD results. This example works remarkably well, showing an error of 0.38 kcal/mol. The intent of the QTP functionals is to reproduce CC results within an effective one-particle theory. Here that objective is realized. Note that CAM-QTP01 has no additional dispersion term added unlike most other functionals, indicating that dispersion is not a discriminate here for the small relative energies of these isomers.

4. CONCLUSIONS By maintaining the important features of the NLSCC method, it is further developed to make corrections for the interaction energies among those functional groups that are far apart using perturbation theory. By appropriately defining the QM1 and QM2 regions, the NLSCC method recovers over 99% of the coupled-cluster correlation energy computed without approximation. However, the functional groups that are not included in the bigger region (QM2) may also have significant contributions, and removing such a contribution may result in a large error for the relative energy between two molecules. To incorporate such contributions and at the same time not increase the scaling, perturbation theory is applied. The new approach significantly improves the accuracy of the NLSCC method and reduces the error of the relative energies to chemical accuracy. A few examples of relative energies of conformers are shown, where a perturbation corrected augmentation of NLSCC, PNLSCC, reduces the error relative to a supermolecule CCSD computation of the whole system (facilitated by the massively parallel ACES III program) to around 1 kcal/mol.

Figure 7. Comparison of the relative energies of the 27 conformers calculated by NLSCCSD and PNLCCSD to the CCSD data.

Table 3. Mean Absolute Error, Maximum Error, RMS, and Standard Deviation of the Relative Energies of 27 Conformers (kcal/mol)

NLSCCSD PNLCCSD

mean absolute error

maximum absolute error

RMS

standard deviation

1.67 0.52

5.71 1.20

2.26 0.61

1.71 0.61



an approximation method that attempts to achieve linear scaling, it should not be expected to have exactly the same accuracy of the original method. As has been demonstrated in Table 3, the energies of the 27 conformers calculated by PNLCCSD have a mean absolute error of just 0.52 kcal/mol, and even the maximum error is only 1.20 kcal/mol, close to the limit of chemical accuracy. Therefore, for most of the chemical

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]fl.edu. ORCID

Yifan Jin: 0000-0001-8894-7693

Figure 8. Comparison of the CAM-QTP01 relative energies to the CCSD data. I

DOI: 10.1021/acs.jpca.8b07947 J. Phys. Chem. A XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry A Notes

(19) Raghavachari, K.; Saha, A. Accurate Composite and FragmentBased Quantum Chemical Models for Large Molecules. Chem. Rev. 2015, 115, 5643−5677. (20) Laidig, W. D.; Purvis, G. D.; Bartlett, R. J. Can Simple Localized Bond Orbitals and Coupled Cluster Methods Predict Reliable Molecular Energies? J. Phys. Chem. 1985, 89, 2161−2171. (21) Saebø, S.; Pulay, P. Local Configuration-Interaction - An Efficient Approach for Larger Molecules. Chem. Phys. Lett. 1985, 113, 13−18. (22) Förner, W.; Ladik, J.; Otto, P.; Č ížek, J. Coupled-Cluster Studies 0.2. The Role of Localization in Correlation Calculations on Extended Systems. Chem. Phys. 1985, 97, 251−262. (23) Saebø, S.; Pulay, P. The Local Correlation Treatment. II. Implementation and Tests. J. Chem. Phys. 1988, 88, 1884−1890. (24) Saebø, S.; Pulay, P. Local Treatment of Electron Correlation. Annu. Rev. Phys. Chem. 1993, 44, 213−236. (25) Hetzer, G.; Pulay, P.; Werner, H.-J. Multipole Approximation of Distant Pair Energies in Local MP2 Calculations. Chem. Phys. Lett. 1998, 290, 143−149. (26) Hampel, C.; Werner, H. Local Treatment of Electron Correlation in Coupled Cluster Theory. J. Chem. Phys. 1996, 104, 6286−6297. (27) Schütz, M.; Werner, H.-J. Low-Order Scaling Local Electron Correlation Methods. IV. Linear Scaling Local Coupled-Cluster (LCCSD). J. Chem. Phys. 2001, 114, 661−681. (28) Flocke, N.; Bartlett, R. J. Localized Correlation Treatment Using Natural Bond Orbitals. Chem. Phys. Lett. 2003, 367, 80−89. (29) Flocke, N.; Bartlett, R. J. A Natural Linear Scaling CoupledCluster Method. J. Chem. Phys. 2004, 121, 10935−10944. (30) Neese, F.; Wennmohs, F.; Hansen, A. Efficient and Accurate Local Approximations to Coupled-Electron Pair Approaches: An Attempt to Revive the Pair Natural Orbital Method. J. Chem. Phys. 2009, 130, 114108. (31) Li, W.; Piecuch, P.; Gour, J. R.; Li, S. Local Correlation Calculations Using Standard and Renormalized Coupled-Cluster Approaches. J. Chem. Phys. 2009, 131, 114109. (32) Li, W.; Piecuch, P. Multilevel Extension of the Cluster-inMolecule Local Correlation Methodology: Merging Coupled-Cluster and Møller-Plesset Perturbation Theories. J. Phys. Chem. A 2010, 114, 6721−6727. (33) Ziólkowski, M.; Jansík, B.; Kjærgaard, T.; Jørgensen, P. Linear Scaling Coupled Cluster Method With Correlation Energy Based Error Control. J. Chem. Phys. 2010, 133, No. 014107. (34) Riplinger, C.; Neese, F. An Efficient and Near Linear Scaling Pair Natural Orbital Based Local Coupled Cluster Method. J. Chem. Phys. 2013, 138, No. 034106. (35) Saitow, M.; Becker, U.; Riplinger, C.; Valeev, E. F.; Neese, F. A New Near-Linear Scaling, Efficient and Accurate, Open-Shell Domain-Based Local Pair Natural Orbital Coupled Cluster Singles and Doubles Theory. J. Chem. Phys. 2017, 146, 164105. (36) Gordon, M. S.; England, W. Localized charge distributions. III. Transferability and trends of carbon-hydrogen moments and energies in acyclic hydrocarbons. J. Am. Chem. Soc. 1972, 94, 5168−5178. (37) England, W.; Gordon, M. S.; Ruedenberg, K. Localized Charge Distributions. Theor. Chem. Acc. 1975, 37, 177−216. (38) Khaliullin, R. Z.; Cobar, E. A.; Lochan, R. C.; Bell, A. T.; HeadGordon, M. Unravelling the Origin of Intermolecular Interactions Using Absolutely Localized Molecular Orbitals. J. Phys. Chem. A 2007, 111, 8753−8765. (39) Reed, A. E.; Weinhold, F. Natural Localized Molecular Orbitals. J. Chem. Phys. 1985, 83, 1736−1740. (40) Kristensen, K.; Ziółkowski, M.; Jansík, B.; Kjærgaard, T.; Jørgensen, P. A Locality Analysis of the Divide-Expand-Consolidate Coupled Cluster Amplitude Equations. J. Chem. Theory Comput. 2011, 7, 1677−1694. (41) Molt, R. W.; Watson, T.; Lotrich, V. F.; Bartlett, R. J. RDX Geometries, Excited States, and Revised Energy Ordering of Conformers via MP2 and CCSD(T) Methodologies: Insights into Decomposition Mechanism. J. Phys. Chem. A 2011, 115, 884−890.

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This work was supported by the U.S. Army Research Office under grant No. W911NF-16-1-0260.



REFERENCES

(1) Bartlett, R. J. Many-Body Perturbation Theory and Coupled Cluster Theory for Electron Correlation in Molecules. Annu. Rev. Phys. Chem. 1981, 32, 359−401. (2) Purvis, G. D.; Bartlett, R. J. A Full Coupled-Cluster Singles and Doubles Model: The Inclusion of Disconnected Triples. J. Chem. Phys. 1982, 76, 1910−1918. (3) Bartlett, R. J.; Musial, M. Coupled-Cluster Theory in Quantum Chemistry. Rev. Mod. Phys. 2007, 79, 291−352. (4) Bartlett, R. J. Coupled-Cluster Theory and its Equation-ofMotion Extensions. Wiley Interdiscip. Rev.-Comput. Mol. Sci. 2012, 2, 126−138. (5) Hughes, T. F.; Flocke, N.; Bartlett, R. J. Natural Linear-Scaled Coupled-Cluster Theory with Local Transferable Triple Excitations: Applications to Peptides. J. Phys. Chem. A 2008, 112, 5994−6003. (6) Verma, P.; Bartlett, R. J. Increasing the Applicability of Density Functional Theory. IV. Consequences of Ionization-potential Improved Exchange-correlation Potentials. J. Chem. Phys. 2014, 140, 18A534. (7) Jin, Y.; Bartlett, R. J. The QTP Family of Consistent Functionals and Potentials in Kohn-Sham Density Functional Theory. J. Chem. Phys. 2016, 145, No. 034107. (8) Taube, A. G.; Bartlett, R. J. Frozen Natural Orbital CoupledCluster Theory: Forces and Application to Decomposition of Nitroethane. J. Chem. Phys. 2008, 128, 164101. (9) Adamowicz, L.; Bartlett, R. J. Optimized Virtual Orbital Space for High-Level Correlated Calculations. J. Chem. Phys. 1987, 86, 6314−6324. (10) Adamowicz, L.; Bartlett, R. J.; Sadlej, A. J. Optimized Virtual Orbital Space for High-Level Correlated Calculations. II. Electric Properties. J. Chem. Phys. 1988, 88, 5749−5758. (11) Urban, M.; Noga, J.; Cole, S. J.; Bartlett, R. J. Towards a Full CCSDT Model for Electron Correlation. J. Chem. Phys. 1985, 83, 4041−4046. (12) Meyer, W. PNO-CI Studies of Electron Correlation Effects. I. Configuration Expansion by Means of Nonorthogonal Orbitals, and Application to the Ground State and Ionized States of Methane. J. Chem. Phys. 1973, 58, 1017−1035. (13) Ahlrichs, R.; Lischka, H.; Staemmler, V.; Kutzelnigg, W. PNOCI (Pair Natural Orbital Configuration Interaction) and CEPA-PNO (Coupled Electron Pair Approximation With Pair Natural Orbitals) Calculations of Molecular Systems. I. Outline of the Method for Closed-Shell States. J. Chem. Phys. 1975, 62, 1225−1234. (14) Neese, F.; Wennmohs, F.; Hansen, A. Efficient and Accurate Local Approximations to Coupled-Electron Pair Approaches: An Attempt to Revive the Pair Natural Orbital Method. J. Chem. Phys. 2009, 130, 114108. (15) Neese, F.; Hansen, A.; Liakos, D. G. Efficient and Accurate Approximations to the Local Coupled Cluster Singles Doubles Method Using a Truncated Pair Natural Orbital Basis. J. Chem. Phys. 2009, 131, No. 064103. (16) Gordon, M. S.; Fedorov, D. G.; Pruitt, S. R.; Slipchenko, L. V. Fragmentation Methods: A Route to Accurate Calculations on Large Systems. Chem. Rev. 2012, 112, 632−672. (17) Ramabhadran, R. O.; Raghavachari, K. The Successful Merger of Theoretical Thermochemistry with Fragment-Based Methods in Quantum Chemistry. Acc. Chem. Res. 2014, 47, 3596−3604. (18) Yang, J.; Hu, W.; Usvyat, D.; Matthews, D.; Schütz, M.; Chan, G. K.-L. Ab Initio Determination of the Crystalline Benzene Lattice Energy to Sub-Kilojoule/Mole Accuracy. Science 2014, 345, 640−643. J

DOI: 10.1021/acs.jpca.8b07947 J. Phys. Chem. A XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry A (42) Martin, J. M. L. What Can We Learn about Dispersion from the Conformer Surface of n-Pentane? J. Phys. Chem. A 2013, 117, 3118− 3132. (43) Fogueri, U. R.; Kozuch, S.; Karton, A.; Martin, J. M. The Melatonin Conformer Space: Benchmark and Assessment of Wave Function and DFT Methods for a Paradigmatic Biological and Pharmacological Molecule. J. Phys. Chem. A 2013, 117, 2269−2277. (44) Beachy, M. D.; Chasman, D.; Murphy, R. B.; Halgren, T. A.; Friesner, R. A. Accurate ab Initio Quantum Chemical Determination of the Relative Energetics of Peptide Conformations and Assessment of Empirical Force Fields. J. Am. Chem. Soc. 1997, 119, 5908−5920. (45) Hughes, T. F.; Bartlett, R. J. Transferability in the Natural Linear-scaled Coupledcluster Effective Hamiltonian Approach: Applications to Dynamic Polarizabilities and Dispersion Coefficients. J. Chem. Phys. 2008, 129, No. 054105. (46) DiStasio, R. A.; Jung, Y.; Head-Gordon, M. A Resolution-ofThe-Identity Implementation of the Local Triatomics-In-Molecules Model for Second-Order Møller-Plesset Perturbation Theory with Application to Alanine Tetrapeptide Conformational Energies. J. Chem. Theory Comput. 2005, 1, 862−876. (47) Distasio, R. A.; Steele, R. P.; Rhee, Y. M.; Shao, Y.; HeadGordon, M. An Improved Algorithm for Analytical Gradient Evaluation in Resolution-of-the-Identity Second-Order Møller-Plesset Perturbation Theory: Application to Alanine Tetrapeptide Conformational Analysis. J. Comput. Chem. 2007, 28, 839−856. (48) Tkatchenko, A.; Rossi, M.; Blum, V.; Ireta, J.; Scheffer, M. Unraveling the Stability of Polypeptide Helices: Critical Role of van der Waals Interactions. Phys. Rev. Lett. 2011, 106, 118102. (49) Foster, J. P.; Weinhold, F. Natural Hybrid Orbitals. J. Am. Chem. Soc. 1980, 102, 7211−7218. (50) Reed, A. E.; Weinhold, F. Natural Bond Orbital Analysis of Near-Hartree-Fock Water Dimer. J. Chem. Phys. 1983, 78, 4066− 4073. (51) Reed, A. E.; Weinstock, R. B.; Weinhold, F. Natural Population Analysis. J. Chem. Phys. 1985, 83, 735−746. (52) Glendening, E. D.; Landis, C. R.; Weinhold, F. Natural Bond Orbital Methods. Wiley Interdiscip. Rev.-Comput. Mol. Sci. 2012, 2, 1− 42. (53) ACES II is a program product of the Quantum Theory Project, University of Florida. Authors: J. F. Stanton, J. Gauss, S. A. Perera, J. D. Watts, A. D. Yau, M. Nooijen, N. Oliphant, P. G. Szalay, W. J. Lauderdale, S. R. Gwaltney, S. Beck, A. Balkovât’a, D. E. Bernholdt, K. K. Baeck, P. Rozyczko, H. Sekino, C. Huber, J. Pittner, W. Cencek, D. Taylor, and R. J. Bartlett. Integral packages included are VMOL (J. Almlöf and P. R. Taylor); VPROPS (P. Taylor); ABACUS (T. Helgaker, H. J. Aa. Jensen, P. Jørgensen, J. Olsen, and P. R. Taylor); HONDO/GAMESS (M. W. Schmidt, K. K. Baldridge, J. A. Boatz, S. T. Elbert, M. S. Gordon, J. J. Jensen, S. Koseki, N. Matsunaga, K. A. Nguyen, S. Su, T. L. Windus, M. Dupuis, J. A. Montgomery). (54) Lotrich, V.; Flocke, N.; Ponton, M.; Yau, A. D.; Perera, A.; Deumens, E.; Bartlett, R. J. Parallel Implementation of Electronic Structure Energy, Gradient and Hessian Calculations. J. Chem. Phys. 2008, 128, 194104. (55) Valiev, M.; Bylaska, E.; Govind, N.; Kowalski, K.; Straatsma, T.; Dam, H. V.; Wang, D.; Nieplocha, J.; Apra, E.; Windus, T.; et al. NWChem: A Comprehensive and Scalable Open-Source Solution for Large Scale Molecular Simulations. Comput. Phys. Commun. 2010, 181, 1477−1489.

K

DOI: 10.1021/acs.jpca.8b07947 J. Phys. Chem. A XXXX, XXX, XXX−XXX