Benchmark ab Initio Conformational Energies for the Proteinogenic

Dec 11, 2015 - Further, we have also considered the following types of empirical ..... (ab initio) with total and relative energies at various levels ...
0 downloads 0 Views 1MB Size
Article pubs.acs.org/JCTC

Benchmark ab Initio Conformational Energies for the Proteinogenic Amino Acids through Explicitly Correlated Methods. Assessment of Density Functional Methods Manoj K. Kesharwani,† Amir Karton,‡ and Jan M. L. Martin*,† †

Department of Organic Chemistry, Weizmann Institute of Science, 76100 Reḥovot, Israel School of Chemistry and Biochemistry, The University of Western Australia, Perth, WA 6009, Australia



S Supporting Information *

ABSTRACT: The relative energies of the YMPJ conformer database of the 20 proteinogenic amino acids, with N- and C-termination, have been re-evaluated using explicitly correlated coupled cluster methods. Lowercost ab initio methods such as MP2-F12 and CCSD-F12b actually are outperformed by double-hybrid DFT functionals; in particular, the DSDPBEP86-NL double hybrid performs well enough to serve as a secondary standard. Among range-separated hybrids, ωB97X-V performs well, while B3LYP-D3BJ does surprisingly well among traditional DFT functionals. Treatment of dispersion is important for the DFT functionals; for the YMPJ set, D3BJ generally works as well as the NL nonlocal dispersion functional. Basis set sensitivity for DFT calculations on these conformers is weak enough that def2-TZVP is generally adequate. For conformer corrections to heats of formation, B3LYP-D3BJ and especially DSDPBEP86-D3BJ or DSD-PBEP86-NL are adequate for all but the most exacting applications. The revised geometries and energetics for the YMPJ database have been made available as Supporting Information and should be useful in the parametrization and validation of molecular mechanics force fields and other low-cost methods. The very recent dRPA75 method yields good performance, without resorting to an empirical dispersion correction, but is still outperformed by DSD-PBEP86-D3BJ and particularly DSD-PBEP86-NL. Core−valence corrections are comparable in importance to improvements beyond CCSD(T*)/cc-pVDZ-F12 in the valence treatment.



INTRODUCTION Understanding the structure−function relationship in polypeptides and proteins is a crucial step in the interpretation of biochemical processes. Also, the study of the conformational flexibility of these biologically active molecules is of the greatest importance. Many conformers of these flexible molecules are high in energy and are not populated at physiological temperatures; however, a subset of low-energy ones is populated. A description of each possible conformer is essential since the biologically active conformers do not always correspond to the energetic global minimum. In a 2004 report, analysis of 150 protein−ligand complexes revealed that the bound conformer is on average 4−5 kcal/mol higher in energy than the lowest energy conformer in solution.1 A flexible molecule may possess many distinct conformers in a narrow energy range; as a result, at room or physiological temperature, said molecule exists as a population mixture of conformers. Experimentally observed properties reflect a Boltzmannweighted average over the entire accessible conformational spacenot just a single, idealized minimum energy structure. However, standard ab initio thermochemistry refers to one particular conformation of a molecule, typically the global minimum on the potential energy surface. This causes nontrivial © 2015 American Chemical Society

contributions to the enthalpy and Gibbs energy corrections for nonrigid molecules such as linear alkanes2 and amino acids.3 Aside from the presence of multiple torsion axes, amino acids (and peptides more generally) also may exhibit intramolecular hydrogen bonding or other nondispersion interactions. All this results in a variety of conformers, ranging in the present work from five for proline to 61 for arginine. A correct description of protein conformation is essential in computational chemical biology generally and drug design in particular. Unfortunately, all present-day computational methods capable of handling proteins with hundreds, let alone thousands, of amino acids are highly approximate (such as molecular mechanics force fields), and their benchmarking and parametrization for these properties remains a necessity. A considerable number of experimental and theoretical studies have been reported for the conformational analysis of amino acids and small peptides: for some examples, the reader may consult refs 4−21. Recently, we have reported3 the accurate theoretical heats of formation for all 20 proteinogenic amino acids by means of Received: November 8, 2015 Published: December 11, 2015 444

DOI: 10.1021/acs.jctc.5b01066 J. Chem. Theory Comput. 2016, 12, 444−454

Article

Journal of Chemical Theory and Computation explicitly correlated high-level thermochemical procedures.22 A meaningful comparison with the experiment required the inclusion of thermal conformer corrections; in that study, conformer energies were gathered from already published calculations in the literature. These range from complete basis set CCSD(T) studies to low-level MP2 or DFT calculations.3 Very recently, Yuan, Mills, Popelier, and Jensen (YMPJ)23 have reported energy minima of all 20 natural amino acids. Unlike in ref 3 where free amino acids were considered, YMPJ placed termination groups on both the N and C termini of the amino acids, and their results thus are more relevant to amino acids inside an actual peptide chain. The conformer manifolds were obtained at the following low levels of theory: HF/ 6-31G(d,p), B3LYP/apc-1, and MP2/cc-pVDZ). In the present work, we report benchmark ab initio relative energies for the YMPJ conformer database obtained by means of explicitly correlated MP2 and coupled cluster methods, starting from reoptimized geometries with a larger basis set. These energetic data will then be used to evaluate the performance of various ab initio, double-hybrid DFT, and DFT methods.

alanine, CCSD(T)-level core−valence contributions were additionally obtained according to the same protocol as used in the W1-F12 method.22 In the DFT calculations, we employed the Weigend− Ahlrichs basis sets46 def2-QZVP, def2-TZVPP, def2-TZVP, and def2-SVP exclusively, using corresponding auxiliary basis sets47 for simultaneously fitting Coulomb and exchange for all DFT methods and associated RI-MP2 auxiliary basis sets48 for the double-hybrid calculations in ORCA. We also considered certain polarization consistent (pc-n) basis sets,49−51 such as pc-2 and aug-pc-1 (where aug indicates the addition of diffuse functions). The following DFT functionals were considered (grouped by rung on the Perdew52 “Jacob’s Ladder”): • on the second (GGA) rung, BP86,53 BLYP,53,54 and PBE55 • on the third (meta-GGA) rung, TPSS56 • on the fourth (occupied-orbital dependent) rung, B3LYP,54,57 B3PW91, PBE0,58 TPSS0,59,60 M06,61 and M06-2X61 • on the fifth (virtual-orbital dependent) rung, the double hybrid B2GP-PLYP,62 and the spin component scaled double hybrids DSD-PBEP86,63 DSD-PBEPBE, and DSD-PBEB9564 methods • We have also considered ωB97X,65 ωB97X-D,66 ωB97XD3,67 and ωB97X-V68 as range-separated fourth-rung functionals. At a referee’s request, we added in several additional range-separated hybrids: CAM-B3LYP,69 LC-ωPBE,70 M11,71 N12SX,72 and MN12SX.72 Further, we have also considered the following types of empirical dispersion corrections (for a review, see ref 73) for DFT energies: (a) Grimme’s 2006 version, denoted by the suffix “-D2” using Grimme’s expression



COMPUTATIONAL METHODS Most calculations were performed on the Faculty of Chemistry cluster at the Weizmann Institute of Science and the remainder at the National Computational Infrastructure (Canberra, Australia). All explicitly correlated coupled cluster calculations were carried out using MOLPRO 2012.1,24 while density functional calculations were carried out using either the ORCA or Gaussian 09 Rev. D.01 packages.25,26 Geometry optimizations were carried out using ORCA’s implementation of the RI-MP2 (resolution of the identity MP2) method,27,28 which uses an auxiliary fitting basis set to avoid treating the complete set of two-electron repulsion integrals. The cc-pVTZ correlation-consistent29−32 basis set was used for these optimizations. At the final geometries, single-point explicitly correlated CCSD(T)-F12b and RI-MP2-F12 calculations were carried out using cc-pVnZ-F12 (where n = D, T, Q) basis sets of Peterson et al.33 together with the associated auxiliary and complementary auxiliary (CABS) basis sets.34,35 The cc-pVnZ-F12 basis sets were specifically developed for explicitly correlated calculations. As recommended in ref 36, the geminal exponents were set to β = 0.9 for cc-pVDZ-F12 and β = 1.0 for cc-pVTZ-F12 as well as cc-pVQZ-F12 (only considered for glycine). The SCF component was improved through the “CABS correction”.37,38 For the (T) term, which does not benefit from the F12, we employed the Marchetti−Werner approximation,39,40 which tries to achieve convergence acceleration through scaling the (T) contribution by the MP2-F12/MP2 correlation energy ratio. This is indicated by the notation CCSD(T*)-F12b. As Marchetti−Werner scaling is not exactly size-consistent (if in practice approximately so), we also consider CCSD(Ts)F12b,41 in which the (T) contributions are multiplied by constant scaling factors of 1.1413 for cc-pVDZ-F12 and 1.0527 for cc-pVTZ-F12 (Table 3 in ref 41). Aside from CCSD-F12b and CCSD(T)-F12b results, we obtained MP2-F12 and SCS-MP2-F1242−44 data as byproducts of the explicitly correlated coupled cluster calculations. Core−valence correlation effects were considered as an additive correction at the RI-MP2 level with the cc-pwCVTZ and cc-pwCVQZ weighted core−valence basis sets.45 For glycine and

Nat − 1

Edisp = −s6

Nat

C6ij

∑ ∑ i=1

j=i+1

R ij6

fdmp (R ij)

(1)

in which the damping function is taken as −1 ⎡ ⎛ ⎛ R ij ⎞⎞⎤ ⎢ ⎥ ⎜ ⎟ fdmp (R ij) = 1 + exp⎜ −α⎜ − 1⎟⎟ ⎢⎣ ⎠⎠⎥⎦ ⎝ ⎝ sR R r

(2)

where s6 is a scaling factor that depends only of the functional used, Cij6 ≈ (Ci6Cj6)1/2 is the dispersion coefficient for the atom pair ij computed by using a geometric mean, Rr = RvdW,i + RvdW,j is the sum of the van der Waals radii of the two atoms in question, and specific numerical values for the atomic Lennard−Jones constants Ci6 and the van der Waals radii have been taken from ref 74. The length-scaling factor sR = 1.0 and hysteresis exponent α = 20.0 were set as in ref 75. (b) Grimme’s DFT-D376,77 version with Becke−Johnson damping, denoted by the suffix “-D3BJ” D3(BJ) Edisp = −∑ s6 i>j

C6, ij R ij6

6

+ [f (R r)]

+ s8

C8, ij R ij8

+ [f (R r)]8 (3)

in which f(Rr) = a1Rr + a2. This modified cutoff function does not fade to zero at short distance but to a small finite value. In addition to these molecular mechanics-like corrections, we have also considered the Vydrov−van Voorhis (VV10) “nonlocal” (NL) dispersion functional,78 in which an a 445

DOI: 10.1021/acs.jctc.5b01066 J. Chem. Theory Comput. 2016, 12, 444−454

Article

Journal of Chemical Theory and Computation posteriori correction is obtained from the electron density. The required short-range attenuation parameter, b, used for various DFT-NL calculations were obtained from ref 79 for the conventional DFT functionals and optimized in our group for the DSD double hybrids; the various values are listed in Table 1. These calculations were carried out using its implementation in ORCA.

Table 2. Number of Minima for Each of the 20 Natural Amino Acids at the MP2/cc-pVDZa and MP2/cc-pVTZ Levels alanine arginine asparagine aspartic acid cysteine glutamine glutamic acid glycine histidine isoleucine leucine lysine methionine phenylalanine proline serine threonine tryptophan tyrosine valine

Table 1. Short-Range Attenuation Parameter, b, in the Vydrov−Van Voorhis Nonlocal Dispersion Correction for DFT Functionals Considered in This Work DFT methods

b

DFT methods

b

BP86-NL BLYP-NL PBE-NL TPSS-NL B3LYP-NL B3PW91-NL

4.4 4.0 6.4 5.0 4.8 4.5

PBE0-NL TPSS0-NL B2GP-PLYP-NL DSD-PBEP86-NL DSD-PBEPBE-NL DSD-PBEB95-NL

6.9 5.5 9.9 12.8 9.6 12.5

The attenuation parameters for the DSD double hybrids were obtained by minimizing the RMSD in counterpoisecorrected def2-QZVP interaction energy calculations on the revised S66 weak interaction benchmark80 of Hobza and co-workers. The values for DSD-PBEP86-NL and B2GP-PLYPNL differ slightly from those obtained earlier81 from uncorrected calculations; even with the def2-QZVP basis set, the basis set superposition error in a double hybrid is significant enough that this makes a difference. The conformer contribution to the enthalpy function hcf298 ≡ HT=298 − E0 is calculated as2 hcf 298 =

∑i dixi exp( −xi) ∑i di exp( −xi)

where xi ≡

a

MP2/cc-pVDZ

MP2/cc-pVTZ

11 61 12 36 24 21 36 9 24 25 28 39 57 30 5 26 17 26 17 15

10 59 12 32 23 20 30 8 22 24 26 37 56 26 5 23 17 26 16 14

Reported by Yuan et al.23

energies is only 0.05 kcal/mol, the largest individual difference being just 0.09 kcal/mol. Small as these numbers are, we can slightly improve on them at a small additional computational cost. For the cc-pVTZ-F12 basis set, the choice between (T*) and (Ts) has no significant consequences, the RMSD between the two sets being just 0.007 kcal/mol. Completely neglecting scaling of the triples leads to RMSD = 0.018 kcal/mol between CCSD(Ts)-F12b and CCSD(T)-F12b or 0.024 kcal/mol between CCSD(T*)-F12b and CCSD(T)-F12b. Since MP2-F12/cc-pVTZ-F12 proved feasible for all systems, we considered an additivity approximation in which the HLC (high-level correction, defined as the CCSD(T) − MP2 difference) was obtained at the CCSD(T*)-F12b/cc-pVDZ-F12 level and added to MP2-F12/cc-pVTZ-F12 relative energies. (It was previously found41,82,83 that the combination of cc-pVDZ-F12 HLCs with larger-basis MP2-F12 energetics yields excellent results for noncovalent interaction data sets.) Again, for the triples, we have the choices of no scaling, uniform scaling, and Marchetti−Werner scaling, leading to HLC(T), HLC(Ts), and HLC(T*), respectively. For glycine, using the CCSD(Ts)/cc-pVTZ-F12 data as the reference, HLC(Ts) yields the lowest RMSD = 0.026 kcal/mol, compared to 0.036 kcal/mol for HLC(T*), and about 0.06 kcal/mol for any of the three purely cc-pVDZ-F12 choices. If we use the CCSD(T*)-F12b/cc-pVTZ-F12 or CCSD(T)-F12b/cc-pVTZF12 data as the reference instead, HLC(Ts) still has an edge of HLC(T*). Therefore, our final chosen level of theory for the reference is MP2-F12/cc-pVTZ-F12 + [CCSD(Ts)-F12b − MP2-F12]/cc-pVDZ-F12 Just for glycine, we were also able to carry out MP2-F12/ cc-pVQZ-F12 calculations. Extrapolation according to ref 36 from cc-pV{T,Q}Z energies yields an approximate MP2 basis set limit. The RMSD from that is just 0.009 kcal/mol for cc-pVQZ-F12, rising to 0.02 kcal/mol for cc-pVTZ-F12 and 0.054 kcal/mol for cc-pVDZ-F12. Our best estimate values for

Ei − E0 RT

where the index i runs over the conformers. di is the degeneracy of the state, R is the gas constant, and T is the temperature in Kelvin. In this approach, we have not considered the difference between the rovibrational partition functions of the different conformers.



RESULTS AND DISCUSSION As starting geometries for the optimization, we took the MP2/ cc-pVDZ optimized conformer structures reported by YMPJ23 for all 20 natural amino acids. The number of minima found for each of the amino acids is listed in Table 2. With the exception of lysine and phenylalanine, MP2/cc-pVTZ reoptimization yields the same global minimum conformer as reported by YMPJ at the MP2/cc-pVDZ level; for the two exceptions, the two competing conformers were close in energy to begin with. Reoptimization also reduced the number of local minima for most amino acids. In four cases, namely, asparagine, proline, threonine, and tryptophan, all of the original minimal are also found at the MP2/cc-pVTZ level. However, one minimum each vanished for alanine, cysteine, glutamine, glycine, isoleucine, methionine, tyrosine and valine and two each for arginine, histidine, leucine, and lysine. Three minima disappeared for serine, four each for aspartic acid and phenylalanine, and no less than six for glutamic acid. Choice of Reference Level. For the glycine conformers, we were able to carry out CCSD(T*)-F12b calculations with the cc-pVTZ-F12 as well as cc-pVDZ-F12 basis sets. The RMS difference between the cc-pVDZ-F12 and cc-pVTZ-F12 relative 446

DOI: 10.1021/acs.jctc.5b01066 J. Chem. Theory Comput. 2016, 12, 444−454

Article

Journal of Chemical Theory and Computation

procedure for three additional amino acids and compared against the MP2-F12/cc-pVTZ-F12 + [CCSD(Ts)-F12b − MP2-F12]/cc-pVDZ-F12 data: RMSDs were found to be 0.07 kcal/mol for phenylalanine, 0.06 kcal/mol for tyrosine, and 0.11 kcal/mol for arginine. Core−valence corrections amount to 0.06 kcal/mol AveRMS with unit weights, dropping to 0.03 kcal/mol with Boltzmann weighting (TB = 1000 K). This means that any attempts at improving the energetics by computing the HLC at a higher level are futile unless core−valence corrections are also included. What about lower-level ab initio electron correlation methods? Omitting the (T) entirely raises AveRMSD to 0.34 kcal/mol, which is actually worse than MP2-F12 (0.29 kcal/mol). Upgrading the basis set size to cc-pVTZ-F12 does not change the performance of MP2-F12, which suggests a faster convergence of MP2-F12 method to the CBS limit. This accords well with a previous report,12 which suggests MP2-F12/ cc-pVDZ-F12 in conjunction with a post-MP2 correction as a fairly cost-effective approach for the calculation of conformational energies of oligopeptides. SCS-MP2-F12, with AveRMSD = 0.38 kcal/mol, performs more poorly than either CCSD-F12b or MP2-F12; we considered other spincomponent scaled MP2 variants44 such as SCS(MI)MP286 and S2-MP287 to no avail. As expected, SCF performs worst in the group (AveRMSD = 1.73 kcal/mol). Boltzmann-weighted AveRMSDs exhibit the same trends but with comparatively smaller values. Analyzing the results for individual amino acids, one finds in most cases that MP2-F12 is superior to both CCSD-F12b and SCS-MP2-F12; for about half the amino acids (namely, alanine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, isoleucine, serine, threonine and valine), RMSDs are actually less than 0.25 kcal/mol. For leucine, lysine, phenylalanine, and tyrosine, both CCSD-F12b and SCS-MP2-F12 outperform MP2-F12. All these three methods are performing well for cysteine, isoleucine, and valine with RMSD values under 0.21 kcal/mol. However, for arginine, histidine, and methionine, all three approximate methods yields RMSDs of 0.4 kcal/mol or more. For the case of tryptophan, MP2-F12/cc-pVTZ-F12 shows 0.46 kcal/mol RMS deviation with reference to HLC corrected MP2-F12, Unsurprisingly, conventional ab initio methods such as CCSD(T), CCSD, MP2, and SCS-MP2 with the small cc-pVDZ basis set perform poorly, with RMSDs in excess of 0.6 kcal/mol. Increasing the basis set to def2-QZVP, improves the RMSD for SCS-MP2 to 0.20 kcal/mol and for ordinary MP2 to 0.53 kcal/mol. RMS core−valence corrections for individual amino acids range from just 0.01 kcal/mol for proline to 0.10 kcal/mol for aspartic acid and indeed 0.15 kcal/mol for arginine. Switching from unit to Boltzmann weights typically reduces these amounts by half or better, the largest value of 0.06 kcal/mol still being for arginine. Let us now discuss the DFT methods. AveRMSD values for conventional and double-hybrid DFT functionals, relative to our MP2-F12/cc-pVTZ-F12 + [CCSD(Ts)-F12b − MP2-F12]/ cc-pVDZ-F12+CV reference values, are given in Table 5, both equally weighted and Boltzmann-weighted (TB = 1000 K). All these calculations were performed using ORCA, except the ωB97X-D calculations, which were performed using Gaussian 09. The def2-QZVP basis set was used throughout for this table.

glycine can be taken as MP2-F12/cc-pV{T,Q}Z-F12 + HLC(Ts)/cc-pVTZ-F12. The chosen level of theory for the entire set, MP2-F12/cc-pVTZ-F12 + HLC(Ts)/cc-pVDZ-F12, deviates 0.05 kcal/mol RMS from that (Table 3). On the basis Table 3. RMSD (kcal/mol) for the Relative Energies of Glycine Conformers with Reference to MP2-F12/ cc-pV{T,Q}Z-F12 + [CCSD(Ts)-F12b − MP2-F12]/ cc-pVDZ-F12 Energies CCSD(T*)-F12b CCSD(Ts)-F12b CCSD(T)-F12b MP2-F12/cc-pVQZ-F12 + HLC(Ts)a/cc-pVnZ-F12 MP2-F12/cc-pVQZ-F12 + HLC(T*)b/cc-pVnZ-F12 MP2-F12/cc-pVTZ-F12 + HLC(Ts)a/cc-pVnZ-F12 MP2-F12/cc-pVTZ-F12 + HLC(T*)b/cc-pVnZ-F12

cc-pVDZ-F12

cc-pVTZ-F12

0.075 0.072 0.043 0.032

0.032 0.026 0.011 0.008

0.042

0.014

0.050 0.061

a HLC(Ts) = CCSD(Ts)-F12b − MP2-F12. bHLC(T*) = CCSD(T*)-F12b − MP2-F12.

of that, we would argue that the reference data in the remainder approach the nonrelativistic, Born−Oppenheimer, valence correlation limit to about 0.05 kcal/mol RMSD, and that 0.1 kcal/mol is a conservative estimate for their remaining uncertainty. Core−valence corrections for alanine were calculated both at the CCSD(T) level using the basis sets and extrapolation from W1-F12 theory and at the RI-MP2/cc-pwCVTZ and cc-pwCVQZ levels. The RMS core−valence contribution to the relative energies is 0.081 kcal/mol from W1-F12; the RMS difference between those core−valence contributions and those obtained at the RI-MP2/cc-pwCVQZ level is just 0.005 kcal/mol. We thus conclude that the costly CCSD(T) core−valence calculations can be dispensed with in favor of RI-MP2/cc-pwCVQZ. The RMS difference between RI-MP2/ cc-pwCVTZ and cc-pwCVQZ of just 0.008 kcal/mol suggests that the latter basis set is close enough to the basis set limit for the present application. We thus add those corrections in (denoted by the suffix “+CV”) to all reference values used in the remainder of the work. Relative Energies. RMSDs for relative energies of amino acid conformers relative to the reference data are listed in Table 4. Performance statistics in term of AveRMSD both using equal weights (corresponding to an infinite Boltzmann temperature) and at a semi-arbitrary finite Boltzmann temperature of 1000 K (see, e.g., refs 84 and 85) are also given in Table 4. The Boltzmann-weighted error statistics may be more informative, as they are less influenced by large errors for (bio)chemically less relevant high-energy regions. For the problem at hand here, the average RMS difference (AveRMSD) between those (valence correlation only) composite results and the raw CCSD(T*)-F12b/cc-pVDZF12 answers is just 0.03 kcal/mol with unit weights, dropping to 0.02 kcal/mol with Boltzmann weighting (TB = 1000 K). The resource demands for CCSD(T*)-F12b/cc-pVDZ-F12 on tryptophan turned out to be prohibitive. For want of a better alternative, we obtained HLCs for this amino acid at the CCSD(T)/cc-pVDZ level and added them in to MP2-F12/ cc-pVTZ-F12 energies. By way of validation, we repeated this 447

DOI: 10.1021/acs.jctc.5b01066 J. Chem. Theory Comput. 2016, 12, 444−454

448

0.86 0.70

0.68

0.27

0.42

0.72 0.40

0.46

MP2-F12/ cc-pVTZ-F12

0.24

0.24 0.75 0.18 0.22 0.21 0.24 0.25 0.16 0.43 0.15 0.27 0.35 0.40 0.44 0.30 0.17 0.17 0.39 0.20 0.29

MP2-F12/ cc-pVDZ-F12

0.00

0.00

0.19 MP2-F12/cc-pVTZ-F12 + HLCd

0.33 0.92 0.63 0.39 0.16 0.27 0.30 0.38 (0.38) 0.47 0.18 0.17 0.21 0.49 0.25 0.31 0.38 0.22 0.27 0.20 0.34

CCSD-F12b/ cc-pVDZ-F12

0.45

0.53

0.22 MP2/ def2-QZVP

0.21 0.72 0.82 0.47 0.17 0.55 0.38 0.28 (0.27) 0.49 0.16 0.15 0.17 0.47 0.26 0.48 0.44 0.43 0.22 0.17 0.37

SCS-MP2-F12/ cc-pVDZ-F12

0.17

0.20

0.19

0.25 0.76 0.18 0.19 0.23 0.20 0.28 0.15 0.45 0.15 0.26 0.33 0.43 0.47 0.30 0.17 0.15 0.41 0.19 0.29

0.31 1.33 0.28 0.56 0.42 0.40 0.38 0.27 0.48 0.19 0.37 0.59 0.34 0.46 0.28 0.33 0.29 0.38 0.29 0.42

MP2/ cc-pVTZ

2.03

2.62

0.30 HF/ def2-QZVP

MP2-F12/ cc-pVTZ-F12

1.10 SCS-MP2/ def2-QZVP

2.16 4.12 2.06 2.55 0.94 1.26 1.23 1.23 (1.25) 1.66 1.60 1.42 1.72 2.23 1.75 1.11 1.43 1.26 1.73 1.71 1.75

SCF/ cc-pVDZ-F12

0.56

0.60

0.03 MP2/ cc-pVTZ

0.03

0.07

0.08(0.04) 0.15(0.06) 0.03(0.02) 0.10(0.05) 0.04(0.02) 0.06(0.02) 0.07(0.03) 0.03(0.02) 0.06(0.02) 0.05(0.02) 0.04(0.02) 0.03(0.01) 0.10(0.06) 0.06(0.03) 0.01(0.01) 0.05(0.02) 0.06(0.03) 0.05(0.03) 0.05(0.02) 0.06(0.03)

core−valence MP2/cc-pwCVQZ

HLC(T*) = CCSD(T*)-F12b/cc-pVDZ-F12 − MP2-F12/cc-pVDZ-F12. bRMSD for relative energies calculated with cc-pVTZ-F12 basis set are given in parentheses. cAverage RMSD of 19 amino acids. d Here HLC = [CCSD(T) − MP2]/cc-pVDZ. eRMSD (kcal/mol) relative to MP2-F12/cc-pVTZ-F12 + [CCSD(T) − MP2]/cc-pVDZ. The last column shows the core−valence contributions calculated at the RI-MP2/cc-pwCVQZ level; values with 1000 K Boltzmann weights are given in parentheses. For alanine, the difference between CCSD(T)/cc-pwCVTZ and MP2/cc-pwCVTZ is 0.013 kcal/mol RMS.

a

0.77 tryptophane Boltzmann-weighted TB = 1000 K 0.42

0.01

0.03 SCS-MP2/ cc-pVDZ

0.02 0.01 0.04 0.04 0.02 0.03 0.03 0.02 0.03 0.01 0.01 0.02 0.03 0.02 0.02 0.03 0.03 0.02 0.02 0.02

MP2-F12/cc-pVTZF12+ HLC(T*)a

0.02 0.12 0.11 0.11 0.02 0.08 0.09 0.05 (0.04) 0.08 0.01 0.02 0.02 0.09 0.04 0.04 0.06 0.07 0.03 0.02 0.06

CCSD(T)-F12b/ cc-pVDZ-F12

MP2/ cc-pVDZ

CCSD(Ts)-F12b/ cc-pVDZ-F12

alanine 0.04 0.06 arginine 0.02 0.02 asparagine 0.03 0.04 aspartic acid 0.04 0.07 cysteine 0.02 0.03 glutamine 0.04 0.07 glutamic acid 0.04 0.07 0.03 (0.02) 0.04 (0.03) glycineb histidine 0.03 0.04 isoleucine 0.02 0.02 leucine 0.01 0.02 lysine 0.02 0.03 methionine 0.03 0.04 phenylalanine 0.04 0.06 proline 0.02 0.01 serine 0.02 0.03 threonine 0.03 0.06 tyrosine 0.03 0.05 valine 0.03 0.04 0.03 0.04 AveRMSDc Boltzmann-weighted TB = 1000 K AveRMSD 0.02 0.02 CCSD(T)/ CCSD/ cc-pVDZ cc-pVDZ

CCSD(T*)-F12b/ cc-pVDZ-F12

Table 4. RMSD (kcal/mol) for the Relative Energies of Amino Acids Conformers Relative to MP2-F12/cc-pVTZ-F12 + [CCSD(Ts)-F12b − MP2-F12]/cc-pVDZ-F12 Energies

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.5b01066 J. Chem. Theory Comput. 2016, 12, 444−454

Article

Journal of Chemical Theory and Computation

Table 5. AveRMSD (kcal/mol) for the relative energies of amino acids conformers calculated with def2-QZVP basis set and various DFT methods relative to MP2-F12/cc-pVTZ-F12 + [CCSD(Ts)-F12b − MP2-F12]/cc-pVDZ-F12+CV energies

Table “heatmapping” is on a spectrum from green (lowest) to red (highest) values. def2-TZVPP results for additional functionals with unit weights (Boltzmann TB = 1000 K): CAM-B3LYP 0.40 (0.22); LC-ωPBE 0.55(0.36); M11 0.75(0.50); MN12SX 0.51(0.43); N12SX 0.89(0.49); B97-D3BJ 0.82(0.39) kcal/mol.

Among GGA functionals, BLYP-NL is the best performer with 0.43 kcal/mol AveRMSD; among conventional hybrids, B3LYP-D3BJ and B3LYP-NL do surprisingly well. The “survival of the most transferable” range-separated hybrid ωB97X-V,68 which includes an NL correction, has a slight edge over B3LYP both for unit weights and for TB = 1000 K. Hybrid DFT functionals such as B3PW91, PBE0, TPSS, and TPSS0 yield quite similar performance with both D3BJ and NL corrections (AveRMSD 0.5−0.6 kcal/mol). However, heavily parametrized kinetic functionals such as M06 and M06-2X performed better without than with dispersion corrections. As the present paper was being prepared for publication, a paper by Kállay and co-workers89 appeared in which they propose a new “dual hybrid” denoted dRPA75, in which the orbitals and exchange energy are obtained using a PBE hybrid with 75% exact exchange, and the entire correlation energy is obtained using the dRPA (direct random phase approximation,90 a.k.a. drCCD or direct ring coupled cluster91,92 with all doubles). We applied this method to the present problem using the def2-QZVP basis set by means of the MRCC program package.93 The RMSD for the amino acid conformers is 0.37 kcal/mol with units weights and 0.21 kcal/mol with Boltzmann weights (TB = 1000 K), which qualifies as very good

Aside from the uncorrected functionals, we considered both Grimme’s D2 and D3BJ empirical dispersion corrections, as well as the NL nonlocal dispersion functional. The importance of dispersion corrections for the accurate reproduction of conformer energies has been established for both alkanes (e.g., n-pentane84) and for biomolecules with weak internal hydrogen bonds, such as melatonin.88 The double hybrids compare favorably to CCSD-F12b and MP2-F12. DSD-PBEP86-NL, in particular, does very well, with an AveRMSD of just 0.09 kcal/mol Boltzmann-weighted and 0.15 kcal/mol with unit weights. DSD-PBEPBE performs only slightly worse on both criteria, as does B2GP-PLYP Boltzmannweighted; with unit weights, B2GP-PLYP-NL markedly deteriorates, reflecting less good performance in the higherenergy region. Of the four double hybrids considered, DSDPBEP86 is the least sensitive to the type of dispersion correction used, even returning 0.14 kcal/mol at TB= 1000 K with no dispersion correction at all. (In the parametrization for DSD-PBEP86-noD, increased same-spin correlation partly compensates for the absent correction.) Generally, D3BJ performs quite similarly to NL, both for the double hybrids and for the conventional DFT functionals, and both are markedly superior to D2 or no correction at all. 449

DOI: 10.1021/acs.jctc.5b01066 J. Chem. Theory Comput. 2016, 12, 444−454

Article

Journal of Chemical Theory and Computation

Table 6. Conformer Corrections to the Enthalpy Function (hcf298) (kcal/mol) for Various Natural Amino Acids Calculated with ab Initio Methods

a HLC(Ts) = [CCSD(Ts)-F12b − MP2-F12]/cc-pVDZ-F12. bHLC(T*) = [CCSD(T*)-F12b − MP2-F12]/cc-pVDZ-F12. chcf298 calculated with cc-pVTZ-F12 basis set are given in parentheses. dRMSD of 19 amino acids. As an indication of scale, the average hcf298 = 0.23 kcal/mol. eHere, HLC = [CCSD(T) − MP2]/cc-pVDZ.

LC-ωPBE, M11, NM12SX, and N12SX. For technical reasons, these calculations were carried out using the smaller def2TZVPP basis set. CAM-B3LYP-D3BJ matches the very good performance of B3LYP-D3BJ, while LC-ωPBE-D3BJ behaves similarly to PBE-D3BJ. M11, N12SX and to a lesser extent MN12SX yield somewhat disappointing results. As a final note, the B97-D3BJ GGA functional74,77 was likewise checked and found to yield performance intermediate between BP86-D3BJ and PBE-D3BJ for the present benchmark (with Boltzmann weighting). Conformer Corrections to the Enthalpy Function. As discussed in the Introduction, comparison of theoretical results with experimental value obviously demands thermal corrections. The procedure to calculate the conformer contribution to the enthalpy function hcf298 is discussed in the Computational Methods section. The conformer corrections to the enthalpy function hcf298 for all natural amino acids calculated with various ab initio methods are listed in Table 6. Conformer corrections to the enthalpy functions reported in this work for the proteinogenic amino acids, with N- and C- termination, are remarkably smaller than the values reported for their corresponding gaseous amino acids.3 For the gaseous

performance considering that dRPA75 is devoid of any dispersion correction. Detailed inspection reveals that dRPA75 performs especially well for amino acids with aliphatic side chains but less so for polar or aromatic side chains. In order to investigate basis set sensitivity, we have performed additional calculations with the def2-TZVPP, def2TZVP, and def2-SVP basis sets for one each of pure (m)GGAs (PBE), hybrid-DFT (B3LYP), and double-hybrid (B2GPPLYP) methods. For B3LYP, we also considered the aug-pc-1 and pc-2 polarization consistent basis sets. (These basis sets were optimized for HF and DFT calculations, while the def2 series was developed with both correlated ab initio and DFT in mind and might thus be a natural choice for double-hybrid functionals.) The B3LYP AveRMSD between def2-QZVP and pc-2, def2-TZVP, and def2-TZVPP are all relatively small at 0.08, 0.12, and 0.09 kcal/mol, respectively, while for aug-pc-1 and def2-SVP, they increase to 0.4 and 1.0 kcal/mol, respectively. The former exceed the intrinsic error of many functionals, while the latter exceeds that of all but the worst functionals without dispersion correction (or with D2). At a reviewer’s request, we considered performance for five additional range-separated hybrids, namely, CAM-B3LYP, 450

DOI: 10.1021/acs.jctc.5b01066 J. Chem. Theory Comput. 2016, 12, 444−454

Article

Journal of Chemical Theory and Computation amino acids, conformer corrections values ranged from 0.22−0.81 kcal/mol; however, for amino acids with N- and C-termination, values ranged between 0.044−0.473 kcal/mol. The magnitude of conformer correction correlates disproportionally with the energy difference between the second lowest energy conformer and the global minimum. Additional peptide bonds at both ends in the proteinogenic amino acids enable additional intramolecular hydrogen bonds, which often preferentially stabilize the lowest energy conformer of most of the amino acids. The resulting increased energy gap with the second lowest conformer then results in reduced conformer corrections to the enthalpy function. Histidine has the smallest conformer correction (0.043 kcal/ mol), closely followed by serine (0.049 kcal/mol), while the two largest corrections are seen for isoleucine (0.473 kcal/mol) and valine (0.408 kcal/mol). Cysteine, leucine, lysine, phenylalanine, and tyrosine are among the other natural amino acids that have conformer corrections greater than or equal to 0.3 kcal/mol. MP2-F12/cc-pVTZ-F12 + [CCSD(Ts)-F12b − MP2-F12]/ cc-pVDZ-F12 calculated conformer corrections are considered as the reference value. In relation to the reference value, for the triples, the choices of no scaling, uniform scaling, and Marchetti− Werner scaling show RMS deviations of just 0.009, 0.005, and 0.004 kcal/mol, respectively. Because of the large size of tryptophan, we were unable to calculate hcf298 value with CCSD(T*)-F12b method; however, we have considered the [CCSD(T) − MP2]/cc-pVDZ HLC corrected MP2-F12/cc-pVTZ-F12 result as the reference value for this particular case. To check the consistency of this methodology, in same manner, we have also calculated the hcf298 for arginine, phenylalanine, and tyrosine and found a deviation of 0.012 kcal/mol for arginine, whereas 0.002 kcal/mol for phenylalanine and tyrosine relative to MP2-F12/cc-pVTZF12 + [CCSD(Ts)-F12b − MP2-F12]/cc-pVDZ-F12. Since the conformer corrections values are small in magnitude, percentual deviations may be more informative than RMSDs. The ab initio calculated percent deviations relative to MP2-F12/cc-pVTZ-F12 + [CCSD(Ts)-F12b − MP2-F12]/ cc-pVDZ-F12 are given in the Supporting Information; averages over all amino acids are presented in Table 7. Among the lower level ab initio methods, MP2-F12 performs well with about 14% average deviation, followed by SCS-MP2F12 at about 18%, both outperforming the much more expensive CCSD-F12b method at 26%. Average percentual deviations calculated for various doublehybrid DFT and DFT methods with def2-QZVP basis set are also given in Table 7. In general, for both double-hybrid DFT and DFT methods, both D3BJ and NL corrected results are better than D2-corrected as well as raw (uncorrected) values. Except for DSD-PBEB95, all other DH-DFTs outperform the lower level ab initio methods, even showing less than 11% average percent deviation with D3BJ and NL corrections. Among other DFT methods, B3LYP, PBE0, and TPSS0 are performing reasonably well with D3BJ and NL corrections, showing average percent deviations less than 17%. The rangeseparated hybrid ωB97X-V also performs well with 17% average percent deviation; however, performance of all other DFT methods leaves something to be desired. What about the basis set size? Additional calculations with def2-TZVPP, def2-TZVP, and def2-SVP basis sets were performed for the PBE, B3LYP, and B2GP-PLYP functionals. Calculated average percent deviations for hcf298 are given in the

Table 7. Average percent deviations for hcf298 in referencea to MP2-F12/cc-pVTZ-F12 + [CCSD(Ts)-F12b − MP2-F12]/cc-pVDZ-F12 for various ab initiob and DFTc methods

For tryptophan reference is MP2-F12/cc-pVTZ-F12 + [CCSD(T) − MP2]/cc-pVDZ. bAverage percent deviation of 19 amino acids. c All DFT calculations were performed using def2-QZVP basis set. d HLC(T*) = [CCSD(T*)-F12b − MP2-F12]/cc-pVDZ-F12. a

Supporting Information. Performance of def2-SVP is very poor with 37−56% average deviation in reference to def2-QZVP. However, def2-TZVPP and def2-TZVP deviate only up to 8% and 9% from def2-QZVP results, respectively; this applies to PBE and B3LYP as well as B2GP-PLYP. Particularly for B3LYP, def2-TZVP appears to be an excellent compromise between quality and computational cost for hcf298. Effect of Reference Geometry Level. It might be argued, with benefit of hindsight, that DSD-PBEP86-D3BJ was a more appropriate level of theory for the reference geometries. This choice, in turn, might have been open to criticism on the grounds that the results would no longer be purely ab initio, as DSD-PBEP86-D3BJ is not only a DFT functional but it contains six adjustable parameters, all of them fixed from highlevel ab initio calculations rather than experiment. (Specifically, they are the percentages of HF-like exchange, DFT-like correlation, same-spin MP2-like correlation, and opposite-spin MP2-like correlation, and the dispersion correction, plus the a2 shaping parameter of the dispersion correction.) For a subset of amino acids, geometries of all conformers were optimized at the DSD-PBEP86-D3BJ/def2-QZVP level using ORCA with the COSX (chain-of-spheres exchange) approximation94 and single-point calculations DSD-PBEP86-D3BJ/ def2-QZVP without COS performed at the final geometry. Comparison of the relative energies with those obtained at the same level at the MP2/cc-pVTZ reference geometries used in the rest of the work will permit us to assess the sensitivity of the relative energies to the reference geometries to some degree. The RMS differences between the two sets of relative energies are: alanine, asparagine, and valine each 0.06 kcal/mol, 451

DOI: 10.1021/acs.jctc.5b01066 J. Chem. Theory Comput. 2016, 12, 444−454

Article

Journal of Chemical Theory and Computation

Lise Meitner-Minerva Center for Computational Quantum Chemistry, and a grant from the Yeda-Sela Initiative (Weizmann Institute of Science). A.K. gratefully acknowledges the generous allocation of computing time from the National Computational Infrastructure (NCI) National Facility, as well as an Australian Research Council (ARC) Discovery Early Career Researcher Award (DECRA, project number: DE140100311).

aspartic acid 0.05 kcal/mol, isoleucine and threonine 0.04, proline 0.03, and cysteine 0.10 kcal/mol. (For cysteine, two conformers collapsed to different ones.) For perspective, the RMS difference between the relative energies with and without COSX is itself 0.065 kcal/mol. We wish to emphasize that since the same set of stationary geometries is used throughout for evaluation of the performance of the lower-level methods, that a re-evaluation of all energetics and statistics at a slightly different reference geometry level should not materially affect any conclusions concerning their performance.





CONCLUSIONS The relative energies of the YMPJ conformer database of the 20 proteinogenic amino acids, with N- and C-termination, have been re-evaluated using explicitly correlated coupled cluster methods. The revised geometries and energetics have been made available as Supporting Information and should be useful in the parametrization and validation of molecular mechanics force fields and other low-cost methods. Core−valence corrections are comparable in importance to improvements beyond CCSD(T*)/cc-pVDZ-F12 in the valence treatment. Lower-cost ab initio methods such as MP2-F12 and CCSDF12b actually perform less well than double hybrid DFT functionals; in particular, the DSD-PBEP86-NL double hybrid performs well enough to serve as a secondary standard. Among range-separated hybrids, ωB97X-V performs well, while B3LYP-D3BJ does surprisingly well among traditional DFT functionals. For this particular problem set, D3BJ generally performs as well as the NL nonlocal dispersion functional. Basis set sensitivity for DFT calculation on these conformers is weak enough that def2-TZVP is generally adequate. For conformer corrections to heats of formation, B3LYPD3BJ and definitely DSD-PBEP86-D3BJ or DSD-PBEP86-NL are adequate for all but the most exacting applications. The dRPA75 method, without any further dispersion correction, performs quite well, particularly for amino acids with aliphatic side chains, but is still outperformed by DSD-PBEP86-NL.



ASSOCIATED CONTENT

* Supporting Information S

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jctc.5b01066. Spreadsheet (ab initio) with total and relative energies at various levels of theory (ZIP) Spreadsheet (DFT) with total and relative energies at various levels of theory (ZIP) MP2/cc-pVTZ optimized geometries for all conformers in Cartesian coordinates (ZIP)



REFERENCES

(1) Perola, E.; Charifson, P. S. Conformational Analysis of Drug-like Molecules Bound to Proteins: An Extensive Study of Ligand Reorganization upon Binding. J. Med. Chem. 2004, 47, 2499−2510. (2) Gruzman, D.; Karton, A.; Martin, J. M. L. Performance of Ab Initio and Density Functional Methods for Conformational Equilibria of CnH2n+1 Alkane Isomers (n = 4−8). J. Phys. Chem. A 2009, 113, 11974−11983. (3) Karton, A.; Yu, L.-J.; Kesharwani, M. K.; Martin, J. M. L. Heats of Formation of the Amino Acids Re-Examined by Means of W1-F12 and W2-F12 Theories. Theor. Chem. Acc. 2014, 133, 1483. (4) Jensen, J. H.; Gordon, M. S. Conformational Potential Energy Surface of Glycine: A Theoretical Study. J. Am. Chem. Soc. 1991, 113, 7917−7924. (5) Császár, A. G. Conformers of Gaseous α-Alanine. J. Phys. Chem. 1996, 100, 3541−3551. (6) Czinki, E.; Császár, A. G. Conformers of Gaseous Proline. Chem. Eur. J. 2003, 9, 1008−1019. (7) Wilke, J. J.; Lind, M. C.; Schaefer, H. F.; Császár, A. G.; Allen, W. D. Conformers of Gaseous Cysteine. J. Chem. Theory Comput. 2009, 5, 1511−1523. (8) Pang, R.; Guo, M.; Ling, S.; Lin, Z. Thorough Theoretical Search of Conformations of Neutral, Protonated and Deprotonated Glutamine in Gas Phase. Comput. Theor. Chem. 2013, 1020, 14−21. (9) Szidarovszky, T.; Czakó, G.; Császár, A. G. Conformers of Gaseous Threonine. Mol. Phys. 2009, 107, 761−775. (10) Balabin, R. Communications: Intramolecular Basis Set Superposition Error as a Measure of Basis Set Incompleteness: Can One Reach the Basis Set Limit without Extrapolation? J. Chem. Phys. 2010, 132, 211103. (11) Petrovic, A. G.; Polavarapu, P. L.; Mahalakshmi, R.; Balaram, P. Characterization of Folded Conformations in a Tetrapeptide Containing Two Tryptophan Residues by Vibrational Circular Dichroism. Chirality 2009, 21, E76−E85. (12) Goerigk, L.; Karton, A.; Martin, J. M. L.; Radom, L. Accurate Quantum Chemical Energies for Tetrapeptide Conformations: Why MP2 Data with an Insufficient Basis Set Should Be Handled with Caution. Phys. Chem. Chem. Phys. 2013, 15, 7028−7031. (13) Rai, R.; Aravinda, S.; Kanagarajadurai, K.; Raghothama, S.; Shamala, N.; Balaram, P. Diproline Templates as Folding Nuclei in Designed Peptides. Conformational Analysis of Synthetic Peptide Helices Containing Amino Terminal Pro-Pro Segments. J. Am. Chem. Soc. 2006, 128, 7916−7928. (14) Ling, S.; Yu, W.; Huang, Z.; Lin, Z.; Harañczyk, M.; Gutowski, M. Gaseous Arginine Conformers and Their Unique Intramolecular Interactions. J. Phys. Chem. A 2006, 110, 12282−12291. (15) Chen, M.; Huang, Z.; Lin, Z. Ab Initio Studies of Gas Phase Asparagine Conformers. J. Mol. Struct.: THEOCHEM 2005, 719, 153− 158. (16) Chen, M.; Lin, Z. Ab Initio Studies of Aspartic Acid Conformers in Gas Phase and in Solution. J. Chem. Phys. 2007, 127, 154314. (17) Huang, Z.; Yu, W.; Lin, Z. First-Principle Studies of Gaseous Aromatic Amino Acid Histidine. J. Mol. Struct.: THEOCHEM 2006, 801, 7−20. (18) Dokmaisrijan, S.; Lee, V. S.; Nimmanpipug, P. The Gas Phase Conformers and Vibrational Spectra of Valine, Leucine and Isoleucine: An Ab Initio Study. J. Mol. Struct.: THEOCHEM 2010, 953, 28−38.

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Fax: +972 8 934 4142. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS M.K.K. acknowledges a Feinberg Graduate School postdoctoral fellowship. Research at Weizmann was supported by the Israel Science Foundation (grant 1358/15), Minerva Foundation, and 452

DOI: 10.1021/acs.jctc.5b01066 J. Chem. Theory Comput. 2016, 12, 444−454

Article

Journal of Chemical Theory and Computation (19) Boeckx, B.; Maes, G. Experimental and Theoretical Observation of Different Intramolecular H-Bonds in Lysine Conformations. J. Phys. Chem. B 2012, 116, 12441−12449. (20) Huang, Z.; Yu, W.; Lin, Z. Exploration of the Full Conformational Landscapes of Gaseous Aromatic Amino Acid Phenylalanine: An Ab Initio Study. J. Mol. Struct.: THEOCHEM 2006, 758, 195−202. (21) Barone, V.; Biczysko, M.; Bloino, J.; Puzzarini, C. Accurate Structure, Thermodynamic and Spectroscopic Parameters from CC and CC/DFT Schemes: The Challenge of the Conformational Equilibrium in Glycine. Phys. Chem. Chem. Phys. 2013, 15, 10094− 10111. (22) Karton, A.; Martin, J. M. L. Explicitly Correlated Wn Theory: W1-F12 and W2-F12. J. Chem. Phys. 2012, 136, 124114. (23) Yuan, Y.; Mills, M. J. L.; Popelier, P. L. A.; Jensen, F. Comprehensive Analysis of Energy Minima of the 20 Natural Amino Acids. J. Phys. Chem. A 2014, 118, 7876−7891. (24) Werner, H.-J.; Knowles, P. J.; Knizia, G.; Manby, F. R.; Schütz, M.; Celani, P.; Korona, T.; Lindh, R.; Mitrushenkov, A.; Rauhut, G.; Shamasundar, K. R.; Adler, T. B.; Amos, R. D.; Bernhardsson, A.; Berning, A.; Cooper, D. L.; Deegan, M. J. O.; Dobbyn, A. J.; Eckert, F.; Goll, E.; Hampel, C.; Hesselman, A.; Hetzer, G.; Hrenar, T.; Jansen, G.; Köppl, C.; Liu, Y.; Lloyd, A. W.; Mata, R. A.; May, A. J.; McNicholas, S. J.; Meyer, W.; Mura, M. E.; Nicklass, A.; O’Neill, D. P.; Palmieri, P.; Peng, D.; Pflüger, K.; Pitzer, R. M.; Reiher, M.; Shiozaki, T.; Stoll, H.; Stone, A. J.; Tarroni, R.; Thorsteinsson, T.; Wang, M. MOLPRO, Version 2012.1, a Package of Ab Initio Programs. University of Cardiff Chemistry Consultants (UC3): Cardiff, Wales, U.K., 2012. http://www.molpro.net. (25) Neese, F. The ORCA Program System. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2012, 2, 73−78, http://orcaforum.cec.mpg.de,. (26) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Mennucci, B.; Petersson, G. A.; Nakatsuji, H.; Caricato, M.; Li, X.; Hratchian, H. P.; Izmaylov, A. F.; Bloino, J.; Zheng, G.; Sonnenberg, J. L.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Vreven, T.; Montgomery, J. A., Jr.; Peralta, J. E.; Ogliaro, F.; Bearpark, M.; Heyd, J. J.; Brothers, E.; Kudin, K. N.; Staroverov, V. N.; Kobayashi, R.; Normand, J.; Raghavachari, K.; Rendell, A. P.; Burant, J. C.; Iyengar, S. S.; Tomasi, J.; Cossi, M.; Rega, N.; Millam, M. J.; Klene, M.; Knox, J. E.; Cross, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.; Ochterski, J. W.; Martin, R. L.; Morokuma, K.; Zakrzewski, V. G.; Voth, G. A.; Salvador, P.; Dannenberg, J. J.; Dapprich, S.; Daniels, A. D.; Farkas, Ö .; Foresman, J. B.; Ortiz, J. V.; Cioslowski, J.; Fox, D. J. Gaussian 09 Rev. D01. Gaussian, Inc.: Wallingford, CT, 2012. http://www. gaussian.com. (27) Weigend, F.; Häser, M. RI-MP2: First Derivatives and Global Consistency. Theor. Chem. Acc. 1997, 97, 331−340. (28) Kendall, R. A.; Früchtl, H. A. The Impact of the Resolution of the Identity Approximate Integral Method on Modern Ab Initio Algorithm Development. Theor. Chem. Acc. 1997, 97, 158−163. (29) Dunning, T. H. Gaussian Basis Sets for Use in Correlated Molecular Calculations. I. The Atoms Boron through Neon and Hydrogen. J. Chem. Phys. 1989, 90, 1007−1023. (30) Kendall, R. A.; Dunning, T. H.; Harrison, R. J. Electron Affinities of the First-Row Atoms Revisited. Systematic Basis Sets and Wave Functions. J. Chem. Phys. 1992, 96, 6796−6806. (31) Woon, D. E.; Dunning, T. H. Gaussian Basis Sets for Use in Correlated Molecular Calculations. III. The Atoms Aluminum through Argon. J. Chem. Phys. 1993, 98, 1358−1371. (32) Peterson, K. A.; Woon, D. E.; Dunning, T. H. Benchmark Calculations with Correlated Molecular Wave Functions. IV. The Classical Barrier Height of the H+H2→H2+H Reaction. J. Chem. Phys. 1994, 100, 7410−7415. (33) Peterson, K. A.; Adler, T. B.; Werner, H.-J. Systematically Convergent Basis Sets for Explicitly Correlated Wavefunctions: The Atoms H, He, B-Ne, and Al-Ar. J. Chem. Phys. 2008, 128, 084102.

(34) Yousaf, K. E.; Peterson, K. A. Optimized Auxiliary Basis Sets for Explicitly Correlated Methods. J. Chem. Phys. 2008, 129, 184108. (35) Yousaf, K. E.; Peterson, K. A. Optimized Complementary Auxiliary Basis Sets for Explicitly Correlated Methods: aug-cc-pVnZ Orbital Basis Sets. Chem. Phys. Lett. 2009, 476, 303−307. (36) Hill, J. G.; Peterson, K. A.; Knizia, G.; Werner, H.-J. Extrapolating MP2 and CCSD Explicitly Correlated Correlation Energies to the Complete Basis Set Limit with First and Second Row Correlation Consistent Basis Sets. J. Chem. Phys. 2009, 131, 194105. (37) Adler, T. B.; Knizia, G.; Werner, H.-J. A Simple and Efficient CCSD(T)-F12 Approximation. J. Chem. Phys. 2007, 127, 221106. (38) Noga, J.; Šimunek, J. On the One-Particle Basis Set Relaxation in R12 Based Theories. Chem. Phys. 2009, 356, 1−6. (39) Marchetti, O.; Werner, H.-J. Accurate Calculations of Intermolecular Interaction Energies Using Explicitly Correlated Wave Functions. Phys. Chem. Chem. Phys. 2008, 10, 3400−3409. (40) Marchetti, O.; Werner, H.-J. Accurate Calculations of Intermolecular Interaction Energies Using Explicitly Correlated Coupled Cluster Wave Functions and a Dispersion-Weighted MP2 Method. J. Phys. Chem. A 2009, 113, 11580−11585. (41) Peterson, K. A.; Kesharwani, M. K.; Martin, J. M. L. The ccpV5Z-F12 Basis Set: Reaching the Basis Set Limit in Explicitly Correlated Calculations. Mol. Phys. 2015, 113, 1551−1558. (42) Grimme, S. Improved Second-Order Møller−Plesset Perturbation Theory by Separate Scaling of Parallel- and Antiparallel-Spin Pair Correlation Energies. J. Chem. Phys. 2003, 118, 9095−9102. (43) Szabados, A. Theoretical Interpretation of Grimme’s SpinComponent-Scaled Second Order Møller-Plesset Theory. J. Chem. Phys. 2006, 125, 214105. (44) Grimme, S.; Goerigk, L.; Fink, R. F. Spin-Component-Scaled Electron Correlation Methods. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2012, 2, 886−906. (45) Peterson, K. A.; Dunning, T. H. Accurate Correlation Consistent Basis Sets for Molecular Core−valence Correlation Effects: The Second Row Atoms Al−Ar, and the First Row Atoms B−Ne Revisited. J. Chem. Phys. 2002, 117, 10548. (46) Weigend, F.; Ahlrichs, R. Balanced Basis Sets of Split Valence, Triple Zeta Valence and Quadruple Zeta Valence Quality for H to Rn: Design and Assessment of Accuracy. Phys. Chem. Chem. Phys. 2005, 7, 3297−3305. (47) Weigend, F. Hartree-Fock Exchange Fitting Basis Sets for H to Rn. J. Comput. Chem. 2008, 29, 167−175. (48) Hättig, C. Optimization of Auxiliary Basis Sets for RI-MP2 and RI-CC2 Calculations: Core−valence and Quintuple-ζ Basis Sets for H to Ar and QZVPP Basis Sets for Li to Kr. Phys. Chem. Chem. Phys. 2005, 7, 59−66. (49) Jensen, F. Polarization Consistent Basis Sets: Principles. J. Chem. Phys. 2001, 115, 9113−9125. (50) Jensen, F. Polarization Consistent Basis Sets. III. The Importance of Diffuse Functions. J. Chem. Phys. 2002, 117, 9234− 9240. (51) Jensen, F.; Helgaker, T. Polarization Consistent Basis Sets. V. The Elements Si-Cl. J. Chem. Phys. 2004, 121, 3463−3470. (52) Perdew, J. P.; Schmidt, K. Jacob’s Ladder of Density Functional Approximations for the Exchange-Correlation Energy. In AIP Conference Proceedings; Van Doren, V., Van Alsenoy, C., Geerlings, P., Eds.; AIP Conference Proceedings; AIP: Antwerp, Belgium, 2001; Vol. 577, pp 1−20. (53) Becke, A. D. Density-Functional Exchange-Energy Approximation with Correct Asymptotic Behavior. Phys. Rev. A: At., Mol., Opt. Phys. 1988, 38, 3098−3100. (54) Lee, C.; Yang, W.; Parr, R. G. Development of the Colle-Salvetti Correlation-Energy Formula into a Functional of the Electron Density. Phys. Rev. B: Condens. Matter Mater. Phys. 1988, 37, 785−789. (55) Perdew, J.; Burke, K.; Ernzerhof, M. Generalized Gradient Approximation Made Simple. Phys. Rev. Lett. 1996, 77, 3865−3868. (56) Tao, J.; Perdew, J.; Staroverov, V.; Scuseria, G. Climbing the Density Functional Ladder: Nonempirical Meta−Generalized Gra453

DOI: 10.1021/acs.jctc.5b01066 J. Chem. Theory Comput. 2016, 12, 444−454

Article

Journal of Chemical Theory and Computation dient Approximation Designed for Molecules and Solids. Phys. Rev. Lett. 2003, 91, 146401. (57) Becke, A. D. A New Mixing of Hartree−Fock and Local Density-Functional Theories. J. Chem. Phys. 1993, 98, 1372−1377. (58) Adamo, C.; Barone, V. Toward Reliable Density Functional Methods without Adjustable Parameters: The PBE0 Model. J. Chem. Phys. 1999, 110, 6158−6170. (59) Grimme, S. Accurate Calculation of the Heats of Formation for Large Main Group Compounds with Spin-Component Scaled MP2 Methods. J. Phys. Chem. A 2005, 109, 3067−3077. (60) Quintal, M. M.; Karton, A.; Iron, M. A.; Boese, A. D.; Martin, J. M. L. Benchmark Study of DFT Functionals for Late-Transition-Metal Reactions. J. Phys. Chem. A 2006, 110, 709−716. (61) Zhao, Y.; Truhlar, D. G. The M06 Suite of Density Functionals for Main Group Thermochemistry, Thermochemical Kinetics, Noncovalent Interactions, Excited States, and Transition Elements: Two New Functionals and Systematic Testing of Four M06-Class Functionals and 12 Other Function. Theor. Chem. Acc. 2008, 120, 215−241. (62) Karton, A.; Tarnopolsky, A.; Lamère, J.-F.; Schatz, G. C.; Martin, J. M. L. Highly Accurate First-Principles Benchmark Data Sets for the Parametrization and Validation of Density Functional and Other Approximate Methods. Derivation of a Robust, Generally Applicable, Double-Hybrid Functional for Thermochemistry and Thermochemical. J. Phys. Chem. A 2008, 112, 12868−12886. (63) Kozuch, S.; Martin, J. M. L. DSD-PBEP86: In Search of the Best Double-Hybrid DFT with Spin-Component Scaled MP2 and Dispersion Corrections. Phys. Chem. Chem. Phys. 2011, 13, 20104− 20107. (64) Kozuch, S.; Martin, J. M. L. Spin-Component-Scaled Double Hybrids: An Extensive Search for the Best Fifth-Rung Functionals Blending DFT and Perturbation Theory. J. Comput. Chem. 2013, 34, 2327−2344. (65) Chai, J.-D.; Head-Gordon, M. Systematic Optimization of LongRange Corrected Hybrid Density Functionals. J. Chem. Phys. 2008, 128, 84106. (66) Chai, J.-D.; Head-Gordon, M. Long-Range Corrected Hybrid Density Functionals with Damped Atom-Atom Dispersion Corrections. Phys. Chem. Chem. Phys. 2008, 10, 6615−6620. (67) Lin, Y.-S.; Li, G.-D.; Mao, S.-P.; Chai, J.-D. Long-Range Corrected Hybrid Density Functionals with Improved Dispersion Corrections. J. Chem. Theory Comput. 2013, 9, 263−272. (68) Mardirossian, N.; Head-Gordon, M. ωB97X-V: A 10-Parameter, Range-Separated Hybrid, Generalized Gradient Approximation Density Functional with Nonlocal Correlation, Designed by a Survival-of-the-Fittest Strategy. Phys. Chem. Chem. Phys. 2014, 16, 9904−9924. (69) Yanai, T.; Tew, D. P.; Handy, N. C. A New Hybrid Exchange− correlation Functional Using the Coulomb-Attenuating Method (CAM-B3LYP). Chem. Phys. Lett. 2004, 393, 51−57. (70) Vydrov, O. A.; Scuseria, G. E. Assessment of a Long-Range Corrected Hybrid Functional. J. Chem. Phys. 2006, 125, 234109. (71) Peverati, R.; Truhlar, D. G. Improving the Accuracy of Hybrid Meta-GGA Density Functionals by Range Separation. J. Phys. Chem. Lett. 2011, 2, 2810−2817. (72) Peverati, R.; Truhlar, D. G. Screened-Exchange Density Functionals with Broad Accuracy for Chemistry and Solid-State Physics. Phys. Chem. Chem. Phys. 2012, 14, 16187−16191. (73) Klimeš, J.; Michaelides, A. Perspective: Advances and Challenges in Treating van Der Waals Dispersion Forces in Density Functional Theory. J. Chem. Phys. 2012, 137, 120901. (74) Grimme, S. Semiempirical GGA-Type Density Functional Constructed with a Long-Range Dispersion Correction. J. Comput. Chem. 2006, 27, 1787−1799. (75) Schwabe, T.; Grimme, S. Double-Hybrid Density Functionals with Long-Range Dispersion Corrections: Higher Accuracy and Extended Applicability. Phys. Chem. Chem. Phys. 2007, 9, 3397−3406. (76) Grimme, S.; Antony, J.; Ehrlich, S.; Krieg, H. A Consistent and Accurate Ab Initio Parametrization of Density Functional Dispersion

Correction (DFT-D) for the 94 Elements H-Pu. J. Chem. Phys. 2010, 132, 154104. (77) Grimme, S.; Ehrlich, S.; Goerigk, L. Effect of the Damping Function in Dispersion Corrected Density Functional Theory. J. Comput. Chem. 2011, 32, 1456−1465. (78) Vydrov, O. A.; Van Voorhis, T. Nonlocal van Der Waals Density Functional: The Simpler the Better. J. Chem. Phys. 2010, 133, 244103. (79) Hujo, W.; Grimme, S. Performance of the van Der Waals Density Functional VV10 and (hybrid)GGA Variants for Thermochemistry and Noncovalent Interactions. J. Chem. Theory Comput. 2011, 7, 3866−3871. (80) Ř ezác,̌ J.; Riley, K. E.; Hobza, P. Extensions of the S66 Data Set: More Accurate Interaction Energies and Angular-Displaced Nonequilibrium Geometries. J. Chem. Theory Comput. 2011, 7, 3466−3470. (81) Yu, F. Spin-Component-Scaled Double-Hybrid Density Functionals with Nonlocal van Der Waals Correlations for Noncovalent Interactions. J. Chem. Theory Comput. 2014, 10, 4400−4407. (82) Brauer, B.; Kesharwani, M. K.; Martin, J. M. L. Some Observations on Counterpoise Corrections for Explicitly Correlated Calculations on Noncovalent Interactions. J. Chem. Theory Comput. 2014, 10, 3791−3799. (83) Temelso, B.; Renner, C. R.; Shields, G. C. Importance and Reliability of Small Basis Set CCSD(T) Corrections to MP2 Binding and Relative Energies of Water Clusters. J. Chem. Theory Comput. 2015, 11, 1439−1448. (84) Martin, J. M. L. What Can We Learn About Dispersion from the Conformer Surface of N-Pentane? J. Phys. Chem. A 2013, 117, 3118− 3132. (85) Robertson, M. J.; Tirado-Rives, J.; Jorgensen, W. L. Improved Peptide and Protein Torsional Energetics with the OPLS-AA Force Field. J. Chem. Theory Comput. 2015, 11, 3499−3509. (86) Distasio, R. A., Jr.; Head-Gordon, M. Optimized SpinComponent Scaled Second-Order Møller-Plesset Perturbation Theory for Intermolecular Interaction Energies. Mol. Phys. 2007, 105, 1073− 1083. (87) Fink, R. F. Spin-Component-Scaled Møller-Plesset (SCS-MP) Perturbation Theory: A Generalization of the MP Approach with Improved Properties. J. Chem. Phys. 2010, 133, 174113. (88) Fogueri, U. R.; Kozuch, S.; Karton, A.; Martin, J. M. L. The Melatonin Conformer Space: Benchmark and Assessment of Wave Function and DFT Methods for a Paradigmatic Biological and Pharmacological Molecule. J. Phys. Chem. A 2013, 117, 2269−2277. (89) Mezei, P. D.; Csonka, G. I.; Ruzsinszky, A.; Kállay, M. Construction and Application of a New Dual-Hybrid Random Phase Approximation. J. Chem. Theory Comput. 2015, 11, 4615−4626. (90) Toulouse, J.; Zhu, W.; Savin, A.; Jansen, G.; Á ngyán, J. G. Closed-Shell Ring Coupled Cluster Doubles Theory with Range Separation Applied on Weak Intermolecular Interactions. J. Chem. Phys. 2011, 135, 084119. (91) Scuseria, G. E.; Henderson, T. M.; Sorensen, D. C. The Ground State Correlation Energy of the Random Phase Approximation from a Ring Coupled Cluster Doubles Approach. J. Chem. Phys. 2008, 129, 231101. (92) Eshuis, H.; Bates, J. E.; Furche, F. Electron Correlation Methods Based on the Random Phase Approximation. Theor. Chem. Acc. 2012, 131, 1084. (93) Kallay, M.; Rolik, Z.; Csontos, J.; Ladjanski, I.; Szegedy, L.; Ladoczki, B.; Samu, G. MRCC, a Quantum Chemical Program Suite; Budapest University of Technology and Economics: Budapest, Hungary, 2015. (94) Izsák, R.; Neese, F. An Overlap Fitted Chain of Spheres Exchange Method. J. Chem. Phys. 2011, 135, 144105.

454

DOI: 10.1021/acs.jctc.5b01066 J. Chem. Theory Comput. 2016, 12, 444−454