Regression Formulas for Density Functional Theory Calculated 1H

Oct 3, 2011 - This study aimed at investigating the performance of a series of basis sets, density functional theory (DFT) functionals, and the IEF-PC...
0 downloads 0 Views 2MB Size
ARTICLE pubs.acs.org/JPCA

Regression Formulas for Density Functional Theory Calculated 1 H and 13C NMR Chemical Shifts in Toluene-d8 Ivan A. Konstantinov and Linda J. Broadbelt* Department of Chemical and Biological Engineering, Northwestern University, 2145 Sheridan Rd, E-136, Evanston, Illinois 60208, United States

bS Supporting Information ABSTRACT: This study aimed at investigating the performance of a series of basis sets, density functional theory (DFT) functionals, and the IEF-PCM solvation model in the accurate calculation of 1H and 13C NMR chemical shifts in toluene-d8. We demonstrated that, on a test set of 37 organic species with various functional moieties, linear scaling significantly improved the calculated shifts and was necessary to obtain more accurate results. Inclusion of a solvation model produced larger deviations from the experimental data as compared to the gas-phase calculations. Moreover, we did not find any evidence that very large basis sets were necessary to reproduce the experimental NMR data. Ultimately, we recommend the use of the BMK functional. For the 1H shifts the use of the 6-311G(d) basis set gave linearly scaled mean unsigned (MU) and root-mean-square (rms) errors of 0.15 ppm and 0.21 ppm, respectively. For the calculation of the 13C chemical shifts the 6-31G(d) basis set produced MUE of 1.82 ppm and RMSE of 3.29 ppm.

’ INTRODUCTION Nuclear magnetic resonance (NMR) is one of the most powerful and widely used spectroscopic techniques for the determination of the structure and dynamics of complex novel compounds.1,2 Unfortunately, even with two- and three-dimensional NMR experiments one must exercise extreme care in the proper structural assignment of the various chemical shifts.3 Recently, the potential of quantum mechanical (QM) methods to aid in the assignment of complex NMR shifts has been recognized, and a significant effort has been devoted to develop accurate methodologies.436 However, in order to employ QM on a regular basis, the error between the predicted and experimental NMR data needs to be within a few ppm for the 13C shifts and even smaller for the 1H shifts.6,15,37 The problem is further complicated by the fact that novel chemical species are very often large molecules with functionalities that produce chemical shifts within very narrow spectral regions.6,12,3741 In addition, such species exist in many conformations and a vast portion of the NMR experiments are conducted in a solvent environment.11,13,1719,24,3739,4143 Thus, accurate calculations need to be achieved at a modest computational cost and yet be able to account for structural and environmental differences.16,17,37 With the advent of density functional theory (DFT), computational chemists have managed to achieve a combination of accuracy and efficiency which has allowed for the treatment of molecules composed of several hundred atoms.14,44 Moreover, scientists can choose among a plethora of available DFT functionals that have been developed to suit their various needs. To date, it has been shown in the literature that many of the existing functionals when combined with very large basis sets r 2011 American Chemical Society

are capable of producing quantitatively accurate NMR chemical shifts.611,21,22,24,25,31,34,35 The effect of the reference state has also been investigated.20 An excellent review for the determination of the relative configuration of organic compounds based on computational and experimental NMR data was provided by Bifulco and co-workers.37 In addition, Smith and Goodman introduced probability-based methodologies to assign the correct stereochemistry to single and pairs of diastereomers using QM calculations.18,19 Theoretical NMR investigations necessitate the calculation of the nuclear magnetic shielding tensor. To this purpose, methods such as the GIAO (gauge-independent atomic orbitals),21,4548 the IGLO (individual gauge for localized orbital),49 and the CGST (continuous set of gauge transformation)21,50,51 have been developed. Generally, the GIAO method has been shown to achieve faster convergence of the shielding value with respect to the size of the basis set and, consequently, has been the most commonly employed procedure.21 Computational 1H and 13C NMR studies in the literature have mainly focused on DFT and ab initio performance for structures in CDCl3.9,24,36 This is expected due to the fact that a large number of experimental spectra are taken in deuterated chloroform and there is an abundance of data for numerous molecules and functional groups. Nevertheless, NMR is also employed in the presence of solvents such as deuterated toluene, hexane, DMSO, THF, etc. Thus, this work aims to investigate the Received: June 28, 2011 Revised: September 30, 2011 Published: October 03, 2011 12364

dx.doi.org/10.1021/jp2060975 | J. Phys. Chem. A 2011, 115, 12364–12372

The Journal of Physical Chemistry A

ARTICLE

Figure 1. Molecules in the 1H and 13C NMR test set.

performance of various basis sets and DFT functionals and the subsequent development of empirical scaling factors for 1H and 13 C chemical shifts in toluene-d8 that can be of tremendous help to chemists for the structural characterization of novel compounds.

’ RESULTS AND DISCUSSION Computational Methodology. All calculations were performed employing the Gaussian 09 computational software package.52 Geometries were optimized in the gas phase using Becke’s three-parameter hybrid exchange functional (B3),53 the correlation functional of Lee, Yang, and Parr (LYP),54 and the 6-31+G(d,p) basis set.5561 This level of theory had been previously shown on numerous occasions to produce very

accurate geometric parameters for organic molecules at a low computational cost. Moreover, because our study investigated the 1 H and 13C chemical shifts in toluene-d8, the gas-phase geometry optimizations were justified due to the low dielectric constant of the solvent (2.37). In order to verify that all geometries were minima on the electronic potential energy surface, the vibrational frequencies of each species were calculated and the absence of any imaginary frequencies was noted. To test the effect of the basis set on the chemical shifts, both Pople-type and Dunning-type basis sets of double- and triple-ζ quality with the B3LYP functional were investigated. The basis sets in this study were 6-31G(d), 6-31G(d,p), 6-31+G(d,p), 6-31++G(d,p),55,56,5861 6-311G(d), 6-311G(d,p), 6-311+G(d,p), 6-311++G(d,p), 6-311+G(2d,p),62,63 cc-pVDZ, aug-cc-pVDZ, cc-pVTZ, and aug-ccpVTZ.6467 The isotropic magnetic shielding tensors were computed using the gauge-independent 12365

dx.doi.org/10.1021/jp2060975 |J. Phys. Chem. A 2011, 115, 12364–12372

The Journal of Physical Chemistry A

ARTICLE

Table 1. Dependence of the Calculated 1H NMR Shifts on the Basis Set and Solvation Model MUEa (ppm) basis setb

RMSE (ppm)

linear regression

PCMc

scaledd

unscalede

scaled

unscaled

slope

intercept

R2

6-31G(d)

no

0.17

0.25

0.23

0.34

1.01

31.97

0.9911

6-31G(d,p)

no

0.18

0.29

0.25

0.38

1.03

31.54

0.9900

6-31+G(d,p)

no

0.19

0.35

0.26

0.45

1.05

31.47

0.9887

6-31++G(d,p)

no

0.19

0.36

0.27

0.45

1.04

31.45

0.9884

6-311G(d)

no

0.16

0.32

0.21

0.38

1.02

31.96

0.9925

6-311G(d,p)

no

0.17

0.34

0.22

0.42

1.05

31.73

0.9917

6-311+G(d,p)

no

0.18

0.39

0.25

0.48

1.05

31.70

0.9898

6-311++G(d,p) 6-311+G(2d,p)

no no

0.18 0.20

0.38 0.39

0.25 0.28

0.47 0.52

1.05 1.07

31.69 31.66

0.9896 0.9871

cc-pVDZ

no

0.17

0.24

0.23

0.33

1.03

31.40

0.9910

aug-cc-pVDZ

no

0.21

0.48

0.28

0.59

1.07

31.43

0.9870

cc-pVTZ

no

0.20

0.37

0.27

0.48

1.06

31.41

0.9878

aug-cc-pVTZ

no

0.21

0.41

0.29

0.54

1.07

31.38

0.9860

6-31G(d)

yes

0.19

0.29

0.26

0.38

1.01

31.95

0.9892

6-31G(d,p)

yes

0.19

0.33

0.26

0.44

1.04

31.53

0.9885

6-31+G(d,p) 6-31++G(d,p)

yes yes

0.21 0.21

0.39 0.41

0.28 0.28

0.51 0.52

1.05 1.06

31.43 31.44

0.9870 0.9872

6-311G(d)

yes

0.18

0.36

0.23

0.43

1.03

31.94

0.9909

6-311G(d,p)

yes

0.19

0.38

0.25

0.47

1.05

31.71

0.9900

6-311+G(d,p)

yes

0.20

0.43

0.27

0.54

1.06

31.67

0.9879

6-311++G(d,p)

yes

0.20

0.42

0.28

0.53

1.06

31.67

0.9876

6-311+G(2d,p)

yes

0.22

0.43

0.30

0.57

1.08

31.64

0.9851

cc-pVDZ

yes

0.19

0.27

0.25

0.37

1.03

31.38

0.9894

aug-cc-pVDZ cc-pVTZ

yes yes

0.22 0.22

0.52 0.40

0.30 0.29

0.64 0.53

1.08 1.07

31.41 31.39

0.9858 0.9859

aug-cc-pVTZ

yes

0.23

0.45

0.31

0.58

1.07

31.36

0.9840

a

MUE and RMSE are the mean unsigned error and the root-mean-square error, respectively. b All calculations were done with the B3LYP functional. c “Yes” indicates that the GIAO method was used along with the IEF-PCM solvation model, “no” means that the calculation was done in the gas phase. d Scaled chemical shift δ = (intercept  isotropic magnetic shielding)/slope. e Unscaled chemical shift relative to TMS at the same level of theory.

atomic orbital (GIAO) methodology on the GAS/B3LYP/6-31+ G(d,p)-optimized geometries. Because NMR chemical shifts are a result of a dynamic process, the computed isotropic magnetic shielding values were averaged over all symmetry-related positions of the nuclei under consideration in each molecule. A complete list of the magnetic shielding values and, when appropriate, their averages are presented in the Supporting Information. The effect of the solvent was included via the polarizable continuum model (IEF-PCM)6883 with the solute cavities built using the van der Waals parameters from the universal force field (UFF) scaled by 1.1. This is the default and recommended solvation model in Gaussian 09. The effects of the basis set and solvation model on the calculated 1H and 13C NMR shifts were established using the B3LYP functional. Moreover, the performance of six additional DFT functionals that had been previously shown to produce quantitatively good results in chloroform was also investigated. Because it was impractical to test each functional with all the basis sets, only the basis set that performed the best with B3LYP was selected. Thus, the M06L and VSXC pure functionals developed by Truhlar and co-workers84 and van Voorhis and Scuseria,85 respectively, were employed. These are two meta-generalized gradient approximation (meta-GGA) functionals that do not

include HartreeFock (HF) exchange and, thus, are suitable for large molecular systems. In addition, the M06 (hybrid meta functional) from Truhlar et al.,86 the OPBE functional which includes Handy’s OPTX modification of Becke’s exchange functional87,88 and the gradient-corrected correlation functional of Perdew, Berke, and Ernzerhof,89,90 the BMK functional by Boese and Martin,91 and the hybrid GGA WP(C)04 functionals of Cramer and co-workers24 that were developed exclusively for the prediction of NMR shifts in chloroform were investigated. Moreover, it was decided to test these functionals of the many available because they had been previously shown to produce superior thermodynamic and kinetic parameters for a large set of reactions and species and because prior to Gaussian 09, M06L, VSXC, M06, and BMK functionals were not coded to work with the GIAO method. Test Set. For the purposes of this study, 37 organic molecules were chosen whose 65 1H and 79 13C NMR experimental chemical shifts in toluene-d8 were taken exclusively from Fulmer et al.92 in order to avoid possible inconsistencies between experimental results. The only difference between the species in the 1H and 13C test sets was the omission of H2 in the latter. The complete set is presented in Figure 1. The species in the test set were chosen according to the following criteria: 12366

dx.doi.org/10.1021/jp2060975 |J. Phys. Chem. A 2011, 115, 12364–12372

The Journal of Physical Chemistry A

ARTICLE

Table 2. Dependence of the Calculated 1H NMR Shifts on the DFT Functional MUEa (ppm)

RMSE (ppm)

linear regression

functionalb

PCMc

scaledd

unscalede

scaled

unscaled

slope

intercept

R2

B3LYP

no

0.16

0.32

0.21

0.38

1.02

31.96

0.9925

M06L

no

0.18

0.29

0.26

0.37

1.00

32.16

0.9893

OPBE

no

0.18

0.27

0.23

0.36

1.02

31.77

0.9912

VSXC

no

0.19

0.34

0.26

0.42

0.99

31.76

0.9893

M06

no

0.16

0.33

0.22

0.41

1.04

32.22

0.9918

BMK

no

0.15

0.40

0.21

0.50

1.08

31.98

0.9926

WP04

no

0.17

0.22

0.23

0.30

0.99

32.21

0.9912

a

MUE and RMSE are the mean unsigned error and the root-mean-square error, respectively. b All calculations were done with the 6-311G(d) basis set. c “No” indicates that the GIAO method was performed in the gas phase. d Scaled chemical shift δ = (intercept  isotropic magnetic shielding)/slope. e Unscaled chemical shift relative to TMS at the same level of theory.

(a) The chemical shifts of the functional groups spanned almost the entire 1H and 13C NMR spectra. (b) Only species that had one or one dominant conformer were selected. Molecules that have multiple isomers which are significantly populated require a Boltzmannweighted averaging of the calculated NMR shifts which, in turn, necessitates a very accurate calculation of the relative Gibbs free energy at the temperature and solvent of interest. Unfortunately, such relative precision is hard to achieve at a reasonable computational price. (c) Protons that are bonded to heteroatoms such as nitrogen and oxygen were not included because they are strongly affected by the ability to form hydrogen bonds as well as the solvent environment. 1H and 13C nuclei which were immediately next to a hydrogen bond were also neglected because their experimental chemical shifts would also be influenced by the hydrogen bond interaction. Nevertheless, species that are involved in hydrogen bonds but whose functionalities were taken into account were tertbutyl alcohol (CH3), ethanol (CH3), 2-propanol (CH3), pyrrolidine (CH2(3,4)), and butylated hydroxytoluene (BHT). It was expected that nuclei positioned more than two bonds away from the hydrogen bonding region would not be affected to a significant degree. Linear Scaling. Linear scaling is a commonly applied technique that allows for the improvement of the calculated chemical shifts via a linear regression of the calculated chemical shifts versus experimental data.17,3133,36,93 However, instead of including the calculated shifts relative to TMS, the computed isotropic shielding values are used. This approach has the advantage of eliminating the need of a reference compound, such as TMS, and the error associated with it, as was previously demonstrated by Rablen et al.31 Moreover, the linear regression method is capable of correcting systematic errors across the whole 1H and 13C NMR spectra. Furthermore, because NMR shifts are a function of the temperature and solvent, these factors will be taken into account in the linear scaling approach. As a result, the slope and intercept from the best-fit line allow for the direct calculation of the empirically scaled chemical shifts according to eq 1 δscaled calc ¼

ðintercept  σÞ slope

ð1Þ

where σ is the calculated isotropic shielding value for a particular nucleus or, when appropriate, the average of several symmetry-

related nuclei. Thus, in eq 1, the intercept serves as an alternative for a reference compound while the slope provides information about the size of the systematic error. Dependence of the Calculated 1H Chemical Shifts on the Basis Set and Solvation Model. Table 1 summarizes the results of the effect of the basis set and solvation model on the calculated 1 H NMR shifts for the test set. The data was organized by the type and size of the basis set. All calculations were performed with the B3LYP functional. Throughout the years, the performance of B3LYP has been tested extensively for numerous experimental observables and remains one of the most widely used density functional theory functionals for organic and inorganic molecules. Hence, this study regarded B3LYP as a comparative standard. The first result that is evident from Table 1 is that the incorporation of an implicit solvation model via the default IEF-PCM procedure in Gaussian 09 to model toluene-d8 increased the error in the calculated chemical shifts regardless of the basis set and method employed, i.e., unscaled relative to TMS or empirically scaled by a linear regression. These results did not support the study of Jain and co-workers36 who showed that a calculation involving a solvation model generally reduced the root-mean-square error (RMSE) between calculated and experimental chemical shifts in CDCl3. However, this is not surprising since toluene has a very low dielectric constant (2.37), and as a consequence, calculations in the gas phase are capable of providing accurate chemical shifts. Moreover, implicit solvation models are generally parametrized for energy, not NMR, calculations. Furthermore, gas-phase calculations of large species have the advantage of not suffering from convergence problems and increased computational times when compared to condensedphase calculations. When 1H chemical shifts were calculated with respect to TMS, the Dunning cc-pVDZ basis set provided the smallest rms and mean unsigned (MU) errors of 0.33 and 0.24 ppm, respectively. The use of the Pople 6-31G(d) basis set resulted in the second smallest RMSE of 0.34 ppm and MUE of 0.25 ppm. Overall, increasing the size of the basis set from double- to triple-ζ and adding extra diffuse and polarization functions increased the rms and MU errors for both the Dunning and Pople families of basis sets when TMS was employed as a reference state. As can be seen in Table 1, the RMSE and MUE had the largest values of 0.59 and 0.48 ppm, respectively, when the aug-cc-pVDZ basis set was used. Linear scaling provided significantly reduced rms and MU errors. Table 1 shows that after scaling the error was much less 12367

dx.doi.org/10.1021/jp2060975 |J. Phys. Chem. A 2011, 115, 12364–12372

The Journal of Physical Chemistry A

ARTICLE

Figure 3. Problematic 1H chemical shifts. The numbers are errors (δcalc  δexp) in the shifts with linear regression at BMK/6-311G(d) in normal font and relative to TMS at WP04/6-311G(d) in italics.

Figure 2. Analysis of experimental and calculated 1H chemical shifts at BMK/6-311G(d) after linear scaling via (a) parity plot and (b) histogram of the errors (δcalc  δexp).

dependent on the size of the basis set. The RMSE and MUE ranged from 0.21 to 0.29 ppm and from 0.16 to 0.21 ppm, respectively. The best results were obtained with the 6-311G(d) basis set. Our results compare favorably with scaled errors in chloroform from previous studies. The inclusion of diffuse and polarization functions to the double- and triple-ζ basis sets did not reduce the size of the error. However, it is interesting to note that the 6-31G(d) and cc-pVDZ basis sets provided RMSE of 0.23 ppm and MUE of 0.17 ppm. Consequently, while the use of the 6-311G(d) basis set is ultimately recommended, for large molecular systems the use of 6-31G(d) and cc-pVDZ is justified as a compromise between accuracy and computational time. Hence, in order to aid chemists, the slope and intercept scaling parameters for all basis sets under investigation are provided in Table 1. All linear regressions had excellent correlation coefficients (R2) whose values varied between 0.9860 and 0.9925. Detailed scatter plots and best-fit lines for all basis sets are provided in the Supporting Information. Dependence of Calculated 1H Chemical Shifts on the DFT Functional. In order to investigate the dependence of the calculated 1H chemical shifts on the DFT functional, the M06L, OPBE, VSXC, M06, BMK, and WP04 functionals were selected. Because it was impractical to investigate the performance of each of the six functionals with every basis set, the 6-311G(d) basis set was chosen which, when combined with B3LYP, produced the smallest rms and MU errors between the experimental and calculated 1H chemical shifts after linear scaling. It was already shown that the inclusion of the IEFPCM solvation model in the GIAO calculations had a negative effect on the calculated shifts in toluene-d8, and hence, all calculations were done in the gas phase. The results are summarized in Table 2.

When chemical shifts were calculated with respect to TMS, the rms and MU errors spanned the range between 0.30 and 0.50 ppm, and 0.22 and 0.40 ppm, respectively. The level of theory that performed the best was WP04/6-311G(d), and the worst was BMK/6-311G(d). Because the WP04 functional is parametrized for 1H NMR shifts in chloroform, it is not surprising that it produced the best results in toluene since both solvents are nonpolar and aprotic. Note that the OPBE and M06L functionals slightly outperformed B3LYP. On the other hand, the VSXC and M06 functionals produced larger errors than B3LYP. Empirically scaling the calculated magnetic shielding values resulted in a significant improvement over using TMS as a reference. The range of rms and MU errors was considerably reduced with the best outcome achieved by BMK in terms of MUE (0.15 ppm). Both BMK and B3LYP gave RMSE of 0.21 ppm. However, the M06L/6-311G(d) level of theory also produced satisfactory results. Because M06L is a pure functional and does not involve HartreeFock (HF) exchange, it is suitable for the treatment of large molecular systems. The linear regression correlation coefficients (R2) for all DFT functionals showed that there was an excellent correlation between the calculated magnetic shielding values and the experimental 1H NMR shifts in toluene-d8 over the entire spectrum range. Detailed Investigation of the BMK/6-311G(d)-Calculated 1 H Chemical Shifts after Linear Scaling. Figure 2 shows the parity plot and histogram between the calculated and experimental 1H chemical shifts at the BMK/6-311G(d) level of theory after linear scaling. It is clear that most of the points are located close to the line of parity with roughly equal distribution above and below the line. The histogram in Figure 2 shows in more detail the distribution of the residuals between the calculated and experimental values. For BMK/6-311G(d) the distribution is close to normal with only a few shifts producing errors larger than (0.30 ppm. Thus, 58.5% of the errors fell within (0.15 ppm and 72.3% within (0.2 ppm. Generally, linear regression employing the BMK/6-311G(d) level of theory slightly underpredicted experimental shifts. Nevertheless, only six of the calculated 1H chemical shifts had errors larger than (0.30 ppm. The functionalities that produced errors larger than (0.30 ppm were defined as problematic and are presented in Figure 3. Unfortunately, common features among the six functionalities that could account for the larger errors were not detected. After linear 12368

dx.doi.org/10.1021/jp2060975 |J. Phys. Chem. A 2011, 115, 12364–12372

The Journal of Physical Chemistry A

ARTICLE

Table 3. Dependence of the Calculated 13C NMR Shifts on the Basis Set and DFT Functional MUEa (ppm) level of theoryb

RMSE (ppm)

linear regression

scaledc

unscaledd

scaled

unscaled

slope

intercept

R2

B3LYP/6-31G(d)

2.02

4.13

3.96

5.77

0.93

186.79

0.9955

B3LYP/6-31G(d,p)

2.05

4.32

4.04

5.92

0.94

188.57

0.9953

B3LYP/6-31+G(d,p)

2.46

3.38

4.23

4.85

0.95

189.00

0.9948

B3LYP/6-31++G(d,p)

2.50

3.46

4.23

4.93

0.95

188.63

0.9948

B3LYP/6-311G(d)

2.21

4.24

4.06

5.90

1.02

180.71

0.9952

B3LYP/6-311G(d,p)

2.23

4.84

4.11

6.43

1.02

180.51

0.9951

B3LYP/6-311+G(d,p)

2.15

5.29

4.08

6.97

1.03

180.62

0.9952

B3LYP/6-311++G(d,p) B3LYP/6-311+G(2d,p)

2.18 2.27

5.35 4.96

4.12 4.32

7.04 6.84

1.03 1.03

180.62 180.09

0.9951 0.9946

B3LYP/cc-pVDZ

2.44

2.91

4.81

5.33

0.97

189.61

0.9933

B3LYP/aug-cc-pVDZ

2.66

2.64

4.76

5.04

0.99

189.19

0.9935

B3LYP/cc-pVTZ

2.29

5.07

4.33

6.87

1.03

180.43

0.9946

B3LYP/aug-cc-pVTZ

2.43

5.91

4.38

7.69

1.03

180.75

0.9945

M06L/6-31G(d)

3.13

7.73

5.79

10.41

0.86

184.36

0.9903

OPBE/6-31G(d)

2.59

5.77

4.97

8.07

0.90

189.55

0.9929

VSXC/6-31G(d) M 06/6-31G(d)

3.46 2.06

7.58 3.33

5.61 4.16

9.84 4.98

0.86 0.96

186.87 183.90

0.9909 0.9950

BMK/6-31G(d)

1.82

4.86

3.29

6.37

1.04

194.56

0.9969

WC 04/6-31G(d)

1.96

6.10

2.41

7.79

0.90

192.49

0.9983

a

MUE and RMSE are the mean unsigned error and the root-mean-square error, respectively. b All calculations were done in the gas phase. c Scaled chemical shift δ = (intercept  isotropic magnetic shielding)/slope. d Unscaled chemical shift relative to TMS at the same level of theory.

scaling, the error for the methyl protons of acetonitrile, nitromethane, and dimethylacetamide was reduced, as compared to the error when TMS was the reference state. However, the methylene protons of trimethylamine and ethyl acetate as well as the phenyl proton of BHT exhibited larger errors if scaling was employed. Also, it is interesting to note that among all the functionals explored, only BMK was able to reduce the error in the chemical shifts of protons of chlorinated carbon atoms below 0.30 ppm. Dependence of the Calculated 13C Chemical Shifts on the Basis Set and DFT Functional. Table 3 summarizes the results of the effect of the basis set and DFT functional on the calculated 13C NMR shifts for the test set. Similar to the 1H investigation, B3LYP was regarded as a comparative standard. Also, since the 1H NMR study demonstrated that the inclusion of the IEF-PCM solvation model did not lead to an improvement of the calculated shifts, only the gas-phase isotropic shielding values were considered. In terms of the basis set used, when TMS was employed as a reference state, the rms and MU errors varied between 4.85 and 7.69 ppm and 2.64 and 5.91 ppm, respectively. The smallest RMSE was produced by the 6-31+G(d,p) basis set, and the smallest MUE resulted with the aug-cc-pVDZ basis set. For the Pople-type double-ζ basis sets, adding diffuse functions on the heavy atoms reduced the calculated error, while the incorporation of diffuse and polarization functions for the hydrogen atoms increased the deviation from the experimental data. For the Pople family of triple-ζ basis sets the inclusion of diffuse and polarization functions increased the error. Thus, the smallest rms and MU errors were obtained with the 6-311G(d) basis set and the largest ones with the 6-311+G(2d,p) basis set. Overall, the increase of the number of the split-valence basis functions did not produce better results. This is in direct contradiction with the

currently accepted notion that accurate 13C NMR chemical shifts require very large basis sets. The Dunning-type double-ζ basis sets outperformed the Pople-type ones in terms of MUE but not RMSE. If one desires to use these basis sets, aug-cc-pVDZ is recommended which gave a MUE of 2.64 ppm. Linear scaling was able to reduce both the MUE and RMSE in the 13C test set. The only exception was the aug-cc-pVDZ basis set whose MUE slightly increased from 2.64 to 2.66 ppm but whose RMSE decreased by 0.28 ppm. The Pople double-ζ basis sets were greatly affected by linear scaling, and their MU and rms errors were diminished to a greater extent than the Dunning ones. The best performance in terms of the basis set after linear scaling was achieved at the modest 6-31G(d) level of theory which gave RMSE and MUE values of 3.96 and 2.02 ppm, respectively. Again, the inclusion of diffuse and polarization functions had a negative effect on the deviation from the experimental data. Once it was established that linear scaling was necessary to reproduce more accurately the experimental 13C chemical shifts and that the best basis set was 6-31G(d), the performance of M06L, OPBE, VSXC, M06, BMK, and WC04 DFT functionals with the 6-31G(d) basis set was evaluated. In terms of shifts calculated relative to TMS, M06 was the only functional that gave better results than B3LYP with a RMSE of 4.98 ppm and a MUE of 3.33 ppm. It was surprising that WC04 failed to outperform B3LYP. However, WC04 was parametrized using the much larger 6-311+G(2d,p) basis set which could account for its poor performance in predicting the 13C chemical shifts with 6-31G(d). Linear scaling reduced the MUE and RMSE values in certain cases by more than a factor of 2. The functional that was least affected by scaling was M06, whose MUE and RMSE values were reduced only by 1.27 and 0.82 ppm, respectively. However, both BMK and WC04 performed better than B3LYP. Interestingly, 12369

dx.doi.org/10.1021/jp2060975 |J. Phys. Chem. A 2011, 115, 12364–12372

The Journal of Physical Chemistry A

ARTICLE

Figure 5. Histogram of the errors (δcalc  δexp) in the calculated 13 C chemical shifts after linear scaling at (a) BMK/6-31G(d) and (b) WC04/6-31G(d).

Figure 4. Parity plot of experimental and calculated 13C chemical shifts after linear scaling at (a) BMK/6-31G(d) and (b) WC04/6-31G(d).

BMK gave the smallest MUE (1.82 ppm), while WC04 had the smallest RMSE (2.41 ppm). This necessitated the investigation of the two functionals to a greater extent with the results presented in the next section. In Table 3 the slope, intercept, and correlation coefficients (R2) for all basis sets and functionals in the 13C NMR study are provided. The R2 values varied between 0.9903 and 0.9983 which showed excellent correlation between the calculated isotropic shielding values and the experimental chemical shifts for the test set. Comparison between BMK and WC04 Functionals for the Calculation of 13C Chemical Shifts. Figure 4 shows the parity plots of the experimental and calculated 13C chemical shifts after linear scaling at the BMK/6-31G(d) and WC04/6-31G(d) levels of theory. Most of the data points lay on or very close to the 45° line with fewer outliers in the case of WC04. However, the histograms in Figure 5 reveal that BMK produced an error distribution much closer to normal with 46.8% of the errors within (1 ppm and 78.5% within (2 ppm. In contrast, 30.4% of the WC04 errors were within (1 ppm and 49.4% were within (2 ppm. Consequently, even though WC04 had a smaller RMSE than BMK, the use of BMK when scaling 13C chemical shifts is recommended. As far as the problematic 13C cases (those with errors larger than (6 ppm) are concerned, they are presented in Figure 6.

Figure 6. Problematic 13C chemical shifts. The numbers are errors (δcalc  δexp) in the shifts with linear regression at BMK/6-31G(d) in normal font and WC04/6-31G(d) in italics.

Out of the 79 13C chemical shifts in this study, BMK failed to reproduce only five whose errors were larger than (6 ppm. When the carbon nucleus was bound to more than one chlorine atom, as in chloroform and dichloromethane, BMK overestimated the chemical shifts by 20.18 and 8.64 ppm, respectively. It was discovered that the size of the error augmented as the number of chlorine atoms increased. Other problematic systems included the carbon atom being part of an amide, carboxylate, or nitrile functionality. BMK underestimated the chemical shifts in dimethyl acetamide, ethyl acetate, and acetonitrile by 7.33, 7.16, and 8.90 ppm, respectively. The WC04 functional, however, could not capture correctly only the 13C shift in chloroform even though it managed to reduce the error to a 12370

dx.doi.org/10.1021/jp2060975 |J. Phys. Chem. A 2011, 115, 12364–12372

The Journal of Physical Chemistry A

ARTICLE

Table 4. 1H and 13C NMR Shifts Calculated with Alternate Basis Sets MUEa (ppm) basis setb

NMR

6-31G(d)

1

6-311G(d)

13

H C

scaledc

RMSE (ppm)

unscaledd

scaled

0.17

0.34

2.09

10.78

linear regression intercept

R2

1.08

32.00

0.9920

1.25

187.69

0.9967

unscaled

slope

0.22

0.45

3.36

13.55

a

MUE and RMSE are the mean unsigned error and the root-mean-square error, respectively. b All calculations were done in the gas phase with the BMK functional. c Scaled chemical shift δ = (intercept  isotropic magnetic shielding)/slope. d Unscaled chemical shifts are relative to TMS at the same level of theory.

significant extent. As a consequence, it is suggested that in cases where a compound contains any of the problematic groups in Figure 5, the WC04/6-31G(d) level of theory be used. Performance of the BMK Functional with the 6-31G(d) and 6-311G(d) Basis Sets for the Calculation of 1H and 13C Chemical Shifts. In an attempt to prescribe a single level of theory for the prediction of the carbon and proton NMR chemical shifts in a particular molecular structure, the performance of the BMK functional with the 6-31G(d) and 6-311G(d) basis sets for the calculation of the 1H and 13C chemical shifts, respectively, was investigated. The results are presented in Table 4. It was discovered that a decrease in the size of the basis set for the 1H chemical shifts resulted in MUE of 0.34 ppm and RMSE of 0.45 ppm when TMS was the reference standard. Once linear scaling was applied, MUE decreased to 0.17 ppm and RMSE decreased to 0.22 ppm. However, the 6-31G(d) basis set failed to outperform the larger 6-311G(d) basis set whose linearly scaled MUE was 0.15 ppm and RMSE was 0.21 ppm. For the 13C calculations relative to TMS, the BMK/6-311G(d) level of theory produced MUE equal to 10.78 ppm and RMSE equal to 13.55 ppm. A considerable reduction in the error was achieved with empirical scaling. MUE was 2.09 ppm and RMSE was 3.36 ppm. Nevertheless, the results were significantly worse than those obtained with the BMK/6-31G(d) method (Table 3).

’ CONCLUSION The performance of different basis sets and functionals, and the inclusion of a solvation model, have been investigated to calculate the 1H and 13C NMR chemical shifts for a series of organic compounds in toluene-d8. It was determined that linear scaling was necessary to achieve more accurate results when compared to those obtained with TMS as a reference state. The smallest error for the 1H shifts in our test set was obtained using the BMK/6-311G(d) level of theory. The use of the default IEFPCM solvation model in Gaussian 09 increased the error in the 1 H calculated chemical shifts. For the 13C shifts the use of either BMK/6-31G(d) or WC04/6-31G(d) is recommended. However, it was discovered that, even after linear scaling, BMK failed to capture six 1H and five 13C NMR shifts with sufficient accuracy. Hence, given the number of experimental chemical shifts in the study, the methodology must be employed with caution when it is applied to the characterization of novel chemical species. Ultimately, this investigation suggested that it was not necessary to employ very large basis sets to obtain quantitatively accurate results. ’ ASSOCIATED CONTENT Information. bS 1Supporting 13

Contains all molecules from the H and C test sets with the appropriate atom numbering. In

addition, we have presented all the calculated isotropic shielding values, the experimental chemical shifts, and the linear regressions at the various levels of theory employed in this study. This material is available free of charge via the Internet at http://pubs. acs.org.

’ AUTHOR INFORMATION Corresponding Author

*E-mail: [email protected]. Fax: (847)491-3728.

’ ACKNOWLEDGMENT We thank the National Energy Research Scientific Computing (NERSC) Center for the computational time that made this project possible. Funding for this project was available through the Institute for Catalysis in Energy Processes (ICEP) which is a collaborative research institute between Northwestern University and Argonne National Laboratory sponsored by the Department of Energy. ’ REFERENCES (1) Carey, F. A. Organic Chemistry, 5th ed.; McGraw-Hill: New York, 2002. (2) Claridge, T. D. High-Resolution NMR Techniques in Organic Chemistry, 2nd ed.; Elsevier: New York, 2008; Vol. 27. (3) Nicolaou, K. C.; Snyder, S. A. Angew. Chem., Int. Ed. 2005, 44, 1012. (4) Barone, G.; Gomez-Paloma, L.; Duca, D.; Silvestri, A.; Riccio, R.; Bifulco, G. Chem.—Eur. J. 2002, 8, 3233. (5) Barone, G.; Gomez-Paloma, L.; Duca, D.; Silvestri, A.; Riccio, R.; Bifulco, G. Chem.—Eur. J. 2002, 8, 3240. (6) Bagno, A.; Rastrelli, F.; Saielli, G. Chem.—Eur. J. 2006, 12, 5514. (7) Helgaker, T.; Jaszunski, M.; Ruud, K. Chem. Rev. 1999, 99, 293. (8) Special issue on “Theoretical Methods in Magnetic Resonance”, edited by: Buhl, M., Magn. Reson. Chem. 2004, 42, S1. (9) Bagno, A. Chem.—Eur. J. 2001, 7, 1652. (10) Bagno, A.; Rastrelli, F.; Saielli, G. J. Phys. Chem. A 2003, 107, 9964. (11) Tahtinen, P.; Bagno, A.; Klika, K. D.; Pihlaja, K. J. Am. Chem. Soc. 2003, 125, 4609. (12) Colombo, D.; Ferraboschi, P.; Ronchetti, F.; Toma, L. Magn. Reson. Chem. 2002, 40, 581. (13) Aiello, A.; Fattorusso, E.; Luciano, P.; Mangoni, A.; Menna, M. Eur. J. Org. Chem. 2005, 2005, 5024. (14) Oschsenfeld, C.; Kussmann, J.; Koziol, F. Angew. Chem., Int. Ed. 2004, 43, 4485. (15) Baldridge, K.; Siegel, J. S. J. Phys. Chem. A 1999, 103, 4038. (16) Ochsenfeld, C.; Kussmann, J.; Koziol, F. Angew. Chem., Int. Ed. 2004, 43, 4485. (17) Forsyth, D. A.; Sebag, A. B. J. Am. Chem. Soc. 1997, 119, 9483. (18) Smith, S. G.; Goodman, J. M. J. Am. Chem. Soc. 2010, 132, 12946. 12371

dx.doi.org/10.1021/jp2060975 |J. Phys. Chem. A 2011, 115, 12364–12372

The Journal of Physical Chemistry A (19) Smith, S. G.; Goodman, J. M. J. Org. Chem. 2009, 74, 4597. (20) Sarotti, A. M.; Pellegrinet, S. C. J. Org. Chem. 2009, 74, 7254. (21) Cheeseman, J. R.; Trucks, G. W.; Keith, T. A.; Frisch, M. J. J. Phys. Chem. 1996, 104, 5497. (22) Cimino, P.; Gomez-Paloma, L.; Duca, D.; Riccio, R.; Bifulco, G. Magn. Reson. Chem. 2004, 42, S26. (23) Tormena, C. F.; da Silva, G. V. J. Chem. Phys. Lett. 2004, 398, 466. (24) Wiitala, K. W.; Hoye, T. R.; Cramer, C. J. Chem. Theory Comput. 2006, 2, 1085. (25) Zhao, Y.; Truhlar, D. G. J. Phys. Chem. A 2008, 112, 6794. (26) Gauss, J. J. Chem. Phys. 1993, 99, 3629. (27) Gauss, J.; Stanton, J. F. J. Chem. Phys. 1995, 103, 3561. (28) Vignale, G.; Rasolt, M.; Geldart, D. J. W. Adv. Quantum Chem. 1990, 21, 235. (29) Malkin, V. G.; Malkina, O. L.; Salahub, D. R. Chem. Phys. Lett. 1993, 204, 80. (30) Shreckenbach, G.; Ziegler, T. J. Chem. Phys. 1995, 103, 606. (31) Rablen, P. R.; Pearlman, S. A.; Finkbiner, J. J. Phys. Chem. A 1999, 103, 7357. (32) Costa, F. L. P.; de Albuquerque, A. C. F.; dos Santos, F. M., Jr.; de Amorim, M. B. J. Phys. Org. Chem. 2009, 23, 972. (33) van Eikema Hommes, N. J. R.; Clark, T. J. Mol. Model. 2005, 11, 175. (34) Wu, A.; Zhang, Y.; Xu, X.; Yan, Y. J. Comput. Chem. 2007, 28, 2431. (35) Kupka, T.; Stachow, M.; Nieradka, M.; Kaminsky, J.; Pluta, T. J. Chem. Theory Comput. 2010, 6, 1580. (36) Jain, R.; Bally, T.; Rablen, P. R. J. Org. Chem. 2009, 74, 4017. (37) Bifulco, G.; Dambruoso, P.; Gomez-Paloma, L.; Riccio, R. Chem. Rev. 2007, 107, 3744. (38) Malkina, O. L.; Hricovini, M.; Bizik, F.; Malkin, V. G. J. Phys. Chem. A 2001, 105, 9188. (39) Roslund, M. U.; Klika, K. D.; Lehtila, R. L.; Tahtinen, P.; Sillanapaa, R.; Leino, R. J. Org. Chem. 2004, 69, 18. (40) Saielli, G.; Bagno, A. Org. Lett. 2009, 11, 1409. (41) Fattorusso, C.; Stendardo, E.; Appendino, G.; Fattorusso, E.; Luciano, P.; Romano, A.; Taglialatela-Scafati, O. Org. Lett. 2007, 9, 2377. (42) Fattorusso, E.; Luciano, P.; Romano, A.; Taglialatela-Scafati, O.; Appendino, G.; Borriello, M.; Fattorusso, C. J. Nat. Prod. 2008, 71, 1988. (43) Smith, S. G.; Paton, R. S.; Burton, J. W.; Goodman, J. M. J. Org. Chem. 2008, 73, 4053. (44) Cramer, C. Essentials of Computational Chemistry: Theories and Models, 2nd ed.; John Wiley and Sons Ltd.: Chichester, UK, 2004. (45) London, F. J. Phys. Radium 1937, 8, 397. (46) McWeeny, R. Phys. Rev. 1962, 126. (47) Ditchfield, R. Mol. Phys. 1974, 27, 789. (48) Wolinski, K.; Hilton, J. F.; Pulay, P. J. Am. Chem. Soc. 1990, 112, 8251. (49) Schindler, M.; Kutzelnigg, W. Mol. Phys. 1983, 48, 781. (50) Keith, T. A.; Bader, R. F. W. Chem. Phys. Lett. 1992, 194, 1. (51) Keith, T. A.; Bader, R. F. W. Chem. Phys. Lett. 1993, 210, 223. (52) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Mennucci, B.; Petersson, G. A.; Nakatsuji, H.; Caricato, M.; Li, X.; Hratchian, H. P.; Izmaylov, A. F.; Bloino, J.; Zheng, G.; Sonnenberg, J. L.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Vreven, T.; Montgomery, J., J. A.; Peralta, J. E.; Ogliaro, F.; Bearpark, M.; Heyd, J. J.; Brothers, E.; Kudin, K. N.; Staroverov, V. N.; Kobayashi, R.; Normand, J.; Raghavachari, K.; Rendell, A.; Burant, J. C.; Iyengar, S. S.; Tomasi, J.; Cossi, M.; Rega, N.; Millam, N. J.; Klene, M.; Knox, J. E.; Cross, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.; Ochterski, J. W.; Martin, R. L.; Morokuma, K.; Zakrzewski, V. G.; Voth, G. A.; Salvador, P.; Dannenberg, J. J.; Dapprich, € Foresman, J. B.; Ortiz, J. V.; Cioslowski, J.; S.; Daniels, A. D.; Farkas, O.; Fox, D. J. Gaussian 09, Revision A.1; Gaussian, Inc.: Wallingford, CT, 2009.

ARTICLE

(53) Becke, A. D. J. Chem. Phys. 1993, 98, 5648. (54) Lee, C.; W., Y.; Parr, R. Phys. Rev. B 1988, 37, 785. (55) Hehre, W. J.; Ditchfield, R.; Pople, J. A. J. Chem. Phys. 1972, 56, 2257. (56) Hariharan, P. C.; Pople, J. A. Theor. Chim. Acta 1973, 28, 213. (57) Frisch, M. J.; Pople, J. A. J. Chem. Phys. 1984, 80, 3265. (58) Ditchfield, R.; Hehre, W. J.; Pople, J. A. J. Chem. Phys. 1971, 54, 724. (59) Hariharan, P. C.; Pople, J. A. Mol. Phys. 1974, 27, 209. (60) Gordon, M. S. Chem. Phys. Lett. 1980, 76, 163. (61) Francl, M. M.; Pietro, W. J.; Hehre, W. J.; Binkley, J. S.; DeFrees, D. J.; Pople, J. A.; Gordon, M. S. J. Chem. Phys. 1982, 77, 3654. (62) McLean, A. D.; Chandler, G. S. J. Chem. Phys. 1980, 72, 5639. (63) Raghavachari, K.; Binkley, J. S.; Seeger, R.; Pople, J. A. J. Chem. Phys. 1980, 72, 650. (64) Dunning, T. H., Jr. J. Chem. Phys. 1989, 90, 1007. (65) Kendall, R. A.; Dunning, T. H., Jr.; Harrison, R. J. J. Chem. Phys. 1992, 96, 6796. (66) Woon, D. E.; Dunning, T. H., Jr. J. Chem. Phys. 1993, 98, 1358. (67) Peterson, K. A.; Woon, D. E.; Dunning, T. H., Jr. J. Chem. Phys. 1994, 100, 7410. (68) Miertus, S.; Scrocco, E.; Tomasi, J. Chem. Phys. 1981, 55, 117. (69) Miertus, S.; Tomasi, J. Chem. Phys. 1982, 65, 239. (70) Pascual-Ahuir, J. L.; Silla, E.; Tu~ non, I. J. Comput. Chem. 1994, 15, 1127. (71) Cossi, M.; Barone, V.; Cammi, R.; Tomasi, J. Chem. Phys. Lett. 1996, 255, 327. (72) Barone, G.; Cossi, M.; Tomasi, J. J. Chem. Phys. 1997, 107, 3210. (73) Cances, E.; Mennucci, B.; Tomasi, J. J. Chem. Phys. 1997, 107, 3032. (74) Mennucci, B.; Tomasi, J. J. Chem. Phys. 1997, 106, 5151. (75) Mennucci, B.; Cances, E.; Tomasi, J. J. Phys. Chem. B 1997, 101, 10506. (76) Barone, G.; Cossi, M. J. Phys. Chem. A 1998, 102, 1995. (77) Cossi, M.; Barone, G.; Mennucci, B.; Tomasi, J. Chem. Phys. Lett. 1998, 286, 253. (78) Barone, G.; Cossi, M.; Tomasi, J. J. Comput. Chem. 1998, 19, 404. (79) Cammi, R.; Tomasi, J.; Mennucci, B. J. Phys. Chem. A 1999, 103, 9100. (80) Tomasi, J.; Mennucci, B.; Cances, E. J. Mol. Struct.: Theochem 1999, 464, 211. (81) Cossi, M.; Rega, N.; Scalmani, G.; Barone, V. J. Chem. Phys. 2001, 114, 5691. (82) Cossi, M.; Scalmani, G.; Rega, N.; Barone, G. J. Chem. Phys. 2002, 117, 43. (83) Scalmani, G.; Frisch, M. J. J. Chem. Phys. 2010, 132, 114110. (84) Zhao, Y.; Truhlar, D. G. J. Chem. Phys. 2006, 125, 194101: 1. (85) Van Voorhis, T.; Scuseria, G. E. J. Chem. Phys. 1998, 109, 400. (86) Zhao, Y.; Truhlar, D. G. Theor. Chem. Acc. 2008, 120, 215. (87) Handy, N. C.; Cohen, A. J. Mol. Phys. 2001, 99, 403. (88) Hoe, W.-M.; Cohen, A.; Handy, N. C. Chem. Phys. Lett. 2001, 341, 319. (89) Perdew, J. P.; Burke, K.; Ernzerhof, M. Phys. Rev. Lett. 1996, 77, 3865. (90) Perdew, J. P.; Burke, K.; Ernzerhof, M. Phys. Rev. Lett. 1997, 78, 1396. (91) Boese, A. D.; Martin, J. M. L. J. Chem. Phys. 2004, 121, 3405. (92) Fulmer, G. R.; Miller, A. J. M.; Sherden, N. H.; Gottlieb, H. E.; Nudelman, A.; Stoltz, B. M.; Bercaw, J. E.; Goldberg, K. I. Organometallics 2010, 29, 2176. (93) Giesen, D.; Zumbulyadis, N. Phys. Chem. Chem. Phys. 2002, 4, 5498.

12372

dx.doi.org/10.1021/jp2060975 |J. Phys. Chem. A 2011, 115, 12364–12372