Fragment-Based Approach for the Evaluation of NMR Chemical Shifts

Feb 14, 2017 - (79) These following procedures are implemented through an external Perl script interfacing with G09. 1. ..... comparison of isotropic ...
0 downloads 15 Views 4MB Size
This is an open access article published under an ACS AuthorChoice License, which permits copying and redistribution of the article or any adaptations for non-commercial purposes.

Article pubs.acs.org/JCTC

Fragment-Based Approach for the Evaluation of NMR Chemical Shifts for Large Biomolecules Incorporating the Effects of the Solvent Environment K. V. Jovan Jose† and Krishnan Raghavachari* Department of Chemistry, Indiana University, Bloomington, Indiana 47405, United States S Supporting Information *

ABSTRACT: We present an efficient implementation of the molecules-in-molecules (MIM) fragment-based quantum chemical method for the evaluation of NMR chemical shifts of large biomolecules. Density functional techniques have been employed in conjunction with large basis sets and including the effects of the solvent environment in these calculations. The MIM-NMR method is initially benchmarked on a set of (alanine)10 conformers containing strong intramolecular interactions. The incorporation of a second low level of theory to recover the missing long-range interactions in the primary fragmentation scheme is critical to yield reliable chemical shifts, with a mean absolute deviation (MAD) from direct unfragmented calculations of 0.01 ppm for 1H chemical shifts and 0.07 ppm for 13C chemical shifts. In addition, the performance of MIM-NMR has been assessed on two large peptides: the helical portion of ubiquitin (1UBQ) containing 12 residues where the X-ray structure is known, and E6-binding protein of papilloma virus (1RIJ) containing 23 residues where the structure has been derived from solution-phase NMR analysis. The solvation environment is incorporated in these MIM-NMR calculations, either through an explicit, implicit, or a combination of both solvation models. Using an explicit treatment of the solvent molecules within the first solvation sphere (3 Å) and an implicit solvation model for the rest of the interactions, the 1 H and 13C chemical shifts of ubiquitin show excellent agreement with experiment (mean absolute deviation of 0.31 ppm for 1H and 1.72 ppm for 13C), while the larger E6-binding protein yields a mean absolute deviation of 0.34 ppm for 1H chemical shifts. The proposed MIM-NMR method is computationally cost-effective and provides a substantial speedup relative to conventional full calculations, the largest density functional NMR calculation included in this work involving more than 600 atoms and over 10,000 basis functions. The MIM-NMR solvation protocols developed in this work may pave the way for very accurate de novo prediction of NMR chemical shifts of a range of large biomolecules in the future. or GIAO−CCSD(T)24 have been successfully applied for small molecules, DFT-based25−27 methods are more cost-effective for large molecules such as polypeptides.28−32 Incorporating the solvation environment of the protein is vital in improving the accuracy of the gas-phase NMR chemical shifts.33 While early attempts to simulate the solvation environment used point charges, implicit solvation models such as the polarizable continuum model (PCM) are being increasingly employed to account for the electrostatic environment in these proteins.34−36 However, within such a description, the structural details of the solvation environment, especially the strong hydrogen bonding within the first solvation sphere, are neglected. QM/MM approaches have also been used to incorporate the solvent environment in NMR calculations,37,38 though the accuracy of the chemical shifts may be dictated by the reliability of the force fields.39,40 An explicit treatment of solvent molecules may be inevitable for a precise description of the first solvation sphere and has been shown to yield

1. INTRODUCTION Nuclear magnetic resonance (NMR) spectroscopy is a powerful technique to understand the structural and dynamical processes of a broad range of biomolecular systems.1,2 In particular, NMR provides complementary information to X-ray crystallography and chiroptical spectroscopy for three-dimensional structure elucidations. However, with the number of protein residues, the spectral assignment of NMR spectral information can be challenging, and theoretical methods could assist in interpreting and assigning experimentally measured chemical shifts.3−6 While empirical7,8 approaches are most commonly used in making assignments of NMR chemical shifts for large systems, ab initio and density functional theoretical (DFT) approaches are finding increasing use.9−21 Typically, GIAO (gauge including atomic orbitals) or related approaches are used to ensure the origin independence of the computed chemical shifts. An accurate theoretical prediction of NMR chemical shifts necessitates a reliable level of theory, large basis sets, and the incorporation of solvation effects in the calculations. While chemical shifts evaluated at GIAO-MP222 and GIAO−CCSD23 © XXXX American Chemical Society

Received: September 20, 2016

A

DOI: 10.1021/acs.jctc.6b00922 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

NMR chemical shifts of large molecules, incorporating accurate levels of theory, large basis sets, and the solvent environment. Section 2 of this work describes the steps involved in the evaluation of MIM-NMR isotropic shielding tensors and chemical shifts. Section 3.1 presents a benchmark study on a set of poly(alanine)10 conformations containing strong intramolecular interactions. We present illustrative applications comparing the MIM-NMR chemical shifts with experimental spectra for two large peptide molecules, ubiquitin (1UBQ) in section 3.2.1 and E6-binding protein of papilloma virus in section 3.2.2. The final conclusions are summarized in section 4.

significant improvement in the accuracy of chemical shifts for small molecules.41−44 However, incorporating explicit solvent molecules can be computationally expensive, particularly for large molecules such as polypeptides. In recent years, fragment-based methods have been employed to overcome the high computational scaling associated with accurate theoretical methods and to assist in a fast evaluation of energies and properties of molecules.45−47 Although there are a plethora of fragment-based methods for evaluating the energies, only a few, viz., fragment molecular orbital (FMO),48−50 automated fragmentation QM/MM approach (AF-QM/MM),51,52 combined fragmentation method (CFM),53,54 charge field perturbed method (CFP),55,56 systematic molecular fragmentation (SMF),57 adjustable density matrix assembler (ADMA),58 density-fitted local second-order Møller−Plesset perturbation theory method,59 and recent work on molecular crystals by Beran and coworkers60,61 have extended their approach for the calculation of NMR chemical shifts. In order to improve the accuracy of the fragment-based methods, hybrid energy schemes,55,56 which combine two or more computational methods in one calculation, can be used to accurately explore the chemistry of large systems.62−70 ONIOM is the simplest hybrid method to study a truncated active region of a large molecule.71−73 In order to develop a more sophisticated method that can treat the whole system at an appropriate quantum chemical level of theory, we have developed an extended model called molecules-in-molecules (MIM).74 Recently, MIM method has been used for the precise determination of the atomic forces, and infrared,75 Raman,76 vibrational circular dichroism,77 and Raman optical activity78 spectra of large molecules. The relatively small sizes of the subsystems evaluated at the high level of theory and the incorporation of long-range corrections using an efficient low level of theory (vide inf ra) make this method a promising tool for evaluating higher energy derivatives for large molecules with quantum chemical methods. In order to reproduce and reliably assign the experimental spectra, fairly large basis sets including diffuse and polarization functions are needed. MIM and locally dense basis functions offer alternative approaches to treat the need for large basis function. In the approach using locally dense basis functions,29 different elements are described by different quality of basis sets, whereas, in the MIM scheme, layers of subsystems are accounted using different combinations of levels of theory and basis sets. Thus, MIM has the additional flexibility in choosing the combinations of levels of theory as appropriate to achieve reliable performance. In addition to using large basis sets, accounting for all the long-range interactions is very important in reproducing the molecular properties of large system.43 In several of the previous approaches, the local environment of each fragment is accounted for via an embedded point charge treatment to yield 1 H, 13C, 15N, and 17O chemical shifts. However, the chemical shifts can be very sensitive to the choice of the environmental partial charges, potentially limiting the accuracy of such methods. The MIM model provides a complementary approach by including all the long-range interactions in a low-level calculation. Additionally, the MIM model is flexible enough to choose the appropriate combinations of levels of theory and basis functions to achieve balanced results. In the present work, we have adapted the MIM method for the precise evaluation of

2. THEORY AND IMPLEMENTATION This section discusses the procedure involved in the implementation of MIM-based evaluation of NMR isotropic shielding tensors and chemical shifts. All the actual and MIM calculations in this work are performed with the Gaussian (G09) package.79 These following procedures are implemented through an external Perl script interfacing with G09. 1. The MIM algorithm used in this work begins with constructing fragments from the large peptide molecule by cutting C−C bonds (vide inf ra). Each fragment interacts with neighboring fragments (using the connectivity information) via a number-based scheme (vide inf ra) to generate overlapping subsystems. All the dangling bonds in the subsystems are saturated with link-hydrogen atoms as in the ONIOM model.80 2. In the two-layer MIM model (MIM2), any missing longrange interactions are taken into account by employing a low level of theory. The total energies and higher energy derivatives are constructed from three separate sets of calculations, denoted as real-low (Erl), model-low (Eml), and model-high (Emh), respectively, using a generalized ONIOM-like hybrid extrapolation expression. ETotal = Erl − Eml + Emh

(1)

Here, the overcounted regions from the overlapping subsystems are corrected at the model-high and model-low levels of theory through the inclusion−exclusion principle. Eml / mh =

∑ Eli/ h − ∑ Eli/∩hj + ... + (−1)n− 1 ∑ Eli/∩hj ∩ k... ∩ n i

ij

n

(2)

Eil/h

represents the energies of primary-, secondary-, and n-body overlapping subsystems, respectively. 3. In the case of the isolated smaller peptides, molecular geometries are obtaind by performing a MIM-based optimization procedure.75 For the larger ubiquitin and E6-binding peptide systems (vide inf ra), the geometry resulting from a partial optimization starting from the experimentally determined X-ray or NMR structure was used. 4. The essential parameters for evaluating the NMR spectra are the isotropic shielding tensors. The isotropic shielding tensor σN of atom N is given as the second-order response of the electronic energy, E, with respect to external magnetic field, ⎡ ∂ 2E ⎤ B, and nuclear magnetic moment, mN, σijN = ⎢ ∂B ∂m ⎥ ⎣ i Nj ⎦ B = 0 N Here, σij is the ijth component of the shielding tensor, Bi is the ith component of the external magnetic field, and mNj is the jth component of the magnetic moment of nucleus N. Within the GIAO method, field-dependent atomic basis functions are employed to ensure the gauge-invariance of the NMR shielding B

DOI: 10.1021/acs.jctc.6b00922 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation tensors. In MIM2, the atomic shielding constants for the full set of atoms are gathered based on the general expression σijN =

∂ 2ETotal ∂ 2Erl ∂ 2Eml ∂ 2Emh = − + ∂Bi ∂m Nj ∂Bi ∂m Nj ∂Bi ∂m Nj ∂Bi ∂m Nj

(3)

5. The atomic NMR isotropic shielding constant (σi) is defined as one-third of the sum of the trace of the atomic shielding tensor from eq 3. σi is subtracted from the corresponding atomic contribution in the standard reference molecule (σref) to yield the chemical shift. For 1H and 13C, the chemical shift (δi) is obtained by subtracting the isotropic shielding constants of the nuclei in the molecule of interest from those of tetramethylsilane (TMS, σref) as δi = σref − σi 15

(4) 17

Similarly, N and O chemical shifts are referenced relative to the corresponding 15N atom in NH3 and 17O atom in H2O molecules, respectively. 6. In the MIM-NMR calculations, the solvent environment on the NMR chemical shifts can be incorporated through the general expression. Solvent Solvent Solvent ETotal = ErlSolvent − Eml + Emh

(5)

The solvent environment can be incorporated either through implicit, explicit, or a combination of both solvation models. We have explored all three approaches in our present work. We have employed the SMD-SCRF81 solvation model for incorporating the implicit solvation environment. In the case of the larger peptide systems, we have also developed protocols for selecting the explicit solvent molecules (vide inf ra).

Figure 1. Structures of 310-(Ala)10, α-(Ala)10, γ-(Ala)10, and β-(Ala)10 isomers employed for benchmarking the MIM-NMR method.

tetrapeptide model is needed to capture the intramolecular hydrogen-bonding interactions within the high level of theory in the α-helix while the tripeptide model is sufficient to capture such interactions in the other conformers. The fragment-based models described thus far use a single level of theory (MIM1). Long-range interactions are missing in the MIM1 model, and the predicted NMR chemical shifts may not be reliable. However, in the MIM2 model, all the missing interfragment interactions are included via a real-low calculation on the full molecule, and the predicted NMR chemical shifts (vide inf ra) show excellent agreement with experiment. In particular, the residues that are distant in sequence but close in space are only included in MIM2 in the current work. We have compared the performance of the MIM1[MPW1PW91/6-311G(d,p)] model with MIM2 using four different low levels of theory: HF/6-31G, HF/6-31G(d), MPW1PW91/6-31G, and MPW1PW91/6-31G(d). The geometries of the four conformations were first optimized using the two-layer MIM model: MIM2[MPW1PW91/6-311G(d,p):HF/6-31G] using procedures described previously.66 These geometries were then used to test the performance of the different models. The mean absolute deviations (MADs) in the calculated NMR chemical shifts (for all NMR-active nuclei) using both tripeptide and tetrapeptide models are shown in Figure 2. It is clear from Figure 2 that the single-layer MIM1 model has significant deviations, and that the inclusion of longrange effects through the second layer improves the results substantially. The largest deviations are seen for the α-helix since it requires the largest fragments to capture the hydrogenbonding interactions. As expected, tetrapeptide models work substantially better than tripeptide models. Using a second layer at the MPW1PW91/6-31G or HF/6-31G(d) provides a cost-effective approach to yield quantitatively accurate results (vide inf ra).

3. RESULTS AND DISCUSSION 3.1. Calibration of MIM-NMR Chemical shifts for (Alanine)10 Conformers. To calibrate the performance of MIM-NMR, we have evaluated the NMR chemical shifts of four conformational isomers of (alanine)10. They represent common secondary structures seen in many polypeptides: two different helical conformers (310- and α-helices), the γ-turn, and the extended β-strand conformer, depicted in Figure 1. The different conformers have different intramolecular interactions resulting in different networks of hydrogen bonds, with the two helical conformers having different pitches. For a given number of residues as in (alanine)10, the 310-helix has tighter turns and, hence, is longer than the α-helix, while the β-strand and the γturn conformers are even more extended. The different intramolecular interactions require different fragment sizes to capture them, potentially resulting in imbalances in their treatment with fragment-based methods. We have carried our reference calculations on each conformer using the MPW1PW91/6-311G(d,p) model to calculate the isotropic magnetic shielding tensors and the associated NMR chemical shifts at all the NMR-active nuclei. A variety of fragment-based models were then calibrated relative to these reference calculations to assess their performance. The fragments were first generated by cutting only C−C single bonds along the backbone and side chains in the peptide, and overlapping primary subsystems were generated using two different models. In the smaller tripeptide model, the interactions involving three fragments along the bond connectivity of the molecule were included to generate the primary subsystems. In the larger tetrapeptide model, interactions involving four fragments were used. The C

DOI: 10.1021/acs.jctc.6b00922 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation 1

H (blue) NMR spectra of the four conformational isomers of (alanine)10 are depicted in Figure S1 and that of 13C (black) and 15N (red) in Figure S2. The performance data for the MIM2[MPW1PW91/6311G(d,p):HF/6-31G(d)] model are listed in Table 1. They show a MAD of less than 1 kcal/mol in the calculated total energies using the tetrapeptide model, and 0.01 and 0.07 ppm MAD for 1H and 13C chemical shifts, respectively. These values are around 3% of the target accuracy in the calculated NMR chemical shifts of 0.3 ppm for 1H and 2−3 ppm for 13C. Thus, the uncertainties resulting from the fragmentation scheme are very small relative to the quantities being calculated, allowing the study of large systems using MIM-NMR. 3.2. Comparison of MIM-NMR Chemical Shifts of Large Peptides with Experiment. In the MIM-NMR calibration for (alanine)10 considered above, we have used molecular geometries determined from a full MIM2 optimization. However, many of the larger peptides exist in an environment where free unconstrained optimization is likely to yield unphysical geometries. In this section, we have selected two larger systems where geometries have been derived from experimental data. The two most common experimental techniques to derive peptide structural information are X-ray diffraction and solution-phase NMR. They both have their advantages and drawbacks. The X-ray structures typically show the positions of the solvating water molecules but do not yield the positions of the hydrogens. Structures derived from NMR show the hydrogens in the molecule but do not include the positions of the solvent molecules. Moreover, NMR-derived structures are not unique but typically include 10−50

Figure 2. Mean absolute deviation (MAD) in MIM-NMR chemical shift (ppm) with tripeptide (top) and tetrapeptides (bottom) models for (alanine)10 isomers relative to the full calculation at MPW1PW91/ 6-311G(d,p). Also see results in Table 1.

The calculated isotropic magnetic shielding tensors (ppm) at the MIM2[MPW1PW91/6-311G(d,p):HF/6-31G(d)] model are compared with those from the corresponding full calculations in Figure 3 for (A) 1H, (B) 13C, (C) 15N, and (D) 17O. The fitted lines in all the 1H, 13C, 15N, and 17O graphs show a correlation coefficient ∼1, with y-axis intercept zero. These graphs show that the MIM method reproduces the NMR magnetic isotropic shielding tensors evaluated at actual (i.e., unfragmented) high-level calculations. The molecular geometries of all these test systems along with the NMR spectra at the MIM1 and the four different MIM2 levels of theory are reported in the Supporting Information (Tables S1−S4). The

Figure 3. Comparison of (A) 1H, (B) 13C, (C) 15N, and (D) 17O isotropic magnetic shielding tensors (ppm) evaluated at actual [MPW1PW91/6311G(d,p)] and MIM2[MPW1PW91/6-311G(d,p):HF/6-31G(d)]. In each graph, 310-(Ala)10 is in black, α-(Ala)10 in red, γ-(Ala)10 in green, and β(Ala)10 in blue. D

DOI: 10.1021/acs.jctc.6b00922 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

Table 1. Structure Number, Symbol, Number of Basis Functions (NBasis), Actual Energies (hartrees) at MPW1PW91/6311G(d,p), Calculated Errors in MIM2[MPW1PW91/6-311G(d,p):HF/6-31G(d)] Level (kcal/mol), and Maximum and Mean Absolute Deviations in the MIM2 Chemical Shift (ppm) for (Ala)10 Isomers energy no.

syst

NBasis

Act

E

maximum deviation ΔE

2L

1

13

H

C

I II III IV

310-(Ala)10 α-(Ala)10 γ-(Ala)10 β-(Ala)10 average

1332 1332 1332 1332

−2682.62946 −2682.61988 −2682.61458 −2682.60293

6.020 5.638 1.507 0.501 3.417

Tripeptides 0.08 0.26 0.15 0.62 0.02 0.14 0.01 0.18 0.07 0.30

I II III IV

310-(Ala)10 α-(Ala)10 γ-(Ala)10 β-(Ala)10 average

1332 1332 1332 1332

−2682.62946 −2682.61988 −2682.61458 −2682.60293

0.979 0.672 0.458 0.321 0.608

Tetrapeptides 0.03 0.22 0.05 0.50 0.01 0.09 0.01 0.09 0.02 0.23

15

N

mean absolute deviation 17

O

1

H

13

C

0.51 3.90 0.21 0.31 1.23

5.31 6.53 0.84 0.30 3.25

0.03 0.04 0.01 0.01 0.02

0.14 0.25 0.06 0.08 0.13

0.32 0.69 0.13 0.16 0.33

0.67 1.68 0.26 0.19 0.70

0.01 0.02 0.00 0.00 0.01

0.08 0.14 0.03 0.03 0.07

15

N

17

O

total

0.25 1.08 0.15 0.23 0.43

3.70 3.20 0.62 0.18 1.93

0.47 0.77 0.10 0.07 0.35

0.20 0.27 0.09 0.12 0.17

0.45 0.71 0.19 0.15 0.38

0.10 0.13 0.04 0.05 0.05

Figure 4. Experimental structure of Ubiquitin (1D3Z, 1XQQ, and 1UBQ) employed for benchmarking the MIM-NMR method. A detailed analysis of MIM2[MPW1PW91/6-311++G(2d,2p):MPW1PW91/6-31G] 1H and 13C chemical shifts with experiment is furnished in Figures 5 and 6.

geometries that are all consistent with the available NMR data. In order to showcase the strength of MIM-NMR, we have investigated two different systems in this section: (1) the helical portion of ubiquitin where the X-ray structure is available and (2) the E6-binding protein of papilloma virus where only NMR-derived structures are available. 3.2.1. Comparison of the MIM-NMR Spectra of Ubiquitin with Experiment. First we consider NMR chemical shifts of the helical portion of the uiquitin protein molecule. The molecular coordinates of ubiquitin have been derived from the X-ray crystallographic data (PDB structure 1UBQ) or NMR (1D3Z and 1XQQ) and are depicted in Figure 4. The X-ray structure also includes the positions of the oxygen atoms from the solvent water molecules around the protein. The experimental NMR chemical shifts (BMRB 5387) have been reported in neutral pH in water. In the MIM calculations we have considered the 12 peptide residues in the helical region from 23 to 34, viz., ILE, GLU, ASN, VAL, LYS, ALA, LYS, ILE, GLN, ASP, LYS, and GLU. In addition, in the explicit solvation model (vide inf ra), 18 water molecules within the first solvation sphere (3.0 Å) have been included. These explicit water molecules correspond to the entrapped water molecules within the 1UBQ crystal structure, though it is likely that not all the water molecules are retained in the crystal structure. The initial geometry was generated by saturating the dangling bonds of the

heavy (i.e., non-hydrogen) atoms of the experimental X-ray structure with the Rosetta Program.82 From this geometry, the water molecules closest to the protein are selected based on a 3.0 Å distance cutoff. In the MIM-NMR procedure, all the residues are treated in their neutral forms, and the positions of all the hydrogen atoms, including those in the solvent water molecules, are determined through a constrained optimization after fixing the positions of the heavy atoms, at the MPW1PW91/6-31G level. The isotropic magnetic shielding tensors and the corresponding chemical shifts are evaluated at the MIM2[MPW1PW91/6-311++G(2d,2p):MPW1PW91/631G] level at this optimized geometry. The full NMR calculation with 12 peptide residues along with 18 water molecules at the target MPW1PW91/6-311++G(2d,2p) level of theory is computationally expensive, containing 259 atoms and 4,562 Gaussian basis functions. In the MIM-NMR procedure, the initial fragments are generated by breaking only the C−C bonds in the ubiquitin molecule, both along the backbone and the side chain. We have used a tetrapeptide model considered in the previous section where each primary subsystem involves four fragments along the connectivity network. Figure 5A depicts a comparison of experiment (BMRB 5387) with MIM2-gas-phase NMR chemical shifts (i.e., without the solvent molecules). The correlation graph for 1H shows some large deviations compared E

DOI: 10.1021/acs.jctc.6b00922 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

Figure 5. Comparison of experimental 1H NMR spectrum of ubiquitin (1UBQ) molecule with neutral residues at (A) MIM-gas-phase, (B) MIMimplicit, (C) MIM-explicit, and (D) MIM-explicit-implicit models at the MIM2[MPW1PW91/6-311++G(2d,2p):MPW1PW91/6-31G] level. The MIM 1H NMR chemical shifts are depicted in reference to tetramethyl silane (TMS).

Figure 6. Comparison of experimental 13C NMR spectrum of ubiquitin (1UBQ) molecule with neutral residues at (A) MIM-gas-phase, (B) MIMimplicit, (C) MIM-explicit, and (D) MIM-explicit-implicit models at the MIM2[MPW1PW91/6-311++G(2d,2p):MPW1PW91/6-31G] level. The MIM 13C NMR chemical shifts are depicted in reference to tetramethyl silane (TMS).

F

DOI: 10.1021/acs.jctc.6b00922 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

Figure 7. Illustration of the steps involved in the explicit MIM solvation model. (A) 1RIJ molecule inside an explicit solvated 40 × 40 × 40 Å3 cube and (B) 1RIJ protein and the explicit solvated water molecules selected within 3.0 Å inside the primary solvation sphere. The comparison of MIM-1H NMR chemical shift with experiment is depicted in Figure 8.

with deviations of 0.2−0.4 ppm. In a later study,43 the same authors found that an explicit solvation model was needed to provide an accurate description of the amide chemical shifts and obtained an root-mean-square (RMS) deviation of 0.59 ppm. Interestingly, the best performance for the amide protons in that study was achieved for the ShiftX2 empirical model that yields an RMS deviation of 0.28 ppm with a correlation coefficient of 0.927.43 Our MIM-NMR model furnishes a mean absolute deviation of 0.31 ppm including all types of protons, and the value obtained is indeed promising. Moreover, from the initial benchmark study on polyalanine, it is clear that accuracy is not lost due to the fragmentation schemes in our MIM approach. Panels A−D of Figures 6 depict the corresponding theoretical−experimental correlation graph for 13C NMR chemical shifts. The MAD in 13C NMR chemical shifts, which span a much larger range, are (A) 3.22, (B) 2.43, (C) 2.88, and (D) 2.07 ppm, respectively. Among the different chemical shifts, slightly larger than average deviations are seen for carboxyl carbon (CO) and Cα carbon. The correlation coefficients of the spectrum are (A) 0.9985, (B) 0.9987, (C) 0.9989, and (D) 0.9999, respectively. A comparison of these NMR chemical shifts in Figure 5 and Figure 6 shows that incorporation of solvation effects through the MIM-explicitimplicit model improves the accuracy of both 1H and 13C chemical shifts, though the most visually striking results are seen for the amide 1H chemical shifts. Similar to 1H MIMNMR chemical shift, we have performed a linear fit for 13C chemical shift and it resulted in a MAD of (A) 2.38, (B) 2.10, (C) 2.15, and (D) 1.72 ppm, respectively. 15 N chemical shifts in molecules such as ubiquitin are known to be challenging36 and can also provide good tests of the reliability of the theoretical calculations. We have followed a similar fitting scheme for 15N chemical shifts, and the mean absolute deviations in the different independent sets are (A) 3.77, (B) 3.54, (C) 3.48, and (D) 3.48 ppm, respectively. The effect of the solvation environment on the 15N chemical shift is considerably smaller than in the calculation of the 1H chemical

to experiment, especially, in the amidic hydrogen (OC− NH), while the remaining chemical shifts show fairly good agreement with the experiment. The mean deviation from experiment for the amide region in the 6−10 ppm range is 1.00 ppm while the deviation in the 0−6 ppm range is smaller, 0.42 ppm (overall deviation, 0.51 ppm). This suggests that modeling the solvation environment of the amide groups is vital in the MIM-NMR calculations. We have incorporated the solvation effects for 1UBQ through three separate models: MIM-implicit using the SMD continuum solvation model, MIM-explicit including 18 solvent water molecules seen in the X-ray structure within 3 Å, and MIM-explicit-implicit where the ubiquitin and the 18 explicit waters are embedded in the SMD continuum model. A full comparison of the 1H NMR experimental and MIM chemical shifts for 1UBQ are depicted in Figure 5A−D. They represent MIM-gas-phase, MIM-implicit, MIM-explicit, and MIM-explicit-implicit models, respectively. The wine-red lines in Figure 5A−D are the 45° diagonal of the graph. The black line shows the best fitted line for the data with each model. The MAD in 1 H NMR chemical shifts for the entire range of chemical shifts are (A) 0.51, (B) 0.43, (C) 0.42, and (D) 0.33 ppm, respectively. The largest improvement occurs for the amide modes where the corresponding deviations are 1.00, 0.64, 0.41, and 0.27 ppm, respectively. It is clear that the explicit water molecules in the first solvation shell are needed, but embedding them in the implicit model gives the best performance. The MIM-explicit-implicit model yields similar deviations in the 0− 6 ppm (0.34 ppm) and 6−10 ppm (0.27 ppm). The best linearfitted line (black) shows a correlation coefficient (cc) of (A) 0.945, (B) 0.968, (C) 0.969, and (D) 0.986 for the 1H NMR chemical shifts. Based on the best linear fit, a comparison of the MIM-NMR chemical shift with experiment results in a MAD of (A) 0.52, (B) 0.40, (C) 0.42, and (D) 0.31 ppm, respectively. MIM-NMR results can be compared to those from previous studies on ubiquitin. In a AF-QM/MM study using implicit solvation,36 Zhu et al. obtained non-amide 1H chemical shifts G

DOI: 10.1021/acs.jctc.6b00922 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

Figure 8. Comparison of experimental 1H NMR spectrum of PDB 1RIJ molecule with (A) MIM-gas-phase, (B) MIM-implicit, (C) MIM-explicit, and (D) MIM-explicit-implicit models at the MIM2[MPW1PW91/6-311++G(2d,2p):MPW1PW91/6-31G] level. The MIM 1H NMR chemical shifts are depicted in reference to tetramethyl silane (TMS).

shifts. Our calculated 15N chemical shifts appear to have significantly smaller deviations than the values obtained previously for ubiquitin.36 An analysis of the NMR spectra using the NMR-derived structures 1XQQ and 1D3Z is depicted in the Supporting Information in Figures S4 and S5, respectively. The best linear fit for 1H chemical shifts are (A) 0.53, (B) 0.43, (C) 0.51, and (D) 0.42 ppm, respectively. Similarly, for the 13C chemical shifts are (A) 3.18, (B) 2.98, (C) 3.23, and (D) 2.91 ppm, respectively. We have only considered the gas-phase and implicit solvation models for these structures, and the deviations are larger than for 1UBQ. While the correlation coefficients show a trend similar to that in the neutral structures, the deviations are larger for the charged residues. The NMR-MIM calculations reduce the computational effort in treating explict water molecules at a high level of theory by a substantial amount. We note that in the MIM-explicit model, the water molecules closest to a fragment are incorporated as a part of that fragment. Hence, the number of fragments remains the same as that in the MIM-gas-phase calculations. The improvements in the MAD and the correlation coefficient in Figures 5D and 6D suggest the best method for an accurate prediction of NMR chemical shifts: an explicit solvation model for the first solvation sphere and an implicit model for the rest of the molecule. The overall mean absolute deviations of 0.31 ppm for 1H chemical shifts and 1.72 ppm for 13C chemical shifts are in the same range as the best values in the previous literature for such large peptide systems. 3.2.2. Comparison of MIM-NMR Chemical Shifts of E6Binding Protein with Experiment. For the final benchmarking of the MIM-NMR chemical shifts with experiment, we have

selected E6-binding protein of papilloma virus (PDB 1RIJ, BMRB 6088) with 23 amino acid residues (Figure 7). We have taken into account all 23 residues in the MIM-NMR theoretical calculation, viz, ALA, LEU, GLN, GLU, LEU, LEU, GLY, GLN, TRP, LEU, LYS, ASP, GLY, GLY, PRO, SER, SER, GLY, ARG, PRO, PRO, PRO, and SER. The initial fragments are again constructed by cutting only the C−C bonds, and none of the heteroatom bond or rings in the systems are cut. The primary subsystems are constructed from these fragments based on their connectivity information using a tetrepeptide model discussed earlier. Unlike ubiquitin where the X-ray structure is known, only the NMR-derived structures (30 conformers) are available for 1RIJ. This protein is in the zwitterionic state, and the individual fragments are either neutral or positively or negatively charged. Solvation interactions, clearly critical for this structure, are not available from NMR and have to be obtained from computational models (vide inf ra). We have evaluated the relative energies at the MIM2 level for all 30 conformers using the same fragmentation scheme. In the gas phase, these conformers occur in a range within 2.5−10.0 kcal/mol above the most stable conformer. We have performed the MIM-NMR calculations only on the most energetically favorable conformer. The MIM2[MPW1PW91/6-311++G(2d,2p):HF/6-31G] 1H NMR chemical shifts for the most stable conformer evaluated in the gas phase (Figure 8A do not reproduce the experimental chemical shifts accurately. As in the case of ubiquitin, the amidic hydrogens deviate significantly from experiment due to the lack of solvation effects. The experimental NMR chemical shifts are evaluated in water, close to neutral pH. We have developed the following step-by-step protocol for the H

DOI: 10.1021/acs.jctc.6b00922 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

chemical shifts show excellent agreement with the experiment, and the accuracy of the method shows that similar high-quality quantum mechanical results can be obtained on large molecules.83 With potentially increasing the applicability of NMR spectroscopy for the determination of the structure and dynamics of proteins, the MIM-NMR-explicit-implicit method may assist in de novo protein structure predictions in the future.84,85

construction of the explicit solvation sphere as depicted in Figure 7A,B. All 23 amino acid residues are solvated with explicit water molecules by defining a box of dimension 40 × 40 × 40 Å3 around the molecule, employing the Gromacs package.69,70 A constrained MD simulation was carried out under standard conditions (1 bar and 273 K) for 20,000 ps using a 1 fs time step. In the MD simulations, only the solvent molecules are allowed to move around the protein molecules to find an optimum geometry for the explicit water molecules. From the final MD-optimized structure, Figure 7A, the water molecules within a distance of 3.0 Å from the protein molecule were then selected, and a constrained solvent optimization was carried out at the semiempirical PM6 level. This final structure is considerd for evaluating the MIM-NMR chemical shifts. The MIM-explicit model includes 92 water molecules in the primary solvation sphere within 3.0 Å of the protein molecule. This calculation with 23 amino acid residues and 92 explicit water molecules results in 613 atoms. We have used the popular MPW1PW91 functional at the MIM2[MPW1PW91/6-311+ +G(2d,2p):MPW1PW91/6-31G) level to calculate the NMR spectra. A full calculation at the MPW1PW91/6-311++G(2d,2p) level involves 10,567 basis functions and is not possible using our available computational resources, thus requiring the use of our fragment-based approach. As in the case of ubiquitin, we have employed four types of MIM solvation models for evaluating the 1H NMR chemical shift: (A) MIM-gas-phase, (B) MIM-implicit, (C) MIMexplicit, and (D) MIM-explicit-implicit solvation models. A comparison of the NMR experimental and MIM chemical shifts are depicted in Figure 8A−D. The brown lines in Figure 8A−D are the 45° line through the diagonal of the graph. The MAD in the 1H NMR chemical shifts using the best linear fit in Figure 8 are (A) 1.08, (B) 0.89, (C) 0.53, and (D) 0.34 ppm, respectively. The overall performance of our best model (MIM-explicit-implicit) is slightly worse than for ubiquitin (0.34 ppm vs 0.31 ppm), and probably reflects a larger uncertainty in the NMR-derived structure of the molecule. For 1RIJ protein, 15N experimental chemical shifts are not available and are not discussed in this work. 3.3. Analysis and Outlook. The overall performance of MIM2 meets our target accuracy of 0.3 ppm for 1H and 2−3 ppm for 13C. We note from the benchmark analysis on polyalanine that the errors coming from the fragmentation schemes are only 3% of the target accuracy, and thus should not interfere with our analysis of the performance of the methods. As mentioned earlier, some of the empirical models such as SHIFTX2 can achieve comparable or even better performance. While it is not uncommon for empirical models to outperform quantum mechanical models,52 the latter offer scope for systematic improvement and will have a better predictive capability in challenging situations. Additionally, DFT methods are used in this work, and the intrinsic errors of the DFT functionals used may limit the target accuracy. Finally, it is known that the results for such large peptides will be dependent on the selection of appropriate structural ensembles. In this work, we have only used a single structure, and it may be unrealistic to expect a significantly smaller deviation. Nevertheless, the accuracy of our methods is comparable to some of the best previous results, and the overall performance is very encouraging. The accurate prediction of the NMR shielding tensors and chemical shifts at MIM2 levels of theory validate the MIMbased method for exploring large molecules. The MIM-NMR

4. CONCLUSIONS In the present work, the molecules-in-molecules (MIM) fragment-based method is adapted for the precise evaluation of 1H, 13C, 15N, and 17O NMR chemical shifts of large biomolecules. The MIM protocol for evaluating NMR chemical shifts is independent of the levels of theory and basis sets, provided the associated analytic energy derivatives are available. The fragments are generated by cutting the C−C bonds, and a number-based connectivity scheme is used for constructing the primary subsystems. All the dangling bonds of the subsystems are saturated with link-hydrogen atoms. MIM-NMR method is benchmarked on (alanine)10 conformers, with strong intramolecular interactions. The incorporation of a second low level of theory to recover the missing long-range interactions in the primary fragmentation scheme is critical to yield reliable chemical shifts, with a mean absolute deviation (MAD) from direct unfragmented calculations of 0.01 ppm for 1H chemical shifts and 0.07 ppm for 13C chemical shifts. We have also employed MIM-NMR for predicting the chemical shifts of the helical portion of ubiquitin (PDB 1UBQ containing 12 residues) and E6-binding protein of papilloma virus (PDB 1RIJ containing 23 residues). We have incorporated the solvent environment employing MIM-explicit, MIM-implicit, and MIM-explicit-implicit models for evaluating NMR chemical shifts. The two-layer MIM2 model in combination with explicit solvents in the primary solvation sphere, and implicit solvation for the rest, could reproduce the experimental NMR chemical shifts accurately. A full calculation with explicit solvation using 6-311++G(2d,2p) for the E6-binding protein containing 23 residues and 92 explicit water molecules contains more than 600 atoms and involves more than 10,000 basis functions. Such quantum chemical calculations on a large protein involving a direct calculation of NMR chemical shift using fully quantum calculations with large basis sets are now possible. The accuracy and performance of the MIM-NMR method for evaluating 1H and 13C spectra of these benchmark systems with strong intramolecular interactions holds promise for exploring the structure and NMR spectra of a range of large biomolecules in the future.



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jctc.6b00922. 1 H, 15N, and 13C MIM-NMR spectra, comparison of isotropic magnetic shielding tensors, comparison of ubiquitin(1XQQ, 1D3Z) molecule experimental NMR spectra with those of neutral residues, and optimized structure results (PDF)



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. I

DOI: 10.1021/acs.jctc.6b00922 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation ORCID

(18) Hansen, A. E.; Bouman, T. D. Localized orbital/local origin method for calculation and analysis of NMR shieldings. Applications to 13C shielding tensors. J. Chem. Phys. 1985, 82, 5035−5047. (19) Keith, T. A.; Bader, R. F. W. Calculation of magnetic response properties using atoms in molecules. Chem. Phys. Lett. 1992, 194, 1−8. (20) Keith, T. A.; Bader, R. F. W. Calculation of magnetic response properties using a continuous set of gauge transformations. Chem. Phys. Lett. 1993, 210, 223−231. (21) Cheeseman, J. R.; Trucks, G. W.; Keith, T. A.; Frisch, M. J. A comparison of models for calculating nuclear magnetic resonance shielding tensors. J. Chem. Phys. 1996, 104, 5497−5509. (22) Kollwitz, M.; Häser, M.; Gauss, J. Non-Abelian point group symmetry in direct second-order many-body perturbation theory calculations of NMR chemical shifts. J. Chem. Phys. 1998, 108, 8295− 8301. (23) Gauss, J.; Stanton, J. F. Coupled-cluster calculations of nuclear magnetic resonance chemical shifts. J. Chem. Phys. 1995, 103, 3561− 3577. (24) Gauss, J.; Stanton, J. F. Analytic CCSD(T) second derivatives. Chem. Phys. Lett. 1997, 276, 70−77. (25) Johnson, B. G.; Fisch, M. J. An implementation of analytic second derivatives of the gradient-corrected density functional energy. J. Chem. Phys. 1994, 100, 7429−7442. (26) Johnson, B. G.; Frisch, M. J. Analytic second derivatives of the gradient-corrected density functional energy. Effect of quadrature weight derivatives. Chem. Phys. Lett. 1993, 216, 133−140. (27) Becke, A. D. Perspective: Fifty years of density-functional theory in chemical physics. J. Chem. Phys. 2014, 140, 18A301. (28) Chesnut, D. B.; Moore, K. D. Locally dense basis sets for chemical shift calculations. J. Comput. Chem. 1989, 10, 648−659. (29) Reid, D. M.; Kobayashi, R.; Collins, M. A. Systematic Study of Locally Dense Basis Sets for NMR Shielding Constants. J. Chem. Theory Comput. 2014, 10, 146−152. (30) Frank, A.; Möller, H. M.; Exner, T. E. Toward the quantum chemical calculation of NMR chemical shifts of proteins. 2. level of theory, basis set, and solvents model dependence. J. Chem. Theory Comput. 2012, 8, 1480−1492. (31) Exner, T. E.; Frank, A.; Onila, I.; Möller, H. M. Toward the Quantum Chemical Calculation of NMR Chemical Shifts of Proteins. 3. Conformational Sampling and Explicit Solvents Model. J. Chem. Theory Comput. 2012, 8, 4818−4827. (32) Flaig, D.; Maurer, M.; Hanni, M.; Braunger, K.; Kick, L.; Thubauville, M.; Ochsenfeld, C. Benchmarking Hydrogen and Carbon NMR Chemical Shifts at HF, DFT, and MP2 Levels. J. Chem. Theory Comput. 2014, 10, 572−578. (33) Continuum solvation models in chemical physics: from theory to applications. Mennucci, B., Cammi, R., Eds.; John Wiley & Sons: Chichester, England, 2007; DOI: 10.1002/9780470515235. (34) Cammi, R.; Mennucci, B.; Tomasi, J. Nuclear magnetic shieldings in solution: Gauge invariant atomic orbital calculation using the polarizable continuum model. J. Chem. Phys. 1999, 110, 7627−7638. (35) Mukkamala, D.; Zhang, Y.; Oldfield, E. A solid state 13C NMR, crystallographic, and quantum chemical investigation of phenylalanine and Tyrosine residues in dipeptides and proteins. J. Am. Chem. Soc. 2007, 129, 7385−7392. (36) Zhu, T.; He, X.; Zhang, J. Z. H. Fragment density functional theory calculation of NMR chemical shifts for proteins with implicit solvation. Phys. Chem. Chem. Phys. 2012, 14, 7837−7845. (37) Sebastiani, D.; Rothlisberger, U. Nuclear magnetic resonance chemical shifts from hybrid DFT QM/MM calculations. J. Phys. Chem. B 2004, 108, 2807−2815. (38) Komin, S.; Gossens, C.; Tavernelli, I.; Rothlisberger, U.; Sebastiani, D. NMR solvent shifts of adenine in aqueous solution from hybrid QM/MM molecular dynamics simulations. J. Phys. Chem. B 2007, 111, 5225−5232. (39) Best, R. B.; Buchete, N.-V.; Hummer, G. Are current molecular dynamics force fields too helical? Biophys. J. 2008, 95, L07−L09.

Krishnan Raghavachari: 0000-0003-3275-1426 Present Address †

School of Chemistry, University of Hyderabad, Gachibowli, Hyderabad 500046, India. Funding

Funding from NSF Grant CHE-1266154 at Indiana University is gratefully acknowledged. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS We thank the Indiana University Big Red II, Supercomputing facility for computing time.



REFERENCES

(1) Mulder, F. A. A.; Filatov, M. NMR chemical shift data and ab initio shielding calculations: emerging tools for protein structure determination. Chem. Soc. Rev. 2010, 39, 578−590. (2) Helgaker, T.; Jaszuński, M.; Ruud, K. Ab Initio Methods for the Calculation of NMR Shielding and Indirect Spin−Spin Coupling Constants. Chem. Rev. 1999, 99, 293−352. (3) de Dios, A.; Pearson, J.; Oldfield, E. Secondary and tertiary structural effects on protein NMR chemical shifts: an ab initio approach. Science 1993, 260, 1491−1496. (4) Casabianca, L. B.; de Dios, A. C. Ab initio calculations of NMR chemical shifts. J. Chem. Phys. 2008, 128, 052201. (5) He, X.; Wang, B.; Merz, K. M. Protein NMR Chemical Shift Calculations Based on the Automated Fragmentation QM/MM Approach. J. Phys. Chem. B 2009, 113, 10380−10388. (6) Oldfield, E. CHEMICAL SHIFTS IN AMINO ACIDS, PEPTIDES, AND PROTEINS: From Quantum Chemistry to Drug Design. Annu. Rev. Phys. Chem. 2002, 53, 349−378. (7) Neal, S.; Nip, A.; Zhang, H.; Wishart, D. Rapid and accurate calculation of protein 1H, 13C and 15N chemical shifts. J. Biomol. NMR 2003, 26, 215−240. (8) Shen, Y.; Bax, A. Protein backbone chemical shifts predicted from searching a database for torsion angle and sequence homology. J. Biomol. NMR 2007, 38, 289−302. (9) Calculation of NMR and EPR parameters: theory and applications; Kaupp, M., Bühl, M., Malkin, V. G., Eds. Wiley-VCH: Weinheim, Germany, 2004. (10) Vaara, J. Theory and computation of nuclear magnetic resonance parameters. Phys. Chem. Chem. Phys. 2007, 9, 5399−5418. (11) Facelli, J. C. Calculations of chemical shieldings: Theory and applications. Concepts Magn. Reson. 2004, 20A, 42−69. (12) Ditchfield, R. Self-consistent perturbation theory of diamagnetism. Mol. Phys. 1974, 27, 789−807. (13) Gauss, J. Effects of electron correlation in the calculation of nuclear magnetic resonance chemical shifts. J. Chem. Phys. 1993, 99, 3629−3643. (14) Wolinski, K.; Hinton, J. F.; Pulay, P. Efficient implementation of the gauge-independent atomic orbital method for NMR chemical shift calculations. J. Am. Chem. Soc. 1990, 112, 8251−8260. (15) Rauhut, G.; Puyear, S.; Wolinski, K.; Pulay, P. Comparison of NMR Shieldings Calculated from Hartree−Fock and Density Functional Wave Functions Using Gauge-Including Atomic Orbitals. J. Phys. Chem. 1996, 100, 6310−6316. (16) Kutzelnigg, W. Theory of Magnetic Susceptibilities and NMR Chemical Shifts in Terms of Localized Quantities. Isr. J. Chem. 1980, 19, 193−200. (17) Schindler, M.; Kutzelnigg, W. Theory of magnetic susceptibilities and NMR chemical shifts in terms of localized quantities. II. Application to some simple molecules. J. Chem. Phys. 1982, 76, 1919− 1933. J

DOI: 10.1021/acs.jctc.6b00922 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation (40) Komin, S.; Sebastiani, D. Optimization of capping potentials for spectroscopic parameters in hybrid quantum mechanical/mechanical modeling calculations. J. Chem. Theory Comput. 2009, 5, 1490−1498. (41) Poon, C.-D.; Samulski, E. T.; Weise, C. F.; Weisshaar, J. C. Do Bridging water molecules dictate the structure of a model dipeptide in aqueous solution? J. Am. Chem. Soc. 2000, 122, 5642−5643. (42) Han, W.-G.; Jalkanen, K. J.; Elstner, M.; Suhai, S. Theoretical study of aqueous N-Acetyl-l-alanine N′-Methylamide: structures and Raman, VCD, and ROA Spectra. J. Phys. Chem. B 1998, 102, 2587− 2602. (43) Zhu, T.; Zhang, J. Z. H.; He, X. Automated fragmentation QM/ MM calculation of amide proton chemical shifts in proteins with explicit solvent model J. J. Chem. Theory Comput. 2013, 9, 2104−2114. (44) He, X.; Zhu, T.; Wang, X.; Liu, J.; Zhang, J. Z. H. Fragment quantum mechanical calculation of proteins and Its applications. Acc. Chem. Res. 2014, 47, 2748−2757. (45) Kussmann, J.; Ochsenfeld, C. Linear-scaling method for calculating nuclear magnetic resonance chemical shifts using gaugeincluding atomic orbitals within Hartree-Fock and density-functional theory. J. Chem. Phys. 2007, 127, 054103. (46) Ochsenfeld, C.; Head-Gordon, M. A reformulation of the coupled perturbed self-consistent field equations entirely within a local atomic orbital density matrix-based scheme. Chem. Phys. Lett. 1997, 270, 399−405. (47) Gordon, M. S.; Fedorov, D. G.; Pruitt, S. R.; Slipchenko, L. V. Fragmentation methods: A route to accurate calculations on large systems. Chem. Rev. 2012, 112, 632−672. (48) Fedorov, D. G.; Kitaura, K. Extending the Power of Quantum Chemistry to Large Systems with the Fragment Molecular Orbital Method. J. Phys. Chem. A 2007, 111, 6904−6914. (49) Gao, Q.; Yokojima, S.; Fedorov, D. G.; Kitaura, K.; Sakurai, M.; Nakamura, S. Fragment-Molecular-Orbital-Method-Based ab Initio NMR Chemical-Shift Calculations for Large Molecular Systems. J. Chem. Theory Comput. 2010, 6, 1428−1444. (50) Gao, Q.; Yokojima, S.; Kohno, T.; Ishida, T.; Fedorov, D. G.; Kitaura, K.; Fujihira, M.; Nakamura, S. Ab initio NMR chemical shift calculations on proteins using fragment molecular orbitals with electrostatic environment. Chem. Phys. Lett. 2007, 445, 331−339. (51) Victora, A.; Möller, H. M.; Exner, T. E. Accurate ab initio prediction of NMR chemical shifts of nucleic acids and nucleic acids/ protein complexes. Nucleic Acids Res. 2014, 42, e173. (52) Case, D. A. Chemical shifts in biomolecules. Curr. Opin. Struct. Biol. 2013, 23, 172−176. (53) Lee, A. M.; Bettens, R. P. A. First Principles NMR Calculations by Fragmentation. J. Phys. Chem. A 2007, 111, 5111−5115. (54) Tan, H.-J.; Bettens, R. P. A. Ab initio NMR chemical-shift calculations based on the combined fragmentation method. Phys. Chem. Chem. Phys. 2013, 15, 7541−7547. (55) Vreven, T.; Morokuma, K. Hybrid Methods: ONIOM(QM:MM) and QM/MM. In Annual Reports in Computational Chemistry; David, C. S., Ed.; Elsevier: 2006; Vol. 2, Chapter 3, pp 35−51, DOI: 10.1016/S1574-1400(06)02003-2. (56) Chung, L. W.; Hirao, H.; Li, X.; Morokuma, K. The ONIOM method: its foundation and applications to metalloenzymes and photobiology. Wiley Interdisciplinary Reviews: Computational Molecular Science 2012, 2, 327−350. (57) Collins, M. A.; Cvitkovic, M. W.; Bettens, R. P. A. The Combined Fragmentation and Systematic Molecular Fragmentation Methods. Acc. Chem. Res. 2014, 47, 2776−2785. (58) Frank, A.; Onila, I.; Möller, H. M.; Exner, T. E. Toward the quantum chemical calculation of nuclear magnetic resonance chemical shifts of proteins. Proteins: Struct., Funct., Genet. 2011, 79, 2189−2202. (59) Loibl, S.; Schütz, M. NMR shielding tensors for density fitted local second-order Møller-Plesset perturbation theory using gauge including atomic orbitals. J. Chem. Phys. 2012, 137, 084107. (60) Hartman, J.; Beran, G. Fragment-based electronic structure approach for computing nuclear magnetic resonance chemical shifts in molecular crystals. J. Chem. Theory Comput. 2014, 10, 4862−4872.

(61) Hartman, J.; Monaco, S.; Schatschneider, B.; Beran, G. Fragment-based 13-C nuclear magnetic resonance chemical shift predictions in molecular crystals: An alternative to planewave methods. J. Chem. Phys. 2015, 143, 102809. (62) Ř ezác,̌ J.; Salahub, D. R. Multilevel Fragment-Based Approach (MFBA): A Novel Hybrid Computational Method for the Study of Large Molecules. J. Chem. Theory Comput. 2010, 6, 91−99. (63) Vreven, T.; Byun, K. S.; Komáromi, I.; Dapprich, S.; Montgomery, J. A.; Morokuma, K.; Frisch, M. J. Combining Quantum Mechanics Methods with Molecular Mechanics Methods in ONIOM. J. Chem. Theory Comput. 2006, 2, 815−826. (64) Fedorov, D. G.; Ishida, T.; Kitaura, K. Multilayer Formulation of the Fragment Molecular Orbital Method (FMO). J. Phys. Chem. A 2005, 109, 2638−2646. (65) Isegawa, M.; Wang, B.; Truhlar, D. G. Electrostatically Embedded Molecular Tailoring Approach and Validation for Peptides. J. Chem. Theory Comput. 2013, 9, 1381−1393. (66) Beran, G. J. O. Approximating quantum many-body intermolecular interactions in molecular clusters using classical polarizable force fields. J. Chem. Phys. 2009, 130, 164115. (67) He, X.; Merz, K. M. Divide and Conquer Hartree−Fock Calculations on Proteins. J. Chem. Theory Comput. 2010, 6, 405−411. (68) Nagata, T.; Fedorov, D. G.; Sawada, T.; Kitaura, K.; Gordon, M. S. A combined effective fragment potential−fragment molecular orbital method. II. Analytic gradient and application to the geometry optimization of solvated tetraglycine and chignolin. J. Chem. Phys. 2011, 134, 034110. (69) Mullin, J. M.; Roskop, L. B.; Pruitt, S. R.; Collins, M. A.; Gordon, M. S. Systematic Fragmentation Method and the Effective Fragment Potential: An Efficient Method for Capturing Molecular Energies. J. Phys. Chem. A 2009, 113, 10040−10049. (70) Guo, W.; Wu, A.; Zhang, I. Y.; Xu, X. XO: An extended ONIOM method for accurate and efficient modeling of large systems. J. Comput. Chem. 2012, 33, 2142−2160. (71) Karadakov, P. B.; Morokuma, K. ONIOM as an efficient tool for calculating NMR chemical shielding constants in large molecules. Chem. Phys. Lett. 2000, 317, 589−596. (72) Hall, K. F.; Vreven, T.; Frisch, M. J.; Bearpark, M. J. ThreeLayer ONIOM Studies of the Dark State of Rhodopsin: The Protonation State of Glu181. J. Mol. Biol. 2008, 383, 106−121. (73) Gascón, J. A.; Sproviero, E. M.; Batista, V. S. QM/MM Study of the NMR Spectroscopy of the Retinyl Chromophore in Visual Rhodopsin. J. Chem. Theory Comput. 2005, 1, 674−685. (74) Mayhall, N. J.; Raghavachari, K. Molecules-in-Molecules: An Extrapolated Fragment-Based Approach for Accurate Calculations on Large Molecules and Materials. J. Chem. Theory Comput. 2011, 7, 1336−1343. (75) Jose, K. V. J.; Raghavachari, K. Evaluation of Energy Gradients and Infrared Vibrational Spectra through Molecules-in-Molecules Fragment-Based Approach. J. Chem. Theory Comput. 2015, 11, 950− 961. (76) Jovan Jose, K. V.; Raghavachari, K. Molecules-in-molecules fragment-based method for the evaluation of Raman spectra of large molecules. Mol. Phys. 2015, 113, 3057−3066. (77) Jose, K. V. J.; Beckett, D.; Raghavachari, K. Vibrational Circular Dichroism Spectra for Large Molecules through Molecules-inMolecules Fragment-Based Approach. J. Chem. Theory Comput. 2015, 11, 4238−4247. (78) Jovan Jose, K. V.; Raghavachari, K. Raman Optical Activity Spectra for Large Molecules through Molecules-in-Molecules Fragment-Based Approach. J. Chem. Theory Comput. 2016, 12, 585−594. (79) Frisch, M. J. T.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Mennucci, B.; Petersson, G. A.; Nakatsuji, H.; Caricato, M.; Li, X.; Hratchian, H. P.; Izmaylov, A. F.; Bloino, J.; Zheng, G.; Sonnenberg, J. L.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Vreven, T.; Montgomery, J. A., Jr.; Peralta, J. E.; Ogliaro, F.; Bearpark, M.; Heyd, J. J.; Brothers, E.; Kudin, K. N.; Staroverov, V. N.; Kobayashi, R.; Normand, J.; Raghavachari, K.; K

DOI: 10.1021/acs.jctc.6b00922 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation Rendell, A.; Burant, J. C.; Iyengar, S. S.; Tomasi, J.; Cossi, M.; Rega, N.; Millam, M. J.; Klene, M.; Knox, J. E.; Cross, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.; Ochterski, J. W.; Martin, R. L.; Morokuma, K.; Zakrzewski, V. G.; Voth, G. A.; Salvador, P.; Dannenberg, J. J.; Dapprich, S.; Daniels, A. D.; Farkas, Ö .; Foresman, J. B.; Ortiz, J. V.; Cioslowski, J.; Fox, D. J. Gaussian 09, Revision D.01; Gaussian: Wallingford, CT, USA, 2009. (80) Dapprich, S.; Komáromi, I.; Byun, K. S.; Morokuma, K.; Frisch, M. J. A new ONIOM implementation in Gaussian98. Part I. The calculation of energies, gradients, vibrational frequencies and electric field derivatives. J. Mol. Struct.: THEOCHEM 1999, 461−462, 1−21. (81) Cramer, C. J.; Truhlar, D. G. Implicit Solvation Models: Equilibria, Structure, Spectra, and Dynamics. Chem. Rev. 1999, 99, 2161−2200. (82) Leaver-Fay, A.; Tyka, M.; Lewis, S. M.; Lange, O. F.; Thompson, J.; Jacak, R.; Kaufman, K.; Renfrew, P. D.; Smith, C. A.; Sheffler, W.; Davis, I. W.; Cooper, S.; Treuille, A.; Mandell, D. J.; Richter, F.; Ban, Y. E. A.; Fleishman, S. J.; Corn, J. E.; Kim, D. E.; Lyskov, S.; Berrondo, M.; Mentzer, S.; Popović, Z.; Havranek, J. J.; Karanicolas, J.; Das, R.; Meiler, J.; Kortemme, T.; Gray, J. J.; Kuhlman, B.; Baker, D.; Bradley, P. ROSETTA3: An object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol. 2011, 487, 545−574. (83) Schleyer, P. v. R.; Maerker, C.; Dransfeld, A.; Jiao, H.; Hommes, N. J. R. v. E. Nucleus-Independent Chemical Shifts: A Simple and Efficient Aromaticity Probe. J. Am. Chem. Soc. 1996, 118, 6317−6318. (84) Vila, J.; Villegas, M.; Baldoni, H.; Scheraga, H. Predicting 13Cα chemical shifts for validation of protein structures. J. Biomol. NMR 2007, 38, 221−235. (85) Vila, J. A.; Ripoll, D. R.; Scheraga, H. A. Use of 13Cα Chemical Shifts in Protein Structure Determination. J. Phys. Chem. B 2007, 111, 6577−6585.

L

DOI: 10.1021/acs.jctc.6b00922 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX