Vibrational Spectroscopy in Solution through Perturbative ab Initio

Jun 12, 2019 - Download Hi-Res ImageDownload to MS-PowerPointCite This:J. Chem. .... In the case of explicit models in MD simulations, the free energy...
0 downloads 0 Views 5MB Size
Article Cite This: J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

pubs.acs.org/JCTC

Vibrational Spectroscopy in Solution through Perturbative ab Initio Molecular Dynamics Simulations Carlos Bistafa,† Yukichi Kitamura,† Marilia T. C. Martins-Costa,‡ Masataka Nagaoka,*,†,§,∥,⊥ and Manuel F. Ruiz-Loṕ ez*,‡,⊥

Downloaded via NOTTINGHAM TRENT UNIV on July 18, 2019 at 20:05:09 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.



Department of Complex Systems Science, Graduate School of Informatics, Nagoya University, Chikusa Ku, Furo Cho, Nagoya, Aichi 4648601, Japan ‡ Laboratoire de Physique et Chimie Théoriques, UMR CNRS 7019, Faculté des Sciences et Technologies, Université de Lorraine, CNRS, BP 70239, 54506 Vandoeuvre-lès-Nancy Cedex, France § ESICB, Kyoto University, Kyodai Katsura, Nishikyo-ku, Kyoto 6158520, Japan ∥ Core Research for Evolutional Science and Technology, Japan Science and Technology Agency, Honmachi, Kawaguchi 3320012, Japan ⊥ Future Value Creation Research Center, Graduate School of Informatics, Nagoya University, Chikusa Ku, Furo Cho, Nagoya, Aichi 4648601, Japan S Supporting Information *

ABSTRACT: We have developed a method that allows computing the vibrational spectra at a high quantum mechanical level for molecules in solution or other complex systems. The method is based on the use of configurational samplings from combined QM/MM molecular dynamics simulations and the use of perturbation theory to calculate accurate molecular properties. Such calculations provide in addition accurate free energy gradient vectors and Hessian matrices and thus open the door for the characterization of stationary points in free energy landscapes and the study of chemical reaction mechanisms in large disordered systems. The vibrational spectrum of the water molecule in liquid water has been computed as a test case. It has been obtained using a weighted average of instantaneous signals assuming the instantaneous normal modes approach. Vibrational frequencies are also computed by diagonalizing the Hessian of the free energy surface. Comparison is made with experimental data and with calculations using the Fourier transform of the time autocorrelation function of the dipole moment. The discussion emphasizes the advantages of the developed methodology compared to other techniques in terms of the accuracy/computational cost ratio. between the solute and solvent molecules;1 in that case, the free energy of the system can easily be minimized with respect to the solute coordinates and the free energy Hessian (FEH) is then used to obtain the vibrational frequencies. Nowadays, however, advanced computational methods in vibrational spectroscopy exploits the potentialities offered by molecular dynamics (MD) simulations. Though in principle it is conceivable to use molecular mechanics (MM) force-fields, vibrational properties at this level are generally not very accurate (and in addition the necessary MM parameters are not always available). Hence, most often, ab initio simulations with full QM or combined QM/MM force-fields are carried out. From the MD trajectory, there are three main strategies2−9 to calculate vibrational properties, which are briefly summarized hereafter. We focus here on simulation methods that assume classical dynamics for the nuclei because including

1. INTRODUCTION In gas phase systems, the calculation of vibrational properties is a well-established procedure. In general, one assumes that the potential energy surface is approximately harmonic around the minima and the vibrational normal modes and frequencies are obtained by geometry optimization followed by diagonalization of the mass-weighted Hessian matrix. If needed, anharmonic corrections are estimated afterward, either by a simple scaling technique or by calculation of higher-order derivatives. Furthermore, the derivatives of the dipole moment and polarizability of the system with respect to these normal modes are related, respectively, to infrared and Raman spectrum intensities. In the case of liquids and solutions, owing to the large number of degrees of freedom, things are more complicated as it is not possible in practice to minimize the potential energy of the whole system. Therefore, some approximations have to be done. A great step forward was done in the 1980s using implicit continuum models to represent the interactions © XXXX American Chemical Society

Received: April 13, 2019 Published: June 12, 2019 A

DOI: 10.1021/acs.jctc.9b00362 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

the solute’s atomic coordinates, G(q), instead of the potential energy used in the gas phase. The gradient and the Hessian of this thermodynamic function are then calculated to locate the energy minima and to obtain the vibrational frequencies, following the idea developed with implicit continuum solvation models. In the case of explicit models in MD simulations, the free energy forces FFE(q) can be obtained from Free Energy gradient (FEG) theory,18−21 which leads to

nuclear quantum effects still represent an important challenge in this field (see for instance refs 10−12 and references cited therein, as well as ref 13 for a perturbative QM/MM approach). The first strategy and possibly most popular one consists in analyzing time correlation functions from the MD simulation.2 Vibrational frequencies are obtained from the power spectrum, i.e., the Fourier transform of the particle velocities time autocorrelation functions (TAF), while the IR and Raman spectra are obtained, respectively, from the dipole moment and molecular polarizability TAF, respectively. Specifically, the IR Beer−Lambert coefficient α(ω) is given by14,15 α(ω) =

πω i 1 y (1 − e−βℏω)D(ω)jjj zzz 3ℏcVn(ω)ε0 k 2π {

e−iωt ⟨M(0) ·M(t )⟩

F FE(q) = −

(1)

H FE(q) =

ÄÅ ÅÅ Å ∂V (q) ∂V T(q) − βÅÅÅÅ ÅÅ ∂q ∂q ÅÇ É Ñ TÑ ∂V (q) ∂V (q) ÑÑÑ ÑÑ ÑÑ ∂q ∂q ÑÑÖ

∂ 2V (q) ∂q∂q −

(5)

where T means transpose and β is the inverse temperature (kBT)−1, which is generally approximated by8,22−24

(2)

The second class of strategy is based on the so-called instantaneous normal mode (INM) analysis.3−7 The INMs and associated frequencies are computed for a set of instantaneous system configurations in the simulation, by diagonalizing the corresponding Hessian matrix of the potential energy: ij ∂ 2V yz z H α = jjj j ∂q∂q zzz k {R α

(4)

∂ 2G(q) ∂q∂q

=

β ℏω (1 − e−βℏω)

∂V (q) ∂q

The brackets mean an average over solvent configurations for a fixed solute’s structure. Likewise, by deriving the preceding expression, one obtains the free energy Hessian (FEH) elements:8,9



∫−∞ dt

where n(ω) is the frequency dependent refraction index, c is the speed of light, V is the volume of the system, ε0 is the permittivity of vacuum, D(ω) is a quantum correction factor, and the last factor represents the Fourier transform of the TAF of dipole moment vector M. The quantum factor D(ω) is introduced to correct for the fact that the TAF is obtained from classical MD simulations. There is not a unique choice for this factor5 but a common choice consists in using the socalled “Harmonic approximation”:16 D(ω) =

∂G(q) =− ∂q

H FE(q) ≈

∂ 2V (q) ∂q∂q

(6)

Though we will assume expressions (4) and (6) here, it is worth noting that some simplifications are possible by assuming the mean-field approximation, i.e., by replacing the average of the potential derivatives by the derivatives of the average potential (or the potential generated by an average solvent configuration).24−26 Such simplifications allow much faster computations at the cost of neglecting the role of solvent fluctuations. The preceding methods are complementary and exhibit their own advantages and drawbacks, the type of system and the problem under study often dictating the best choice. In all cases, however, the computational cost is high, which inevitably implies (a) making simplifications in the QM treatment (which increases the computational error) and (b) carrying out short time simulations (which increases the statistical uncertainty). In this work, we introduce the use of perturbative methods in ab initio simulations27 for predicting the vibrational spectra of solutions and other complex systems, and we show that this approach opens a way to push back the limits of current approaches. In short, an MD trajectory is first carried out at an affordable (but reliable) QM or QM/MM level that allows keeping the computational cost into reasonable amount. The configurational sampling is then used in conjunction with perturbation theory to obtain the vibrational spectrum at a high QM level. The perturbation Hamiltonian is the difference between the target and reference QM Hamiltonians. In recent years, we have shown that a comparable perturbative treatment is valuable to get highly accurate free energies and electronic properties,27,28 especially if the simulations are conducted with efficient sampling techniques.29 Besides, in combination with

(3)

where V is the total potential energy of the system, q are the mass-weighted atomic Cartesian coordinates of the nuclei, and Ra represents a specific configuration of the system. In these configurations, the system is not necessarily close to one of equilibrium configurations and some eigenvalues of the Hessian matrix may be negative, leading to imaginary INM frequencies. These imaginary frequencies are related to hills (such parts with the upward convex slope) in the potential energy landscape of the system connecting minima (designed as unstable modes) and have been sometimes associated with properties of the liquid such as libration, hindered rotation, and diffusion. The histogram of positive frequencies obtained over a very large number of configurations, i.e., the density of states, weighted by the derivative of the dipole moment with respect to the INM, yields the IR absorption spectrum. In the case of a solution, it may be advantageous to decouple the solute vibrations from the solvent coordinates, a technique which is especially well-suited in combined QM/MM MD simulations (for details, see refs 8 and 17). Diagonalization of the reduced Hessian matrix leads to the INMs QiINM associated with instantaneous vibrational frequencies wiINM of the solute, whose vibrational spectra can efficiently be obtained in this way. The last strategy applicable to the study of a solute in a solvent considers the free energy of the system as a function of B

DOI: 10.1021/acs.jctc.9b00362 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation FEG theory, perturbation theory has allowed us to develop a cost-effective method for free energy minimization and geometry optimization in solution.30 The methodology is explained in section 2, while section 3 summarizes the computational details and section 4 the results for the case study “water in water”. This system has been chosen to illustrate the theory developed here because there is a large number of previous simulations and experimental studies devoted to it, and in addition it is well-known that the water vibrational spectrum undergoes large solvation effects in terms of both frequencies and intensities. For simplicity, we will focus on IR spectra but applications of our approach to Raman spectroscopy are easily achievable. In section 5, some conclusions and perspectives are presented.

∂G HL(q) = ∂q

∂VHL(q ; s) ∂q

∂ 2G HL(q) = ∂q∂q

HL

∂ 2VHL(q ; s) ∂q∂q

HL

∂ 2VHL(q ; s) −β ΔV (q ; s) e ∂q∂q

⟨e−β ΔV (q; s)⟩LL

LL

(10)

which provides the elements of the HL Hessian matrix whose eigenvectors are the vibrational normal modes Qi and whose eigenvalues ki provide the vibrational frequencies ν̃i0 : νi0̃ =

1 ki 2πc

(11)

Equations 9−11 are used below to optimize the geometry and calculate the vibrational frequencies of the water molecule in liquid water in the FEP-FEH approach. However, it must be emphasized that the same equations allow us to locate and characterize saddle points, thus offering new perspectives to study chemical reaction mechanisms in solution at high theoretical levels. 2.2. FEP-INM Method. As explained in the introduction, in the INM approximation, the IR spectrum is generally obtained using the density of states, weighted by the derivative of the dipole moment (actually the square of the derivatives). Such a calculation requires in general handling a very large number of configurations in order to get a smooth spectrum. In the present study, where the number of snapshots is necessarily limited due to high computational cost, we have preferred to use a slightly different but equivalent technique, namely we replace the weighted average density of states by the average of instantaneous IR signals. The IR signals are obtained as a sum of Lorentzian shape bands over the INMs, the band area being the calculated absorbance, as explained hereafter. First, one must remind that the dimensionless linear absorbance is obtained from the intensities of the incident and transmitted radiation (respectively I0 and I) as

(7)

where ΔV = V1 − V0 is the corresponding difference in potential energy and the average is calculated on the sampling provided by equilibrium configurations of the reference state. The Hamiltonians H0 and H1 may correspond to the same system in different conditions (for example, gas phase and solution) or, as assumed here, to the same system in the same conditions but described at different quantum mechanical levels. In other words, FEP may be used to estimate free energy variations upon a modification of the QM method used to describe the system. It therefore allows to estimate free energy at high QM levels (HL) from the sampling of cheaper simulations at low QM levels (LL).27 For a selected value of the solute’s mass-weighted Cartesian coordinates q, the free energy at high level reads: 1 ln⟨e−β ΔV (q; s)⟩LL β

LL

⟨e−β ΔV (q; s)⟩LL

which allows locating the stationary points in the free energy landscape. A similar development (see the SI) leads to the second derivatives of the free energy:

2. THEORETICAL METHOD The method proposed here aims to get the vibrational spectrum of a solute in solution from ab initio molecular dynamics at a high QM level using either the INM or the FEH approaches described above. Since the spectra are obtained using Free Energy Perturbation (FEP) theory, the new methods developed here will be named FEP-INM and FEPFEH. We will assume in this paper that the system is described through a combined QM/MM force-field but extension to full QM simulations is straightforward. In all simulations, the Born−Oppenheimer approximation and a classical dynamics of the nuclei are assumed. 2.1. FEP-FEH Method. The fundamental equation in free energy perturbation (FEP) theory states that the free energy difference between a reference system described by Hamiltonian H0 and a target system described by Hamiltonian H1 can be obtained as31,32

G HL(q) = G LL(q) −

=

(9)

=

1 ΔG = G1 − G0 = − ln⟨e−β ΔV ⟩0 β

∂VHL(q ; s) −β ΔV (q ; s) e ∂q

i I (ν)̃ y A(ν)̃ = log10jjjj 0 zzzz k I(ν)̃ {

(12)

where ν̃ is the wavenumber in cm−1, and is related to the molar absorption coefficient (usual units L mol−1 cm−1) by the equation:

(8)

where s is used to emphasize the explicit dependence of the potential difference ΔV with the solvent coordinates. The conditions for a satisfactory convergence of this procedure have already been analyzed (see works cited in ref 30) and will not be discussed here again. We showed before that the derivatives of eq 8 with respect to the mass-weighted solute’s coordinates allow to get the FEPcorrected free energy gradient as30

ε(ν)̃ =

1 A(ν)̃ Cd

(13)

where C is the concentration and d the optical path length. The molar absorption coefficient in turn is connected to the integrated absorption coefficient ( by (= C

∫ ε(ν)̃ dν ̃

(14) DOI: 10.1021/acs.jctc.9b00362 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation Table 1. Calculated and Experimental Vibrational Frequencies of the Water Molecule in the Gas Phase (in cm−1)a bend

symm stretch

asymm stretch

HF/6-31G(d)b

harmonic

1826

4069

4188

B3LYP/6-311+G(d,p)b

harmonic anharmonic scaling anharmonic second order

1601 1539 1555

3817 3668 3646

3923 3770 3739

MP2/aug-cc-PVQZc

harmonic anharmonic second order

1631 1578

3846 3674

3972 3788

CAS/ANO[6s5p4d3f/5s4p3d]d

harmonic

1657

3849

3965

1595

3657

3756

experimente a

Calculated anharmonic frequencies are estimated using the scaled harmonic frequencies or the second-order perturbative approach.42. bThis work. c Reference 43. dReference 44. eReference 41.

Theoretically, the band area ( may be obtained from the derivatives of the dipole moment μ with respect to the normal modes; for a specific band i with absorption wavenumber ν̃i0 :33 N π ij ∂μ yzz zz ( i = A 2 jjjj 3c k ∂Q i z{

INM, but for comparison, we have also carried out computations using the standard FEH, INM, and TAF approaches. QM/MM-MD simulations have been carried out using the interface developed by some of us35 and connecting Gaussian 0934 and AMBER 09.36 The simulation cell consists of a cubic box of side 26.25 Å, containing one QM water molecule and 592 TIP3P37 MM water molecules. The target QM high-level is B3LYP/6-311+G(d,p)38 while the reference QM low-level corresponds to the HF/6-31G(d) level. This combination of methods has successfully been used to optimize the geometry of the water molecule in the liquid phase.30 We assume the NVT ensemble at 300 K using a Berendsen thermostat.39 The QM/MM Lennard-Jones interaction is treated in the same way that the classical MM/MM terms, as this approximation has been shown to provide satisfactory results.30 Periodic boundary conditions in the cubic box have been assumed: we use a strict cutoff radius of 11 Å for QM-MM interactions and for MM-MM interactions we use the Ewald method with the same cutoff, as explained in ref 35. The time step is 1 fs. The geometry of the QM water molecule is first optimized at the target level using either the FEP-FEG or the standard FEG algorithms, as reported in our previous work.30 The optimized internal parameters with these two methods are very close and are as follows: dOH = 0.9822 Å and θHOH = 105.19° (when the FEG-FEP method is used) and dOH = 0.9820 Å and θHOH = 105.18° (when the standard FEG method is used). For comparison, the gas phase geometry of the water molecule at the B3LYP/6-311+G(d,p) level is dOH = 0.9620 Å and θHOH = 105.11°. After QM water geometry free energy optimization, two QM/MM-MD trajectories of 40 ps are carried out to obtain the required sampling for the vibrational calculations with a fixed geometry of solute’s optimized structure. Convergence of the results as a function of simulation time is shown in Figure S1. The first simulation is carried out at LL and the corresponding sampling is used for FEP-FEH and FEP-INM calculations. The second simulation is carried out at HL and the corresponding sampling is used for standard FEH and INM calculations. In both cases, the QM water molecule geometry is constrained using the BELLY method implemented in AMBER 09.36 In the two simulations, snapshots are saved for further analysis. The averages in FEH and INM calculations are done using 4000 snapshots (configurations stored every 10 fs). The

2

(15)

and assuming a Lorentzian shape for the IR spectrum bands, the molar absorption coefficient becomes εi(ν)̃ =

2( i Γ π 4(ν ̃ − νi0̃ )2 + Γ 2

(16) −1

where Γ is the full width at half-maximum (cm ). Note that the IR intensity given by the program Gaussian,34 IIR (km mol−1), corresponds to the Napierian logarithm of the absorbance and is related to ( (L mol−1 cm−2) by the expression ( = 100/ln10 IIR = 43.42945IIR. In our FEP-INM method, the IR spectrum at the high QM level is calculated by a weighted averaged of instantaneous εi(ν̃) bands: εHL(ν ̃, q) =

−β ΔV (q ; s) (∑i εHL, i(ν ̃, q ; s))e

⟨e−β ΔV (q; s)⟩LL

LL

(17)

where the sum runs over all the INMs i and the corresponding absorbance εHL,i is obtained from the derivatives of the dipole moment at HL and eqs 15 and 16. In this expression, we make use of the FEP expression for a mechanical property that depends on the Cartesian coordinates of the system, such as the molar IR absorption coefficient.32 In principle, the MD simulation at LL level defining the sampling used in (17) does not need to involve constraints for the solute’s coordinates. However, for consistency and comparison with the FEP-FEH method described above, in our calculations below we will use the sampling from a simulation in which the solute’s coordinates are fixed to the geometry minimizing the free energy.

3. COMPUTATIONAL DETAILS In this work, we have studied the IR spectrum of the water molecule in liquid water since theoretical and experimental data are widely available, so that it appeared as a suitable test case for our methodology. We have carried out calculations with the two methods proposed above, FEP-FEH and FEPD

DOI: 10.1021/acs.jctc.9b00362 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation Lorentzian functions in the INM calculations have a full width at half-maximum Γ = 15 cm−1. In the case of the TAF calculations, another trajectory of 40 ps at HL without constraints has been carried out. We used the whole set of 40000 snapshots and the maximum entropy method40 to obtain the IR spectrum. The frequency dependent refraction index was taken from previous work.41 In order to compare the present calculational values with experimental ones, it is important to account for anharmonic effects. In the gas phase, anharmonic frequencies have been calculated using not only the usual scaling technique but also the second-order perturbative approach42 implemented in Gaussian.34 In the liquid phase, however, the latter method would be computationally too costly. Hence, anharmonic frequencies have been estimated only by the scaling technique, i.e., by fitting the harmonic gas phase frequencies to experimental data. The factor 0.961 has been deduced in our case. Note that this fitting factor do not only account for anharmonic effects but also for other errors in the theoretical method. In the case of the TAF calculations, no scaling has been done as, in principle, this approach takes anharmonicity into account.

Table 2. Calculated Vibrational Frequencies in the Liquid Phase and Frequency Shifts with Respect to Those in the Gas Phase of the Water Molecule Using Different Methods (cm−1)a symm stretch

asymm stretch

FEP-FEH

harmonic anharmonic scaling Δν

1755 1687 +148

3501 3364 −304

3571 3432 −338

FEH

harmonic anharmonic scaling Δν

1735 1667 +128

3514 3377 −291

3587 3447 −323

FEP-INM

harmonic anharmonic scaling Δν

1754 1685 +146

3495 3359 −309

3573 3433 −337

INM

harmonic anharmonic scaled Δν

1744 1676 +137

3506 3370 −298

3589 3449 −321

Δν

1650 ∼+55

experimentb

4. RESULTS AND DISCUSSION 4.1. Gas Phase Frequencies. In Table 1, we compare the vibrational frequencies of the water molecule in the gas phase computed at the reference HF/6-31G(d) and target B3LYP/6311+G(d,p) levels with other theoretical and experimental ones. As shown, the HF/6-31G(d) frequencies display very large errors compared to experimental values or to MP2/augcc-PVQZ and CAS/ANO[6s5p4d3f/5s4p3d] calculations. In contrast, our values at the target level B3LYP/6-311+G(d,p) are close to those reported with the correlated ab initio methods and they also compare well with experimental ones after the correction for anharmonic contributions, which are significant for water, as it is well-known. The largest frequency difference between the anharmonic values at the B3LYP/6311+G(d,p) level and the experimental values appears for the bending mode and amounts 40 cm−1. This is an acceptable error for these test calculations. 4.2. Liquid Water Spectrum. Calculated and experimental values for the bending and stretching modes of the water molecule in liquid water are presented in Table 2 (vibrational frequencies) and Figure 1 (IR spectrum). Experimentally,45−48 the IR spectrum displays a large absorption band in the 2800−3800 cm−1 region that can be associated with the two OH stretching modes. As a consequence, the bands for the symmetric and asymmetric modes are strongly overlapped and, despite some attempts reported in the literature to determine the position of each one,45,47,49 exact values are not available although it is generally assumed that there is a red shift of both modes by at least 250 cm−1. Another less intense band appears at around 1650 cm−1 in association with the bending mode and is blue-shifted by about 55 cm−1 with respect to that in the gas phase. In comparison to the IR spectrum in gas phase, that in the liquid phase exhibits other important characteristics, particularly broadening of the bending and stretching absorption bands and inversion of their relative intensities. The liquid water spectrum displays an additional absorption band below 900 cm−1 that basically has intermolecular origin and therefore is not accessible in our present model.

bend

2800−3700 ∼−250

a

Anharmonic frequencies are obtained by scaling the calculated harmonic values by 0.961, as explained in the text. For consistency, frequency shifts are obtained using also the scaled frequencies in the gas phase. bReferences 45−48.

Figure 1. Comparison of the theoretical and experimental48 IR spectra of the water molecule in liquid water. Instantaneous frequencies in FEP-INM and INM calculations have been scaled by 0.961 as explained in the text. TAF intensities are given in arbitrary units.

Let us first discuss the frequency values in Table 2, where the harmonic and anharmonic frequencies calculated using the FEP-FEH and FEP-INM methods developed here are compared to the corresponding values using the standard FEH and INM methods. In the case of the INM methods, the frequencies correspond to the maxima of the bands (harmonic or anharmonic) after averaging the Lorentzian functions (for details, see Figure S2). As shown, the agreement between the values given by the different theoretical methods is quite good. In particular, the frequencies obtained with the FEP methods developed here are close to the values obtained in the standard E

DOI: 10.1021/acs.jctc.9b00362 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation calculations (the entries FEH and INM in Table 2), frequency differences being typically in the range 10−20 cm−1. The comparison with the experimental values shows also a good agreement; in particular the positive and negative frequency shifts of, respectively, the bending and stretching modes are correctly reproduced, although they are overestimated. Overall, our calculations are in very good agreement with previous theoretical studies and in particular with those based on related QM/MM simulations.9,43,49 Another trend shown in Table 2 is that the FEH and INM calculations of the vibrational frequencies exhibit a very good agreement and differ by less than 10 cm−1. This is an interesting result because FEH calculations are extremely useful to assign the normal modes for the vibrational bands in the experimental spectra, which is less straightforward using INM calculations. This assignment is trivial in the case of the water molecule but might be cumbersome for more complex systems with many degrees of freedom and strong band overlapping. It is worth mentioning here that the FEH and INM calculations take into account the effect of solvent fluctuations along the simulation. For completeness, we have carried out a calculation of the frequencies obtained by diagonalizing the Hessian of the potential energy for the QM water molecule surrounded by an average solvent configuration of MM molecules (B3LYP/6-311+G(d,p) level), i.e., using the mean-field approximation. The obtained harmonic frequencies, 1744, 3506, and 3577 cm−1, are not far from the corresponding FEH and INM values in Table 1, meaning that this is a useful approximation when only average values are of interest.24−26 In Figure 1, the calculated IR spectra (FEP-INM and INM methods) are compared to the experimental one. For completeness, the theoretical spectrum by the TAF method is also drawn. As shown, the approximate FEP-INM spectrum is very close to the standard INM one, the main difference being the larger intensity of the FEP-INM peak at the higher frequency (asymmetric stretching). Nevertheless, the two methods qualitatively reproduce the higher intensity of the stretching mode in respect to that of the bending mode, which is a distinctive characteristic of the liquid water IR spectrum (the opposite is found in gas phase). The broadening of the bands is underestimated but this fact is probably due to the use of the QM water structure fixed to the free energy optimized geometry and should be improved by carrying out calculations for instantaneous configurations along the MD simulation with a nonconstrained solute structure. This option was beyond the initial scope of the present study and has not been explored. Finally, note that the TAF approach provides reasonably good results for the spectrum but absorption bands are slightly shifted toward higher frequencies and relative intensities are less well reproduced than in the INM approaches. To get some further insights on these results, Figure 2 displays the calculated spectra for the optimized water molecule in the gas phase and in the liquid phase (FEP-INM calculations), as well as for the isolated molecule with the geometry optimized in the liquid phase. The comparison of the three spectra shows that the change in water geometry in going from gas to liquid phase produces a significant change in the stretching bands and is responsible for the main shift of the associated frequencies in the liquid. This result is explained by the large increase of the OH bond lengths and by the anharmonicity of the stretching modes (roughly, for purely harmonic vibrations, the calculated frequency should be independent of the OH distance at which the second energy

Figure 2. Analysis of the calculated solvation effect on the water molecule IR spectrum. For simplicity, the liquid phase calculations correspond to the FEP-INM results but standard INM results are very similar.

derivatives and associated eigenvalues and frequencies are calculated). The geometry effect on bending frequencies is smaller because the bond angle changes very slightly in the liquid phase and in addition the associated frequency is less anharmonic (see Table 1). The role of anharmonicity to explain observed IR frequency shifts in solution has been discussed in other cases, in particular a detailed analysis has been made for the CO stretching mode in amides based on a rigorous energy decomposition scheme.50 4.3. Computational Requirements. From the computational point of view, the combination of QM/MM simulations and FEP corrections is quite efficient, not only because the total number of high-level (HL) calculations is hugely decreased but also because such calculations are independent and can be carried out in parallel. Thus, both CPU and clockwall times are significantly reduced allowing to conduct studies that would be difficult or even impossible to perform otherwise. The calculation of IR spectra in the FEP theory framework encompasses two steps: free energy minimization (geometry optimization) and calculation of the normal modes and vibrational frequencies. The first step is more computationally demanding because it is an iterative procedure and several MD trajectories have to be carried out. As described in more detail previously,30 in the FEP approach the computational effort (CPU time) is reduced by as much as 1−3 orders of magnitude, and possibly more important, the wall-clock time required to obtain HL results is not significantly higher than the one required for the full LL calculation, which can be a decisive advantage in practical terms. The computational costsaving in the second step is basically associated with the simulation of the QM/MM trajectory to generate the sampling, so that it is simply proportional to the number of time steps in the simulation and to the CPU time difference between the HL and LL methods for one single time step.

5. CONCLUSIONS In the present study, we have extended FEP theory to the calculation of free-energy second-derivatives and its application to the simulation of IR vibrational spectra from QM/MM Molecular Dynamics simulations. We have shown that QM/ F

DOI: 10.1021/acs.jctc.9b00362 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Theory and Computation

Nagoya University for a Designated Professor position. M.F.R.L. and M.T.C.M.C. acknowledge the French CINES for providing computational resources (Project LCT2550). This work was also supported partially by the Core Research for Evolutional Science and Technology (CREST) of the Japan Science Technology Agency (JST); by a Grant-in-Aid for Science Research from the Ministry of Education, Culture, Sport, Science and Technology (MEXT) in Japan; and also by the MEXT programs “Elements Strategy Initiative for Catalysts and Batteries (ESICB)” and “Priority Issue on Post-K computer” (Development of new fundamental technologies for high-efficiency energy creation, conversion/storage and use). The calculations were partially performed using several computing systems at the Information Technology Center at Nagoya University.

MM FEP-methods are capable of reproducing those expensive results obtained at high-levels of theory with a considerably reduced computational effort. As a test case, we have investigated the IR spectrum of the water molecule in liquid water at the B3LYP/6-311+G(d,p) level, which is properly reproduced by combining QM/MM simulations at the simple HF/6-31G(d) level and perturbational corrections. More generally, the approach can be used in association with elaborated ab initio approaches to obtain accurate Hessian matrices in complex chemical systems. Thus, in combination with the possibility of calculating accurate gradients developed in our previous work,30 the present approach opens avenues for the study of chemical reactions at unprecedented levels of theory. A key question to this approach is of course the conditions that the LL and HL methods must fulfill in general to ensure a good convergence of the calculations. This question has already been discussed28,30,51 and we are just reminded here that despite FEP is an exact theory, in practice, the computations have to be carried out over a limited number of snapshots. Good converged results can only be achieved if the LL solvent sampling “sufficiently” overlaps the equilibrated system sampling at HL. This means that, for a given solute structure, both methods should provide comparable solute− solvation interaction energies: the closer the solvation energies are to each other, the faster convergence is achieved, differences in the intrinsic solute potential energy playing a secondary role. Some practical rules were reported in our previous work,30 where we proposed to look at the distribution of the solute−solvent interaction energy HL-LL difference, estimated from a set of snapshots of a low-level trajectory. Typically, the distribution should present a maximum at low energies (