How to Model Inter- and Intramolecular Hydrogen Bond Strengths with

4 days ago - PDF (1 MB) ... The computed complexation free energies in solution show a ... The intramolecular hydrogen bonding free energies in soluti...
0 downloads 0 Views 1MB Size
Subscriber access provided by Nottingham Trent University

Computational Chemistry

How to Model Inter- and Intramolecular Hydrogen Bond Strengths with Quantum Chemistry Christoph Alexander Bauer J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.9b00132 • Publication Date (Web): 14 Aug 2019 Downloaded from pubs.acs.org on August 14, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

How to Model Inter- and Intramolecular Hydrogen Bond Strengths with Quantum Chemistry Christoph A. Bauer∗,†,‡ †Department of Chemistry, University of Bergen, 5007 Bergen, Norway ‡Computational Biology Unit, University of Bergen, 5007 Bergen, Norway E-mail: [email protected]

Abstract This article presents the computation of both inter- and intramolecular hydrogen bond strengths from first principles. Quantum chemical calculations conducted at the dispersion-corrected density functional theory level including free energy and solvation contributions are conducted for (i) one-to-one hydrogen-bonded complexes of alcohols to N-methyl pyrrolidinone measured by an infrared spectroscopy method and (ii) a set of experimental intramolecular hydrogen bond-forming phenol and pyrrole compounds, with intramolecular hydrogen bond strengths derived from a nuclear magnetic resonance method. The computed complexation free energies in solution show a correlation to experiment of R2 = 0.74 with a root mean square error of 4.85 kJ mol−1 . The intramolecular hydrogen bonding free energies in solution show a correlation of R2 = 0.79 with a root mean square error of 5.51 kJ mol−1 . The results of this study can be used as a guide on how to build reliable quantum chemical databases for computed hydrogen bonding strengths.

1

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Introduction Hydrogen-bonding (HB) 1 is one of the crucial non-covalent interactions to be taken into account in biochemistry. 2,3 Its role in medicinal chemistry and drug discovery has been widely discussed 4–6 and there have been recent studies in which the biological activity of small molecules is explained via their HB behavior, for example through correlations of HB acceptor strengths of individual sites in molecules with IC50 values. 7–9 Intramolecular HB (IMHB) has also recently come into focus in medicinal chemistry, opening the perspective of designing drugs using IMHB rationally, 10 although IMHB formation itself does not necessarily correlate with a compound’s biological activity. 11 IMHB may, however, affect important molecular properties like lipophilicity; 12–14 it influences the formation and stabilities of molecular crystal polymorphs, 15 and plays a role even in mass spectrometry as higher energies are required to fragment molecules that form IMHB. 16 It is therefore not suprising that the HB strength has been a quantity of considerable interest. There exists a vast literature on the experimental quantification of HB acceptor (HBA) and HB donor (HBD) strengths. 17–23 Equally long is the list of HBA/HBD strength modeling, i.e., prediction of the Gibbs free energy of formation (∆Gexp ) for HB-complexes. The list includes cheminformatic approaches using ISIDA fragment descriptors and support vector regression, 24,25 modeling with quantum-chemical (QC) descriptors such as orbital energies, electrostatic potential minimums, shared electron numbers, 26–36 and full supramolecular QC treatment. 37–40 The experimental measurement of intermolecular HB strength is usually achieved by obtaining the equilibrium constant K of 1:1 complex formation against reference molecules. The most popular reference HBD is 4-fluorophenol and the largest database of HBA strengths, the pKBHX database, 41 comprises about 1,200 values. For HBD, the available experimental reference values are considerably fewer. One popular acceptor molecule is N-methyl-pyrrolidinone (NMP). The makers of the pKBHX database have devised a measure for HBD strength, called pKAHY , which is the negative decadic logarithm of K for a 1:1 complex of a HBD molecule 2

ACS Paragon Plus Environment

Page 2 of 37

Page 3 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

with NMP. 42 It is measured in CCl4 solution using an infrared spectrometry (IR) method, exploiting the shift of the OH band of alcohol HBDs. The Gibbs free energy is obtained by: AHY ∆GpK (kJ mol−1 ) = −RT lnK = −5.705 × pKAHY . exp

(1)

The main assumption and limitation of this method is that HB strength should be the only non-negligible contribution to ∆Gexp , which is to say that other, size-extensive non-covalent interactions such as dispersion should have little weight. Therefore, only rather small, monofunctional HBDs fall within the scope. The measurement of IMHB strengths is not so straightforward. One can resort to the measurement of competing inter- and intramolecular HB formation processes, 43 or to spectroscopy, for instance nuclear magnetic resonance (NMR). The NMR approach exploits the measurable difference in chemical shift (∆σ) when a reference molecule (e.g., phenol) and a related IMHB-forming molecule (e.g., o-substituted phenols) are compared. 44,45 The IMHB energy is obtained in the following way: −1 ∆GIMHB exp (kJ mol ) = 4.184 × (∆σ + (0.4 ± 0.2)),

(2)

with the appropriate non-IMHB reference molecules of, for example, phenol, pyrrole. Afonin and co-workers 46 have compared results from this approach to IR-spectroscopic 47 and computational methods of obtaining IMHB strengths. They find a root mean square error (RMSE) of 2 kJ mol−1 between NMR and IR-derived IMHB strengths, which means that at least two experimental sources for derived IMHB strengths are in reasonably close agreement with each other. Since all those measurements are at equilibrium conditions in solution, it is reasonable to assume that the measured thermodynamic quantity is indeed ∆Gexp . While there have been many QC studies on intermolecular HB formation, not many have compared the full supramolecular Gibbs free energy in solution ∆Gsol directly to experiment. What is more, benchmarking of QC on experiment, especially ∆Gsol , is in gen3

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 37

eral performed rarely. 48 One of the few examples of successful thermochemical modeling is dispersion-corrected density functional theory (DFT), 49 which has reproduced experimental host-guest binding affinities in supramolecular chemistry. 50–52 This is also the method that is used in this report. ∆Gsol is computed as follows:

∆Gsol = ∆E + ∆GHO + ∆δGsolv ,

(3)

where the energy difference ∆E is obtained as:

∆E = E(product) − E(reactant(s)).

(4)

∆GHO are the harmonic oscillator (HO) corrections to the free energies in the gas phase, obtained from harmonic frequency calculations, and ∆δGsolv are the differences in computed solvation free energies, obtained from implicit solvent calculations. For the intermolecular HB case, ∆E is obtained by computing the energies of the molecule, the standard reference acceptor/donor molecule, and subtracting them from the energy of the complex. In the case of an IMHB, ∆E is obtained as the difference in energy of the limiting closed (IMHB) and open (non-IMHB) conformations. What has not been reported at all to the best of our knowledge is a comparison of the QC modeling of inter- and intramolecular HB at the same level. The only difference in the modeling is a concentration-dependence for the intermolecular case, for which a correction of the form suggested by the group of Whitesides should account. 53 To this end, two data sets have been compiled from the papers and Graton and co-workers 42 and Afonin and coworkers. 46 These two sources are chosen because they (i) provide consistent experimental sources for ∆Gexp , (ii) have a reasonable span of ∆Gexp values. For the specifications of the data sets, see the results section.

4

ACS Paragon Plus Environment

Page 5 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Methods All molecules were handled using the rdkit . 54 Initial cartesian coordinates for all molecules were generated by the ETKDG (Experimental Torsion Distance Geometry augmented with basic Knowledge) method 55 as implemented in the rdkit. IMHB conformations were generated by placing a distance constraint on the donor-acceptor pairs: 2.0 Å for IMHB forms and 5.0 Å for open forms. The IMHB and open forms were then pre-optimized for 100 steps using the Merck Molecular Force Field, version for static computations (MMFF94s) 56–61 implementation in the rdkit. 62 Force-field preoptimized H-bonded complex geometries with the reference acceptor molecule NMP were generated by placing the hydrogen atom of the donor moiety 2 Å from the acceptor carbonyl oxygen, followed by 100 steps of MMFF94s optimization. The dispersion-corrected DFT workflow was as follows: The MMFF94s pre-optimized structures were optimized without any constraints at the TPSS 63 -D3(BJ) 64–66 /def2-TZVP 67 level of theory. The harmonic frequencies were computed at the same level to give the harmonicoscillator free energy contributions (∆GHO ) for the gas phase free energies ∆G. Single-point energies for the ∆E contributions were computed at the PW6B95 68 -D3(BJ)/def2-QZVP 67 level of theory using the TPSS-optimized structures. The free energy of solvation contributions (∆δGsolv ) were computed using an implicit solvent model at the SMD 69 (BP86 70–72 D3(BJ)/def2-TZVP) level of theory. For the intermolecular cases, the solvent was CCl4. To account for concentration dependence in the intermolecular case, a shift of -22.64 kJ mol−1 free was added to ∆δGsolv , using the solvent free volume (Vsolv ) correction for translational en-

tropy in solution suggested by Mammen et al. 53 This correction was evaluated using the GoodVibes program 73 with a molarity of 10.4 L mol−1 and a computed volume of 128.8 Å3 free for CCl4 (corresponding to a Vsolv of 0.264 mol L−1 ) at a concentration of 1.0 M. For the

intramolecular hydrogen bonding calculations, the solvent was chloroform and no shift was added because unimolecular reactions are not concentration dependent. All DFT calculations were performed using the Gaussian suite of programs. 74 5

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 37

Linear regression models were created using the scikit-learn library and python. 75

Results and Discussion The two data sets devised for this study are (i) the intermolecular pKAHY 29 data set of 29 compounds from the paper by Graton and co-authors, using the pKAHY values as the source for ∆Gexp 42 , and (ii) the intramolecular IMHB16 data set of 16 compounds compiled from the paper by Afonin and co-workers, using the NMR-derived values for ∆Gexp . 46 Figures 1 and 2 illustrate the datasets. The pKAHY 29 set contains 28 aromatic and aliphatic alcohols and cyclohexanone oxime as donors with NMP as the reference acceptor. For reasons of data consistency in a one-conformer approach, the hydrogen-bound complexes are in the conformer where the lone pair in ’anti’-position to the amide nitrogen is the donating one. The IMHB16 set contains 16 IMHB forming phenols and pyrroles. For the experimental target values, see Tables 1 and 2. HB donor HB acceptor a

c

b O H

O H

O

O

N

N

O H N

O

N

Figure 1: Molecular 1:1 complexes with the reference acceptor molecule NMP contained in the pKAHY 29 intermolecular HB data set: (a) phenols (b) aliphatic alcohols (c) cyclohexanone oxime. 42

6

ACS Paragon Plus Environment

Page 7 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

HB donor HB acceptor OH

OH

OH

F

OH Br

Cl

OBC-1

OBC-2

OH O

O

OBC-3

OH O

OBC-4

OH O

H OBC-6

OH O N

O

OBC-5

N

OH O

OBC-7

OBC-8

OBC-9

N O

N

Cl3C N

O HN

OH H N

O HN O HN OBC-10

OBC-11

OBC-12

OBC-13

O HN

HN O

O HN

OBC-14

OBC-15

OBC-16

Figure 2: Compounds of the IMHB16 intramolcular HB set. The label ’OBC’ has been chosen because of the source of the data. 46 The results of the dispersion-corrected DFT calculations are compared to the experimentally derived hydrogen bonding strengths. Inter- and intramolecular HB formation energies ∆E, gas phase free energies ∆G = ∆E + ∆GHO , and free energies in solution ∆Gsol values are reported. First, the results for the pKAHY 29 set are presented, followed by the IMHB16 results. All the optimized structures and absolute energies obtained from the DFT computations are available in the Supporting Information (SI) in the form of SDF data files 76 with

7

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

corresponding documentation. Table 1 shows the computed ∆E, ∆G, and ∆Gsol values for the pKAHY 29 data set in comparison with the experiment. 42 The first batch of data points are 17 phenols, ordered by increasing experimental free energies, i.e., decreasing hydrogen bond donor strength from 3-4-5-trichlorophenol(-19.90 kJ mol−1 ) to 2-4-dichlorophenol (-10.10 kJ mol−1 ). Then come 11 aliphatic alcohols, analogously ordered from hexafluoroisopropanol (-17.70 kJ mol−1 ) to 2-butanol (-3.70 kJ mol−1 ). Finally, there is cyclohexanone-oxime as the sole representative of the oxime functional group. The ∆E column shows overbound, i.e., too negative results. This is expected because ∆E are pure gas-phase interaction energies; thus, no counteracting enthalpic or entropic effect is accounted for. The ∆G values correspond, in contrast, to underbinding, with some values significantly larger than zero, which means that the complex is predicted to be energetically unfavorable in the gas phase. This is the result of the thermal ∆GHO contributions, which are almost always repulsive for bimolecular reactions. The ∆δGsolv corrections, including a large concentration dependent shift of -22.64 kJ mol−1 , make the experimental and computed ∆Gsol values directly comparable. All of the complexes are bound, with a slight systematic overbinding. This may arise from the uncertainty whether the experimental results were in fact back-corrected to standard conditions (1.0 M). The value of the correction increases (i.e., becomes less negative) with decreasing concentration. The value of the shift is surprisingly large, owing to the density and molecular volume of CCl4 .

8

ACS Paragon Plus Environment

Page 8 of 37

Page 9 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 1: Quantum Chemical Results and ∆Gexp Values for the pKAHY 29 Data Set. The Energies are in kJ mol−1 .The ∆Gsol values include a concentration-dependent shift of -22.64 kJ mol−1 for CCl4 at standard conditions. Name

∆E

∆Gsol

∆Gexp

3-4-5-trichlorophenol

-57.47 -10.03 -26.15

-19.90

3-5-dichlorophenol

-54.49

-7.54 -23.49

-18.40

4-bromophenol

-49.86

-2.67 -18.66

-15.20

4-chlorophenol

-50.10

-3.01 -18.94

-15.10

4-fluorophenol

-48.64

-1.55 -17.58

-13.60

1-naphtol

-50.54

2.80 -11.61

-13.20

3-5-dimethoxyphenol

-46.73

1.09 -14.73

-12.40

3-isopropylphenol

-46.28

0.59 -15.41

-11.80

4-methoxyphenol

-45.60

2.25 -13.64

-11.80

phenol

-46.72

0.35 -15.86

-11.80

3-tert-butylphenol

-53.49

4.04 -10.04

-11.80

p-cresol

-45.87

2.17 -13.94

-11.70

4-tert-butylphenol

-48.36

-1.68 -18.06

-11.70

pentachlorophenol

-42.91

0.12 -19.54

-11.60

3-5-diisopropylphenol

-46.46

0.62 -14.86

-11.20

3-4-5-trimethylphenol

-45.03

2.60 -13.24

-10.70

2-4-dichlorophenol

-36.70

7.14 -11.15

-10.10

hexafluoroisopropanol

-55.31 -10.82 -29.41

-17.70

2-2-2-trifluoroethanol

-47.15

-0.40 -16.90

-11.30

2-2-2-trichloroethanol

-48.04

1.09 -15.14

-9.50

2-2-dichloroethanol

-45.37

4.66 -10.33

-8.40

3-methoxybenzylalcohol -51.60

2.63

-8.46

-6.10

benzylalcohol

-49.06

4.20

-8.36

-5.90

allylalcohol

-34.09

11.43

-6.81

-4.90

1-ethynyl-c-pentanol

-41.76

3.43 -12.23

-4.90

ethanol

-33.00

7.44

-9.77

-4.30

c-hexanol

-32.47

8.05

-9.02

-3.80

2-butanol

-33.87

8.88

-7.04

-3.70

cyclohexanone-oxime

-40.23

4.71 -13.51

-5.80

9

∆G

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The computed ∆Gsol values for the pKAHY 29 data set are plotted against ∆Gexp in Figure 3. The correlation is clearly visible, with R2 = 0.74, with the dashed line indicating perfect agreement. Some of the largest deviations from experiment occur for the strongest binders. Their structures are shown in Figure 3 along with 1-naphthol (∆Gexp = −13.20 kJ mol−1 ), which has the largest deviation from the regression line. The largest difference between ∆Gsol and ∆Gexp is found for hexafluoroisopropanol, the strongest binder, which is overbound by almost 12 kJ mol−1 . In contrast, the errors for the other strongest binders are smaller. Interestingly, the largest deviations of ∆Gsol with respect to experiment occur for both the strongest and the weakest binders. One reason for this could be the chosen QC protocol itself. It could also be that the single-conformer nature of the chosen approach has an influence as in some of the complex geometries, the NMP reference acceptor molecule is oriented differently towards the donating molecule even with the constraint that the lone pair involved in the HB should be the one schematically depicted in Figure 1, see the example complexes depicted in the SI. While extending the number of conformers in the approach appears to be the most plausible way forward, in the present study this has not been done as the results are sufficient to move on to the evaluation of the QC protocol on the IMHB16 data set.

10

ACS Paragon Plus Environment

Page 10 of 37

Page 11 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Cl

HO

Cl

OH F 3C

OH

CF3 Cl Cl HO

Cl

Figure 3: Calculated (∆Gsol ) vs. ∆Gexp for the intermolecular pKAHY 29 set. The structures of the three strongest experimental binders and 1-naphthol (largest deviation from the regression line) are shown. The dashed line indicates perfect agreement between ∆Gsol and ∆Gexp . The results of the IMBH16 data set are given in Table 2. The identifiers of the molecules are as in Figure 2. The range of the target ∆Gexp values is from -38 kJ mol−1 (strong IMHB) to weak (-3 kJ mol−1 ) and thus larger than for the intermolecular the pKAHY 29 set. The ∆E values reveal overbound results. The IMHB16 set consists only of unimolecular reactions, and so only weakly repulsive ∆GHO contributions are expected. Indeed, ∆G also shows overbound results in Table 2. The ∆δGsolv contributions are, somewhat unanticipatedly, repulsive. The ∆Gsol values of the IMHB16 set are very close to the experimental values.

11

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 37

Table 2: Quantum Chemical Results and ∆Gexp for the Compounds of the IMHB16 Set. The Energies are in kJ mol−1 . Identifier

∆E

∆G

∆Gsol

∆Gexp

OBC-1

-11.78 -10.86

-4.45

-3.35

OBC-2

-14.69 -13.64

-4.72

-5.02

OBC-3

-14.67 -13.68

-3.42

-5.02

OBC-4

-18.33 -17.44 -11.11

-5.44

OBC-5

-41.73 -37.44 -22.22

-26.36

OBC-6

-43.90 -39.30 -24.99

-28.03

OBC-7

-38.29 -33.89 -25.02

-33.47

OBC-8

-46.43 -37.58 -24.81

-26.78

OBC-9

-50.86 -46.42 -34.50

-38.07

OBC-10

-31.44 -28.86 -17.75

-23.43

OBC-11

-27.99 -25.91 -17.40

-24.27

OBC-12

-28.24 -25.06 -15.70

-25.94

OBC-13

-14.89 -13.81

-6.68

-10.88

OBC-14

-34.24 -30.99 -19.45

-24.69

OBC-15

-33.66 -30.71 -19.15

-22.18

OBC-16

-48.90 -42.04 -29.93

-20.08

The correlation between computation and experiment for the IMHB16 set is displayed in Figure 4, where the dashed line indicates perfect agreement. A univariate linear regression model built on ∆Gsol has a score of R2 = 0.79. One compound with a relatively high error is OBC-4, 2-methoxy phenol, which is overbound by 6 kJ mol

−1

, shown in Figure 4. The

reason is that the OBC-4 ∆E contribution is already too favorable when compared to OBC 1-3, which have comparable experimental target values to OBC-4. A similar observation but in the opposite direction can be made about acetophenone, OBC-7, also shown in Figure 4. Furthermore, OBC-13 is interesting to discuss because it is the only oxime in the set, with the OH group as the acceptor. The deviation of ∆Gsol from experiment is only 4 kJ mol−1 . The structures with the largest errors (OBC-12 and OBC-16, for which ∆Gsol − ∆Gexp ≈ 10 12

ACS Paragon Plus Environment

Page 13 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

kJ mol−1 ) are depicted in Figure 4. The rationalization of these two data points is similar to the intermolecular case: As the data points with the biggest errors concern the largest molecules, the inclusion of more conformers would likely improve the accuracy of this IMHB prediction approach. To ensure data consistency, for OBC-15 and OBC-16 two maximally aligned conformations of the limiting open and closed (IMHB) forms of the molecules have been evaluated, see Figure S3 in the SI.

OH O

OBC-4

O HN OBC-16 Cl3C

OH O

OBC-12 O HN

OBC-7

Figure 4: Calculated (∆Gsol ) vs. ∆Gexp for the IMHB16 data set. Structures with large errors vs. experiment are shown. Figure 5 shows the results for both sets in one plot. It becomes apparent that with the chosen protocol, ∆Gexp can be reproduced quite successfully for both the inter- and the intramolecular cases. There are only two differences in the protocols, (i) the solvent is chloroform for the intramolecular case and CCl4 for the intermolecular case, and (ii) there is 13

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

a concentration and solvent-dependent shift for the intermolecular case of -22.64 kJ mol−1 . The magnitude of this shift for CCl4 at 1.0 M is larger than the shift of -7.90 kJ mol−1 (or free correction. 77,78 -1.89 kcal mol−1 ) commonly applied without theVsolv

Figure 5: Comparison of ∆Gexp for the intermolecular pKAHY 29 and the intramolecular IMHB16 data sets vs. the computed ∆Gsol values. Table 3 shows the R2 scores and errors vs. experiment of the computed ∆E, ∆G, ∆Gsol values for both data sets. The correlation improves in both cases when the ∆GHO contributions are added. What is more, the ∆Gsol values also correlate well to experiment for both sets, although there is a slight decline in R2 for the pKAHY 29 set from gas phase to solution phase. The standard deviation reflects that the spread of the error follows the same trend. The ∆E results are overbound, ∆G result are underbound. This is repaired by the large 14

ACS Paragon Plus Environment

Page 14 of 37

Page 15 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

∆δGsolv contributions including the correction for the standard state at 1.0 M CCl4 , with a slightly overbound ∆Gsol at a mean error of -3.98 kJ mol mol

−1

, which is close to chemical accuracy (4.18 kJ mol

−1

−1

and an RMSE of 4.85 kJ

). For IMHB16, the ∆E values

are also overbound; however, the ∆G values are not as repulsive because IMHB formation is a unimolecular process. Therefore, these values are still slightly overbound, and only the ∆δGsolv contributions lead to results comparable to ∆Gexp . The mean error of the ∆Gsol results is only 2.61 kJ mol−1 , indicating only slightly underbound results on average. The chosen approach for the modeling of IMHB by energetic comparison of an open and a closed conformation therefore seems valid. Table 3: Statistical Error Measures vs. Experiment for the Two Data Sets in kJ mol−1 . metric

∆E

∆G ∆Gsol

intermolecular pKAHY 29 set R2 0.63 0.78 mean error -35.13 12.10 standard deviation 4.10 2.34 mean absolute error 35.13 12.10 root mean square error 35.37 12.33

0.74 -3.98 2.76 4.22 4.85

intramolecular IMHB16 set R2 0.75 mean error -11.06 standard deviation 6.73 mean absolute error 11.06 root mean square error 12.84

0.77 -7.79 5.49 7.90 9.43

0.79 2.61 5.02 4.68 5.51

Univariate linear regression models based on ∆Gsol and their scores are presented in Table 4. The slope of the regression line for pKAHY 29 is only 0.69 in the intermolecular case. The intercept of -0.56 kJ mol l−1 is owed to the value of the slope. For IMHB16, the slope is practically 1 and the intercept indicates the slightly underbound results. The MAE and the RMSE are lower for the pKAHY 29 linear regression as the error is more systematic than for IMHB16. All in all, the linear models are just for orientation as the MAE/RMSE values of ∆Gsol compared directly to ∆Gexp (Table 3) already reach levels 15

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

that are at least close to an often desired value of 4.18 kJ mol−1 (1 kcal mol−1 )). Table 4: Linear models and error measures vs. experiment for the two data sets in kJ mol−1 . MAE = mean absolute error, RMSE = root mean square error c1 ∆Gsol + d pKAHY 29 IMHB16 c1 d R2 MAE RMSE

0.69 -0.56

1.02 -2.19

0.74 1.80 2.20

0.79 3.47 4.85

This article’s findings for the intermolecular case compare well to earlier quantum chemical studies on hydrogen bond strength prediction, 38,40,79–85 of which a selection is discussed. Using a geometric angle constraint for the HB, Nocker and co-authors 38 found a correlation with R2 = 0.77 of B3LYP/aug-cc-pVDZ computed HBD strengths using the NMP-based experimental Kα scale. 19 We have not used such a constraint and arrive at R2 = 0.74 for the ∆Gsol values. R2 scores of up to 0.92 are reported by Clark and co-workers using up to two angular constraints and different density functional/basis set combinations. 40 However, these scores are only reported for zero-point-energy corrected gas-phase values and thus the influence of entropic contributions on the results remains unknown. A score of R2 = 0.77 for the correlation of MP2/aug-cc-pVTZ computed vs. experimental HB enthalpies is reported 84 by the group of Graton who are also the authors of the pKAHY experimental study. In a study from 2018, Rosenberg finds that the entropic contributions make the formation of H-bonded complexes theoretically endergonic. 85 It is known from the QC modeling of the thermodynamics of supramolecular chemistry 50 that the ∆GHO contributions are generally repulsive and that solvation contributions can also be repulsive. DFT studies of supramolecular binding affinities have revealed the need to model accurately the intricate balance of attractive (∆E, including dispersion corrections) and repulsive contributions (∆GHO and sometimes ∆δGsolv ). 51,86,87 In the case of large supramolecular complexes, that balance often 16

ACS Paragon Plus Environment

Page 16 of 37

Page 17 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

results in a negative final ∆Gsol due to the large attractive dispersion contributions outweighing the ∆GHO type contributions. For pKAHY 29, the average ∆E is -46 kJ mol−1 , the average ∆GHO is +47 kJ mol−1 , and the average ∆δGsolv is -16 kJ mol−1 when including the shift for standard conditions at 1.0 M in CCl4 solution, rendering the average 1:1 complex with NMP bound. The main difference to the supramolecular case is the smaller, almost negligible dispersion contribution to complex formation. Since IMHBs are unimolecular, the ∆GHO effects nearly cancel out, which Rosenberg also notes in his study. 85 This is reflected in the average individual contributions for the IMHB16 set, which are as follows: the average ∆E is -31 kJ mol−1 (same tendency as the intermolecular case, strongly bound), the average ∆GHO is +3 kJ mol−1 (a negligibly small value), and the average ∆δGsolv is + 10 kJ mol−1 . The IMHB average ∆E compares well with literature values from various sources, 88–96 of which a few examples are discussed: While the jury is still out on whether an IMHB is truly formed in 2-fluorophenol (OBC-1), 95 the ∆E values of Rosenberg (approximated CCSD(T)/aug-cc-pVTZ) 85 and this study are exactly the same. For peptides, molecular tailoring approaches place average IMHB energies around -16 to -24 kJ mol−1 (B3LYP/6-311++G(d,p) level), 91 whereas a force-field like approach places them at -24 to -32 kJ mol−1 , 92 which is comparable with the IMHB16 ∆E values with a carbonyl acceptor and a heteroaromatic nitrogen donor (e.g., OBC-15). To arrive at observable populations, however, one must, go beyond computing ∆E, which usually overestimates the IMHB energy. 2-fluorophenol (OBC-1) again serves as a good example. Experimental estimates of the IMHB energy range between -2 and -6 kJ mol−1 , depending on the solvent. 95,97 Our computed ∆Gsol value is -4 kJ mol−1 in chloroform (OBC-1 individual contributions in kJ mol−1 : ∆E : −11.78 ∆GHO : 0.92 ∆δGsolv : 6.41), in close agreement with the IMHB16 reference value in chloroform of -3 kJ mol−1 . 46 It is also the repulsive solvation contributions that push the mean error vs experiment of the IMHB16 ∆Gsol values close to zero. This is another indication that the balance between the individual contributions is correct for IMHB modeling. 17

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Lastly, any improvement the modeling quality of both inter- and intramolecular HB is most likely to include fully representative sets of conformations for all molecules. One of the problems with such an endeavor is that the conformational spaces of even such small molecules as in the pKAHY 29 and IMHB16 data sets may grow very large. Therefore, in order to conduct high-level ∆E computations, an efficient method to sample the conformational space and evaluate the representative structures, is needed. Very recently, such a method has been presented by Grimme 98 based on the semi-empirical GFN-xTB family of methods. 99

Conclusions Dispersion-corrected DFT computations of the Gibbs free energies in solution ∆Gsol compare well to experimental values ∆Gexp for both inter- and intramolecular HB. The following three major conclusions can be drawn from the computations on the intermolecular pKAHY 29 and the intramolecular IMHB16 data sets: 1. Computations of ∆Gsol for both inter- and intramolecular HB strengths yields a correlation to experiment, with R2 = 0.74 for pKAHY 29 and R2 = 0.79 for IMHB16 . 2. The ∆Gsol values of intermolecular HB result in an RMSE of ≈ 5 kJ mol−1 vs. ∆Gexp . The inclusion of all energetic contributions, i.e., ∆E, ∆GHO , and ∆δGsolv including a large shift of almost -23 kJ mol−1 to account for standard conditions in CCl4 , is vital for this excellent correspondence between predicted and measured ∆G. 3. Similarly, the ∆Gsol values for IMHB correspond well to ∆Gexp values (RMSE ≈ 6 kJ mol−1 ), indicating that the energetic comparison of an open and a closed conformation is a good model. The balance of the individual energetic contributions differs from the intermolecular case because the reaction is unimolecular. These findings may contribute to building a bridge between the fields of cheminformatics and quantum chemistry by (i) the computation of databases containing intermolecular HB 18

ACS Paragon Plus Environment

Page 18 of 37

Page 19 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

strengths, building scales on various standard reference molecules such as 4-fluorophenol or NMP in continuation of existing literature. ∆Gsol is a good approximation for the HB strength 1:1 complexes. A first step in this direction has already been taken. (ii) the computation of a database containing ∆Gsol values for IMHB-forming molecules. A large database of IMHB strengths is currently not available. Research in our laboratory is going in this direction.

Acknowledgement The author is supported by the Bergen Research Foundation (BFS) grant number BFS2017TMT01. The author thanks Prof. Johannes Kirchmair of the University of Bergen, Dr. Andreas Göller of Bayer, and Prof. Stefan Grimme of the University of Bonn for valuable discussions. The author thanks reviewer 1 for their comments, yielding a major improvement of this manuscript.

Supporting Information Available The following files are available free of charge. • pKHAY29.3D.sdf: pKAHY 29 data set: geometries of molecules, complexes and standard acceptor NMP, quantum chemical absolute energies • IMHB16.3D.sdf: IMHB16 data set: geometries of open and closed conformers, quantum chemical absolute energies • Documentation.pdf: Documentation of the two data sets.

19

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

References (1) Arunan, E.; Desiraju, G. R.; Klein, R. A.; Sadlej, J.; Scheiner, S.; Alkorta, I.; Clary, D. C.; Crabtree, R. H.; Dannenberg, J. J.; Hobza, P.; Kjaergaard, H. G.; Legon, A. C.; Mennucci, B.; Nesbitt, D. J. Definition of the hydrogen bond (IUPAC Recommendations 2011). Pure Appl. Chem. 2011, 83, 1637–1641. (2) Hornby, D. FEBS Lett.; Springer Berlin Heidelberg: Berlin, Heidelberg, 1993; Vol. 323; pp 295–295. (3) Herschlag, D.; Pinney, M. M. Hydrogen Bonds: Simple after All? Biochemistry 2018, 57, 3338–3352. (4) Laurence, C.; Berthelot, M. Observations on the strength of hydrogen bonding. Perspect. Drug Discov. Des. 2000, 18, 39–60. (5) Hessler, G. In Angew. Chemie; Böhm, H., Schneider, G., Eds.; Wiley VCH Verlag GmbH & Co. KGaA: Weinheim, 2003; Vol. 116; pp 148–148. (6) Bissantz, C.; Kuhn, B.; Stahl, M. A Medicinal Chemist’s Guide to Molecular Interactions. J. Med. Chem. 2010, 53, 5061–5084. (7) Hamaguchi, W.; Masuda, N.; Miyamoto, S.; Shiina, Y.; Kikuchi, S.; Mihara, T.; Moriguchi, H.; Fushiki, H.; Murakami, Y.; Amano, Y.; Honbou, K.; Hattori, K. Synthesis, SAR study, and biological evaluation of novel quinoline derivatives as phosphodiesterase 10A inhibitors with reduced CYP3A4 inhibition. Bioorganic Med. Chem. 2015, 23, 297–313. (8) Lawhorn, B. G.; Philp, J.; Graves, A. P.; Holt, D. A.; Gatto, G. J.; Kallander, L. S. Substituent Effects on Drug-Receptor H-bond Interactions: Correlations Useful for the Design of Kinase Inhibitors. J. Med. Chem. 2016, 59, 10629–10641.

20

ACS Paragon Plus Environment

Page 20 of 37

Page 21 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

(9) Helal, C. J.; Arnold, E.; Boyden, T.; Chang, C.; Chappie, T. A.; Fisher, E.; Hajos, M.; Harms, J. F.; Hoffman, W. E.; Humphrey, J. M.; Pandit, J.; Kang, Z.; Kleiman, R. J.; Kormos, B. L.; Lee, C. W.; Lu, J.; Maklad, N.; McDowell, L.; McGinnis, D.; O’Connor, R. E.; O’Donnell, C. J.; Ogden, A.; Piotrowski, M.; Schmidt, C. J.; Seymour, P. A.; Ueno, H.; Vansell, N.; Verhoest, P. R.; Yang, E. X. Identification of a Potent, Highly Selective, and Brain Penetrant Phosphodiesterase 2A Inhibitor Clinical Candidate. J. Med. Chem. 2018, 61, 1001–1018. (10) Caron, G.; Kihlberg, J.; Ermondi, G. Intramolecular hydrogen bonding: An opportunity for improved design in medicinal chemistry. Med. Res. Rev. 2019, 1–23. (11) Giordanetto, F.; Tyrchan, C.; Ulander, J. Intramolecular Hydrogen Bond Expectations in Medicinal Chemistry. ACS Med. Chem. Lett. 2017, 8, 139–142. (12) Shalaeva, M.; Caron, G.; Abramov, Y. A.; O’Connell, T. N.; Plummer, M. S.; Yalamanchi, G.; Farley, K. A.; Goetz, G. H.; Philippe, L.; Shapiro, M. J. Integrating intramolecular hydrogen bonding (IMHB) considerations in drug discovery using ∆logP as a tool. J. Med. Chem. 2013, 56, 4870–4879. (13) Caron, G.; Vallaro, M.; Ermondi, G. High throughput methods to measure the propensity of compounds to form intramolecular hydrogen bonding. Medchemcomm 2017, 8, 1143–1151. (14) Caron, G.; Vallaro, M.; Ermondi, G. Log P as a tool in intramolecular hydrogen bond considerations. Drug Discov. Today Technol. 2018, 27, 65–70. (15) Karamertzanis, P. G.; Day, G. M.; Welch, G. W.; Kendrick, J.; Leusen, F. J.; Neumann, M. A.; Price, S. L. Modeling the interplay of inter- and intramolecular hydrogen bonding in conformational polymorphs. J. Chem. Phys. 2008, 128 . (16) Seo, J.; Yoon, H. J.; Shin, S. K. Effects of intramolecular hydrogen bonds on the

21

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

collision-induced dissociation of tryptic peptide ions. Int. J. Mass Spectrom. 2019, 435, 272–279. (17) Taft, R. W.; Gurka, D.; Joris, L.; Schleyer, P. R.; Rakshys, J. W. Studies of HydrogenBonded Complex Formation with p-Fluorophenol. V. Linear Free Energy Relationships with OH Reference Acids. J. Am. Chem. Soc. 1969, 91, 4801–4808. (18) Kamlet, M. J.; Taft, R. W. The Solvatochromic Comparison Method. I. The β-Scale Of Solvent Hydrogen-Bond Acceptor (HBA) Basicities. J. Am. Chem. Soc. 1976, 98, 377–383. (19) Abraham, M. H.; Duce, P. P.; Prior, D. V.; Barratt, D. G.; Morris, J. J.; Taylor, P. J. Hydrogen bonding. Part 9. Solute proton donor and proton acceptor scales for use in drug design. J. Chem. Soc. Perkin Trans. 1989, 2, 1355–1375. (20) Abraham, M. H.; Grellier, P. L.; Prior, D. V.; Morris, J. J.; Taylor, P. J. Hydrogen bonding. Part 10. A scale of solute hydrogen-bond basicity using log K values for complexation in tetrachloromethane. J. Chem. Soc. Perkin Trans. 1990, 2, 521. (21) Abraham, M. H. Hydrogen bonding. 31. Construction of a scale of solute effective or summation hydrogen-bond basicity. J. Phys. Org. Chem. 1993, 6, 660–684. (22) Abraham, M. H. Scales of solute hydrogen-bonding: Their construction and application to physicochemical and biochemical processes. Chem. Soc. Rev. 1993, 22, 73–83. (23) Abraham, M. H.; Abraham, R. J.; Byrne, J.; Griffiths, L. NMR method for the determination of solute hydrogen bond acidity. J. Org. Chem. 2006, 71, 3389–3394. (24) Ruggiu, F.; Solov’Ev, V.; Marcou, G.; Horvath, D.; Graton, J.; Le Questel, J. Y.; Varnek, A. Individual hydrogen-bond strength QSPR modelling with ISIDA local descriptors: A step towards polyfunctional molecules. Mol. Inform. 2014, 33, 477–487.

22

ACS Paragon Plus Environment

Page 22 of 37

Page 23 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

(25) Glavatskikh, M.; Madzhidov, T.; Solov’ev, V.; Marcou, G.; Horvath, D.; Varnek, A. Predictive Models for the Free Energy of Hydrogen Bonded Complexes with Single and Cooperative Hydrogen Bonds. Mol. Inform. 2016, 35, 629–638. (26) Reiher, M.; Sellmann, D.; Hess, B. A. Stabilization of diazene in Fe(II)-sulfur model complexes relevant for nitrogenase activity. I. A new approach to the evaluation of intramolecular hydrogen bond energies. Theor. Chem. Acc. 2001, 106, 379–392. (27) Thar, J.; Kirchner, B. Hydrogen bond detection. J. Phys. Chem. A 2006, 110, 4229– 4237. (28) Kenny, P. W. Hydrogen bonding, electrostatic potential, and molecular design. J. Chem. Inf. Model. 2009, 49, 1234–1244. (29) Schwöbel, J.; Ebert, R. U.; Kühne, R.; Schüürmann, G. Prediction of the intrinsic hydrogen bond acceptor strength of chemical substances from molecular structure. J. Phys. Chem. A 2009, 113, 10104–10112. (30) Schwöbel, J.; Ebert, R. U.; Kühne, R.; Schüürmann, G. Prediction of the intrinsic hydrogen bond acceptor strength of chemical substances from molecular structure. J. Phys. Chem. A 2009, 113, 10104–10112. (31) Klamt, A.; Reinisch, J.; Eckert, F.; Hellweg, A.; Diedenhofen, M. Polarization charge densities provide a predictive quantification of hydrogen bond energies. Phys. Chem. Chem. Phys. 2012, 14, 955–963. (32) Klamt, A.; Reinisch, J.; Eckert, F.; Graton, J.; Le Questel, J. Y. Interpretation of experimental hydrogen-bond enthalpies and entropies from COSMO polarisation charge densities. Phys. Chem. Chem. Phys. 2013, 15, 7147–7154. (33) Green, A. J.; Popelier, P. L. Theoretical prediction of hydrogen-bond basicity pK BHX using quantum chemical topology descriptors. J. Chem. Inf. Model. 2014, 54, 553–561. 23

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(34) Kenny, P. W.; Montanari, C. A.; Prokopczyk, I. M.; Ribeiro, J. F.; Sartori, G. R. Hydrogen Bond Basicity Prediction for Medicinal Chemistry Design. J. Med. Chem. 2016, 59, 4278–4288. (35) Zheng, S.; Xu, S.; Wang, G.; Tang, Q.; Jiang, X.; Li, Z.; Xu, Y.; Wang, R.; Lin, F. Proposed Hydrogen-Bonding Index of Donor or Acceptor Reflecting Its Intrinsic Contribution to Hydrogen-Bonding Strength. J. Chem. Inf. Model. 2017, 57, 1535–1547. (36) Bauer, C. A.; Schneider, G.; Göller, A. H. Gaussian Process Regression Models for the Prediction of Hydrogen Bond Acceptor Strengths. Mol. Inform. 2019, 38, 1800115. (37) Kaminski, G. A.; Maple, J. R.; Murphy, R. B.; Braden, D. A.; Friesner, R. A. Methods for Computation of Hydrogen Bonding Energies of Molecular Pairs. 2005, 248–254. (38) Nocker, M.; Handschuh, S.; Tautermann, C.; Liedl, K. R. Theoretical prediction of hydrogen bond strength for use in molecular modeling. J. Chem. Inf. Model. 2009, 49, 2067–2076. (39) Schneebell, S. T.; Bochevarov, A. D.; Friesner, R. A. Parameterization of a B3LYP Specific Correction for Non-covalent Interaction Energies. J. Chem. Theory Comput. 2011, 7, 658–668. (40) El Kerdawy, A.; Tautermann, C. S.; Clark, T.; Fox, T. Economical and accurate protocol for calculating hydrogen-bond-acceptor strengths. J. Chem. Inf. Model. 2013, 53, 3262–3272. (41) Laurence, C.; Brameld, K. A.; Graton, J.; Le Questel, J.-Y.; Renault, E. The pKBHX Database: Toward a Better Understanding of Hydrogen-Bond Basicity for Medicinal Chemists. J. Med. Chem. 2009, 52, 4073–4086. (42) Graton, J.; Besseau, F.; Brossard, A. M.; Charpentier, E.; Deroche, A.; Le Questel, J. Y. Hydrogen-bond acidity of OH groups in various molecular environments (phenols, al24

ACS Paragon Plus Environment

Page 24 of 37

Page 25 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

cohols, steroid derivatives, and amino acids structures): Experimental measurements and density functional theory calculations. J. Phys. Chem. A 2013, 117, 13184–13193. (43) Hubbard, T. A.; Brown, A. J.; Bell, I. A.; Cockroft, S. L. The Limit of Intramolecular H-Bonding. J. Am. Chem. Soc. 2016, 138, 15114–15117. (44) Gränacher, I. Einfluss der Wasserstoffbrückenbindung auf das Kernresonanzspektrum von Phenolen. Helv. Phys. Acta 1961, 34, 272–302. (45) Schaefer, T. Relation between hydroxyl proton chemical shifts and torsional frequencies in some ortho-substituted phenol derivatives. J. Phys. Chem. 2005, 79, 1888–1890. (46) Afonin, A. V.; Vashchenko, A. V.; Sigalov, M. V. Estimating the energy of intramolecular hydrogen bonds from 1 H NMR and QTAIM calculations. Org. Biomol. Chem. 2016, 14, 11199–11211. (47) Yurenko, Y. P.; Zhurakivsky, R. O.; Ghomi, M.; Samijlenko, S. P.; Hovorun, D. M. Ab initio comprehensive conformational analysis of 20 -deoxyuridine, the biologically significant DNA minor nucleoside, and reconstruction of its low-temperature matrix infrared spectrum. J. Phys. Chem. B 2008, 112, 1240–1250. (48) Mata, R. A.; Suhm, M. A. Benchmarking Quantum Chemical Methods: Are We Heading in the Right Direction? Angew. Chemie Int. Ed. 2017, 56, 11011–11018. (49) Grimme, S.; Hansen, A.; Brandenburg, J. G.; Bannwarth, C. Dispersion-Corrected Mean-Field Electronic Structure Methods. Chem. Rev. 2016, 116, 5105–5154. (50) Grimme, S. Supramolecular binding thermodynamics by dispersion-corrected density functional theory. Chem. - A Eur. J. 2012, 18, 9955–9964. (51) Sure, R.; Grimme, S. Comprehensive Benchmark of Association (Free) Energies of Realistic Host-Guest Complexes. J. Chem. Theory Comput. 2015, 11, 3785–3801.

25

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 37

(52) Sure, R.; Grimme, S. Comprehensive Benchmark of Association (Free) Energies of Realistic Host-Guest Complexes. J. Chem. Theory Comput. 2015, 11, 3785–3801. (53) Whitesides, G. M.; Mammen, M.; Shakhnovich, E. I.; Deutch, J. M. Estimating the entropic cost of self-assembly of multiparticle hydrogen-bonded aggregates based on the cyanuric acid center dot melamine lattice. J. Org. Chem. 1998, 63, 3821–3830. (54) The

RDKit:

Open-Source

Cheminformatics

Software,

version

2018.09.1,

https://www.rdkit.org, accessed January 15, 2019. (55) Riniker, S.; Landrum, G. A. Better Informed Distance Geometry: Using What We Know to Improve Conformation Generation. J. Chem. Inf. Model. 2015, 55, 2562– 2574. (56) Halgren, T. A. Merck molecular force field. V. Extension of MMFF94 using experimental data, additional computational data, and empirical rules. J. Comput. Chem. 1996, 17, 616–641. (57) Halgren, T. A. Merck molecular force field. II. MMFF94 van der Waals and electrostatic parameters for intermolecular interactions. J. Comput. Chem. 1996, 17, 520–552. (58) Halgren, T. A.; Nachbar, R. B. Merck molecular force field. IV. Conformational energies and geometries for MMFF94. J. Comput. Chem. 1996, 17, 587–615. (59) Halgren, T. A. Merck molecular force field. III. Molecular geometries and vibrational frequencies for MMFF94. J. Comput. Chem. 1996, 17, 553–586. (60) Halgren, T. A. MMFF VII. Characterization of MMFF94, MMFF94s, and other widely available force fields for conformational energies and for intermolecular-interaction energies and geometries. J. Comput. Chem. 1999, 20, 730–748. (61) Halgren, T. A. MMFF VI. MMFF94s option for energy minimization studies. J. Comput. Chem. 1999, 20, 720–729. 26

ACS Paragon Plus Environment

Page 27 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

(62) Tosco, P.; Stiefl, N.; Landrum, G. Bringing the MMFF force field to the RDKit: Implementation and validation. J. Cheminform. 2014, 6, 4–7. (63) Tao, J.; Perdew, J. P.; Staroverov, V. N.; Scuseria, G. E. Climbing the density functional ladder: Nonempirical meta–generalized gradient approximation designed for molecules and solids. Phys. Rev. Lett. 2003, 91, 14601. (64) Grimme, S.; Antony, J.; Ehrlich, S.; Krieg, H. A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. J. Chem. Phys. 2010, 132 . (65) Grimme, S.; Ehrlich, S.; Goerigk, L. Effect of the damping function in dispersion corrected density functional theory. J. Comput. Chem. 2011, 32, 1456–1465. (66) Becke, A. D.; Johnson, E. R. A density-functional model of the dispersion interaction. J. Chem. Phys. 2005, 123, 154101. (67) Weigend, F.; Ahlrichs, R. Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy. Phys. Chem. Chem. Phys. 2005, 7, 3297–3305. (68) Zhao, Y.; Truhlar, D. G. Design of density functionals that are broadly accurate for thermochemistry, thermochemical kinetics, and nonbonded interactions. J. Phys. Chem. A 2005, 109, 5656–5667. (69) Marenich, A. V.; Cramer, C. J.; Truhlar, D. G. Universal solvation model based on solute electron density and on a continuum model of the solvent defined by the bulk dielectric constant and atomic surface tensions. J. Phys. Chem. B 2009, 113, 6378– 6396. (70) Becke, A. D. Density-functional exchange-energy approximation with correct asymptotic behavior. Phys. Rev. A 1988, 38, 3098–3100. 27

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 37

(71) Perdew, John, P. Erratum: Density-functional approximation for the correlation energy of the inhomogenous electron gas. Phys. Rev. B 1986, 34, 7406. (72) Perdew, J. P. Erratum: Density-functional approximation for the correlation energy of the inhomogeneous electron gas [Phys. Rev. B 33, 8822 (1986)l. Phys. Rev. B 1986, 34, 7406. (73) Funes-Ardoiz,

I.;

Paton,

R.

S.

GoodVibes:

GoodVibes

v

2.0.3,

https://doi.org/10.5281/zenodo.595246. 2016. (74) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Petersson, G. A.; Nakatsuji, H.; Li, X.; Caricato, M.; Marenich, A. V.; Bloino, J.; Janesko, B. G.; Gomperts, R.; Mennucci, B.; Hratchian, H. P.; J. V. Ortiz,; Izmaylov, A. F.; Sonnenberg, J. L.; WilliamsYoung, D.; Ding, F.; Lipparini, F.; Egidi, F.; Goings, J.; Peng, B.; Petrone, A.; Henderson, T.; Ranasinghe, D.; Zakrzewski, V. G.; Gao, J.; Rega, N.; Zheng, G.; Liang, W.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Vreven, T.; Throssell, K.; Montgomery, J. A.; Jr., J. E. P.; Ogliaro, F.; Bearpark, M. J.; Heyd, J. J.; Brothers, E. N.; Kudin, K. N.; Staroverov, V. N.; Keith, T. A.; Kobayashi, R.; Normand, J.; Raghavachari, K.; Rendell, A. P.; Burant, J. C.; Iyengar, S. S.; Tomasi, J.; Cossi, M.; Millam, J. M.; Klene, M.; Adamo, C.; Cammi, R.; Ochterski, J. W.; Martin, R. L.; Morokuma, K.; Farkas, O.; Foresman, J. B.; Fox, D. J. Gaussian 16, Revision B.01. 2016. (75) Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Müller, A.; Nothman, J.; Louppe, G.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; Vanderplas, J.; Passos, A.; Cournapeau, D.; Brucher, M.; Perrot, M.; Duchesnay, É. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2012, 12, 2825–2830. (76) Hounshell, W. D.; Dalby, A.; Nourse, J. G.; Gushurst, A. K. I.; Leland, B. A.; 28

ACS Paragon Plus Environment

Page 29 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Grier, D. L.; Laufer, J. Description of several chemical structure file formats used by computer programs developed at Molecular Design Limited. J. Chem. Inf. Model. 2005, 32, 244–255. (77) Ochterski, J. W. Thermochemistry in Gaussian. 2000; http://www.gaussian.com/ g{\_}whitepap/thermo/thermo.pdf. (78) Reimers, J. R.; Panduwinata, D.; Visser, J.; Chin, Y.; Tang, C.; Goerigk, L.; Ford, M. J.; Sintic, M.; Sum, T.-J.; Coenen, M. J. J.; Hendriksen, B. L. M.; Elemans, J. A. A. W.; Hush, N. S.; Crossley, M. J. A priori calculations of the free energy of formation from solution of polymorphic self-assembled monolayers. Proc. Natl. Acad. Sci. 2015, 112, E6101–E6110. (79) Fonseca Guerra, C.; Bickelhaupt, F. M.; Snijders, J. G.; Baerends, E. J. Hydrogen bonding in DNA base pairs: Reconciliation of theory and experiment. J. Am. Chem. Soc. 2000, 122, 4117–4128. (80) Morozov, A. V.; Kortemme, T.; Tsemekhman, K.; Baker, D. Close agreement between the orientation dependence of hydrogen bonds observed in protein structures and quantum mechanical calculations. Proc. Natl. Acad. Sci. 2004, 101, 6946–6951. (81) Hao, M. H. Theoretical calculation of hydrogen-bonding strength for drug molecules. J. Chem. Theory Comput. 2006, 2, 863–872. (82) Besseau, F.; Graton, J.; Berthelot, M. A theoretical evaluation of the pKHB and ∆H HB hydrogen-bond scales of nitrogen bases. Chem. - A Eur. J. 2008, 14, 10656–10669. (83) Wendler, K.; Thar, J.; Zahn, S.; Kirchner, B. Estimating the hydrogen bond energy. J. Phys. Chem. A 2010, 114, 9529–9536. (84) Koné, M.; Illien, B.; Laurence, C.; Graton, J. Can quantum-mechanical calcula-

29

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

tions yield reasonable estimates of hydrogen-bonding acceptor strength? the case of hydrogen-bonded complexes of methanol. J. Phys. Chem. A 2011, 115, 13975–13985. (85) Rosenberg, R. E. The Strength of Hydrogen Bonds between Fluoro-Organics and Alcohols, a Theoretical Study. J. Phys. Chem. A 2018, 122, 4521–4529. (86) Sure, R.; Antony, J.; Grimme, S. Blind prediction of binding affinities for charged supramolecular host-guest systems: Achievements and shortcomings of DFT-D3. J. Phys. Chem. B 2014, 118, 3431–3440. (87) Antony, J.; Sure, R.; Grimme, S. Using dispersion-corrected density functional theory to understand supramolecular binding thermodynamics. Chem. Commun. 2015, 51, 1764–1774. (88) Dietrich, S. W.; Jorgensen, E. C.; Kollman, P. A.; Rothenberg, S. A Theoretical Study of Intramolecular Hydrogen Bonding in Ortho-Substituted Phenols and Thiophenols. J. Am. Chem. Soc. 1976, 98, 8310–8324. (89) Deshmukh, M. M.; Gadre, S. R.; Bartolotti, L. J. Estimation of intramolecular hydrogen bond energy via molecular tailoring approach. J. Phys. Chem. A 2006, 110, 12519– 12523. (90) Jabłoński, M.; Kaczmarek, A.; Sadlej, A. J. Estimates of the energy of intramolecular hydrogen bonds. J. Phys. Chem. A 2006, 110, 10890–10898. (91) Deshmukh, M. M.; Gadre, S. R. Estimation of N-H· · · O=C Intramolecular hydrogen bond energy in polypeptides. J. Phys. Chem. A 2009, 113, 7927–7932. (92) Sun, C. L.; Wang, C. S. Estimation on the intramolecular hydrogen-bonding energies in proteins and peptides by the analytic potential energy function. J. Mol. Struct. THEOCHEM 2010, 956, 38–43.

30

ACS Paragon Plus Environment

Page 30 of 37

Page 31 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

(93) Rusinska-Roszak, D.; Sowinski, G. Estimation of the intramolecular O-H· · · O=C hydrogen bond energy via the molecular tailoring approach. Part I: Aliphatic structures. J. Chem. Inf. Model. 2014, 54, 1963–1977. (94) Rusinska-Roszak, D. Intramolecular O-H· · · O=C hydrogen bond energy via the molecular tailoring approach to RAHB structures. J. Phys. Chem. A 2015, 119, 3674–3687. (95) Abraham, M. H.; Abraham, R. J.; Aliev, A. E.; Tormena, C. F. Is there an intramolecular hydrogen bond in 2-halophenols? A theoretical and spectroscopic investigation. Phys. Chem. Chem. Phys. 2015, 17, 25151–25159. (96) Karas, L. J.; Batista, P. R.; Viesser, R. V.; Tormena, C. F.; Rittner, R.; De Oliveira, P. R. Trends of intramolecular hydrogen bonding in substituted alcohols: A deeper investigation. Phys. Chem. Chem. Phys. 2017, 19, 16904–16913. (97) Carlson, G. L.; Fateley, W. G.; Manocha, A. S.; Bentley, F. F. Torsional frequencies and enthalpies of intramolecular hydrogen bonds of o-halophenols. J. Phys. Chem. 1972, 76, 1553–1557. (98) Grimme, S. Exploration of Chemical Compound, Conformer, and Reaction Space with Meta-Dynamics Simulations Based on Tight-Binding Quantum Chemical Calculations. J. Chem. Theory Comput. 2019, 15, 2847–2862. (99) Grimme, S.; Bannwarth, C.; Shushkov, P. A Robust and Accurate Tight-Binding Quantum Chemical Method for Structures, Vibrational Frequencies, and Noncovalent Interactions of Large Molecular Systems Parametrized for All spd-Block Elements (Z = 1-86). J. Chem. Theory Comput. 2017, 13, 1989–2009.

31

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Graphical TOC Entry HB donor HB acceptor O

H

O O H R

vs.

ΔGsol,QC vs. exp RMSE = 5.5 kJ mol-1

O

N

ΔGsol,QC vs. exp RMSE = 4.9 kJ mol-1

32

ACS Paragon Plus Environment

Page 32 of 37

HB donor Page 33 of 37 Journal of Chemical Information and Modeling HB acceptor

a1 2 3 4 5 6

b O H

O

O H O NACS Paragon Plus Environment N

c O H N

O

N

Journal of Chemical Information and Modeling OH OH OH OH OH O 1 F Br N Cl O 2 O 3 4 5 OBC-1 OBC-2 OBC-3 OBC-4 OBC-5 6 7 8 9 OH O OH O OH O OH N 10 11 H O 12 13 14 OBC-6 OBC-7 OBC-8 OBC-9 15 16 17 N 18 O Cl3C N OH 19 N 20 O H HN N 21 22 O HN 23 O HN 24 25 OBC-11 OBC-12 OBC-13 26OBC-10 27 28 29 ACS Paragon 30 HN Plus Environment O O HN 31 O HN 32 OBC-14 OBC-15 OBC-16

HB donor Page 34 of 37 HB acceptor

Page 35 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Journal of Chemical Information and Modeling

Cl

HO

Cl

OH F 3C

OH

CF3 Cl Cl HO

Cl

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

OH O

OBC-4

O HN OBC-16 Cl3C

OH O

OBC-12 O HN

OBC-7 ACS Paragon Plus Environment

Page 36 of 37

Page 37 of 37

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment