Subscriber access provided by Imperial College London | Library
Article
The hpCADD NDDO-Hamiltonian: Parameterization Heike B. Thomas, Matthias Hennemann, Patrick Kibies, Franziska Hoffgaard, Stefan Gussregen, Gerhard Hessler, Stefan M. Kast, and Timothy Clark J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.7b00080 • Publication Date (Web): 12 Jul 2017 Downloaded from http://pubs.acs.org on July 24, 2017
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
Journal of Chemical Information and Modeling is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 39
Journal of Chemical Information and Modeling
1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The hpCADD NDDO-Hamiltonian: Parameterization Heike B. Thomas,a Matthias Hennemann,a Patrick Kibies,b Franziska Hoffgaard,b Stefan Güssregen,c Gerhard Hessler,c Stefan M. Kast,b and Timothy Clarka,* a
Computer-Chemie-Centrum, Department Chemie und Pharmazie, Friedrich-
Alexander-Universität Erlangen-Nürnberg, Nägelsbachstr. 25, 91052 Erlangen, Germany. b
Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a,
44227 Dortmund, Germany c
R&D, IDD, Structural Design and Informatics, Sanofi-Aventis Deutschland GmbH,
Industriepark Höchst, 65926 Frankfurt am Main, Germany.
*Corresponding author (
[email protected])
Abstract: A Neglect of Diatomic Differential Overlap (NDDO) Hamiltonian has been parameterized as an electronic component of a polarizable force field. Coulomb and exchange potentials derived directly from the NDDO Hamiltonian in principle can be used with classical potentials, thus forming the basis for a new generation of efficiently applicable multipolar polarizable force fields. The new hpCADD Hamiltonian uses force-field like atom types and reproduces the electrostatic properties (dipole moment, molecular electrostatic potential) and Koopmans’ Theorem ionization potentials closely, as demonstrated for a large training set and an independent test set of small molecules. The Hamiltonian is not intended to reproduce geometries or total energies well as these will be controlled by the classical force-field potentials. In order to establish the hpCADD Hamiltonian as electronic component in force field-based calculations, we test its performance in combination with the 3D Reference Interaction Site Model (3D RISM) for aqueous solutions. Comparison of the resulting solvation free energies for training and test sets to atomic charges derived from standard procedures, exact solute-solvent electrostatics based on high-level quantum chemical reference data and established
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
Page 2 of 39
2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
semiempirical
Hamiltonians
demonstrates
the
advantages
parametrization.
ACS Paragon Plus Environment
of
the
hpCADD
Page 3 of 39
Journal of Chemical Information and Modeling
3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Introduction Molecular mechanics force fields1 have proven to be far more successful in reproducing molecular structures and energies than might be expected from the sometimes very simple physical model on which they are based. However, they do have systematic weakness, one of which is that most are not polarizable (i.e. their electrostatic parameters remain constant under an applied electric field). As polarization is now increasingly being recognized to be an important factor in noncovalent interactions,2,3 this can be a serious disadvantage because specific electrostatic effects that depend on polarization are only reproduced in an averaged fashion. Although multipole-based force fields do exist,4,5,6 most use a simple atomic monopole (net atomic charge) electrostatic model, which cannot reproduce many important features of the molecular electrostatic potential (MEP) around the molecule, including the σ-hole.7 Many of these problems can be traced back to the coarse graining of all electronic degrees of freedom inherent in most force fields. Polarizable force fields attempt to alleviate this problem, but no single implementation has yet achieved general acceptance.8 Although much progress has been made in developing polarizable force fields,9 modifying the electrostatics to reproduce the σhole,10,11,12,13,14 or both together,15 we decided to use the fact that NDDO wavefunctions are polarizable and can reproduce anisotropic electrostatics around atoms as the electrostatic component of a new force field. We here report a Neglect of Diatomic Differential Overlap (NDDO) Hamiltonian designed to reproduce the electronic properties of unperturbed molecules as closely as possible, confining ourselves in this pilot study to isolated species. Later work will focus on environment-induced polarization and classical force-field potentials for completing the model description in a way that is compatible with force field-based molecular simulation. Note that the only function that the NDDO Hamiltonian needs to perform in such a hybrid force field is to reproduce the molecular electrostatics (Coulomb and Exchange interactions) correctly. It need not reproduce either geometries or energies well as these will be governed by the classical force-field potentials being developed. The concept of developing the force field is first to parameterize the Coulomb (NDDO) Hamiltonian independently and then to leave it fixed and parameterize the classical potentials “on top” of the existing electrostatics. ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
Page 4 of 39
4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
In this way, we can use high-level ab initio standards to parameterize the electrostatics, which are thus removed from the plethora of non-independent twocenter interactions in the classical potentials. In contrast to standard force fields, computationally economical quantum mechanical techniques such as NDDO-based16 molecular-orbital theory,17 reproduce the MEP around molecules (and other one-electron properties) very well18 and are polarizable, but often have difficulties reproducing structures and energies (heats of formation or reaction energies) correctly if other parameters (e.g. ionization potentials or dipole moments) are also to be reproduced accurately. Thus, there are often conflicts in the goals of parameter optimization. Stewart, for instance, used a relatively low weight for dipole moments in developing PM619 and has discussed the problems encountered and strategies used in parameterizing new, generalist Hamiltonians (i.e. those intended to reproduce a wide range of properties).20 These observations are the basis for the hpCADD (high-Performance Computer-Aided Drug Design) force field, which we introduce here. We have chosen to use NDDO, rather than, for instance DFTB3,21
or
other
self-consistent
charge
density-functional
tight-binding
(SCCDFTB)22 methods because MNDO-like NDDO methods offer the advantage of multipole electrostatics and reproduce electrostatic properties well.18 As highperformance semiempirical molecular orbital programs such as EMPIRE23,24 are of comparable speed and scaling to classical polarizable force fields and are able to calculate tens of thousands of molecules in aggregates or single molecules of perhaps 100,000 atoms on highly parallel hardware, the NDDO formalism represents an attractive alternative to classical polarization models for force fields. One weakness is that NDDO wavefunctions with the traditional minimal Slater basis sets underestimate polarizability. Truhlar and Gao recently observed25 that the polarizabilities of organic molecules can be reproduced well in ab initio and Density Functional Theory (DFT) calculations simply by adding diffuse p-functions to the basis set. We pointed out2 that this effect can be traced back to the facile shift of electron density in bonds to hydrogen atoms that is also responsible for the electrostatic component of the directionality of hydrogen bonds. This means that even techniques that use minimal basis sets, such as traditional NDDO-based Modified Neglect of Diatomic Overlap (MNDO-) like formalisms, can be made sufficiently polarizable by adding p-orbitals on hydrogen. Interestingly, adding pfunctions to hydrogen was found to enable SINDO/1 (Symmetrically Orthogonalized ACS Paragon Plus Environment
Page 5 of 39
Journal of Chemical Information and Modeling
5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Intermediate Neglect of Differential Overlap) to reproduce hydrogen bonds correctly.26 Thus, an MNDO-like ansatz should be able to reproduce the electrostatics of molecules very well. An additional advantage of this approach is that the one- and two-electron integrals in MNDO-like theories are approximated by multipole interactions up to the quadrupole-quadrupole level,27,28,29 so that they deliver an atom-centered multipole model for molecular electrostatics directly.18 Note that the use of atom-centered multipole expansion up to quadrupoles is an approximation introduced with MNDO,28,29 rather than an analysis method such as that of Stone.30 In our case, however, the multipole approximation can be translated directly into an atom-centered multipole representation.18 We have therefore decided to construct a force field, initially intended for drug design and biomolecular modeling, which uses a re-parameterized MNDO Hamiltonian for the Coulomb and exchange interactions and classical force-field potentials for all others. In this way, we can use the strengths of semiempirical molecular-orbital theory in reproducing one-electron properties and those of classical force fields for structures and energies. We now present first proof-of-principle results for the parameterization of the electrostatic component in the absence of environmental polarization. Our strategy is to separate the Coulomb (and exchange) interaction from other two-center potentials by parameterizing the quantum mechanical Hamiltonian only with respect to electrostatic and electronic properties. These are the MEP close to the van der Waals surface, where it is most important for intermolecular interactions, both the magnitude and direction of dipole moments and the molecular ionization potential, which guarantees that other local properties such as the local ionization energy31 and by implication the local electron affinity32 are also reproduced reliably. This has the advantage that conflicting goals in the parameterization process (such as, for instance, the necessity to reproduce structures and energies well and, at the same time, to give accurate dipole moments)33 are alleviated or eliminated, so that the parameterization process is normally easier and more successful. Once the semiempirical Hamiltonian has been parameterized to reproduce these properties as accurately as possible, we test its performance by computing solvation free energies from three-dimensional Reference Interaction Site Model (3D RISM) integral equation theory for aqueous solutions34,35,36 in comparison to high-level MEP ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
data used as reference for the parameterization and to other common MEP models. Note that we do not intend to reproduce the experimental solvation thermodynamics with absolute accuracy since RISM models are known to have systematic deficiencies that can be corrected using a suitable approach based on the partial molar volume (PMV).37,38 Here, we only test the consistency between results using the MEP from reference calculations and from the hpCADD model. In the next stage to be described elsewhere, the optimized NDDO-Hamiltonian (and thus the molecular electrostatics) will be fixed and the remaining force-field potentials can be parameterized “on top” of it. Similarly, a suitable PMV correction for RISM-based solvation free energies can be parametrized once the final model becomes available. In its final form, the hpCADD force field should represent a transferable and universal atom-type based polarizable multipole force field that will not require expensive reparameterization calculations in order to treat chemical entities not considered in the training set. By applying the final parameters to an independent test set of small molecules, we will demonstrate that hpCADD provides this desirable transferability feature. By construction, the hpCADD parametrization strategy is easily extensible to situations where new chemical types are thought to be essential for enhancing predictive capabilities. It is thus in principle possible to cover large and diverse chemical spaces by this approach. We now describe the parameterization process for the semiempirical Hamiltonian, present the parameters for a series of common elements, and outline the solution state calculations. Note that we are able to use different parameters for different types of atoms of the same element (e.g. one set for sp3- and one for sp2- and spcarbon atoms) because these atom types will be carried over into the classical components of the force field.
Methods Parameterization All semiempirical molecular orbital calculations used the EMPIRE program.23,24 The reference geometries used for the parameterization were taken from second order Møller-Plesset perturbation theory MP2/aug−cc−pVDZ39,40,41 geometry optimizations
ACS Paragon Plus Environment
Page 6 of 39
Page 7 of 39
Journal of Chemical Information and Modeling
7 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
that were performed without symmetry constraints using Gaussian09.42 The target properties for the parameterization were: 1. The three Cartesian components of the molecular dipole moment calculated at MP2/aug−cc−pVDZ. 2. The standard deviation, mean signed and unsigned deviations, most positive and most negative deviations between the MP2/aug−cc−pVDZ-calculated MEP at points on the ParaSurf43 standard AM1 (Austin Model 1)44,45 isodensity surface and those given by the atomic multipole technique18 at the same points with the Hamiltonian being parameterized. 3. Vertical experimental ionization potentials (IP) taken from the literature and given in the Supporting Information (Table S4). Initial parameters were usually based on those taken from the MNDO27 Hamiltonian with the exception that the one-electron ionization potential for the hydrogen 1sorbital (USS in the original terminology used in Reference 25) was fixed at its experimental value, as suggested previously46 in order to anchor the ionization energies for the entire parameterization. In contrast to what has become standard practice since the parameterization of PM3,47,48 (Parameterized Model 3) the derived parameters for the multipole integral calculations were not treated as parameterizable variables, but calculated as in the original MNDO technique.27 This reduces the maximum number of parameters per atom type to twelve. Parameterization was performed using an in-house standalone version of Baker’s Eigenvector-Following (EF) algorithm49 using numerical gradients with EMPIRE as the semiempirical compute engine. The Hamiltonians parameterized do not contain d-orbitals. The magnitudes of the parameters were monitored continuously during the parameterization runs in order to avoid overtraining or optimization into spurious minima. The training set, and the initial set of atoms types, which are described in Table 1, do not contain phosphorus, which would most likely need d-orbitals for good results. There is, however, no reason other than computational cost that d-orbitals should not be included in a future hpCADD NDDO Hamiltonian.
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
Page 8 of 39
8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Atom Types and Parameterization Datasets The atom types defined for the initial parameterization are shown in Table 1. Table 1: Initial atom types for the hpCADD force field. Atom Symbol
Type
Definition
3
Csp3
sp -hybridized Carbon
Csp2
sp -hybridized Carbon
Csp1
sp-hybridized Carbon
HCsp3
H bound to sp3-hybridized Carbon
HCsp2
H bound to sp2-hybridized Carbon
HCsp1
H bound to sp-hybridized Carbon
OC
Carbonyl oxygen
CO
Carbonyl carbon
HCO
Aldehyde hydrogen
SC
Thioketone sulfur
HAS
Thioaldehyde hydrogen
2
O
ACS Paragon Plus Environment
C
Page 9 of 39
Journal of Chemical Information and Modeling
9 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
OHc
Carboxylic acid OH oxygen
HOc
Carboxylic acid proton
ORc
Ester OR oxygen
Namd
Amide nitrogen
Hamd
Amide NH hydrogen
OH
Alcohol oxygen
HO
Alcohol hydrogen
SH
Thiol sulfur
HS
Thiol hydrogen
N3
Amine nitrogen
HN
Amine hydrogen
NC
Nitrile nitrogen
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
10 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
OR
Ether oxygen
SR
Thioether sulfur
F
Fluoride
Cl
Chloride
Br
Bromide
I
Iodide
SO
Sulfone sulfur
OS
Sulfone oxygen
HSO
HSO2 hydrogen
Nim
Divalent nitrogen as part of a π-system
Ccon
Carbon as part of a π-system
Hcon
Hydrogen bonded to a Carbon as part of a π-system
Ncon
Trivalent nitrogen as part of a π-system
HNcon
Hydrogen bonded to conjugated nitrogen
ACS Paragon Plus Environment
Page 10 of 39
Page 11 of 39
Journal of Chemical Information and Modeling
11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Ocon
Oxygen as part of a π-system
Scon
Sulfur as part of a π-system
The parameterization data for 188 compounds (training set) are shown in Tables S1S4 of the Supporting Information. They include many common organic groups without assuming complete coverage of chemical space. The parameterization was performed incrementally in the order in which the atom types appear in Table 1. This means that, for instance, the Csp3, Csp2, Csp1, HCsp3, HCsp2 and HCsp1 atom types were parameterized for the hydrocarbons and then used unchanged for other groups unless they led to poor performance. Thus, for instance, HCsp2 is not used for aldehyde hydrogens, but rather a dedicated atom type HCO. In many cases, the existing parameters were found to perform well and no new atom type was introduced. The ordering of chemical compound classes used for this sequential parametrization strategy is indicated behind the class names in Tables 2 and 5 and in the tables in the Supporting Information. The raw performance of the hpCADD model for approximating electronic structure was characterized by computing various electrostatic and quantum mechanical descriptors in comparison with reference MP2 results. In particular, we report below the overall root mean square deviations (RMSD) for dipole moments, ionization potentials and MEP together with the largest MEP errors with mean signed errors (MSE). We furthermore compare details of the MEP topographies.
3D RISM calculations A technique that calculates the electron density, and therefore the MEP around compounds well should also be appropriate for reproducing solvation free energies deduced from reference MEPs used for the parameterization process. We therefore tested the hpCADD Hamiltonian in conjunction with the 3D RISM solvation model34,35 for aqueous solutions. This model yields analytical expressions for the excess chemical potential, µex, within certain approximations.36 Since we are not interested in ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
absolute data here but only in measuring the performance of hpCADD electrostatics in comparison with high-level calculations and other established electrostatics models, we do not consider the solvent-induced polarization of the solute, which is in principle possible.50 The electrostatic component of the solute-solvent interaction is described by computing the energy of solvent point charges in the MEP field. As we have demonstrated earlier,51 it is necessary for numerical reasons to split the total electrostatic interaction into an auxiliary long-range part arising from atomic solute point charges and a residual short ranged part such that the sum is identical with the exact MEP. After subsequent iterative solution to the 3D RISM integral equations on a 3D grid, µex is obtained from the set of correlation functions between solute and solvent sites. As a stringent test of the hpCADD model we compared its performance for solution thermodynamics not only with reference MP2 MEPs, but also with common semiempirical Hamiltonians (AM1, PM3, MNDO) and with a standard approach to fixed charge force field models following the Generalized Amber Force Field (GAFF) strategy to derive atomic point charges for small non-peptide molecules.52 In this case, the RESP (“Restrained Electrostatic Potential”) procedure was employed using antechamber from the Amber tools53 to represent the HF/6-31G(d) Hartree-Fock MEP as accurately as possible with topologically symmetrized atomic site charges.54 MEP fitting points were selected according to the Merz-Singh-Kollman scheme55,56 on four concentric layers around the atoms with an increased number of six points per unit area. Since the Amber-GAFF model is atom-centered, the resulting MEP was used directly for 3D RISM calculations, not accounting for the residual difference to the exact HF MEP. The reference MP2 MEP was calculated using the Gaussian auxiliary program cubegen using the MP2 density. Auxiliary charges for semiempirical Hamiltonians including hpCADD were obtained by fitting VESPA charges57 to the MEP calculated with the hpCADD Hamiltonian. In all cases, optimized geometries from the MP2/aug-cc-pvDZ calculations were used for charge and MEP calculations. As input for 3D RISM iterations, the MEP for all methods was computed on a grid with a minimum distance between box edges and any solute atom of 15 Å and a grid spacing of 0.3 Å in each direction. Following closely our earlier work,51 a modified SPC/E (Extended Simple Point Charge) water representation58 was used to model
ACS Paragon Plus Environment
Page 12 of 39
Page 13 of 39
Journal of Chemical Information and Modeling
13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
the solvent. The Lennard-Jones parameters for the dispersion-cavitation solutesolvent description were taken from the GAFF force field in all cases. The PSE-3 closure was used throughout;36 the convergence criterion was set to 10-6 for the maximum residual deviation of the direct correlation functions between subsequent iteration cycles. The results for training and test data sets were compared by computing statistical descriptors for the deviation between the approximate (hpCADD, Amber, AM1, PM3, MNDO) solvation free energies measured by µex and the MP2-based reference results. To this end, the RMSD, the mean absolute deviation (MAD) along with the MSE, and the Pearson correlation coefficient were computed. Additionally, slope and intercept of a linear regression model were derived together with the resulting coefficient of determination R2 and the estimated standard deviation σ.
Results and discussion Performance of electronic structure descriptors
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Table 2 shows the parameterization results for each of the target values and for the entire training set divided into classes of molecules. The results table is organized in the order in which the molecule classes were parameterized (i.e. lower classes of molecules use parameters from those above them in addition to their own specific parameterization). The “inapplicable” test-set compounds belong to structural classes not represented in the training set and therefore outside the applicability domain of the current parameterization. They are included to illustrate the locality of the hpCADD Hamiltonian and to illustrate likely errors for “unknown” compound types (see below).
ACS Paragon Plus Environment
Page 14 of 39
Page 15 of 39
Journal of Chemical Information and Modeling
15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Table 2: Performance of the optimized parameters for the target variables organized in compound groups. The data for the individual compounds are given in the Supporting Information. Dipole moments are reported in Debye, ionization potentials in eV and the MEP in kcal mol−1. The number in parentheses after the class name indicates the parametrization ordering. The column “MEP-RMSD” gives the root-mean-square deviation of the RMSDs of the individual compounds for the MEP, whereas the “MEP” column gives the RMSD of the MEP for all surface points for all molecules in the dataset. Comparison of the two values indicates how evenly the molecules in the dataset are treated. The values in parentheses give the range of the target value for the dataset. See the text above for the definition of “Inapplicable” compounds. Atom Types
Csp3 Csp2 Csp1 HCsp3 HCsp2 HCsp1
Overall RMSD µ
IP
0.08 (-0.63 to 0.76)
0.23 (8.09 – 11.51)
OH HO
0.11 (-1.50 to 1.67)
0.37 (10.2 to 10.96)
Namd Hamd
0.40 (-4.14 to 4.06)
0.49 (9.14 to 10.5))
N3 HN
0.24 (-1.41 to 1.15)
0.56 (8.54 to 9.65)
Br
0.17 (-3.56 to 2.54)
0.35 (9.39 to 10.76)
OHc HOc
0.39 (-2.96 to 3.92)
Cl
Largest MEP errors
MEPMEP RMSD 1. Hydrocarbons (1)
0.85 (-23.68 to 32.31)
2. Alcohols (7) 0.85 (-41.02 to 44.02) 3. Amides (6) 2.22 (-51.23 to 48.76) 4. Amines (9) 0.94 (-52.86 to 39.70) 5. Bromides (15)
3.96
-5,87
0.58
0.98
4.93
-5.57
0.72
2.84
6.95
-19.37
0.88
1.12
6.51
-7.67
0.67
7.23
-9.15
0.68
11.23
-15.72
1.04
7.10
-10.79
0.46
1.12
2.43
-7.80
0.22
1.53
7.98
-10.55
0.89
1.54
6.07
-10.96
1.13
2.17
11.17
-12.05
0.63
0.32 0.87 0.16 1.27 (-2.50 to 2.25) (10.18 to 11.29) (-19.87 to 34.97) 8. Conjugated Systems (18)
0.20 (-3.22 to 4.86)
0.64 (7.86 to 10.15)
ORc
0.20 (-2.02 to 3.43)
0.52 (8.54 to 10.75)
OR
0.18 (-1.44 to 1.90)
0.43 (9.32 to 10.02)
F
0.24 0.43 (-1.97 to 2.51) (10.63 to 13.27)
0.79 (-55.52 to 311.97)
9. Esters (5) 1.31 (-43.81 to 43.64) 10. Ethers (11) 1.19 (-44.34 to 30.26) 11. Fluorides (13) 1.08 (-30.43 to 34.56) 12. Iodides (16)
MSE
1.13
1.01 1.41 (-23.06 to 33.61) 6. Carboxylic acids (4) 1.37 1.82 2.50 (10.3 to 10.84) (-43.06 to 55.02) 7. Chlorides (14)
Ccon HCcon Nim Ncon HNcon Ocon Scon
Positive Negative
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
Page 16 of 39
16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
I
0.27 (-2.12 to 2.47)
OC CO HCO
0.34 (-3.11 to 3.11)
0.57 1.49 2.08 (9.00 to 9.9) (-14.31 to 33.61) 13. Ketones and Aldehydes (2)
7.73
-8.99
0.97
1.95
10.56
-6.21
1.38
NC
0.07 0.57 (-4.02 to 3.85) (11.03 to 12.46)
14. Nitriles (10) 0.71 0.75 (-44.91 to 34.51) 15. Sulfones (17)
3.51
-5.32
0.27
SO OS HSO
0.15 (-4.72 to 5.54)
0.83 (-43.81 to 54.61)
7.14
-7.32
0.36
-10.76
1.10
-7.49
0.71
9.21
-4.91
1.41
2.84
11.23
-19.37
1.41
3.36 (-69.60 to 64.08)
13.05
36.28
-37.75
5.11
7.60 (-68.79 to 64.08)
21.82
119.84
-55.33
1.45
1.55 (9.29 to 10.24)
0.42 (9.96 to 10.81)
1.52 (-43.10 to 33.78)
1.20
16. Thioethers (12) SR
0.20 (-2.17 to 1.70)
SH HS
0.26 (-1.70 to 1.70)
SC HSA CS
0.47 (-2.64 to 1.69)
0.65 (8.07 to 8.81)
1.46 1.82 7.95 (-31.71 to 31.13) 17. Thiols (8) 0.59 1.01 1.36 7.96 (8.90 to 9.51) (-27.64 to 31.59) 18. Thioketones and Thioaldehydes (3) 0.30 (8.18 to 9.37)
1.80 (-29.21 to 29.04)
1.95
Complete Training Set All types
0.23 (-4.72 to 5.54)
0.57 (7.86 to 13.27)
1.15 (-55.52 to 311.97) Test sets
All types
Inapplicable
Experimental 0.90 values not (-7.70 to 4.74) available 1.45 (-7.81 to 12.94)
Experimental values not available
Parameters Table 3 lists the parameters obtained for the individual atom types. These parameters were used in the precision given in the table in the EMPIRE parameters file used for all subsequent hpCADD calculations. The symbols and units used for the parameters correspond to references 27 and 47 and their roles are defined in the Supporting Information.
ACS Paragon Plus Environment
Page 17 of 39
Journal of Chemical Information and Modeling
17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Table 3: Optimized parameters for the atom types listed in Table 1. The Uss value for all types of hydrogen was fixed at the experimental value of -13.59844 eV and is indicated in bold in the table. Nomenclature for the parameters and units are taken from reference 27 and are defined in the Supporting Information. Parameters Ussa a Upp
βs a βp a ξs b ξp b a
Gss Gspa Gppa a Gp2
ρcore b Parameters Ussa a Upp
βs a βp a ξs b ξp b Gssa Gspa Gppa a Gp2
ρcore b Parameters a Uss a Upp
βs a βp a ξs b ξp b Gssa a Gsp a Gpp a Gp2
ρcore b Parameters Ussa Uppa
βs a βp a ξs b ξp b Gssa a Gsp
Csp1 -57.134540 -31.550208 -20.576034 -11.556630 2.651424 1.350009 11.385277 11.367651 7.317063 7.193675 1.203862
Csp2 -56.004533 -32.771727 -19.937626 -4.392654 1.823864 1.232033 10.373712 8.595365 9.095091 9.094891 1.350157
OC -99.290252 -82.324401 -33.087822 -30.608878 2.441799 2.53864 28.527808 7.026354 22.464162 17.820685 0.319475
-51.051834 -34.019691 -23.242293 -7.193323 1.845027 1.720874 10.008872 10.539312 8.029731 7.929731 1.291188
13.598440 -0.111755 -7.279856 -0.347596 1.133597 1.378217 14.311001 10.831106 4.089672 3.989672 0.732684
-70.525034 -52.846144 -8.159588 4.887574 1.748760 2.023847 14.714087 5.506948 15.744037 11.022401 0.544163
-50.667713 -37.016841 -19.336368 -10.944181 1.332529 3.268046 8.697626 11.297338 18.208000 10.165255 1.687183
-13.598440 -4.296529 -7.385536 -2.365866 0.977236 1.694048 16.471460 6.335711 0.952519 0.766008 0.467238
OHc -98.773392 -73.959228 -30.285285 -39.393993 1.812872 1.390249 16.682234 19.098621 4.429111 3.630789 0.805882
HOc -13.598440 -9.792256 -2.518255 -5.178289 1.402333 6.690095 11.745390 10.993214 11.359259 10.134907 0.425108
ORc -101.397130 -73.497761 -31.105996 -32.086173 1.168803 4.643811 12.006680 13.387283 13.796043 13.012912 1.148969
Namd -71.594154 -57.995725 -20.724598 -24.181100 2.264992 0.883725 22.923393 16.675275 0.392244 0.284953 0.876562
Hamd -13.598440 -4.507666 -5.932420 -7.518052 2.456534 1.421647 10.208671 10.381695 4.611840 2.925733 0.935030
NC -68.982170 -71.527271 -20.781192 -29.734675 4.286493 1.064035 28.502824 8.867105 17.346125 17.246125 0.550223
CN OH -54.214780 -110.644199 -61.018598 -62.779013 -18.599950 -48.049193 -4.884952 -6.068511 1.905249 2.025751 2.538566 1.544880 10.001897 6.447996 15.288876 11.438543
HO -13.598440 -6.345576 -5.188109 -30.289408 3.741170 4.300671 15.921467 23.187677
SH -69.404196 -53.329494 -20.797253 -0.477919 2.187098 1.393829 6.432390 7.225023
HS -13.598440 -6.828391 -3.524485 -0.732089 1.456146 2.160097 18.802828 16.774856
OR -93.131866 -57.266407 -56.385721 -8.437938 2.211918 1.576284 7.393121 9.332506
CO
Csp3 -56.866067 -32.873021 -18.879427 -13.704305 2.053747 2.325245 15.913764 9.351601 10.084143 9.901188 0.872597 HCO
HCsp1 -13.598440 4.651429 -6.781701 -0.660783 0.840477 3.117138 22.504821 1.812203 1.034360 0.615950 0.490854 SC
ACS Paragon Plus Environment
HCsp2 -13.598440 5.936670 -3.015955 0.437893 3.656390 0.301103 18.219326 8.755309 3.297240 3.297040 0.542826 CS
HCsp3 -13.598440 1.106749 -12.598789 -0.934851 1.370618 1.694617 15.522633 7.340549 5.429447 5.311533 0.678804 HAS
Journal of Chemical Information and Modeling
Page 18 of 39
18 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Gppa a Gp2
27.464126 16.226985 1.324776
10.437636 10.153486 2.174147
8.720638 8.129336 0.669973
10.264111 9.936538 2.075309
12.439347 6.937294 0.662903
10.041088 9.685367 1.880738
SR -81.441743 -51.075098 -26.434130 -1.725599 1.970969 1.358496 6.277973 11.108872 7.621889 7.151393 2.100699
NH -48.552277 -43.524106 -9.463622 -3.842305 1.752407 1.205681 9.520222 8.195266 10.116551 10.016551 1.552650
HN -13.598440 -2.829472 -3.994636 -1.449631 0.844586 1.605460 14.582207 11.878712 8.245272 -0.007277 0.308455
SO -67.374544 -56.231516 -10.736591 -17.849105 2.129794 2.195676 12.286556 11.450864 8.416305 8.316251 1.154768
OS -98.359575 -73.238164 -33.959484 -21.324128 11.982624 1.684346 12.951124 16.664765 9.698725 9.598725 1.050488
HSO -13.598440 3.017358 -5.890047 0.928311 1.141711 1.496751 11.820608 21.994001 8.058844 1.210303 0.915851
Parameters F a -145.121500 Uss a -121.815460 Upp a -34.145166 βs -12.371271 βp a 2.334286 ξs b 2.190831 ξp b 14.038205 Gssa a 14.903272 Gsp a 19.703140 Gpp a 19.551353 Gp2 b 0.989531 ρcore
Cl -93.675395 -69.792916 -62.799677 -3.995695 2.756296 1.565311 13.913884 1.222128 14.324101 14.084516 0.960562
Br -97.497102 -75.241413 -44.314455 0.305045 2.185464 1.814781 9.574625 4.056305 14.360031 14.020251 1.365639
I -78.669148 -83.328898 -29.207027 0.518614 2.339867 1.977634 8.364136 4.636808 15.828207 15.427902 1.504931
Nim -70.183426 -50.981265 -16.290831 -13.112620 2.867447 1.271439 14.318383 13.672876 8.380220 8.136448 0.834358
Ncon -69.894881 -60.974364 -16.942307 -17.327652 1.192675 2.200598 14.527863 11.562961 18.841732 14.985688 0.951010
Parameters Ussa Uppa
Scon -63.926189 -52.527459 -30.025352 -7.921570 2.842908 1.318051 8.550021 8.450424 9.472057 9.009453 1.494701
Ocon -101.058827 -72.283708 -38.365078 -21.418025 0.938976 2.772824 6.074053 10.621859 21.388329 13.010110 2.275529
Ccon -52.690492 -52.690492 -33.376010 -20.694394 1.684054 1.068103 8.361445 10.574547 8.857099 8.550273 1.670025
HCcon -13.598440 4.914856 -12.787739 -0.432002 1.917542 1.793264 17.156526 4.526758 8.133185 -2.022178 0.075378
ρcore b Parameters Ussa a Upp
βs a βp a ξs b ξp b a
Gss a Gsp Gppa a Gp2
ρcore b
βs a βp a ξs b ξp b Gssa a Gsp Gppa a Gp2
ρcore b a
eV
b
HNcon -13.598440 -5.249159 0.000000 -1.491020 1.197927 0.664124 21.655184 14.406163 6.031612 5.535249 0.079740
Bohr−1
As expected, it is straightforward to parameterize a standard MNDO-like Hamiltonian to reproduce our limited set of electrostatic and electronic structure-related target properties quite accurately without needing to consider structures or energies. The different parameter sets for hydrogen illustrate the effect of using force field like atom types for the semiempirical Hamiltonian. Table 4 shows the mean values and
ACS Paragon Plus Environment
Page 19 of 39
Journal of Chemical Information and Modeling
19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
standard deviations of the nine variable hydrogen parameters (Uss is always fixed) for the 12 different hydrogen atom types. Table 4: Statistical distribution of the optimized parameters for hydrogen (HCsp1, HCsp2, HCsp3, HCO, HAS, HOc, Hamd, HO, HS, HN, HSO, HCcon and HNcon). Nomenclature for the parameters and units are defined in the Supporting Information. Parameter a
-1.671 -5.592 -4.031 1.690 2.253 16.218 11.563 6.496 2.911 0.521
Upp
a
βs βp a ξs b ξp b a
Gss a Gsp a Gpp Gp2a
ρcore b a
eV
b
Mean
Standard Deviation Absolute % 5.119 406 3.723 167 8.216 304 0.999 41 1.830 19 3.807 77 6.594 43 3.623 44 3.050 5 0.277 47
Bohr−1
The parameters, in particular those associated with the p-polarization functions, exhibit considerable scatter. Although much of this results from the pragmatic statistical parameterization and thus has little physical meaning, some of the variation can be interpreted as a surrogate for the effects that would normally be accounted for by a split-valence basis set. Upp, βs, βp and ξp, for instance, all correlate roughly with the acidity (in this case, the pKa value in water) of the proton, in accord with the polarization effect observed previously. The trends can be understood in terms of the 1s-orbital becoming more compact and the polarization effect made possible by the p-orbitals becoming more important as the hydrogen becomes more positive; Upp is most negative and ξp largest for the most acidic protons.
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 1: Visual comparison of the optimized parameters for the three differently hybridized carbon atom types. Nomenclature for the parameters and units are defined in the Supporting Information.
Figure 1 shows the optimized parameters for the three differently hybridized carbon atom types. The parameters show consistent trends, although the Csp2 ξs, ξp and βs parameters are lower than those for both Csp1 and Csp3. This is likely because the Csp2 parameterization is strongly adapted to reproduce the electrostatics of π-clouds accurately.
ACS Paragon Plus Environment
Page 20 of 39
Page 21 of 39
Journal of Chemical Information and Modeling
21 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 2: Visual comparison of the optimized parameters for the halogens F, Cl, Br and I. Nomenclature for the parameters and units are defined in the Supporting Information.
Figure 2 shows a similar comparison for the halogens. Fluorine generally forms an exception, but the trends are quite smooth and without large exceptions.
Molecular electrostatic potential topography Atom-centered monopole electrostatic models often fail to capture the correct topography of the MEP projected onto the van der Waals surface, especially for strongly anisotropic atoms such as the heavier elements.59 This failure has strong consequences for intermolecular interactions because features that correspond to the H-bond accepting faces of aromatic systems60 or the σ-hole7,61 are often not reproduced using atom-centered monopoles, which tend towards isotropic MEP distributions around single atoms or homogeneous rings. We have therefore selected some molecules that exhibit such binding features to investigate the behavior of the newly parameterized hpCADD NDDO Hamiltonian with respect to the accurate ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
Page 22 of 39
22 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
reproduction of electrostatic properties. As for the parameterization, we have used the MP2/aug-cc-pVDZ-calculated data and optimized geometries as reference. p-Chloro benzoic acid, 1
Figure 3 shows the hpCADD MEP projected onto the standard isodensity surface of p-chloro benzoic acid, 1. Red arrows in the above scheme indicate favorable interactions with H-bond (or halogen–bond) acceptors, blue ones with H-bond donors. This molecule shows several important features that must be reproduced by a satisfactory electrostatic model. It can enter into classical hydrogen bonds through the OH-donor and =O-acceptor functions of the carboxylic acid, function as a halogen-bond donor via the σ-hole on chlorine or as an H-bond acceptor through the negative equatorial belt of the halogen. The aromatic ring can function as an H-bond acceptor on its faces and as a CH hydrogen-bond donor in the ring plane. Figure 3a) shows the MEP calculated with the hpCADD Hamiltonian and Figure 3b) the difference between hpCADD and MP2/aug-cc-pVDZ MEPs projected onto the same surface.
kcal mol−1
kcal mol−1
Figure 3: a) MEP for p-chloro benzoic acid in kcal mol−1 calculated at the 0.001 a.u. (atomic units) isodensity surface with the hpCADD Hamiltonian. The color scale has been compressed to 0 ± 20 kcal mol−1 to emphasize the features of the MEP map. A corresponding figure in which the color code extends over the entire range (-41 – +37 kcal mol−1) is given in the Supporting Information (Table S5); b) the difference between the hpCADD and the MP2/aug-cc-pVDZ calculated MEP (i.e. the hpCADD error) projected onto the same surface.
ACS Paragon Plus Environment
Page 23 of 39
Journal of Chemical Information and Modeling
23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The hpCADD Hamiltonian gives the largest deviation (-15.7 kcal mol−1) for the σ-hole at the chlorine atom in p-chloro benzoic acid with a most positive MEP value in this region of -7.2 kcal mol−1. The largest positive error (+11.2 kcal mol-1) is situated in the negatively polarized area of the hydrogen atoms around the benzene ring. The overall
RMSE
for
the
MEP
is
1.7 kcal mol-1.
The
hpCADD
Hamiltonian
underestimates the polarity of these two important binding features. Most likely, this depends on the electronic properties of the chlorine atom and may mean that dorbitals are needed. 2-(4-pyridyl)-pyrrole, 2
Pyridine is a very electron-poor aromatic system and pyrrole very electron-rich. Therefore, linking the two at the pyridine 4-position and the 2-position of pyrrole should result in a bis-aromatic system, 2, with two very different aromatic ring systems and different MEPs on the π-faces of the rings. Figure 4 shows hpCADD results and a comparison with MP2/aug-cc-pVDZ analogously to that shown in
Figure 3. Figure 4: The difference between the hpCADD and the MP2/aug-cc-pVDZ calculated MEP (i.e. the hpCADD error, VhpCADD(r) – VMP2(r)) projected onto the 0.001 a.u. isodensity surface. The right-hand structure is that optimized fully at MP2/aug-cc-pVDZ. The three left-hand structures are based on partially optimized geometries in which the twist angle around the central bond was constrained to 30°, 60° and 90°.
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
Page 24 of 39
24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The range of the MEP values given by the hpCADD Hamiltonian is similar to that found for MP2/aug-cc-pVDZ. The most positive deviation (8.87 kcal mol-1, 9%) of the overall MEP range of the MP2/aug-cc-pVDZ calculations corresponds to the positive area around the pyridine ortho-hydrogen atoms and the most negative for their metaneighbors. The hpCADD errors for the pyridine moiety are approximately 50% larger than for unsubstituted pyridine, whereas the pyrrole ring is reproduced very similarly to the unsubstituted parent. hpCADD underestimates the induced dipole of the pyridine ring slightly. Twisting the rings relative to each other demonstrates that the hpCADD Hamiltonian can adapt to conformational changes in conjugated systems; the most positive error remains approximately constant in the twisted structures and the most negative one decreases with increasing twist angle. Carbohydrate modeling is traditionally regarded as being difficult, so that a good description of the electrostatic properties of glucose is an important quality criterion for any new force field for natural systems. Figure 5 shows the hpCADD results for glucose, 3, analogously to Figure 4. Glucose, 3
kcal mol−1
kcal mol−1
Figure 5: a) MEP for glucose calculated at the 0.001 a.u. isodensity surface with the hpCADD Hamiltonian. The color scale has been compressed to 0 ± 20 kcal mol−1 to emphasize the features of the MEP map. A corresponding figure in which the color code extends over the entire range (-49 – +52 kcal mol−1) is given in the Supporting Information (Table S5); b) the difference between the hpCADD and the MP2/aug-cc-pVDZ calculated MEP (i.e. the hpCADD error, V(r)hpCADD – V(r)MP2) projected onto the same surface.
ACS Paragon Plus Environment
Page 25 of 39
Journal of Chemical Information and Modeling
25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The overall RMSE for the MEP from the hpCADD Hamiltonian relative MP2/aug-ccpVDZ to is 3.40 kcal mol-1. The largest deviation is -12.04 kcal mol-1, 12% of the MEP range from MP2/aug-cc-pVDZ calculations, is found around the oxygen of an equatorial OH. The MP2 value here is -14 kcal mol−1, compared with the hpCADD value of -26 kcal mol-1. The most positive error (10 kcal mol−1) is found on the side of C6 opposite the axial hydrogen atom.
3D RISM solvation free energy results Tables 5 and 6 (raw data are shown in Fig. S1 and Table S6 of the Supporting Information) summarize the statistical descriptors for a comparison to the MP2 reference of the 3D RISM excess chemical potentials calculated using MEPs calculated with hpCADD, Amber-GAFF, and three semiempirical methods. The results for are shown for the complete training data set, for the individual chemical compound classes and for two independent test sets. The “inapplicable” test set consists of compound types not represented in the training set (predominantly guanidines, ureas, anhydrides and aromatics with two or more adjacent nitrogen atoms). These compounds are included to test and illustrate the applicability of the parameterized Hamiltonian. Their poor performance indicates that additional atom types are needed. Figure 6 shows RMSD and MSE for distinct classes and all methods and Figure 7 in addition the deviations for each model for both training and test sets. More details including raw data for all compounds and separate scatter plots for all models/classes are given in the Supporting Information.
-1
Table 5: Statistical results of aqueous solvation free energies (in kcal mol ) from 3D RISM calculations for the training data set for hpCADD and other charge models measured in relation to reference MP2 calculations. R2 and σ that were derived from a linear model fit with respect to the MP2based results. The statistics were obtained for the following numbers of data points within each chemical compound class; all (188), hydrocarbons (21), alcohols (8), amides (9), amines (6), bromides (12), carboxylic acids (6), chlorides (11), conjugated systems (17), esters (22), ethers (11), fluorides (10), iodides (10), ketones and aldehydes (8), nitriles (5), sulfones (6), thioethers (11), thiols (8) as well as thiones and thioaldehydes (7). The number in parentheses after the class name indicates the parametrization ordering. For the raw data, see the Supporting Information. All hpCADD Amber AM1 PM3
RMSD 1.052 1.891 2.676 2.608
MAD 0.756 0.978 1.795 1.561
MSE -0.270 -1.356 -1.266 -0.725
Pearson Slope Intercept R2 σ 0.990 0.923 1.182 0.979 0.887 0.979 0.930 2.109 0.959 1.245 0.925 0.943 1.884 0.855 2.346 0.921 0.880 2.089 0.848 2.397
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
Page 26 of 39
26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
MNDO 1. Hydrocarbons (1) hpCADD Amber AM1 PM3 MNDO 2. Alcohols (7) hpCADD Amber AM1 PM3 MNDO 3. Amides (6) hpCADD Amber AM1 PM3 MNDO 4. Amines (9) hpCADD Amber AM1 PM3 MNDO 5. Bromides (15) hpCADD Amber AM1 PM3 MNDO 6. Carboxylic acids (4) hpCADD Amber AM1 PM3 MNDO 7. Chlorides (14) hpCADD Amber AM1 PM3 MNDO 8. Conjugated systems (18) hpCADD Amber AM1 PM3 MNDO
2.328 RMSD 0.539 1.285 0.512 0.780 1.250 RMSD 0.502 1.352 1.272 1.819 1.681 RMSD 1.512 2.537 0.706 0.811 1.174 RMSD 0.610 1.235 1.022 1.399 0.860 RMSD 0.911 0.585 0.529 0.367 0.336 RMSD 1.099 2.632 1.136 1.133 0.959 RMSD 0.491 0.609 0.351 0.755 0.200 RMSD 1.982 3.163 2.676 2.608 2.316
1.479 MAD 0.215 0.678 0.398 0.410 0.566 MAD 0.212 0.245 0.143 0.202 0.297 MAD 0.564 0.240 0.664 0.746 0.504 MAD 0.362 0.693 0.458 0.636 0.570 MAD 0.487 0.163 0.224 0.206 0.153 MAD 0.800 0.474 0.537 0.511 0.491 MAD 0.345 0.201 0.171 0.307 0.156 MAD 1.263 1.279 1.795 1.561 1.444
-0.635 MSE 0.471 -1.019 0.189 0.589 1.050 MSE -0.433 -1.311 1.260 1.800 1.635 MSE -1.358 -2.524 0.176 0.181 0.963 MSE -0.436 -0.877 -0.792 1.061 0.234 MSE -0.566 -0.544 -0.455 -0.267 -0.273 MSE -0.071 -2.568 0.864 0.884 0.653 MSE -0.247 -0.563 0.255 0.623 -0.028 MSE -0.649 -2.125 -1.266 -0.725 -0.692
0.935 Pearson 0.998 0.985 0.997 0.995 0.989 Pearson 0.999 0.997 0.999 0.998 0.996 Pearson 0.988 0.998 0.988 0.985 0.989 Pearson 0.996 0.982 0.995 0.988 0.990 Pearson 0.986 0.999 0.998 0.998 0.999 Pearson 0.946 0.992 0.980 0.981 0.981 Pearson 0.996 0.999 0.999 0.996 0.999 Pearson 0.946 0.858 0.925 0.921 0.937
ACS Paragon Plus Environment
0.914 Slope 1.014 1.038 1.083 1.048 1.003 Slope 0.949 1.047 1.027 1.045 1.048 Slope 0.953 0.981 1.061 1.102 0.939 Slope 0.962 0.934 1.125 1.168 1.149 Slope 1.006 0.989 1.018 1.006 1.025 Slope 1.047 1.140 1.110 1.100 1.085 Slope 0.956 1.034 0.993 0.970 1.021 Slope 0.718 0.716 0.943 0.880 0.912
1.628 Intercept -0.690 0.489 -1.463 -1.348 1.097 Intercept 0.785 1.034 -1.492 -2.204 -2.060 Intercept 1.499 2.558 -0.451 -0.642 -0.641 Intercept 0.685 1.280 -0.025 -2.403 -1.306 Intercept 0.493 0.687 0.222 0.195 -0.046 Intercept -0.238 2.007 -1.682 -1.627 -1.262 Intercept 0.811 0.141 -0.163 -0.219 -0.238 Intercept 2.815 3.890 1.884 2.089 1.695
0.874 R2 0.997 0.971 0.995 0.989 0.977 R2 0.999 0.994 0.999 0.997 0.992 R2 0.976 0.997 0.975 0.971 0.977 R2 0.992 0.965 0.990 0.976 0.980 R2 0.972 0.998 0.996 0.997 0.998 R2 0.894 0.985 0.961 0.963 0.962 R2 0.992 0.998 0.997 0.991 0.998 R2 0.895 0.736 0.855 0.848 0.878
2.185 σ 0.268 0.805 0.345 0.492 0.712 σ 0.172 0.330 0.167 0.245 0.401 σ 0.718 0.272 0.729 0.791 0.700 σ 0.478 0.998 0.531 0.823 0.748 σ 0.781 0.230 0.283 0.275 0.184 σ 1.331 0.504 0.811 0.788 0.802 σ 0.414 0.203 0.264 0.448 0.196 σ 1.310 2.078 2.346 2.397 2.151
Page 27 of 39
Journal of Chemical Information and Modeling
27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
9. Esters (5) hpCADD Amber AM1 PM3 MNDO 10. Ethers (11) hpCADD Amber AM1 PM3 MNDO 11. Fluorides (13) hpCADD Amber AM1 PM3 MNDO 12. Iodides (16) hpCADD Amber AM1 PM3 MNDO 13. Ketones/aldehydes (2) hpCADD Amber AM1 PM3 MNDO 14. Nitriles (10) hpCADD Amber AM1 PM3 MNDO 15. Sulfones (17) hpCADD Amber AM1 PM3 MNDO 16. Thioethers (12) hpCADD Amber AM1 PM3 MNDO 17. Thiols (8)
RMSD 1.004 2.374 4.775 0.879 3.576 RMSD 0.739 0.439 0.622 0.797 0.384 RMSD 0.763 0.626 0.611 0.594 0.854 RMSD 0.221 1.269 1.279 1.214 1.226 RMSD 0.696 2.148 0.464 0.532 0.547 RMSD 1.682 1.503 1.075 0.698 1.592 RMSD 1.339 4.597 9.285 11.760 9.514 RMSD 1.401 0.501 2.197 2.861 0.507 RMSD
MAD MSE 0.506 0.769 0.209 -2.357 0.784 -4.669 0.315 -0.783 0.547 -3.500 MAD MSE 0.470 -0.436 0.265 -0.269 0.376 -0.383 0.412 0.553 0.264 -0.064 MAD MSE 0.590 -0.300 0.144 -0.604 0.164 0.563 0.219 0.514 0.310 -0.741 MAD MSE 0.171 -0.040 0.555 -1.109 0.295 -1.213 0.283 -1.160 0.290 -1.156 MAD MSE 0.562 -0.194 0.276 -2.121 0.164 -0.378 0.188 -0.459 0.248 -0.448 MAD MSE 0.765 -1.370 0.181 -1.487 0.180 1.055 0.091 -0.688 0.462 1.468 MAD MSE 1.075 -0.537 0.332 -4.577 0.441 -9.265 0.637 -11.728 0.666 -9.484 MAD MSE 0.789 -1.043 0.341 0.237 0.458 -2.093 0.377 -2.803 0.346 -0.156 MAD MSE
Pearson 0.991 0.998 0.986 0.998 0.992 Pearson 0.998 0.998 0.999 0.998 0.999 Pearson 0.994 0.999 0.998 0.997 0.994 Pearson 0.999 0.991 0.996 0.997 0.995 Pearson 0.998 1.000 1.000 1.000 0.999 Pearson 0.955 0.998 0.998 0.999 0.981 Pearson 0.961 0.993 0.981 0.965 0.980 Pearson 0.992 0.997 0.996 0.996 0.996 Pearson
ACS Paragon Plus Environment
Slope 0.969 1.010 1.148 1.060 1.090 Slope 0.919 1.0011 1.082 1.085 1.053 Slope 0.870 1.023 0.974 0.981 1.003 Slope 0.983 0.926 1.043 1.039 1.024 Slope 0.960 1.022 1.022 1.017 1.003 Slope 0.928 1.015 1.027 1.005 0.995 Slope 0.758 0.918 1.069 0.882 1.199 Slope 0.880 1.042 1.102 1.075 1.028 Slope
Intercept -0.242 2.223 2.998 -0.124 2.383 Intercept 1.488 0.254 -0.681 -1.739 -0.645 Intercept 1.696 0.365 -0.260 -0.290 0.713 Intercept 0.270 2.024 0.687 0.679 0.859 Intercept 0.666 1.899 0.126 0.260 0.417 Intercept 1.896 1.384 -1.315 0.646 -1.413 Intercept 2.019 4.748 9.445 11.129 10.046 Intercept 2.812 -0.915 0.689 1.825 -0.277 Intercept
R2 0.983 0.997 0.972 0.996 0.983 R2 0.996 0.996 0.998 0.995 0.998 R2 0.987 0.999 0.997 0.994 0.987 R2 0.997 0.983 0.992 0.994 0.990 R2 0.996 0.999 0.999 0.999 0.999 R2 0.911 0.996 0.997 0.999 0.962 R2 0.923 0.987 0.961 0.931 0.961 2 R 0.984 0.994 0.991 0.992 0.992 2 R
σ 0.657 0.293 0.835 0.310 0.650 σ 0.384 0.384 0.289 0.422 0.285 σ 0.476 0.160 0.241 0.322 0.476 σ 0.230 0.588 0.411 0.364 0.445 σ 0.656 0.324 0.233 0.262 0.360 σ 1.223 0.275 0.239 0.148 0.795 σ 1.009 0.419 0.714 0.958 0.721 σ 0.704 0.432 0.524 0.497 0.512 σ
Journal of Chemical Information and Modeling
Page 28 of 39
28 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
hpCADD Amber AM1 PM3 MNDO 18. Thiones/thioaldehydes (3) hpCADD Amber AM1 PM3 MNDO
0.640 0.306 2.456 2.719 0.361 RMSD 0.600 1.287 3.981 4.252 1.813
0.249 0.199 0.189 0.191 0.249 MAD 0.418 0.411 0.365 0.375 0.409
-0.577 -0.162 -2.443 -2.705 0.086 MSE -0.338 -1.199 -3.961 -4.229 -1.736
0.999 0.940 1.252 0.999 0.168 0.999 1.055 -0.476 0.998 0.200 0.999 1.043 2.036 0.997 0.237 0.998 1.032 2.413 0.996 0.286 0.996 1.028 -0.416 0.992 0.388 2 Pearson Slope Intercept R σ 0.999 0.968 0.997 0.997 0.499 0.999 1.036 0.491 0.998 0.449 0.999 1.002 3.931 0.998 0.474 0.999 0.995 4.310 0.997 0.520 0.998 0.994 1.858 0.996 0.615
Table 6: Statistical results of aqueous solvation free energies (in kcal mol-1) from 3D RISM calculations for the independent test data set comprising a total of 55 compounds and for hpCADD 2 and other charge models measured in relation to reference MP2 calculations. R and σ that were derived from a linear model fit with respect to the MP2-based results. For raw data, see Supporting Information. Test set (all compounds) hpCADD Amber AM1 PM3 MNDO
RMSD
MAD
MSE
Pearson
Slope
Intercept
2.158 3.533 2.730 2.964 2.547
1.594 1.282 1.731 1.779 1.648
-0.678 -3.078 -0.413 0.182 0.412
0.985 0.989 0.972 0.966 0.976
0.961 1.033 1.000 0.985 0.984
1.402 2.554 0.415 0.102 -0:010
2
R
0.970 0.978 0.945 0.934 0.952
σ 2.036 1.731 2.752 3.012 2.556
-1
Table 7: Statistical results of aqueous solvation free energies (in kcal mol ) from 3D RISM calculations for the 18 “inapplicable” substances excluded from the independent test set for hpCADD and other charge models measured in relation to reference MP2 calculations. R2 and σ that were derived from a linear model fit with respect to the MP2-based results. For raw data, see Supporting Information. Test set (“inapplicable compounds”) hpCADD Amber AM1 PM3 MNDO
RMSD
MAD
MSE
Pearson
Slope
8.709 5.807 5.543 6.062 5.259
4.039 1.549 4.018 4.189 3.257
-6.715 -5.502 -2.868 -3.511 -3.262
0.909 0.990 0.926 0.919 0.944
0.822 1.067 1.139 1.143 1.111
Intercept 8.458 4.768 0.978 1.662 1.796
R2 0.826 0.980 0.857 0.844 0.891
σ 5.316 1.802 4.821 5.033 4.207
For the complete training data set (55 compounds), we find that the electrostatics of the hpCADD Hamiltonian best reproduce the excess chemical potentials of all the methods investigated with respect to the reference calculations. This trend is confirmed by the consistency of all statistical metrics applied and becomes particularly obvious from both the resulting deviations and from the regression analysis, for which the slope of the hpCADD model is closer to one than for all others.
ACS Paragon Plus Environment
Page 29 of 39
Journal of Chemical Information and Modeling
29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Clearly and not unexpectedly, MNDO is the runner-up among semiempirical Hamiltonians. The overall good performance, however, varies among different compound classes, but, most notably, hpCADD performs significantly better in cases where AM1, PM3, and MNDO fail dramatically, particularly for compounds containing sulfur. The MSE data and the values of the regression intercept indicate that hpCADD electrostatics lead to a better balance and less systematic shifts than other methods. Interestingly, Amber-GAFF point charge electrostatics, although worse than hpCADD, also show good overall agreement with the reference data as measured by the RMSD, although the MSE and Fig. 7 indicate a strong and systematic shift with respect to the MP2 reference. This property of the RMSD for a simple fixed charge Hamiltonian could be among the reasons why established force fields such as Amber still prevail. However, several compounds classes, mostly containing oxo and thiocarbonyl groups, amines, iodides, and conjugated systems exhibit stronger deviations. This is related to the difference between exact electrostatics and the point charge approximation, which neglects diffuse electron distribution components. To verify the predictive and thus transferability capability of the hpCADD electrostatics, 3D RISM calculations were performed for two independent test sets (a 55-compound test set and 18 “inapplicable” compounds) and evaluated equivalently as has been done for the training data set. The resulting statistics (see Tables 6 and 7 and Fig. S1 and Table S6 in the Supporting Information) reveal both overall good correlation with reference data and superior predictive quality of the hpCADD model measured by the RMSD and MAD, although the MSE of other semiempirical models show their somewhat better balance, at the price of considerably larger spread of absolute values. A clear advantage of the hpCADD approach is further revealed by regression analysis. Here, we find that both slope and intercept are optimal, indicating a well-balanced model that could be interpreted in the sense that the physical basis is captured best among the competing approaches. The hpCADD quality for solvation properties also agrees with the statistics of quantum mechanical observables (see Table 2), where the performance of hpCADD is optimal for the training set, while the test set shows only a small increase of error. The set of 18 “inapplicable” compounds illustrates the limits of our approach. Compound types not represented in the training set are treated less well by the hpCADD Hamiltonian than by the more generally applicable MNDO, AM1 and PM3
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Hamiltonians and by Amber. This deficit is expected for a highly parameterized Hamiltonian with specific atom types. It illustrates the need to consider all relevant atom types in the parameterization.
Figure 6: Bar chart for the root mean square deviation (RMSD) and the mean signed error (MSE) of aqueous solvation free energies computed with the electrostatics described by hpCADD, Amber, AM1, PM3 and MNDO with respect to MP2 calculations. The data is shown for the complete training data set and for the chemical compound classes hydrocarbons 1, alcohols 2, amides 3, amines 4, bromides 5, carboxylic acids 6, chlorides 7, conjugated systems 8, esters 9, ethers 10, fluorides 11, iodides 12, ketones and aldehydes 13, nitriles 14, sulfones 15, thioethers 16 thiols 17, thiones and thioaldehydes 18.
However, the present parametrization does not assume completeness for the entire range of small molecule chemistries. For instance, it is necessary to introduce special nitrogen parameters e.g. for urea-type molecular fragments in order to cover a wider range of pharmaceutically relevant compounds or nucleobases. The “inapplicable test set was assembled in order to characterize these systematic deficiencies more clearly. Remarkably, not only the hpCADD parameterization fails for these species, but all Hamiltonians used in this work show rather large RMSD values in comparison with high level MP2 calculations (see Table 7 and Supporting Information for details). This finding is actually an advantage of the hpCADD approach since the enhanced parametrization flexibility will most likely allow for improvements in future work. The parametrization strategy allows for a simple complementation of atom types, should it
ACS Paragon Plus Environment
Page 30 of 39
Page 31 of 39
Journal of Chemical Information and Modeling
31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
turn out to be necessary, since these new parameters additively supplement the existing set.
Figure 7: Scatter plot of the aqueous solvation free energies derived by MP2 reference calculations and the solute models under investigation shown for both the training (left) and the test (right) data set (not accounting for the “inapplicable” compounds discussed in the text).
Conclusions We have shown that it is possible to parameterize an NDDO (MNDO-like) Hamiltonian to reproduce molecular electrostatic potentials around molecules, dipole moments, ionization potentials, and aqueous solvation free energies well. The agreement between the MEP from our first hpCADD parameterization reported here at the standard isodensity surface and that calculated at the MP2/aug-cc-pVDZ level is good. The structure of the MEP hypersurface is reproduced far better using the hpCADD Hamiltonian than by fitting MEP-derived charges to the MP2/aug-cc-pVDZ results. Important effects such as the σ-hole and the H-bond accepting properties of aromatic rings are reproduced well without having to resort to additional charges etc. Of all the semiempirical Hamiltonians tested, the hpCADD model reproduces MP2/aug-cc-pVDZ excess chemical potentials best within the 3D RISM approach. This holds for both training set (188 compounds) and independent test data sets (55 compounds, excluding 18 compounds that belong to “inapplicable” compound types not covered by the parametrization). Amber-GAFF fixed charged chemical potentials also show good correlation and relatively small errors in comparison with established ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
semiempirical models. However, a fixed charge model exhibits systematic shifts compared to reference data while hpCADD turns out to improve fixed charge thermodynamics considerably and in a well-balanced way at a reasonable cost. In ongoing work, we will adapt the classical potentials used in the complete hpCADD force field. Further developments will concentrate on reproducing polarizability for self-consistent applications. Finally, the recently introduced NDDO one-electron integrals based on CCSD(T) calculations may provide the basis for a later generation hpCADD NDDO-Hamiltonian.62
Acknowledgments This work was supported by the Bundesministerium für Bildung und Forschung (BMBF) as part of the hpCADD project. We thank Nicolas Tielker for supporting the selection of test set compounds.
ACS Paragon Plus Environment
Page 32 of 39
Page 33 of 39
Journal of Chemical Information and Modeling
33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
References 1
Allinger, N. L. Molecular Structure: Understanding Steric and Electronic Effects from
Molecular Mechanics, John Wiley and Sons, Hoboken, NJ, 2010. 2
Hennemann, M.; Murray, J. S.; Politzer, P.; Riley, K. E.; Clark, T. Polarization-
Induced σ-Holes and Hydrogen Bonding, J. Mol. Model., 2012, 18, 2461-2469. 3
Clark, T.; Murray, J. S.; Politzer, P. The Role of Polarization in a Halogen Bond, Aus
J. Chem., 2014, 67, 451-456. 4
Devereux, M.; Plattner, N.; Meuwly, M. Application of Multipolar Charge Models and
Molecular Dynamics Simulations to Study Stark Shifts in Inhomogeneous Electric Fields, J. Phys. Chem. A 2009, 113, 13199-13209. 5
Plattner, N.; Bandi, T.; Doll, J. D.; Freeman, D. L.; Meuwly, M. MD Simulations
Using Distributed Multipole Electrostatics: Structural and Spectroscopic Properties of CO- and Methane-Containing Clathrates, Mol. Phys. 2008, 106, 1675-1684. 6
Devereux, M.; Gresh, N.; Piquemal J.-P.; Meuwly, M. A Supervised Fitting
Approach to Force Field Parametrization with Application to the SIBFA Polarizable Force Field, J. Comput. Chem. 2014, 35, 1577-1591. 7
Clark, T. σ-Holes, WIREs Comput. Mol. Sci., 2013, 3, 13-20.
8
Warshel, A.; Kato, M.; Pisliakov, A. V. Polarizable Force Fields: History, Test
Cases, and Prospects. J. Chem. Theor. Comput. 2007, 3, 2034-2045. 9
Lopes, P. E. M.; Huang, J.; Shim, J.; Luo, Y.; Li, H.; Roux, B.; MacKerell, A. D. jr.
Polarizable Force Field for Peptides and Proteins Based on the Classical Drude Oscillator. J. Chem. Theory Comput., 2013, 9, 5430-5449. 10
Jorgensen, W. L.; Schyman, P. Treatment of Halogen Bonding in the OPLS-AA
Force Field: Application to Potent Anti-HIV Agents. J. Chem. Theory Comput., 2012, 8, 3895-3901. 11
Wolters, L. P.; Schyman, P.; Pavan, M. J.; Jorgensen, W. L.; Bickelhaupt, F. M.;
Kozuch, S. WIRES Comput Mol. Sci., 2014, 4, 523-540. 12
El Hage, K.; Bereau, T.; Jakobsen, S.; Meuwly, M. Impact of Quadrupolar
Electrostatics on Atoms Adjacent to the Sigma-Hole in Condensed-Phase Simulations. J. Chem. Theory Comput., 2016, 12, 3008-3019.
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
34 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
13
Hédin, F.; El Hage, K.; Meuwly, M. A Toolkit to Fit Nonbonded Parameters from
and for Condensed Phase Simulations. J. Chem. Inf. Model., 2016, 56, 1479-1489. 14
Kolář, M. H.; Hobza, P. Computer Modeling of Halogen Bonds and Other σ-Hole
Interactions. Chem. Rev., 2016, 116, 5155-5187. 15
Mu, X.; Wang, Q.; Wang, L.-P.; Fried, S. D.; Piquemal, J.-P.; Dalby, K. N.; Ren, P.
Modeling Organochlorine Compounds and the σ-Hole Effect Using a Polarizable Multipole Force Field. J Phys Chem B. 2014,118, 6456-6465. 16
Pople, J. A.; Santry, D. P.; Segal, G. A. Approximate Self-Consistent Molecular
Orbital Theory I. Invariant Procedures, J. Chem. Phys. 1965, 43, S129-S135. 17
Clark, T.; Stewart, J. J. P.; MNDO-like Semiempirical Molecular Orbital Theory and
its Application to Large Systems, in, Computational Methods for Large Systems, Reimers, J. J. (ed.), Wiley, Chichester, 2011, Chapter 8 (ISBN: 978-0-470-48788-4). 18
Horn, A. H. C.; Lin J.-H.; Clark, T. Multipole Electrostatic Model for MNDO-Like
Techniques with Minimal Valence spd-Basis Sets, Theor. Chem. Acc. 2005, 114, 159-168; Erratum: Theor. Chem. Acc., 2007, 117, 461−465. 19
Stewart, J. J. P.; Optimization of Parameters for Semiempirical Methods V:
Modification of NDDO Approximations and Application to 70 Elements. J. Mol. Model. 2007, 13, 1173-1213. 20
Stewart, J. J. P. Optimization of Parameters for Semiempirical Methods VI: More
Modifications to the NDDO Approximations and Re-optimization of Parameters. J. Mol. Model. 2013, 19, 1-32. 21
Gaus, M.; Cui, Q.; Elstner, M. DFTB3: Extension of the Self-Consistent-Charge
Density-Functional Tight-Binding Method (SCC-DFTB). J. Chem. Theory Comput. 2011, 7, 931-948. 22
Elstner, M.; Porezag, D.; Jungnickel, G.; Elsner, J.; Haugk, M.; Frauenheim, T.;
Suhai, S.; Seifert, G. Self-consistent-charge density-functional tight-binding method for simulations of complex materials properties. Phys. Rev. B 1998, 58, 7260-7268 23
Hennemann M.; Clark, T. EMPIRE: A Highly Parallel Semiempirical Molecular
Orbital Program: 1: Self-Consistent Field Calculations, J. Mol. Model. 2014, 20, 2331;
ACS Paragon Plus Environment
Page 34 of 39
Page 35 of 39
Journal of Chemical Information and Modeling
35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
see also http://www.ceposinsilico.de/products/empire.htm, accessed on November 7th, 2016. 24
Margraf, J. T.; Hennemann, M.; Meyer B.; Clark, T. EMPIRE: A Highly Parallel
Semiempirical Molecular Orbital Program: 2: Periodic Boundary Conditions, J. Mol. Model., 2015, 21, 144. 25
Fielder, L.; Gao, J.; Truhlar, D. G.; Polarized Molecular Orbital Model Chemistry. 1.
Ab Initio Foundations, J. Chem. Theory Comput. 2011, 7, 852-856. 26
Jug K.; Geudtner, G. Treatment of Hydrogen Bonding in SINDO1, J. Comput.
Chem., 1993, 14, 639-646. 27
Dewar, M. J. S.; Thiel, W. Ground States of Molecules, 38. The MNDO Method.
Approximations and Parameters, J. Am. Chem. Soc., 1977, 99, 4899-4907. 28
Dewar, M. J. S.; Thiel, W. Ground States of Molecules. 39. MNDO Results for
Molecules Containing Hydrogen, J. Am. Chem. Soc., 1977, 99, 4907-4917. 29
Thiel, W. MNDO, in Encyclopedia of Computational Chemistry, Vol. 3, Schleyer, P.
v. R.; Allinger, N. L.; Clark, T.; Gasteiger, J.; Kollman, P. A.; Schaefer III, H. F.; Schreiner, P. R. (Eds.), Wiley, Chichester, 1998, vol 3, pp 1599-1604. 30
Stone, A. J. Distributed Multipole Analysis, or how to Describe a Molecular Charge
Distribution. Chem. Phys. Lett. 1981, 83, 233-239. 31
Sjoberg, P.; Murray, J. S.; Brinck, T.; Politzer, P. A. Average Local Ionization
Energies on the Molecular Surfaces of Aromatic Systems as Guides to Chemical Reactivity, Can. J. Chem. 1990, 68, 1440-1443. 32
Ehresmann, B.; Martin, B.; Horn A. H. C.; Clark, T. Local Molecular Properties and
their Use in Predicting Reactivity, J. Mol. Model. 2003, 9, 342-347. 33
Stewart, J. J. P. Optimization of Parameters for Semiempirical Methods VI: More
Modifications to the NDDO Approximations and Re-Optimization of Parameters. J. Mol. Model. 2013, 19, 1-32. 34
Beglov, D.; Roux, B. An Integral Equation to Describe Solvation of Polar Molecules
in Liquid Water, J. Phys. Chem. 1997, 101, 7821-7826.
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
35
Kovalenko, A.; Hirata, F. Three-Dimensional Density Profiles of Water in Contact
with a Solute of Arbitrary Shape: A RISM Approach, Chem. Phys. Lett. 1998, 290, 237-244. 36
Kast, S. M.; Kloss, T. Closed-Form Expressions of the Chemical Potential for
Integral Equation Closures with Certain Bridge Functions. J. Chem. Phys. 2008, 129, 236101. 37
Ratkova, E. L.; Palmer, D. S.; Fedorov, M. V. Solvation Thermodynamics of
Organic Molecules by the Molecular Integral Equation Theory: Approaching Chemical Accuracy. Chem. Rev. 2015, 115, 6312-6356. 38
Tielker, N.; Tomazic, D.; Heil, J.; Kloss, T.; Ehrhart, S.; Güssregen, S.; Schmidt, K.
F. Kast, S. M. The SAMPL5 Challenge for Embedded-Cluster Integral Equation Theory: Solvation Free Energies, Aqueous pKa, and Cyclohexane–Water log D. J. Comput. Aided Mol. Des. 2016, 30, 1035-1044. 39
Pople, J. A.; Binkley, J. S.; Seeger, R. Theoretical Models Incorporating Electron
Correlation, Int. J. Quantum Chem. Symp. 1976, 10, 1-19. 40
Dunning, T. H. Gaussian-Basis Sets for Use in Correlated Molecular
Calculations.1. The Atoms Boron through Neon and Hydrogen. J. Chem. Phys. 1989, 90, 1007-1023. 41
Kendall, R. A.; Dunning, T. H.; Harrison, R. J. Electron-Affinities of the 1st-Row
Atoms Revisited - Systematic Basis-Sets and Wave-Functions. J. Chem. Phys. 1992, 96, 6796-6806. 42
Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.;
Cheeseman, J. R.; Scalmani, G.; Barone, V.; Mennucci, B.; Petersson, G. A.; Nakatsuji, H.; Caricato, M.; Li, X.; Hratchian, H. P.; Izmaylov, A. F.; Bloino, J.; Zheng, G.; Sonnenberg, J. L.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Vreven, T.; Montgomery, J. A., Jr.; Peralta, J. E.; Ogliaro, F.; Bearpark, M.; Heyd, J. J.; Brothers, E.; Kudin, K. N.; Staroverov, V. N.; Kobayashi, R.; Normand, J.; Raghavachari, K.; Rendell, A.; Burant, J. C.; Iyengar, S. S.; Tomasi, J.; Cossi, M.; Rega, N.; Millam, N. J.; Klene, M.; Knox, J. E.; Cross, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.; Ochterski, J. W.;
ACS Paragon Plus Environment
Page 36 of 39
Page 37 of 39
Journal of Chemical Information and Modeling
37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Martin, R. L.; Morokuma, K.; Zakrzewski, V. G.; Voth, G. A.; Salvador, P.; Dannenberg, J. J.; Dapprich, S.; Daniels, A. D.; Farkas, Ö.; Foresman, J. B.; Ortiz, J. V.; Cioslowski, J.; Fox, D. J. Gaussian 09, Gaussian, Inc., Wallingford CT, 2009. 43
Hennemann, M.; Lin, J.-H., T. Clark, ParaSurf’12, Cepos Insilico Ltd, Kempston,
UK, 2013: http://www.ceposinsilico.de/products/parasurf.htm, accessed on November 7th, 2016. 44
Dewar, M. J. S.; Zoebisch, E. G.; Healy, E. F.; Stewart, J. J. P. AM1 - A New
General Purpose Quantum Mechanical Molecular Model. J. Am. Chem. Soc. 1985, 107, 3902-3909. 45
Holder, A. J. AM1 in Encyclopedia of Computational Chemistry, Schleyer, P. v. R.;
Allinger, N. L.; Clark, ;T. Gasteiger, J.; Kollman, P. A.; Schaefer III, H. F.; Schreiner, P. R. (Eds.), Wiley, Chichester, 1998, Vol. 1, pp. 8-11. 46
Winget, P.; Selçuki, C.; Horn, A. H. C.; Martin B.; Clark, T. Towards a "Next
Generation" NDDO-Based Semiempirical Molecular Orbital Technique. Theor. Chem. Acc. 2003, 110, 254-266. 47
Stewart, J. J. P. Optimization of Parameters for Semi-Empirical Methods I-Method,
J. Comput. Chem. 1989, 10, 209-220; 221-264. 48
Stewart, J. J. P. Encyclopedia of Computational Chemistry, Vol. 3, Schleyer, ;
Allinger, N. L.; Clark, T.; Gasteiger, J.; Kollman, P. A.; Schaefer III, H. F.; Schreiner, P. R. (Eds.), Wiley, Chichester, 1998, pp. 2080-2086. 49
Baker, J. An Algorithm for the Location of Transition-States, J. Comput. Chem.
1986, 7, 385-395. 50
Kloss, T.; Heil, J.; Kast, S. M. Quantum Chemistry in Solution by Combining 3D
Integral Equation Theory with a Cluster Embedding Strategy. J. Phys. Chem. B 2008, 112, 4337-4343. 51
Hoffgaard, F.; Heil, J.; Kast S. M. Three-Dimensional RISM Integral Equation
Theory for Polarizable Solute Models. J. Chem. Theory Comput. 2013, 9, 4718-4726. 52
Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case D. A. Development
and Testing of a General Amber Force Field. J. Comput. Chem. 2004, 25, 11571174.
ACS Paragon Plus Environment
Journal of Chemical Information and Modeling
38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
53
Wang, J., Wang, W., Kollman P. A.; Case, D. A. J. Mol. Graph. Model. 2006, 25,
247-260. 54
Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D.A. Development
and Testing of a General Amber Force Field. J. Comput. Chem. 2003, 25, 9, 11571174. 55
Singh, U. C.; Kollman, P. A. An Approach to Computing Electrostatic Charges for
Molecules, J. Comput. Chem. 1984, 5, 129-145. 56
Besler, B. H.; Merz Jr., K. M.; Kollman, P. A. Atomic Charges Derived from
Semiempirical Methods, J. Comput. Chem. 1990, 11, 431-439. 57
Beck, B.; Clark, T.; Glen, R. C. VESPA: A New, Fast Approach to Electrostatic
Potential-Derived Atomic Charges from Semiempirical Methods, J. Comput. Chem. 1997, 18, 744-756. 58
Maw, S.; Sato, H.; Ten-no, S.; Hirata F. Ab Initio Study of Water: Self-Consistent
Determination of Electronic Structure and Liquid State Properties, Chem. Phys. Lett. 1997, 276, 20-25. 59
Politzer, P.; Murray, J. S.; Concha, M. C.; σ-Hole Bonding Between Like Atoms; A
Fallacy of Atomic Charges. J. Mol. Model. 2008, 14, 659-665. 60
Suzuki, S.; Green, P. G.; Bumgarner, R. E.; Dasgupta, S.; Goddard, W. A.; Blake,
G. A. Benzene Forms Hydrogen Bonds with Water, Science, 1992, 257, 942-945. 61
Clark, T.; Hennemann, M.; Murray, J. S.; Politzer, P. Halogen Bonding: The σ-
Hole, J. Mol. Model. 2007, 13, 291-296. 62
Margraf, J. T.; Claudino, D.; Bartlett, R. J. Determination of Consistent
Semiempirical One-Centre Integrals Based on Coupled Cluster Theory, Mol. Phys., 2016, published online, DOI: 10.1080/00268976.2016.1200755
ACS Paragon Plus Environment
Page 38 of 39
Page 39 of 39
Journal of Chemical Information and Modeling
39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
ToC Graphic:
ACS Paragon Plus Environment