Application of the ChIMES Force Field to Nonreactive Molecular

Nov 26, 2018 - (48,49) While these machine learning force fields yield a more .... In this framework, ghost atoms are generated by replicating real at...
0 downloads 0 Views 3MB Size
Subscriber access provided by Stockholm University Library

Molecular Mechanics

Application of the ChIMES Force Field to Non-Reactive Molecular Systems: Water at Ambient Conditions Rebecca Kristine Lindsey, Laurence E. Fried, and Nir Goldman J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.8b00831 • Publication Date (Web): 26 Nov 2018 Downloaded from http://pubs.acs.org on December 2, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Application of the ChIMES Force Field to Non-Reactive Molecular Systems: Water at Ambient Conditions Rebecca K. Lindsey,∗,† Laurence E. Fried,† and Nir Goldman†,‡ †Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States ‡Department of Chemical Engineering, University of California, Davis, California 95616, United States E-mail: [email protected] Phone: +1-925-422-0915

Abstract We demonstrate development of the Chebyshev Interaction Model for Efficient Simulation (ChIMES) for molecular systems through application to water under ambient conditions (298 K, 1 g/cm3 ). These models, which are comprised of linear combinations of Chebyshev polynomials explicitly describing two- and three-body interactions, are largely fit by force matching to Kohn–Sham Density Functional Theory (DFT). Protocols for selecting user-specified parameters and inclusion of stress tensor data are investigated, and structural and dynamical property prediction for resulting models is benchmarked against DFT. We show that the present ChIMES force fields yield excellent agreement with DFT without the need for additional terms such as those for Coulomb interactions. Overall, we show that tractable parameterization and subse-

1

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

quent accuracy of the present models make ChIMES an ideal candidate for extension of DFT dynamics to larger system sizes and longer time scales.

1

Introduction

Water plays a crucial role in innumerable processes, motivating decades of research into the phenomena governing its many properties. Nevertheless, a complete and detailed theoretical picture of even the ambient liquid is lacking. 1 Substantial effort has been devoted to the creation of empirical force fields based on high-level quantum chemical gas-phase cluster calculations. 2–4 Indeed, rapid convergence of the n-body expansion for water 2–7 suggests force fields using 2- and 3-body descriptions (or many body descriptions, as through polarization) can yield accurate results for both small clusters as well as the bulk liquid. Earlier successes on this front focused on reproducing vibrational-rotation-tunneling (VRT) spectroscopic data 8–10 and/or use of symmetry-adapted perturbation theory (SAPT). 11,12 More recent works have focused on parameterization of many-body terms directly to CCSD(T) data of clusters up to the tetramer, yielding accurate prediction of dimer binding energies, 13,14 larger cluster data, 5 and the vibrational spectrum of the liquid, 15 among numerous other properties of the bulk phase. 16 Molecular dynamics (MD) simulations based on Kohn-Sham Density Functional Theory (DFT) offer a particularly attractive approach for computation of bulk water, as they can provide simultaneous information about electronic states and the physical-chemical properties of multiple phases. 17,18 This in in contrast to higher level quantum chemical approaches (e.g., coupled cluster calculations), which are generally restricted to small clusters or gasphase data. 19,20 DFT remains one of the most popular and widely used simulation methods for condensed matter, including the prediction of material properties and bulk chemical reactivity. This is particularly true for reactive conditions, where the underlying MD potential needs to accurately account for the breaking and forming of chemical bonds. 21–24 Many of

2

ACS Paragon Plus Environment

Page 2 of 41

Page 3 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

the issues pertaining to computation of bulk water via DFT 25,26 can be resolved through use of improved hybrid functionals 27–33 and Van der Waals dispersion corrections. 34–39 Nevertheless, for many problems of interest, such as free energies, transport properties, or chemistry under extreme conditions, access to time and length scales well beyond those attainable with DFT is desired (i.e. µm and µs compared to Å and ps, respectively). In these cases, force-field based extension of DFT-MD has demonstrated promise as an alternative approach for reaching these target scales. 40–44 For example, bond-order methods like ReaxFF 45 or the reactive empirical bond order (REBO) model 46 and the SAPT based force field of Schmidt and co-workers 12 have proven successful for some applications. However these models can prove challenging to parameterize due to the sparsity of available fitting data as well as the difficulty in performing non-linear optimizations for these complex functional forms. On the other hand, models with greater flexibility that utilize machine learning are being developed, such as the Gaussian Approximation Potentials (GAP), 47 which have already been used to aid in quantification and correction of many body errors within exchange correlation functionals. 48,49 While these machine learning force fields yield a more automated fitting procedure, we note that the potentials themselves can be exceedingly complex, and can require large repositories of training data, resulting in slow generation of parameters sets. Thus, the creation of an MD force field that has sufficient flexibility to capture DFT level physics while allowing for rapid parameterization remains an open challenge in this area. Recently, a new force field comprised of linear combinations of Chebyshev polynomials explicitly describing two- and three-body interactions (“Chebyshev Interaction Model for Efficient Simulation” or “ChIMES”) 50 was developed, which results in good agreement with DFT for the structure and dynamics of a metallic molten carbon system. The ChIMES model is particularly well suited for DFT-MD extension due to its simple yet flexible functional form that (i) allows complex DFT potential energy surfaces to described with high accuracy, (ii) is completely linear in fitted coefficients, allowing for rapid parameterization, and (iii) affords

3

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 41

size scalability and orders-of-magnitude improvements in associated computational cost with DFT-MD. In the present work, we investigate use of the ChIMES model and fitting approach for extension of DFT-MD liquid water simulations at ambient conditions. Liquid water presents a particularly challenging test case for force field development due to the wide range of energies that need to be accurately sampled, including short-ranged librational and vibrational modes as well as long-ranged weak (but extremely significant) interactions due to hydrogen bonding. We note that there has been significant effort in the literature to determine optimal exchange-correlation functional and/or dispersion models for aqueous systems. 25 However, the goal of this work is not to produce a universal water force field, but rather to provide a straightforward framework for determining ChIMES force fields based on DFT or other ab initio methods of choice. Consequently, we choose to determine our ChIMES model based on the PBE functional with D2 dispersion corrections, though we note that a similar approach could be used to determine ChIMES models based on other, more accurate DFT models. In the following sections, the influence of user-defined model parameters and inclusion of stress tensor data on ability to reproduce structure, dynamics, density, and pressure of the reference DFT-MD simulation will be evaluated for water at 298 K and 1 g/cm3 . A general recipe for development of ChIMES models to DFT data for liquid water is provided, and a force field based on the present reference data is provided.

2

Method and Computational Details

The ChIMES model, given below, is comprised of explicit two- and three-body interaction terms: EChIMES =

XnaX i

Eij +

j>i

na X XX i

Eijk ,

(1)

j>i k>j

where na is the total number of atoms in the system and EChIMES , Eij , and Eijk are the total energy of the system, the energy between a pair of atoms i and j, and the energy between 4

ACS Paragon Plus Environment

Page 5 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

a set of three atoms, i, j, and k, respectively. The two-body energy takes on the following form: O2B    X e ,e Eij = fpei ,ej rij + fsei ,ej rij ceni ,ej Tn siji j ,

(2)

n=1

where Tn is a Chebyshev polynomial of order O2B = n, which depends on a transformed pair e ,e

ee

distance, siji j , defined over [−1,1], with the corresponding polynomial coefficient of cni j . Note that the superscripts ei and ej indicate the element type of atoms i and j, and for the present system, are assigned values of “O” or “H”. We enforce permutational invariance such that cOH ≡ cHO n n . Further details on the transformation used to convert between rij and sij will be presented in section 3.1.  e ,e The fp i j rij function in Equation 2 introduces a penalty to discourage the simulation from exploring pair distance (rij ) values below the minimum distance sampled in the training trajectory and is defined by

 fpei ,ej rij =

  3  ei ,ej ei ,ej  Ap rc,in + dp − rij , if rij < rc,in + dp   0,

,

(3)

otherwise e ,e

i j where Ap is the penalty prefactor, rc,in is the inner cutoff radius for the pair of atoms with

element types ei and ej , and dp is the penalty initiation distance (e.g., a “skin” of 0.02 Å).  e ,e The remaining function, fs i j rij , ensures the potential goes smoothly to zero at the outer cutoff and is given by fsei ,ej rij = 

1−

rij

!3

e ,e

i j rc,out

,

(4)

e ,e

i j where rc,out is the outer cutoff distance for the pair of atoms with elements ei and ej .

The three-body energy, constructed from the 2-body descriptions, is given as the product

5

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 41

of Chebyshev polynomials for constituent atom pairs:

Eijk =

fsei ,ej

rij



fsei ,ek

(rik ) fsej ,ek

rjk

O3B X O3B X O3B X

eij ,eik ,ejk Tm cm,p,q



e ,e siji j



Tp

seiki ,ek



Tq



e ,e sjkj k



.

m=0 p=0 q=0 ∗

(5) Here, the single sum over two-body order given in Equation 2 is replaced with a triple sum over three-body orders, O3B , for each of the three constituent atom pairs, ij, ik, or jk of element pair types eij , eik , ejk . Coefficients again have permutational invariance enforced with respect to atom type, such that for the set of atoms i, j, k of types ei , ej , ek = O, O, H, cOO,OH,OH = cOH,OO,HO = cHO,HO,OO , etc. The asterisk below the triple sum indicates that 2,1,1 1,2,1 1,1,2 only terms for which two or more of the m, p, q indicies are greater than zero are included, ensuring that Eijk always corresponds to the energy between three atoms. The expression for Eijk also contains fs smoothing functions for each constituent pair distance. Note that outer cutoffs can differ between two-body or three-body equations. Penalty functions are e ,e

i j not included for three-body interactions; instead, all penalties for rij < rc,in are handled by

the two-body Ap terms. The instantaneous pressure tensor P including three body interactions is calculated as follows (see Ref. 51 ): 

P ab

n

n

a X aX ∂Eij rij,a rij,b 1 X =  mi vi,a vi,b − V ∂rij rij i i j>i



na X XX ∂Eijk rij,a rij,b i

j>i k>j

∂rij

rij

 +

∂Eijk rik,a rik,b ∂Eijk rjk,a rjk,b  + . ∂rik rik ∂rjk rjk

(6)

In Eq. 6, a and b are cartesian components that can take the values (x, y, z). rij,a is the a component of the displacement vector between atom i and atom j. vi is the velocity of atom i, and mi is the mass of atom i. For calculations with charges, there is an additional contribution to the pressure due to the Ewald sum, which has been derived by Heyes. 52 e ,ej

The ChIMES force field is entirely linear in cni 6

e ,e ,ejk

ij ik and cm,p,q

ACS Paragon Plus Environment

coefficients, which allows

Page 7 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

for rapid and automated fitting by force matching to a DFT trajectory. Specifically, we aim to minimize the following objective function: v u u u RMSE = t

1 nf (3na + nt )

nf X

 wF2

i=1

na X 3  X

ChIMES{c}

DFT Fijk − Fijk

j=1 k=1

2

+ wσ2

nt  X

 ChIMES{c}

σilDFT − σil

l=1

(7) where RMSE and {c} are the root-mean-squared error and model coefficients, respectively. The number of frames, atoms, and stress tensor components considered in the fit are given by nf , na , and nt , respectively. Fijk indicates the k th Cartesian component of the force on atom j in configuration i, while σil indicates the lth system stress tensor component in configuration i. The superscripts “ChIMES” and “DFT” indicate forces/stresses predicted from the present force-matched model and the DFT molecular dynamics (DFT-MD) training trajectory. Model parameters {c} can be obtained through solution of the following overdetermined matrix equation:

wM c = wX DFT ,

(8)

DFT where X DFT the vector of Fijk and σilDFT values, w is a diagonal matrix of weights, and

the elements of matrix M are given by:

Mab =

∂Xa,ChIMES{c} . ∂cb

(9)

In the above, a represents a combined index over force and stress tensor components, and b is the index over permutationally invariant model parameters. For fits that do not include stress tensor data, we set nt in eq 7 to zero, and wF to unity; otherwise, stress tensor and force components are assigned weights of wσ and unity, respectively. In the present work, principal component analysis (PCA) based on the singular value decomposition (SVD) of the force derivative matrix with a regularization variable of 6.667 × 10−5 is used to solve

7

ACS Paragon Plus Environment

2

,

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

eq 8. For further details of the SVD approach, and how it is used in conjunction with force matching to generate ChIMES models, the reader is directed to references 50 and 53. The initial training trajectory was generated from a 24 ps DFT molecular dynamics (DFT-MD) simulation of 32 water molecules in the N V T ensemble at 298 K and 1.0 g/cm3 . Because the DFT-generated trajectory may not completely sample all possible model interaction space, 50 MD trajectories resulting from ChIMES force fields fit in a non-iterative fashion can sample unphysical configurations. To overcome this issue, training sets can be obtained in the following self-consistent manner: first, a preliminary force field is fit to a small subset of the original DFT atomic configurations (we call these frames below), e.g. 50. Dynamics are then run using the preliminary model, which can lead to poor or unphysical structures, such as abnormally large concentrations of H3 O+ species. A fraction of these “faulty” frames are then collected and added to the training set, e.g., one out of every five. The final training trajectory is then constructed from a relatively large number of original DFT frames (250) and the 20 frames taken from the preliminary model. The DFT-MD simulation cell is small (i.e. 9.8553 Å3 ), which can be problematic for force field fitting. Ideally, interaction cutoff distances should be less than half of the training trajectory box length when force matching, to prevent multiple interaction distances from being associated with the set of ions due to periodic boundary conditions. However, this restriction can preclude inclusion of pertinent longer range data (e.g., third and fourth solvation shells for H2 O) for the small simulations cells typical to DFT-MD. To ensure we do not violate this requirement while still allowing for relatively large cutoff radii, we include seven additional DFT-computed force frames, determined from configurations taken from preliminary ChIMES based simulations on a 768 atom system of box length 19.713 Å, allowing for cut off radii of up to 9.857 Å. This leads to a final training trajectory size of 277 frames. All DFT simulations are run with VASP 54–57 using a planewave cutoff of 1000 eV, the Perdew-Burke-Ernzerhof generalized gradient approximation functional 58,59 (PBE), projectoraugmented wave pseudopotentials 60,61 (PAW), and the DFT-D2 method for description of

8

ACS Paragon Plus Environment

Page 8 of 41

Page 9 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

dispersion interactions. DFT-MD simulations are initialized from a well-converged N V T DFT-MD simulation with the same level of theory and at the same conditions. These simulations are run with a 0.2 fs timestep and a Nose-Hoover thermostat 62,63 for constant volume-temperature (N V T ) simulations, and Langevin dynamics with a Parrinello-Rahman barostat 64,65 for constant pressure-temperature (N pT ). It is well known that quantum nuclear vibrational effects can have significant impact on computed bulk water properties, 43,44 even under extreme conditions. 24 However, for the purpose of this work, we choose to focus on results from classical ionic equations of motion, only. We provide parameters for the best-performing model in the Supporting Information. Unless otherwise stated, results for the ChIMES models are obtained from simulations in the N V T ensemble at 298 K, across eight statistically independent simulations (i.e. all simulations use a different initial configuration and seed for velocity generation). All reported errors are computed as the standard error of the mean across all eight independent trajectories. For the sake of consistent comparison to DFT, all ChIMES simulations are run for the same 32 water molecule system size and 25 ps production simulation length as DFT, unless otherwise stated. Where applicable, ChIMES simulations utilize a global Hoover thermostat and barostat, and the Ewald sum for charges, and a 0.2 fs timestep. Rather than computing rij based on the minimum image convention for atoms within the simulation cell, rij values are taken as the explicit distance between any real atom i, and any real or ghost atom j, whose parent atom is > i. In this framework, ghost atoms are generated by replicating real atoms in the ± x, y, and z directions. 66

3

Results and Discussion e ,ej

Aside from the automatically generated cni

e ,e ,ejk

ij ik and cm,p,q

coefficients, the user can specify

a number of other ChIMES model features. For ease of discussion we have divided these parameters into three categories as follows: the first category, henceforth referred to as

9

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 41

e ,e

e ,e

i j i j “set 1” controls fine-tuning aspects of the model and includes rc,out,2B and rc,out,3B . This

category also contains an option to generate atomic site charges during the force matching procedure or to exclude them altogether. Exclusion of atomic charges can be advantageous from a computational efficiency standpoint, i.e., not requiring computation of expensive Ewald summations. We note that, when charges are included in the fit, augmentation to eqs 7 through 9 is required. Further details on fitting of atomic charges are described in Koziol et al. 67 The final set 1 option is the method used to transform rij to sij . Coordinate transforms provide a means to transform interatomic distance variables to the [−1,1] interval over which Chebyshev polynomials are defined. They also allow for fine tuning of model sensitivity to small changes in interatomic distances through use of different coordinate transformation functions (e.g., exponentials, etc., discussed below). The second category, “set 2,” contains O2B and O3B and controls the polynomial order/complexity of the primary functionality of the ChIMES force field. An additional catOO HH OH egory of parameters contains Ap , dp , rc,in , rc,in , and rc,in , all of which are set through a

pre-established procedure. 50 These are thus considered fixed with values of 105 kcal/mol/Å3 , 0.02 Å, 2.307 Å, 1.141 Å, and 0.894 Å for this work, respectively. The following sub-sections will present force field sensitivity to set 1 and 2 parameters. Initial models were parameterized without explicitly including stress tensor data in the training set. Subsequent models explore the influence of explicit stress tensor fitting on our results. Models performance will be discussed in terms of predicted radial distribution functions (RDF), vibrational power spectra, pressure, and H-bonding relative to DFT.

3.1

Influence of Set 1 Variables e ,e

e ,e

i j i j , distance transWe begin with a study of the influence of set 1 variables, rc,out,2B , rc,out,3B

formation method, and whether to include charges, on model performance. During this portion of the work, set 2 variables O2B and O3B are fixed at values of 12 and 4, respectively, which has yielded good model performance previously. 50 We note also that no stress tensor 10

ACS Paragon Plus Environment

Page 11 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

data are included in the fit. The set 1 default options along with investigated alternatives e ,e

i j are provided in Table 1. The two distances evaluated for rc,out,2B ensure that the two-body

potential is sufficiently ranged to capture either the 3rd or 4th solvation shells of the OO atom pair, respectively. These values were selected based on the OO atom pair because its solvation shells persist at longer distances than either the OH or HH pairs. The outer cutoff for each 3-body interaction is defined by a set of three values, one for each constituent pair. For example, the interaction between respective atom types ei , ej , and ek = O, H, and H OH OH HH is constrained by the set of 3-body outer cutoffs: rc,out,3B , rc,out,3B , and rc,out,3B . Each set of e ,e

i j OO OH HH distances listed in Table 1 for rc,out,3B give the possible {rc,out,3B ,rc,out,3B ,rc,out,3B } three-body

cutoff distances in Ångstroms, and are set to encompass either the 3rd solvation shell for each individual pair type, as for {8.0,6.0,5.0}, or for the longest of the three (i.e., OO), as for {8.0,8.0,8.0}. Set 1 variables also explore the influence of charges on model performance. When included, charges are obtained during the force matching process, and constraints are introduced to ensure that (i) qO + 2qH = 0 and (ii) qO < 0. We note that the resulting charges, i.e. qO = −0.638, are somewhat smaller than those found in common 3-site water models (i.e. qO ≈ −0.8). 68,69 The final set 1 option is the method for transforming rij to sij (i.e. the model symmetry functions), and Table 1 provides the three options explored herein. Here, pair distances as well as inner and outer cutoffs are first converted to transformed coordinates through one of three approaches: (i.) direct, where x(rij ) = rij , (ii.) inverse, where x(rij ) = 1.0/rij , and (iii.) Morse, 5,13,70 where xei ,ej (rij ) = exp(−rij /λei ,ej ). In the third conversion method, λei ,ej can be considered a characteristic bonding distance and is set to the location of the first peak in the DFT radial distribution function for the ei , ej atom pair type. Increasing λei ,ej has the effect of decreasing the rate of decay of interactions e ,e

i j as the outer cutoff is approached, in a Morse-type fashion. 70 Upon conversion of rij , rc,in ,

e ,e

e ,e

e ,e

e ,ej

i j i j i j and rc,out to xij , xc,in , and xc,out values, we compute the transformed pair distance, siji

through:

11

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

e ,e

e ,e

(10)

e ,e

e ,e

(11)

i j i j i ,ej xeavg = 0.5(xc,out + xc,in )

e ,e

i j i j i j xdiff = 0.5|xc,out − xc,min |

e ,ej

siji

Page 12 of 41

e ,e

i j ei ,ej = (xij − xavg )/xdiff

(12)

Figure 1 and Table 2 summarize the performance of the set 1 models relative to our DFT calculations. We note that each model is named according to the option that differs from the default set. For example, “Direct Transf.” means that the only difference between this model and the default is use of a direct transformation, as opposed to the default Morse-type option. With the exception of the set 1 model that uses a direct transformation (which exhibits poorest recovery of the DFT diffusion coefficient and yields the O–O and H–H RDFs that differ most significantly from DFT), all others are in excellent agreement with DFT across all properties examined. Error bars for ChIMES diffusion constants are 1 standard deviation values calculated from 8 independent 25 ps trajectories. Since a single trajectory is used for the DFT diffusion constant, no error bar is reported, but the DFT diffusion constant error is expected to have similar magnitude to that of the ChIMES simulations. The generally good set 1 model performance is reflected in small root-mean-squared errors in the force, RMSE, and reduced RMSE, RMSEr = RMSE/h| FDFT |i, provided in Table 2, which all fall at or below RMSE = 4.0 kcal/(mol Å) and RMSEr = 0.35. In fact, the present RMSEr values are found to be less than that found for the ChIMES molten carbon model (i.e. 0.439). 50 In order to compare system structure predicted by each set 1 model against DFT, Figure 1 gives model RDFs in relative terms, i.e. as the difference between model and DFT values. For reference, DFT RDFs for the OO pair predict rmax , g(rmax ), rmin , and g(rmin ) of 2.75 Å, 3.0, 3.27 Å, and 0.6, respectively, while experiment yields values of 2.8 Å, 2.57, 3.47 Å, and 0.84, respectively. 71 All models exhibit similar results with good reproduction of the DFT RDF, though a few distinguishing remarks can be made. Deviation between RDFs predicted by the set 1 models and those from DFT are most noticeable for the OO and OH 12

ACS Paragon Plus Environment

Page 13 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

atom pair types, and persist over the longest distances for the OO pair type. Although the deviations from DFT are small, we find that use of a direct transformation yields the RDFs that differ most significantly from the DFT result. Decreasing the two-body cutoff appears to increase the O–O peak height only slightly, while increasing the three-body cutoff leads to slight improvement relative to the other examined models. For reference, the absolute RDF for the latter model is drawn along with that from DFT, in Figure 1. Overall, the largest deviations from DFT in non-bonded structure in set 1 models occur near the 1st OO peak, 2nd and 3rd HH peaks, and across the first three non-bonding OH peaks, with absolute differences from DFT RDFs of less than 0.75, 0.5, and 0.25, respectively. Dynamic properties including the vibrational power spectrum and self-diffusion coefficients are also provided in Figure 1. Overall, the DFT power spectrum exhibits peaks near the expected values, with low frequency peaks for O... O... O bending/caging, hydrogen bond vibrations, and water libration occurring at 71 (50), 215 (200), and 401–1270 (400-1200) cm−1 for DFT (experiment), respectively, 72–75 and H–O–H bending and O–H stretching at 1633 (1595) and 3410–3670 (3657–3756) cm−1 , respectively. 76 Power spectra for the models are in good agreement with DFT, yielding only small differences as the set 1 variables are changed. All models exhibit a shift of roughly 100 cm−1 in the H–O–H bending peak, though this shift amounts to only a 0.3 kcal/mol energy difference. We also find that agreement between self-diffusion coefficients, ds predicted by DFT and the set 1 models are generally good, with all but the model using a direct transformation yielding similar results. These values should be interpreted with caution, however; experiments predict ds on the order of 2.4 × 10−9 m2 /s, 77,78 or roughly 0.24 Å2 /ps. Thus, during the 24 ps DFT trajectory computed herein, molecules have only had time to diffuse roughly one box length. In contrast, it is typically recommended that ds be computed from trajectories long enough to observe diffusion over several box lengths. Furthermore, system size effects can be present due to the long-ranged flow fields induced by molecular motion. 79 Nevertheless, comparisons in predicted values from DFT and each model can still be useful

13

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

when computed from the same trajectory length and not taken as estimates of experimental values. Structural and dynamic performance of the set 1 models were also evaluated in terms of H-bonding. Here, we define two water molecules to be H-bonded if the following criteria are met: (i) rO... O < 3.3 Å, (ii) rO... H < 2.6 Å, and (iii) cos(θO−H... O ) < −0.1, which fall in the range of commonly used criteria. 80–82 Note that the resulting number of hydrogen bonds per molecule, nH-bond , is highly dependent on choice of H-bonding criteria. Thus, we simply use nH-bond as a means of consistent structural comparison between DFT and the present models. We also consider intermittent H-bond lifetimes, τH-bond , which we define as the average length of time that the above criteria are met, for any H-bonded pair. Table 2 gives nH-bond , and τH-bond for each of the set 1 models as well as DFT. Good agreement in nH-bond is observed between the models and DFT, with the largest differences found for the direct and inverse transformation styles (≈ 50 fs). Nevertheless, we find good agreement within the range of published τH-bond predictions (i.e. 50–400 fs), which employ a variety of differing H–bond criteria. 83,84 Finally, we compare pressures predicted by the set 1 models with those from DFT. As will be discussed in Section 3.3, there is no a priori expectation that the force matched models should yield agreement with the DFT pressures. We find that the set 1 force fields generally fall within 2 − 10 GPa of the DFT result, with the inverse transformation and direct transformation models as outliers. None of the models presented in this section is found to yield good reproduction of the DFT total system pressure, however. Our results for set 1 parameters suggest that model accuracy is roughly independent of inclusion of atomic point charges. We note that these findings, i.e., that explicit long-ranged and electrostatics interactions are unnecessary for calculation of standard MD properties, are consistent with those of Molinero and Moore, who demonstrated similar results using a modified Stillinger– Weber potential 85 for liquid water. For the remainder of this work, we fix set 1 variables at the default values of rc,out,2B = 9.855, rc,out,3B = {8.0,6.0,5.0}, no charges, and a “Morse-

14

ACS Paragon Plus Environment

Page 14 of 41

Page 15 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

style” transformation, unless otherwise stated, affording a balance between model accuracy and computational expense.

3.2

Influence of Set 2 Variables

The second category of user-specified variables, set 2, contains O2B and O3B , controlling the level of complexity with which two- and three-body interactions can be described. Chemical intuition cannot be leveraged to set the Chebyshev polynomial orders, and thus we look to a more automated method of paring down the possible parameter space. In this work, the hold-out cross-validation approach was used, 86 which aims to identify the O2B and O3B yielding the highest accuracy, without statistical over-fitting to the training data. This is of particular pertinence to ChIMES models, which can contain hundreds of parameters. In principle, for a fixed training set, force RMSE will decrease as the number of model parameters increase. Thus, if one aims only to drive RMSE down, the resulting model can easily become over-fit. In hold-out cross-validation, however, a number of configurations are withheld from the training data, and used only for computation of a cross-validation error, RMSExval . In practice, RMSExval should go through a minimum as parameter set size is increased, indicating the optimal balance between model accuracy and complexity. To obtain these results, force fields were fit for 63 different combinations of the set 2 variables, where O2B was varied over {4, 6, 8, 10, 12, 16, 20, 24, 28} and O3B over {0, 2, 4, 6, 10, 12, 14}. As with section 3.1, no stress tensor data are included in the fit. We note that these models were fit to the same training trajectory used for generation of set 2 models. A new set of 50 randomly selected configurations were obtained from the remaining original DFT trajectory, for which forces from each resulting set 2 model were computed, yielding a collection of cross-validation RMSE values. This procedure was performed only once. The results of this test (Figure 2) show that RMSExval do not reach a minimum even at the highest investigated orders, namely O2B /O3B = 28/14, which is substantially larger than the order used in previous work on metallic carbon (i.e. O2B /O3B = 12/4). 50 15

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

We find that at O3B ≥ 4, RMSExval continues to decrease, although results for different O2B values have converged at this point. This trend indicates that here O3B begins to have a greater influence on RMSExval than O2B . From this point to the maximum range of the plot, the changes in RMSExval (RMSExval,r ) are small, decreasing to values of roughly 1.9 kcal/mol/Å (0.15). We then test through MD simulation a number of different set 2 force fields that appear to yield acceptable cross-validation errors, where we consider twobody orders of 12 and 24 simultaneously with three-body orders of 2, 4, 6, and 10, resulting in a total of eight combinations. Lower 2-body orders were not considered as they yielded the highest RMSE for a given 3-body order, and thus were expected to give poor results. We note that when tested, 2-body only models yielded stable simulations but did a poor job of recovering the H–O–H angle, often yielding values greater than 120◦ . Findings from simulations using set 2 models are summarized in Figure 3 and Table 3. Data for O3B = 2 and 10 are excluded due to observation of water dissociation in the former case (indicative of an under-fit model) and singularities in the latter (indicative of over-fitting to unphysical configurations from self-consistently generated configurations). Focusing first on the relative RDF data in Figure 3, it is clear that O2B /O3B = 12/4 provides the best reproduction of DFT structure. This can be seen more clearly in the full RDFs, where this model is compared directly with DFT. Increasing O2B generally leads to an over-structured system while increasing O3B leads to under-structuring, particularly near the first O–O peak. The tendency for larger orders to yield worse structure is likely driven by the presence of self-consistent configurations exhibiting large magnitude forces, (i.e., overfitting, despite continued decreases in RMSExval ) and can be particularly problematic for three-body terms, with parameter set size increasing rapidly with O3B . The set 2 power spectra exhibit subtle differences as two- and three-body order are increased; increasing O3B leads to greater population of high frequency O−H stretching (≈ 3600 cm−1 ), a shift of the H−O−H bending peak of ≈ 50 cm−1 , and a relatively broader water libration peak (≈ 600 cm−1 ). Increasing two-body order mimics these trends, though

16

ACS Paragon Plus Environment

Page 16 of 41

Page 17 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

to a much lesser extent. Changes in self-diffusion coefficients are small but observable; values are found to increase when O3B is increased and decrease with larger O2B , consistent with structural trends observed in the RDF data. Despite this and the relatively less accurate RDF data observed at high O3B , H-bond counts and τH-bond values do not appear to be significantly affected by set 2 settings. Additionally, we find no clear trend in predicted pressures, given in Table 3. Based on these results, we fix polynomial orders at O2B /O3B = 12/4 for the remainder of this work.

3.3

Targeting Pressure and Density

The models and subsequent results discussed to this point have been the result of fitting directly to DFT forces. Though reproduction of structure and dynamics has been good, pressures have been substantially different from the DFT value. This result is somewhat expected, as even a perfectly force matched potential (i.e. having RMSE = 0) can yield poor estimates of the DFT pressure, since the energy dependence on system volume is not constrained. 67,87 Furthermore, for any atom i with x-component of the total force, Fi,x = Patoms DFT FF j6=i Fij,x , a perfect force match only requires Fi,x = Fi,x , while the components of the defining summation (i.e. Fij,x ) are free to take on any value (considering only 2-body interactions). This effect can be compensated for, however, through inclusion of stress tensor data in fitting procedure. 87 Here, we evaluate the trade off between accuracy in previously discussed properties with reproduction of system pressure/density. To accomplish this, we include the diagonal components of the stress tensor corresponding to the potential energy contribution to the pressure, for frames from the original DFT-MD simulation. To account for the difference between magnitude and number of the stress tensor components and forces considered in the fit, we evaluate the influence of weighting the stress components versus the forces on the final results. For this test, we use the set 1 and set 2 options that yielded the best balance between performance and computational efficiency in previously discussed metrics, e ,e

e ,e

i j i j namely rc,out,2B = 9.855, rc,out,3B = {8.0,6.0,5.0}, no charges, Morse transformation, O2B =

17

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

12, and O3B = 4. Each frame contains 3N force components, but only 3 stress components, so we would expect that the weighting factor for stress (wσ ) should be substantially greater than 1. Our force matching routine reports forces in units of kcal/mol-Å, and stress components in units of kcal/mol-Å3 , so wσ has units of Å2 . We begin by focusing on results presented in Table 4. Here, densities are obtained from simulations in the N pT ensemble, while all other data are extracted from N V T runs. It is clear that a weight of wσ = 1 is insufficient to recover reasonable system pressures, and indeed, in N pT simulations using this model, the system is found to evaporate. Conversely, weights of 250, 500 and 1000 yield both density and pressure values in agreement with the DFT prediction. No significant or systematic changes are observed in nH-bond or τH-bond as weight is increased, and the relatively unchanging RMSE and RMSEr values are indicative of DFT force recovery that is minimally altered by inclusion and weighting of stress tensor data. Figure 4 shows that increased stress tensor weighting has the effect of gradually and subtly increasing differences between DFT and model RDFs, diffusion coefficients, and leads to a less pronounced O–H stretching shoulder near 3500 cm−1 in the vibrational power spectrum. Overall, the model fit with wσ = 250, which is provided in the supporting information, yields the best balance of accuracy across all properties predicted. Finally, an additional 100 ps ChIMES simulation was run for an 864 molecule (2592) molecule system, in order to shed light on system size effects. The resulting RDFs, vibrational power spectrum, and diffusion coefficient are provided in Figure 5 along with those from DFT and the 32-molecule ChIMES system, while predicted pressures, densities, nHB , and tHB values are given in Table 5. Statistically, differences between the 864- and the 8 32-molecule simulations are minimal, though pressures and densities from the 864-molecule system are found to be slightly smaller in magnitude than those from the 32-molecule system (i.e. −0.03 GPa and 1.01 g/cm3 compared to −0.137 GPa and 1.031 g/cm3 for the 864- and 32molecule simulations, respectively). Nevertheless, predictions from the 864 molecule system are still in good agreement with DFT.

18

ACS Paragon Plus Environment

Page 18 of 41

Page 19 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

4

Conclusions

We have presented an overall methodology for using ChIMES interaction models to accurately replicate DFT-MD simulations of condensed matter, applied to water at 298 K and 1 g/cm3 . We have shown that, overall, the models are minimally sensitive to the user selected parameters presented here, and that use of force RMSE as a metric in hold-out cross-validation can be misleading. In fact, moving to large polynomial orders is shown to be potentially problematic when a self-consistent scheme is used for creation of the training trajectory, due to over-fitting. A possible reason for this is that the cross-validation was performed using possibly highly correlated frames from a short, DFT-MD simulation at a single thermodynamic state, while longer ChIMES MD trajectories can easily move outside of this fitting space of configurations. We show that model pressures resulting from fits made exclusively to forces are often poor with respect to DFT, but that weighted inclusion of stress tensor data can provide accurate estimates of pressure and density with no significant trade off in model accuracy. Our final model provides excellent agreement with DFT in all properties examined, indicating that a relatively simple, explicit two-plus-three body description of interatomic interactions is sufficient to capture the structure and dynamics in a complex H-bonding liquid such as water. Furthermore, the present force field drastically reduces the computational effort required to run “quantum accurate” simulations. Specifically, our ChIMES models exhibit linear scaling with system size, and we find computational efficiencies of 0.021 (CPU hrs)/(ps molecule), compared to 43.2 (CPU hrs)/(ps molecule) for DFT. Thus, our ChIMES water model and fitting approach is ideal for extending DFT to time and length scales conducive to prediction of properties such as surface tension, diffusion, and viscosity, and is the subject of future work.

19

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 41

Acknowledgement This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. The project 16-LW-020 was funded by the Laboratory Directed Research and Development Program at LLNL with N.G. as principal investigator.

Supporting Information Available • ChIMES-SI.pdf: Parameter file for the recommended water model developed in this work.

20

ACS Paragon Plus Environment

Page 21 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Table 1: List of set 1 options and defaults. Option/Parameter Values Tested Default ei ,ej rc,out,2B 8 or 9.855 9.855 ei ,ej a rc,out,3B {8.0,6.0,5.0} or {8.0,8.0,8.0} {8.0,6.0,5.0} Charges fit/exclude exclude Transformationb direct, inverse, Morse Morse a Bracketed values correspond to cutoff distances for {OO,OH,HH} atom pairs within a given triplet, respectively, and are given in Å. e ,e b For conversion between rij and siji j .

21

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 41

Table 2: Pressure, number of H-bonds, H-bond lifetimes, force RMSE, and force RMSEr predicted by DFT and Set 1 modelsa p hnH-bond i hτH-bond i RMSE [GPa] [ps] [kcal/(mol Å)] DFT −0.1 2.93 0.35 – Default 7.624 2.913 0.341 3.94 Direct Transf. −3.804 2.983 0.392 3.13 Inverse Transf. 14.332 2.932 0.401 3.86 Fit Charges 5.591 2.932 0.371 3.68 ei ,ej rc,out,2B = 8.0 2.442 2.963 0.372 3.93 ei ,ej = 8.0 7.399 2.942 0.371 3.68 rc,out,3B a Subscripted values provide uncertainties in the final digit.

22

ACS Paragon Plus Environment

RMSEr – 0.31 0.24 0.30 0.29 0.31 0.29

Page 23 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Table 3: Pressure, number of H-bonds, H-bond lifetimes, force RMSE, and force RMSEr predicted by DFT and Set 2 modelsa O2B

O3B

p hnH-bond i hτH-bond i [GPa] [ps] DFT −0.1 2.93 0.35 12 4 7.643 2.932 0.352 12 6 4.872 2.933 0.321 24 4 2.872 2.963 0.394 24 6 3.852 2.922 0.341 a Subscripted values provide uncertainties in the final digit.

23

ACS Paragon Plus Environment

RMSE [kcal/(mol Å)] – 3.94 3.17 3.82 3.09

RMSEr – 0.31 0.25 0.30 0.24

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 4: Pressure, density, number of H-bonds, H-bond lifetimes, force RMSE, and force RMSEr predicted by DFT and models fit with stressesa p ρ hnH-bond i hτH-bond i RMSE RMSEr [GPa] g/cm3 [ps] [kcal/(mol Å)] DFT −0.1 1.01 2.93 0.35 – – Force-only fitb 7.624 − 2.913 0.343 3.94 0.31 Force+σ, wσ = 1 6.813 − 2.963 0.381 3.93 0.31 Force+σ, wσ = 250 −0.132 1.0295 2.962 0.372 4.17 0.32 Force+σ, wσ = 500 −0.141 1.0529 2.992 0.382 3.95 0.31 Force+σ, wσ = 1000 −0.112 0.0445 2.953 0.361 4.04 0.31 a Subscripted values provide uncertainties in the final digit. b Table entries for “Force-only fit” differ from Table 3 because they were obtained from a different set of independent simulations.

24

ACS Paragon Plus Environment

Page 24 of 41

Page 25 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Table 5: Pressure, density, number of H-bonds, and H-bond lifetimes from DFT and the best performing ChIMES modela,b p ρ hnH-bond i hτH-bond i [GPa] g/cm3 [ps] DFT −0.1 1.01 2.93 0.35 32 Molecules −0.137 1.031 2.967 0.375 864 Molecules 0.03 1.01 2.97 0.39 a Subscripted values provide uncertainties (standard deviation) in the final digits. b ChIMES results are presented for 864- and 32-molecule systems.

25

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

OO O-O Pair Pair

g(rChIMES) - g(rDFT)

7.0

HH Pair H-H Pair

OH Pair O-H Pair

rcut,3B = 8.0 Å

6.0

rcut,2B = 8.0 Å

5.0

Fit Charges

4.0

Inverse Transf.

3.0

Direct Transf.

2.0

Default

1.0 3.0

rcut,3B = 8.0 Å

g(r)

DFT

2.0

1.0

0.0 0.0

2.0

4.0

6.0

8.0 0.0

2.0

4.0

r [Å]

6.0

8.0 0.0

2.0

4.0

r [Å]

6.0

8.0

r [Å]

8

OOO bending/caging

rc,out,3B = 8.0 Å

H-bond OO stretch rc,out,2B = 8.0 Å

Water libration 6

Intensity (Arb. Units)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 41

Fit Charges

OHH bending OH stretching 4

Inverse Transf.

rc,out,3B = 8.0 Å rc,out,2B Å O /O = 8.0 = 24/7 2B 3B

Direct Transf.

O2B/O3B = 24/5

2

Default

O2B/O3B = 12/7 O2B/O3B = 12/5

DFT

DFT

0

0

1000

2000

3000 -1

4000

5000

0

n [cm ]

1E-10

2E-10

3E-10

4E-10

5E-10

6E-10

ds [m2/s]

Figure 1: Predicted properties for water by DFT and set 1 ChIMES models. Top: DFT radial distribution functions and deviation of set 1 ChIMES models from DFT (deviations shifted for clarity). Bottom left: Power spectra (shifted for clarity). Bottom right: Self-diffusion coefficients.

26

ACS Paragon Plus Environment

Page 27 of 41

7.0 [0.56] 7

1.7

6.0 [0.48] 6

1.6

RMSExval

RMSExval [RMSExval,r]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

5.0 [0.40] 5 4 4.0 [0.32]

1.4 1.3

10

4 6 8 10 12 16 20 24 28

3 3.0 [0.24] 2 2.0 [0.16] 1.0 [0.08] 1

1.5

0

12

4

8

14

12

O3B

Figure 2: Hold-out cross-validation force RMSExval and RMSExval,r . Two-body order is given in the legend, and the inset figure provides a magnified view of data at high three-body order.

27

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

g(rChIMES) - g(rDFT)

OO O-O Pair Pair 4.0 3.0 2.0 1.0

HH Pair H-H Pair

OH Pair O-H Pair

O2B/O3B = 24/6 O2B/O3B = 24/4 O2B/O3B = 12/6 O2B/O3B = 12/4

3.0 O2B/O3B = 12/4 DFT

g(r)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 41

2.0

1.0

0.0 0.0

2.0

4.0

6.0

8.0 0.0

r [Å]

2.0

4.0

6.0

8.0 0.0

2.0

4.0

r [Å]

6.0

8.0

r [Å] O2B/O3B = 24/6

OOO bending/caging H-bond OO stretch Water libration

O2B/O3B = 24/4

OHH bending

O2B/O3B = 12/6

OH stretching

O2B/O3B = 12/4

DFT

0

2E-10

4E-10

6E-10

8E-10

ds [m2/s]

Figure 3: Predicted properties for water by DFT and set 2 ChIMES models. Top: DFT radial distribution functions and deviation of set 2 ChIMES models from DFT. Bottom left: Power spectra. Bottom right: Self-diffusion coefficients.

28

ACS Paragon Plus Environment

Page 29 of 41

OO O-O Pair Pair g(rFM) - g(rDFT)

4.0 3.0 2.0 1.0

HH Pair H-H Pair

OH Pair O-H Pair

ws = 1000.0 ws = 500.0 ws = 100.0 ws = 1.0

3.0 ws = 250.0 DFT

g(rDFT)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

2.0

1.0

0.0 0.0

2.0

4.0

6.0

8.0 0.0

r [Å]

2.0

4.0

6.0

8.0 0.0

2.0

4.0

r [Å]

6.0

8.0

r [Å] wσ = 1000

OOO bending/caging H-bond OO stretch Water libration

wσ = 500

OHH bending OH stretching

wσ = 250

wσ = 1

DFT

0

2E-10

4E-10 ds

6E-10

8E-10

1E-09

[m2/s]

Figure 4: Predicted properties for water by DFT and ChIMES models fit with stress tensor data. Top: DFT radial distribution functions and deviation of ChIMES models fit with stress tensor data from DFT. Bottom left: Power spectra. Bottom right: Self-diffusion coefficients.

29

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

OO Pair

Page 30 of 41

HH Pair

OH Pair

OOO bending/caging

864 molecules

H-bond OO stretch Water libration

OHH bending

32 molecules OH stretching

DFT

0

1E-10

2E-10

3E-10

4E-10

5E-10

6E-10

ds [m /s] 2

Figure 5: Predicted properties for water by DFT and the best performing ChIMES model. ChIMES results are presented for 864- and 32-molecule systems. Top: DFT radial distribution functions and deviation of ChIMES models from DFT. Bottom left: Power spectra. Bottom right: Self-diffusion coefficients; errors given as the standard deviation across independent simulations.

30

ACS Paragon Plus Environment

Page 31 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

References (1) Ball, P. Water: Water – an enduring mystery. Nature 2008, 452, 291–292. (2) Xantheas, S. S. Ab initio studies of cyclic water clusters (H2 O)n , n = 1 − 6. II. Analysis of many-body interactions. J. Chem. Phys. 1994, 100, 7523–7534. (3) Cui, J.; Liu, H.; Jordan, K. D. Theoretical Characterization of the (H2 O)21 Cluster: Application of an N -body Decomposition Procedure. J. Phys. Chem. B 2006, 110, 18872–18878. (4) Góra, U.; Podeszwa, R.; Cencek, W.; Szalewicz, K. Interaction energies of large clusters from many-body expansion. J. Chem. Phys. 2011, 135, 224102. (5) Wang, Y.; Huang, X.; Shepler, B. C.; Braams, B. J.; Bowman, J. M. Flexible, ab initio potential, and dipole moment surfaces for water. I. Tests and applications for clusters up to the 22-mer. J. Chem. Phys. 2011, 134, 094509. (6) Mas, E. M.; Bukowski, R.; Szalewicz, K.; Groenenboom, G. C.; Wormer, P. E. S.; van der Avoird, A. Water pair potential of near spectroscopic accuracy. I. Analysis of potential surface and virial coefficients. J. Chem. Phys. 2000, 113, 6687–6701. (7) Groenenboom, G.; Wormer, P.; Van Der Avoird, A.; Mas, E.; Bukowski, R.; Szalewicz, K. Water pair potential of near spectroscopic accuracy. II. Vibration–rotation– tunneling levels of the water dimer. J. Chem. Phys. 2000, 113, 6702–6715. (8) Goldman, N.; Fellers, R. S.; Brown, M. G.; Braly, L. B.; Keoshian, C. J.; Leforestier, C.; Saykally, R. J. Spectroscopic determination of the water dimer intermolecular potentialenergy surface. J. Chem. Phys. 2002, 116, 10148. (9) Goldman, N.; Leforestier, C.; Saykally, R. J. A ‘first principles’ potential energy surface for liquid water from VRT spectroscopy of water clusters. Phil. Trans. R. Soc. A 2005, 363, 493–508. 31

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(10) Bukowski, R.; Szalewicz, K.; Groenenboom, G. C.; van der Avoird, A. Predictions of the Properties of Water from First Principles. Science 2007, 315, 1249–1252. (11) Leforestier, C.; Szalewicz, K.; van der Avoird, A. Spectra of water dimer from a new ab initio potential with flexible monomers. J. Chem. Phys. 2012, 137, 014305. (12) McDaniel, J. G.; Schmidt, J. Next-Generation Force Fields from Symmetry-Adapted Perturbation Theory. Annu. Rev. Phys. Chem. 2016, 67, 467–488. (13) Medders, G. R.; Götz, A. W.; Morales, M. A.; Bajaj, P.; Paesani, F. On the representation of many-body interactions in water. J. Chem. Phys. 2015, 143, 104102. (14) Babin, V.; Leforestier, C.; Paesani, F. Development of a “First Principles” Water Potential with Flexible Monomers: Dimer Potential Energy Surface, VRT Spectrum, and Second Virial Coefficient. J. Chem. Theory Comput. 2013, 9, 5395–5403. (15) Wang, Y.; Shepler, B. C.; Braams, B. J.; Bowman, J. M. Full-dimensional, ab initio potential energy and dipole moment surfaces for water. J. Chem. Phys. 2009, 131, 054511. (16) Reddy, S. K.; Straight, S. C.; Bajaj, P.; Huy Pham, C.; Riera, M.; Moberg, D. R.; Morales, M. A.; Knight, C.; Götz, A. W.; Paesani, F. On the accuracy of the MBpol many-body potential for water: Interaction energies, vibrational frequencies, and classical thermodynamic and dynamical properties from clusters to liquid water and ice. J. Chem. Phys. 2016, 145, 194504. (17) Grossman, J.; Schwegler, E.; Draeger, E. W.; Gygi, F.; Galli, G. J. Chem. Phys. 2004, 120, 300–311. (18) Schwegler, E.; Sharma, M.; Gygi, F.; Galli, G. Melting of ice under pressure. Proc. Nat. Acad. Sci.(USA) 2008, 105, 14779.

32

ACS Paragon Plus Environment

Page 32 of 41

Page 33 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(19) Wang, Y.; Shepler, B. C.; Braams, B. J.; Bowman, J. M. Full-dimensional, ab initio potential energy and dipole moment surfaces for water. J. Chem. Phys. 2009, 131, 054511. (20) Wang, Y.; Huang, X.; Shepler, B. C.; Braams, B. J.; Bowman, J. M. Flexible, ab initio potential, and dipole moment surfaces for water. I. Tests and applications for clusters up to the 22-mer. J. Chem. Phys. 2011, 134, 094509. (21) Schwegler, E.; Galli, G.; Gygi, F. Phys. Rev. Lett. 2000, 84, 2429 – 2432. (22) Schwegler, E.; Galli, G.; Gygi, F.; Hood, R. Q. Phys. Rev. Lett. 2001, 87, 265501–1. (23) Goldman, N.; Reed, E. J.; Kuo, I.-F. W.; Fried, L. E.; Mundy, C. J.; Curioni, A. Ab initio simulation of the equation of state and kinetics of shocked water. J. Chem. Phys. 2009, 130, 124517. (24) Goldman, N.; Reed, E. J.; Fried, L. E. Quantum Corrections to Shock Hugoniot Temperatures. J. Chem. Phys. 2009, 131, 204103. (25) Gillian, M. J.; Michaelides, A. Perspective: How good is DFT for water? J. Chem. Phys. 216, 144, 130901. (26) Alfè, D.; Bartók, A.; Csanyi, G.; Gillan, M. Analyzing the errors of DFT approximations for compressed water systems. J. Chem. Phys. 2014, 141, 014104. (27) Becke, A. Density-functional thermochemistry. III. The role of exact exchange. J. Chem. Phys 1993, 98, 5648–5652. (28) Lee, C.; Yang, W.; Parr, R. G. Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density. Phys. Rev. B 1988, 37, 785–789. (29) Perdew, J. P.; Ernzerhof, M.; Burke, K. Rationale for mixing exact exchange with density functional approximations. J. Chem. Phys. 1996, 105, 9982–9985. 33

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(30) Adamo, C.; Cossi, M.; Scalmani, G.; Barone, V. Accurate static polarizabilities by density functional theory: assessment of the PBE0 model. Chem. Phys. Lett. 1999, 307, 265–271. (31) Mardirossian, N.; Ruiz Pestana, L.; Womack, J. C.; Skylaris, C.-K.; Head-Gordon, T.; Head-Gordon, M. Use of the rVV10 Nonlocal Correlation Functional in the B97MV Density Functional: Defining B97M-rV and Related Functionals. J. Phys. Chem. Letters 2017, 8, 35–40. (32) Mardirossian, N.; Head-Gordon, M. Mapping the genome of meta-generalized gradient approximation density functionals: The search for B97M-V. J. Chem. Phys. 2015, 142, 074111. (33) Zhao, Y.; Truhlar, D. G. A new local density functional for main-group thermochemistry, transition metal bonding, thermochemical kinetics, and noncovalent interactions. J. Chem. Phys. 2006, 125, 194101. (34) Grimme, S. Accurate description of van der Waals complexes by density functional theory including empirical corrections. J. Comput. Chem. 2004, 25, 1463–1473. (35) Grimme, S. Semiempirical GGA-type density functional constructed with a long-range dispersion correction. J. Comput. Chem. 2006, 27, 1787–1799. (36) Grimme, S.; Antony, J.; Ehrlich, S.; Krieg, H. A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. J. Chem. Phys. 2010, 132, 154104. (37) Von Lilienfeld, O. A.; Tavernelli, I.; Rothlisberger, U.; Sebastiani, D. Optimization of effective atom centered potentials for London dispersion forces in density functional theory. Phys. Rev. Lett. 2004, 93, 153004.

34

ACS Paragon Plus Environment

Page 34 of 41

Page 35 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(38) Tkatchenko, A.; Scheffler, M. Accurate molecular van der Waals interactions from ground-state electron density and free-atom reference data. Phys. Rev. Lett. 2009, 102, 073005. (39) Dion, M.; Rydberg, H.; Schröder, E.; Langreth, D. C.; Lundqvist, B. I. Van der Waals density functional for general geometries. Phys. Rev. Lett. 2004, 92, 246401. (40) Izvekov, S.; Parrinello, M.; Burnham, C. J.; Voth, G. A. Effective force fields for condensed phase systems from ab initio molecular dynamics simulation: a new method for force-matching. J. Chem. Phys. 2004, 120, 10896–10913. (41) Paesani, F.; Zhang, W.; Case, D. A.; Cheatham III, T. E.; Voth, G. A. An accurate and simple quantum model for liquid water. J. Chem. Phys. 2006, 125, 184507. (42) Pinilla, C.; Irani, A. H.; Seriani, N.; Scandolo, S. Ab initio parameterization of an allatom polarizable and dissociable force field for water. The Journal of Chemical Physics 2012, 136, 114511. (43) Fritsch, S.; Potestio, R.; Donadio, D.; Kremer, K. Nuclear Quantum Effects in Water: A Multiscale Study. J. Chem. Theory Comput. 2014, 10, 816–824. (44) Spura, T.; John, C.; Habershon, S.; Kühne, T. D. Nuclear quantum effects in liquid water from path-integral simulations using an ab initio force-matching approach. Molecular Physics 2015, 113, 808–822. (45) Van Duin, A. C.; Dasgupta, S.; Lorant, F.; Goddard, W. A. ReaxFF: a reactive force field for hydrocarbons. J. Phys. Chem. A 2001, 105, 9396–9409. (46) Brenner, D. W.; Shenderova, O. A.; Harrison, J. A.; Stuart, S. J.; Ni, B.; Sinnott, S. B. A second-generation reactive empirical bond order (REBO) potential energy expression for hydrocarbons. J. Phys. Condens. Matter 2002, 14, 783.

35

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(47) Bartók, A. P.; Payne, M. C.; Kondor, R.; Csányi, G. Gaussian approximation potentials: The accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 2010, 104, 136403. (48) Bartók, A. P.; Gillan, M. J.; Manby, F. R.; Csányi, G. Machine-learning approach for one-and two-body corrections to density functional theory: Applications to molecular and condensed water. Phys. Rev. B 2013, 88, 054104. (49) Alfè, D.; Bartók, A.; Csanyi, G.; Gillan, M. Analyzing the errors of DFT approximations for compressed water systems. J. Chem. Phys. 2014, 141, 014104. (50) Lindsey, R. K.; Fried, L. E.; Goldman, N. ChIMES: A Force Matched Potential with Explicit Three-Body Interactions for Molten Carbon. J. Chem. Theory Comput. 2017, 13, 6222–6229. (51) Heyes, D. M. The Liquid State: Applications of Molecular SImulations; John Wiley and Sons Ltd: West Sussex, UK, 1998; p 68. (52) Heyes, D. M. Pressure tensor of partial-charge and point-dipole lattices with bulk and surface geometries. Phys. Rev. B 1994, 49, 755–764. (53) Friedman, J.; Hastie, T.; Tibshirani, R. Numerical Recipes: The Art of Scientific Computing,3rd ed.; Cambridge University Press: Cambridge, U.K., 2007. (54) Kresse, G.; Hafner, J. Ab initio molecular dynamics for liquid metals. Phys. Rev. B: Condens. Matter Mater. Phys. 1993, 47, 558. (55) Kresse, G.; Hafner, J. Ab initio molecular-dynamics simulation of the liquid-metal– amorphous-semiconductor transition in germanium. Phys. Rev. B: Condens. Matter Mater. Phys. 1994, 49, 14251. (56) Kresse, G.; Furthmüller, J. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Comput. Mater. Sci. 1996, 6, 15–50. 36

ACS Paragon Plus Environment

Page 36 of 41

Page 37 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(57) Kresse, G.; Furthmüller, J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B: Condens. Matter Mater. Phys. 1996, 54, 11169. (58) Perdew, J. P.; Burke, K.; Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 1996, 77, 3865. (59) Perdew, J. P.; Burke, K.; Ernzerhof, M. Generalized gradient approximation made simple [Erratum to Phys. Rev. Lett. 77, 3865 (1996)]. Phys. Rev. Lett. 1997, 78, 1396– 1396. (60) Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B: Condens. Matter Mater. Phys. 1994, 50, 17953. (61) Kresse, G.; Joubert, D. From ultrasoft pseudopotentials to the projector augmentedwave method. Phys. Rev. B: Condens. Matter Mater. Phys. 1999, 59, 1758. (62) Nosé, S. A unified formulation of the constant temperature molecular dynamics methods. J. Chem. Phys. 1984, 81, 511–519. (63) Hoover, W. G. Canonical dynamics: equilibrium phase-space distributions. Phys. Rev. A: At., Mol., Opt. Phys. 1985, 31, 1695. (64) Parrinello, M.; Rahman, A. Crystal structure and pair potentials: A moleculardynamics study. Phys. Rev. Lett. 1980, 45, 1196. (65) Parrinello, M.; Rahman, A. Polymorphic transitions in single crystals: A new molecular dynamics method. J. Appl. Phys. 1981, 52, 7182–7190. (66) Plimpton, S. Fast parallel algorithms for short-range molecular dynamics. J. Phys. Chem. C 1995, 117, 1–19.

37

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(67) Koziol, L.; Fried, L. E.; Goldman, N. Using force-matching to determine reactive force fields for bulk water under extreme thermodynamic conditions. J. Chem. Theory Comput. 2017, 13, 135. (68) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 1983, 79, 926–935. (69) Berendsen, H.; Grigera, J.; Straatsma, T. The missing term in effective pair potentials. J. Phys. Chem. 1987, 91, 6269–6271. (70) Braams, B. J.; Bowman, J. M. Permutationally invariant potential energy surfaces in high dimensionality. Int. Rev. Phys. Chem. 2009, 28, 577–606. (71) Skinner, L. B.; Huang, C.; Schlesinger, D.; Pettersson, L. G.; Nilsson, A.; Benmore, C. J. Benchmark oxygen-oxygen pair-distribution function of ambient water from x-ray diffraction measurements with a wide Q-range. J. Chem. Phys. 2013, 138, 074506. (72) Walrafen, G. Water. A Comprehensive Treatise, ed. F. Franks. 1972. (73) Downing, H. D.; Williams, D. Optical constants of water in the infrared. J. Geophys. Res. 1975, 80, 1656–1661. (74) Chen, S.; Teixeira, J. Structure and dynamics of low-temperature water as studied by scattering techniques. Adv. Chem. Phys 1986, 64 . (75) Bertie, J. E.; Ahmed, M. K.; Eysel, H. H. Infrared intensities of liquids. 5. Optical and dielectric constants, integrated intensities, and dipole moment derivatives of water and water-d2 at 22. degree. C. J. Phys. Chem. 1989, 93, 2210–2218. (76) Shimanouchi, T. Tables of Molecular Vibrational Frequencies Consolidated Volume I; National Bureau of Standards: Washington, DC, 1972. 2005, 1–160.

38

ACS Paragon Plus Environment

Page 38 of 41

Page 39 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(77) Mills, R. Self-diffusion in normal and heavy water in the range 1-45. deg. J. Phys. Chem. 1973, 77, 685–688. (78) O’Reilly, D.; Peterson, E. Self-Diffusion Coefficients and Rotational Correlation Times in Polar Liquids. II. J. Chem. Phys. 1971, 55, 2155–2163. (79) Pranami, G.; Lamm, M. H. Estimating error in diffusion coefficients derived from Molecular Dynamics simulations. J. Chem. Theory Comput. 2015, 11, 4586–4592. (80) Stubbs, J. M.; Siepmann, J. I. Aggregation in dilute solutions of 1-hexanol in n-hexane: A Monte Carlo simulation study. J. Phys. Chem. B 2002, 106, 3968–3978. (81) Zhang, L.; Rafferty, J. L.; Siepmann, J. I.; Chen, B.; Schure, M. R. Chain conformation and solvent partitioning in reversed-phase liquid chromatography: Monte Carlo simulations for various water/methanol concentrations. J. Chromatogr. A 2006, 1126, 219–231. (82) Stubbs, J. M.; Siepmann, J. I. Elucidating the vibrational spectra of hydrogen-bonded aggregates in solution: Electronic structure calculations with implicit solvent and firstprinciples molecular dynamics simulations with explicit solvent for 1-hexanol in nhexane. J. Am. Chem. Soc. 2005, 127, 4722–4729. (83) Rapaport, D. Hydrogen bonds in water: Network organization and lifetimes. Mol. Phys. 1983, 50, 1151–1162. (84) Voloshin, V.; Naberukhin, Y. I. Hydrogen bond lifetime distributions in computersimulated water. J. Struct. Chem. 2009, 50, 78–89. (85) Molinero, V.; Moore, E. B. Water modeled as an intermediate element between carbon and silicon. J. Phys. Chem. B 2008, 113, 4008–4016. (86) Friedman, J.; Hastie, T.; Tibshirani, R. The elements of statistical learning; Springer series in statistics New York, 2001; Vol. 1. 39

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(87) Goldman, N.; Fried, L. E.; Koziol, L. Using Force-matched potentials to improve the accuracy of density functional tight binding for reactive conditions. J. Chem. Theory Comput. 2015, 11, 4530.

40

ACS Paragon Plus Environment

Page 40 of 41

Page 41 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Graphical TOC Entry

Predictive, Scale-limited DFT-MD

ChIMES

41

Scalable, Quantum-Accurate MM-MD

ACS Paragon Plus Environment