From Molecular Fragments to the Bulk: Development of a Neural

May 15, 2019 - OR SEARCH CITATIONS ..... The construction of equally reliable and efficient reactive potentials is a ... theoretically,(17,22) is shar...
0 downloads 0 Views 7MB Size
Article Cite This: J. Chem. Theory Comput. 2019, 15, 3793−3809

pubs.acs.org/JCTC

From Molecular Fragments to the Bulk: Development of a Neural Network Potential for MOF‑5 Marco Eckhoff* and Jörg Behler* Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraße 6, D-37077 Göttingen, Germany

Downloaded via NOTTINGHAM TRENT UNIV on August 9, 2019 at 23:22:25 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

S Supporting Information *

ABSTRACT: The development of first-principles-quality reactive atomistic potentials for organic−inorganic hybrid materials is still a substantial challenge because of the very different physics of the atomic interactionsfrom covalent via ionic bonding to dispersionthat have to be described in an accurate and balanced way. In this work we used a prototypical metal−organic framework, MOF-5, as a benchmark case to investigate the applicability of high-dimensional neural network potentials (HDNNPs) to this class of materials. In HDNNPs, which belong to the class of machine learning potentials, the energy is constructed as a sum of environment-dependent atomic energy contributions. We demonstrate that by the use of this approach it is possible to obtain a high-quality potential for the periodic MOF-5 crystal using density functional theory (DFT) reference calculations of small molecular fragments only. The resulting HDNNP, which has a root-mean-square error (RMSE) of 1.6 meV/atom for the energies of molecular fragments not included in the training set, is able to provide the equilibrium lattice constant of the bulk MOF-5 structure with an error of about 0.1% relative to DFT, and also, the negative thermal expansion behavior is accurately predicted. The total energy RMSE of periodic structures that are completely absent in the training set is about 6.5 meV/atom, with errors on the order of 2 meV/atom for energy differences. We show that in contrast to energy differences, achieving a high accuracy for total energies requires careful variation of the stoichiometries of the training structures to avoid energy offsets, as atomic energies are not physical observables. The forces, which have RMSEs of about 94 meV/a0 for the molecular fragments and 130 meV/a0 for bulk structures not included in the training set, are insensitive to such offsets. Therefore, forces, which are the relevant properties for molecular dynamics simulations, provide a realistic estimate of the accuracy of atomistic potentials.

1. INTRODUCTION Metal−organic frameworks (MOFs) are three-dimensional nanoporous crystalline solids with exceptionally large internal pore surface areas.1−4 As organic−inorganic hybrid materials, they contain inorganic secondary building units (SBUs), which are usually small metal oxide clusters, that are connected by organic linker molecules via covalent bonds. Because of their modular structure and the large number of possible combinations of SBUs and linkers, MOFs offer enormous potential for highly flexible design and customized functionalization.5−8 Their exceptional properties make MOFs interesting for a wide range of applications. The high surface area9 enables capture and storage of gases like H2 and CO2,10,11 and their unique capabilities for selective separation of different molecules are superior to conventional systems like zeolites because not only the pore size but also the chemical nature of the interior surfaces can be modified to increase the selectivity. Moreover, MOFs can act as heterogeneous catalysts,12 and their catalytic properties can be further enhanced by the incorporation of small metal clusters through chemical vapor deposition.13,14 The use of suitable luminescent linker © 2019 American Chemical Society

molecules can be applied in chemical sensing to indicate the presence of specific guest molecules.15 Finally, their global structural properties are related to the pores and building units, as several MOFs show very pronounced negative thermal expansion (NTE) behavior, which is caused by thermal motions of their organic linkers.16 Thus, by combination of structural elements giving rise to thermal expansion as well as contraction, MOFs with customized mechanical properties can be synthesized.17 NTE materials are of interest, e.g., in engineering of optical and electronic devices.18 In combination with positive thermal expansion materials, they can lead to materials with zero or low temperature dependence of the lattice constants, which are required for, e.g., high antithermal shock applications.19 As essentially all properties of MOFs are determined by their building blocks and the resulting pore and channel structures, right from their discovery a lot of effort has been spent on the investigation of the underlying interactions governing these properties. In particular, theoretical investigations aiming to Received: January 3, 2019 Published: May 15, 2019 3793

DOI: 10.1021/acs.jctc.8b01288 J. Chem. Theory Comput. 2019, 15, 3793−3809

Article

Journal of Chemical Theory and Computation

Figure 1. (a) Unit cell of MOF-5 containing eight Zn4O(BDC)3 formula units (BDC = benzene-1,4-dicarboxylate; H in white, C in gray, O in red, and Zn in blue). (b) Supercell highlighting the large (yelllow) and small (orange) pores of MOF-5. This and all other figures in this work were created with VESTA version 3.4.4.58

form17,29 or adapted to specific MOFs to increase their accuracy for these systems.30−32 Typically, classical force fields exhibit reasonable performance for organic compounds, but the parametrization for the inorganic SBUs is far from trivial. Essentially, two main strategies have been developed in the construction of force fields for MOFs: “nonbonded” force fields describe the interactions within the metal−oxo clusters exclusively by van der Waals and electrostatic interactions without explicit directionality,30,33 while “bonded” force fields treat the inorganic components as covalently bonded.34 A general limitation of conventional force fields is that they are unable to describe the making and breaking of bonds, which prevents the study of several interesting phenomena like the formation of MOFs by self-assembly, chemisorption of guest species, and catalytic reactions. A promising new approach for the construction of highly accurate reactive potentials is based on the use of machine learning (ML) techniques.35 ML potentials have in common that they employ highly flexible functions to represent the multidimensional potential energy surface (PES) with very high accuracy, relying on reference data from electronic structure calculations. Numerous types of ML potentials have been proposed and used with great success in the literature in recent years for different kinds of systems, employing, e.g., artificial neural networks (NNs),36−39 Gaussian approximation potentials (GAPs),40,41 kernel ridge regression,42 and many others. In this work we describe the construction and validation of a high-dimensional neural network potential (HDNNP)43,44 for a typical MOF based on DFT data. HDNNPs were introduced by Behler and Parrinello in 2007.45 They provide essentially the same accuracy as the underlying electronic structure reference data and can be applied in the simulation of large systems containing up to 105 atoms on nanosecond time scales, which has been demonstrated for a variety of systems ranging from liquid water46,47 and aqueous NaOH solutions48 via solid−liquid interfaces49,50 to bulk materials such as bulk silicon51 and oxide-supported metal clusters.52 A few studies of organic molecules have also been reported,53−55 but to date HDNNPs have not been used to investigate the properties of

predict new structures with tailored pore sizes, mechanical behavior, and interactions with guest species have attracted substantial interest. Several high-throughput theoretical studies have been carried out to date20,21 with the goal to unravel the relationship between composition and properties. For a comprehensive understanding, theoretical investigations focus not only on the properties of pure MOFs but also include the role of guest species, which can significantly influence, e.g., the local structural properties of the system.13,17 In order to predict the properties and reactivity of MOFs in computer simulations, a reliable description of the atomic interactions is needed, which is very challenging because very different interactions such as strong covalent bonds and ionic interactions as well as weak van der Waals interactions are important. An obvious solution for this problem would be the use of first-principles methods like (dispersion-corrected) density functional theory (DFT), which allow chemical reactions to be described and are applicable to a wide range of systems with good accuracy. Indeed several theoretical studies of MOFs have been carried out using DFT.22−25 A main hindrance toward a general application of DFT is that MOF structures can be very complicated, resulting in unit cells with hundreds or even thousands of atoms, making electronic structure calculations very demanding. The situation becomes even worse if the problem of interest involves dynamical processes requiring extensive configurational sampling, possibly including a solvent, adsorbed species, and reactant molecules. This is clearly beyond reach for ab initio molecular dynamics relying on the on-the-fly calculation of the energies and forces by DFT. Consequently, less demanding potentials are required that allow such simulations to be performed at affordable costs. The construction of equally reliable and efficient reactive potentials is a substantial challenge for hybrid systems like MOFs, and only a few methods have been applied to MOFs, e.g., the reactive force field ReaxFF26,27 and approximate electronic structure methods like density functional tight binding (DFTB).28 Instead, because of their superior efficiency, to date most theoretical studies of MOFs have been carried out using nonreactive classical force fields, either directly in their generic 3794

DOI: 10.1021/acs.jctc.8b01288 J. Chem. Theory Comput. 2019, 15, 3793−3809

Article

Journal of Chemical Theory and Computation

In the present work we investigate and discuss the following central issues regarding the applicability of HDNNPs to organic−inorganic hybrid systems: 1. What is the accuracy that can be obtained for energies and forces for hybrid systems, taking MOF-5 as a typical example? Can we achieve a balanced and high accuracy for all parts of the system? 2. MOFs have very large unit cells, and constructing large reference sets containing the data from thousands of DFT calculations is a very demanding task. Since HDNNPs rely on atomic energies with only a local environment dependence, we investigate whether it is possible to construct a potential for periodic bulk MOF5 from small molecular fragment data that enables correct prediction of the physical properties of the bulk, like its lattice constant and the NTE. This would allow us to provide training data even in cases in which the full periodic system would be computationally too demanding to be calculated by DFT. Of course, to validate this approach, we need a system that still allows calculations on the full bulk crystal for comparison, which is why we chose MOF-5. Other MOFs, or in general more complicated systems with defects or molecular species included in the pores, require much larger unit cells, which are inaccessible by DFT but still could be addressed by our HDNNP approach. For other systems, like metals64 and organic molecules,65 it has already been shown that in principle it is possible to construct HDNNPs for larger systems from much smaller fragments. 3. Using MOF-5, we discuss the role of the atomic energies in the construction of the HDNNP. We demonstrate that the internal flexibility of the atomic energy contributions in HDNNPs in some situations can give rise to offsets in the total energy, which can be overcome by using systems of varying stoichiometries. This paper is organized as follows: First, the HDNNP method is briefly outlined and the computational details are summarized. The discussion of the results starts with an analysis of the HDNNP’s quality and a validation on bulk properties. Subsequently, the atomic energy contributions to the total energy of the system are discussed in detail, followed by the prediction and analysis of the negative thermal expansion of MOF-5 as well as its phonon density of states. The paper concludes with a summary of this work.

complex hybrid organic−inorganic bulk materials like MOFs. In principle, one HDNNP can describe the PESs of a variety of MOFs and their interactions and reactions with guest species. In this study, we focus on the construction and validation of a HDNNP for a bulk MOF as a first benchmark case. Our goal here is to represent the PES of MOF-5also called isoreticular metal−organic framework 1 (IRMOF-1) which was synthesized for the first time by Li et al. in 1999.5 It is probably the most frequently studied MOF, as it is very stable up to elevated temperatures of about 650 K22 even in the absence of a solvent56 and forms the structural basis for a large family of isoreticular MOFs.1 Its cubic structure is built from Zn4O tetrahedra that are octahedrally coordinated by six bidentate benzene-1,4-dicarboxylate (BDC) ligands (Figure 1a). The resulting three-dimensional framework has Fm3̅m space group symmetry and exhibits an alternating sequence of two different pore diameters resulting from the spatial orientation of the phenylene rings, as shown in Figure 1b. As can be seen, the cubic unit cell consists of eight Zn4O(BDC)3 units and contains four large and four small pores that are connected by channels. The inorganic SBUs as well as the organic linkers can be modified or even replaced by alternative building blocks, making the design of this type of MOFs extremely flexible.6,57 Of particular interest in the present work is the NTE behavior of MOF-5, which was first reported by Rowsell et al. in 200559 and can be observed from 0 K up to the decomposition temperature. This property, which has also been studied theoretically,17,22 is shared with only a few other material families besides MOFs, like ZrW 2 O 8 and HfW2O8,60−63 and will serve here as a test case to investigate the predictive power of HDNNPs concerning physical properties not included in thehere molecularreference set. Typically, the volume of a material increases with temperature because of the anharmonic potentials of chemical bonds. However, electronic or magnetic transitions, low-energy transverse motions, or cooperative rotations can lead to NTE behavior.18 In X-ray diffraction experiments the actual dynamics of the system cannot be resolved, and only the time-averaged structure is detected, which retains the symmetry of the crystal. Therefore, in diffraction experiments the bond lengths seem to shrink with increasing temperature, and a linear thermal expansion coefficient α is obtained, which is the derivative of the lattice constant a with respect to the temperature T divided by the lattice constant:

α=

1 da a dT

2. METHOD In the HDNNP method,43,44,66 the potential energy E of a many-atom molecular or periodic condensed system containα ing Nelem elements and Natoms atoms of element α is constructed as a sum of atomic energy contributions Eαn ,

(1)

In the case of MOF-5, the nearly linear NTE is mainly caused by transverse motions of the BDC linkers. The amplitudes of the motions increase with temperature, resulting in a decreasing unit cell volume.16 Measurements yield α = −13.1 × 10−6 K−1 using synchrotron powder X-ray diffraction in the temperature range from 80 to 500 K56 and α = −16 × 10−6 to −10 × 10−6 K−1 using neutron powder diffraction in the range from 4 to 600 K.22 Consequently, the unit cell volume decreases by more than 1% in going from 0 to 300 K. Guest molecules or external pressure can reduce the NTE by damping the transverse motions of the BDC linkers.16 On the other hand, the NTE can be increased by substituting the organic linkers and the inorganic SBUs, which can result in very high NTE coefficients.

α Nelem Natoms

E=

∑ ∑ α=1 n=1

Enα

(2)

The atomic energy contributions are computed using individual atomic neural networks and depend on the local atomic chemical environments consisting of all atoms inside a cutoff sphere of radius Rc, which has to be sufficiently large to avoid the trunction of energetically relevant atomic interactions. Typically Rc values of 6−10 Å are used for this purpose. For atom n of element α, the positions of the 3795

DOI: 10.1021/acs.jctc.8b01288 J. Chem. Theory Comput. 2019, 15, 3793−3809

Article

Journal of Chemical Theory and Computation neighboring atoms in these atomic environments are described by a set of NαG many-body atom-centered symmetry functions {Giα,n}.67 These symmetry functions form translationally, rotationally, and permutationally invariant vectors of descripα,n T tors for each atom, Gα,n = (Gα,n 1 , ..., GNαG ) , which are used as input for the atomic neural networks to construct a functional relation between the environment of an atom and its atomic energy contribution Eαn . Since the symmetry function values are calculated from the Cartesian coordinates of the atoms, the HDNNP uses only information about the atomic positions, while the specification of atom typesapart from the elementand connectivities between atoms are not required. Therefore, in contrast to most classical force fields, HDNNPs are intrinsically “reactive”, i.e., they are able to describe the making and breaking of bonds like electronic structure methods. For each element, an individual atomic NN, which is a standard feed-forward NN,68 is defined such that the architecture and weight parameters are the same for all atoms of the same element. This NN is then replicated according to eq 2 as many times as there are atoms of the respective element in the system, and the different atomic energies obtained as output result from the numerically different input symmetry function vectors. The weight parameters for all of the atomic NNs are optimized simultaneously to minimize the errors of known DFT total energies and force components of a set of training structures using gradient-based optimization algorithms, e.g., an adaptive, global, extended Kalman filter.69,70 This requires the analytic derivatives of the NN energies and forces with respect to the weight parameters, which are readily available in analytic form. It should be noted that for the energy training no partitioning of the DFT energies into target atomic energies is required only total energies of the full systems are used as references in the iterative optimization process. More detailed information about the HDNNP method in general, the functional form of HDNNPs as well as the symmetry functions, and the training process can be found in several reviews.43,44,66 For the present work it is important to realize that atoms provide the same energy contributions to the total energy of the system if their environments are characterized by numerically identical symmetry function vectors. In this situation the atomic environments are indistinguishable for the atomic NNs. This can be exploited to mimic a bulklike environment using molecular fragments if the cutoff of the symmetry functions is sufficiently large to include the relevant interactions and if for the atom of interest the positions of the neighboring atoms within this cutoff are identical to the respective bulk positions. This allows us to replace the reference calculations for large unit cells of MOF-5, which has a cubic lattice constant of about 25.9 Å and contains 424 atoms,56 by a series of molecular fragments centered at the geometrically inequivalent atoms of the system. Because of the drastically reduced system size, the resulting DFT calculations are significantly less demanding. The trained HDNNP will then be directly applicable to simulations of the large periodic system by putting together the bulk structure from the known bulklike environments in the molecular fragments, as will be shown in this work. The crystal structure of MOF-5 contains seven chemically inequivalent atoms (H1, C1, C2, C3, O1, O2, and Zn1), as shown in Figure 2. These sites do not imply any symmetry in the

Figure 2. Seven inequivalent atoms H1, C1, C2, C3, O1, O2, and Zn1 of MOF-5 are highlighted as balls.

system, and each site can have many different environments because of the thermal fluctuations of the surrounding atoms. Our general ansatz is to cut spheres around these atoms to generate many molecular fragments that are used for the reference DFT calculations covering the relevant configuration space. The radius of these spheres must be at least as large as the cutoff radius of the symmetry functions used for the HDNNP, which is 12 a0 in this study. Furthermore, cutting of important structural features like aromatic rings should be avoided in this procedure in order to ensure that the electronic structures of all atoms within the cutoff spheres do not undergo artificial changes in the cutting process. Therefore, if at least one carbon atom of a phenylene ring is located inside the cutoff radius, the complete ring is included in the molecular fragment. This can result in an anisotropic extension of the fragment beyond the cutoff radius around the central atom. The inorganic Zn4O clusters are treated in a similar way in most fragments: If more than one atom of a Zn4O cluster is inside the cutoff, the whole SBU is kept and saturated at least with formic acid molecules. All dangling bonds are then saturated by additional H atoms, maintaining the directions of the original bonds. The initial bond length is set to a chemically reasonable value (aliphatic C−H, 1.07 Å; aromatic C−H, 0.95 Å; O−H, 0.97 Å; Zn−H, 1.55 Å). Here only five out of the seven fragments centered around the unique atomic sites labeled in Figure 2 are used. They are shown in Figure 3. The reason is that usually the fragments simultaneously contain the full cutoff spheres of several sites, e.g., fragment F5 in Figure 3 contains the environments of two atoms, O2 and Zn1. In principle, fragments F4 and F5 are also redundant in that they contain only sites that are also included in F1, F2, and F3, but as will be discussed below, it is useful to include molecular fragments with different stoichiometries in the reference set.

3. COMPUTATIONAL DETAILS The DFT reference calculations were performed using the Fritz-Haber-Institute ab initio molecular simulations (FHIaims) package (version 160328_3),71 which is an all-electron electronic structure code employing numerical atomic orbitals as basis functions to construct the Kohn−Sham orbitals. The Perdew−Burke−Ernzerhof (PBE) functional72 was applied in combination with the Tkatchenko−Scheffler method73 for van der Waals corrections. The “light” settings of FHI-aims were used for the basis set, the integration grids, and the expansion of the electron density for the Hartree potential. Convergence tests with “tight” settings were performed to confirm that the convergence of the light settings was satisfactory. For instance, the equilibrium lattice constant at 0 K changed only from 3796

DOI: 10.1021/acs.jctc.8b01288 J. Chem. Theory Comput. 2019, 15, 3793−3809

Article

Journal of Chemical Theory and Computation

Molecular dynamics (MD) simulations were carried out using the Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) (version March 31, 2017)75,76 which has been extended to include HDNNPs.77 The time step was set to 0.5 fs. A Nosé−Hoover78,79 thermostat and barostat were applied with damping constants of 0.05 ps for the thermostat and 0.5 ps for the barostat. Before the MD simulations were started, the experimental MOF-5 structure was optimized using the HDNNP.

4. RESULTS AND DISCUSSION 4.1. Neural Network Potential. The reference configurations for the construction of the HDNNP are determined using an iterative process.64 The general procedure is that starting from an initial data set, several preliminary HDNNPs are created. MD simulations using one of these HDNNPs then enable structures to be found that are not well-represented by determining the uncertainty of energies and forces using a comparison with the predictions of the other HDNNPs. These structures can then be recomputed by DFT and added to the data set to construct an improved set of HDNNPs. These steps are repeated until the quality of the HDNNP has converged. Here an initial set of 4594 reference fragment structures was obtained in the following way: First, the five fragments shown in Figure 3 were generated by cutting and saturating the atomic environments starting from the experimental X-ray structure at 100 K.56 Then those fragments were scaled by a factor between 0.96 and 1.10, and the atomic positions were distorted randomly with a maximum displacement of 0.16 Å in the x, y, and z directions. The total energies and forces were calculated for all of these structures using DFT, and the obtained data were used to fit a set of initial HDNNPs having different functional forms because different NN architectures and different initial random weight parameters were used, although the energy and force errors were typically very similar. As the available reference set does not cover all chemically relevant parts of the configuration space, missing structures have to be identified to improve the HDNNP. For this purpose, MD simulations of the fragments in the canonical (NVT) ensemble, driven by one of the obtained HDNNPs, were performed to generate a large number of candidate structures. As starting structures for the MD simulations the five basis fragments and scaled derivatives were used. The initial velocity distribution as well as the temperature of the MD simulations were varied from 50 to 500 K. The resulting MD trajectories allow the identification of new reference structures in two ways: First, a structure needs to be included if its symmetry function values are outside the training set range, which means that the current HDNNP is extrapolating. Another criterion is whether one or more force components of a given structure recomputed with a second independent HDNNP deviate substantially from the forces underlying the HDNNP used in the MD simulation, which means that the HDNNP still has too much flexibility for this structure.66 When a sufficient number of new structures were identified, they were recomputed by DFT and added to the reference set to create a new generation of HDNNPs. This iterative improvement was repeated several times until extrapolations were absent in 1 ns NVT MD simulations of the periodic structure. We note that in this construction of the HDNNP no DFT data on bulk structures were included. This would in principle also enable our approach to be used for

Figure 3. Five hydrogen-saturated molecular basis fragments used for the DFT reference calculations. All atoms that have complete bulklike environments within a cutoff radius Rc = 12 a0 are shown as balls. All other atoms, which form the chemical environment and do not have the full bulklike environments, are represented as sticks. Capping H atoms saturating broken bonds are colored orange.

26.046 (light) to 26.047 Å (tight), and the energy difference between typical structures of interest like the experimental crystal structures at 100 and 500 K (differing in lattice constant by 0.136 Å) is 73.7 meV/atom (light) compared with 71.9 meV/atom (tight). Consequently, the level of convergence of about 2 meV/atom for energy differences is similar to the typical target accuracy of HDNNPs, and the light basis set is sufficient. For the periodic calculations of MOF-5, only the Γ point was used. The construction of the HDNNPs was done using the RuNNer code.44,74 For each element, the atomic NNs in all of the HDNNPs used for production runs in this work consisted of two hidden layers with 20 neurons per layer. The parameters of the employed symmetry functions are given in the Supporting Information. To train the HDNNP, 90% of the reference data were used, and the remaining 10% were used to test its reliability for structures not included in the training set. The values of the symmetry functions were rescaled to the interval [0.0, 1.0], and their averages were shifted to zero. The DFT energies of free H, C, O, and Zn atoms were subtracted from the DFT total energies Etot before the training, so that only the cohesive energies E were fitted. As the free atom energies are just constant shifts, the cohesive energies have the same complexity as the total energies, but have numbers closer to zero. The initial weights were chosen as random numbers in the interval [−1.0, 1.0]. A preconditioning scheme66 was applied to the weights before training to reduce the initial rootmean-square error (RMSE). The training data points were mixed randomly in each epoch to avoid early trapping in local minima. 3797

DOI: 10.1021/acs.jctc.8b01288 J. Chem. Theory Comput. 2019, 15, 3793−3809

Article

Journal of Chemical Theory and Computation

Figure 4. Signed errors of the cohesive energies (training set (a) and test set (b)) and force components (training set (c) and test set (d)) of NNP1 as a function of the respective DFT data for fragments F1 to F5, which are shown in different colors. The black lines mark errors of 2.5 meV/atom for the energies and 0.25 eV/a0 for the forces.

of the DFT cohesive energy are shown in Figure 4a for the training set and Figure 4b for the test set. Most structures (98.77% in the training set and 97.13% in the test set) have an absolute energy error of less than 2.5 meV/atom. Only five structures have errors larger than 10 meV/atom, and all of them are test structures of fragment F3 with rather high cohesive energies. The RMSEs of NNP1 for the forces are 88.8 meV/a0 and 88.9 meV/a0 for training and test sets, respectively, and the deviations of the NNP1 force components from the DFT values are shown in Figure 4c for the training set and Figure 4d for the test set. Only 1.86% of the total of 4.7 × 106 force

higher-level electronic structure methods for which no periodic boundary conditions are available. The converged reference set for the HDNNP, which for later reference we call NNP1, contains 23 132 fragment structures for training and 2543 fragment structures for testing. The cohesive energies cover a range from −5.421 to −4.850 eV/atom, and the maximum force component is 7.41 eV/a0. The final RMSE of the energies of the training set is 0.81 meV/atom, while that of the test set is 1.40 meV/atom. The maximum errors for the energies are 8.87 meV/atom for the training set and 34.1 meV/atom for the test set. The deviations between the NNP1 and DFT energies as a function 3798

DOI: 10.1021/acs.jctc.8b01288 J. Chem. Theory Comput. 2019, 15, 3793−3809

Article

Journal of Chemical Theory and Computation components in the training set have an absolute error greater than 0.25 eV/a0, while for the test set it is 1.89% of the 5.3 × 105 force components. The stability of NNP1 was tested in a 1 ns NVE MD simulation of the periodic MOF-5 structure starting from a structure equilibrated at 300 K for 1 ns. We found excellent energy conservation with a total energy drift of only 4 × 10−9 eV/ps. No extrapolation was observed for any symmetry function in any configuration of the trajectory, and the MOF-5 structure did not undergo any structural changes apart from the typical thermal fluctuations. 4.2. Validation on Bulk Properties of MOF-5. A main feature of MOF-5 is the orientational order of its phenylene rings. The rotational barrier leading to this quantity can be used to judge the quality of the HDNNP. Figure 5 shows the energy profile along a full 180° rotation of a phenylene group in bulk MOF-5 calculated using DFT and NNP1 as well as the experimentally determined value of the energy barrier, which was obtained from 2H NMR data.80 The energy profile of NNP1 matches the DFT data very well. The maximum deviation is 0.03 eV. The rotational barrier was calculated to be 0.508 eV by PBE, 0.491 eV by NNP1, and (0.490 ± 0.087) eV by NMR, which means a close to perfect agreement of the theoretical and experimental data. Since the reference set for NNP1 does not include any geometry-optimized fragments or periodic MOF-5 structures, the prediction of the equilibrium lattice constant of MOF-5 by NNP1 is an important test for the potential. Table 1 lists the DFT-PBE- and NNP1-optimized lattice constants of the periodic system. They differ by only 0.036 Å, corresponding to a relative error of about 0.1%. Therefore, the molecularfragment-based HDNNP can predict the periodic structure reliably making use of information about the local atomic environments only. The experimental lattice constant from single-crystal X-ray diffraction (SCXRD),56 which is also given in Table 1, is about 0.6% smaller than our DFT value, which is in very good agreement with previous PBE data (26.0443 Å) for this system.24 However, in this comparison between experiment and theory it must also be considered that the experimental structure was measured at 100 K, at which the lattice constant should be approximately 0.03 to 0.04 Å smaller than the one at 0 K because of the NTE, while the DFT lattice constant was determined at 0 K. The NNP1 equilibrium lattice constant, which cannot be better than the quality of the underlying DFT functional, clearly shows that we have achieved our first goal to

Table 1. Equilibrium Lattice Constants of MOF-5 (in Å) Obtained Using PBE-DFT, NNP1, and SCXRD at 100 K56 PBE

NNP1

SCXRD

26.046

26.082

25.892

represent the periodic crystal structure by an HDNNP based only on data for small molecular fragments. Apart from the lattice constant, other bulk properties like the bulk modulus, the elastic constants, the Young’s modulus, and the Poisson’s ratio are also of interest. The data obtained with NNP1 and two further potentials to be discussed later, NNP4*l and NNP*all, are compiled in Table 2 and compared with DFT reference data, which would be the ideal outcome of our HDNNPs. The results for the different HDNNPs vary slightly. The bulk modulus is overestimated by about 2.7 GPa compared to DFT. All of the NNP elastic constants are predicted to be somewhat higher than the DFT reference values, and a rather large relative error for C44 is present because of its small absolute value. The Young’s modulus and Poisson’s ratio are on average slightly higher than the DFT reference values. The DFT results are in agreement with previous published data using the PBE functional and the D3 dispersion correction.81 The Supporting Information gives the PBE data obtained without using the Tkatchenko−Scheffler van der Waals correction, showing only a small influence of the dispersion correction. 4.3. Bulk Energy and Atomic Energy Contributions. Next, we examine the total energy of the periodic MOF-5 structure. For this purpose, the cohesive energy was calculated using NNP1 and PBE-DFT for the SCXRD structures measured between 100 and 500 K,56 which have different lattice constants and slightly differing internal structures, and for the PBE-optimized structure. The cohesive energies and the differences per atom between NNP1 and PBE-DFT are shown in Figure 6a and Figure 6b, respectively. The energy differences in Figure 6b are roughly inside a window of 2 meV/ atom, which is in the typical energy error range of HDNNPs. However, there is a cohesive energy offset Eoff tot between the energies of NNP1 and DFT-PBE in Figure 6a of about 0.237 eV/atom, which is remarkably large. Apparently this offset is present only for bulk structures, which are not included in the reference data of NNP1, as otherwise the energy RMSE values of the training and the test sets of the molecular fragments would be much higher than the value of about 1 meV/atom that we observed. Next, we investigate the reason for the cohesive energy offset in the bulk structures, propose a possible solution, and discuss the consequences for the general applicability of our ansatz. Before we start with this investigation, we note that the offset is present only for the cohesive energies, while the forces and energy differences, e.g., Table 2. Bulk Modulus B (in GPa), Elastic Constants C11, C12, and C44 (in GPa), Young’s Modulus E (in GPa), and Poisson’s Ratio ν Calculated Using PBE-DFT, NNP1, NNP4l *, and NNPall *

Figure 5. Energy profile along a full 180° rotation of a phenylene group in bulk MOF-5 obtained by NMR,80 PBE-DFT, and NNP1. 3799

B

C11

C12

C44

E

ν

PBE NNP1 NNP*4l

16.0 19.0 18.4

25.9 29.1 28.1

11.1 13.9 13.6

1.1 1.9 1.6

19.2 20.1 19.2

0.30 0.32 0.33

NNPall *

18.7

30.4

12.8

1.7

22.8

0.30

DOI: 10.1021/acs.jctc.8b01288 J. Chem. Theory Comput. 2019, 15, 3793−3809

Article

Journal of Chemical Theory and Computation

the distribution of the cohesive energy between the atoms, and consequently, different HDNNPs of similar quality can have very different sets of atomic energies. We note that only for perfect simple crystal structures of single elements (e.g., the diamond structure), in which all of the atoms have exactly the same geometric environment, would the atomic energy contributions have a unique numerical value, equal to the cohesive energy divided by the number of atoms. Even for slight distortions of such a system or for systems containing more than one element, this uniqueness is lost, as discussed below. To compare the atomic energy contributions in different HDNNPs, we constructed a second potential called NNP*all, which is based on an extended reference set including next to all of the molecular fragments also the experimental bulk structure at 100 K and the larger aggregates of fragment F1 that are shown schematically in Figure 7. We chose this extended data set as an extreme case that must yield an HDNNP with a close to zero offset for bulk structures with respect to DFT because a bulk structure is explicitly included in the training set. Each of the aggregates is represented by only one undistorted structure, which was cut from the experimental bulk structure at 100 K. These aggregates were constructed such that a larger variation of stoichiometries and ratios of different atomic environments is present in the extended reference set. In spite of the extended training set, NNP*all has accuracies of the energies (RMSEs of 0.83 and 1.30 meV/atom for the training and test sets, respectively) and forces (RMSEs of 91.2 and 93.8 meV/a0 for the training and test sets, respectively) that are very similar to those for NNP1.

Figure 6. (a) Cohesive energies (total energy minus the energies of the free atoms) and (b) cohesive energy differences in eV/atom between NNP1 and PBE-DFT for 21 experimental X-ray structures of bulk MOF-5 measured at different temperatures.56 For comparison, data for the PBE-optimized structure are also shown (PBEopt). While the differences are similar for all of the structures and are all within an interval of only about 2 meV/atom, there is a general offset of about 0.237 eV/atom in the cohesive energies.

between two bulk structures, do not show a similar phenomenon. Therefore, the offset does not play a role for applications in MD simulations, nor is it relevant for any of the physical properties investigated in this work, which are all insensitive to systematic offsets in the total energy. First, we find that even if we add just a single periodic structure to the training data set consisting of the molecular fragments, e.g., the experimental bulk structure at 100 K (NNPexp * ), the offset in the cohesive energies decreases essentially to zero for all bulk structures, even for those not included in the training set. Therefore, the offset cannot be related to physical phenomena like a too-small cutoff radius of the symmetry functions. We conclude that all of the energetically relevant atomic interactions are adequately included with the chosen cutoff spheres, which is also consistent with the accurate prediction of bulk properties discussed before. Instead, we now show that the offset arises from some freedom in the values of the atomic energy contributions that enter the cohesive energy expression (eq 2). We note that atomic energies in machine learning potentials other than HDHDNNPs have been investigated before in a similar spirit, e.g., for GAPs82 and convolutional neural networks.83 In general, atomic energies are not quantum-mechanical observables. Thus, they have no physically unique values and are not available for training of the atomic NNs. Instead, in the HDNNP method only cohesive energies are used for determining the NN parameters, and the atomic energies are constructed such that their sum provides the correct cohesive energies for the reference systems used in the fitting process, which in the case of NNP1 are the cohesive energies of the molecular fragments. These constraints leave some freedom in

Figure 7. Schematic representation of larger nonperiodic aggregates of fragment F1 that were been added to the reference data sets. The octahedra represent the Zn4O SBUs, and sticks correspond to the BDC linkers. All of the dangling bonds of the Zn4O SBUs are saturated with benzoic acid residues as in fragment F1 (see Figure 3). The labels specify the numbers of Zn4O SBUs included in the aggregates. The subscript “l” is used for linear arrangements, and the subscript “c” is used if one or more cyclic (Zn4O)4(BDC)4 units are present. 3800

DOI: 10.1021/acs.jctc.8b01288 J. Chem. Theory Comput. 2019, 15, 3793−3809

Article

Journal of Chemical Theory and Computation Both NNP1 and NNP*all were used to determine the atomic energy contributions in bulk MOF-5. They are plotted in Figure 8a and Figure 8b, respectively, which for clarity show only a part of the MOF-5 bulk structure. In spite of the similar accuracy in terms of energy and force RMSEs, Figure 8 shows clearly that the atomic energy contributions are very different in the two HDNNPs. These differences can have two different reasons. First, there could be atomic energy contribution depending on the chemical elements, as could be offsets Eoff,ele n suggested by the carboxylate carbon atoms C1 and oxygen atoms O2. Although it is reasonable that carboxylate carbon and oxygen atoms contribute different atomic energies, in Figure 8a,b their relative contributions are essentially inverted, as in Figure 8a (NNP1) the oxygen atoms have a lower atomic energy than the carbon atoms, while the opposite is observed in Figure 8b (NNP*all). Second, the atomic energy contribution offsets not only can be element-specific but also depend on differences in the chemical environments of the atoms for a given element. For instance, in case of NNP1 the carboxylate carbon atoms C1 provide a higher energy contribution than the * shows the aromatic carbon atoms C2 and C3, whereas NNPall opposite trend. In order to understand the origin of the atomic energy contribution offsets, we first address the element-specific part and start with a thought experiment using water as an example system. As the H:O atomic ratio in water always remains 2:1 irrespective of the system size and no matter whether it is a small cluster or a bulk liquid, an HDNNP trained to reproduce the energies of water configurations has the freedom to introduce offsets in the atomic energy contributions as long as Enoff,ele(O) = − 2Enoff,ele(H)

to ensure that the sum of atomic energies, i.e., the cohesive energy, is not shifted. Therefore, in this case the elementspecific atomic energy contribution offset does not change the cohesive energies and there is no penalty in the form of an increased energy RMSE. However, if the energies of species with different stoichiometries like H2O2 are predicted with such an HDNNP, this compensation of the element-specific atomic offsets does not work anymore, resulting in a net offset in the cohesive energy, which for H2O2 would amount to Eoff tot = Eoff,ele (O). This would yield a substantial error for this n structure, which can be reduced by adding H2O2 species to the training process. This augmentation of the training set then results in an adjustment of the weight parameters to eliminate this cohesive energy offset. Thus, effectively, the inclusion of H2O2 introduces a second constraint Enoff,ele(O) = −Enoff,ele(H)

(4)

for the element-specific atomic energy offsets. To fulfill both conditions of eqs 3 and 4, the element-specific offsets Eoff,ele (O) and Eoff,ele (H) have to be zero. Consequently, the n n combination of a sufficient number of different stoichiometries is an effective means to avoid element-specific offsets in the atomic energy contributions. The same principle to avoid the arbitrariness in the elementspecific offsets can be applied to the MOF-5 system. Using molecular fragments of five different stoichiometries should remove or at least substantially reduce the element-specific atomic energy offsets. Since nevertheless there is obviously some freedom in the atomic energy contributions in Figure 8, there must be a second origin of atomic energy offsets that is not directly related to the specific elements. To further investigate this, we perform a second thought experiment addressing the flexibility in the atomic energy contributions for an elemental monocomponent system. In this case the offset can only be a function of the atomic environments. As a test system we select a simple crystal structure, e.g., the ideal diamond structure, in which all of the atoms have exactly the same environment. Then, as already mentioned, the numerical value of the atomic energy contributions, which are all the same, is unique, and there is no offset. If now one atom is displaced, its atomic environment and also the atomic environments of all atoms inside its cutoff sphere change. Therefore, it is now impossible to assign a unique value to each atomic energy, and there are many possible sets of atomic energies yielding the modified cohesive energy of the system. Again, just the sum is constrained to yield the correct cohesive energy, but there can be environment-dependent offsets in the atomic energies. This situation is more complex than the case of the element-specific atomic offsets because, in contrast to the nuclear charges, the atomic positions and environments are continuous properties, and they are all overlapping and thus coupled. Therefore, while it is in principle reasonable to assume that by variation of the numbers of atoms in given local chemical environments it is possible to reduce or even eliminate these environmentdependent atomic energy offsets, in full analogy to the variation of the stoichiometries in the case of the elementspecific offsets, in practice this will turn out to be very difficult. Both the element-specific and environment-dependent offsets cannot beand do not need to bestrictly separated in systems of the complexity of MOF-5 but can be addressed together. As we now show, it is possible to reduce the flexibility in the atomic energy contributions by increasing the diversity

(3)

Figure 8. Atomic energy contributions En in the bulk structure of MOF-5 calculated by (a) NNP1 and (b) NNP*all, whose training set additionally includes the bulk structure and the larger fragments shown in Figure 7. For clarity, only a part of the bulk structure is shown. 3801

DOI: 10.1021/acs.jctc.8b01288 J. Chem. Theory Comput. 2019, 15, 3793−3809

Article

Journal of Chemical Theory and Computation in the stoichiometries and geometries present in the training set. We investigate this by adding larger aggregates or even a bulk structure to the training set. Table 3 lists the stoichiometries of the five molecular fragments, of the larger aggregates, and of bulk MOF-5. Since MOF-5 contains four elements, already the information included in the five fragments of different stoichiometry should be sufficient to formally avoid element-specific atomic energy offsets. To test our thought experiments and to reduce the cohesive energy offsets of bulk MOF-5 structures, different HDNNPs with augmented training sets were constructed. They all start from the complete set of molecular fragments F1 to F5. The additional training structures that were included are the aggregates of fragment F1 (Figure 7). The HDNNPs containing only one aggregate at a time are named as the respective aggregate that is added to the molecular fragments. Further, in the training set of NNP*agg, all of the aggregates were added simultaneously. As described above, NNPexp * contains all of the fragments and the experimental bulk structure at 100 K, while NNP*all contains all of the fragments, all of the aggregates, and the experimental bulk structure at 100 K. The asterisk in the labels of the HDNNPs indicates that the splitting of the reference structures into training and test sets and the initial random weight parameters are the same for all these HDNNPs. In the same way, the HDNNPs without an asterisk share a common but different splitting and initial weight parameter set. The aggregates and the bulk structure, if present, are always included in the training set. The RMSE values of the energies and forces for all of the HDNNPs, which are very similar, are given in the Supporting Information. To measure the cohesive energy offsets for bulk MOF-5 structures, the RMSEs with respect to DFT-PBE calculations of 21 cohesive energies of X-ray diffraction structures measured between 100 and 500 K as well as of the PBEDFT-optimized structure are shown in Figure 9 for all of the HDNNPs. The HDNNPs based on only one aggregate are ordered according to the fraction of atoms in the aggregate that have a bulklike environment inside the cutoff radius (cf. Table 4). This means that no capping atoms are included in the environment and/or atoms are missing that would be present in the bulk environment. In this order also the deviation of the

Figure 9. Energy RMSEs with respect to DFT for a data set containing 21 experimental X-ray structures of bulk MOF-5 measured at different temperatures56 (100 to 500 K) as well as the PBEoptimized bulk structure for a series of HDNNPs based on different training sets. This RMSE can be used as a measure of the cohesive energy offset for bulk structures. The HDNNPs are labeled according to the additional structures augmenting the training data set. For a detailed description of the HDNNPs, see the text.

aggregates from the bulk stoichiometry decreases (cf. Table 3). The values of the cohesive energy offsets and the RMSEs of the offset-corrected cohesive energies are given in the Supporting Information. The data in Figure 9 show the expected trend. Including an aggregate structure with a different stoichiometry and a different ratio of environments in the training set in general reduces the cohesive energy offset of the bulk. The magnitude of the reduction is correlated with the fraction of atoms in the aggregate that are in a bulklike environment (cf. the “total %” column Table 4 ). Only the HDNNPs with larger linear aggregates (NNP*3l and NNP*4l ) are slight exceptions. Of all single aggregates, 8c is most efficient in reducing the offset (RMSE = 22.4 meV/atom). When all of the aggregates are included simultaneously (NNP*agg), the offset of the cohesive energy vanishes almost completely (RMSE = 6.5 meV/atom), indicating that the freedom in the numerical values of the atomic energy contributions is substantially reduced. If even a bulk structure is included in the training set (NNP*exp and NNPall * ), the atomic energy contributions are adjusted to work close to perfectly for this stochiometry (RMSE = 3.5 meV/ atom for both). The same trend is observed if a different splitting of the reference structures into training and test sets and different initial weight parameters are used (NNP1, NNP2, and NNPagg). However, the offset of NNP1 is smaller than that of NNP*1 , which underlines that for a given reference set the numerical values of the atomic energies and thus of the offset depend on the specific NNP parameters, which is more pronounced for smaller data sets. Since adding one or more aggregates or even a single bulk structure reduces the cohesive energy offset of MOF-5 significantly, one could expect that the atomic energies converge to numerically unique values. However, this is not the case for MOF-5, as can be seen in Figure 10. Figure 10a,b shows the differences in the bulk atomic energy contributions of NNP1* and NNP1 with respect to NNPall * . Obviously the atomic energies of NNP1* and NNP1 deviate from those of NNP*all as well as from each other. The difference is much smaller if the same training set and the same initial random weight parameters are used (NNP1*), suggesting that these choices influence the local minimum adopted during the NNP

Table 3. Chemical Compositions of the Five Molecular Fragments (Figure 3), the Larger Aggregates (Figure 7), and Bulk MOF-5 (Figure 1a) stoichiometry F1 F2 F3 F4 F5 2l 3l 4l 6 7 4c 6c 8c bulk

atom %

H

C

O

Zn

H

C

O

Zn

30 26 18 18 18 54 78 96 102 150 138 174 168 96

42 37 25 25 24 78 114 144 150 222 210 258 264 192

13 15 15 15 13 26 39 52 52 78 78 91 104 104

4 4 5 4 4 8 12 16 16 24 24 28 32 32

33.7 31.7 28.6 29.0 30.5 32.5 32.1 31.9 31.6 31.6 31.2 30.7 29.6 22.6

47.2 45.1 39.7 40.3 40.7 47.0 46.9 46.9 46.8 46.8 46.8 46.7 46.5 45.3

14.6 18.3 23.8 24.2 22.0 15.7 16.0 16.3 16.5 16.5 16.9 17.3 18.3 24.5

4.5 4.9 7.9 6.5 6.8 4.8 4.9 5.0 5.1 5.1 5.2 5.3 5.6 7.5 3802

DOI: 10.1021/acs.jctc.8b01288 J. Chem. Theory Comput. 2019, 15, 3793−3809

Article

Journal of Chemical Theory and Computation

Table 4. Numbers of Atoms in Bulklike and Non-bulklike Environments Inside the Cutoff Radius of the Symmetry Functionsa atoms in a bulklike environment F1 F2 F3 F4 F5 2l 3l 4l 6 7 4c 6c 8c bulk

O1

Zn1

O2

1

4 2 1 1 1 8 12 16 24 28 16 24 32 32

12 6 3 3 3 24 36 48 72 84 48 72 96 96

2 3 4 6 7 4 6 8 8

C1

C2

1

2 4 6 10 12 8 14 24 48

atoms in a non-bulklike environment

C3

H1

1 1 1

1

2 1 1

2 4 6 10 12 8 14 24 48

4 8 12 20 24 16 28 48 96

4 8 12 20 24 16 28 48 96

total

total-%

H

C

17 12 7 6 4 46 75 104 162 191 116 186 280 424

19.1 14.6 11.1 9.7 6.8 27.7 30.9 32.5 34.2 34.7 37.7 41.3 49.3 100

30 24 17 17 18 50 70 90 130 150 80 110 120

42 35 23 24 24 70 98 126 182 210 112 154 168

O

Zn

9 12 12 10

2 4 3 3

total

total %

72 70 56 56 55 120 168 216 312 360 192 264 288 0

80.9 85.4 88.9 90.3 93.2 72.3 69.1 67.5 65.8 65.3 62.3 58.7 50.7 0

a

A non-bulklike environment includes capping atoms and/or misses atoms that would be present in the bulk system. Numbers are given for the seven unique atomic bulk sites in the five molecular fragments (Figure 3), the larger aggregates (Figure 7), and bulk MOF-5 (Figure 1a). The “total %” value equals 100 times the “total” value divided by the number of atoms in the corresponding fragment, aggregate, or bulk unit cell.

fitting process. On the other hand, the differences in the atomic energies of NNP1 compared with NNP*all are up to 3.5 eV/atom, which is very large. The comparison of NNP*exp and NNP*all (Figure 10c), both of which include a bulk structure in the training set, shows that the resulting atomic energy contributions can still vary up to about 1.0 eV/atom even in the absence of a significant cohesive energy offset. The same is found for NNP*agg and NNPagg (Figure 10d), which do not contain a bulk structure but include all of the larger aggregates, resulting in a very small cohesive energy offset as well. Here the deviation in the atomic energies reaches more than 3 eV/atom. In all of these HDNNPs, the remaining flexibility in the atomic energies does not give rise to any notable offset in the cohesive energies of MOF-5. Therefore, we first note that our goal to remove this offset by a suitable extension of the training set has been reached. Second, we now unravel the origin of the remaining notable flexibility in the atomic energies, which is related to the stoichiometry of MOF-5. In the construction of the molecular fragments as well as the larger aggregates, we ensured that certain structural features, like the phenylene rings, the carboxyl groups, and in most cases also the SBUs remain intact. Therefore, the ratios of the individual atomic environments in these groups are always the same. For instance, there are always two oxygen atoms O2 and one carbon atom C1 forming the carboxyl group, which is never cut. Since this atomic ratio is never changed, there is some freedom in internally distributing the net energy contribution of this “group”, while the contribution of the entire group is constrained by the varying “group stoichiometry”, e.g., in the different aggregates. In Figure 10d, which shows the atomic energy differences between the two HDNNPs containing all of * and NNPagg, the atomic energy the aggregates, NNPagg differences of C1 and O2 are of opposite sign and still very large although the cohesive energy offset is close to zero for * and NNPall * , both of both HDNNPs. In the potentials NNPexp which contain a bulk structure, the atomic energy differences are much smaller but still notable (Figure 10c). To support the hypothesis that with a group of atoms the atomic energies compensate for each other, we analyzed the characteristic groups of atoms forming MOF-5. In Figure 11

the individual atomic energies as well as their weighted sum is shown for the Zn4O SBU (b), the CO2 carboxyl group (c), and the backbone C6H4 of the phenylene ring (d). It can be seen that our assumption is correct. For instance, in Figure 11d, NNP*6c and NNP*8c show inverted contributions of the carbon atoms C2 and C3, for which we have to take into account the fact that there are twice as many C3 atoms as C2 atoms. A similar case can be found for these HDNNPs for the atoms C1 and O2 in Figure 11c. However, the separation into individual groups of atoms is not the only mechanism of compensation, which can also take place to some extent between different groups of atoms. An example is a mutual compensation of the energy contribution of the atoms in the Zn4O SBU and the carboxyl groups for NNPexp * and NNPall * (Figure 10a). On the basis of this analysis, it should be possible in principle to obtain unique numerical values for all of the atomic energies by introducing additional structures with different stoichiometries of local environments into the training set. However, for all but the simplest systems this will turn out to be unfeasible in practice. Therefore, the atomic energies of HDNNPs for real systems are not unique in the general case. Consequently, they should not be physically interpreted. However, for an accurate description of the potential energy surface this extension of the reference data set is not required because the atomic energy offsets compensate for each other in the trained configurational space and forces are not affected at all. 4.4. Negative Thermal Expansion. A characteristic property of MOF-5 that we now address is its NTE behavior, and we investigate whether the molecular-fragment-based HDNNP is able to predict the NTE behavior of the bulk reliably. For this purpose we performed NPT MD simulations of a 2 × 2 × 2 MOF-5 supercell at temperatures from 50 to 500 K in steps of 50 K for 500 ps each to calculate the average lattice constant as a function of the temperature. The obtained lattice constants are shown in Figure 12. NNP1, NNP4*l , and * predict a correct NTE behavior of MOF-5 in NNPall agreement with the experimental SCXRD result56 (α = −13.2 × 10−6 K−1), exhibiting a similar slope. We note that the HDNNPs are based on PBE-DFT reference data only, 3803

DOI: 10.1021/acs.jctc.8b01288 J. Chem. Theory Comput. 2019, 15, 3793−3809

Article

Journal of Chemical Theory and Computation

Figure 10. Differences in the atomic energy contributions, ΔEn, in a * ); (b) part of the bulk MOF-5 structure: (a) En(NNP1*) − En(NNPall * ); (c) En(NNPexp * ) − En(NNPall * ); (d) En(NNP1) − En(NNPall * ). En(NNPagg) − En(NNPagg

Figure 11. Weighted sums of the atomic energy contributions (in eV/ atom) of the Zn4O SBU, the CO2 carboxyl group, and the backbone C6H4 of the phenylene ring (a) and the individual atomic energy contributions of the seven inequivalent atoms in bulk MOF-5 (b−d) calculated by different HDNNPs.

which allows prediction of physical properties without relying on experimental data. In this approach we are limited by the accuracy of the reference method that we consider as an example here. In future studies it can be replaced by more accurate methods, which are applicable because of the use of small molecular fragments as references only. The results for the linear thermal expansion coefficient at 300 K are comparable for the HDNNPs (NNP1, α = −9.1 × 10−6 K−1; NNP*4l , α = −10.5 × 10−6 K−1; NNP*all, α = −8.3 × 10−6 K−1). Even NNP1, which exhibits an offset in the cohesive energy of bulk MOF-5, yields results of good quality although it is based only on training data of small molecular fragments. The small shift in lattice constants observed in Figure 12 is mainly a consequence of the deviation of the PBE-DFT lattice constant

from experiment, but there is also a small dependence on the particular potential, which is not sensitive to whether the bulk structure was included in the fit. Our simulations further yield detailed information about the changes in the interatomic distances as a function of the temperature. This analysis of distances and angles provides insights into the mechanisms that lead to the NTE behavior. From previous studies16 it is known that the main contributor to the NTE behavior is the transverse motion of the BDC linker. In the ideal crystal structure at 0 K, the atoms O1, C1, C2, C2′, C1′, and O1′ (Figure 2) form a straight line that is parallel to a lattice vector. If these atoms are transversely 3804

DOI: 10.1021/acs.jctc.8b01288 J. Chem. Theory Comput. 2019, 15, 3793−3809

Article

Journal of Chemical Theory and Computation

is not parallel to the lattice vector. Figure 14 shows that the decrease in r(Zn4O1−Zn4O1′) is about 3 times greater than that in r(Zn4O 1−C6). Consequently, there is a large contribution of the in-phase transverse motion. The distance r(Zn4O1−C1) decreases by 0.007 Å when the temperature increases from 50 to 450 K, and therefore, the motion of the carboxylate group contributes to the NTE as well. Because the NTE behavior is a result of the changing bond angles at higher temperatures, some important interatomic angles are shown in Figure 15. The thermal motion reduces the mean interatomic angle between two adjacent Zn4O SBUs and the center of the connecting phenylene ring, θ(Zn4O1−C6− Zn4O1′), from 180° at 0 K to 175.1° at 450 K, i.e., a distortion from the planar arrangement takes place. A simple linear approximation of the connection yields a lattice constant decrease of 0.05 Å, which is about half of the actual decrease. This is a lower bound for the impact of the transverse in-plane displacement on the lattice constant that is responsible for the decreased angle. Because the BDC linker is bent and not formed like a triangle, the actual influence of this motion on the NTE is larger than 0.05 Å. Figure 15 shows that the angles θ(Zn4O1−C1−C2) and θ(C1−C2−C6) deviate more from 180° than the angle θ(C2−C6−C2′), meaning that the phenylene ring itself is not so much deformed from the planar configuration, which could be expected on the basis of the stabilization by the conjugated π system. The decrease in the bond angles θ(O1−Zn1−O2) and θ(Zn1−O2−C1) shows that the motion of the carboxylate group slightly contributes to the NTE behavior as well. * show the same trend. The only The results of NNPall difference is that the NTE is slightly less pronounced which agrees with the calculated linear thermal expansion coefficients. 4.5. Phonon Density of States. The phonon spectrum is another characteristic property of a material. However, the calculation by first-principles methods is demanding for systems with large unit cells, such as MOF-5. The phonon density of states ρ(ν̃) can be obtained from NVE MD simulations by a discrete Fourier transform of the atomic velocities.84 This can be done using an HDNNP with low computational costs. The phonon modes show how accurately the dynamics are represented by the HDNNP. For the calculation of the phonon density of states, 40 ps NVE simulations were performed on 40 thermally equilibrated structures from an NPT simulation at 298 K of a 2 × 2 × 2

Figure 12. Experimental SCXRD56 and calculated average lattice constants a of MOF-5 as functions of the temperature T.

displaced, the bonding path, i.e., the distance via all connecting bonds, increases (Figure 13). Therefore, both in-phase and out-of-phase transverse vibrational motions can lead to a shrinking of the lattice vector at higher temperatures if the increase in the overall bonding path is larger than the increase in the interatomic bond lengths inside the BDC linker, because then two Zn4O SBUs have to approach each other. The NNP MD simulations show that all of the bond lengths increase at higher temperatures, as expected from the anharmonic nature of covalent bonds. On average the bond lengths increase by 0.01 Å when the temperature is raised from 50 to 450 K. However, the lattice constant decreases by 0.1 Å, which means that the increase in the bonding path due to the transverse motions is larger. In order to estimate which of the two motions has a higher contribution to the NTE behavior, we can analyze mean interatomic distances. If only the out-of-phase motion is present, the distance between the center of a Zn4O SBU and the center of the phenylene ring, r(Zn4O1−C6), will be half the distance between two Zn4O SBUs, r(Zn4O1−Zn4O1′). If only the in-phase motion is present, 2r(Zn4O1−C6) will be larger than r(Zn4O1−Zn4O1′) because in this case the vector between a Zn4O SBU and the center of the phenylene ring

Figure 14. Changes in mean distances r with respect to the structure at 50 K as functions of the temperature T. Zn4O1 is the center of the SBU, and C6 is the center of the phenylene ring. The results were obtained using NNP1.

Figure 13. (a) In-phase and (b) out-of-phase transverse vibrational motions of the BDC linker at increased temperatures. 3805

DOI: 10.1021/acs.jctc.8b01288 J. Chem. Theory Comput. 2019, 15, 3793−3809

Article

Journal of Chemical Theory and Computation

with each other. The difference is that the NNP*4l C−H stretching bands are more split. This could be a consequence of anharmonic and temperature effects, which were not included in the harmonic frequency calculations that were performed in the previous DFT studies. The IR and Raman spectra of MOF-5 agree with our data as well. The observed Raman bands are up to about 1630 cm−1,86,87 and the IR bands are up to about 1650 cm−1.88−91 Unfortunately, the C−H stretching modes do not show significant intensity in either IR or Raman spectroscopy. The IR and Raman bands of solid benzene-1,4-dicarboxylic acid are between 3064 and 3104 cm−1,92 which is slightly below the observed C−H stretching modes of MOF-5 calculated using NNP*4l .

Figure 15. Changes in mean absolute interatomic angles calculated as functions of the temperature T. Zn4O1 is the center of the SBU, and C6 is the center of the phenylene ring. The results were obtained using NNP1.

5. SUMMARY In this work, using bulk MOF-5 as a benchmark system, we have demonstrated that it is possible to construct highdimensional neural network potentials of first-principles quality for organic−inorganic hybrid systems based on a reference set of small molecular fragments. Even in the absence of bulk structures in the construction of the HDNNP, all of the investigated bulk properties, such as lattice parameters, elastic constants, bulk modulus, and negative thermal expansion, are accurately predicted. A detailed analysis of the molecular distortions as a function of temperature has been performed, providing insights into the atomistic origin of the NTE. Without information about the bulk, the predicted total energies of bulk MOF-5 exhibit an energy offset that can be reduced by including training data with an increased diversity in chemical composition. This augmentation of the training set poses additional constraints to be fulfilled by the individual atomic energies. Energy gradients and energy differences do not exhibit any offsets. Consequently, applications like MD simulations are insensitive to some freedom in the distribution of the total energy in atomic energy contributions. In conclusion, the present work shows that HDNNPs of very high accuracy can be developed for complex hybrid organic− inorganic materials like metal−organic frameworks.

MOF-5 supercell. The velocities after every 4 fs were used. The resolution of the spectrum is 0.4 cm−1. The spectrum calculated using NNP*4l at 298 K is shown in Figure 16, as the offset-corrected energy RMSE and the force RMSE with respect to DFT of the bulk test data set is smallest for NNP*4l (see the Supporting Information). Furthermore, NNP4*l shows the best agreement for the equilibrium lattice constant and the elastic constants. An analysis of spectra calculated by other HDNNPs is given in the Supporting Information. There are two major spectral regions. The one at higher wavenumbers ν̃ is associated with the C−H stretching bands. The peak positions are at 3104 and 3148 cm−1. The lower spectral region is from about 6 to 1640 cm−1 and includes all of the other vibrations. In previous studies the phonon modes were calculated using B3LYP for a simplified model system of MOF-5 that included only 106 atoms in the unit cell25 and using PBEsol-D3 for the real MOF-5 structure.85 The lower spectral region is from about 9 to 1670 cm−1 as calculated by B3LYP and up to 1640 cm−1 as calculated by PBEsol-D3. Generally, these DFT results agree with our results. Two C−H stretching bands were observed at 3230 and 3243 cm−1 by B3LYP and at 3110 and 3122 cm−1 by PBEsol-D3. The wavenumbers differ by 120 cm−1 depending on the functional. However, for both DFT functionals the splitting of the bands is very similar. The NNP spectrum should resemble Tkatchenko−Scheffler-corrected PBE results. These are expected to be similar to the PBEsol-D3 results. However, the functional as well as the dispersion correction are not the same. Still, the C− H wavenumbers of NNP4*l and PBEsol-D3 agree rather well



ASSOCIATED CONTENT

* Supporting Information S

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jctc.8b01288. Neural network architecture, symmetry functions, RMSEs of the neural network potentials, bulk properties calculated by PBE, and phonon density of states analysis (PDF)



AUTHOR INFORMATION

Corresponding Authors

*E-mail: marco.eckhoff@chemie.uni-goettingen.de. *E-mail: [email protected]. Phone: +49 551 3923133. ORCID

Marco Eckhoff: 0000-0002-5581-789X Jörg Behler: 0000-0002-1220-1542 Figure 16. Phonon density of states ρ(ν̃) of MOF-5 at 298 K calculated using NNP*4l .

Notes

The authors declare no competing financial interest. 3806

DOI: 10.1021/acs.jctc.8b01288 J. Chem. Theory Comput. 2019, 15, 3793−3809

Article

Journal of Chemical Theory and Computation



(20) Wilmer, C. E.; Leaf, M.; Lee, C. Y.; Farha, O. K.; Hauser, B. G.; Hupp, J. T.; Snurr, R. Q. Large-scale screening of hypothetical metal− organic frameworks. Nat. Chem. 2012, 4, 83−89. (21) Krishna, R.; van Baten, J. M. In silico screening of metal− organic frameworks in separation applications. Phys. Chem. Chem. Phys. 2011, 13, 10593−10616. (22) Zhou, W.; Wu, H.; Yildirim, T.; Simpson, J. R.; Walker, A. R. H. Origin of the exceptional negative thermal expansion in metal− organic framework-5 Zn4O(1,4-benzenedicarboxylate)3. Phys. Rev. B: Condens. Matter Mater. Phys. 2008, 78, No. 054114. (23) Mueller, T.; Ceder, G. A Density Functional Theory Study of Hydrogen Adsorption in MOF-5. J. Phys. Chem. B 2005, 109, 17974− 17983. (24) Yang, L.-M.; Vajeeston, P.; Ravindran, P.; Fjellvag, H.; Tilset, M. Theoretical Investigations on the Chemical Bonding, Electronic Structure, And Optical Properties of the Metal−Organic Framework MOF-5. Inorg. Chem. 2010, 49, 10283−10290. (25) Civalleri, B.; Napoli, F.; Noel, Y.; Roetti, C.; Dovesi, R. Abinitio prediction of materials properties with CRYSTAL: MOF-5 as a case study. CrystEngComm 2006, 8, 364−371. (26) Han, S. S.; Choi, S.-H.; van Duin, A. C. T. Molecular dynamics simulations of stability of metal−organic frameworks against H2O using the ReaxFF reactive force field. Chem. Commun. 2010, 46, 5713−5715. (27) Huang, L.; Bandosz, T.; Joshi, K. L.; van Duin, A. C. T.; Gubbins, K. E. Reactive adsorption of ammonia and ammonia/water on CuBTC metal−organic framework: A ReaxFF molecular dynamics simulation. J. Chem. Phys. 2013, 138, No. 034102. (28) Lukose, B.; Supronowicz, B.; St. Petkov, P.; Frenzel, J.; Kuc, A. B.; Seifert, G.; Vayssilov, G. N.; Heine, T. Structural properties of metal−organic frameworks within the density-functional based tightbinding method. Phys. Status Solidi B 2012, 249, 335−342. (29) Skoulidas, A. I.; Sholl, D. S. Self-Diffusion and Transport Diffusion of Light Gases in Metal−Organic Framework Materials Assessed Using Molecular Dynamics Simulations. J. Phys. Chem. B 2005, 109, 15760−15768. (30) Greathouse, J. A.; Allendorf, M. D. The Interaction of Water with MOF-5 Simulated by Molecular Dynamics. J. Am. Chem. Soc. 2006, 128, 10678−10679. (31) Tafipolsky, M.; Schmid, R. Systematic First Principles Parameterization of Force Fields for Metal−Organic Frameworks using a Genetic Algorithm Approach. J. Phys. Chem. B 2009, 113, 1341−1352. (32) Boyd, P. G.; Moosavi, S. M.; Witman, M.; Smit, B. Force-Field Prediction of Materials Properties in Metal−Organic Frameworks. J. Phys. Chem. Lett. 2017, 8, 357−363. (33) Greathouse, J. A.; Allendorf, M. D. Force Field Validation for Molecular Dynamics Simulations of IRMOF-1 and Other Isoreticular Zinc Carboxylate Coordination Polymers. J. Phys. Chem. C 2008, 112, 5795−5802. (34) Yang, Q.; Jobic, H.; Salles, F.; Kolokolov, D.; Guillerm, V.; Serre, C.; Maurin, G. Probing the Dynamics of CO2 and CH4 within the Porous Zirconium Terephthalate UiO-66(Zr): A Synergic Combination of Neutron Scattering Measurements and Molecular Simulations. Chem. - Eur. J. 2011, 17, 8882−8889. (35) Behler, J. Perspective: Machine Learning Potentials for Atomistic Simulations. J. Chem. Phys. 2016, 145, 170901. (36) Handley, C. M.; Popelier, P. L. A. Potential Energy Surfaces Fitted by Artificial Neural Networks. J. Phys. Chem. A 2010, 114, 3371−3383. (37) Behler, J. Neural network potential-energy surfaces in chemistry: a tool for large-scalesimulations. Phys. Chem. Chem. Phys. 2011, 13, 17930−17955. (38) Faraji, S.; Ghasemi, S. A.; Rostami, S.; Rasoulkhani, R.; Schaefer, B.; Goedecker, S.; Amsler, M. High accuracy and transferability of a neural network potential through charge equilibration for calcium fluoride. Phys. Rev. B: Condens. Matter Mater. Phys. 2017, 95, 104105.

ACKNOWLEDGMENTS Support by the DFG (Project BE3264/12-1 number 405479457 and Heisenberg Professorship Be3264/11-2 number 329898176) is gratefully acknowledged.



REFERENCES

(1) Rowsell, J. L. C.; Yaghi, O. M. Metal−organic frameworks: a new class of porous materials. Microporous Mesoporous Mater. 2004, 73, 3− 14. (2) James, S. L. Metal−organic frameworks. Chem. Soc. Rev. 2003, 32, 276−288. (3) Long, J. R.; Yaghi, O. M. The pervasive chemistry of metal− organic frameworks. Chem. Soc. Rev. 2009, 38, 1213−1214. (4) Heinen, J.; Dubbeldam, D. On flexible force fields for metal− organic frameworks: Recent developments and future prospects. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2018, 8, No. e1363. (5) Li, H.; Eddaoudi, M.; O’Keeffe, M.; Yaghi, O. M. Design and synthesis of an exceptionally stable and highly porous metal−organic framework. Nature 1999, 402, 276−279. (6) Eddaoudi, M.; Kim, J.; Rosi, N.; Vodak, D.; Wachter, J.; O’Keeffe, M.; Yaghi, O. M. Systematic Design of Pore Size and Functionality in Isoreticular MOFs and Their Application in Methane Storage. Science 2002, 295, 469−472. (7) Furukawa, H.; Cordova, K. E.; O’Keeffe, M.; Yaghi, O. M. The Chemistry and Applications of Metal−Organic Frameworks. Science 2013, 341, 1230444. (8) Schneemann, A.; Bon, V.; Schwedler, I.; Senkovska, I.; Kaskel, S.; Fischer, R. A. Flexible metal−organic frameworks. Chem. Soc. Rev. 2014, 43, 6062−6096. (9) Furukawa, H.; Ko, N.; Go, Y. B.; Aratani, N.; Choi, S. B.; Choi, E.; Yazaydin, A. Ã .; Snurr, R. Q.; O’Keeffe, M.; Kim, J.; Yaghi, O. M. Ultrahigh Porosity in Metal−Organic Frameworks. Science 2010, 329, 424−428. (10) Li, J.-R.; Kuppler, R. J.; Zhou, H.-C. Selective gas adsorption and separation in metalorganic frameworks. Chem. Soc. Rev. 2009, 38, 1477−1504. (11) Rosi, N. L.; Eckert, J.; Eddaoudi, M.; Vodak, D. T.; Kim, J.; O’Keeffe, M.; Yaghi, O. M. Hydrogen Storage in Microporous Metal− Organic Frameworks. Science 2003, 300, 1127−1129. (12) Lee, J.; Farha, O. K.; Roberts, J.; Scheidt, K. A.; Nguyen, S. T.; Hupp, J. T. Metalorganic framework materials as catalysts. Chem. Soc. Rev. 2009, 38, 1450−1459. (13) Meilikhov, M.; Yusenko, K.; Esken, D.; Turner, S.; van Tendeloo, G.; Fischer, R. A. Metals@MOFs - Loading MOFs with Metal Nanoparticles for Hybrid Functions. Eur. J. Inorg. Chem. 2010, 2010, 3701−3714. (14) Hermes, S.; Schroeter, M.-K.; Schmid, R.; Khodeir, L.; Muhler, M.; Tissler, A.; Fischer, R. W.; Fischer, R. A. Metal@MOF: Loading of Highly Porous Coordination Polymers Host Lattices by Metal Organic Chemical Vapor Deposition. Angew. Chem., Int. Ed. 2005, 44, 6237. (15) Hu, Z.; Deibert, B. J.; Li, J. Luminescent metal−organic frameworks for chemical sensing and explosive detection. Chem. Soc. Rev. 2014, 43, 5815−5840. (16) Lock, N.; Christensen, M.; Wu, Y.; Peterson, V. K.; Thomsen, M. K.; Piltz, R. O.; Ramirez-Cuesta, A. J.; McIntyre, G. J.; Noren, K.; Kutteh, R.; Kepert, C. J.; Kearley, G. J.; Iversen, B. B. Scrutinizing negative thermal expansion in MOF-5 by scattering techniques and ab initio calculations. Dalton Trans. 2013, 42, 1996−2007. (17) Han, S. S.; Goddard, W. A., III .Metal−Organic Frameworks Provide Large Negative Thermal Expansion Behavior. J. Phys. Chem. C 2007, 111, 15185−15191. (18) Evans, J. S. O. Negative thermal expansion materials. J. Chem. Soc., Dalton Trans. 1999, 3317−3326. (19) Roy, R.; Agrawal, D. K.; McKinstry, H. A. Very Low Thermal Expansion Coefficient Materials. Annu. Rev. Mater. Sci. 1989, 19, 59− 81. 3807

DOI: 10.1021/acs.jctc.8b01288 J. Chem. Theory Comput. 2019, 15, 3793−3809

Article

Journal of Chemical Theory and Computation

dynamics theoretical foundation. Rep. Prog. Phys. 2016, 79, No. 066503. (61) Balestra, S. R. G.; Bueno-Perez, R.; Hamad, S.; Dubbeldam, D.; Ruiz-Salvador, A. R.; Calero, S. Controlling Thermal Expansion: A Metal−Organic Frameworks Route. Chem. Mater. 2016, 28, 8296− 8304. (62) Cliffe, M. J.; Hill, J. A.; Murray, C. A.; Coudert, F.-X.; Goodwin, A. L. Defectdependent colossal negative thermal expansion in UiO-66(Hf) metal−organic framework. Phys. Chem. Chem. Phys. 2015, 17, 11586−11592. (63) Mary, T. A.; Evans, J. S.; Vogt, T.; Sleight, A. W. Negative Thermal Expansion from 0.3 to 1050 K in ZrW208. Science 1996, 272, 90−92. (64) Artrith, N.; Behler, J. High-dimensional neural network potentials for metal surfaces: A prototype study for copper. Phys. Rev. B: Condens. Matter Mater. Phys. 2012, 85, No. 045439. (65) Gastegger, M.; Kauffmann, C.; Behler, J.; Marquetand, P. Comparing the accuracy of high-dimensional neural network potentials and the systematic molecular fragmentation method: A benchmark study for all-trans alkanes. J. Chem. Phys. 2016, 144, 194110. (66) Behler, J. Constructing High-Dimensional Neural Network Potentials: A Tutorial Review. Int. J. Quantum Chem. 2015, 115, 1032−1050. (67) Behler, J. Atom-centered symmetry functions for constructing high-dimensional neural network potentials. J. Chem. Phys. 2011, 134, No. 074106. (68) Haykin, S. Neural Networks and Learning Machines, 3rd ed.; Prentice Hall, 2008. (69) Kalman, R. E. A New Approach to Linear Filtering and Prediction Problems. J. Basic Eng. 1960, 82, 35−45. (70) Blank, T. B.; Brown, S. D. Adaptive, global, extended Kalman filters for training feedforward neural networks. J. Chemom. 1994, 8, 391−407. (71) Blum, V.; Gehrke, R.; Hanke, F.; Havu, P.; Havu, V.; Ren, X.; Reuter, K.; Scheffler, M. Ab initio molecular simulations with numeric atom-centered orbitals. Comput. Phys. Commun. 2009, 180, 2175− 2196. (72) Perdew, J. P.; Burke, K.; Ernzerhof, M. Generalized Gradient Approximation Made Simple. Phys. Rev. Lett. 1996, 77, 3865−3868. (73) Tkatchenko, A.; Scheffler, M. Accurate Molecular Van Der Waals Interactions from Ground-State Electron Density and FreeAtom Reference Data. Phys. Rev. Lett. 2009, 102, No. 073005. (74) Behler, J. RuNNerA Neural Network Code for Constructing High-Dimensional Potential-Energy Surfaces; Georg-August-Universität Göttingen, 2018. (75) Plimpton, S. Fast Parallel Algorithms for Short-Range Molecular Dynamics. J. Comput. Phys. 1995, 117, 1−19. (76) http://lammps.sandia.gov (accessed June 1, 2017). (77) Singraber, A.; Behler, J.; Dellago, C. A library-based LAMMPS implementation of high-dimensional neural network potentials. 2018, submitted. (78) Nosé, S. A molecular dynamics method for simulations in the canonical ensemble. Mol. Phys. 1984, 52, 255−268. (79) Hoover, W. G. Canonical dynamics: Equilibrium phase-space distributions. Phys. Rev. A: At., Mol., Opt. Phys. 1985, 31, 1695−1697. (80) Gould, S. L.; Tranchemontagne, D.; Yaghi, O. M.; GarciaGaribay, M. A. Amphidynamic Character of Crystalline MOF-5: Rotational Dynamics of Terephthalate Phenylenes in a Free-Volume, Sterically Unhindered Environment. J. Am. Chem. Soc. 2008, 130, 3246−3247. (81) Banlusan, K.; Strachan, A. First-principles study of elastic mechanical responses to applied deformation of metal−organic frameworks. J. Chem. Phys. 2017, 146, 184705. (82) Bernstein, N.; Bhattarai, B.; Csányi, G.; Drabold, D. A.; Elliott, S. R.; Deringer, V. L. Quantifying Chemical Structure and Atomic Energies in Amorphous Silicon Networks. Angew. Chem., Int. Ed. 2019, 58, 7057−7061.

(39) Schuett, K. T.; Sauceda, H. E.; Kindermans, P.-J.; Tkatchenko, A.; Mueller, K.-R. SchNet - A deep learning architecture for molecules and materials. J. Chem. Phys. 2018, 148, 241722. (40) Bartók, A. P.; Payne, M. C.; Kondor, R.; Csányi, G. Gaussian Approximation Potentials: The Accuracy of Quantum Mechanics, without the Electrons. Phys. Rev. Lett. 2010, 104, 136403. (41) Bartók, A. P.; Csányi, G. Gaussian approximation potentials: A brief tutorial introduction. Int. J. Quantum Chem. 2015, 115, 1051− 1057. (42) Rupp, M.; Tkatchenko, A.; Müller, K.-R.; von Lilienfeld, O. A. Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning. Phys. Rev. Lett. 2012, 108, No. 058301. (43) Behler, J. Representing potential energy surfaces by highdimensional neural network potentials. J. Phys.: Condens. Matter 2014, 26, 183001. (44) Behler, J. First Principles Neural Network Potentials for Reactive Simulations of Large Molecular and Condensed Systems. Angew. Chem., Int. Ed. 2017, 56, 12828−12840. (45) Behler, J.; Parrinello, M. Generalized Neural-Network Representation of High-Dimensional Potential-Energy Surfaces. Phys. Rev. Lett. 2007, 98, 146401. (46) Morawietz, T.; Singraber, A.; Dellago, C.; Behler, J. How van der Waals interactions determine the unique properties of water. Proc. Natl. Acad. Sci. U. S. A. 2016, 113, 8368−8373. (47) Cheng, B.; Behler, J.; Ceriotti, M. Nuclear Quantum Effects in Water at the Triple Point: Using Theory to Validate Experiments. J. Phys. Chem. Lett. 2016, 7, 2210. (48) Hellström, M.; Behler, J. Concentration-Dependent Proton Transfer Mechanisms in Aqueous NaOH Solutions: From AcceptorDriven to Donor-Driven and Back. J. Phys. Chem. Lett. 2016, 7, 3302− 3306. (49) Natarajan, S. K.; Behler, J. Neural network molecular dynamics simulations of solidliquid interfaces: water at low-index copper surfaces. Phys. Chem. Chem. Phys. 2016, 18, 28704. (50) Quaranta, V.; Hellstrom, M.; Behler, J. Proton Transfer Mechanisms at the Water-ZnO Interface: The Role of Presolvation. J. Phys. Chem. Lett. 2017, 8, 1476. (51) Behler, J.; Martonák, R.; Donadio, D.; Parrinello, M. Metadynamics Simulations of the High-Pressure Phases of Silicon Employing a High-Dimensional Neural Network Potential. Phys. Rev. Lett. 2008, 100, 185501. (52) Artrith, N.; Hiller, B.; Behler, J. Neural network potentials for metals and oxides - First applications to copper clusters at zinc oxide. Phys. Status Solidi B 2013, 250, 1191−1203. (53) Gastegger, M.; Marquetand, P. High-Dimensional Neural Network Potentials for Organic Reactions and an Improved Training Algorithm. J. Chem. Theory Comput. 2015, 11, 2187−2198. (54) Gastegger, M.; Behler, J.; Marquetand, P. Machine Learning Molecular Dynamics for the Simulation of Infrared Spectra. Chem. Sci. 2017, 8, 6924. (55) Smith, J. S.; Isayev, O.; Roitberg, A. E. ANI-1: An extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 2017, 8, 3192−3203. (56) Lock, N.; Wu, Y.; Christensen, M.; Cameron, L. J.; Peterson, V. K.; Bridgeman, A. J.; Kepert, C. J.; Iversen, B. B. Elucidating Negative Thermal Expansion in MOF-5. J. Phys. Chem. C 2010, 114, 16181− 16186. (57) Wang, C.; Xie, Z.; deKrafft, K. E.; Lin, W. Doping Metal− Organic Frameworks for Water Oxidation, Carbon Dioxide Reduction, and Organic Photocatalysis. J. Am. Chem. Soc. 2011, 133, 13445−13454. (58) Momma, K.; Izumi, F. VESTA 3 for three-dimensional visualization of crystal, volumetric and morphology data. J. Appl. Crystallogr. 2011, 44, 1272−1276. (59) Rowsell, J. L. C.; Spencer, E. C.; Eckert, J.; Howard, J. A. K.; Yaghi, O. M. Gas Adsorption Sites in a Large-Pore Metal−Organic Framework. Science 2005, 309, 1350−1354. (60) Dove, M. T.; Fang, H. Negative thermal expansion and associated anomalous physical properties: review of the lattice 3808

DOI: 10.1021/acs.jctc.8b01288 J. Chem. Theory Comput. 2019, 15, 3793−3809

Article

Journal of Chemical Theory and Computation (83) Chen, X.; Jørgensen, M. S.; Li, J.; Hammer, B. Atomic Energies from a Convolutional Neural Network. J. Chem. Theory Comput. 2018, 14, 3933−3942. (84) Lee, C.; Vanderbilt, D.; Laasonen, K.; Car, R.; Parrinello, M. Ab initio studies on the structural and dynamical properties of ice. Phys. Rev. B: Condens. Matter Mater. Phys. 1993, 47, 4863−4872. (85) Bristow, J. K.; Skelton, J. M.; Svane, K. L.; Walsh, A.; Gale, J. D. A general forcefield for accurate phonon properties of metal−organic frameworks. Phys. Chem. Chem. Phys. 2016, 18, 29316−29329. (86) Grzesiak, A. L.; Uribe, F. J.; Ockwig, N. W.; Yaghi, O. M.; Matzger, A. J. Polymer-Induced Heteronucleation for the Discovery of New Extended Solids. Angew. Chem., Int. Ed. 2006, 45, 2553−2556. (87) Siberio-Pérez, D. Y.; Wong-Foy, A. G.; Yaghi, O. M.; Matzger, A. J. Raman Spectroscopic Investigation of CH4 and N2 Adsorption in Metal−Organic Frameworks. Chem. Mater. 2007, 19, 3681−3685. (88) Hermes, S.; Schröder, F.; Amirjalayer, S.; Schmid, R.; Fischer, R. A. Loading of porous metal−organic open frameworks with organometallic CVD precursors: inclusion compounds of the type [LnM]a@MOF-5. J. Mater. Chem. 2006, 16, 2464−2472. (89) Saha, D.; Deng, S. Ammonia adsorption and its effects on framework stability of MOF-5 and MOF-177. J. Colloid Interface Sci. 2010, 348, 615−620. (90) Wu, C.-M.; Rathi, M.; Ahrenkiel, S. P.; Koodali, R. T.; Wang, Z. Facile synthesis of MOF-5 confined in SBA-15 hybrid material with enhanced hydrostability. Chem. Commun. 2013, 49, 1223−1225. (91) Chaemchuem, S.; Kui, Z.; Verpoort, F. Control of interpenetration via in situ lithium incorporation in MOFs and their gas adsorption properties and selectivity. CrystEngComm 2016, 18, 7614− 7619. (92) Téllez S, C. A.; Hollauer, E.; Mondragon, M. A.; Castaño, V. M. Fourier transform infrared and Raman spectra, vibrational assignment and ab initio calculations of terephthalic acid and related compounds. Spectrochim. Acta, Part A 2001, 57, 993−1007.

3809

DOI: 10.1021/acs.jctc.8b01288 J. Chem. Theory Comput. 2019, 15, 3793−3809