Combining Linear-Scaling DFT with Subsystem DFT in Born

May 31, 2016 - In this work, methods for the efficient simulation of large systems embedded in a molecular environment are presented. These methods co...
1 downloads 15 Views 5MB Size
Subscriber access provided by UNIV OF NEBRASKA - LINCOLN

Article

Combining Linear-Scaling DFT with Subsystem DFT in Born-Oppenheimer and Ehrenfest Molecular Dynamics simulations: from Molecules to a Virus in Solution. Samuel Andermatt, Jinwoong Cha, Florian Schiffmann, and Joost VandeVondele J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.6b00398 • Publication Date (Web): 31 May 2016 Downloaded from http://pubs.acs.org on June 1, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Theory and Computation is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Combining Linear-Scaling DFT with Subsystem DFT in Born-Oppenheimer and Ehrenfest Molecular Dynamics simulations: from Molecules to a Virus in Solution. Samuel Andermatt,† Jinwoong Cha,†,‡ Florian Schiffmann,†,¶ and Joost VandeVondele∗,† Department of Materials, ETH Zürich, Switzerland E-mail: [email protected]

To whom correspondence should be addressed of Materials, ETH Zürich, Switzerland ‡ Current address: Department of Mechanical and Process Engineering, ETH Zürich, Switzerland ¶ Current address: Centre of Policy Studies, Victoria University, Melbourne, Australia



† Department

1 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract In this work, methods for the efficient simulation of large systems embedded in a molecular environment are presented. These methods combine linear-scaling (LS) Kohn-Sham (KS) density functional theory (DFT) with subsystem (SS) DFT. LS DFT is efficient for large subsystems, while SS DFT is linear scaling with a smaller prefactor for large sets of small molecules. The combination of SS and LS, which is an embedding approach, can result in a ten-fold speedup over a pure LS simulation for large systems in aqueous solution. In addition to a ground-state Born-Oppenheimer SS+LS implementation, a time-dependent density functional theory based Ehrenfest molecular dynamics (EMD) using density matrix propagation is presented that allows for performing non-adiabatic dynamics. Density matrix based EMD in the SS framework is naturally linear scaling and appears suitable to study the electronic dynamics of molecules in solution. In the LS framework, linear scaling results as long as the density matrix remains sparse during time propagation. However, we generally find a less than exponential decay of the density matrix after a sufficiently long EMD run, preventing LS EMD simulations with arbitrary accuracy. The methods are tested on various systems, including spectroscopy on dyes, the electronic structure of TiO2 nanoparticles, electronic transport in carbon nanotubes, and the satellite tobacco mosaic virus in explicit solution.

2 ACS Paragon Plus Environment

Page 2 of 43

Page 3 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

1 Introduction Density functional theory (DFT), in particular in the formulation by Kohn and Sham (KS DFT), is extremely successful for the description of molecules, solids, liquids and interfaces. 1,2 The success of KS DFT can be attributed to the excellent performance / cost ratio of the theory as well as to its versatility. Furthermore, the original ground-state formulation has been extended to the time domain 3 (TDDFT), giving access to various spectroscopic properties and electronic dynamics. As such, there is growing demand to describe increasingly complex systems, moving more and more towards realistic models of experimental systems and to build in-silico chemical labs. As a motivating example for the developments presented in this manuscript, we would like to explicitly describe the electronic dynamics through the mesoporous titanium oxide, in contact with a liquid electrolyte, found in dye sensitized solar cells. 4,5 Unfortunately, this is not yet possible. An important reason is the fact that the computational cost of traditional KS DFT scales O(N 3 ) with the size of the system N. This can be attributed to the orthogonality requirement on the single-particle wave functions employed in KS DFT. Here, we combine two approaches that aim at reducing the cost of such calculations to O(N). First we employ linear-scaling (LS) DFT as implemented recently in CP2K 6 and available in various other codes. 7–12 Approaches to LS DFT have been reviewed in Ref. 13 and Ref. 14 Generally speaking, these methods exploit the fact that, for large ground-state non-metallic systems, the matrices representing the operators in a localized basis set are sparse. These methods avoid at all times the computation of canonical single-particle molecular orbitals that extend throughout the whole system. The LS methods employed in this work directly transform the sparse Kohn-Sham matrix into the sparse density matrix of the system, and employ sparse matrix-matrix multiplications to achieve that goal. This approach is linear scaling with system size, which has been demonstrated for homogeneous systems up to millions of atoms. 6,7 The accuracy of LS DFT can be arbitrarily close to the O(N 3 ) KS reference. In our implementation, the accuracy is controlled by a single parameter determining the filtering threshold at which matrix elements are considered negligible and thus are removed from the matrix. However, the prefactor of the linear-scaling term 3 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

in this approach is, in particular with basis sets suitable for accurate DFT calculations, large in condensed phase (3D) systems. An alternative to this linear-scaling approach with arbitrary precision is to describe electrons in a way that is more approximate than KS DFT. Our method of choice is based on subsystem (SS) DFT, in which a large system is partitioned into smaller subsystems. The subsystems themselves are described by KS DFT, but interact via a simpler orbital-free density functional. SS DFT has recently been reviewed in Ref. 15 It relates to the early work of Kim and Gordon (ref. 16 ) and Cortona (ref. 17 ). SS DFT became well known with the development of the frozen-density embedding approach. 18 SS DFT is an intermediate between fully orbital-free DFT (OF DFT) and KS DFT. Fully orbital-free DFT has been successfully used to describe millions of atoms, 19 but is not very accurate if covalent bonds are present. SS DFT remains suitable if covalent bonds are present, as long as these are contained within a subsystem. SS DFT exploits the fact that the non-covalent interaction between molecules is simpler to describe than e.g. the electronic interactions across a bond. In self-consistent SS DFT, ionic forces are readily available. SS DFT is thus suitable to describe the dynamics of molecular liquids. An earlier implementation of SS DFT in CP2K has been used to perform molecular dynamics (MD) of liquid water. 20 However, for this subtle system, these results demonstrated reduced accuracy compared to KS DFT. Extensions of the theory towards linear response TDDFT are available 21,22 as well as to real-time propagation. 23 This approach has been used to compute excitations in a protein complex containing a few thousand atoms, 24 and could be used for larger systems as SS DFT is linear scaling in the number of subsystems. However, SS DFT is only effective if all fragments are sufficiently small, since SS DFT suffers from O(M 3 ) scaling, where M is the size of the largest fragment. For sufficiently small fragments, the prefactor of SS DFT is smaller than that of LS DFT. Here, we combine LS DFT and SS DFT to obtain a method that has a small prefactor, yet is able to deal with fragments or subsystems of arbitrary size. The method is particularly suitable if the investigated system contains a region of main interest, for which KS DFT accuracy is needed, for example if this region can not easily be partitioned into weakly interacting fragments. The

4 ACS Paragon Plus Environment

Page 4 of 43

Page 5 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

expected areas of application are, e.g., nanoparticles, bio-molecules or colloids in solution, where the solvent is less important than the actual solute itself. As we will present in the following, this combination is very natural in its implementation. Finally, we will establish how electron dynamics can be simulated within this framework. These methods will permit access to spectroscopic information and will provide insight in the mobility and transfer properties of electrons. For this, the usual ground-state Born-Oppenheimer (gs-BO) approach, where the wave function or density matrix is minimized at each time step, is not suitable. The alternative is an explicit real-time propagation of the electronic wave function using time-dependent density functional theory, which we will name real-time propagation (RTP) at fixed nuclei, or Ehrenfest molecular dynamics (EMD) if the ionic forces are computed and employed for their propagation. Recently, density matrix based RTP has been implemented in the CONQUEST code. 25 Here we extend the implementation of EMD as available in CP2K by explicitly propagating the density matrix, instead of the molecular orbitals, which offers the prospect of linear scaling. Anticipating our results, this linear scaling is observed in the SS DFT scheme, but in the LS scheme, the density matrix is found not to be sparse at arbitrary precision and longer time scales. In the following sections, first the theoretical background of subsystem DFT, linear-scaling DFT and EMD will be provided. We then discuss implementation and usage aspects, namely coloring the graph of interacting subsystems, and the optimal time step for EMD. Following this, an in-depth investigation of the sparsity of the density matrix within the LS EMD method is presented. We conclude the manuscript with various applications of the SS+LS method in ground and excited states.

5 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 43

2 Theory 2.1 Combining Linear-Scaling DFT with Subsystem DFT In SS DFT, 15–17 the system is subdivided into (weakly) interacting subsystems, for example molecules. The energy of the full system is obtained from the Hohenberg-Kohn (HK) energy functional 26 for the full system, but corrected by the difference between the Kohn-Sham 27 (KS) and HK energy functionals for each of the subsystems:  i ESSDFT =EHK [ρ ] + ∑(EKS [ρi [ ψ j ]] − EHK [ρi ]) i

 i =Es [ρ ] + ∑(Ts [ ψ j ] − Es [ρi ])+ i

Eext [ρ ] + ECoul [ρ ] + Exc [ρ ],

(1)

where we have used the definitions of the HK energy functional:

EHK [ρ ] = Es [ρ ] + Eext [ρ ] + ECoul [ρ ] + Exc [ρ ], and KS energy functional:

  EKS [ρ [ ψ j ]] = Ts [ ψ j ] + Eext [ρ ] + ECoul [ρ ] + Exc [ρ ]. The index i iterates over all subsystems, while the index j iterates for each subsystem over the electrons of that subsystem. Eext , ECoul and Exc stand for the external, Coulomb and exchange and correlation energy, respectively. Ts is the KS kinetic energy, which is computed from the singleparticle orbitals, while Es is the HK kinetic energy, which is computed from the electron density. The total density is the sum of the subsystem densities (ρ = ∑i ρi ), and the subsystem density itself

6 ACS Paragon Plus Environment

Page 7 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

 is obtained from the set of molecular orbitals ψ j of each of the subsystems as  ρ [ ψ j ] = ∑ ||ψ j ||2 .

(2)

j

Note that these molecular orbitals are enforced to be orthogonal only within the set belonging to a single subsystem. In our implementation, these molecular orbitals are expanded in the basis functions of corresponding subsystems only. The lack of orthogonality of the wave functions between subsystems is taken into account by the HK kinetic energy functional. However, since the exact form of this functional is unknown, and must be approximated, the interaction between subsystems is usually described somewhat less accurately in SS DFT than in KS DFT. As in the earlier CP2K implementation 20 this functional is minimized fully self-consistently with respect to the expansion coefficients of the molecular orbitals, which implies that ionic forces are obtained straightforwardly. In this SS DFT approach, the Kohn-Sham matrix, the density matrix and the overlap matrix of the full system have a clear structure, as they can be made block-diagonal, where each block corresponds to the sub-matrix of a single subsystem. This structure can be exploited to perform the calculations more efficiently. Traditionally, the equations are solved by diagonalizing each sub-block, and parallellizing the calculation over each of these blocks. While this procedure works well for small systems or homogeneous liquids, it has limitations as soon as the subsystems are very different in size. Indeed, if very large and small subsystems are present at the same time, load balancing the calculation can become problematic, and sometimes calculations would not fit on a single processor further complicating the implementation. Finally, as the very large fragments increase in size, diagonalization based techniques are not suitable anymore. We therefore prefer to work with the matrix of the full system, and instead solve the equations with the same techniques as employed in linear-scaling calculations, i.e. with methods that directly benefit from the sparsity of the matrices describing the system. In the following, we show that an SS DFT implementation can be combined very easily with linear-scaling DFT, addressing the above mentioned problems,

7 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 43

and ultimately allowing for the efficient simulation of very large systems in an environment of molecules. We base our implementation of the linear-scaling SS DFT method on the linear-scaling KS DFT implementation in CP2K. 6 This method uses the fact that the density matrix (P) can be written as a function of the Kohn-Sham matrix: 1 P = (I − sign(S−1 H − µ I))S−1 , 2

(3)

where H is the Kohn-Sham matrix, S is the overlap matrix and µ is the chemical potential. The 1

sign function sign(A) = A(A2 )− 2 can be computed iteratively through matrix multiplications alone 1 Xn+1 = Xn (3I − Xn2 ). 2

(4)

The benefit of this formulation derives from the fact that P is known to decay exponentially for insulators in the ground state. 13 If small elements are thus truncated to zero (filtered), P is a sparse matrix for large systems. The same holds for the S and H matrices. If all multiplications are implemented using sparse matrix matrix multiplication, for example using the DBCSR library, 28 the computation of the density matrix, starting from the H and S matrix can be performed in O(N) cost. However, the sparsity of the matrices is not constant during the multiplication, i.e. if two sparse matrices are multiplied, the sparsity pattern of the result contains more entries. To retain sparsity, small elements, below the filtering threshold, are filtered away from the resulting matrix. Nevertheless, the density matrix is usually significantly less sparse than the Kohn-Sham or overlap matrices. This filling-in of the sparsity pattern contributes to the prefactor of the O(N) algorithm. We note that the sign matrix based formulation given above is intuitive and useful, but various other linear-scaling techniques are available in CP2K, including the TRS4 method 29 and curvysteps optimization. 30 These methods are advantageous for example when the chemical potential is not known, or the self-consistent electronic structure converges slowly. Within the SS DFT method, using linear-scaling algorithms based on multiplication, P auto8 ACS Paragon Plus Environment

Page 9 of 43

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 43

method is furthermore naturally parallel and GPU accelerated, since the underlying sparse matrix multiplication library is. 28,31 As an added benefit of this approach, the charge of the individual subsystems is not fixed, but naturally adjusts, in integer steps, to the global chemical potential of system, which is, for example, advantageous if ions are present in solution.

2.2 Density Matrix Based Ehrenfest Molecular Dynamics EMD is a way to perform molecular dynamics where the electronic wave function is explicitly propagated over time. This differentiates it from gs-BO DFT where the electronic wave function is minimized at each time step. EMD allows for non-adiabatic dynamics and gives access to the electronic properties of the system. 32–44 EMD has been implemented in CP2K 45 and a number of other codes. 46–51 The trade off for the ability to investigate electronic processes is that the time step is determined by the electrons and not by the ions. The natural time step is thus multiple orders of magnitude shorter than in gs-BO DFT. We differentiate between EMD and RTP. In the latter case, the ions are kept fixed, which avoids the need to compute the ionic forces. The computation of the atomic forces is relatively involved in codes that employ atom centered basis functions, but the equations of motion have been derived in Ref. 52 All terms of the ionic forces have been implemented in CP2K, and given here for completeness:

j

j

−1 (iHβ γ + Bβ γ )aγ , a˙α = − ∑ Sαβ

(5)

βγ

MA R¨ A = −

∂ U(R,t) + ∂ RA

Ne

∑ ∑ aαj∗(DAαβ − j=1 αβ

∂ Hαβ j )a . ∂ RA β

(6)

RA and MA are the position and mass of the ion A. U(R,t) is the potential energy of the ion-ion j

interaction. a˙α is the derivative of a coefficient of the molecular orbital j in the linear combination 10 ACS Paragon Plus Environment

Page 11 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

of atomic orbitals (LCAO) expansion: j

ψ j (r,t) = ∑ aα (t)φα (r − RAα ),

(7)

α

with α iterating over all basis functions of the atoms. DA contains all the terms that are a result of the dependence of the basis set on the position of the ion A: −1 −1 A+ DAαβ = ∑(BA+ αγ Sγδ Hδ β + Hαγ Sγδ Bδ β )+ γδ

  A + −1 A A+ −1 A+ i Cαβ −Cαβ + ∑(Bαγ Sγδ Bδ β + Bαγ Sγδ Bδ β ) ,

(8)

γδ

where γ ,δ iterate over all basis functions and

Bαβ ∗ B+ αβ = Bβ α

d dt

 d = φα φβ , dt   d = φα φβ , dt 

=

BAαβ = A+ Bαβ = BA∗ βα = A Cαβ = A+ Cαβ = CβA∗α =

Ni ∂ ∂ , + ∑ R˙ A ∂ t A=1 ∂ RA   ∂ φα φ , ∂ RA β   ∂ φα φβ , ∂R   A ∂ d φα φ , dt ∂ RA β   d ∂ φα φ . ∂ RA dt β

Note that all quantities can be computed in linear-scaling time using the usual screening procedures. Two propagators for the electronic wave function were implemented. Enforced time reversal

11 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 43

symmetry (ETRS): 53

ψ (t + ∆t) = e

−∆t 2 X(t+∆t)

e

−∆t 2 X(t)

ψ (t);

(9)

ψ (t),

(10)

and exponential midpoint (EM): 53

ψ (t + ∆t) = e

−∆t 2 (X(t)+X(t+∆t))

where X is the right-hand side of equation 5 X(t) = S−1 (iH(t) + B(t)).

(11)

Both propagators are implicit and a self-consistent iteration has to be performed for the propagation. We also investigated a fourth order Magnus propagator, 53 but consistent with Rourke et. al. 25 we did not find it to improve the results. The initial guess for the density at t + ∆t is estimated by the always stable predictor–corrector method. 54 Self-consistency is usually obtained without mixing, but mixing is automatically used if the self-consistency procedure does not converge. Both propagators are computationally similarly expensive, since the second matrix exponential in ETRS only depends on X(t) and not X(t + ∆t). Therefore it only has to be calculated in the first step of the implicit propagation. For most of the calculations in this paper the ETRS propagator was used. Note that the wave functions of all subsystems are propagated simultaneously and that the electronic dynamics of each subsystem is thus fully coupled to that of the other subsystems. The propagation of the wave function can be reformulated as a propagation of P: P(t + ∆t) =ψ (t + ∆t)ψ (t + ∆t)∗ =U(t,t + ∆t)P(t)U(t,t + ∆t)∗ , with U(t,t + ∆t) = e

−∆t 2 (X(t)+X(t+∆t))

for EM and U(t,t + ∆t) = e

12 ACS Paragon Plus Environment

−∆t 2 X(t+∆t)

(12)

e

−∆t 2 X(t)

for ETRS. This

Page 13 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

formulation offers the prospect to achieve linear scaling in cases where P is sparse. The sparsity could occur either through the nearsightedness of electrons, 55 or is implicitly enforced by the use of SS DFT. Note that in the SS DFT approach, electrons are not transferred between subsystems during the propagation. The straightforward way to propagate P is to calculate the matrix exponential explicitly, for example through a Taylor series expansion. 25 However P(t +∆t) can be calculated more efficiently. The multiplication of a Hermitian matrix with a matrix exponential, eM , from one side and the complex conjugate of the same exponential on the other side can be written as a series similar to a Baker-Campbell-Hausdorff (BCH) expansion: ∞

P(t + ∆t) = eM P(t)(eM )∗ =

∑ Zn,

(13)

n=0

1 Z0 = P(t), Zn>0 = (MZn−1 + (MZn−1 )∗ ). n

(14)

This sum converges much faster than the explicit calculation of the matrix exponential, requiring only 5-10 terms depending on the accuracy needed. Indeed, if H(t) and P(t) commute, the series terminates immediately independently of the value of ∆t, while for small perturbations away from the ground state, only the small non-commuting part remains. The increased sparsity of these higher order terms furthermore reduces their cost with the sparse matrix multiplication employed.

3 Implementation 3.1 Subsystem Graph Coloring The calculation of the energy according to equation 1 requires the calculation of the HK energy of all individual subsystems. Each subsystem can be computed fully independently, and the calculation could be parallellized over the various subsystem. However, this can be challenging if large

13 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

subsystems are present. Furthermore, in CP2K and various other codes that employ regular grids for the electron density, the computation of small subsystem is relatively inefficient, as sufficiently large auxiliary simulation cells, centered around each subsystem, are needed. This implies an overhead to deal with essentially empty space and complicates the implementation, which was one reason why the older CP2K implementation 20 was replaced by a new scheme. In this equivalent alternative, the sum of the HK energies is computed considering subsets of subsystems that have non-overlapping densities. The HK energy of such a subset equals the sum of the HK energies of each of the subsystems in the subset. To compute this energy, this subset of non-overlapping subsystems is placed in the simulation cell that equals the original simulation cell, and the HK energy of the subset is computed. This calculation is repeated for all subsets, each being essentially a relatively sparse system. By packing the subsystems as efficiently as possible in subsets, the amount of empty space is minimized and the number of iterations over subsets reduced, speeding up the calculation. As an additional advantage, only few code changes relative to a standard DFT code are needed to perform such a computation, it just implies a standard calculation for each subset of particles. Furthermore, in this approach, large subsystems are not problematic, as the standard parallelism can efficiently be employed. The computational effort to compute the HK energy is proportional to the number of subsets needed. Finding these subsets is a problem that can be formulated as a graph coloring problem. Graph coloring is a standard problem in computer science. To obtain the graph, each subsystem is considered a node (vertex) in a graph . Two nodes are connected by an edge of the graph if the corresponding subsystems overlap. This information can be obtained from the overlap matrix of the full system. An allowed partitioning of the graph in subsets is obtained coloring the graph in such a way that two connected nodes have different colors. After coloring, each color represents a different subset, and all nodes of the same color belong to the same subset. The coloring of a graph is not unique. The optimal solution with the fewest possible colors is computationally intractable to find, but two established heuristic algorithms to solve the coloring problem were tested. A third algorithm improves on the obtained solutions.

14 ACS Paragon Plus Environment

Page 14 of 43

Page 15 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

The first algorithm to color a graph is the greedy algorithm. In the greedy algorithm all nodes are colored in an order that is independent of the topology of the graph, for example using the node’s index. Each node is assigned the color with the lowest index that is not yet present in a neighboring node. The second algorithm is DSATUR. 56 In DSATUR two criteria are used to decide which node of the graph is colored next. The first criterion is the degree of saturation. The degree of saturation is the number of unique colors present in the neighborhood of the node. The algorithm is to choose the node with the highest degree of saturation and give it the lowest color not yet present in the neighborhood. In case of multiple nodes with the same degree of saturation, the second criterion takes the node which has the most neighbors. After an initial solution is obtained, the number of colors is further reduced by switching neighboring pairs and decreasing their colors where applicable. The pair-switching algorithm can be found in the supplementary information. The coloring was tested on an STMV virus in water. 57,58 The system contains around one million atoms. Most of the subsystems are water molecules, but there are also proteins that are much larger. The greedy algorithm results in 36 colors and the DSATUR algorithm in 28 colors. The pair-switching algorithm reduces the number of colors to 30 and 26, respectively. The coloring of the graph is fast and does not significantly contribute to the total execution time.

3.2 Optimal Time Step For the computational cost of an EMD simulation the time step is of crucial importance. Results by O’Rourke and Bowler suggests a time step of 0.06 a.u (1.44 as) for real-time propagation (benzene system) as obtained by the exponential midpoint propagator, only applied self-consistently in the initial 100 steps. 25 For SS DFT Krishtal et. al. suggested a time step 2 as. 23 The present approach should be stable at longer time steps thanks to the reduction of numerical noise described in detail below and the use of an implicit propagation. The stability with respect to the time step has been investigated using a simple condensed phase system with a large band gap that was created replicating in one dimension a cubic simulation cell 15 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Page 16 of 43

Page 17 of 43

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

of ice 1h containing 288 atoms up to required size. The one dimensional nature of the test system facilitates reaching the linear scaling regime. Note that this test does not employ SS DFT, for which the enforced matrix structure would trivially lead to linear scaling, but treats the system as a whole. The wave function and nuclei were propagated for a short time, starting from a ground-state wave function. As shown in Figure 3, the computational cost increases linearly with system size. This is consistent with the results reported for RTP in Ref., 25 and supports the correctness of the implementation. However, the full picture is more complicated, and as described in detail in the following section, at longer timescales the computational cost of EMD gradually increases.

4.2 EMD Density Matrix Filling In CP2K the density matrix is not filtered based on an a priori fixed sparsity pattern, for example as determined from a spatial cutoff 25 or from the fixed sparsity pattern of the overlap matrix. Instead, the elements are simply filtered based on their magnitude, as computed during the matrix multiplications. The advantage of this scheme is that the accuracy is determined by one parameter, the filtering threshold, independently of the chemistry of the system. Arbitrary accuracy can be obtained reducing the filtering threshold. However, as a result of this filtering method, the occupation of P depends on the decay of the physical density matrix, and also on the quality of the numerical scheme. To achieve linear scaling, the occupation of P needs to stay significantly below 1. To investigate the evolution of the density matrix sparsity during EMD, a simple test system has been prepared and let to evolve close to the ground state using EMD started from the electronic ground state, but with finite ionic temperature. The test system is a quasi-1D system, with a large band gap, namely the ice 1h system described above, replicated in one direction to contain approximately 2000 atoms. As such, the ionic evolution is very similar to that of a gs-BO MD. Within gs-BO MD, the density matrix is sparse at all times. As shown in Figure 4, a straightforward implementation of density matrix based EMD, which directly implements the formulae given with filtering sparse matrix multiplication, results in a 18 ACS Paragon Plus Environment

Page 18 of 43

Page 19 of 43

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 43

course of a simulation. A second possible explanation is that P could be become intrinsically non sparse, i.e. would become non-sparse also if computed with essentially infinite precision. EMD simulates non-adiabatic dynamics, and the proofs for the exponential decay of P however were given for the ground state at zero temperature for insulators and finite temperature for metals. 13 To control the numerical noise caused by the filtering of P two techniques were combined. The first is the restoration of the idempotency of P. This can be achieved through McWeeny purification: 59 P = 3P2 − 2P3 .

(15)

An idempotent P, such as the true density matrix, remains invariant, while a noisy P moves back in the direction of idempotency. In our experience, one or two purification steps after each implicit propagation are sufficient to maintain good idempotency. McWeeny purification has therefore little impact on the short term performance, but improves the long-term stability. The second technique is to use two different filtering thresholds during the simulation. Steps in the simulation that are critical for numerical errors are performed with a tighter filtering threshold. Once the simulation leaves the critical region the matrices are filtered at the normal filtering threshold. The critical operations are the McWeeny purification and the application of the propagator to P. In our calculations, reducing the filtering threshold by two orders of magnitude for these steps yielded stable results at low computational cost. The reduction of the numerical noise through these control mechanisms is displayed in Figure 4, showing that this significantly outperforms the straightforward implementation. As illustrated in Figure 5 for 2000 time steps (8000 as), the combination of both techniques allows for an effective control of the numerical noise. For the large band gap system employed, P remains sparse (0.2 occupation) at a filtering threshold of 10−5 . However, this does not hold at arbitrary precision. Using a tighter threshold (10−7 ), a dense matrix is obtained after 6000as. As illustrated in Figure 6 for the simulation of a boron-nitride nanotube (a smaller gap system), this threshold depends on the system studied. Even at a filtering threshold of 10−5 the density matrix

20 ACS Paragon Plus Environment

Page 21 of 43

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Page 22 of 43

Page 23 of 43

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

context 61–63 might provide a viable approach to obtain a truly linear-scaling non-adiabatic MD.

5 Applications In the following sections, various more realistic systems are studied with the methods presented in this work. Two systems are studied in the ground state and illustrate the applications of SS DFT in combination with the LS approach, while two further systems illustrate density matrix based EMD, one of them in combination with the SS DFT approach.

5.1 Computational Setup All the following calculations were performed using the CP2K software. 45,64 CP2K combines a primary contracted Gaussian basis with an auxiliary plane-wave (PW) basis. This Gaussian and Plane Wave (GPW) 65 scheme allows for an efficient linear-scaling calculation of the Kohn-Sham matrix. The auxiliary PW basis is used to calculate the Hartree (Coulomb) energy in linear-scaling time using Fast Fourier Transforms. The transformation between the Gaussian and PW basis can be computed rapidly. The cutoff for the PW basis set was chosen to be at least 300 Ry in all simulations. While the PW basis is efficient for the Hartree energy, the primary Gaussian basis-set is local in nature and allows for a sparse representation of the Kohn-Sham matrix. Furthermore, the electronic density can be represented in fewer basis functions, especially in system which contain empty space. Unless mentioned otherwise, DZVP-MOLOPT and DZVP-MOLOPT-SR basis sets 66 were employed in the simulations. For the simulations, the Perdew-Burke-Ernzerhof 67 (PBE) exchange and correlation (xc) functional and Goedecker-Teter-Hutter (GTH) pseudopotentials 68 were used. In the HK energy functional, used in SS DFT, the kinetic energy was calculated using the Lee-Lee-Parr (LLP) 69 kinetic energy functional. For the simulations of liquid acetonitrile dispersion was added using the DFTD3 approach. 70 For the virus calculations the non-local rVV10 density functional was employed to efficiently and non-empirically take van der Waals interactions into account. 71–73 24 ACS Paragon Plus Environment

Page 24 of 43

Page 25 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

The linear-scaling calculations have been performed with the implementation as described in Ref., 6 which in particular allows for variable sparsity patterns of the matrices. Calculations are based on the TRS4 29 algorithm, except for the virus, which employs TRS4 only to obtain the first density matrix, and employs curvy-steps 30 for successive SCF steps. These calculations employ an EPS_FILTER of 10-7 . The EMD simulations were started from the ground-state wave functions. The wave-function minimization was converged with a threshold (EPS_SCF) of 10-7 or tighter. The BCH like series described in eq. 13, was truncated at a threshold (EXP_ACCURACY) of 10-12 or tighter. Choosing a very tight threshold was computationally inexpensive since the higher order terms are very sparse, which makes their computation cheap. In practice the threshold resulted in the convergence of the exponential to the point were the term became 100 percent sparse. The self-consistent propagation of the density matrix was converged to a threshold (EPS_ITER) of 10-5 or tighter. The default accuracy of CP2K (EPS_DEFAULT) was set to 10-10 or tighter. All simulations were run in double precision on a CRAY XC30. CP2K input files are available in the supplementary information.

5.2 Ground state applications 5.2.1

The band gap of TiO2 Nanoparticles in acetonitrile solution

The electronic structure of TiO2 anatase nanoparticles in an acetonitrile solvent was investigated, as a test of the gs-BO SS DFT implementation. Such particles are central for DSSCs by serving as the conduction path for the photo-excited electrons from dye molecules. 4 To validate the performance of the SS DFT implementation, several tests were performed. First, the structure of the neat liquid has been computed using MD based simulations (10-15ps in length) based on both KS DFT and SS DFT. Pair correlation functions for a cubic unit cell (edge 15.74Å) containing 45 molecules are shown in Figure 8. Overall, the structure of the liquid is well described by the SS DFT approach. All peaks are well reproduced and are similar in height. The SS DFT approach leads to a small (0.1-0.2Å) inwards shift of the peaks, suggesting that some more repulsion would be required in the PBE+LLP functional. Based on these data, we consider the SS 25 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Page 26 of 43

Page 27 of 43

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

leads to a smaller band gap. This mechanism has also been suggested based on experiments. 79,80 5.2.2

Satellite Tobacco Mosaic Virus

Figure 11: Visualization of the electrostatic properties of the viral capsid as computed with DFT. The solvent (blue sticks), protein and RNA (ribbons) are shown in part, while the surface, which corresponds to the viral capsid, is colored according to the computed charges of its constituent atoms. The system contains somewhat more than one million atoms. Visualized using VMD. 81 For the final demonstration of the ground-state combination of the SS and LS approaches, we compute the electronic structure of the satellite tobacco mosaic virus in solution. This virus was the first entire life form simulated at the atomic level by Freddoline et. al in Ref. 58 using empirical force fields. Here, we perform the first ever first principles simulation of such a system. This calculation takes all important interactions (bonding, repulsion, electrostatics and polarization, van der Waals, etc.) into account based on the electronic structure only. Clearly, our simulations can not asses the dynamics of the virion at relevant timescales (ns to ms). For these timescales, empirical force fields, running on specialized hardware are more appropriate for the task. 82 With this calculation we illustrate that relaxing such structures and computing their electronic densities is 28 ACS Paragon Plus Environment

Page 28 of 43

Page 29 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

possible and learn how to deal with practical aspects of such calculations. This could be useful to analyse X-ray structure in more detail, 83 which is commonly still based on databases of precomputed densities, 84 even though a subsystem based approach has been applied to proteins already. 85 Furthermore, the computed densities provides a route to obtaining electrostatic properties of, and possibly interactions between, large protein assemblies, or the RNA and protein capsid. 86 The starting point for our calculations is the fully solvated STMV structure that is provided as a benchmark for the NAMD program, and available online. 87 Each of the 60 proteins ( 2200 atoms) and the RNA fragment ( 30300 atoms) are considered a molecule in the SS scheme. The latter is highly charged (949 e-, corresponding to the number of nucleotides), which is correctly captured in the SS+LS calculation. The system is neutralized by magnesium and chloride ions, solvated in water. 99 geometry relaxation steps have been performed to relax the structure. This relaxation is intended to reduce the gradients and total energy, but ideally the system would be equilibrated at a finite temperature, which is still out of reach. The resulting electronic structure has been analyzed in terms of the Mulliken charges, as shown in Figure 11. During optimization, total energy is lowered by approx. 2000 Hartree, only 2 mHt/atom, and supports the quality of the initial structure. The root mean square gradient is reduced by roughly one order of magnitude to 6 · 10−4 a.u. . However, the relaxation of the structure is rather challenging in this context. Large gradients are present in the structure, and a trust radius approach, limiting the atomic displacements to 0.05Å turned out to be essential to obtain a stable optimization in combination with a standard LBFGS 88,89 optimizer. Without trust radius, large local distortions that result from attempted optimization steps, would lead to an instable SCF procedure. Linear scaling preconditioning of the geometry optimization problem might be a complementary approach to further improve the geometry optimization algorithm, but is not yet available in CP2K. An interesting quantity of the system is the HOMO-LUMO gap. This gap increases from 0.7eV in the original structure to 1.2eV after relaxation. Even though GGA DFT underestimates band gaps, 90,91 a finite gap is thus obtained for this large structure. This is important as it shows that a properly solvated protein/DNA structure can indeed be described with GGA DFT, even in the presence of various charged residues and nu-

29 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

cleotides. Computing the corresponding orbitals is not yet possible, and we therefore do not know if the two orbitals that contribute to this small HOMO-LUMO gap are spatially close, or rather far apart, but this seems an interesting question for future research. Indeed, fluctuations in these large, disordered, systems should modulate the electrostatic potential in such a way that the orbital energies can be locally pushed up or down significantly, so that the energies of HOMO orbitals of molecular fragments in one region come closer to the energies of LUMO orbitals of molecular fragments in far away regions. In extreme cases where these levels would cross, Born-Oppenheimer simulations would become problematic. Finally, we believe that the STMV structure is an excellent benchmark system for linear scaling electronic structure calculations. We have established that it is at the same time challenging, but also feasible to deal with this system. It complements the usual benchmarks in that it is fully disordered, three dimensional, and chemically sufficiently rich. The computation of the HOMO-LUMO gap of the original structure, and the relaxation of that structure seem two reasonable targets of different complexity for such a benchmark. With the SS+LS approach, the computational cost of this calculation is manageable on today’s supercomputing resources. Using 2450 compute nodes (each 8 cores, no GPU employed) of a Cray XC30 system at CSCS, one geometry optimization step requires roughly 1h, while the memory peaks at 13GB per node. The speedup over a purely LS approach is significant, as measured for the first SCF step. SS+LS required 233s on 2450 nodes, while LS required 1385s on 4626 nodes, an 11-fold speedup. Memory usage per node is more similar, as both KS+LS and LS only required 9GB per node for the first SCF step, despite the fact that the occupation of the matrices (8.4M × 8.4M in size) is 0.019% in the SS+LS and 0.267% in the LS case. The similarity might be due to the fact that the implementation of the SS approach stores several density grids, which are large (2304 × 2304 × 2304 points). The usage of GPUs (one NVIDIA K20X with 6GB memory, per node) results in a 30% speedup in the SS+LS case, which mostly results from the accelerated sparse matrix matrix multiplications, as performed by the GPU accelerated version of the DBCSR library. 31 A GPU accelerated run was not possible for the LS only approach, as insufficient GPU memory was available. Interestingly, in the SS+LS approach, the computer time is approximately

30 ACS Paragon Plus Environment

Page 30 of 43

Page 31 of 43

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

density matrix EMD, and to study the influence of SS DFT model. Furthermore, this spectrum can be compared to a traditional LR-TDDFT calculation, as the peak positions (at fixed nuclei) coincide with that of LR-TDDFT. A correct reproduction of the intensities would require an initial electronic excitation, i.e. as obtained by applying a delta pulse. 23 The performed GGA simulations are expected to underestimate the excitation energies as compared to experiment, which is due to the fact that GGA functionals systematically underestimate the band gap. 90,91 The results of these simulations are shown in Figure 12. As expected, the density matrix based implementation of EMD was found to produce indistinguishable results from the traditional wave function based EMD implementation in CP2K. Comparing the EMD + SS and EMD simulations, fair agreement is found, with peak positions at similar locations, and with similar intensities. Clearly, as the ionic dynamics between the two models differs, some differences are to be expected. We note that the sampling of ionic configurations during the EMD or EMD + SS runs is a useful aspect that is not present in typical, static, RTP or LR-TDDFT calculations, and that this will naturally broaden the peaks in the spectrum. However, the dynamics is roughly 250fs (up to 1ps with SS DFT), so that only the fastest vibrations are averaged over. The fact that simulations of up to 1ps can be performed demonstrates the long-term stability of the method. Finally, the results compare favorably with LR-TDDFT calculations by Fantacci et. al. 94 These calculations employ an implicit ethanol solvent model, and employ a different xc functional (BPW91). Despite these differences in computational setup, a similar spectrum is obtained.

5.3.2

Electron Injection into a Boron-Nitride Nanotube

In the previous section, EMD was employed with the SS+LS scheme to compute spectra, comparable to linear response time-dependent DFT calculations. However, the current developments have also been triggered by our interest in electron dynamics beyond the linear response. This includes the simulation of electron transfer across interfaces as e.g. in DSSC, 95–101 in the presence of time-dependent external fields as found in various spectroscopies. 102–104 Furthermore, EMD could be employed to simulate the response of electrons to applied potentials in devices. 105 In the 32 ACS Paragon Plus Environment

Page 32 of 43

Page 33 of 43

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Page 34 of 43

Page 35 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Ehrenfest MD has been implemented in a formalism that employs a propagation of the density matrix. Within a SS scheme, where the sparsity of the matrices results from the model adopted, this leads to linear-scaling EMD. Comparing the spectra of a dye in solution obtained from EMD simulations with and without the SS approximation suggests that the approximation is relatively accurate. EMD in the LS framework, and by generalization in the SS+LS approach, is linear scaling as long as the assumption of a sparse density matrix holds. With a carefully refined algorithm a stable and robust propagation scheme has been derived, with forces consistently implemented. However, we find that the assumption of sparsity initially holds, but breaks down at longer timescales where the decay of the density matrix not exponential anymore. Therefore EMD can not be performed at arbitrary precision in a linear scaling setup, unless a relatively loose threshold is employed. Nevertheless, EMD simulations can be performed relatively efficiently for system containing up to a few thousand atoms, which might allow for the simulation of transient electronic transport effects in nanoscale devices.

Acknowledgement For the initial implementation and testing of the SS+LS scheme we acknowledge the work of Martin Häufel. We thank the Swiss National Science Foundation (SNF 200020-146485/1 and SNF 200021-129602/2) and the European union FP7 with an ERC Starting Grant under contract No. 277910 for funding this work. It was furthermore supported by a grant from the Swiss National Supercomputing Centre (CSCS) under project ID ch5.

Supporting Information Available Representative input files for most simulations are provided, as well as a pseudo-code for the pairswitching algorithm and a detailed derivation of Eq. 14. This information is available free of charge via the Internet at http://pubs.acs.org. This material is available free of charge via the Internet at http://pubs.acs.org/.

35 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

References (1) Hohenberg, P.; Kohn, W. Phys. Rev. 1964, 136, B864–B871. (2) Kohn, W.; Sham, L. J. Phys. Rev. 1965, 140, A1133–A1138. (3) Marques, M.; Gross, E. Annual Rev. Phys. Chem. 2004, 55, 427–455. (4) O’regan, B.; Grätzel, M. Nature 1991, 353, 737–740. (5) Grätzel, M. Nature 2001, 414, 338–344. (6) VandeVondele, J.; Borštnik, U.; Hutter, J. J. Chem. Theory Comput. 2012, 8, 3565–3573. (7) Bowler, D.; Miyazaki, T. J. Phys. Condens. Matter 2010, 22, 074207. (8) Skylaris, C.-K.; Haynes, P. D.; Mostofi, A. A.; Payne, M. C. J. Chem. Phys 2005, 122, 084119. (9) Artacho, E.; Anglada, E.; Diéguez, O.; Gale, J. D.; García, A.; Junquera, J.; Martin, R. M.; Ordejón, P.; Pruneda, J. M.; Sánchez-Portal, D. J. Phys. Condens. Matter 2008, 20, 064208. (10) Qin, X.; Shang, H.; Xiang, H.; Li, Z.; Yang, J. Int. J. Quant. Chem. 2015, 115, 647–655. (11) Ozaki, T.; Kino, H. Phys. Rev. B 2005, 72, 045121. (12) Mohr, S.; Ratcliff, L. E.; Genovese, L.; Caliste, D.; Boulanger, P.; Goedecker, S.; Deutsch, T. Phys. Chem. Chem. Phys. 2015, 17, 31360–31370. (13) Goedecker, S. Rev. Mod. Phys. 1999, 71, 1085. (14) Bowler, D. R.; Miyazaki, T. Rep. Prog. Phys. 2012, 75, 036503. (15) Jacob, C. R.; Neugebauer, J. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2014, 4, 325–362. (16) Gordon, R. G.; Kim, Y. S. J. Chem. Phys 1972, 56, 3122–3133. (17) Cortona, P. Phys. Rev. B 1991, 44, 8454–8458. 36 ACS Paragon Plus Environment

Page 36 of 43

Page 37 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(18) Wesołowski, T. A.; Warshel, A. J. Phys. Chem. 1993, 97, 8050–8053. (19) Hung, L.; Carter, E. A. Chem. Phys. Lett. 2009, 475, 163 – 170. (20) Iannuzzi, M.; Kirchner, B.; Hutter, J. Chem. Phys. Lett. 2006, 421, 16–20. (21) Casida, M. E.; Wesołowski, T. A. Int. J. Quantum Chem. 2004, 96, 577–588. (22) Neugebauer, J. J. Chem. Phys. 2007, 126, 134116. (23) Krishtal, A.; Ceresoli, D.; Pavanello, M. J. Chem. Phys. 2015, 142, 154116. (24) König, C.; Neugebauer, J. J. Chem. Theo. Comput. 2013, 9, 1808–1820. (25) O’Rourke, C.; Bowler, D. R. J. Chem. Phys 2015, 143, 102801. (26) Hohenberg, P.; Kohn, W. Phys. Rev. 1964, 136, B864. (27) Kohn, W.; Sham, L. J. Phys. Rev. 1965, 140, A1133–A1138. (28) Borštnik, U.; VandeVondele, J.; Weber, V.; Hutter, J. Parallel Comput. 2014, 40, 47–58. (29) Niklasson, A. M.; Tymczak, C.; Challacombe, M. J. Chem. Phys 2003, 118, 8611–8620. (30) Shao, Y.; Saravanan, C.; Head-Gordon, M.; White, C. A. J. Chem. Phys 2003, 118, 6144– 6151. (31) Schütt, O.; Messmer, P.; Hutter, J.; VandeVondele, J. In Electronic Structure Calculations on Graphics Processing Units; Walker, R. C., Götz, A. W., Eds.; John Wiley & Sons, Ltd, 2016; pp 173–190. (32) Miyamoto, Y.; Rubio, A.; Tománek, D. Phys. Rev. Lett. 2006, 97, 126104. (33) Isborn, C. M.; Li, X.; Tully, J. C. J. Chem. Phys 2007, 126, 134307. (34) Krasheninnikov, A. V.; Miyamoto, Y.; Tománek, D. Phys. Rev. Lett. 2007, 99, 016104.

37 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(35) Race, C. The modelling of radiation damage in metals using Ehrenfest dynamics; Springer Science & Business Media, 2011. (36) Wardlow, A.; Dundas, D. Phys. Rev. A 2016, 93, 023428. (37) Laarmann, T.; Shchatsinin, I.; Stalmashonak, A.; Boyle, M.; Zhavoronkov, N.; Handt, J.; Schmidt, R.; Schulz, C. P.; Hertel, I. V. Phys. Rev. Lett. 2007, 98, 058302. (38) Uhlmann, M.; Kunert, T.; Schmidt, R. Phys. Rev. A 2005, 72, 045402. (39) Handt, J.; Schmidt, R. EPL 2015, 109, 63001. (40) Lindenblatt, M.; van Heys, J.; Pehlke, E. Surf. Sci. 2006, 600, 3624 – 3628, Berlin, Germany: 4-9 September 2005 Proceedings of the 23th European Conference on Surface Science. (41) Liang, W.; Isborn, C. M.; Li, X. J. Phys. Chem. A 2009, 113, 3463–3469. (42) Moss, C. L.; Isborn, C. M.; Li, X. Phys. Rev. A 2009, 80, 024503. (43) Tavernelli, I.; Gaigeot, M.-P.; Vuilleumier, R.; Stia, C.; Hervé du Penhoat, M.-A.; Politis, M.-F. ChemPhysChem 2008, 9, 2099–2103. (44) Chapman, C. T.; Liang, W.; Li, X. J. Phys. Chem. Lett. 2011, 2, 1189–1192. (45) Hutter, J.; Iannuzzi, M.; Schiffmann, F.; VandeVondele, J. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2014, 4, 15–25. (46) Ojanperä, A.; Havu, V.; Lehtovaara, L.; Puska, M. J. Chem. Phys 2012, 136, 144103. (47) Meng, S.; Kaxiras, E. J. Chem. Phys 2008, 129, 054110. (48) Andrade, X.; Alberdi-Rodriguez, J.; Strubbe, D. A.; Oliveira, M. J. T.; Nogueira, F.; Castro, A.; Muguerza, J.; Arruabarrena, A.; Louie, S. G.; Aspuru-Guzik, A.; Rubio, A.; Marques, M. A. L. J. Phys. Condens. Matter 2012, 24, 233202. 38 ACS Paragon Plus Environment

Page 38 of 43

Page 39 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(49) Wang, F.; Yam, C. Y.; Hu, L.; Chen, G. J. Chem. Phys 2011, 135, 044126. (50) Li, X.; Tully, J. C.; Schlegel, H. B.; Frisch, M. J. J. Chem. Phys 2005, 123, 084106. (51) Tavernelli, I.; Röhrig, U. F.; Röthlisberger, U. Mol. Phys. 2005, 103, 963–981. (52) Kunert, T.; Schmidt, R. Eur. Phys. J. D 2003, 25, 15–24. (53) Castro, A.; Marques, M. A.; Rubio, A. J. Chem. Phys 2004, 121, 3425–3433. (54) Kolafa, J. J. Comput. Chem. 2004, 25, 335–342. (55) Kohn, W. Phys. Rev. Lett. 1996, 76, 3168–3171. (56) Brélaz, D. Commun. ACM 1979, 22, 251–256. (57) Larson, S. B.; Day, J.; Greenwood, A.; McPherson, A. J. Mol. Biol. 1998, 277, 37 – 59. (58) Freddolino, P. L.; Arkhipov, A. S.; Larson, S. B.; McPherson, A.; Schulten, K. Structure 2006, 14, 437 – 449. (59) McWeeny, R. Rev. Mod. Phys. 1960, 32, 335–369. (60) Hückel, E. Z. Phys. 1931, 70, 204–286. (61) Bedard-Hearn, M. J.; Larsen, R. E.; Schwartz, B. J. J. Chem. Phys. 2005, 123, 234106. (62) Shenvi, N.; Subotnik, J. E.; Yang, W. J. Chem. Phys. 2011, 134. (63) Akimov, A. V.; Long, R.; Prezhdo, O. V. J. Chem. Phys. 2014, 140. (64) VandeVondele, J.; Krack, M.; Mohamed, F.; Parrinello, M.; Chassaing, T.; Hutter, J. Comput. Phys. Commun. 2005, 167, 103 – 128. (65) Lippert, G.; Hutter, J.; Parrinello, M. Mol. Phys. 1997, 92, 477–488. (66) VandeVondele, J.; Hutter, J. J. Chem. Phys 2007, 127, 114105. 39 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(67) Perdew, J. P.; Burke, K.; Ernzerhof, M. Phys. Rev. Lett. 1996, 77, 3865–3868. (68) Goedecker, S.; Teter, M.; Hutter, J. Phys. Rev. B 1996, 54, 1703–1710. (69) Lee, H.; Lee, C.; Parr, R. G. Phys. Rev. A 1991, 44, 768–771. (70) Grimme, S.; Antony, J.; Ehrlich, S.; Krieg, H. J. Chem. Phys. 2010, 132, 154104. (71) Vydrov, O. A.; Van Voorhis, T. J. Chem. Phys 2010, 133, 244103. (72) Román-Pérez, G.; Soler, J. M. Phys. Rev. Lett. 2009, 103, 096102. (73) Sabatini, R.; Gorni, T.; de Gironcoli, S. Phys. Rev. B 2013, 87, 041108. (74) Reddy, K. M.; Manorama, S. V.; Reddy, A. R. Mater. Chem. Phys. 2003, 78, 239 – 245. (75) Monticone, S.; Tufeu, R.; Kanaev, A.; Scolan, E.; Sanchez, C. Appl. Surf. Sci. 2000, 162 163, 565 – 570. (76) Satoh, N.; Nakashima, T.; Kamikura, K.; Yamamoto, K. Nat. Nanotechnol. 2008, 3, 106– 111. (77) Kormann, C.; Bahnemann, D. W.; Hoffmann, M. R. J. Phys. Chem. 1988, 92, 5196–5201. (78) Anpo, M.; Shima, T.; Kodama, S.; Kubokawa, Y. J. Phys. Chem. 1987, 91, 4305–4310. (79) Tao, J.; Luttrell, T.; Batzill, M. Nat. Chem. 2011, 3, 296–300. (80) Ariga, H.; Taniike, T.; Morikawa, H.; Tada, M.; Min, B. K.; Watanabe, K.; Matsumoto, Y.; Ikeda, S.; Saiki, K.; Iwasawa, Y. J. Am. Chem. Soc. 2009, 131, 14670–14672. (81) Humphrey, W.; Dalke, A.; Schulten, K. J. Mol. Graphics 1996, 14, 33–38. (82) Shaw, D. E.; Maragakis, P.; Lindorff-Larsen, K.; Piana, S.; Dror, R. O.; Eastwood, M. P.; Bank, J. A.; Jumper, J. M.; Salmon, J. K.; Shan, Y.; Wriggers, W. Science 2010, 330, 341– 346. 40 ACS Paragon Plus Environment

Page 40 of 43

Page 41 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(83) Koritsanszky, T. S.; Coppens, P. Chem. Rev. 2001, 101, 1583–1628. (84) Domagala, S.; Fournier, B.; Liebschner, D.; Guillot, B.; Jelsch, C. Acta Crystallogr. Sect. A 2012, 68, 337–351. (85) Kiewisch, K.; Jacob, C. R.; Visscher, L. J. Chem. Theo. Comput. 2013, 9, 2425–2440. (86) Ren, P.; Chun, J.; Thomas, D. G.; Schnieders, M. J.; Marucho, M.; Zhang, J.; Baker, N. A. Quart. Rev. Biophys. 2012, 45, 427–491. (87) STMV, NAMD benchmark input. http://www.ks.uiuc.edu/Research/namd/ utilities/stmv/, 2016. (88) Zhu, C.; Byrd, R. H.; Lu, P.; Nocedal, J. ACM Trans. Math. Softw. 1997, 23, 550–560. (89) Morales, J. L.; Nocedal, J. ACM Trans. Math. Softw. 2011, 38, 7:1–7:4. (90) Perdew, J. P. Int. J. Quant. Chem. 1985, 28, 497–523. (91) Mori-Sánchez, P.; Cohen, A. J.; Yang, W. Phys. Rev. Lett. 2008, 100, 146401. (92) Cleveland, W. S. J. Amer. Statist. Assoc. 1979, 74, 829–836. (93) Cleveland, W. S.; Devlin, S. J. J. Amer. Statist. Assoc. 1988, 83, 596–610. (94) Simona Fantacci, F. D. A.; Selloni, A. J. Am. Chem. Soc. 2003, 125, 4381–4387. (95) Duncan, W. R.; Prezhdo, O. V. Annu. Rev. Phys. Chem. 2007, 58, 143–184. (96) Akimov, A. V.; Neukirch, A. J.; Prezhdo, O. V. Chem. Rev. 2013, 113, 4496–4565. (97) Prezhdo, O. V.; Duncan, W. R.; Prezhdo, V. V. Prog. Surf. Sci. 2009, 84, 30 – 68. (98) Duncan, W. R.; Stier, W. M.; Prezhdo, O. V. J. Am. Chem. Soc. 2005, 127, 7941–7951. (99) Duncan, W. R.; Craig, C. F.; Prezhdo, O. V. J. Am. Chem. Soc. 2007, 129, 8528–8543.

41 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 42 of 43

(100) Stier, W.; Prezhdo, O. V. J. Mol. Struct.: Theochem 2003, 630, 33 – 43. (101) Negre, C. F. A.; Fuertes, V. C.; Oviedo, M. B.; Oliva, F. Y.; Sánchez, C. G. J. Phys. Chem. C 2012, 116, 14748–14753. (102) Sánchez-de Armas, R.; Oviedo López, J.; A. San-Miguel, M.; Sanz, J. F.; Ordejón, P.; Pruneda, M. J. Chem. Theory Comput. 2010, 6, 2856–2865. (103) Sun, J.; Song, J.; Zhao, Y.; Liang, W.-Z. J. Chem. Phys. 2007, 127, 234107. (104) Oviedo, M. B.; Wong, B. M. J. Chem. Theory Comput. 2016, 12, 1862–1871. (105) Bani-Hashemian, M. H.; Brück, S.; Luisier, M.; VandeVondele, J. J. Chem. Phys. 2016, 144, 044113. (106) Kadanoff, L. P.; Baym, G. Quantum statistical mechanics; WA Benjamin, Inc. New York, 1962. (107) Luisier, M. Chem. Soc. Rev. 2014, 43, 4357–4367. (108) Frey,

J.

T.;

Doren,

D.

J.

TubeGen

3.4

http://turin.nss.udel.edu/research/tubegenonline.html, 2011. (109) Perebeinos, V.; Tersoff, J.; Avouris, P. Phys. Rev. Lett. 2005, 94, 086802. (110) Pennington, G.; Goldsman, N. Phys. Rev. B 2003, 68, 045426. (111) Chen, Y.-F.; Fuhrer, M. S. Phys. Rev. Lett. 2005, 95, 236803.

42 ACS Paragon Plus Environment

web-interface.

Page 43 of 43

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment