Local Molecular Orbitals from a Projection onto Localized Centers

May 10, 2016 - presented which exploits the locality of the eigenfunctions ... locality then depends more strongly on the shape of the set of local fu...
1 downloads 0 Views 10MB Size
Subscriber access provided by UNIV OF PITTSBURGH

Article

Local molecular orbitals from a projection onto localised centres Andreas Hesselmann J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.6b00321 • Publication Date (Web): 10 May 2016 Downloaded from http://pubs.acs.org on May 12, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Theory and Computation is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Local molecular orbitals from a projection onto localised centres Andreas Heßelmann∗ Lehrstuhl für Theoretische Chemie, Universität Erlangen-Nürnberg, Egerlandstr. 3, D-91058 Erlangen, Germany E-mail: [email protected]

Abstract A localisation method for molecular orbitals is presented which exploits the locality of the eigenfunctions associated with the largest eigenvalues of the matrix representation of spatially localised functions. Local molecular orbitals are obtained by a projection of the canonical orbitals onto the set of the eigenvectors which correspond to the largest eigenvalues of these matrices. Two different types of spatially localised functions were chosen in this work, a two-parameter smoothstep-type function and the weight functions determined by a Hirshfeld partitioning of the molecular volume. It will be shown that the method can provide fairly local occupied molecular orbitals if the positions of the set of local functions are set to the molecular bond centres. The method can also yield reasonably well localised virtual molecular orbitals, but here a sensible choice of the positions of the functions are the atomic sites and the locality then depends more strongly on the shape of the set of local functions. The method is tested for a range of polypeptide molecules in two different conformations, namely a helical and a

β -sheet conformation. Futhermore it will be shown that an adequate locality of the occupied and virtual orbitals can also be obtained for highly delocalised systems. ∗ To

whom correspondence should be addressed

1 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1 Introduction In standard Hartree-Fock (HF) and Kohn-Sham (KS) density-functional theory calculations the electronic energy is calculated from a single-determinant wave function that itself is constructed from a set of single-electron wave functions termed as molecular orbitals (MO’s). Usually these orbitals are obtained from the diagonalisation of the Fock respectively Kohn-Sham matrix (F) to fulfill the underlying stationary condition Fia = 0 which is also refered to as Brillouin’s theorem 1 (i: occupied orbital index, a: virtual orbital index). While the criterion of Fia = 0 is sufficient to optimise the HF/KS wave functions, usually also the occupied-occupied and virtual-virtual blocks of the F matrix are diagonalised leading to a unique set of so-called canonical orbitals. The advantage of this additional determination of the set of occupied or virtual orbitals is that they may also be used for gaining a qualitative insight into the electronic structure. Moreover, due to Koopman’s theorem 2 (and a corresponding relation for KS methods, see Ref. 3 ), the orbital energies associated with the occupied or virtual HF/KS orbitals have a physical meaning and these, especially in case of KS, can e.g. be used to compute zeroth order estimates for the molecular excitation energies 4 (according to the so-called uncoupled approximation). The eigenvalues (at given k-points) from KS calculations are also of importance in the analysis of the electronic structure of solid state systems as they are often used to estimate the band structure of periodic sytems. However, the canonicalisation of the MO’s often leads to a set of strongly delocalised orbitals even in small molecular systems. Due to this the canonical orbitals are sometimes not useful for a qualitative description of the electronic structure. More important, however, is the fact that the nonlocal canonical orbitals are not well suited for the description of electron correlation. Electron correlation interaction energies between two electronic systems rapidly decay as 1/R6 with the distance R and therefore they are much more short-ranged than first-order Coulomb interaction energies. When using canonical orbitals for describing the correlated wave function, however, the short-range nature of electron correlation effects can not be exploited, both, for gaining a qualitative description of electron correlation 5–7 and for reducing the steep scaling behaviour of wave function correlation methods 8 . 2 ACS Paragon Plus Environment

Page 2 of 62

Page 3 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

The HF and KS energies are, however, invariant with respect to (arbitrary) rotations among the occupied and among the unoccupied MO’s. Therefore a number of methods have been developed by which the set of canonical occupied (or also unoccupied) MO’s can be transformed to local ones 9–25 . Almost all of these methods are based on an optimisation of a given orbital functional. The Boys method (also termed Foster-Boys method) 9,10,13 determines localised orbitals by minimising the orbital spread, which is identical to the easier task of maximising the sum of squares between the distances of orbital centroids 12 N

fFoster−Boys [φ ] = ∑ |hφi |r|φi i|2

(1)

i

The Pipek-Mezey (PM) method 15 is based on the principle of reducing the number of atomic centres in the occupied and virtual orbital space according to the Mulliken population number using the functional N M

fPM [φ ] = ∑ ∑ |hφi |PˆA |φi i|2 i

(2)

A

where PˆA is the projection operator onto the space of atomic orbitals centered on atom A. The PM population localisation approach has been shown to be superior to the Boys method for physically well-localised systems 16 and also preserves the σ − π separation of double bonds opposed to the Boys method. Due to the fact, however, that Mulliken charges do not possess a well defined basis set limit, also the PM localisation method inherits this unphysical basis set dependence. Because of this only recently Lehtola et al. have derived a number of alternative PM approaches employing a variety of other partial charge models which yield atomic charges which do converge with respect to the basis set size 25 . While the PM method to date is perhaps the most widely used localisation method, another notable early orbital localisation method is the Edminston-Ruedenberg method 11 in which the self-repulsion energy is maximised. This method, however, formally scales as N

5

compared to the lower scaling behaviour of the Boys method (N 3 ) and the PM method (N

3−4 )

as

and therefore is not favourable anymore for large electronic systems. Usually the localisation methods described above are only used for localising the occupied 3 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

orbitals where strong and isolated minima of the given functionals exist, enabling a rapid convergence of the iterative optimisation method. In case of the virtual orbitals, however, the localisation functional does not possess this property and therefore an iterative optimisation method can require many iteration steps until convergence is achieved. Because of this, in local correlation methods the virtual space is usually not represented by unitary transformed virtual MO’s. Instead, it is constructed from the atomic orbital basis from which the occupied space is projected out (so called projected AO’s (PAO’s)). While this procedure yields localised virtual orbitals very efficiently for large systems, the set of resulting virtual orbitals builds a redundant and nonorthonormal set. As pointed out by Subotnik et al. 18 , however, local correlation methods would benefit from using orthonormal local virtual orbitals because in case of nonorthogonal (and perhaps even linearly dependent) projected orbitals the residuals of the amplitude equations occuring in, e.g., coupledcluster methods have to be transformed first to an orthogonal basis before the amplitude update can be computed 26 . The solution algorithm to compute the amplitudes in local correlation methods is therefore more demanding if nonorthogonal virtual orbitals instead of orthogonal ones are used, and this may also affect the computational efficiency of the approach. An alternative solution to the technical problems in local correlation methods which are based on local virtual PAO’s is provided by local correlation methods in which the virtuals are represented by pair-specific pair-natural orbitals which are, for a given occupied orbital pair, orthogonal among themselves 27 . There exist a number of approaches which can yield both occupied and virtual orthonormal local MO’s 17–19,22,23,28 . The method by Subotnik 18 uses a decomposition of the full AO basis space into a minimal basis space, encompassing the occupied and valence virtual orbital space, and a hard virtual space that includes polarisation, diffuse and higher harmonic functions. The latter is constructed from diagonalisations of overlap matrices that are associated with a given atomic centre. The method is not based on an optimisation of a functional and the rate determining step in the algorithm is only limited by the diagonalisation of matrices of the size of the virtual orbital space. Orbital localisation methods which employ optimisation methods, namely a ’leastchange’ strategy 20 and the trust-region method 22 , for minimising the sum of orbital variances have

4 ACS Paragon Plus Environment

Page 4 of 62

Page 5 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

been developed by Jørgensen et al. 20,22,28 . These methods can be used both, to determine occupied and virtual orthonormal local MO’s. Another notable orbital localisation method is the Cholesky decomposition approach by Koch et al. 19 which exploits the locality of the density matrix in the atomic orbital basis. The Cholesky decomposition of the density matrix is given by N

Pµν = ∑ Lµ i Lν i

(3)

i

where L is a triangular matrix containing the (orthonormal) Cholesky orbitals in their columns and

µ , ν label atomic orbital basis functions. The idea of the Cholesky localisation method is that, since the density matrix P is sparse in the atomic orbital (AO) basis set, this sparsity transfers to the Cholesky vectors L which then represent, to some extent, local MO’s. The method has the advantage that it can be applied to very large systems due to the fact that it scales cubically with the system size and may be made even linear scaling. Furthermore, the Cholesky localisation method can be used also to compute local virtual orbitals 19 from a decomposition of the orthogonal complement of the density matrix. It has been found, however, that the virtual Cholesky MO’s are significantly less local than corresponding PAO’s and therefore are unsuitable for local correlation methods which aim at strong reductions of the computation time 19 . Cholesky orbitals serve, however, well as a starting guess for iterative localisation methods which are based on an optimisation of an energy functional, see e.g. Ref. 21 which presents a linear scaling algorithm to compute Boys localised MO’s. In this work an new noniterative orbital localisation method is presented which can be used both to compute orthonormal occupied and virtual localised orbitals. The method is based on a projection of the nonlocal canonical MO’s onto the eigenfunctions of the matrix representation of spatially localised functions within the restricted (occupied or virtual) orbital space. An orbital localisation method comparable to the one of this work has recently been derived by Zhang et al. 29 . In theirs RLMO (regionally localised molecular orbitals) method the canonical MO’s are projected onto the eigenfunctions of regionally separated overlap matrices represented in the AO

5 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

basis set through a singular value decomposition of the overlap matrix of the local eigenfunctions and the canonical MO’s 29 . The perhaps first type of orbital localisation method which is based on a projection technique is the method by Meunier et al. 30 . In this method some arbitary, yet linearly independent, orbitals of the bonds, lone pairs and inner shells are constructed in advance and then projected onto the set of canonical occupied MO’s. The resulting orbitals, however, are nonorthogonal and, e.g., a Löwdin orthogonalisation has to be applied on the projected set of orbitals 30,31 . Another method that can be viewed as a projective localisation method is the NLMO (natural localised molecular orbital) method by Reed and Weinhold in which the MO’s are diagonalised within local subblocks of the density matrix, see Ref. 14 . In this work it will be shown that local molecular orbitals can also be obtained by a projection onto the eigenfunction space of a finite-basis set matrix representation of arbitrary spatially localised functions. It will be explored herein how the shape of these functions influence the locality of the resulting MO’s. Moreover, it will be shown that also the positioning of the functions can have a significant impact on the locality of the orbitals. The results of the projection localisation method will be compared to those from localisation methods that optimise a certain localisation functional explicitly. The methods of this work will be tested for a selection of polypetide molecules as well as an electronically delocalised polyacene molecule and the results will be compared to standard orbital localisation methods. This work is organised as follows: section 2.1 describes the general scheme of the orbital localisation method. Section 2.2 describes the different functionals that have been used to measure the locality of the MO’s in this work. In section 2.3 the different types of local functions used in conjunction with the localisation method will be presented and some further aspects, namely the influence of the orthonormalisation constraint of the MO’s, the σ -π separation property and the performance in electronically difficult cases involving multiple centre bonds, are presented in the sections 2.4, 2.5 and 2.6. Section 3 describes the computational details of the calculations performed and section 4 illustrates the results. Finally, section 5 summarises the results of this work.

6 ACS Paragon Plus Environment

Page 6 of 62

Page 7 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

2 Method 2.1 Orbital localisation method Consider a general positive-definite function f A (r) which is localised at a point rA , i.e.

A

f (r) =

    0

if |r − rA | < d

  ≈ 0

elsewhere

(4)

where d may be of the order of a typical covalent bond radius of a chemical element of the second row. Any normalised function χ (r) (i.e. hχ |χ i = 1) can be attributed to be localised at rA if the overlap integral hχ | f A |χ i  0

(5)

is large. It can be shown 10 that a function χe which maximises hχe| f A |χei minimises the variance 2 2 hχe| f A |χei − hχe| f A |χei = min

(6)

which measures how much an expectation value of the quantity f A deviates from its mean value. When regarding f A as a multiplicative linear operator one can formally infer from Eq. (6) that the variance Eq. (6) is exactly zero if the function χe is an eigenfunction of f A . Certainly, in a complete function space {Ψ} the equation ( f A − τ )Ψ = 0 (with a scalar value τ ) can only have the trivial solution Ψ = 0. The only notable exception to this would be, if the function f A is a constant in the region of centre A and equal to zero otherwise (i.e., is a step function). In this case f A can have infinitely many eigenfunctions with the eigenvalue τ = f A (r ∈ A) and with the property Ψ(r ∈ / A) = 0, i.e., the eigenfunctions are localised in the region of A (for which f A (r) 6= 0). In the following the function f A will be represented in a finite subspace of the full function space that can be, for instance, the space spanned by the atomic orbital (AO) basis functions. The task of seeking for a set of functions which are ’maximally’ localised on a set of points {rA }

7 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 62

can then be redefined to the determination of the eigenfunctions of the matrix representation of f A within this finite subspace. For a set of N functions {χi } and M centres {A} one can then formulate that the quantity N

M

∑∑

i=1 A=1



2 N hχi | f A |χi i = ∑

M

∑ (SiiA)2 = max

(7)

i=1 A=1

would be maximised if the functions χi correspond to the N eigenfunctions of the matrix representation of f A possessing the largest eigenvalues. The matrix representation of f A will be denoted as SA in the following. The above given general consideration is now exploited to derive a method for localising a given set of N orthonormal functions onto the M centres {A}. This means that the maximum of Eq. (7) is seeked with the additional constraint that the functions φi ∈ χi define a subset of the full function space (spanned by all eigenfunctions of { f A } within the AO basis set) and that the functions are all orthonormal among themselves, i.e. hφi |φ j i = δi j . If the functions were identical to eigenfunctions of { f A } the latter is true only if two functions correspond to the same centre A. In section 2.4 it will be explored how strongly the orthonormalisation constraint influences the locality of the resulting orbitals. In this work the procedure shown in algorithm 1 is used to project a set of N nonlocal orbitals onto a set of orbitals localised on the centres A ({φi } → {φei }). By using the ordering of the centres as given in the first statement in algorithm 1 it is accomplished that the centres which give the smallest contribution to the total sum ∑i→A SiiA (i.e., the centres on which the orbitals are least localised) which has to be maximised are prefered in the following projection on the local centres (in the sum ∑i→A it is implied that orbital i is associated with a given centre A). The reason for this setting is that the first projected vectors (associated with the largest eigenvalues) of the procedure can be considered as the most local ones. Indeed, it has been found that this ordering of the centres yields more strongly localised orbitals than a reverse ordering of centres. It should be noted that in cases where, due to the symmetry of the molecule, the ordering of the atom centres according to the partial charges is not unique, also the resulting local MO’s will 8 ACS Paragon Plus Environment

Page 9 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Algorithm 1 ALOC localisation method 1: Calculate the expectation values SiiA = hi| f A |ii of the functions f A in the orbital subspace {φi }

2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15:

and order the set of centres {A} such that the one yielding the smallest value of ∑i SiiA comes first and the one yielding the largest values of the sum of expectation values comes last. Initialise the transformation matrix X by the unit matrix. Initialise the number of functions to be transformed to m = N Set a threshold τmax for the magnitude of the largest eigenvalues of SA . A of the dimension of the number of centres with some large numbers Initialise a vector τupper A (which exceed the largest eigenvalues of SA , i.e. τupper  1) l f irst =True if not l f irst then τmax → τmax · γ (0 < γ < 1) end if l f irst =False for A=1,N do A if τupper < τmax then discard the current centre, goto 11 end if Transform SA to the subspace spanned by the m remaining vectors: eA = XT SA XN×m S m×m N×m N×N where XN×m corresponds to a submatrix of X containing the first m columns (in the first cycle XN×m corresponds to the full matrix XN×N ).

16: 17:

18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29:

Gershgorin

If τmax < τmax then goto 10 eA . Order the eigenvectors such Calculate the eigenvectors UA and eigenvalues τ A of S that the one corresponding to the smallest eigenvalue is in the first column of UA and the one corresponding to the largest eigenvalue in the last column of UA . If max(τ A )< τmax then goto 10 XN×m → XN×m UAm×m (since UA is a unitary matrix the transformation matrix X stays unitary, too) for all τ A > τmax do m → m−1 end for A Save the largest eigenvalue of SA which is smaller than τmax as an entry in τupper If m ≤ 1 then goto 29 end for if m > 1 then goto 7 end if Determine the new set of localised orbitals φei = ∑i j Xi j φ j

9 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

differ in locality depending on the random initial ordering of the centres. In order to handle such cases the algorithm 1 would have to be modified such that, as soon as the first of the symmetry equivalent centres is processed, the remaining atoms are ordered anew according, e.g., to the largest eA matrix. Consider the benzene molecule as an example, labeling the ring eigenvalues of the S atoms by the list 1-2-3-4-5-6: as soon as, say, centre 1 is processed, the largest eigenvalues of e SA in the remaining orbital subspace on the centres 2 to 6 will partially lose their degeneracy, i.e., eA when the eigenfunctions only the centres 2-5 and 3-4 will possess degenerate eigenvalues of S on centre 1 are projected out. The algorithm then should sort the remaining symmetry equivalent eA and proceed in this way as long groups 2-5, 3-4 and 6 according to the largest eigenvalues S as all initial centres are processed. Such a stepwise-type adaptation of the localisation algorithm has been applied to the B− 9 cluster studied in section 2.6 and which has D8h symmetry. It was found, however, that compared to the algorithm where symmetry equivalent centres of the system are not sorted properly, the resulting localisation measures (see section 2.2) showed only marginal differences. Due to this it can be expected that the consideration of the internal symmetry of the system in the order of the projections in the localisation scheme has little impact in praxis. The idea of using the acceptance threshold τmax in algorithm 1 is that only the eigenvectors corresponding to the largest eigenvalues of SA are used for the projection in the first cycle. These eigenvectors are the most localised ones for a given centre and will give the strongest contribution to the functional of Eq. (7). In this way the threshold τmax serves as a filter by which less localised eigenvectors are discarded. As soon as a new cycle over the centre loop is started, the acceptance threshold τmax is reduced by a defined scaling factor γ . In this work γ has been set to a value of 0.9 leading to a relatively smooth scan over the eigenvector space. An alteration of this value lead, however, only to a small change in the locality of the resulting MO’s, so that also smaller values of it would be feasible. The value of τmax is initialised by the largest Gershgorin eigenvalue estimate 32 of the {SA } matrices multiplied by the γ factor. The algorithm 1 to compute localised molecular orbitals aims at a maximisation of the functional given in Eq. (7). It is noted in passing that if the functions f A were replaced by projection

10 ACS Paragon Plus Environment

Page 10 of 62

Page 11 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

operators PA which project onto the basis functions centered on atom A, then hχi |PA |χi i is the contribution of orbital χi to the Mulliken charge of atom A and the functional of Eq. (7) would be identical to the Pipek-Mezey (PM) functional, see Ref. 15 and Eq. (2). Alternative choices for f A therefore lead to a generalised PM scheme as was also recently investigated by Lehtola et al. 25 . The choices for the local functions f A used in this work are described in section 2.3. It has to be noted that the constraints which have to be satisfied in the above localisation method, namely the constraint that the functions χi are orthonormal and in particular that they are spanned by the restricted function space of the occupied (or virtual) molecular orbitals {φi } means that in general it will not be possible to approach the exact maximum of Eq. (7). However, one can argue that the better the exact eigenfunctions of f A (in a full, yet finite, AO basis function representation) can be represented by the orbitals {φi }, the better the functional of Eq. (7) will be suited to localise the MO’s. It is therefore worthwhile to compare the (projected) local MO’s obtained from the above method against the local MO’s obtained by a direct optimisation of Eq. (7). As stated above, the functional is essentially identical in its form to the PM functional and thus the same optimisation method as is used by Pipek and Mezey can be adopted here, see Ref. 15 . The unitary transformation matrix X which transforms the orbitals from the canonical to local ones can then generally be written by a product of individual 2-by-2 Jacobi rotation matrices 15 : X = ∏ Gi j (θi j )

(8)

ij

where Gi j (θi j ) contains the submatrix 



 cos(θi j ) sin(θi j )  gi j (θi j ) =   − sin(θi j ) cos(θi j )

(9)

in its i’th and j’th column and row and is a unit matrix elsewhere. In the PM 15 and also Boys 9,10 localisation methods the 2-by-2 rotation angles θi j are determined for a given pair of orbitals φi and

φ j stepwise, i.e., the orbitals are always updated by the rotation Gi j (θi j ) before the rotation angle 11 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 62

for the next orbital pair is calculated. While this approach can lead to a rapid convergence towards the maximum of Eq. (7) for a set of local orbitals, it can be very time consuming for localising a large set of virtual orbitals. This in particular is true, because the intermediate quantities A and B (see Eqs. (29a) and (29b) in Ref. 15 ) used to determine the rotation angles have to be updated, too, each time a 2-by-2 rotation is applied. Because of this, herein the rotation angles (Eqs. (13a-c) in Ref. 15 ) are determined for all orbital pairs i j in advance. The MO’s are then updated by the total transformation matrix X of Eq. (8) after which a new set of angles θi j is calculated until all angles θi j are smaller than a predefined threshold (in this work this threshold was set to 10−8 rad for localising the occupied orbitals while for the virtual orbitals a threshold of 10−4 rad was found to be sufficient). However, this approach is based on the assumption that the optimal rotation angle for two orbitals is independent from all other orbitals which is certainly not true especially at the beginning of the optimisation. Therefore, the optimal rotation angles θi j are scaled by a small factor γ in the construction of the transformation matrix X(γ ) = ∏ Gi j (γθi j )

(10)

ij

which means that a damping method is used within the procedure. In this work γ was set to a value of 0.2 which leads to a reasonable convergence behaviour for occupied and a moderate convergence behaviour for virtual orbitals. In some cases the ’step-length’ of γ lead to a decrease of the functional of Eq. (7) after X(γ ) was applied. Whenever this happened the last step has been reset and γ has been reduced by γ → γ /2. This step was done repeatedly until a new maximum was found. In order to distinguish between the localisation method which employs projections onto the eigenvectors of SA and the explicit optimisation method the former one will be denoted by the acronym ALOC and the latter one with the acronym ALOC(opt) in the following. The speed of convergence of the iterative optimisation methods can also critically depend on the starting guess for the MO’s. If the canonical MO’s were used as input to the ALOC(opt) method

12 ACS Paragon Plus Environment

Page 13 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

a large number of iteration steps were needed, especially for localising the virtual MO’s. A much more rapid convergence could be achieved by using the (already) local MO’s from a corresponding projective ALOC localisation as a starting guess for the ALOC(opt) procedure. Identifying the eigenvectors of SA with the smallest eigenvalues that are projected in the ALOC method as the least local contributions to the set of MO’s leads to the idea that an increased locality of the resulting MO’s can be achieved if the ALOC projection scheme is combined with the ALOC(opt) method as soon as the threshold τmax for the acceptance of eigenvectors in the ALOC method drops below a certain threshold. In this work this hybrid method, termed as ALOC(hyb), has been tested for the localisation of virtual orbitals. It has been implemented as follows: as 0 , where τ 0 soon as the current acceptance threshold τmax is below 1/4 τmax max denotes the initial

eA matrices according to threshold used in the algorithm, the ALOC method is stopped and the S step 9 in the ALOC algorithm are calculated for all centres. These are then used as input to the ALOC(opt) method which maximises the function 7 within the remaining orbital space of dimension m. Since this dimension is typically only a fraction of the total number of virtual orbitals, the total ALOC(hyb) method is much less time consuming than the ALOC(opt) method, requiring also a much smaller number of iterations in the explicit optimisation part, while it gives an improved locality of the resulting virtual MO’s compared to the bare ALOC method, as will be shown later on in this work.

2.2 Measurement of locality In this section the different methods to measure the locality of a set of (occupied or virtual) orbitals is briefly described. A more detailed description of different measurements for the locality of indivual orbitals or a set of orbitals has recently been given by Høyvik and Jørgensen 28 . A natural choice to measure the locality of a pair of orbitals is to calculate their spatial overlap ˆ Ωi j =

dr |φi (r)||φ j (r)|

13 ACS Paragon Plus Environment

(11)

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 62

This quantity could also directly be adopted for prescreening over a set of orbital pairs in local correlation methods. E.g., if Ωi j is small then any two-electron integral of the type (i j|rs) = ´ drφi (r)φ j (r) 1/r12 φr (r)φs (r) would be small, too. For a set of N orbitals one can then measure the functional fOV [φ ] =

1 Ωi j N(N − 1)/2 ∑ ij

(12)

which is normalised over the number of distinct orbital pairs. The functional of Eq. (12) will be refered to as spatial overlap functional in the following. The related functional 1 Ωia Nocc Nvir ∑ ia

occ−vir fOV [φ ] =

(13)

can be used to measure the average spatial overlap for a set of Nocc occupied and Nvir virtual orbitals which may be even more relevant in the context of local correlation methods and underlying prescreening approaches. Another choice for measuring the locality is the functional of Eq. (7). In a more general form it can be written as: M N

fcharg [φ ] = ∑ ∑



2 hφi | f |φi i A

(14)

i

A

which will be termed as charge functional in the following. Basically this functional measures, for a set of M local functions f A , how strongly a set of N orbitals φi are localised on the different centres A. In this work, for the calculation of fcharg [φ ], the functions f A were chosen as the weight functions from a Hirshfeld partitioning of the molecular volume (see section 2.3.2 for details). The centres A in this partitioning were set to the atomic centres of the respective systems in this case. The maximisation of the charge functional is the underlying concept of the PM localisation method and the ALOC method described in section 2.1. Finally, the average orbital spread

σ=

1/2 1  2 2 h φ |r | φ i − (h φ |r| φ i) i i i i N∑ i

14 ACS Paragon Plus Environment

(15)

Page 15 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

will be used as a quantity to measure the locality. The indivdual orbital spreads σi in the sum in Eq. (15) measure how much on average the orbital density |φi (r)|2 deviates from its mean position and thus becomes small the more localised the orbital φi is. It should be noted here that the Boys orbitals minimise the related function fBoys [φ ] = ∑hφi |r2 |φi i − (hφi |r|φi i)2

(16)

i

which is the sum over the orbitals’ second moment. A minimisation of fBoys is congruent with a maximisation of the sum of squares between the distances of orbital centroids, see Eq. (1).

2.3 Choice of the functions f A 2.3.1 General two-paramter function A central question is now, which types of functions f A (Eq. (4)) would be most suitable within the localisation method ALOC described in section 2.1, more precisely, which shape of f A leads to a well localised set of orbitals? In this work the spherically symmetric function f1A (r) = 1 − erf(α (r − rA )β )

(17)

is chosen which contains the two parameters α and β and which obeys 0 ≤ f1A (r) ≤ 1. The parameter α controls the spatial extent of the function while the parameter β the slope of it, leading to a large variety of function shapes on adjustment of the two values. Figure 1 shows the function of Eq. (17) for two possible values of α and β , namely α = 2, β = 4 and α = 1/2, β = 1/2. As can be seen, in the former case the function is very narrow and quickly decays to zero while in the latter case the function is wider and decreases more slowly towards zero. For the case of a helical alanyl-phenylalanyl (AF) dipeptide molecule it is now investigated which influence the choices of the parameters α , β have on the locality on the resulting orbitals. The functions f1A were set to the locations of the atomic positions for this test. This approach will

15 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 62

be termed as ALOCa1 method here and in the following (to denote that ’a’tomic sites were used in conjunction with the ALOC localisation method described in section 2.1). The top left and right diagrams in Figure 2 display the spatial overlap functionals (Eq. (12)) for a range of parameter values for the localised occupied (left) and virtual (right) orbitals. Dark blue regions within the diagrams mark small values of fOV and therefore an increased locality of the resulting orbitals. In both cases small values of fOV are obtained for small values of α , β in the region of α , β ≈ 0.5. In case of the occupied orbitals an increased locality can also be obtained by some larger parameter values of the order of α = 2.5 and β = 4 as can be seen in the top left diagram in Figure 2. In contrast to this, in case of the virtual orbitals an increase of both parameters α and β generally leads to an increase of fOV and thus a reduction of the locality, see top right diagram in Figure 2. It can therefore be concluded that for atom centered functions f1A it is advantageous if the functions are more diffuse (as is the case with small values of α and β ). In order to explain this finding the mean positions of the local orbitals of acrylic acid obtained by the PM and the Boys localisation methods are displayed in Figure 3 (marked as purple dots in the figure). Note that in case of the PM the mean positions of all orbitals are located on the molecular plane while some of the Boys orbitals are located outside of the molecular plane (indicating a mixing of σ - and

π -bonds), see top right subfigure in Figure 3. In both cases, however, it can clearly be seen that, except for the oxygen lone pairs, the occupied orbitals are not located on the atomic centres but are located along the respective molecular bonds. The ALOC localisation method, however, can only work in a reasonable way if there is some resemblance between the exact eigenvectors of SA and the local MO’s. Accordingly, if the functions f A are located on the atomic centres a more diffuse shape allows to better represent the local MO’s which are displaced from the atomic centres. As a cross check the same scan as for the diagrams in the top of Figure 2 has been repeated with the functions f1A located now, however, at the bond centres which are estimated by pbond AB = rA +

rAcov (rB − rA ) rAcov + rBcov

16 ACS Paragon Plus Environment

(18)

Page 17 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

cov are covalent radii for the where rA,B are the positions of two atoms A and B and the values rA,B

atoms A and B taken from the work of Pykkö and Atsumi 33 . For the oxygen lone pairs a corresponding function was also set at the positions of the two oxygen atoms of acrylic acid. The resulting locations of the functions are displayed in the bottom left subfigure in Figure 3. It should be noted now, that by this approach also the ALOC projection method described in section 2.1 simplifies since one local function f A now exactly is associated with one (single bond) or two (double bond or oxygen lone pairs) local MO(s). In the loop over centres in the ALOC method described eA to the transformain section 2.1 one can then therefore add exactly as many eigenvectors of S tion matrix X as correspond to the number of MO’s for the given centre and the total procedure is finished after a single loop over all centres (core orbitals have been excluded in the localisation method). With this method, which will be termed as ALOCb1 method in the following (where ’b’ denotes that bond centered functions f A were used) the spatial overlap values (for the AF molecule) shown in the bottom diagram in Figure 2 result for various values of the parameters α and β . Again one can observe that small values of α , β lead to a decrease of fOV , but the point is now, that the spatial overlap functional is much less senisitive with respect to variations of α and β as can be seen from the much smaller differences of the maximum and minimum value in the bottom diagram as compared to the corresponding differences for the top left diagram in Figure 2. Also, for extremely small values of α , β the functional fOV increases again in case of the ALOCb1 method as can be observed in Figure 2. Note in addition that the magnitudes of fOV for almost all choices of α , β are significantly smaller for the bond centered approach than with atom centered functions. A corresponding approach also for the localisation of virtual orbitals is, however, less successful. The reason for this is obvious on inspection of the positions for local virtual MO’s as obtained by the ALOCa2 method for acrylic acid shown in the bottom right subfigure in Figure 3 (the meaning of the ’2’ in ALOCa2 will be explained below in this section). While (with this method) all local virtual MO’s lie within the molecular plane, their positions are much more distributed than in case of the local occupied MO’s. Some virtuals are again located on the bond centres but a range

17 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 62

of virtual MO’s are also close to the atomic centres and many lie even farther outside. Generally this behaviour can be attributed to the fact that higher lying virtual orbitals (try to) represent the continuum and accordingly are given by linear combinations of AO basis functions including the most diffuse functions in the basis. From a practical point of view this means that generally orbital localisation methods will fail to properly localise the virtual orbitals if the basis set possesses very diffuse functions. For the localisation of the virtual orbitals a placement of the functions f A onto the atomic centres therefore seems to be reasonable. In this work, for the ALOCa1 method, the parameters α and β of the function of Eq. (17) were set to the values α = 0.5 and β = 0.5 while in case of the ALOCb1 method a more tight form was chosen using α = 2 and β = 4. While the choice of the positioning of the local centres in the ALOCb1 method depends on the average covalent radii used in Eq. (18), also an iterative method has been tested in which the positions for the functions f1A were chosen explicitly by the mean positions of the local MO’s which are given by the expectation values hi|r|ii for an orbital φi . The procedure of this approach is then as follows: a) compute the local MO’s using Eq. (18) as an initial guess for the positions of the local centres; b) compute the mean orbital positions hi|r|ii for all resulting localised MO’s and set the functions f1A on these new centres; c) again localise the MO’s for the new set of centres; repeat the steps b) and c) until the mean orbital positions of subsequent cycles do not change anymore (using a reasonable threshold criterion). It has been found that this procedure rapidly converges for the different systems studied in this work. However, no clear improvement of the locality of the resulting MO’s over the zeroth-order guess (Eq. (18)) was found. Therfore this approach will not be considered in the following in this work.

2.3.2 Local weight functions from a Hirshfeld partitioning In addition to the function f1A of Eq. (17) two alternative functions were chosen for the local sites. The first one is the weight function obtained by a Hirshfeld partitioning of the molecular volume defined as f2A (r) =

ρ A (r) ∑B ρ B (r)

18 ACS Paragon Plus Environment

(19)

Page 19 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

A A,B in Eq. (19) are atom centered which has the property that ∑M A f 2 (r) = 1. The densities ρ

densities of atoms A, B. The function f2A will therefore be close to one in the region of centre A and it will be close to zero otherwise. The densities ρ A were set to the atomic densities obtained by Slater’s rules in this work, see Refs. 34,35 . With this an expectation value hi| f2A |ii can be identified as a contribution of orbital φi to the partial atomic charge of atom A. Note that atomic Hirshfeld charges were also investigated within generalisations of the PM localisation method by Lehtola et al. 25 . Hereafter the method using the function in Eq. (19) will be termed as ALOCa2 method. Similarly as was done for the f1A functions an alternative method to ALOCa2 has been devised for the localisation of occupied orbitals in which the densities ρ A in Eq. (19) were not positioned on the atomic sites but on the bond centres, herefater denoted as ALOCb2 method. In this case, ρ A was set to the hydrogen density for a single bond and to the helium atomic density for a double bond (or double lone pair). It should be noted that the densities used in Eq. (19) in this work are fixed quantities which do not depend on the atomic basis set used in the calculations. Because of this the Hirshfeld partitioning itself is basis set independent, too, and this means that the partital atomic charges obtained by this approach converge fairly well upon increase of the basis set size, in contrast to the Mulliken charges which are used in the corresponding PM localisation method. The Hirshfeld charges for the carbon and oxygen atoms of acrylic acid are shown in Table 1 for the aug-cc-pVXZ Dunning basis sets with X =D,T,. . . ,6 36–39 . The labeling of the atoms used in this table is shown in Figure 4. As can be seen therein, for all atoms a rapid convergence towards a basis set limit can be observed. At the aug-cc-pVQZ basis set level the partial charges are already converged to 3 digits after the decimal point and at the aug-cc-pV5Z level to 4 digits. Finally, as it has been observed in case of the ALOCa1 method that a more diffuse shape of the function f1A can lead to an increased locality of the resulting orbitals, a corresponding approach has also been used for the Hirshfeld weight functions by exponentiating the charges in Eq. (19) by a factor of 1/4: f3A (r) =

 1/4 ρeA (r) A e with ρ = ρA ∑B ρeB (r) 19

ACS Paragon Plus Environment

(20)

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 62

The localisation method which uses the functions f3A on the atomic sites will be termed as ALOCa3 method in this work. Note that with the property ∑A f A (r) = 1 that is obeyed by the functions f2A and f3A the sum ∑A SA = S yields the overlap matrix in the MO basis . There exist a number of other methods which provide a decomposition of the overlap matrix into atomic ones and which might be useful as an alternative to the choice used in this work, see e.g. Refs. 40,41 .

2.4 Influence of the orthonormalisation constraint In the ALOC localisation method presented in algorithm 1 the eigenvectors of the matrices SA are calculated within subspaces of the full orbital function space in order to keep the resulting orbitals (not belonging to the same centre) orthogonal to each other (only the first vector is represented in the full space spanned by the orbitals). Because of this one can assume that the resulting local MO’s are less local than if they were identical to the eigenvectors of the matrices SA calculated in the full space of the N orbitals. In order to determine the influence of this orthonormalisation constraint on the locality of the MO’s the ALOC algorithm 1 has been modified such that the exact eigenvectors of SA (corresponding to the N largest eigenvalues) are copied to the transformation matrix X. This approach has been used in conjunction with the ALOCa1 method and the results for some local measurement functionals are shown in Table 2 for the occupied orbitals of the helical AF dipeptide molecule (displayed in the last column denoted as ALOCa1(noorth)). For comparison, also the results of the ALOCa1 and ALOCb1 methods are shown in the table. From the results of the charge functional and the mean orbital spread it can be seen that indeed the omission of the orthogonality constraint leads to a higher locality of the MO’s. The effect is, however, rather small. E.g., the mean orbital spread decreases by only 0.08 a.u. without the constraint. For the spatial overlap functional one can even observe a slight increase if the local MO’s for different centres are not orthogonalised in the ALOCa1 method. Compared to this, the bond centered method ALOCb1 yiels more localised orbitals than the ALOCa1(noorth) method by all measurement functionals displayed in Table 2. 20 ACS Paragon Plus Environment

Page 21 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Some of the eigenfunctions of the MO-basis matrix representation of the f1A function of Eq. (17) are shown in Figure 5. Some of these can be well intrepreted as σ -bonds or π -bonds, see e.g. the first two subfigures on the top line of Figure 5. Interestingly, however, among the eigenvectors which correspond to the N largest eigenvalues of {SA } there are no one which represent any σ - or

π -type bond orbitals on the phenyl ring in contrast to the ALOCa1 or ALOCb1 methods (see, e.g., Figure 10). It can be summarised that the orthonormality constraint for orbitals on different centres has only a small impact on the locality of the resulting MO’s. Other boundary conditions, such as the shape of the local functions (Eq. (4)) or their positioning can have a much larger influence on the MO’s locality.

2.5

σ -π separation in acrylic acid

It has previously been investigated by Lehtola and Jonsson 25 that an alteration of the atomic charge definitions in the PM localisation method does not abandon the property of the PM method to yield properly separated σ and π local MO’s. Here it is shown that the ALOC localisation method conserves a separation of σ - and π - orbitals, too. In order to demonstrate this, Figure 6 shows the highest nine occupied orbitals of acrylic acid obtained by the ALOCa1 method. The first four orbitals in Figure 6 (from the top) clearly correspond to the π -type orbitals on the C-C double bond and the carboxy group while the remaining orbitals are all of σ -type character representing C-H and O-H single bonds. Note that a corresponding finding is also made for the ALOCb1 method. Also note that the choice of the local function (see section 2.3) does not have any influence on the property of the σ -π separation of the MO’s, in line with the work of Lehtola et al. 25 .

2.6 Performance for molecules including multiple centre bonds Orbital localisation methods can generally be assumed to work well if a single Lewis structure is able to represent the electronic structure of a molecule. In this case it will always be possible to find suitable unitary transformations such that the delocalised canonical orbitals can be trans21 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

formed to local ones corresponding, in case of occupied MO’s, to one of the valence lines drawn in the Lewis structure. Already in case of electronically delocalised π -systems, however, multiple resonance Lewis structures are necessary to describe the system and the question is whether orbital localisation methods can still be applied. In case of aromatic carbon compounds it is known that, while the degree of localisation deteriorates as compared to electronically localised system (see Ref. 28 or section 4.2), still orbital localisation methods are able to localise the MO’s because they can usually be transformed to orbitals which are extended over only two centres (like in a corresponding Lewis structure where the double valence lines mark a ’local’ double bond). The situtation, however, gets more complicated if the molecule includes multiple centre bonds that can not properly be localised between atoms anymore. Examples for such cases are systems containing the cationic cyclopropyl group or a number of cationic and anionic borane clusters 42,43 . In these cases not only the outerplanar p-electrons can form bonding orbitals which are extended over multiple centres, but also the planar s- and p-orbitals. This kind of in-plane delocalisation is also termed as σ -aromaticity. Standard iterative orbital localisation methods are unsuitable for such cases as they will even fail to converge. Instead, the adaptive natural density partitioning (AdNDP) method by Zubarev et al., which extends the natural bonding orbital (NBO) analysis by Weinhold 14 to general n-center bond descriptions, can be used in this case 43 . Both, in the NBO and the AdNDP method the (n-centre) orbitals are obtained as the local block eigenfunctions of the one-electron density matrix and thus these methods, as the method of this work, are projective orbital localisation methods. Because of this it will be interesting to compare the performance of the method of the current work to the performance of the AdNDP for molecules which involve multiple centre bonds. For this, the anionic B− 9 cluster will be considered in the following. The global minumum of this system is given by the wheel-shaped structure of D8h symmetry as shown in figure 5 in Ref. 43 . According to the analysis of the canonical MO’s this system is doubly σ - and π -aromatic. The AdNDP method yields eight 2c-2e (2 electrons extended over 2 centres) orbitals of the σ -type, three σ type orbitals which are extended over the full system (9c-2e) and also three fully delocalised 9c-2e

22 ACS Paragon Plus Environment

Page 22 of 62

Page 23 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

π -type orbitals, see figure 5 in Ref. 43 . It goes without saying that the ALOCb method described in section 2.3.1 can not be applied for cases involving n-centre bonds with n > 2. Because of this the ALOCa1 method has been used to transform the canonical valence MO’s of the B− 9 cluster (the geometry of the system has been optimised on the MP2 (second-order Møller-Plesset perturbation theory) level). The resulting MO’s of the projective transformation are shown in Figure 7. As one can observe, in contrast to the results of the AdNDP method the first eight σ -type orbitals in Figure 7 are extended over three and four atoms, respectively. They are grouped into two different types which correspond to linear combinations of only atomic s-orbitals (first line in Figure 7) or s- and innerplanar porbitals (second line in Figure 7). The following six delocalised σ - and π -type orbitals yielded by the ALOCa1 method (lines 3 and 4 in Figure 7) basically correspond to the orbitals obtained also by the AdNDP method, compare figure 5 in Ref. 43 . There is only one noteable exception to this, namely the ALOCa1 method yields two π orbitals which are extended over six atoms, respectively, while the 9c-2e π orbital of A2u symmetry as yielded by the AdNDP method (see figure 5 in Ref. 43 ) is missing. The six delocalised σ - and π -type orbitals obtained by the AdNDP and ALOCa1 methods resemble the canonical σ - and π -MO’s and are responsible for the σ - and π -aromaticity of the system. In contrast to this, the remaining eight orbitals describe the peripheric bonding in the system and accordingly can be localised both by the AdNDP method and the ALOCa1 method of this work. Projective localisation methods like the AdNDP method therefore can help to interpret the bonding in such electronically complicated cases involving multicentre bonds. It is clear, however, that generally orbital localisation methods come to a limit as soon as the electronic structure of a molecule can no longer be represented by a set of 2-centre bonds.

23 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

3 Computational details The orbital localisation methods described in section 2 have been tested for a range of helical and

β -sheet polypeptide structures containing alternating alanine (A) and phenylalanine (F) units. In order to analyse the dependence of the locality of the molecular orbitals on the size of the systems the two different conformations of the peptides have been extended from bi- (AF) to decapeptides (AFAFAFAFAF) by adding AF sequences incrementally, leading to five different peptides (for the two different conformations) altogether. The peptide structures were obtained by first generating the global helical and β -sheet conformations using the Python PeptideBuilder program from Ref. 44 . For this, dihedral angles of

φ = −58◦ and ψ = −47◦ (helix structure) and φ = −139◦ and ψ = 135◦ (β -sheet structure) were used. Then, since the PeptideBuilder program omits the hydrogen atoms, the molecules were saturated with hydrogen atoms (leading to an amino and an aldehyde group for the terminating amino acid units) and the structures were optimised using the DFTB+ program 45 . In these optimisations only the positions of the backbone atoms were kept fixed in order to conserve the global conformation of the molecules. In addition the localisation methods have also been tested for an extended helical polypeptide molecule containing 20 alanyl units (Ala20 ) and a 37-ring polyacene molecule (C96 H24 ). In case of the Ala20 system the geometry has been obtained by the same strategy as described above for the alanyl-phenylalanine peptides. The geometry for the polyacene molecule has been taken from Ref. 46 . The implementation of the ALOC method described in section 1 requires the calculation of matrix elements of the type SiAj = hi| fxA | ji (x = 1 − 3) with i, j denoting two molecular orbitals φi and φ j and fxA is one of the functions defined in section 2.3 (Eqs. (17), (19) and (20)). All these different types of integrals have been implemented using a numerical quadrature. For this a tight threshold of 10−8 a.u. was set for the grid accuracy. With this value the number of quadrature points is chosen such that a numerical integration over the Slater exchange potential is identical to an analytical integration to within the given threshold value 47 . In case of the functions f1A (Eq. 24 ACS Paragon Plus Environment

Page 24 of 62

Page 25 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(17)) also an alternative method to compute the integrals hi| f1A | ji has been implemented. This was done by fitting the functions f1A by a linear combination of s-type Gaussian functions located on the centres A: f1A (r) =

K

∑ cr e−αr (r−r

A )2

K

=

r=1

∑ cr ζrA(r)

(21)

r=1

using a least-squares fit for the linear coefficients cr and using αr = 12 , 1, 32 , . . . . It has been found that a good accuracy of the fit is obtained for all ranges of the parameters α , β of the function f1A (Eq. (17)) that were used in this work if the number of Gaussian functions is set to K = 9. The integrals SiAj computed in this way showed a fairly good agreement with the integral values from a numerical quadrature. The computation time for calculating the integrals could be drastically reduced with this approach because with Eq. (21) they are now given as K

K

r

r

SiAj = ∑ cr hi|ζrA | ji = ∑ cr ∑ cµ i cν j hµ |ζrA |ν i

(22)

µν

where hµ |ζrA |ν i is a three-centre overlap integral of three Gaussian functions which can be calculated fast utilising Molpro’s integral routines for three-index Gaussian integrals. In Eq. (22) µ , ν label two atomic orbital basis functions and cµ i , cν j are the orbital coefficients of the orbitals φi and φ j . The orbital localisation methods have been applied to canonical (nonlocal) Hartree-Fock orbitals calculated for the different molecules. The def2-TZVP basis set 48 has been used as orbital basis set throughout. Two-electron integrals were calculated using the density fitting approximation. For this, the corresponding def2-QZVPP/JKFit fitting basis set has been used for the auxiliary basis functions 49 . For comparison, the canonical Hartree-Fock orbitals were also localised using the Cholesky decomposition (Chol) 19 , the Pipek-Mezey (PM) 15 and the Boys 9,10 localisation methods. While all methods have been applied to the set of occupied MO’s, the Chol and Boys methods have also been used to compute localised virtual orbitals which are then compared against the orbitals from the methods of this work. 25 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

All calculations were performed using a developers version of the Molpro quantum chemistry program 47,50 . Core orbitals were left unlocalised in this work.

4 Results Summary of acronyms The following list gives a brief description of the different acronyms that have been introduced in section 2: ALOC the orbital localisation method which uses the projection procedure of algorithn 1 ALOCa corresponds to the ALOC method employing ’a’tomic sites for the local functions for the projection ALOCb orresponds to the ALOC method employing ’b’ond centered sites for the local functions for the projection ALOCa1, ALOCa2, ALOCa3 the ALOCa method used in conjunction with the three different choices for the local functions given in Eqns. (17)→ALOCa1, (19)→ALOCa2 and (20)→ALOCa3 ALOCb1, ALOCb2 the ALOCb method used in conjunction with the two different choices for the local functions given in Eqns. (17)→ALOCb1 and (19)→ALOCb2 ALOCa1-3(opt) corresponds to the ALOCa1-3 method, but the charge functional (Eq. (14)) is explicitly optimised for the different choices of the functions (1-3) ALOCa1(hyb) a hybrid method between the ALOCa1 and ALOCa1(opt) method in which the projection method is used for the eigenvectors which correspond to the largest eigenvalues A (Eqns. (17), (19) and (20)) while an explicit optimisation of the charge of the functions f1−3

functional is performed within the remaining orbital space (corresponding to small eigenvalA ) ues of f1−3

26 ACS Paragon Plus Environment

Page 26 of 62

Page 27 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

4.1 Helical and β -sheet polypeptide conformers The diagrams in Figure 8 show the results for the spatial overlap functional (Eq. (12)) for the different structures of the (AF)1−5 polypeptide molecules and the different localisation methods. The two diagrams in the top of the figure display fOV for the local occupied orbitals of the helical (left) and β -sheet (right) conformers and the the two diagrams on the bottom the corresponding results for the local virtual orbitals. For comparison, the results for the Boys and PM localisation methods are included as well (the latter only for the results for the occupied orbitals). Generally one can see that the relative locality increases with larger system sizes in all cases (an improved locality is indicated by a smaller value of the spatial overlap functional in the figures). Note that the curves in the diagrams in Figure 8 are normalised over the functional values obtained by the corresponding canonical Hartree-Fock orbitals. The small ratio values in the diagrams in Figure 8 therefore actually also indicate the strong improvement of the locality of the orbitals for all methods as compared to the canonical ones. Only small differences of the curves for the helical and β -sheet structures can be observed in the diagrams. This shows that the spatial overlap functional is rather independent on the conformer structure of the different peptide molecules. No significant differences of the performances of the different localisation methods can be detected in the diagrams in Figure 8. The only exception to this is observed for the occupied MO’s. Here the ALOCa2 method yields clearly larger values for the fOV functional than all other methods for all conformer systems. Both, the ALOCb1 and ALOCb2 methods improve the results of the atom-centered ALOCa1-3 methods, as expected. Indeed, it can be seen that the fOV functional values obtained with the ALOCb1-2 methods are practically identical to the results obtained by the explicit optimisation methods ALOCa1-3(opt), PM and Boys shown in the two upper diagrams in Figure 8. This means that, by the measure of the average orbital’s spatial overlap, the ALOCb1-2 MO’s are as local as the orbitals which are obtained from an explicit optimisation of a localisation functional. In case of the virtual MO’s for the polypeptide systems there is hardly any difference of the mean spatial overlap yielded by the different localisation methods, see bottom diagrams in Fig27 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ure 8. The ALOCa1(hyb) method yields a small improvement over the ALOCa1 method, as expected, and yields fOV functional values very close to the ALOCa1(opt) ones. The diagram in Figure 9 shows the results for the charge functional (Eq. (14)) obtained for the occupied MO’s of the helical and β -sheet structures of the (AF)5 molecule. The values in the diagram are normalised over the result for the ALOCa2(opt) method because this method optimises fcharg (the Hirshfeld weights of Eq. (19) were used in Eq. (14)). One can see again that the atomcentered methods ALOCa1-3 can only yield reasonably localised MO’s if the local functions used in the projection approach of the ALOC localisation has a more diffuse shape as in the ALOCa1 and ALOCa3 methods. In contrast to this, by the measure of the charge functional, the ALOCa2 orbitals are significanly less local because only about 88% of the maximum of the sum of the atomic charges can be accumulated for both conformers, see Figure 9. The ALOCb1 and ALOCb2 methods yield about 99% of the maximum of the charge functional and thus again yield results practically identical to the localisation methods which are based on a functional optimisation. The tables Table 3 and Table 4 show the average orbital spreads that are obtained for the occupied and virtual local MO’s for the different methods. Qualitatively the same relations between the performances of the different methods as for the spatial overlap functional (Figure 8) can be observed here as well. Among the ALOC localisation methods that use atom-centered local sites (ALOCa1-3) the ALOCa1 methods yields the smallest average orbital spreads for the occupied MO’s while in case of the virtual ones all three ALOCa1-3 methods yield about the same magnitudes for σ of about 2.0 a.u., see Table 3. For the virtual MO’s one can now (more clearly than in the diagrams in Figure 8) see that the ALOCa1(hyb) method indeed improves the ALOCa1 results for this property and gives mean orbital spreads that are close to the ALOCa1(opt) ones, see Table 4. The figures Figure 10 and Figure 11 show some selected occupied and virtual localised MO’s for the helical (AF)5 molecule which have been obtained by the ALOCb1 (Figure 10) and ALOCa1 (Figure 11) method.

28 ACS Paragon Plus Environment

Page 28 of 62

Page 29 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

4.2 Alanine20 polypeptide and C96 H24 polyacene molecule To investigate the performances of the different localisation methods for a delocalised electronic system a 96-carbon polyacene molecule is considered in this work. The results will be compared to those for a helical (Ala)20 polypeptide molecule which represents an example of an extended molecule with a localised electronic structure. A thorough analysis of the performances of the PM localisation method and some other localisation methods which minimise powers of the orbital’s fourth central moment for a delocalised electronic strucure system represented by a graphene sheet (using a slightly larger C106 H28 acene cluster than considered in this work) has recently been given by Høyvik and Jørgensen 28 . Therein it is shown that the failure for obtaining local orbitals for electronically nonlocal systems are due to inadequacies in the used localisation schemes, rather than to the inherent nonlocal electronic structure of the corresponding molecular system. In this work, however, these enhanced localisation methods developed by Høyvik et al. 28 are not considered and instead the standard PM and Boys localisation methods will be used as a reference, besides noting that the PM method is considered as a method which is not capable to produce local orbitals for a system with a delocalised electronic structure in Ref. 28 . While this conclusion is based on the results for average orbital spreads for graphene obtained by the different localisation methods in Ref. 28 , it has not been shown whether also the spatial overlap values of orbital pairs, which could be utilised for prescreening over pairs of orbitals in local correlation methods, is significantly worse for the PM method, too, if compared against the enhanced localisation methods of Ref. 28 and when this is set in relation to localised electronic systems as (Ala)20 which is studied, too, in Ref. 28 . For the (Ala)20 system Høyvik et al. indeed observe an improvement of theirs more sophisticated localisation methods compared to the PM method for the maximum spatial overlap values Ωia between occupied and virtual MO’s. Yet, the maximum spatial overlap values presented in Ref. 28 do not give any evidence about how many occupied-virtual orbital pairs can actually be screened for a given threshold. In this work therefore an alternative plot type has been chosen to highlight this, see section 4.3. Table 5 presents the different localisation measures of section 2.2 for the occupied MO’s of 29 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(Ala)20 and the C96 H24 systems. A comparison between the spatial overlap values and the average orbital spreads, which both are normalised quantities with respect to the system size, shows that the MO’s of the polyacene system are quite less localised than the MO’s of (Ala)20 for all methods presented. A comparison of the performances of the different methods yields basically the same trends for (Ala)20 as for the (AF)x polypeptides studied in section 4.1. In case of the C96 H24 acene system the ALOCb1-2 methods yield only slightly more localised orbitals than the ALOCa1-3 methods on the basis of the spatial overlap and mean orbital spread measure. Considering the results for the charge functional, however, the ALOCb1-2 methods significantly improve the ALOCa1-3 methods and yield values of fcharg that are close to the results for the explicit optimisation methods ALOCa1-3(opt), Boys and PM. The results for the ALOCa1-3(opt) methods are almost identical to each other in contrast to the ALOCa1-3 counterpart methods. This shows that the explicit shape of the local function used in Eq. (4) has only a marginal effect on the locality of the resulting MO’s if the charge functional is maximised explicitly. The different results for the localisation functionals for the virtual MO’s are shown in Table 6. Again a significant deterioration of the locality of the MO’s as described by the different functionals is observed for the delocalised acene system. Except for the Cholesky method all methods yield comparable results for the mean spatial overlap and orbital spread values for the polypeptide and the polyacene systems. Explicit diagrams displaying the individual orbital spreads for the (Ala)20 molecule for the 281 occupied and 3331 virtual MO’s are presented in the figures Figure 12 and Figure 13, respectively. Note that the orbitals were ordered according to the magnitudes of σ for these plots. In case of the occupied MO’s one can see that the Boys orbitals yield a more uniform distribution of the orbital spreads compared to the PM and ALOCa2(opt) method, see Figure 12. The latter two methods yield a number of occupied MO’s which possess smaller σ values than the Boys method, but in turn theirs largest orbital spreads exceed those of the Boys method. The ALOCb1 and ALOCb2 methods yield an orbital spread distribution which is very close to the related ALOCa2(opt) method as can be seen in the diagrams in Figure 12. For the virtual orbitals the distributions of the orbital

30 ACS Paragon Plus Environment

Page 30 of 62

Page 31 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

spread values all look similar to each other for the different methods, see diagrams in Figure 13. A comparison of the plots for the ALOCa1-3 and ALOCa1-3(opt) methods among each other shows that the main differences are detected between the orbitals with the largest orbital spread. The explicit optimisation of the charge functional performed by the ALOCa1-3(opt) methods therefore seems to have mainly an effect on the orbitals which possess a larger orbital spread. Since this is only a small fraction of the complete set of virtual orbitals, the ALOCa1(hyb) method which combines the ALOCa1 and ALOCa1(opt) methods gives an orbital spread dsitribution almost indistinguishable from the corresponding ALOCa1(opt) method, see bottom left diagram in Figure 13. The figures Figure 14, Figure 15, Figure 16 and Figure 17 show some selected MO’s obtained by the ALOCb1 and ALOCa1 methods for the (Ala)20 and C96 H24 systems. Here and in all other orbital plots presented in this work a contour cutoff of 0.05 has been used.

4.3 Spatial overlap of occupied-virtual orbital pairs One of the most important fields of application for orbital localisation methods are local correlation methods 8 . In local correlation methods the occupied and virtual local MO’s are usually assigned to certain domains within which excitations are considered while excitations outside of a specified domain are completely neglected or treated by a lower level electron correlation approach. Accordingly the magnitude of the spatial overlap between occupied and virtual MO’s can directly be used as a screening criterion by which certain excitations, assigned to given occupied-virtual pairs, can be omitted based on a screening threshold for this quantity. Figure 18 shows the percentual contribution of the number of occupied-virtual orbital pairs with a spatial overlap lower than a given threshold, given on the abscissa axis, to the total number of distinct orbital pairs for the helical and β -sheet conformers of (AF)5 (upper diagrams) and the (Ala)20 polypeptide and C96 H24 acene molecules (lower diagrams). Therefore, in the diagrams in Figure 18, all curves turn towards 100% for large spatial overlap values because then all orbital pairs undermatch the given threshold. In turn, for smaller spatial overlap threshold values the 31 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

curves decrease because fewer orbital pairs exist for which the spatial overlap is smaller than this limit. The most strong decrease of the curves towards smaller spatial overlap thresholds is observed for the canonical orbitals for which the number of occ-virt pairs is below 10% already at a threshold value of 0.1 (a.u.). In contrast to this, in case of the orbitals from the localisation methods a siginficant number of orbital pairs exist with much smaller spatial overlap values than 0.1 and less for all systems, so a much weaker decay of the curves is observed in Figure 18 compared to the canonical ones. An essential result that can be deduced from the diagrams in Figure 18 is, that all orbital localisation methods presented in the plots yield about the same number of occupied-virtual orbital pairs for a given spatial overlap threshold value. The only exception to this is observed for the Cholesky method which yields a much smaller number of pairs for a given threshold value for all systems when compared to the other localisation methods, see Figure 18. By this result it can be concluded that there is little to choose between the different localisation methods presented here if used in conjunction with local correlation methods. A comparison of the diagrams for the four different systems in Figure 18 reveals, however, that the type of the system can have a strong influence on the spatial overlap values obtained by the different localisation methods. At a threshold value of 0.001 the number of occ-virt pairs amounts to about 60% for the localisation methods in case of the β -sheet conformer of (AF)5 while only about 40% of the orbital pairs undermatch the given threshold value in case of the more densely packed helical conformer, see top left and right diagrams in Figure 18. The most problematic case for all localisation methods is, as expected, the delocalised electronic system, namely the C96 H24 polyacene molecule, where only about 20% of all occupied-virtual orbital pairs could be screened at a threshold value of 0.001, see bottom right diagram in Figure 18.

5 Summary An orbital localisation method is presented which is based on a projection of the nonlocal canonical orbitals onto the eigenvectors of a matrix representation of spatially localised functions. Both, the

32 ACS Paragon Plus Environment

Page 32 of 62

Page 33 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

influence of the shape of these functions as well as their positioning on the locality of the resulting MO’s has been studied. If the positions of the functions are chosen as the atomic sites of the given molecule it was found that a more diffuse rather than tight form of the function can lead to an increased locality of the MO’s. This finding could be explained by the fact that occupied local MO’s are usually localised on the molecular bond centres of the molecule rather than on the atomic positions. Virtual local MO’s, on the other hand, possess no prefered mean position within the molecule. Accordingly, if the local functions used in the localisation method are positioned on the molecular bond centres, an increased locality of the resulting occupied MO’s is obtained by the projection approach. It was found, on the basis of the results of several functionals for measuring the locality of MO’s, that the occupied local MO’s obtained in this way were as local as those from localisation methods which are based on an explicit optimisation of a defined functional. The localisation method presented in this work can also yield fairly well local virtual MO’s if the functions for the projection are diffuse and set on the atomic sites. A combination of the projection method and explicit optimisation methods for orbital localisations has been derived as well, which can lead to an even improved locality of the virtual MO’s. Several choices for the local functions used in the localisation method have been tested in this work, a two-parameter smooth-step type function and two different local weight functions from a Hirshfeld partitioning of the molecular volume. Regarding the attained locality of the resulting MO’s no clear preference of the explicit type of the function could be consitituted, apart from the above mentioned fact that a more diffuse form of the function is advantageous if set on the atomic sites. Regarding the computational efficiency, however, a general (spherically symmetric) function can be beneficial as it can easily be fitted by a linear combination of s-type Gaussians (Eq. (21)). With this the CPU time for localising the 281/3331 occupied/virtual orbitals of (Alanine)20 (def2TZVP basis set) amounted to 2.4 and 24 minutes when applying this works localisation method. This in sum was less than half of the CPU time of one SCF iteraction cycle on the same computer (67 minutes) (measured on a 2GHz Sandybridge computer with 16 cores). The localisation method

33 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

of this work is therefore capable to yield local molecular orbitals of extended molecules in fairly short computation times. The computational efficiency of noniterative orbital localisation methods like the one presented in this work or the methods by Subotnik 18 or Zhang 29 is a significant advantage over methods which are based on optimising a localisation functional explicitly. However, it has to be added in favour of the latter class of methods that they can be used in conjunction with local analytic gradient methods because it is possible to formulate coupled perturbed localisation theories for these methods 51 . Basically all traditional localisation methods 9–11,15 and also a range of more recent localisation methods 24,25,52 fall into this category. For projective localisation methods, such as the one described in this work, corresponding coupled perturbed equations to determine the derivatives of the orbital transformation matrix have yet to be derived. The orbital localisation method of this work has been tested for a number of electronically localised and also unlocalised systems. It has been found that the locality of the resulting MO’s is comparable to the locality achieved by standard localisation methods like the Pipek-Mezey or Boys localisation methods, irrespective of the type of the underlying electronic strucuture of the molecule. It has been shown that the σ -π separation of molecular double bonds is conserved by the localisation method of this work, unlike, e.g., the standard Boys method. In case of electronically delocalised systems all methods considered in this work, including the Pipek-Mezey and Boys method, produce less localised orbitals and therefore may be less well suited for use in local correlation methods in this particular case.

Acknowledgement Financial support of this work through the DFG (Deutsche Forschungsgemeinschaft) priority Program No. SPP1807 ("Control of London dispersion interactions in molecular chemistry") is gratefully acknowledged.

34 ACS Paragon Plus Environment

Page 34 of 62

Page 35 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

References (1) Brillouin, L. Actual. scient. ind. 1933, 71, 159. (2) Koopmans, T. Physica (Amsterdam) 1933, 1, 104. (3) Chong, D. P.; Gritsenko, O. V.; Baerends, E. J. J. Chem. Phys. 2002, 116, 1760. (4) van Meer, R.; Gritsenko, O. V.; Baerends, E. J. J. Chem. Theory Comput. 2014, 10, 4432. (5) Runeberg, N.; Schütz, M.; Werner, H.-J. J. Chem. Phys. 1999, 110, 7210. (6) Schütz, M.; Rauhut, G.; Werner, H.-J. J. Phys. Chem. A 1998, 102, 5997. (7) Reinhardt, P. Chem. Phys. Lett. 2003, 370, 338. (8) Korona, T.; Kats, D.; Schütz, M.; Adler, T. B.; Liu, Y.; Werner, H.-J. Local approximations for an efficient and accurate treatment of electron correlation and electron excitations in molecules. In Series: Challenges and Advances in Computational Chemistry and Physics; Springer, 2011; Vol. 13, p 345. (9) Boys, S. F. Rev. Mod. Phys. 1960, 32, 296. (10) Foster, J. M.; Boys, S. F. Rev. Mod. Phys. 1960, 300. (11) Edminston, C.; Ruedenberg, K. Rev. Mod. Phys. 1963, 35, 457. (12) Kleier, D. A.; Halgren, T. A.; Hall, J. H.; Lipscomb, W. N. J. Chem. Phys. 1974, 61, 3905. (13) Haddon, R. C.; Williams, G. R. J. Chem. Phys. Lett. 1976, 42, 453. (14) Reed, A. E.; Weinhold, F. J. Chem. Phys. 1985, 83, 1736. (15) Pipek, J.; Mezey, P. G. J. Chem. Phys. 1989, 90, 4916. (16) Boughton, J. W.; Pulay, P. J. Comp. Chem. 1993, 14, 736. (17) Rubio, J.; Malrieu, J. P.; Reinhardt, P. J. Chem. Phys. 1997, 107, 10044. 35 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(18) Subotnik, J. E.; Dutoi, A. D.; Head-Gordon, M. J. Chem. Phys. 2005, 123, 114108. (19) Aquilante, F.; Pedersen, T. B.; Sanchez de Meras, A.; Koch, H. J. Chem. Phys. 2006, 125, 174101. (20) Ziotlowski, M.; Jansik, B.; Jørgensen, P.; Olsen, J. J. Chem. Phys. 2009, 131, 124112. (21) Guo, Y.; Li, W.; Li, S. J. Chem. Phys. 2011, 135, 134107. (22) Janisk, B.; Høst, S.; Kristensen, K.; Jørgensen, P. J. Chem. Phys. 2011, 134, 194104. (23) Høyvik, I.-M.; Jansik, B.; Jørgensen, P. J. Comp. Chem. 2013, 34, 1456. (24) Knizia, G. J. Chem. Theory Comput. 2013, 9, 4834. (25) Lehtola, S.; Jonsson, H. J. Chem. Theory Comput. 2014, 10, 642. (26) Knowles, P.; Schütz, M.; Werner, H.-J. Ab Initio Methods for Electron Correlation in Molecules. Modern Methods and Algorithms of Quantum Chemistry, John von Neumann Institute for Computing, Jülich, 2000. (27) Neese, F.; Wennmohs, F.; Hansen, A. J. Chem. Phys. 2009, 130, 114108. (28) Høyvik, I.-M.; Jørgensen, P. Chem. Rev. 2016, 116, 3306. (29) Zhang, C.; Li, S. J. Chem. Phys. 2014, 141, 244106. (30) Meunier, A.; levy, B.; Berthier, G. Theoret. Chim. Acta 1973, 29, 49. (31) Levie, B.; Millie, P.; Ridard, J.; Vinh, J. J. Electr. Spectr. 1974, 4, 13. (32) Gerschgorin, S. Izv. Akad. Nauk. USSR Otd. Fiz.-Mat. Nauk 1931, 6, 749. (33) Pykkø, P.; Atsumi, M. Chem. Eur. J. 2009, 15, 186. (34) Slater, J. C. Phys. Rev. 1930, 36, 57.

36 ACS Paragon Plus Environment

Page 36 of 62

Page 37 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(35) Cramer, C. J. Essentials of Computational Chemistry. Theories and Models; Wiley & Sons, 2004. (36) Dunning, T. H. J. Chem. Phys. 1989, 90, 1007. (37) Kendall, R. A.; T. H. Dunning, J.; Harrison, R. J. J. Chem. Phys. 1992, 96, 6796. (38) Woon, D.; T.H. Dunning, Jr., J. Chem. Phys. 1993, 98, 1358. (39) Woon, D.; T.H. Dunning, Jr., J. Chem. Phys. 1994, 100, 2975. (40) Mayer, I.; Salvador, P. Chem. Phys. Lett. 2004, 383, 368. (41) Vanfleteren, D.; Van Neck, D.; Bultinck, P.; P. W. Ayers,; Waroquier, M. J. Chem. Phys. 2010, 132, 164111. (42) Huang, W.; Sergeeva, A. P.; Zhai, H.-J.; Averkiev, B. B.; Wang, L.-S.; Boldyrev, A. I. Nature Chemistry 2010, 2, 202. (43) Zubarev, D. Y.; Boldyrev, A. I. Phys. Chem. Chem. Phys. 2008, 10, 5207. (44) Tien, M. Z.; Sydykova, D. K.; Meyer, A. G.; Wilke, C. O. PeptideBuilder: A simple Python library to generate model peptides, PeerJ 1:e80 https://dx.doi.org/10.7717/peerj.80, 2013. (45) Aradi, B.; Hourahine, B.; Frauenheim, T. J. Phys. Chem. A 2007, 111, 5678. (46) Heßelmann, A. J. Chem. Theory Comput. 2013, 9, 273. (47) Werner, H.-J. et al. MOLPRO, version 2012.1, a package of ab initio programs, 2012, see http://www.molpro.net. (48) Weigend, F.; Ahlrichs, R. Phys. Chem. Chem. Phys. 2005, 7, 3297. (49) Weigend, F. J. Comput. Chem. 2008, 29, 167. (50) Werner, H.-J.; Knowles, P. J.; Knizia, G.; Manby, F. R.; Schütz, M. WIREs Comput Mol Sci 2012, 2, 242. 37 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(51) A. E. Azahri,; Rauhut, G.; Pulay, P.; Werner, H.-J. J. Chem. Phys. 1998, 108, 5185. (52) A. C. West,; M. W. Schmidt,; M. S. Gordon,; Ruedenberg, K. J. Chem. Phys. 2013, 139, 234107.

38 ACS Paragon Plus Environment

Page 38 of 62

Page 39 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

39 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Tables Table 1: Basis set dependence of the partial atomic charges in acrylic acid obtained by a Hirshfeld partitioning using Slater’s rules atomic densities. See Figure 4 for the labeling used in the table. basis set aug-cc-pVDZ aug-cc-pVTZ aug-cc-pVQZ aug-cc-pV5Z aug-cc-pV6Z

C1 C2 C3 O1 O2 3.828329 4.024062 3.988721 6.056841 6.231858 3.826827 4.024193 3.988235 6.060505 6.232881 3.825630 4.024082 3.988456 6.061794 6.233537 3.825196 4.024113 3.988507 6.062139 6.233858 3.825153 4.024125 3.988507 6.062153 6.233883

Table 2: Comparison of the results for the overlap functional, the charge functional and the average orbital spread for the occupied orbitals of the helical AF dipeptide molecule as obtained by the ALOCa1 and ALOCb1 methods as well as the ALOCa1(noorth) method. In case of the ALOCa1(noorth) approach the canonical HF MO’s are projected onto the eigenfunctions of the matrix representation of f (rA ) = 1 − erf(α (rA )β ) (for α , β = 0.5) without the additional constraint of keeping MO’s on different sites orthogonal to each other. Values are in a.u. fOV fcharg σ

ALOCa1 0.0797 19.30 1.695

ALOCb1 ALOCa1(noorth) 0.0730 0.0818 20.13 19.93 1.606 1.614

40 ACS Paragon Plus Environment

Page 40 of 62

Page 41 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Table 3: Average orbital spread σ of the occupied orbitals of the helical and β -sheet (AF)x (x = 1 − 5) peptide conformers for various localisation methods (in a.u.). conformer method helix ALOCa1 ALOCa2 ALOCa3

(AF)1 1.695 1.930 1.766

(AF)2 1.687 1.929 1.779

(AF)3 1.690 1.920 1.793

(AF)4 1.691 1.934 1.804

(AF)5 1.692 1.933 1.808

ALOCb1 ALOCb2

1.606 1.601

1.611 1.605

1.612 1.605

1.612 1.606

1.613 1.606

ALOCa1(opt) ALOCa2(opt) ALOCa3(opt)

1.588 1.590 1.587

1.590 1.592 1.589

1.590 1.593 1.590

1.590 1.593 1.591

1.590 1.594 1.591

Cholesky Boys PM

2.330 1.590 1.598

2.466 1.590 1.603

2.452 1.592 1.606

2.647 1.592 1.608

2.707 1.592 1.609

ALOCa1 ALOCa2 ALOCa3

1.692 1.968 1.774

1.692 1.933 1.762

1.691 1.932 1.767

1.685 1.931 1.767

1.680 1.930 1.768

ALOCb1 ALOCb2

1.615 1.604

1.617 1.604

1.617 1.605

1.617 1.605

1.616 1.604

ALOCa1(opt) ALOCa2(opt) ALOCa3(opt)

1.591 1.593 1.590

1.591 1.594 1.592

1.591 1.594 1.592

1.591 1.594 1.592

1.590 1.593 1.591

Cholesky Boys PM

2.314 1.591 1.601

2.509 1.592 1.606

2.620 1.592 1.607

2.670 1.592 1.607

2.720 1.591 1.607

β -sheet

41 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 42 of 62

Table 4: Average orbital spread σ of the virtual orbitals of the helical and β -sheet (AF)x (x = 1 − 5) peptide conformers for various localisation methods (in a.u.). conformer method helix ALOCa1 ALOCa2 ALOCa3

β -sheet

(AF)1 2.026 2.013 1.968

(AF)2 2.039 2.033 1.987

(AF)3 2.069 2.044 1.996

(AF)4 2.077 2.056 1.998

(AF)5 2.083 2.060 2.001

ALOCa1(hyb)

1.906

1.904

1.910

1.911

1.910

ALOCa1(opt) ALOCa2(opt) ALOCa3(opt)

1.839 1.881 1.775

1.838 1.882 1.773

1.844 1.882 1.771

1.846 1.883 1.771

1.843 1.880 1.768

Cholesky Boys

2.711 1.719

2.748 1.718

2.774 1.719

2.797 1.720

2.802 1.718

ALOCa1 ALOCa2 ALOCa3

2.007 2.002 1.969

2.027 2.021 1.978

2.027 2.025 1.976

2.029 2.023 1.979

2.025 2.023 1.977

ALOCa1(hyb)

1.897

1.904

1.904

1.903

1.910

ALOCa1(opt) ALOCa2(opt) ALOCa3(opt)

1.831 1.874 1.766

1.837 1.879 1.765

1.835 1.877 1.762

1.835 1.875 1.760

1.835 1.874 1.760

Cholesky Boys

2.663 1.710

2.725 1.713

2.754 1.709

2.766 1.707

2.772 1.708

Table 5: Overlap and charge functionals as well as average orbital spreads obtained for the occupied orbitals of the (Ala)20 polypeptide molecule and the C96 H24 polyacene molecule for various localisation methods (in a.u.). method ALOCa1 ALOCa2 ALOCa3

fOV 0.0128 0.0158 0.0140

(Ala)20 fcharg 136.98 125.40 131.05

σ 1.577 1.809 1.696

fOV 0.0374 0.0488 0.0392

C96 H24 fcharg 66.065 59.151 64.115

σ 2.226 2.556 2.284

ALOCb1 ALOCb2

0.0122 0.0121

138.84 138.77

1.537 1.533

0.0371 0.0351

76.209 76.604

2.231 2.128

ALOCa1(opt) 0.0118 ALOCa2(opt) 0.0118 ALOCa3(opt) 0.0118

139.46 139.96 139.79

1.514 1.519 1.515

0.0291 0.0292 0.0290

78.747 78.750 78.725

1.889 1.893 1.880

Cholesky Boys PM

0.0232 116.78 2.065 0.0119 138.48 1.503 0.0123 139.65 1.527

0.0915 50.118 3.457 0.0339 77.535 1.916 0.0299 78.683 1.896

42 ACS Paragon Plus Environment

Page 43 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Table 6: Overlap functional and average orbital spreads obtained for the virtual orbitals of the (Ala)20 polypeptide molecule and the C96 H24 polyacene molecule for various localisation methods (in a.u.). (Ala)20 fOV σ 0.0132 2.015 0.0137 1.994 0.0125 1.923

method ALOCa1 ALOCa2 ALOCa3

C96 H24 fOV σ 0.0332 2.451 0.0343 2.423 0.0333 2.395

ALOCa1(hyb) 0.0118

1.827

0.0330

2.381

ALOCa1(opt) ALOCa2(opt) ALOCa3(opt)

0.0112 0.0111 0.0108

1.759 1.806 1.692

0.0297 0.0291 0.0274

2.271 2.274 2.188

Cholesky Boys

0.0265 2.670 0.0114 1.649

0.0564 0.0284

3.277 2.127

43 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

Figures 1

0.8

α=2, β=4 α=1/2, β=1/2

0.6 f(x)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 44 of 62

0.4

0.2

0 0

1

2

x

3

4

5

Figure 1: The function f (x) = 1 − erf(α xβ ) for two different choices of the parameters α and β .

44 ACS Paragon Plus Environment

Page 45 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

occupied, atom centered 5

virtual, atom centered 5

0.11

4.5

0.12

4.5

0.115

0.105

4

4 0.1

3.5 3

β

3.5

β

2.5

0.1

2.5

0.09

1.5

0.105

3

0.095

2

0.11

0.095

2

0.09

1.5

0.085

1

0.085

1

0.08

0.08

0.5

0.5

0

0

0.075

0

0.5

1

1.5

2

α

0.075

2.5

0.07

0

0.5

1

1.5

α

2

2.5

occupied, bond centered 5

0.0765

4.5

0.076

4 0.0755

3.5 0.075

3

β

2.5

0.0745

2

0.074

1.5 0.0735

1 0.073

0.5 0

0.0725

0

0.5

1

1.5

α

2

2.5

Figure 2: Spatial overlap functional values (Eq. (12)) obtained by the ALOCa1 (top diagrams) and ALOCb1 (bottom diagram) for various values of the parameters α and β . Dark blue regions mark parameter ranges of small values of the functional and thus an increased locality of the orbitals. The helical AF dipeptide molecule has been used for this analysis.

45 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

PM (occ)

Boys (occ)

bond centres (est.)

ALOCa2 (virt)

Figure 3: Average positions hφi |r|φi i of the occupied local molecular orbitals in acrylic acid obtained by the PM method (top left) and the Boys method (top right) (marked by the purple dots). The figure on the bottom left displays the bond centres estimated by average covalent radii of the underlying atoms. The figure on the bottom right-hand side shows the positions of the localised virtual orbitals as obtained by the ALOCa2 method (def2-TZVP basis set).

46 ACS Paragon Plus Environment

Page 46 of 62

Page 47 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

1

2 1

3

2 Figure 4: Labeling of the carbon and oxygen atoms of acrylic acid used in Table 1

47 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 5: Some selected eigenfunctions of the matrix representation of f (rA ) = 1 − erf(α (rA )β ) located on the atomic centres of the helical AF dipeptide molecule (for α , β = 0.5). The eigenfunctions are determined within the restricted function space spanned by the canonical HF MO’s.

48 ACS Paragon Plus Environment

Page 48 of 62

Page 49 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Figure 6: The highest nine occupied orbitals of acrylic acid obtained by the ALOCa1 method. The first orbital on the top line corresponds to the HOMO.

49 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 7: Valence orbitals of the B− 9 cluster obtained by the ALOCa1 localisation method. The first eight orbitals from the top correspond to 4c-2e and 3c-2e σ -type bonds, respectively. The three orbitals in the third line can be interpreted as completely delocalised 9c-2e σ -type bonds and the three orbitals in the last line as (partially) delocalised π -type bonds extended over 6 and 9 centres, respectively.

50 ACS Paragon Plus Environment

Page 50 of 62

0.15

0.1

(AF)1 (AF)2 (AF)3 (AF)4 (AF)5

β-sheet / occ

0.2

0.15

0.1

PM

pt ) Bo ys

3( o

2( op t)

Ca

AL O

pt )

Ca AL O

AL O

Ca

1( o

Cb 2 O

Cb 1

AL

Ca 3

AL O

2

AL O

0.1

)

s Bo y

AL O

Ca

3( o

pt

t)

AL

O Ca

2( op

op t)

1( hy

AL O

O Ca AL

Ca 1(

3 O Ca

Ca AL O

b)

0.05

1

ys Bo

op t)

) 2( o

Ca 3( AL O

Ca AL O

1(

op t

pt

)

b) 1( hy

AL

O Ca AL

O Ca

3 O Ca AL

Ca 2 AL O

AL O

Ca

1

0.05

0.15

AL

0.1

(AF)1 (AF)2 (AF)3 (AF)4 (AF)5

β-sheet / virt

Ca 2

fOV(method)/fOV(canonical)

0.15

O Ca

Ca 1

0.2

(AF)1 (AF)2 (AF)3 (AF)4 (AF)5

helix / virt

AL O

0.2

AL

AL O

PM

pt )

2( op AL t) O Ca 3( op t) Bo ys

1( o Ca

AL O

Ca

Cb 2 AL O

Cb 1

AL O

AL O

2

AL O

O Ca

Ca 1 AL O

Ca 3

0.05

0.05

AL

fOV(method)/fOV(canonical)

0.25

(AF)1 (AF)2 (AF)3 (AF)4 (AF)5

helix / occ

0.2

fOV(method)/fOV(canonical)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

fOV(method)/fOV(canonical)

Page 51 of 62

Figure 8: Spatial overlap functional evaluated for the occupied (top) and virtual (bottom) MO’s obtained by the different localisation methods. The curves are normalised by the magnitude of the functional yielded by the canonical HF MO’s. Left diagrams: helical structures and right diagrams: β -sheet structures.

51 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

100 98 (AF)5 helix (AF)5 β-sheet

96 94 92 90

PM

AL O

AL OC

Ca 2 AL OC a3 AL OC b1 AL OC b2 AL OC a1 (o AL pt OC ) a3 (o pt Bo ) ys

88

a1

fcharg(method) / fcharg(ALOCa2(opt))

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 52 of 62

Figure 9: Charge functional evaluated for various localisation methods for the helical and β -sheet conformers of the (AF)5 decapeptide molecule. The values are normalised by the charge functional values of the ALOCa2 method.

52 ACS Paragon Plus Environment

Page 53 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Figure 10: Localised occupied orbitals of the helical conformer of the (AF)5 peptide molecule obtained by the ALOCb1 method. The orbital on the top left-hand side corresponds to the HOMO.

53 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 11: Localised virtual orbitals of the helical conformer of the (AF)5 peptide molecule obtained by the ALOCa1 method. The orbital on the top left-hand side corresponds to the LUMO.

54 ACS Paragon Plus Environment

Page 54 of 62

Page 55 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

2.4

2.4

2.4

ALOCa1

σ

ALOCa2

ALOCa3

2.2

2.2

2.2

2

2

2

σ

1.8

σ

1.8

1.8

1.6

1.6

1.6

1.4

1.4

1.4

1.2

1.2 0

50

100

150

200

250

1.2 0

50

100

orbital

150

200

250

0

2.2

2

2

σ

1.8

1.6

1.4

1.4 1.2 50

100

150

200

250

0

50

orbital

100

150

2.4

PM

ALOCa2(opt)

2.2

2.2

2.2

2

2

2

σ

1.8

σ

1.8

1.8

1.6

1.6

1.6

1.4

1.4

1.4

1.2

1.2 150

200

250

2.4

Boys

orbital

200

orbital

2.4

100

250

1.8

1.6

0

50

200

ALOCb2

2.2

1.2

0

150

2.4

ALOCb1

σ

100

orbital

2.4

σ

50

orbital

250

1.2 0

50

100

150

200

250

orbital

0

50

100

150

200

250

orbital

Figure 12: Orbital spread yielded by various localisation methods for the occupied orbitals of the (Ala)20 molecule.

55 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

8

8

ALOCa1

7

6

5

σ

4

5

σ

4 3

3

2

2

2

1

1

1

0

0 500

1000

1500

2000

2500

3000

0 0

500

1000

orbital

1500

2000

2500

3000

0

6

5

5

σ

4

4 3

2

2

1

1

1

0

0 2000

2500

3000

0 0

500

1000

orbital

1500

2000

2500

3000

0

500

1000

orbital 8

1500

2000

2500

3000

orbital 8

ALOCa1(hyb)

7

Boys

7

6

6

5

σ

3000

5

σ

4

2

1500

2500

6

3

1000

2000

ALOCa3(opt)

7

3

500

1500

8

ALOCa2(opt)

7

6

0

1000

orbital

8

ALOCa1(opt)

7

500

orbital

8

σ

4

3

0

ALOCa3

7

6

5

σ

8

ALOCa2

7

6

Page 56 of 62

5

σ

4

4

3

3

2

2

1

1

0

0 0

500

1000

1500

2000

2500

3000

0

500

1000

orbital

1500

2000

2500

3000

orbital

Figure 13: Orbital spread yielded by various localisation methods for the virtual orbitals of the (Ala)20 molecule.

56 ACS Paragon Plus Environment

Page 57 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Figure 14: Localised occupied orbitals of the C96 H24 polyacene molecule obtained by the ALOCb1 method. The orbital on the top left-hand side corresponds to the HOMO.

57 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 15: Localised virtual orbitals of the C96 H24 polyacene molecule obtained by the ALOCa1 method. The orbital on the top left-hand side corresponds to the LUMO.

58 ACS Paragon Plus Environment

Page 58 of 62

Page 59 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Figure 16: Localised occupied orbitals of the (Ala)20 polypeptide molecule obtained by the ALOCb1 method. The orbital on the top left-hand side corresponds to the HOMO.

59 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 17: Localised virtual orbitals of the (Ala)20 polypeptide molecule obtained by the ALOCa1 method. The orbital on the top left-hand side corresponds to the LUMO.

60 ACS Paragon Plus Environment

Page 60 of 62

Page 61 of 62

100

(AF)5 helix

number of (occ-vir) orbital pairs [%]

number of (occ-vir) orbital pairs [%]

100

80 canonical ALOCa1 ALOCa2 ALOCa3 ALOCa1(opt) ALOCa2(opt) ALOCa3(opt) Cholesky Boys

60

40

20

0 0.001

0.01 0.1 spatial overlap threshold

number of (occ-vir) orbital pairs [%]

Ala20

40

20

0 0.001

80 canonical ALOCa1 ALOCa2 ALOCa3 ALOCa1(opt) ALOCa2(opt) ALOCa3(opt) Cholesky Boys

60

40

20

0.01 0.1 spatial overlap threshold

1

100

80

60

(AF)5 β-sheet

0 0.001

1

100

number of (occ-vir) orbital pairs [%]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

canonical ALOCa1 ALOCa2 ALOCa3 ALOCa1(opt) ALOCa2(opt) ALOCa3(opt) Cholesky Boys

0.01 0.1 spatial overlap threshold

C96H24 80

60

40

20

0 0.001

1

canonical ALOCa1 ALOCa2 ALOCa3 ALOCa1(opt) ALOCa2(opt) ALOCa3(opt) Cholesky Boys 0.01 0.1 spatial overlap threshold

1

Figure 18: Number of occupied-virtual orbital pairs which possess a spatial overlap that is smaller than a given threshold shown on the absissa axis (in precent). Top diagrams: helical and β -sheet conformers of (AF)5 , bottom left diagram: helical (Ala)20 molecule and bottom right diagram: C96 H24 polyacene molecule.

61 ACS Paragon Plus Environment

projection centres MO’s (occupied) Journal of Chemical Theory andlocal Computation Page 62 of 62 canonical MO’s 1

f(r)

1 2 3 4 5 6 7 8

0

projection function occupied virtual

r

local MO’s (virtual)

ACS Paragon Plus Environment