Large Eigenvalue Problems in Coarse-Grained Dynamic Analyses of

6 Jun 2018 - Despite recent successes in analyzing very large systems with up to 100 million atoms, those methods are currently limited to studying sm...
0 downloads 0 Views 3MB Size
Subscriber access provided by Kaohsiung Medical University

Biomolecular Systems

Large eigenvalue problems in coarse-grained dynamic analyses of supra molecular systems Patrice Koehl J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.8b00338 • Publication Date (Web): 06 Jun 2018 Downloaded from http://pubs.acs.org on June 18, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Large eigenvalue problems in coarse-grained dynamic analyses of supra molecular systems Patrice Koehl∗ Department of Computer Sciences and Genome Center, University of California, Davis, CA 95616, USA E-mail: [email protected] Abstract Computational methods ranging from all-atom molecular dynamics simulations to coarse-grained normal mode analyses based on simplified elastic networks provide a general framework to studying molecular dynamics. Despite recent successes in analyzing very large systems with up to a hundred million atoms, those methods are currently limited to studying small to medium size molecular systems when used on standard desktop computers due to computational limitations. The hope to circumvent those limitations rests on the development of improved algorithms with novel implementations that mitigate their computationally challenging parts. In this paper, we have addressed the computational challenges associated with computing coarse-grained normal modes of very large molecular systems (up to one million atoms), focusing on the calculation of the eigen pairs of the Hessian of the potential energy function from which the normal modes are computed. We have described and implemented a new method for handling this Hessian based on tensor products. This new formulation is shown to reduce space requirements and to improve the parallelization of its implementation. We have implemented and tested four different methods for computing some eigen pairs of the Hessian, namely the standard, robust Lanczos method, a simple modification

1

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

of this method based on polynomial filtering, a functional-based method recently proposed for normal mode analyses of viruses, and a block Chebyshev-Davidson method with inner-outer restart. We have shown that the latter provides the most efficient implementation when computing eigen pairs of extremely large Hessian matrices corresponding to large viral capsids. We have also shown that for those viral capsids, a large number of eigen pairs is actually needed, in the order of thousands, noticing however that this large number is still a small fraction of the total number of possible eigen pairs (a few percent).

1

Introduction

The current views of the relationship between the structure and function of biomolecules remain fragmented. The main difficulty comes from the understanding that biomolecular function arises from action, i.e. the dynamics or movements, of a structure. However, we lack the experimental and computational tools for directly studying dynamics from the molecular to supra-molecular levels. Indeed, only a few experimental techniques are capable of collecting time-resolved structural data, and those that can are usually limited to a narrow time window. In parallel, state-of-the-art computational methods are limited in scope, both for time-scale (usually micro-seconds) and length-scale (with systems of up to hundred thousand atoms), because of limitations in computing power. The limitations in computing power are directly related to the size of the molecular system considered. Any simulations of molecular dynamics based on Newton’s equations require numerous calculations of the energy of the system under study. Any such calculation is at least of order O(N log(N )), where N is the number of atoms included in the system. These calculations need to be repeated many times: for example, 1012 times for a 1 ms molecular dynamics simulation with a standard time step ∆t that is equal to one to a few femtoseconds. For large values for N , such calculations become prohibitive. Viral capsids are one example of large supramolecular systems with such large N . They are 2

ACS Paragon Plus Environment

Page 2 of 53

Page 3 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

multi-protein assemblies that serve as containers for the genetic material of viruses. 1 These capsids need to be able to change conformation to release their contents into a host cell during infection; studying their dynamic is therefore one key to understand the life cycle of a virus. 1 They are usually too big, with M in the hundreds of thousands, up to millions, for comprehensive analyses using standard molecular dynamics simulations. There are two main options to alleviate those problems. One option is to work on the implementation of the software or hardware designed to perform these calculations. 2–8 Such approaches have enabled simulations of systems of up to one hundred million atoms. 9,10 The second option is to simplify both the physical model used to capture dynamics as well as the model used to represent the system, i.e. develop models with reduced numbers of particles M , to make the calculation more tractable. This is usually identified as coarse-graining and has attracted a lot of attention in the recent literature. It is the topic of this paper. More specifically, we look at the numerical challenges associated with those coarse models when they remain large. An alternate and promising approach to standard molecular dynamics on a molecular system is to infer dynamics from static structures corresponding to locally stable states, 11 together with reliable coarse-graining approaches to bridge the time-scale gap. 12,13 Cartesian Normal Modes, for example, represent a class of movements around a local energy minimum that are both straightforward to calculate and biologically relevant. 14–16 The low-frequency part of the spectrum of normal modes is often associated with functional transitions, for instance, between two known states of the same macromolecule such as its apo (ligand-free) or holo (bound) form. The Elastic Network Model (ENM), introduced by M. Tirion in 1996, offers a particularly simple and efficient way to calculate these modes, allowing fast access to the collective dynamics of large complexes with no minimization issues as it enforces that the crystal structure is already at the energy minimum. 17 Computing coarse-grained normal modes for a molecular system is deceptively easy. Starting from a conformation C0 for the molecular system, a coarse-grained potential V

3

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 53

that is minimum at C0 , and taking a second order approximation of that potential, the Newton’s equations of motion expressed in Cartesian coordinates lead to a generalized eigenvalue problem

Hv = λM v,

(1)

where H ∈ RN ×N is the Hessian matrix of the potential V computed at C0 , M is the (diagonal) mass matrix for the system, λ is one eigenvalue and v the eigenvector associated to λ. The row size of the Hessian matrix, 3N , is equal to three times the number N of atoms in the system. When N is small, this Hessian can be stored as a dense matrix and the computation of its eigen pairs is straightforward, leading to the popularity of the method described above. However, while the contribution of normal modes techniques to study the dynamics of such small systems is undeniable, the hope is that they can be used to analyze much bigger molecular systems, in which case N will be large. As standard algorithms for eigen analyses have a time complexity of O(N 3 ), they become prohibitive for such systems. Normal mode techniques have been used for example extensively on virus capsids (for reviews, see for example Ref. 18–20 ). Most of those applications however resort to a symmetry-specific implementation 21–23 or to higher level of coarsening, as in the Rotations and Translations of Blocks (RTB) method 24 and in the block normal mode method, 25 and only a few have used full atom representations. 19,26,27 There are two main, related issues to take into account when trying to compute normal modes for a very large molecular system. First, there are storage issues. The full Hessian matrix requires storage of the order O(N 2 ). This requirement is however not a strict bottleneck as the matrix is highly sparse: the number of non zero values is only a fraction of the theoretical requirements, as the definition of “interactions” within the molecule usually relies on a cutoff value, making the storage linear in N rather than quadratic. For very large N however, it remains a moderate issue. Second, there is the problem of computing

4

ACS Paragon Plus Environment

Page 5 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

the eigen pairs of the Hessian matrix itself. There are a few elements that simplify this problem. As the mass matrix M in equation 9 is diagonal and invertible, the generalized eigenvalue problem can easily be rewritten as a standard symmetric eigenvalue problem. In addition, Tirion had indicated in her original paper describing elastic normal modes that the lowest frequency normal modes can capture most of the dynamics of the molecular system of interest. 17 This observation has since been supported by further evidence that the lowestfrequency normal modes do conform with conformational changes observed by X-ray and NMR experiments 28–30 as well as with the results of MD simulations. 31–33 While it is unclear as to how many of those low frequency normal modes need to be considered, 34 it remains that only a small fraction of the eigenvalues and eigenvectors of the Hessian matrix need to be computed

35

. This leads to the opportunity to use powerful iterative algorithms for

computing those quantities. The most common family of such algorithms used for normal mode analysis is based on efficient Krylov subspace methods, such as those that perform minimization of the Ritz-Rayleigh quotient. 36–38 More recently, the Krylov-based implicitly restarted Lanczos iteration method 39 has become popular for computing a subset of the eigenvalue spectrum of a large, sparse matrix as it is robust, usually efficient and fast, and can be easily implemented due to the general availability of the package ARPACK. 40 While ARPACK has been used for computing the normal modes of very large molecular systems (see for example Ref. 27 ), it is unclear whether it is the most efficient method in this context. This is the main question addressed in this paper. This paper is developed around two ideas. First, a new representation of the Hessian of a pairwise potential is proposed. It is shown that implementation of this new representation leads to reduced memory requirements and better options for parallelization. Second, we explore different methods for solving the eigenvalue problems associated with normal mode analyses. There have been many methods developed recently for solving large scale eigenvalue problems (for recent reviews, see Refs. 41,42 ) as this is a recurring problem in many scientific settings. We analyze three of those methods, namely the implicit Lanczos iteration

5

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

method, 39,43,44 a Jacobi-Davidson method, 45,46 and a method inspired from electronic structure theory, based on an energy functional. 26,47 We note that the latter is in fact similar to a trace minimizing approach. 48,49 For the first method, we consider both the standard implementation from ARPACK, 40 as well as a modified version in which we have introduced a Chebyshev polynomial filtering. 50,51 The second method follows the block Chebyshev Davidson algorithm with inner-outer restart. 52 Finally, the third method follows the algorithm described by Dykeman and Sankey. 47 All methods are tested on Hessian matrices derived from the elastic energies of viruses on different sizes, from 10,000 atoms up to close to 800,000 atoms. The paper is organized as follows. In the next section, we describe normal mode analysis (NMA) in the context of the Elastic Network Model. We provide an overview of the theory and discuss the computation of the Hessian matrix, describing a new formulation in which this matrix is expressed as a sum of tensor products. In the following section, we provide a (brief) description of the three different methods that we have tested for finding a small subset of the eigen pairs of a large symmetric matrix. In the Results section, we present and discuss the applications of these methods to compute the normal modes of large viral capsids. We find that the block Chebyshev Davidson algorithm performs best. We conclude the paper with a discussion on future developments on how to increase the number of eigen pairs that are computed.

2 2.1

Normal mode analysis Normal mode analysis based on the Tirion elastic network model

The Elastic Network Model (ENM) was originally introduced by Tirion. 17 It is a model that captures the geometry of the molecule of interest in the form of a network of inter-atomic connections, linked together with elastic springs. Two categories of normal mode analyses 6

ACS Paragon Plus Environment

Page 6 of 53

Page 7 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

based on ENMs are widely used today, namely, the Gaussian Network Model (GNM) 53,54 and the anisotropic network model (ANM). 17,55 Here we follow the latter model, in which the energy of the molecule is equated with the harmonic energy associated with these springs. This defines a quadratic energy on the inter-atomic distances. Let B be a biomolecule containing N atoms, with atom i characterized by its position Xi = (xi , yi , zi ). The whole molecule is then described by a 3N position vector X. For two atoms i and j of B, we set 0 rij = |Xi − Xj | and rij = |Xi0 − Xj0 | to be their Euclidean distances in any conformation X

and in the ground-state conformation X 0 (usually the X-ray structure), respectively. The total potential VEN M of the biomolecule is then set to:

VEN M (X) =

1X 0 2 kij (rij − rij ) 2

(2)

(i,j)

where kij is the force constant of the “spring” formed by the pair of atoms i and j. This 0 sum includes all pairs of atoms (i, j) that satisfies rij < Rc , where Rc is a cutoff distance.

In the original ENM, 17 the force constants kij are set constant for all pairs of residues. In other models, kij vary for different pairs of atoms. For example, Ming and Wall 56 employed an enhanced ENM in which the interactions of neighboring Cα atoms on the backbone of a protein were strengthened to reproduce the correct bimodal distribution of density-of-states from an all-atom model, while Kondrashov et al. 57 used a strategy in which they classified residue interactions into several categories corresponding to different physical properties. In parallel, the values Rc differ between models. Those values are usually considered in the range from 12 to 15 ˚ A if only one atom per residue (usually Cα) is considered, and in the range from 6 to 8 ˚ A if all atoms are considered. 58 We will use constant kij = 1, Rc = 8 for all-atom models and Rc = 15 for Cα-only models, unless otherwise noted. In the normal mode framework, the potential VEN M is then approximated with a second-

7

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 53

order Taylor expansion in the neighborhood of the ground state X 0 : 1 VEN M (X) ≈ VEN M (X 0 ) + ∇VEN M (X 0 )T (X − X 0 ) + (X − X 0 )T H(X − X 0 ) 2

(3)

where ∇VEN M and H are the gradient and Hessian of VEN M , respectively. Note that based on Equation 2, VEN M (X 0 ) = 0 and ∇VEN M (X 0 ) = 0. The ENM energy is then simply 1 VEN M (X) ≈ (X − X 0 )T H(X − X 0 ) 2

(4)

The 3 × 3 submatrix Hij of the Hessian H corresponding to two atoms i and j that are in 0 < Rc ) is given by: contact (i.e. with rij

kij (Xi − Xj )(Xi − Xj )T 0 2 ) (rij  (xi − xj )2 (xi − xj )(yi − yj ) (xi − xj )(zi − zj )   kij = − 0 2 (yi − yj )(xi − xj ) (yi − yj )2 (yi − yj )(zi − zj ) (rij )   (zi − zj )(xi − xj ) (zi − zj )(yi − yj ) (zi − zj )2 .

Hij = −

     

(5)

As the gradient of V , ∇VEN M , at X 0 is zero, the 3 × 3 submatrix Hii on the diagonal of H is given by:

Hii = −

X

Hij

(6)

j

where the sum extends over all atoms j in contact with i, For simplicity, we will assume that each atom is assigned a mass of 1. The procedure described above can easily be expanded to the more general case of different values for the masses. In Cartesian coordinates, the equations of motion defined by the potential VEN M

8

ACS Paragon Plus Environment

Page 9 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

are derived from Newton’s equation: d2 X = −H(X − X 0 ) dt2

(7)

Writing the solution to this equation as a linear sum of intrinsic motions (the “normal modes” of the system),

Xj =

N X

Ajk αk cos(ωk t + δk )

(8)

k=k0

we get a standard eigenvalue problem,

HV = V Ω

(9)

The eigenfrequencies ω are given by the elements of the diagonal matrix Ω, namely ωi2 = Ω(i, i). The eigenvectors are the columns of the matrix V , and the amplitudes and phases, αk and δk , are determined by initial conditions. We note that the first six eigenvalues in Ω are equal to 0, as they correspond to global translations and rotations of the biomolecule.

2.2

A new representation for the Hessian of the elastic potential and its properties

Let us rewrite the quadratic potential for the elastic network as:

VEN M (X) =

1X Vij (X) 2

(10)

(i,j)

where the summation extends to all pairs of atoms (i, j) that satisfy the cutoff criterium (see above). We compute the derivatives and Hessian of this potential in vector form. We first introduce some notations. We write the inner and outer products of two vectors u and v as < u, v > and u ⊗ v, respectively. We define the vector Uij such that 9

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Uij = (0, . . . , 0,

Xi −Xj X −X , 0, . . . , 0, jrij i , 0, . . . , 0), rij

Page 10 of 53

namely Uij is zero everywhere, except at

positions i and j where it is equal to the normalized difference vector between the positions of i and j.

Let us first analyze the pairwise potential Vij (X). Its gradient in R3N at a position X is given by:

0 ∇Vij (X) = kij (rij − rij )Uij

(11)

and its Hessian at the same position X is given by:

0 Hij (X) = kij (rij − rij )

δUij + kij Uij ⊗ Uij δX

(12)

Note that both terms in the expression of the Hessian are matrices of size 3N × 3N . For normal mode analyzes, the gradient and Hessian are evaluated at X0 :

∇Vij (X0 ) = 0

(13)

Hij (X0 ) = kij Uij ⊗ Uij

(14)

and

The total Hessian of the elastic potential is then given by:

H = H(X0 ) =

X

kij Uij ⊗ Uij

(15)

(i,j)

Expressing the Hessian as a (weighted) sum of tensor products (Equation 15) has at least two main advantages: it reduces (slightly) the amount of memory required to store 10

ACS Paragon Plus Environment

Page 11 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

the Hessian, and it provides for simpler computations of Hessian-vector multiplications. We provide illustrations of both. More efficient storage of the Hessian matrix Let P be the number of pairs of atoms 0 (i, j) that satisfy the cutoff criterium (i.e. with rij < Rc ). In the sparse representation of the

matrix H, we need to store P 3×3 matrices Hij (see equation 12) as well as N corresponding diagonal block matrices, for a total storage S1 = 9P + 9N . In the tensor representation of the Hessian (equation 15), we need to store P vectors Uij that only contains 3 independent, non zero values, for a total storage of S2 = 3P . Strictly speaking S1 could be reduced to S1 = 6P + 6N , as the diagonal and off-diagonal block matrices are symmetric, and at the cost however of recomputing Hij every time it is needed. In practice therefore, the tensor representation results in a factor of 3 savings in storage, with no computing cost. More efficient implementations of the product of the Hessian with a vector Let W be a vector in RN . Based on Equation 15 above, the product HW is given by:

HW =

X

kij (Uij ⊗ Uij )W

(16)

(i,j)

Using the property (Uij ⊗ Uij )W = hUij , Wi Uij , we get:

HW =

X

kij hUij , Wi Uij

(17)

(i,j)

The N -dimensional dot product hUij , Wi is simply the 3-dimensional dot-product:

hUij , Wi =

kij hXi − Xj , Wi − Wj i 2 rij

(18)

where Wi is the 3D vector containing the components of W at indices 3i + 1, 3i + 2, and 3i + 3 (using Fortran-style indexing), with a similar definition for Wj . This dot product only requires 3 multiplications. Multiplying this dot product with Uij requires an additional 3 11

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 53

multiplications, making the whole calculation of HW of the order of 6P . In addition, the computation of the product HX is embarrassingly parallelisable, as it can be easily divided over all the pairs of interacting atoms (i, j) in the molecule.

3

Computing selected eigenvalues of large symmetric real matrices

As mentioned in the introduction, there have been many methods developed recently for solving large scale eigenvalue problems for sparse symmetric matrices. In this section, we present three of those methods, namely the implicit Lanczos iteration method, 39 a JacobiDavidson method, 46 and a method based on the minimization of an energy functional. 26,47 We focus on the main concepts and refer the readers to the original papers for more detailed presentations of these methods. We note first that the goal is to compute a small subset of the eigen pairs of a large, sparse Hessian H matrix described above. Tirion 17 had reported that the lowest frequency normal modes can capture most of the dynamics of the molecular system; therefore, we are interested in the smallest eigenvalues of H. Second, we note that the matrix H is positive, semi-definite. This can easily be verified, as for any non zero vectors W in R3N , based on equation 17, we have:

hW, HWi =

X

kij hUij , Wi2 ≥ 0

(19)

(i,j)

H is semi-definite, as it has zero as an eigenvalue, with a null space of size (at least) 6. If λmax is the largest eigenvalue of H, then the matrix A = H − λmax I has all its eigenvalues negative, with its “largest” values (i.e. those with the largest amplitudes) corresponding to the smallest eigenvalues of H. In the following, we will focus on computing those largest amplitude eigenvalues of A. Finally, we note that there are not reasons for H to be diagonally

12

ACS Paragon Plus Environment

Page 13 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

dominant. In practice, we have observed that it is not.

3.1

Implicitly restarted Arnoldi method

An intuitive method for finding the largest eigenvalue of a given N ×N matrix A is the power iteration. Starting with an initial random vector x, this method calculates Ax, A2 x, A3 x,. . . iteratively, storing and normalizing the result into x at every iteration. The corresponding sequence of Rayleigh quotient Ri

Ri =

xT A i x xT x

(20)

converges to the largest eigenvalue of A, while x itself converges to the corresponding eigenvector. However, much potentially useful computation is wasted by using only the final result. This suggests that, instead, the so-called Krylov matrix is to be formed:  Kn =

 x Ax A2 x . . . An−1 x

(21)

The columns of this matrix are not orthogonal, but an orthogonal basis can be constructed via a Gram–Schmidt orthogonalization. The resulting vectors are a basis of the corresponding Krylov subspace. The vectors of this basis give approximations of the eigenvectors corresponding to the n largest eigenvalues of A. This procedure is however unfortunately unstable. Arnoldi 59 proposed an iterative approach for solving this problem, outlined in algorithm 1. We refer to this algorithm as the Lanczos algorithm as it is used specifically for symmetric matrices. In algorithm 1, steps (1) and (2) are basically a Gram-Schmidt procedure that enforces that the new vector qk is orthogonal to all previous vectors qj . After k steps, this algorithm has computed a truncated factorization of A of the form:

AVk ≈ Vk Hk 13

ACS Paragon Plus Environment

(22)

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 53

Algorithm 1 Lanczos iterations for matrix eigenvalue problems Input: matrix A, initial random vector q1 or norm 1, and maximum number of iterations, k for j = 2, . . . k do (1) hi,j−1 = hqi , Aqj−1 i , ∀i ∈ [1, j − 1] j−1 X hi,j−1 qi (2) qj = Aqj−1 − i=1

hj,j−1 = ||qj || q qj = ||qjj || end for where Vk is an orthogonal matrix whose columns are the vectors q and Hk is the upper Hessenberg matrix formed by the numbers hj,k . The eigenvalues λi of A are approximated by those of Hk (named Ritz values) and the associated approximate eigenvectors ui (Ritz vectors) are given by:

ui = Vk yi

(23)

where yi is an eigenvector of Hk associated with λi . What makes this method so valuable for approximating a few eigenvalues of a large sparse matrix is that it only requires knowledge of the matrix A through matrix vector products. Equation 17 shows that those products can be computed efficiently for the Hessian matrix of a quadratic potential. Algorithm 1 has been implemented in the package ARPACK 40 with three major modifications. First, it implements an implicit restarting scheme, 43 in which the vectors qj are continually divided into a “wanted” list, and an “unwanted” list, basically compressing the interesting information into a fixed size subspace. Second, it performs deflation procedure to improve convergence. 39 Finally, it implements a block method, in which multiple vectors qj are handled at once. A key parameter when using the Lanczos algorithm is the choice of the size of the subspace considered, i.e. the parameter k defined above. This parameter is usually set to be twice the number of eigenvalues to be computed, though there are no specific guidelines as to why

14

ACS Paragon Plus Environment

Page 15 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

this value should be used.

3.2

Speeding up convergence: Chebyshev acceleration

There are some known limitations to the Lanczos algorithm described above. For example, for large matrices the number of iterations and/or the size of the search subspace may be large, leading to problems related to computing time and memory requirements, respectively. In addition, if the wanted eigenvalues are highly clustered, the algorithm may be dominated by the convergence in the eigenvectors outside the range of those wanted ones. One option to reduce those shortcomings is to apply a polynomial filtering. The main idea is to compute eigen pairs for the matrix pm (A) instead of for the matrix A, where pm is a polynomial of degree m. This polynomial pm is chosen such that the product pm (A)W amplifies the components of the vector W in the direction of the desired eigenvectors, while dampening those along the unwanted eigenvectors. Chebyshev polynomials of the first kind have been, and are very popular as such filtering polynomials (see Refs. 50,51 for their use in eigenvalue problems for symmetric, and non-symmetric matrices, respectively). As a reminder, Chebyshev polynomials of the first kind can be defined as the unique polynomials that satisfy

Cm (x) =

    cos(m arccos(x)),    

|x| < 1

cosh(m arccosh(x)) x>1       (−1)m cosh(m arccosh(−x)) x < −1.

(24)

The exponential functions outside of the interval [−1, 1] ensure that we filter out that interval; the higher the degree of the polynomial, the better. If the spectrum of the matrix H considered is in the interval [a0 , b] and we want to dampen the interval [a, b] with a > a0 , it suffices to map [a, b] to [−1, 1] with a linear mapping. Note that with this scheme, in the case of normal mode we would use the matrix H directly, and not its shifted version A.

15

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 53

Computing a polynomial of a matrix comes at a computational cost. For Chebyshev polynomials, this cost is mitigated by the fact that they satisfy a recurrence relation,

Cm (x) = 2xCm−1 (x) − Cm−2 (x)

(25)

with C0 (x) = 1 and C1 (x) = x. This leads to the following algorithm 2 (see also 46,52 ) for   x−c computing pm (H)W , with pm (x) = Cm such that the interval to the right of [a, b] is e magnified. As noted in step (1) of the algorithm, c is the center of [a, b] and e its half width. Algorithm 2 Filter a vector W with a m-degree Chebyshev polynomial that dampens on an interval [a, b]. Input: matrix H, vector W , interval [a, b], degree m. (1) e = 0.5(b − a); c = 0.5(b + a). (2) s = e/(a − c); s1 = 1/s s (3) Y = (HW − cW ) e for j = 2, . . . m do 1 (4) snew = 2s1 − s 2snew (5) Ynew = (HY − cY ) − ssnew W e (6) W = Y ; Y = Ynew ; s = snew end for Output: filtered vector Y . While this procedure can be embedded directly in the Lanczos algorithm (see for example Refs. 60,61 ), an easier implementation is to use the package ARPACK as is, but replace each request to computing the matrix-vector product HW with the product pm (H)W . This is equivalent to computing the eigen pairs of the matrix pm (H). This matrix has the same eigenvectors of H. The eigenvalues of H are then obtained by computing the corresponding Rayleigh quotients (see equation 20).

3.3

A Chebyshev filtered, Jacobi-Davidson method

The Jacobi-Davidson, originally introduced by Sleijpen and van der Horst 45 solves the eigenvalue problem using a different point of view compared to the Arnoldi method. Given a target 16

ACS Paragon Plus Environment

Page 17 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

eigenvalue τ and an approximate eigenpair (λ, u) close to this target, where u belongs to a search subspace U , the idea is to expand U with a correction s such that a better approximation for the eigenpair can be found in the expanded space. Algorithm 3 below, adapted from Ref. 62 provides an overview of this method. Algorithm 3 The Jacobi-Davidson method Input: matrix A, initial vector u1 , target τ . Initialize: U0 = [ ] for k = 1, 2, . . . do (1) s = GS(Uk−1 , s), Uk = [Uk−1 s] (2) Hk = UkT AUk (3) Find eigenpair (λ, c) of Hk closest to τ (4) u = Uk c; r = Au − λu (5) Stop if ||r|| < TOL (6) Solve for s ⊥ u satisfying (I − uuT )(A − λI)(I − uuT )s = −r end for Output: converged eigenpair (λ, u).

In step (1), GS stands for a Gram-Schmidt procedure such that s is modified to be orthogonal to Uk−1 . The matrix Hk in step (2) is the projection of A onto the subspace Uk . As k is usually small (at least compared to the dimension of A), any method can be used to computed its eigen decomposition (step 3). In step (4) and (5), the updated eigen pair is checked and the procedure is stopped if the norm of the residual r is below a tolerance T OL. The core of the Jacobi-Davidson algorithm lies in step (6), where a correction vector s is found by solving a linear system of equations. The key idea in most implementations of the Jacobi-Davidson algorithm is to solve this linear system only approximately, using an iterative method (see for example 62 ). There is however an alternative approach that has proved efficient 46 and is of interest for normal modes calculations. The exact solution of the correction equation for s gives an expansion vector proportional to (A − λI)−1 u. Zhou and Saad 46 noticed that this is equivalent to apply a polynomial filtering on the current vector u by the rational polynomial φ(x) = 1/(x − λ). Applying this filtering has the advantage of magnifying the direction of the eigenvector corresponding to the Ritz value λ, which is the current best approximation to a wanted eigenvalue of A. Following the discussion in the 17

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

previous subsection, Chebyshev filtering offers an alternative which was shown to improve global convergence as well as robustness. The corresponding Chebyshev-Davidson-Jacobi algorithm is fully described in Zhou and Saad. 46 Zhou further improved this algorithm and developed a block version, BlockChebDav. with inner-outer restart to improve convergence and reduce its memory footprint. 52 An overview of BlockChebDav is given in Algorithm 4. Algorithm 4 The BlockChebDav method Input: matrix A; estimate of largest eigenvalue of A, λmax ; active space size kact , Chebyshev polynomial order, m, number of eigenvalues needed, kwant , block size, kblock , maximum number of iterations, itmax, tolerance on eigenpairs, TOL. (1) Initialize: E = [ ], H0 = [ ], Vt [:, 1 : kblock ] ← random, kc = 0, ka = 0, a = λmax /8, b = λmax , i = 0. while i = kwant return E and Vi+1 ; (7) Update Hi+1 [1 : ka , 1 : ka ] = D[knew + 1 : knew + ka , knew + 1 : knew + ka ] (8) Update a = median(D[knew + 1 : knew + ka , knew + 1 : knew + ka ]) (9) Vt [:, 1 : kblock ] = Vj+1 [:, kc + 1 : kc + kblock ] (10) if ka + kb > kact do ka = max(kact /2, ka − 3kb ) end while The pseudo code in Algorithm 4 follows the Matlab conventions for the representation of matrices. Briefly, M [i : j, k : l] means the submatrix of M between rows i and j, and columns k and l. When any of the indices is not given, it means that the range either starts from 1, or ends at the maximum row size or column size of M . For example, M [:, k : l] means the submatrix that contains all full columns between k and l. serves as an initialization. Step

18

ACS Paragon Plus Environment

Page 18 of 53

Page 19 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

1 of the algorithm serves as initialization. The array E of eigenvalues and the intermediate matrix H are set empty. ka and kc are two counters that define the size of the current search space and the number of converged eigen pairs; there are initialized to 0. The interval for the Chebyshev filter is initialized based on the estimated maximum eigenvalue λmax of the matrix A. Finally, a random set of kblock vectors of size n (where n is the size of the matrix A is computed. In step 2, each of those initial vectors is filtered using the Chebyshev filter of degree m, as described in Algorithm 2. The filtered vectors are then transformed to be orthogonal to the current matrix V that contains the converged eigenvectors, and the current trial vectors in the search space (step 3). The Rayleigh-Ritz procedure (steps 4 and 5) is then applied to generate a set of potential Ritz values and Ritz vectors, stored in the matrices D and V , respectively. Those potential eigen pairs are checked in step 6.Converged eigenvalues are stored in E, and the counter kc is incremented accordingly. The matrix H is then updated (step 7), and the interval for the Chebyshev filter is modified to account for the spectrum of eigenvalues already identified (step 8). Finally, the first kblock non converged Ritz vectors define the new Vt , and the procedure is iterated (step 9). If the size of the current search space becomes too large, i.e. if ka becomes larger that the input kact , its size is reduced, as given in step 10, which corresponds to the inner restart defined by Zhou. 52 The equation given in step 10 is ad hoc and was originally proposed by Zhou; itwas found to work well on all cases included in this study. We note that the three main parameters for this method are the order of the Chebyshev polynomial filter, m, the maximum size of the search space, kact , and the tolerance on the eigen pairs, TOL. Their importance will be discussed below in the context of normal mode computations.

3.4

Solving an eigenvalue problem through minimization of an energy functional

The last method we consider has its roots in physics and was originally proposed by Dykeman and Sankey. 19,26 We describe it here from an applied mathematics perspective. For 19

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 53

a matrix H of size N × N for which we want to find its M lowest frequency eigenvalues and corresponding eigenvectors, Dykeman and Sankey have considered a phonon functional energy G defined as 26

G = tr {D + (1 − S)D}

(26)

where “tr” stands for trace, and D = U T AU and S = U T U are “Hamiltonian” and “overlap” matrices in RM ×M . The M columns of the matrix U are vectors in RN . The matrix A is defined as A = H − λmax I, where λmax is the largest eigenvalue of H; this shift is required to set the whole spectrum of A to be negative, which then guarantees that the functional G has a global minimum. The largest eigenvalue λmax can easily be estimated using a few iterations of the power method or of the Lanczos method described above. The minimization of G is performed over the elements of the matrix U . The converged matrix Umin forms a complete orthonormal basis of the space of the lowest states associated with A. The actual eigenvectors are obtained by first solving the generalized M × M eigenvalue problem DV = θSV . The corresponding eigenvalues for H are λi = θi + λmax and the associated eigenvectors E are given by E = Umin V . Finding the minimum of a function in RN ×M (the size of the matrix U ) is usually considered a difficult problem. However, in this specific case there is an efficient implementation of the conjugate gradient method for finding that minimum. At a step n of the conjugate gradient method, the gradient of G with respect to the matrix U is given by:

gn =

dG = 4AU − 2SAU − 2U D dU

(27)

The conjugate gradient search direction pn is then given by

pn = −gn + βpn−1

20

ACS Paragon Plus Environment

(28)

Page 21 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

where β is computed using the Pollak and Riberi variant of conjugate gradient. The matrix U are then updated using a fixed step size γ along the direction pn , Un+1 = Un + γpn , where γ is obtained through a line minimization. The key to an efficient implementation is that this line minimization can be performed analytically by requiring that the corresponding new gradient be perpendicular to the old search direction; this yields a cubic equation for γ that can easily be solved. 47 The minimization of the functional G is then performed using conjugate gradient with this analytical line minimization. It is stopped when G does not change anymore, within a tolerance TOL.

4

Mapping conformational changes onto normal modes

Let us consider a molecular system S with M atoms for which we have two conformations, A and B. The total change between those two conformations is captured by a displacement vector, D, such that D = B − A. Let us now consider a set of K normal modes for S in conformation A. These normal modes have been computed based on the eigenvalues Λ and eigenvectors V of the Hessian of an elastic network for A. Under the normal mode model, the dynamic of A can be described as a linear superposition of the fundamental motions described by those eigenvectors. The corresponding dynamic that will bring A closer to B is obtained by assigning the weights W of the modes in this superposition through projections of the displacement vector onto the eigenvectors:

W = V tD

(29)

The contribution of mode i to this optimal collective change of conformation can then be measured as the absolute value of the cosine of the angle between the displacement, and the

21

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 53

direction of the mode, given by column vi in the matrix V :

Oi =

| hvi , Di | ||vi ||||D||

(30)

Oi takes values between 0 and 1, with small values indicating that the mode i contribute little to the conformational change, while large values indicate a significant contribution. 3M X We note that Oi2 = 1, as the vi are normalized to 1 and are orthogonal to each other. Then, SOk =

i=1 k X

Oi2 is a measure of the contribution of the first k normal modes to the

i=1

total overlaps between the normal modes of A and the displacement between A and B. Using the optimal weights defined in equation 29, the new conformation Ak of S obtained by applying k of the K modes of A is given by:

Ak = A +

k X

Wi vi

(31)

i=1

With some basic linear algebra, taking into account the orthogonality of the eigenvectors, it can be shown that the coordinate root mean square (cRMS) distance between Ak and the target conformation B, denoted RM Sk , is given by:

RM Sk2

k 1 X 2 W = RM S(A, B) − M i=1 i 2

(32)

||B − A|| , after optimal M rigid body transformation of B onto A to minimize this RMS value. Note that the more we where RM S(A, B) is the cRMS between A and B, computed as

include normal modes, the smaller RM Sk becomes.

22

ACS Paragon Plus Environment

Page 23 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

5

Implementation

We have implemented the different methods described above in a stand-alone application, NormalModes, written mainly in C++, with some calls to libraries in Fortran (see below). The source code is available upon request to the author. In the following, we highlight some of the elements of NormalModes that are relevant to the analysis of large systems. The program reads in first the conformation of a biomolecule from a file in the PDB format. 63 Each line starting with the “ATOM” tag within this PDB file is processed and interpreted to inform on the positions of the atoms of the molecule. One of the parameters provided to NormalModes enables the user to decide between an “all-atom” representation or a coarse representation in which only one atom is picked for each residue, its central Cα. This structural information is then used to defined the Elastic Network Model. In its current implementation NormalModes only allows for a simple cutoff model. Namely, the network is defined as a set of links, with a link between two atoms only if the corresponding interatomic distance is smaller than a given cutoff value Rc (see section 2 above). If the number of atoms in the molecule is M , a brute force approach to defining this network has a complexity of O(M 2 ). This complexity can easily be reduced to O(M ) by using the linked-list cell MD algorithm, 64 and even reduced further by distributing the cells to be checked uniformly over the p processors available for the computation, leading to a complexity of O(M/p). We have implemented this solution without further refinement as the average clock time observed for a system of one million atoms is of the order of a few seconds on a desktop computer, a time that is negligible compared to the computing time required to find the eigen pairs of the Hessian of the elastic potential. NormalMode includes the four different methods for computing eigenvalues discussed above, namely the standard Lanczos method, the Chebyshev-filtered Lanczos method, the Chebyshev-filtered block Jacobi-Davidson method, and the phonon functional method. For the Lanczos method, we have simply implemented C++ wrappers to the ARPACK code, 40 which is written in Fortran 77. The Chebyshev-filtered Lanczos method uses the same im23

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

plementation, with two simple changes. First, the call to the Hessian vector product in ARPACK is replaced with a call to a Chebyshev filtering (see algorithm 2 above). Finally, the eigenvalues generated by ARPACK are replaced with the Ritz values obtained from the eigenvectors using the Rayleigh quotient (see equation 20). Note that this is not per se a Chebyshev-Lanczos method; its implementation however is straightforward once the Lanczos method has been implemented. For the Chebyshev-filtered block Jacobi-Davidson method, BlockChebDav, we use our own implementation of the algorithm proposed by Zhou. 52 Finally, the implementation of the phonon functional method is straightforward, following Pollak and Riberi variant of conjugate gradient with the analytical expression for the line search as described above and in Dykeman and Sankey. 47 All four methods described above require at their core a procedure to compute the product of the Hessian of the elastic energy with a vector. We have implemented two parallel versions of this procedure, both based on the tensor representation of this product, see Equation 17. For any request for a single Hessian Vector product, we have divided the sums over all links in the elastic network over all processors p available. When a block of K Hessian vector products is requested however, we split those K products over the p processor, and perform the individual product sequentially. Finally, we have used as much as possible the BLAS level2 and level3 routines, with a preference for the latter when possible.

6

Results and Discussion

We present computational results associated with the two topics of this paper, namely a validation of the tensor representation of the Hessian of the elastic energy of a biomolecule, and an analysis of the adequacy and efficiency of four different algorithms for computing a subset of the eigenvalues and eigenvectors of such Hessian matrices for very large molecular systems. We also illustrate the applications of those eigen pairs when studying conforma-

24

ACS Paragon Plus Environment

Page 24 of 53

Page 25 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

tional changes of those large molecular systems and discuss the number of eigen pairs that are needed. All computational experiments described below were conducted on an Apple computer with an Intel i7 4GHz processor with 4 cores (and two hyperthreads per core) and 64 Gb of RAM. The program NormalModes was compiled using the Intel icpc and ifort compilers for its C++ and Fortran components, respectively, and linked against the Intel MKL library for all BLAS and LAPACK procedures that are used within the code.

6.1

The molecular systems

We have used two different virus systems as support for our studies, the Zika virus (ZIKV) and the West Nile Virus (WNV), as they both form very large molecular systems. ZIKV is a member of the flaviviridae family of viruses. Its capsid has icosahedral symmetry and is formed of 60 asymmetrical units, with each unit containing three copies of E protein and three copies of the membrane protein M. We have used the structure of the mature form of ZIKV, available in the PDB database with code 5IZ7. 65 A cartoon representation of ZIKV is given in Figure 1A. In this structure, each asymmetrical unit contains 1737 residues (3 × 504 E protein residues and 3 × 75 M protein residues) that are formed of 13226 atoms. The full capsid includes 60 copies of this unit, for a total of 104220 residues and 793,560 atoms. We have considered multiple versions of the structure, with the number of asymmetric units varying between 1 and 60, to assess the impact of sizes on computing normal modes. We have chosen WNV to illustrate conformational changes of large molecular systems. WNV is another member of the flaviviridae family of virus, whose capsid resembles the ZIKV capsid, with the same composition, the same icosahedral symmetry, and the same number (60) of asymmetrical units (see Figure 1B, C, and D). We have used three different structures of WNV, an immature form with PDB code 2OF6, 66 an immature form complexed with FAB fragments to mimic the effects of antibody binding to a viral capsid, PDB code 3IXX, 67 and its mature form, PDB code 3J0B. 68 We extracted from those three files all 25

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

A))

ZIKV, Mature

C)

WNV, Immature

B))

WNV, Mature

D) WNV, Immature + FAB

Figure 1: Capsids of flaviridae. A) Space filling, all-atom model of the mature form of the capsid of ZIKV (PDB file 5IZ7). The capsid includes 180 copies of both protein E and protein M. We omit the latter in the display as they point inside the capsid. The three E proteins from each asymmetric unit are colored green, orange, and blue. B) Space filling, Cα-only model of the mature form of the capsid of WestNile virus (PDB code 3J0B), using the same color code as in panel A. Note the similarity of the architecture of the two capsids. Note that the capsids of ZIKV and WNV in their mature forms have approximately the same size. C) Space filling, Cα-only model of the immature form of the capsid of WestNile virus (PDB code 2OF6), using the same color code as in panel B. The capsid undergoes a major rearrangement as it goes from its immature to its mature conformation; in particular, the immature form is 10% bigger compared to the mature form. D) Space filling model of the immature form of the capsid of WestNile virus, bound to FAB fragments, in magenta (PDB code 3IXX). All four panels were generated using Pymol (http://www.pymol.org). copies of the E-proteins. As only the Cα atoms were present in the three PDB files, we got three conformations of the viral capsid of WNV, each with 71,100 atoms. In Figure 2 we compare the frequencies of the first fifty normal modes of the elastic networks of the full capsids of ZIKV and WNV. The elastic networks were computed using a full atom representation with a cutoff Rc = 8 ˚ A for ZIKV, and a Cα-only representation with a cutoff Rc = 15 ˚ A for WNV. As expected, the first six frequencies are found equal 26

ACS Paragon Plus Environment

Page 26 of 53

Page 27 of 53

A)

B)

10 9 8 7 6 5 4 3 2 1 0 0

3.5 3

Frequency

Frequency

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Mature Immature Immature+FAB

2.5 2 1.5 1 0.5

10

20

30

40

0 0

50

10

20

30

40

50

Mode #

Mode #

Figure 2: The low frequencies of the normal modes of ZIKV and WNV. (A) The frequencies of the first fifty normal modes of the full capsid of ZIKV are plotted against the mode number. The frequencies are in arbitrary units, as the force constants are also in arbitrary units. (B) The frequencies of the first fifty normal modes of the full capsid of WNV in its mature form (black), immature form (red), and immature form with FAB fragments (blue), are plotted against the mode number. Note the similarity between the spectra of the mature forms of ZIKV and WNV, and the decrease in the amplitudes of the mode frequencies between the mature and immature forms of WNV. to zero, for all systems considered, as those frequencies correspond to the rigid motions (three translations and three rotations). Of significance is the presence of degeneracies in the spectra, namely repeating frequencies with order 1, 2, 3, 4, or 5, as expected from the icosahedral symmetries of the capsids. The impact of those degeneracies will be discussed with respect to computing the eigenvalues of the Hessian. For WNW, we observe a decrease in the frequencies of the modes between the mature, and immature form of the capsids (see figure 2B).

6.2

Faster Hessian-vector multiplications with the tensor representation of the Hessian

We first assessed if rewriting the Hessian as a sum of tensor product leads to a more efficient implementation of Hessian vector products. We considered different constructs of the capsid of ZIKV with varying numbers of asymmetrical units, from 1 to 60. We computed the elastic 27

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

A)

B)

4 9 10

4.4

8

4.2

7

4

6

Speedup

Total CPU time (s)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 53

5 4 3

3.8 3.6 3.4

2 0 0

3.2

Tensor form Sparse matrix

1 0.5

1

1.5

2

2.5

3 0

3

3.5 8 10 NNZ elements in Hessian matrix

Tensor form Sparse matrix

0.5

1

1.5

2

2.5

3

NNZ elements in Hessian matrix

3.5 8 10

Figure 3: CPU time and efficiency of Lanczos-based NormalMode for computing the first 100 normal modes of elastic networks of virus capsids. All calculations are based on the ARPACK implementation of the Lanczos method. Two sets of calculations are performed. In the first set, Hessian vector multiplications are performed using a sparse representation of the Hessian (in black). In the second set, Hessian vector multiplications are based on the tensor representation of the Hessian (in red). A. The total CPU times are plotted against the sizes of the elastic networks, given in terms of the number of non-zero values (NNZ) in their Hessian. B. The speedup (computed as the ratio of total CPU time over clock time) is plotted against the same number of non-zero values. All calculations are performed on an Intel i7 4GHz processor with 4 cores, and two hyperthreads per core. networks of all those constructs, using all atoms and a cutoff Rc = 8 ˚ A. The sizes of those elastic networks vary from 554636 edges for one asymmetrical unit, to 34,477,555 edges for sixty units. Note that the latter corresponds to 315,059,355 non zero values (NNZ) in the Hessian matrix, taking into account the fact that it is symmetric. The storage of this matrix requires 2.52 GB in double precision. In contrast, the storage of the Hessian in its tensor representation only requires 827 MB in double precision. We computed the first 100 modes of the elastic network models corresponding to those constructs using the traditional Lanczos algorithm as implemented in ARPACK in two ways that differ only in the method used to compute the Hessian vector products. In the first approach, the Hessian is stored using the compressed sparse row format and Hessian vector products are computed accordingly. This part of the code is then parallelized using OPENMP

28

ACS Paragon Plus Environment

Page 29 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

instructions. In the second approach, the Hessian is stored using its tensor representation and Hessian vector products are computed using equation 17. The sum over all edges in the elastic network is distributed over the different cores available using the standard pthread library. The computing times associated with those two approaches are compared in Figure 3. The tensor representation of the Hessian leads to more efficient computations of Hessian vector multiplications, with the total computing time being reduced by up to 40% for small Hessian matrices, and up to 15% for large Hessian matrices (figure 3A). More importantly, this tensor representation enables a better parallelization of those matrix vector multiplications, as shown in figure 3B. For small elastic networks, i.e. for Hessian matrices with small to moderate numbers of non-zero values, the overall calculation of normal modes is dominated by the computations that are internal to ARPACK. Those computations are efficient and parallel through the usage of efficient BLAS routines. As such, there is only a small impact of the cost of matrix vector multiplications, and the speedup due to the use of multiple cores is similar for the two representations of the Hessian. However, as the size of the Hessian increases, reaching hundreds of millions of NNZs, the impact of the multiplications increases on the total computing time. This increase has little impact on the speedup obtained with multiple cores when the Hessian is represented by tensors. It has however a significant impact when the Hessian is stored in sparse format, with the speedup decreasing rapidly. The parallelization that was implemented for the sparse representation of the Hessian matrix is naive, based on standard OPENMP, and it is likely that more optimized implementations exist. The results presented here however illustrate that such optimization is not needed for the specific case in which an alternate representation of the matrix is available.

29

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

6.3

Computing a subset of the normal modes of very large elastic networks of virus capsids

Computational methods applied to molecular systems that rely on second order analyses of a potential, usually in the form of a Hessian, are usually limited to small systems, because of the time and space complexity necessary to manage such a Hessian. This is unfortunately the case for normal mode calculations. The simplified elastic network model enables applications of methods that scale linearly with the number of atoms, as it relies on short range interactions. This does not solve however the time complexity associated with computing even a small number of normal modes of such elastic networks when the size of the system becomes large. To assess the importance of this problem and the efficiency of existing algorithms, we computed the first 100 modes of the elastic network models corresponding to the ZIKV capsid constructs described above using four different eigenvalue algorithms, described in section 3. Comparisons of the computing times required by those four methods are provided in figure 4. The Lanczos method is the most common method used to compute a small subset of eigenvalues and eigenvectors of a Hessian for normal mode calculations. We have used here what is considered its most robust implementation, ARPACK, with default values for its parameters, and setting the size of the Krylov subspace to be twice the number of requested eigenvalues, namely 2× 100. While it is found to be efficient and competitive compared to the other methods that we tested, it becomes inefficient for much larger systems. The total CPU time to compute the first 100 modes for the whole ZIKV viral capsid is 72000 seconds, with a clock time of 17200 s. As a reminder, this viral capsid contains 793,560 atoms and the Hessian of its elastic network containing 3.15 × 108 NNZ. Of concern is the limited speedup induced by parallelization of the Hessian matrix products and by the use of optimized BLAS and LAPACK routines. This speedup is of the order of 4, compared to an ideal value of 8 on the desktop computer that was used. Interestingly, for the problem of computing normal modes the Lanczos method, as im30

ACS Paragon Plus Environment

Page 30 of 53

Page 31 of 53

A)

B)

4 8 10

9 Lanczos Filtered Lanczos Functional BlockChebDav

7 6

Lanczos Filtered Lanczos Functional BlockChebDav

8

Speedup

CPU time (s)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

5 4 3

7 6 5

2 4

1 0

0

0.5

1

1.5

2

2.5

NNZ elements in Hessian matrix

3

3 0

3.5 8

0.5

1

1.5

2

2.5

NNZ elements in Hessian matrix

10

3

3.5 8

10

Figure 4: Comparing four methods for computing the first 100 normal modes of the elastic networks of virus capsids. A. The total CPU times are plotted against the sizes of the elastic networks, given in terms of the number of non-zero values (NNZ) in their Hessian. B. The speedup (computed as the ratio of total CPU time over clock time) is plotted against the same number of non-zero values. All calculations are performed on an Intel i7 4GHz processor with 4 cores, and two hyperthreads per core. plemented in ARPACK, can be significantly accelerated with a minor modification, namely using a Chebyshev filtering. As described in the method section, the modification amounts to replacing the Hessian vector product HW with p(H)W , where p is a Chebyshev polynomial designed to filter out unwanted eigenvalues. We tested this method with a Chebyshev polynomial of order 5, keeping all other parameters for ARPACK the same as above. We observed a significant improvement, with the total CPU time between 50% faster for small systems, and close to 100% faster for large systems. In addition, application of filtering lead to (slightly) better parallel speedup, as shown in figure 4B. The computing efficiency of the functional-based method was found to be intermediate between the Lanczos method and the Chebyshev-filtered Lanczos method (figure 4A). Its parallelization is more efficient, with a speedup of 5.5. There is however a significant difference between the results obtained with the Lanczos method (filtered or not), and the functional based method. The convergences of the eigen pairs for the Lanczos method are

31

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

assessed by computing the quality of the truncated factorization of the matrix (see equation 22). It is found that this leads to accurate eigen pairs (λ, v), with a mean accuracy of 10−11 , with the accuracy defined as ||Hv − λv||. The convergence of the functional-based method is assessed with the tolerance TOL set as a stopping criterium for the conjugate gradient minimization of the functional. We have set TOL=10−10 . It is found that the eigen pairs are then accurate with a mean accuracy of 5 × 10−4 , i.e. significantly worse than with the Lanczos procedure. We tried smaller values for TOL; this lead to significant increase in computing time, as the number of conjugate gradient iterations increases significantly, without ever reaching the accuracy of the Lanczos procedure. Larger values of TOL reduces the computing time, but at a cost in accuracy: with TOL=10−8 , we observe a reduction of the total computing time, but a decrease in accuracy from 5 × 10−4 to 5 × 10−3 . The Chebyshev-filtered, block version of the Jacobi Davidson method was found to be the most performant of all four methods tested. It was run with a block size kblock set to 16, a Chebyshev polynomial order of 100, the size of the active search subspace kact set to 400, the total size for the search space set to keig + kact (where keig = 100 is the number of eigenvalues computed) and a tolerance TOL 10−8 (see algorithm 3). The choices for those values will be discussed below. It is on average 65% faster than traditional Lanczos (80% for the largest Hessians), and 30% faster than Chebyshev-filtered Lanczos (80% for the largest Hessians) . The total CPU time to compute the first 100 modes for the whole ZIKV viral capsid is 21000 seconds, with a clock time of 3200 s. As the tolerance is used to assess directly the accuracies of the eigen pairs identified, the mean accuracies of the latter were found to be 10−8 . A lower value for TOL would have lead to more accurate eigen pairs; however, a mean accuracy of the order of square root of the machine precision is usually deemed acceptable. The use of a block version enables efficient parallelization as all the Hessian- vector multiplications can be distributed efficiently over the different processors and as all matrix-matrix products benefit from optimized BLAS3 routines. The corresponding speedup is of the order of 6.6 on average, compared to an ideal value of 8.

32

ACS Paragon Plus Environment

Page 32 of 53

Page 33 of 53

6.4

Choosing the polynomial order for the Chebyshev filters

Two of the four methods considered here rely on Chebyshev filtering, namely the Filtered Lanczos method, and the Jacobi-Davidson method. In the comparisons of those two methods presented above, we used two different settings for the filtering, with the polynomial orders of the filter set to 5 for the Lanczos method, and 100 for the Jacobi-Davidson. The question arises then as to why we chose those different values for those two methods. We answer this question with the results presented figure 5. 10 52

12

1.6

11

1.4

10

1.2

9

1

8

0.8

7

0.6

6

0.4

5

0.2

4

0

20

40

60

80

100

BlockChebDav

10 4

10 5

1.6

14

CPU time (s)

1.8

# Mat-Vect product

13

16

1.4

12

1.2

10 1 8 0.8

6

0.6

4

0

2

120

20

# Mat-Vect product

Filtered Lanczos

4 14 10

CPU time (s)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

40

Polynomial degree

60

80

100

0.4

120

Polynomial degree

Figure 5: The effects of the Chebyshev polynomial degree on computing the first 100 eigen pairs of the elastic network of the full ZIKV capsid for the filtered Lanczos method (left) and the block Chebyshev Davidson-Jacobi method (right). We computed the first 100 eigen pairs of the elastic network of the full ZIKV virus using the filtered Lanczos method, and the block filtered Jacobi Davidson method, using different degree m for the Chebyshev polynomial used for filtering. For the filtered Lanczos method, the total CPU time needed to compute those 100 eigen pairs drops first as the degree m is increased to 6, and then linearly increase for degrees larger than 6. The total number of Hessian vector products shows the same linear increase as a function of the polynomial degree. A similar behavior is observed for the block filtered Jacobi Davidson method, but with a significant shift of the minimum in both computing time and number of Hessian vector product, which is now observed for m between 80 and 100. Zhou had reported a 33

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

similar behavior for electronic structure calculations, 52 with the main difference that he reported good values for the polynomial degree in the order of 40 − 50. The fact that we found larger optimal values for m is not too surprising as the eigenvalues for the full capsid of ZIKV are clustered, with a high level of degeneracies (see Figure 2). In such situations, it is expected that more aggressive filtering provides faster convergence. It is however surprising that the filtered Lanczos effect becomes much less efficient for aggressive filtering. We note however that our implementation is not a true filtered Lanczos method, as we only modify the Hessian-matrix product and do not change the Lanczos procedure itself.

6.5

Parameterizing the block filtered Jacobi Davidson method

In addition to the degree of the polynomial filtering, there are two other parameters that influence the running time of the block filtered Jacobi Davidson method, namely the block size, kblock , and the size of the active search subspace, kact . To assess their importance, we computed the first 100 modes of the elastic network model corresponding to the full ZIKV capsid over ranges of their values, keeping all the other parameters fixed. Results are shown in figure 6. All calculations over different block sizes were performed with a large value for kact set to 400. The total CPU time required to compute the first 100 eigenvalues for the full ZIKV capsid decreases as the block size is increased up to 16, and then increases significantly. We note that the decrease over the range 1 to 16 is not monotonic, with clear minima at 8 and 16, and local increases before and after those values. This behavior finds its origin in the way we implemented the parallelization of the matrix vector multiplications. When the block size is a multiple of the number of processors available (8 for our calculations), each processor performs a set of full matrix vector operations. In the other cases, some single matrix vector multiplications are spread out over the different processors. Clearly, this second mode of parallelization is less efficient than the first mode. When the block size is larger than 16, the total computing time increases, even when this block size is a 34

ACS Paragon Plus Environment

Page 34 of 53

10

8

3.5

7 3 6

2.5 2 0

5 2.5 10

10 9

4

B)

4

10

15

20

25

30

5

6 5

2

4

1.5 3 1 2

0.5

5 5

10

0 100

4 35

Block size

# Mat-Vect product

4 4.5 10

Total CPU time

A)

Total CPU time

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

# Mat-Vect product

Page 35 of 53

1 200

300

400

500

0 600

Active space size

Figure 6: Parameterizing the block filtered Jacobi Davidson method for normal mode calculations. The effects of the block size and the of active search space size on computing the first 100 eigen pairs of the elastic network of the full ZIKV capsid are shown in panels (A) and (B), respectively. multiple of the number of processors. However, we cannot exclude that the position of the minimum, 16, is a consequence of using only 8 processors. We note that the total number of matrix vector multiplications increases monotonically with the block size and does not directly reflect the total computing time. This was already reported by Zhou for applications of the block filtered Jacobi Davidson method on electronic structure calculations. 52 The active search subspace, with size kact , is part of the inner-outer restart procedure included in BlockChebDav. 52 This procedure was introduced as a way to reduce the computational cost by decreasing the memory footprint necessary for the eigenvalue search space. Zhou had shown for electronic structure calculation that small values for kact (of the order of 100) were usually sufficient to compute 100 eigenvalues. For normal modes calculation however, we have found that larger values for kact significantly reduce the total computing time, with best values around 400. We note that increasing the active search space beyond 400 does not improve the computation time. We assign this difference to the nature of the eigenvalue problem when computing normal modes of large virus capsids with high levels of symmetry. The presence of these symmetries lead to high degeneracies in the eigenvalue spectrum (see Figure 2) which are best handled with larger kact values. Interestingly, the 35

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

total number of matrix vector multiplications follows the total computing time as a function of the active search space. A search space of size 400 is larger than the search space used for the Lanczos procedure, which was set to 200 (see above). We repeated the calculations for both the traditional Lanczos method and for the filtered Lanczos method with a larger search space of 400 and did not see any improvements in computing time compared to a search space of size 200.

6.6

Computing a much larger number of eigenpairs

All comparisons presented above focused on computing the hundred smallest eigenvalues and corresponding eigenvectors of very large Hessian matrices. We expand those analyses to computing a much larger set of eigenvalues, namely the 5000 smallest eigenvalues of the elastic network corresponding to the full WNV capsid in its immature form. The choice of this capsid will be justified below. We do note that if we had considered the full ZIKV capsid with its 793,560, storing all 5000 eigenvectors for its elastic network would have requested more than 95GB of memory, making this calculation out of reach of a regular desktop computer. We use a coarse grained representation of the WNV capsid, with one atom per residue, its Cα. The full capsid is then represented with 71,100 atoms. We computed the elastic network of this capsid using a cutoff Rc = 15 ˚ A. This elastic network contains 1765740 edges, corresponding to 16318260 non-zero values in the corresponding Hessian. We then computed the 5000 smallest eigenvalues of this Hessian using the traditional Lanczos method with a search space of size 10000, the filtered Lanczos method with a Chebyshev polynomial of degree 5 and the same search space of size 10000, the functional-based method with a tolerance set to 10−8 , and the block filtered Jacobi Davidson method with a Chebyshev polynomial of degree 100, a block size of 16, and a total search space of size 10000 and an active search subspace of size 1000. All those values were chosen based on the analyses presented above, with the exception of the active search space size for the Jacobi-Davidson 36

ACS Paragon Plus Environment

Page 36 of 53

Page 37 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

method that was set to 1000 to reduce its memory footprint. The total computing times and corresponding Wall times are given in table 1. Table 1: CPU time for computing the first 5000 eigen pairs of the Hessian of a full viral capsid with 71,100 atoms Method Lanczos Filtered Lanczos Functional BlockChebDav

CPU time (s) 611608 261258 234812 94352

a

Wall time (s) 163937 69808 51324 20173

a

Speedup 3.74 3.74 4.6 4.7

b

Improvement 1 2.34 3.2 8.2

c

Accuracy d 2 × 10−14 2 × 10−14 0.5 5 × 10−9

a)

Total CPU time and corresponding Wall time on an Intel i7 4GHz processor with 4 cores (and two hyperthreads per core) and 64 Gb of RAM b) Ratio of the CPU time over the Wall time; this speedup defines the quality of parallelization. The ideal value for the system considered is 8. d) Ratio of the Wall time over the Wall time for the Lanczos method; this ratio defines the improvement compared to the robust ARPACK method. d) The accuracy is computed as the mean value of ||Hv − λv|| over all 5000 eigen pairs (λ, v) computed.

The block filtered Chebyshev Jacobi Davidson method (BlockChebDav) remains the method of choice when computing a large number of eigenvalues, with differences with other methods in terms of computing time even larger than what was observed for a smaller number of eigenvalues. The corresponding wall time for computing 5000 eigen pairs for this large system is approximately 5 hours 30 minutes on a four-core, hyperthreaded desktop computer. While not small, it makes such a computation feasible. In comparison, it takes 46 hours to compute the same number of eigen pairs using the standard ARPACK package, and close to 20 hours with the filtered Lanczos method. While Chebyshev filtering clearly improves running time for the Lanczos method, its impact remains more significant for the Block Jacobi-Davidson method. We note that the functional method is seemingly faster than the filtered Lanczos method, but still significantly slower than BlockChebDav. However, the functional method leads to eigen pairs that are not accurate. As mentioned above, the convergence of the conjugate gradient method is based on the convergence of the functional G itself, computed as the ratio of the changes in G compared to the current value of G

37

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

|Gcurrent − Gold | ) . When the value of the functional is large, the procedure may Gold end prematurely. Reducing the tolerance may improve the overall accuracy; this will however

(namely,

increase the computing time significantly. 10

10

4

7

9 6.5

8 7

6

6 5

5.5

4

Speedup

Clock time (s)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 53

5

3 2

4.5

1 0

0

1000

2000

3000

4000

4 5000

Eigenvalue #

Figure 7: Total CPU time (left axis) and parallelization speedup (right axis) of the block filtered Jacobi Davidson method as a function of the number of converged eigenvalues of the elastic network corresponding to the immature form of the full capsid of WNV. This elastic network contains 71,100 atoms and 1765740 edges. The computing time for BlockChebDav remains linear as a function of the number of eigenvalues computed up at least 4000 eigenvalues, as illustrated in Figure 7. Above 4000 eigenvalues however, we notice a significant slow down. There are many reasons for that, including the fact that any new putative eigenvectors have to be orthogonalized against the set of current eigenvectors that has gotten large. If more than 5000 eigenvalues are needed, new strategies will probably have to be implemented, such as the spectrum splicing approach proposed by Saad and co-workers. 69 We note that the related speedup introduced by parallelization sees a faster drop with respect to the number of eigenvalues; it remains significant, however, and larger than 4.5.

38

ACS Paragon Plus Environment

Page 39 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

6.7

How many normal modes do we actually need?

The question arises as to how many normal modes do we actually need. We address this question with respect to predicting conformational changes in proteins. Petrone and Pande had shown that for small systems with up to 6000 atoms, up to thousands of normal modes were needed to accurately map their conformational changes between two known conformations. 34 We repeated their analyses for the WNV, for which we have three conformations for its capsid, an immature form, an immature form with FAB fragments bound, and a mature form (see section 6.1 above). We computed normal modes of different elastic networks corresponding to the immature form of WNV. Those networks differ as they are computed with different cutoff values, Rc , varying from 12 ˚ A to 30 ˚ A. We assessed then how the first 5000 modes of those networks can be used to map the conformational changes with the two other forms of the capsid. The mapping was measured using both an overlap value between each mode and the conformational changes, and a coordinate RMS between the conformation generated by combining the normal modes, and the target conformation (see section 4). Note that in this analysis we use the eigenvectors simply as a coordinate basis in which to project the structural changes. The frequencies associated with each normal mode is only used to rank the vectors. Results are shown in figure 8 for the two transitions immature → immature+FAB, and immature → mature. We used in this analysis a Cα representation of the WNV. Standard cutoff values for the elastic networks of such systems are in the range 13 to 15˚ A. 55 The first network for WNV we analyze has its cutoff value in that range, namely Rc = 15 ˚ A. The overall contributions of the first 1000, 3000, and 5000 modes of that network to the collective motions mapped onto the conformational changes is 56%, 90%, and 93% for the transition between the immature form to the immature form with FAB, and 56%, 95%, and 97%, for the transition between the immature form to the mature form. These two examples show that even though the individual contributions of the modes tend to decrease as their frequencies increase, we need to include high frequency modes, up to at least mode number 5000, for mapping the 39

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

A) WNV: Immature to Immature with FAB fragments 0.4

RMS to target structure

5

Rc =15

0.35

Overlap

0.3 0.25 0.2 0.15 0.1 0.05 0 0

Rc=12 Rc=15 Rc=20 Rc=25 Rc=30

4.5 4 3.5 3 2.5 2 1.5

1000

2000

3000

4000

1 0

5000

1000

Mode #

2000

3000

4000

5000

Mode #

B) WNV: Immature to Mature 0.5

90

RMS to target structure

Rc =15

0.45 0.4

Overlap

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 53

0.35 0.3 0.25 0.2 0.15 0.1

70 60 50 40 30 20

0.05 0 0

Rc=12 Rc=15 Rc=20 Rc=25 Rc=30

80

1000

2000

3000

4000

10 0

5000

1000

2000

3000

4000

5000

Mode #

Mode #

Figure 8: Overlaps between normal modes and conformational changes (left panel) and cRMS between best mapped conformation and target conformation (right panel) for the transition between the immature form of WNV (PDB code 2OF6) and the immature form of WNV with FAB fragments bound to it (PDB code 3IXX), and for the transition between the same immature form and the mature form of WNV (PDB code 3J0B). The normal modes were computed based on the immature structure. Overlaps on the left are shown for a single cutoff distance for the elastic network, set to 15 ˚ A, while the cRMS curves are given for a range of cutoff distances Rc . Modes are indexed based on increasing frequencies. conformational changes in the viral capsid. We do note however that 5000 modes represent 2% of the total number of modes for the elastic network of the viral capsid. As we keep adding modes to the computed collective motions mapped onto the conformational changes, the cRMS between the mapped conformation and the target conformation decreases. For the transition between the two immature forms of the capsid of WNV (without, and with FAB fragments), this RMS decreases from 4.94 to 1.33. The change is even more drastic for the transition between the immature and mature forms of the capsid of WNV, as the cRMS

40

ACS Paragon Plus Environment

Page 41 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

decreases from 85.1 to 14.7. Interestingly, the choice of the cutoff value Rc for constructing the elastic network of WNV has a significant impact on the effectiveness with which the normal modes of that network reproduce conformational changes (see the right panels in figure 8). If we reduce Rc from 15 ˚ A to 12 ˚ A, we observe small changes only. Decreasing Rc even further introduces problems in the network, as it is then divided into multiple components (this is especially the case for the immature form of the viral capsid as it is looser than the mature form). However, when we increases Rc to significantly larger values, up to 30˚ A, we find that the lower 5000 modes are no more sufficient to capture the transitions in the conformation of the capsid of WNV. The overall contribution of the first 5000 modes to the conformational change from the immature form to the mature form of WNV decreases from 98% to 75% as the cutoff is increased from 12 ˚ A to 30 ˚ A, with similar decrease for the transition induced by FAB binding. Larger values for Rc lead to elastic networks that are denser. In addition, they lead to more rigidity between the different E proteins in the virus capsid, as the elastic networks include more links between those proteins. These two effects make it harder for those subunits of the viral capsids to rearrange with low frequency collective motions. Increasing the cutoff value Rc has two adverse effects. First, larger values for Rc lead to larger elastic networks, and therefore to Hessian matrices that are less sparse. The elastic network for the immature form of WNV contains 1026912 edges when Rc = 12 ˚ A, and 7740024 edges when Rc = 30 ˚ A, ie. an increase in size of a factor 7. Second, based on the observations described above, more normal modes are needed to capture a significant fraction of the transition between two conformations of the WNV capsid. Generating more normal modes comes at a high computational cost for such large systems, especially in the range of several thousand normal modes. We therefore recommend using a small cutoff value, in the range 12 ˚ A to 15 ˚ A, as previously discussed. 55 We note that while the cRMS reported here will drop to 0 if we include all normal modes (as the displacement vector is projected on the eigenvectors that form an orthonormal basis),

41

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

the conformational changes may not maintain the stereo-chemistry of the molecular system. Indeed, Cartesian normal modes based on a simple potential such as the elastic potential, when weighted with possibly large amplitudes are likely to stretch the structures in nonphysical ways. This is a concern that will need to be addressed.

7

Concluding remarks

Computational methods from all-atom molecular dynamics simulations to coarse-grained normal mode analyses based on simplified elastic networks provide a general framework for studying the dynamics of bio-molecular systems. Despite recent successes in studying vary large systems with up to hundred million atoms, 9,10 those methods are currently limited to studying small to medium size molecular systems due to computational limitations, especially when used on standard desktop computers. The hope to circumvent those limitations rests on the development of improved algorithms with novel implementations to solve their computationally challenging parts. In this paper, we have argued that a major limitation associated with normal mode analyses comes from the calculation of the eigen pairs of the Hessian of the potential energy function from which normal modes are computed. We have described and implemented a new method for handling this Hessian based on tensor products for the simple elastic potential of the Tirion elastic network model. 17 This new method was shown to reduce space requirements and improve the parallelization of its implementation. We have implemented and tested four different methods for computing some eigen pairs of that Hessian, the standard, robust Lanczos method as implemented in ARPACK, a simple modification of this method based on polynomial filtering, a functional-based method, and a block Chebyshev-Davidson method with inner-outer restart. We have shown that the latter provides the most efficient implementation when computing eigen pairs of extremely large Hessians, corresponding to full-atom representations of large viral capsids. We have also shown that for those viral capsids, a large number of eigen pairs is needed, in the order

42

ACS Paragon Plus Environment

Page 42 of 53

Page 43 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

of thousands, noticing however that this large number is still a small fraction of the total number of possible eigen pairs (a few percent). The tensor representation of the Hessian was introduced here in the context of the Tirion elastic potential used for normal mode analyses. This representation should in fact apply to a much larger family of potentials. Equation 12, in which the Hessian is written as a sum of two terms, one of which involving tensor products, remains true for all potentials. For quadratic potentials, when the Hessian is evaluated at the stable conformation, it simplifies and only includes the sum of tensor products. This expression makes the computation of Hessian vector products simpler and more efficient, as illustrated in this study. We are currently expanding this formalism to more complex potentials, such as the G¯o potential. 70–72 There has been a lot of interest recently in using filtered block iterative methods for computing partial eigenvalue decomposition of large, sparse matrices (for a recent review with discussions of their diverse applications, including for problems outside of eigenvalue decomposition, see for example Ref. 73 ). We have used a Block Chebyshev Davidson method, based on an algorithm originally proposed by Zhou. 52 We have shown that our implementation requires between 5 and 6 hours of Wall time for computing 5000 eigen pairs of the Hessian of a coarse-grained representation of a viral capsid with around 70,000 atoms on a desktop computer with 4 cores and 64 GB of RAM. While not small, this computing time makes such a computation feasible. There are still many options for improving upon this method. First, we do not exclude that there are other methods for solving large symmetric eigenvalues that may prove more efficient. In addition, we have limited ourselves to using those methods on standard desktop computers. Computing clusters, as well as specialized architecture such as those available for GPU computing are likely to provide significant speedups of the techniques. There are however some more fundamental problems that need to be solved. For viral capsids or other large molecular systems with more than 100,000 atoms, storage and the corresponding memory requirements become problematic. Computing the first 10,000 eigenvectors of a system with one million atoms would require 240GB of RAM,

43

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

not including the memory needed for defining the search space itself. To circumvent this problem we will probably have to rely on divide-and-conquer methods such as the spectrum slicing methods. 69 We are currently looking on implementing such methods.

8

Acknowledgments

The author thanks Zhaojun Bai and Marc Delarue for helpful comments and suggestions.

44

ACS Paragon Plus Environment

Page 44 of 53

Page 45 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

References (1) Hagan, M. Modeling viral capsid assembly. Adv. Chem. Phys. 2014, 155, 1–68. (2) Shaw, D.; Deneroff, M.; Dror, R.; Kuskin, J.; Larson, R.; Salmon, J.; Young, C.; Batson, B.; Bowers, K.; Chao, J.; Eastwood, M.; Gagliardo, J.; Grossman, J.; Ho, C.; Ierardi, D.; Kolossv´ary, I.; Klepeis, J.; Layman, T.; McLeavey, C.; Moraes, M.; Mueller, R.; Priest, E.; Shan, Y.; Spengler, J.; Theobald, M.; Towles, B.; Wang, S. Anton, a Specialpurpose Machine for Molecular Dynamics Simulation. Commun. ACM 2008, 51, 91–97. (3) Stone, J.; Hardy, D.; Ufimtsev, I.; Schulten, K. GPU-accelerated molecular modeling coming of age. J. Molec. Graph. Modelling 2010, 29, 116–125. (4) Wang, Y.; Harrison, C.; Schulten, K.; McCammon, J. Implementation of accelerated molecular dynamics in NAMD. Comput. Sci. Discov. 2011, 4, 015002. (5) Pierce, L.; Salomon-Ferrer, R.; de Oliveira, C.; McCammon, J.; Walker, R. Routine access to millisecond time scale events with accelerated molecular dynamics. J. Chem. Theori. Comput. 2012, 8, 2997–3002. (6) Sweet, J.; Nowling, R.; Cickovski, T.; Sweet, C.; Pande, V.; Izaguirre, J. Long Timestep Molecular Dynamics on the Graphical Processing Unit. J. Chem. Theori. Comput. 2013, 13, 3267–3281. (7) Shaw, D. E.; Grossman, J. P.; Bank, J. A.; Batson, B.; Butts, J. A.; Chao, J. C.; Deneroff, M. M.; Dror, R. O.; Even, A.; Fenton, C. H.; Forte, A.; Gagliardo, J.; Gill, G.; Greskamp, B.; Ho, C. R.; Ierardi, D. J.; Iserovich, L.; Kuskin, J. S.; Larson, R. H.; Layman, T.; Lee, L. S.; Lerer, A. K.; Li, C.; Killebrew, D.; Mackenzie, K. M.; Mok, S. Y. H.; Moraes, M. A.; Mueller, R.; Nociolo, L. J.; Peticolas, J. L.; Quan, T.; Ramot, D.; Salmon, J. K.; Scarpazza, D. P.; Schafer, U. B.; Siddique, N.; Snyder, C. W.; Spengler, J.; Tang, P. T. P.; Theobald, M.; Toma, H.; Towles, B.; Vitale, B.; Wang, S. C.;

45

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Young, C. Anton 2: Raising the Bar for Performance and Programmability in a SpecialPurpose Molecular Dynamics Supercomputer. SC14: International Conference for High Performance Computing, Networking, Storage and Analysis. 2014; pp 41–53. (8) Eastman, P.; Pande, V. OpenMM: A Hardware Independent Framework for Molecular Simulations. Comput. Sci. Eng. 2015, 12, 34–39. (9) Zhao, G.; Perilla, J.; Yufenyuy, E.; Meng, X.; Chen, B.; Ning, J.; Ahn, J.; Gronenborn, A.; Schulten, K.; Aiken, C.; Zhang, P. Mature HIV-1 capsid structure by cryoelectron microscopy and all-atom molecular dynamics. Nature 2013, 497, 643–646. (10) Sener, M.; Strumpfer, J.; Singharoy, A.; Hunter, C.; Schulten, K. Overall energy conversion efficiency of a photosynthetic vesicle. Elife 2016, 5 . (11) Mahajan, S.; Sanejouand, Y.-H. On the relationship between low-frequency normal modes and the large-scale conformational changes of proteins. Arch. Biochem. Biophys. 2015, 567, 59–65. (12) Saunders, M.; Voth, G. Coarse-graining Methods for Computational Biology. Annu. Rev. Biophysics 2013, 42, 73–93. (13) Lopez-Blanco, J.; Chacon, P. New generation of elastic network models. Curr. Opin. Struct. Biol. 2016, 37, 46–53. (14) Noguti, T.; Go, N. Collective variable description of small-amplitude conformational fluctuations in a globular protein. Nature 1982, 296, 776–778. (15) Brooks, B.; Bruccoleri, R.; Olafson, B. CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. J. Comp. Chem. 1983, 4, 187–217. (16) Levitt, M.; Sander, C.; Stern, P. Protein normal-mode dynamics: trypsin inhibitor, crambin, ribonuclease and lysozyme. J. Mol. Biol. 1985, 181, 423–47.

46

ACS Paragon Plus Environment

Page 46 of 53

Page 47 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(17) Tirion, M. Large amplitude elastic motions in proteins from a single parameter, atomic analysis. Phys. Rev. Lett. 1996, 77, 1905–1908. (18) Tama, F.; C.L. Brooks, I. Symmetry, form, and shape: guiding principles for robustness in macromolecular machines. Ann. Rev. Biophys. Biomol. Struct. 2006, 35, 115–133. (19) Dykeman, E.; Sankey, O. Normal mode analysis and applications in biological physics. J. Phys. Condens. Matter 2010, 22, 423202. (20) Lezon, T.; Shrivastava, I.; Yan, Z.; Bahar, I. In Handbook on Biological Networks; Boccaletti, S., Latora, V., Moreno, Y., Eds.; World Scientific Publishing Co: Singapore, 2010; pp 129–158. (21) Simonson, T.; Perahia, D. Normal modes of symmetric protein assemblies. Application to the tobacco mosaic virus protein disk. Biophys. J. 1992, 61, 410–427. (22) van Vlijmen, H.; Karplus, M. Normal mode calculations of icosahedral viruses with full dihedral flexibility by use of molecular symmetry. J. Mol. Biol. 2005, 350, 528–542. (23) Peeters, K.; Taormina, A. Group theory of icosahedral virus capsid vibrations: a topdown approach. J. Theor. Biol. 2009, 256, 607–624. (24) Tama, F.; Gadea, F.; Marques, O.; Sanejouand, Y.-H. Building-block approach for determining low-frequency normal modes of macromolecules. Proteins: Struct. Func. Genet. 2000, 41, 1–7. (25) Li, G.; Cui, Q. A coarse-grained normal mode approach for macromolecules: an efficient implementation and applications to CA2+-ATPase. Biophys. J. 2002, 86, 743–763. (26) Dykeman, E.; Sankey, O. Low frequency mechanical modes of viral capsids: an atomistic approach. Phys. Rev. Lett. 2008, 100, 028101. (27) Hsieh, Y.-C.; Poitevin, F.; Delarue, M.; Koehl, P. Comparative normal mode analysis of the dynamics of DENV and ZIKV capsids. Frontiers Bio. Sci. 2016, 3, 85. 47

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(28) Kim, M.; Jernigan, R.; Chirikjian, G. Efficient generation of feasible pathways for protein conformational transitions. Biophys. J. 2002, 83, 1620–1630. (29) Maragakis, P.; Karplus, M. Large amplitude conformational change in proteins explored with a plastic network model: adenylate kinase. J. Mol. Biol. 2005, 352, 807–822. (30) Kurkcuoglu, O.; Turgut, O.; Cansu, S.; Jernigan, R.; Doruker, P. Focused functional dynamics of supramolecules by use of a mixed-resolution elastic network model. Biophys. J. 2009, 97, 1178–1187. (31) Rueda, M.; Chacon, P.; Orozco, M. Thorough validation of protein normal mode analysis: a comparative study with essential dynamics. Structure 2007, 15, 565–575. (32) Orellana, L.; Rueda, M.; Ferrer-Costa, C.; Lopez-Blanco, J.; Chacon, P.; Orozco, M. Approaching elastic network models to molecular dynamics flexibility. J. Chem. Theory Comput. 2010, 6, 2910–2923. (33) Leioatts, N.; Romo, T.; Grossfield, A. Elastic network models are robust to variations in formalism. J. Chem. Theory Comput. 2012, 8, 2424–2434. (34) Petrone, P.; Pande, V. Can conformational change be described by only a few normal modes? Biophys. J. 2006, 90, 1583–1593. (35) Mahajan, S.; Sanejouand, Y. On the relationship between low-frequency normal modes and the large-scale conformational changes of proteins. Arch. Biochem. Biophys. 2015, 567, 59–65. (36) Harrison, R. Variational calculation of the normal modes of a large macromolecule: methods and some initial results. Biopolymers 1984, 23, 2943–2949. (37) Brooks, B.; Karplus, M. Normal modes for specific motions of macromolecules: Application to the hinge–bending mode of lysozyme. Proc. Natl. Acad. Sci. (USA) 82, 4995–4999. 48

ACS Paragon Plus Environment

Page 48 of 53

Page 49 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(38) Hao, M.-H.; Harvey, S. Analyzing the normal mode dynamics of macromolecules by the component synthesis method. Biopolymers 1992, 32, 1393–1405. (39) Lehoucq, R.; Sorensen, D. Deflation techniques for an implicitly restarted Arnoldi iteration. SIAM J. Matrix Anal. Appl. 1996, 17, 789–821. (40) Lehoucq, R.; Sorensen, D.; Yang, C. ARPACK Users’ Guide: Solution of Large-Scale Eigenvalue Problems with Implicitly Restarted Arnoldi Methods; SIAM: Philadelphia, PA, 1998. (41) Golub, G.; van der Vorst, H. Eigenvalue computation in the 20th century. J. Comput. Applied Math. 2000, 123, 35–65. (42) Saad, Y. Numerical Methods for Large Eigenvalue Problems; Society for Industrial and Applied Mathematics: Philadelphia, PA, 2011. (43) Sorensen, D. Implicit application of polynomial filters in a k-step Arnoldi method. SIAM J. Matrix Anal. Appl. 1992, 13, 357–385. (44) Lehoucq, R. Analysis and implementation of an implicitly restarted Arnoldi iteration. Ph.D. thesis, Rice University, Houston, TX, 1995. (45) Sleijpen, G.; van der Vorst, H. A Jacobi Davidson iteration method for linear eigenvalue problems. SIAM J. Matrix Anal. Appl. 1996, 17, 401–425. (46) Zhou, Y.; Saad, Y. A Chebishev-Davidson algorithm for large symmetric eigenvalue problems. SIAM J. Matrix Anal. Appl. 2007, 29, 341–359. (47) Dykeman, E.; Sankey, O. Theory of the low frequency mechanical modes and Raman spectra of the M13 bacteriophage capsid with atomic detail. J. Phys. Condens. Matter 2009, 21, 035116. (48) Sameh, A.; Wisniewski, J. A trace minimization algorithm for the generalized eigenvalue problem. SIAM J. Numer. Anal. 1982, 19, 1243–1259. 49

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(49) Wen, Z.; Yang, C.; Liu, X.; Zhang, Y. Trace penalty minimization for large–scale eigenspace computation. J. Scientific Computing 2016, 66, 1175–1203. (50) Rutishauser, H. Computational aspects of F.L. Bauer’s simultaneous iteration method. Numer. Math. 1969, 13, 4–13. (51) Saad, Y. Chebishev acceleration techniques for solving nonsymmetric eigenvalue problems. Math. Comput. 1984, 42, 567–588. (52) Zhou, Y. A block Chebishev-Davidson method with inner-outer restart for large eigenvalue problems. J. Comput. Phys. 2010, 229, 9188–9200. (53) Bahar, I.; Atilgan, A.; Erman, B. Direct evaluation of thermal fluctuations in proteins using a single parameter harmonic potential. Folding and Design 1997, 2, 173–181. (54) Haliloglu, T.; Bahar, I.; Erman, B. Gaussian dynamics of folded proteins. Phys. Rev. Lett. 1997, 79, 3090–3093. (55) Atilgan, A.; Durell, S.; Jernigan, R.; Demirel, M.; Keskin, O.; Bahar, I. Anisotropy of fluctuation dynamics of proteins with an elastic network model. Biophys. J. 2001, 80, 505–515. (56) Ming, D.; Wall, M. Allostery in a Coarse-Grained Model of Protein Dynamics. Phys. Rev. Lett. 2005, 95, 198103. (57) Kondrashov, D.; Cui, Q.; Phillips Jr, G. Optimization and evaluation of a coarse-grained model of protein motion using X-ray crystal data. Biophys. J. 2006, 91, 2760–2767. (58) Eyal, E.; Yang, L.; Bahar, I. Anisotropic network model: systematic evaluation and a new web interface. Bioinformatics 2006, 22, 2619–2627. (59) Arnoldi, W. The principle of minimized interations in the solution of the matrix eigenvalue problem. Quart. Appl. Math. 1951, 9, 17–29. 50

ACS Paragon Plus Environment

Page 50 of 53

Page 51 of 53 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(60) Li, R.; Xi, Y.; Vecharynski, E.; Yang, C.; Saad, Y. A Thick-Restart Lanczos algorithm with polynomial filtering for Hermitian eigenvalue problems. SIAM J. Sci. Comput. 2016, 38, A2512–A2534. (61) Aurentz, J.; Kalantzis, V.; Saad, Y. Cucheb: A GPU implementation of the filtered Lanczos procedure. Comput. Phys. Comm. 2017, 220, 332–340. (62) Hochstenbach, M.; Notay, Y. The Jacobi-Davidson method. GAMM Mitteilungen 2006, 29, 368–382. (63) Berman, H.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T.; Weissig, H.; Shindyalov, I.; Bourne, P. The Protein Data Bank. Nucleic Acids Res. 2000, 28, 235–242. (64) Quentrec, B.; Brot, C. New method for searching for neighbors in molecular dynamics computations. J. Comput. Phys. 1973, 13, 430–432. (65) Kostyuchenko, V.; Lim, E.; ; Zhang, S.; Fibriansah, G.; Ng, T.-S.; Ooi, S.; Shi, J.; Lok, S.-M. Structure of the thermally stable Zika virus. Nature 2016, 533, 425–428. (66) Zhang, Y.; Kaufmann, B.; Chipman, P.; Kuhn, R.; Rossmann, M. structure of immature West Nile Virus. J. Virol. 2007, 81, 6141–6145. (67) Cherrier, M.; Kaufmann, B.; Nybakken, G.; Lok, S.; Warren, J.; Nelson, C.; Kostyuchenko, V.; Holdaway, H.; Chipman, P.; Kuhn, R.; Diamond, M.; Rossmann, M.; Fremont, D. structural basis for the preferential recognition of immature flaviviruses by a fusion-loop antibody. EMBO J. 2009, 28, 3269–3276. (68) Zhang, W.; Kaufmann, B.; Chipman, P.; Kuhn, R.; Rossmann, M. membrane curvature in flaviviruses. J. Struct. Biol. 2013, 183, 86–94. (69) Lin, L.; Saad, Y.; Yang, C. Approximating spectral densities of large matrices. SIAM review 2016, 58, 34–65.

51

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(70) Ueda, Y.; Taketomi, H.; G¯o, N. Studies on protein folding, unfolding and fluctuations by computer simulation. I. The effects of specific amino acid sequence represented by specific inter-unit interactions. Int. J. Pept. Res. 1975, 7, 445–459. (71) Clementi, C.; Nymeyer, H.; Onuchic, J. Topological and energetic factors: what determines the structural details of the transition state ensemble and “en-route” intermediates for protein folding? An investigation for small globular proteins. J. Mol. Biol. 2000, 298, 937–953. (72) Lin, T.-L.; Song, G. Generalized spring tensor models for protein fluctuation dynamics and conformation changes. BMC Struct. Biol. 2010, 10, S3. (73) Zhou, Y.; Wang, Z.; Zhou, A. Accelarating large partial EVD/SVD calculations by filtered block Davidson methods. Sci. China Math. 2016, 59, 1635–1662.

52

ACS Paragon Plus Environment

Page 52 of 53

Page 53 of 53

Start

Target Normal mode transition RMS to target structure

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

90 80 70 60 50 40 30 20 10 0

1000

2000

3000

4000

5000

Mode #

Figure 9: TOC graphic.

53

ACS Paragon Plus Environment