A generalized Markov State Modeling method for non-equilibrium

May 29, 2018 - Markov state models (MSMs) have received an unabated increase in popularity in recent years, as they are very well suited for the ident...
1 downloads 0 Views 4MB Size
Subscriber access provided by University of Massachusetts Amherst Libraries

Molecular Mechanics

A generalized Markov State Modeling method for nonequilibrium biomolecular dynamics – exemplified on peptide conformational dynamics driven by an oscillating electric field. Bernhard Reuter, Marcus Weber, Konstantin Fackeldey, Susanna Röblitz, and Martin E Garcia J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.8b00079 • Publication Date (Web): 29 May 2018 Downloaded from http://pubs.acs.org on June 1, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

A generalized Markov State Modeling method for non-equilibrium biomolecular dynamics – exemplified on peptide conformational dynamics driven by an oscillating electric field. Bernhard Reuter*,†, Marcus Weber‡, Konstantin Fackeldey‡,§, Susanna Röblitz‡, Martin E. Garcia† †

University of Kassel, Institute of Physics, Theoretical Physics II, Heinrich-Plett-Str. 40, 34132

Kassel, Germany ‡

Zuse Institute Berlin (ZIB), Takustraße 7, 14195 Berlin, Germany

§

Institute of Mathematics, Technical University Berlin, Straße des 17. Juni 136, 10623 Berlin,

Germany KEYWORDS. Markov State Model, non-equilibrium, PCCA+, oscillating electric field, molecular dynamics, protein

ACS Paragon Plus Environment

1

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 57

ABSTRACT: Markov state models (MSMs) have received an unabated increase in popularity in recent years, as they are very well suited for the identification and analysis of metastable states and related kinetics. However, the state-of-the-art Markov state modeling methods and tools enforce the fulfillment of a detailed balance condition, restricting their applicability to equilibrium MSMs. To date, they are unsuitable to deal with general dominant data structures including cyclic processes, which are essentially associated with non-equilibrium systems. To overcome this limitation we developed a generalization of the common robust Perron Cluster Cluster Analysis (PCCA+) method, termed generalized PCCA (G-PCCA). This method handles equilibrium and non-equilibrium simulation data, utilizing Schur vectors instead of eigenvectors. G-PCCA is not limited to the detection of metastable states but enables the identification of dominant structures in a general sense, unraveling cyclic processes. This is exemplified by application of G-PCCA on non-equilibrium molecular dynamics data of the Amyloid β (1-40) peptide, periodically driven by an oscillating electric field.

ACS Paragon Plus Environment

2

Page 3 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

1. Introduction A major challenge in biomolecular simulations consists in the identification of the relevant conformational changes of large biomolecules without relying in rough coarse-grained models, i.e., by keeping the accuracy provided by the atomistic simulations. Since simulations do not only include all atoms of the biomolecule but also thousands of water molecules surrounding them, during analysis a dramatic reduction of the degrees of freedom is necessary. This can be achieved by Markov state modeling, which is a mathematically rigorous technique, adapted for Molecular Dynamics (MD) in thermodynamic equilibrium by Schütte1 and refined in recent years.2–6 Markov State Models (MSMs)7 have received an ongoing increase in popularity in years past as they enable the conflation of data from simulations of different lengths and the identification and analysis of the relevant metastabilities and kinetics of molecular systems. Markov state modeling has become a powerful tool to infer the long-time dynamic behavior of molecular systems from short-time MD simulations. MSMs foster the identification and characterization of rare events in molecular systems. In order to estimate the frequency of rare events, acceleration methods (see, for example, Chaterjee and Voter8) are a useful alternative to MSMs. However, many acceleration methods make use of the reversibility of the processes. In our context we will investigate molecular processes that are influenced by an outer stimulus, which soils the reversibility of the process. In a first step, the conformational space is discretized in terms of a finite number of subsets, and the dynamic process is represented as a discrete-time Markov chain on this finite state space for an appropriate lag time 𝜏. In a second step, the metastable (i.e., almost invariant) subsets of this finite dimensional state space are identified, i.e., those sets with large holding probabilities, which typically represent metastable conformations of the molecular system. Finally, the Markov

ACS Paragon Plus Environment

3

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 57

chain is projected onto these metastable sets, giving rise to a low-dimensional matrix that contains the transition probabilities between the metastable conformations. Under certain conditions, this matrix again describes a Markov chain. However, standard methods and tools for MSM building are so far restricted to the creation of equilibrium MSMs fulfilling the detailed balance condition9. Although recently increased efforts were made to derive equilibrium MSMs from non-equilibrium simulations,10,11 there are only few contributions regarding the construction of actual non-equilibrium MSMs12–14. This constitutes a severe constraint, since molecular non-equilibrium systems, e.g., disturbed by an external force, have drawn increasing interest. Thus, to get over this limitation, we have developed and implemented a generalization of the widely used robust Perron Cluster Cluster Analysis (PCCA+)15 method, termed generalized PCCA (G-PCCA). This method, based on the utilization of Schur vectors16 instead of eigenvectors, can handle both data from equilibrium and nonequilibrium simulations, resulting in a respective MSM. G-PCCA is able to identify dominant structures in a more general sense, not limited to the detection of metastable states, unraveling cyclic processes. Markov state modeling has been developed for reversible autonomous17 systems, where the transition probabilities only depend on the lag-time. Furthermore, it is also applicable to nonreversible autonomous and non-autonomous systems. A system is non-autonomous17, if external forces influence its law of behavior. The application of Markov state modeling is particularly simple for non-autonomous systems with a periodic outer stimulus, if the lag-time 𝜏 equals the length of one period.18,19 This will be the basis for constructing MSMs in this article. It is exemplified by application of G-PCCA on Non-Equilibrium Molecular Dynamics (NEMD) data

ACS Paragon Plus Environment

4

Page 5 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

of the human Amyloid β (1-40) peptide (Aβ40)20, crucially involved with Alzheimer disease, periodically driven by an oscillating electric field. Aβ40 is cleaved by β secretase and γ secretase from the amyloid precursor protein (APP). APP is an integral membrane protein, which is concentrated in neuronal synapses. Experimental studies of Aβ40 in non-polar and micellar environments have shown that this peptide consists mainly of α-helical regions.21 However, studies under physiological conditions show Aβ40 populates a multiplicity of (partially) structured states with mainly α-helical and β-sheet conformations.22,23 This α/β conformations tend to build toxic oligomers that are found in Alzheimer disease brain lesions (plaques) and, therefore, are pathogenic conformations of the peptide, whereas the α-helical conformation is assumed to not be linked to diseases.20 We conducted extensive NEMD simulations, based on the method of Wang et al.24, utilizing the GROMACS program package. The resulting huge amount of simulation data, showing complex conformational dynamics between a manifold of microstates, was then successfully modeled utilizing G-PCCA, unraveling both metastabilities and nonreversible cyclic processes.

2. Theory 2.1. MSM building The aim of MSMs is to represent the dynamic behavior of a molecular system, given as some stochastic process (e.g., Langevin dynamics), by conditional transition probabilities between subsets of the conformational space. Originally, this type of models has been developed for autonomous systems. Based on a sufficiently fine decomposition {Ω! }! !!! of the underlying conformational space into subsets, the dynamics of the stochastic process are interpreted as a Markov process, that is the transition probability depends on the current state only. In this case, the dynamical behavior of the system can be represented by a transition matrix 𝐏 𝜏 ∈ ℝ!×! ,

ACS Paragon Plus Environment

5

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 57

where every entry 𝑃!" 𝜏 of this matrix gives the conditional probability that a trajectory starting in subset 𝑖 will be in subset 𝑗 after a certain lag-time 𝜏. The transition matrix 𝐏 is a stochastic matrix with the largest eigenvalue 𝜆 = 1, called the Perron root25. Under certain ergodicity assumptions, this eigenvalue is algebraically and geometrically simple. In that case, the MSM has a unique stationary distribution 𝛑 with 𝛑! 𝐏(𝜏) = 𝛑! . For a reversible process the matrix 𝐃𝐏 𝜏 is symmetric, where 𝐃 = diag 𝜋! . . . 𝜋! , i.e., the detailed balance condition 𝜋! 𝑃!" = 𝜋! 𝑃!" holds. In the context of reversible Markov chains it has been understood that the long time relaxation kinetics of a process are represented by the eigenvectors belonging to the eigenvalues close to one. Reversible matrices 𝐏 always have real eigenvalues and real eigenvectors. Thus we can arrange the eigenvalues of 𝐏 in a descending order and take only the first 𝑛 ≪ 𝑁 dominant eigenvectors to describe the dominant processes of the system. This kind of spectral analysis holds for reversible autonomous systems, where the transition probability only depends on the lag-time. MSMs of time-reversible processes represent transition probabilities between metastable sets 𝑀 ⊂ Ω in space. When considering non-reversible, especially non-autonomous, processes, the sets 𝑀(𝑡) ⊂ Ω have to be regarded in space and time.18 By extension of the configuration space Ω via including the time variable 𝑡, one can consider metastable sets in space-time. Those are called coherent sets and can simply be considered as metastable sets in the augmented space Ω × [0, ∞).18 Thus, coherent sets are meant, if one speaks of metastable sets or states of non-autonomous systems. With extending the configuration space, it is possible to also extend common spectral algorithms like PCCA+ to identify coherent sets of non-autonomous dynamical systems. This is particularly straightforward when analyzing systems with a periodic outer stimulus, if the lag-time 𝜏 equals the length of one period.18,19 Wang and Schütte12 showed

ACS Paragon Plus Environment

6

Page 7 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

how to spatially and temporally discretize periodically driven NEMD utilizing the Floquet theorem26, obtaining a time-homogeneous (but non-reversible) Markov jump process. This will be the basis for constructing the transition matrix 𝐏(𝜏) in this article as detailed at the example of diffusive molecular dynamics, based on the description by Wang and Schütte12: The non-autonomous stochastic differential equation of the system’s state 𝐱(𝑡) ∈ Ω at time 𝑡 in the state space Ω ⊂ ℝ!! , driven by the time-dependent external force 𝐄 𝑡 𝐃(𝐱(𝑡)) with periodic external field 𝐄 𝑡 , is given by d𝐱 𝑡 = −∇𝑉 𝐱(𝑡) + 𝐄 𝑡 𝐃 𝐱(𝑡) d𝑡 + 2𝛽 !! d𝑊 Here 𝑊 marks 3𝑁-dimensional Brownian motion and 𝛽 = 1/(𝑘! 𝑇). Then the propagation of probability densities 𝜌(𝐱, 𝑡), 𝜌 𝐱, 𝑡 d𝐱 = ℙ[𝐱(𝑡) ∈ [𝐱, 𝐱 + d𝐱)] is given by the Fokker-Planck equation !" !"

= ℒ! 𝑡 𝜌

(1)

Here ℒ ! 𝑡 is the adjoint of the generator ℒ 𝑡 = 𝛽 !! ∆𝐱 + −∇𝐱 𝑉 𝐱 + 𝐄 𝑡 𝐃 𝐱 ∇𝐱 with ∆𝐱 and ∇𝐱 the Laplacian and Nabla operators with respect to 𝐱. The generator ℒ 𝑡 = ℒ 𝑡 + 𝜏!" is periodic with the period 𝜏!" of the external field 𝐄(𝑡). The temporal evolution of the probability densities 𝜌(𝐱, 𝑡) is given by 𝜌(𝐱, 𝑡) = 𝒫! (𝑡)𝜌(𝐱, 0) with the non-autonomous propagator or transfer operator 𝒫! (𝑡), called Perron–Frobenius operator19, solving (1) with the initial condition 𝜌(𝐱, 0) = 𝜌! . Firstly, a spatial discretization is performed by introducing a partition {Ω! }! !!! of the state space Ω into a finite number of disjoint sets and discretizing the Fokker-Planck equation, obtaining a time-inhomogeneous Markov jump

ACS Paragon Plus Environment

7

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 57

process in state space 𝑆 = {1, … , 𝑁}. This Markov jump process has a time-dependent rate Matrix 𝐋 𝑡 ∈ ℝ!×! with periodicity 𝐿!" 𝑡 = 𝐿!" (𝑡 + 𝜏ex ). Like in the time-continuous case, the Markov jump process generated by 𝐋(𝑡) propagates probability distributions according to the master equation d𝐩(𝑡) = 𝐋! 𝑡 𝐩(𝑡) d𝑡 with 𝐩(𝑡) the vector of the probability distribution of the states 𝑆 = {1, … , 𝑁} at time 𝑡. Then the temporal evolution of the probability distribution 𝐩(𝑡) is given by 𝐩 𝑡 = 𝐏 ! 𝑡 𝐩(0)

(2)

utilizing the associated propagator matrix 𝐏 𝑡 ∈ ℝ!×! solving d d!

𝐏 ! 𝑡 = 𝐋! 𝑡 𝐏 ! 𝑡 ,

𝐏! 0 = I

(3)

with 𝑃!" 𝑡 = P 𝑋! = 𝑗 𝑋! = 𝑖 , where 𝑋! is the Markov jump process generated by 𝐋(𝑡). Obviously, 𝐏 𝑡

is the time-dependent transition Matrix of the Markov jump process 𝑋!

generated by 𝐋(𝑡), derived by spatial discretization of the transfer operator 𝒫! (𝑡). The elements of 𝐏 𝑡 are the probabilities 𝑃!" 𝑡 = P 𝑋! = 𝑗 𝑋! = 𝑖 to be in state 𝑗 at time 𝑡 under the condition of being in state 𝑖 at 𝑡 = 0. Subsequently we can introduce a temporal discretization by taking advantage of the periodicity of 𝐋(𝑡). Regarding (3) column-wise as periodic linear differential equation, 𝐏(𝑡 + 𝜏ex ) satisfies 𝐏(𝑡 + 𝜏ex ) = 𝐏(𝑡) 𝐏(𝜏ex ) as a consequence of Floquet's theorem26. Thereof follows that 𝐏 𝑡 + 𝑚𝜏ex = 𝐏 𝑡 𝐏 ! 𝜏ex for all 𝑚 = 0, 1, 2, … . If we know 𝐏(𝑡) for 𝑡 ∈ (0, 𝜏ex ) and use (2), we get the solution 𝐩 𝑡 of the master equation for all 𝑡 ≥ 0. Particularly we get the long-term evolution of the propagator 𝐏 𝑚𝜏ex = 𝐏 ! 𝜏ex

ACS Paragon Plus Environment

8

Page 9 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

and the probability at multiples of the period 𝐩 𝑚𝜏ex = 𝐏 ! 𝑚𝜏ex 𝐩 0 = 𝐏 !

!

𝜏ex 𝐩 0

Applying Floquet's theorem, the time-inhomogeneous (non-autonomous) Markov jump process 𝑋! is discretized to a time-homogeneous (autonomous) non-reversible Markov jump process 𝑋!!!" , which is generated by the transition matrix 𝐏 𝜏 = 𝜏ex . By the described discretization one gains an autonomous transition matrix 𝐏 𝜏ex approximating the dynamics of the periodically driven non-equilibrium system on timescales of multiples of the period 𝜏ex . If the lag-time does not coincide with the length of the period 𝜏ex , then the transition matrix 𝐏 𝑡, 𝜏 ∈ ℝ!×! does not only depend on the lag-time 𝜏, but also on the starting time 𝑡 of the process. In general, if the studied process is non-autonomous, then the transfer operator 𝒫! (𝑡), acting on functions 𝑓 ∶ Ω → ℝ, is time-dependent and, if the external influence is not periodic, the above discretization procedure is not applicable. To overcome this, it is convenient to define an autonomous transfer operator 𝒯! acting on functions 𝑓 ∶ Ω×[0, ∞) → ℝ by18 𝒯! 𝑓 (𝐱, 𝑡) ∶=

𝒫! (𝑡)𝑓(𝑡) (𝐱), 𝑡 + 𝜏

Then, one possibility is to perform a Galerkin discretization of 𝒯! to construct a space-time discretized autonomous transition matrix 𝐓(𝜏), as detailed by Fackeldey et al.18 that can be used for further analysis. Additional methods to deal with transfer operators of non-autonomous processes can be found in the publication by Fackeldey et al.18. Having gained such an autonomous transfer operator 𝒯! and an appropriate discretization 𝐓(𝜏) of a non-autonomous process, the G-PCCA method, detailed in the following sections, becomes directly applicable. Thus, G-PCCA is generally applicable to non-autonomous Markov processes, provided an appropriate discretization leading to a time-independent transition matrix 𝐓(𝜏), is utilized.

ACS Paragon Plus Environment

9

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 57

Schild27 applied the construction of time-dependent MSMs to periodic processes for the first time. He manipulated the matrices 𝐏 𝑡, 𝜏 in such a way that they all became reversible, so that he could apply real-valued spectral analysis. In this article, we will not manipulate the transition matrices. The transition matrices are non-reversible and may thus have complex-valued eigenvectors. We will explain, how PCCA+ still describes the dynamics of the slow processes in this case. Coarse graining MSMs using Schur vectors Usually, PCCA+ is used for projecting the high-dimensional transition matrices 𝐏 𝜏 to low dimensional transition matrices 𝐏! 𝜏 ∈ ℝ!×! , 𝑛 ≪ 𝑁, accounting only for the transition probabilities between metastable subsets of the conformational space. This means that the 𝑁 discretization sets are further combined to 𝑛 larger subsets, such that the diagonal entries of 𝐏! are nearly one, representing the long holding times of the metastable sets. Note that in general, the projected matrix 𝐏! based on subset projection does not provide the correct transition probabilities of the projected chain due to a projection error.28 Only if 𝐏! is constructed by a Galerkin projection19 based on an invariant subspace of 𝐏 (invariant subspace projection), this property can still be assured.29,30,31 In other words, "propagation with 𝐏 and then applying projection" should be the same as "first projection and then propagation with 𝐏! ". The diagram in Figure 1 should commute.29,30,31 This is the reason, why PCCA+ uses invariant eigenspaces of 𝐏 to provide this invariant subspace projection: In the case of reversible Markov chains, the realvalued eigenspaces coincide with the invariant subspaces of 𝐏. In our case, the transition matrix 𝐏 is non-reversible due to the outer stimulus, which could lead to complex-valued eigenspaces. PCCA+ is not prepared to cope with complex vectors. In order to still provide an invariant subspace projection, real-valued basis vectors of (another kind of) invariant subspaces of 𝐏 have

ACS Paragon Plus Environment

10

Page 11 of 57

to be taken into account. This can be done by simply using the real Schur vectors as an input for the PCCA+ algorithm instead of eigenvectors.32

projection!

P! projection!

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Pc! Figure 1: To preserve the time-scales of the slow processes, propagation and projection are assumed to commute. To achieve this, the projection to a low-dimensional matrix 𝐏! must be based on an invariant subspace projection. A partial Schur decomposition provides an invariant subspace of a given matrix. This means, one can find a real matrix 𝐗𝜖ℝ!×! such that 𝐏𝐗 = 𝐗𝚲 where 𝚲𝜖ℝ!×! is a (non-diagonal) real upper triangle matrix with 1-by-1 and/or 2-by-2 blocks on its diagonal, depending on whether the corresponding eigenvalue of 𝐏 is real or complex valued. Every linear combination 𝝌 = 𝐗𝐀 with 𝝌𝜖ℝ!×! and a non-singular matrix 𝐀 ∈ ℝ!×! provides the basis of an invariant subspace of 𝐏. Instead of combining the 𝑁 states of 𝐏 to 𝑛 subsets in order to get 𝐏! , we have to apply a Galerkin projection, i.e., 𝐏! = 𝝌! 𝐃𝝌

!!

𝝌! 𝐃𝐏𝝌

In this expression, 𝐃 is a diagonal matrix of a weighted scalar product. In order to provide an invariant subspace projection, the Schur vectors have to be orthogonal with respect to this weighted product, i.e., 𝐗 ! 𝐃𝐗 = 𝐈, with 𝐈 being the 𝑛-dimensional unit matrix. This can be

ACS Paragon Plus Environment

11

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 57

guaranteed if the Schur decomposition 𝐏𝐗 = 𝐗𝚲 of the matrix 𝐏 = 𝐃!.! 𝐏𝐃!!.! is computed. Then 𝐗 = 𝐃!!.! 𝐗 are the 𝐃-orthogonal Schur vectors of 𝐏 and 𝚲 is its Schur matrix. The following calculation proves this statement: 𝐏𝐗 = 𝐗𝚲 ⇔ 𝐃!.! 𝐏𝐃!!.! 𝐗 = 𝐗𝚲 ⇔ 𝐏𝐃!!.! 𝐗 = 𝐃!!.! 𝐗𝚲 ⇔ 𝐏𝐗 = 𝐗𝚲,

𝐗 = 𝐃!!.! 𝐗

With these preparations, it becomes clear that the diagram in Figure 1 commutes, i.e., applying the projection after 𝑘 steps with the high-dimensional Markov chain is the same as 𝑘 steps of the projected Markov chain. This is shown by the following equations: 𝐀! 𝐗 ! 𝐃𝐗𝐀

!!

𝐀! 𝐀 !! 𝐀! 𝚲𝐀 = 𝐀!! 𝚲𝐀 𝐀! 𝐗 ! 𝐃𝐏𝐗𝐀 = 𝐏! = 𝝌! 𝐃𝝌

!!

𝝌! 𝐃𝐏𝝌

which leads to the desired result: 𝝌! 𝐃𝝌

!!

𝝌! 𝐃𝐏 ! 𝛘 = 𝐀!! 𝚲! 𝐀 = 𝐏! !

The last equation shows that the transition on the coarse level is in principle given by 𝚲! . If 𝚲! is a diagonal matrix, i.e., if 𝐏 is reversible, then one can clearly see that the diagonal elements (the eigenvalues) determine a decay, and thus the eigenvalues determine characteristic time scales of the process. If 𝚲! has non-diagonal elements, the process comprises cycles as well. The non-symmetry of 𝚲! is directly connected to the cyclic behavior of the system.14 This means that complex eigenvalues close to the unit circle in the complex plane are characteristic for cyclic processes. The diagonal elements of 𝐃 are not discussed yet. In principle, they are arbitrary. In our examples, they are given by the initial distribution of the molecular states (per discretization set), i.e., 𝐷!! is the probability to start a trajectory in set 𝑖. With this setting, 𝐃𝝌 represents an assignment of every element of the initial distribution to the corresponding row of 𝝌 and 𝝌! 𝐃𝝌 is a matrix that assigns the initial distribution to the projected space (regarding the overlap of 𝝌).

ACS Paragon Plus Environment

12

Page 13 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

If 𝝌! 𝐃𝝌 was a diagonal matrix, then the diagonal elements would be the initial probabilities of the macrostates of the system. Thus, we will determine the transformation matrix 𝐀 in such a way that 𝝌! 𝐃𝝌 has only as small as possible off-diagonal entries. This determines the objective function to be optimized by PCCA+.33 If the off-diagonal entries of 𝝌! 𝐃𝝌 are zero, then 𝝌 represents characteristic functions of a disjoint decomposition of the configuration space into metastable sets (rather than fuzzy sets). Thus, a non-negative matrix 𝐏! combined with almost characteristic functions 𝝌 is what we aim at. In this situation, the projection 𝐏! of 𝐏 onto 𝝌 provides what is usually defined as a MSM. Röblitz and Weber33 have shown that minimizing the off-diagonal elements of 𝝌! 𝐃𝝌 is equivalent to maximizing the trace of 𝐒 = 𝐃𝝌! 𝐃𝝌, where 𝐃 is constructed such that 𝐒 is a stochastic matrix. The function 𝑓! = 𝑛 − trace 𝐒 is a convex objective function. Note, if 𝐒 was the identity matrix, then the coarse-grained matrix 𝐏! is automatically non-negative, thus allowing for an interpretation of the coarse-grained process as a Markov chain. 2.2. Determining the number of states Usually the number 𝑛 of macrostates is determined by searching for a spectral gap in the eigenvalues of 𝐏 near the Perron root 1. This procedure has been suggested by Huisinga34 for reversible Markov chains. However, if the computation of spectral properties of 𝐏 is replaced by a real Schur decomposition, then the spectral gap (of a complex valued spectrum) may not be the correct measure for the number of states. For determining dominant structures in non-reversible Markov chains, the eigenvalues of 𝐏 near the unit circle in the complex plane play an important role, because they correspond to dominant cyclic or metastable processes.14 After sorting the Schur matrix according to the distance of the eigenvalues from the unit cycle by using the MATLAB routine SRSchur35, we first use the minChi-criterion to determine a possible interval

ACS Paragon Plus Environment

13

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 57

for the number 𝑛 of macrostates.36 While generally the initial guess for the transformation matrix (!)

𝐀 leads to membership vectors 𝝌!

with negative entries, the minChi-criterion states that the

value minChi = min min 𝜒 (!) 𝑖, 𝑗 !

!

will be close to zero, if well separated clusters or states exist. minChi is the minimal value in the matrix 𝝌 ! that is constructed from the infeasible starting point of the PCCA+ optimization problem. If the smallest negative entry of 𝝌 ! is almost zero, then the infeasible (but optimal) starting point is almost feasible. Thus, we expect the optimization problem to provide a good decomposition of the configuration space. This means that we should spend the effort to solve the PCCA+ optimization problem with the Nelder-Mead method (like originally proposed by Deuflhard and Weber15) or with the Gauss-Newton method NLSCON37,38.33 Note that both local optimization routines lead to similar results in our applications, but Gauss-Newton is much faster than Nelder-Mead. After determining the crispness33 𝜉=

trace 𝐒 𝑛

i.e., the value of the trace of 𝐒 = 𝐃𝝌! 𝑫𝝌, we try to find a number 𝑛 of macrostates, such that not only the crispness is high, but also the minimal element of 𝐏! is not negative. This is a Markovianity check. In the case of a non-negative matrix 𝐏! , this matrix allows for an interpretation of the coarse process as a Markov chain. If 𝐏! has negative elements, then the projected process cannot be regarded as a Markov chain. High crispness of 𝝌 indicates that the process can be well decomposed into subsets, but it does not ensure that the transition behavior between the subsets is Markovian. For assuring Markovianity we need the additional minPc-

ACS Paragon Plus Environment

14

Page 15 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

criterion. To take the requirement of Markovianity into account, we defined a new indicator, which we termed minPc minPc = min min 𝑃! (𝑖, 𝑗) !

!

in resemblance to the minChi indicator. A process can be assumed Markovian, if minPc is nonnegative (up to numerical precision). The matrix 𝝌𝜖ℝ!×! in the last section has the meaning of a membership matrix. The nonnegative entry 𝜒!" denotes the grade of membership of micro-subset 𝑖 to the metastable macrostate 𝑗. The rows of the membership matrix are summing up to 1. In order to meet the invariant subspace condition, the membership matrix must be a linear transformation of the basis vectors spanning the corresponding 𝑛-dimensional subspace of 𝐏. This means 𝝌 = 𝐗𝐀, where 𝐗 ∈ ℝ!×! is the 𝑛-dimensional basis of the invariant subspace and 𝐀 ∈ ℝ!×! is the non-singular transformation matrix, such that 𝝌 has the described properties. The rows of the matrix 𝝌 as vectors in ℝ! are all pointing into a simplex spanned by the unit vectors of ℝ! . Thus, by linear transform, all rows of 𝐗 are also 𝑛-dimensional vectors, which point into a simplex spanned by unknown vertices in ℝ! . If these vertices can be found among the rows of 𝐗, then the transformation matrix 𝐀 ∈ ℝ!×! is unique.15 The minChi-criterion36 tests, if it is possible to find vertices of the simplex close to some rows of 𝐗. If this is the case, then the transformation matrix 𝐀 ∈ ℝ!×! leads to a well-conditioned matrix 𝝌, i.e., its columns are far from being linearly dependent. Therefore, we applied this criterion in order to figure out, if our choice of the number 𝑛 of states is a good choice. Remark 1: The Schur decomposition is not unique. It depends on the ordering of the 1-by-1 and 2-by-2 blocks on the diagonal of the Schur matrix. In order to "select" a special 𝑛dimensional invariant subspace, one has to sort the Schur matrix in such a way that the

ACS Paragon Plus Environment

15

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 57

corresponding blocks, i.e., eigenvalues, are on the top left of the Schur matrix. Then, one can pick the leading 𝑛 Schur vectors for the PCCA+ analysis. The routine SRSchur35 in MATLAB can be used for this ordering. 2.3. From fuzzy sets to the assignment of states Once the number of states is determined via minChi, crispness and minPc, and the linear transformation 𝝌 = 𝐗𝐀 is done via PCCA+, we want to interpret the columns of 𝝌 as indicator functions of subsets of the conformational space. The entries of 𝝌 are nonnegative numbers and sum up to 1 (row-wise). Thus, subset 𝑖 is assigned to macrostate 𝑗, if 𝜒!" is the largest entry of 𝝌 in its 𝑖-th row. By this "defuzzification", the macrostates are metastable subsets of the conformational space and the transition matrix 𝐏! has an interpretation as a (non-reversible) conditional transition probability matrix among these macrostates. The interpretation is possible, since the matrix 𝝌! 𝐃𝝌 in the calculation of the matrix 𝐏! becomes a diagonal matrix in this case, with the projected initial distribution as diagonal elements, and 𝝌! 𝐃𝐏𝝌 simply counts the number of transitions between the sets. Remark 2: If the matrix 𝐏 has already large entries on its diagonal, then PCCA+ will tend to separate single rows of 𝐏 as metastable states. In order to circumvent this problem, one could transform the matrix 𝐏 by deleting the diagonal elements and rescaling the rows, such that the resulting matrix is row stochastic again. This is like a transition from a Markov chain to the embedded Markov chain. 2.4. Metastable states and dominant cycles When building a MSM of a non-reversible Markov process, not only metastable states but also so-called dominant cycles must be regarded. A dominant cycle is a repeatedly traversed sequence of transient states. After space and time discretization of the Markov process, a Markov chain

ACS Paragon Plus Environment

16

Page 17 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

with a non-reversible transition matrix 𝐏 is obtained. This transition matrix possesses complex eigenvalues and eigenvectors that can be interpreted in terms of coherence (coherent sets) in the case of a periodically driven process.19 Searching for eigenvalues near the unit circle in the complex plane is often the same as searching for eigenvalues near the Perron root 1. Especially if the matrices 𝐏 are strongly diagonally dominant, then the diagonal entries of 𝐏 nearly correspond to the leading eigenvalues. The cyclic sub-processes are hidden. They correspond to eigenvalues near the unit circle, which are away from 1.14 In order to reveal the cyclic sub-processes, we compute the Embedded Markov Chain30 (EMC) of the process. The EMC is derived from an infinitesimal generator29,39 or from the derivative of a transfer operator at lag-time 𝜏 = 0. If an infinitesimal generator exists, then the infinitesimal generator is in principle the matrix logarithm of the transfer operator.29,40 Expanding the matrix logarithm into a Taylor series, one can construct the first order approximation of this EMC by deleting the diagonal elements of the transition matrix 𝐏 and rescaling the rows of the matrix, such that we end up with a stochastic matrix 𝐊. The EMC 𝐊 does not account for trajectories that start in a microstate and end in the same microstate after the lag-time. Using the previously described analysis on 𝐊 instead of 𝐏, we can see that cyclic processes become visible.

3. Application: Amyloid β (1-40) peptide in an oscillating electric field 3.1. System setup and simulation details The system setup was performed utilizing GROMACS 4.6.741 and the AMBER99SB-ILDN42 force field, following a approach based on a protocol by English et al.43 as detailed here. An Amyloid β (1-40) peptide (Aβ40) (PDB 1BA421) in mostly helical conformation was placed in a rhombic dodecahedral box of volume 229.739 nm3 and edge length 4.21 nm (minimum image distance of 6.87 nm), with periodic boundary conditions in all directions. The peptide was

ACS Paragon Plus Environment

17

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 57

protonated appropriate to a pH of 7, with six negatively charged (Asp 7, and 23, Glu 3, 11, and 22, Val 40) and three positively charged (Arg 5, Lys 16, and 28) residues (termini charged), resulting in a total peptide charge of −3 𝑒. 7388 TIP3P water molecules, taken from a relaxed liquid configuration at 300 K and 1 bar, were added into the box. To neutralize the overall system charge 3 Na+ ions were added. Then the system was energy minimized utilizing the steepest descent method, stopping after the maximum force falls below 50 kJ . Afterwards the potential energy was well converged. Thereafter, the system was slowly heated to 300 K in 60 K steps by MD, with 20 ps duration and time step ∆𝑡 = 1 fs, coupled to a v-rescale thermostat44 with relaxation time constant 𝜏! = 0.1 ps. The initial velocities were taken from the MaxwellBoltzmann distribution at 60 K. During the heating procedure positional restraints were applied on the protein atoms that were reduced with each temperature increase step as follows: 5000, 2500, 1250, 600, and 300 kJ mol!! nm!! . Afterwards, the system was relaxed by MD, with time step ∆𝑡 = 1 fs, for 10 ns in the NPT ensemble at 300 K and 1 bar, coupled to a Nose-Hoover thermostat with relaxation time constant 𝜏! = 2.5 ps and a Berendsen barostat with relaxation time constant 𝜏! = 0.1 ps. Positional restraints of 300 kJ mol!! nm!! were applied on the protein atoms. This first 10 ns were discarded and the system was subsequently relaxed for 10 ns using the same settings, whereby the system's configuration was stored every picosecond. This second 10 ns NPT relaxation MD simulation was used to calculate averages and errors45 with the GROMACS gmx energy tool. Since the system's potential, kinetic, and total energy were all well converged, fluctuating around steady values, the data from the second 10 ns NPT equilibration was used for the next step: The frame whose volume was closest to the average Volume 𝑉 = 229.739 ± 0.0073 nm3 was selected. There was one single frame at 1109 ps that matched exactly the average volume, which was chosen as starting frame – using the

ACS Paragon Plus Environment

18

Page 19 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

positions and velocities for exact continuation – for the next equilibration step. Using this, a 10 ns NVT equilibration by MD was performed, utilizing standard GROMACS 5.1.246 with a time step of 1 fs and positional restraints of 300 kJ mol!! nm!! applied on the protein atoms. In this equilibration step two coupling groups were used: Only the water molecules and ions were coupled to a v-rescale thermostat with relaxation time constant 𝜏! = 2.0 ps, while the protein was not coupled to the thermostat, following an approach by Gallo et al.47 Thereby the protein's degrees of freedom were not disturbed by the velocity rescaling procedure, exerted by the thermostat, but only indirectly thermostated through the contact with the water molecules via collisions and interactions. In the course of this NVT equilibration run, the systems potential, kinetic, and total energy remained fluctuating around steady values. Statistics were done with the GROMACS gmx energy tool. The temperatures of the system, protein and water were 𝑇 = (300.018 ± 0.038) K, 𝑇protein = (301.09 ± 0.2) K, and 𝑇water = (299.981 ± 0.041) K, respectively. Evidently, the temperatures of the two coupling groups, protein and water, were well converged within the error margins. Hence, the criterion for thermal equilibration of a protein-water system, introduced by Gallo et al.47, was satisfied and the system was prepared for production simulations. The higher fluctuations of the protein temperature, as compared to the water temperature, were effected by the much smaller number of atoms in the protein group: 598 protein atoms vs. 22167 water and ion atoms. Overall 40×1000 ns NEMD24 production simulation runs, with time step ∆𝑡 = 1 fs, were performed, utilizing an in-house modified GROMACS 5.1.246 and the AMBER99SB-ILDN42 force field. The modification was necessary to incorporate an oscillating electric field of RMS strength 0.01 V nm!! and 1 GHz frequency, starting with zero field intensity at time zero (sine

ACS Paragon Plus Environment

19

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 57

wave). An oscillating electric field 𝐄(𝑡) couples with the electric charges 𝑞! of the atoms 𝑖 = 1, . . . , 𝑁 resulting in an 𝐄-field dependent force term in the Newton's equations of motion 𝑚! 𝐫! = −∇𝑈 + 𝑞! 𝐄 𝑡 additional to the force term −∇𝑈 derived from the MD force field 𝑈. As before in the NVT equilibration step, a two coupling group scheme was applied were the water molecules and ions were coupled to a v-rescale thermostat with relaxation time constant 𝜏! = 2.0 ps, while the protein was not coupled to the thermostat. Thereby, the influence of the external oscillating electric field on the protein's degrees of freedom was not disguised by coupling to a thermostat, while the protein was still indirectly thermostated through the contact with the water molecules via collisions and interactions. All atom coordinates and energies were stored every 10 ps for further processing and analysis. Starting from 294 conformations, selected from the preceding 40×1000 ns NEMD simulations, 881×10 ns of additional adaptive NEMD production simulation runs were performed to improve and refine the sampling. Said 294 conformations were identified from the simulation data by performing feature selection and extraction, clustering and discretization of the trajectories as described in the sections below. Resulting from the 40×1000 ns NEMD simulations, a dataset consisting of 4 million frames or conformations was gained, since we stored snapshots every 10 ps. Assuming 𝑘kmeans = 𝑁, 𝑁 equals the total number of samples, as a good guess (see section 3.2.2) for the number 𝑘kmeans of clusters in a dataset, 𝑘kmeans = 2000 was obtained. Thereby, a clustering of the conformations into 2000 microstates and a discretization of the trajectories by projection onto those microstates was achieved. Of these microstates, 100 particularly rarely visited states were selected from the discretized trajectories, termed rare states. Furthermore, we were also interested in identifying transition states, exhibiting particularly many transitions to other states in the discretized

ACS Paragon Plus Environment

20

Page 21 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

trajectories. To this purpose, a 2000×2000 transition count matrix was created from the discretized trajectories, by the method described in section 3.3.1. Based on this transition count matrix, the transition states and dominant states were identified by summing up the rows of the matrix and taking the 100 states with the 100 highest row sums, that is the highest count of transition events (including self-transitions). Thereby, states with many outgoing transitions are called transition states and states with many self-transitions are called dominant. Afterwards, the conformations nearest to the cluster centers were extracted from the trajectories as representatives of the associated microstates. From each of these trajectories, the structure nearest to the cluster center was used, since some microstates appeared in several trajectories, leaving us with more than one representative for several states. As a result, 106 conformations representing rare states and 188 conformations representing transition states were gained - in sum 294 conformations, used as starting structures for short adaptive runs. From each of these conformations three 10 ns NEMD runs were started, resulting in 882 (881, since one run terminated prematurely for technical reasons and was not considered in the further analysis) short adaptive NEMD simulations. All MD simulations were conducted using the leapfrog algorithm, as implemented in GROMACS, to integrate Newton’s equations of motion. Neighbor searching was performed utilizing the Verlet cutoff-scheme with a Verlet buffer tolerance of 0.005 kJ mol!! ps !! per atom, setting the maximum error for pair interactions per atom to this value.45 The cutoff radii for the van der Waals and Coulomb interactions were set to 1 nm and both potentials were shifted by a constant, resulting in potentials of zero at the cutoff radii. Long-range electrostatics were handled by the Particle-Mesh Ewald (PME)48,49 method with a FFT grid spacing of 0.12 nm and a PME interpolation order of 4. The relative error tolerance of the Ewald-shifted direct Coulomb

ACS Paragon Plus Environment

21

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 57

potential at the Coulomb-cutoff was set to 1×10!! . The relative dielectric constant was set to 1. Long-range dispersion corrections were applied for energy and pressure. Only the bonds with hydrogen atoms were constrained during MD using the LINCS50,51 constraint algorithm with an highest order of 4 in the expansion of the constraint coupling matrix and 1 iteration for the correction of rotational lengthening. It is particularly important not to use the SHAKE52 constraint algorithm, since it was shown by Gallo et al.47 that SHAKE creates artifacts inconsistent with an accurate equilibration.47 3.2. Data preparation To derive a MSM from a dataset, it is essential to first prepare the data appropriately by feature selection and extraction, clustering and discretization as schematized in Figure 2 and detailed in the following sections.

Data preparation Featurizer

Feature Extraction Dimension reduction TICA

Clustering

discrete trajectories

k-means

feature trajectories MD Data Figure Workflow schematizing the process of data preparation prior to the construction of a MSM2:building MSM.

Coarse grained transition matrix generation

MSM

discrete Cluster number Transition 3.2.1. Featurematrix selection and extraction trajs estimation creation When performing Markov modeling, one firstly needs to select appropriate input features for P c, minChi G-PCCA the following modeling process. Since MD simulations do not only include all atoms of the protein but also thousands of surrounding water molecules, it is common and justified to remove

ACS Paragon Plus Environment

22

Page 23 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

the water degrees of freedom for further analysis, since one is mostly only interested in the protein conformational dynamics. Nevertheless, an analysis employing 3𝑁 Cartesian coordinates of hundreds or thousands of protein atoms is still prohibitive, necessitating a dramatic reduction of the degrees of freedom. Thus, the input features used for Markov state modeling are usually the coordinates of the protein in reduced representations like dihedral angles or Calpha distances. Still, one is routinely confronted with several hundreds of dimensions, hampering a clustering of the conformations in continuous space into discrete states as required for MSM construction. Therefore, further dimension reduction or feature extraction by linear transformation methods, enabling selection of the relevant dimensions, associated with the slow processes of conformational dynamics, is commonly performed. Time-lagged Independent Component Analysis (TICA) is a good, regarding linear methods even the best, choice for the feature extraction, because it was shown to be an optimal linear projection method to identify the slow reaction coordinates and to approximate their relaxation time scales.53 When utilizing TICA, besides the coordinates (Cartesian coordinates, dihedral angles, Calpha distances), one can include further features, typically like RMSD, contacts between amino acid residues etc., as input for the TICA transformation.54 It is even possible to include all the mentioned features by simply concatenating them into an input feature vector as stated by Scherer54, since one can remove possible redundant information via TICA. Since we were attempting to model a protein under the influence of an external oscillating electric field, we decided for an uncommon input feature selection: First – insofar not unusual – we included the 218 sine and cosine values sin 𝜒! , cos 𝜒!

sin 𝜙 , cos 𝜙 ,

sin 𝜓 , cos 𝜓 ,

of the 39 backbone dihedral angle pairs 𝜙, 𝜓 and 31 side chain torsions 𝜒!

of Aβ40 into the input feature vector. Then, we assigned to every amino acid residue and time

ACS Paragon Plus Environment

23

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 57

frame the secondary structure type, utilizing the DSSP55 algorithm as implemented in MDTraj56 1.8.0. The DSSP assignment character codes were converted to real values, since the TICA input features vector ought to be in the real coordinate space ℝ!feat , with 𝑛feat the number of input features, as follows: α-helix

H → 3.0, isolated β-bridge

B → 7.0, extended β-strand

participating in a β-ladder E → 8.0, 310-helix G → 2.0, 𝜋-helix I → 4.0, hydrogen bonded turn T → 5.0, bend S → 0.0, loops and irregular structure elements C → −1.0. Thus, we got a 40element (one element for each of the 40 residues of Aβ40) column vector with a real value out of {−1.0, 0.0, 2.0, 3.0, 4.0, 5.0, 7.0, 8.0} describing the secondary structure of every time frame. Afterwards, we joined this 40-element vector to the input feature vector. Finally, we appended the input feature vector of every analyzed time frame by the total dipole moment 𝑝total 𝑡 , the Cartesian dipole moment components 𝑝! 𝑡 , 𝑝! 𝑡 , 𝑝! 𝑡 , the radius of gyration 𝑅! 𝑡 , the total number 𝑁HB 𝑡 of hydrogen bridge bonds, and the 7 numbers 𝑁HB !,! 𝑡 , … , 𝑁HB !,!!! 𝑡 of 𝑛 to 𝑛 + 𝑖 hydrogen bridge bonds, where 𝑛 and 𝑛 + 𝑖 are residue numbers and 𝑖 = 0, … ,6. Thereby, we ended up with a 271-element feature input vector in the ℝ!"# for every time frame. Afterwards, this was used as input for TICA using PyEMMA54 2.2.3, resulting in linearly transformed new coordinate vectors. We only kept the first 20 dominant TICA dimensions corresponding to the slowest reaction coordinates and longest relaxation timescales. Furthermore, we scaled the TICA coordinates as suggested by Scherer et al.54 to get a kinetic map, i.e., obtaining coordinates that span a metric space were geometric and kinetic distances are proportional, to enhance geometric clustering as performed later on. 3.2.2. Clustering and discretization of the trajectories Since MSMs are build by counting transitions between discrete states, the TICA-transformed input data needs to be discretized. This is done by a Voronoi tesselation via clustering with

ACS Paragon Plus Environment

24

Page 25 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

!

kmeans standard methods like the k-means algorithm57, thereby inducing a decomposition {Ω! }!!! of

the underlying conformational space into disjoint subsets. To discretize the data, we performed k-means57 clustering with the kmeans++ initialization in the 20-dimensional space, spanned by the TICA coordinates, utilizing PyEMMA 2.2.3. The number of clusters was chosen to 𝑘kmeans = 2210, based on the suggestion by Scherer et al.54 to assume 𝑘kmeans = 𝑁, 𝑁 equals the total number of samples in the input data. Taking a look on Table 1, this choice appears reasonable: There, the dependence of the optimal number 𝑘 of states, as selected by the crispness 𝜉 and the minPc indicator (see section 2.2), of the projected coarse MSM on the number 𝑘kmeans of k-means clusters is shown for a transition matrix creation lag time 𝜏 = 1 ns (see section 3.3.1) and different TICA lag times 𝜏TICA . From Table 1, it is apparent that 𝑘kmeans = 2000 is not a sufficient number of k-means clusters, since the optimal numbers of states of the coarse MSMs are systematically underestimated compared to 𝑘kmeans = 2210. At the same time, it is not justified to increase 𝑘kmeans over 𝑘means = 𝑁, since this will lead to insufficient statistics. Here, every frame of all 40×1000 ns and 881×10 ns trajectories – equals the conformational state every 10 ps – was used for clustering, i.e., 𝑁 = 4.881×10! . Thus, 𝑘kmeans = 𝑁 = 2210 appears as a reasonable choice of the number of k-means clusters. The procedure to only store and utilize the conformational state of the trajectories every 10 ps is sensible to remove fluctuations from the dataset used for analysis, since we are not interested in the typically nonMarkovian short-time dynamics, which only increase the variance in the conformations, worsening the quality of the analysis. Next, the trajectories were discretized applying a Voronoi discretization by assigning each frame to the closest cluster center, resulting in discrete trajectories were every frame is represented by the integer number 𝑖center ∈ 1, … , 2210 , representing the corresponding cluster center.

ACS Paragon Plus Environment

25

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 57

Table 1: Comparison of optimal numbers 𝑘 of MSM states for 𝑘kmeans = 2000, 𝑘kmeans = 2210 and 𝜏TICA = 0.5 ns, 𝜏TICA = 1 ns. 𝑘kmeans 2000

𝜏TICA = 0.5 ns

𝜏TICA = 1 ns

{56, … , 61, 63, 65} {54, … , 64, 66, … , 69}

2210

{67, … , 71}

{68, … , 73, 76, 77}

All transition matrices were created from Aβ40 MD data featurized and transformed as described in the preceding section 3.2.1. In all cases, MSMs were created using G-PCCA and the optimal numbers 𝑘 of states of the coarse MSMs were selected by both, the crispness and the minPc indicator.

DataMSM preparation 3.3. building After data preparation, as Feature described before, the coarse-grained MSM of the discrete projected Markov

Featurizer

Extraction Clustering trajectories Dimension reduction chain is generated by transition matrix creation, number of states estimation and projection (the k-means

TICA

actual coarse graining) of the transition matrix by the G-PCCA algorithm, as schematized in Figure 3 and described in the subsequent sections.

feature trajectories

MD Data

MSM building

discrete trajs

Transition matrix creation

Cluster number estimation

Coarse grained transition matrix generation

minChi

G-PCCA

MSM

Pc ,

Figure 3: Workflow schematizing the building of a coarse-grained MSM. 3.3.1 Count matrix and transition matrix creation After discretizing the trajectories onto a jump process between 𝑘kmeans = 2210 microstates (cluster centers), the protein kinetics can be approximated by a MSM. Therefore, a 2210×2210

ACS Paragon Plus Environment

26

Page 27 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

conditional transition count matrix 𝐂 = 𝑐!"

!,!!!,…,!

was build, based on the discretized

trajectories. Starting at time 𝑡 = 0 in one microstate, the current microstate was checked every lag time 𝜏, updating the transition count matrix as follows: If the protein was in microstate 𝑖 at time 𝑡 and is in microstate 𝑗 at time 𝑡 + 𝜏, then the value of the transition count matrix element 𝑐!" is increased by 1. The transition count matrix 𝐂 was accumulated from all 40×1000 ns and 881×10 ns discretized trajectories, using the sliding window approach and a lag time of 𝜏 = 1 ns, fitting the period 𝜏!" = 1 ns of the oscillating electric field, which means that 𝐂 was assembled from 𝑁 = 4.881×10! counts. In the sliding window approach, one constructs the transition count matrix from all available data: Assume that the conformations were sampled at times 𝑡! = 𝑡! + 𝑛 Δ𝑡, 𝑛 ∈ {1, … , 𝑁} with Δ𝑡 < 𝜏, 𝜏 = 𝑚Δ𝑡, 𝑚 ≪ 𝑁, then transitions between states 𝑠 𝑡! at times 𝑡! are counted as 𝑠 0 → 𝑠 𝜏 , 𝑠 Δ𝑡 → 𝑠 Δt + 𝜏 , 𝑠 2Δ𝑡 → 𝑠 2Δt + 𝜏 … .58 As detailed in the literature9,59, this approach underestimates model uncertainty, but results in increased precision of the transition probability estimates compared to sampling of the conformations only at multiples of the lag time 𝜏. In the case of NEMD, driven by a periodic outer stimulus, sampling the trajectories with the sliding window approach and a lag time 𝜏, equals the period of the stimulus, denotes a time averaging over the period. Thus, by application of this approach, the coarse (in the augmented space) long-term dynamics on the scale of the period of the outer stimulus are obtained, time-averaged over the period of the driving force. From the transition count matrix, the row stochastic transition Matrix 𝐏 was estimated9 by 𝐏 = diag 𝑐! with 𝑐! ∶=

!kmeans 𝑐!" , !!!

!!

𝐂

the row sums of 𝐂.

ACS Paragon Plus Environment

27

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 57

Extraction of the embedded Markov chain To facilitate the identification of dominant cycles, additionally, the transition matrix 𝐊 of the EMC was calculated by subtracting the diagonal elements of the transition count matrix 𝐂 𝐂EMC = 𝐂 − diag(diag 𝐂 ) and making it stochastic by 𝐊 = diag 𝑐! with 𝑐! ∶=

!kmeans 𝑐!" , !!!

!!

𝐂EMC

the row sums of 𝐂EMC .

3.3.2. Estimation of the number of states via the minChi criterion Since G-PCCA requires the number of states as input, which is beforehand unknown, it needs to be estimated. To estimate the number of Markov states, the minChi-criterion, defined in section 2.2, was used. Utilizing this, the number of states can be estimated by the number 𝑘 that maximizes the minChi-value or at least a number 𝑘 with minChi close to zero.33 The polymorphic monomeric Aβ40 shows very rich conformational dynamics between a multitude of states, particularly when starting in the here considered helical state21, which is not stable in water.20,22,23,60 Since a large amount of simulations was conducted, the observation of extensive dynamics between an ample set of conformational states was to be expected. This expectation was underpinned by inspection of DSSP plots of the trajectories, which showed many transitions between various metastable conformations. Therefore, a number 𝑘 of states was aimed for, which is large enough to describe this behavior, but not to large to sufficiently reduce the complexity of the underlying data and unravel the actually relevant dynamics.

ACS Paragon Plus Environment

28

Page 29 of 57

0.0

b) 0.0

0.2

0.2

0.4

0.4

0.6

0.6

minChi

a)

minChi

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

0.8

0.8

1.0

1.0

1.2

1.2

1.4

1.4

1.6

0

20

40 60 80 number of clusters

100

1.6

0

20

40 60 80 number of clusters

100

Figure 4: minChi plotted over the number 𝑘 of states for the full Markov chain with a sliding window lag time 𝜏 = 1 ns and a TICA lag time (a) 𝜏TICA = 1 ns, (b) 𝜏TICA = 0.5 ns. Figure 4 shows minChi as a function of 𝑘 for the full Markov chain. It can be seen that in Figure 4a 𝑘 = 64 till 𝑘 = 80 (𝜏TICA = 1 ns, 𝜏 = 1 ns) and in Figure 4b 𝑘 = 67 till 𝑘 = 76 (𝜏TICA = 0.5 ns, 𝜏 = 1 ns) fulfill the minChi-criterion sufficiently well. To test this assumption, spectral clustering with G-PCCA for 𝑘 = 64, … , 80 (𝜏TICA = 1 ns, 𝜏 = 1 ns) and 𝑘 = 66, … , 77 (𝜏TICA = 0.5 ns, 𝜏 = 1 ns) was performed in the case of the full Markov chain. In the second case, we extended the range of numbers of states on both sides into the non-optimal area to show the sensitivity of the minChi indicator.

ACS Paragon Plus Environment

29

Journal of Chemical Theory and Computation

b)

0.0

0.0

0.2

0.2

0.4

0.4

minChi

a)

minChi

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

0.6 0.8

0.6 0.8

1.0

1.0

1.2

1.2

1.4

0

20

40 60 80 number of clusters

100

Page 30 of 57

1.4

0

20

40 60 80 number of clusters

100

Figure 5: minChi plotted over the number 𝑘 of states for the EMC with a sliding window lag time 𝜏 = 1 ns and a TICA lag time (a) 𝜏TICA = 1 ns, (b) 𝜏TICA = 0.5 ns. Figure 5 shows minChi as a function of 𝑘 for the EMC. One recognizes that in Figure 5a 𝑘 = 42 till 𝑘 = 57 (𝜏TICA = 1 ns, 𝜏 = 1 ns) and in Figure 5b 𝑘 = 39 till 𝑘 = 60 (𝜏TICA = 0.5 ns, 𝜏 = 1 ns) fulfill the minChi-criterion sufficiently well. To test this assumption, spectral clustering with G-PCCA for 𝑘 = 42, . . . , 57 (𝜏TICA = 1 ns, 𝜏 = 1 ns) and 𝑘 = 39, … , 60 (𝜏TICA = 0.5 ns, 𝜏 = 1 ns) was performed in the case of the EMC. 3.3.3. Fuzzy clustering and generation of the coarse grained Markov transition matrix via G-PCCA After estimation of the number 𝑘 of states via the minChi criterion, the G-PCCA algorithm is used to calculate the membership matrix 𝝌 and the coarse-grained transition matrix 𝐏! as described in the following. First, the initial distribution of the Markov process is determined by 𝜼=

!kmeans 𝑐!" !!! !kmeans !kmeans 𝑐!" !!! !!! !

from the transition count matrix 𝐂. Then, the initial distribution 𝜼, the row-stochastic transition matrix 𝐏, the minimum number 𝑘min and maximum number 𝑘max of macrostates (fuzzy clusters)

ACS Paragon Plus Environment

30

Page 31 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

and some structures wk and iopt, containing input parameters, are used as input to the MATLAB routine pcca.m. This function executes the G-PCCA algorithm as detailed in the flowchart shown in Figure 6, which is summarized in the following.

ACS Paragon Plus Environment

31

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 57

input: D, η, kmin, kmax, wk, iopt

BEGIN pcca.m

do_schur.m 0.5 ̃ compute P=D PD−0.5 ,D=diag( η )

̃ compute Schur decomposition of P

order Schur matrix Λ with respect to the unit circle

SRSchur.m

gram_schmidt_mod.m orthonormalize the ̃ DΧ ̃ =I ordered Schur verctors Χ

̃ compute Χ =D −0.5 Χ ̃ from the Schur vectors Χ

generate starting guess A0

initialize_A.m

use_minChi.m apply minChi criterion for cluster numbers k ∈[k min ,k max ]

indexsearch.m

cluster_by_isa.m

choose new kmin and kmax interactively optimization_loop.m

k =k min no

k ⩽k max

compute objective function fk

yes opt_soft.m

k → k +1

n c =argmax choose new kmin, kmax

fillA.m

make A feasible

objective.m

postprocess.m

compute Pc write Pc , χ , A , f k , ξ to fles plot Pc ,χ , ...

k −f k k

yes fag>0

no

output: Pc , χ ,... from the last optimization

END

ACS Paragon Plus Environment

32

Page 33 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Figure 6: Flowchart describing the G-PCCA algorithm as implemented in the routine pcca.m and its subroutines. First, the Schur vectors of 𝐏 are calculated, sorted and orthonormalized as detailed in section 2.1 by the subroutine do_schur.m. Afterwards, a starting guess 𝐀! of the transformation matrix is generated and the inner simplex algorithm15,61 is utilized to calculate 𝝌(!) and minChi, as detailed in section 2.2, for every number 𝑘min ≤ 𝑘 ≤ 𝑘max of states, using the subroutine use_minChi.m. This gives minChi depending on the number 𝑘 of states, which is used to interactively choose new 𝑘min and 𝑘max from an interval were minChi is very close to zero. Now, the actual optimization of the transformation matrix 𝐀 is performed in a loop for 𝑘 between 𝑘min and 𝑘max , by the subroutine optimization_loop.m. During this, an objective function 𝑓! = 𝑘 − trace 𝐒 = 𝑘 − trace 𝐃𝝌! 𝐃𝛘 is optimized, utilizing either the Gauss-Newton solver NLSCON37,38 or the MATLAB Nelder-Mead solver fminsearch. The optimization loop can either be performed serially or in parallel as a parfor loop, reducing computation time significantly on a current multicore PC unit. In the course of the optimization loop, the projected transition matrix 𝐏! = 𝝌! 𝐃𝝌

!!

𝝌! 𝐃𝐏𝝌 and the crispness 𝜉 =

!!!! !

are calculated together

with the membership matrix 𝝌, the optimized transformation matrix 𝐀 and the optimal value of the objective function 𝑓! , by the subroutine postprocess.m. If selected, an additional optional optimization (𝑘min = 𝑘max ) or optimization loop (𝑘min < 𝑘max ) can be performed by interactively setting 𝑘min and 𝑘max . This is useful, if one needs to conduct many optimizations, i.e., in case when minChi is nearly zero for a large interval of numbers of states. Since the Nelder-Mead algorithm, as a direct search method, can take very long to converge and is not guaranteed to find a global minimum, but only a local one close to the starting guess,33

ACS Paragon Plus Environment

33

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 57

it can be reasonable to first perform an optimization loop using Gauss-Newton with only a few steps. In our study, 10 Gauss-Newton steps or less already sufficed to find a solution near a local minimum. Second, a final Nelder-Mead optimization for a number of states or a range of numbers of states with high crispness 𝜉 can be performed, as suggested by Röblitz and Weber33. Another possibility could be to first perform a loop of Nelder-Mead optimizations for a limited number of steps, to improve the initial guess 𝐀! , and afterwards perform a second optimization loop using the Gauss-Newton method. Thereby, the chances for convergence could possibly be improved, since the success of the Gauss-Newton algorithm can depend strongly on the quality of the initial guess. In our study, we experimented with both of the above optimization strategies: Following strategy one, a range of numbers 𝑘min ≤ 𝑘 ≤ 𝑘max with minChi close to zero was chosen and a first optimization loop was performed using the Gauss-Newton method with a maximum of 10 Gauss-Newton steps. Subsequently, a range of numbers 𝑘min ≤ 𝑘 ≤ 𝑘max with high crispness 𝜉(𝑘) was chosen to perform a final optimization with the Nelder-Mead algorithm. Applying strategy two, again a range of numbers 𝑘min ≤ 𝑘 ≤ 𝑘max with minChi close to zero was chosen, but a first optimization loop was performed using the Nelder-Mead algorithm with a maximum of MaxIter = MaxFunEvals = 50× 𝑘 − 1

!

iterations and function evaluations. Following, a

second optimization with the Gauss-Newton method was executed with a maximum of 10 Gauss-Newton steps for a range of numbers 𝑘min ≤ 𝑘 ≤ 𝑘max with high crispness 𝜉(𝑘). Upon application of both strategies, it became apparent that already the application of only one GaussNewton optimization with a maximum of 10 Gauss-Newton steps sufficed to find a solution near a local minimum, as will be detailed in the following section. This solution was not significantly improved by performing a Nelder-Mead optimization with MaxIter = MaxFunEvals =

ACS Paragon Plus Environment

34

Page 35 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

200× 𝑘 − 1

!

or the above detailed combination of a Nelder-Mead optimization with

MaxIter = MaxFunEvals = 50× 𝑘 − 1

!

and a subsequent Gauss-Newton optimization.

3.4. Selection of the optimal number of states via the crispness and the minPc criterion After performing the optimization of the transformation matrix 𝐀 for a range of numbers of clusters with almost feasible (minChi ≈ 0) starting guess 𝐀! , one has to decide for an optimal number of (fuzzy) clusters tantamount to the number of macrostates of the final coarse grained MSM. Therefore, as discussed in section 2.2, it is essential to counterbalance two important requirements: First, an as crisp as possible fuzzy clustering of the microstates into macrostates by the G-PCCA algorithm, quantified by the crispness indicator 𝜉(𝑘) ∈ [0,1], were 𝜉(𝑘) = 0 indicates no separation of the microstates into macrostates at all and 𝜉(𝑘) = 1 denotes a crisp clustering in the sense of a Voronoi clustering. Second, a projection of the original dynamics between the microstates onto dynamics between the macrostates that fulfills Markovianity, meaning that the projected process can be regarded as a Markov chain. This is indicated by nonnegativity of the projected matrix 𝐏! = 𝝌! 𝐃𝝌

!!

𝝌! 𝐃𝐏𝝌 . Implying that significant negative

elements in 𝐏! indicate non-Markovianity of the projected process. The softening of the demanded non-negativity by the adjective "significant" denotes the necessity to regard the finite precision of the numerical implementation: Since most parts of the computation were performed in 64 bits double precision, the precision is limited to 15 or 16 digits and further reduced by numerical error, accumulated during extensive computation. Hence, negative entries of 𝐏! with a magnitude of 10!!" and below were considered as insignificant regarding numerical precision, assuming the corresponding projected process as Markovian. To check for Markovianity, the minPc-criterion, defined in section 2.2, was used. Following the above argumentation, a process can be assumed Markovian, if minPc is non-negative within numerical precision, which led to

ACS Paragon Plus Environment

35

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 57

𝐏! Markovian ⇔ minPc ≥ −10!!" in our case. In our study, as described in the sections above, the dynamics of Aβ40 under the influence of an oscillating electric field were simulated, the conformations were clustered into 2210 microstates, the trajectories were discretized by projection onto those microstates, the transition matrix was sampled at lag time 𝜏 = 1 ns with the sliding window approach, a infeasible guess of the transformation matrix 𝐀! was generated for numbers of states between 2 and 100, the minChi criterion was applied to identify the numbers of states with almost feasible 𝐀! and the transformation matrix 𝐀 was optimized in this range of numbers to identify the optimal number of states by means of the crispness 𝜉(𝑘) and the minPc indicator. This procedure was performed for the full Markov chain and also for the EMC. First, the case of the full Markov chain is discussed in the following. Here, optimizations utilizing the Gauss-Newton method were executed with a maximum of 10 Gauss-Newton steps between 𝑘min = 64 and 𝑘max = 80 (𝜏TICA = 1 ns, 𝜏 = 1 ns) and between 𝑘min = 66 and 𝑘max = 77 (𝜏TICA = 0.5 ns, 𝜏 = 1 ns), as identified by the minChi criterion in section 3.3.2. In the second case, the range of numbers of states was extended from 67 ≤ 𝑘 ≤ 76 into the nonoptimal area to show the sensitivity of the minChi indicator. The thereby obtained values of the crispness and minPc indicators as functions of the number of states 𝑘 are shown in Figure 7.

ACS Paragon Plus Environment

36

Page 37 of 57

4 6 8 10 12 65

70 75 number of clusters

80

0.60

crispness

b)

0.510 0.495 0.480 0.465 0.450

0.45

log10 (|minPc|)

crispness

a)

log10 (|minPc|)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

2 4 6 8 10 12

0.30 0.15

66

68

70 72 74 number of clusters

76

Figure 7: Crispness vs. minPc (logarithmic) over the number 𝑘 of states for the Markov chain with a sliding window lag time 𝜏 = 1 ns and a TICA lag time (a) 𝜏TICA = 1 ns, (b) 𝜏TICA = 0.5 ns. As one can clearly extract from the Figure 7a, 𝑘 = 69, 𝑘 = 76 and 𝑘 = 77 can be assumed optimal numbers of states of the projected process, for a TICA lag time 𝜏TICA = 1 ns and a sliding window lag time 𝜏 = 1 ns used for transition count matrix sampling. Those decompositions are reasonably crisp, considering the high number of states, and the associated projected transition matrices 𝐏! fulfill the minPc criterion within the numerical precision, allowing for the assumption of Markovianity of the projected processes. The projection of a dynamical process between microstates onto a process between macrostates, compiled into a MSM, is typically undertaken to represent the underlying process in an as simple as possible fashion. Thus, it was decided upon 𝑘 = 69 as optimal number of cluster. Likewise, one can extract from Figure 7b that 𝑘 = 69 can be presumed as an optimal number of states of the projected process, for a TICA lag time 𝜏TICA = 0.5 ns and a sliding window lag time 𝜏 = 1 ns used for transition count matrix sampling. The decomposition is reasonably crisp and the associated projected transition matrix 𝐏! fulfills the minPc criterion within the numerical

ACS Paragon Plus Environment

37

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 57

precision. Consequently, 𝑘 = 69 appears as an optimal number of states for clustering with GPCCA regardless of the TICA lag time 𝜏TICA , indicating robustness of our method against the parameter 𝜏TICA . Concerning the sensitivity of the minChi indicator, one can conclude from Figure 7b that it is indeed a good indicator for the crispness of the decomposition associated to a given number of states, since the crispness degrades strongly for 𝑘 = 66 and 𝑘 = 77, which were assessed as clearly non-optimal via the minChi indicator in section 3.3.2. Comparing Figure 7 with Figure 4, one can even conclude with a high degree of certainty that the closer to zero the minChi indicator the higher the crispness. Second, the case of the EMC is discussed. Here, it was experimented with different optimization schemes as discussed in the preceding section. Frist, optimizations utilizing solely the Gauss-Newton method were executed, with a maximum of 10 Gauss-Newton steps, between 𝑘min = 42 and 𝑘max = 57 (𝜏TICA = 1 ns, 𝜏 = 1 ns) and between 𝑘min = 39 and 𝑘max = 60 (𝜏TICA = 0.5 ns, 𝜏 = 1 ns), as identified by the minChi criterion in section 3.3.2. The thereby obtained values of the crispness and minPc indicator, as functions of the number 𝑘 of states, are shown in Figure 8. Additionally, we performed a combination of a Nelder-Mead optimization with MaxIter = MaxFunEvals = 50× 𝑘 − 1

!

and a subsequent Gauss-Newton optimization

with a maximum of 10 Gauss-Newton steps between 𝑘min = 39 and 𝑘max = 60 (both 𝜏TICA = 1 ns, 𝜏 = 1 ns and 𝜏TICA = 0.5 ns, 𝜏 = 1 ns). The wider range of numbers of states, compared to the optimality region 42 ≤ 𝑘 ≤ 57 in the case 𝜏TICA = 1 ns, 𝜏 = 1 ns, as determined by the minChi criterion, was chosen to exemplify the validity of the minChi criterion. The thus obtained values of the crispness and minPc indicators, as functions of the number 𝑘 of states, are shown in Figure 8, comparing the solely Gauss-Newton optimization with the Nelder-Mead optimization only, and Figure 9, showing both Nelder-Mead and subsequent Gauss-Newton optimizations.

ACS Paragon Plus Environment

38

Page 39 of 57

b)

crispness

0.75 0.60 0.45 0.30 0.15 0.0 2.5 5.0 7.5 10.0 12.5

log10 (|minPc|)

crispness

a)

log10 (|minPc|)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

40

45 50 55 number of clusters

60

0.6 0.5 0.4 0.3 0.2

0.0 2.5 5.0 7.5 10.0 12.5 40

45 50 55 number of clusters

60

Figure 8: Crispness vs. minPc (logarithmic) over the number 𝑘 of states, for the EMC, obtained via Gauss-Newton optimization (black dots) vs. Nelder-Mead optimization (red crosses) with a sliding window lag time 𝜏 = 1 ns and a TICA lag time (a) 𝜏TICA = 1 ns, (b) 𝜏TICA = 0.5 ns. As one can clearly extract from Figure 8a, 𝑘 = 42 till 𝑘 = 52 can be assumed as optimal numbers of states of the projected process, in the case of a TICA lag time 𝜏TICA = 1 ns and a sliding window lag time 𝜏 = 1 ns, with a little accentuated maximum of the crispness for 𝑘 = 49. Regarding Figure 8b, in the case of 𝜏TICA = 0.5 ns and 𝜏 = 1 ns, one can gather that 𝑘 = 39 till 𝑘 = 49 can be assumed as optimal numbers of states. All of those decompositions are reasonably crisp, considering the high number of cluster, and the associated projected transition matrices 𝐊 ! of the EMC fulfill the minPc criterion within the numerical precision, allowing for the assumption of Markovianity of the projected processes. One could now either, for the sake of simplicity as above, argue for 𝑘 = 42 as the preferable number of states or choose 𝑘 = 49, since it presents the maximum of the crispness for 𝜏TICA = 1 ns, 𝜏 = 1 ns and is also elevated for 𝜏TICA = 0.5 ns, 𝜏 = 1 ns. We will take both possibilities into account and discuss them in more detail in the subsequent section 3.5. Figure 8a and Figure 8b exhibit largely overlapping

ACS Paragon Plus Environment

39

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 57

optimality ranges, regardless of the TICA lag time 𝜏TICA , again (compare Figure 7) indicating robustness of our method against the parameter 𝜏TICA .

Figure 9: Crispness vs. minPc as a function of the number 𝑘 of states for the EMC with a TICA lag time 𝜏TICA = 1 ns and a sliding window lag time 𝜏 = 1 ns, obtained from a combination of Nelder-Mead optimization (red crosses) and subsequent Gauss-Newton optimization (black dots). Again, one can conclude from Figure 9 (compare Figure 7b) that minChi is indeed a good indicator for the crispness of a decomposition corresponding to a given number of states, since the crispness degrades significantly for 39 ≤ 𝑘 ≤ 41 and 58 ≤ 𝑘 ≤ 60, which were assessed as clearly non-optimal via the minChi indicator in section 3.3.2. By reference to Figure 8 and Figure 9, one can also compare the different optimization strategies discussed earlier in more detail: From Figure 8, one can conclude that optimization via the Gauss-Newton method and optimization via the Nelder-Mead algorithm both lead to similar values of the crispness indicator. Concerning the minPc indicator, the Gauss-Newton method appears as comparable, or even superior for some numbers of states, to the Nelder-Mead algorithm. That is why it was decided in favor of the Gauss-Newton method, since it takes

ACS Paragon Plus Environment

40

Page 41 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

significantly less processor time to execute. Regarding the strategy to combine a first NelderMead optimization with a subsequent Gauss-Newton optimization, one can conclude from Figure 9 compared to Figure 8a that this optimization scheme provides no benefit over a single Nelder-Mead or Gauss-Newton optimization, at least for the example discussed here. 3.5. Metastable states and dominant cycles Considering the example system, Aβ40 under the influence of an oscillating electric field with a frequency of 1 GHz tantamount to a period of 1 ns, one is not only interested in the identification of metastable states, as usual for equilibrium systems, but in addition, one searches for cyclic processes, like the periodic switching between conformations. Thus, one aims to identify dominant structures in a more general sense than just metastability. To this purpose, G-PCCA was utilized to cluster the microstates corresponding to transition patterns they have in common, thereby identifying metastabilities and dominant cycles. Referring to the previous section 3.4, a closer look is taken onto certain projected transition matrices, representing the underlying dynamics effectively coarse-grained as transition processes between metastable states and - as one will see - as dominant cycles. First, lets take a look at the projected transition matrix in Figure 10 of the full Markov chain for 𝑘 = 69, which was identified as optimal number of states earlier.

ACS Paragon Plus Environment

41

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

a)

b)

1

1

10

10

20

20

30

30

40

40

50

50

60

60

69

Page 42 of 57

69 1

10 10 10

17

18

10

20 15

10

10

17

30 13

10

40

11

10

10

16

9

50

10

7

10

15

10

60 5

10

10

14

3

69 10

1

10

1 10

13

10 18

10

10

19

20 16

30

10

14

10

18

10

12

10

10

17

40 10

10

10

50 8

16

10

6

10

60 10 15

4

69

10

10

2

100

14

Figure 10: Projected transition matrix of the Markov chain for 𝑘 = 69 with a sliding window lag time 𝜏 = 1 ns and a TICA lag time (a) 𝜏TICA = 1 ns, (b) 𝜏TICA = 0.5 ns. Positive elements are shown in red to yellow logarithmic color scale, negative elements are shown in white to blue logarithmic color scale. In Figure 10, one can see that both projected transformation matrices are diagonally dominant and approximately similar to the unit matrix. Thereof, one can conclude that these matrices represent a Markov process between 69 metastable macrostates, showing no cyclic processes. In order to reveal the cyclic sub-processes, as detailed in section 2.4, we computed the EMC of the process, as exemplified in section 3.3.1. That is why a look on Figure 11 is taken at the projected transition matrices, obtained with a TICA lag time 𝜏TICA = 1 ns and a sliding window lag time 𝜏 = 1 ns, of the EMC for 𝑘 = 42 and 𝑘 = 49, which were identified as optimal numbers of states by both the crispness and the minPc indicator earlier.

ACS Paragon Plus Environment

42

Page 43 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

a)

b) 1

1

10

10

20 20 30 30 40 40

49 1

10

10 10

17

18

10

15

10

10

17

20 13

10

11

10

10

16

30 9

10 10

7

15

10

40 5

10

10

14

3

10

1 1

10

13

10

10

17

10

18

10

20

15

10

10

17

13

10 10

11

16

30 10

9

10

10

15

40 7

10 10

5

10 14

49 3

10

10

1

13

Figure 11: Projected transition matrix of the EMC for a TICA lag time 𝜏TICA = 1 ns and a sliding window lag time 𝜏 = 1 ns (a) for 𝑘 = 42 and (b) for 𝑘 = 49. Positive elements are shown in red to yellow logarithmic color scale, negative elements are shown in white to blue logarithmic color scale. Considering Figure 11, it can be seen that the projected matrices are no longer purely diagonally dominant but resemble partially diagonal permutation matrices, as can be expected for Markov process showing both metastabilities and dominant cycles. Only 2-cycles can be seen, which lead to projected matrices 𝐊 ! having high entries symmetric to the diagonal. A 𝑘cycle or transposition of some set 𝑆 is a cyclic permutation of 𝑆, which cyclically maps the elements of some subset 𝑀 ⊆ 𝑆 with cardinality |𝑀| = 𝑘 to each other, while mapping all other elements to themselves. For example, the permutation 𝜎 = (1 2)(3)(4) that maps the sequence (1, 2, 3, 4) to (2, 1, 3, 4) is a 2-cycle of the set 𝑆 = {1,2,3,4} swapping the elements of 𝑀 = {1,2}

ACS Paragon Plus Environment

43

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 44 of 57

and keeping the other elements fixed. There are overall 7 dominant 2-cycles and 35, in case 𝑘 = 42, respectively 42, in case 𝑘 = 49, metastabilities clearly recognizable. Finally, it is worthwhile to consider the projected transition matrix in Figure 12, obtained with a TICA lag time 𝜏TICA = 0.5 ns and a sliding window lag time 𝜏 = 1 ns, of the EMC for 𝑘 = 49, as identified as an optimal number of states by the crispness and minPc indicators earlier.

Figure 12: Projected transition matrix of the EMC for a TICA lag time 𝜏TICA = 0.5 ns and a sliding window lag time 𝜏 = 1 ns for 𝑛! = 49. Positive elements are shown in red to yellow logarithmic color scale, negative elements are shown in white to blue logarithmic color scale. Considering Figure 12, there are 7 dominant 2-cycles and 35 metastabilities evident, as in Figure 11, only distinguished by the exact positioning in the matrix. Hence, it is evident that the core result or statement extracted from the EMC appears quite robust regarding the TICA lag time 𝜏TICA , which is an important feature of the proposed method. As a consequence, the

ACS Paragon Plus Environment

44

Page 45 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

advantage of G-PCCA applied on EMCs is given by the observation that unwanted sinks and isolated microstates, present in the full Markov chain, were removed by this matrix transform, while previously hidden dominant cycles showed. G-PCCA classifies states by mutual transition patterns, thereby identifying both metastable macrostates and dominant cycles. Since the TICA coordinates were scaled as suggested by Scherer et al.54 to get a kinetic map, i.e., obtaining coordinates that span a metric space were geometric and kinetic distances are proportional, the clustering of the data into microstates respects the kinetics of the process. A strong correlation between the assignment to a macrostate and the peptide structure (conformation) was observed in this study: Microstates that were clearly assigned to the same macrostate showed strongly matching conformations. This indicates a strong correlation between kinetics and structure. A detailed discussion and depiction of the characteristics of the metastable states and dominant cycles is beyond the scope of this paper and will be addressed in a follow-up publication.

4. Conclusions The presented Generalized Perron Cluster Cluster Analysis (G-PCCA) method enables the application of the popular robust Perron Cluster Cluster Analysis (PCCA+) on non-equilibrium systems, e.g., driven by an external force, with non-reversible transition matrices. By utilizing Schur vectors instead of eigenvectors, non-symmetric transition matrices with complex eigenvalues and eigenvectors can be treated following a procedure similar to the analysis of reversible transition matrices via PCCA+. Thus, G-PCCA extends Markov state modeling from reversible MSMs of metastable states to non-reversible MSMs of metastable states and dominant cycles. In addition, by utilizing Schur vectors, G-PCCA features increased robustness, since the computation of Schur vectors is numerically stable, while the computation of eigenvectors can be

ACS Paragon Plus Environment

45

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 46 of 57

very sensitive to rounding errors, if the eigenvalues are ill-conditioned.62,63 Moreover, we identified a new indicator minPc, quantifying the Markovianity of the projected dynamics. Combined with the well-known minChi and crispness indicators, minPc enables the selection of a number of well-separated states with projected dynamics described by a Markov chain. Via GPCCA, we modeled the irreversible conformational dynamics of Aβ40, forced by an oscillating electric field, unraveling not only metastable states but also several dominant 2-cycles. The method is not limited to non-equilibrium systems, driven by a periodic external force as exemplified in this study, but can be applied to any system, reversible or non-reversible, autonomous or non-autonomous, by application of the matching spatial or spatio-temporal discretization, as insinuated in section 2.1 and detailed by Fackeldey et al18. In this study, we exploited the periodicity of the studied process, driven by a periodic force, to exemplify MSM construction in case of a non-autonomous process. It would be worthwhile to examine the effect of the relation between the period and the system's internal timescales on the MSM. Since this a very complex question, it will be addressed in a future study. For the time being, we refer to the paper of Koltai et al.13 were this topic is addressed to some extent. Further, it would be interesting to investigate the model uncertainty resulting from rarely sampled regions in the context of non-reversible Markov chains, which - however - is beyond the scope of this paper. In the case of reversible Markov chains, this uncertainty has been discussed by Metzner et al.64.

ACS Paragon Plus Environment

46

Page 47 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

AUTHOR INFORMATION Corresponding Author *Email: [email protected] Author Contributions B.R. and M.W. wrote the paper. B.R. designed, set up and performed the simulations, analysis and modeling. B.R. designed and implemented the code used in this work, partially based on code and algorithms that M.W. and S.R. provided. S.R. and K.F. revised and complemented the manuscript. All authors have given approval to the final version of the manuscript. Funding Sources Simulations were performed on the Lichtenberg high performance computer of the TU Darmstadt. Notes The MATLAB Code implemented and used during the research for this paper is available upon request from B.R. The authors declare no competing financial interest. ACKNOWLEDGMENT B.R. is grateful to the University of Kassel for financial support granted to him within a PhD scholarship.

ACS Paragon Plus Environment

47

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 48 of 57

REFERENCES (1)

Schütte, C.; A., F.; Huisinga, W.; Deuflhard, P. A Direct Approach to Conformational Dynamics Based on Hybrid Monte Carlo. J. Comput. Phys. 1999, 151, 146–168.

(2)

Swope, W. C.; Pitera, J. W.; Suits, F. Describing Protein Folding Kinetics by Molecular Dynamics Simulations. 1. Theory. J. Phys. Chem. B 2004, 108 (21), 6571–6581.

(3)

Singhal, N.; Pande, V. S. Error Analysis and Efficient Sampling in Markovian State Models for Molecular Dynamics. J. Chem. Phys. 2005, 123 (20), 204909.

(4)

Noé, F.; Horenko, I.; Schütte, C.; Smith, J. C. Hierarchical Analysis of Conformational Dynamics in Biomolecules: Transition Networks of Metastable States. J. Chem. Phys. 2007, 126 (15), 155102.

(5)

Chodera, J. D.; Singhal, N.; Pande, V. S.; Dill, K. A.; Swope, W. C. Automatic Discovery of Metastable States for the Construction of Markov Models of Macromolecular Conformational Dynamics. J. Chem. Phys. 2007, 126 (15), 155101.

(6)

Buchete, N.-V.; Hummer, G. Coarse Master Equations for Peptide Folding Dynamics. J. Phys. Chem. B 2008, 112 (19), 6057–6069.

(7)

Chodera, J. D.; Noé, F. Markov State Models of Biomolecular Conformational Dynamics. Curr. Opin. Struct. Biol. 2014, 25, 135–144.

(8)

Chatterjee, A.; Voter, A. F. Accurate Acceleration of Kinetic Monte Carlo Simulations through the Modification of Rate Constants. J. Chem. Phys. 2010, 132 (19), 194101.

(9)

Prinz, J.-H.; Wu, H.; Sarich, M.; Keller, B.; Senne, M.; Held, M.; Chodera, J. D.; Schütte,

ACS Paragon Plus Environment

48

Page 49 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

C.; Noé, F. Markov Models of Molecular Kinetics: Generation and Validation. J. Chem. Phys. 2011, 134 (17), 174105. (10)

Wu, H.; Nüske, F.; Paul, F.; Klus, S.; Koltai, P.; Noé, F. Variational Koopman Models: Slow Collective Variables and Molecular Kinetics from Short off-Equilibrium Simulations. J. Chem. Phys. 2017, 146 (15), 154104.

(11)

Nüske, F.; Wu, H.; Prinz, J.; Wehmeyer, C.; Clementi, C.; Noé, F. Markov State Models from Short Non-Equilibrium simulations—Analysis and Correction of Estimation Bias. J. Chem. Phys. 2017, 146 (9), 94104.

(12)

Wang, H.; Schütte, C. Building Markov State Models for Periodically Driven NonEquilibrium Systems. J. Chem. Theory Comput. 2015, 11 (4), 1819–1831.

(13)

Koltai, P.; Ciccotti, G.; Schütte, C. On Metastability and Markov State Models for NonStationary Molecular Dynamics. J. Chem. Phys. 2016, 145 (17), 174103.

(14)

Conrad, N. D.; Weber, M.; Schütte, C. Finding Dominant Structures of Nonreversible Markov Processes. Multiscale Model. Simul. 2016, 14 (4), 1319–1340.

(15)

Deuflhard, P.; Weber, M. Robust Perron Cluster Analysis in Conformation Dynamics. Linear Algebra Appl. 2005, 398 (1–3), 161–184.

(16)

Golub, G. H.; Van Loan, C. F. The Unsymmetric Eigenvalue Problem. In Matrix Computations; Johns Hopkins University Press: Baltimore and London, 1996; pp 341– 342.

(17)

Chicone, C. Types of Differential Equations. In Ordinary Differential Equations with

ACS Paragon Plus Environment

49

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 50 of 57

Applications; Texts in Applied Mathematics; Springer New York: New York, NY, 1999; Vol. 34, p 4. (18)

Fackeldey, K.; Koltai, P.; Nevir, P.; Rust, H.; Schild, A.; Weber, M. From Metastable to Coherent Sets - Time-Discretization Schemes; ZIB-Report; 17–74; Zuse Insitute Berlin: Berlin, Germany, 2017.

(19)

Froyland, G.; Koltai, P. Estimating Long-Term Behavior of Periodically Driven Flows without Trajectory Integration. Nonlinearity 2017, 30 (5), 1948–1986.

(20)

Nasica-Labouze, J.; Nguyen, P. H.; Sterpone, F.; Berthoumieu, O.; Buchete, N.-V.; Coté, S.; De Simone, A.; Doig, A. J.; Faller, P.; Garcia, A.; et al. Amyloid β Protein and Alzheimer’s Disease: When Computer Simulations Complement Experimental Studies. Chem. Rev. 2015, 115 (9), 3518–3563.

(21)

Coles, M.; Bicknell, W.; Watson, A. A.; Fairlie, D. P.; Craik, D. J. Solution Structure of Amyloid Beta-Peptide(1-40) in a Water-Micelle Environment. Is the Membrane-Spanning Domain Where We Think It Is? Biochemistry 1998, 37 (97), 11064–11077.

(22)

Yang, M.; Teplow, D. B. Amyloid β-Protein Monomer Folding: Free-Energy Surfaces Reveal Alloform-Specific Differences. J. Mol. Biol. 2008, 384 (2), 450–464.

(23)

Sgourakis, N. G.; Merced-Serrano, M.; Boutsidis, C.; Drineas, P.; Du, Z.; Wang, C.; Garcia, A. E. Atomic-Level Characterization of the Ensemble of the Aβ(1–42) Monomer in Water Using Unbiased Molecular Dynamics Simulations and Spectral Algorithms. J. Mol. Biol. 2011, 405 (2), 570–583.

(24)

Wang, H.; Schütte, C.; Ciccotti, G.; Delle Site, L. Exploring the Conformational

ACS Paragon Plus Environment

50

Page 51 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Dynamics of Alanine Dipeptide in Solution Subjected to an External Electric Field: A Nonequilibrium Molecular Dynamics Simulation. J. Chem. Theory Comput. 2014, 10 (4), 1376–1386. (25)

Deuflhard, P.; Huisinga, W.; Fischer, A.; Schütte, C. Identification of Almost Invariant Aggregates in Reversible Nearly Uncoupled Markov Chains. Linear Algebra Appl. 2000, 315 (1–3), 39–59.

(26)

Floquet, G. Sur Les Équations Différentielles Linéaires À Coefficients Périodiques. Ann. Sci. l’École Norm. Supérieure 1883, 12, 47–88.

(27)

Schild, A. Electron Fluxes During Chemical Processes in the Electronic Ground State. Doctoral Thesis, Free University of Berlin, 2013.

(28)

Djurdjevac, N.; Sarich, M.; Schütte, C. Estimating the Eigenvalue Error of Markov State Models. Multiscale Model. Simul. 2012, 10 (1), 61–81.

(29)

Weber, M. A Subspace Approach to Molecular Markov State Models via a New Infinitesimal Generator. Habilitation Dissertation, Free University of Berlin, 2011.

(30)

Kube, S.; Weber, M. A Coarse Graining Method for the Identification of Transition Rates between Molecular Conformations. J. Chem. Phys. 2007, 126 (2), 24103.

(31)

Weber, M.; Kube, S. Preserving the Markov Property of Reduced Reversible Markov Chains. In AIP Conference Proceedings; 2008; Vol. 1048, pp 593–596.

(32)

Weber, M.; Fackeldey, K. G-PCCA: Spectral Clustering for Non-Reversible Markov Chains; ZIB-Report; 15–35; Zuse Insitute Berlin: Berlin, Germany, 2015.

ACS Paragon Plus Environment

51

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(33)

Page 52 of 57

Röblitz, S.; Weber, M. Fuzzy Spectral Clustering by PCCA+: Application to Markov State Models and Data Classification. Adv. Data Anal. Classif. 2013, 7 (2), 147–179.

(34)

Huisinga, W. Metastability of Markovian Systems: A Transfer Operator Based Approach in Application to Molecular Dynamics. Doctoral Thesis, Free University of Berlin, 2001.

(35)

Brandts, J. H. Matlab Code for Sorting Real Schur Forms. Numer. Linear Algebr. with Appl. 2002, 9 (3), 249–261.

(36)

Weber, M.; Rungsarityotin, W.; Schliep, A. An Indicator for the Number of Clusters: Using a Linear Map to Simplex Structure. In From Data and Information Analysis to Knowledge Engineering; Spiliopoulou, M., Kruse, R., Borgelt, C., Nürnberger, A., Gaul, W., Eds.; Springer: Berlin/Heidelberg, Germany, 2006; pp 103–110.

(37)

Nowak, U.; Weimann, L. Numerical Solution of Nonlinear (NL) Least Squares (S) Problems with Nonlinear Constraints (CON), Especially Designed for Numerically Sensitive Problems, v 2.3.2.; Zuse Insitute Berlin: Berlin, Germany, 2001.

(38)

Nowak, U.; Weimann, L. A Family of Newton Codes for Systems of Highly Nonlinear Equations; Technical Report; TR-91-10; Zuse Insitute Berlin: Berlin, 1991.

(39)

Weber, M. Meshless Methods in Conformation Dynamics. Doctoral Thesis, Free University of Berlin, 2006.

(40)

Kijima, M. Markov Processes for Stochastic Modeling, 1st ed.; Stochastic modeling series; Chapman & Hall: London, UK, 1997.

(41)

Pronk, S.; Pall, S.; Schulz, R.; Larsson, P.; Bjelkmar, P.; Apostolov, R.; Shirts, M. R.;

ACS Paragon Plus Environment

52

Page 53 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Smith, J. C.; Kasson, P. M.; van der Spoel, D.; et al. GROMACS 4.5: A High-Throughput and Highly Parallel Open Source Molecular Simulation Toolkit. Bioinformatics 2013, 29 (7), 845–854. (42)

Lindorff-Larsen, K.; Piana, S.; Palmo, K.; Maragakis, P.; Klepeis, J. L.; Dror, R. O.; Shaw, D. E. Improved Side-Chain Torsion Potentials for the Amber ff99SB Protein Force Field. Proteins Struct. Funct. Bioinforma. 2010, 78 (8), 1950–1958.

(43)

English, N. J.; Solomentsev, G. Y.; O’Brien, P. Nonequilibrium Molecular Dynamics Study of Electric and Low-Frequency Microwave Fields on Hen Egg White Lysozyme. J. Chem. Phys. 2009, 131 (3), 35106.

(44)

Bussi, G.; Donadio, D.; Parrinello, M. Canonical Sampling through Velocity Rescaling. J. Chem. Phys. 2007, 126 (1), 14101.

(45)

Abraham, M. J.; Spoel, D. van der; Lindahl, E.; Hess, B. GROMACS User Manual Version 5.1.2; Royal Institute of Technology and Uppsala University: Uppsala, Sweden, 2016.

(46)

Abraham, M. J.; Murtola, T.; Schulz, R.; Páll, S.; Smith, J. C.; Hess, B.; Lindah, E.; Lindahl, E. Gromacs: High Performance Molecular Simulations through Multi-Level Parallelism from Laptops to Supercomputers. SoftwareX 2015, 1–2, 19–25.

(47)

Gallo, M. T.; Grant, B. J.; Teodoro, M. L.; Melton, J.; Cieplak, P.; Phillips, G. N.; Stec, B. Novel Procedure for Thermal Equilibration in Molecular Dynamics Simulation. Mol. Simul. 2009, 35 (5), 349–357.

(48)

Darden, T.; York, D.; Pedersen, L. Particle Mesh Ewald: An NŊlog(N) Method for Ewald

ACS Paragon Plus Environment

53

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 54 of 57

Sums in Large Systems. J. Chem. Phys. 1993, 98 (12), 10089. (49)

Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. G. A Smooth Particle Mesh Ewald Method. J. Chem. Phys. 1995, 103 (19), 8577.

(50)

Hess, B.; Bekker, H.; Berendsen, H. J. C.; Fraaije, J. G. E. M. LINCS: A Linear Constraint Solver for Molecular Simulations. J. Comput. Chem. 1997, 18 (12), 1463– 1472.

(51)

Hess, B. P-LINCS: A Parallel Linear Constraint Solver for Molecular Simulation. J. Chem. Theory Comput. 2008, 4 (1), 116–122.

(52)

Ryckaert, J.-P.; Ciccotti, G.; Berendsen, H. J. . Numerical Integration of the Cartesian Equations of Motion of a System with Constraints: Molecular Dynamics of N-Alkanes. J. Comput. Phys. 1977, 23 (3), 327–341.

(53)

Pérez-Hernández, G.; Paul, F.; Giorgino, T.; De Fabritiis, G.; Noé, F. Identification of Slow Molecular Order Parameters for Markov Model Construction. J. Chem. Phys. 2013, 139 (1), 15102.

(54)

Scherer, M. K.; Trendelkamp-Schroer, B.; Paul, F.; Pérez-Hernández, G.; Hoffmann, M.; Plattner, N.; Wehmeyer, C.; Prinz, J.-H.; Noé, F. PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. J. Chem. Theory Comput. 2015, 11 (11), 5525–5542.

(55)

Kabsch, W.; Sander, C. Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features. Biopolymers 1983, 22 (12), 2577–2637.

ACS Paragon Plus Environment

54

Page 55 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(56)

McGibbon, R. T.; Beauchamp, K. A.; Harrigan, M. P.; Klein, C.; Swails, J. M.; Hernández, C. X.; Schwantes, C. R.; Wang, L.-P.; Lane, T. J.; Pande, V. S. MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories. Biophys. J. 2015, 109 (8), 1528–1532.

(57)

Lloyd, S. Least Squares Quantization in PCM. IEEE Trans. Inf. Theory 1982, 28 (2), 129– 137.

(58)

Bowman, G. R. An Overview and Practical Guide to Building Markov State Models. In An Introduction to Markov State Models and Their Application to Long Timescale Molecular Simulation; Bowman, G. R., Pande, V. S., Noé, F., Eds.; Advances in Experimental Medicine and Biology; Springer Netherlands: Dordrecht, Netherlands, 2014; Vol. 797, pp 7–22.

(59)

Noé,

F.

Statistical

inefficiency

of

Markov

model

count

matrices.

http://publications.imp.fu-berlin.de/1699/ (accessed Jan 22, 2018). (60)

Olubiyi, O. O.; Strodel, B. Structures of the Amyloid β-Peptides Aβ 1–40 and Aβ 1–42 as Influenced by pH and a D -Peptide. J. Phys. Chem. B 2012, 116 (10), 3280–3291.

(61)

Weber, M.; Galliat, T. Characterization of Transition States in Conformational Dynamics Using Fuzzy Sets; ZIB-Report; 02–12; Zuse Insitute Berlin: Berlin, Germany, 2002.

(62)

Golub, G. H.; Van Loan, C. F. The Unsymmetric Eigenvalue Problem. In Matrix Computations; Johns Hopkins University Press: Baltimore and London, 1996; pp 320– 328.

(63)

Saad, Y. Perturbation Theory and Error Analysis. In Numerical Methods for Large

ACS Paragon Plus Environment

55

Journal of Chemical Theory and Computation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 56 of 57

Eigenvalue Problems; Society for Industrial and Applied Mathematics, 2011; pp 70–75. (64)

Metzner, P.; Weber, M.; Schütte, C. Observation Uncertainty in Reversible Markov Chains. Phys. Rev. E 2010, 82 (3), 31114.

ACS Paragon Plus Environment

56

Page 57 of 57 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

For Table of contents use only:

ACS Paragon Plus Environment

57