A Coarse-Grained Molecular Dynamics Approach to the Study of the

1Institut Universitari d'Investigació en Ciències de la Salut (IUNICS), Departament de. Química, Universitat de les Illes Balears, 07122 Palma de M...
0 downloads 0 Views 1MB Size
Subscriber access provided by IDAHO STATE UNIV

Computational Chemistry

A Coarse-Grained Molecular Dynamics Approach to the Study of the Intrinsically Disordered Protein #-Synuclein. Rafael Ramis, Joaquín Ortega-Castro, Rodrigo Casasnovas, Laura Mariño, Bartolome Vilanova, Miquel Adrover, and Juan Frau J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.8b00921 • Publication Date (Web): 01 Apr 2019 Downloaded from http://pubs.acs.org on April 2, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

A Coarse-Grained Molecular Dynamics Approach to the Study of the Intrinsically Disordered Protein α-Synuclein Rafael Ramis,1,2 Joaquín Ortega-Castro,1,2* Rodrigo Casasnovas,1,2 Laura Mariño,1,2 Bartolomé Vilanova,1,2 Miquel Adrover,1,2 and Juan Frau1,2 1Institut

Universitari d’Investigació en Ciències de la Salut (IUNICS), Departament de Química, Universitat de les Illes Balears, 07122 Palma de Mallorca, Spain

2Institut

d’Investigació Sanitària Illes Balears (IdISBa), 07120 Palma de Mallorca, Spain *E-mail: [email protected]

Abstract Intrinsically disordered proteins (IDPs) are not well described by a single 3D conformation but by an ensemble of them, which makes their structural characterization especially challenging, both experimentally and computationally. Most all-atom force fields are designed for folded proteins and give too compact IDP conformations. α-synuclein is a well-known IDP because of its relation to Parkinson's disease (PD). To understand its role in this disease at the molecular level, an efficient methodology is needed for the generation of conformational ensembles that are consistent with its known properties (in particular, with its dimensions) and that is readily extensible to post-translationally modified forms of the protein, commonly found in PD patients. Herein, we have contributed to this goal by performing explicit-solvent, microsecondlong Replica Exchange with Solute Scaling (REST2) simulations of α-synuclein with the coarse-grained force field SIRAH, and finding that a 30% increase of the default strength of protein-water interactions yields a much better reproduction of its radius of gyration. Other known properties of α-synuclein, such as chemical shifts, secondary structure content and long-range contacts, are also reproduced. Furthermore, we have simulated a glycated form of α-synuclein to suggest the extensibility of the method to its post-translationally modified forms. The computationally efficient REST2 methodology in combination with coarse-grained representations will facilitate the

1 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 39

simulations of this relevant IDP and its modified forms, enabling a better understanding of their roles in disease and potentially leading to efficient therapies.

1. Introduction α-synuclein is a protein expressed predominantly in the brain and localized mainly in the presynaptic terminals of neurons.1 This protein is well known for its involvement in a set of neurodegenerative diseases, of which Parkinson's disease (PD) is the main representative. It has been widely studied since it was found to be the main component of Lewy bodies (the hallmarks of PD),2 and a large body of evidence has been found that relates it to the disease3–5. Its primary structure, shown in Figure 1A, is composed of 140 amino acid residues and is commonly divided into three domains: the membrane-binding N-terminal domain (residues 1-66), the aggregation-related non-amyloid component (NAC) domain (residues 67-96) and the acidic C-terminal domain (residues 97-140). Its exact function is not yet known, but some have been proposed, as recently reviewed.6 Numerous studies suggest that PD is related to the formation of neurotoxic α-synuclein aggregates, in particular, small soluble oligomers that precede the formation of insoluble amyloid fibrils7–9. It is accepted that αsynuclein exists in solution as a monomer and that some factors, such as point mutations,10–12 its interactions with metal ions13 or post-translational modifications,14– 16

stabilize certain aggregation-prone conformations and accelerate this process.17–19

2 ACS Paragon Plus Environment

Page 3 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 1: Representation of the primary structure of α-synuclein (A) and of one of its 3D structures in solution (B).

The structure of the native α-synuclein monomer is highly dependent on its surroundings: in aqueous solution at physiological pH it shows a rapid exchange between multiple conformations, and adopts a partially folded structure when bound to membranes or to other proteins, small molecules or metal ions or at lower pH.20 Figure 1B shows a possible 3D structure of α-synuclein in solution. This high conformational variability has led to its classification as an intrinsically disordered protein (IDP). α-synuclein's N-terminal region adopts an α-helical conformation upon membrane binding, while the C-terminus remains unstructured.21–23 The protein's structure in solution is more extended than a typical globular protein of the same mass, but more compact than a completely disordered structure,24 due to the presence of long-range contacts between the C-terminal and both the N-terminal and the NAC domains.25 Mass-spectrometry studies have found α-synuclein to be acetylated at the Met1 residue in samples extracted from human tissues.17,26 No significant differences have been found between N-terminally acetylated and non3 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 39

acetylated α-synuclein, although some NMR studies have suggested that the former has an increased α-helical propensity at the first 12 residues and a higher affinity for membranes.27–29 Due to the high conformational variability of IDPs in general and of α-synuclein in particular, they are challenging to characterize by using common biophysical techniques.30 Therefore, molecular dynamics (MD) simulations have become a useful complement to experimental methods in this field. Classical all-atom MD simulations have been used to study IDPs, but standard all-atom force fields have some drawbacks when dealing with this kind of proteins. One of the greatest ones is that they overstabilize compact conformations, since they have been designed to simulate globular proteins.31–33 Some improvements have been proposed to ameliorate this flaw.34–36 The Amber ff03ws force field was obtained by a simple scaling of the shortrange protein-water interactions in the parent force field Amber ff03w. With this adjustment, the overall dimensions of a number of IDPs are correctly reproduced. Additional studies have provided further evidence that stronger protein-water interactions might be the key to properly simulate IDPs.37,38 All-atom MD simulations require integration time steps short enough to accurately reproduce the vibration of the high frequency bonds (typically 2 fs), and this limits their accessible time scales. Another problem when simulating IDPs is the high computational cost of including an explicit solvent, given that these proteins are usually quite extended and, therefore, large simulation boxes with a very high number of solvent particles are needed to avoid artifacts due to the protein interacting with its periodic images. A strategy that has been adopted to face these two limitations is the use of coarse-grained models, which simplify the representation of the system by considering only a few interaction centers (coarse-grained beads) on each residue. These coarse-grained beads have much larger masses than individual atoms in all-atom models, so vibrational frequencies of bonds are much smaller and much larger time steps are affordable. Moreover, coarse-grained solvents represent several all-atom solvent molecules with a single coarse-grained one, drastically reducing the number of particles and interactions in comparison to an all-atom representation. The most popular of those models is probably the Martini coarse-grained force field for 4 ACS Paragon Plus Environment

Page 5 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

proteins.39 However, it is not suitable for studying secondary structural changes since it explicitly imposes secondary structure. Many of these models either use an implicit solvent or are structurally biased toward native structures. SIRAH40 is a recently developed, intermediate-resolution, structurally unbiased coarse-grained force field for proteins designed to be used with the explicit water model WT4,41 and allowing time steps of 20 fs. It was reported to provide a speedup of 2 orders of magnitude in comparison with fully atomistic systems with an equivalent number of particles.40 This force field uses a standard pairwise Hamiltonian, common to most MD simulation engines, which facilitates its implementation in a variety of codes. Coarse-grained beads are placed in the positions of real atoms; this mapping strategy is expected to make the development of modified amino acids easier,40 which will be useful to study post-translational modifications. The backbone is represented with three beads on the positions of nitrogen, α carbon and oxygen and side chains are represented with zero, one or two beads, depending on the amino acid. SIRAH is explicitly parametrized to overcome some common issues of coarsegrained models, such as the use of a uniform dielectric constant (implicit solvent), the imposition of specific constraints to maintain secondary structure, the lack of longrange electrostatic interactions or the effect of ions, all of which are important for the correct description of IDPs. However, it was not explicitly meant to be used with this kind of proteins. In this work, it is demonstrated that its default parameters cannot reproduce the experimental dimensions of monomeric α-synuclein, and the strategy used in developing the Amber ff03ws force field (namely, to scale the protein-water interactions) is translated to SIRAH, so that this weakness is fixed. The corrected version of the force field obtained this way is then applied to simulate α-synuclein with all its lysines replaced by Nε-(carboxyethyl)lysine (CEL), one of the well-known products of the glycation of α-synuclein lysines by the highly reactive dicarbonyl compound methylglyoxal.42,43 Therefore, a methodology is obtained that can reliably and efficiently be applied to the study of this protein and its modifications.

5 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 39

2. Methodology The SIRAH coarse-grained force field for proteins was used in combination with the WT4 explicit coarse-grained water model to perform molecular dynamics simulations of native α-synuclein with the aim of reproducing its experimental dimensions at the microsecond scale. The main idea behind the WT4 water model is to reproduce the roughly tetrahedral ordering of bulk water, due to the existence of hydrogen bonds between atomistic water molecules. Therefore, a WT4 molecule consists of four tetrahedrally connected beads that represent the four molecules on the vertices of a single tetrahedron. The force constants and equilibrium distances are set to reproduce the energy and length of typical water hydrogen bonds inside the same tetrahedron, while nonbonded parameters aim to reproduce hydrogen bonds between different tetrahedrons. Two of the four beads carry a partial positive charge (+0.41e) and the other two, the opposite partial negative charges (-0.41e). This allows the model to generate its own dielectric constant. The masses of WT4 beads (50 au) were set to fit the density of water.41 The combination of SIRAH and WT4 was also used to simulate α-synuclein with its 15 lysine residues replaced by the glycation product CEL and both systems were compared. To represent this non-standard residue in SIRAH, a carboxylate moiety (like the ones already present in aspartate and glutamate residues) was simply added to the bead corresponding to the lysine side-chain nitrogen, with a carbon bead between the two which stands for the methyl group in CEL. Figure S1 displays the atomistic structure of both lysine and CEL and Figure S2 shows the SIRAH atom types assigned to the beads in CEL. Tables S1 and S2 collect the point charges assigned to the CEL beads and the values for the missing bonded parameters, respectively. Our simulations were started from the central structure (9AAC-522.pdb) of the α-synuclein ensemble deposited in the Protein Ensemble Database (pE-DB)44 (shown in Figure S3), whose radius of gyration (Rg) is 2.89 nm. This structure was the central one in the sense of having the largest number of neighbors when performing a cluster analysis with the method of Daura et al45 using a 3.25-nm cutoff (high enough to have 6 ACS Paragon Plus Environment

Page 7 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

just one cluster). This pE-DB ensemble was obtained using all-atom molecular dynamics simulations with distance restraints derived from PRE-NMR data.46 The whole protein was mapped to a coarse-grained representation according to the standard SIRAH mapping. It was then placed in a rhombic dodecahedral box, which was then filled with WT4 molecules and NaW and ClW ions (coarse-grained representations of water, Na+ and Cl-, respectively) to achieve electroneutrality and a physiological concentration of salt (0.15 M). In order to improve and accelerate the conformational sampling, a variation of the replica exchange method known as Replica Exchange with Solute Scaling (REST2)47,48 was applied. Additional details on this method and on how it was applied in this work are given in the Supporting Information. All systems were minimized using the steepest descent algorithm and then went through a 5-ns NVT equilibration, a 5-ns NPT equilibration and a NPT production run. The leap-frog integrator with a 20-fs time step was used throughout. Protein beads were constrained with the LINCS algorithm49 during the equilibration and no constraints were used during the minimization and production. The temperature was kept at 310 K with a velocity rescale thermostat50 (with the protein and the solvent coupled separately) and the pressure at 1 bar with the Parrinello-Rahman barostat.51 τT for the thermostat was set to 1.0 ps during the equilibration phases and to 2.0 ps during the production. τP for the barostat was set to 10.0 ps during both the NPT equilibration and the production. Both nonbonded cutoffs (van der Waals and short-range electrostatics) were set to 1.2 nm. Long-range electrostatics was treated with the PME method52 with a 0.2 nm grid spacing during the equilibration and 0.25 nm during the production. Nonbonded interactions were calculated using a 1.2-nm cutoff neighbor list, updated every 25 steps (in the production and the NPT equilibration) or 10 steps (in the NVT equilibration). Both energy and pressure dispersion corrections were applied. Periodic boundary conditions and the minimum image convention were used. Snapshots were collected every 1000 steps (20 ps). Exchanges between neighboring replicas were attempted with the same frequency as snapshot collection. All simulations and analyses were carried out with the GROMACS 2016.4 software53–57 patched with the PLUMED plug-in (version 2.4),58

7 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 39

unless otherwise indicated. Additional information about the scripts and GROMACS routines used for the analyses is provided in the Supporting Information. 3. Results and Discussion In our attempt to reproduce the experimental dimensions of α-synuclein with the SIRAH force field, we found that its standard parametrization yields too compact conformations. This indicates that the protein-protein and protein-water interactions are unbalanced and that the former are too strong compared to the latter, leading to the collapse of the protein. Consequently, we wondered whether the simple strategy of uniformly scaling the protein-water interactions could be applied to this coarsegrained force field. To answer this question, we used it to perform molecular dynamics simulations of α-synuclein with increasing values (from 1.00 to 1.30, in steps of 0.10) of a constant f multiplying the default ε Lennard-Jones parameter between the solvent atom type (WT, making up the WT4 water molecules) and all the protein atom types, leaving the remaining parameters unaltered. Plainly, larger f values imply stronger protein-water interactions. In the case of Amber ff03ws, an optimal value of f=1.10 was determined to improve the description of several IDPs.35 This way we sought to reproduce, with SIRAH, the experimental Rg of monomeric α-synuclein in aqueous solution at physiological pH, which is estimated by paramagnetic relaxation enhancement-based nuclear magnetic resonance (NMR+PRE), small-angle X-ray scattering (SAXS) and single molecule Förster resonance energy transfer (FRET) experiments to lie between 2.26 and 4.00 nm.36,59,60 Once the value of f that best reproduced the Rg distribution for α-synuclein was found, an additional simulation with that f was conducted with all α-synuclein lysines turned into the glycation product CEL. Table S3 summarizes the composition of each system, the total simulated time (per replica), the number of replicas and the diameter of the rhombic dodecahedral simulation box. The numbers of replicas were enough for an efficient exchange between them and the simulation times were enough for the Rg distributions to be converged, as explained in the next section. The box diameters sufficed to avoid interactions of the protein with its periodic images, as the minimum

8 ACS Paragon Plus Environment

Page 9 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

distances between them were 5.8 nm, 5.5 nm, 4.2 nm, 3.1 nm and 2.1 nm for f=1.00, f=1.10, f=1.20, f=1.30 and CEL, respectively, while the nonbonded cutoffs were 1.2 nm.

3.1. Determination of f Figures S4 and S5 show the trajectories of the first replica (at an effective temperature of 310 K) through the effective temperature space (spanning the 310 K - 450 K interval) and the potential energy histograms for each replica at each f value, respectively. As can be seen, there is a good overlap between adjacent replicas and the first replica explores all the temperature space several times during each simulation. Other replicas at low, intermediate and high effective temperatures also do (see Figure S6). For each f value, all replicas spend roughly the same amount of time at each effective temperature, since all the histograms (shown in Figure S7) roughly oscillate around a probability of 0.03 (the inverse of the number of effective temperatures). The exchange rates were between 0.20 and 0.27 for f=1.00 and f=1.20, between 0.20 and 0.26 for f=1.10 and between 0.23 and 0.30 for f=1.30. Exchange rates between 0.20 and 0.30 are generally considered acceptable in replica exchange simulations.61–63 All these facts led us to consider our numbers of replicas to be sufficient. In order to assess the convergence of our simulations, the Rg histograms were computed using only the snapshots in the first halves of the trajectories and compared them with those obtained from the second halves and from the whole trajectories. The values of Rg were calculated over the α-carbons of the protein. Before computing the distributions, the snapshots within the Rg correlation times (calculated as the integral of the Rg autocorrelation functions) were discarded at the beginning of each trajectory so as to start the analysis from the first snapshot not correlated with the initial one. For all four f values, these three Rg distributions (first half, second half and overall) had a good overlap (see Figure 2). Note that the f=1.10 simulation is characterized by a sharp peak around 1.70 nm during the second half of the simulation, due to the protein collapsing to a compact state in all the replicas at about 1000 ns (an event also reflected in the trajectory of the first replica through effective temperature space, see the top right plot in Figure S4). In fact, if the last 400 ns of this 9 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 39

simulation are removed, this peak becomes significantly less prominent, as seen in Figure S8. These Rg distributions were obtained using the trajectories at the lowest effective temperature (310 K). The demultiplexed trajectories (with varying effective temperatures starting at 310 K) showed qualitatively similar distributions (see Figure S9).

Figure 2: Rg histograms for each f value, corresponding to the T=310 K ensemble. For f=1.00, they span the range between 1.50 nm and 3.00 nm with a maximum at around 2.00 nm (Rg=2.19 ± 0.34 nm). For f=1.10, Rg varies 1.70 nm and 3.20 nm with maxima at about 1.80 nm and 2.50 nm (Rg=2.41 ± 0.42 nm). In the f=1.20 case, the three distributions are centered around 2.50 nm and vary in the 1.70 nm - 3.50 nm range (Rg=2.59 ± 0.44 nm) and for f=1.30, the histograms reach their peaks at about 3.00 nm and take values mainly between 2.00 nm and 5.00 nm (Rg=3.22 ± 0.69 nm). In each plot, the Rg distributions corresponding to the first half of the trajectory (purple line), to the second half (green line) and the overall distribution (orange thick line) are shown. The Rg correlation times were removed from the start of each trajectory before computing the distributions. The bin width used was 0.01 nm. Note that the scales for the x and y axes differ between the plots.

We also calculated the end-to-end distance (the distance between the α carbon of Met1 and that of Ala140) as a function of time for the four values of f (see Figure 10 ACS Paragon Plus Environment

Page 11 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

S10). The values obtained were 3.76 ± 2.06 nm, 4.43 ± 2.71 nm, 4.76 ± 2.41 nm and 6.21 ± 2.83 nm for f=1.00, f=1.10, f=1.20 and f=1.30 respectively (averages ± standard deviations). The amplitudes of the oscillations of these distances are approximately constant for all f values and their 100-ns running averages (black lines in Figure S10) become stabilized at about 200 ns, 400 ns and 200 ns for f=1.00, f=1.20 and f=1.30 respectively. The f=1.10 end-to-end distance starts decreasing at about 900 ns and until the end of the simulation, presumably due to the aforementioned protein collapse, but it would otherwise be stabilized at about 500 ns. The f=1.30 simulation is the first one that reproduces the Rg distribution of the pE-DB ensemble46 from which the initial structure was taken. Figure 3 compares our four distributions with that from the pE-DB ensemble. The f=1.00, f=1.10 and f=1.20 simulations sample too few structures above 3.00 nm and too many below 2.00 nm, which is also against the experimental data available (gathered in Table 1), including a hydrodynamic radius of about 2.5 nm measured experimentally in our group by applying the Dynamic Light Scattering technique (unpublished data). The experimental SAXS profile for α-synuclein is also well reproduced with f=1.30 (and not with the other f values; the root mean square error decreases from 0.13 arbitrary units for f=1.00 to 0.08 arbitrary units for f=1.30, see Figure 4). The SAXS profile calculated by Piana et al36 with the TIP4P-D water model also agrees with our f=1.30 profile, while that obtained with the TIP3P water model does not (Figure 4). These results could point to a systematic overestimation of Rg when calculating it by using the Guinier approximation (as was done in all SAXS experiments in Table 1, most of which yielded larger Rg values in comparison to the other techniques). Indeed, the larger the value of Rg, the smaller the scattering angle interval where the approximation is valid, implying that the number of data points for the regression analysis is smaller and, therefore, the errors are larger.64 In IDPs, this Guinier region becomes even smaller due to the presence of many conformations with different shapes and sizes.65 Besides, the highly extended conformations are known to contribute more to the scattering intensity, yielding higher Rg estimates.60

11 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 39

Figure 3: Comparison of the Rg histogram obtained from the α-synuclein pE-DB ensemble46 (purple line) with the ones obtained from our simulations (corresponding to the T=310 K ensemble) after removing the Rg correlation times from the start of each trajectory (green lines). The bin width used was 0.01 nm. The f=1.30 distribution is the first one reproducing the pE-DB one reasonably well. Note that the scale of the y axis for f=1.00 differs from the other plots.

Table 1: Experimentally Measured α-Synuclein Radii of Gyration at Neutral pH and Room Temperature with Different Techniquesa Technique Radius of gyration (Rg) (nm) NMR+PRE 2.47 SAXS 4.0 ± 0.1 Single Molecule FRET 3.3 ± 0.3 SAXS 4.0 ± 0.1 SAXS 2.55 SAXS 3.6 ± 0.1 NMR+PRE 2.26 SAXS 4.0 ± 0.24 Dynamic Light Scattering ~2.5b aValues taken from the Supporting Information of Dibenedetto et al.66 bHydrodynamic radius measured in our group (unpublished data).

12 ACS Paragon Plus Environment

Page 13 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

These results validate our hypothesis that the standard SIRAH parameters do not provide an appropriate balance between protein-protein and protein-water interactions to efficiently sample the extended conformations of α-synuclein, but tend to overstabilize protein-protein contacts in relation to protein-water interactions. This is common to many classical all-atom force fields (as noted in the introduction), like those in the Amber, Gromos or CHARMM families.36 In fact, the values of the van der Waals parameters σ and ε between SIRAH backbone beads and WT4 molecules were specifically set outside the Lorentz-Berthelot rules with the intention of ensuring the correct reproduction of α-helices and, mainly, β-sheets, but this decreased the desolvation energy of proteins and increased their tendency to aggregate.40 This is probably the main cause for the overly compact structures obtained with f=1.00. Our results demonstrate that this specific weakness can be fixed by uniformly scaling the ε Lennard-Jones parameter of protein-water pairs by a factor of 1.30. The modified (σ, ε) parameters are included in the f130.itp file included in the Supporting Information.

13 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 39

Figure 4: Comparison of the experimental SAXS profile obtained by Schwalbe et al60 (upper plot) and the SAXS profiles calculated from the trajectories of Piana et al36 (lower plot) with the profiles calculated from our trajectories (via the FoXS server67,68) for the different f values.

It should be mentioned that the pE-DB ensemble was obtained with much shorter simulations (of just 1.2 ns per replica with a total of 24 replicas) than those reported here, but ad hoc PRE-NMR derived restraints needed to be imposed to reproduce the experimentally measured dimensions of the protein. Our methodology has the advantage of being structurally unbiased and therefore applicable to predict the structural effects of α-synuclein post-translational modifications with minimal experimental data (i.e. only the protein sequence and knowledge of those modifications).

3.2. Further validation of the obtained ensemble To further validate the ensemble obtained with f=1.30, the average content of secondary structure (α-helix, β-sheet and random coil) per residue was calculated. 14 ACS Paragon Plus Environment

Page 15 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Also, our trajectories were backmapped to an all-atom representation, from which the Cα, C, N and Hα chemical shifts were obtained with the SHIFTX2 program.69 Cluster analyses were conducted as well to visualize the contacts within and between domains in representative conformations of the protein, and contact maps were obtained for the first clusters. From the secondary structure analysis it is evident that the protein is mainly unstructured (73.0 ± 6.8% random coil) but with some residual extended β-sheet structure (26.8 ± 6.8%) and a negligible amount of α-helix (0.2 ± 0.4%), which is consistent with the data available from circular dichroism studies on the α-synuclein monomer in solution at physiological pH and room temperature (< 2% α-helix, 30% βsheet and 68% random coil70 or 3 ± 1% α-helix, 23 ± 8% β-sheet and 74 ± 10% random coil71). The f=1.00, f=1.10 and f=1.20 ensembles slightly overestimate the β-sheet content (and thus underestimate the random coil content). Table 2 displays the secondary structure populations for the four values of f. The transient extended βsheet conformations are mainly localized along the first 110 residues (the N-terminal and NAC domains and the beginning of the C-terminal one); the last 30 residues are notably more unstructured (Figure 5, A and B), in line with the experimental observation that the N-terminal and NAC domains have higher secondary structure propensities than the C-terminal one.21 Similar plots for the f=1.00, f=1.10 and f=1.20 ensembles are given in Figures S11-S13 respectively; they show the same qualitative features as that for f=1.30 but, as already mentioned, the quantitative agreement with experimental data is best for f=1.30. Table 2: Secondary Structure Populations for Each f Value. α-helix β-sheet random coil

f=1.00 0.4 ± 0.5% 37.8 ± 6.1% 61.8 ± 6.1%

f=1.10 0.5 ± 0.6% 35.6 ± 5.8% 63.8 ± 5.9%

f=1.20 0.3 ± 0.4% 33.0 ± 6.3% 66.7 ± 6.2%

f=1.30 0.2 ± 0.4% 26.8 ± 6.8% 73.0 ± 6.8%

15 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 39

Figure 5: Average secondary structure content per residue (A) and its time evolution (B) for the f=1.30 ensemble. Note that the secondary structure definition used is not DSSP, but the one provided by the dedicated script included in SIRAH tools,72 whose details are given in the Supporting Information.

Figure 6 shows the excellent agreement between the calculated chemical shifts of Cα, C, N and Hα and their experimental counterparts from the BMRB database (entry 6968), determined through NMR experiments.73 Indeed, as shown in Figure 7, Pearson correlation coefficients of the experimental over the calculated chemical shifts were 0.999, 0.958, 0.942 and 0.931, respectively, for the mentioned nuclei. Our calculated shifts are also consistent with those obtained through NMR in our group (unpublished data), with similar correlation coefficients, and with those reported in another coarsegrained molecular dynamics study of α-synuclein.74 Still, our Hα shifts are underestimated, a fact that seems to be common to many classical all-atom force fields, up to the point that it could be related to systematic errors in the experimental 16 ACS Paragon Plus Environment

Page 17 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

data.75 Another point that should be considered is the error in the coordinates due to backmapping the coarse-grained structures before computing the shifts, which could contribute to the smaller coefficients for C, N and Hα (C and Hα are not present in the coarse-grained representation). Figures S14-S21 show similar plots for the f=1.00, f=1.10 and f=1.20 ensembles. As can be seen, the agreement with experimental data is very similar for these other f values, so it can be concluded that the effect of changing the parameter f in the reproduction of backbone chemical shifts is minimal.

Figure 6: Calculated (f=1.30) and experimental (BMRB database,73 entry 6968) Cα, C, N and Hα chemical shifts for α-synuclein.

17 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 39

Figure 7: Regression plots of experimental (BMRB database,73 entry 6968) over calculated (f=1.30) Cα, C, N and Hα chemical shifts for α-synuclein. The Pearson correlation coefficients are also indicated.

The cluster analysis conducted on the f=1.30 ensemble found 177 clusters. The first 16 ones represented more than 50% of all structures. For comparison, only the first 6 clusters of the f=1.00 ensemble are needed to represent more than 50% of it. Gromacs .gro files corresponding to the central structures of these clusters are given in zip format in the Supporting Information. As can be observed, the f=1.30 conformations are considerably more extended than the other ones. Among the first 16 clusters (for f=1.30), the first one shows the most compact structures according to the radius of gyration of its central structure (Rg < 2.2 nm, Figure S22, top left). In the other 15 clusters, the corresponding central structures show Rg values up to 3.7 nm, being most of them between 2.4 and 3.0 nm. When comparing the radius of gyration of each individual domain relative to their respective maximum values, it becomes clear that the protein becomes increasingly extended from the N-to the C-terminal domain, with the NAC domain exhibiting intermediate relative extension (Figure S22, top right). The smaller compactness of the C-terminal 18 ACS Paragon Plus Environment

Page 19 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

region is consistent with the known fact that it remains in an unfolded state even in the presence of micelles (unlike the N-terminal domain), which suggests its greater preference for extended conformations.21,22 None of the central structures of the clusters exhibit stable secondary structure elements such as α-helices or β-strands. However, some dim structural patterns can be outlined. Accounting for its compactness, in all the most populated clusters, the Nterminal domain contains an average of 21 residues in bend or turn conformation. On the other hand, both the NAC and C-terminal domains show a lower average of 7 residues in such conformation (Figure S22, central left). The fraction of residues in bend or turn conformation is also higher in the N-terminal domain than in the other two (Figure S22, central right). The most salient feature of the f=1.30 representative structures is the greater number of contacts within the N-terminal domain than between it and the other two (see the contact maps for the first four clusters in Figure 8). This is related to the presence of three or four loops stabilized mainly by backbone-backbone or side chainbackbone hydrogen bonds or Lys-Glu salt bridges. As an example, the central structure of the first cluster shows three loops comprising the Lys10-Ala19, Thr22-Ala30 (backbone-backbone hydrogen bonds) and Glu28-Lys43 (salt bridge) segments. Even though the interactions within the N-terminal domain predominate, the characteristic long-range interactions between the C-terminal domain and the other two (residues Pro120-Ala140 with residues Ala30-Leu100), identified in a PRE study of α-synuclein and suggested to shield its aggregation,25 are clearly identifiable in the first four cluster representatives and in their contact maps (Figure 8). Similar contact maps for the f=1.00, f=1.10 and f=1.20 ensembles are given in Figures S23-S25. These three ensembles also display those kinds of interactions; the main effect of changing the parameter f is to weaken the contacts but without making them disappear, as reflected in the smaller colored areas as the value of f increases. The trend in the number of contacts observed for the entire protein is that it decreases from the most to the least populated clusters (Figure S22, bottom left), which implies more exposure of the protein to the solvent. The relative number of contacts within the N-terminal 19 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 39

domain is much larger than that with the NAC or with the C-terminal domains (Figure S22, bottom left and right). On the other hand, the NAC domain forms similar number of contacts within itself than with the other two domains. Finally, the C-terminal domain forms more contacts with the other two domains than within itself. As depicted in Figure 8, the NAC domain tends to form a β-hairpin loop with either the beginning of the C-terminus (more often) or the end of the N-terminus. For example, in the central structure of cluster 3, this loop is located at the Ala85-Asp98 region; in cluster 4, at the Glu83-Pro117 one. Even when this hairpin is not fully formed, the NAC domain is in a loop conformation, as is the case e.g. in clusters 1 and 2. These motifs are consistent with the proposed structural models for the NAC region in α-synuclein fibrils on the basis of all-atom MD simulations, characterized by the presence of three β-strands connected by two turns spanning the Val66-Ala69 and the Gln79-Ile88 regions, or two polymorphic β-arch-like structures with a Lys80-Gly86 turn region.76,77

Figure 8: Central structures of the four most populated clusters in α-synuclein (f=1.30) and their corresponding clusters’ contact maps. Contacts between the N-terminal and the C-terminal domains are highlighted inside a rectangle (both on the structures and on the contact maps) and the hairpin loops (or the loop-shaped regions) near the NAC domain are enclosed in ellipses. 20 ACS Paragon Plus Environment

Page 21 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Considering the low relative radius of gyration, the high number of bend and turn residues and the number of contacts, the general picture is that the N-terminal domain is mainly packed and more wrapped around itself than interacting with the other domains. The NAC and C-terminal domains, are more extended and form strands with few or none tight turns. These strands establish interactions when crossing over the N-terminal domain but without wrapping it. The other type of pattern that is observed is that these long strands associate in pairs in a zipper fashion. The best example of this type of interaction is found in cluster 4 (Figure 8), where the NAC and C-terminal domains are paired and very extended forming a large number of interactions with each other. To compare our clusters with an experimentally determined α-synuclein structure, we have computed the root-mean square deviations (RMSD) between the central structures of the 16 first clusters and each of the ten chains (monomers) A-J in the disordered domains (residues 1-28 and 100-140) of the recently determined solidstate NMR structure of a pathogenic α-synuclein fibril (PDB code 2N0A,78 Figure S26). The most remarkable finding was that chain G in the fibril was the most similar (and chain F, the least similar) to almost all 16 clusters when comparing residues 1-28. Such a regularity was not observed when comparing residues 100-140 but, in general, chains E and G were the ones with lowest RMSD (Figure S27). An analysis of the Rg differences between each cluster representative and the ten chains (Figure S28) reveals that the N-terminal domain is usually more compact in our clusters than in the fibrils, while the C-terminal domain tends to be more extended. The RMSD results are explained by the fact that chains G and E have a more compact N-terminus (and chain E, a more extended C-terminus) in relation to the other chains. The modification of the SIRAH force field presented herein greatly enhances the sampling of the most extended conformations of α-synuclein in comparison to the unmodified version of the force field. Moreover, this does not come at the cost of worsening the description of other properties, like the global secondary structure content, the backbone chemical shifts or the tertiary contacts. Indeed, the f parameter 21 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 39

only affects the Lennard-Jones interaction potentials between protein and solvent beads; it does not interact directly with other parameters of the force field equations. Nevertheless, since the f parameter affects the strength of the protein-water interactions, and these compete with water-water and protein-protein interactions, other properties will be affected by the parameter f. For example, although not applicable to α-synuclein or other IDPs, the thermal stability of the folded structure and the protein melting point. For the same reason, the folding/unfolding kinetics will be affected by modifications of the parameter f. Other properties such as the residence time of the solvent molecules in the first solvation layers and the diffusivity of the protein will be affected as well. It must also be kept in mind that some properties may not be well reproduced by this force field (even without modifying it). For example, secondary chemical shifts (where the expected random coil shifts for each residue have been subtracted) were computed for Cα and Cβ and expressed as their difference and, as can be seen in Figure S29, there is only a weak correlation between calculated and experimental values (obtained either from the BMRB database or from NMR in our group) and the rootmean square errors (RMSE) are rather large (Table S4). On the other hand, since this is true for all four f values, this weak correlation cannot be attributed to our modification of the f parameter; in order to improve it, some other parameters (e. g. torsion potentials) would have to be explicitly tuned in the original force field. In spite of the weak correlation, it should be noted that our RMSE values are comparable to those reported in all-atom MD studies of the intrinsically disordered Aβ protein79 (0.65 and 0.89 ppm for the Cα secondary shifts of the 21-30 fragment of Aβ and the whole 1-42 Aβ, respectively, and 0.96 and 0.87 ppm for the Cβ secondary shifts of the same two systems respectively).

3.3. Comparison between α-synuclein and α-synuclein-CEL After choosing 1.30 as a suitable value for f, an additional simulation with all αsynuclein lysines replaced with CEL was conducted to analyze the effects of this modification on its conformational preferences. This specific post-translational 22 ACS Paragon Plus Environment

Page 23 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

modification was especially convenient to simulate due to the availability of experimental data, obtained in our group, which we could compare with. Figures S30 and S31 depict the potential energy histograms for each αsynuclein-CEL replica and the trajectory over the effective temperature space of the first replica, respectively. Figure S32 shows the trajectories of three additional replicas (at low, intermediate and high effective temperatures) and Figure S33 displays the effective temperature histograms. These Figures show that there is a good overlap between neighboring replicas (with exchange rates between 0.24 and 0.33) and that the replicas visit all the effective temperatures several times during the simulation. The results obtained through our simulations agree with those obtained experimentally (via dynamic light scattering and circular dichroism), suggesting that αsynuclein-CEL is more extended (with an Rg of 4.0 ± 0.8 nm compared to 3.2 ± 0.7 nm for the native protein, according to the simulations) and increases its random-coil content (unpublished data). To visualize representative conformations of both αsynuclein and α-synuclein-CEL and find out what causes α-synuclein-CEL to sample more extended conformations, cluster analyses were performed and contact maps were obtained for each cluster. α-synuclein-CEL has a higher number of clusters (244 in comparison to 177). The first 16 α-synuclein clusters and the first 27 α-synuclein-CEL ones represent more than 50% of the total number of structures in each case. Gromacs .gro files corresponding to the central structures of these clusters are given in zip format in the Supporting Information. As in the case of native α-synuclein, there is a greater number of contacts within the N-terminal domain of α-synuclein-CEL than between it and the other two (see the contact maps for the first four clusters in Figure 9), that also correspond to three or four loops stabilized by backbone-backbone or side chain-backbone hydrogen bonds or CEL(NH2+)-CEL(COO-) salt bridges. As an example, the central structure of the first α-synuclein-CEL cluster shows also three loops between Ser9-Ala19 (side chainbackbone hydrogen bond), Thr22-Ala30 (backbone-backbone hydrogen bond) and Val37-His50 (side chain-side chain hydrogen bond), and in the central structure of the third cluster, a CEL32(N+)-CEL58(COO-) salt bridge can be found.

23 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 39

In relation to the N-terminal-C-terminal interactions, the differences between α-synuclein and α-synuclein-CEL are evident from the visualization of the central structures of the first few clusters (Figure 9). In the case of α-synuclein, the first cluster representative shows lysines from 6 to 45 interacting with aspartates and glutamates along the C-terminal domain, while lysines 58, 60 and 80 are located close to Glu83 in the NAC domain. On the other hand, in the case of α-synuclein-CEL, CEL from 6 to 45 are mainly interacting with other N-terminal residues, CEL58 is close to glutamates 104 and 105, CEL 60, 96 and 97 are surrounding Asp98 and CEL80 is interacting via a hydrogen bond with the side chain of Tyr75. The C-terminal residues beyond glutamates 104 and 105 are away from the other two domains and exposed to the solvent; this separation between the C-terminal domain and the other two is the main reason why α-synuclein-CEL is more extended than α-synuclein. The NAC domain, as happens also in native α-synuclein, tends to form a βhairpin loop with either the beginning of the C-terminus (more often) or the end of the N-terminus. For example, in cluster 1, this loop is located at the Asn65-Ala78 region; in cluster 2, at the Val82-Asn103 one and in cluster 8, at the Lys60-Ala85 one. Even when this hairpin is not fully formed, again the NAC domain is in a loop conformation (as in native α-synuclein), as is the case e.g. in clusters 3 and 4.

24 ACS Paragon Plus Environment

Page 25 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 9: Central structures of the four most populated clusters in α-synuclein (left) and α-synuclein-CEL (right) and their corresponding clusters’ contact maps. Contacts between the N-terminal and the C-terminal domains are highlighted inside a rectangle and the (hairpin) loops near the NAC domain are enclosed in ellipses.

3.5. Biological implications α-synuclein aggregation has been related to several neurological disorders, among which PD is the most prevalent. The protein exists in solution initially as a monomer, 25 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 39

but several circumstances, including different post-translational modifications of its side chains (serine phosphorylation, methionine oxidation or tyrosine nitration, among others), point mutations (A53T, A30P, E46K, G51D, H50Q and A53E) or interactions with metal ions (especially Cu2+), may promote its conversion into oligomers or fullyformed fibrils by stabilizing oligomerization or fibrillation-prone conformations respectively.12–15,80 Small spherical oligomers, and not fibrils, are believed to be the neurotoxic species.81 A detailed knowledge of the conformational behavior of the monomer under those different circumstances can thus provide mechanistic insights on the first stages of aggregation in each case and assist in the design of small molecules that can prevent the formation of neurotoxic oligomers already at these initial phases, potentially leading to efficient therapies against synucleinopathies. Another post-translational modification affecting the propensity of α-synuclein to aggregate is glycation, whose main effect is to inhibit the formation of fibrils and to potentiate the formation of toxic small spherical oligomers.19,82,83 Our results show that, upon modification of lysines to form CEL, α-synuclein adopts much more extended conformations due to the separation between the N-terminal and the Cterminal domains. It has been proposed that long-range interactions between the Cterminal tail and both the N-terminal and the NAC domains in α-synuclein might have a protective effect against its aggregation,84 although this hypothesis has been shown not to be general. For example, the PD-related A30P, E46K and A53T mutations do not disrupt (or even enhance) these interactions, but at the same time accelerate the formation of α-synuclein aggregates. For these mutations, changes in local secondary structure propensity or a decrease in the net charge of the protein were hypothesized to be the main factors triggering the aggregation process.85 In our simulations, the overall secondary structure content of α-synuclein-CEL is practically the same as that of α-synuclein. The NAC domain of both α-synuclein and αsynuclein-CEL tends to form β-hairpin loops due to hydrophobic contacts within itself, with the first residues in the C-terminus or the last ones in the N-terminus. Therefore, changes in local secondary structure propensity would not play a major role in the stabilization of small spherical oligomers and the inhibition of fibrillation characteristic of α-synuclein glycation. On the other hand, modification of lysines to form CEL would 26 ACS Paragon Plus Environment

Page 27 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

cause a high increase in the net charge of the protein (from -9 to -24, as all 15 lysines are likely to be glycated82). Considering these facts, the formation of spherical oligomers characteristic of α-synuclein glycation could start by means of hydrophobic interactions between the exposed parts of the NAC loops, but the high net charge of glycated α-synuclein, in comparison to the unmodified protein, would cause considerable electrostatic repulsion between monomers or low-order oligomers that would perturb the formation of fully mature fibrils.

4. Conclusions A reliable, computationally affordable methodology was derived in this work to explore the structural and dynamical features of the PD-related IDP α-synuclein that can be readily extended to its glycated forms. This is highly interesting, since protein glycation has been related to a number of diseases for which diabetes is a risk factor. Specifically, α-synuclein glycation has been detected in PD patients and in animal models of the disease, suggesting that it plays an essential role in its development. Therefore, a greater knowledge about its effects on the dynamics of the protein at the molecular level can provide some new insights to tackle the disease. By performing microsecond-long, REST2 coarse grained molecular dynamics simulations of α-synuclein employing the SIRAH force field, it was determined that its standard protein-protein interactions are too stabilized in relation to the proteinsolvent ones to efficiently explore the most extended conformations of the protein, as is also the case with many classical all-atom force fields, designed for the study of folded proteins. In the particular case of α-synuclein, a uniform scaling of the standard ε Lennard-Jones parameter associated to the protein-water pairs by a factor of 1.30 solves this issue, allowing a much better reproduction of the experimental radius of gyration of the protein. Moreover, the analysis of local properties, like backbone chemical shifts or global secondary structure content, as well as the characteristic long-range contacts between α-synuclein domains obtained with this ε scaling is also consistent with the available experimental data.

27 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 39

This slightly corrected version of the force field could potentially be used to predict how factors such as different post-translational modifications, point mutations or the presence of ions, affect the dynamics and the conformational landscape of αsynuclein. In particular, it would be especially suitable for the study of the effects of those modifications on global dimensions, tertiary structure and random-coil content, but less appropriate for properties such as secondary chemical shifts. In addition, the coarse-grained REST2 methodology used herein allows the efficient inclusion of an explicit solvent, much longer time steps (therefore, much longer time scales) in comparison to all-atom simulations and needs much fewer replicas than a standard temperature replica exchange. It is, therefore, cheap enough to properly simulate the computationally demanding α-synuclein, while still yielding valuable information.

Acknowledgements This work was cofunded by the Ministerio de Economía y Competitividad (MINECO) and by the European Fund for Regional Development (FEDER) (CTQ2014-55835-R), and also by the Conselleria d'Educació, Cultura i Universitats (Ajuts a accions especials d'R+D AAEE49/2015). The authors are grateful to “Consorci de Serveis Universitaris de Catalunya (CSUC)”, the “Centro de Cálculo de Supercomputación de Galicia (CESGA)”, and the “Centre de Tecnologies de la Informació (CTI) de la UIB” for providing access to their computational facilities. R. R. acknowledges his PhD scholarship granted by the Spanish MECD within the FPU program (FPU16/00785). R. C. acknowledges a Margalida Comas-CAIB postdoctoral fellowship granted by the “Govern de les Illes Balears, Conselleria d'Innovació, Recerca i Turisme” (PD/11/2016). L. M. acknowledges her PhD scholarship granted by the Spanish MECD within the FPU program (FPU14/01131). Thanks are also due to Profs. Zweckstetter and Blackledge for providing their experimental SAXS data and to Profs. Piana and Shaw for providing their MD trajectories to recompute their SAXS data.

Supporting Information Available

28 ACS Paragon Plus Environment

Page 29 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Additional details on the REST2 method, scripts and GROMACS routines used for the analyses, additional details on the sirah_vmdtk.tcl script, Figures S1-S33 and Tables S1S4.

References (1)

Maroteaux, L.; Campanelli, J. T.; Scheller, R. H. Synuclein: A Neuron-Specific Protein Localized to the Nucleus and Presynaptic Nerve Terminal. J. Neurosci. 1988, 8, 2804–2815.

(2)

Spillantini, M. G.; Schmidt, M. L.; Lee, V. M.-Y.; Trojanowski, J. Q.; Jakes, R.; Goedert, M. Alpha-Synuclein in Lewy Bodies. Nature 1997, 388, 839–840.

(3)

Kuwahara, T.; Koyama, A.; Gengyo-Ando, K.; Masuda, M.; Kowa, H.; Tsunoda, M.; Mitani, S.; Iwatsubo, T. Familial Parkinson Mutant α-Synuclein Causes Dopamine Neuron Dysfunction in Transgenic Caenorhabditis Elegans. J. Biol. Chem. 2006, 281, 334–340.

(4)

Auluck, P. K.; Caraveo, G.; Lindquist, S. α-Synuclein: Membrane Interactions and Toxicity in Parkinson’s Disease. Annu. Rev. Cell Dev. Biol. 2010, 26, 211–233.

(5)

van Rooijen, B. D.; Claessens, M. M. A. E.; Subramaniam, V. Membrane Interactions of Oligomeric Alpha-Synuclein: Potential Role in Parkinson’s Disease. Curr. Protein Pept. Sci. 2010, 11, 334–342.

(6)

Emamzadeh, F. N. Alpha-Synuclein Structure, Functions and Interactions. J. Res. Med. Sci. 2016, 21, 29/1-29/9.

(7)

Lashuel, H. A.; Petre, B. M.; Wall, J.; Simon, M.; Nowak, R. J.; Walz, T.; Lansbury, P. T. α-Synuclein, Especially the Parkinson’s Disease-Associated Mutants, Forms Pore-Like Annular and Tubular Protofibrils. J. Mol. Biol. 2002, 322, 1089–1102.

(8)

Lashuel, H. A.; Hartley, D.; Petre, B. M.; Walz, T.; Lansbury, P. T. Neurodegenerative Disease: Amyloid Pores from Pathogenic Mutations. Nature 2002, 418, 291.

(9)

Cookson, M. R. The Biochemistry of Parkinson’s Disease. Annu. Rev. Biochem. 29 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 39

2005, 74, 29–52. (10)

Polymeropoulos, M. H.; Lavedan, C.; Leroy, E.; Ide, S. E.; Dehejia, A.; Dutra, A.; Pike, B.; Root, H.; Rubenstein, J.; Boyer, R.; Stenroos, E. S.; Chandrasekharappa, S.; Athanassiadou, A.; Papapetropoulos, T.; Johnson, W. G.; Lazzarini, A. M.; Duvoisin, R. C.; Di Iorio, G.; Golbe, L. I.; Nussbaum, R. L. Mutation in the αSynuclein Gene Identified in Families with Parkinson’s Disease. Science. 1997, 276, 2045–2047.

(11)

Krüger, R.; Kuhn, W.; Müller, T.; Woitalla, D.; Graeber, M.; Kössel, S.; Przuntek, H.; Epplen, J. T.; Schols, L.; Riess, O. Ala30Pro Mutation in the Gene Encoding αSynuclein in Parkinson’s Disease. Nat. Genet. 1998, 18, 106–108.

(12)

Zarranz, J. J.; Alegre, J.; Gómez-Esteban, J. C.; Lezcano, E.; Ros, R.; Ampuero, I.; Vidal, L.; Hoenicka, J.; Rodriguez, O.; Atarés, B.; Llorens, V.; Gomez Tortosa, E.; del Ser, T.; Muñoz, D. G.; de Yebenes, J. G. The New Mutation, E46K, of αSynuclein Causes Parkinson and Lewy Body Dementia. Ann. Neurol. 2004, 55, 164–173.

(13)

Rasia, R. M.; Bertoncini, C. W.; Marsh, D.; Hoyer, W.; Cherny, D.; Zweckstetter, M.; Griesinger, C.; Jovin, T. M.; Fernández, C. O. Structural Characterization of Copper(II) Binding to α-Synuclein: Insights into the Bioinorganic Chemistry of Parkinson’s Disease. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 4294–4299.

(14)

Giasson, B. I.; Duda, J. E.; Murray, I. V. J.; Chen, Q.; Souza, J. M.; Hurtig, H. I.; Ischiropoulos, H.; Trojanowski, J. Q.; Lee, V. M. Oxidative Damage Linked to Neurodegeneration by Selective α-Synuclein Nitration in Synucleinopathy Lesions. Science. 2000, 290, 985–989.

(15)

Fujiwara, H.; Hasegawa, M.; Dohmae, N.; Kawashima, A.; Masliah, E.; Goldberg, M. S.; Shen, J.; Takio, K.; Iwatsubo, T. α-Synuclein Is Phosphorylated in Synucleinopathy Lesions. Nat. Cell Biol. 2002, 4, 160–164.

(16)

Miranda, H. V.; Outeiro, T. F. The Sour Side of Neurodegenerative Disorders: The Effects of Protein Glycation. J. Pathol. 2010, 221, 13–25.

(17)

Anderson, J. P.; Walker, D. E.; Goldstein, J. M.; de Laat, R.; Banducci, K.; Caccavello, R. J.; Barbour, R.; Huang, J.; Kling, K.; Lee, M.; Diep, L.; Keim, P. S.; 30 ACS Paragon Plus Environment

Page 31 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Shen, X.; Chataway, T.; Schlossmacher, M. G.; Seubert, P.; Schenk, D.; Sinha, S.; Gai, W. P.; Chilcote, T. J. Phosphorylation of Ser-129 Is the Dominant Pathological Modification of α-Synuclein in Familial and Sporadic Lewy Body Disease. J. Biol. Chem. 2006, 281, 29739–29752. (18)

Fredenburg, R. A.; Rospigliosi, C.; Meray, R. K.; Kessler, J. C.; Lashuel, H. A.; Eliezer, D.; Lansbury, P. T. The Impact of the E46K Mutation on the Properties of α-Synuclein in Its Monomeric and Oligomeric States. Biochemistry 2007, 46, 7107–7118.

(19)

Lee, D.; Park, C. W.; Paik, S. R.; Choi, K. Y. The Modification of α-Synuclein by Dicarbonyl Compounds Inhibits Its Fibril-Forming Process. Biochim. Biophys. Acta 2009, 1794, 421–430.

(20)

Uversky, V. N. A Protein-Chameleon: Conformational Plasticity of α-Synuclein, a Disordered Protein Involved in Neurodegenerative Disorders. J. Biomol. Struct. Dyn. 2003, 21, 211–234.

(21)

Eliezer, D.; Kutluay, E.; Bussell, R.; Browne, G. Conformational Properties of αSynuclein in Its Free and Lipid-Associated States. J. Mol. Biol. 2001, 307, 1061– 1073.

(22)

Ulmer, T. S.; Bax, A.; Cole, N. B.; Nussbaum, R. L. Structure and Dynamics of Micelle-Bound Human α-Synuclein. J. Biol. Chem. 2005, 280, 9595–9603.

(23)

Croke, R. L.; Sallum, C. O.; Watson, E.; Watt, E. D.; Alexandrescu, A. T. Hydrogen Exchange of Monomeric α-Synuclein Shows Unfolded Structure Persists at Physiological Temperature and Is Independent of Molecular Crowding in Escherichia Coli. Protein Sci. 2008, 17, 1434–1445.

(24)

Uversky, V. N.; Li, J.; Fink, A. L. Evidence for a Partially Folded Intermediate in Alpha-Synuclein Fibril Formation. J. Biol. Chem. 2001, 276, 10737–10744.

(25)

Dedmon, M. M.; Lindorff-Larsen, K.; Christodoulou, J.; Vendruscolo, M.; Dobson, C. M. Mapping Long-Range Interactions in α-Synuclein Using Spin-Label NMR and Ensemble Molecular Dynamics Simulations. J. Am. Chem. Soc. 2005, 127, 476–477.

(26)

Öhrfelt, A.; Zetterberg, H.; Andersson, K.; Persson, R.; Secic, D.; Brinkmalm, G.; 31 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 39

Wallin, A.; Mulugeta, E.; Francis, P. T.; Vanmechelen, E.; Aarsland, D.; Ballard, C.; Blennow, K.; Westman-Brinkmalm, A. Identification of Novel α-Synuclein Isoforms in Human Brain Tissue by Using an Online NanoLC-ESI-FTICR-MS Method. Neurochem. Res. 2011, 36, 2029–2042. (27)

Fauvet, B.; Fares, M.-B.; Samuel, F.; Dikiy, I.; Tandom, A.; Eliezer, D.; Lashuel, H. A. Characterization of Semisynthetic and Naturally Nα-Acetylated α-Synuclein in Vitro and in Intact Cells. J. Biol. Chem. 2012, 287, 28243–28262.

(28)

Kang, L.; Moriarty, G. M.; Woods, L. A.; Ashcroft, A. E.; Radford, S. E.; Baum, J. NTerminal Acetylation of α-Synuclein Induces Increased Transient Helical Propensity and Decreased Aggregation Rates in the Intrinsically Disordered Monomer. Protein Sci. 2012, 21, 911–917.

(29)

Maltsev, A. S.; Ying, J.; Bax, A. Impact of N-Terminal Acetylation of α-Synuclein on Its Random Coil and Lipid Binding Properties. Biochemistry 2012, 51, 5004– 5013.

(30)

Schor, M.; Mey, A. S. J. S.; MacPhee, C. E. Analytical Methods for Structural Ensembles and Dynamics of Intrinsically Disordered Proteins. Biophys. Rev. 2016, 8, 429–439.

(31)

Nettels, D.; Müller-Späth, S.; Küster, F.; Hofmann, H.; Haenni, D.; Rüegger, S.; Reymond, L.; Hoffmann, A.; Kubelka, J.; Heinz, B.; Gast, K.; Best, R. B.; Schuler, B. Single-Molecule Spectroscopy of the Temperature-Induced Collapse of Unfolded Proteins. Proc. Natl. Acad. Sci. U. S. A. 2009, 106, 20740–20745.

(32)

Piana, S.; Klepeis, J. L.; Shaw, D. Assessing the Accuracy of Physical Models Used in Protein-Folding Simulations: Quantitative Evidence from Long Molecular Dynamics Simulations. Curr. Opin. Struct. Biol. 2014, 24, 98–105.

(33)

Best, R. B. Computational and Theoretical Advances in Studies of Intrinsically Disordered Proteins. Curr. Opin. Struct. Biol. 2017, 42, 147–154.

(34)

Song, D.; Wang, W.; Ye, W.; Ji, D.; Luo, R.; Chen, H.-F. ff14IDPs Force Field Improving the Conformation Sampling of Intrinsically Disordered Proteins. Chem. Biol. Drug Des. 2017, 89, 5–15.

(35)

Best, R. B.; Zheng, W.; Mittal, J. Balanced Protein-Water Interactions Improve 32 ACS Paragon Plus Environment

Page 33 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Properties of Disordered Proteins and Non-Specific Protein Association. J. Chem. Theory Comput. 2014, 10, 5113–5124. (36)

Piana, S.; Donchev, A. G.; Robustelli, P.; Shaw, D. E. Water Dispersion Interactions Strongly Influence Simulated Structural Properties of Disordered Protein States. J. Phys. Chem. B 2015, 119, 5113–5123.

(37)

Henriques, J.; Cragnell, C.; Skepö, M. Molecular Dynamics Simulations of Intrinsically Disordered Proteins: Force Field Evaluation and Comparison with Experiment. J. Chem. Theory Comput. 2015, 11, 3420–3431.

(38)

Henriques, J.; Skepö, M. Molecular Dynamics Simulations of Intrinsically Disordered Proteins: On the Accuracy of the TIP4P-D Water Model and the Representativeness of Protein Disorder Models. J. Chem. Theory Comput 2016, 12, 3407–3415.

(39)

Monticelli, L.; Kandasamy, S. K.; Periole, X.; Larson, R. G.; Tieleman, D. P.; Marrink, S.-J. The MARTINI Coarse-Grained Force Field: Extension to Proteins. J. Chem. Theory Comput. 2008, 4, 819–834.

(40)

Darré, L.; Machado, M. R.; Brandner, A. F.; González, H. C.; Ferreira, S.; Pantano, S. SIRAH: A Structurally Unbiased Coarse-Grained Force Field for Proteins with Aqueous Solvation and Long-Range Electrostatics. J. Chem. Theory Comput. 2015, 11, 723–739.

(41)

Darré, L.; Machado, M. R.; Dans, P. D.; Herrera, F. E.; Pantano, S. Another Coarse Grain Model for Aqueous Solvation: WAT FOUR? J. Chem. Theory Comput. 2010, 6, 3793–3807.

(42)

Ahmed, M. U.; Frye, E. B.; Degenhardt, T. P.; Thorpe, S. R.; Baynes, J. W. Nε(Carboxyethyl)Lysine, a Product of the Chemical Modification of Proteins by Methylglyoxal, Increases with Age in Human Lens Proteins. Biochem. J. 1997, 324, 565–570.

(43)

Thornalley, P. J. Dicarbonyl Intermediates in the Maillard Reaction. Ann. N.Y. Acad. Sci. 2005, 1043, 111–117.

(44)

Varadi, M.; Kosol, S.; Lebrum, P.; Valentini, E.; Blackledge, M.; Dunker, A. K.; Felli, I. C.; Forman-Kay, J. D.; Kriwacki, R. W.; Pierattelli, R.; Sussman, J.; Svergun, 33 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 39

D. I.; Uversky, V. N.; Vendruscolo, M.; Wishart, D.; Wright, P. E.; Tompa, P. pEDB: A Database of Structural Ensembles of Intrinsically Disordered and of Unfolded Proteins. Nucleic Acids Res. 2014, 42, D326-D335. (45)

Daura, X.; Gademann, K.; Jaun, B.; Seebach, D.; van Gunsteren, W. F.; Mark, A. E. Peptide Folding: When Simulation Meets Experiment. Angew. Chem., Int. Ed. 1999, 38, 236–240.

(46)

Allison, J. R.; Rivers, R. C.; Christodoulou, J. C.; Vendruscolo, M.; Dobson, C. M. A Relationship between the Transient Structure in the Monomeric State and the Aggregation Propensities of α-Synuclein and β-Synuclein. Biochemistry 2014, 53, 7170–7183.

(47)

Wang, L.; Friesner, R. A.; Berne, B. J. Replica Exchange with Solute Scaling: A More Efficient Version of Replica Exchange with Solute Tempering (REST2). J. Phys. Chem. B 2011, 115, 9431–9438.

(48)

Su, L.; Cukier, R. I. Hamiltonian Replica Exchange Method Studies of a Leucine Zipper Dimer. J. Phys. Chem. B 2009, 113, 9595–9605.

(49)

Hess, B.; Bekker, H.; Berendsen, H. J. C.; Fraaije, J. G. E. M. LINCS: A Linear Constraint Solver for Molecular Simulations. J. Comput. Chem. 1997, 18, 1463– 1472.

(50)

Bussi, G.; Donadio, D.; Parrinello, M. Canonical Sampling through Velocity Rescaling. J. Chem. Phys. 2007, 126, 14101.

(51)

Parrinello, M.; Rahman, A. Polymorphic Transitions in Single Crystals: A New Molecular Dynamics Method. J. Appl. Phys. (Melville, NY, U. S.) 1981, 52, 7182– 7190.

(52)

Darden, T.; York, D.; Pedersen, L. Particle Mesh Ewald: An N Log(N) Method for Ewald Sums in Large Systems. J. Chem. Phys. 1993, 98, 10089–10092.

(53)

Abraham, M. J.; Murtola, T.; Schulz, R.; Páll, S.; Smith, J. C.; Hess, B.; Lindahl, E. GROMACS: High Performance Molecular Simulations through Multi-Level Parallelism from Laptops to Supercomputers. SoftwareX 2015, 1, 19–25.

(54)

Páll, S.; Abraham, M. J.; Kutzner, C.; Hess, B.; Lindahl, E. Tackling Exascale 34 ACS Paragon Plus Environment

Page 35 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Software Challenges in Molecular Dynamics Simulations with GROMACS. In Solving Software Challenges for Exascale; Markidis, S., Laure, E., Eds.; Springer, 2015; Vol. 8759, pp 3–27. (55)

Hess, B.; Kutzner, C.; van der Spoel, D.; Lindahl, E. GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. J. Chem. Theory Comput. 2008, 4, 435–447.

(56)

van der Spoel, D.; Lindahl, E.; Hess, B.; Groenhof, G.; Mark, A. E.; Berendsen, H. J. C. GROMACS: Fast, Flexible and Free. J. Comput. Chem. 2005, 26, 1701–1718.

(57)

Berendsen, H. J. C.; van der Spoel, D.; van Drunen, R. GROMACS: A MessagePassing Parallel Molecular Dynamics Implementation. Comput. Phys. Commun. 1995, 91, 43–56.

(58)

Tribello, G. A.; Bonomi, M.; Branduardi, D.; Camilloni, C. PLUMED2: New Feathers for an Old Bird. Comput. Phys. Commun. 2014, 185, 604–613.

(59)

Morar, A. S.; Olteanu, A.; Young, G. B.; Pielak, G. J. Solvent-Induced Collapse of α-Synuclein and Acid-Denatured Cytochrome c. Protein Sci. 2001, 10, 2195– 2199.

(60)

Schwalbe, M.; Ozenne, V.; Bilbow, S.; Jaremko, M.; Jaremko, L.; Gajda, M.; Jensen, M. R.; Biernat, J.; Becker, S.; Mandelkow, E.; Zweckstetter, M.; Blackledge, M. Predictive Atomic Resolution Descriptions of Intrinsically Disordered HTau40 and α-Synuclein in Solution from NMR and Small Angle Scattering. Structure 2014, 22, 238–249.

(61)

Wang, W.; Perovic, I.; Chittuluru, J.; Kaganovich, A.; Nguyen, L. T. T.; Liao, J.; Auclair, J. R.; Johnson, D.; Landeru, A.; Simorellis, A. K.; Ju, S.; Cookson, M. R.; Asturias, F. J.; Agar, J. N.; Webb, B. N.; Kang, C.; Ringe, D.; Petsko, G. A.; Pochapsky, T. C.; Hoang, Q. Q. A Soluble α-Synuclein Construct Forms a Dynamic Tetramer. Proc. Natl. Acad. Sci. U. S. A. 2011, 108, 17797–17802.

(62)

Moors, S. L. C.; Michielssens, S.; Ceulemans, A. Improved Replica Exchange Method for Native-State Protein Sampling. J. Chem. Theory Comput. 2011, 7, 231–237.

(63)

Liu, P.; Kim, B.; Friesner, R. A.; Berne, B. J. Replica Exchange with Solute 35 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 39

Tempering: A Method for Sampling Biological Systems in Explicit Water. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 13749–13754. (64)

Feigin, L. A.; Svergun, D. I. Structure Analysis by Small-Angle X-Ray and Neutron Scattering; Plenum Press: New York, 1987.

(65)

Receveur-Bréchot, V.; Durand, D. How Random Are Intrinsically Disordered Proteins? A Small Angle Scattering Perspective. Curr. Protein Pept. Sci. 2012, 13, 55–75.

(66)

Dibenedetto, D.; Rossetti, G.; Caliandro, R.; Carloni, P. A Molecular Dynamics Simulations-Based Interpretation of NMR Multidimensional Heteronuclear Spectra of Alpha-Synuclein/Dopamine Adducts. Biochemistry 2013, 52, 6672– 6683.

(67)

Schneidman-Duhovny, D.; Hammel, M.; Tainer, J. A.; Sali, A. Accurate SAXS Profile Computation and Its Assessment by Contrast Variation Experiments. Biophys. J. 2013, 105, 962–974.

(68)

Schneidman-Duhovny, D.; Hammel, M.; Tainer, J. A.; Sali, A. FoXS, FoxSDock and MultiFoXS: Single-State and Multi-State Structural Modeling of Proteins and Their Complexes Based on SAXS Profiles. Nucleic Acids Res. 2016, 44, 424–429.

(69)

Han, B.; Liu, Y.; Ginzinger, S. W.; Wishart, D. S. SHIFTX2: Significantly Improved Protein Chemical Shift Prediction. J. Biomol. NMR 2011, 50, 43–57.

(70)

Weinreb, P. H.; Zhen, W.; Poon, A. W.; Conway, K. A.; Lansbury, P. T. NACP, A Protein Implicated in Alzheimer’s Disease and Learning, Is Natively Unfolded. Biochemistry 1996, 35, 13709–13715.

(71)

Davidson, W. S.; Jonas, A.; Clayton, D. F.; George, J. M. Stabilization of αSynuclein Secondary Structure upon Binding to Synthetic Membranes. J. Biol. Chem. 1998, 273, 9443–9449.

(72)

Machado, M. R.; Pantano, S. SIRAH Tools: Mapping, Backmapping and Visualization of Coarse-Grained Models. Bioinformatics 2016, 32, 1568–1570.

(73)

Ulrich, E. L.; Akutsu, H.; Doreleijers, J. F.; Harano, Y.; Ioannidis, Y. E.; Lin, J.; Livny, M.; Mading, S.; Maziuk, D.; Miller, Z.; Nakatani, E.; Schulte, C.; Tolmie, D. E.; 36 ACS Paragon Plus Environment

Page 37 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Wenger, R. K.; Yao, H.; Markley, J. L. BioMagResBank. Nucleic Acids Res. 2008, 36, 402–408. (74)

Yu, H.; Han, W.; Ma, W.; Schulten, K. Transient β-Hairpin Formation in αSynuclein Monomer Revealed by Coarse-Grained Molecular Dynamics Simulation. J. Chem. Phys. 2015, 143, 243115–243142.

(75)

Somavarapu, A. K.; Kepp, K. P. The Dependence of Amyloid-β Dynamics on Protein Force Fields and Water Models. ChemPhysChem 2015, 16, 3278–3289.

(76)

Pollock-Gagolashvili, M.; Miller, Y. Two Distinct Polymorphic Folding States of Self-Assembly of the Non-Amyloid-β Component Differ in the Arrangement of the Residues. ACS Chem. Neurosci. 2017, 8, 2613–2617.

(77)

Atsmon-Raz, Y.; Miller, Y. A Proposed Atomic Structure of the Self-Assembly of the Non-Amyloid-β Component of Human α-Synuclein As Derived by Computational Tools. J. Phys. Chem. B 2015, 119, 10005–10015.

(78)

Tuttle, M. D.; Comellas, G.; Nieuwkoop, A. J.; Covell, D. J.; Berthold, D. A.; Kloepper, K. D.; Courtney, J. M.; Kim, J. K.; Barclay, A. M.; Kendall, A.; Wan, W.; Stubbs, G.; Schwieters, C. D.; Lee, V. M.-Y.; George, J. M.; Rienstra, C. M. SolidState NMR Structure of a Pathogenic Fibril of Full-Length Human α-Synuclein. Nat. Struct. Mol. Biol. 2016, 23, 409–415.

(79)

Ball, A. K.; Phillips, A. H.; Nerenberg, P. S.; Fawzi, N. L.; Wemmer, D. E.; HeadGordon, T. Homogeneous and Heterogeneous Tertiary Structure Ensembles of Amyloid-β Peptides. Biochemistry 2011, 50, 7612–7628.

(80)

Hokenson, M. J.; Uversky, V. N.; Goers, J.; Yamin, G.; Munishkina, L. A.; Fink, A. L. Role of Individual Methionines in the Fibrillation of Methionine-Oxidized αSynuclein. Biochemistry 2004, 43, 4621–4633.

(81)

Winner, B.; Jappelli, R.; Maji, S. K.; Desplats, P. A.; Boyer, L.; Aigner, S.; Hetzer, C.; Loher, T.; Vilar, M.; Campioni, S.; Tzitzilonis, C.; Soragni, A.; Jessberger, S.; Mira, H.; Consiglio A. Pham, E.; Masliah, E.; Gage, F. H.; Riek, R. In Vivo Demonstration That Alpha-Synuclein Oligomers Are Toxic. Proc. Natl. Acad. Sci. USA 2011, 108, 4194–4199.

(82)

Chen, L.; Wei, Y.; Wang, X.; He, R. Ribosylation Rapidly Induces α-Synuclein to 37 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 39

Form Highly Cytotoxic Molten Globules of Advanced Glycation End Products. PLoS One 2010, 5, e9052. (83)

Padmaraju, V.; Bhaskar, J. J.; Pradasa Rao, U. J. S.; Salimath, P. V. Role of Advanced Glycation on Aggregation and DNA Binding Properties of α-Synuclein. J. Alzheimer’s Dis. 2011, 24, 211–221.

(84)

Bertoncini, C. W.; Jung, Y.-S.; Fernandez, C. O.; Hoyer, W.; Griesinger, C.; Jovin, T. M.; Zweckstetter, M. Releasse of Long-Range Tertiary Interactions Potentiates Aggregation of Natively Unstructured α-Synuclein. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 1430–1435.

(85)

Rospigliosi, C. C.; McClendon, S.; Schmid, A. W.; Ramlall, T. F.; Barré, P.; Lashuel, H. A.; Eliezer, D. E46K Parkinson’s-Linked Mutation Enhances C-Terminal-to-NTerminal Contacts in α-Synuclein. J. Mol. Biol. 2009, 388, 1022–1032.

38 ACS Paragon Plus Environment

Page 39 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

TOC Graphic

39 ACS Paragon Plus Environment