Partial Intrinsic Disorder Governs the Dengue Capsid Protein

May 24, 2018 - Bioinformatics institute (BII) , Agency for Science, Technology and Research (A*STAR), #07-01 Matrix, 30 Biopolis Street, Singapore 138...
0 downloads 0 Views 1MB Size
Subscriber access provided by UNIV OF CAMBRIDGE

Article

Partial Intrinsic Disorder Governs the Dengue Capsid Protein Conformational Ensemble Priscilla L. S. Boon, Wuan Geok Saw, Xin Xiang Lim, Palur Venkata Raghuvamsi, Roland G Huber, Jan K Marzinek, Daniel A Holdbrook, Ganesh S Anand, Gerhard Grüber, and Peter J Bond ACS Chem. Biol., Just Accepted Manuscript • DOI: 10.1021/acschembio.8b00231 • Publication Date (Web): 24 May 2018 Downloaded from http://pubs.acs.org on May 26, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

TOC image 40x25mm (300 x 300 DPI)

ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Partial Intrinsic Disorder Governs the Dengue Capsid Protein Conformational

2

Ensemble

3

Priscilla L. S. Boon2,3,4,†, Wuan Geok Saw1,†, Xin Xiang Lim3, †, Palur Venkata

4

Raghuvamsi3, Roland G. Huber2, Jan K. Marzinek2,3, Daniel A. Holdbrook2, Ganesh

5

S. Anand3, Gerhard Grüber1,*, Peter J. Bond2,3,*

6 7

1

8

Nanyang Drive, Singapore 637551

9

2

School of Biological Sciences (SBS), Nanyang Technological University (NTU), 60

Bioinformatics institute (BII), Agency for Science, Technology and Research

10

(A*STAR), #07-01 Matrix, 30 Biopolis Street, Singapore 138671

11

3

12

14 Science Drive 4, Singapore 117543

13

4

14

Singapore, #05-01, 28 Medical Drive, Singapore 117456

Department of Biological Sciences (DBS), National University of Singapore (NUS),

NUS Graduate School of Integrated Science and Engineering, National University of

15 16

† these authors contributed equally

17

* corresponding authors: [email protected] and [email protected]

18 19

keywords: partially ordered proteins, intrinsically disordered proteins (IDPs),

20

flavivirus, dengue, Zika, molecular dynamics simulation, small angle X-ray

21

scattering, amide hydrogen-deuterium exchange

22

1 Environment ACS Paragon Plus

Page 2 of 32

Page 3 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

23

ABSTRACT

24

The 11 kDa, positively charged dengue capsid protein (C protein) exists stably as a

25

homodimer and co-localizes with the viral genome within mature viral particles. Its

26

core is composed of four alpha helices encompassing a small hydrophobic patch that

27

may interact with lipids, but approximately 20% of the protein at the N-terminus is

28

intrinsically disordered, making it challenging to elucidate its conformational

29

landscape. Here, we combine small-angle X-ray scattering (SAXS), amide hydrogen-

30

deuterium exchange mass spectrometry (HDXMS), and atomic-resolution molecular

31

dynamics (MD) simulations to probe the dynamics of dengue C proteins. We show

32

that the use of MD force fields (FFs) optimized for intrinsically disordered proteins

33

(IDPs) is necessary to capture their conformational landscape, and validate the

34

computationally generated ensembles with reference to SAXS and HDXMS data.

35

Representative ensembles of the C protein dimer are characterized by alternating,

36

clamp-like exposure and occlusion of the internal hydrophobic patch, as well as by

37

residual helical structure at the disordered N-terminus previously identified as a

38

potential source of auto-inhibition. Such dynamics are likely to determine the multi-

39

functionality of the C protein during the flavivirus life cycle, and hence impact design

40

of novel antiviral compounds.

41 42

2 Environment ACS Paragon Plus

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 32

43

The flavivirus family comprises a series of homologous mosquito-borne pathogens.

44

Prominent members of the genus are West Nile virus (WNV), yellow fever virus,

45

Japanese encephalitis virus, Zika virus (ZIKV), along with four serotypes of dengue

46

virus (DENV) termed DENV-1 to DENV-4 1. Flaviviruses are enveloped, positive-

47

sense single-stranded RNA viruses. The mature virion measures approximately 50 nm

48

in diameter and contains three structural proteins: the envelope (E), membrane (M),

49

and capsid (C) proteins. A total of 180 E proteins and 180 M proteins closely interact

50

with a phospholipid bilayer envelope circumscribing the center of the virus, which

51

contains the viral genome in complex with C protein

52

encapsulation of the viral genome; in its absence, empty envelopes can be formed that

53

undergo maturation but do not contain RNA 5. Whilst cryo-electron microscopy

54

(cryo-EM) has enabled the structure of the virion envelope to be solved, the flexibility

55

of the nucleocapsid core has precluded its visualization within the mature virus 1–4. A

56

9 Å cryo-EM structure of immature ZIKV revealed weak radial density just below the

57

inner leaflet of the virion membrane, interpreted as a broken shell of C proteins 6. This

58

contrasts with the idea that C proteins in enveloped viruses self-assemble in a

59

heterogeneous manner with the RNA genome, and do not form an ordered shell 7.

60

Irrespective, detailed information on protein orientation within the flavivirus particle

61

remains unavailable.

1–4

. C protein is essential for

62 63

The mature, 11 kDa C protein consists of approximately 100 residues, folded into four

64

α-helices (α1–α4), and forms a homo-dimer (Figure 1A) 8. The NMR structure of

65

DENV-2 C protein (PDB ID: 1R6R8) reveals a dimeric structure that contains a well-

66

folded core domain (residues 21–100) and a conformationally labile N-terminal

67

region (residues 1–20), which is absent from the reported coordinates. Flavivirus

3 Environment ACS Paragon Plus

Page 5 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

68

proteins are expressed as part of a single polyprotein and are subsequently cleaved

69

into the respective protein components. C protein is the N-terminal segment of the

70

polyprotein and is connected to the precursor M protein by a short linker that is

71

considered an ER translocation signal, which is not part of the mature protein. Mature

72

C protein is highly basic with a net charge of approximately +22 at neutral pH, thus

73

yielding an extremely high unit charge per molecular mass (~2/kDa). While overall

74

highly charged, the C protein dimer encloses a hydrophobic patch comprised of the

75

α2-α2’ interface, sandwiched between α1 helices (Figure 1B).

76 77

An important feature of the C protein is the coexistence of an ordered domain and an

78

intrinsically disordered N-terminal tail which, combined with its high charge, confers

79

multi-functionality 7. The high density of positively charged residues in two clusters

80

on the N-terminal tails are essential for efficient viral particle formation in human and

81

mosquito cells 9 and for mediation of RNA chaperone activity 10. Whilst the C protein

82

has been shown to associate with membranes through its small hydrophobic patch 11,

83

it also interacts with lipid droplets 12,13 and very low density lipoprotein (VLDL) in a

84

potassium-dependent manner

85

terminus

86

particle formation 13, thus identifying this interaction as a potential target for antiviral

87

therapy

88

protein termed pep14–23 undergoes a conformational change from a random coil to an

89

α-helix in the presence of anionic lipids, and blocks the hydrophobic patch from

90

interacting with lipids

91

region being identified as a hypothetical labile α-helix “α0”, which in the full-length

15,16

16

14

via a conserved lipid-droplet-binding motif at the N-

. Disruption of the biogenesis of lipid droplets by drugs impairs viral

. A peptide derived from residues 14–23 at the N-terminus of DENV-2 C

16

. Combined with previous modelling efforts, this led to the

4 Environment ACS Paragon Plus

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 32

92

protein adopts various structural arrangements that may inhibit lipid interactions by

93

blocking access to the hydrophobic α2-α2’ interface 16.

94 95

Given the integral role of the C protein during the viral life cycle, including its

96

enclosed hydrophobic patch and intrinsically disordered tails that are both potential

97

targets for antiviral compounds, it is important to deepen our understanding of its

98

dynamics. To this end, atomic-resolution MD simulations totaling >16 microseconds

99

have been used to describe the dynamics of the full-length DENV C protein dimer

100

from all four serotypes. In light of the emergence of specifically parameterized FFs

101

developed to model the conformational dynamics of IDPs

102

simulations were performed using three conventional FFs, and two specialized for

103

IDPs. In conjunction with SAXS and HDXMS, further insight into FF suitability and

104

dynamics of the partially disordered DENV C protein were explored. Trajectories

105

based on IDP FFs were found to be most suitable for describing the dynamics of the

106

dimer in solution, and indicate how the N-termini of the core fold and disordered tails,

107

and the labile “α0 helix”, may serve to regulate access to the internal hydrophobic

108

patch during the viral life cycle.

17,18

, comparative

109 110

RESULTS AND DISCUSSION

111

SAXS studies confirm full-length DENV-2 C protein exists as a dimer in solution.

112

The full-length C protein from DENV-2 was analyzed in solution using SAXS. The

113

experiments were performed immediately after the C protein eluted from the gel

114

filtration column and had been concentrated. SAXS patterns were recorded at protein

115

concentrations of 1.65, 4.05 and 7.3 mg ml-1 (Figure S1A). The Guinier plots at low

5 Environment ACS Paragon Plus

Page 7 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

116

angles for all three concentrations appeared linear and confirmed good data quality

117

with no indication of protein aggregation (Supporting Information Figure S1A, right

118

inset). All three scattering patterns overlapped nicely at very low scattering angles

119

(Supporting Information Figure S1B), indicating no inter-particle interaction at all

120

tested concentrations. Therefore, the primary data analysis was performed using the

121

scattering pattern collected at 4.05 mg ml-1, which had a moderate protein

122

concentration with good signal-to-noise ratio. Using Guinier approximation, the

123

derived radius of gyration (Rg) of C protein was 2.50 ± 0.02 nm. Its distance

124

distribution (P(r)) function (Supporting Information Figure S1C) showed a bell-curve

125

range from 0 to 5 nm, with a long tail pointing to a maximum particle dimension, Dmax

126

of 8.5 nm. The Rg value of 2.63 ± 0.01 nm extracted from the P(r) function, which

127

takes the whole scattering curve into consideration, agrees with the one derived from

128

the Guinier region (Supporting Information Table S1).

129 130

Based on the Porod-volume and DAMMIF-excluded volume (see Supplementary

131

Methods section) determined from the scattering pattern at a protein concentration of

132

4.05 mg ml-1, the molecular mass (MM) of DENV-2 C protein was calculated to be

133

26.8 ± 5.3 and 29.1 ± 2.9 kDa, respectively. According to the protein sequence, the

134

monomeric MM is 11.8 kDa, confirming that it exists as a dimer in solution, in

135

agreement with previous NMR experiments

136

literature to support formation of higher-order oligomers in the absence of RNA. This

137

was further confirmed by comparing the theoretical scattering patterns of the

138

monomeric and dimeric NMR structures (which lack the first ~20 residues at the N-

139

terminus) with the experimental scattering pattern, where low discrepancy χ2 values

8,19

and with the lack of data in the

6 Environment ACS Paragon Plus

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

140

were obtained for the dimeric form (χ2 = 8.89) compared to the monomeric form (χ2 =

141

27.68) (Figure 1C).

142 143

Flexibility and ensemble formation of DENV-2 C protein dimer. Since the N-

144

terminus of the DENV-2 C protein is likely disordered in solution, the SAXS data

145

were further analyzed by considering the protein to be flexible. To qualitatively assess

146

particle state in solution, the normalized Kratky plot was created (Figure 1D) and

147

compared to that of globular lysozyme. The plot exhibited a bell-shaped profile with a

148

maximum shift towards the right (Figure 1D), indicating that the recombinant protein

149

is folded, and composed of a compact region, presumed to be the core protein domain,

150

and a flexible part, likely contributed by the first 20 residues. To further characterize

151

the flexibility of the C protein, the Ensemble Optimization Method (EOM)20 was used

152

to generate a random pool of independent full-length models of the C protein dimer,

153

and to subsequently select an ensemble of conformers that best fit the experimental

154

data. The optimal fit required a minimum of six conformations to describe the system

155

(Supporting Information Figure S2). The selected ensembles exhibited a broad Rg

156

distribution that ranged from 1.8–3.5 nm. A major distribution was evident between

157

2.4 nm to 3.1 nm (Figure 1E), which was larger than the Rg-distribution of the random

158

pool, showing the C protein is highly flexible and extended. This was also reflected

159

by the Rflex value, which quantifies flexibility, where the Rflex of the selected ensemble

160

(88%) was higher than the randomness threshold of 82%. The quality of the ensemble

161

solution was further confirmed by the control value Rσ = 1.24 (expected to be lower

162

than 1.0 when ensemble Rflex < pool Rflex). Given that the NMR structure of the C

163

protein (residues 21 to 100) has an Rg-value of 1.71 nm (Figure 1E, inset), the flexible

164

and extended conformation of the DENV-2 C protein may be attributed mainly to the

7 Environment ACS Paragon Plus

Page 8 of 32

Page 9 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

165

first ~20 residues. This ensemble solution selected by EOM yielded a discrepancy

166

value χ2 of 2.28 (Figure 1F).

167 168

Simulation using IDP FFs best fit the SAXS ensemble. Interest in using simulations

169

to probe the dynamics of IDPs is on the rise due to their functional importance in vivo

170

17,18,21–23

171

partially ordered C protein dimer, we conducted a series of 1 µs simulation replicas

172

using the amber14sb

173

charmm36m 18 FFs. EOM was used to select an ensemble of structures from a pool of

174

conformations generated from each of the simulations. Agreement with the SAXS

175

data could be obtained by selecting structures from each FF ensemble via EOM with

176

χ2 values ranging from 0.957 to 1.578 (Figure 2A) across the entire set of

177

experimental values, also yielding curves with good agreement in the low scattering

178

angle region representing the largest structural features (Figure 2A, insert).

179

Agreement with the SAXS data could be also obtained by fitting the linear

180

combination of theoretical intensities based on the raw simulation trajectories for each

181

FF, with χ2 ranging from 1.17 to 2.52; this similarly yielded curves with good

182

agreement in the low scattering angle region (Supporting Information Figure S3),

183

especially for amber03ws.

. To establish the most appropriate FF to describe the dynamics of the

24

, charmm36

25

, gromos96 54A7

26

, amber03ws

17

, and

184 185

When analyzing the compactness of the structural ensembles selected by EOM, most

186

FFs yielded sizes that were decidedly too compressed when compared with

187

experiment, based on the calculated Rg values (Figure 2C). Only the amber03ws FF

188

(lowest χ2 value), which was specifically parameterized for IDPs, was able to produce

189

an ensemble whose Rg range coincided with the experimentally determined value

8 Environment ACS Paragon Plus

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

190

(Figure 2C). The Rg values of the pool range from 2.1 nm to 3.0 nm (Figure 2D). The

191

structures that were selected with EOM via the genetic algorithm that best fit the

192

intensity curve had Rg values around 2.5 nm, corresponding to the Rg of the majority

193

of the structures derived from the trajectory, indicating that amber03ws samples

194

conformations that align well with the experimental data. The selected structures

195

using amber03ws showed not only a qualitative improvement in fit, but also yielded a

196

relatively uniform selection from the pool of MD-derived conformations (Figure 2D),

197

providing confidence that this simulated ensemble is most able to represent the

198

dynamics measured experimentally in solution, subject to the under-determined nature

199

of the true ensemble.

200 201

Calculation of the root-mean-square deviation (RMSD) (Supporting Information

202

Figure S4) and per-residue root-mean-square fluctuations (RMSFs) (Supporting

203

Information Figure S5) revealed that, irrespective of using an IDP-optimized FF, the

204

rigidly folded core of the C proteins retained their structure and exhibited a common

205

pattern of fluctuation, whilst blockwise analysis indicated convergence across FFs

206

within ~400 ns of sampling (Supporting Information Figures S4–S6). Analyzing the

207

end-to-end distances (Supporting Information Figure S7) of the N-termini (residues 1

208

to 20) of the DENV-2 C protein revealed a similar trend across the five FFs, with

209

amber03ws producing the longest distances. The solvent accessibility (Supporting

210

Information Figure S8) of the N-terminal residues of DENV-2 C protein were very

211

similar for amber14sb, charmm36, charmm36m, and gromos96 54A7, while the

212

amber03ws FF resulted in looser N-terminal tails with greater solvent accessibility.

213

The propensity to form helices (Supporting Information Figure S9) in the N-terminal

214

tails was least pronounced for amber03ws compared with the four other FFs.

9 Environment ACS Paragon Plus

Page 10 of 32

Page 11 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

215

Comparing the distributions of backbone angles (phi and psi) for the N-terminal tails

216

of the DENV-2 C protein (Supporting Information Figure S10) revealed that

217

amber03ws has a broader distribution of angles in the “allowed” regions compared to

218

the other four FFs. The charmm36m FF has a reduced propensity to form left-handed

219

α-helices compared to charmm36, but the helicity is still more pronounced than either

220

of the two amber FFs. The gromos96 54a7 FF resulted in most residues with angles in

221

“disallowed” regions.

222 223

DENV C protein dimer exhibits opening-closing motions supported by the α0-α α1

224

region. The DENV-2 C protein is formed by four α-helices forming a stable core with

225

a conformationally labile N-terminal region including a hypothesized α0 helix (Figure

226

1A). In a previous study, the N-terminal regions were reconstructed as α-helices that

227

stacked on top of the α1 helices

228

simulations, and principal component analysis (PCA) was subsequently used to filter

229

the trajectories, in order to identify the dominant motions of the DENV-2 C protein

230

dimer. Based on the amber03ws FF, the major motions of the protein core may be

231

described by the first two modes (at least ~50% of the total dynamics in all

232

trajectories), which describe an opening and closing motion of the α1 helices (Figure

233

3A) that serves to occlude the central hydrophobic patch (Figure 1B). Based on the

234

principal motions of the DENV-2 C protein core, the starting structure (PDB

235

ID:1R6R) is in an “open” state (Figure 3B), with the α1 helices spread apart exposing

236

the hydrophobic patch. The hydrophobic patch became occluded within the first 400

237

ns of the simulation and subsequently remained in the “closed” state, as defined by

238

the closed extreme of the second mode (Figure 3B).

16

. This model was used as a starting structure for

239

10 Environment ACS Paragon Plus

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

240

In order to compare the dynamics of the C protein across the different serotypes of

241

dengue, homology models of DENV-1, DENV-3 and DENV-4 C proteins were also

242

simulated using the amber03ws FF. The proteins from the four serotypes have a

243

pairwise percentage identity amongst one another of 70% or more. A multiple

244

sequence alignment shows that, overall, ZIKV and WNV are more closely related to

245

DENV-4 and DENV-2, while DENV-1 and DENV-3 are more closely related to one

246

another (Figure 4A). The core regions of all four serotypes of DENV exhibited similar

247

RMSF profiles, with increased flexibility in the N-terminal tails (Supporting

248

Information Figure S11). Tracking the secondary structural changes of this region

249

along the trajectories for all four serotypes revealed rapid loss of α-helicity compared

250

to the starting structure for the first ~10 residues (Figure 4B). Interestingly, DENV-2

251

and DENV-4 showed a more persistent pattern of α-helicity in the latter half of the N-

252

terminal tails during the simulations, which may correspond to the hypothetical, labile

253

α0 helix 16. In this region, Glu20 was observed to form an intermittent salt bridge with

254

Arg23 (Supporting Information Figure S12), which may help to stabilize α0, whereas

255

in DENV-1 and DENV-3, this residue is substituted by Ala or Val, respectively.

256

Examining the principal motions of the DENV C protein dimer for the other serotypes

257

revealed similar “collapsing dynamics” of the α1 helices leading to occlusion of the

258

hydrophobic patch (Figure 4C). However, the hydrophobic patch of the DENV-4 C

259

protein dimer was intermittently exposed over the simulation time, due to greater

260

fluctuations of residues 1–40 (Supporting Information Figure S11). Overlaying the

261

crystal structure of WNV C protein (PDB ID: 1SFK) with the NMR structure of

262

DENV-2 C protein (PDB ID: 1R6R) shows that the α1 helix of WNV adopts a closed

263

conformation, occluding its hydrophobic patch (Figure 4D). This indicates that a

264

concerted approach of α1 (and potential α0) helices towards one another mediates

11 Environment ACS Paragon Plus

Page 12 of 32

Page 13 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

265

access to the hydrophobic patch, irrespective of the previously proposed auto-

266

inhibitory collapse of N-terminal tails on top of the cavity 16.

267 268

HDX deuterium uptake correlates with simulated flexibility and indicates low

269

dynamics in α0-α1 region. HDXMS was used to probe the dynamics of the DENV-2

270

C protein dimer in an aqueous environment. The closest correlation between average

271

backbone amide hydrogen bond propensities and experimental deuterium uptake

272

(Supporting Information Figure S13) could be obtained for the amber03ws FF

273

(Pearson correlation of ~0.8–0.9 across all measured time points). It should be noted

274

that the deuterium exchange measurements reflect the raw experimental readout,

275

unadjusted for the ~90% deuterium content under experimental conditions and the

276

loss of deuterium due to back-exchange (~15%); our estimated correlations would

277

likely be even stronger under 100% deuterium environments without such back-

278

exchange. Relative fractional deuterium uptake (RFU) values for each peptide were

279

mapped onto the EOM selected structure of DENV-2 C protein taken from the

280

amber03ws simulation pool which best fit the SAXS data (Fig 5B). Overlapping

281

pepsin proteolyzed peptides of DENV-2 C protein spanning the N-terminal region

282

(residues 1–16) exhibited high RFUs (greater than 0.38), in agreement with the

283

prediction from simulations. Consistently, peptides spanning α2 (residues 56–65),

284

predicted to be part of the hydrophobic patch, exhibited low RFU values of ~0.1.

285

Pepsin proteolyzed peptides spanning the C-terminus also exhibited low RFU values

286

and low RMSFs. The peptide spanning the hypothesized α0 helix (residues 16–28)

287

exhibited RFU values intermediate between the core and N-termini regions. Taken

288

together, local dynamics captured by HDXMS corroborated with the peaks in RMSF

289

values and hydrogen-bonding patterns predicted by simulations, indicating that the N-

12 Environment ACS Paragon Plus

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

290

termini of the C protein dimer exhibit high flexibility indicative of disorder, whereas

291

peptides mapped to the core show low deuterium exchange, connected by a partially

292

structured α0 region.

293 294

CONCLUDING REMARKS

295

The DENV C proteins are partially ordered, i.e. they simultaneously consist of

296

segments with a distinct fold and with intrinsically disordered regions. This peculiar

297

feature, thought to be crucial for their biological multi-functionality, makes their

298

study uniquely challenging. In this work, we have combined a range of approaches to

299

understand how the C protein dimers behave in solution. SAXS and HDX data were

300

used to evaluate the ensembles generated by microsecond timescale MD sampling

301

with a variety of FFs, thus enabling us to identify a simulation protocol that best

302

reproduces the conformational dynamics of the system.

303 304

Structural ensemble assessment. We found that it is imperative to use a FF that is

305

specifically optimized for intrinsically disordered ensembles, to accurately describe

306

the dynamics of a protein that contains only ~10–20% disordered regions. In this case,

307

the amber03ws FF yielded structural ensembles that are clearly in better agreement

308

with the presented experimental evidence than the other tested FFs. The amber03ws

309

FF is less biased towards helicity, produces a greater solvent accessible area, and

310

slightly longer end-to-end distances for the disordered N-terminal tails of the C

311

protein. The structure of DENV-2 C protein selected by SAXS from the simulation

312

pool using amber03ws provides a better explanation for deuterium uptake than the

313

NMR structure. The HDX data tracks well with the RMSF values derived from

314

simulation and acts as an experimental proxy for the backbone flexibility of the

13 Environment ACS Paragon Plus

Page 14 of 32

Page 15 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

315

protein. A pool of randomly generated structures revealed a clear selection bias

316

towards extended structures with an intermediate fit quality compared with the

317

experimental SAXS data (Figures 1E, 1F). Selection of structures from a pool of MD-

318

generated conformations using amber03ws showed not only a qualitative

319

improvement in fit, but also yielded a relatively uniform selection from the pool

320

(Figure 2D), giving us confidence that the MD-generated ensembles are most likely

321

able to represent the structural ensembles present in solution. This therefore represents

322

a powerful combination of methods, enabling investigation of full-length dynamic

323

and/or disordered biomolecules in solution that are unlikely to crystallize, and whose

324

evident plasticity may play a crucial role in function.

325 326

Opening-closing dynamics around the hydrophobic cavity. Proceeding with the

327

analysis of the structural ensembles revealed interesting dynamics surrounding the

328

hydrophobic patch formed by the α1- and α2-helices (Figure 1A, 1B). This peculiar

329

feature of the DENV C protein was noted when the structure was first determined 8.

330

Comparing the open nature of the DENV C protein dimer with the more occluded

331

state of the homologous WNV structure 27, it is apparent that the DENV hydrophobic

332

patch is unusually accessible in the NMR structure. It is known that the DENV C

333

protein interacts with a variety of cellular lipid components, e.g. VLDL

334

droplets

335

pep14–23 16. Inhibiting this interaction reduces viral fitness, indicating that the interplay

336

of C protein and cellular lipid components is a crucial aspect of the viral life cycle. In

337

our studies we observed a clamp-like closing behavior of the C protein hydrophobic

338

patch (Figure 3A, 3B), isolated by normal mode analysis (Figure 3A). We postulate

339

that this hydrophobic region may be exposed when the C protein is in contact with

12,13,15

14

or lipid

which may be blocked by a small DENV C protein-derived peptide,

14 Environment ACS Paragon Plus

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

340

viral or host membranes, especially during the packaging and uncoating phases of the

341

virus life cycle. This would make this interaction an attractive target for antiviral

342

therapeutics targeting DENV. As the hydrophobic patch is highly conserved among

343

the flavivirus family (Figure 4A), a targeted compound is expected to show broad

344

efficacy against this unique fold 8 in flaviviruses.

345 346

Local Dynamics. Whereas the global dynamics observed here are largely consistent

347

between the four serotypes of DENV, differences in the behavior of the intrinsically

348

disordered regions are apparent upon closer inspection. The closely related DENV-1

349

and DENV-3 C proteins retained significantly less residual structure in the disordered

350

N-terminal tails during simulation compared to DENV-2 and DENV-4, whose

351

proteins showed considerable retention of helicity in the latter half of the tails (Figure

352

4B). Examination of the multiple sequence alignment around residues 14 to 24

353

(corresponding to the α0 helix) revealed the intermittent presence of the Glu20-Arg23

354

salt bridge in the case of DENV-2 and DENV-4 C proteins, but which was absent for

355

DENV-1 and DENV-3 due to the substitution of Glu20 by either Ala or Val,

356

respectively. Interestingly, in WNV and ZIKV C proteins, this residue is substituted

357

by Gly, which may further destabilize the α0 helix due to the poor helix-forming

358

propensity of glycine; this is consistent with the more occluded state of the WNV

359

experimental structure in comparison with DENV 27. The existence of a hypothetical,

360

labile α0 helix

361

minimal deuterium uptake measured in this region (Figure 5B), at least for DENV-2.

362

The N-terminal tails of the DENV C protein are essential for efficient viral particle

363

formation 9 as well as for recognition of different ligands 10,14,15,28 during the life cycle

364

of the virus. Subtle functional differences in the local dynamics of this region may

16

is thus supported by our simulations, and also corroborated by the

15 Environment ACS Paragon Plus

Page 16 of 32

Page 17 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

365

thus impact upon function, survivability, and replicative fitness of the virus, and our

366

observations therefore warrant further exploration, particularly in the context of

367

antiviral drug development.

368 369

METHODS

370

A complete description of the computational, HDXMS, and SAXS methods may be

371

found in the Supporting Information. Briefly, the starting coordinates for homology

372

models and subsequent simulations were based on the NMR structure of the DENV-2

373

C protein dimer (PDB ID: 1R6R)8, with the missing N-termini modelled as described

374

previously16. All simulations were run using GROMACS 2016

375

scattering curves were created using CRYSOL 30, and GAJOE 20 was used to select an

376

ensemble of theoretical curves that best fit the experimental SAXS intensity curve.

377

SAXS data of DENV-2 C protein was measured with the BRUKER NANOSTAR

378

SAXS instrument with a setup as described recently.31

379

through Montel mirrors and collimated by two-pinhole system. The sample to detector

380

distance was set at 0.67 m and the sample chamber and X-ray paths were evacuated.

381

This setup covers a range of momentum transfer of 0.16 < q < 4 nm-1 (q = 4π

382

sin(θ)/λ, where 2θ is the scattering angle) 31–33. SAXS experiments were carried out at

383

15 ºC in buffer A (50 mM Tris/HCl, pH 7.5, 1 M NaCl) or buffer B (50 mM Tris/HCl,

384

pH 7.5), using a sample volume of 40 µl in a vacuum tight quartz capillary. The data

385

were collected for 30 min and for each measurement a total of six frames at 5 min

386

intervals were recorded. The scattered X-rays, detected by a two-dimensional area

387

detector, were flood-field and spatially corrected. The flood-field correction rectifies

388

the intensity distortions arising due to the non-uniformity in the pixel to pixel

389

sensitivity differences in the detector using a radioactive source (55Fe) (Bruker AXS,

16 Environment ACS Paragon Plus

29

. Theoretical

The X-rays are filtered

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

390

Germany). The spatial correction fixes the inherent geometrical pincushion distortion

391

by placing a mask with regular pattern before the detector and measuring the

392

deviation from regularity in the detected image. The data were then converted to one-

393

dimensional scattering as a function of momentum transfer by radial averaging using

394

the built-in SAXS software (Bruker AXS, Germany), and normalized by the incident

395

intensity and transmission of the sample using a strongly scattering glassy carbon of

396

known X-ray transmission. The data were tested for possible radiation damage by

397

comparing the six data frames and no changes were detected. The scattering of the

398

buffer was subtracted and the difference curves were normalized by the concentration.

399 400

17 Environment ACS Paragon Plus

Page 18 of 32

Page 19 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

401

SUPPORTING INFORMATION

402

Supporting Information Available: This material includes full experimental details,

403

along with Figures S1–S13 and Tables S1–S2. This material is available free of

404

charge via the Internet (http://pubs.acs.org).

405 406

ACKNOWLEDGEMENTS

407

This research was supported by the Ministry of Education in Singapore (MOE AcRF

408

Tier 3 Grant Number MOE2012-T3-1-008). The authors gratefully acknowledge

409

computing resources provided by the National Supercomputing Center Singapore,

410

www.nscc.sg. The authors declare no competing financial interests.

411 412

18 Environment ACS Paragon Plus

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

413

References

414

(1) Kuhn, R. J., Zhang, W., Rossmann, M. G., Pletnev, S. V, Corver, J., Lenches, E.,

415

Jones, C. T., Mukhopadhyay, S., Chipman, P. R., Strauss, E. G., Baker, T. S., and

416

Strauss, J. H. (2002) Structure of Dengue Virus: Implications for Flavivirus

417

Organization, Maturation, and Fusion. Cell 108, 717–725.

418

(2) Zhang, X., Ge, P., Yu, X., Brannan, J. M., Bi, G., Zhang, Q., Schein, S., and Zhou,

419

Z. H. (2013) Cryo-EM structure of the mature dengue virus at 3.5-Å resolution. Nat.

420

Struct. Mol. Biol. 20, 105–110.

421

(3) Kostyuchenko, V. A., Lim, E. X. Y. Y., Zhang, S., Fibriansah, G., Ng, T.-S., Ooi,

422

J. S. G. G., Shi, J., and Lok, S.-M. (2016) Structure of the thermally stable Zika virus.

423

Nature 533, 425–428.

424

(4) Sirohi, D., Chen, Z., Sun, L., Klose, T., Pierson, T. C., Rossmann, M. G., and

425

Kuhn, R. J. (2016) The 3.8 A resolution cryo-EM structure of Zika virus. Science 352,

426

467–470.

427

(5) Ferlenghi, I., Clarke, M., Ruttan, T., Allison, S. L., Schalich, J., Heinz, F. X.,

428

Harrison, S. C., Rey, F. A., and Fuller, S. D. (2017) Molecular Organization of a

429

Recombinant Subviral Particle from Tick-Borne Encephalitis Virus. Mol. Cell 7, 593–

430

602.

431

(6) Prasad, V. M., Miller, A. S., Klose, T., Sirohi, D., Buda, G., Jiang, W., Kuhn, R.

432

J., and Rossmann, M. G. (2017) Structure of the immature Zika virus at 9 A

433

resolution. Nat. Struct. Mol. Biol. 24, 184–186.

434

(7) Freire, J. M., Santos, N. C., Veiga, A. S., Da Poian, A. T., and Castanho, M. A. R.

435

B. (2015) Rethinking the capsid proteins of enveloped viruses: multifunctionality

436

from genome packaging to genome transfection. FEBS J. 282, 2267–2278.

19 Environment ACS Paragon Plus

Page 20 of 32

Page 21 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

437

(8) Ma, L., Jones, C. T., Groesch, T. D., Kuhn, R. J., and Post, C. B. (2004) Solution

438

structure of dengue virus capsid protein reveals another fold. Proc. Natl. Acad. Sci.

439

101, 3414–3419.

440

(9) Samsa, M. M., Mondotte, J. a., Caramelo, J. J., and Gamarnik, A. V. (2012)

441

Uncoupling cis-Acting RNA Elements from Coding Sequences Revealed a

442

Requirement of the N-Terminal Region of Dengue Virus Capsid Protein in Virus

443

Particle Formation. J. Virol. 86, 1046–1058.

444

(10) Ivanyi-Nagy, R., Lavergne, J.-P., Gabus, C., Ficheux, D., and Darlix, J.-L. (2008)

445

RNA chaperoning and intrinsic disorder in the core proteins of Flaviviridae. Nucleic

446

Acids Res. 36, 712–725.

447

(11) Markoff, L., Falgout, B., and Chang, A. (1997) A Conserved Internal

448

Hydrophobic Domain Mediates the Stable Membrane Integration of the Dengue Virus

449

Capsid Protein. Virology 233, 105–117.

450

(12) Carvalho, F. A., Carneiro, F. A., Martins, I. C., Assunção-Miranda, I., Faustino,

451

A. F., Pereira, R. M., Bozza, P. T., Castanho, M. A. R. B., Mohana-Borges, R., Da

452

Poian, A. T., and Santos, N. C. (2012) Dengue Virus Capsid Protein Binding to

453

Hepatic Lipid Droplets (LD) Is Potassium Ion Dependent and Is Mediated by LD

454

Surface Proteins. J. Virol. 86, 2096–2108.

455

(13) Samsa, M. M., Mondotte, J. A., Iglesias, N. G., Assunção-Miranda, I., Barbosa-

456

Lima, G., Da Poian, A. T., Bozza, P. T., and Gamarnik, A. V. (2009) Dengue Virus

457

Capsid Protein Usurps Lipid Droplets for Viral Particle Formation. PLoS Pathog.

458

(Diamond, M. S., Ed.) 5, e1000632.

459

(14) Faustino, A. F., Carvalho, F. A., Martins, I. C., Castanho, M. A. R. B., Mohana-

460

Borges, R., Almeida, F. C. L., Da Poian, A. T., and Santos, N. C. (2014) Dengue virus

20 Environment ACS Paragon Plus

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

461

capsid protein interacts specifically with very low-density lipoproteins. Nanomedicine

462

Nanotechnology, Biol. Med. 10, 247–255.

463

(15) Martins, I. C., Gomes-Neto, F., Faustino, A. F., Carvalho, F. A., Carneiro, F. A.,

464

Bozza, P. T., Mohana-Borges, R., Castanho, M. A. R. B., Almeida, F. C. L., Santos,

465

N. C., and Da Poian, A. T. (2012) The disordered N-terminal region of dengue virus

466

capsid protein contains a lipid-droplet-binding motif. Biochem. J. 444, 405–415.

467

(16) Faustino, A. F., Guerra, G. M., Huber, R. G., Hollmann, A., Domingues, M. M.,

468

Barbosa, G. M., Enguita, F. J., Bond, P. J., Castanho, M. A. R. B., Da Poian, A. T.,

469

Almeida, F. C. L., Santos, N. C., and Martins, I. C. (2015) Understanding Dengue

470

Virus Capsid Protein Disordered N-Terminus and pep14-23-Based Inhibition. ACS

471

Chem. Biol. 10, 517–526.

472

(17) Best, R. B., Zheng, W., and Mittal, J. (2014) Balanced Protein–Water

473

Interactions Improve Properties of Disordered Proteins and Non-Specific Protein

474

Association. J. Chem. Theory Comput. 10, 5113–5124.

475

(18) Huang, J., Rauscher, S., Nawrocki, G., Ran, T., Feig, M., de Groot, B. L.,

476

Grubmüller, H., and MacKerell, A. D. (2016) CHARMM36m: an improved force

477

field for folded and intrinsically disordered proteins. Nat. Methods 14, 71–73.

478

(19) Jones, C. T., Ma, L., Burgner, J. W., Groesch, T. D., Post, C. B., and Kuhn, R. J.

479

(2003) Flavivirus Capsid Is a Dimeric Alpha-Helical Protein. J. Virol. 77, 7143–7149.

480

(20) Tria, G., Mertens, H. D. T., Kachala, M., and Svergun, D. I. (2015) Advanced

481

ensemble modelling of flexible macromolecules using X-ray solution scattering.

482

IUCrJ 2, 207–217.

483

(21) Chong, S.-H., Chatterjee, P., and Ham, S. (2017) Computer Simulations of

484

Intrinsically Disordered Proteins. Annu. Rev. Phys. Chem. 68, 117–134.

21 Environment ACS Paragon Plus

Page 22 of 32

Page 23 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

485

(22) Best, R. B. (2017) Computational and theoretical advances in studies of

486

intrinsically disordered proteins. Curr. Opin. Struct. Biol. 42, 147–154.

487

(23) Huang, J., and MacKerell, A. D. (2018) Force field development and simulations

488

of intrinsically disordered proteins. Curr. Opin. Struct. Biol. 48, 40–48.

489

(24) Maier, J. A., Martinez, C., Kasavajhala, K., Wickstrom, L., Hauser, K. E., and

490

Simmerling, C. (2015) ff14SB: Improving the Accuracy of Protein Side Chain and

491

Backbone Parameters from ff99SB. J. Chem. Theory Comput. 11, 3696–3713.

492

(25) Huang, J., and Mackerell, A. D. (2013) CHARMM36 all-atom additive protein

493

force field: Validation based on comparison to NMR data. J. Comput. Chem. 34,

494

2135–2145.

495

(26) Schmid, N., Eichenberger, A. P., Choutko, A., Riniker, S., Winger, M., Mark, A.

496

E., and van Gunsteren, W. F. (2011) Definition and testing of the GROMOS force-

497

field versions 54A7 and 54B7. Eur. Biophys. J. 40, 843–856.

498

(27) Dokland, T., Walsh, M., Mackenzie, J. M., Khromykh, A. A., Ee, K.-H., and

499

Wang, S. (2004) West Nile Virus Core Protein. Structure 12, 1157–1163.

500

(28) Faustino, A. F., Martins, I. C., Carvalho, F. A., Castanho, M. A. R. B., Maurer-

501

Stroh, S., and Santos, N. C. (2015) Understanding Dengue Virus Capsid Protein

502

Interaction with Key Biological Targets. Sci. Rep. 5, 10592.

503

(29) Abraham, M. J., Murtola, T., Schulz, R., Páll, S., Smith, J. C., Hess, B., and

504

Lindah, E. (2015) Gromacs: High performance molecular simulations through multi-

505

level parallelism from laptops to supercomputers. SoftwareX 1–2, 19–25.

506

(30) Svergun, D., Barberato, C., and Koch, M. H. J. (1995) CRYSOL – a Program to

507

Evaluate X-ray Solution Scattering of Biological Macromolecules from Atomic

508

Coordinates. J. Appl. Crystallogr. 28, 768–773.

22 Environment ACS Paragon Plus

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

509

(31) Balakrishna, A. M., Basak, S., Manimekalai, M. S. S., and Grüber, G. (2015)

510

Crystal Structure of Subunits D and F in Complex Gives Insight into Energy

511

Transmission of the Eukaryotic V-ATPase from Saccharomyces cerevisiae. J. Biol.

512

Chem. 290, 3183–3196.

513

(32) Dip, P. V., Kamariah, N., Subramanian Manimekalai, M. S., Nartey, W.,

514

Balakrishna, A. M., Eisenhaber, F., Eisenhaber, B., and Grüber, G. (2014) Structure,

515

mechanism and ensemble formation of the alkylhydroperoxide reductase subunits

516

AhpC and AhpF from Escherichia coli. Acta Crystallogr. Sect. D Biol. Crystallogr.

517

70, 2848–2862.

518

(33) Tay, M. Y. F., Saw, W. G., Zhao, Y., Chan, K. W. K., Singh, D., Chong, Y.,

519

Forwood, J. K., Ooi, E. E., Grüber, G., Lescar, J., Luo, D., and Vasudevan, S. G.

520

(2015) The C-terminal 50 Amino Acid Residues of Dengue NS3 Protein Are

521

Important for NS3-NS5 Interaction and Viral Replication. J. Biol. Chem. 290, 2379–

522

2394.

523

(34) Paramo, T., East, A., Garzón, D., Ulmschneider, M. B., and Bond, P. J. (2014)

524

Efficient Characterization of Protein Cavities within Molecular Simulation

525

Trajectories: trj_cavity. J. Chem. Theory Comput. 10, 2151–2164.

526 527

23 Environment ACS Paragon Plus

Page 24 of 32

Page 25 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

528

FIGURE LEGENDS

529

Figure 1. SAXS studies of DENV-2 C protein. (A) Schematic of DENV-2 C protein

530

dimer with reconstructed N-terminal tails (red) in a “stacked” conformation. The core

531

region is comprised of four alpha helices α1, α2, α3, and α4, colored orange, yellow,

532

green, and blue, respectively. (B) Position of the hydrophobic patch (green surface) in

533

the C protein dimer, assessed using trj_cavity 34. (C) Comparison of SAXS pattern of

534

DENV-2 C protein (green circles) and theoretical scattering profiles of monomeric

535

(red line) and dimeric (blue line) NMR structures (PDB ID: 1R6R). The dimeric

536

NMR structure provides a better fit to the experimental scattering data, which had a χ2

537

of 8.89, compared to the monomeric NMR structure, which had a χ2 of 27.68. The

538

NMR structures of C protein in (top) monomeric and (bottom) dimeric forms are

539

shown inset. (D) The normalized Kratky plots for lysozyme (grey circles) and DENV-

540

2 C protein (green circles). (E) The Rg distribution of selected ensembles from

541

DENV-2 C protein exhibited a broad distribution that was positioned right of the

542

random pool. The Rg calculated from the NMR structure was located outside the pool.

543

(F) The calculated ensemble scattering profile from EOM (red line) fitted to the

544

experimental scattering profile (green circles), with a discrepancy χ2 of 2.28.

545 546

Figure 2. Structural ensembles of DENV-2 C protein. (A) Fit of SAXS data for

547

various FFs, with the region of largest structural features shown inset. χ2 values are

548

indicated for each FF. The best fit was obtained using the ensemble generated with

549

the amber03ws FF. (B) Three structures from the amber03ws generated pool (shades

550

of blue) selected by a genetic algorithm that best fit the experimental SAXS data. The

551

MD starting structure is indicated in light grey. (C) Colored lines show the Rg

552

distributions during 1 µs of MD simulations using each FF. Only the ensemble

24 Environment ACS Paragon Plus

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

553

generated using amber03ws overlaps with the SAXS determined value (indicated by

554

vertical purple solid line), whereas other FFs coincide with that calculated for the

555

starting structure based on NMR (indicated by vertical dashed line). (D) The Rg

556

distribution for DENV-2 C protein dimer of selected ensembles (blue) from the

557

complete pool of structures derived from MD simulations using amber03ws (grey).

558 559

Figure 3. Dynamics of DENV-2 C protein. (A) First and second principal motions

560

of the dimeric core region shown as arrows with motions greater than 1 nm. The

561

flexible N-termini are omitted for clarity. (B) RMSD of the core region with respect

562

to the open and closed states, for each frame in the trajectory. The open state is

563

derived from the starting state based on NMR (PDB:1R6R), and the closed state is

564

derived from the extreme “closed” structure of the second principal moment,

565

indicated alongside.

566 567

Figure 4. Dynamics of DENV-1, DENV-3 and DENV-4 C proteins. (A) Multiple

568

sequence alignment of C protein from DENV-1 to -4, along with ZIKV and WNV.

569

(B) Secondary structural changes of the N-terminal tails (residues 1 to 20) of DENV-1

570

to -4 C proteins along the trajectory time. (C) First principal motions indicated by

571

arrows for the core of each C protein dimer. (D) Comparison of DENV-2 C protein

572

NMR structure (PDB ID: 1R6R) (grey) and WNV C protein X-ray structure (PDB ID:

573

1SFK) (salmon).

574 575

Figure 5. Flexibility and deuterium uptake of DENV-2 C protein. (A) Per-residue

576

RMSF measured over the last 600 ns of each simulation. Shaded regions of the

25 Environment ACS Paragon Plus

Page 26 of 32

Page 27 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

577

RMSFs represent one standard deviation derived from blockwise analysis (see

578

Supporting Information Figure S8). Black bars indicate the span of the peptides used

579

to probe deuterium uptake in HDXMS. (B) Overlay of deuterium uptake

580

measurements from HDXMS onto the structure of DENV-2 C protein dimer, taken

581

from the selected amber03ws MD simulation pool which best fit the SAXS data. Each

582

peptide is shown as a colored segment (corresponding to relative fractional deuterium

583

uptake (RFU)) in sausage representation, overlaid on top of the protein backbone

584

shown in grey.

585

26 Environment ACS Paragon Plus

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1. SAXS studies of DENV-2 C protein. (A) Schematic of DENV-2 C protein dimer with reconstructed Nterminal tails (red) in a “stacked” conformation. The core region is comprised of four alpha helices α1, α2, α3, and α4, colored orange, yellow, green, and blue, respectively. (B) Position of the hydrophobic patch (green surface) in the C protein dimer, assessed using trj_cavity 34. (C) Comparison of SAXS pattern of DENV-2 C protein (green circles) and theoretical scattering profiles of monomeric (red line) and dimeric (blue line) NMR structures (PDB ID: 1R6R). The dimeric NMR structure provides a better fit to the experimental scattering data, which had a χ2 of 8.89, compared to the monomeric NMR structure, which had a χ2 of 27.68. The NMR structures of C protein in (top) monomeric and (bottom) dimeric forms are shown inset. (D) The normalized Kratky plots for lysozyme (grey circles) and DENV-2 C protein (green circles). (E) The Rg distribution of selected ensembles from DENV-2 C protein exhibited a broad distribution that was positioned right of the random pool. The Rg calculated from the NMR structure was located outside the pool. (F) The calculated ensemble scattering profile from EOM (red line) fitted to the experimental scattering profile (green circles), with a discrepancy χ2 of 2.28.

ACS Paragon Plus Environment

Page 28 of 32

Page 29 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

139x165mm (300 x 300 DPI)

ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2. Structural ensembles of DENV-2 C protein. (A) Fit of SAXS data for various FFs, with the region of largest structural features shown inset. χ2 values are indicated for each FF. The best fit was obtained using the ensemble generated with the amber03ws FF. (B) Three structures from the amber03ws generated pool (shades of blue) selected by a genetic algorithm that best fit the experimental SAXS data. The MD starting structure is indicated in light grey. (C) Colored lines show the Rg distributions during 1 µs of MD simulations using each FF. Only the ensemble generated using amber03ws overlaps with the SAXS determined value (indicated by vertical purple solid line), whereas other FFs coincide with that calculated for the starting structure based on NMR (indicated by vertical dashed line). (D) The Rg distribution for DENV-2 C protein dimer of selected ensembles (blue) from the complete pool of structures derived from MD simulations using amber03ws (grey). 139x129mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 30 of 32

Page 31 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Figure 3. Dynamics of DENV-2 C protein. (A) First and second principal motions of the dimeric core region shown as arrows with motions greater than 1 nm. The flexible N-termini are omitted for clarity. (B) RMSD of the core region with respect to the open and closed states, for each frame in the trajectory. The open state is derived from the starting state based on NMR (PDB:1R6R), and the closed state is derived from the extreme “closed” structure of the second principal moment, indicated alongside. 144x139mm (300 x 300 DPI)

ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 4. Dynamics of DENV-1, DENV-3 and DENV-4 C proteins. (A) Multiple sequence alignment of C protein from DENV-1 to -4, along with ZIKV and WNV. (B) Secondary structural changes of the N-terminal tails (residues 1 to 20) of DENV-1 to -4 C proteins along the trajectory time. (C) First principal motions indicated by arrows for the core of each C protein dimer. (D) Comparison of DENV-2 C protein NMR structure (PDB ID: 1R6R) (grey) and WNV C protein X-ray structure (PDB ID: 1SFK) (salmon). 141x142mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 32 of 32

Page 33 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Figure 5. Flexibility and deuterium uptake of DENV-2 C protein. (A) Per-residue RMSF measured over the last 600 ns of each simulation. Shaded regions of the RMSFs represent one standard deviation derived from blockwise analysis (see Supporting Information Figure S8). Black bars indicate the span of the peptides used to probe deuterium uptake in HDXMS. (B) Overlay of deuterium uptake measurements from HDXMS onto the structure of DENV-2 C protein dimer, taken from the selected amber03ws MD simulation pool which best fit the SAXS data. Each peptide is shown as a colored segment (corresponding to relative fractional deuterium uptake (RFU)) in sausage representation, overlaid on top of the protein backbone shown in grey. 67x96mm (300 x 300 DPI)

ACS Paragon Plus Environment