Dynamics of Zika Virus Capsid Protein in Solution: The Properties and

Apr 29, 2019 - We distinguished two dynamical regions in the ZIKVC IDR, a highly flexible N-terminal end and a transitional disordered region, indicat...
0 downloads 0 Views 3MB Size
Subscriber access provided by UNIV OF LOUISIANA

Article

Dynamics of Zika virus capsid protein in solution: properties and exposure of the hydrophobic cleft are controlled by the #-helix 1 sequence. Maria Agnese Morando, Glauce M. Barbosa, Christiane Cruz-Oliveira, Andrea T. Da Poian, and Fabio C.L. Almeida Biochemistry, Just Accepted Manuscript • DOI: 10.1021/acs.biochem.9b00194 • Publication Date (Web): 29 Apr 2019 Downloaded from http://pubs.acs.org on April 30, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Dynamics of Zika virus capsid protein in solution: properties and exposure of the hydrophobic cleft are controlled by the helix 1 sequence Maria A. Morando†‡§, Glauce M. Barbosa†, Christine Cruz-Oliveira†‡, Andrea T. Da Poian†*, Fabio C.L. Almeida†‡* †Institute

of Medical Biochemistry Leopoldo De Meis, Program of Structural Biology, Federal

University of Rio de Janeiro. ‡National

Center for Structural Biology and Bioimaging (CENABIO) / National Center for

Nuclear Magnetic Resonance (CNRMN), Federal University of Rio de Janeiro. §Centro

de Desenvolvimento de Tecnologia em Saúde, Fiocruz, Rio de Janeiro

*Correspondence

to: F.C.L. Almeida, [email protected] A.T. Da Poian, [email protected]

KEYWORDS: Zika virus, dynamics, capsid protein, NMR, hydrophobic cleft

ACS Paragon Plus Environment

1

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 41

ABSTRACT

Zika virus (ZIKV) became an important public health concern since the infection has been correlated to the development of microcephaly and other neurological disorders. Although the structure of virion has been determined by cryo-EM, information on the nucleocapsid is lacking. We used NMR to determine the solution structure and dynamics of full length ZIKV capsid protein (ZIKVC). Although most of the protein is structured as described for dengue and West Nile viruses’ capsid proteins, as well as for the truncated ZIKVC (residues 23-98), here we show important differences in the -helix 1 and N-terminal intrinsically disordered region (IDR). We distinguished two dynamical regions in ZIKVC IDR: a highly flexible N-terminal end and a transitional disordered region, indicating that it contains ordered segments rather than being completely flexible. The unique size, orientation of -helix 1 partially occludes protein hydrophobic cleft. Measurements of the dynamics of -helix 1, surface exposure and thermal susceptibility of each backbone amide 1H in protein structure revealed the occlusion of the hydrophobic cleft by α1/α1´ and supported a -helix 1 position uncertainty. Based on the findings described here, we propose that the dynamics of ZIKVC structural elements responds for a structure-driven regulation of protein interaction with intracellular hydrophobic interfaces, which would impact in the switches necessary for nucleocapsid assembly. Subtle differences in the sequence of -helix 1 impact on its size and orientation and on the degree of exposure of the hydrophobic cleft, suggesting that α-helix 1 is a hotspot for evolutionary adaptation of flaviviruses’ capsid proteins.

ACS Paragon Plus Environment

2

Page 3 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

INTRODUCTION Zika virus (ZIKV) is an arbovirus of the Flaviviridae family that recently caused a major outbreak in the Americas 1. ZIKV neurotropism has been demonstrated 2,3, and infection has been correlated to the development of microcephaly and other malformations associated to congenital ZIKV exposure 4,5, as well as to serious neurological disorders, such as Guillain-Barré Syndrome, after adult infection 6,7. The structure of ZIKV has been determined by cryo-EM 8, revealing structural details of the envelope (E) and the membrane (M) proteins, but lacking information on the virus nucleocapsid (NC). ZIKV NC is formed by the 10.6 kb single stranded positive sense genomic RNA associated to the 104-amino acid capsid protein (ZIKVC). The structure of a truncated form of ZIKVC (residues 23-98) has been previously solved by X-ray crystallography 9. As found for the described capsid proteins of dengue virus (DENVC) 10 and West Nile virus (WNVC) 11, ZIKVC shows a dimeric globular core formed by 8 intertwined -helices (4 per monomer) and a Nterminal intrinsically disordered region (IDR). Flaviviruses’ capsid proteins are highly basic, but the charges are not symmetrically distributed on the protein surface. In one face, they display a hydrophobic cleft, shown to mediate protein binding to intracellular hydrophobic interfaces, such as endoplasmic reticulum and/or lipid droplets, during virus replication cycle 12,13. The opposite face is a positive-charged patch, which is a possible nucleic acid binding site. For DENVC, Nterminal IDR is also involved in protein association to lipid droplets (LD) 14,15. DENVC-LD interaction is essential for infectious particles formation 12 , but the mechanism that drives the transition from the membrane-associated C protein dimers to the assembled NC is unknown. In this work we used NMR to determine the structure and dynamics of the full length ZIKVC in solution. We found that most of the protein globular region is structured similarly as described

ACS Paragon Plus Environment

3

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 41

for DENVC and WNVC, as well as for the crystal structure of the truncated ZIKVC, but remarkable differences occur for α-helix 1 and the N-terminal IDR. Since ZIKVC IDR corresponds to ~1/3 of the protein, spanning from residue 1 to 35, we took advantage of a number of NMR approaches to study its dynamical properties. Indeed, conformational equilibrium can only be fully characterized in solution, and NMR spectroscopy is particularly powerful to characterize regions involved in dynamics 16,17. We distinguished two dynamical regions in ZIKVC IDR: a highly flexible N-terminal end and a transitional disordered region, indicating that ZIKVC IDR contains ordered segments rather than being completely flexible. Measurements of surface exposure and thermal susceptibility of each backbone amide 1H in protein structure revealed the occlusion of the hydrophobic cleft by α1/α1´ and supported the α1 position uncertainty observed in the structural ensemble. Based on the findings described here, we propose that the dynamics of ZIKVC structural elements responds for a structure-driven regulation of protein interaction with intracellular hydrophobic interfaces, which would impact in the switches necessary for NC assembly. We also suggest that α-helix 1 is a hotspot for evolutionary adaptation of flaviviruses’ capsid proteins.

MATERIALS AND METHODS

Protein expression and purification ZIKVC protein purification was adapted from the protocols used for DENV2C purification 10,14. ZIKVC coding sequence (encoding amino acid residues 1-104, deposit #AMD16557.1) 18 was synthesized by Genscript and codon-optimized for E. coli and sub-cloned to pET3a vector using NdeI and BamHI restriction sites. Recombinant protein was expressed in E. coli BL21-DE3-

ACS Paragon Plus Environment

4

Page 5 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

pLysS overnight, at 30ºC, after induction by 0.5 mM IPTG. The uniform 15N and 15N 13C labelled proteins were expressed in M9 minimal medium supplemented by 11 g/L of [15N]NH4Cl and 3 g/L of [13C6]-glucose with 100 μg/mL ampicillin and 34 μg/mL chloramphenicol. After protein expression, cells were harvested by centrifugation at ~5000 g for 20 min at 4ºC. Pellet was resuspended in 40 mL of buffer, containing 25 mM of 2-[4-(2hydroxyethyl)piperazin-1-yl]ethanesulfonic acid (HEPES), 0.5 M NaCl, 1 mM 2,2',2'',2'''(ethane-1,2-diyldinitrilo)tetraacetic acid (EDTA), 5% (v/v) glycerol at pH 7.4, and 1 mL of protease inhibitor cocktail (P8465 – Sigma Aldrich). After centrifugation and ultrasonication (model VCX 130, Vibra-Cell), NaCl was added to the cell lysate to achieve a final concentration of 2 M. The lysate was stirred for 1 h at 4ºC. After the precipitation, the lysate was ultracentrifuged at 70,400 g for 50 min at 4ºC. ZIKVC was soluble in supernatant and was purified in a HiTrap Heparin 5 column coupled to an AKTA Start (GE Healthcare). Step gradient was employed with increasing NaCl concentration (1.0 M, 1.5 M and 2 M). Fractions containing ZIKVC protein were confirmed by absorbance at 280 nm and 18% SDS-PAGE. Fractions were concentrated and stored at -20ºC.

NMR spectroscopy The NMR sample contained 300 M (dimer) of ZIKVC in 55 mM sodium phosphate buffer, 200 mM NaCl, pH 6.0, 10% D2O. NMR spectra were acquired at 35ºC on Bruker Avance III 600MHz and Avance IIIHD 900MHz spectrometers equipped with 15N/13C/1H triple-resonance probes (Bruker TXI). NMR spectra were processed with NMRPipe 19 and analyzed using CCPNmr Analysis software 20. The backbone assignment was obtained using non-uniform sampling (NUS) with multidimensional Poisson Gap scheduling. NMRPipe and iterative soft

ACS Paragon Plus Environment

5

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 41

threshold (hmsIST) fast reconstruction of NMR data were used for processing 21. The protein backbone resonance assignments were achieved through analysis of the triple resonance experiments: HNCACB, CBCA(CO)NH, HNCACO and HNCO 22,23. The aliphatic side-chain resonances were primarily assigned using HBHA(CO)NH 24 experiment and with the help of complementary HCCH-TOCSY 25, 15N-edited NOESY-HSQC and 13C-edited NOESY-HSQC 26 optimized for both aliphatic or aromatic region detection. Assignments of aromatic side chains were obtained using NOEs between the aromatic protons and the βCH2 group in the 13C-NOESYHSQC spectrum. We chose to work at 35oC because of the better-quality spectra, especially in the regions involved in conformational exchange. The data were processed using NMRPipe 19 or TopSpin3.1 (Bruker Biospin) while resonance assignments were achieved using the software CCPNMR Analysis 27. Distance restrains were derived from 15N-NOESY-HSQC and two 13C-NOESY-HSQC, for aromatics and aliphatics. For NOESY experiments we used the NUS-scheduling to increase resolution. We run TALOS-N 28 for backbone chemical shift based dihedral angle prediction. The predicted backbone dihedral angles  and  of the residues involved in secondary structure were used as a restraint for structural calculations. Setup details of the NMR experiments are described in Table S1.

Structure calculation Structure calculation of the ZIKVC were performed iteratively using ARIA 2.1 program, version 2.3 29,30 combined with CNS version 1.2 31, using 15N NOESY-HSQC and 13C-NOESY-HSQC datasets as source of distance restraints. All data were converted to xml format file. The

ACS Paragon Plus Environment

6

Page 7 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

backbone  and  dihedral restraints were predicted based on analysis of 1HN, 15N, 1Ha, 13Ca, 13CO,

and 13Cb chemical shifts using the program TALOS-N 28,32.

Structures were calculated using CNS 1.2 using torsion angle simulated annealing. Several cycles of ARIA were performed using standard protocols. After each cycle rejected restraints, side-chain assignments, NOEs and dihedral violations were analyzed. Finally, 400 conformers were calculated with ARIA/CNS, and among them, the 20-best water-refined structures with the lowest energy were selected to represent the solution structural ensemble of ZIKVC. The structural ensemble was visualized and analyzed with Chymera and Pymol. Quality validation was performed using PROCHECK 33 and Protein Structure Validation Software suite (PSVS) (http://psvs-1_5-dev.nesg.org/).

NMR relaxation parameters 15N

backbone amide relaxation parameters (15N R1, 15N R2 and 1H-15N heteronuclear Nuclear

Overhauser effect - NOE) were measured for a 15N labeled ZIKVC (300 M dimer, in 55 mM sodium phosphate buffer, 200 mM NaCl, 10% D2O, pH 6.0) using Bruker Avance IIIHD 500 (11.74 T, operating at 500.13 MHz) and Avance IIIHD 600 (14.09 T, operating at 600.03 MHz) at 35oC, with 1024 and 96 complex points in F2 (1H) and F1 (15N), respectively. The R1 and R2 were collected with 11 accumulations by increment and 1H-15N NOE with 16 accumulations by increment. R1 was measured with delays varying from 0.05 to 1 s. R2 was measured with delays varying from 17 to 170 ms. The experimental error was evaluated from the signal-to-noise ratio of the spectra displaying ~50% of the decay in signal. The 1H-15N NOE were acquired with or without proton saturation for 8.0 s. The R1 and R2 values were obtained using the relaxation module of CCPNMR 20. The 1H-15N Heteronuclear NOE values were determined using the

ACS Paragon Plus Environment

7

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 41

intensity saturation spectra/intensity without saturation spectra ratio. The data could not be fitted in a Lipari-Szabo model free formalism34 because of the lack of a hydrodynamic model of the protein due of the long IDR. Thus, the raw data were interpreted along with the 15N CPMG relaxation dispersion experiments (CPMG-RD), also collected at two fields and two temperatures (Figure 2).

15N

Relaxation dispersion measurements:

15N

Carr-Purcell-Meiboom-Gill relaxation dispersion (CPMG-RD) profiles for a 15N labeled

ZIKVC (300 M dimer, in 55 mM sodium phosphate buffer, 200 mM NaCl, 10% D2O, pH 6.0) were recorded in Bruker Avance IIIHD 500 (11.74 T, operating at 500.13 MHz) and Avance IIIHD 600 (14.09 T, operating at 600.13 MHz), at two temperatures (30 and 35oC), using the constant relaxation time of 30 ms in a 15N CPMG relaxation compensation, comprising two CPMG elements of 15 ms 35, resulting in the transverse relaxation of a mix of in-phase (Nx,y) and anti-phase (2Nx,yHz) coherences (R2eff = 0.5R2effin-phase + 0.5R2effanti-phase). All 15N CPMG-RD experiments were performed with constant relaxation time period Trelax of 30 ms and with CPMG frequencies νCPMG ranging from 66 to 1000 Hz. CPMG-RD R2,eff(νCPMG) were calculated from peak intensities (I) in a series of two-dimensional (2D) 1H-15N correlation spectra recorded in an interleaved way at different CPMG frequencies νCPMG, using the following equation: 𝑅2𝑒𝑓𝑓(𝜈𝑐𝑝𝑚𝑔) = ―

1

𝐼 𝑇𝑟𝑒𝑙𝑎𝑥𝑙𝑛( 𝐼𝑜), where I is the signal intensity in the spectra collected at Trelax = 30 ms

and I0 is the signal intensity in the reference spectrum recorded at Trelax = 0. The experimental error in R2eff rates were estimated signal to noise ratio for each resonance Δ𝑅2𝑒𝑓𝑓(𝜈𝑐𝑝𝑚𝑔) = 1

1

𝑇𝑟𝑒𝑙𝑎𝑥(

(𝑆𝐼𝐺𝑁𝐴𝐿 𝑁𝑂𝐼𝑆𝐸)).

ACS Paragon Plus Environment

8

Page 9 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Amide chemical shift temperature coefficient Amide 1HN chemical shift temperature coefficient of ZIKVC (200 M dimer, in 55 mM sodium phosphate buffer, 200 mM NaCl, 10% D2O, pH 6.0) was determined by recording a series of two-dimensional 15N/1H HSQC spectra at 20, 25, 30, 35 and 40oC, using a Bruker Avance 600 spectrometer (14.09 T, operating at 600.13 MHz). All the spectra were referenced in the water signal for each temperature and processed in NMRPipe and analysed in CcpNmr analysis software. Water chemical shift was referenced using 3-(trimethylsilyl)propane-1-sulfonic acid (DSS). The chemical shift values (H ) of all residues at different temperature were plotted as N

function of the temperature. The resulting slope (dH /dT) of every curve was plotted for each N

residue.

Solvent paramagnetic relaxation enhancement (sPRE) Amide 1HN/15N cross-peak intensities for each residue of ZIKVC (200 M dimer, in 55 mM sodium phosphate buffer, 200 mM NaCl, 10% D2O, pH 6.0) were determined by recording a series of two-dimensional 15N/1H HSQC spectra at different concentrations of gadoliniumdiethylenetriamine pentaacetic acid (Gd-DTPA): 0, 0.5, 1.0, 2.0, 3.0 and 4.5 mM, at 25ºC, using a Bruker Avance 600 MHz spectrometer (operating at 14.09T). All the spectra were processed in NMRPipe and analyzed in CcpNmr analysis software. The intensities of each residue 1HN/15N cross-peak (INH) were plotted as function of the concentrations of Gd-DTPA and the resulting slopes (INH [Gd]-1) were plotted for each residue. To calculate an accurate INH [Gd]-1, the points where the intensities approached zero (lack of linearity) were not taken into consideration, according to Figure S5.

ACS Paragon Plus Environment

9

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 41

Flavivirus sequence alignment and sequence clustering analysis Protein-protein BLAST (BLASTP) using the ZIKVC as query was run over a database of nonredundant protein sequences of viruses from Flaviviridae family (Taxid 11050). From the 20,000 BLASTP best hits (scored using the matrix BLOSUM62), a representative sequence of the complete polyprotein of each flavivirus was selected, resulting in the following sequences ID: AMD16557.1, YP_009227185.1, AFR66758.1, AEA72437.1, AFN43044.1, AER25364.2, ALE71321.1, AEV41145.1, ANK79133.1, AFP95929.1, AEI27244.1, AGV15509.1, AIU94743.1, AAV54504.1, ALK02500.1, ABW76844.2, AAX82481.1, NP_051124.1, AJE59927.1, YP_164264.1, YP_009350103.1, YP_009126874.1, ARR96288.1, AIU94744.1, AIJ19434.1, AIJ19433.1, AAV34157.1, YP_009328360.1, AGJ84083.1, YP_001040004.1, YP_002790882.1, AFK83757.1, ABW82078.1, AGE13481.1, ACW82911.1. Multiple sequence alignments of these sequences were carried out using Clustalw 2.1 (http://www.genome.jp/toolsbin/clustalw) 36, with Blosum 80, 62, 45 and 30 used as the matrices. A dendogram was constructed using simplified neighbor-joining method. Prediction of capsid proteins’ secondary structures was performed using Jpred4 at the server http://www.compbio.dundee.ac.uk/jpred/ 37.

Remarks on the data interpretation Interpretation of the relaxation dispersion parameters CPMG-RD experiments were collected at two temperatures, 30 and 35oC. The CMPG-RD profiles did not show a typical dispersion of an intermediate exchange regime, with R2eff decreasing toward R2eff free of exchange (R2eff∞) in the range of cp varying from 66 to 1000 s-1, as illustrated in

ACS Paragon Plus Environment

10

Page 11 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figures S3B and S4B. Rather, we observed R2eff approximately constant. For most residues R2eff is approximately that of R2eff∞, as illustrated in Figures S3C, S3D, S4C and S4D. For few residues, those illustrated in Figures S3E and S4E, R2eff > R2eff∞. These relaxation dispersion profiles are typical of conformational exchange in a fast exchange regime (Figures S3 and S4). At this exchange regime it is not possible to obtain R2eff∞ from the CPMG-RD profile. To analyze and plot the data in Figure 2, we estimated R2eff∞ based on the expected overall behavior of R2eff of a dynamic region. The values interpreted as R2eff∞ are traced as dotted line in Figures S3A and S4A. Only the values of R2eff that significantly exceeded the dotted lines (R2eff∞) were considered as a 15NH

in conformational exchange in a fast exchange regime (microsecond timescale).

Interpretation of the amide chemical shift temperature coefficient The amide hydrogen (HN) chemical shift temperature coefficient (dH /dT) reports on the thermal N

susceptibility of the amide hydrogen bonds HN-C´. dH /dT is correlated to the length of the N

hydrogen bond (rHNC´) and hydrogen bond J coupling 3hJNC´38. In proteins, rHNC´ increases with temperature due to the increase in thermal fluctuation 39,40. An empirical simple rule stands that dH /dT > -5 ppb/K for HNs involved in secondary structure 41 (low thermal susceptibility), and N

residues hydrogen bonded with water presents dH /dT < -5 ppb/K (high thermal susceptibility). N

The accepted reasoning is that the hydrogen bonding with water is weaker and more expandable than the HN-C´ 40. However, dH /dT is considered poorer than HN exchange rates for depicting N

internal hydrogen bonds and, therefore, it is frequent to find residues involved in secondary structure with dH /dT < -5 ppb/K 38,42. The reason is that dH /dT senses local fluctuation, N

N

providing thermodynamic information of the local environment 43. Chemical shift is a function of the local environment for each nucleus, and consequently reflects all the degrees of freedom in

ACS Paragon Plus Environment

11

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 41

which one nucleus is experimenting. For this reason, thermal coefficient is fundamentally related to the local entropy 38. Residues with high amide chemical shift thermal susceptibility (dH /dT < -5 ppb/K) make more N

expandable hydrogen bonds, which are weaker, presenting smaller hydrogen bond J coupling 3hJ

NC´

38.

For this, an amide with dH /dT < -5 ppb/K may be considered a point of break of a N

secondary structure or, when in a loop or IDR, more exposed to hydrogen bond with water. Residues with low amide chemical shift thermal susceptibility (dH /dT > -5 ppb/K) make less N

expandable hydrogen bonds, which are stronger, presenting larger hydrogen bond J coupling 3hJ

NC´

38.

For this, an amide with dH /dT > -5 ppb/K is involved secondary structures hydrogen N

bond. When in a loop or IDR, the amide may tend to form intramolecular hydrogen bonds and, thus, tend to increase order.

RESULTS ZIKVC resonance assignment (Figure S1) and structure determination (Table 1 and Figure 1) were performed using 13C/15N double-labeled protein at pH 6.0. ZIKVC structure shows a larger IDR when compared to DENVC and WNVC structures10,11. -helix 1 (1) spans from residues 36 to 39, what makes the IDR at least 9-residues longer (residues 1 to 35). It is noteworthy that 1 shows unique size and orientation, with 1/1´ partially occluding the 2/2´ hydrophobic cleft (Figures 1A). Among the 4 -helices, 2/2´, 3/3´ and 4/4´ form the intertwined dimeric structure dominated by quaternary contacts, while 1/1´ displays a small position uncertainty (Figures 1A and 1B). It is remarkable the presence of 3 -stacking interactions among the conserved aromatic residues (Figures 1C and 1D) and 6 salt-bridges (Table 2 and Figures 1E and 1F), 2 more than found in DENVC and 6 more than in WNVC. The interchain

ACS Paragon Plus Environment

12

Page 13 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

salt bridge between R55 and D87´ (and R55´and D87) seems to be the most important for the structural stabilization. Conversely, in the crystal structure of truncated ZIKVC a rotameric conformation of R55 side chain makes the formation of this salt bridge unfavorable (Table 2). A salt bridge between K75 and E79, which is important to stabilize -helix 4, is only observed in the ZIKVC solution structure described here. The lack of K75/E79 salt bridge in the crystal structure could be explained by the fact that this region is involved in crystal contacts. Table 1: Structural statistics of ZIKVC Number of restraints Total distance restraints: 1883 (unambiguous) 314 (ambiguous) Intraresidue 645 Sequential (i, i+1) 564 Medium-range (i, i+2)i,i+3) 304 Long-range > i, i+3: Intersubunit 150 Intrasubunit 220 Total dihedral angle restraints 102 Total hydrogen bond restraints 74 Violations Distance restraints >0.5 Å 0 Dihedral restraints >5° 0 Hydrogen bond restraints >0.5 Å 0 Energy Geometry -5112.83(±15x10-1) kcal mol-1 Distance Restraints 38.64 (±11) kcal mol-1 rms deviation in NOE restraints and idealized geometry NOE,Å 1.80 (±2.7) Bond length,Å 2.86 (±1.5) Bond angles,° 0.46 (±1.9) Improper dihedral angles,° 1.21 (± 8.0) Coordinate rms deviation from average structure All heavy atoms, Å Backbone atoms, Å All 35-104 2.40 1.69 35-97 1.27 0.91 44-97 (α2, α3, α4) 0.92 0.40

ACS Paragon Plus Environment

13

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

PROCHECK Ramachandran analysis Most favored regions, % Additional allowed regions, % Generously allowed regions, % Disallowed regions, %

Secondary structure 98.7 1.3 0 0

Page 14 of 41

All residues 90 7.5 1.4 1.1

Figure 1: Solution structure of ZIKVC globular domain. (A and B) Superposition of the 3 lowenergy structures showing the position uncertainty of -helix 1, in frontal view (A); and in top-

ACS Paragon Plus Environment

14

Page 15 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

view (B). (C and D) Ribbon representation of the backbone of the ZIKVC showing the side chains of aromatic residues involved the -stacking interaction, in frontal view (C), with the black line representing the symmetry axis; and in lateral view (D). (E) Frontal and (F) lateral views showing the localization of the salt bridges of ZIKVC, as detailed in Table 2. Negatively charged residues are colored in red and positively charged residues in blue. Salt bridges R55/D87´ and R55´/D87 connect -helices 2 and 4 of different subunits, contributing to the stabilization of the quaternary structure. Salt bridges R45´/E76´ and R45/E76 connect -helices 2 and 4 of the same subunit, contributing to the stabilization of the tertiary structure. Salt bridges K75´/E79´ and K75/E79 stabilize -helix 4. Table 2: Salt bridges in ZIKV, DENV2 and WNV capsid proteins 44.

Salt bridges R45´/E76´ R45/E76 R55´/D87 R55/D87´ K75´/E79´ K75/E79

Solution ZIKVC Structure Solvent Minimum / Maximum exposure (%) Distances (Å) 55.1 / 46.1 2.7 / 10.1 55.1 / 46.1 2.7 / 10.1 44.5 / 45.3 2.6 / 7.3 44.5 / 45.3 2.6 / 7.3 53.3 / 46.6 2.7 / 6.0 53.3 / 46.6 2.7 / 6.0

Average Distance (Å) 5.39 5.39 4.10 4.10 4.67 4.67

Crystal ZIKVC Structure (PDB 5YGH) Salt bridges Solvent exposure (%) Distance (Å) R45´/E76´ 46.2 / 33.5 10.9 R45/E76 46.2 / 33.5 11.3 R55´/D87 58.9 / 27.2 6.1 R55/D87´ 58.9 / 27.2 6.3 K75´/E79´ 49.7 / 40.0 8.4 K75/E79 49.7 / 40.0 11.0

Salt bridges K45´/E87´ K45/E87 R55´/E87 R55/E87´

DENV2C Solvent exposure (%) 20.6 / 26.2 20.6 / 26.2 40.2 / 26.2 40.2 / 26.2

Salt bridge None

WNVC Solvent exposure (%) -

Distance (Å) 3.7 3.7 2.8 2.8 Distance (Å)

ACS Paragon Plus Environment

15

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 41

To describe the plasticity of ZIKVC structure, especially in what refers to 1/1´ and IDR structural behaviors, we analyzed its dynamics in solution. We measured the 15N relaxation parameters (Figure S2), which is typical of a dimer and allowed us to distinguish different dynamic regions: (i) a highly flexible disordered region observed for residues 1 to 23; (ii) a transition disordered region from residue 24 to 35; (iii) a globular ordered region from residue 36 to 98; and (iv) a flexible C-terminal (residues 99 to 104). We observed conformational exchange (increase in R2/R1), especially in 1 at residue G36, and in the loop between 1 and 2 (loop12) at residues G40 and G42. CPMG-RD experiments (Figures 2, S3 and S4) revealed conformational exchange in a fast regime (microsecond timescale motion, Figure S3B, S3E, S4B and S4E). We also observed conformational exchange at the IDR, indicating that some order occurs in this region: five residues (N3, S8, G9, G10 and N15) located in its highly flexible segment and two residues (G28 and K31) in the transition region showed increased R2eff. In the globular domain, the most evident dynamics in the microsecond timescale occurred at residue G36 at 1, and residues G40 and G42 at loop1-2, supporting the position uncertainty of 1/1´ observed in the structural ensemble. It is important to mention that, despite the conformational exchange observed for 1 and the adjacent loops, we did not observe disorder for 1. Relaxation parameters that reports for high flexibility (R1, R2 and 1H-15N Heteronuclear NOE) are similar for 1, 2, 3 and 4. Similar behavior was observed for the random coil index32, which are in average 0.81, 0.87, 0.85 and 0.87 for 1, 2, 3 and 4, respectively.

ACS Paragon Plus Environment

16

Page 17 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure 2: Summary of the 15N CPMG-RD experiments. (A and D) R2eff as a function of the residue number, acquired at 35oC at 11.74 T and 14.09 T, respectively. (B and E) R2eff as a function of the residue number, acquired at 30oC at 11.74 T and 14.09 T, respectively. (C and F) 15N R2/R1 as a function of the residue number acquired at 35oC at 11.74 T (Figure S2 shows the complete set of relaxation parameters of ZIKVC). The labels in A, B, D and E highlight residues in conformational exchange, according to the analyses shown in Figure S3 and S4. Asterisks refer to overlapped residues and + refer to prolines.

To confirm the occlusion of the hydrophobic cleft by 1/1´, and the occurrence of some order in the IDR, we measured the solvent-exposure of each backbone amide by solvent paramagnetic relaxation enhancement (sPRE) (Figures 3A-C and S5). The globular region showed the smallest sPRE effect, with the C-terminal portion of 2/2´ (50-ILAFLRF-56) being the least exposed region. This observation supports that 1´ and IDR´ occlude 2, while 1 and IDR occlude 2´ (Figures 3B and 3C). This partial occlusion of the hydrophobic cleft by 1/1´ may have important biological consequences, since C protein binding to hydrophobic interfaces seems to mediate NC assembly 14,15.

ACS Paragon Plus Environment

17

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 41

As expected, the C-terminal and the highly flexible region of the IDR (R12, V21 and A22) were the most exposed regions. In agreement with the 15N CPMG-RD experiments, the most protected residues of the IDR were the residues in conformational exchange: N3, S8, G9, G10 and N15, in the highly flexible segment, and G28 and K31 the transition region, corroborating the existence of some degree of order in the ZIKVC IDR. We also measured the thermal susceptibility of the amide (HN) chemical shifts (dH /dT) (Figures N

3D-F and S6; see Methods for details on the interpretation of this parameter). dH /dT gave N

extra-information on the relative order of the IDR and specific features of the globular region. Residues G10, G20, G28, G29, L30 and R32 show dH /dT>-5 ppb/K, displaying less N

expandable hydrogen bonds (stronger), typical of HNs in secondary structure. This further supports an order in the IDR with these amides more likely to make intramolecular hydrogen bonds. Intermolecular hydrogen bonds with water are expected for a fully disordered polypeptide chain.

ACS Paragon Plus Environment

18

Page 19 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure 3: Surface exposure and thermal susceptibility of the HN. (A) NH surface exposure measured (INH [Gd]-1) as a function of the residue number (data were derived from Gd titration curves shown in Figure S5). (B and C) Worm representations of ZIKVC (frontal and top views, respectively). The thickness and colors represent the degree of NH exposure, with the low-exposed residues colored in blue (thin) and the high-exposed ones in red (thick). The region 50-ILAFLRF-56 is the most protected from solvent exposure. (D) Temperature dependence of the HN chemical shift (d/dT) as a function of residue number. (E and F) Worm representations of ZIKVC (frontal and top views, respectively). The thickness and colors correlate to the d/dT, with the lowthermal susceptibility residues colored in blue (thin) and the high-thermal susceptibility ones in red (thick). Colorless residues represent lack of information. Asterisks refer to overlapped residues and + refer to Pro.

The thermal susceptibility measurements also enabled us to explain the smaller size of ZIKVC αhelix 1 (Figure 4). The residues K31 and L33 show weaker hydrogen bonds (dH /dT < -5 N

ACS Paragon Plus Environment

19

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 41

ppb/K), behaving as secondary structure breakers. The P34 in the vicinity of these residues contributes to the loss of helical structure in this α-helix 1 region. With these results we were able to map the key amino acid residues that make ZIKVC α-helix 1 different from those of the other flaviviruses’ C proteins. They also explain the unique features of ZIKVC α-helix 1, which may be the driving force for the hydrophobic cleft occlusion. It is remarkable that the presence of K31, L33 and P34 as helix breakers, induced a smaller -helix 1, changing its orientation and the degree of exposure of the hydrophobic cleft. It also leaded to the increased IDR. In the globular region, most of the residues displaying dH /dT>-5 ppb/K are in secondary N

structures elements (Figures 3D-F). The exceptions A49, R93 and N96 (dH /dT