Overall Structural Model of NS5A Protein from ... - ACS Publications

May 23, 2017 - D1D2D3 to homogeneity (see Figure S1) with a yield of 0.2 ... Values close to 0 correspond to fully disordered residues, whereas positi...
0 downloads 0 Views 10MB Size
Article

Overall Structural Model of NS5A Protein from Hepatitis C Virus and Modulation by Mutations Confering Resistance of Virus Replication to Cyclosporin A. Aurelie Badillo, Veronique Brechot, Stephane Sarrazin, François-Xavier Cantrelle, Frederic Delolme, Marie-Laure Fogeron, Jennifer Molle, Roland Montserret, Anja Bockmann, Ralf Bartenschlager, Volker Lohmann, Guy Lippens, Sylvie Ricard-Blum, Xavier Hanoulle, and Francois Penin Biochemistry, Just Accepted Manuscript • Publication Date (Web): 23 May 2017 Downloaded from http://pubs.acs.org on May 24, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Biochemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Overall Structural Model of NS5A Protein from Hepatitis C Virus and Modulation by Mutations Confering Resistance of Virus Replication to Cyclosporin A. Aurelie Badillo1†, Véronique Receveur-Brechot2, Stéphane Sarrazin1, François-Xavier Cantrelle3, Frédéric Delolme1, Marie-Laure Fogeron1, Jennifer Molle1, Roland Montserret1, Anja Bockmann1, Ralf Bartenschlager4, Volker Lohmann4, Guy Lippens3#, Sylvie Ricard-Blum1§, Xavier Hanoulle3* and François Penin1* 1

Institut de Biologie et Chimie des Protéines, MMSB, UMR 5086, CNRS, Labex Ecofect, Université de Lyon, 69367 Lyon, France; 2 Aix Marseille Univ, CNRS, INSERM, Institut Paoli-Calmettes, CRCM,

Marseille, France; 3University of Lille, CNRS, UMR 8576, UGSF, Unité de Glycobiologie Structurale et Fonctionnelle, F 59 000 Lille, France; 4Department of infectious diseases, Molecular Virology, University of Heidelberg, Im Neuenheimer Feld 345, 69120 Heidelberg, Germany.

Running head: Structural model of NS5A

ACS Paragon Plus Environment

1

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 62

ABSTRACT Hepatitis C virus (HCV) nonstructural protein 5A (NS5A) is a RNA-binding phosphoprotein composed of a N-terminal membrane anchor (AH), a structured domain 1 (D1) and two intrinsically disordered domains (D2 and D3). The knowledge of the functional architecture of this multifunctional protein remains limited. We report here that NS5A-D1D2D3 produced in a wheat germ cell-free system is obtained under a highly phosphorylated state. Its NMR analysis revealed that these phosphorylations do not change the disordered nature of D2 and D3 domains but increase the number of conformers due to partial phosphorylations. By combining NMR and SAXS we performed a comparative structural characterization of unphosphorylated recombinant D2 domains of JFH1 (genotype 2a) and the Con1 (genotype 1b) strains produced in E. coli. These analyses highlighted a higher intrinsic folding of the latter, revealing the variability of intrinsic conformations in HCV genotypes. We also investigate the effect of D2 mutations conferring resistance of HCV replication to cyclophilin A (CypA) inhibitors on the structure of the recombinant D2 Con1 mutants and their binding to CypA. Although resistance mutations D320E and R318W could induce some local and/or global folding perturbation, which could thus affect the kinetics of conformer interconversions, they do not significantly affect the kinetics of CypA/D2 interaction measured by surface plasmon resonance (SPR). The combination of all our data led us to build a model of the overall structure of NS5A, which provides a useful template for further investigations of the structural and functional features of this enigmatic protein.

INTRODUCTION Hepatitis C virus (HCV) infection is a leading cause of chronic hepatitis, liver cirrhosis and hepatocellular carcinoma worldwide and a still increasing health burden worldwide (1). Direct acting antivirals (DAAs) targeting the HCV NS3 protease, NS5B polymerase and NS5A phosphoprotein used in all-oral combination therapies now allow to eliminate the virus in the majority of infected patients, but major challenges in basic, translational and clinical research remain (2, 3). Of note, DAA resistance to NS5A inhibitors poses an important clinical problem. In this context, the detailed knowledge of the NS5A ACS Paragon Plus Environment

2

Page 3 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

structure provides an essential framework for understanding its functions in viral replication as well as the molecular mechanism of action of DAAs NS5A inhibitors. HCV is an enveloped, single-stranded positive-sense RNA virus classified in the hepacivirus genus of the Flaviviridae family. On the basis of viral variability, HCV is classified into seven genotypes and numerous subtypes (4). The HCV RNA genome encodes a single polyprotein, which is processed both co- and post-translationally by viral and endoplasmic reticulum (ER)-bound cellular proteases to produce three structural proteins that build up the virus particle (Core, E1 and E2) and seven non-structural proteins (p7, NS2, NS3, NS4A, NS4B, NS5A and NS5B) (reviewed in (5, 6). The viroporin p7 and the cysteine protease NS2 are required for the assembly of infectious HCV particles while proteins NS3 to NS5B form a membrane-associated replicase complex catalyzing viral RNA replication (7). HCV proteins induce intracellular membrane rearrangement giving rise to replication factories that are composed primarily of ER-derived double membrane vesicles (DMVs) (8), which constitute the presumed sites of RNA replication (9). NS5A is a 447 to 466 aa monotopic membrane-associated RNA binding protein that plays an essential role in modulating HCV RNA replication and particle formation. The functions of NS5A remain to a large extent elusive. NS5A interaction with NS5B is critical for HCV RNA replication (10, 11) and it likely ensures the loading of the core protein with the viral RNA, triggering the formation of nucleocapsids (12). NS5A is produced as multiple phospho-variants, giving rise to a basal and a hyperphosphorylated forms designated p56 and p58, respectively. The phosphorylation status is thought to coordinate different steps of the viral replication cycle, possibly via regulating interactions with replication- vs. assembly-specific host factors (reviewed in (13-15)). In fact, NS5A was reported to have numerous cellular interactants (>100, (16)), including CypA (17) and phosphatidylinositol-4 kinase III alpha (PI4KIIIα) (18), which are essential for HCV replication. NS5A is anchored to the cytoplasmic side of the endoplasmic reticulum membrane via an amphipathic N-terminal α-helix (AH) (19, 20). Comparative sequence analyses and limited proteolysis of recombinant NS5A have defined, downstream to its N-terminal membrane anchor, three domains denoted D1, D2 and D3, separated by two low complexity sequences (LCS1 and LCS2) (21) (see Figure 1). Domains 1 and 2 are primarily involved in RNA replication whereas domain 3 is essential for viral assembly (22, 23). ACS Paragon Plus Environment

3

Biochemistry

Page 4 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

4

Page 5 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Although NS5A RNA binding property has been ascribed primarily to domain D1, D2 and D3 appear to contribute as well (24, 25). D3 is involved in the interaction with the core protein (26) and was recently shown to contain two regulatory determinants that orchestrate virus assembly: a serine cluster in the Cterminal region involved in the recruitment of replication complexes to core protein, and a basic cluster in the N-terminal region involved in the RNA genome delivery to core protein (12). D1 together with the amphipathic N-terminal α-helix is responsible for DMV formation (27) and is also most likely the target of the highly potent inhibitor Daclatasvir (28), which blocks the biogenesis of membranous HCV replication factories (29). The first crystal structure of the D1 domain revealed a claw-like dimer with a basic groove that could accommodate either single- or double-stranded RNA (30). More recently, three other X-ray structures of D1 domain revealed virtually identical monomer conformations, but distinct dimer organizations that have been proposed to form multimeric complexes (31, 32). According to one hypothesis, multiple NS5A dimers may form a 'basic railway' on intracellular membranes that would allow tethering as well as sliding of the viral RNA on intracellular membranes and coordination of its different fates during HCV replication (5, 7, 33, 34). In contrast to the well-structured D1, we and others found that the D2 and D3 domains are intrinsically disordered proteins (IDPs) (35-42). Moreover, we showed in recent NMR studies that both D2 and D3 domains interacts with, and are substrate of the peptidyl-prolyl cis/trans-isomerase activity of human CypA (35, 40). CypA is critical for HCV replication (17, 43, 44) and mutations conferring resistance of HCV replication to cyclophilin inhibitors (including Cyclosporin A (CsA) as well as the nonimmunosuppressive CsA analogues Debio-025, NIM811, and SCY635 (45)) often map to D2 and D3 domains (46-49). Whether the binding properties and/or the PPIase activity of CypA are crucial remains to be confirmed. Indeed, both the CypA enzymatic activity (43, 50, 51) and its binding properties (48, 5254) that are inherently tightly coupled have been proposed to be mandatory for the virus. Recently we identified in the main CypA-binding region of NS5A-D2 (308-327) a small structural motif, a

314

Pro-

Trp316 turn, which is essential for HCV RNA replication. The I315G NS5A-D2 mutant, in which the structural motif is absent, is not competent for replication and is altered for CypA binding. In contrast, the CypA PPIase activity of this mutant toward the Pro314 is higher, suggesting the crucial involvement of ACS Paragon Plus Environment

5

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 62

CypA binding properties rather than its enzymatic activity (54). Nevertheless, the molecular mechanism by which NS5A mutations confer resistance to CypA inhibitors remains to be elucidated. In the present work, we analyzed the structural properties and phosphorylation sites of NS5A-D1D2D3 produced in a wheat germ cell-free system by using NMR and mass spectroscopy. We performed a comparative structural characterization of unphosphorylated NS5A recombinant domains D2 and D3 produced in E. coli by combining circular dichroism (CD), NMR and small angle X-ray scattering (SAXS) experiments to characterize the folding propensity of these domains and the overall conformational behavior of these intrinsically disordered domains from HCV strains Con1 (genotype 1b) and JFH1 (genotype 2a). Finally, we investigated the effect of D2 mutations conferring resistance of HCV replication to CypA inhibitors (denoted « resistance mutations ») on the structural features and on the binding of the recombinant D2 Con1 mutants R262Q, R318W, D320E, and R318W/D320E to CypA. The combination of all these results allowed us to build a model for the overall structure of NS5A, which provides a template to further investigate the structural and functional features of this enigmatic protein. EXPERIMENTAL SECTION Recombinant domains 2 (D2) and 3 (D3) and D2D3 of HCV NS5A protein from JFH1 strain (accession number AB047639, genotype 2a), domains 2 and 3 of HCV NS5A protein from Con1 strain (accession number AJ238799, genotype 1b), and CypA were expressed in E. coli and purified as detailed in Supporting Experimental Procedures. Unlabeled and 13C, 15N labeled NS5A D1 (aa 30-213) and D1D2D3 (aa 30–447) proteins from Con1 strain were expressed in a wheat germ cell-free expression system (WGE-CF) and purified as detailed in Supporting Experimental Procedures. Sample preparation and mass spectroscopy analyses of phosphopeptides obtained after enzymatic digestion by trypsin followed or not by Glu-C from NS5A-D1D2D3 produced in WGE-CF are detailed in Supporting Experimental Procedures. Circular Dichroism spectra were recorded on a Chirascan dichrograph and analysed as detailed in Supporting Experimental Procedures. NMR spectra were acquired at 298K on a Bruker 900 MHz NMR spectrometer equipped with a cryogenic triple resonance probe. NMR data collection and assignments, and structure models generation are detailed in Supporting Experimental Procedures. SAXS experiments were performed on the SWING beamline at the Synchrotron SOLEIL, France. SAXS data collection and processing are detailed in Supporting Experimental Procedures. Surface Plasmon ACS Paragon Plus Environment

6

Page 7 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Resonance (SPR) binding assays were carried out in a Biacore T100 and the kinetic parameters ka and kd and the equilibrium dissociation constants were calculated using the BIAevaluation 3.0 software as detailed in Supporting Experimental Procedures. RESULTS Comparison of sequences and structural features of NS5A from Con1 and JFH1 strains. Figure 1 shows the alignment of NS5A sequences from Con1 (genotype 1b) and JFH1 (genotype 2a) strains and summarizes their various structural features reported in the literature and in the present study. The comparison of invariant, very similar, and similar amino acids (labeled by asterisks, colons and dots, respectively) as well as of the numerous proline residues (colored orange) reveals alternation of wellconserved and partially conserved regions, especially in the D2 and D3 domains. The conserved prolinetryptophan turn (54) is highlighted in yellow and the phosphorylated serine and threonine residues (this study, see below) are highlighted in green and magenta. Because of the high conservation of amino acid between D1 domains of Con1 and JFH1, the secondary structures deduced from X-ray structure of D1 (indicated in blue letters) is likely conserved. In contrast, transient secondary structures of D2 and D3 deduced from previous NMR studies (35, 39, 40, 54) show obvious differences between Con1 and JFH1 strains (colored black and brown, respectively). To better understand the folding and the role of these homologous intrinsically disordered domains in the NS5A protein structure, their conformational features and folding propensities were further characterized experimentally by CD, NMR, ITC and SAXS in various physico-chemical conditions using recombinant purified proteins. Production and characterization of NS5A samples from cell-free or bacterial systems. Recombinant D2 and D3 domains from JFH1 and Con1 strains, as well as D2 Con1 resistance mutants and D2D3 JFH1 domains were efficiently overexpressed as soluble proteins in E. coli and successfully purified to homogeneity in mg amount (see Figure S1). Production of NS5A-D1D2D3 (aa 30-447) from the Con1 strain in E. coli was inefficient, mainly due to proteolysis. To overcome this problem, D1D2D3 as well as D1 alone were expressed as fusion proteins containing a C-terminal Strep-tag II, in the wheat germ cell-free system and purified by affinity chromatography on a Strep-Tactin column. This approach also allowed us to produce and purify 13C, 15N labeled D1 and D1D2D3 to homogeneity (see Figure S1) with a yield of 0.2 mg of purified protein per ACS Paragon Plus Environment

7

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 62

mL of wheat germ extract. It is worth to mention that conversely to protein production in E. coli, wheat germ extracts contain endogenous protein kinases, resulting in the phosphorylation of D1D2D3 produced with this cell-free system (see characterization below). Analyses of the D1D2D3 and D1 Con1 domains by various biophysical methods including gel filtration chromatography, dynamic light scattering, small-angle X-ray scattering (SAXS), and electron microscopy showed that both proteins form concentration-dependent large soluble oligomeric assemblies (data not shown), as already reported (31). This behavior precluded further characterization by these techniques. However, the structure of D1D2D3 and D1 domains was be investigated by circular dichroism spectroscopy (Figure 2) and NMR (Figure 3), while that of D2, D3, and D2D3 could be investigated by any methods. CD analysis of NS5A-D1 and -D1D2D3 (Con1 strain) produced in cell-free system. Both D1 and D1D2D3 produced using the wheat germ cell-free system gave typical spectra of folded proteins, with a large positive band around 190 nm and a double minimum around 208 and 220 nm (Figure 2). As expected, the deconvolution of the D1 spectrum using the CDSSTR algorithm (55) yielded values of secondary structure content comparable to that deduced from its X-ray structures (PDB accession numbers 1ZH1, 3FQM and 3FQQ): 6% α-helix, 30% β-strand, 18% turns, and 44% unordered. The deconvolution of the D1D2D3 spectrum using the same method gave 7% α-helix, 28% β-strand, 16% turns and 48% unordered. According to these results about 30 residues over 428 would be helical in the D1D2D3 domain, while only 12 residues would be helical in the D1 domain. This difference indicates that D2D3 domain contributes to the helical content of the D1D2D3 domain.

ACS Paragon Plus Environment

8

Page 9 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

NMR analysis of phosphorylated NS5A-D1D2D3 (Con1 strain) produced in cell-free system. The 1

H,15N-HSQC spectrum of NS5A-D1D2D3 (Figure 3A) displays roughly 200 resonances although the

protein contains 430 residues. Thus not all the NS5A-D1D2D3 residues can be observed by NMR spectroscopy in our experimental setup. All but nearly 15 resonances fall into a narrow proton chemical shift range limited to 1 ppm (from 7.55 to 8.55 ppm). This low level of dispersion suggests that these NMR resonances arise from mainly disordered residues.

ACS Paragon Plus Environment

9

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 62

The superimposition of the 1H,15N-HSQC spectrum of NS5A-D1D2D3 with those of the isolated domain 2 (NS5A-D2) (54) or domain 3 (NS5A-D3) (39) shows that most of the resonances from the NS5A-D1D2D3 spectrum overlay with NMR peaks from either the isolated NS5A-D2 or -D3 (see Figure S2). This indicates that most of the resonances observed in NS5A-D1D2D3 spectrum would correspond to residues from domains 2 and 3 and hence suggests that these domains would behave in a similar way when in the full-length protein or as isolated domains. The NMR resonances with proton amide chemical shift values higher than 8.55 ppm do not seem to correspond to disordered residues and thus may arise (i) either from the domain 1 of NS5A (NS5A-D1), for which prediction of the chemical shifts based on the crystal structure of the ordered domain D1 (PDB 1ZH1) leads to a spectrum that would extend from 7.1 ppm to 9.4 ppm (30, 31, 56), (ii) or from the LCS1 and LCS2 linkers between the domains, (iii) or from phosphorylated residues. To unambiguously assign the amide proton resonances we recorded threedimensional 1H,15N,13C NMR spectra on a

15

N,13C-labelled NS5A-D1D2D3 sample and used our

product-plane method (57). Following this strategy, we assigned nearly all the backbone amide proton resonances of the 1H,15N-HSQC spectrum of NS5A-D1D2D3 as well as their corresponding

13

Cα,

13



and 13C’ resonances (BMRB # 26702). None of the assigned resonances correspond to residues from the linkers between the domains (LCS1 and LCS2) or from the domain 1. The latter observation is coherent with our SAXS data showing that NS5A-D1 and NS5A-D1D2D3 form large protein assemblies in solution. Indeed, if the domain 1 of NS5A is responsible for the formation of these large oligomers then its NMR resonances would be broadened beyond detection. Use of NMR pulse sequences optimized for large proteins, such as TROSY (58), did not allow to detect supplemental resonances in the 1H,15N spectrum of NS5A-D1D2D3 (data not shown). Thus, all the amide resonances in the 1H,15N-HSQC spectrum of NS5A-D1D2D3 belong to either the domains 2 or 3 (Figure 3). Nevertheless, NMR resonances from several regions of these domains were not visible in the spectrum of the full-length protein, such as resonances from regions 261-266 and 302-328 in the domain 2 and from region 337-384 located in the domain 3. These regions already show the lowest NMR intensity in the context of the isolated domains (Figure 3C and 3D). In IDPs, these low-intensity profiles often correlate with protein regions that have a reduced mobility or that undergo exchange processes. Indeed, in the case of NS5A, all these low intensity regions have been associated with particular structural features. The 261-266 region ACS Paragon Plus Environment

10

Page 11 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

corresponds to residues that exhibit propensity to adopt an alpha-helical structure (37, 54). The 302-328 region of NS5A contains the conserved Proline-Tryptophan turn that we recently identified as essential for HCV RNA replication (54) (cf. Figure 1). Finally, the N-terminus of domain 3, which is not visible in NS5A-D1D2D3, corresponds to the region 364-381 that exhibits a strong intrinsic propensity to adopt an amphipathic α-helical structure (40) (cf. Figure 1). Nevertheless, because of the absence of resonances detection from regions 261-266 and 302-328 in domain 2 and from region 337-384 located in domain 3, one cannot exclude the possibility that these regions establish transient intramolecular interactions with folded domain 1. Almost all amide proton resonances with a chemical shift value higher than 8.55 ppm were assigned to phosphorylated serine or threonine residues located in the domain 3 (Figure 3A and 3B). Nine residues of NS5A-D1D2D3 have been identified as being at least partially phosphorylated when produced in the wheat germ cell-free system: pSer401, pSer408, pSer412, pSer415, pSer429, pSer432, pSer434, pThr435 and pSer437. These residues have higher chemical shift values for their proton amide and Cβ resonances (59). The secondary structure propensity analysis (60) of the experimental

13

Cα and

13

Cβ chemical shifts

(Figure 3, B-D) shows that in the context of the full-length NS5A-D1D2D3 protein, the domains 2 and 3 remain mainly intrinsically disordered. Moreover, the phosphorylated residues in the domain 3 favor a more extended conformation of this region, as seen by the more pronounced negative SSP scores (compare Figures 3B and 3D). In the 1H,15N-HSQC spectrum of NS5A-D1D2D3, aside the main peaks corresponding to the major population, several low intensity resonances are present due to minor populations. A first potential source of heterogeneity in the NS5A-D1D2D3 NMR spectrum corresponds to the numerous proline residues that can exist in both trans and cis conformations and for which the isomerization rate is rather slow (on the minute time scale). The residues in the vicinity of prolines thus may have two resonances in the 1H,15N-HSQC spectrum, one major (typically ~90% for IDPs) and one minor (typically ~10% for IDPs) when the neighboring proline residue is in trans or cis conformation respectively. For example, Ser396 give two 1H,15N resonances at 8.37,117.06 ppm (81%) and 8.82,119.13 ppm (19%) that are most probably due to the conformational equilibrium of the Pro397 (Figure 3A). A

ACS Paragon Plus Environment

11

Biochemistry

similar observation was made for the Asp425 residue, which is located close to both Pro426 and Pro423 (data not shown). The second source of heterogeneity arises both from the combination and the level of the phosphorylation on the Ser/Thr residues. In the region 398-416, the major population of NS5A-D1D2D3 contains three phosphorylation sites: pSer408, pSer412 and pSer415 but a minor population has also been identified for this region, in which only two phosphorylated residues are present: pSer401 and pSer408. Looking at the intensity of the NMR peaks, the major and minor populations in this region represent twothird and one-third of the protein conformers respectively (Table 1).

Maj.

M416

S415

S414

Y413

nd nd nd nd nd nd nd nd

A440

62

E439

V436

T435

S434

W433

S432

S412

E411

V410

43 56 44 38 34 29 23 34

100

G431

D430

S429

Residues

L428

33

S437

Phosph. Levela (%)

57 44 56 62 66 71 77 66

0

77

S441

Populations Min. 1 25 22 nd 33 31 25 29 30 28 34 43 (%) Min. 2 nd nd nd nd nd nd nd nd nd nd nd

D409

S408

G407

A406

D405

G404

D403

D402

S401

75 78 nd 67 69 75 71 70 72 66 57

E438

Maj.

P400

Residues

Q399

D398

Table 1. Phosphorylation sites and population heterogeneity in the 398-416 (top) and 427-441 (bottom) regions of NS5A-D1D2D3 revealed by NMR analyses.

D427

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 62

59 53 63 69 67 51 41 45 43 36 52 58 40 54 58

Populations Min. 1 41 47 37 31 33 33 24 33 23 29 36 42 60 46 42 (%) Min. 2 nd nd nd nd nd 15 35 22 25 36 13 nd nd nd nd Phosph. Levela (%)

63

77

67 64

80

0

a

Phosphorylation sites are indicated in bold underlined. n.d., not detected.

This ratio nearly corresponds to the phosphorylation level of Ser401 that is 33%. It is likely that the minor population in the 398-416 NS5A region is due to the phosphorylation of the Ser401. In the region 427-441 of NS5A, there are two populations and even a third one for residues 432-437. In the major ACS Paragon Plus Environment

12

Page 13 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

population, we identified five phosphorylated residues: pSer429, pSer432, pSer434, pThr435 and pSer437. The first minor population corresponds to the situation where no phosphorylation is present at all. Then for residues 432-437 we identified a second minor population in which Ser432, Ser434, Thr435 and Ser437 are phosphorylated (as in the main population) and would reflect the absence of phosphorylation on Ser429 (Table 1). Indeed, the ratios between these two phosphorylated resonances (72%/28% in average) roughly correspond to the phosphorylation level of Ser429 (pSer429 63%/ Ser429 37%). Globally in the NS5AD1D2D3 sample, the phosphorylation levels are 33%, 100%, 62%, 77%, 63%, 67%, 67%, 67% and 64% for the residues Ser401, Ser408, Ser412, Ser415, Ser429, Ser432, Ser434, Thr435 and Ser437 respectively. Our results are in line with a recent study showing that in vitro the Casein Kinase 2 quickly phosphorylates S408 in NS5A-D3 whereas S401, S429 and S434 are more slowly and less efficiently modified (61). Despite the presence of numerous serine and threonine residues in LCS1-domain2 and despite the fact that phosphorylations have already been reported in these regions (62, 63), we did not detect any phosphorylation here by NMR spectroscopy. In fact, as reported in the following section, phosphorylated residues were indeed identified in LCS1-Domain2 by mass spectroscopy in the NS5A-D1D2D3 sample (Table 2 and Figure 1), indicating that their NMR signals were broadened beyond detection.

Mass spectroscopy analysis of phosphorylated NS5A-D1D2D3 (Con1 strain) produced in cell-free system. To further analyze the number and nature of the phosphorylation sites, an unlabeled NS5AD1D2D3 Con1 protein sample was directly analyzed by mass spectrometry analysis after tryptic or tryptic + Glu-C digestion followed by TiO2 enrichment. ProteinPilot® database searching software, using the Paragon method with phosphorylation emphasis, was used to detect and identify the phosphorylated peptides. The sequence coverage of the protein was 66% and 12 phosphorylated peptides were identified (Table 2 and Supporting MS analyses). Manual validation of the spectra was performed based on neutral loss of H3PO4 from the precursor ion and the assignment of major fragment ions to b- and y-ion series or to the corresponding neutral loss of H3PO4 from these ions. As an example, MS/MS spectra at m/z 648.97 (+3) of peptide (247-262) shows two ”b-” N-terminal daughter ion series (i) one including the increment mass of 80 Da (b+HPO3) (ii) the other one after a loss of phosphoric acid (b-98 Da) after the phosphorylated residues (Supporting MS analyses). ACS Paragon Plus Environment

13

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Page 14 of 62

Table 2. List of NS5A-D1D2D3 phosphopeptide sequences identified by Mass Spectroscopy Peptide sequencea

Positions

Phospho-residuesb

m/z precursor ion

Theo. Mass

(charge)c

(∆m ppm)d

Enzyme + enrichemente

221-240

GpSPPSLASSSASQLSAPSLK (1P)

P-Ser222/2194

976.47 (2+)

1950.93 (0.7)

Trypsin + TiO2

221-240

GSPPSLASSpSASQLSAPSLK (1P)

P-Ser230/2202

976.47 (2+)

1950.93 (1.3)

Trypsin + TiO2

249/2221

648.97 (3+)

1943.88 (4.7)

Trypsin + TiO2

247-262

HDpSPDADLIEANLLWR (1P)

P-Ser

266-275

GGNITRVEpSE (1P)

P-Ser274/2246

561.24 (2+)

1140.48 (5.4)

Trypsin + GluC + TiO2

274/2246

962.97 (4+)

3847.88 (10.2)

Trypsin + TiO2

P-Ser306/2278

435.72 (2+)

869.43

Trypsin + TiO2

360/2332

506.27 (2+)

1010.52 (2.3)

Trypsin + GluC + TiO2

272-304

VEpSENKVVILDSFEPLQAEEDEREVSVPAEILR (1P) P-Ser

306-311

pSRKFPR (1P)

(1.4)

358-365

KRpTVVLSE (1P)

P-Thr

360-374

TVVLSESTVSpSALAE (1P)

P-Ser370/2342

786.87 (2+)

1571.73 (10.4)

Trypsin + GluC + TiO2

364/2336

662.7 (3+)

1984.99 (6.8)

Trypsin + TiO2

360-378

TVVLpSESTVSSALAELATK (1P)

P-Ser

360-378

TVVLSEpSTVSSALAELATK (1P)

P-Ser368/2340

662.7 (3+)

1984.99 (9.6)

Trypsin + TiO2

360-378

TVVLSESTVSpSALAELATK (1P)

370/2342

662.7 (3+)

1984.99 (7.1)

Trypsin + TiO2

385-411

SSAVDSGTATASPDQP(S)DDGDAG(S)DVE (1P)

873.32 (3+)

2616.98 (40.0)

Trypsin + GluC + TiO2

P-Ser P-Ser401/2373 Ser408/2380

or

P-

a

Ser or Thr residues carrying a phosphorylation are in bold. Phospho-residues and positions in protein sequence. c Mass and charge of the precursor ion selected and fragmented in MSMS mode. d Theoretical molecular weight in Da of the peptide and the difference in ppm between the theoretical and the experimental masses e Enzyme(s) used alone or in combination for the digestion step. b

ACS Paragon Plus Environment

14

Page 15 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

All the twelve reported phosphorylated positions (11 pSer and 1 pThr) were unambiguously identified except for peptide (385-411) where the MS/MS signals were too weak to assess the residue position unambiguously (Supporting MS analyses). Interestingly, except pSer 401 or 408, none of the other phosphorylated residues identified by NMR in the C-terminal part of domain D3 were observed by mass spectroscopy (compare Tables 1 and 2). One explanation for MS results could be related to the heterogeneity of the numerous phosphorylation sites in this region (see Figure 1), leading to a mixture of peptides of the same sequence carrying one or many phosphorylated residues. Moreover, this region contains numerous Glu-C clivage sites (see Figure 1), leading to small peptide fragments that are difficult to characterize. Consequently, the quantity and/or the size of each phosphorylated peptide would be too low to be selected and/or detected by the MS system. Nevertheless, the combination of NMR and MS analyses allowed us to identify up to 19 phosphorylation sites in NS5A-D1D2D3 when produced in the wheat germ cell-free system. Concentration- and temperature-dependent conformation of D2 and D3 domains. Previously characterized as mainly intrinsically disordered domains (35, 39, 40) the folding propensity of NS5A D2 and D3 produced in E. coli (i.e. in an unphosphorylated form) appeared to be very dependent on their concentration and on temperature. As reported in Figure 4 (solid lines), the CD spectra of diluted D2 and D3 domains at room temperature were typical of disordered proteins with a large negative peak centered at 200 nm, although the shoulders observed around 222 nm indicated the presence of some poorly defined structures. While increasing the concentration of each domain, from 50µM to 300-500µM, we observed a strong reduction of the negative peak at 200 nm with a concomitant signal change in the 215-240 nm region. This behavior, which was fully reversible upon dilution/concentration (data not shown) is due to intermolecular interactions as demonstrated by dilution ITC experiments (Figure S3). Concomitantly to the increase of temperature (from 25 to 90°C), a significant increase of negative ellipticity at 222 nm and a decrease at 205 nm were observed for the four domains (Figure 4E and 4F), indicating some folding. These changes are fully reversible upon cooling, and no aggregation was observed. These results illustrate the conformational plasticity of the D2 and D3 domains, which depending the concentration, could form soluble protein assemblies in dynamic equilibrium. They also indicate that the partial folding of the NS5A domains is induced by increasing concentartion or temperature. This phenomenon, frequently observed in ACS Paragon Plus Environment

15

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 62

IDPs, could be attributed to an increase of hydrophobic interaction at higher temperatures, leading to a driving force for folding (64).

Folding Propensity of NS5A D2 and D3 domains. To further probe the folding propensity of the D2 domains from Con1 and JFH1 strains, their secondary structure was investigated by CD under various stabilizing conditions with cosolvents (TFE) or detergents (SDS) that mimic the environment found in the ACS Paragon Plus Environment

16

Page 17 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

hydrophobic core of globular proteins (for an overview see (65, 66) and references therein). TFE is well known to induce the folding of polypeptide chains that have an intrinsic propensity to adopt an α-helical structure by stabilizing short-range hydrogen bonds. In the presence of 50% TFE, the spectra of the D2 Con1 and D2 JFH1 domains exhibited the distinct minima at 208 and 222 nm and the maximum at 192 nm, which are characteristic of helical folding (Figure 5A and 5B). D2 JFH1 domain exhibits a limited propensity to fold comparatively to the D2 Con1 domain as shown by its lower negative ellipticity value at 222nm all along the TFE concentrations tested (from 0 to 80%) (Figure 5C). Above 50% TFE irrelevant folding could be induced, as it is likely the case here for both domains (67). At 50% TFE, assuming that the CD signal at 222 nm was entirely due to α-helix (68, 69) and taking into account the length of the domains, one can estimate that up to 42 and 18 residues could adopt a helical conformation in the D2 Con1 and D2 JFH1 domains, respectively. This genotype-dependent difference was not observed for D3 domains that have a similar α-helical folding propensity in presence of TFE (40).

ACS Paragon Plus Environment

17

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 62

In the presence of SDS, CD spectra of the D2 and D3 domains again showed the typical spectrum of αhelical folding when tested under acidic conditions (Figure 6, A-D, dashed lines), especially in the case of the D2 Con1 domain (Figure 6B). Figure 6 (E-H) shows that the domains reached their maximum alpha helical folding (monitored by the ellipticity at 222 nm) below pH 4.0 (but pH 2.0 for the D2 Con1 domain, Figure 6F) and were mainly unordered above this pH. The apparent pK of the helix-to-unordered transition was about 5.0, a value close to the pKas of the acidic side chains of Glu and Asp (4.5 and 4.6, respectively) that are abundant in D2 and D3 domains. Thus the repulsive electrostatic interactions between the negatively charged groups of Glu and Asp could prevent α-helical folding of the domains.

ACS Paragon Plus Environment

18

Page 19 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Interestingly, for the D2 Con1 and conversely to the others domains, the helix-to-unordered transition occurred in a larger range of pH (possibly due to the presence of one His residue at position 340, pKa of 6.0) and the reversibility of the pH-induced changes required a longer time. It indicates the tendency of D2 Con1 to cluster and collapse by hydrophobic interactions, a behavior also observed during the NMR analyses (data not shown). These CD analyses in the presence of SDS at low pH again point on the higher propensity of the D2 Con1 domain to adopt an α-helix folding compared to the D2 JFH1 domain. Localization of potential secondary structure elements in the D2 domains. Disorder predictors such as VL-XT PONDR ® algorithm (70) predicted a high degree of disorder in both D2 sequences of JFH1 and Con1 strains, with however lower scores for the 15 first N-terminal residues and also in the C-termini (Figure 7). The prediction between residues 300 to 315, and generally speaking the C-terminal half region, is clearly different for the D2 sequences from JFH1 and Con1 strains. To evaluate the structure of D2 domains at the residue level in aqueous conditions, we analyzed our experimental NMR

13

Cα and

13

Cβ chemical shifts (39, 54) with the SSP (Secondary Structure Propensities) program (60) developed to

probe the propensities of IDPs to form secondary structures. Although both D2 JFH1 and D2 Con1 domains globally exhibit similar SSP patterns with two regions having α-helical propensity (aa 250-267 and aa 296-306, respective to Con1 numbering), their corresponding positive SSP scores are clearly higher in D2 Con1. Indeed, the SSP scores for the two helical regions are 0.441 and 0.537 respectively in D2 Con1 whereas they are 0.242 and 0.241 in D2 JFH1. Moreover, the gap of four residues observed in D2 JFH1 helix sequence significantly reduces the size of the N-terminal helix. The backbone torsion angles predicted by Talos+ program (71) from 1H and 13C NMR chemical shifts, indicate several potential helical segments in addition to the two helices predicted by SSP in D2 Con1 but nothing in D2 JFH1 (Figure 7). Interestingly, the N-terminal helix 250-267 is quite well predicted by classical secondary structure prediction methods applied to amino acid sequence, although it is reduced in genotype 2 strains, whereas the prediction of the second helix appears to be largely genotype dependent (see Figure S4). All together, these data highlight the larger α-helix folding propensity of D2 Con1 comparatively to JFH1 and are fully consistent with the above CD data. This is also consistent with the fact that D2 Con1 could not be clearly classified as a coil or pre-molten globule IDP in the double CD wavelengths plot proposed by Uversky (72) (Figure S5). In contrast, D2 JFH1 and both D3 Con1 and JFH1 clearly behave as coil IDP. ACS Paragon Plus Environment

19

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 62

Evaluation of overall conformation of NS5A domains by SAXS. To obtain structural information about the overall shape and dimensions of each NS5A domains, we performed detailed SAXS analyses of protein samples eluted from on-line gel filtration column. Of note, one single peak was obtained for each NS5A domains, indicating a homogeneous population of molecules. The experimental radii of gyration Rg of each domain reported in Table 3 were estimated by the conventional Guinier analysis (73) as well as by the indirect transform approach (GNOM program, (74) and Debye analysis (75), which provide a better accuracy of Rg value comparatively to Guinier analysis. The obtained Rg values were compared to the expected values for random coils domains exhibiting the same numbers of residues calculated according to the equation Rg = 1.93 x N(0.598) (76, 77), where N represent the number of residues. This comparison indicates that all NS5A domains exhibit experimental Rg values close to predicted random coil values (Table 3).

Table 3. SAXS biophysical parameters of NS5A D2, D3 and D2D3 domains ACS Paragon Plus Environment

20

Page 21 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Protein

Biochemistry

MMCalc MMExp (kDa)a (kDa) b

Rgth (Å) c

Rg (Å)d

Rg*(Å)e

Rg** (Å) f

Dmax (Å)

D2 JFH1 12.8 10.0 30.8 28.0 ± 0.4 29.5 29.9 ± 0.1 110 ± 5 D2 Con1 12.4 18.8 31.4 30.9 ± 0.2 32.8 33.7 ± 0.1 130 ± 5 D3 JFH1 11.9 9.0 33.6 33.0 ± 0.5 36.2 35.1 ± 0.1 130 ± 5 D3 Con1 9.9 13.0 29.6 29.0 ± 0.5 31.1 31.3 ± 0.2 115 ± 5 D2D3 JFH1 23.7 40.0 49.4 42.7 ± 0.4 49.7 47.5 ± 0.2 195 ± 5 a Theoretical molecular mass calculated from amino acid sequence. b Experimentally based molecular mass calculated from the scattering intensity extrapolated at zero angles I(0). c theoretical radius of gyration of random coils with the same numbers of residues. d Radius of gyration, estimated from the Guinier plots. e Radius of gyration, estimated using the program GNOM. f Radius of gyration, estimated using the Debye law.

The pair-distance distribution function P(r) deduced from the scattering intensities of NS5A domains display asymmetric shape with an extended tail, exhibiting maximum dimensions Dmax of 110 Å, 130 Å, and 195 Å for D2, D3, and D2D3 JFH1, respectively (Figure 8B and Table 3), and 130 Å and 115 Å for D2 and D3 Con1, respectively (Table 3 and Figure S6). Compared to the number of residues in these domains, these Dmax values are indicative of extended non-globular conformation. Taken together, these results clearly indicate that D2D3 JFH1 as well as D2 and D3 domains from JFH1 and Con1 strain behave as random coil polypeptide. It has been shown that the normalized Kratky plot representation of SAXS scattering curve exhibits a typical bell shape with a clear maximum for globular proteins, either in the native state or in the molten globule state. In contrast, for a fully disordered protein or a premolten globule state, a sharp rise as a function of qRg is observed before reaching a plateau, sometimes followed by an increase, depending on the local rigidity of the chain (77). The Kratky plots of D2, D3, and D2D3 JFH1 (Figure 8A) as well as D2 and D3 Con1 (Figure S6) do not display bell shape curves and are typical of disordered proteins. However, a significant difference is observed in the curve shapes of D2 and D3 domains for both strains. Indeed, D3 curves display significantly more important increases as a function of qRg than D2 curves, suggesting a more extended and rigid conformation for D3 than for D2. Interestingly, this difference is more pronounced between D2 and D3 of Con1 strain. Moreover, the plateau of D2 Con1 curve is more defined at large qRg values than D2 JFH1, indicating that D2 Con1 is less rigid than D2 JFH1 and could display short secondary structure elements.

ACS Paragon Plus Environment

21

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 62

The above data suggest that the NS5A domains are highly dynamic and exist in solution as interconverting conformers, which could adopt extended conformations. In order to infer more precise and complete pictures of NS5A domains, we combined experimental SAXS data with NMR data to get a description of the ensemble of conformers existing in solution. We used the program suite EOM (ensemble optimization method) that fits the SAXS data by selecting a set of structures out of a large pool of calculated structures representing the conformational space accessible to each flexible domain (see Experimental Section). NS5A domain structures were generated from dihedral angle constraints calculated with TALOS (71) from 1H and

13

C chemical shifts and Xplor-NIH program (78), except for

D2D3 JFH1 for which a complete set of NMR chemical shifts is not available. The EOM-optimized ensembles of conformers (Figure 9) show bimodal Rg distributions corresponding to compact and extended structures, as illustrated by representative conformers models. Compact structures are however much more represented than extended structures, suggesting a relative stability of compact conformers, which could transiently adopt an extended conformation. This is particularly well illustrated in the case of ACS Paragon Plus Environment

22

Page 23 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

D2 Con1 (Figure 9C) for which the Rg distribution is almost monomodal comparatively to that of D2 JFH1 (Figure 9B). This behavior is likely related to the higher tendency of α-helix formation in D2 Con1 than in D2 JFH1, possibly in a cooperative manner. The larger Rg average value observed for D3 JFH1 comparatively to D3 Con1 (39 Å and 33 Å, respectively, Figure 9, D and E) is likely related to the 20 aa insertion in the former (see sequences alignments in Figure S4). The wide Rg distribution of D2D3 conformers comparatively to theoretical random structures (Figure 9A) indicate that this NS5A region could transiently adopt a largely extended conformation but conserve some compactness.

Structural analyses of D2 Con1 mutants bearing CsA resistance mutations. Coelmont et al. (47) reported that D2 Con1 mutations D320E, R318W and R318W+ D320E conferred a 6.4-, 2.5-, and 10.1fold resistance to CypA inhibitor CsA, respectively (and 3.9-, 1.6-, and 3.8- fold resistance to CypA ACS Paragon Plus Environment

23

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 62

inhibitor Alisporivir (Debio 025), respectively) by measuring HCV RNA replication with subgenomic replicon Con1. In contrast, mutation R262Q does not confer measurable resistance to these inhibitors. To investigate the possible effect of CsA resistance mutations on D2 Con1 structural features, point mutations R262Q, R318W, and D320E, as well as double mutations R318W/D320E were introduced in the D2 Con1 overexpression plasmid, and the corresponding mutated recombinant domains were purified and analyzed by CD and SAXS. The CD spectra of all D2 mutants at 25°C were comparable to wild-type, except for D320E mutant which exhibited a lower negative value of molar ellipticity around 222 nm (Figure 10A, dotted line), suggesting the loss of some folding. However, this mutant conserved its overall folding propensity induced by increasing temperature, as monitored at 222 nm (Figure 10B). In contrast, the double mutant R318W/D320E mutant showed a higher propensity to fold with increasing temperature. This feature could be attributed to a stabilization of hydrophobic interaction due to the presence of the Trp residue at position 318. Intriguingly, this feature was not observed with the single mutant R318W (Figure 10B). Nevertheless, these data suggest that the resistant mutations D320E and R318W may induce some folding perturbation of the proline-tryptophane turn (see Figure 7 and Figure S4). The Kratky plot representation of X-ray scattering showed that D2 mutants globally behave like D2 wild-type and are typical of intrinsically disordered proteins (Figure 10C). However, the curve of the R318W mutant exhibits a significantly different shape, with a weak bump followed by a low increase as a function of q.Rg. Such features have been associated with the presence of residual structures in premolten globules (79), indicating that residual structures in the R318W mutant may be present. It is indeed possible that the W318 residue establishes additional hydrophobic interaction (possibly stacking) with the W316, belonging to the PW-turn, but also with the neighboring aromatic Y321 as well as with the surrounding P319, P323 and P324.

ACS Paragon Plus Environment

24

Page 25 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

The pair-distance distribution function P(r) of D2 mutants behave similarly to D2 wild-type (Figure 10D), i.e., exhibiting an asymmetric shape with an extended tail, and maximum dimensions around 125130 Å. The radii of gyration measured for the different mutants with different methods (see above) are summarized in Table 4. These results indicate that the Rg of D2 mutants are close to predicted random coil values. However, the Rg of mutant R318W is slightly higher than that of the wild-type, reflecting a wider conformation that might be due to some residual structure preventing the wild-type more compact conformation.

Table 4. Comparison of SAXS biophysical parameters of NS5A D2 resistants mutants Protein

MMCalc MMExp

Rgth

Rg (Å)d

Rg*(Å)e

Rg** (Å) f

Dmax (Å)

ACS Paragon Plus Environment

25

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(kDa)a D2 Con1 12.4 D2 Con1-R318W 12.5 D2 Con1-D320E 12.4 D2 Con1-R262Q 12.4 D2 Con112.5 R318W/D320E For details, see legend of Table 1

(kDa) b 18.8 16.0 10.6 11.4 -

Page 26 of 62

(Å) c 31.4 31.4 31.4 31.4 31.4

30.9 32.6 30.1 31.6 31.8

32.8 34.0 31.2 32.6 33.3

33.6 35.0 32.9 33.5 33.7

130 ± 5 130 ± 5 125 ± 5 130 ± 5 130 ± 5

Taken together, the structural analyses of D2 Con1 mutants indicate that CsA resistance mutations D320E and R318W could induce some local folding perturbation, which could thus affect the kinetics of interconversion of conformers. SPR binding assay of CypA to NS5A domains. To investigate the mechanism of NS5A resistance to cyclophilin inhibitors, we studied the direct interaction between CypA and NS5A domains as well as the effect of the D2 Con1 resistance mutations on this interaction. In order to keep the flexibility of the disordered NS5A domains, we captured these His-tagged domains on a NTA sensor chip via their His-tag and then stabilized them covalently taking advantage of the tag to orientate the domains and of the stabilization step to prevent them to dissociate from the sensor chip surface (80). The sensorgrams obtained after injection of increasing CypA concentrations over immobilized D2 Con1 (Figure 11A) or D2 and D3 domains from JFH1 and Con1 (Figure S7) were best fitted to a heterogeneous ligand model. This model assumes that the domains immobilized on the sensor chip would have two independent binding sites (roughly equivalent to two 1:1 models) for CypA, leading to the calculation of two sets of rate and affinity constants in the micromolar range (Table 5). The existence of two putative binding sites suggested by the evaluation of SPR data is in agreement with the identification of two main CypA binding sites on NS5A-D2 by NMR spectroscopy (cf. Figure 1D in (11) and Figure 5 in (39)). These two main sites have also been identified by peptide mapping (81) and correspond to the region centered on the PW-turn motif and the C-terminus, respectively.

ACS Paragon Plus Environment

26

Page 27 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Table 5. Kinetic and affinity parameters of CypA binding to NS5A D2 and D3 domains a Ligand

ka1

kd1

ka2

kd2

×10+3 M-1s-1 ×10-3 s-1 ×10+3 M-1s-1 ×10-2 s-1 D2 Con1 3.2 2.0 3.4 1.4 D2 JFH1 1.1 2.2 17 9.7 D3 Con1 7.1 1.4 17 4.5 D3 JFH1 n.d. n.d. n.d. n.d. a The sensorgrams were fitted to heterogeneous ligand model n.d., not determined because the SPR signal was to low to be measured accurately

KD1 ; KD2 µM 0.6 ; 3.9 0.2 ; 5.7 0.2 ; 2.7 n.d.

CsA inhibition of CypA totally prevented the binding of NS5A domains (data not shown), demonstrating that the catalytic site of CypA is involved in the interaction. It is consistent with our previous NMR studies showing that D2 and D3 domains are substrates for the peptidylprolyl isomerase activity of CypA (39, 40). It should be pointed out that CypA/NS5A domain interactions were characterized when the NS5A domains were immobilized on the sensor chip. Indeed, no significant SPR signal was detected in the reverse orientation, although CypA was functional as shown by its ability to bind to CsA with high affinity (KD = 19.6 nM) in these experimental conditions (Figure S8). The CypA/NS5A complex was dissociated in running buffer, which preserved the conformation(s) of the immobilized D2 domains.

ACS Paragon Plus Environment

27

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 62

It should be mentioned that the interaction of CypA with the D2 and D3 domains induced minor structural changes detected by CD. Indeed, the experimental spectra of CypA-D2 or CypA-D3 complexes were close to those calculated by summing the CD spectrum of each species (Figure S9). This is consistent with the view that the cis/trans isomerization of proline residues catalyzed by CypA does not induce the formation of new conformers, but mainly accelerate the kinetic of conformers interconversion. Binding of CypA to D2 Con1 resistance mutants. SPR experiments were performed with D2 Con1 mutants to evaluate whether resistance mutations could interfere with CypA binding (Figure 11). The binding of CypA to wild type D2 and to any D2 mutants were nearly superimposable, as shown in Figure 11B for mutants R318W and D320E. Several concentrations of CypA were injected over D2-R318W, D2-D320E, D2- R262Q and D2-R318W/D320E mutants (Figure S10) to calculate the rate constants and affinities of these interactions reported in Table 6. Interestingly, these values were in the same range than those obtained with the wild type D2 domain showing that the mutations did not disrupt the binding of the D2 domain to CypA and that they do not alter significantly the kinetics and the affinity of the binding. Similar findings have been reported by Chatterji et al. (53).

Table 6. Kinetic and affinity binding of CypA binding to NS5A D2 resistant mutants a Ligand

ka1

×10+3 M-1s-1 3.2 6.2 3.4 3.9 4.4

kd1

×10-3 s-1 2.0 2.1 1.8 2.0 3.7

ka2

×10+3M-1s-1 3.4 2.9 1.3 2.9 4.9

D2 Con1 D2 Con1-D320E D2 Con1-R318W D2 Con1-R262Q D2 Con1R318W/D320E a The sensorgrams were fitted to heterogeneous ligand model

kd2

×10-2 s-1 1.4 1.6 1.6 8.6 1.1

KD1 ; KD2 µM 0.6 ; 4.0 0.3 ; 5.4 0.5 ; 12.0 0.5 ; 3.0 0.9 ; 2.2

DISCUSSION The determination of the detailed structural features of NS5A is mandatory to understand its precise function(s) in the HCV life cycle and the mechanism of action of highly potent NS5A inhibitors that constitute a critical component of clinically used therapy of chronic HCV infection. Although significant advances have been achieved in recent years, the biochemical and structural analysis of this enigmatic protein remains very challenging. This is due to the difficulties to produce the full-length protein (notably

ACS Paragon Plus Environment

28

Page 29 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

because of its membrane binding properties), its high propensity to self-interaction, the intrinsically disordered nature of its D2 and D3 domains, and the various phosphorylated states of the protein. We report here that NS5A-D1D2D3 produced in a wheat germ cell-free system was phosphorylated by protein kinases present in the wheat germ extract. It is not very likely that the NS5A phosphorylation pattern obtained with the wheat germ cell-free system is identical to NS5A produced in authentic replication models for several reasons. First, NS5A generated upon viral replication typically gives rise to two phosphoisoforms, basal (p56) and hyperphosphorylated NS5A (p58, reviewed (13-15, 82), which can be separated by SDS-PAGE whereas only one was observed here using wheat germ extract. The distinct functions and precise phosphorylation patterns underlying the two isoforms are still enigmatic, but the synthesis of p58 required the expression of NS5A at least in the context of a NS3-5A polyprotein (83, 84, 85) and is influenced by NS5B expression (85). Second, NS5A expressed as a single protein seems not to be fully functional, likely due to a different phosphorylation pattern, as lethal mutations within NS5A cannot be rescued by transcomplementation of sole expression of NS5A (86). Third, numerous kinases have yet been identified to be involved in NS5A phosphorylation, including CKIα (87), CKII (22, 88, 89), Polo-like kinase (90), PKA (91) and even PI4KA (85). Genetic evidence suggests that p58 synthesis involves a sequential cascade of phosphorylation events within LCS I positions 221-240 (62), mainly based on CKIα (92), which might be regulated by priming and blocking phosphorylations involving other kinases (90, 93). While position 222 has been identified in many studies, including ours, positions 235, exclusively found in p58 (63), and 238 were only recently shown to be phosphorylated by MS (63). The fact that our study missed these residues might point to dysregulation of phosphorylation events in absence of the polyprotein context, or alternatively that our protein expression model might lack specific kinase isoforms. Overall, many potential NS5A phosphorylation sites with functional relevance have been suggested but only a very limited number have been unequivocally identified by MS (62, 94). The combination of NMR and MS analyses of NS5A-D1D2D3 produced in cell-free system allowed us now to identify up to 19 phosphorylation sites (Figure 1) including positions that have been identified earlier, like 222 (2194) found in all studies so far using mass spectrometry (62, 63, 82), and 360 (2332), which is phosphorylated by PKA in vitro and which is partially phosphorylated within cells (91). Moreover, our NMR data now ACS Paragon Plus Environment

29

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 62

show that only a sub-fraction of NS5A is indeed phosphorylated at many sites with partial phosphorylation at most sites (Table 1). Partial phosphorylation might explain the difficulties to identify numerous sites by both NMR and MS in our analysis and might be a reason underlying the low number of phosphorylation sites identified in HCV replication models (62, 63, 94). Our data also provide evidence that positions 432, 434 and 437 (JFH1: 452, 454, 457) are indeed phosphorylated. These serine residues have been shown to regulate the interaction with core protein, thereby linking RNA replication to assembly of virions (22, 23, 26). Most interestingly, our study suggests several phosphorylation sites (#360, 364, 366, 370) close to a basic cluster (356-359) in the Nterminal region of D3 domain, which was recently found to regulate NS5A-RNA-core interaction and envelopment of viral particles (12). These phosphorylations might regulate interaction processe(s) by influencing the folding propensity of the putative molecular recognition element in this region (40). It will be interesting to analyze the functional role of these phosphorylation events in an authentic replication model in future studies by phosphoablatant and phosphomimetic mutations. Overall, analysis of phosphorylation of full-length NS5A generated in a cell-free system is a promising technique to identify novel potential phosphorylation sites, particularly in D3. Although the phosphorylation pattern will not be authentic at all sites, it is yet the only model to allow NMR studies on almost full-length NS5A and thereby will be a valuable way to validate predicted phosphorylation sites based on reverse genetics. From a structural point of view, the present NMR analyses revealed that phosphorylations do not change the disordered nature of D2 and D3 domains but multiply the number of conformers due to partial phosphorylations. In line with our results, Secci et al. have concluded that the phosphorylation of pS401, pS408, pS429 and pS432 in NS5A-D3 do not influence its secondary structure content (61). Differential phosphorylation patterns could give rise to a huge number of NS5A phosphovariants probably serving various functions (91). Phosphorylations could regulate the interactions of NS5A with other viral or host factors both by influencing the folding of its molecular recognition elements and by the destruction of intermolecular interactions, or conversely by the formation of new interaction sites.

The folding propensity of purified recombinant unphosphorylated NS5A domains D2 and D3 from Con1 and JFH1 strains produced in bacteria and their overall conformational behavior were studied by ACS Paragon Plus Environment

30

Page 31 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

CD, NMR and SAXS. These technics are very complementary as they give essential information about the state of folding and the secondary structure content, structural information at the atomic level and the dimensions and distribution of protein conformers in solution, respectively. We confirmed that both D2 and D3 domains from Con1 and JFH1 strains are intrinsically disordered exhibiting coil-like conformation, although D2 Con1 exhibits some pre-molten-like properties due to its relatively high content of residual secondary structures. All D2 and D3 domains showed both a strong protein concentration- and temperature-dependent folding propensity, indicating that they could form clusters of conformers in dynamic equilibrium through increased hydrophobic interactions and α-helices stabilization. Analyses of NMR chemical shifts (Figure 7) allowed the identification of two putative αhelical segments comprising amino acids 251-266 and 298-306, respectively. This finding is consistent with the identification of similar helices in the strain HC-J4 of genotype 1b (37, 41). Interestingly, several additional short helical segments were predicted in the D2 Con1 domain using the Talos+ and SSP programs from experimental NMR data and also by conventional secondary structure prediction methods (see Figure S4), but their identification as helices has to be confirmed. Conversely to D2 Con1, a poor helical folding propensity was observed for D2 JFH1. This is consistent with our previous NMR observations (39) and it is likely related both to its higher proline content (#15 vs #11 in D2 Con1) and with the deletion of four residues in the first putative helix. In addition, while this helix 251-266 is well predicted by bioinformatics methods in all but genotype 2 including JFH1 strain (see Figure S4), the prediction of the second helix 298-306 appears to be highly genotype-dependent. Moreover, the ensemble of selected D2 conformers representing the experimental SAXS data in Figure 9 clearly shows a bimodal distribution for D2 JFH1 conformers, indicating a relative stability of each interconnecting compact and extended forms. In contrast, the distribution for D2 Con1 conformers is almost monomodal, indicating a good stability of compact conformers undergoing transient extended conformation (Figure 9). The comparative structural characterization of the recombinant D2 JFH1 and D2 Con1 domains highlighted the higher α-helical folding propensity and compactness of the latter, which indicates the presence of extra intramolecular interactions. The PW-turn structural motif that we identified in NS5A-D2 (54) could also be involved by acting as a minimal folded core that could establish transient long-range intramolecular interactions. ACS Paragon Plus Environment

31

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 62

The comparison of the distribution of conformers for the D3 JFH1 and D3 Con1 domains indicates that compact structures are more represented than extended structures in the D3 Con1 domain, indicating a relative stability of both forms (Figure 9E). In contrast, the conformation of the D3 JFH1 domain appears to rapidly oscillate between compact and extended forms (Figure 9D). This behavior is likely related to the presence of a 20 aa proline-rich sequence insertion in the D3 JFH1 domain. In summary, our experimental data reveal the variability of intrinsic conformations of the D2 and D3 domains between HCV genotypes and raises the important questions of the folding and interaction capabilities of these domains with viral and cell factors in the various genotypes, which likely play a role in the genotype-dependent pathogenesis of HCV infection. It also underscores the fact that one has to be cautious with some results of NS5A functional studies obtained with JFH1-based molecular tools, as they may not necessarily be representative for others genotypes as genotypes 1 or 3 that are the most prevalent worldwide. All the structural data reported here together with other available data prompted us to build a tentative model of the overall structure of the NS5A protein anchored to the membrane (Figure 12), aside host CypA, to provide a template to further investigate the structural and functional features of this enigmatic protein. This model was built for NS5A sequence of the Con1 strain using available structural data (for details, see legend of Figure 12) and thus highlights all its predicted structured elements: the recently identified α-helix at the end of D1 (37, 41), the PW-turn (54) as well as the α-helical segments in D2 (this study) and the amphipathic helix at the beginning of D3 (40). All these putative molecular recognition elements could mediate, alone or in combination, essential interactions with viral or cell factors, that could moreover be modulated by phosphorylations (see above) (91), and would explain the numerous host interacting proteins reported for the NS5A protein (16). It is also possible that several of these structured elements could interact to yield a more compact structure of the NS5A protein, but no significant interaction between the globular and disordered parts of NS5A are expected (42). Indeed, most of NMR signals of isolated D2 and D3 can be superimposed to that of these domains in NS5A-D1D2D3 spectrum (Figure 3 and S2), of course excluding phosphorylation sites.

ACS Paragon Plus Environment

32

Page 33 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

The two conformers of the D2D3 domain on Figure 12 were selected to illustrate either a relatively compact D2 domain and an extended D3 domain (left) or relatively extended D2 domain and compact D3 domain (right) according to the experimental SAXS data in Figure 9. One can wonder whether these extended conformations represent extreme situations, which likely transiently exist during the biosynthesis of NS5A molecules. The extended D2D3 domains look like tentacles of an octopus exploring its environment to capture its prey. They could allow the recruitment of distant biological partners by the specific binding of molecular recognition element(s) alone or in combination. In vivo, the conformation of D2D3 domains are likely modulated by (i) the increased local concentration due to the multimerization of NS5A at the surface of ER membrane (27, 95) as suggested by the concentrationdependent changes observed for D2 and D3 in this study (Figure 4), (ii) by the RNA-binding properties of the D2 domain (24, 25) that are expected to modulate its conformation, and finally (iii) by the interaction of the acidic D2 or D3 domains with positively-charged partner(s) such as D1 basic groove, which may

ACS Paragon Plus Environment

33

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 62

induce the folding in α-helix, as observed here at low pHs (Figure 5). Nevertheless, further studies will be required to determine the functional architecture of this protein. Although a correlation has been established between the CypA interaction sites and mutations in the D2 domain that confer resistance to cyclophilin inhibitors (47, 48, 52, 96, 97, 98), the underlying molecular mechanism is still not elucidated. D320E is to date the most common resistance mutation conferring CypA inhibitor resistance and has been identified in CsA- , NIM811-, Alisporivir-, and SCY 635-resistant HCV Con1 replicons (47, 52, 53, 96, 97, 99), as well as in CsA-resistant HCV JFH1 ((48), D316E in JFH1). Our structural analyses of the D2 Con1 recombinant mutants show that the D320E mutation modifies some existing structural constraints in wild-type D2, but does not significantly affect the overall conformational properties of this domain (Figure 10). This finding is coherent with the previous observation that, in the context of D2 Con1-derived synthetic 20-mer peptides, the D320E mutation triggers marked alteration of the W316 and A317 NMR chemical shift resonances (47). It is also in keeping with the more extended conformation of a synthetic peptide carrying the D316E+Y317N double resistant mutant (denoted DEYN) of D2 from JFH1 strain (48) observed by CD and NMR. The D320E mutation is located right after the PW-turn structural motif (54), which is mandatory for HCV RNA replication, so these two elements may be connected. The binding of CypA to the D2 domain does not significantly affect the ensemble of D2 conformers (see Figure S9). Moreover, neither D320E mutation nor the other tested mutations have measurable impact on the kinetics and the affinity of CypA binding to the D2 domain measured by SPR. This finding corroborates previously reported pull-down experiments of CypA binding to the D320E mutant (47, 53, 99) and NMR CypA titration experiments of 20-mer peptides carrying, or not, D320E mutation (47). We suggest that the role of CypA would be to allow the D2 domain to adopt alternative conformation(s) required for efficient interaction(s) of NS5A with some essential partner(s). It remains however difficult to identify which partner is affected since proline isomerization may affect various features of the NS5A protein, including binding to the viral RNA (24, 51, 100), to the NS5B polymerase (10) or Core protein (26), to multiple host proteins (16, 101, 102), and modulation of the IFN pathway (103). In conclusion, the combination of CD, NMR and SAXS data obtained with recombinant NS5A domains allowed us to propose a model of the overall NS5A structure. Although further work will be ACS Paragon Plus Environment

34

Page 35 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

needed to determine the functional architecture of this intrinsically dynamic protein bound to the membrane, the present model allowed us to discuss various structural and functional features of this multifunctional protein. In addition, our structural and interaction analyses of the D2 mutants with CypA provide valuable new information towards the resolution of the molecular mechanism by which NS5A mutations confer resistance of HCV replication to CypA inhibitors.

FIGURE LEGENDS Figure 1. Sequence alignment and structural features of NS5A from Con1 and JFH1 strains. NS5A sequences from Con1 and JFH1 strains of HCV subtypes 1b and 2a (GenBankTM accession number AJ238799 and AB047639, respectively) were aligned with Clustal W. Amino acids are numbered with respect to NS5A proteins and the HCV polyproteins for each strain (top rows and color black for Con1 strain, bottom rows and color brown for JFH1 strain). The limits of amphipathic helix (AH), D1, D2 and D3 domains as well as low complexity sequences (LCS 1 and 2) are indicated accordingly to (21). Color coding of domains correspond to that used for the right subunit of NS5A in Figure 10. Identical, highly similar, and similar residues at each position are indicated with asterisks, colons, and dots, respectively, according to Clustal W conventions. Identical and similar residues are in grey while residues having no similarity are in black to highlight sequence positions and regions exhibiting clear amino acid differences. Gaps between sequences are indicated by hyphens. Proline residues are colored orange and cysteine residues involved in zinc binding are colored green (# 39, 57, 59, 80). The secondary structures (colored blue) of AH (1-31) and D1 (36-198) were deduced from NMR and X-ray structures (PDB entries 1R7E and 1ZH1, respectively; c, coil, no secondary structure assigned; s, bend; H, alpha helix; t, turn; b, beta bridge; E, beta strand). The predicted secondary structures of D2 and D3 deduced from reported NMR data (35, 39, 40, 54) using Talos+ program (Talos Pred., for details, see legend of Figure 12) are indicated in italics above the Con1 sequence (in black) and below the JFH1 sequence (in brown). The conserved proline-tryptophan turn (54) is highlighted in yellow (PDB entry : 2M5L; BMRB number : 19059). The phosphorylated serine and threonine residues observed in

13

C,

15

N labeled NS5A D1D2D3

ACS Paragon Plus Environment

35

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 62

by NMR are highlighted in magenta while those observed by mass spectroscopy are highlighted in green when confirmed and cyan when probable (this study, see the text for details). Underlined residues correspond to regions exhibiting alternative conformation as observed by NMR. Figure 2. Comparison of far UV circular dichroism spectra of the purified NS5A domains D1 and D1D2D3 of Con1 strain expressed in wheat germ cell-free system. CD spectra of D1 and D1D2D3 were recorded in 10 mM sodium phosphate buffer, pH 7.4. Figure 3. NMR analyses of phosphorylated NS5A-D1D2D3. The protein produced in cell-free system was observed to be phosphorylated by endogenous protein kinases of wheat germ extract. (A) 1H,15NHSQC NMR spectrum of NS5A-D1D2D3 at 900MHz. Regions corresponding to glycine, serine/threonine or phospho-serine/phospho-threonine residues are highlighted by gray, blue and red dashed lines, respectively. Assignment of residues that are phosphorylated (totally or partially) are indicated. Assignments with an asterisk correspond to NMR peaks coming from minor NS5A populations (see Table 1). (B) Secondary Structure Propensity (SSP) analysis of the

13

C NMR chemical shifts of

NS5A-D1D2D3. Values close to 0 correspond to fully disordered residues, whereas positive and negative scores represent helical propensities and extended regions, respectively. Light gray bars indicate that the corresponding residues have been assigned in the NS5A-D1D2D3 NMR spectrum in (A). The red points indicate the phosphorylated residues in the major population of NS5A and the red star indicates the residues that are phosphorylated only in a minor population. (C, D) Low-complexity sequences (LCS1 and LCS2) and domains 2 and 3 (D2, D3) of NS5A are highlighted in light grey, blue and purple, respectively. (C) NMR peaks intensity profile of isolated NS5A-D2 and NS5A-D3 domains. The NMR intensities reported are those obtained either on the isolated D2 or D3 domains alone at the same concentration. (D) Secondary Structure Propensity (SSP) analysis of the

13

C NMR chemical shifts of

isolated D2 and D3 domains. Values close to 0 correspond to fully disordered residues, whereas positive and negative scores represent helical propensities and extended regions, respectively. Figure 4. Effect of protein concentration and temperature on the conformations of the NS5A D2 and D3 domains. CD spectra of D2 and D3 domains from strains JFH1 (A,C) and Con1 (B,D) recorded in 10 mM sodium phosphate buffer, pH 7.4 at different protein concentrations : 50 µM (solid line), 125 µM (large ACS Paragon Plus Environment

36

Page 37 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

dashed line), 300 and 500 µM (dashed line) for the D2 Con1 and the D3 JFH1 domains respectively; and 125 µM (solid line), 250 µM (large dashed line), and 500 µM (dashed line) for the D2 JFH1 (A) and D3 Con1 (D) domains. (E,F), Temperature-induced conformational changes of D2 (E) and D3 (F) domains from JFH1 (circles) and Con1 (square) monitored by the molar ellipticity per residue at 205 nm (black symbol) and 222 nm (white symbol) from 25 to 90°C. The protein concentration was 8 µM for all samples. Figure 5. Comparison of TFE-induced folding propensity of D2 domains from Con1 and JFH1 strains. CD spectra of the D2 domain from JFH1 (A) and from Con1 (B) were recorded at 25°C, in 10 mM sodium phosphate buffer, pH 7.4 (H2O, solid line) or complemented with 50% (v/v) TFE (small dashed line). The difference spectra (large dashed lines) were obtained by subtracting the spectrum recorded in the presence of 50% TFE from that recorded in H2O. (C), the molar ellipticity per residue at 222nm, assumed to represent the α-helix content in the D2 JFH1 (white circles) and D2 Con1 domains (black square), are plotted as a function of percent TFE. Figure 6. SDS-induced folding propensity and pH-dependent conformation of the D2 and D3 domains. (A-D), CD spectra of the indicated domains were recorded at 25°C in the presence of 100 mM SDS in 10 mM sodium phosphate buffer at pH 7.4 (solid line) and 2.0 (dashed line). (E-F), the molar ellipticity per residue at 222 nm, assumed to represent the α-helix content in the D2 and D3 domains was monitored between pH 2.0 and 7.4. Figure 7. Disorder predictions and helical propensity of D2 domains from Con1 and JFH1 strains. In this Figure, the sequences of D2 Con1 (top) and D2 JFH1 (bottom) domains are kept aligned as shown in Figures 1 and S5. Disorder predictions were performed on the NS5A sequences using the PONDR software (solid curve). A disorder score above the 0.5 threshold indicates a disorder state. Results of the SSP analyses of the Cα and Cβ NMR chemical shifts of D2 Con1 and D2 JFH1 residues (BMRB codes 19055 and 16165, respectively) are indicated as gray bars. The positive SSP scores (right axes) correspond to the helical propensity, whereas negative scores indicate extended regions. Talos pred., Talos secondary structure prediction based on residues backbone torsion angles predicted from NMR chemical shifts. H, helix ; c, coil. ACS Paragon Plus Environment

37

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 62

Figure 8. Overall conformation of the NS5A domains assessed by SAXS measurements. (A) Normalized Kratky plot representation of the scattered intensity for the D2D3 JFH1 (black bold line), D2 JFH1 (thin line) and D3 JFH1 (grey line) domains. (B) Experimental distance distribution profile P(r) (plot of frequency versus distances (R) within the particle) calculated using GNOM for these domains. Figure 9: Distributions of the radius of gyration (Rg) derived from the EOM analysis and representative structure of the NS5A domains. (A) D2D3 JFH1; (B) D2 JFH1; (C) D2 Con1; (D) D3 JFH1; (E) D3 Con1. For each domain, the Rg distribution of the initial pool of 1000 conformers calculated from NMR data is represented by the solid line and white area, while the Rg distribution of conformers that fits the SAXS data is represented by the dashed line and grey area. The compact and extended structures shown are representative of the low and large Rg conformers fitting the SAXS data. The ribbon representations were generated on the basis of the structure coordinates by using the Visual Molecular Dynamics program (VMD, (104)) and rendered with the POV-Ray software package. Alpha-helices are represented as cylinders. Figure 10. Effect of CsA resistance mutations on the structural features of the D2 Con1 domain. (A,B) Far UV CD spectra of D2 wild-type domain (solid line) and D2 mutants R262Q (short dashed line) R318W (long dashed line), D320E (dotted line), and R318W/D320E (dotted dashed line) were recorded at 25°C, in 10 mM sodium phosphate buffer pH 7.4. The conformational changes induced by temperature were followed at 222 nm between 25°C and 90 °C (B). (C,D) SAXS analyses of D2 wild-type domain and mutants. (C) Normalized Kratky plot representation of the scattered intensity for the D2 wild-type domain (bold grey line) and D2 mutants R262Q (black thin line), R318W (bold black line), D320E (thin dark grey line), and R318W/D320E (grey thin line). (D) Experimental distance distribution profile P(r) calculated using GNOM for D2 wild-type and mutants (symbols as in panels A and B). Figure 11. (A) Sensorgrams resulting from the injection of various concentrations of CypA (0.5 to 2.5 µM) over the covalently stabilized wild-type D2 Con1 domain at a flow rate of 30 µL/min. The sensorgrams were fitted (dashed lines) to a heterogeneous ligand model (Biaevaluation software, floated Rmax during the fitting process). (B) Superimposition of sensorgrams resulting from the injection of 2.5 µM CypA over the covalently stabilized D2 Con1 wild-type domain (solid line) and D2-R318W (small ACS Paragon Plus Environment

38

Page 39 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

dashed line) and D2-D320E (large dashed line) mutants on the same sensor chip. All experiments were performed with 10 mM HEPES pH 7.4, 150 mM NaCl, 50 µM EDTA, 0.05 % surfactant P20 as running buffer. Figure 12. Tentative model of the NS5A protein in relation to the membrane. The structures and the membrane bilayer are shown at the same scale. NS5A domain 1 dimer (PDB entry 1ZH1 (30), subunits colored in red and blue) and the N-terminal amphipathic α-helix in-plane membrane anchor (PDB entry 1R7E (20), helices colored in red and blue) are shown in relative position to a phospholipid membrane (adapted from (5, 30)). The membrane is represented as a simulated model of a 1-palmitoyl-2-oleoyl-snglycero-3-phosphocholine (POPC) bilayer (http://moose.bio.ucalgary.ca). Polar heads and hydrophobic tails of phospholipids (stick structures) are light yellow and gray, respectively. Domain 2 (subunits colored in magenta and cyan) and domain 3 (subunits colored in yellow and blue) are in ribbon representation with putative α-helices predicted from NMR chemical shifts (Talos+) represented as cylinders. The low-complexity sequences (LCS, (21)) connecting D1 to D2 (LCS1) and D2 to D3 (LCS2) are colored in grey. The representative structures of D2 and D3 presented here combined experimental SAXS data with NMR data (see Figure 9). The model structures including the end of D1, LCS1, D2, LSC2, and D3, were generated from dihedral constraints calculated with Talos+ program from 1H and 13C chemical shifts of Con1 strain for D2 and D3 domains (BMRB accession code 16166), and of HC-J4 strain for the end of D1 domain, LCS1 and LCS2 (BMRB accession code 17468; see Supplemental Figure S4 for similarities between the sequences of these two genotype 1b strains). The representative structures were connected to domain 1 dimer structure using Swiss PDB-viewer program (http://www.expasy.org/spdbv/) (105). A surface representation of CypA in complex with cyclosporin A (CsA) is shown on the left (PDB entry 1CWA). A putative structure for CypA and its main binding site in NS5A D2 is shown on the right. Note that all structures are drawn to scale to illustrate the length and flexibility of the 'arms' formed by D2 and D3 and the relative sizes of CypA and NS5A D1 domains. The Figure was generated on the basis of the structure coordinates by using the Visual Molecular Dynamics program (VMD, (104)) and rendered with the POV-Ray software package. ASSOCIATED CONTENT ACS Paragon Plus Environment

39

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 62

*S Supporting Information - Supporting Experimental Procedures (PDF) - Supporting Figures including electrophoresis of purified NS5A domains used in this study (Figure S1), superimposition of

1

H,15N HSQC NMR spectrum of phosphorylated NS5A-D1D2D3 with

unphosphorylated isolated D2 and/or D3 domain (Figure S2), ITC titrations of D2 and D3 dilution, and D2-D3 interaction analyses by ITC and CD (Figure S3), sequence analysis of NS5A D2D3 from various HCV genotypes (Figure S4), double CD wavelengths plot predicting the coil-like state of D2 and D3 (Figure S5), SAXS measurements of overall D2 and D3 conformations (Figure S6), SPR analyses of CypA interaction with D2, D3 (Figure S7) and CsA (Figure S8), CD analyses of CypA interaction with D2 and D3 (Figure S9), Binding of CypA to D2 resistance mutants by SPR (Figure S10). (PDF) - Mass Spectroscopy analyses of phosphorylated NS5A-D1D2D3. (PDF) AUTHOR INFORMATION Corresponding Authors * E-mail: [email protected]. Fax: +33-3-20-43-65-55; [email protected]. Fax: +33-4-72-7226-04. Present Addresses † RD-Biotech, Recombinant Protein Unit, 3 rue Henri Baigue, F-25000 Besançon, France. § ICBMS, UMR 5246 CNRS - University lyon 1, 43 Boulevard 11 novembre 1918, F-69622 Villeurbanne cedex, France. # Laboratoire d'Ingénierie des Systèmes Biologiques et des Procédés, INSA, University of Toulouse, CNRS, INRA, Toulouse, France. Author Contributions The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript. Funding Sources ACS Paragon Plus Environment

40

Page 41 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

The research was supported by grants from the Agence Nationale de Recherches sur le Sida et les Hépatites Virales (ANRS, A02007-2 and A02011-2, PhD fellowship to A.B. and Post-Doc fellowship to S.S., respectively), and the French National Agency for Research (ANR-11-JSV8-005, LABEX ECOFECT ANR-11-LABX-0048, and MAPPING project ANR-11-BINF-0003). Work of R.B. was supported by the Deutsche Forschungsgemeinschaft (SFB/TRR83, TP13). The NMR facility was supported by the European Community, the CNRS (TGIR RMN THC, FR-3050), the University of Lille, and the Région Nord-Pas-de-Calais (France). Notes The authors declare no competing financial interest. An oral presentation (O-9.14) of the main part of this work was done by A Badillo at the 18th International Meeting on Hepatitis C and Related Viruses (Seattle, WA, USA, September 2011). ACKNOWLEDGMENTS The authors gratefully acknowledge Christophe Combet for euHCVdb bioinformatics support, Timothy L. Tellinghuisen for providing E. coli recombinant NS5A domain 1 for early SAXS experiments, and Simona Miron for her expertise in ITC experiments. SPR and CD experiments were performed using the technical facility of UMS 3444/US8 Gerland - Lyon Sud, Lyon, France. SAXS experiments were performed on the SWING beamline at SOLEIL Synchrotron, Saclay, France, (proposal numbers 20100304 and 20120186). We are grateful to Pierre Roblin and Javier Perez for assistance and to the SOLEIL staff for smoothly running the facility. ABBREVIATIONS aa, amino acid(s) ; CD, circular dichroism; CsA, cyclosporin A; CypA, cyclophilin A; CSI, chemical shift index ; DAA, Direct acting antiviral; EOM, ensemble optimization method ; HCV, hepatitis C virus; HPLC, high-performance liquid chromatography; IDP, intrinsically disordered protein; ITC: isothermal titration calorimetry; HSQC, heteronuclear single-quantum correlation; LCS, low-complexity sequences; MS, mass spectroscopy; NMR, nuclear magnetic resonance; NS5A, nonstructural protein 5A; PPIase, peptidyl-prolyl cis/trans-isomerase; SAXS, Small Angle X-ray scattering; SDS, sodium dodecyl sulfate; ACS Paragon Plus Environment

41

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 42 of 62

SPR, Surface Plasmon Resonance; SSP, Secondary Structure Propensities; TFE, 2,2,2-trifluoroethanol; TOCSY, total correlation spectroscopy; WGE, wheat germ extract; WGE-CF, wheat germ extract cell free system. REFERENCES 1.

2. 3. 4. 5. 6. 7. 8.

9.

10.

11.

12.

13.

Stanaway, J. D., Flaxman, A. D., Naghavi, M., Fitzmaurice, C., Vos, T., Abubakar, I., AbuRaddad, L. J., Assadi, R., Bhala, N., Cowie, B., Forouzanfour, M. H., Groeger, J., Mohd Hanafiah, K., Jacobsen, K. H., James, S. L., MacLachlan, J., Malekzadeh, R., Martin, N. K., Mokdad, A. A., Mokdad, A. H., Murray, C. J., Plass, D., Rana, S., Rein, D. B., Richardus, J. H., Sanabria, J., Saylan, M., Shahraz, S., So, S., Vlassov, V. V., Weiderpass, E., Wiersma, S. T., Younis, M., Yu, C., El Sayed Zaki, M., and Cooke, G. S. (2016) The global burden of viral hepatitis from 1990 to 2013: findings from the Global Burden of Disease Study 2013. Lancet 388, 1081-8. Pawlotsky, J. M. (2016) Hepatitis C Virus Resistance to Direct-Acting Antiviral Drugs in Interferon-Free Regimens. Gastroenterology 151, 70-86. Moradpour, D., Grakoui, A., and Manns, M. P. (2016) Future landscape of hepatitis C research Basic, translational and clinical perspectives. J Hepatol 65, S143-55. Simmonds, P. (2013) The origin of hepatitis C virus. Curr Top Microbiol Immunol 369, 1-15. Moradpour, D., Penin, F., and Rice, C. M. (2007) Replication of hepatitis C virus. Nat Rev Microbiol 5, 453-63. Moradpour, D., and Penin, F. (2013) Hepatitis C virus proteins: from structure to function. Curr Top Microbiol Immunol 369, 113-42. Bartenschlager, R., Penin, F., Lohmann, V., and Andre, P. (2011) Assembly of infectious hepatitis C virus particles. Trends Microbiol 19, 95-103. Romero-Brey, I., Merz, A., Chiramel, A., Lee, J. Y., Chlanda, P., Haselman, U., SantarellaMellwig, R., Habermann, A., Hoppe, S., Kallis, S., Walther, P., Antony, C., Krijnse-Locker, J., and Bartenschlager, R. (2012) Three-dimensional architecture and biogenesis of membrane structures associated with hepatitis C virus replication.PLoS Pathog 8, e1003056. Paul, D., Hoppe, S., Saher, G., Krijnse-Locker, J., and Bartenschlager, R. (2013) Morphological and biochemical characterization of the membranous hepatitis C virus replication compartment. J Virol 87, 10612-27. Shirota, Y., Luo, H., Qin, W., Kaneko, S., Yamashita, T., Kobayashi, K., and Murakami, S. (2002) Hepatitis C virus (HCV) NS5A binds RNA-dependent RNA polymerase (RdRP) NS5B and modulates RNA-dependent RNA polymerase activity. J Biol Chem 277, 11149-55. Rosnoblet, C., Fritzinger, B., Legrand, D., Launay, H., Wieruszeski, J. M., Lippens, G., and Hanoulle, X. (2012) Hepatitis C virus NS5B and host cyclophilin A share a common binding site on NS5A. J Biol Chem 287, 44249-60. Zayas, M., Long, G., Madan, V., and Bartenschlager, R. (2016) Coordination of Hepatitis C Virus Assembly by Distinct Regulatory Regions in Nonstructural Protein 5A. PLoS Pathog 12, e1005376. Evans, M. J., Rice, C. M., and Goff, S. P. (2004) Phosphorylation of hepatitis C virus nonstructural protein 5A modulates its protein interactions and viral RNA replication. Proc Natl Acad Sci U S A 101, 13038-43. ACS Paragon Plus Environment

42

Page 43 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

14. 15. 16.

17.

18.

19.

20.

21. 22. 23.

24.

25.

26.

27.

28.

Biochemistry

Lohmann, V. (2013) Hepatitis C virus RNA replication. Curr Top Microbiol Immunol 369, 16798. Ross-Thriepland, D., and Harris, M. (2015) Hepatitis C virus NS5A: enigmatic but still promiscuous 10 years on! J Gen Virol 96, 727-38. de Chassey, B., Navratil, V., Tafforeau, L., Hiet, M. S., Aublin-Gex, A., Agaugue, S., Meiffren, G., Pradezynski, F., Faria, B. F., Chantier, T., Le Breton, M., Pellet, J., Davoust, N., Mangeot, P. E., Chaboud, A., Penin, F., Jacob, Y., Vidalain, P. O., Vidal, M., Andre, P., Rabourdin-Combe, C., and Lotteau, V. (2008) Hepatitis C virus infection protein network. Mol Syst Biol 4, 230. Kaul, A., Stauffer, S., Berger, C., Pertel, T., Schmitt, J., Kallis, S., Zayas, M., Lohmann, V., Luban, J., and Bartenschlager, R. (2009) Essential role of cyclophilin A for hepatitis C virus replication and virus production and possible link to polyprotein cleavage kinetics. PLoS Pathog 5, e1000546. Reiss, S., Rebhan, I., Backes, P., Romero-Brey, I., Erfle, H., Matula, P., Kaderali, L., Poenisch, M., Blankenburg, H., Hiet, M. S., Longerich, T., Diehl, S., Ramirez, F., Balla, T., Rohr, K., Kaul, A., Buhler, S., Pepperkok, R., Lengauer, T., Albrecht, M., Eils, R., Schirmacher, P., Lohmann, V., and Bartenschlager, R. (2011) Recruitment and activation of a lipid kinase by hepatitis C virus NS5A is essential for integrity of the membranous replication compartment. Cell Host Microbe 9, 32-45. Brass, V., Bieck, E., Montserret, R., Wolk, B., Hellings, J. A., Blum, H. E., Penin, F., and Moradpour, D. (2002) An amino-terminal amphipathic alpha-helix mediates membrane association of the hepatitis C virus nonstructural protein 5A. J Biol Chem 277, 8130-9. Penin, F., Brass, V., Appel, N., Ramboarina, S., Montserret, R., Ficheux, D., Blum, H. E., Bartenschlager, R., and Moradpour, D. (2004) Structure and function of the membrane anchor domain of hepatitis C virus nonstructural protein 5A. J Biol Chem 279, 40835-43. Tellinghuisen, T. L., Marcotrigiano, J., Gorbalenya, A. E., and Rice, C. M. (2004) The NS5A protein of hepatitis C virus is a zinc metalloprotein. J Biol Chem 279, 48576-87. Tellinghuisen, T. L., Foss, K. L., and Treadaway, J. (2008) Regulation of hepatitis C virion production via phosphorylation of the NS5A protein. PLoS Pathog 4, e1000032. Appel, N., Zayas, M., Miller, S., Krijnse-Locker, J., Schaller, T., Friebe, P., Kallis, S., Engel, U., and Bartenschlager, R. (2008) Essential role of domain III of nonstructural protein 5A for hepatitis C virus infectious particle assembly. PLoS Pathog 4, e1000035. Foster, T. L., Belyaeva, T., Stonehouse, N. J., Pearson, A. R., and Harris, M. (2010) All three domains of the hepatitis C virus nonstructural NS5A protein contribute to RNA binding. J Virol 84, 9267-77. Ngure, M., Issur, M., Shkriabai, N., Liu, H. W., Cosa, G., Kvaratskhelia, M., and Gotte, M. (2016) Interactions of the Disordered Domain II of Hepatitis C Virus NS5A with Cyclophilin A, NS5B, and Viral RNA Show Extensive Overlap. ACS Infect Dis. 2, 839-851. Masaki, T., Suzuki, R., Murakami, K., Aizaki, H., Ishii, K., Murayama, A., Date, T., Matsuura, Y., Miyamura, T., Wakita, T., and Suzuki, T. (2008) Interaction of hepatitis C virus nonstructural protein 5A with core protein is critical for the production of infectious virus particles. J Virol 82, 7964-76. Romero-Brey, I., Berger, C., Kallis, S., Kolovou, A., Paul, D., Lohmann, V., and Bartenschlager, R. (2015) NS5A Domain 1 and Polyprotein Cleavage Kinetics Are Critical for Induction of Double-Membrane Vesicles Associated with Hepatitis C Virus Replication. MBio 6, e00759. Gao, M., Nettles, R. E., Belema, M., Snyder, L. B., Nguyen, V. N., Fridell, R. A., Serrano-Wu, M. H., Langley, D. R., Sun, J. H., O'Boyle, D. R., 2nd, Lemm, J. A., Wang, C., Knipe, J. O., Chien, ACS Paragon Plus Environment

43

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

29.

30. 31. 32.

33. 34. 35.

36. 37.

38. 39.

40.

41.

42.

43.

44.

45.

Page 44 of 62

C., Colonno, R. J., Grasela, D. M., Meanwell, N. A., and Hamann, L. G. (2010) Chemical genetics strategy identifies an HCV NS5A inhibitor with a potent clinical effect. Nature 465, 96-100. Berger, C., Romero-Brey, I., Radujkovic, D., Terreux, R., Zayas, M., Paul, D., Harak, C., Hoppe, S., Gao, M., Penin, F., Lohmann, V., and Bartenschlager, R. (2014) Daclatasvir-like inhibitors of NS5A block early biogenesis of hepatitis C virus-induced membranous replication factories, independent of RNA replication. Gastroenterology 147, 1094-105 e25. Tellinghuisen, T. L., Marcotrigiano, J., and Rice, C. M. (2005) Structure of the zinc-binding domain of an essential component of the hepatitis C virus replicase. Nature 435, 374-9. Love, R. A., Brodsky, O., Hickey, M. J., Wells, P. A., and Cronin, C. N. (2009) Crystal structure of a novel dimeric form of NS5A domain I protein from hepatitis C virus. J Virol 83, 4395-403. Lambert, S. M., Langley, D. R., Garnett, J. A., Angell, R., Hedgethorne, K., Meanwell, N. A., and Matthews, S. J. (2014) The crystal structure of NS5A domain 1 from genotype 1a reveals new clues to the mechanism of action for dimeric HCV inhibitors. Protein Sci 23, 723-34. Moradpour, D., Brass, V., and Penin, F. (2005) Function follows form: the structure of the Nterminal domain of HCV NS5A. Hepatology 42, 732-5. Bartenschlager, R., Lohmann, V., and Penin, F. (2013) The molecular and structural basis of advanced antiviral therapy for hepatitis C virus infection. Nat Rev Microbiol 11, 482-96. Hanoulle, X., Verdegem, D., Badillo, A., Wieruszeski, J. M., Penin, F., and Lippens, G. (2009) Domain 3 of non-structural protein 5A from hepatitis C virus is natively unfolded. Biochem Biophys Res Commun 381, 634-8. Liang, Y., Ye, H., Kang, C. B., and Yoon, H. S. (2007) Domain 2 of nonstructural protein 5A (NS5A) of hepatitis C virus is natively unfolded. Biochemistry 46, 11550-8. Feuerstein, S., Solyom, Z., Aladag, A., Favier, A., Schwarten, M., Hoffmann, S., Willbold, D., and Brutscher, B. (2012) Transient structure and SH3 interaction sites in an intrinsically disordered fragment of the hepatitis C virus protein NS5A. J Mol Biol 420, 310-23. Hanoulle, X., Badillo, A., Verdegem, D., Penin, F., and Lippens, G. (2010) The domain 2 of the HCV NS5A protein is intrinsically unstructured. Protein Pept Lett 17, 1012-8. Hanoulle, X., Badillo, A., Wieruszeski, J. M., Verdegem, D., Landrieu, I., Bartenschlager, R., Penin, F., and Lippens, G. (2009) Hepatitis C virus NS5A protein is a substrate for the peptidylprolyl cis/trans isomerase activity of cyclophilins A and B. J Biol Chem 284, 13589-601. Verdegem, D., Badillo, A., Wieruszeski, J. M., Landrieu, I., Leroy, A., Bartenschlager, R., Penin, F., Lippens, G., and Hanoulle, X. (2011) Domain 3 of NS5A protein from the hepatitis C virus has intrinsic alpha-helical propensity and is a substrate of cyclophilin A. J Biol Chem 286, 20441-54. Feuerstein, S., Solyom, Z., Aladag, A., Hoffmann, S., Willbold, D., and Brutscher, B. (2011) 1H, 13C, and 15N resonance assignment of a 179 residue fragment of hepatitis C virus non-structural protein 5A. Biomol NMR Assign 5, 241-3. Solyom, Z., Ma, P., Schwarten, M., Bosco, M., Polidori, A., Durand, G., Willbold, D., and Brutscher, B. (2015) The Disordered Region of the HCV Protein NS5A: Conformational Dynamics, SH3 Binding, and Phosphorylation. Biophys J 109, 1483-96. Chatterji, U., Bobardt, M., Selvarajah, S., Yang, F., Tang, H., Sakamoto, N., Vuagniaux, G., Parkinson, T., and Gallay, P. (2009) The isomerase active site of cyclophilin A is critical for hepatitis C virus replication. J Biol Chem 284, 16998-7005. Yang, F., Robotham, J. M., Nelson, H. B., Irsigler, A., Kenworthy, R., and Tang, H. (2008) Cyclophilin A is an essential cofactor for hepatitis C virus infection and the principal mediator of cyclosporine resistance in vitro. J Virol 82, 5269-78. Gallay, P. A. (2009) Cyclophilin inhibitors. Clin Liver Dis 13, 403-17. ACS Paragon Plus Environment

44

Page 45 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

46.

47.

48.

49.

50.

51.

52. 53.

54.

55.

56. 57. 58. 59.

60.

61.

Biochemistry

Fernandes, F., Poole, D. S., Hoover, S., Middleton, R., Andrei, A. C., Gerstner, J., and Striker, R. (2007) Sensitivity of hepatitis C virus to cyclosporine A depends on nonstructural proteins NS5A and NS5B. Hepatology 46, 1026-33. Coelmont, L., Hanoulle, X., Chatterji, U., Berger, C., Snoeck, J., Bobardt, M., Lim, P., Vliegen, I., Paeshuyse, J., Vuagniaux, G., Vandamme, A. M., Bartenschlager, R., Gallay, P., Lippens, G., and Neyts, J. (2010) DEB025 (Alisporivir) inhibits hepatitis C virus replication by preventing a cyclophilin A induced cis-trans isomerisation in domain II of NS5A. PLoS One 5, e13687. Yang, F., Robotham, J. M., Grise, H., Frausto, S., Madan, V., Zayas, M., Bartenschlager, R., Robinson, M., Greenstein, A. E., Nag, A., Logan, T. M., Bienkiewicz, E., and Tang, H. (2010) A major determinant of cyclophilin dependence and cyclosporine susceptibility of hepatitis C virus identified by a genetic approach. PLoS Pathog 6, e1001118. Garcia-Rivera, J. A., Bobardt, M., Chatterji, U., Hopkins, S., Gregory, M. A., Wilkinson, B., Lin, K., and Gallay, P. A. (2012) Multiple Mutations in HCV NS5A Domain II Are Required to Confer Significant Level of Resistance to Alisporivir. Antimicrob Agents Chemother 56, 5113-21. Liu, Z., Yang, F., Robotham, J. M., and Tang, H. (2009) Critical role of cyclophilin A and its prolyl-peptidyl isomerase activity in the structure and function of the hepatitis C virus replication complex. J Virol 83, 6554-65. Foster, T. L., Gallay, P., Stonehouse, N. J., and Harris, M. (2011) Cyclophilin A interacts with domain II of hepatitis C virus NS5A and stimulates RNA binding in an isomerase-dependent manner. J Virol 85, 7460-4. Fernandes, F., Ansari, I. U., and Striker, R. (2010) Cyclosporine inhibits a direct interaction between cyclophilins and hepatitis C NS5A. PLoS One 5, e9815. Chatterji, U., Lim, P., Bobardt, M. D., Wieland, S., Cordek, D. G., Vuagniaux, G., Chisari, F., Cameron, C. E., Targett-Adams, P., Parkinson, T., and Gallay, P. A. (2010) HCV resistance to cyclosporin A does not correlate with a resistance of the NS5A-cyclophilin A interaction to cyclophilin inhibitors. J Hepatol 53, 50-6. Dujardin, M., Madan, V., Montserret, R., Ahuja, P., Huvent, I., Launay, H., Leroy, A., Bartenschlager, R., Penin, F., Lippens, G., and Hanoulle, X. (2015) A Proline-Tryptophan Turn in the Intrinsically Disordered Domain 2 of NS5A Protein Is Essential for Hepatitis C Virus RNA Replication. J Biol Chem 290, 19104-20. Sreerama, N., and Woody, R. W. (2000) Estimation of protein secondary structure from circular dichroism spectra: comparison of CONTIN, SELCON, and CDSSTR methods with an expanded reference set. Anal Biochem 287, 252-60. Han, B., Liu, Y., Ginzinger, S. W., and Wishart, D. S. (2011) SHIFTX2: significantly improved protein chemical shift prediction. J Biomol NMR 50, 43-57. Verdegem, D., Dijkstra, K., Hanoulle, X., and Lippens, G. (2008) Graphical interpretation of Boolean operators for protein NMR assignments. J Biomol NMR 42, 11-21. Nietlispach, D. (2005) Suppression of anti-TROSY lines in a sensitivity enhanced gradient selection TROSY scheme. J Biomol NMR 31, 161-6. Theillet, F. X., Smet-Nocca, C., Liokatis, S., Thongwichian, R., Kosten, J., Yoon, M. K., Kriwacki, R. W., Landrieu, I., Lippens, G., and Selenko, P. (2012) Cell signaling, posttranslational protein modifications and NMR spectroscopy. J Biomol NMR 54, 217-36. Marsh, J. A., Singh, V. K., Jia, Z., and Forman-Kay, J. D. (2006) Sensitivity of secondary structure propensities to sequence differences between alpha- and gamma-synuclein: implications for fibrillation. Protein Sci 15, 2795-804. Secci, E., Luchinat, E., and Banci, L. (2016) The Casein Kinase 2-Dependent Phosphorylation of ACS Paragon Plus Environment

45

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

62. 63.

64. 65. 66.

67. 68. 69. 70. 71. 72. 73.

74. 75.

76.

77. 78. 79. 80.

Page 46 of 62

NS5A Domain 3 from Hepatitis C Virus Followed by Time-Resolved NMR Spectroscopy. Chembiochem 17, 328-33. Ross-Thriepland, D., and Harris, M. (2014) Insights into the complexity and functionality of hepatitis C virus NS5A phosphorylation. J Virol 88, 1421-32. Chong, W. M., Hsu, S. C., Kao, W. T., Lo, C. W., Lee, K. Y., Shao, J. S., Chen, Y. H., Chang, J., Chen, S. S., and Yu, M. J. (2016) Phosphoproteomics Identified an NS5A Phosphorylation Site Involved in Hepatitis C Virus Replication. J Biol Chem 291, 3918-31. Uversky, V. N. (2002) What does it mean to be natively unfolded? Eur J Biochem 269, 2-12. Buck, M. (1998) Trifluoroethanol and colleagues: cosolvents come of age. Recent studies with peptides and proteins. Q Rev Biophys 31, 297-355. Montserret, R., McLeish, M. J., Bockmann, A., Geourjon, C., and Penin, F. (2000) Involvement of electrostatic interactions in the mechanism of peptide folding induced by sodium dodecyl sulfate binding. Biochemistry 39, 8362-73. Jasanoff, A., and Fersht, A. R. (1994) Quantitative determination of helical propensities from trifluoroethanol titration curves. Biochemistry 33, 2129-35. Woody, R. C. (1985) Circular dichroism of peptides, Vol. 7, Academic Press, Inc., New York, N.Y. Chen, Y. H., Yang, J. T., and Chau, K. H. (1974) Determination of the helix and beta form of proteins in aqueous solution by circular dichroism. Biochemistry 13, 3350-9. Romero, P., Obradovic, Z., Li, X., Garner, E. C., Brown, C. J., and Dunker, A. K. (2001) Sequence complexity of disordered protein. Proteins 42, 38-48. Shen, Y., Delaglio, F., Cornilescu, G., and Bax, A. (2009) TALOS+: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts. J Biomol NMR 44, 213-23. Uversky, V. N. (2002) Natively unfolded proteins: a point where biology waits for physics. Protein Sci 11, 739-56. Konarev, P. V., Volkov, V. V., Sokolova, A. V., Koch, M. H. J., and Svergun, D. I. (2003) PRIMUS: a Windows PC-based system for small-angle scattering data analysis. J Appl Cryst 36, 1277-82. Svergun, D. I. (1992) Determination of the regularization parameter in indirect-transform methods using perceptual criteria. J. Appl. Crystallogr. 25, 495-503. Perez, J., Vachette, P., Russo, D., Desmadril, M., and Durand, D. (2001) Heat-induced unfolding of neocarzinostatin, a small all-beta protein investigated by small-angle X-ray scattering. J Mol Biol 308, 721-43. Kohn, J. E., Millett, I. S., Jacob, J., Zagrovic, B., Dillon, T. M., Cingel, N., Dothager, R. S., Seifert, S., Thiyagarajan, P., Sosnick, T. R., Hasan, M. Z., Pande, V. S., Ruczinski, I., Doniach, S., and Plaxco, K. W. (2004) Random-coil behavior and the dimensions of chemically unfolded proteins. Proc Natl Acad Sci U S A 101, 12491-6. Receveur-Brechot, V., and Durand, D. (2012) How random are intrinsically disordered proteins? A small angle scattering perspective. Curr Protein Pept Sci 13, 55-75. Schwieters, C. D., Kuszewski, J. J., Tjandra, N., and Clore, G. M. (2003) The Xplor-NIH NMR molecular structure determination package. J Magn Reson 160, 65-73. Receveur-Brechot, V., Bourhis, J. M., Uversky, V. N., Canard, B., and Longhi, S. (2006) Assessing protein disorder and induced folding. Proteins 62, 24-45. Wear, M. A., Patterson, A., Malone, K., Dunsmore, C., Turner, N. J., and Walkinshaw, M. D. (2005) A surface plasmon resonance-based assay for small molecule inhibitors of human ACS Paragon Plus Environment

46

Page 47 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

81.

82.

83. 84.

85.

86. 87.

88. 89.

90.

91.

92.

93. 94. 95.

96.

Biochemistry

cyclophilin A. Anal Biochem 345, 214-26. Grise, H., Frausto, S., Logan, T., and Tang, H. (2012) A conserved tandem cyclophilin-binding site in hepatitis C virus nonstructural protein 5A regulates Alisporivir susceptibility. J Virol 86, 4811-22. Huang, Y., Staschke, K., De Francesco, R., and Tan, S. L. (2007) Phosphorylation of hepatitis C virus NS5A nonstructural protein: a new paradigm for phosphorylation-dependent viral RNA replication? Virology 364, 1-9. Koch, J. O., and Bartenschlager, R. (1999) Modulation of hepatitis C virus NS5A hyperphosphorylation by nonstructural proteins NS3, NS4A, and NS4B. J Virol 73, 7138-46. Neddermann, P., Clementi, A., and De Francesco, R. (1999) Hyperphosphorylation of the hepatitis C virus NS5A protein requires an active NS3 protease, NS4A, NS4B, and NS5A encoded on the same polyprotein. J Virol 73, 9984-91. Reiss, S., Harak, C., Romero-Brey, I., Radujkovic, D., Klein, R., Ruggieri, A., Rebhan, I., Bartenschlager, R., and Lohmann, V. (2013) The lipid kinase phosphatidylinositol-4 kinase III alpha regulates the phosphorylation status of hepatitis C virus NS5A. PLoS Pathog 9, e1003359. Appel, N., Herian, U., and Bartenschlager, R. (2005) Efficient rescue of hepatitis C virus RNA replication by trans-complementation with nonstructural protein 5A. J Virol 79, 896-909. Quintavalle, M., Sambucini, S., Di Pietro, C., De Francesco, R., and Neddermann, P. (2006) The alpha isoform of protein kinase CKI is responsible for hepatitis C virus NS5A hyperphosphorylation.J Virol 80, 11305-12. Kim, J., Lee, D., and Choe, J. (1999) Hepatitis C virus NS5A protein is phosphorylated by casein kinase II. Biochem Biophys Res Commun 257, 777-81. Dal Pero, F., Di Maira, G., Marin, O., Bortoletto, G., Pinna, L. A., Alberti, A., Ruzzene, M., and Gerotto, M. (2007) Heterogeneity of CK2 phosphorylation sites in the NS5A protein of different hepatitis C virus genotypes. J Hepatol 47, 768-76. Chen, Y. C., Su, W. C., Huang, J. Y., Chao, T. C., Jeng, K. S., Machida, K., and Lai, M. M. (2010) Polo-like kinase 1 is involved in hepatitis C virus replication by hyperphosphorylating NS5A. J Virol 84, 7983-93. Cordek, D. G., Croom-Perez, T. J., Hwang, J., Hargittai, M. R., Subba-Reddy, C. V., Han, Q., Lodeiro, M. F., Ning, G., McCrory, T. S., Arnold, J. J., Koc, H., Lindenbach, B. D., Showalter, S. A., and Cameron, C. E. (2014) Expanding the proteome of an RNA virus by phosphorylation of an intrinsically disordered viral protein. J Biol Chem 289, 24397-416. Masaki, T., Matsunaga, S., Takahashi, H., Nakashima, K., Kimura, Y., Ito, M., Matsuda, M., Murayama, A., Kato, T., Hirano, H., Endo, Y., Lemon, S. M., Wakita, T., Sawasaki, T., and Suzuki, T. (2014) Involvement of hepatitis C virus NS5A hyperphosphorylation mediated by casein kinase I-alpha in infectious virus production. J Virol 88, 7541-55. Lee, K. Y., Chen, Y. H., Hsu, S. C., and Yu, M. J. (2016) Phosphorylation of Serine 235 of the Hepatitis C Virus Non-Structural Protein NS5A by Multiple Kinases. PLoS One 11, e0166763. Lemay, K. L., Treadaway, J., Angulo, I., and Tellinghuisen, T. L. (2013) A hepatitis C virus NS5A phosphorylation site that regulates RNA replication. J Virol 87, 1255-60. Bhattacharya, D., Ansari, I. H., Mehle, A., and Striker, R. (2012) Fluorescence Resonance Energy Transfer-Based Intracellular Assay for the Conformation of Hepatitis C Virus Drug Target NS5A. J Virol 86, 8277-86. Goto, K., Watashi, K., Inoue, D., Hijikata, M., and Shimotohno, K. (2009) Identification of cellular and viral factors related to anti-hepatitis C virus activity of cyclophilin inhibitor. Cancer Sci 100, 1943-50. ACS Paragon Plus Environment

47

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

97.

98.

99.

100.

101. 102. 103. 104. 105.

Page 48 of 62

Puyang, X., Poulin, D. L., Mathy, J. E., Anderson, L. J., Ma, S., Fang, Z., Zhu, S., Lin, K., Fujimoto, R., Compton, T., and Wiedmann, B. (2010) Mechanism of resistance of hepatitis C virus replicons to structurally distinct cyclophilin inhibitors. Antimicrob Agents Chemother 54, 1981-7. Tellinghuisen, T. L., Foss, K. L., Treadaway, J. C., and Rice, C. M. (2008) Identification of residues required for RNA replication in domains II and III of the hepatitis C virus NS5A protein. J Virol 82, 1073-83. Hopkins, S., Bobardt, M., Chatterji, U., Garcia-Rivera, J. A., Lim, P., and Gallay, P. A. (2012) The cyclophilin inhibitor SCY-635 disrupts hepatitis C virus NS5A-cyclophilin A complexes. Antimicrob Agents Chemother 56, 3888-97. Huang, L., Hwang, J., Sharma, S. D., Hargittai, M. R., Chen, Y., Arnold, J. J., Raney, K. D., and Cameron, C. E. (2005) Hepatitis C virus nonstructural protein 5A (NS5A) is an RNA-binding protein.J Biol Chem 280, 36417-28. Tellinghuisen, T. L., and Rice, C. M. (2002) Interaction between hepatitis C virus proteins and host cell factors. Curr Opin Microbiol 5, 419-27. Polyak, S. J. (2003) Hepatitis C virus--cell interactions and their role in pathogenesis. Clin Liver Dis 7, 67-88. Pawlotsky, J. M., and Germanidis, G. (1999) The non-structural 5A protein of hepatitis C virus. J Viral Hepat 6, 343-56. Humphrey, W., Dalke, A., and Schulten, K. (1996) VMD: visual molecular dynamics. J Mol Graph 14, 33-8, 27-8. Guex, N., and Peitsch, M. C. (1997) SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis 18, 2714-23.

ACS Paragon Plus Environment

48

Page 49 of 62

FOR TABLE OF CONTENTS USE ONLY

Overall Structural Model of NS5A Protein from Hepatitis C Virus and Modulation by Mutations Confering Resista nce of Virus Replication to Cyclosporin A.

Aurelie Badillo, Véronique Receveur-Brechot, Stéphane Sarrazin, François-Xavier Cantrelle, Frédéric Delolme, Marie-Laure Fogeron, Jennifer Molle, Roland Montserret, Anja Bockmann, Ralf Bartenschlager, Volker Lohmann, Guy Lippens, Sylvie Ricard-Blum, Xavier Hanoulle* and François Penin*

LCS1

D1

CypA-CsA

D2 RNA binding groove

D3

LCS2

AH Cytosol

ER membrane

ER lumen

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

CypA

ACS Paragon Plus Environment

49

Biochemistry

Figure 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

§ Amphipathic helix Domain 1 §******...........................* ******|************************************ .. 1973 1982 1992 2002 2012 2022 2032 2042 2052 ...............1 10 20 30 40 50 60 70 80 ...............|........|.........|.........|.........|.........|.........|.........|.........|... Sec. struct. cccsHHHHHHHHHHHHHHHHHHHHHcccscc cccccccbcscEEscEEEEEEcttscEEEEEEEttEEEEEccttsHHH Con1-1b SGSWLRDVWDWICTVLTDFKTWLQSKLLPRLPGVPFFSCQRGYKGVWRGDGIMQTTCPCGAQITGHVKNGSMRIVGPRTCSNT ...............***********:**:*****.** ***:*:***:**:***:****** * *** * *****:*:*:*: *****.**:** ** JFH1-2a SGSWLRDVWDWVCTILTDFKNWLTSKLFPKLPGLPFISCQKGYKGVWAGTGIMTTRCPCGANISGNVRLGSMRITGPKTCMNT ...............|........|.........|.........|.........|.........|.........|.........|.........|... ...............1 10 20 30 40 50 60 70 80 .. 1977 1986 1996 2006 2016 2026 2036 2046 2056 §

Domain 1

. 2056 2062 2072 2082 2092 2102 2112 2122 2132 2138 ............ 84 90 100 110 120 130 140 150 160 166 ...............|.....|.........|.........|.........|.........|.........|.........|.........|.....| Sec. struct. HHccbcccttcbcscEEcccsccsEEEEEcssscEEEEEEEttEEEEEEEssttcbccsscccHHHssEEttEEccscccccc Con1-1b WHGTFPINAYTTGPCTPSPAPNYSRALWRVAAEEYVEVTRVGDFHYVTGMTTDNVKCPCQVPAPEFFTEVDGVRLHRYAPACK ...............*:******.** * *:*.*..**. *:*****.**.***: *.: ****:****:* ***:*:****: ****::**:**: * JFH1-2a WQGTFPINCYTEGQCAPKPPTNYKTAIWRVAASEYAEVTQHGSYSYVTGLTTDNLKIPCQLPSPEFFSWVDGVQIHRFAPTPK ...............|.....|.........|.........|.........|.........|.........|.........|.........|.....| ............ 84 90 100 110 120 130 140 150 160 166 . 2060 2066 2076 2086 2096 2106 2116 2126 2136 2142 § Domain 1 LCS1 §******...........................* ************************|************************************| . 2142 2152 2162 2172 2182 2192 2202 2212 2222 ............... 170 180 190 200 210 220 230 240 250 ..................|.........|.........|.........|.........|.........|.........|.........|.........| Sec. struct. ccbcsscEEEEttEEEEttcbcttsccccscc Talos Pred. ccccccccccccHHHHHHHHcccccccccccccccccccccccccccccccc Con1-1b PLLREEVTFLVGLNQYLVGSQLPCEPEPDVAVLTSMLTDPSHITAETAKRRLARGSPPSLASSSASQLSAPSLKATCTTRHDSP ...............*::*:**:* ****.* ************. ** ******.******* ********** ****.********:*****: :: JFH1-2a PFFRDEVSFCVGLNSYAVGSQLPCEPEPDADVLRSMLTDPPHITAETAARRLARGSPPSEASSSVSQLSAPSLRATCTTHSNTY Talos Pred. ccc ..................|.........|.........|.........|.........|.........|.........|.........|.........| ............... 170 180 190 200 210 220 230 240 250 . 2146 2156 2166 2176 2186 2196 2206 2216 2226 §

Domain 2

. 2223 2232 2242 2252 2262 2272 2282 2292 2302 2312 .............251 260 270 280 290 300 310 320 330 340 ...............|........|.........|.........|.........|.........|.........|.........|.........|.........|.. Talos Pred.(#) cHHHHHHHHHHHHHccccccHHHHHHHHccccHcccccccHHHHHHccHHHHHHHHHccccsscssscttccccHHHHHcccHccccHcccc Con1-1b DADLIEANLLWRQEMGGNITRVESENKVVILDSFEPLQAEEDEREVSVPAEILRRSRKFPRAMPIWARPDYNPPLLESWKDPDYVPPVVHGC ...............*.*:::**** * *.:::.*.*.:* :** :**: **.: * *:*:* : ****:* **********:***: *** **.* ** JFH1-2a DVDMVDANLL----MEGGVAQTEPESRVPVLDFLEPMAEEESDLEPSIPSECMLPRSGFPRALPAWARPDYNPPLVESWRRPDYQPPTVAGC Talos Pred. cccccccccc----cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc ...............|........| .........|.........|.........|.........|.........|.........|.........|.......| .............251 260 270 280 290 300 310 320 330 338 . 2227 2236 2246 2256 2266 2276 2286 2296 2306 2314 § LCS2 Domain 3 §******.....................|************************************ . 2315 2322 2332 2342 2352 2362 2372 2382 2392 .............343 350 360 370 380 390 400 410 420 ...............|......|.........|.........|.........|. ........|.........|..... ....|.........| Talos Pred. ccccccccccccccccccEEEccccccHHHHHHcccccc---cccccccccccccccccccccccc-ccccccccccccccc Con1-1b PLPPAKAPPIPPPRRKRTVVLSESTVSSALAELATKTFG---SSESSAVDSGTATASPDQPSDDGD-AGSDVESYSSMPPLE ................***.* .* *****:*** *****:*.** :** **** ** .:. .:*:.:*... *:. *: * *:. * ******* JFH1-2a ALPPPKKAPTPPPRRRRTVGLSESTISEALQQLAIKTFGQPPSSGDAGSSTGAGAAESGGPTSPGEPAPSETGSASSMPPLE Talos Pred. ccc cccccccccccHHHHHHccccccccccccccccccccccccccccccccccccccccccccccccc ...............|..........|.........|.........|.........|.........|.........|.........|.........| .............339 350 360 370 380 390 400 410 420 . 2315 2326 2336 2346 2356 2366 2376 2386 2396 §

Domain 3

. 2393 2402 2403 2412 2419 ............ 421 430 431 440 447 ...............|........| |........|......| Talos Pred. cccccccccc--------------------ccccccccccccccccc Con1-1b GEPGDPDLSD--------------------GSWSTVSEEASEDVVCC ...............********.. ***** *** : .*** JFH1-2a GEPGDPDLESDQVELQPPPQGGGVAPGSGSGSWSTCSEED-DTTVCC Talos Pred. cccccccccccccEcccccccccccccccccccccccccc-cccccc ...............|........|.........|.........|.........| .....| .............421 430 440 450 460 466 . 2397 2406 2416 2426 2436 2442 ACS Paragon Plus Environment

Page 50 of 62

Page 51 of 62

Figure 2

4000

Ellipticity (deg.cm 2 .dmol -1 )

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

2000

D1

0 -2000 -4000

D1D2D3 -6000 190

200

210

220

230

240

250

Wavelength (nm)

ACS Paragon Plus Environment

Biochemistry

Page 52 of 62

Figure 3

A

Gly

Ser/Thr

pSer/pThr pS408

S401 pS432

pS429

pS408*

pS401*

T435*

S396*

pT435*

pT435 pS437*

SSP score NMR intensity (x106 a.u.)

S412*

S414*

S434*

S414

S415*

Q454

pS437

0.3 0.1 -0.1 -0.3 240 250 260 270 280 290 300 310 320 330 340 350 360 370 380 390 400 410 420 430 440 LCS1

Domain 2

LCS2

Domain 3

80 60 40 20 0

D

S429*

0.5

-0.5

C

pS415

S396

S437*

pS434*

B

S432*

pS412 pS434

SSP score

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

240 250 260 270 280 290 300 310 320 330 340 350 360 370 380 390 400 410 420 430 440

0.6 0.4 0.2 0 -0.2 -0.4

240 250 260 270 280 290 300 310 320 330 340 350 360 370 380 390 400 410 420 430 440

ACS Paragon Environment NS5APlus sequence

Page 53 of 62

Figure 4

D2 JFH1

0

D3 JFH1

-5000

Ellipticity (deg.cm 2 .dmol -1 )

-10000 -15000

C

A

-20000

D2 Con1

D3 Con1

0 -5000 -10000 -15000 -20000

B 200

220

240

Wavelength (nm)

0

Ellipticity (deg.cm 2 .dmol -1 )

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

222 nm D2 JFH1

260

E

D 200

220

240

Wavelength (nm)

222 nm

-4000

D3 JFH1

260

F

D3 Con1 D2 Con1

-8000 D2 JFH1

D3 Con1

-12000

D3 JFH1

D2 Con1

-16000

205 nm 30 40 50 60 70 80 90

Temperature (C)

205 nm 30 40 50 60 70 80 90

Temperature (C)

ACS Paragon Plus Environment

Biochemistry

Figure 5

Ellipticity (deg.cm 2 .dmol -1 )

A

D2 JFH1

20000 10000

D2 Con1

B

Difference spectrum

Difference spectrum

0 -10000

TFE 50% TFE 50%

H20

-20000 200

Ellipticity at 222 nm (deg.cm 2 .dmol -1 )

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 54 of 62

220

H20

240

200

0

220

240

Wavelength (nm)

Wavelength (nm)

C

-5000

D2 JFH1

-10000 -15000

D2 Con1

-20000 -25000 -30000

0

10

20

30

40

50

60

70

Trifluoroethanol (%)

ACS Paragon Plus Environment

80

Page 55 of 62

Figure 6

D2 JFH1 0

A

-3000

-15000

D2 Con1 0

B

-5000 -10000

Ellipticity at 222 nm (deg.cm 2 .dmol -1 )

Ellipticity (deg.cm 2 .dmol -1 )

E

D2 Con1

F

D3 JFH1

G

D3 Con1

H

-5000

-10000

-6000 -7000 -8000 -7000 -8000 -9000 -10000

-15000

-11000

D3 JFH1 0

-12000

C

-4500 -5500

-10000 -15000

D3 Con1 0

D

-5000

Ellipticity at 222 nm (deg.cm 2 .dmol -1 )

-5000

-6500 -7500 -8500 -5000 -7000 -9000

-10000

-11000

-15000 -20000

D2 JFH1

-4000

-5000

Ellipticity (deg.cm 2 .dmol -1 )

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

200

220

240

Wavelength (nm)

-13000

2

3

4

5

pH

ACS Paragon Plus Environment

6

7

Biochemistry

Figure 7

Disorder Score

1.0

DSPDADLIEANLLWRQEMGGNITRVESENKVVILDSFEPLQAEEDEREVSVPAEILRRSRKFPRAMPIWARPDYNPPLLESWKDPDYVPPVVHGC ccccHHHHHHHHHHHHHccccccHHHHHHHHccccHcccccccHHHHHHccHHHHHHHHHccccccccccccccccccHHHHcccHccccccccH | | | | | | | | | |

250

260

270

280

290

300

310

320

330

340

0.6 0.4

0.8

0.2

0.6

0

0.4

- 0.2

0.2

- 0.4

0

- 0.6

1.0

0.6

0.8

0.4 0.2

0.6

0

0.4

-0.2

0.2 0

D2-JFH1

- 0.4 250 |

260 |

266 |

276 |

286 |

296 |

306 |

316 |

326 |

336 |

NTYDVDMVDANLL----MEGGVAQTEPESRVPVLDFLEPMAEEESDLEPSIPSECMLPRSGFPRALPAWARPDYNPPLVESWRRPDYQPPTVAGCALP

Talos pred. ccccccccccccc----cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc

ACS Paragon Plus Environment

- 0.6

SSP Score

Talos pred.

SSP Score

D2-Con1

Disorder Score

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 56 of 62

Page 57 of 62

Figure 8

I/I(0).(qRg)2

4

A D3 JFH1

3

D2D3 JFH1

2 D2 JFH1

1

0

2

0

4

q.Rg

8

6

B

1.0

10

B

0.8

P(R)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

D3 JFH1

0.6

D2D3 JFH1

0.4 D2 JFH1

0.2 0

0

50

100

150

R (Å)

ACS Paragon Plus Environment

200

Biochemistry

Figure 9

0.12

A

D2D3 JFH1

B

D2 JFH1

C

D2 Con1

D

D3 JFH1

E

D3 Con1

0.10 0.08 0.06 0.04 0.02 0 0.12 0.10 0.08 0.06 0.04 0.02 0 0.12

Frequency

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 58 of 62

0.10 0.08 0.06 0.04 0.02 0 0.12 0.10 0.08 0.06 0.04 0.02 0 0.12 0.10 0.08 0.06 0.04 0.02 0

0

20

40

60

80

100

120

140

Rg (Å)

ACS Paragon Plus Environment

160

Page 59 of 62

Figure 10

A

Ellipticity at 222 nm (deg.cm 2 .dmol -1 )

Ellipticity (deg.cm 2 .dmol -1 )

5000 0 -5000 -10000

wt R262Q R318W D320E R318W/D320E

-15000 -20000

200

220

B

-3500 -4000 -4500 -5000 -5500 -6000 -6500

20 30 40 50 60 70 80 90

240

Temperature (°C)

Wavelength (nm)

2.5

C

I/I(0).(qRg)2

2.0 1.5 wt R262Q R318W D320E R318W/D320E

1.0 0.5 0

0

4

2

6

10

8

q.Rg

12

D

1.0 0.8

P(R)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

wt R262Q R318W D320E R318W/D320E

0.6 0.4 0.2 0

0

20

40

60

80

100

120

R (Å)

ACS Paragon Plus Environment

140

Biochemistry

Figure 11

Response Units (RU)

50 40

A

30 20 10 0

Response Units (RU)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 60 of 62

100

B

80 60 40 20 0 0

200

400

600

800

Time (s)

ACS Paragon Plus Environment

Page 61 of 62

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

Biochemistry

Figure 12

D3 320 321 318

CypA-CsA

CypA LCS2

D2 LCS1 RNA binding groove

D1 AH

Cytosol

ER lumen ACS Paragon Plus Environment

Biochemistry

FOR TABLE OF CONTENTS USE ONLY

Overall Structural Model of NS5A Protein from Hepatitis C Virus and Modulation by Mutations Confering Resistance of Virus Replication to Cyclosporin A.

Aurelie Badillo, Véronique Receveur-Brechot, Stéphane Sarrazin, François-Xavier Cantrelle, Frédéric Delolme, Marie-Laure Fogeron, Jennifer Molle, Roland Montserret, Anja Bockmann, Ralf Bartenschlager, Volker Lohmann, Guy Lippens, Sylvie Ricard-Blum, Xavier Hanoulle* and François Penin*

LCS1

D1

D2 RNA binding groove

CypA-CsA

LCS2

AH Cytosol

ER membrane

ER lumen

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 62 of 62

CypA

ACS Paragon Plus Environment

D3