Allosteric Signal Transduction in HIV-1 Restriction Factor SAMHD1

Sep 28, 2017 - The sterile alpha motif and histidine-aspartate domain-containing protein 1 (or SAMHD1), a human dNTP-triphosphohydrolase, contributes ...
2 downloads 10 Views 29MB Size
Subscriber access provided by UNIVERSITY OF THE SUNSHINE COAST

Article

Allosteric Signal Transduction in HIV-1 Restriction Factor SAMHD1 proceeds via Reciprocal Handshake across Monomers. Kajwal Kumar Patra, Akash Bhattacharya, and Swati Bhattacharya J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.7b00279 • Publication Date (Web): 28 Sep 2017 Downloaded from http://pubs.acs.org on October 5, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Information and Modeling is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Allosteric Signal Transduction in HIV-1 Restriction Factor SAMHD1 proceeds via Reciprocal Handshake across Monomers. Authors: Kajwal Kumar Patra1, Akash Bhattacharya2 and Swati Bhattacharya1,3* 1

Department of Physics, Indian Institute of Technology Guwahati, Guwahati, Assam,

INDIA, 781039 2

Department of Biochemistry and Structural Biology, University of Texas Health Science

Center at San Antonio, San Antonio, TX 78229-3900, U.S.A 3

Department of Chemical Engineering, Indian Institute of Technology Bombay, Mumbai,

India, 400076 Correspondence* : [email protected] Institution: Department of Chemical Engineering, Indian Institute of Technology Bombay, Mumbai, India, 400076 Phone: +918723840619 Keywords: SAMHD1; HIV-1 restriction factor; dNTP hydrolase; molecular dynamics simulations; allostery

ACS Paragon Plus Environment

1

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract The sterile alpha motif and histidine-aspartate domain-containing protein 1 (or SAMHD1), a human dNTP-triphosphohydrolase, contributes to HIV-1 restriction in select terminally differentiated cells of the immune system. The catalytically active form of the protein is an allosterically triggered tetramer, whose HIV-1 restriction properties are attributed to its dNTP triphosphohydrolase activity. The tetramer itself is assembled by a GTP/dNTP combination. This enzyme uses the strategy of deoxynucleotide starvation, which is thought to prevent effective reverse transcription of the retroviral genome – hence restricting HIV-1 propagation. HIV-2 and SIV have evolved defences against SAMHD1, underscoring its role in restriction. Previous studies have provided high-resolution structures of GTP/dNTP bound enzyme complexes, but have not been able to provide information on dynamics. In this study, we have used correlation network analysis along with MD techniques to study the flow of allosteric information across the active complex. We have found evidence of a reciprocal allosteric “handshake” occurring across monomeric units. We have also uncovered a short linker region as the nexus for funnelling the regulatory signal from phosphorylation at T592 from the surface to the interior core of the protein.

ACS Paragon Plus Environment

2

Page 2 of 60

Page 3 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Introduction The Sterile Alpha Motif Domain and Histidine-Aspartate Domain containing protein SAMHD1 is a human triphosphohydrolase protein which is implicated in HIV-1 restriction in certain immune system cells1–5. Specifically, a GTP/dNTP combination, induces the formation of catalytically active tetramers of SAMHD1 which then cleave the triphosphate group of dNTPs6. This results in the lowering of dNTP levels to as low as a few 10s of nM obstructing efficient reverse transcription, and therefore hindering propagation of HIV-1

7–9

. This lowered

susceptibility of HIV-1 infection of certain myeloid cells such as resting CD4+ T cells is attributed to the enzymatic activity of SAMHD110–14. The HD domain of SAMHD1 (residue 115-626) has been shown to be necessary and sufficient for both tetramerization as well as triphosphohydrolase activity15–20. Studies have also suggested that SAMHD1 possesses an exonuclease activity, which may also be involved in HIV-1 restriction. However, the exonuclease activity has been disputed and is not considered a mechanism for antiviral resistance in myeloid cells21–24. Nevertheless, the importance of SAMHD1 in retroviral restriction is underscored by the fact that HIV-2/SIVmac/SIVsmm have evolved a defense against it: by employing the protein Vpx to target SAMHD1 for proteasomal degradation3,4,25. Therefore studying the enzymatic mechanism of SAMHD1 action is of great importance, both from the standpoint of understanding biophysical machinery, as well as due to the necessity of understanding HIV / host interaction. The central enigma of retroviral restriction by SAMHD1 is how does an enzyme so ostensibly inefficient manage to lower dNTP levels to far below its Km26,27?

ACS Paragon Plus Environment

3

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Structural studies obtained from X-ray crystallography have been successful in explaining the dNTPase activity of SAMHD1. The current state of the art involves well characterized assembled SAMHD1 tetramers, bound to different assembly and substrate nucleotide combinations2,15,17,18,28. Monomeric SAMHD1 has not been crystalized. The necessary and sufficient SAMHD1 fragment for catalysis, restriction and tetramerization is the HD domain (residues 115-626). Tetramerization is induced by a GTP/dNTP combination, with each tetramer possessing 4 GTP-Mg+2-dNTP cross bridges at the interface of 3 monomers. Crystal structures reveal that the HD domain consists of a major lobe (residues 115-373), a minor lobe (residues 376-450) and finally a C-terminal domain (CTD) (residues 455-599) which features a prominent antiparallel beta-sheet. The major lobe contains the catalytic site. The CTD contains Thr592, phosphorylation of which by cdk-1 has been suggested as an on/off switch for the enzyme. However, the effect of phosphorylation is not well understood: it does not affect the kcat and Km of the enzyme. It was suggested that phosphorylation leads to structural collapse, but further studies demonstrated that it merely leads to faster tetramer disassembly upon nucleotide turnover27,29,30. Thus, functional assays of SAMHD1 have yielded an incomplete picture, with the role of phosphorylation being especially opaque. What is missing, crucially, is a picture of SAMHD1 dynamics. Studies utilizing fluorescence spectroscopy and NMR have shed light on the enzymatic mechanism27,29,30, but a molecular view of SAMHD1 dynamics remains elusive. While NMR spectroscopy based approaches to high molecular weight structure-dynamics studies do exist, they are very expensive31. To this end, in this study we have used MD methods to demonstrate the allosteric links in SAMHD1, indeed to uncover the flow of allosteric information across the assembled SAMHD1 tetramer.

ACS Paragon Plus Environment

4

Page 4 of 60

Page 5 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

The SAMHD1 tetramer crystallizes as a dimer of dimers. The dimer interface (w.r.t 4TNR.pdb) is defined between chain A & D and also chain B & C and is similar to the crystal structure first solved by Goldstone and co-workers2 . The tetramer interface is defined as that between chains A & C and also chain B & D. Tetramerization was seen to cause ordering of 40 -odd C-domain residues which had not displayed viable electron density in the original Goldstone dimeric structure2,15,17. In our previous study32, we have explored the consequences of nucleotide depletion upon structural rigidity and dynamics of the SAMHD1 tetramer. In this study, going beyond the phenomenological exploration of nucleotide depletion - we have uncovered the actual pathways of allosteric signal transduction in the assembled SAMHD1 tetramer. Analysis of cooperative effects in proteins using molecular dynamics simulations has been well established. This technique relies upon cross correlating atomic fluctuations and can be used to establish network models that reflect the flow of allosteric information across a protein system33–37. We have also used Principal Component Analysis to explore the dynamics in more detail. In this study, we have applied these ideas and developed a model of interactions involving communities of connected residues stretching across monomeric units of the assembled tetramer. We have found flow paths from the catalytic core to the surface of the protein. Our analysis has uncovered an intriguing reciprocal allosteric “handshake” which transmits information across the tetramer interface (of chains B & D and of chains A & C). This “handshake” takes the form of a comoving community of residues which consist of almost the entire C-terminal domain (CTD) from a given monomer except for a short loop region, and also extending to the equivalent short loop region from the adjacent monomer. Further, we have also identified that allosteric signals arising from residue T592, the target of cdk-1 phosphorylation funnel from the protein surface in

ACS Paragon Plus Environment

5

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

toward the core via a critical linker between the minor lobe and the CTD. We have also extended the correlation network analysis to the phosphomimetic variant, T592E. Thus, our current work contributes to the mechanistic understanding of SAMHD1 operating as a molecular machine.

These findings shed light onto not just how this allosteric protein “breathes”, but also on information conduits from regulatory sites to the catalytic core of the protein, thus allowing us to build a more complete picture of the functioning of this highly complex molecular machine. Methods System Setup for MD simulations The starting conformations of the explicit solvent simulations were based on the high resolution crystallographic structures (PDB Code 4TNR28) of the tetrameric SAMHD1 complex with allosteric site 1 occupied by GTP and the allosteric site 2 occupied by dATP. Three of the four catalytic sites in the crystal structure are occupied by dATP while the fourth (in subunit A) is vacant. The initial structure was generated from the original crystal structure. The crystallographic waters were retained along with the Mg+2 ions coordinated by allosteric site. The unresolved portions in the loop (278-283) were inserted in the protein structures whereas the missing N terminal and C terminal residues were ignored. The four R206 and N207 residues were mutated back to histidine and aspartate in accord with the sequence of the wt SAMHD1 (Uniprot Q9Y3Z3-1). Disulfide bonds were introduced between residues 341 and 350. The assembled system was immersed in ~59000 pre-equilibrated TIP3P water molecules. Na+ and Clions were added at random positions to bring the net charge of the system to zero. The system

ACS Paragon Plus Environment

6

Page 6 of 60

Page 7 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

consisted of ~210,000 atoms measured 13×12×14 nm. The T592E mutant was created by the replacement of the threonine residue by glutamic acid at the position 592 on all four monomers. General MD methods All simulations were performed with the NAMD 2.938,39 package with the CHARMM3140,41 force fields. Analysis was performed with Bio3D42 package and VMD. All MD simulations were carried out using periodic boundary conditions and particle-mesh Ewald electrostatics43 for long range electrostatic calculations. The SETTLE44 and RATTLE45 algorithms were employed to constrain the covalent bonds involving hydrogen atoms. Operational parameters included a 2 fs timestep while the cutoff and switching distances were set at to 12 and 10 Å respectively. The systems were minimized for 3000 steps using the conjugate gradient method and then equilibrated in the NPT ensemble using the Nosé-Hoover Langevin piston pressure control at 295 K for at least 5 ns. Following equilibration, three independent sets of MD calculations, with trajectory length of 100 ns each were performed in the NVT ensemble for the wt system with the temperature maintained at 295 K using the Langevin thermostat. The data was recorded at 10 ps intervals. Trajectory Analysis and Correlation Network Construction The correlation network construction and analysis was performed with the Bio3D package. To identify and characterize the coupled dynamics between the different parts of the SAMHD1 machinery, first, the  residue-wise linear mutual information (LMI) was calculated as  = 

  +

  −   where Ci is the covariance matrix for the displacement of

Cα atom of the ith residue and Cij is the pair covariance matrix for residues i and j. A consensus matrix of LMI values was calculated as an ensemble average over multiple 50 ns windows along

ACS Paragon Plus Environment

7

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

each of the three independent 100 ns trajectories for the wt system. Subsequently, the consensus matrix was pruned using a cutoff of 0.5. Network Community Analysis A weighted network graph illustrating the dynamic correlation of the protein was generated from the pruned consensus matrix. The network nodes represent the Cα atoms connected through edges weighted by the negative of the logarithm of the LMI values. The network was further clustered into highly intra-correlated regions (communities) that are loosely coupled to the rest of the system42,46,47. Mirroring the tetrameric symmetry of the protein, the communities themselves were grouped into families such that a family included similar communities from the different subunits of the protein. Network Path Analysis To examine the origin of distal site communications between specific residues we performed an analysis of the optimal (shortest) and suboptimal (close to but longer than optimal) paths through possible edges42,48. An ensemble of five hundred paths, representing the distribution of possible modes of communication between the nodes were collected for each pair of source/sink nodes. Additionally, the normalized node degeneracy, indicating the fraction of the total paths crossing each node, was calculated. Residues with high degeneracy were identified as important conduits to the communication network. Principal Components Analysis (PCA) We carried out PCA based on the calculation and diagonalization of the covariance matrix with elements  ,  = 〈  − 〈 〉 ∙  − 〈 〉〉 , where  ( ) refer to the Cartesian coordinates of the Cα atoms of the ith (jth) residue and 〈 〉 (〈 〉) are their respective means calculated over the

ACS Paragon Plus Environment

8

Page 8 of 60

Page 9 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

configurations sampled in the trajectory. The eigenvectors of the matrix, referred to as Principal Components, represent the projections of the trajectory on the principal modes. Flourescence Polarization Methods: We have investigated the assembly of SAMHD1 tetramer by using the binding of fluorescent GTP to SAMHD1 as a reporter29,30. The polarization of a fluorophore is correlated with its rate of tumbling in solution: a small molecule tumbles rapidly and displays a low polarization, while a large molecule displays a high polarization. The binding of a small fluorophore to a large molecule thus results in a measurable increase in polarization. The reporter molecule chosen was gamma-(6-Aminohexyl)-GTP-ATTO-495 (Jena Biosciences Cat # NU-834-495), hereafter referred to as fluorescent GTP, or F-GTP. We tested the binding of F-GTP to different constructs of SAMHD1 as follows: 6 um of protein was incubated with 20 nM F-GTP in a buffer of 50 mM TRIS, 50 mM NaCl, 5 mm MgCl2 and 2 mm DTT reducing agent at pH 8 for 20 minutes. Triplicate samples were set up in a 384-well plate (30 ul/sample) to be read on a Biotek Synergy 2 multimode plate reader. After incubation, 30 ul of buffer containing 20 nM F-GTP was injected into the control cells and 30 ul of buffer containing 20 nM F-GTP and 1000 uM dATP was injected into the sample cells. Fluorescence Polarization readings were acquired at 15 second intervals for 7200 seconds. Protein preparation and purification Recombinant protein was grown in BL21DE3 cells and purified as discussed in Bhattacharya29 and Wang30.

Results

ACS Paragon Plus Environment

9

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 60

MD simulations in explicit solvent were carried out to characterize the internal dynamics of the SAMHD1 tetramer. This consisted of three independent 100 ns simulations of the complex which were then used in a conventional geometric analysis as well as correlation network analysis methods to uncover the working of the machinery. Previously32, we had computed the dynamic cross correlation between the residues based on Pearson’s coefficient, that revealed a complex matrix of allosteric cross talk, too complicated to decipher by mere visual inspection. Hence, in this study, we carried out a detailed analysis of the dynamical coordination between the different parts of the protein, based on network analysis of the correlation matrix. Additionally, we have used linear mutual information in the present study, as the basis for the correlation analysis to avoid the undesired dependence on the direction of fluctuations present in the case of Pearson’s coefficient, which was used previously. The present investigation is focused on two main objectives that were not studied previously; to explore the communication channels between adjacent chains, and to identify how allosteric signals from the surface site T592, known to be the target of phosphorylation by cdk-1, may propagate to the catalytic core. In addition, we have characterized the dynamical coordination between the two allosteric sites with the catalytic site. Through this study, we have identified “hot-spots” or residues that play a critical role either in the structural integrity of the tetrameric complex or in allosteric signal transmission. A preliminary analysis of the stability of the tetramer, measured in terms of the backbone RMSD (root mean square deviation) with respect to the crystal structure (4TNR) is presented in the supporting information (Figure S1). The system was stable with RMSD variations between 1.5 and 2.2 Å.

ACS Paragon Plus Environment

10

Page 11 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 1. (a) Snapshot of the SAMHD1 with the chains in different colors (A:brown, B:green, C:pink and D:blue). The GTP and dATP molecules bound to the Allosite 1,2 and the catalytic sites are represented by red and purple spheres. (b) Consensus cross correlation between Cα-Cα atoms of the entire protein. Boxes highlight regions of significant correlations. Correlation Analysis of protein motions Strong intra-chain correlations connect different structural elements near the allosteric pocket. Figure 1(b) shows our results for the cross correlations of the SAMHD1 tetrameric complex. Regions of significant correlation, obtained from the consensus correlation matrix, are highlighted by boxes. The main intra-chain correlations are indicated by the boxes R1, R2, R3 and R4 for chain A. Similar correlations in the other chains, though present, are not highlighted in the figure. Note that the features highlighted in the boxes are at least 5-10 times the standard error of the correlation coefficients, shown in the supporting information (Figure S19). Amino acid residues at the two allosteric sites exhibit high correlations (region R1). Although most of

ACS Paragon Plus Environment

11

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 60

these correlations are found in residues belonging to common structural elements (i.e. near offdiagonal elements) highly coupled motions are also observed for non-contiguous residues such as R318 (located in the helix D309-G324) with I122 and D120 (Supplementary Figure S3a), P130 with L197 (Supplementary Figure S3b), H129 with Y257 (Supplementary Figure S3c), Q140 with N248 (Supplementary Figure S3d). Concerted motion was also observed between residues of the two helices D440-Y450 and D383-F390 (region R2) (Supplementary Figure S4a). These two antiparallel helices form one wall of the allosteric pocket. Three proximal methionine residues, M239, M240 and M416, exhibit moderately coupled motion as highlighted in region R3 (Supplementary Figure S4b). Several residues near the surface belonging to different structural elements are found to exhibit moderate correlations (region R4) such as W572 and E479 (Supplementary Figure S4c). Regions R5, R6, R7 and R8 indicate inter-chain correlations. These are discussed later. Only the correlations between residues of chain A with other chains are highlighted by the boxes in Figure 1b although similar correlations exist for other chains as well. A comparison between the LMI matrices obtained from the separate trajectories is provided in the supporting material Figure S2. Overall, the LMI matrices computed for the three independent trajectories are very similar to the consensus matrix. Although the correlation analysis in Figure 1(b) provides evidence for distal site communications within the tetrameric complex and indicates residues that are important for the structural stability of the complex, a more detailed analysis was undertaken to elucidate allosteric pathways and residues crucial for signal propagation using network theory methods as described next.

ACS Paragon Plus Environment

12

Page 13 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 2 (a) Representative snapshot showing chain B (in ribbon representation). The bound GTP and dATP molecules are represented as red and purple sticks and a Mg2+ ion is represented as a magenta sphere. The snapshot shows the allosteric site bound molecules in the foreground with the catalytic site dATP behind. The segments belonging to communities C6,C7,C8,C9 and C10 are colored tan, grey, dark green, white and magenta respectively. (b) Optimal community network. The communities are labeled C1-C15. The edges of the communities of chains A and C (C1,C2,C3,C4,C5,C11,C12) are colored white while those of connecting communities of chains

ACS Paragon Plus Environment

13

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 60

B and D (C6,C7,C8,C9,C10,C13,C14,C15), are colored blue. Communities C1/6/11/13 – comprising family 1 are surrounded by a pink shaded box – they are the “floors” of the catalytic sites. Communities C3/8/15 comprising family 3 are the inside face of the catalytic site. Communities C2/7/12/14, comprising family 2 are shaded by triangles – they form the “roof” of the catalytic site. Communities C4/5/9/10, comprising family 4 are shaded by light green bars. The bars between C9 and C10 denote the allosteric handshake between them, as do the bars between C4 & C5. (c) The communities C9 (white) and C10 (magenta) shown in ribbon representation enclosed by a transparent molecular surface coloured according to the monomer (green for chain B and blue for chain D). Note that the communities C9 and C10 include residues from both chains (B and D) as listed in Table 1. Family

Community

Monomer

Residues

C1

A

113-145,163-164,166-209,232354,516-530

F1

C6

B

113-214,231-353,355,424,519527

C11

C

113-145,152-161,164-208,232357,515-533,535

C13

D

113-131,158-159,165-204,251360,508-537

F2

C2

A

146-162,165,210-231,376-453

C7

B

215-230,376,378-423,425-453

C12

C

146-151,162-163,209-231,358-

ACS Paragon Plus Environment

14

Page 15 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

452 C14

D

132-157,160-164,205-250,376452

F3

C3

A

355-375

C8

B

354,356-375,377

C15

D

361-375

C4

A

454-515,544-599,

C

534,536-542,544

C

453-514,543,545-599

A

531-543

B

454-518,544-599

D

538-541

D

453-507,542-599

B

528-543

C5 F4 C9

C10

Table 1. List of the communities and their members.

SAMHD1 can be partitioned into co-moving communities of residues which reveal allosteric channels operating across the tetramer interface. A correlation network was generated from the MD calculations as described in Materials and Methods. Subsequently, we applied community analysis to partition the network into highly intracorrelated clusters. Table 1 reports the total community partitioning of the SAMHD1

ACS Paragon Plus Environment

15

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 60

complex. Fifteen consistently correlated sectors (or communities) emerged from the community analysis using the consensus correlation matrix. Figure 2b represents the structure of the community partitioning (Table-1) with the communities represented by spheres. The analysis elucidated a common pattern in the tetrameric complex; all residues of any given monomer (chain) can be grouped into five communities that display concerted motion. Four of these communities include nodes (residues) belonging to a single monomer whereas the fifth community, spans across two individual protein monomers that mirror each other’s positions from adjacent monomeric units. The one exception was chain C, where the residues were partitioned into four communities instead of the expected five. We illustrate the community partition using monomer B as a representative example. Figure 2(a) presents a snapshot of chain B of the complex in ribbon representation colored according to the communities. The names of the communities (C6-C10) are indicated alongside. The allosteric pocket with the GTP and dATP molecules are indicated by red and purple sticks. The dATP molecule bound to catalytic site is in the background. Based on the similarity of the community structure in the four chains, we can identify, four families of communities. The first family, labelled F1, contains the communities, C1 (chain A), C6 (chain B), C11 (chain C) and C13 (chain D). The community, C6, belonging to chain B (Figure 2a) is represented by tan colored ribbons, forms the bulk of the monomer, encompassing the allosteric pockets while also extending to the surface. C6 has three non-contiguous sectors, residues 113-214, 231-353 and 519-527. The structural basis for the coupling of these sectors is discussed later. The next family, F2, includes the communities C2, C7, C12 and C14 of the chains A-D respectively. The C7 community, belonging to chain B, shown in grey in Figure 2a, has two sectors; a strip of residues 215 to 230 sandwiched between two parts of C6, and a larger

ACS Paragon Plus Environment

16

Page 17 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

segment comprising of the residues 378 to 453. Several of the catalytic site residues, eg. D311, R312, belong to the communities in the F2 family. It also includes the residue R451 which is a key residue at the allosteric site of the neighbouring monomer. The third family, F3, contains the communities C3, C8 and C15, which include the helix E355-A373. Notably, this community was absent in chain C, where the residues were included in the community C12 (part of the F2 family). The α-helix E355-A373 forming an independent community, C8, is depicted by a dark green ribbon in Figure 2a. Note that the E355-A373 helices from adjacent monomers (B-D and A-C) are coordinated at both ends by GTP molecules and interact with each other via hydrogen bonds (D361-R372 and N358-R372 of adjacent chains) that are essential to tetramer formation. In addition, Y374, at the terminus of the helix, is one of the key residues of the catalytic site. The fourth family (F4) contains communities C4, C5, C9 and C10, each of which include residues from two chains. Eg. C10 includes residues 528-543 from chain B and two segments from chain D, 453-507 and 542-599. Similarly, C9 includes residues 538-541 from chain D and the segments 454-518 and 544 to 599 from chain D. Thus, the communities C9 and C10 mirror each other in chains B and D. The corresponding communities of chains A and C, i.e C4 and C5 also have a similar structure. Figure 2c shows the reciprocal allosteric handshake between the communities C9 and C10 that bridge chains B and D. The communities of the F4 family primarily include the CTD residues.

A detailed comparison of the community structure obtained from the three independent trajectories is provided in the supplementary information Figure S6 in order to distinguish between features arising due to random fluctuations and the underlying molecular mechanisms.

ACS Paragon Plus Environment

17

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The allosteric handshake between the chains A-C and B-D was a consistent feature in all three trajectories despite the other minor differences.

ACS Paragon Plus Environment

18

Page 18 of 60

Page 19 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 3. Interchain Correlations. (a) Portions of the communities C9 and C10 represented as white and magenta ribbons. E547 (chain B) and Q539 (chain D) show direct interactions. (b)

ACS Paragon Plus Environment

19

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 60

The Cα-Cα distance between E547 and Q539 of the pairs of chains A-C, C-A, B-D and D-B obtained from the MD simulations. (c) The allosteric pocket of chain A showing the bound GTP and dATP as red and purple sticks. The segment of chain A between V117 and R145 is represented as a blue ribbon while a portion of chain D is shown as a pink ribbon. The protein backbone is colored according to the community partitioning of Figure 2(c). The residues N119 (chain A) and F157, V156 (chain D) are represented as sticks. (d) Residues N328 and Q326 of the chains A and C are shown as sticks. Interactions between the N328 side chain and the Q326 backbone of adjacent monomers result in moderate correlations pinning the chains together. (e) Another view of the allosteric pocket showing the residues D137 (chain A) and R451 (chain D). V156 (chain D) lies in the background. The residues V156 and R451 are the two fingers of the pincer formation by which chain D encircles the allosteric site of chain A. (f) Adjacent antiparallel helices E355-A373 of chains A and C with the key residues forming hydrogen bonds. The residues are colored according to the communities. Inter-chain Correlations We next examine the interactions between the monomers that are essential to the integrity of the complex. Figure 3 shows representative images that illustrate how these couplings arise. Figure 3(a) and (b) correspond to region R5. Intermittent hydrogen bonds between the residues R528D585, R531-N599 and between E547-Q539 of the chain pairs A-C,C-A,B-D and D-B produce moderate correlations between the chains (see Figure S3d and S4 in the supporting information). These interactions lie at the heart of family F4 communities (C4, C5, C9, C10). The correlations of region R7 are illustrated in Figure 3(c) and (e). Residues V156 and R451 of chain D which interact with the allosteric site bound dATP and GTP respectively, are responsible for the dynamic coordination between chains A and D. The region R6 correlations (Figure 1b) are due

ACS Paragon Plus Environment

20

Page 21 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

to residues N328 and Q326 of the chain pairs as shown in Figure 3(d). Figure 3(f) shows the helices E355-A373 from adjacent monomers, A and C coordinated at both ends by dATP molecules. Interactions between R372 of one chain and N358/D361 of the adjacent chain are instrumental in holding the helices together. Stacking interactions between H364 of the adjacent helices also contribute to the stability of the adjacent helices. In chains A, B and D, the major part of the helix E355-A373 forms an independent community (C3 in case of chain A). This is not the case in chain C, where the helix is apportioned between the larger C12 and C11 communities.

Identification of “hot-spots” which mediate information flow: the minor-lobe to CTD linker channels the phosphorylation signal to the protein core. We calculated the optimal and suboptimal paths connecting several pairs of residues at key functional sites as explained in Materials and Methods to explore the connectivity and modes of signal transmission in the network. The resulting path ensemble, revealing the diverse ways in which information can flow between the specific node pairs, also aids in the identification of key residues that can play an important role in the passage of the signal. Consider the transmission of allosteric signal between T592 and H206 (both belonging to the same monomer) as an example of the pathways linking surface site to the catalytic core (Figure 4a). Since the phosphorylation of T592 by cdk-1 has been suggested as a means of regulation of the protein, it is pertinent to probe the influence of T592 on the dynamics of the complex. The main connections involved are T592-I591 (of 310 helix I591-K595)-A588-I587 (of helix D583A588)-Q567-A565 (of helix D558-N577)-V457 (of sheet K455-T460)-Y456-K455-F454-

ACS Paragon Plus Environment

21

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 60

I381(of helix H376-D394)-T384-M216 (of helix S214-R220)-S214-D207 (of the helix S192R206)-H206. Figure 4a illustrates the pathways computed for chain D. The corresponding pathways for the other three monomers are presented in the supplementary information (Figures S7, S9 and S11 panels e and f). While the pathways in all four chains show remarkable similarities, there are a few differences. In case of the pathways between T592-H206, in all four chains, a multitude of pathways were found near the surface that later converged to a single bundle towards the core. The residues N452-K455 were found to play a crucial role in all the four chains. In chain D (Figure 4a), the links between F454/K455 and I381 were found to be critical to the funnelling of the signal from the branched pathways near the surface towards the core. Figure 4b shows the pathways between the surface site T592 of chain D and the catalytic site H206 of the adjacent chain B. The most crucial inter-chain connection involved the residues V586 (chain D) and I530 (chain B) which correspond to the community C10. Note that these residues are adjacent to D585 and R528 (Figure S4d in the Supplementary Material shows the corresponding residues of chains A and C) which are intermittently connected by hydrogen bonds. The main connections are D:T592-D:I591-D:A585-D:V586- B:I530-B:F520-B:C350B:I349-B:E299-B:I300-B:V301-B:G203-B:L204-B:H206. The correlations between F520 and C350 that couple sequentially distant parts of the monomer may be traced to the interactions between proximal residues C522 and C350 (which are not connected by disulfide bonds in the simulations). Some of the important connections between sequentially distant residues included I349 (chain B, sheet E346-R352) and E299 (chain B, 310 helix P291-I300 ), V301 and G203 (chain B helix S192—R206 ).

ACS Paragon Plus Environment

22

Page 23 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 4. The optimal and suboptimal pathways between key functional sites: (a) T592 and H206, both of chain D and (b) chain D:T592 and chain B:H206. The residues predicted to assist in propagating the allosteric signal between two residues of interest are depicted as orange spheres along suboptimal signalling pathways, depicted as magenta lines.

ACS Paragon Plus Environment

23

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 5. Inter-chain optimal and suboptimal pathways between key functional sites: (a) D137 (chain B) and Q375 (chain C), (b) D137 (chain B) and Q375 (chain D) and (c) D137 (chain B) and Q375 (chain A). The residues predicted to assist in propagating the allosteric signal between two residues of interest are depicted as orange spheres along suboptimal signalling pathways, depicted as cyan (panel a) or magenta (panels b and c) lines. The chains A, B, C and D are represented as brown, green, pink and blue ribbons respectively. In case of inter-

ACS Paragon Plus Environment

24

Page 24 of 60

Page 25 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

chain links, the chain is indicated along with the residue identifier. (d) The residue-wise betweenness centrality for the SAMHD1 network. A grey background is used to highlight residues with high centrality. Chain A

Chain B

Chain C

Chain D

Resid Centrality

137

232400

119

150295

152

123735

132

111753

325

160847

120

114617

157

143354

137

193977

326

178452

350

104482

328

117487

349

109510

451

196060

452

124790

329

120010

350

114358

452

237978

453

125120

350

100091

451

195953

453

237630

454

125490

382

116455

452

248611

454

237372

455

123920

447

102934

453

248308

455

234356

520

163366

452

116042

454

292309

456

112407

530

145348

453

116344

455

288803

457

112466

538

125572

454

116716

456

224243

547

110666

455

115138

457

225866

534

100163

530

123044

586

149846

ACS Paragon Plus Environment

25

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 60

Table 2. The residues with betweenness centrality greater than 100000 are listed for the four chains. A high value of centrality is found in the residues N452, L453, F454 and K455 in all four chains.

Several other pathways connecting the allosteric and catalytic site residues belonging to the same chain are presented and discussed in the supplementary information (Figures S7-S14). The key residues involved in the signal flow may be identified from the node degeneracies (the fraction of the total computed paths crossing a given node), presented in Figures S8, S10, S12 and S14 in the supporting information. Some of the key links identified in intra-chain communication pathways are I122-R318 (helix D309-G324), R145-R164, M216-D383. Given that tetramerization is essential to the enzymatic activity of SAMHD1 (the monomeric form is not known to possess triphosphohydrolase activity), it is necessary to understand the allosteric linkages between the chains particularly between the catalytic and allosteric pockets. Figure 5 shows the communication channels between D137 of chain B, (a key residue at Allosite 1 that forms hydrogen bonds with the GTP) with the catalytic site residue Q375 of the other chains, C, D and A respectively. Unsurprisingly, V156 of chain C is the key residue that bridges the two monomers (B and C). In contrast, the linkages between D137 (chain B) and Q375 (chain D) involve residues I530 (chain B) and V586 (chain D), both belonging to the C10 community (see Table 1). The reciprocal pathways between Q375 of chain B and the allosteric site D137 of chains A, C and D are shown in Figure S16 in the supplementary material. The same key residues such as V586, I530, V156, R451, interlinking the different chains are found in the paths even though, intra-chain communication proceeds through alternate pathways. Note that the correlation between V586 and I530 of neighbouring chains was also observed to be crucial in the

ACS Paragon Plus Environment

26

Page 27 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

communications between the surface exposed T592 residue of one chain and the H206 of the core of the adjacent monomer (Figure 4b). An important revelation in the path analysis is that the passage of allosteric signals between the pairs of chains A-C and B-D primarily involve the communities of the family F4 (i.e. C4, C5, C9 and C10) rather than the E355-A373 helix (family F3). The communication between D137 of chain B and Q375 of chain A is tortuous, spanning three monomers (Figure 5c) indicating tenuous correlations. The key inter-chain connections include D137 (chain B) -V156 (chain C) and D361 (chain C)- H364 (chain A). The distributions of the path lengths corresponding to the source-sink pairs in Figure 5 is presented in the supplementary material Figure S18. The longest path lengths are observed between the residues B:D137-A:Q375 while the shortest lengths are found between B:D137-C:Q375. Node-centrality reveals minor-lobe to CTD linker to be an important allosteric channel. The betweenness centrality, which gives the number of unique shortest paths crossing a node, is plotted for all the residues in Figure 5(d). The node centrality indicates the importance of the node in controlling the flow of information. Although some variation between the centrality of the nodes belonging to the four monomers is observed, there is an overall consistency between the centrality values of residues of all four chains. Table 2 reports the residues with betweenness centrality greater than 100000. In all four monomers, N452, L453, F454 and K455 were found to possess high centrality values. These residues were found to be crucial in channelling the signal pathways from the surface towards the catalytic core (see Figure 4a). In addition, the neighbouring residue, R451 flanking the allosteric site, pins different chains together (D-A, A-D, B-C and C-B) as shown in Figure 3e. A high centrality value was also observed for C350 in three of the four chains. Of the three proximal cysteine residues C350, C341 and C522, two are connected by a disulfide bond (C350 and C341). However, the path analysis reveals that the

ACS Paragon Plus Environment

27

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 60

correlations between C350 and C522 play a significant role in transmitting information from the surface to the core residues. Note that C350 was found to be an important link in the communication pathways between T592 and H206 of neighbouring chains (Figure 4b).

D311 forms a crucial hydrogen bond with H167 that closes the catalytic pocket. A representative snapshot showing the dATP bound to the catalytic site in one of the chains is presented in Figure 6 (a). A closer inspection shows that the interactions between D311 and H167 is instrumental in maintaining the shape of the catalytic pocket. We analysed the shortest distance between the (OD1/OD2) atoms of D311 and the NH1 atom of H167. Figure 6b shows the variation of the distance with time as measured from Set 1 MD simulations. The distance was found to be steady with an average of 2.8 Å in three of the four chains. The corresponding occurrence of the hydrogen bond was greater than 50% in three of the four monomers.

ACS Paragon Plus Environment

28

Page 29 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure. 6 (a) Snapshot showing dATP (purple sticks) bound to the catalytic site in chain D (blue ribbon). Proximal residues Y374, R312,D311 H206 and H167 are shown. (b) The residues R311, H123, I122 and D120 are represented as sticks. A double hydrogen bond between D120 and R318 pins the α-helix (D309-G324) to the beta hairpin. The residues R318 and I122 forms an important conduit in the flow of information between the allosteric and the catalytic sites (see Figure 4b, pathways between N119-D311). (c) Distance between D311 (side-chain O) and H206 (N) obtained from MD simulations. Hydrogen bonds between D311 and H167 are crucial in maintaining the shape of the allosteric pocket. (d) The Cα-Cα distance between I122 and R318 of the four chains from Set 1 MD simulations.

Figure 7(a) Fluorescence Polarization of F-GTP bound to SAMHD1 WT with no addition of dNTP (black circles –reference) and after addition of dATP (red triangles). Rise of polarization indicates formation of tetramer and fall indicates disassembly of tetramer. (b) Fluorescence

ACS Paragon Plus Environment

29

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 60

Polarization of F-GTP bound to SAMHD1 D311A without dNTP addition (black circles) and with dATP (red triangles). The much slower rise of Polarization (1/2 life ~800 seconds) indicates that D311A induced changes to the catalytic pocket have translated to a kinetic, but not an equilibrium allosteric effect.

There is already experimental evidence that residues D311 and H167 are critical for catalysis2. Additionally, Tungler and co-workers49 used Fluorescence Cross Correlation Spectroscopy (FCCS) to determine that the mutant H167Y is deficient in tetramerization. We have used dATP induced fluorescence polarization time-course experiments to determine the effect of the mutation D311A on tetramerization ability. As reported in previous works by Bhattacharya and Wang 29,30 the D311A mutant is capable of tetramerization, but much slower than the wild type protein. We tested tetramer assembly and disassembly kinetics for the following constructs of SAMHD1 114-626: wild type and D311A (Fig. 7). Reference samples of SAMHD1 WT, T592D, and D311A (which were not injected with dATP) maintained a Fluorescence Polarization of ~160-175 mP, corresponding to a baseline polarization of F-GTP bound to monomeric SAMHD1. Upon injection of 1000 uM dATP, the polarization of SAMHD1 WT increased to ~ 210 mP in about ~< 100 seconds, indicating assembly of tetramer. The polarization returned to baseline after about 1700 seconds, indicating full disassembly of tetramer. The corresponding experiment for the D311A mutant showed a very slow rise from ~ 165 mP to ~ 210 mP over almost 1700 seconds. Thereafter, the polarization stayed at ~ 210 mP over the course of the 7200-second experiment. Thus, our experiments also confirm that the WT tetramer assembles completely in ~< 100 seconds and decays completely to monomer in about 1700 seconds. However, we see that the

ACS Paragon Plus Environment

30

Page 31 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

D311A mutant requires almost a full 1700 seconds to form complete tetramer; and it retains the tetramer form over the course of a long duration (2hr) experiment. Thus, the D311A mutation causes a significant change in the catalytic pocket which communicates itself all the way to the allosteric pocket and slows down the assembly of tetramer by an order of magnitude. Perturbations to the community network

Figure 8. Path length distributions between the allosteric site residue D137 and the catalytic site residue Q375 belonging to adjacent chains. In each case, the green bars denote the original

ACS Paragon Plus Environment

31

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

network while the red bars represent the perturbed network where specific edges were deleted. (a) B:D137 to D:Q375, linker 452-455 in chain D deleted, (b) D:D137 to B:Q375, linker 452-455 in chain B deleted, (c) B:D137-D:Q375, all edges to B:C350 deleted, (d) D:D137 to B:Q375, all edges to D:C350 deleted, (e) B:D137-D:Q375, edges between B:530/531 and D:V586 deleted and (f) D:D137-B:Q375, edges between D:530/531 and B:V586 deleted.

To further test our hypothesis regarding the role of the “hot-spots” in signal transmission, we modified the community network (Figure 2b) by selectively deleting certain edges. Specifically, we probed the role of the linker 452-455, C350 (part of the putative redox switch50) and the edges between V586 and I530/R531 of adjacent chains. The residues D137 of chain B and Q375 of chain D (and vice versa), belonging to the allosteric and the catalytic sites of neighbouring chains were selected for the analysis (the paths are shown in Figure 5b). As shown in Figure 8(a and b), the deletion of the linker 452-455, i.e elimination of all edges connected to the specified residues, leads to weaker communication between the allosteric and catalytic site residues in neighbouring chains as indicated by longer path lengths. Figure 8 (c-d) shows that the deletion of edges connecting the C350 (part of the redox switch) also leads to longer pathways, i.e. diminished communication between the active sites of adjacent chains. Finally, the deletion of the edges between B:530/531 and D:V586 was found to weaken the communication between the specified active sites of the neighbouring chains B and D. Figure 8(e-f) show a significant shift in the path length histograms towards longer paths, supporting our hypothesis regarding interchain signal transmission. The shift in the path length distributions was more pronounced in the last example (Figure 8 e-f) as compared to the first two (Figure 8a-d), highlighting the influence of the interchain V586-I530/R531 links in signal transmission between the allosteric and catalytic sites of adjacent chains.

ACS Paragon Plus Environment

32

Page 32 of 60

Page 33 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Reciprocal Allosteric Handshake between Monomers A/C and B/D probed using PCA To elucidate the overall patterns of motion in the SAMHD1 complex, we have employed Principal Components Analysis (PCA) on the Cα atoms of the complex. The analysis was performed in two ways. First, in order to focus on the “allosteric handshake” between the monomers, we performed PCA using only the C terminal domain of adjacent chains (i.e. considering A-C and B-D separately). Next, we carried out the analysis on the entire complex. In each case, three separate calculations were performed using the three independent trajectories. Figures 9 illustrates the results of the PCA performed using the CTD residues 455-599 of pairs of adjacent monomers (viz A-C and B-D in separate calculations) on trajectory 3. Auxiliary figures S20 and S21 in the supporting material show the corresponding plots for the other two MD trajectories. Although all the PCs are involved in the collective motion, the contribution to the total variance diminishes rapidly after the first few eigenvectors as shown in Figure 9 (e). The first four eigenvectors accounted for almost half of the total variance in all cases. An indication of collective motion of adjacent monomers is obtained from the simultaneous peaks in both monomers in the plot showing the residue-wise contribution to the first two PCs (Figure 9 f-g). In case of PC1, almost all sharp peaks are observed in segments G464-I466, D506-E511, and the loop F520-I530. A similar trend is observed for the other two trajectories as shown in the supplementary Figures S20 and S21. Two of the most dynamic parts of the CTD that contribute to the top two PCs are the segments 505-511 and 520-530. The flapping of the segment F520-I530 is particularly interesting because of two reasons. First, it closes the gap between C522 and C340/C350. Recent experimental studies have indicated a redox switch operating between the three cysteine residues that can have

ACS Paragon Plus Environment

33

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

important consequences for regulation of the system50. Second, the residue I530 is involved in inter-chain communication as seen previously from the path analysis. The scatter plots in panels (j) and (k) of Figure 9 essentially give a two-dimensional representation of the conformational space occupied by the system. The gradual migration of the points in the PC1-PC2 scatter plots (Figure 9 j,k) indicate that the protein is undergoing breathing motions while the tetramer stays intact. Supplementary Fig S22 shows the contribution of PC3 for the C-terminal domain corresponding to all three trajectories.

Figure 9. Principal Components Analysis of MD simulations performed using only the C terminal domain (CTD) of pairs of adjacent:chains A-C and B-D. Only the data from Trajectory3 is presented in this figure. The interpolated structures representing motion along principal

ACS Paragon Plus Environment

34

Page 34 of 60

Page 35 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

component 1 of chains A-C and B-D are shown in panels (a) and (b) respectively. The principal component 2 of the chains A-C and B-D are shown in panels (c) and (d). The residues with a high contribution to the PC are shown in stick representation. The protein backbone is colored according to the contribution to the PC (with blue indicating high contribution and red indicating invariant structure). Panel (e) presents the contribution of all calculated principal components to the total variance as a percentage. Black and red symbols denote the chain pairs B-D and A-C respectively. Panels (f-g) present the residue-wise contribution to the first and second principal components of the analysis performed on chains A-C. Panels (h-i) present the corresponding plots for chains B-D. Cross plots of the first two principal components is presented in (j-k). The color code represents the time evolution with blue points marking the configurations early in the trajectories and red towards the end.

Figure S23 and S24 present the results of PCA performed using the entire tetrameric complex. As in the case of PCA performed using the CTD of pairs of monomers, the residue-wise contributions to the top three PCs (Figure S23) shows the same segments in the CTD to be involved in the motion (G464-I466, D506-E511, and F520-I530). In addition, other segments in the core with high contributions can be identified: 186-190, 228-230 and smaller peaks between 300 and 350. The peaks were found in similar locations in all three trajectories in case of the top three PCs. We ignored the loop 278-283 in the calculations since the segment was not resolved in the XRD structure (4TNR.pdb) and was found to have high fluctuations that dominated analysis. However, sharp peaks at the residues flanking the segment, i.e 277 and 284, were observed. Figure S24 shows the cross plot projecting the dynamics of the system on the top two Principal Components. The migration of points in the PC1-PC2 scatterplot along semi-circular or U-shaped paths, characteristic of diffusive behaviour in a shallow basin, is indicative of collective motions in the protein complex, which nevertheless retains the quaternary structure – i.e. breathing motions.

ACS Paragon Plus Environment

35

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Alterations in the communication channels between the CTD and the core observed in the phosphomimetic T592E mutant In light of the important, yet insufficiently understood regulatory role of CTD residue, Thr592, we extended the network analysis to the phosphomimetic variant T592E. In our previous study32, MD simulations of the T592E mutant revealed local fluctuations near the site of the mutation that did not extend to the core of the complex. We subjected the two independent 100ns trajectories to network analysis in order to examine if more subtle effects on the correlated dynamics may be identified. Figure 10 pictorially depicts the community partitioning in the wt and T592E mutant.

Figure 10. The community partitioning of the systems studied indicated by color coded horizontal bar. Each monomer (chain) is depicted by a bar that is colored according to the community. The color code of the extra communities not present in the wt system, but identified in the T592E variant are indicated by the box at the top.

ACS Paragon Plus Environment

36

Page 36 of 60

Page 37 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

In the wt complex (Figure 10a), a clear link between the CTD and the core is observed. The communities of Family F1 (i.e. C1, C6, C11 and C13) each had a section in the CTD (approx. residues 516-530), sandwiched between the communities of the Family F4, that provided a connection between the surface and the core. The connection is eliminated in the T592E variant, i.e the communication between the core (allosteric sites) and the surface is significantly weakened in the T592E mutant. Instead, the members of the F3 Family (C3,C8, C15 and the new community C18) are connected to the CTD. Now, the E355-R372 helix, which form the communities of the F3 Family, were effectively dynamically decoupled from the other parts of the complex in the wt system. The isolation is concomitant to their role as structural anchors as described previously. In contrast, in the T592E variant, the communities of the F3 family have a segment extending to the CTD, indicating stronger dynamic coupling between the helices and the surface, which may be a precursor to the dissociation or loosening of the complex.

ACS Paragon Plus Environment

37

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 11. The path length distributions calculated for the wt system is compared to those of the T592E variant. Source-sink pairs considered include (a) the regulatory site residue 592 and anchor-helix residue R372 , (b) regulatory site residue 592 and the catsite residue H206 of same monomer, (c) residue 592 and H206 of adjacent monomers (A-C and B-D) and (d) the surface site C522 and the anchoring helix residue R372. The calculated suboptimal pathways between C522 and R372 in the (e) wt and (f) the T592E mutant.

To examine the link between the anchoring helices and the CTD further, we first considered the direct information flow between the surface residue 592 and R372 on the helix in the wt and the T592E mutant. In each case considered (Figure 11a), the path lengths increased in the T592E mutant indicating weaker correlation between E592 and the helix sites. Since the network topology in the T592E variant (Figure 10) shows the helix E355-R372 to be connected to the surface strip 513-533 in three of the four chains, we next looked at the communication pathways between C522 and the helix. In this case, (Figure 11d), the path lengths were found to consistently decrease in case of the T592E mutant indicating a tighter communication between the two sites. The snapshots in Figure 11 (e,f) show that the paths involve the same nodes in both the wt and the T592E mutant. Thus, the correlation between C522 and the helix terminus R372 is stronger in case of the T592E mutant even though the pathway itself is not changed.

To investigate other effects of the T592E mutation, we also consider the communication between the surface site 592 and the catalytic site H206 in the same monomer and in adjacent monomers in Figure 11 (b,c). This corresponds to paths computed for the wt in Figure 4. The results here are more ambiguous, although a shift towards shorter paths is observed in case of inter-chain communication in T592E mutant, suggesting stronger correlation between distant

ACS Paragon Plus Environment

38

Page 38 of 60

Page 39 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

parts of the complex. It appears that the effect of T592E on the anchoring helices may trigger a loosening of the complex rather than a direct effect on the catalytic site. This is in line with results by Ivanov and co-workers who have shown that the T592D mutation does not lead to collapse of secondary structure. Our results indicate that the phosphomimetic mutation affects signal transmission to the protein core, but it is a subtle effect, and may be potentially connected to the redox-regulation implicated residue C522. Figures S25-S26 in the supporting material, presenting the PCA of the T592E variant, show the first three PCs to be dominated by the fluctuations of the C terminal residues 590-599. This is, again, in line with the results by Bhattacharya and Ivanov29 where the effect of the phosphomimetic mutation T592D was not found to propagate.

Discussion The existence of concerted motions between adjacent, intrachain helices in SAMHD1 monomers, as detected by correlation analysis is not unexpected. Indeed, the allosteric dATP is stabilized by beta strands 1 & 2, which are anchored at either end to helices 1 (129-137) and helix 12 (309-324). Helix 1, in particular is “double-anchored” at both ends to helix 9 (248-257). Thus, the allosteric dATP, by its very presence confers a rigidity to the tertiary structure of the monomer which extends from the Allosite, all the way to the surface. This is in line with conclusions from our recent study32. However, it is the presence of correlation groups (R5/6/7/8) which span across monomeric units which yields interesting insights into how the different monomers “talk” to one another. Surface residue E547, for instance, belonging to beta strand 11 (546-554) is anchored to residue Q539 of the adjacent monomer. Thus, the allosteric sites are heavily “scaffolded” by a network of correlated residues. However, more information can be

ACS Paragon Plus Environment

39

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 60

extracted when we look beyond individual pairs of residues and start looking at co-moving groups of residues. Hence, an analysis of the community partitioning of residues begins to shed more light on how this protein functions. Taking as an example chain B, the bulk of the monomer is one co-moving bundle of helices which we identify as community C6. This community forms one wall of the catalytic crevice which accommodates dNTPs, and indeed forms many of the sidechain stabilizing interactions which select and hold the dNTP in place, including the catalytically critical residues D311, H206 and D207. However, C6 only makes sparing contacts with the two allosteric sites. The “inside” face of the catalytic pocket is formed by community C8, which is essentially helix 14 (E355-A373). This community makes not just close contacts with the substrate dNTP (Y374), but also makes direct contacts with both allosteric pockets, with one end of the helix helping to stabilize the dNTP at one end (R372) and the other end stabilizing the dATP in that pocket (N358). Further, helix 14 (E355-A373) that forms this community makes two hydrogen bonds (D361-R372 and N358-R372) at either end with the equivalent helix from the adjacent monomer. Thus, community C8 is not just responsible for inward allosteric communication from the allosteric pocket to the catalytic core, but also for outward communication to the adjacent monomer. Community C7 forms the “roof” of the catalytic pocket and community C9 forms the outer wall. Viewed in isolation, community C9 can be readily seen to compose the bulk of the CTD of the SAMHD1 monomer (chain B). It is not surprising that this domain would be a comoving entity. Looking closer, however, C9 actually comprises two large segments of the CTD, but does not include the connecting 25 linker residues. Surprisingly, it does reach across the tetramer interface and includes the corresponding linker residues from the adjacent monomer (in this case, chain D). Thus, the two CTDs of adjacent monomers are involved in an exquisitely

ACS Paragon Plus Environment

40

Page 41 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

reciprocal allosteric handshake. It is also worth noticing that this allosteric communication, as well as the hydrogen bonding between communities C3-C12 and C8-C15 (helix E355-A373) of opposing monomers spans the chain A/C and chain B/D “tetramer” interface and not the chain A/D and chain B/C “dimer” interface. This might yield an elegant explanation for why the SAMHD1 dimer is catalytically inactive: the allosteric breathing motions that this protein undergoes are only truly seen in the tetrameric state. In that context, we performed an experimental case study where we mutated the catalytic residue D311 to A. This catalytically dead mutant resulted in a kinetic, but not thermodynamic defect in tetramerization, with an assembly half life increased by almost an order of magnitude. This is an extension of previous data reported by Bhattacharya29 and Wang30. The next phase of analysis involved drilling down from identifying co-moving groups of residues to identifying specific pathways and nodes through which allosteric information travels. As expected, we found several pathways for information flow from the catalytic sites to the allosteric sites within a monomer. However, path-analysis proved to be truly revelatory when we investigated the information flow across monomeric units. We have already found communities C9/C10 to be involved in a allosteric handshake across monomers. Path analysis revealed even more fine-grained results: residues I530 (chain B) and V586 (chain D) are vital nodes for information flow across monomers, which is largely transduced across surface pathways. These are located close to R528 (chain B) and D585 (chain D) which are involved in intermittent hydrogen bonding. Finally, the question of SAMHD1 regulation by phosphorylation remains. Previous reports that this leads to structural collapse of the CTD have been disputed. Our results indicate that the transmission of allosteric

signals from the surface exposed cdk-1

phosphorylation site – T592 to the catalytic core are seen to converge and funnel through a

ACS Paragon Plus Environment

41

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 42 of 60

critical bottleneck; N452-K455. Apropos the community partitioning table, this is the link between communities C7 and C9 (chain B). Structurally, this is the connector between the minor lobe and the CTD of the SAMHD1 monomer. Clearly, this short linker is of significant regulatory importance, as it transmits the effect of cdk-1 phosphorylation from the CTD to the entirety of the protein tertiary structure. This is also reflected in the high “centrality” of these residues. Mutagenesis studies on the linker residues (452-455) are suggested - both to study restriction defects as well as in-vitro dNTPase activity. However, we have yet to uncover the consequences of this regulatory signal. We know that phosphomimetic mutants of SAMHD1 are restriction incompetent, but they are dNTPase capable. Thus, this regulatory signal potentially triggers some other, hitherto undiscovered property of this protein, which may be an alternate HIV-1 restriction mechanism. Conclusion In a previous empirical study32, we have explored the effect of GTP/dNTP occupancy and vacancy in the Allosites of the SAMHD1 tetramer. We were able to uncover several hot-spots for protein stability and we had also found that the T592E mutation leads to elevated local motions, but does not break down the tertiary structure of SAMHD1 in the timescales investigated. In this study, we have delved into the mechanistic basis for the phenomenological observations of our previous study. In other words, we have revealed the structural linkages and communication channels which form the underpinning that allows SAMHD1 to function as molecular engine. SAMHD1 is an enormously complex protein system. Since its discovery about 6 years ago, intense efforts have been made by immunologists, biochemists and structural biologists to uncover its secrets. Various plausible models have been proposed, involving, for instance, the formation of a monomer-dimer equilibrium, which can be driven further to stable, activated

ACS Paragon Plus Environment

42

Page 43 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

tetramer by dNTPs. However, the best structural biology toolkits cannot overcome the fact this is a 245 kDa tetramer. NMR studies are extremely difficult, if not impossible at this molecular weight. X-Ray crystallography only provides still images of protein “eigenstates”, it only registers dynamics as the absence of electron density data. Thus, there is a paucity of information to work with. This is where our molecular dynamics studies come in. Starting from highresolution X-Ray structures, we have uncovered pathways of allosteric communication which show how the monomeric units of the active tetramer communicate via a reciprocal “handshake”. We have also demonstrated that mutations at the catalytic site affects the kinetics of tetramer assembly (but not its equilibrium thermodynamics). We have also found the avenue taken by a phospho-regulatory signal to the core of the protein. PCA analysis has revealed that the protein undergoes breathing motions, while the quaternary structure stays intact. However, we are still limited to relatively small MD trajectory timescales. The dynamics of this protein at the millisecond timescale remain unexplained. In that context, there is a need for more solution state biophysics experimental data: such as from realtime FPLC- Small Angle X-Ray Scattering, which would reveal information about the average particle size. In conclusion, while our MD analysis has yielded insights that complement and expand X-Ray and biochemistry based models of SAMHD1 activity, we have just scratched the surface. Much remains to be done, in terms of uncovering the regulatory mechanism, elucidating the role of nucleic acid binding and beyond. Supporting Information The Supporting Information is available free of charge on the ACS Publications website at DOI: Acknowledgements This work was supported by the Seed Grant (IIT Bombay), Start-Up Grant (IIT Guwahati), Department of Biotechnology Grant No. BT/PR8311/BID/7/451/2013 (S.B.) and National

ACS Paragon Plus Environment

43

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 44 of 60

PARAM Supercomputing Facility (S.B.). A.B acknowledges Prof. Dmitri N Ivanov at the University of Texas Health Science Center at San Antonio for use of lab facilities as well as useful discussions.

References: (1)

Rice, G. I.; Bond, J.; Asipu, A.; Brunette, R. L.; Manfield, I. W.; Carr, I. M.; Fuller, J. C.; Jackson, R. M.; Lamb, T.; Briggs, T. A.; Ali, M.; Gornall, H.; Couthard, L. R.; Aeby, A.; Attard-Montalto, S. P.; Bertini, E.; Bodemer, C.; Brockmann, K.; Brueton, L. A.; Corry, P. C.; Desguerre, I.; Fazzi, E.; Cazorla, A. G.; Gener, B.; Hamel, B. C. J.; Heiberg, A.; Hunter, M.; van der Knaap, M. S.; Kumar, R.; Lagae, L.; Landrieu, P. G.; Lourenco, C. M.; Marom, D.; McDermott, M. F.; van der Merwe, W.; Orcesi, S.; Prendiville, J. S.; Rasmussen, M.; Shalev, S. A.; Soler, D. M.; Shinawi, M.; Spiegel, R.; Tan, T. Y.; Vanderver, A.; Wakeling, E. L.; Wassmer, E.; Whittaker, E.; Lebon, P.; Stetson, D. B.; Bonthron, D. T.; Crow, Y. J. Mutations Involved in Aicardi-Goutières Syndrome Implicate SAMHD1 as Regulator of the Innate Immune Response. Nat. Genet. 2009, 41, 829–832.

(2)

Goldstone, D. C.; Ennis-Adeniran, V.; Hedden, J. J.; Groom, H. C.; Rice, G. I.; Christodoulou, E.; Walker, P. A.; Kelly, G.; Haire, L. F.; Yap, M. W.; de Carvalho, L. P.; Stoye, J. P.; Crow, Y. J.; Taylor, I. A.; Webb, M. HIV-1 Restriction Factor SAMHD1 Is a Deoxynucleoside Triphosphate Triphosphohydrolase. Nature 2011, 480 , 379–382.

(3)

Hrecka, K.; Hao, C.; Gierszewska, M.; Swanson, S. K.; Kesik-Brodacka, M.; Srivastava, S.; Florens, L.; Washburn, M. P.; Skowronski, J. Vpx Relieves Inhibition of HIV-1 Infection of Macrophages Mediated by the SAMHD1 Protein. Nature 2011, 474, 658– 661.

(4)

Laguette, N.; Sobhian, B.; Casartelli, N.; Ringeard, M.; Chable-Bessia, C.; Ségéral, E.; Yatim, A.; Emiliani, S.; Schwartz, O.; Benkirane, M. SAMHD1 Is the Dendritic- and Myeloid-Cell-Specific HIV-1 Restriction Factor Counteracted by Vpx. Nature 2011, 474 , 654–657.

(5)

Leavy, O. Antiviral Immunity: SAMHD1 — Stopping HIV in Its Tracks. Nat. Rev. Immunol. 2011, 11, 440–440.

(6)

Diamond, T. L.; Roshal, M.; Jamburuthugoda, V. K.; Reynolds, H. M.; Merriam, A. R.; Lee, K. Y.; Balakrishnan, M.; Bambara, R. A.; Planelles, V.; Dewhurst, S.; Kim, B. Macrophage Tropism of HIV-1 Depends on Efficient Cellular dNTP Utilization by Reverse Transcriptase. J. Biol. Chem. 2004, 279, 51545–51553.

(7)

Powell, R. D.; Holland, P. J.; Hollis, T.; Perrino, F. W. Aicardi-Goutières Syndrome Gene and HIV-1 Restriction Factor SAMHD1 Is a dGTP-Regulated Deoxynucleotide Triphosphohydrolase. J. Biol. Chem. 2011, 286, 43596–43600.

(8)

Ayinde, D.; Casartelli, N.; Schwartz, O. Restricting HIV the SAMHD1 Way: Through

ACS Paragon Plus Environment

44

Page 45 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Nucleotide Starvation. Nat. Rev. Microbiol. 2012, 10, 675–680. (9)

Kennedy, E. M.; Gavegnano, C.; Nguyen, L.; Slater, R.; Lucas, A.; Fromentin, E.; Schinazi, R. F.; Kim, B. Ribonucleoside Triphosphates as Substrate of Human Immunodeficiency Virus Type 1 Reverse Transcriptase in Human Macrophages. J. Biol. Chem. 2010, 285, 39380–39391.

(10)

Baldauf, H.-M.; Pan, X.; Erikson, E.; Schmidt, S.; Daddacha, W.; Burggraf, M.; Schenkova, K.; Ambiel, I.; Wabnitz, G.; Gramberg, T. SAMHD1 Restricts HIV-1 Infection in Resting CD4+ T Cells. Nat. Med. 2012, 18, 1682–1689.

(11)

Descours, B.; Cribier, A.; Chable-Bessia, C.; Ayinde, D.; Rice, G.; Crow, Y.; Yatim, A.; Schwartz, O.; Laguette, N.; Benkirane, M. SAMHD1 Restricts HIV-1 Reverse Transcription in Quiescent CD4+ T-Cells. Retrovirology 2012, 9, 87.

(12)

Kim, B.; Nguyen, L. A.; Daddacha, W.; Hollenbaugh, J. A. Tight Interplay among SAMHD1 Protein Level, Cellular dNTP Levels, and HIV-1 Proviral DNA Synthesis Kinetics in Human Primary Monocyte-Derived Macrophages. J. Biol. Chem. 2012, 287, 21570–21574.

(13)

Wu, L. SAMHD1: A New Contributor to HIV-1 Restriction in Resting CD4+ T-Cells. Retrovirology 2012, 9, 88.

(14)

Hollenbaugh, J. a; Gee, P.; Baker, J.; Daly, M. B.; Amie, S. M.; Tate, J.; Kasai, N.; Kanemura, Y.; Kim, D.-H.; Ward, B. M.; Koyanagi, Y.; Kim, B. Host Factor SAMHD1 Restricts DNA Viruses in Non-Dividing Myeloid Cells. PLoS Pathog. 2013, 9, e1003481.

(15)

Ji, X.; Wu, Y.; Yan, J.; Mehrens, J.; Yang, H.; DeLucia, M.; Hao, C.; Gronenborn, A. M.; Skowronski, J.; Ahn, J.; Xiong, Y. Mechanism of Allosteric Activation of SAMHD1 by dGTP. Nat. Struct. Mol. Biol. 2013, 20, 1304–1309.

(16)

Yan, J.; Kaur, S.; DeLucia, M.; Hao, C.; Mehrens, J.; Wang, C.; Golczak, M.; Palczewski, K.; Gronenborn, A. M.; Ahn, J.; Skowronski, J. Tetramerization of SAMHD1 Is Required for Biological Activity and Inhibition of HIV Infection. J. Biol. Chem. 2013, 288, 10406– 10417.

(17)

Zhu, C.; Gao, W.; Zhao, K.; Qin, X.; Zhang, Y.; Peng, X.; Zhang, L.; Dong, Y.; Zhang, W.; Li, P.; Wei, W.; Gong, Y.; Yu, X.-F. Structural Insight into dGTP-Dependent Activation of Tetrameric SAMHD1 Deoxynucleoside Triphosphate Triphosphohydrolase. Nat. Commun. 2013, 4, 829–832.

(18)

Koharudin, L. M. I.; Wu, Y.; DeLucia, M.; Mehrens, J.; Gronenborn, A. M.; Ahn, J. Structural Basis of Allosteric Activation of Sterile α-Motif and Histidine-Aspartate Domain-Containing Protein 1 (SAMHD1) by Nucleoside Triphosphates. J. Biol. Chem. 2014, 289, 32617–32627.

(19)

Miazzi, C.; Ferraro, P.; Pontarin, G.; Rampazzo, C.; Reichard, P.; Bianchi, V. Allosteric Regulation of the Human and Mouse Deoxyribonucleotide Triphosphohydrolase Sterile αMotif/histidine-Aspartate Domain-Containing Protein 1 (SAMHD1). J. Biol. Chem. 2014, 289, 18339–18346.

(20)

Zhu, C.-F.; Wei, W.; Peng, X.; Dong, Y.-H.; Gong, Y.; Yu, X.-F. The Mechanism of

ACS Paragon Plus Environment

45

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Substrate-Controlled Allosteric Regulation of SAMHD1 Activated by GTP. Acta Crystallogr. Sect. D Biol. Crystallogr. 2015, 71, 516–524. (21)

Beloglazova, N.; Flick, R.; Tchigvintsev, A.; Brown, G.; Popovic, A.; Nocek, B.; Yakunin, A. F. Nuclease Activity of the Human SAMHD1 Protein Implicated in the Aicardi-Goutieres Syndrome and HIV-1 Restriction. J. Biol. Chem. 2013, 288, 8101– 8110.

(22)

Ryoo, J.; Choi, J.; Oh, C.; Kim, S.; Seo, M.; Kim, S.-Y.; Seo, D.; Kim, J.; White, T. E.; Brandariz-Nuñez, A.; Diaz-Griffero, F.; Yun, C.-H.; Hollenbaugh, J. A.; Kim, B.; Baek, D.; Ahn, K. The Ribonuclease Activity of SAMHD1 Is Required for HIV-1 Restriction. Nat. Med. 2014, 20, 936–941.

(23)

Choi, J.; Ryoo, J.; Oh, C.; Hwang, S.; Ahn, K. SAMHD1 Specifically Restricts Retroviruses through Its RNase Activity. Retrovirology 2015, 12, 46.

(24)

Seamon, K. J.; Sun, Z.; Shlyakhtenko, L. S.; Lyubchenko, Y. L.; Stivers, J. T. SAMHD1 Is a Single-Stranded Nucleic Acid Binding Protein with No Active Site-Associated Nuclease Activity. Nucleic Acids Res. 2015, 1–14.

(25)

DeLucia, M.; Mehrens, J.; Wu, Y.; Ahn, J. HIV-2 and SIVmac Accessory Virulence Factor Vpx down-Regulates SAMHD1 Enzyme Catalysis prior to Proteasome-Dependent Degradation. J. Biol. Chem. 2013, 288 , 19116–19126.

(26)

Hansen, E. C.; Seamon, K. J.; Cravens, S. L.; Stivers, J. T. GTP Activator and dNTP Substrates of HIV-1 Restriction Factor SAMHD1 Generate a Long-Lived Activated State. Proc. Natl. Acad. Sci. U. S. A. 2014, 111, E1843-51.

(27)

Arnold, L. H.; Groom, H. C. T.; Kunzelmann, S.; Schwefel, D.; Caswell, S. J.; Ordonez, P.; Mann, M. C.; Rueschenbaum, S.; Goldstone, D. C.; Pennell, S.; Howell, S. A.; Stoye, J. P.; Webb, M.; Taylor, I. A.; Bishop, K. N. Phospho-Dependent Regulation of SAMHD1 Oligomerisation Couples Catalysis and Restriction. PLoS Pathog. 2015, 11, e1005194.

(28)

Ji, X.; Tang, C.; Zhao, Q.; Wang, W.; Xiong, Y. Structural Basis of Cellular dNTP Regulation by SAMHD1. Proc. Natl. Acad. Sci. 2014, 111, E4305–E4314.

(29)

Bhattacharya, A.; Wang, Z.; White, T.; Buffone, C.; Nguyen, L. A.; Shepard, C. N.; Kim, B.; Demeler, B.; Diaz-Griffero, F.; Ivanov, D. N. Effects of T592 Phosphomimetic Mutations on Tetramer Stability and dNTPase Activity of SAMHD1 Can Not Explain the Retroviral Restriction Defect. Sci. Rep. 2016, 6, 31353.

(30)

Wang, Z.; Bhattacharya, A.; Villacorta, J.; Diaz-Griffero, F.; Ivanov, D. N. Allosteric Activation of SAMHD1 Protein by Deoxynucleotide Triphosphate (dNTP)-Dependent Tetramerization Requires dNTP Concentrations That Are Similar to dNTP Concentrations Observed in Cycling T Cells. J. Biol. Chem. 2016, 291, 21407–21413.

(31)

Tugarinov, V.; Kay, L. E. An Isotope Labeling Strategy for Methyl TROSY Spectroscopy. J. Biomol. NMR 2004, 28, 165–172.

(32)

Patra, K. K.; Bhattacharya, A.; Bhattacharya, S. Uncovering Allostery and Regulation in SAMHD1 through Molecular Dynamics Simulations. Proteins Struct. Funct. Bioinforma. 2017.

ACS Paragon Plus Environment

46

Page 46 of 60

Page 47 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

(33)

Ghosh, A.; Vishveshwara, S. A Study of Communication Pathways in Methionyl- tRNA Synthetase by Molecular Dynamics Simulations and Structure Network Analysis. Proc. Natl. Acad. Sci. U. S. A. 2007, 104 , 15711–15716.

(34)

Kormos, B. L.; Baranger, A. M.; Beveridge, D. L. A Study of Collective Atomic Fluctuations and Cooperativity in the U1A–RNA Complex Based on Molecular Dynamics Simulations. J. Struct. Biol. 2007, 157, 500–513.

(35)

Bethany L. Kormos; Anne M. Baranger; Beveridge, D. L. Do Collective Atomic Fluctuations Account for Cooperative Effects? Molecular Dynamics Studies of the U1A−RNA Complex. J. Am. Chem. Soc. 2006, 128, 8992-8993 .

(36)

Scarabelli, G.; Grant, B. J. Kinesin-5 Allosteric Inhibitors Uncouple the Dynamics of Nucleotide, Microtubule, and Neck-Linker Binding Sites. Biophys. J. 2014, 107, 2204– 2213.

(37)

Rivalta, I.; Sultan, M. M.; Lee, N.-S.; Manley, G. A.; Loria, J. P.; Batista, V. S. Allosteric Pathways in Imidazole Glycerol Phosphate Synthase. Proc. Natl. Acad. Sci. U. S. A. 2012, 109 , E1428-36.

(38)

Phillips, J. C.; Braun, R.; Wang, W.; Gumbart, J.; Tajkhorshid, E.; Villa, E.; Chipot, C.; Skeel, R. D.; Kalé, L. Scalable Molecular Dynamics with NAMD. J. Comput. Chem. 2005, 26 , 1781–1802.

(39)

Kal, L.; Skeel, R.; Bhandarkar, M.; Brunner, R.; Gursoy, A.; Krawetz, N.; Phillips, J.; Shinozaki, A.; Varadarajan, K.; Schulten, K. NAMD2 : Greater Scalability for Parallel Molecular Dynamics. J. Comput. Phys. 1999, 151, 283–312.

(40)

Mackerell, A. D. Empirical Force Fields for Biological Macromolecules: Overview and Issues. J. Comput. Chem. 2004, 25, 1584–1604.

(41)

Klauda, J. B.; Venable, R. M.; Freites, J. A.; O’Connor, J. W.; Tobias, D. J.; MondragonRamirez, C.; Vorobyov, I.; MacKerell, A. D.; Pastor, R. W. Update of the CHARMM AllAtom Additive Force Field for Lipids: Validation on Six Lipid Types. J. Phys. Chem. B 2010, 114 , 7830–7843.

(42)

Grant, B. J.; Rodrigues, A. P. C.; ElSawy, K. M.; McCammon, J. A.; Caves, L. S. D. Bio3d: An R Package for the Comparative Analysis of Protein Structures. Bioinformatics 2006, 22 , 2695–2696.

(43)

Batcho, P. F.; Case, D. a.; Schlick, T. Optimized Particle-Mesh Ewald/multiple-Time Step Integration for Molecular Dynamics Simulations. J. Chem. Phys. 2001, 115, 4003–4018.

(44)

Miyamoto, S.; Kollman, P. a. Settle: An Analytical Version of the SHAKE and RATTLE Algorithm for Rigid Water Models. J. Comput. Chem. 1992, 13, 952–962.

(45)

Andersen, C. Rattle : A “ Velocity ” Molecular Version of the Shake Dynamics Calculations for. J. Comput. Phys. 1983, 52, 24–34.

(46)

Girvan, M.; Newman, M. E. J. Community Structure in Social and Biological Networks. Proc. Natl. Acad. Sci. U. S. A. 2002, 99 , 7821–7826.

ACS Paragon Plus Environment

47

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(47)

Newman, M. E. J. Modularity and Community Structure in Networks. Proc. Natl. Acad. Sci. U. S. A. 2006, 103, 8577–8582.

(48)

Van Wart, A. T.; Durrant, J.; Votapka, L.; Amaro, R. E. Weighted Implementation of Suboptimal Paths (WISP): An Optimized Algorithm and Tool for Dynamical Network Analysis. J. Chem. Theory Comput. 2014, 10, 511–517.

(49)

Tüngler, V.; Staroske, W.; Kind, B.; Dobrick, M.; Kretschmer, S.; Schmidt, F.; Krug, C.; Lorenz, M.; Chara, O.; Schwille, P.; Lee-Kirsch, M. A. Single-Stranded Nucleic Acids Promote SAMHD1 Complex Formation. J. Mol. Med. 2013, 91, 759–770.

(50)

Mauney, C. H.; Rogers, L. C.; Harris, R. S.; Daniel, L. W.; Devarie-Baez, N. O.; Wu, H.; Furdui, C. M.; Poole, L. B.; Perrino, F. W.; Hollis, T. The SAMHD1 dNTP Triphosphohydrolase Is Controlled by a Redox Switch. Antioxid. Redox Signal. 2017, ars.2016.6888.

Abbreviations SAMHD1 : sterile alpha motif and HD domain-containing protein 1 HIV-1 : human immunodeficiency virus-1 MD : molecular dynamics dNTP : deoxynucleoside triphosphate GTP : guanosine triphosphate dATP : deoxyadenosine triphosphate Catsite : catalytic site Allosite : allosteric site XRD : X- Ray Diffraction wt : wild type RMSD: Root mean square deviation

ACS Paragon Plus Environment

48

Page 48 of 60

Page 49 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

TOC Image

ACS Paragon Plus Environment

49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

(a)

(b)

Journal of Chemical Information and Modeling

R8 113

R7 R5 R6

113

113

R3 113

R4 R1

R2

ACS Paragon Plus Environment

Page 50 of 60

(b)

Page 51 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

C14

Journal of Chemical Information and Modeling

(a)

C9 (chain B)

C1

C2

C10

C2 C3 C4

C10 (chain B) C15

C8

dATP

GTP

C1 C13

C4 C5

C3

C6 C7

C7

C8

C8

C9 C6

C9

C10

C11

C11

C6

C5 C7

C12

C12 C13 C14 C15

(c) B:T592 D:K544 C9

C10

D:E547

D:Q539 B:Q539

B:E547

B:K544 D:T592

ACS Paragon Plus Environment

10

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

D:E547

(c)

E547(CA)-Q539(CA) (Å)

(a) C9

C10 D:Q539 B:Q539 B:E547

Page 52 of 60

(b)

9

A-C C-A B-D D-B

8 7 6 5 4 0

(d)

20

40

80

time (ns)

C:N328

C11 A:N119

dATP

D:F157 GTP C1

A:Q326 C:Q326

C14 (f ) dATP

GTP

A:D137

A:N328

C1

(e) C1

60

D:V156

D:R451

C12 dATP

C1 C3 A:H364 A:R372

C:H364 C:D361 C:N358 C11

C14

ACS Paragon Plus Environment

C2

GTP

100

Page 53 of 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(a)

Journal of Chemical Information and Modeling

T592-H206 F454

(b) D:T592-B:H206

T592

D383

D:T592

D:V586 B:I530 B:C350

M216 B:E299 B:H206

H206

ACS Paragon Plus Environment

(a) B:D137 - C:Q375

Page 54 of 60

(b) B:D137 - D:Q375

dATP F454 Q375

V457

Q375

dATP

H376 C:V156

V378

dATP

GTP

D:V586

B:I118

B:I530

C350

D137 L297

D137

(c) B:D137 - A:Q375

(d) 3e+05

GTP dATP

ch-A ch-B ch-C ch-D

2.5e+05

A:Q375 C:D361 A:H364 B:I118

dATP GTP

L132

C:H376 C:V156 B:D137

Centrality

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

2e+05 1.5e+05 1e+05 50000 0 0

ACS Paragon Plus Environment

100

200

300

400

Residue No

500

600

Page 55 of 60

Q375

R312 D311

dATP

H206

H167

D311(O)-H167(N) (Å)

(c)

(a)

chain A chain B chain C chain D

4 3 2 1 0

20

(d) 9

(b) H123 I122 D120 R318

I122(CA)-R318(CA) (Å)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

5

Journal of Chemical Information and Modeling

40

60

time (ns)

80

100

8 7 6

A B C D

5 4 3 0

20

ACS Paragon Plus Environment

40

60

time (ns)

80

100

Fluorescence Polarization

(a)

1 2 3 4 5 6 7 8 9 10 11 12 13 14

250

225

(b) and Modeling Journal of Chemical Information

D 311A, no dN TPs D311A, 500 dATP

200

175

175

150

150

0

W T , no dN T P s W T , 5 0 0 dAT P

225

200

125

Page 56 of 60

250

800 1600 2400 3200 4000 4800 5600 6400 7200

125

0

800 1600 2400 3200 4000 4800 5600 6400 7200

ACS Paragon Plus Environment

Time (seconds)

Time (seconds)

Page 57 of 60

300 250

(a)

B:D137-D:Q375 D:(452-455) linker deleted

300 250

150

100

100

50

300 250

50 6.2

6.4

6.6

6.8

7

Path Length

(c)

7.2

0 6

7.4

300

B:D137-D:Q375 without B:C350

250

6.8

7

7.2

7.4

7

7.2

7.4

7.2

7.4

Path Length

(d) D:D137-B:Q375

without D:C350

100

50

50

(e)

Count

6.4

6.6

6.8

7

Path Length

7.2

0 6

7.4

300

B:D137-D:Q375 without D:V586(B:I530/R531) edges

250

6.2

6.4

6.6

6.8

Path Length

D:D137-B:Q375 without B:V586(D:I530/R531) edges

(f )

200

Count

6.2

200 150

150

100

100

50 0 6

6.6

150

100

250

6.4

Count

150

300

6.2

200

Count

200

0 6

D:D137-B:Q375 B:(452-455) linker deleted

200

150

0 6

(b)

Count

200

Count

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

50 6.2

6.4

6.6

6.8

7

Path Length

7.2

7.4

0 6

ACS Paragon Plus Environment

6.2

6.4

6.6

6.8

7

Path Length

D:R470

0.4

(f )

0.5

Journal of Chemical Information and Modeling

Trajectory 3

ch.A ch.C

Contribution to PC2

A-C PC1

0.4

0.3

0.2 0.1 0 450

D:P526

B-D PC2

(d)

A-C PC2

B:R528 D:K469 C:K469

B:E511

0.5 0.4

(h)

(e)

500

550

Residue Index

ch.B ch.D

0 450

Trajectory 3

0.4

(i)

500 ch.B ch.D

5

PCs

600

Trajectory 3

0.2

0.1 500

550

Residue Index

A-C

0.1 0 450

600

500

PC1

550

Residue Index

(k)

B-D

PC2

PC2 15

20

550

0.3

ACS Paragon Plus Environment

10

Trajectory 3

Residue Index

B&D A&C

Trajectory 3

Page 58 of 60

0.1

0.5

0.2

(j)

ch.A ch.C

0.3

600

0.3

0 450 D:R528

(g)

0.2

Contribution to PC2

C:K469

(c)

proportion of variance (%)

(b)

C:R528

Contribution to PC1

1 2 3 4 5A:G508 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 3240 33 35 34 3530 36 25 37 3820 39 40 15 41 10 42 43 5 44 45 00 46

B-D PC1

Contribution to PC1

(a)

0.5

PC1

600

Page C1 59 of 60

Monomer

Monomer

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

C11

C2

C3

C12

C13

(a) wt

A B C D 100

200

A B C D 100

200

(b)T592E

Journal Modeling C4 of Chemical C5 Information C6 andC7

C14

C15

C16

C17

C8

C10

C9 C18

C19

300

400

500

600

300

400

500

600

ACS Paragon Plus Environment

Residue Index

200

frequency

150 100

wt: chain A wt: chain D T592E: chain A T592E: chain D

(b)

(T/E)592-R372

frequency

(a)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

50 0 0 200

(d)

Count

150 100

1

2

3

path length

4

C522-R372 Same Monomer wt: chain A wt: chain B T592E: chain A T592E: chain B

50 0 2

(e)

wt. ch.A wt. ch. B wt. ch. C wt. ch.D T592E ch.A T592E ch.B T592E ch.C T592E ch. D

2.5

3

4

3.5

path length

C522

(T/E)592-H206 Adjacent Monomers

150 100 50 0 1.5

4.5

(f )

wt A:T592-C:H206 wt C:T592-A:H206 wt B:T592-D:H206 wt D:T592-B:H206 T592E A:E592-CH206 T592E C:E592-A:H206 T592E B:E592-D:H206 T592E D:E592-B:H206

2

2.5

3

3.5

path length (l)

T592E

R372 1

2

path length

3

R372 ACS Paragon Plus Environment

4

C522

wt

50 0 0

100

5

(c)

(T/E)592-H206 Same Monomer

150

Page 60 of 60

200

frequency

200

Journal of Chemical Information and Modeling

4.5