Elucidation of hydrogen bonding patterns in ligand-free, lactose- and

2 days ago - The medically important drug target galectin-3 binds galactose-containing moieties on glycoproteins through an intricate pattern of hydro...
0 downloads 4 Views 2MB Size
Subscriber access provided by UNIV OF NEW ENGLAND ARMIDALE

Elucidation of hydrogen bonding patterns in ligand-free, lactose- and glycerol-bound galectin-3C by neutron crystallography to guide drug design Francesco Manzoni, Johan Wallerstein, Tobias E. Schrader, Andreas Ostermann, Leighton Coates, Mikael Akke, Matthew P. Blakeley, Esko Oksanen, and Derek Thomas Logan J. Med. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.jmedchem.8b00081 • Publication Date (Web): 19 Apr 2018 Downloaded from http://pubs.acs.org on April 19, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Elucidation of hydrogen bonding patterns in ligand-free, lactose- and glycerol-bound galectin-3C by neutron crystallography to guide drug design

Francesco Manzoni1†, Johan Wallerstein2, Tobias Schrader3, Andreas Ostermann4, Leighton Coates5, Mikael Akke2, Matthew P. Blakeley6, Esko Oksanen1,7 & Derek T. Logan1

1

2

Dept. of Biochemistry & Structural Biology, Lund University, S-221 00 Lund, Sweden; Dept. of Biophysical 3

Chemistry, Lund University, S-221 00 Lund, Sweden; Jülich Centre for Neutron Science (JCNS) at Heinz MaierLeibnitz Zentrum (MLZ), Forschungszentrum Jülich GmbH, Lichtenbergstrasse 1, 85747 Garching, Germany; 4

Heinz Maier-Leibnitz Zentrum (MLZ), Technische Universität München, Lichtenbergstrasse 1, 85748 Garching, 5

Germany; Neutron Scattering Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, TN 6

7

,

37831, USA; Institut Laue-Langevin, 71, avenue des Martyrs, 38000, Grenoble, France; Instrument Division, European Spallation Source ERIC, Lund, Box 176, S-221 00 Lund, Sweden.

KEYWORDS: galectin-3, neutron crystallography, hydrogen bonding, protein-ligand interactions, carbohydrate recognition, drug design



th

Francesco Manzoni passed away on 12 March 2017

1 ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2 ACS Paragon Plus Environment

Page 2 of 39

Page 3 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

ABSTRACT: The medically important drug target galectin-3 binds galactose-containing moieties on glycoproteins through an intricate pattern of hydrogen bonds to a largely polar surfaceexposed binding site. All successful inhibitors of galectin-3 to date have been based on mono- or disaccharide cores closely resembling natural ligands. A detailed understanding of the Hbonding networks in these natural ligands will provide an improved foundation for the design of novel inhibitors. Neutron crystallography is an ideal technique to reveal the geometry of hydrogen bonds, because the positions of hydrogen atoms are directly detected rather than being inferred from the positions of heavier atoms as in X-ray crystallography. We present three neutron crystal structures of the C-terminal carbohydrate recognition domain of galectin-3: the ligand-free form, the complex with the natural substrate lactose, and with glycerol, which mimics important interactions made by lactose. The neutron crystal structures reveal unambiguously the exquisite fine-tuning of the hydrogen bonding pattern in the binding site to the natural disaccharide ligand. The ligand-free structure shows that most of these hydrogen bonds are preserved even when the polar groups of the ligand are replaced by water molecules. The protonation states of all histidine residues in the protein are also revealed and correlate well with NMR observations. The structures give a solid starting point for molecular dynamics simulations and computational estimates of ligand binding affinity that will inform future drug design.

INTRODUCTION The specific binding of carbohydrate binding proteins (lectins) to glycoproteins or glycolipids directs a large variety of cellular processes, such as adhesion to other cells and trafficking of

3 ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 39

intracellular components. Hence there is great interest in understanding the detailed molecular mechanisms governing the binding specificity between lectins and their cognate carbohydrate chains. This binding generally involves many hydrogen bonds, whose directionality can be difficult to characterize, as well as many indirect and dynamic interactions via water molecules. Galectin-3 is a mammalian protein belonging to the galectin family, which has 15 members in mammals and is defined by affinity for glycans containing β-D-galactoside moeities1. The protein is found both in the nucleus and the cytoplasm of cells, and it is also secreted extracellularly, where it interacts with β-galactoside-containing glycoproteins and glycolipids. Galectin-3 consists of two domains. The carbohydrate-recognition domain (CRD) is C-terminal, while the N-terminal domain is involved in oligomerization, probably through formation of a coiled-coil structure, a characteristic that is unique among the galectin family proteins1. Recently, human galectin-3 has emerged as an interesting pharmaceutical target, owing to its involvement in various diseases, including inflammation, cancer proliferation and metastasis, and diabetes 2-7. Many X-ray crystal structures of the C-terminal domain of galectin-3 (galectin3C) in complex with synthetic ligands have been determined as a key tool in the structure-based design of new inhibitors8-16. The most successful candidates are derived from a core based on the natural substrate lactose, or the very similar dithiogalactoside core, and their sugar moieties conserve the same interactions with the protein residues as lactose. Enhanced affinity has instead been gained by the exploration of additional binding pockets without disturbing the disaccharide core. The structural basis for this inability to vary the core interactions is of fundamental interest. In a wider context, we are aiming to understand the basic principles governing protein-ligand interactions, in terms of absolute and relative free energy differences 4 ACS Paragon Plus Environment

Page 5 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

upon ligand binding, entropy–enthalpy correlation and water dynamics, through the characterization of many novel, systematically varied synthetic inhibitors using a variety of structural and biophysical methods. A detailed experimentally-derived description of the geometry of the hydrogen bonds in the ligand binding site of galectin-3 can explain why it is so difficult to tamper with the disaccharide core. Experimentally determined hydrogen atom positions will provide an unambiguous foundation for molecular dynamics simulations, free energy perturbation calculations and quantum chemical calculations that inform drug design. The relatively polar, surface-exposed ligand binding site of galectin-3 presents several residues that interact with the ligands through hydrogen bonds (Figure 1). Direct visualization of the H atoms using X-ray crystallography requires very high resolution data, usually beyond 1.1 Å 17, and the visibility of H atoms depends strongly on the mobility of the heavy atoms to which they are attached, as measured by the atomic displacement parameters (ADPs)18. Even in very highresolution crystal structures at < 0.7 Å resolution, only about half of the most ordered H atoms can typically be observed 18. In contrast, neutron crystallography is a powerful experimental technique for providing this information even at more modest resolution. The hydrogen bonding patterns of water molecules are particularly hard to probe with X-ray crystallography, as they are often more mobile than the protein atoms, but an elucidation of water H-bonding is also fundamental to an understanding of the enthalpic and entropic effects of expulsion of ordered water molecules from the binding site. Here again, neutron crystallography is an invaluable tool.

5 ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 39

Figure 1. Overview of the lactose binding site in galectin-3C. Galectin-3C is shown as a grey cartoon, with the side chains of important residues in the lactose recognition site shown as lines. Lactose is represented by sticks, with the carbon atoms colored yellow and oxygens colored red.

Recently, we presented a method to obtain large crystals of perdeuterated galectin-3C (up to 1.8 mm3) in various states, allowing collection of neutron diffraction data up to 1.6 Å resolution.19 Here, we present the first three neutron crystal structures of galectin-3C, involving unbound (apo) galectin-3C, as well as complexes with the natural ligand lactose and the ligand mimic glycerol, which essentially corresponds to one half of a galactose molecule. These neutron structures provide novel information pinpointing the geometry and directionality of hydrogen bonds and protonation states of titrating residues. The results are fundamental for our understanding of ligand binding to galectin-3 and will serve to guide future inhibitor design.

6 ACS Paragon Plus Environment

Page 7 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

7 ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

RESULTS We have determined three neutron crystal structures of galectin-3C: the unbound form (apo, 1.75 Å) and the complexes with lactose (1.7 Å) and glycerol (1.7 Å). All three structures were determined from crystals produced at pD = 7.9 (pH 7.5). Data quality statistics for the lactose and glycerol complexes have been presented previously19 and statistics for the apo-galectin-3C dataset are given in Table 1. All structures were jointly refined against the neutron data and room-temperature X-ray data collected from the same crystal, to much higher resolution. The room-temperature apo X-ray structure was also built and refined to the highest resolution of the X-ray data, since these extended to higher resolution (1.03 Å) than what we had achieved before (1.25 Å)20. The refinement statistics are consistent with the resolution and data quality (Table 2)19. The backbone traces of the three structures are essentially identical and none of the protein side chains interacting with ligands differ in conformation between the three structures. Hydrogen bonding in the lactose complex The 2m|Fo|-D|Fc| X-ray map for the lactose complex at 1.1 Å resolution shows that the entire lactose molecule is well-ordered, consistent with earlier studies.20 In the corresponding nuclear density map for the lactose complex, the galactose moiety is very well defined, but some parts of the glucose moiety are less clear due to a combination of higher mobility and cancellation effects from the aliphatic hydrogen atoms on lactose, which is not deuterated (Fig. 2). Nevertheless, the hydrogen-bonding network between the inward-facing hydroxyl groups of lactose and the protein is very clear, and the protonation states of interacting residues are unambiguously defined. The directionalities of the three key H-bonding groups on the inside of the lactose molecule can now be described (Fig. 2 and Fig. S1a). In the following discussion, -OH will be 8 ACS Paragon Plus Environment

Page 8 of 39

Page 9 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

used to refer to the hydroxyl groups and -O when referring specifically to the oxygen atoms. The three most important polar groups are the 4-OH and 6-OH hydroxyl groups of galactose (GalOH4 and GalOH6) and the O3’ hydroxyl group of glucose (GlcOH3’). The GalO4 oxygen atom accepts an H-bond from NH2 of Arg162 (2.09 Å) and its hydrogen atom is directed towards NE2 of His158 (2.00 Å), which is singly protonated on ND1. GalO6 accepts an H-bond from the amino group of Asn174 (1.81 Å) and the hydrogen atom of GalO6H interacts with the carboxylate group of Glu184 (1.86 Å). GlcO3’ accepts one H-bond from atom NH1 of Arg162 (1.88 Å) and GlcOH3’ donates its H atom to the deprotonated carbonyl group of Glu184 (1.72 Å). All distances in parentheses are from the hydrogen atom to the acceptor atom. Finally, the O5 atom of galactose (GalO5), enclosed within the sugar ring, accepts an H-bond from atom NH2 of Arg162 (2.21 Å). Two other polar hydrogen atoms on the outer face of lactose are clearly visible, although they do not directly build H-bonds with the protein, but rather with surrounding water molecules. In particular, we see that the outward-pointing GalO3 accepts an H-bond from a water molecule (1.96 Å), which simultaneously accepts an H-bond from NH2 of Arg144 (1.88 Å), thus bridging an interaction between lactose and the protein. GalOH3 donates its H-atom to a water molecule (2.09 Å), which interacts with other crystallographic water molecules further out of the binding site through an ordered H-bond network. The GlcOH2’ hydroxyl group is also clearly oriented, though it does not H-bond to any protein residue or water molecule. In contrast, nuclear density for the OH6’ group of glucose is not well-defined. Interestingly, binding of lactose to the protein apparently leads to minor conformational strain within lactose. The geometry imposed by H-bonding to His158 puts the hydroxyl hydrogen atom 9 ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 39

in GalOH4 into an energetically unfavorable eclipsed conformation (dihedral angle 5°) with respect to the aliphatic hydrogen atom on C4 (Fig. 2b). In contrast, the GalOH6 and GlcOH3’ hydrogen atoms are almost completely staggered (dihedral angles 180° and 172° respectively). On the outer face of lactose, the GalOH3 group is in a gauche conformation (dihedral 43°) and the GalOH2 group is staggered (164°).

10 ACS Paragon Plus Environment

Page 11 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Figure 2 a) The complex of perdeuterated galectin-3C with lactose at 1.7 Å resolution. The 2m|Fo|-D|Fc| nuclear density map, calculated on-the-fly in PyMOL, is shown as a grey mesh contoured at a level of 0.6 σ. The lactose molecule is shown as thick sticks; protein side chains and water molecules that interact with them and lactose are shown as thin sticks. Deuterium atoms are coloured yellow; hydrogen atoms on the non-perdeuterated lactose molecule are coloured white. Hydrogen bonds relevant for the discussion are shown as light blue dotted lines with distances between the H atom and the H-bond acceptor. This view is chosen to highlight the interactions made by the outer hydroxyl groups of lactose, as well as Arg162. b) Another view of the binding pocket, rotated approximately 90° around the horizontal axis with respect to panel A. This view is chosen to highlight the interactions of GalOH4, GalOH6 and GlcOH3’ with His158, Gln174, Glu184 and Arg186 on the inner side of the binding pocket. Arg162 is not shown for clarity.

Glycerol closely mimics the inner side of lactose In the glycerol complex, the six heavy atoms of glycerol completely overlap the inner part of the galactose moiety of lactose (Fig. 3), as previously observed by ourselves20 and others 21. However, the present study allows localization of the hydrogen atoms for the first time. Both aliphatic and polar D atoms of the d8-glycerol molecule are visible, thus there are no cancellation effects from the aliphatic H, as in the lactose complex. The interactions of glycerol with galectin-3C are

11 ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 39

essentially identical to those of galactose: O1 (equivalent to GalO4 in lactose) donates its H-atom to NE2 of His158 (1.84 Å) and accepts an H-bond from NH2 of Arg162 (2.14 Å); O3 (equivalent to GalO6) donates its H-atom to OE2 of Glu184 (1.82 Å) while accepting an H-bond from ND2 of Asn174 (1.88 Å), with the same geometry as in the lactose complex. Interestingly, the H-atom in the remaining hydroxyl group (HO2; equivalent to the ring oxygen GalO5 in lactose) has a very clear directionality, despite not donating an H-bond to any protein functional group or water molecule. This appears to be due to the fact that O2 accepts an H-bond from NH2 of Arg162 (2.17 Å), placing tight restraints on the orbital orientations in O2. The angle between HH22 on Arg162, O2 and HO4 on glycerol is 111°, consistent with very good overlap of the lone pairs on O4 with the σ orbitals of HH22 (Fig. 3a). The position of GlcOH3’ in the lactose structure is occupied in the glycerol complex by a wellordered water molecule, which accepts an H-bond from NH1 of Arg162 (1.86 Å) and donates one to OE2 of Glu184 (1.68 Å). This arrangement strongly mimics the interaction made by GlcOH3’.

Figure 3. a) Superimposed neutron crystal structures of the complexes of galectin-3C with glycerol at 1.7 Å resolution (solid) and lactose (transparent). The 2m|Fo|-D|Fc| nuclear density map, shown as a grey mesh, is

12 ACS Paragon Plus Environment

Page 13 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

contoured at a level of 0.8 σ. The glycerol molecule is shown as grey sticks. The lactose molecule is shown as a faint shadow for comparison. The perfect overlap between glycerol and parts of the galactose moiety of lactose is evident. b) View rotated approximately 90° around a horizontal axis, as in Fig. 1B. The color scheme is as in Fig. 1.

Three water molecules in the apo form replace key H-bonding moieties in lactose In the apo (ligand-free) neutron crystal structure at 1.75 Å resolution, only three water molecules are clearly visible in the binding site in the 2m|Fo|-D|Fc| neutron map (Fig. 4). The water molecules are bound at the positions of the inner oxygen atoms of bound lactose. Again, this has been observed previously 20-21, but hydrogen atom positions are identified for the first time here using neutron diffraction. The GalOH4 position is occupied by W1, GalOH6 by W2 and GlcOH3’ by W3. The isotropic B-factors of the three water molecules are similar (52.0, 41.9 and 47.5 Å2 respectively). At 1.03 Å, the full resolution of the X-ray data, a further three water molecules are observed at lower electron density levels (down to 0.6 σ) at the positions of GalC6, GalO1 and GalO5 (Fig. 4d). These three water molecules were also seen in our previous X-ray structure of apo-galectin-3C at 1.25 Å20. The water at GalC6 appears to be a low-occupancy alternative conformation for W2. A further two water molecules were seen in the binding site in the 1.25 Å structure, but these may have been due to traces of lactose in the protein preparations, as noted previously.19, 21 Here, we will describe only the waters visible in the nuclear density maps. The H-bonding directionality of W2 is clearly determined (Fig. 4b), showing a nuclear density peak tilted towards Glu184, similarly to the hydroxyl group GalO6 of lactose. W2 accepts an Hbond from Asn174 (1.95 Å) and donates one to Glu184 (2.47 Å). W3 also shows clear nuclear density, with H-bond orientation similar to that in the glycerol and lactose complexes. It accepts an H-bond from Arg162 (1.90 Å) and donates one to Glu184 (1.81 Å; Fig 4a,b). The nuclear density

13 ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 39

for the deuterium atoms on W1 is weaker, nevertheless it is possible to model an orientation in which a single, relatively long H-bond (2.33 Å) is made to His158 (Fig. 4a,b). The second proton appears disordered. This water molecule also accepts a hydrogen bond from Arg162 (2.16 Å), and it is clearly visible in the X-ray map (Fig. 4c), albeit that it refines with significant anisotropy, disordered in the imidazole plane of His158. Its lack of clarity in the nuclear density map may be due to a combination of this mobility and limitations in the neutron diffraction data. Interestingly, the hydrogen bond of W2 rotates by about 45° away from OE2 compared to the two other structures, lengthening the H-bond from 1.90 Å in the lactose complex and 1.82 Å in the glycerol complex to 2.47 Å. Likewise, the H-bond of W3 rotates away from Glu184, but the Hbond distance stays approximately the same, at 1.81 Å compared to 1.72 Å in lactose. The H-bond orientation of the corresponding water molecule in the glycerol complex is more similar to that in lactose. This indicates higher conformational freedom in the mobile water molecules than in the ligands.

14 ACS Paragon Plus Environment

Page 15 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Figure 4. Neutron and X-ray structures of ligand-free galectin-3C. The color scheme is as in Figs. 1 and 2. a) The 2m|Fo|-D|Fc| nuclear density map is contoured at a level of 0.7 σ. Water molecules in the binding site are shown as sticks, except for W1. b) View rotated approximately 90° around a horizontal axis, as in Fig. 1B. The lactose molecule is shown as a faint shadow for comparison. c) The 2m|Fo|-D|Fc| electron density map for the same region, for comparison. The resolution has been artificially limited to 1.8 Å. The ellipsoids show the anisotropic atomic displacement parameters for the water molecules derived from the X-ray data to 1.03 Å, at a probability level of 0.5. d) Electron density map at 1.03 Å, showing the extra water molecules at positions corresponding to lactose oxygen atoms found at higher resolution.

Protonation states of histidine residues – correlation between NMR and neutron observations Histidine side chains have two nitrogen atoms that can be protonated, ND1 and NE2. The pKa values of these atoms are similar (6.6–7.0) and close to neutral pH. Therefore, there are three 15 ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 39

possible tautomeric states: the H–NE2 tautomer, which is the most stable non-charged state; the rare H–ND1 tautomer, which typically occurs only when stabilized by specific interactions; and the positively charged state, in which both NE2 and ND1 are protonated.22 It is necessary to determine the protonation state in order to understand function and to provide correct starting models for computational studies. The combination of neutron crystallography and NMR spectroscopy provides a unique opportunity to determine the protonation state of His residues in a protein. The NMR experiments show that the tautomers present in solution at pH 7.5 (the value used for crystallization was pD 7.9, equivalent to pH 7.5) agree well with those observed in the neutron structures (Table 3). State

Method

His158

His208

His217

His223

lactose

neutron

ND1

NE2; ND1?

Both

NE2

glycerol

neutron

ND1

NE2; ND1?

NE2

NE2

apo

neutron

ND1

NE2; ND1?

NE2

NE2

lactose

NMR

ND1 (100%)

NE2 (major)

NE2 (major)

NE2 (100%)

ND1 (minor)

ND1 (minor)

charged (5%)

charged (1%)

Table 3. Protonated N atoms in the sidechains of the four His residues in galectin-3C according to the neutron structures and the NMR measurements.

There is complete agreement between all neutron structures and the NMR experiments for His158 in the carbohydrate binding site (Fig. S2a) and for His223 (Fig. S2b). Both methods unequivocally show that His158 is singly protonated on the ND1 atom, whereas His223 is singly

16 ACS Paragon Plus Environment

Page 17 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

protonated on NE2. The presence of this unusual H-ND1 tautomer is consistent with the fact that NE2 of His158 is a critical H-bond acceptor for HO4 of lactose. Furthermore, the rare tautomer is stabilized by a hydrogen bond to the unprotonated carboxylate side chain of Asp148. His223 is located far from the binding site. Its protonation state is determined by an H-bond from NE2 to the main-chain carbonyl group of Glu205 (Fig. S2b). This interaction holds together two loops at the end of the β-sandwich. ND1 of His223 is the H-bond acceptor for the end of a chain of hydrogen-bonded water molecules. For His208, the NMR experiments show experimentally that its pKa is 6.20, which yields a population of the charged state of 5% at pH 7.5. Furthermore, both neutral tautomers are also present, with a major population of NE2 and minor population of ND1. Thus, all three tautomers are present in solution. These observations are augmented by the neutron crystal structures, which show that this residue has two distinct conformations in all structures (Fig. S2d), which refine well with approximately equal occupancy. In both conformations, NE2 appears to be protonated, but the overlap of the two conformations prevents us from unambiguously determining the protonation state of the ND1 atom, or indeed which of two possible orientations related by an 180° flip the side chains have. In one of the conformations, NE2 donates a hydrogen bond to the unprotonated carboxylate side chain of Glu205; in the other, it makes no clear interactions. ND1 makes no interactions to surrounding residues in either conformation. For His217, the NMR results indicate the presence of all three tautomers: one major state with NE2 protonated, a lower-populated state with ND1 protonated, and a very small population (1%) of the charged state. In the lactose neutron crystal structure, His217 appears to be doubly protonated. If this structure is refined with His217 protonated only on the NE2 atom, a positive 17 ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 39

m|Fo|-D|Fc| difference density peak is seen at the position of the HD1 atom. However, the structure refines equally well with the deuterium atoms at full or half occupancy, thus the neutron data are not sufficient to quantify the relative populations. In contrast to His208, His217 accommodates the different tautomers without any changes in side chain orientation. The results for the apo and glycerol neutron structures differ from lactose: the residue appears to be protonated only on NE2. In the apo structure there are some indications of partial protonation of ND1 (Fig. S1c), which is also seen in the 0.86 Å X-ray structure at 100K 20. In summary, the NMR results broadly agree with the distribution seen in the neutron structures in that the dominant species is protonated on NE2.

18 ACS Paragon Plus Environment

Page 19 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

DISCUSSION The three neutron crystal structures of galectin-3C presented here identify for the first time experimentally the positions of key hydrogen atoms in ligand complexes and in the apo protein, thereby highlighting the exquisite fine-tuning of H-bonding interactions in the carbohydrate binding site to exploit all available interaction possibilities in the disaccharide core and providing experimental support for geometry that could only be inferred until now. Even in the highestresolution X-ray structure obtained to date, the 0.86 Å structure of the lactose complex 20, the Hatom positions are much more ambiguous. The protonation state of His158 is discernible, and some of the aliphatic H-atoms on the galactose moiety are visible, but the only well-defined Hbond is the one between GalOH4 and His158. Thus, the combination of neutron and X-ray crystallography, the latter to very high resolution, gives a powerful combination of accuracy in the heavy atom positions while allowing unambiguous localization of hydrogen. For the most part, the important H-bonds are achieved while maintaining an undistorted, low energy conformation of the ligands, with gauche or staggered conformations around the C-O bond in the ligand hydroxyl groups. The exception is the H-bond made between GalOH4 and His158, in which the hydrogen atom is in an eclipsed geometry with respect to the other end of the C-O bond. As previously observed 20-21, the non-natural ligand glycerol closely mimics the inner side of the more tightly-bound of the two sugar moieties in lactose, namely galactose. The H-bonds made by the OH4 and OH6 groups of glycerol are remarkably similar to those made by GalOH4 and GalOH6 in lactose, in both cases showing high directionality and suggesting that it would be hard to replace these moieties with functional groups that make less directional interactions 19 ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 39

with the protein, such as halogens, while maintaining tight binding of synthetic inhibitors. The results confirm that inhibitor design has been wise to focus on exploitation of additional nearby binding pockets on the surface of galectin-3 rather than modification of the core carbohydrate binding moieties. The lactose and glycerol complexes also pinpoint for the first time a critical Hbond donation from an arginine side chain (Arg162) to the cyclic oxygen atom O5 of the core recognition motif, namely the galactose moiety in lactose. Despite not H-bonding to any external group, the H atom on O4 of glycerol is highly directional, emphasizing the necessity to orient the p-orbitals of O4, and thus GalO5 in lactose, towards Arg162. This is a further demonstration of the detailed fine-tuning of protein side chain interactions to exploit every possible H-bonding possibility on the carbohydrate core. The neutron structures described here have highlighted the central role of Arg162 in maintaining the conserved H-bond network, as it participates in three highly directional Hbonds to lactose. These results are in agreement with NMR experiments, which show that the guanidine group of this residue is not more flexible than are the backbone amide groups. 9 There is excellent agreement between the protonation states of galectin-3C histidine residues observed by neutron crystallography and by NMR. Of these, the functionally most interesting is His158 in the binding site. Both methods agree that the rare ND1-protonated tautomer of His158 is crucial for the recognition of the galactose O4 hydroxyl group. This tautomer is stabilized by an H-bond from ND1 to the unprotonated Asp148. In the unbound form of galectin-3C, three water molecules occupy the three important sites corresponding to the inner oxygen atoms of bound lactose, which are critical for ligand recognition. The neutron crystal structure shows that two of these water molecules, namely W2 20 ACS Paragon Plus Environment

Page 21 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

and W3, build highly directional H-bonds with the protein. Interestingly, these waters, in particular W3, are both replaced in the natural ligands by moieties that are able to build a strong and directional hydrogen bond, while still keeping the interaction with the positive NH donor. It is likely that the release of these waters upon ligand binding is associated with a favorable change in entropy. W1, although located in the same position as the GalO4 oxygen atom, shows a less well-defined neutron density. This water molecule is not more mobile than the others, but anisotropic atomic displacement parameters indicate that it moves in the same plane as the ring of His158. This suggests that W1 prefers to build an electrostatic interaction with His158, perhaps sacrificing strict H-bond directionality to gain entropy. Similar observations of buried water molecules with high disorder have been reported previously and have been related to increased entropy upon binding, compared to bulk water.23 This result suggests that it might be advantageous not to occupy this water site by ligand oxygens in order not to trigger an entropically unfavorable water release. CONCLUSION Taken together, the present results provide a detailed picture of the critical H-bonds involved in ligand binding for galectin-3. The high directional specificity of H-bonding interactions in the shallow, exposed binding site of galectin-3 explains the success of disaccharide templates (e.g. thiodigalactoside) 8, 13-16, 24-25 as the basis for all potent galectin-3 inhibitors to date and suggests that it may be difficult to identify inhibitors with novel chemotypes through e.g. fragment or high-throughput screening. Nonetheless, the results also suggest novel approaches to leverage subtle differences in the hydrogen bonding observed for waters bound to the apo protein and

21 ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 39

ligand oxygens in the complexes. Above all, the observed hydrogen positions can now be used as a benchmark for theoretical methods and to improve force fields for molecular modeling. EXPERIMENTAL SECTION

Crystallization, data collection and refinement. Protein expression, perdeuteration, purification, crystallization, generation of the apo form, data collection and data quality statistics for the lactose and glycerol complexes have been described previously.19 A crystal of the apo protein measuring around 1.0 mm3 was mounted in a thick-walled quartz capillary (Vitrocom; from CM Scientific, Silsden, UK), with internal diameter 1.5 mm. A plug of deuterated crystallisation mother liquor was left at one end the capillary. Data to 1.8 Å were collected at the LADI-III instrument of the Laue-Langevin, Grenoble, France26. Twenty-six images with 4h exposures were taken, in 3 different crystal orientations. The wavelength range was 2.74-3.57 Å. The crystal was rotated by 7° between exposures and the total data collection time was ~4.5 days. Data were indexed and integrated using LAUEGEN,27 wavelengthnormalised using LSCALE 28 and scaled and merged using SCALA.29 Room-temperature X-ray data were obtained to 1.03 Å resolution from the same crystal at beamline BM30 of the ESRF, Grenoble, France, using a beam size of 0.3 x 0.3 mm. Data were collected in two passes: first a low-dose pass to 1.3 Å with 250 x 1° rotations and exposure time 0.5s, followed by a highresolution pass to 1.02 Å with 260 x 1° rotations and exposure time 4s. The X-ray data were integrated using XDS and merged using XSCALE 30. Merging of MTZ files and other file manipulations were done using programs from the CCP4 package.31 Data quality statistics are presented in Table 1.

22 ACS Paragon Plus Environment

Page 23 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

23 ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 39

Table 1: X-ray and neutron diffraction data quality for the crystal of apo galectin-3C used in this study. Data 19

statistics for the lactose and glycerol complexes have been presented previously. In summary, the lactose dataset is the result of combining two datasets from LADI-III and MaNDi at Oak Ridge National Laboratory, USA, and is exceptionally complete (95.8%) with good quality (Rpim = 5.2 %) and multiplicity (13.0). The glycerol data are from the monochromatic instrument BIODIFF at FRM-II, Munich, and are 94.7% complete to 1.65 Å with Rpim = 8.7% and multiplicity 3.1. The glycerol dataset was limited to 1.7 Å for refinement.

neutrons

X-rays

instrument/beamline

LADI-III

ESRF BM30

wavelength(s) (Å)

3.35-4.35

0.98081

detector

image plate

ADSC Q315r

30–1.75

28–1.03

(1.78–1.75)

(1.06–1.03)

resolution

space group

P212121

cell dimensions 37.2, 58.4, 63.8

37.2, 58.6, 64.1

Rmerge (I) (%)

13.3 (19.0)

4.4 (235.7)

Rpim (I) (%)

5.3 (11.1)

1.2 (93.3)

CC(1/2) (I) (%)

0.992 (0.944)

100.0 (35.5)

mean I/σ σ(I)

8.3 (4.6)

32.2 (1.0)

completeness (%)

82.1 (62.1)

99.8 (99.2)

no. unique reflections

11790 (485)

70173 (5138)

multiplicity

4.9 (3.1)

14.0 (9.3)

a, b, c

24 ACS Paragon Plus Environment

Page 25 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

All structures were refined jointly against neutron and X-ray data with phenix.refine.32 The starting model for all refinements was a lactose-bound galectin-3C X-ray structure at 0.86 Å resolution,20 omitting water molecules and the ligand. The ligand molecules and restraints were built using the eLBoW tool in Phenix.33 The structures, without water molecules but with the ligand (if present), were briefly refined using only X-ray data at the highest resolution of the neutron data until the Rfree, Rfactor and geometry/atomic displacement parameter weight values converged. After this, joint refinement with X-ray and neutron data was done using phenix.refine.34 Oxygen atoms of water molecules were added manually by inspecting only the electron density maps in Coot.35 Waters were added if visible in the electron density map at the neutron data resolution, while those waters that were visible only at higher resolution were not added. When satisfactory Rmodel and Rfree values were obtained, we added the remaining X-ray data to the highest possible resolution, which enabled refinement with anisotropic ADPs, even for the D atoms. This resulted in reduction of the neutron Rfree values, demonstrating the validity of this approach. To avoid bias towards the X-ray data due to the much higher number of X-ray reflections, a careful screening of relative weights for the X-ray and neutron terms was carried out. We found that the use of anisotropic ADPs, made possible by using all available X-ray data to the highest possible resolution, produced the best models, with a significant improvement in the Rfree value for the neutron data to 1.7–1.8 Å compared to using only isotropic ADPs. An extensive screening of the relative X-ray and neutron weights for coordinate and ADP refinement showed that there was no risk of overfitting the neutron data, so we chose to weight X-rays and neutron data equally, as is done by default in phenix.refine. Deuterium atoms were added in all

25 ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 39

positions and refined as individual atoms with bond lengths restrained to ideal values from neutron crystal structures. Finally, since the current apo structure is the highest resolution one to date at room temperature, we also refined the model to the full resolution of the X-ray data, adding some water molecules that were not observed in the neutron maps. In this case, the deuterium atoms were refined in riding positions. Refinement statistics are presented in Table 2. The crystal structures, X-ray and neutron diffraction data have been deposited in the Protein Data Bank with accession IDs 6EYM (lactose complex), 6EXY (glycerol complex) and 6F2Q (ligand-free).

26 ACS Paragon Plus Environment

Page 27 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Table 2: X-ray and neutron data and structure quality for four galectin-3C structures. lactose

glycerol

apo

resolution range neutron

28.0-1.7

48.4-1.7

28.1-1.75

resolution range X-ray

(1.83-1.7) 32.0-1.1

(1.81-1.7) 43.1-1.1

(1.12-1.1) (1.12-1.1) 37.0, 58.4, 63.8 37.3, 58.4, 63.9 12.5 11.5

unit cell dimensions: a, b, c (Å) 2 Wilson B-factor (Å , from X-ray data)

apo (X-ray only) –

(1.94-1.75) 29.3-1.03

(1.04-1.03) 37.4, 58.6, 64.0 12.8

no. of reflections used in refinement, neutron no. of reflections used in refinement, X-ray # of reflections in free set, neutron # of reflections in free set, X-ray Rwork (X-ray)

15194 (2778) 56662 (2773) 769 (135) 2815 (146) 0.107 (0.177)

15178 (2266) 54348 (2431) 782 (112) 2751 (128) 0.123 (0.253)

– 11090 (1695) 70166 – 697 (106) 4639 (145) 0.128 (0.330) 0.118 (0.339)

Rfree (X-ray) Rwork (neutron) Rfree (neutron) # of non-D atoms # of D atoms macromolecules ligands solvent protein residues rmsd bonds (Å) rmsd angles (°)

0.125 (0.194) 0.168 (0.253) 0.21.1 (0.328) 1364 1441 1253 lactose 111 137 0.011 1.7

0.135 (0.228) 0.152 (0.211) 0.184 (0.261) 1314 1442 1204 glycerol 110 137 0.010 1.3

0.141 (0.330) 0.180 (0.238) 0.210 (0.275) 1276 1347 1199 none 77 137 0.010 1.4

0.129 (0.325) – – 1395 1304

97.1 2.9 0.0 0.0 4.0

97.8 2.2 0.0 1.5 5.8

97.8 2.2 0.0 0.0 1.6

97.8 1.5 0.7 1.4 2.4

Ramachandran favoured (%) Ramachandran allowed (%) Ramachandran outliers (%) Rotamer outliers (%) All-atom clashscore

1

27 ACS Paragon Plus Environment

none 142 137 0.010 1.1

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 39

NMR spectroscopy and determination of histidine protonation states. The NMR samples contained 0.3 mM uniformly 13C/15N-labeled galectin-3C and 150 mM lactose in 5 mM HEPES buffer, originally at pH 7.4, with 5% D2O added for the field-frequency lock. Constanttime 1H–13C HSQC experiments were carried out at a static magnetic field strength of 11.7 T and temperature of 301.15 K on an Agilent DirectDrive VNMRS spectrometer. Spectra were acquired with spectral widths of 8012 Hz and 3750 Hz in the 1H and 13C dimensions, sampled over 2240 and 62 points, and with the carriers positioned at 4.75 ppm and 125.9 ppm, respectively. NMR data were processed using NMRPipe 36 and analyzed using CCPNMR Analysis 2.4.2. 37 Spectra were acquired at 15 different pH values, ranging from 5.3 to 8.3. Two samples were used: one sample was titrated down in pH from 7.4 to 5.3 in 12 steps, whereas the other one was titrated up to pH 8.3 in 3 steps. Acid (0.1 M HCl) or base (0.1 M NaOH) was added in aliquots of 1 µl to adjust the pH at each titration point. The variation in chemical shift of the histidine 13CE1 resonances (δobs) was monitored as a function of pH and fitted, using the Levenberg-Marquardt algorithm38 in MATLAB R2016 (MathWorks), to the following equation: ߜ௢௕௦

ߜு஺ + ߜ஺ ൣ10(௣ுି௣௄ೌ) ൧ = 1 + ߜ஺ ሾ10(௣ுି௣௄ೌ) ሿ

where δHA and δA are the chemical shifts of the charged and non-charged states, respectively.

δH, δA, and pKa are free parameters of the fit. ASSOCIATED CONTENT

28 ACS Paragon Plus Environment

Page 29 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Supplementary Figure S1: Neutron diffraction analysis of protonation state of His residues in galectin-3C. Supplementary Figure S2: Schematic diagram of hydrogen bonding patterns in the three neutron crystal structures of galectin-3C. AUTHOR INFORMATION Corresponding author: *E-mail: [email protected]. Phone: +46 46 222 1443. AUTHOR CONTRIBUTIONS FM, EO, MA & DTL designed research; FM, JW, TS, AO, LC, MPB, EO & DTL carried out research and analyzed results; FM, EO, MA & DTL wrote the paper. ACKNOWLEDGEMENTS We thank Nicolas Coquelle (Institut Laue Langevin) and Jean-Luc Ferrer (Institut de Biologie Structurale) for help with room-temperature X-ray data collection on the crystal of apo-galectin3C at the BM30 beamline of the ESRF and Ulrich Weininger (now at Universität HalleWittenberg) for help with the NMR experiments. We thank the Institut Laue-Langevin for provision of instrument time at LADI-III. Research at the Spallation Neutron Source (SNS) at ORNL was sponsored by the Scientific User Facilities Division, Office of Basic Energy Sciences, U.S. Department of Energy. The work was partly funded by a PhD fellowship from the European Spallation Source ERIC to FM, by a grant from the Swedish Research Council (2016-04855) to DL and from the Knut and Alice Wallenberg Foundation (KAW 2013.0022) to DL, MA and EO. This paper is dedicated to the memory of Francesco Manzoni, who sadly passed away on 12th March 2017 at the age of 27. 29 ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ANCILLARY INFORMATION PDB accession IDs: 6EYM (lactose complex), 6EXY (glycerol complex) and 6F2Q (ligand-free). The authors will release the atomic coordinates and experimental data upon article acceptance.

ABBREVIATIONS USED ADP: atomic displacement parameter; CRD: carbohydrate recognition domain;

30 ACS Paragon Plus Environment

Page 30 of 39

Page 31 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

REFERENCES 1.

Leffler, H.; Carlsson, S.; Hedlund, M.; Qian, Y.; Poirier, F. Introduction to galectins. Glycoconj J 2004,

19, 433-440. 2.

Rabinovich, G. A.; Liu, F. T.; Hirashima, M.; Anderson, A. An emerging role for galectins in tuning

the immune response: lessons from experimental models of inflammatory disease, autoimmunity and cancer. Scand J Immunol 2007, 66, 143-158. 3.

Almkvist, J.; Karlsson, A. Galectins as inflammatory mediators. Glycoconj J 2004, 19, 575-581.

4.

Dumic, J.; Dabelic, S.; Flogel, M. Galectin-3: an open-ended story. Biochim Biophys Acta 2006, 1760,

616-635. 5.

Liu, F. T. Regulatory roles of galectins in the immune response. Int Arch Allergy Immunol 2005,

136, 385-400. 6.

Liu, F. T.; Rabinovich, G. A. Galectins as modulators of tumour progression. Nat Rev Cancer 2005, 5,

29-41. 7.

Pugliese, G.; Iacobini, C.; Ricci, C.; Blasetti Fantauzzi, C.; Menini, S. Galectin-3 in diabetic patients.

Clin Chem Lab Med 2014, 52, 1413-1423. 8.

Sörme, P.; Arnoux, P.; Kahl-Knutsson, B.; Leffler, H.; Rini, J. M.; Nilsson, U. J. Structural and

thermodynamic studies on cation-pi interactions in lectin-ligand complexes: high-affinity galectin-3 inhibitors through fine-tuning of an arginine-arene interaction. J Am Chem Soc 2005, 127, 1737-1743. 9.

Diehl, C.; Genheden, S.; Engström, O.; Delaine, T.; Håkansson, M.; Leffler, H.; Nilsson, U. J.; Ryde, U.;

Akke, M. Protein flexibility and conformational entropy in ligand design targeting the carbohydrate recognition domain of galectin-3. J. Am. Chem. Soc. 2010, 132, 14577-14589. 10.

Collins, P. M.; Öberg, C. T.; Leffler, H.; Nilsson, U. J.; Blanchard, H. Taloside inhibitors of galectin-1

and galectin-3. Chem Biol Drug Des 2012, 79, 339-346.

31 ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

11.

Page 32 of 39

Bum-Erdene, K.; Gagarinov, I. A.; Collins, P. M.; Winger, M.; Pearson, A. G.; Wilson, J. C.; Leffler, H.;

Nilsson, U. J.; Grice, I. D.; Blanchard, H. Investigation into the feasibility of thioditaloside as a novel scaffold for galectin-3-specific inhibitors. Chembiochem 2013, 14, 1331-1342. 12.

Collins, P. M.; Bum-Erdene, K.; Yu, X.; Blanchard, H. Galectin-3 interactions with glycosphingolipids.

J Mol Biol 2014, 426, 1439-1451. 13.

Delaine, T.; Collins, P.; MacKinnon, A.; Sharma, G.; Stegmayr, J.; Rajput, V. K.; Mandal, S.; Cumpstey,

I.; Larumbe, A.; Salameh, B. A.; Kahl-Knutsson, B.; van Hattum, H.; van Scherpenzeel, M.; Pieters, R. J.; Sethi, T.; Schambye, H.; Oredsson, S.; Leffler, H.; Blanchard, H.; Nilsson, U. J. Galectin-3-binding glycomimetics that strongly reduce bleomycin-induced lung fibrosis and modulate intracellular glycan recognition. Chembiochem 2016, 17, 1759-1770. 14.

Hsieh, T. J.; Lin, H. Y.; Tu, Z.; Lin, T. C.; Wu, S. C.; Tseng, Y. Y.; Liu, F. T.; Hsu, S. T.; Lin, C. H. Dual thio-

digalactoside-binding modes of human galectins as the structural basis for the design of potent and selective inhibitors. Sci Rep 2016, 6, 29457. 15.

Atmanene, C.; Ronin, C.; Teletchea, S.; Gautier, F. M.; Djedaini-Pilard, F.; Ciesielski, F.; Vivat, V.;

Grandjean, C. Biophysical and structural characterization of mono/di-arylated lactosamine derivatives interaction with human galectin-3. Biochem Biophys Res Commun 2017, 489, 281-286. 16.

Peterson, K.; Kumar, R.; Stenstrom, O.; Verma, P.; Verma, P. R.; Hakansson, M.; Kahl-Knutsson, B.;

Zetterberg, F.; Leffler, H.; Akke, M.; Logan, D. T.; Nilsson, U. J. Systematic tuning of fluoro-galectin-3 interactions provides thiodigalactoside derivatives with single-digit nM affinity and high selectivity. J Med Chem 2018, 61, 1164-1175. 17.

Blakeley, M. P.; Mitschler, A.; Hazemann, I.; Meilleur, F.; Myles, D. A.; Podjarny, A. Comparison of

hydrogen determination with X-ray and neutron crystallography in a human aldose reductase-inhibitor complex. Eur Biophys J 2006, 35, 577-583. 18.

Howard, E. I.; Sanishvili, R.; Cachau, R. E.; Mitschler, A.; Chevrier, B.; Barth, P.; Lamour, V.; Van

Zandt, M.; Sibley, E.; Bon, C.; Moras, D.; Schneider, T. R.; Joachimiak, A.; Podjarny, A. Ultrahigh resolution

32 ACS Paragon Plus Environment

Page 33 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

drug design I: details of interactions in human aldose reductase-inhibitor complex at 0.66 A. Proteins 2004, 55, 792-804. 19.

Manzoni, F.; Saraboji, K.; Sprenger, J.; Noresson, A.-L.; Nilsson, U.; Leffler, H.; Fisher, Z.; Schrader, T.;

Ostermann, A.; Coates, L.; Blakeley, M. P.; Oksanen, E.; Logan, D. T. Perdeuteration, crystallisation, data collection and comparison of five neutron diffraction datasets of complexes of human galectin-3C. Acta Crystallogr D Biol Crystallogr 2016, 72, 1194-1202. 20.

Saraboji, K.; Hakansson, M.; Genheden, S.; Diehl, C.; Qvist, J.; Weininger, U.; Nilsson, U. J.; Leffler, H.;

Ryde, U.; Akke, M.; Logan, D. T. The carbohydrate-binding site in galectin-3 is preorganized to recognize a sugarlike framework of oxygens: ultra-high-resolution structures and water dynamics. Biochemistry 2012, 51, 296-306. 21.

Collins, P. M.; Hidari, K. I.; Blanchard, H. Slow diffusion of lactose out of galectin-3 crystals

monitored by X-ray crystallography: possible implications for ligand-exchange protocols. Acta Crystallogr D Biol Crystallogr 2007, 63, 415-419. 22.

Reynolds, W. F.; Peat, I. R.; Freedman, M. H.; Lyerla, J. R., Jr. Determination of the tautomeric form

of the imidazole ring of L-histidine in basic solution by carbon-13 magnetic resonance spectroscopy. J Am Chem Soc 1973, 95, 328-331. 23.

Denisov, V. P.; Venu, K.; Peters, J.; Horlein, H. D.; Halle, B. Orientational disorder and entropy of

water in protein cavities. J Phys Chem B 1997, 101, 9380-9389. 24.

Salameh, B.; Cumpstey, I.; Sundin, A.; Leffler, H.; Nilsson, U. J. 1H-1,2,3-Triazol-1-yl galactose

derivatives as high affinity galectin-3 inhibitors. Bioorg Med Chem 2010, 18, 5367-5378. 25.

van Hattum, H.; Branderhorst, H. M.; Moret, E. E.; Nilsson, U. J.; Leffler, H.; Pieters, R. J. Tuning the

preference of thiodigalactoside- and lactosamine-based ligands to galectin-3 over galectin-1. J Med Chem 2013, 56, 1350-1354.

33 ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

26.

Page 34 of 39

Blakeley, M. P.; Teixeira, S. C. M.; Petit-Haertlein, I.; Hazemann, I.; Mitschler, A.; Haertlein, M.;

Howard, E.; Podjarny, A. D. Neutron macromolecular crystallography with LADI-III. Acta Crystallogr D Biol Crystallogr 2010, 66, 1198-1205. 27.

Campbell, J. W.; Hao, Q.; Harding, M. M.; Nguti, N. D.; Wilkinson, C. LAUEGEN version 6.0 and

INTLDM. J Appl Crystallogr 1998, 31, 496-502. 28.

Arzt, S.; Campbell, J. W.; Harding, M. M.; Hao, Q.; Helliwell, J. R. LSCALE - the new normalization,

scaling and absorption correction program in the Daresbury Laue software suite. J Appl Crystallogr 1999, 32, 554-562. 29.

Evans, P. R. Scaling and assessment of data quality. Acta Crystallogr D Biol Crystallogr 2006, 62, 72-

82. 30.

Kabsch, W. XDS. Acta Crystallogr D Biol Crystallogr 2010, 66, 125-132.

31.

Winn, M. D.; Ballard, C. C.; Cowtan, K. D.; Dodson, E. J.; Emsley, P.; Evans, P. R.; Keegan, R. M.;

Krissinel, E. B.; Leslie, A. G.; McCoy, A.; McNicholas, S. J.; Murshudov, G. N.; Pannu, N. S.; Potterton, E. A.; Powell, H. R.; Read, R. J.; Vagin, A.; Wilson, K. S. Overview of the CCP4 suite and current developments. Acta Crystallogr D Biol Crystallogr 2011, 67, 235-242. 32.

Adams, P. D.; Afonine, P. V.; Bunkoczi, G.; Chen, V. B.; Davis, I. W.; Echols, N.; Headd, J. J.; Hung, L.

W.; Kapral, G. J.; Grosse-Kunstleve, R. W.; McCoy, A. J.; Moriarty, N. W.; Oeffner, R.; Read, R. J.; Richardson, D. C.; Richardson, J. S.; Terwilliger, T. C.; Zwart, P. H. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr D Biol Crystallogr 2010, 66, 213-221. 33.

Moriarty, N. W.; Grosse-Kunstleve, R. W.; Adams, P. D. electronic Ligand Builder and Optimization

Workbench (eLBOW): a tool for ligand coordinate and restraint generation. Acta Crystallogr D Biol Crystallogr 2009, 65, 1074-1080. 34.

Afonine, P. V.; Mustyakimov, M.; Grosse-Kunstleve, R. W.; Moriarty, N. W.; Langan, P.; Adams, P. D.

Joint X-ray and neutron refinement with phenix.refine. Acta Crystallogr D Biol Crystallogr 2010, 66, 11531163.

34 ACS Paragon Plus Environment

Page 35 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

35.

Journal of Medicinal Chemistry

Emsley, P.; Lohkamp, B.; Scott, W. G.; Cowtan, K. Features and development of Coot. Acta

Crystallogr D Biol Crystallogr 2010, 66, 486-501. 36.

Delaglio, F.; Grzesiek, S.; Vuister, G. W.; Zhu, G.; Pfeifer, J.; Bax, A. NMRPipe: A multidimensional

spectral processing system based on UNIX pipes. JBNMR 1995, 6, 277-293. 37.

Vranken, W. F.; Boucher, W.; Stevens, T. J.; Fogh, R. H.; Pajon, A.; Llinas, P.; Ulrich, E. L.; Markley, J.

L.; Ionides, J.; Laue, E. D. The CCPN data model for NMR spectroscopy: development of a software pipeline. Proteins 2005, 59, 687-696. 38.

Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; Vetterling, W. T., Numerical Recipes. The Art of

Scientific Computing. Cambridge University Press: Cambridge, 1986.

35 ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1. Overview of the lactose binding site in galectin-3C. Galectin-3C is shown as a grey cartoon, with the side chains of important residues in the lactose recognition site shown as lines. Lactose is represented by sticks, with the carbon atoms colored yellow and oxygens colored red. 705x846mm (72 x 72 DPI)

ACS Paragon Plus Environment

Page 36 of 39

Page 37 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Figure 2. a) The complex of perdeuterated galectin-3C with lactose at 1.7 Å resolution. The 2m|Fo|-D|Fc| nuclear density map, calculated on-the-fly in PyMOL, is shown as a grey mesh contoured at a level of 0.6 . The lactose molecule is shown as thick sticks; protein side chains and water molecules that interact with them and lactose are shown as thin sticks. Deuterium atoms are coloured yellow; hydrogen atoms on the non-perdeuterated lactose molecule are coloured white. Hydrogen bonds relevant for the discussion are shown as light blue dotted lines with distances between the H atom and the H-bond acceptor. This view is chosen to highlight the interactions made by the outer hydroxyl groups of lactose, as well as Arg162. b) Another view of the binding pocket, rotated approximately 90° around the horizontal axis with respect to panel A. This view is chosen to highlight the interactions of GalO4, GalO6 and GlcO3’ with His158, Gln174, Glu184 and Arg186 on the inner side of the binding pocket. Arg162 is not shown for clarity. 1587x640mm (72 x 72 DPI)

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Caption : Figure 3. a) Superimposed neutron crystal structures of the complexes of galectin-3C with glycerol at 1.7 Å resolution (solid) and lactose (transparent). The 2m|Fo|-D|Fc| nuclear density map, shown as a grey mesh, is contoured at a level of 0.8 . The glycerol molecule is shown as grey sticks. The lactose molecule is shown as a faint shadow for comparison. The perfect overlap between glycerol and parts of the galactose moiety of lactose is evident. b) View rotated approximately 90° around a horizontal axis, as in Fig. 1B. The color scheme is as in Fig. 1. 1587x636mm (72 x 72 DPI)

ACS Paragon Plus Environment

Page 38 of 39

Page 39 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Figure 4. Neutron and X-ray structures of ligand-free galectin-3C. The color scheme is as in Figs. 1 and 2. a) The 2m|Fo|-D|Fc| nuclear density map is contoured at a level of 0.7 . Water molecules in the binding site are shown as sticks, except for W1. b) View rotated approximately 90° around a horizontal axis, as in Fig. 1B. The lactose molecule is shown as a faint shadow for comparison. c) The 2m|Fo|-D|Fc| electron density map for the same region, for comparison. The resolution has been artificially limited to 1.8 Å. The ellipsoids show the anisotropic atomic displacement parameters for the water molecules derived from the X-ray data to 1.03 Å, at a probability level of 0.5. d) Electron density map at 1.03 Å, showing the extra water molecules at positions corresponding to lactose oxygen atoms found at higher resolution. 1587x1058mm (72 x 72 DPI)

ACS Paragon Plus Environment