Structural and Biophysical Characterization of Human EXTL3: Domain

Jan 18, 2018 - Biochemistry and Structural Biology, Centre for Molecular Protein Science, Deptartment of Chemistry, Lund University, SE-221 00 Lund, S...
0 downloads 5 Views 2MB Size
Subscriber access provided by READING UNIV

Article

Structural and biophysical characterization of human EXTL3: domain organisation, glycosylation and solution structure Wael Awad, Sven Kjellstrom, Gabriel Svensson Birkedal, Katrin Mani, and Derek Thomas Logan Biochemistry, Just Accepted Manuscript • DOI: 10.1021/acs.biochem.7b00557 • Publication Date (Web): 18 Jan 2018 Downloaded from http://pubs.acs.org on January 19, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Biochemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Structural and biophysical characterization of human EXTL3: domain organisation, glycosylation and solution structure Wael Awada,b,§, Sven Kjellströma, Gabriel Svensson Birkedalc§§, Katrin Manic,* & Derek T. Logana,* a

Biochemistry and Structural Biology, Centre for Molecular Protein Science, Dept. of Chemistry, Lund University, SE-221 00 Lund, Sweden. b Department of Biophysics, Faculty of Science, Cairo University, 12316 Cairo, Egypt. c Department of Experimental Medical Science, Division of Neuroscience, Glycobiology Group, Lund University, SE-221 00 Lund, Sweden. § present address: Infection and Immunity Program and Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Clayton, Australia §§ present address: Swedish Orphan Biovitrum AB, Tomtebodavägen 23A, 17165 Solna, Sweden * Authors to whom correspondence may be addressed: [email protected]: tel: +46 46 222 4077; [email protected]; tel. +46 46 222 1443.

Send proofs and reprints to [email protected] Keywords: heparan sulfate synthesis, exostosin-like protein 3, N-glycosylation, SAXS, mass spectrometry. Running title: Structural and biophysical characterization of human EXTL3

1 Environment ACS Paragon Plus

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract Heparan sulfate proteoglycans are proteins substituted with one or more heparan sulfate (HS) polysaccharides, found in abundance at cell surfaces. HS chains influence the activity of many biologically important molecules involved in cellular communication and signaling. The exostosin (EXT) proteins are glycosyltransferases in the Golgi apparatus that assemble HS chains on HSPGs. The EXTL3 enzyme mainly works as an initiator in HS biosynthesis. In this work, human lumenal N-glycosylated EXTL3 (EXTL3∆N) was cloned, expressed in human embryonic kidney cells and purified. Various biophysical and biochemical approaches were then employed to elucidate the N-glycosylation sites and the function of their attached N-glycans. Furthermore, the stability and conformation of the purified EXTL3∆N protein in solution have been analyzed. Our data show that EXTL3∆N has N-glycans at least at two positions, Asn290 and Asn592, which seem to be critical for proper protein folding and/or release. EXTL3∆N is quite stable, as high temperature (~59 °C) was required for denaturation. Deconvolution of the EXTL3∆N far-UV CD spectrum revealed a substantial fraction of β sheets (25%) with a minor proportion of α-helices (14%) in the secondary structure. Solution small angle X-ray scattering and dynamic light scattering revealed an extended structure suggestive of a dimeric arrangement and consisting of two distinct regions, narrow and broad respectively. This is consistent with bioinformatics analyses suggesting a 3domain structure with two glycosyltransferase domains and a coiled-coil domain.

2 Environment ACS Paragon Plus

Page 2 of 35

Page 3 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Introduction Proteoglycans

(PG)

are

proteins

modified

by

covalent

attachment

of

anionic

glycosaminoglycan chains to the side chain oxygen atoms of specific serine residues in certain consensus sequences 1. One of the most widespread types of PG is the heparan sulfate proteoglycans (HSPGs), which carry one or more heparan sulfate (HS) chains. HSPGs interact with a wide variety of proteins, such as chemokines, growth factors, morphogens, extracellular matrix components and enzymes via their HS chains and thereby control various biological processes1, 2. In HSPGs the HS chains consist of repeating units of either glucuronic acid (GlcA) or iduronic acid (IdoA) & N-acetylglucosamine (GlcNAc), i.e. (GlcA/IdoA-GlcNAc)n, which are synthesized via membrane-bound glycosyltransferases in the Golgi apparatus. HS assembly is initiated by formation of a common tetrasaccharide linkage on the target serine residue of PG core protein (GlcA-Gal-Gal-Xyl-Ser). Afterwards, GlcNAc is added to the tetrasaccharide linker and the HS chains are further elongated by addition of corresponding repeating disaccharides that become additionally modified by Ndeacetylations/N-sulfations, epimerizations and O-sulfations, resulting in formation of HS chains with variable lengths and structures 2. In most HSPGs, synthesis of HS occurs at SerGly consensus sequences that are flanked by a cluster of acidic residues and an adjacent tryptophan. Mutations in this region can give rise to more chondroitin sulfate than HS, therefore it seems likely that the protein core has a vital regulating role in HS assembly3, 4. The synthesis of HS backbone structure is mediated by glycosyltransferases of the exostosin protein family (EXTs), which initiate, elongate and terminate HS chain formation. Loss of functions of these genes in mouse, zebrafish, D. melanogaster, C. elegans and human affect different cellular signaling processes and cause developmental deformities 5. Five members of the EXT family have been identified in mammals, including EXT1, EXT2, EXTL1, EXTL2 and EXTL3. The exact role of EXT glycosyltransferases in the process of HS biosynthesis is relatively unclear. The first step, namely transfer of the first GlcNAc residue to the tetrasaccharide linkage region, has been considered as performed by EXTL2 and/or EXTL3 glycosyltransferases. EXT1 and EXT2 seem to work together to extend and polymerize the HS chain by transferring alternating GlcA and GlcNAc residues to the growing polymer 5. EXTL2 has been described as an initiator of HS chain elongation; however, it is also involved in chain termination by transferring a GlcNAc to the phosphorylated tetrasaccharide linkage region 6. EXTL3 is a bifunctional enzyme which has been ascribed both GlcNAc transferase I and II (GlcNAc-TI and GlcNAc-TII) activities, which implies its participation in both HS

3 Environment ACS Paragon Plus

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

initiation and elongation. So far, no GlcA transferase (GlcA-TII) activity has been shown for EXTL3. The C. elegans genome has only two HS synthesis enzymes, which have homology to EXT1 and EXTL3, indicating that the activities of these two enzymes are sufficient for the production of HS chains 5. Recent studies show that reduction of EXTL3 expression levels results in synthesis of longer HS chains, whereas EXTL3 overexpression has no clear effect on the HS chain lengths 7. Furthermore, mutations in EXTL3 orthologues of D. melanogaster, C. elegans, and Danio rerio resulted in reduced HS synthesis 8-10 and no HS was detected in 9 day-old mouse embryos lacking EXTL3 11. In addition, it was recently shown that inhibition of EXTL2 and EXTL3 expression using siRNAs results in notable reduction in the mRNA levels of the corresponding genes and decreases the rates of GAG synthesis and storage 12. All EXT proteins have a small membrane-spanning domain at their N-terminus that presumably links the proteins to the Golgi membrane, followed by variable length catalytic domains with 7 cysteines that are conserved within the EXT family members 13, 14. EXTs have a ubiquitous expression pattern and contain one or more conserved Asp-Xxx-Asp (DXD) motifs, except for EXTL1. The DXD motifs are typical for glycosyltransferases utilizing nucleotide-activated sugars as donor substrates, and these motifs are most likely involved in either substrate recognition and/or catalysis

15

. The structures and biochemical properties of

EXT proteins have been poorly investigated, and so far the only crystal structures available are of the catalytic domain of mouse EXTL2 in the apo form and in complex with donorsubstrates UDP-GlcNAc and UDP-GalNAc 16, giving insight into the mechanisms of GlcNAc transferases (GlcNAc-TI) in HS biosynthesis. EXTL2 is less than half the size of the other family members (~330 amino acids) and shows high sequence homology to the C-terminal region of EXT proteins, whereas EXTL3 (~919 residues) is the largest and contains an additional structurally uncharacterized region of ~580 residues that shows the GlcNAc-TII activity of the EXT-proteins

17

. Thus, EXTL3 provides an excellent model for obtaining a

complete view of the structure–based mechanism of glycosyltransferase activities catalysed by the EXT protein family. In order to characterize the EXTL3 protein and determine its structure, we have expressed recombinant human N-glycosylated lumenal EXTL3 lacking the N-terminal transmembrane part (EXTL3∆N) in human eukaryotic kidney cells and subsequently purified and investigated its biophysical and biochemical characteristics using various methods such as mass spectrometry (MS), far-UV circular dichroism (CD) spectroscopy, differential scanning fluorimetry (DSF) and dynamic light scattering (DLS). Furthermore, small angle X-ray

4 Environment ACS Paragon Plus

Page 4 of 35

Page 5 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

scattering (SAXS) was pursued to elucidate the low-resolution structure of EXTL3∆N in solution.

Experimental Procedures Bioinformatics analysis To predict the intrinsic disorder within human EXTL3, the servers IUPred and DISpro

20

18

, DISEMBL 19,

were employed. Domain predictions were performed using GlobDoms

conserved domain database search

22

, Ginzu

23

, HHpred

24

and Dom-Pred

21

,

25

. The CAZy

database, which describes the families of structurally-related catalytic and carbohydratebinding domains of enzymes that work on glycosidic bonds

26

, was used to characterize the

functional domains of EXTL3. The I-TASSER27 and ROBETTA beta 28 servers were used for homology modelling of the different domains of EXTL3. The Phyre2 server

29

was used to

find distant structural homologs of the GT47 domain. Phyre2 was run with default parameters, in intensive modeling mode. Cloning, expression, identification and purification of human EXTL3∆N The pCEP4-BM40-HisTEV expression vector 30, containing the sequences for an N-terminal BM40 secretion peptide followed by a 6His-tag and a TEV cleavage site, was used in this study. The cDNA sequence coding for the lumenal part of human EXTL3, amino acids 52919, i.e. without the sequences for the N-terminal cytoplasmic region and the transmembrane helix,

was

amplified

by

PCR

using

the

forward

primer

5`-

ATTAAGCTTACCACTCTGGATGAGGCTGATGAG-3` and the reverse primer 5’TTCTCGAGCTAGATGAACTTGAAGCACTTGG-3’. This created a DNA segment comprising base pairs 154-2760 of the coding sequence of human EXTL3, as in NCBI nucleotide database accession number BC006363. Restriction sites for cloning were inserted at the 5`end of the primers and are marked in boldface. The PCR product was digested with the restriction enzymes HindIII & XhoI and ligated into the HindIII/XhoI-digested pCEP4BM40-HisTEV vector. The construct was verified by sequencing at Eurofins MWG Operon (Germany). Transfection of 293 human embryonic kidney (HEK293) cells was performed according to Invitrogen’s standard transfection protocol with Lipofectamine 2000. Stable HEK293 cell clones were generated using the limiting dilution technique by growing the cells in the selective antibiotic hygromycin B for several weeks. The clones expressing high amounts of

5 Environment ACS Paragon Plus

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

EXTL3∆N were expanded and used in this study. The EXTL3∆N-expressing cells were grown in protein-free medium and the conditioned medium collected every 3-4 days was stored at -80 oC. EXTL3∆N was purified from conditioned medium of ∼200 mL batches using nickel affinity chromatography with a linear imidazole gradient and then subjected to gel filtration chromatography at room temperature using a Superdex 200 10/300 GL column (GE Healthcare) pre-equilibrated for two column volumes of 25 mM sodium phosphate buffer, pH 7.0 . The purity of EXTL3∆N was assessed by SDS-PAGE. Finally, the protein was clarified and concentrated by ultrafiltration and the concentration was measured by the bicinchoninic acid protein (BCA) assay (Pierce Biotechnology, Rockford, Il) or absorbance measurements at 280 nm using a Nanodrop spectrophotometer (Thermo Scientific). Mass spectrometry analyses In-gel digestion of protein SDS-PAGE bands using trypsin or chymotrypsin was performed 31. After extraction of bands from the gel, LC-MS/MS analysis was performed to verify the EXTL3∆N sequence. Data-dependent mass spectrometry experiments were performed with an EASY LC Nano Flow high-performance liquid chromatography (HPLC) (Proxeon Biosystems, Odense, Denmark) connected to an LTQ-Orbitrap Velos Pro mass spectrometer (Thermo Fisher Scientific, Waltham, WA) equipped with a nano EASY-spray ion source (Proxeon Biosystems, Odense, Denmark). The chromatographic separation was performed at 40 °C on a 15 cm (75µm i.d.) EASY-Spray column packed with 3 µm resin (Proxeon Biosystems, Odense, Denmark). The nano HPLC intelligent flow control gradient was 5–20% solvent B (0.1% (v/v) FA in acetonitrile) in solvent A (0.1% (v/v) FA in water) for 120 min, then 20%-40% for 60 min followed by an increase to 90% for 5 min. A flow rate of 300 nl/min was used through the whole gradient. An MS scan (400–1400 m/z) was recorded in the Orbitrap mass analyzer set at a resolution of 60 000 at 400 m/z, 1×106 automatic gain control target and 500 ms maximum ion injection time. The MS was followed by data-dependent collision-induced dissociation MS/MS scans on the eight most intense multiply-charged ions in the LTQ at 500 signal threshold, 3 m/z isolation width, 10 ms activation time at 35 normalized collision energy and dynamic exclusion enabled for 60 seconds. The general mass spectrometric conditions were as follows: spray voltage 2.0 kV; no sheath or auxiliary gas flow; S-lens 60%; ion transfer tube temperature 275 °C. Raw data were processed by Mascot Distiller and searched against the Swiss-Prot database (release 11-Dec-2013, containing 541954 entries) with an in-house Mascot server. The search parameters for the Mascot searches were: Taxonomy: Homo sapiens, Enzyme: trypsin or chymotrypsin, Variable

6 Environment ACS Paragon Plus

Page 6 of 35

Page 7 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Modifications: oxidation (Me), Precursor Tolerance: 20 ppm, MS/MS Fragment Tolerance: 0.1 Da. Manual annotation of GlcNAc residues was performed. Enzymatic treatments The EXTL3∆N-producing cells were treated with protein-free medium containing 10 µM kifunensine for 24-48 h and EXTL3∆N protein was purified from the harvested medium using Ni-NTA chromatography. Enzymatic deglycosylation of 1 mg of high-mannose EXTL3 (in 25 mM sodium phosphate, pH 7.0) was carried out by treatment with 50 mU endoglycosidase H (EndoH) (New England Biolabs) overnight at 37 °C. The EndoH was then removed from the samples by repeating the Ni-NTA purification and deglycosylation efficiency was analyzed by SDS-PAGE gel. Tunicamycin treatment and Western blot analysis The EXTL3∆N-expressing cells were grown to 80% confluency in 6-well plates. The cells were washed three times with PBS and then incubated in protein-free medium containing increasing concentrations (0-3 µg/mL) of tunicamycin

32

. After 3 hours, the medium was

replaced with fresh medium containing the same concentrations of tunicamycin and the cells were incubated for another 24 h. Afterwards, the conditioned medium was harvested and the cells were lysed in radioimmunoprecipitation assay buffer (RIPA buffer: 0.1% (w/v) SDS, 1% (v/v) Triton X-100, 0.5% (w/v) sodium deoxycholate and 0.5 mM phenylmethyl sulfonyl fluoride in PBS) for 30 minutes at 4 °C. The cells were then centrifuged at 14,000 rpm for 10 min at 4°C and the supernatants were collected. The protein concentrations were determined by BCA assay and adjusted to 2.8 mg/mL for the conditioned medium and 2.0 mg/mL for cell lysates. For Western blot analysis, samples of 17.5 µl of the conditioned medium or 20 µg of the cell extracts were subjected to SDS-PAGE on 4−12% Bis-Tris gels and then transferred to polyvinylidine fluoride (PVDF) membranes. The membranes were then blocked by gentle shaking with 5% (w/v) non-fat milk in PBS containing 0.05% (v/v) Tween-20 for 1 h at room temperature. Afterwards, the membranes were treated with anti-His-tag primary antibody (tetra-His Ab) diluted 1:2000 in PBS with 0.05% (v/v) Tween-20 and incubated overnight at 4 °C. The next day, the membranes were washed extensively with PBS in 0.05% (v/v) Tween20, and then probed with secondary antibody (Amersham anti-mouse- horseradish peroxidase; 1:10 000 in PBS with 0.05% (v/v) Tween-20) for 2 h at room temperature. After extensive washings, the blots were developed using the Fujifilm LAS-1000 imaging system.

7 Environment ACS Paragon Plus

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 35

Circular dichroism spectroscopy The EXTL3∆N protein was dialyzed overnight against 20 mM sodium phosphate buffer (pH 7.8), clarified by centrifugation at 15,000 rpm for 20 min and finally the protein concentration was adjusted to 0.1 mg/mL. Far-UV CD spectra (measurement range of 190–250 nm) were recorded using a Jasco J-810 circular dichroism spectropolarimeter equipped with a Peltierthermostated cell holder. A 1 mm quartz cuvette was used and thermostated at 25 °C unless otherwise mentioned. The following parameters were employed: data pitch 0.1 nm; sensitivity 100 mdeg; response time 8 sec; bandwidth 1 nm; scanning speed 50 nm/min; accumulation of 3 scans. The spectrum of the corresponding buffer (dialysis buffer) was recorded and subtracted from the protein spectra. Far-UV CD spectra were recorded for EXTL3∆N from various protein batches. Mean residue ellipticity was calculated using 114.2 as the mean residue molecular weight of EXTL3∆N. Heat denaturation was studied using CD spectroscopy by heating the protein (0.1 mg/mL) in 20 mM sodium phosphate buffer (pH 7.8) from 25 to 90 °C at a ramp rate of 1 °C/min and subsequent complete far-UV CD spectra were recorded at intervals of 5 °C. After that, the sample was cooled back to 25 °C and another farUV CD spectrum was recorded with the standard parameters. Dynamic light scattering The protein sample homogeneity was assessed by dynamic light scattering (DLS) using the Zetasizer APS DLS system (Malvern Instruments Ltd., Malvern, UK). Briefly, 35 µl of protein samples were clarified by spinning down at 14,000 rpm for 30 min then DLS measurements were performed at 25 °C for each sample. All the data were analyzed using the Zetasizer Nano software v7.03 to derive the hydrodynamic radius and apparent molecular weight. Temperature–dependent (from 25 to 90 °C) size measurements of EXTL3∆N (at ~ 1 mg/mL) were obtained with a 1 °C incremental temperature ramp and 120 sec equilibration time at each point. All experiments were repeated at least twice. Differential scanning fluorimetry The thermal stability of the expressed EXTL3∆N proteins was determined by differential scanning fluorimetry (DSF) using the Mx3005P qPCR system (Stratagene), which monitors the thermal unfolding of proteins in the presence of a fluorescent dye

33

. Clarified proteins

were mixed with the dye SYPRO Orange (Life Technologies) at 1:1000 fold dilution, to a protein concentration of about 0.1 mg/mL. The samples were heated from 25 to 90 °C and fluorescence measurements were taken at 1 °C intervals. Thermal melting curves were

8 Environment ACS Paragon Plus

Page 9 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

plotted, and the instrument software and Excel were used to calculate the melting temperature (Tm) values according to published methods 33. Measurements were repeated at least twice for each buffer condition. SAXS data collection and processing Synchrotron radiation solution X-ray scattering data were collected on the BM29 beamline of ESRF, Grenoble, France

34

at wavelength 0.992 Å using a PILATUS 1M detector

(DECTRIS), at a sample-to-detector distance of 2.8 m, covering a momentum transfer (Q) range of 0.05 - 5 nm-1 (Q = 4πsinθ/λ, where 2θ is the scattering angle and λ is the wavelength) 34

. Water measurements were used as reference for further measurements and to give

preliminary estimates for the sample molecular weight using the known absolute scattering of water (I0,abs (water)= 1.632 x10-2 cm-1, at 25 °C). Bovine serum albumin (BSA) samples were also used for the molecular weight calibration. Size exclusion in-line with SAXS (SEC-SAXS) was used to obtain a scattering data set from a monodisperse form of EXTL3∆N using an HPLC system (Viscotek GPCmax, Malvern Instruments) attached directly to the sample inlet valve of the sample changer at BM29. 100 µl of 5 mg/mL clarified EXTL3∆N samples were injected manually into a Superdex 200 10/300 GL column (GE Healthcare) pre-equilibrated for three column volumes with degassed 20 mM bicine, 150 mM NaCl, 3mM DTT (pH 8.6). The proteins were eluted at flow rate of 0.5 mL/min, passed through the 1.8 mm-diameter quartz capillary cell and a scattering frame was collected every 2 sec, with a total number of 1000. The EDNA pipeline

35

provided a

one-dimensional profile for each frame. All frames were compared with the initial frame and the first 300 frames were merged to generate the reference profile. Any subsequent frames that differed from the reference one were subtracted and then processed with ATSAS suite tools

36

, calculating the forward scattering I(0) and radius of gyration Rg for each one. Five

hundred and thirty frames with stable Rg values towards the end of the main elution peak were merged to provide a single averaged scattering profile corresponding to the protein dimer. To maximize the signal to noise ratio, the targeted frames were reprocessed manually. All data manipulations and processing were carried out using the PRIMUS software

37

using

standard procedures. Determination of the I(0) and Rg of the samples was performed using the Guinier approximation at Q ≤1.3/Rg. Real space Rg, excluded particle (Porod) volume and the maximum particle dimension Dmax were calculated from the pair distance distribution function P(r) using the program GNOM

38

. Molecular weight was also estimated using the SAXS

39

MoW2 server .

9 Environment ACS Paragon Plus

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Ab-initio shape reconstructions The P(r) plot was generated using GNOM including Q values up to 1.6 nm-1, and its output was the basis for low resolution ab initio shape reconstruction using simulated annealing optimization of a dummy atom set, as implemented in the program DAMMIN 40. The results of 20 independent DAMMIN runs were aligned and averaged using the DAMAVER program suite

41

, by minimizing the normalized spatial discrepancy (NSD) between the models,

selecting the most probable ones and building an averaged model. The mean NSD between the independent DAMMIN reconstructions provides a useful estimate of the reliability of the models. DAMFILT was used as a part of the DAMAVER suite to filter out the low bead occupancy positions and the loosely connected atoms from the averaged model, generating the most representative model that agrees with the particle excluded volume. The DAMSTART model, generated from the averaged model by DAMAVER, was used as initial model for another slow mode DAMMIN run. Graphical representations of the models were generated using PyMOL Molecular Graphics System, Version 1.6 (Schrödinger, LLC, New York, NY). To test the dependence of the ab-initio reconstruction of EXTL3∆N on the algorithm used to calculate it, the GASBOR program42 was also tested to estimate the abinitio envelope (20 independent runs), and the resulting envelopes were averaged using DAMAVER as described earlier. Both DAMMIN & GASBOR produced models with similar overall shapes (Figure S1).

Results Bioinformatics analysis Protein sequence analysis (Figure 1) suggests that human EXTL3 (919 residues) is made up of three different regions, specifically a cytoplasmic region (Met1–Thr30), a helical transmembrane region (Trp31–Leu51) and finally a Golgi lumenal region (Thr52–Ile919). We analyzed the lumenal catalytic region of EXTL3 using different intrinsic disorder predictors, which indicated the presence of two disordered parts (residues 160-179 and 556-650) that presumably divide the lumenal region into at least two domains. Extensive database searches and analysis of a previously published sequence alignment of the exostosin family 5 predicted three structural domains in the lumenal part of EXTL3. The first is a coiled-coil region (residues 53-159). The second domain (approximately residues 172-455) belongs to glycosyl transferase family 47 (GT47) and is predicted to add β-1,4-GlcA residues to HS chains terminating in GlcNAc (GlcA-TII). The final, C-terminal domain (residues: 655-904) belongs

10 Environment ACS Paragon Plus

Page 10 of 35

Page 11 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

to glycosyl transferase 64 (GT64) and is predicted to add α-1,4-GlcNAc residues to HS chains terminating in GlcA (GlcNAc-TII), according to the CaZY database43. Structurally this domain is predicted to be very similar to the Rossmann fold seen in mouse EXTL2.

Furthermore, sequence analysis predicted the occurrence of 5 DXD motifs within the lumenal part of EXTL3. Of these, the sequence DDD between residues 744 and 746 is predicted to be involved in the GlcNAc-TI activity. By homology to mouse EXTL2, Asp744 will bind the donor sugar, Asp745 will bind the UDP ribose moiety and Asp746 will bind the catalytic metal. Using Robetta, we succeeded to generate a homology model for the C-terminal GT64 domain (residues 665-890) with 73% confidence (fraction of similar structure) based on the EXTL2 crystal structure template. The homology model confirms the conservation of other key residues found in mouse EXTL2, e.g. Glu832 and Asp833, involved in interactions to the donor sugar, Tyr670 that should stack with the UDP base, Asn723 that should H-bond to the UDP base and Arg672 that should salt-bridge to the UDP phosphate groups. As in EXTL2, a disulfide bond between Cys831 and Cys879 reinforces the structure near the active site. In contrast to the GT64 domain, we failed to obtain 3-dimensional models for the other parts, at least for the entire domains. However, the GT47 domain is predicted to belong to the fold family GT-B, with two consecutive Rossmann fold domains. 44 This is supported by results from the Phyre2 modelling server, which found 13 distant structural homologs for a ~120 residue segment near the end of the GT47 domain (residues ~413–535), resulting in models with over 90% confidence. All of the homologs belong to GT-B; two typical examples are 3RHZ, a glycosyltransferase required for glycosylation of serine-rich strepctococcal adhesins (112 residues aligned, 20% identity) and 4W6Q, glycosyl transferase C from Streptococcus agalactiae (114 residues, 13% identity). The homologous sequences were all from the same region of the C-terminal nucleotide donor binding domains of these GT-B enzymes. This confirms that the middle domain of EXTL3 will most likely belong to the GT-B fold family.

11 Environment ACS Paragon Plus

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 35

Lumenal region

Cytoplasmic domain Transmembrane helix

1 31 52

919

GlobDoms domain prediction Conserved domain database search Ginzu domain prediction HHPred Dom-Pred CAZy database Coiled coil

Glycosyl-Transferase Family 47

Conserved Cys

Glycosyl-Transferase Family 64

Predicted N-glycosylation site

DXD motif

Figure 1. Schematic representation of human EXTL3 sequence with an annotation of cytoplasmic, transmembrane and lumenal regions. The predicted N-glycosylation sites, 7 EXT-family conserved cysteines, and DXD motifs are displayed as green, yellow and orange lines respectively. Moreover, the 3 predicted dissimilar structural areas of coiled-coil, glycosyl transferase families 47, and 64 are shown as red, blue and violet bars respectively.

Expression, identification and purification of EXTL3∆N A cDNA for human EXTL3 protein (GenBank: BC006363.2) encoding residues 52-919, i.e. the lumenal part of human EXTL3 without the sequences for the N-terminal cytoplasmic region and the transmembrane helix, was cloned into pCEP4-BM40-HisTEV mammalian expression vector and introduced into to human embryonic kidney cells. The cells were cloned by limiting dilution under the selective pressure of hygromycin B to obtain stable cell clones expressing EXTL3∆N. These were grown to confluence, then maintained in proteinfree medium for 3 to 4 days. The EXTL3∆N expression level was quantified by Coomassie staining of SDS-PAGE gels and the clones with a high expression of EXTL3∆N were selected and used in further studies. For protein purification, a batch of 200 mL of the conditioned medium was harvested and the EXTL3∆N protein was purified by Ni-NTA affinity and gel filtration, yielding up to 20 mg/L of conditioned medium. The purified protein was then analyzed by SDS-PAGE gel and yielded several bands with molecular weights (MW) of ~125 and ≥300 kDa under non-reducing conditions and a single band of ~125 kDa under reducing conditions (Figure 2A). The ~300 kDa band appeared as a smear, suggesting the presence of large oligomers or aggregates. Almost no protein degradation was detected by SDS-PAGE. To verify the identity of the expressed proteins, protein bands excised from the gels were destained, reduced, alkylated

31

and then subjected to trypsin in-gel digestion

overnight at 37 °C followed by LC-MS/MS analysis, which confirmed that all bands

12 Environment ACS Paragon Plus

Page 13 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

contained EXTL3∆N protein, with high sequence coverage (more than 70%) and acceptable MS scores (Figure 2B).

Figure 2. EXTL3 expression and identification. (A) SDS-PAGE analysis of EXTL3∆N proteins. Lanes 1 and 2: non-reduced and reduced glycosylated EXTL3∆N; lanes 3 and 4: non-reduced and reduced non-glycosylated EXTL3∆N respectively. (B) A typical band of EXTL3∆N was cut from a Coomassie-stained SDS-PAGE gel and subjected to proteolysis using trypsin followed by LC-MS/MS analysis. Peptides identified by mass spectrometry and subsequent database search are shown as black boxes and mapped on the EXTL3∆N schematic sequence. The other bands produced similar results.

Enzymatic removal of N-glycans and annotation of N-glycosites using MS The primary sequence of EXTL3∆N reveals four possible N-glycosylation sites at Asn277, Asn290, Asn592 and Asn790 (Figure 1), based on the fact that N-glycans may only attach to Asn in a triple consensus sequon of Asn-Xxx-Ser/Thr/Cys, where Xxx can be any amino acid except proline

45

. To investigate N-glycosylation of EXTL3∆N, we treated the EXTL3∆N-

expressing cells with the plant alkaloid kifunensine, a potent inhibitor of mannosidase I enzymes, to produce primarily N-linked glycans of the high-mannose type that are sensitive to EndoH

46

. The EXTL3∆N-expressing cells survived well upon treatment with up to 10 µM

kifunensine for 24 h (data not shown), accompanied by a 50 % increase in the level of protein 13 Environment ACS Paragon Plus

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

expression, yielding up to 30 mg protein per litre of the conditioned medium. A batch of conditioned medium (200 mL) was harvested from EXTL3∆N-producing cells treated with 10 µM kifunensine and the protein was purified by Ni-NTA affinity chromatography then subjected to EndoH treatment under native conditions overnight at 37 °C. Then EndoH was removed by repeating the Ni-NTA purification and the deglycosylation efficiency was verified by Coomassie Blue-stained SDS-PAGE (Figure 2A). Two bands of EXTL3∆N with MW of ~125 and 300 kDa were detected on the gel under non-reducing conditions, whereas only one band with MW of ~125 kDa corresponding to monomeric EXTL3∆N appeared under reducing conditions. Both bands showed a reduction in MW of ~10kDa after treatment with EndoH (Figure 2A; lane 3 & 4), indicating deglycosylation upon kifunensine and EndoH treatments. Mapping of the N-glycosylation sites using MS was performed to identify the occupancy of the four possible N-glycosylation sites on EXTL3∆N. EndoH hydrolyses the glycosidic bond between the two last GlcNAc residues and leaves one GlcNAc of a mass of 203.079 Da attached to the Asn. This mass change can be detected both in MS and MS/MS patterns and can therefore be used for identifying the N-glycosites. LC-MS/MS analysis of both glycosylated and deglycosylated (EndoH treated) trypsin-digested EXTL3∆N SDS-PAGE bands was performed. The LC-MS/MS analysis and database searches gave sequence coverage of 72% and 87% for glycosylated and deglycosylated EXTL3∆N respectively. Peptides containing Asn277 and Asn290 were identified when analyzing the mass list from the glycosylated EXTL3∆N, which is a preliminary indication that these asparagines might not be glycosylated. In contrast, no peptides were detected containing Asn592 or Asn790 in the glycosylated EXTL3∆N. Moreover, peptides containing Asn290 and Asn592 with extra masses of 203.079 Da were also found in the deglycosylated samples. Subsequent MS/MS analysis and sequencing of the precursor fragment ions verifies that Asn290 and Asn592 were indeed glycosylated (Figure 3). Taken together, these data demonstrate that Asn592 is invariably occupied with N-glycans, whereas Asn290 is variably occupied (i.e. present in both glycosylated and non-glycosylated states), as it was found both with and without extra mass of 203.079 Da corresponding to GlcNAc. Regarding Asn790, no experimental data could be obtained for either the glycosylated or the deglycosylated samples even after double digestion using trypsin and chymotrypsin. This can be due to suppression of the corresponding peptide mass ion by other, more abundant mass ions or incomplete deglycosylation.

14 Environment ACS Paragon Plus

Page 14 of 35

Page 15 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure 3. Annotation of MS/MS pattern of the NFTLTVTDFYR peptide, containing the Asn592 glycosite, found in the EndoH deglycosylated EXTL3∆N sample. Collision-induced dissociation of the GlcNAc residue resulted in appearance of the intact peptide (without the GlcNAc residue) as the most intense fragment, as indicated by MH+ in the image. Most fragment ions, i.e. b- and y-ions, correspond to the fragmentation of the peptide without GlcNAc residue, whereas three fragments (encircled) are denoted as b-ions consisting of the N-terminal part of the peptide with the GlcNAc residue attached.

Effect of inhibition of N-glycosylation on EXTL3∆N secretion N-glycosylation affects different properties of glycoproteins, including their conformation, oligomerization, solubility, stability and quality

47

. To explore the importance of N-

glycosylation for the properties of soluble EXTL3 , EXTL3∆N-producing cells were treated with

tunicamycin,

which

prevents

N-glycosylation

by

inhibiting

the

GlcNAc

phosphotransferase enzyme that catalyzes the transfer of N-acetylglucosamine-1-phosphate from UDP-N-acetylglucosamine to dolichol phosphate in the first step of glycan synthesis 32. The EXTL3∆N-expressing cells were grown to 80% confluency then incubated in proteinfree medium with various concentrations of tunicamycin (0–3 µg/mL) for 24 h. The culture media were then harvested and the cells were extracted in RIPA buffer. Tunicamycin concentrations higher than 1.5 µg/mL induced cell death; however, lower concentrations showed ~90% cell survival. Furthermore, cell extracts of untreated and tunicamycin-treated cells (up to 1.5 µg/mL) contained similar total amounts of protein, indicating that tunicamycin treatment for 24 h at concentrations up to 1.5 µg/mL was not detrimental to the cells. The conditioned medium and cell extracts were then analyzed by Western blot using an anti-Histag antibody (Figure 4A). As shown in Figure 4A, the conditioned medium of untreated cells 15 Environment ACS Paragon Plus

Biochemistry

displayed a protein band of 125 kDa corresponding in size to purified EXTL3∆N. Treatment with tunicamycin at concentrations of 0.5 µg/mL and higher resulted in significant reduction in the size of the band, indicating inhibition of N-glycosylation. Non-glycosylated EXTL3∆N was no longer detected in the media of tunicamycin-treated (> 0.5 µg/mL) cells. Cell extracts of untreated and treated cells showed similar reduction in the size of expressed EXTL3∆N. The protein amount in the bands was further quantified using the Gel-Pro analyser software (Figure 4B). This analysis indicated that in tunicamycin treated cells almost no EXTL3∆N (0 to 5%) was secreted to the medium and that the non-glycosylated proteins were accumulated inside the cells. Taken together, it appears that N-glycosylation is critical for proper secretion of EXTL3∆N, although the experiments do not show whether the protein is properly folded or not. Conditioned medium

A Tunicamycin µg/ml

0

0.5

1

Cell extract 1.5

0

0.5

Pure EXTL3ΔN 1

1.5 glycosylated Deglycosylated

1

2

3

4

5

6

7

8 α-tubulin

B 250 Relative protein expression (%)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 35

200 150 100 50 0

Tunicamycin µg/ml

0

0.5

1.0

1.5

Conditioned medium

0

0.5

1.0

1.5

Cell extract

Figure 4. Inhibition of N-glycosylation by tunicamycin treatment. The EXTL3∆N-expressing cells were incubated with various concentrations of tunicamycin as described in the experimental procedures, and conditioned medium and cell extracts were probed using an anti-His-tag antibody in a Western blot analysis. (A) A representative Western blot image of two independent experiments is shown with Ni-NTA-purified EXTL3 included as a control in lane 9. Anti-α−tubulin was used as loading control. (B) The amount of EXTL3 protein in the Western blot was quantified using the GelPro Analyzer software. The data are protein levels relative to the protein amounts in tunicamycin untreated conditioned medium and cell extract.

Folding and stability of EXTL3∆N The EXTL3∆N secondary structure was predicted via the Chou-Fasman-algorithm based on the sequence, which estimated 23%, 45% and 23% fractions for α-helical, β sheet and irregular elements respectively. To inspect experimentally whether EXTL3∆N is folded and

16 Environment ACS Paragon Plus

Page 17 of 35

to estimate further its secondary structure content, far-UV CD spectra of EXTL3∆N protein in 20 mM sodium phosphate buffer (pH 7.8) were recorded at 25 °C. The spectrum exhibited a peak at 193 nm and a single trough at 210 nm, which are characteristics of a protein containing predominantly β sheets (Figure 5A). The spectrum was compared by the program CDSSTR to a reference set of 43 proteins using the Dichroweb server 48, which estimated the secondary fractions in EXTL3∆N to be around 14% in α-helical conformations, 25% in β sheets elements, 23% in turns, and 35% unordered structures. (B)

20

CD spectrum CD spectrum CD spectrum CD spectrum

10

at 25 o C at 60 o C at 95 o C after cooling to 25 o C

7000

50

DSF: Tm = 55 o C DLS: T m = 59 o C

Fluorescent signals

40

0

-10

6000 30

20 5000

-20

Hydrodynamic radius R h (nm)

(A)

[θ] (103 deg cm2 dmol -1)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

10

-30

4000

190

200

210

220

230

240

250

0 20

30

40

Wavelength (nm)

50

60

70

80

90

Temperature ( oC)

Figure 5. Folding and thermal stability of EXTL3 ∆N in solution (A) Far-UV CD spectra of native EXTL3∆N protein in 20 mM sodium phosphate buffer (pH 7.8) at 25, 60 and 95 oC, and after cooling to 25 oC. (B) Heat denaturation of native EXTL3∆N in 20 mM bicine (pH 8.6), 150 mM NaCl & 3mM DTT was monitored by DLS (red triangle) and DSF (green square) measurements.

Furthermore, the thermal stability of EXTL3∆N was investigated using CD spectroscopy. The protein was heated at a rate of 1 °C /min and a far-UV CD spectrum (190-250 nm) was collected every 5 °C. A decrease of the ellipticity signal at 210 nm was evident until 60 °C, with little extra signal reduction occurring afterwards (Figure 5A). This indicates that unfolding of EXTL3∆N occurred around 60 °C. The far-UV CD spectrum was recorded at 25 °C, after the protein had been heated to 95 °C and cooled to 25 °C. A proportion of the molar ellipticity was lost when the protein was heated to 95 °C, which was not regained for the 193 nm & 210 nm ellipticities when the protein was cooled back to 25 °C (Figure 5A). This indicates irreversible or partially reversible refolding after EXTL3∆N heat denaturation.

The size exclusion chromatography (SEC) profile of EXTL3∆N in 20 mM sodium phosphate buffer (pH 7.5) displayed a main elution peak corresponding to a protein dimer (see later), and short shoulder of larger forms. The homogeneity of the eluted protein fractions from the main peak was assessed by DLS, which indicated a polydisperse population in the main peak

17 Environment ACS Paragon Plus

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

with hydrodynamic radius Rh ~10.5 nm, expected MW = 793 ± 78 kDa and high polydispersity index (PI) of ~30–40% (data not shown). This demonstrated the requirement to optimize the EXTL3∆N buffer before further structural studies. We examined the effect of 24 buffers with various buffer types, pH and salt contents using DSF and DLS 33. DSF is a rapid screening method to identify the stabilization effect of a new buffer relative to a reference buffer by the measurement of Tm

33

, whereas DLS characterizes the solubility and

monodispersity of the purified proteins in the tested buffer. The analyses revealed better stability and solubility of the expressed EXTL3∆N at higher pH (Rh = 7.8–8.9 nm & Tm = 50– 60 °C), where bicine buffer (pH 8.3) conferred the best monodispersity (Rh ~7.8 nm) with moderate stability (Tm = 56 °C). Subsequently, the bicine pH was optimized and various additives (such as glycerol and DTT) were examined. The best suggested buffer contained 20 mM bicine pH 8.6, 150 mM NaCl & 3 mM DTT, which presents a monodisperse DLS pattern of Rh 7.1 ± 1.2 nm and PI of 18%. Under these conditions, EXTL3 exhibited sigmoidal melting curves using both DSF and DLS, with measured Tm of 55 °C and 59 °C respectively (Figure 5B). Therefore EXTL3 was purified using SEC in a buffer consisting of 20 mM bicine pH 8.6, 150 mM NaCl & 3 mM DTT before further experiments.

Structural studies using SAXS and DLS To elucidate the structure of EXTL3∆N protein in solution, SAXS data were collected for a monodisperse fraction of the purified glycosylated protein using SEC-SAXS (Table 1). The Rg and total scattering distribution of the individual frames are shown in Figure 6A. The 53 scattering frames from the second half of the main eluted peak, with stable Rg around of 4.8 nm, were merged to provide a single averaged profile corresponding to the scattering of the monodisperse population, which was further analysed using PRIMUS (Figure 6B and Table 1). No sign of protein aggregation in this fraction was observed. Furthermore, the Kratky plot displayed a bell–shaped curve characteristic of folded proteins (Figure 6D). The estimated MW indicated the presence of an EXTL3∆N dimer in the main peak (Rg = 4.8 ± 0.047 nm and MW 241 kDa estimated from the Porod volume) Table 1. This corresponds to twice the estimated molecular weight of a glycosylated monomer, suggesting that the species in solution is a dimer. The P(r) function was calculated from the scattering data using GNOM 38, to estimate the electronic distribution within the EXTL3∆N particle and its shape (Figure 6E). The P(r) profiles of the EXTL3∆N dimer showed a main peak around 4.9 nm and a quite extended tail giving a secondary peak at large distances with maximum dimension (Dmax) of

18 Environment ACS Paragon Plus

Page 18 of 35

Page 19 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

16.8 nm, indicating a multi-domain structure, which is consistent with our prediction of two functional domains and a dimeric arrangement. Table 1 SAXS data collection and model parameters Parameters

EXTL3∆N dimer

Beamline Detector Beam size at sample (mm2) Wavelength (Å) q range (Å-1) Loaded concentration (mg/mL) Structural parameters I(0) (from P) Rg (Å) (from P) I(0) (from Guinier ) Rg (Å) (from Guinier) Dmax (Å) Porod volume estimate, Vp (Å3) Average Excluded volume, Vex (Å3) Molecular mass (kDa) From I(0) From Porod volume From SAXSMoW2 From protein sequence From protein sequence + N-glycans Modelling parameters Symmetry Χ2 of DAMMIN models P1 DAMAVER (20 DAMMIN P1) NSD Χ2 of DAMMIN models P2 DAMAVER (20 DAMMIN P2) NSD

BM29, ESRF Pilatus 1M 0.7 x 0.7 0.9919 0.1-5 5 mg/mL 12.7 48.8 12.8 ± 0.09 47.8 ± 0.47 168.3 409 750 468 184 189.5 241.3 272.2 102 112

P1, P2 0.93 0.45 ± 0.01 0.92 0.39 ± 0.01

The homogeneity of the eluted protein fractions was assessed by DLS to obtain the Rh, without performing further concentration (Figure 6C). The volume distribution plots of the fractions from the second half of the EXTL3∆N SEC peak showed a single, monodisperse peak with PI of ~13% and Rh = 4.9 ± 0.7 nm for the smallest EXTL3∆N species. An estimation of the MW from the DLS measurements assuming a globular protein suggested 144.9 ± 87.1 kDa for the molecule. Given the limited accuracy of DLS experiments, the results are consistent within experimental error with the SAXS results that suggest a dimer in solution. The typical shape factor σ (the Rg/Rh ratio) for a globular protein is ~0.774, however when molecules diverge from globular shape to ellipsoidal, σ increases, as the Rg becomes larger than Rh

49

. EXTL3∆N has σ = 0.98, which confirms an extended structure for the

EXTL3∆N dimer.

19 Environment ACS Paragon Plus

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 35

Figure 6. SEC-SAXS run of EXTL3∆N. The total scattering intensity (violet) and the calculated Rg (green) obtained for each frame of the eluted EXTL3∆N protein from a Superdex-200 column are shown in (A). The scattering images of stable Rg (red box) were merged and radially averaged to produce the final SAXS pattern shown in (B), with the fit for the DAMMIN model built with P1 symmetry (black line). (C) Normalized DLS size distribution plot of EXTL3∆N dimer with annotated Rh. SAXS-derived Kratky and normalized P(r) plots of EXTL3∆N dimer are shown in (D) and (E). (F) Ab initio averaged model of EXTL3∆N dimer derived from the P(r) data are depicted as surface volume. A detailed gallery of the individual envelopes is shown in Supporting Figure 1.

Ab-initio modelling The DAMMIN ab-initio shape reconstruction program

40

was used to generate 20 models of

the EXTL3∆N dimer, using both P1 and P2 symmetry, which have very good structural convergence, as reflected by the low NSD of 0.45 ± 0.01 in P1 and 0.39 ± 0.01 in P2 (Table 1). There was good agreement between the individual models, as well as their averages, when comparing P1 and P2 symmetry (Table 1 and Supporting Figure 1). The EXTL3∆N ab-initio envelope revealed an extended structure consisting of two distinctive regions: a large one with maximum dimensions ~112 x 78 x 90 Å and a small one with maximum dimensions ~50 x 44 x 37 Å (Figure 6F). It seems likely that the narrow region at the base of the envelope as

20 Environment ACS Paragon Plus

Page 21 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

shown in Fig. 6F corresponds to the coiled-coil domain, while the larger regions in the middle and upper parts of the envelope correspond to the GT47 and GT64 domains.

Discussion HS chains are complex sulfated unbranched polysaccharides found mainly at the cell surface, which bind to and affect the activity of various molecules such as growth factors and morphogens. They are thus involved in various cellular communications and interactions. Exostosin proteins (EXTs) are glycosyltransferases of the Golgi apparatus that assemble the HS on HSPG proteins. EXTL3 is the largest member of the EXT family, and it is found to work as an initiator of HS chain biosynthesis 5, thus providing a good model for structural and mechanistic investigations of the EXT family. Human EXTL3 is inserted into the Golgi apparatus membrane by an N-terminal transmembrane helix of 20 residues, followed by an unstructured protein segment of ~18 residues, rich in charged amino acids, then around 80 residues of coiled-coil structure that may play a role in positioning and/or orienting the following catalytic domains on the Golgi membrane. Our bioinformatics analysis suggested that the lumenal region contains at least two functional domains with different activities: GlcA-TII activity (addition of β-1,4-GlcA residues) through the large GT47 domain and GlcNAc-TI activity (addition of α-1,4-GlcNAc residues) through the smaller GT64 domain. The glycosyltransferase activities of EXTL3∆N were not confirmed here, but soluble EXTL3 has previously been shown to have GlcNAc-TI and GlcNAc-TII activities

17

. A GlcA-TII

activity has never been reported for EXTL3. In this investigation, human lumenal EXTL3∆N was cloned and expressed in human embryonic kidney cells. Correct protein expression was confirmed by Western blot and MS analysis and the purified proteins were investigated using various biochemical and biophysical techniques to elucidate N-glycosylation sites and their impact on structure and expression, and to study the stability and conformation of the enzyme in solution. EXTL3, like the other members of the exostosin family and other glycosyltransferases, has several characteristic DXD motifs that can be involved in catalysis. In the case of the GT64 domain of EXTL2 this occurs through binding of the UDP sugar donors via the catalytic metal Mn2+ 16. Of five DXD motifs in EXTL3, only the one between residues 744-746 (in GT64) is present in all other family members except EXTL1 5. This motif is preceded by a cluster of hydrophobic residues characteristic of functional DXD motifs. The exact relationship of DXD motifs to GlcNAc-TII is however still unclear. Studies hitherto have not

21 Environment ACS Paragon Plus

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 35

identified the DXD motifs or the functional domains involved, although the fact that the Cterminal domain belongs to GT64 suggests that the activity also resides there. N-linked glycosylation of proteins does not occur at every potential glycosylation site, resulting in variability in the occupancy of the potential N-glycosites with N-glycans. This variation in site occupancy is a function of the site availability, substrate concentration and enzyme kinetics in the ER

50

. The EXTL3 sequence reveals four possible N-glycosylation

sites at Asn277, Asn290, Asn592 and Asn790. All the sites are highly evolutionarily conserved between EXTL3 orthologues, except for Asn277. The latter was found to be nonglycosylated in our hands, which is consistent with its lack of conservation. Further, the MS data showed that Asn290 and Asn592 are variably and invariably occupied with N-glycans respectively, whereas the data neither confirms nor disproves N-glycosylation on Asn-790. Secretion of EXTL3∆N devoid of N-glycosylation via tunicamycin treatment was severely impaired. One explanation of this would be an effect on the protein folding, where the protein does not fold to the active state in the absence of N-glycans. N-linked oligosaccharides might promote folding directly by stabilizing polypeptide structures or indirectly by serving as specific recognition structures, chaperone-like function, that allows the glycoproteins to interact with molecular chaperones resident in the ER during protein folding

51, 52

. The

improperly folded proteins are then either retained in the ER lumen or targeted for ERassociated degradation

53

. However, our experiments are inconclusive on whether loss of

secretion is due to improper folding. We have shown in this study that EXTL3∆N is quite a stable protein, as high temperatures (~59 °C) were required to denature it. The lumenal EXTL3∆N catalytic domain contains 16 Cys residues, 7 of which are conserved within the EXT protein family. These may be involved in protein stabilization by forming disulfide bonds. Preliminary results from the heat denaturation of EXTL3∆N in the presence of reducing agent (DTT) suggest a destabilizing effect of reduction, with a decrease in Tm (~3 °C), which may indicate the presence of some disulfide bonds. Deconvolution of the far-UV CD spectrum of EXTL3∆N revealed a substantial fraction of β-sheet structure (25%), with 14% α-helical conformations, 23% turns, and 35% disordered structure. As well as the normal loops between secondary structure elements, these structures might be the inter-domain linkers and the N-terminal His-tag. Further, we have shown that nearly no protein refolding occurred for the EXTL3∆N protein after cooling the thermally denatured proteins to 25 °C. This indicates irreversible heat denaturation, which usually results from aggregation of thermally denatured protein.

22 Environment ACS Paragon Plus

Page 23 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

To investigate the structural characteristics of EXTL3∆N in solution, we studied it using SEC-SAXS and DLS. Data analysis of the proteins eluted in the main SEC peak provided estimated MW, Rg, Rh, and σ consistent with an extended dimeric structure. The reconstructed ab-initio envelope of EXTL3∆N revealed a two-region structure, one narrow and one broad. We suggest that the narrow region may contain the coiled-coil region, the middle of the envelope the GT47 domain and the broader upper half the GT64 domain.

Acknowledgements We thank staff at the ESRF SAXS beamlines, in particular Martha Brennich, for help with SAXS data collection. The technical assistance of Sol Da Rocha is greatly appreciated. This work was supported by grants from the Swedish Research Council to DL (project no. 201107119) and KM (project no. K2015-66X-22693-01-4), as well as the Swedish Cancer Society (contract 17 0255), Alfred Österlund, Åhlens and Olle Engkvist Foundation to KM.

Supporting Information Figure S1: Gallery of the 20 individual DAMMIN reconstructions of EXTL3∆N.

23 Environment ACS Paragon Plus

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

References

[1] Couchman, J. R., and Pataki, C. A. (2012) An Introduction to Proteoglycans and Their Localization, J Histochem Cytochem 60, 885-897. [2] Esko, J. D., Kimata, K., and Lindahl, U. (2009) Proteoglycans and Sulfated Glycosaminoglycans., In Essentials of Glycobiology (Varki, A., Cummings, R. D., Esko, J. D., Freeze, H. H., Stanley, P., Bertozzi, C. R., Hart, G. W., and Etzler, M. E., Eds.) 2nd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor (NY). [3] Chen, R. L., and Lander, A. D. (2001) Mechanisms underlying preferential assembly of heparan sulfate on glypican-1, J Biol Chem 276, 7507-7517. [4] Zhang, L., David, G., and Esko, J. D. (1995) Repetitive Ser-Gly Sequences Enhance Heparan Sulfate Assembly in Proteoglycans, J. Biol. Chem. 270, 27127-27135. [5] Busse-Wicher, M., Wicher, K. B., and Kusche-Gullberg, M. (2014) The extostosin family: Proteins with many functions, Matrix Biol 35, 25-33. [6] Nadanaka, S., Zhou, S., Kagiyama, S., Shoji, N., Sugahara, K., Sugihara, K., Asano, M., and Kitagawa, H. (2013) EXTL2, a Member of the EXT Family of Tumor Suppressors, Controls Glycosaminoglycan Biosynthesis in a Xylose Kinase-dependent Manner, J Biol Chem 288, 9321-9333. [7] Busse, M., Feta, A., Presto, J., Wilen, M., Gronning, M., Kjellen, L., and KuscheGullberg, M. (2007) Contribution of EXT1, EXT2, and EXTL3 to heparan sulfate chain elongation, J Biol Chem 282, 32802-32810. [8] Bornemann, D. J., Duncan, J. E., Staatz, W., Selleck, S., and Warrior, R. (2004) Abrogation of heparan sulfate synthesis in Drosophila disrupts the Wingless, Hedgehog and Decapentaplegic signaling pathways, Development 131, 1927-1938. [9] Morio, H., Honda, Y., Toyoda, H., Nakajima, M., Kurosawa, H., and Shirasawa, T. (2003) EXT gene family member rib-2 is essential for embryonic development and heparan sulfate biosynthesis in Caenorhabditis elegans, Biochem Biophys Res Comm 301, 317323. [10] Lee, J. S., von der Hardt, S., Rusch, M. A., Stringer, S. E., Stickney, H. L., Talbot, W. S., Geisler, R., Nusslein-Volhard, C., Selleck, S. B., Chien, C. B., and Roehl, H. (2004) Axon sorting in the optic tract requires HSPG synthesis by ext2 (dackel) and extl3 (boxer), Neuron 44, 947-960. [11] Takahashi, I., Noguchi, N., Nata, K., Yamada, S., Kaneiwa, T., Mizumoto, S., Ikeda, T., Sugihara, K., Asano, M., Yoshikawa, T., Yamauchi, A., Shervani, N. J., Uruno, A., Kato, I., Unno, M., Sugahara, K., Takasawa, S., Okamoto, H., and Sugawara, A. (2009) Important role of heparan sulfate in postnatal islet growth and insulin secretion, Biochem Biophys Res Comm 383, 113-118. [12] Canals, I., Benetó, N., Cozar, M., Vilageliu, L., and Grinberg, D. (2015) EXTL2 and EXTL3 inhibition with siRNAs as a promising substrate reduction therapy for Sanfilippo C syndrome, Sci Rep 5, 13654. [13] Zak, B. M., Crawford, B. E., and Esko, J. D. (2002) Hereditary multiple exostoses and heparan sulfate polymerization, Biochimica et Biophysica Acta (BBA) - General Subjects 1573, 346-355. [14] Van Hul, W., Wuyts, W., Hendrickx, J., Speleman, F., Wauters, J., De Boulle, K., Van Roy, N., Bossuyt, P., and Willems, P. J. (1998) Identification of a Third EXT-like Gene (EXTL3) Belonging to the EXT Gene Family, Genomics 47, 230-237. [15] Breton, C., Bettler, E., Joziasse, D. H., Geremia, R. A., and Imberty, A. (1998) Sequence-Function Relationships of Prokaryotic and Eukaryotic Galactosyltransferases, J Biochem 123, 1000-1009.

24 Environment ACS Paragon Plus

Page 24 of 35

Page 25 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

[16] Pedersen, L. C., Dong, J., Taniguchi, F., Kitagawa, H., Krahn, J. M., Pedersen, L. G., Sugahara, K., and Negishi, M. (2003) Crystal structure of an alpha 1,4-Nacetylhexosaminyltransferase (EXTL2), a member of the exostosin gene family involved in heparan sulfate biosynthesis, J Biol Chem 278, 14420-14428. [17] Kim, B.-T., Kitagawa, H., Tamura, J., Saito, T., Kusche-Gullberg, M., Lindahl, U., and Sugahara, K. (2001) Human tumor suppressor EXT gene family members EXTL1 and EXTL3 encode α1,4- N-acetylglucosaminyltransferases that likely are involved in heparan sulfate/ heparin biosynthesis, Proc Natl Acad Sci USA 98, 7176-7181. [18] Dosztanyi, Z., Csizmok, V., Tompa, P., and Simon, I. (2005) IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content, Bioinformatics 21, 3433-3434. [19] Linding, R., Jensen, L., Diella, F., Bork, P., Gibson, T., and Russell, R. (2003) Protein disorder prediction: implications for structural proteomics, Structure 11, 1453 - 1459. [20] Cheng, J., Sweredoski, M., and Baldi, P. (2005) Accurate Prediction of Protein Disordered Regions by Mining Protein Structure Data, Data Min Knowl Disc 11, 213222. [21] Linding, R., Russell, R., Neduva, V., and Gibson, T. (2003) GlobPlot: Exploring protein sequences for globularity and disorder, Nucleic Acids Res 31, 3701 - 3708. [22] Marchler-Bauer, A., Derbyshire, M. K., Gonzales, N. R., Lu, S., Chitsaz, F., Geer, L. Y., Geer, R. C., He, J., Gwadz, M., Hurwitz, D. I., Lanczycki, C. J., Lu, F., Marchler, G. H., Song, J. S., Thanki, N., Wang, Z., Yamashita, R. A., Zhang, D., Zheng, C., and Bryant, S. H. (2015) CDD: NCBI's conserved domain database, Nucl Acids Res 43, D222-D226. [23] Chivian, D., Kim, D. E., Malmström, L., Bradley, P., Robertson, T., Murphy, P., Strauss, C. E. M., Bonneau, R., Rohl, C. A., and Baker, D. (2003) Automated prediction of CASP-5 structures using the Robetta server, Proteins: Structure, Function, and Bioinformatics 53, 524-533. [24] Söding, J., Biegert, A., and Lupas, A. N. (2005) The HHpred interactive server for protein homology detection and structure prediction, Nucl Acids Res 33, W244-W248. [25] Bryson, K., Cozzetto, D., and Jones, D. T. (2007) Computer-assisted protein domain boundary prediction using the Dom-Pred server, Curr Protein Pept Sc 8, 181-188. [26] Lombard, V., Golaconda Ramulu, H., Drula, E., Coutinho, P. M., and Henrissat, B. (2014) The carbohydrate-active enzymes database (CAZy) in 2013, Nucl Acids Res 42, D490-D495. [27] Yang, J., Yan, R., Roy, A., Xu, D., Poisson, J., and Zhang, Y. (2015) The I-TASSER Suite: protein structure and function prediction, Nature methods 12, 7-8. [28] Kim, D. E., Chivian, D., and Baker, D. (2004) Protein structure prediction and analysis using the Robetta server, Nucl Acids Res 32, W526-W531. [29] Kelley, L. A., Mezulis, S., Yates, C. M., Wass, M. N., and Sternberg, M. J. (2015) The Phyre2 web portal for protein modeling, prediction and analysis, Nat Protoc 10, 845858. [30] Svensson, G., Linse, S., and Mani, K. (2009) Chemical and thermal unfolding of glypican-1: protective effect of heparan sulfate against heat-induced irreversible aggregation, Biochemistry 48, 9994-10004. [31] Shevchenko, A., Tomas, H., Havlis, J., Olsen, J. V., and Mann, M. (2007) In-gel digestion for mass spectrometric characterization of proteins and proteomes, Nat. Protocols 1, 2856-2860. [32] Elbein, A. D. The tunicamycins — useful tools for studies on glycoproteins, Trends Biochem Sci 6, 219-221.

25 Environment ACS Paragon Plus

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

[33] Niesen, F. H., Berglund, H., and Vedadi, M. (2007) The use of differential scanning fluorimetry to detect ligand interactions that promote protein stability, Nat. Protocols 2, 2212-2221. [34] Pernot, P., Round, A., Barrett, R., De Maria Antolinos, A., Gobbo, A., Gordon, E., Huet, J., Kieffer, J., Lentini, M., Mattenet, M., Morawe, C., Mueller-Dieckmann, C., Ohlsson, S., Schmid, W., Surr, J., Theveneau, P., Zerrad, L., and McSweeney, S. (2013) Upgraded ESRF BM29 beamline for SAXS on macromolecules in solution, Journal of Synchrotron Radiation 20, 660-664. [35] Incardona, M. F., Bourenkov, G. P., Levik, K., Pieritz, R. A., Popov, A. N., and Svensson, O. (2009) EDNA: a framework for plugin-based applications applied to Xray experiment online data analysis, Journal of synchrotron radiation 16, 872-879. [36] Petoukhov, M. V., Franke, D., Shkumatov, A. V., Tria, G., Kikhney, A. G., Gajda, M., Gorba, C., Mertens, H. D. T., Konarev, P. V., and Svergun, D. I. (2012) New developments in the ATSAS program package for small-angle scattering data analysis, J Appl Crystallogr 45, 342-350. [37] Konarev, P. V., Volkov, V. V., Sokolova, A. V., Koch, M. H. J., and Svergun, D. I. (2003) PRIMUS: a Windows PC-based system for small-angle scattering data analysis, Journal of Applied Crystallography 36, 1277-1282. [38] Svergun, D. I. (1992) Determination of the regularization parameter in indirect-transform methods using perceptual criteria, J Appl Crystallogr 25, 495-503. [39] Fischer, H., de Oliveira Neto, M., Napolitano, H. B., Polikarpov, I., and Craievich, A. F. (2010) The molecular weight of proteins in solution can be determined from a single SAXS measurement on a relative scale., J Appl Crystallogr 43, 101–109. [40] Svergun, D. I. (1999) Restoring low resolution structure of biological macromolecules from solution scattering using simulated annealing (vol 76, pg 2879, 1999), Biophys J 77, 2896-2896. [41] Volkov, V. V., and Svergun, D. I. (2003) Uniqueness of ab initio shape determination in small-angle scattering, J Appl Crystallogr 36, 860-864. [42] Svergun, D. I., Petoukhov, M. V., and Koch, M. H. J. (2001) Determination of domain structure of proteins from X-ray solution scattering, Biophys J 80, 2946-2953. [43] Cantarel, B. L., Coutinho, P. M., Rancurel, C., Bernard, T., Lombard, V., and Henrissat, B. (2009) The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics, Nucleic Acids Res 37, D233-238. [44] Breton, C., Snajdrova, L., Jeanneau, C., Koca, J., and Imberty, A. (2006) Structures and mechanisms of glycosyltransferases, Glycobiology 16, 29R-37R. [45] Bause, E. (1983) Structural requirements of N-glycosylation of proteins. Studies with proline peptides as conformational probes, Biochem J 209, 331-336. [46] Chang, V. T., Crispin, M., Aricescu, A. R., Harvey, D. J., Nettleship, J. E., Fennelly, J. A., Yu, C., Boles, K. S., Evans, E. J., Stuart, D. I., Dwek, R. A., Jones, E. Y., Owens, R. J., and Davis, S. J. (2007) Glycoprotein structural genomics: solving the glycosylation problem, Structure 15, 267-273. [47] Stanley, P., Schachter, H., and Taniguchi, N. (2009) N-Glycans., In Essentials of Glycobiology (Varki, A., Cummings, R. D., Esko, J. D., Freeze, H. H., Stanley, P., Bertozzi, C. R., Hart, G. W., and Etzler, M. E., Eds.) 2nd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor (NY). [48] Whitmore, L., and Wallace, B. A. (2008) Protein secondary structure analyses from circular dichroism spectroscopy: Methods and reference databases, Biopolymers 89, 392-400.

26 Environment ACS Paragon Plus

Page 26 of 35

Page 27 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

[49] Brewer, A. K., and Striegel, A. M. (2011) Characterizing the size, shape, and compactness of a polydisperse prolate ellipsoidal particle via quadruple-detector hydrodynamic chromatography, The Analyst 136, 515-519. [50] Jones, J., Krag, S. S., and Betenbaugh, M. J. (2005) Controlling N-linked glycan site occupancy, Biochimica et Biophysica Acta (BBA) - General Subjects 1726, 121-137. [51] Mitra, N., Sinha, S., Ramya, T. N. C., and Surolia, A. (2006) N-linked oligosaccharides as outfitters for glycoprotein folding, form and function, Trends Biochem Sci 31, 156163. [52] Tannous, A., Pisoni, G. B., Hebert, D. N., and Molinari, M. N-linked sugar-regulated protein folding and quality control in the ER, Seminars in Cell & Developmental Biology. [53] Hammond, C., and Helenius, A. (1995) Quality control in the secretory pathway, Cur Opinion Cell Biol 7, 523-529.

27 Environment ACS Paragon Plus

Cytoplasmic domain Transmembrane helix

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

Lumenal region

Biochemistry

Page 28 of 35

1 31 52

919

GlobDoms domain prediction Conserved domain database search Ginzu domain prediction HHPred Dom-Pred CAZy database Coiled coil

Glycosyl-Transferase Family 47

Glycosyl-Transferase Family 64

ACS Paragon Plus Environment Conserved Cys

DXD motif

Predicted N-glycosylation site

Page 29 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure 2. EXTL3 expression and identification. (A) SDS-PAGE analysis of EXTL3∆N proteins. Lanes 1 and 2: non-reduced and reduced glycosylated EXTL3∆N; lanes 3 and 4: non-reduced and reduced non-glycosylated EXTL3∆N respectively. (B) A typical band of EXTL3∆N was cut from a Coomassie-stained SDS-PAGE gel and subjected to proteolysis using trypsin followed by LC-MS/MS analysis. Peptides identified by mass spectrometry and subsequent database search are shown as black boxes and mapped on the EXTL3∆N schematic sequence. The other bands produced similar results. 104x101mm (200 x 200 DPI)

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3. Annotation of MS/MS pattern of the NFTLTVTDFYR peptide, containing the Asn-592 glycosite, found in the EndoH deglycosylated EXTL3∆N sample. Collision-induced dissociation of the GlcNAc residue resulted in appearance of the intact peptide (without the GlcNAc residue) as the most intense fragment, as indicated by MH+ in the image. Most fragment ions, i.e. b- and y-ions, correspond to the fragmentation of the peptide without GlcNAc residue, whereas three fragments (encircled) are denoted as b-ions consisting of the Nterminal part of the peptide with the GlcNAc residue attached. 162x77mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 30 of 35

Page 31 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure 4: Inhibition of N-glycosylation by tunicamycin treatment. The EXTL3∆N-expressing cells were incubated with various concentrations of tunicamycin as described in the experimental procedures, and conditioned medium and cell extracts were probed using an anti-His-tag antibody in a Western blot analysis. (A) A representative Western blot image of two independent experiments is shown with Ni-NTA-purified EXTL3 included as a control in lane 9. Anti-α−tubulin was used as loading control. (B) The amount of EXTL3 protein in the Western blot was quantified using the Gel-Pro Analyzer software. The data are protein levels relative to the protein amounts in tunicamycin untreated conditioned medium and cell extract. 146x88mm (200 x 200 DPI)

ACS Paragon Plus Environment

(B)

Biochemistry

20

CD spectrum CD spectrum CD spectrum CD spectrum

10

at 25 oC at 60 oC at 95 oC after cooling to 25 oC

7000

50

DSF: Tm = 55 oC DLS: Tm = 59 oC 40

Fluorescent signals

[θ] (103 deg cm2 dmol -1)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Page 32 of 35

0

-10

6000 30

20 5000

-20

10

-30 190

200

210

220

Wavelength (nm)

230

240

ACS Paragon Plus4000 Environment 250 20 30

0 40

50

60

Temperature ( oC)

70

80

90

Hydrodynamic radius Rh (nm)

(A)

Page 33 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure 6. SEC-SAXS run of EXTL3∆N. The total scattering intensity (violet) and the calculated Rg (green) obtained for each frame of the eluted EXTL3∆N protein from a Superdex-200 column are shown in (A). The scattering images of stable Rg (red box) were merged and radially averaged to produce the final SAXS pattern shown in (B), with the fit for the DAMMIN model built with P1 symmetry (black line). (C) Normalized DLS size distribution plot of EXTL3∆N dimer with annotated Rh. SAXS-derived Kratky and normalized P(r) plots of EXTL3∆N dimer are shown in (D) and (E). (F) Ab initio averaged model of EXTL3∆N dimer derived from the P(r) data are depicted as surface volume. A detailed gallery of the individual envelopes is shown in Supplementary Figure 1. 202x148mm (200 x 200 DPI)

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

For Table of Contents only 58x44mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 34 of 35

Page 35 of 35 Biochemistry 1 2 3 4 ACS Paragon Plus Environment 5 6