Structural, Kinetic, and Mechanistic Analysis of an Asymmetric 4

May 10, 2019 - A 4-oxalocrotonate tautomerase (4-OT) trimer has been isolated from Burkholderia lata and a kinetic, mechanistic, and structural analys...
0 downloads 0 Views 1MB Size
Subscriber access provided by SUNY PLATTSBURGH

Article

Structural, Kinetic, and Mechanistic Analysis of an Asymmetric 4-Oxalocrotonate Tautomerase Trimer Bert-Jan Baas, Brenda P. Medellin, Jake LeVieux, Marieke de Ruijter, Yan Jessie Zhang, Shoshana D. Brown, Eyal Akiva, Patricia Clement Babbitt, and Christian P. Whitman Biochemistry, Just Accepted Manuscript • DOI: 10.1021/acs.biochem.9b00303 • Publication Date (Web): 10 May 2019 Downloaded from http://pubs.acs.org on May 13, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 43 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Structural, Kinetic, and Mechanistic Analysis of an Asymmetric 4-Oxalocrotonate Tautomerase Trimer

Bert-Jan Baasa, Brenda P. Medellinb, Jake A. LeVieuxb, Marieke de Ruijtera, Yan Jessie Zhangb,c,*, Shoshana D. Brownd, Eyal Akivad, Patricia C. Babbittd,e,f, and Christian P. Whitmana,c,*

aDivision

of Chemical Biology and Medicinal Chemistry, College of Pharmacy, bDepartment of

Molecular Biosciences, and cInstitute for Cellular and Molecular Biology, University of Texas, Austin, TX 78712, dDepartments of Bioengineering and Therapeutic Sciences, eDepartment of Pharmaceutical Chemistry, and the fQuantitative Biosciences Institute, University of California, San Francisco, 94158

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ABSTRACT A 4-oxalocrotonate tautomerase (4-OT) trimer has been isolated from Burkholderia lata and a kinetic, mechanistic, and structural analysis has been performed. The enzyme is the third described oligomer state for 4-OT along with a homo- and heterohexamer. The 4-OT trimer is part of a small subset of sequences (133 sequences) within the 4-OT subgroup of the tautomerase superfamily (TSF). The TSF has two distinct features: members are composed of a single  unit (homo- and heterohexamer) or two consecutively joined  units (trimer) and generally have a catalytic amino-terminal proline. The enzyme, designated fused 4-OT, functions as a 4OT where the active site groups (Pro-1, Arg-39, Arg-76, Phe-115, Arg-127) mirror those in the canonical 4-OT from Pseudomonas putida mt-2. Inactivation by 2-oxo-3-pentynoate suggests that Pro-1 of fused 4-OT has a low pKa enabling the prolyl nitrogen to function as a general base. A remarkable feature of the fused 4-OT is the absence of P3 rotational symmetry in the structure (1.5 Å resolution). The asymmetric arrangement of the trimer is not due to the fusion of the two  building blocks because an engineered “unfused” variant that breaks the covalent bond between the two units (to generate a heterohexamer) assumes the same asymmetric oligomerization state. It remains unknown how the different active site configurations contribute to the observed overall activities and whether the asymmetry has a biological purpose or role in the evolution of TSF members.

2

ACS Paragon Plus Environment

Page 2 of 43

Page 3 of 43 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

INTRODUCTION The canonical 4-oxalocrotonate tautomerase (4-OT) from Pseudomonas putida mt-2 converts 2-hydroxymuconate (1, 2-HM, Scheme 1) to 2-oxo-3-hexenedioate (2), which is one of the reactions in the bacterial meta-fission pathway for the degradation of aromatic hydrocarbons.1-3

The genes coding the enzymes in this pathway are grouped together on a plasmid (the TOL plasmid pWW0) and the presence of this plasmid allows the host bacteria to use simple alkylsubstituted aromatic hydrocarbons as sole sources of carbon and energy.2,3 A wealth of kinetic, mechanistic, and structural data has been collected for 4-OT, resulting in a reasonably good understanding of the catalytic mechanism as well as the roles of four key players in the mechanism (Pro-1, Arg-11, Arg-39, and Phe-50).4-8 4-OT is a homohexamer where each monomer is made up of 62 amino acids.9,10 A second oligomeric arrangement was described for a heterohexamer 4-OT (hh4-OT) from the thermophilic organism, Chloroflexus aurantiacus J-10-fl.11 The hh4-OT consists of three -dimers where both the - and -subunits have 72 amino acids. The genomic context suggests that the hh4-OT is part of a meta-fission pathway, and it has been shown to carry out

3

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

the reaction shown in Scheme 1.11 The key players in the hh4-OT mechanism (Pro-1, Arg-12, Arg-40, αTrp-51) correspond to those in 4-OT. The canonical 4-OT is the founding member of the tautomerase superfamily (TSF), which has two defining features: members are constructed using a  building block and generally have a catalytic N-terminal proline.12,13 The monomers in both the homo- and heterohexameric 4-OTs display the signature  motif as well as the N-terminal proline. Over the last few decades, the structural and catalytic properties of more than 30 TSF members have been determined. These studies defined the five major subgroups in the TSF, which are designated the 4-OT,12,13 5-(carboxymethyl)-2-hydroxymuconate isomerase,14 (CHMI), macrophage migration inhibitory factor (MIF),15 cis-3-chloroacrylic acid dehalogenase (cis-CaaD),16 and malonate semialdehyde decarboxylase (MSAD) subgroups.17 Members of the 4-OT subgroup are composed of a single  unit, whereas those of the other four subgroups are composed of two consecutively fused  units. The single  units generally make up homo- or heterohexamers and the fused  units result in trimers, each resulting in an overall highly similar quaternary arrangement of  units. This long-standing observation about the TSF was examined in a global context across the superfamily in a sequence similarity network (SSN).18 The analysis confirmed that the pattern generally prevails. Using pairwise sequence identity as a metric in an all-by-all comparison, more than 11,000 protein sequences sorted into the five known subgroups, where the 4-OT subgroup is the largest one (4472 protein sequences).18 The predominance of this structural theme suggests that a gene fusion event took place early in the evolution of the TSF, followed by the diversification of function that is seen today.

4

ACS Paragon Plus Environment

Page 4 of 43

Page 5 of 43 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

In the course of an analysis of the 4-OT subgroup, a subset of sequences double the length of that of the canonical 4-OTs was identified.18 These “fused” 4-OTs form a separate subgroup that still connects to the short 4-OTs. These proposed trimeric 4-OTs represent a third oligomeric structure identified for 4-OT with features similar to possible progenitors for other subgroups in the TSF (the great majority of which are of the double length represented by the fused 4-OTs). In view of the potential significance of the members in this subset, a kinetic, mechanistic, and structural analysis was carried out on a representative of the fused 4-OTs from Burkholderia lata. This analysis shows that the enzyme functions as a 4-OT, the overall structure is trimeric, where each monomer is made up of two consecutively fused  units, and the active site is nearly superimposable on that of the canonical 4-OT. However, one striking property of the fused 4-OT, whose structure was determined at 1.5 Å resolution, is that it lacks a three-fold rotational symmetry, resulting in an asymmetric arrangement of the three active sites. There is no other known TSF member like this, and, to the best of our knowledge, such an observation has not been previously reported. In order to investigate whether “fusing” the two  units had resulted in the asymmetric arrangement, an engineered variant was made where the covalent bond between the two units was broken. This “unfused” 4-OT (1.8 Å resolution) also assumes an asymmetric arrangement. The biological and evolutionary implications of the asymmetry are unknown.

5

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 43

Experimental procedures Materials. All chemicals, biochemical, and enzyme substrates were obtained from sources described

elsewhere.19,20

2-Oxo-3-pentynoate

(2-OP)

was

synthesized

as

reported.21

Chromatographic resins (DEAE Sepharose fast flow and Bio-Gel P60 gel beads) and columns (PD-10 Sephadex G-25M, Econo-Column, and Spectra/Chrom LC) were acquired from sources described elsewhere.19 The Amicon stirred cell concentrators and the ultrafiltration membranes (10,000 Da, MW cutoff) were purchased from EMD Millipore Inc. (Billerica, MA). InstantBlue for staining SDS-PAGE gels was obtained from C.B.S. Scientific Company, Inc. (Del Mar, CA). Bacterial Strains, Plasmids, and Growth Conditions. Escherichia coli strain BL21Gold(DE3) was obtained from Agilent Technologies (Santa Clara, CA). The gene encoding fused 4-OT from Burkholderia lata (ATCC 17760, UniProt accession: Q392K7) (designated “fused 4-OT”) was codon-optimized for expression in E. coli, synthesized, and cloned into the expression vectors pJ411 (pJ411-F4-OT) by DNA2.0, Inc., now ATUM (Newark, CA).19 Cells were grown overnight (~16 h) at 37 °C in Luria−Bertani (LB) media, supplemented with 25 mM Na2HPO4 and 25 mM KH2PO4, buffered to pH ~6.75, 5 mM Na2SO4, 2 mM MgSO4, and kanamycin (Kn, 30 μg/mL). General Methods. Steady-state kinetic assays were performed on an Agilent 8453 diodearray spectrophotometer at 22 °C. Nonlinear regression data analysis was performed using the program Grafit (Erithacus Software Ltd., Staines, U.K.). Protein concentrations were determined by the Waddell method.22 Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDSPAGE) was carried out on denaturing gels containing 12% polyacrylamide.23 The PCR was carried out on a Labnet MultiGene OptiMax Thermal Cycler (Labnet International, Inc., Edison,

6

ACS Paragon Plus Environment

Page 7 of 43 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

NJ). Electrospray ionization mass spectrometry was performed on an LCQ electrospray ion-trap mass spectrometer (Thermo, San Jose, CA), housed in the Institute for Cellular and Molecular Biology (ICMB) Protein and Metabolite Analysis Facility at the University of Texas. 1H NMR spectroscopic experiments to examine possible dehalogenase activities were carried out as described elsewhere.24 Light scattering experiments were carried out as described.25 Construction of the SSN for the Level 2 4-OT Subgroup Containing the Canonical 4-OT. The network was generated using an in-house version of the Pythoscape software,26 tailored for use with available hardware. All-by-all pairwise comparisons of 133 Level 2 4-OT subgroup sequences, as curated in the Structure-Function Linkage Database (SFLD) (Figure 1), were obtained using the BLAST algorithm. The nodes (each representing a sequence) are arranged using Prefuse force directed OpenCL layout provided by the Cytoscape software,27 weighted by BLAST bit score of the associated edges. Edges are drawn between two nodes only if the mean similarity between the two sequences is at least as significant as the bit score of 96 chosen to illustrate similarity relationships for this network. Construction of the P1A, R39A, R76A, R104A, and R127A Mutants of Fused 4-OT. The fused 4-OT mutants were generated by the PCR using the pJ411-F4-OT plasmid as a template. The cycling parameters were 95 oC for 5 min followed by 35 cycles of 98 oC for 30 s, 45 oC for 20 s, and 72 oC for 20 s, with a final elongation step of 72 oC for 10 min. The P1A and R127A mutations were generated by direct amplification of the entire gene using the indicated primers. For the P1A mutant, the forward primer, which contains an NdeI restriction site (underlined) and the desired mutation (in bold), was 5- ATAGCAGGTACATATGGCGACTCTTGAAGTC-3, and the reverse primer was the universal T7 terminator primer. For the R127A mutant, the forward primer was the universal T7 promoter primer. The reverse primer, which contains a

7

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

HindIII restriction site (underlined) and the desired mutation (in bold), was 5GTGATGTTATAAGCTTATCACGCGCCCAGTGCACG-3. The R39A, R76A and R104A mutations were generated by overlap extension PCR28 using the same cycling parameters, and the following primers in combination with the universal T7 promoter and terminator primers. The desired mutations are in bold. For the R39A mutant: 5- C GAG AGC GTC GCG GTG CTG CTG ACC-3 (R39A-F) and 5- GGT CAG CAG CAC CGC GAC GCT CTC G-3 (R39AR). For the R76A mutant: 5-G ATC GCG GGT GCG ACC GAC GAG-3 (R76A-F), and 5CTC GTC GGT CGC ACC CGC GAT C-3 (R76A-R). For the R104A mutant: 5- CAG GCG ACG GCG GTT ATG ATT-3 (R104A-F), and 5-AAT CAT AAC CGC CGT CGC CTG-3 (R104A-R). Each PCR reaction mixture was made up in the Thermo Phusion High-fidelity Master Mix to which was added 100 ng of template DNA, and 50 ng of each primer. After the PCR, the products and pET-24(a) vector were treated with the NdeI and HindIII restriction enzymes. Subsequently, the vector was dephosphorylated with alkaline phosphatase. After purification, each PCR product and vector were ligated for 2 h at room temperature using T4 DNA ligase. An aliquot of the ligation mixture was transformed into E. coli BL21(Gold) DE3 cells. Transformants were selected at 37 oC on LB/Kn plates. Plasmid DNA was isolated from several colonies, and subsequently sequenced to verify that no mutations had been introduced during the amplification of the gene. Construction of the Expression Vector for “Unfused” 4-OT. The gene for the “unfused” 4OT was generated using the PCR protocol described above where the pJ411-F4-OT plasmid was used as the template. The following primers were used to introduce a stop codon after the Leu-65 codon (bold, italics), a ribosomal binding site (bold), and start codon before the Pro-66 codon (bold, italics). The NdeI and HindIII restriction sites are underlined. The N-terminal β-α-β8

ACS Paragon Plus Environment

Page 8 of 43

Page 9 of 43 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

subunit (Pro-1 to Leu-65) of unfused 4-OT was amplified using the primers 5’-A TAG CAG GTA CAT ATG CCA ACT CTT GAA GTC TTT CTG CC-3’ and 5’- CAT AAA TTT TAC CTC CTT TCA CAG GCT CGG CGG TGC-3’. The C-terminal β-α-β-subunit (Pro-66 to Arg127) of unfused 4-OT was amplified using the primers 5’-G TGA AAG GAG GTA AAA TTT ATG CCG GTG ATC GTT GCG-3’ and 5’-G TGA TGT TAT AAG CTT TCA GCG GCC CAG TGC ACG GGC-3’. The resulting PCR products were gel-purified, and the full-length gene for unfused 4-OT was generated by the overlap extension PCR protocol (above) and the following primers as forward and reverse primers, respectively: 5’-A TAG CAG GTA CAT ATG CCA ACT CTT GAA GTC TTT CTG CC-3’ and 5’-G TGA TGT TAT AAG CTT TCA GCG GCC CAG TGC ACG GGC-3’. The resulting full-length gene for unfused 4-OT was cloned into the pJ411 expression vector as described above. Expression and Purification of Wild-type Fused 4-OT, Unfused 4-OT, and Fused 4-OT Mutants. The proteins were all purified based on the protocol described elsewhere for the cisCaaD homologue encoded by PputUW4_01749 in Pseudomonas sp. UW4 (UniProt K9NIA5, designated Ps01740).19 After growth, induction, cell disruption, and centrifugation, the clear supernatant was applied to a DEAE-Sepharose column (~15 mL bed volume), which had previously been equilibrated in buffer A (10 mM Na2HPO4 buffer, pH 7.3). The column was washed with buffer A (3  15 mL), and retained proteins were subsequently eluted with a linear gradient of buffer A made 500 mM in NaCl. Typically, the target proteins eluted around 50 mM NaCl. The fractions from the DEAE-Sepharose column containing the target protein were pooled, and concentrated to ~3 mL using an Amicon concentrator. This sample was subsequently applied to a size-exclusion column (Bio-Gel P60 beads, 500 mL bed volume), which had previously been equilibrated in buffer A, and proteins were eluted with buffer A (0.5 mL per

9

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 43

minute) at 22 oC. The protein-containing fractions (5 mL) were analyzed by SDS-PAGE, and those containing near-homogenous target protein were pooled, and concentrated to ~20 mg/mL using an Amicon concentrator. Aliquots were flash-frozen in liquid nitrogen and stored at -80 oC until further use. Covalent modification of Fused 4-OT and Unfused 4-OT by 2-Oxo-3-pentynoate (2-OP) and Mass Spectral Analysis. Stock solutions of 2-OP (60 mM) were prepared by dissolving the solid compound (4 mg) in 100 mM Na2HPO4 buffer, pH 9.2 (600 L, final pH ~7). To 900 L of a 1 mg/mL solution of protein in 50 mM Na2HPO4 buffer, pH 7.3, was added 100 L of the stock solution of 2-OP. The reaction mixtures were incubated at 22 oC for 90 min (< 1% activity remaining). The 2-OP containing reaction mixture was made 50 mM in NaBH4 (from a 1M stock solution in ultrapure water), and incubated at 22 oC for 3 h. Subsequently, the protein in the reaction mixtures was buffer-exchanged into 10 mM Na2HPO4 buffer (pH 7.3) using a PD-10 column, flash-frozen in liquid nitrogen, and stored at -80 oC until further use. The sample was analysed by ESI-MS according to previously published procedures.29 Steady State Kinetics. The assays were carried out at 22 oC in 10 mM Na2HPO4 buffer (pH 7.3)

using

2-hydroxymuconate

(2-HM),

5-(methyl)-2-hydroxymuconate

(5-Me-2-HM),

phenylenolpyruvate (PP), 2-hydroxy-2,4-pentadienoate (HPD), and 5-(carboxymethyl)-2hydroxymuconate (CHM) as substrates.19,20 Stock solutions of 2-HM (25 mM), 5-Me-2-HM (10 and 25 mM), PP (100 mM), HPD (17 and 50 mM), and CHM (17 and 50 mM) were prepared by dissolving the appropriate amount of the crystalline free acid in absolute ethanol. Substrate concentrations ranged from 13-300 μM for 2-HM, 40-450 μM for 5-Me-2-HM, 26-1030 μM for PP, 170-900 μM for HPD, and 85-770 μM for CHM. Fused 4-OT was found to be unstable at the dilute concentrations used in the assay for >1 h. Therefore, the enzyme was diluted to the assay 10

ACS Paragon Plus Environment

Page 11 of 43 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

concentration (0.0033 μM) directly before use from a 0.016 mg/mL stock solution, at which concentration it is stable for at least 24 h. For 5-Me-2-HM, HPD, and CHM, the enzyme was diluted directly to the assay concentration (0.018 μM, 0.058 μM, and 1.4 μM, respectively) from a ~20 mg/mL stock solution. With the exception of the R104A fused 4-OT, which was assayed the same way as wild-type fused 4-OT, the mutant enzymes were directly diluted to their assay concentrations (0.14-1.0 μM) from a 20 mg/mL stock solution, due to the low-level activity towards 2-HM. For unfused 4-OT, a 0.14 mg/mL or 0.7 mg/mL stock solution was prepared, respectively, to measure kinetics for 2-HM and PP (final enzyme assay concentrations were 0.011 μM and 0.194 μM, respectively). The enol-keto tautomerization of substrates was monitored at wavelengths reported elsewhere using published extinction coefficients with the following exceptions.20 The enol-keto tautomerization of 2-HM was monitored by following the decay of the enol form at 300 nm ( = 22.5  103 M-1 cm-1). The enol-keto tautomerization of PP was monitored by following the decay of the enol-form at 305 nm ( = 2800 M-1 cm-1).20 Initial rates were determined, plotted against the substrate concentration and fitted to the Michaelis-Menten equation using Grafit. Crystallization. Initial crystallization conditions for fused 4-OT and unfused 4-OT were identified using sparse-matrix screening with a Phoenix crystallization robotic system (Art Robbins Instruments). After systematic optimization of the hits using the vapor-diffusion sitting drop method, reproducible crystals grew within a few days. Fused 4-OT crystallized in conditions consisting of 0.1 M HEPES buffer at pH 7.5 and 28-33% PEG 8000 at room temperature at a concentration of ~9 mg/mL. Unfused 4-OT crystallized in conditions consisting of 200 mM magnesium acetate and 28% PEG 3550 at room temperature at a concentration of ~26 mg/mL. Crystals were cryoprotected in mother liquor containing 25-30% glycerol.

11

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Data collection, Processing, Structure Determination and Refinement. X-ray data for fused 4-OT were collected at the Advanced Light Source beamline 5.0.3 (ALS, Berkeley, CA) and the diffraction data for the unfused 4-OT were collected at Advanced Photon Source (APS) beamline 23-ID-D. The data sets were indexed, integrated, and scaled using HKL-2000.30 The structures were determined by molecular replacement (MR) using Phaser-MR31 and Autobuild32 from the PHENIX suite of programs. The Cg10062 monomer crystal structure (19.4% sequence identity, PDB entry 3N4G) was used as a search model for the initial phases for the fused 4-OT structure factors. The monomeric alanine truncated version of the canonical 4-OT was used as a search model for the initial estimate of the unfused 4-OT structure factors. Subsequently, using highresolution electron density maps, chain identity was assigned as the α-subunit (unfused 4-OT residues 1-65) and the β-subunit (unfused 4-OT 66-120). Structure refinement was performed using Phenix Refine.33 TLS parameter was included in the refinement of all structures.34 The final structures were evaluated during and after refinement using Molprobity.35 The refinement statistics for the structures are summarized in Table 1 (where F4-OT is fused 4-OT and uF4-OT is unfused 4-OT). All figures were prepared with PyMol (The PyMOL Molecular Graphics System, Version 1.8 Schrödinger, LLC.).36

12

ACS Paragon Plus Environment

Page 12 of 43

Page 13 of 43 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

13

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Results and Discussion Discovery of the Fused 4-OTs. The structure-function relationships of more than 11,000 members in the TSF were visualized in a sequence similarity network (SSN).18 In the course of the analysis of a similarity path connecting the 4-OT and cis-CaaD subgroups and the characterization of two linking sequences, a group of fused 4-OTs was identified within the 4OT subgroup, in that sequences in this subset are double that of the length of the short 4-OTs (~125 vs. 62 amino acids, respectively). Two representative sequences from this fused 4-OT subset were chosen and a preliminary characterization was reported.18 One sequence is from Burkholderia lata (strain ATCC 17760, UniProt accession: Q392K7) and the encoded protein is designated “fused 4-OT”. The other sequence is from Pusillimonas sp. (strain T7-7) (PT7_0534, UniProt accession F4GMX9) and the encoded protein is designated “Linker 2”.18 Both proteins are trimers, where each monomer is made up of two consecutively fused  units.18 The diversity of the 4-OT subgroup (4472 non-redundant sequences) in the previously reported SSN necessitated a further subgrouping of the main Level 1 subgroup to result in several Level 2 subgroups.18 An update of one of the Level 2 4-OT subgroups (the one that includes the canonical 4-OT) is reported herein (Figure S1) where an additional 776 sequences were added (to give a total of 1995 sequences). When the same network is color-coded by sequence length (less than 100 amino acids and more than 100 amino acids), three distinct clusters can be seen (Figure S2). A majority of sequences (133 sequences) falls into two of the clusters, identified by the “fused 4-OT” or “Linker 2” nodes (Figure 1). The third cluster (59 sequences) will be investigated at a later time. With a few exceptions (which might be due to a misannotated start codon), the remaining sequences are short ones (Figure S2).

14

ACS Paragon Plus Environment

Page 14 of 43

Page 15 of 43 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure 1. Sequence similarity network showing the fused 4-OTs in the 4-OT subgroup. The SNN is part of the Level 2 4-OT subgroup containing the canonical 4-OT (Figures S1 and S2). The SSN represents 133 sequences with characteristics shown in Figure 2. Nodes representing the characterized sequences for “fused 4-OT” and “Linker 2” are colored and labeled. Edges are drawn between two nodes only if the mean similarity between the two sequences is at least as significant as the bit score of 96 chosen to illustrate similarity relationships for this network. Sequence Analysis. The alignment of the fused 4-OT and canonical 4-OT sequences

15

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

identified conserved and potential active site groups. As noted previously, the canonical 4-OT aligns best with the C-terminal half of fused 4-OT.18 Notably, fused 4-OT has Arg-76 (equivalent to Arg-11 in 4-OT), Arg-104 (equivalent to Arg-39 in 4-OT), Phe-115 (equivalent to Phe-50 in 4-OT), the GIGG region from 116-119 (equivalent to G51-G54 in 4-OT), and Arg-127 (equivalent to Arg-62 in 4-OT). In addition, fused 4-OT has the N-terminal proline (Pro-1) and Arg-39. The most significant observation is that Pro-66 (in the fused 4-OT) aligns with Pro-1 of the canonical 4-OT. Pro-66 is rarely conserved in TSF members and the positional conservation might reflect a gene fusion of two separate “short” 4-OTs and little divergence since the fusion. A multiple sequence alignment (MSA) of the 86 sequences in the fused 4-OT cluster and the 47 sequences in the Linker 2 cluster is visualized in two sequence logos (Figures 2A and 2B, respectively). The sequences in the fused 4-OT cluster (Figure 2A) are characterized by 4-OTlike sections joined in the middle by a linker-region, which results in a gap in the MSA. Although not identical in sequence, both 4-OT-like sections share similar conserved features (red asterisks). The N-terminal section shows the conserved N-terminal Pro characteristic of the TSF, a positively charged residue at positions 11 (His) and 39 (Arg), and a non-polar residue at position 50 followed by a GxGG-motif. This is similar to the core catalytic machinery, and the beta-hairpin motif important for the formation of the functional homohexamer structure, of the canonical 4-OT.9,13 The C-terminal section shows almost the same features, where two Argresidues (logo-positions 82 and 110), and a nonpolar residue (position 121) followed by a GxGG-motif, are conserved at equivalent positions relative to the N-terminal section. Worth noting, the “fused” 4-OTs have a conserved Pro-residue, 10 positions before the first conserved Arg-residue (at logo-position 72, red arrow). This would exactly have been the position of an Nterminal Pro, had this C-terminal section not been fused to the N-terminal section.

16

ACS Paragon Plus Environment

Page 16 of 43

Page 17 of 43 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

The sequences in the Linker 2 cluster (Figure 2B) have many of the same conserved features (Pro-1, His-11, Arg-39, a hydrophobic residue, Ile, at position 50, Arg-71, Arg-99 and a hydrophobic residue, Met, at position 110) as indicated by the red asterisks. In both N-terminal and C-terminal section, the GxGG motif is less prominent. There is no gap in the MSA and no obvious proline that might represent the start of the second 4-OT-like sequence. Linker 2 has diverged from the fused 4-OT.18

Figure 2. Sequence logos for the fused 4-OT and Linker 2 clusters. A) The sequence logo for the fused 4-OT cluster. The sequences of the “fused” 4-OTs (86 total) are characterized by two 4-OT-like sections joined in the middle by a linker-region, which results in a gap in the multiple sequence alignments (MSA). The red asterisks indicate the conserved features (described in the text) and the red arrow indicates a conserved proline residue (red arrow logoposition 72) that is 10 positions before the arginine at logo-position 82. This could represent the position of an N-terminal proline before fusion of two short 4-OTs in many of the sequences. B) The sequence logo for the Linker 2 cluster. The sequences show the conserved features

17

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(described in the text), but the positioning is not as striking. Moreover, there is not an obvious conservation of proline as the start of the C-terminal section (red arrow logo-position 61). The logos are graphical representations of multiple sequence alignments (MSAs) generated using WebLogo.37

Purification and Characterization of the Fused 4-OT. The fused 4-OT was overproduced in E. coli BL21(DE3) and purified to near homogeneity. The typical yield was ~100 mg of fused 4OT per liter of culture. ESI-MS analysis of the purified fused 4-OT showed a single major signal corresponding to a monoisotopic mass of 13,030 Da (calc. 13,161 Da) in the reconstructed mass spectrum. The mass difference of 131 Da between the theoretical and the observed mass indicates that the translation-initiating methionine was post-translationally removed, resulting in a mature protein (127 amino acids for the fused 4-OT) with an N-terminal proline.38 DLS analysis of the fused 4-OT showed a single peak (97.4% mass fraction) with a molar mass of 39,640 Da. This species corresponds best with the trimeric form of fused 4-OT (calculated mass of 39,090 Da). Kinetic Properties of the Fused 4-OT. The protein was screened for activity with known substrates for TSF members (Table 2). Of these substrates, the highest kcat/Km value is measured for 2-HM with lower values being measured for 5-methyl-2-HM, PP, and HPD (5.7-fold to 21.5fold lower). The lowest value is measured for CHM (1217-fold lower). The enzyme could not be saturated with CHM. It appears that the bulkier substituent at C5 (and/or the charge) substantially affects binding and activity.

18

ACS Paragon Plus Environment

Page 18 of 43

Page 19 of 43 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

The enzyme was also tested for its ability to process the cis- and trans-isomers of 3chloroacrylic acid.24 After 7 days, no product could be detected (by 1H NMR spectroscopy) when incubated with the cis-isomer. However, a trace amount of product (