β-Hydrolase Fold Revealed in the ... - ACS Publications

Nov 7, 2016 - (B) Differing biotin biosynthetic gene organization in Escherichia coli and Haemophilus influenzae. The E. coli bioH gene is located far...
1 downloads 0 Views 3MB Size
Subscriber access provided by University of Otago Library

Article

An atypical #/# hydrolase fold revealed in the crystal structure of pimeloylacyl carrier protein methyl esterase BioG from Haemophilus influenzae Jie Shi, Xinyun Cao, Yaozong Chen, John E. Cronan, and Zhihong Guo Biochemistry, Just Accepted Manuscript • DOI: 10.1021/acs.biochem.6b00818 • Publication Date (Web): 07 Nov 2016 Downloaded from http://pubs.acs.org on November 8, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Biochemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

An atypical α/β hydrolase fold revealed in the crystal structure of pimeloylacyl carrier protein methyl esterase BioG from Haemophilus influenzae Funding Sources: This work was supported by GRF601413 and N_HKUST621/13 from the Research Grants Council of the Government of the Hong Kong Special Administrative Region (to ZG) and by NIH grant AI15650 (to JEC). Jie Shi, †, Φ Xinyun Cao, ‡, Φ Yaozong Chen, † John E. Cronan,*, ‡, Ψ and Zhihong Guo*, † †

Department of Chemistry and State Key Lab for Molecular Neuroscience, The Hong Kong

University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong SAR, China. Departments of MicrobiologyΨ and Biochemistry‡, University of Illinois, Urbana, Illinois 61801 Φ

These authors contributed equally to this work

*

To whom correspondence should be addressed: Zhihong Guo, Department of Chemistry and

State Key Lab for Molecular Neuroscience, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong SAR, China. Tel: +852-2358 7352; Fax: +852-2356 1594; E-mail: [email protected]. Or John E. Cronan, Department of Microbiology, The School of Molecular and Cellular Biology, B103 Chemical and Life Sciences Laboratory, University of Illinois at Urbana-Champaign, 601 South Goodwin Avenue, Urbana, IL 61801, USA. Tel: +1-(217) 333-7919; Fax: +1-(217) 244-6697; E-mail: [email protected].

1 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ABBREVIATIONS ACP, acyl carrier protein; LB, Luria broth; IPTG, isopropyl-β-D-thiogalactopyranoside; MOPS, 3-(N-morpholino)propanesulfonic acid; TCEP, tris(2-carboxyethyl)phosphine; PEG, polyethylene glycol; SDS-PAGE, sodium dodecyl sulfate-polyacrylamide gel electrophoresis; PDB, Protein Data Bank; SAD, single-wavelength anomalous dispersion; CD, circular dichroism; DTT, 1, 4-dithiothreitol; OD, optical density; rmsd, root-mean-square deviation.

2 Environment ACS Paragon Plus

Page 2 of 43

Page 3 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

ABSTRACT Pimeloyl-acyl carrier protein (ACP) methyl esterase is an α/β-hydrolase that catalyzes the last biosynthetic step of pimeloyl-ACP, a key intermediate in biotin biosynthesis. Intriguingly, multiple non-homologous isofunctional forms of this enzyme that lack significant sequence identity are present in diverse bacteria. One such esterase, Escherichia coli BioH, has been shown to be a typical α/β-hydrolase fold enzyme. To gain further insights into the role of this step in biotin biosynthesis, we have solved the crystal structure of another widely distributed pimeloyl-ACP methyl esterase, H. influenzae BioG at 1.26 Å. The BioG structure is similar to the BioH structure and is composed of an α-helical lid domain and a core domain that contains a central seven-stranded β-pleated sheet. However, four of the six α-helices that flank both sides of the BioH core β-sheet are replaced with long loops in BioG, thus forming an unusual α/β-hydrolase fold. This structural variation results in a significantly decreased thermal stability of the enzyme. Nevertheless, the lid domain and the residues at the lid-core interface are well conserved between BioH and BioG, in which an analogous hydrophobic pocket for pimelate binding as well as similar ionic interactions with the ACP moiety are retained. Biochemical characterization of site-directed mutants of the residues hypothesized to interact with the ACP moiety supports a similar substrate interaction mode for the two enzymes. Consequently, these enzymes package the identical catalytic function under a considerably different protein surface.

3 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biotin, also known as vitamin H or vitamin B7, is an essential prosthetic cofactor that serves as the carboxyl group carrier in enzymatic carboxylation, decarboxylation and transcarboxylation reactions in all three domains of life (1-3). It is synthesized de novo in microorganisms and plants from the seven carbon α,ω-dicarboxylate intermediate, pimelate, that is esterified with either CoA (pimeloyl-CoA) or acyl carrier protein (pimeloyl-ACP) (4). Conversion of this common pimeloyl thioester precursor to biotin involves four enzymes, BioF, BioA, BioD and BioB (Figure 1A), that are well conserved and have been extensively investigated for their structures and catalytic mechanisms (4, 5). Not surprisingly, biotin and its biosynthetic enzymes are essential for survival and indispensable for efficient virulence of many microbial pathogens such as Mycobacterium tuberculosis and Francisella tularensis (68). In contrast, biotin is not synthesized by animals but is acquired from food, implicating enzymes involved in biotin biosynthesis as potential drug targets. Indeed, potent inhibitors have been discovered for biotin biosynthetic enzymes and shown to be promising candidates for development of novel antibacterial agents (9, 10). However, the pimeloyl thioester precursor has diverse origins in different organisms. Early feeding experiments showed that it is generated from exogenous pimelic acid in some fungi and bacteria such as the diphtheria bacillus and Phycomyces blakesleeanus (11-14), whereas it is synthesized de novo via a modified fatty acid synthetic pathway in many other biotinproducing microorganisms. Recently, the methyltransferase BioC was shown to hijack the bacterial type II fatty acid synthesis pathway to assemble the pimelate moiety (Figure 1A) (15, 16). BioC methylates the free carboxyl of malonyl-ACP to form an ester that is recognized as a substrate by the enzymes of type II fatty acid biosynthesis (17) and is elongated for two cycles with addition of four carbon atoms to the malonyl methyl ester moiety. After acyl chain elongation, the terminal methyl ester in the ACP-bound intermediate is hydrolyzed by

4 Environment ACS Paragon Plus

Page 4 of 43

Page 5 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

an α/β-hydrolase fold esterase (e.g. E. coli BioH) to block chain elongation and release pimeloyl-ACP (15). To avoid excessive methylation of malonyl-ACP by BioC which disrupts fatty acid biosynthesis (16), bioC is clustered with other biotin biosynthetic genes in a tightly controlled and widely distributed bio operon (18). Intriguingly, the pimeloyl-ACP methyl esterases that decouple biotin precursor biosynthesis from fatty acid biosynthesis fall into several phylogenetically distinct subclades of the α/β-hydrolase fold superfamily in BioC-containing bacteria (18). Thus far, five groups of the esterases have been discovered, including the most prevalent variants represented by E. coli BioH and H. influenza BioG which are distributed widely among diverse bacteria. In contrast some bacteria have organism-specific enzymes such as BioK in cyanobacteria, BioJ in Francisella species and BioV in Helicobacter species (19-21). All these esterases have been demonstrated to cleave pimeloyl-ACP methyl ester by both in vitro assay and by rescue of biotin synthesis in ∆bioH E. coli mutants. Nonetheless, the esterases from different groups share almost no sequence identity and their genes are not always subjected to the same transcriptional regulation as bioC (19). E. coli BioH is the best studied pimeloyl-ACP methyl esterase having a typical α/βhydrolase fold structure and shows promiscuity in that it hydrolyzes some non-ACP substrates (22, 23). The crystal structure of a binary complex of the inactive BioH S82A mutant and pimeloyl-ACP methyl ester has been determined at high resolution (PDB accession code 4ETW) to provide the structural basis for the substrate recognition and catalysis (24). Arginine residues on the BioH helical lid domain bind the α2-helix of ACP through ionic interactions. A long, narrow channel allows access of the hydrophobic pimeloyl methyl ester linked to the phosphopantetheine group of the ACP to the BioH active site. The substrate terminal methyl ester is in close proximity to the Ser-His-Asp triad for hydrolysis. These studies demonstrate that the pimeloyl-ACP methyl esterase is the gatekeeping enzyme

5 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

in the pimeloyl-ACP biosynthesis that prevents further elongation by the fatty acid biosynthetic cycle (24). It is not clear why multiple non-homologous isofunctional pimeloyl-ACP methyl esterases have appeared in the BioC-dependent pimeloyl-ACP biosynthetic pathway when BioH is sufficient to carry out the gatekeeping function. To better understand the biological role of these esterases, we determined the X-ray crystal structure of BioG from Haemophilus influenzae (Figure 1B) and compared this structure to that of E. coli BioH. In addition, we mutated the putative BioG-ACP interaction residues and determined the effects of these mutations on the catalytic activity.

EXPERIMENTAL PROCEDURES Molecular cloning. The bioH gene was amplified from E. coli genomic DNA using primers BioH-F and BioH-R (Table 1) and ligated into pET-28a(+) (Novagen) between the NdeI and XhoI restriction sites for expression of the protein having a N-terminal hexahistidine tag. Plasmid pMAD23 encoding the H. influenzae bioG with a C-terminal hexahistidine tag (19) was the source of BioG. Site-directed mutagenesis was performed on plasmid pMAD23 as PCR template using the primers given in Table 1 and the QuickChange mutagenesis kit (Stratagene). The residual template plasmid was removed by treatment with DpnI at 37°C. The constructed plasmids pXC031-pXC037, each carrying one, two or three mutations within the BioG coding sequence (Table 2), were transformed into E. coli DH5α and their sequences were verified by DNA sequencing done by ATGC, Inc. Protein purification. Overexpression and purification of BioH (refers to E. coli BioH unless stated otherwise) and native BioG (refers to H. influenzae BioG unless stated otherwise) for crystallization followed procedures described previously with minor modifications (20, 23). Expression of the selenomethionine (SeMet) BioG derivative was first 6 Environment ACS Paragon Plus

Page 6 of 43

Page 7 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

attempted using the methionine-auxotroph E. coli B834 strain without success. The plasmid pMAD23 was then transformed into E. coli strain BL21(DE3) and the recombinant cells were grown in Luria broth (LB) containing 50 µg/mL kanamycin at 37oC for expression of SeMetBioG. After the OD600 reached 0.8, cells were gently harvested, washed and suspended in M9 minimal medium supplemented with 100 mg/L lysine, 100 mg/L threonine, 50 mg/L leucine, 50 mg/L isoleucine, 4 g/L glucose, 2 mM MgCl2, 0.1 mM CaCl2 and 50 mg/L selenomethionine. After another hour of growth, 0.2 mM IPTG was added to the culture to induce expression for 4 h. Like the unlabeled BioG, SeMet-BioG was purified to higher than 95% purity as indicated by SDS-PAGE and stored at -20°C in 20 mM Tris-HCl (pH 8.0), 1 mM DTT and 10% glycerol. The success of selenomethionine substitution was confirmed by mass spectroscopy. Protein concentrations were determined using a Coomassie Blue protein assay kit (Pierce). For the enzymatic studies strain BL21(DE3) carrying pMAD23 or pXC031-pXC037 encoding wild type BioG or a mutant BioG (Table 2) was grown to OD600 of 0.8 in LBkanamycin medium at 37°C followed by induction for 4 h by addition of 1 mM IPTG. The cells of a 500 ml culture of each strain were collected. All protein purifications and manipulations were performed at 4°C or on ice. The cell pellets were resuspended in lysis buffer containing 20 mM MOPS (pH 8.0), 500 mM NaCl, and 10% glycerol, and lysed by multiple passages through a French Press. The soluble cell extract was collected and mixed with Ni-NTA resin (Qiagen) for 2 h. The resin was then loaded into a column and washed twice with 40 mM lysis buffer containing 30 mM imidazole. The column was eluted with 250 mM imidazole and protein fractions were collected. Protein purification was monitored by SDS/PAGE. The concentrated protein solutions were dialyzed overnight in dialysis buffer containing 25 mM MOPS, 10% glycerol, 1 mM TCEP and 0.2 M NaCl (pH 7.5) followed by flash freezing and storage at -80°C.

7 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Crystallization and data collection. Initial crystal screening was performed independently for native BioG and its selenomethionine derivative with several screening kits (Hampton Research) using the sitting drop vapor diffusion method in 96-well plate at 16oC. Lead conditions were optimized by the hanging drop vapor diffusion method and native BioG single crystals were successfully grown in a protein solution at 16 mg/mL containing 0.2 M Mg(CH3COO)2, 0.1 M sodium cacodylate pH 6.5 and 25% PEG 8000, while the selenomethionine derivative crystals were harvested from a protein solution at 18 mg/mL containing 10% (v/v) 2-propanol, 0.1 M citric acid/sodium citrate (pH 5.6) and 22% PEG 4000. Both proteins crystallized within 4 days to a size larger than 100 × 50 × 50 µm. Single crystals were then mounted, cryo-protected with reservoir solution containing extra 20% glycerol by flash freezing in liquid N2 and exposed to 12662 kev X-Ray at Shanghai Synchrotron Radiation Facility or National Center for Protein Science Shanghai. The diffraction data were collected using the rotation method with oscillation size of 1 degree per frame. A total of 360 frames were collected for the selenomethionine-derived crystal and 180 frames were collected for the native crystal. Diffraction images were recorded with an ADSC Quantum 315R charge-coupled device detector and then indexed, integrated, and scaled in space group P21 using HKL2000 (25). Structure determination and refinement. Experimental phases of anomalous data were determined by Autosol using the SAD method (26) and six Se atoms were unambiguously positioned in the asymmetric unit, suggesting the presence of two BioG molecules. Initial structural model was built by Autobuild, resulting in construction of 408 out of 430 possible residues (27). After preliminary refinement using Coot and Phenix (28, 29), the model was employed to solve the native BioG structure using molecular replacement by Phaser (30). Further refinement was performed using Phenix with the merohedral twin law (-h, k, l)

8 Environment ACS Paragon Plus

Page 8 of 43

Page 9 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

applied, as suggested by Xtriage (31), combined with manual building in Coot (28, 29). Diffraction and refinement statistics were summarized in Table 3. Enzyme Activity Assays. E. coli ACP was expressed and purified as previously described (32). Assays of the BioG and BioG mutant proteins were performed using the protocol established for E. coli BioH with modifications (24). Each reaction contained 50 mM TrisHCl (pH 7.0), 5% glycerol, 150 µM pimeloyl-ACP methyl ester and 5 nM BioG. The reaction was monitored at seven different time-points over 20 min using the following protocol. A premix of buffer and pimeloyl-ACP methyl ester was incubated at 37 °C for 1 min without BioG. Each reaction was initiated by adding BioG. The reactions were stopped at different time points by adding an equal volume of 10 M urea and placed on dry ice. For analysis the reaction mixtures were loaded into 20% PAGE gels containing 2.5 M urea and run at 130 V for 2.5 h. Pimeloyl-ACP methyl ester was synthesized using the previously described protocol (24). In vivo complementation assays. Strain STL243 (MG1655 ∆bioH::FRT ∆pcnB::cat, Table 1) was previously described and used for complementation. The strain was transformed with plasmid pET28b carrying either the wild-type bioG gene or a mutant gene. In order to ultimately prevent carrying over of biotin, all plasmid-carrying strains were grown on M9 medium plates containing 0.8% (wt/vol) glycerol and a minimal concentration of biotin (1.6 nM) at 30ºC for overnight. The following day the cells were restreaked onto M9 medium plates of the same composition with or without biotin supplementation. Circular dichroism spectroscopy. Prior to measurement, BioG and BioH were diluted to 0.5 mg/ml in 20 mM phosphate (pH 7.0) buffer containing 100 mM NaCl and transferred into a quartz cuvette with a 1 mm path length. The circular dichroism (CD) spectra were recorded on a Chirascan Spectrometer (Applied Photophysics) between 200-260 nm at temperatures

9 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ranging from 20 oC to 75 oC. Samples were incubated for 30 min at each temperature to achieve equilibrium. Structural analysis and sequence alignment. PyMOL was used for structural analysis and graphic generation (33). Multimeric state and interfaces were calculated by PISA (34). Structure based sequence alignment was constructed by PROMALS3D and alignment graph was generated by ESPript 3.0 (35, 36). Electrostatic surface was analyzed by APBS and visualized in PyMOL (33, 37).

RESULTS Structure of H. influenzae BioG. The crystal structure of native BioG was determined using the model derived from the selenomethionine derivative because neither BioH nor its backbone structure could be successfully used as a search model for molecular replacement. This is not surprising because alignment of BioG and BioH shows only 19% identity even when several large gaps are allowed. Although native BioG and selenomethionine derivative crystals were obtained from two independent conditions, unexpected twinning was observed only in native crystals. The structure was finally refined to a resolution of 1.26 Å using the twin law (–h, k, l). In one asymmetric unit, two nearly identical monomers are arranged with an interface (Figure 2A) inadequate for dimer formation according to PISA calculation, consistent with its monomer elution profile from size exclusion chromatography (19). Both chains lack the C-terminal residue and the hexahistidine tag due to poor electron density. As a member of α/β-hydrolase family, BioG contains a core domain consisting of sevenstranded β-sheets (β1 to β7) and a lid domain consisting of four α-helices (α1 to α4), similar to BioH (Figure 2A). The two proteins are superimposable with a rmsd of 3.727 Å over the Cα atoms of 130 residues (Figure 3A). A typical catalytic triad consisting of S65, H200, and

10 Environment ACS Paragon Plus

Page 10 of 43

Page 11 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

D175 is located at the interface of the core and lid (Figure 4B). The lid domain of BioG extends from Y101 to Q157 and thus is smaller than the 65-residue BioH lid (K121 to T185) with shorter helices α2 and α3 (Figure 3A and Figure 5). The spatial arrangement of the four α-helices in BioG is also slightly altered in comparison to BioH with its helix α4 extending significantly more outwards (Figure 3A). The main differences between BioG and BioH lie in the core domain. Although the central seven-stranded β-sheets are nearly identical, dramatic differences are found in the layers flanking both sides of the central β-sheet. In BioG, one of the flanking layers (annotated as α layer) consists of only two normal α-helices (αA and αB) and long loops, while the other (annotated as 310 layer) comprises merely loops containing three short 310-helices (η1 to η3) (Figure 2B and 3B). Altogether, four of the six α-helices flanking the central β-sheet in the canonical α/β-hydrolase fold of BioH are replaced by long loops with or without a 310-helix in BioG, resulting in an atypical α/β-hydrolase fold for the latter. Since the helix-replacing loops in BioG are significantly shorter than corresponding helices in the canonical α/βhydrolase fold of BioH, it follows that BioG proteins are on average 19% shorter than the BioH homologues. Alignment of the primary sequence of H. influenzae BioG with its orthologues shows little conservation in the flanking loops (Figure 5), suggesting high variability of these regions. Structural modeling of BioG-substrate interactions. To understand how BioG interacts with its substrate, pimeloyl-ACP methyl ester was manually modeled into the BioG active site by superimposing BioG with the structure of the complex of the substrate with BioH S82A. As shown in Figure 4, the substrate interacts similarly with BioG and BioH with most of the contact amino acid residues conserved as listed in Table 4. BioG is found to contain three positively charged residues K118, K127, and R132 at the corresponding positions of three of the four positively charged residues at the surface of BioH (Figure 4C), previously

11 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

shown to form salt bridges with the α2-helix of ACP and allow docking of the ACP moiety (24). Since loss of a single BioH positive charge residue did not impair interaction with ACP (24), the three identified positively charged residues in BioG (Figure 4C) seemed likely to form similar salt bridges with the ACP α2-helix upon substrate docking at the lid-core interface although the side chain orientations differed from those of BioH. More importantly, the hydrophobic pocket for binding and positioning of the pimeloyl moiety at the active site is also conserved between BioG and BioH. It is composed of a large group of hydrophobic amino acid residues as shown in Figure 4D and Table 4, most of which are well-conserved among BioG and BioH orthologues (Figure 5). According to the structure-based sequence alignment (Figure 5), M66 rather than L151 of helix α4 of BioG takes the position of the BioH L181 side chain in forming the hydrophobic pocket (Figure 4D). In addition, the side chains of F111 of BioH and its corresponding BioG residue F178 point in opposite directions without significantly changing the shape of the substrate binding pocket (Figure 4D). Moreover, one hydrophobic residue F128, which interacts with and also positions the thioester function of the substrate in BioH (Figure 4D), is not found in BioG. This lack of hydrophobic interaction at the thioester function is compensated in BioG by the polar interaction of the thioester carbonyl group with the side chain of T107 (Figure 4D). Thus, although minor differences are present at the active sites of the two enzymes, the substrate-enzyme interactions do not alter the catalytic mechanism. Indeed, as found in BioH, the terminal methyl ester is located very close to and positioned perfectly for nucleophilic attack by S65 of the catalytic triad with a short distance of 2.6 Å between the carbon atom of the carbonyl group and Oγ of S65 (Figure 4B). Results from the structural modeling and comparison show that BioG and BioH possess equivalent active sites and thus have the same catalytic mechanism and were previously shown to have similar activity (19).

12 Environment ACS Paragon Plus

Page 12 of 43

Page 13 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Mutation of BioG surface residues results in decreased esterase activity in vitro. The BioG crystal structure (Figure 4C) suggested that there are three basic residues (K118, K127 and R132) at the surface that could interact with the acidic α2 helix of the ACP moiety of pimeloyl-ACP. To test this premise these BioG residues were individually replaced with alanine. Genes encoding the double and triple mutant alanine for basic residue mutations were also constructed. To test the enzymatic activity of the wild type and alanine substitution BioG mutant proteins, pimeloyl-ACP methyl ester was synthesized from E. coli ACP and the wild type and mutant BioG proteins were purified. E coli ACP was used rather than H. influenza ACP because the conditions that give the differing gel mobilities for pimeloyl-ACP methyl ester and pimeloyl-ACP used to assay activity were well established (15, 19). The behavior of ACP species on these partially denaturing gels is subtle. Separation must be obtained empirically by varying the urea concentration and can vary markedly with small changes in ACP structure (38). H. influenzae ACP is 83% identical to E. coli ACP and the α2-helices of the two proteins have identical sequences.

Moreover, there is only one

mismatch in the 25 residues that flank the α2-helices (9 residues upstream, 16 residues downstream). Given that the α2-helix is the site of the great bulk of enzyme-ACP interactions (39) (including all of those with BioH), E. coli ACP seemed an acceptable surrogate for the native protein. The pimeloyl-ACP product can be distinguished from the pimeloyl-ACP methyl ester substrate by its slower migration in a conformationally sensitive urea-PAGE gel system (32). In a 20 min assay, the wild type BioG and single substitution mutants (K118A, K127A and R132A) showed similar activities. Production of pimeloyl-ACP by these BioG proteins could be observed in as little as 30 s and the reactions approached completion in about 15 min. In contrast the activities of the double mutant BioGs were significantly reduced. The reactions remained incomplete at 20 min although pimeloyl-ACP hydrolysis could be seen at the 30 s

13 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

time point. The BioG triple mutant protein was virtually inactive; only trace amounts of pimeloyl-ACP appeared in 15 and 20 min incubations (Figure 6A). The gel shift assays indicated that BioG residues K118, K127 and R132 are responsible for almost all of binding affinity of BioG for the ACP moiety of pimeloyl-ACP methyl ester. Note that these assays cannot give Michaels-Menton values because detection of the cleavage reaction requires that an appreciable fraction of the substrate be converted to product. Densitometry is problematical because separation quality and background vary from gel to gel. For these reasons, we have chosen the time course of the reaction as the most appropriate means to compare the activities of the mutant and wild type proteins. Indeed, the BioG proteins that show no or very slow pimeloyl-ACP production grow poorly or fail to grow in the absence of biotin whereas the more active proteins support strong growth (Figure 6B). Loss of the BioG basic residues results in decreased rates of biotin biosynthesis. In order to test the physiological relevance of loss of the BioG basic residues the abilities of these proteins to restore biotin synthesis to an E. coli ∆bioH ∆pcnB strain were assayed. The BioG proteins were expressed in plasmid pET28b, a phage T7 expression vector. However, the host strain lacked the phage polymerase and thus E. coli RNA polymerase transcribed the genes (40) under regulation by the LacI repressor. The ∆pcnB mutation was included to decrease the copy number of the BioG plasmids (pMS421 is unaffected) (41) and hence this host strain allowed sensitive detection of the in vivo activities of the BioG mutant proteins. We assayed the in vivo effects of the mutations by colony formation on minimal medium plates that lacked biotin. The strain expressing the protein that lacks all three basic residues failed to form single colonies in the absence of biotin and one of the doubly mutant proteins, K118A/K127A, showed compromised growth (Figure 6B). Thermal stabilities of BioG and BioH. Relative to the α-helices of BioH, the helixreplacing loops (with or without a 310-helix) in BioG form fewer polar and nonpolar

14 Environment ACS Paragon Plus

Page 14 of 43

Page 15 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

interactions with the central β-sheet. This consideration prompted us to compare the stabilities of these two proteins. The energy of interaction between the three structural layers in BioG was calculated by PISA and compared with that in BioH. As shown in Table 5, the area of interface between the 310 layer and the rest of BioG is 1190 Å2, which is apparently smaller than the area of corresponding interface of 1512 Å2 in BioH. Consequently, the total stabilization energy for the central β-sheet is -47.1 kcal/mol in BioG, which is significantly smaller than the total stabilizing energy (-58.7 kcal/mol) for BioH. To test the in silico results, the thermal stabilities of BioG and BioH were determined by measuring thermal unfolding monitored by circular dichroism (CD) spectroscopy. The temperature-dependent CD spectra (Figure 7A) showed obvious loss of secondary structures with increasing temperature for both BioG and BioH. As both unfolding curves indicated irreversible transitions, a two-state approximation (42) was used to fit the temperaturedependent plot of Θ222, which indicates the content of α-helices in the protein solution and is normalized as a measure of the population of the native, folded protein in the unfolding experiments. The fitting resulted in the adjusted R2 of 0.997 and 0.995, respectively. The melting point (TM) was determined to be 50.4 ± 0.2 oC for BioG and 58.3 ± 0.1 oC for BioH (Figure 7B), showing that BioG is indeed significantly less stable than BioH. Due to the close resemblance of cap domain between BioG and BioH, the stability difference is attributed to the difference in the interaction between the central β-sheet and the flanking sub-structures, demonstrating that replacement of the BioH β-sheet-stabilizing α-helices with long loops in BioG results in decreased structural stability.

DISCUSSION BioG is a pimeloyl-ACP methyl esterase responsible for a critical step of in the biotin biosynthetic pathway of a diverse group of bacteria (20). It is remotely related to other non-

15 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

homologous isofunctional enzymes such as BioH, BioK, BioJ, and BioV (18-21) and its orthologues are classified as a subfamily of the α/β-hydrolase fold superfamily together with the Pfam Domain of Unknown Function 452 (DUF452) (43). In this study, we have solved the high-resolution structure of H. influenzae BioG at 1.26 Å. Structural searches with either PDBeFold (44) or Dali (45) identified BioH as the closest structural neighbor of BioG indicating that these two proteins share a common evolutionary origin. Structural comparisons found that the BioG active site closely resembles that of BioH including a similar hydrophobic pocket for binding and positioning of the pimeloyl methyl ester moiety for nucleophilic attack by the conserved catalytic triad. BioG has a positively charged docking site for the ACP moiety of the substrate that is similar to that of BioH (Figures 4 and 5). As mentioned above the ACPs of H. influenzae and E. coli are highly homologous (83% identical) and the α2-helices are identical. Hence all of the ACP residues that interact with BioH are conserved. The ACP docking site was functionally verified by site-directed mutagenesis and both in vivo and in vitro activity assays (Figure 6). The function of S62 in catalysis was previously demonstrated (19). However, BioG presents an atypical α/βhydrolase fold structure formed by replacing four of the six α-helices flanking both sides of the central β-sheet in the canonical α/β-hydrolase fold structure of BioH with long loops that may contain a short 310-helix (Figure 2). These structural differences in BioG occur far away from the active site and results in decreased thermal stability in comparison to BioH (Figure 7). The atypical α/β-hydrolase fold structure gives BioG a protein surface distinct from that of BioH (Figure 8A) although the two proteins are superimposable with an rmsd of 3.73 Å. However, there are two conserved surface areas with a similar pattern of electrostatic potential distribution (Figure 8B and 8C), each of which appears suitable to interact with another protein. The first one is a positively charged area composed of K118, K127 and R132

16 Environment ACS Paragon Plus

Page 16 of 43

Page 17 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

in BioG and R138, R142, R155, and R159 in BioH (Figure 4C and Figure 8B), which have been experimentally confirmed to form the ACP docking site in this study (Figure 6) and a previous study (24), respectively. Although a few of these positive residues such as K127 and K132 of BioG are little conserved (Figure 5), analogous residues are available in close vicinity to form the ACP docking site, which are thus believed to be conserved feature of all BioG and BioH orthologues. The second area lies on the opposite side of the protein from the first one (Figure 8A) and consists of a positive, deep dent as the core and the surrounding wide, negatively charged belt over an extensive surface region (Figure 8C). Residues contributing to the negative belt are mainly from loops 1 and 2 in BioG and αA and αB in BioH in the second surface feature, while the residues contributing to the positive core are from the core domain in both proteins (Figure 8B, legend). Although these residues are poorly conserved, many similarly charged residues are found in the orthologues in the close vicinity of the residues corresponding to the identified ones in H. influenzae BioG and E. coli BioH, which are not aligned in in multiple sequence alignments of their respective orthologues probably due to the intrinsically low level of sequence conservation (Figure 8D). This second surface area with a similar electrostatic potential distribution is thus believed to be conserved discretely among BioG and BioH orthologues. Nonetheless, this surface area exhibits significant differences in size and shape as well as charge distribution between BioG and BioH. Notably, the central positively charged dent is significantly larger in BioH due to the involvement of a long connection loop between β3 and α2, the equivalent of which is very short and contributes little to protein surface in BioG (Figure 8C). Another major difference lies in the size and shape of the negative belt, which is due in large part to the atypical α/β hydrolase fold structure of BioG with four long loops replacing four of the six β-sheet stabilizing helices in the canonical structure of BioH. In BioG, helix-replacing loop 3 and loop 4 are not found to be involved in formation of any conserved protein surfaces, of which

17 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

loop 4 is completely absent in some BioG orthologues such as the one from M. hirudinis (Figure 5). Since these loops are far from the active site, functional implication of their replacement of the helices is unknown except that they may accommodate the changes in loop 1 and loop 2. The presence of a potential conserved protein surface with a similar electrostatic distribution pattern but distinct shape and size suggests that BioG and BioH may interact with another protein, although no such protein has been reported. The putative protein should ideally aid the gatekeeping function of the esterases to provide protein-level control for the exit of pimeloyl-ACP methyl ester from the type II fatty acid biosynthetic cycles. While this speculated protein-protein interaction awaits verification, it is consistent with the experimental observation that overexpression of BioH impairs biotin biosynthesis (46). In this hypothetic regulatory scheme, the size, shape and charge distribution of the interaction interface on the esterase have to be complementary to the partner protein that may be specifically conserved among a subset of bacteria. To guarantee structural complementarity, BioH orthologues have to adopt the canonical α/β hydrolase fold structure, whereas BioG orthologues have to adopt the atypical α/β hydrolase fold structure to accommodate the interacting protein, even at the expense of structural stability. This structural constraint on the esterase thus provides a rationale for the existence of BioG and BioH orthologues as two phylogenetically distinct groups of non-homologous isofunctional enzymes. Alternatively, the phylogenetic and structural segregation of BioG and BioH may have no functional implication according to the neutral theory of molecular evolution (47), which assumes that random and neutral mutations are enriched and fixed through genetic drift (48) and the resulting proteins keep full functionality with pronounced structural change. This is consistent with the complete exchangeability between BioH and BioG (19). It is also consistent with the presence of many proline residues in the helix-replacing loop 1 of BioG

18 Environment ACS Paragon Plus

Page 18 of 43

Page 19 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

(Figure 5 and Figure 8D). However, this assumption is unable to explain the significant decrease of stability in BioG in the absence of functional advantage over BioH. In summary, we have presented structural and biochemical evidence to show that BioG adopts an atypical α/β-hydrolase fold structure with a decreased stability but have the same mode of substrate recognition and the same catalytic function in comparison to the nonhomologous isofunctional BioH with a canonical α/β-hydrolase fold structure. Structural analysis has led to identification of a potential binding site for new protein-protein interaction with a similar pattern of electrostatic potential but with a distinct size and shape on the surface of BioG and BioH, suggesting that BioG and BioH may engage in regulatory control of pimeloyl-ACP biosynthesis by interaction with other proteins. These structural and functional insights may eventually lead to an understanding of the phylogenetic diversity of the pimeloyl-ACP methyl esterase and its role in biotin biosynthesis.

ACKNOWLEDGEMENTS We thank Dr. Herman Ho-Yung Sung for the help of in-house crystal testing and Shanghai Synchrotron Radiation Facility (SSRF) and National Center for Protein Science Shanghai (NCPSS) for access to the beamlines BL17U and BL19U and the on-site technical support.

REFERENCES 1. Attwood, P. V., and Wallace, J. C. (2002) Chemical and catalytic mechanisms of carboxyl transfer reactions in biotin-dependent enzymes. Acc. Chem. Res. 35, 113–120. 2. Waldrop, G. L., Holden, H. M., St Maurice, M. (2012) The enzymes of biotin dependent CO₂ metabolism: what structures reveal about their reaction mechanisms. Protein Sci. 21, 1597–1619.

19 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

3. Tong L. (2013) Structure and function of biotin-dependent carboxylases. Cell Mol. Life Sci. 70, 863–891. 4. Lin, S., and Cronan, J. E. (2011) Closing in on complete pathways of biotin biosynthesis. Mol. Biosyst. 7, 1811–1821. 5. Jarrett, J. T. (2015) The biosynthesis of thiol- and thioether-containing cofactors and secondary metabolites catalyzed by radical S-adenosylmethionine enzymes. J. Biol. Chem. 290, 3972–3979. 6. Park, S. W., Klotzsche, M., Wilson, D. J., Boshoff, H. I., Eoh, H., Manjunatha, U., Blumenthal, A., Rhee, K., Barry, C. E., Aldrich, C. C., Ehrt, S., and Schnappinger, D. (2011) Evaluating the sensitivity of Mycobacterium tuberculosis to biotin deprivation using regulated gene expression. PLoS Pathog. 7, e1002264. 7. Napier, B. A., Meyer, L., Bina, J. E., Miller, M. A., Sjostedt, A., and Weiss, D. S. (2012) Link between intraphagosomal biotin and rapid phagosomal escape in Francisella. Proc. Natl. Acad. Sci. U. S. A. 109, 18084–18089. 8. Yu, J., Niu, C., Wang, D. C., Li, M., Teo, W. S., Sun, G., Wang, J. P., Liu, J., and Gao, Q. A. (2011) MMAR_2770, a new enzyme involved in biotin biosynthesis, is essential for the growth of Mycobacterium marinum in macrophages and zebrafish. Microbes Infect. 13, 33–41. 9. Zlitni, S., Ferruccio, L. F., and Brown, E. D. (2013) Metabolic suppression identifies new antibacterial inhibitors under nutrient limitation. Nat. Chem. Biol. 9, 796–804. 10. Park, S. W., Casalena, D. E., Wilson, D. J., Dai, R., Nag, P. P., Liu, F., Boyce, J. P., Bittker, J. A., Schreiber, S. L., Finzel, B. C., Schnappinger, D., and Aldrich, C. C. (2015) Target-based identification of whole-cell active inhibitors of biotin biosynthesis in Mycobacterium tuberculosis. Chem. Biol. 22, 76–86.

20 Environment ACS Paragon Plus

Page 20 of 43

Page 21 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

11. Mueller, J. H. (1937) Pimelic acid as a growth accessory for the diphtheria bacillus. J. Biol. Chem. 119, 121–131. 12. du Vigneaud, V., Dittmer, K., Hague, E., and Long, B. (1942) The growth-stimulating effect of biotin for the diphtheria bacillus in the absence of pimelic acid. Science 96, 186– 187. 13. Ogata, K., Tochikura, T., Iwahara, S., Takasawa, S., Ikushima, K., Nishimura, A., and Kikichi, M. (1965) Studies on biosynthesis of biotin by microorganisms. II. Identification of biotin-vitamers accumulated by various microorganisms. Agr. Biol. Chem. (Tokyo) 29, 895–901. 14. Eisenberg, M. A. (1963) Biotin biosynthesis I. Biotin yields and biotin nitamers in cultures of Phycomyces blakesleeanus. J. Bacteriol. 86, 673–680. 15. Lin, S., Hanson, R. E., and Cronan, J. E. (2010) Biotin synthesis begins by hijacking the fatty acid synthetic pathway. Nat. Chem. Biol. 6, 682–688. 16. Lin, S., and Cronan, J. E. (2012) The BioC O-methyltransferase catalyzes methyl esterification of malonyl-acyl carrier protein, an essential step in biotin synthesis. J. Biol. Chem. 287, 37010–37020. 17. White, S. W., Zheng, J., Zhang, Y.-M., and Rock, C. O. (2005) The structural biology of type II fatty acid biosynthesis. Annu. Rev. Biochem. 74, 791–831. 18. Rodionov, D. A., Mironov, A. A., and Gelfand, M. S. (2002) Conservation of the biotin regulon and the BirA regulatory signal in eubacteria and archaea. Genome Res. 12, 1507– 1516. 19. Shapiro, M. M., Chakravartty, V., and Cronan, J. E. (2012) Remarkable diversity in the enzymes catalyzing the last step in synthesis of the pimelate moiety of biotin. PLoS ONE 7, e49440.

21 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

20. Feng, Y. J., Napier, B. A., Manandhar, M., Henke, S. K., Weiss, D. S., and Cronan, J. E. (2014) A Francisella virulence factor catalyses an essential reaction of biotin synthesis. Mol. Microbiol. 91, 300–314. 21. Bi, H. K., Zhu, L., Jia, J., and Cronan, J. E. (2016) A biotin biosynthesis gene restricted to Helicobacter. Sci. Rep. 6, 21162. 22. Xie, X., Wong, W. W., Tang, Y. (2007) Improving simvastatin bioconversion in Escherichia coli by deletion of bioH. Metab Eng. 9, 379–386. 23. Kwon, M. A., Kim, H. S., Oh, J. Y., Song, B. K., Song, J. K. (2009) Gene cloning, expression, and characterization of a new carboxylesterase from Serratia sp. SES-01: comparison with Escherichia coli BioHe enzyme. J. Microbiol. Biotechnol. 19, 147–154. 24. Agarwal, V., Lin, S., Lukk, T., Nair, S. K., and Cronan, J. E. (2012) Structure of the enzyme-acyl carrier protein (ACP) substrate gatekeeper complex required for biotin synthesis. Proc. Natl. Acad. Sci. U. S. A. 109, 17406–17411. 25. Otwinowski, Z., and Minor, W. (1997) Processing of X-ray diffraction data collected in oscillation mode. Macromol. Crystallogr. A 276, 307–326. 26. Terwilliger, T. C., Adams, P. D., Read, R. J., McCoy, A. J., Moriarty, N. W., GrosseKunstleve, R. W., Afonine, P. V., Zwart, P. H., and Hung, L. W. (2009) Decision-making in structure solution using Bayesian estimates of map quality: the PHENIX AutoSol wizard. Acta Crystallogr. D Biol. Crystallogr. 65, 582–601. 27. Terwilliger, T. C., Grosse-Kunstleve, R. W., Afonine, P. V., Moriarty, N. W., Zwart, P. H., Hung, L. W., Read, R. J., and Adams, P. D. (2008) Iterative model building, structure refinement and density modification with the PHENIX AutoBuild wizard. Acta Crystallogr. D Biol. Crystallogr. 64, 61–69. 28. Emsley, P., Lohkamp, B., Scott, W. G., and Cowtan, K. (2010) Features and development of Coot. Acta Crystallogr D Biol. Crystallogr. 66, 486–501.

22 Environment ACS Paragon Plus

Page 22 of 43

Page 23 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

29. Adams, P. D., Afonine, P. V., Bunkoczi, G., Chen, V. B., Davis, I. W., Echols, N., Headd, J. J., Hung, L. W., Kapral, G. J., Grosse-Kunstleve, R. W., McCoy, A. J., Moriarty, N. W., Oeffner, R., Read, R. J., Richardson, D. C., Richardson, J. S., Terwilliger, T. C., and Zwart, P. H. (2010) PHENIX: a comprehensive Python-based system for macromolecular structure solution, Acta Crystallogr. D Biol. Crystallogr. 66, 213-221. 30. Mccoy, A. J., Grosse-Kunstleve, R. W., Adams, P. D., Winn, M. D., Storoni, L. C., and Read, R. J. (2007) Phaser crystallographic software, J. Appl. Crystallogr. 40, 658-674. 31. Zwart, P. H., Grosse-Kunstleve, R. W., and Adams, P. D. (2005) Xtriage and Fest: automatic assessment of X-ray data and substructure structure factor estimation. CCP4 newsletter Winter, Contribution 7. 32. Cronan, J. E., and Thomas, J. (2009) Bacterial fatty acid synthesis and its relationships with polyketide synthetic pathways. Methods Enzymol 459, 395–433. 33. DeLano, W. L. (2002) The PyMOL Molecular Graphics System, Schrödinger, LLC, New York. 34. Krissinel, E., and Henrick, K. (2007) Inference of macromolecular assemblies from crystalline state. J. Mol. Biol. 372, 774–797. 35. Pei, J. M., Kim, B. H., and Grishin, N. V. (2008) PROMALS3D: a tool for multiple protein sequence and structure alignments. Nucleic Acids Res. 36, 2295–2300. 36. Robert, X., and Gouet, P. (2014) Deciphering key features in protein structures with the new ENDscript server. Nucleic Acids Res. 42, W320–W324. 37. Baker, N. A., Sept, D., Joseph, S., Holst, M. J., and McCammon, J. A. (2001) Electrostatics of nanosystems: application to microtubules and the ribosome. Proc. Natl. Acad. Sci. U.S.A. 98, 10037–10041. 38. De Lay, N. R., Cronan, J. E. (2007) In vivo functional analyses of the type II acyl carrier proteins of fatty acid biosynthesis. J. Biol. Chem. 282, 20319–20328.

23 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

39. Cronan, J. E. (2014) The chain-flipping mechanism of ACP (acyl carrier protein)dependent enzymes appears universal. Biochem J. 460, 157–163. 40. Somerville, R. L., Shieh, T. L., Hagewood, B., Cui, J. S. (1991) Gene expression from multicopy T7 promoter vectors proceeds at single copy rates in the absence of T7 RNA polymerase. Biochem. Biophys. Res. Commun. 181, 1056–1062. 41. Masters, M., Colloms, M. D., Oliver, I. R., He, L., Macnaughton, E. J., and Charters, Y. (1993) The pcnB gene of Escherichia coli, which is required for ColE1 copy number maintenance, is dispensable. J. Bacteriol. 175, 4405–4413. 42. Greenfield, N. J. (2006) Using circular dichroism collected as a function of temperature to determine the thermodynamics of protein unfolding and binding interactions. Nat. Protoc. 1, 2527–2535. 43. Finn, R. D., Coggill, P., Eberhardt, R. Y., Eddy, S. R., Mistry, J., Mitchell, A. L., Potter, S. C., Punta, M., Qureshi, M., Sangrador-Vegas, A., Salazar, G. A., Tate, J., and Bateman, A. (2016) The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 44, D279–D285. 44. Krissinel, E., and Henrick, K. (2004) Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr. D Biol. Crystallogr. 60, 2256–2268. 45. Holm, L., and Rosenstrom, P. (2010) Dali server: conservation mapping in 3D. Nucleic Acids Res. 38, W545–W549. 46. Koga, N., Kishimoto, J., Haze, S. I., and Ifuku, O. (1996) Analysis of the bioH gene of Escherichia coli and its effect on biotin productivity. J. Ferment. Bioeng. 81, 482–487. 47. Kimura M. (1968) Evolutionary rate at the molecular level. Nature 217, 624–626. 48. Nei, M. (2005) Selectionism and neutralism in molecular evolution. Mol. Biol. Evol. 22, 2318–2342.

24 Environment ACS Paragon Plus

Page 24 of 43

Page 25 of 43

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

25 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Table 1. Oligonucleotides (5’-3’) BioG-F

BioG Forward

GGAATTCCATATGAATAACATCTGGTGGC

BioG-R

BioG Reverse

CCGCTCGAGTGCGGCCG CAAGCTTGTCG

XC168 XC169

K118A Forward K118A Reverse

ATCTCACAGAAAATACCCGTTTAGCATTTGAACGCAGAATCTGTGGC GCCACAGATTCTGCGTTCAAATGCTAAACGGGTATTTTCTGTGAGAT

XC170

K127A Reverse

TTGAACGCAGAATCTGTGGCGATGCAGCATCTTTTGAACGTTAC

XC171

K127A Forward

GTAACGTTCAAAAGATGCTGCATCGCCACAGATTCTGCGTTCAA

XC172

R132A Forward

GGCGATAAAGCATCTTTTGAAGCTTACCAATTATTTCCAGCCCG

XC173

K118A/K127A Forward

ATACCCGTTTAGCATTTGAACGCAGAATCTGTGGCGATGCAGCATCTTT

XC174

K118A/K127A Reverse

AAAGATGCTGCATCGCCACAGATTCTGCGTTCAAATGCTAAACGGGTAT

XC175

K127A/R132A Forward

CGCAGAATCTGTGGCGATGCAGCATCTTTTGAAGCTTACCAATTATTTC

XC176

K127A/R132A Reverse

GAAATAATTGGTAAGCTTCAAAAGATGCTGCATCGCCACAGATTCTGCG

XC177

K118A/R132A Forward

TTAGCATTTGAACGCAGAATCTGTGGCGATAAAGCATCTTTTGAAGCTTA

XC178

K118A/R132A Reverse

TAAGCTTCAAAAGATGCTTTATCGCCACAGATTCTGCGTTCAAATGCTAA

XC179

K118A/K127A/R132A Forward

TTAGCATTTGAACGCAGAATCTGTGGCGATGCAGCATCTTTTGAAGCTTA

XC180

K118A/K127A/R132A Reverse

TAAGCTTCAAAAGATGCTGCATCGCCACAGATTCTGCGTTCAAATGCTAA

26 ACS Paragon Plus Environment

Page 26 of 43

Page 27 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Table 2. E. coli strains and plasmids Strains

Relevant Genotype or Description

Reference or Derivation

DH5α

-

∆(argF lacZ)U169 Ф80 ∆(lacZ)M15 recA1 endA1

Lab stock

BL21(DE3) E. coli B ompT hsdSB gal dcm (DE3)

Lab stock

STL243

∆bioH::FRT ∆pcnB::cat of MG1655

1

MG1655

E. coli K-12 wild type

Lab stock

pET28b

T7 promoter expression vector, KanR

Novagen

pMAD23

pET28b+ encoding His6-BioG

4

pXC031

pET28b+ encoding His6-BioG K118A

This study

pXC032

pET28b+ encoding His6-BioG K127A

This study

pXC033

pET28b+ encoding His6-BioG R132A

This study

pXC034

pET28b+ encoding His6-BioG K118A/K127A

This study

pXC035

pET28b+ encoding His6- BioG K118A/R132A

This study

pXC036

pET28b+ encoding His6- BioG K127A/R132A

This study

pXC037

pET28b+ encoding His6- BioG K118A/K127A/R132A This study

Plasmids

27 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 43

Table 3. Data collection and refinement statisticsa

PDB code Wavelength (Å) Resolution range (Å) Space group Unit cell a, b, c (Å) α, β, γ (∘) Total reflections Unique reflections Multiplicity Completeness (%) Mean I/sigma(I) CC1/2 (%) R-measure R-work R-free Non-hydrogen atoms Macromolecules Ligands Water Protein residues RMS bonds (Å) RMS angles (o) Ramachandran favored (%) Ramachandran outliers (%) Clashscore Average B-factor Macromolecules Ligands Solvent a

SeMet BioG 5H3B 0.97918 32.45 - 1.49 (1.55 - 1.49) P 1 21 1

Native BioG 5GNG 0.97918 30.39 - 1.26 (1.31 - 1.26) P 1 21 1

44.154, 69.471, 72.034 90, 93.15, 90 518390 70407 7.4 (6.5) 99.60 (96.38) 19.5 (2.7) 99.8 (87.3) 0.051 (0.75) 0.1469 (0.2231) 0.1816 (0.3069) 3995 3515 10 470 428 0.008 1.08 98 0 2.32 25.30 23.90 18.80 35.50

45.95, 67.83, 67.99 90, 90.01, 90 204144 105604 1.9 (1.9) 93.88 (90.48) 13.25 (2.34) 99.6 (67.3) 0.034 (0.33) 0.1446 (0.2780) 0.1809 (0.3069) 4272 3551 721 428 0.006 0.86 97 0 5.20 14.10 12.10 23.90

Statistics for the highest-resolution shell are shown in parentheses.

28 Environment ACS Paragon Plus

Page 29 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Table 4. Important residues of BioG and BioH. BioH

BioG

hydrophobic pocket W22

W20

L83 F111

F178a

I120

I99

V124

I103

L125

F104

F128

T107

F143

F119

L146

I123

M149 L180

L148

L183

M66a

L209

I177

catalytic triad S82

S65

D207

D175

H235

H200

ionic contacts R138 R142

K118

R155

K127

R159

R132

a

Residues identified from structural superposition rather than sequence alignment.

29 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 5. Calculated interface areas and free energy for stabilization of the central βsheet in BioG and BioH. Fig. 3B BioG left BioH left BioG right BioH right 2 Interface/Å 1190 1512 2156 2035 ∆G/kcal/mol -18.6 -27.6 -28.5 -31.2

30 Environment ACS Paragon Plus

Page 30 of 43

Page 31 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure Legends Figure 1. Biotin biosynthesis in BioC-containing bacteria. (A) Biosynthesis of pimleoylACP via a modified fatty acid biosynthetic pathway. The release of the pimeloyl intermediate from the fatty acid synthesis cycle is highlighted by a green box. (B) Differing biotin biosynthetic gene organization in E. coli and H. influenzae. The E. coli bioH gene is located far from the bio operon genes (about 40% of the genome). Figure 2. Crystal structure of BioG. (A) Schematic representation of overall structure of BioG. One subunit is colored green, while the other is colored by secondary elements in individual domains. α-Helices are colored cyan in the lid domain. In the core domain, the αhelices, β-sheets and 310-helices are colored red, yellow, and magenta, respectively. The loops are colored grey in both domains. (B) Three 310-helices in the long loops of one flanking layer. The 310-helices are shown in stick representation with green carbon atoms. Blue mesh indicates omit |Fo-Fc| electron density map contoured at 2σ. Figure 3. Structural differences between BioG and BioH. (A) Stereo diagram of superimposed BioG (green) and BioH (PDB accession code 1M33, grey). Outward-pointing α4 in the BioG lid domain is circled with black dashes, while the shorter α2 to α3 are circled with violet dashes. (B) Replacement of four β-sheet stabilizing α-helices in BioH (grey) with four long loops in BioG (green). The α-helices or α-helix replacing loops (red) in BioG consist of the following residues: 46-53 (loop 2), 66-75 (αA), 160-163 (loop 3), 180-188 (αB), 24-33 (loop 1) and 201-214 (loop 4). The corresponding α-helices (red) of BioH contain the following residues: 61-72 (αB), 83-94 (αC), 190-194 (αD) and 212-221 (αE), 25-40 (αA) and 236-256 (αF).

31 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 43

Figure 4. The active sites of BioG and BioH are similar. (A) Superposition of BioG (green) with binary complex of BioH S82A (grey) with pimeloyl-ACP methyl ester (blue). The pimeloyl methyl ester moiety of the substrate is represented in sticks. (B) A close-up view of the modeled methyl ester and the catalytic triad. The arrow indicates the distance in Å of Oγ of the nucleophilic serine to carbonyl carbon atom in the methyl ester function. (C) Similar ionic interactions between ACP and the enzymes. (D) Similar hydrophobic pockets for binding of the pimeloyl methyl ester moiety of the ACP-linked substrate. The pimelate carbon chain is numbered from the methyl ester end. In all panels, previously identified BioH residues and their corresponding residues in BioG are shown in sticks with grey labels in BioH and with green labels in BioG. Dashed lines in black indicate hydrogen bonds or ionic interactions between ACP and BioH. Gold dashed lines link the alpha carbons of corresponding pairs of residues in (D) except F111/F178 and L183/M66. Figure 5. Structure-based sequence alignment of BioG orthologues and E. coli BioH. 20 BioG orthologues from Uniprot are aligned and 6 representative sequences with identity across 20% to 50% are selected and shown here. Residues forming the catalytic triad, the hydrophobic channel, and docking site for ACP are highlighted by circles ( ), triangles (

),

and diamonds ( ), respectively. Residues are numbered according to H. influenzae BioG. Secondary structures of BioG are placed above the sequences, while those of BioH are placed below the sequences with the same labels as in Figures 2 and 3. Figure 6. Effects of substitution of alanine for the BioG basic residues at positions 118, 127, and 132 on BioG activity in vitro and in vivo. (A) Assay in vitro. Conversion of the BioG pimeloyl-ACP methyl ester substrate to the pimeloyl-ACP product was assayed in a conformational sensitive electrophoretic mobility shift assay (Material and Methods). The pimeloyl-ACP product migrates more slowly than the pimeloyl-ACP methyl ester substrate

32 Environment ACS Paragon Plus

Page 33 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

because the carboxyl group of pimeloyl-ACP destabilizes ACP structure and thereby increases the effective radius of the protein. The reactions contained 150 µM methylpimeloyl-ACP plus 5 nM enzyme and were followed for 20 min. The single substitution mutants K118A, K127A and R132A showed activities similar to wild type BioG whereas the double mutants showed significantly reduced activities. The activity of the triple mutant was nearly abolished. (B). Activities of the BioG mutant proteins in vivo. The in vivo assays were performed using derivatives from an E. coli ∆bioH ∆pcnB host strain carrying one of the bioG plasmids in medium lacking biotin. Colonies formed on plates lacking biotin. The strain streaked in a sector is identified by the number. Growth of all strains was indistinguishable from that of the wild type strain MG1655 in the presence of 4 nM biotin. Figure 7. Thermal denaturation monitored by CD spectroscopy. (A) Far-UV region CD spectra of BioG (left) and BioH (right) are shown with arrows indicating the temperature change. (B) Normalized Θ222 from BioG (left) and BioH (right) are plotted to temperature and fitted to a native-unfolding two-state model. The midpoints of denaturation are marked. Figure 8. Comparison of the electrostatic potential surfaces of BioG and BioH. (A) Overall electrostatic potential surfaces. The enzymes are in the same views as in Figure 3B and the circled areas in black and purple correspond to views in (B) and (C), respectively. (B) Positively charged surfaces for docking of the acidic ACP. A close-up view of the positive patch is provided in half transparency with relevant basic residues shown in sticks. (C) Another surface area with similar electrostatic potential patterns but distinct size and charge distribution. The positive core consists of K2 and K4 from β1 in BioG and N3 from the Nterminus, Q7 and K9 from β1, and R52 and R54 from a long core domain loop in BioH. Residues contributing to the negative belt are: E109, N110, E113, E131, Q134, D25, N28, E34, Q45, D46, D50, D52, D142, and Q146 in BioG; and D134, D135, Q136, Q137, E141,

33 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

E172, D174, E27, D33, E34, E35, D63, E66, Q70 and D74 in BioH. Half-transparent surfaces are shown next to respective electrostatic surface with engaging negatively charged secondary structures labeled. A close-up view is provided alongside the electrostatic potential map to show morphological difference in the connection loop (purple) from β3 to loop2 in BioG and to αB in BioH. The helix-replacing loop 1 and loop 2 in BioG and the corresponding αA and αB in BioH are colored yellow. (D) Partial sequence alignments of BioG and BioH orthologues. The aligned sequences correspond to residues contributing to the electrostatic potential distribution patterns identified in (C) in BioG or BioH. In the alignments, acidic residues contributing to the negatively charged surface are labeled red and proline residues in the helix-replacing loop 1 are labeled green in BioG. From (A) to (C), the electrostatic potential is calculated by APBS in the unit of KbT/ec and colored according to the scale given in (C).

34 Environment ACS Paragon Plus

Page 34 of 43

Page 35 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Fig. 1

35 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Fig. 2

36 Environment ACS Paragon Plus

Page 36 of 43

Page 37 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Fig. 3

37 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Fig. 4

38 Environment ACS Paragon Plus

Page 38 of 43

Page 39 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Fig. 5

39 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Fig. 6

40 Environment ACS Paragon Plus

Page 40 of 43

Page 41 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Fig. 7

41 Environment ACS Paragon Plus

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Fig. 8

42 Environment ACS Paragon Plus

Page 42 of 43

Page 43 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Graphic for the Table of Contents: Jie Shi, Xinyun Cao, Yaozong Chen, John E. Cronan*, and Zhihong Guo*. An atypical α/β hydrolase fold revealed in the crystal structure of

pimeloyl-acyl carrier protein methyl esterase BioG from Haemophilus influenzae.

43 Environment ACS Paragon Plus