Interdomain and Intermodule Organization in ... - ACS Publications

Jun 13, 2016 - ABSTRACT: Nonribosomal peptide synthetases are large, complex multidomain enzymes ... Nonribosomal peptides (NRPs) are a complex and...
5 downloads 0 Views 2MB Size
Subscriber access provided by The University of British Columbia Library

Article

Interdomain and intermodule organization in epimerization domain containing nonribosomal peptide synthetases Wei-Hung Chen, Kunhua Li, Naga Sandhya Guntaka, and Steven D. Bruner ACS Chem. Biol., Just Accepted Manuscript • DOI: 10.1021/acschembio.6b00332 • Publication Date (Web): 13 Jun 2016 Downloaded from http://pubs.acs.org on June 14, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

ACS Chemical Biology is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Interdomain and intermodule organization in epimerization domain containing nonribosomal peptide synthetases Wei-Hung Chen, Kunhua Li, Nagasandhya Guntaka and Steven D. Bruner* * Department of Chemistry, University of Florida, P.O. Box 117200, Gainesville, FL 32611, email: [email protected] Abstract Nonribosomal peptide synthetases are large, complex multi-domain enzymes responsible for the biosynthesis of a wide range of peptidic natural products. Inherent to synthetase chemistry is the thioester templated mechanism that relies on protein/protein interactions and inter domain dynamics. Several questions related to structure and mechanism remain to be addressed, including the incorporation of accessory domains and inter-module interactions. The inclusion of nonproteinogenic D-amino acids into peptide frameworks is a common and important modification for bioactive nonribosomal peptides. Epimerization domains, embedded in nonribosomal peptide synthetases assembly lines, catalyze the L- to D- amino acid conversion. Here we report the structure of the epimerization domain/peptidyl carrier protein didomain construct from the first module of the cyclic peptide antibiotic gramicidin synthetase. Both holo (phosphopantethiene post-translationally modified) and apo structures were determined, each representing catalytically relevant conformations of the two domains. The structures provide insight into domain-domain recognition, substrate delivery during the assembly line process, in addition to the structural organization of homologous condensation domains, canonical players in all synthetase modules. Introduction Nonribosomal peptides (NRPs) are a complex and structurally diverse family of natural products produced primarily by bacteria and fungi using large multi-domain enzyme machines.1– 3 NRP secondary metabolites exhibit a wide range of biological activities that can be attributed to the large structural variability made possible by the modular, biosynthetic pathways.4 The range of activities of NRPs include key regulators of cell metabolism, for example, the ironchelation and transportation to antibiotic, antitumor, immunosuppressive, or cytotoxic compounds, frequently exploited as therapeutics. 5–7 The biosynthesis of NRPs is carried out by nonribosomal peptide synthetases (NRPSs), large, multidomain enzymes commonly arranged in co-linear assemblies of repeating modules.1,4,8 The linked domains orchestrate multiple and diverse chemical reactions in a highly coordinated manner. The basic logic of NRPS chemistry is well-established.4,9 A minimum of three domains are necessary to extend one amino acid building block into a growing peptide chain: adenylation (A), condensation (C), and the peptidyl carrier protein (PCP) domains. The initial step of chain extension is the selection and activation of a specific amino acid by A domains using ATP to convert the amino acid to an aminoacyl-AMP intermediate, followed by transfer of the activated amino acid on to the thiol moiety of the phosphopantetheine (Ppant), posttranslationally-modified PCP domain forming an aminoacyl-S-Ppant-PCP. A condensation domain of the downstream module next couples the amino group of donor substrate with the

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

upstream thioester, forming the amide bond and releasing a free Ppant arm. The peptide grows by downstream transfer along the assembly line machinery by repeating modules. Oligimerization is commonly started by a A-PCP initiation module and terminated with a Cterminal thioesterase (Te) domain that cleaves the peptide from the machinery by hydrolysis or macrocyclization.10 Complementing the core domains common to all NRPSs, additionally auxiliary domains catalyze various modifications of the peptide while embedded into the assembly line. The addition of further chemical modifications contributes significantly to the structural diversity of NRPs, adding functional diversity and in vivo stability.11 These modifications occur during peptide synthesis and are performed by domains such as epimerization (E), cyclization (Cy)/oxidation (Ox) and methyl-transferase (Mt), responsible for the incorporation of D-amino acids, heterocyclic rings, and N-methylated residues respectively. Incorporation of nonproteinogenic D-amino acids in the peptide framework is, in particular, a common modification for bioactive nonribosomal peptides.12 Enantiomeric amino acids provide unique conformations into the natural product framework necessary for biological activity and retards peptide degradation by common L-amino acid specific proteases.13 The epimerization of the stereochemistry at the alpha-C position of an amino acid can be provided by external racemases for a small number of amino acids followed by incorporation in an NRP (alanine racemase as a prominent example), but more frequently assembly-line epimerization (E) domains embedded in a NRPS module generate D-amino acids found in the natural products.14 Previous studies on NRPS E domains have elucidated key catalytic residues, substrate specificity, and the timing of epimerization in the assembly-line.15–18 Mutational analysis of the E domain in the initiation module of the gramicidin S synthetase (GrsA_PheATE) predicted that the cofactor-independent epimerization reaction is a deprotonation/reprotonation mechanism, most likely through an peptidyl enolate intermediate.19 It was also reported that the GrsA_PheATE module could activate and epimerize noncognate amino acid substrates, but with much lower efficiency18 and predicted that the E domain serves as a gatekeeper for incorporation of D-Phe into gramicidin S. Furthermore, it was determined that the E domain (located at the C-terminal of NRPS modules) contributes to the intermodular transfer of correct intermediates to downstream modules.20 In a standard NRPS elongation module, the aminoacyl substrates are first formed by the upstream C domain, followed by an epimerization by the E domain, whereas the E domain in an initiation module lacks the upstream C domain, it directly epimerizes the aminoacyl substrate and next condensed with the downstream module. E domains show significant homology to C domains based on sequence alignment with similar active site motifs and secondary structure prediction.21 The structure of an isolated epimerization domain from the initiation module of tyrocidine synthetase was solved and supported the prediction of similarity in structure with C domain.22 Based on the structure, a conserved glutamate residue unique in E domains, was found to be adjacent to an active-site histidine (common to both E and C domains), predicted to be part of the catalytic apparatus responsible for acid-base reaction. Despite the previous biochemical and structural studies, several questions related to E domain substrate recognition and binding and detailed chemical mechanism remain to be addressed.

ACS Paragon Plus Environment

Page 2 of 21

Page 3 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

A unique and mechanistically interesting aspect of NRPS assembly lines is the overall structure and dynamics of the multidomain/multimodular machine.23,24 General questions related to domain orientation and the flux of Ppant-tethered intermediates through the large assembly remain largely unanswered. The issues of structure and mechanism have direct relevance to efforts to engineering NRPSs through domain manipulation and rearrangements to produce novel compounds, although many studies have shown success, generally applicable approaches are not known.25,26 The structures of individual NRPS domains is well established with multiple known structures of A, PCP and C domains.21,27–30 There are limited number of multiple domain NRPS fragments and intact modules.31–35 Recently, structures of intact modules of ‘holo’-NRPS fragments, with the fundamental prosthetic group Ppant were solved.36,37 The conformational changes of individual domains in the intact modules allows the carrier domain to transfer intermediates between domains while inducing partner domains to adopt their catalytically competent conformations. Additionally, the role of inter-domain linker regions in the assembly line mechanism is under-explored. Well-defined linker regions in the structure of the terminal module of surfactin synthetase (SrfA-C) indicate the flexibility and length help organize the domains in NRPS module.31 The linker regions connecting the PCP and downstream C domain of the di-domain PCP-C construct from tyrocidine synthetase also show a high degree of conformational flexibility.35 In general, the flexible linker allows the PCP to interact with other catalytic domains. The structural information on initiation module of NRPS fragments (F-A-PCP) of linear gramicidin synthetase (LgrA) shows the PCP domain translocate by 61 Å and orient to bind different domains by a considerable conformational pliability.37 Although the linker between A and PCP domains is relatively shorter (typically only 9-11 amino acid long), the thioesterase domains in terminal module (SrfA-C, holo AB3403, and EntF) were observed in various positions relative to condensation-adenylation di-domain.31,36 The diversity of linker sequence and flexibility contributes to the difficulties in generalizing function. In addition, details related to intramodule communication within NRPS biosynthetic machinery is not complete. Several studies, however, examining intermodule communication have described distinct communication domains (COM domains), providing promising strategies for reprograming biosynthetic complexes.38,39,40 To address issues of structure and function, research strategies using functional probes utilizing structurally restricted carrier domains suitable for X-ray crystallographic analysis has provided useful structural insights.32,34,37,39,40 In this study, we describe the structure of both the holo and apo forms of the PCP-E didomain of the GrsA, the first module of the cyclic peptide antibiotic (gramicidin S) synthetase. The biosynthesis of gramicidin S in Bacillus brevis is catalyzed by the NRPS GrsA and GrsB; in which, five modules incorporate ten amino acids cyclized into the final cyclic peptide (Figure 1). The E domain of GrsA is responsible for incorporating the two D-phenylalanines into the product. It is not entirely clear the precise mode in which gramicidin S interacts with cell membranes, leading to its cytotoxicity, however, the unique antiparallel β−sheet conformation of the product is a result of the presence of the D-amino acids and essential for its antibiotic activity. Here, structures of the GrsA didomain PCP-E fragment (holo and apo) are determined by X-ray crystallography at 1.9 Å and 2.3 Å resolution. Our work provides a structure of the fragment in a conformation directly relevant to the chemistry of the epimerization domain and also provides a structural basis for the inter-module PCP-C interaction that is a key component of the all NRPS

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

assembly lines. From the structures, novel details related to the interactions between catalytic residues and the Ppant prosthetic group and the interface region within linker-domain and domain-domain are apparent. These results provide insights into elucidating the catalytic mechanism and the domain-domain recognition of fundamental components of NRPS megaenzymes.

Figure 1. Gramicidin S synthetase assembly line. a) Cartoon representation of the NRPS enzymes, GrsA and GrsB. The amino acid building blocks are selected and activated by A domains (adenylation domain, blue, activated amino acid shown in subscript). PCP domains (peptidyl carrier protein, orange) use the 4’-phosphopantethiene, post-translational modification to transport acyl-intermediates. C domains (condensation, green) catalyze peptide bond formation between acyl-S-PCP intermediates of adjacent modules. The E domain (epimerization domain, grey) produces D-phenylalanine (the epimerized stereocenter is highlighted, red) through epimerization at the α-carbon. The di-domain fragment described in this study is shown in an orange box. Gramicidin S b) is released from the termination module by a cyclization reaction of the dimeric pentapeptide catalyzed by the Te domain (thioesterase domain, brown). Methods

ACS Paragon Plus Environment

Page 4 of 21

Page 5 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Cloning, expression and purification of the GrsA_PCP-E di-domain protein. The grsA_PCP-E coding sequence was amplified by PCR from the genomic DNA of B. brevis (ATCC 9999) using the primers grsA_PCPE_538Fw and grsA_PCPE_1098Rv (see Supplementary Table 1). The PCR product was purified through agarose gel electrophoresis and gel extraction (QIAquick gel Extraction kit, QIAgen), and ligated into pET28a (Novagen) using the NcoI and EcoRI restriction sites to create plasmid pET28a_grsA_PCPE. The plasmid was transformed into Escherichia coli BL21(DE3) and grown at 37°C in 6 L of Luria broth (LB) medium with 50 µg/mL kanamycin. Cultures were grown to an optical density at 600 nm of 0.3– 0.4, and then cooled to 18°C. Expression was initiated by the addition of 250 µM Isopropyl β-D1-thiogalactopyranoside (IPTG) at 18°C for 16 h. Cultures then were harvested by centrifugation at 3,500 rpm for 20 min, followed by resuspension in 20 mL of 500 mM NaCl, 20 mM Tris-HCl (pH 7.5) and lysed at 14,000 psi through a nitrogen-pressure microfluidizer cell (M-110L Pneumatic). The lysate was clarified with centrifugation at 10,000 rpm for 20 min. The supernatant was incubated for 1 h with 2 mL of Ni-NTA agarose resin (QIAgen) at 4°C. The resin was washed with 2X 25 mL of 500 mM NaCl, 20 mM Tris-HCl pH 7.5, 25 mM imidazole then the protein was eluted with 3X 3 mL of 500 mM NaCl, 20 mM Tris-HCl pH 7.5, 250 mM imidazole. Pooled protein were combined and dialyzed against 50 mM NaCl, 20 mM Tris-HCl pH 7.5, 1 mM β-mercaptoethanol (βME) for 12 h. Further purification was performed with ion exchange chromatography (HiTrap-Q, GE Healthcare) followed by size-exclusion chromatography (HiLoad 16/60 SuperDex S-200, GE Healthcare). For ion exchange, the gradient was 0-80 % buffer B over 30 min. at a flow rate of 2 mL/min. (buffer A: 50 mM Tris-HCl pH 7.5, 1 mM βME; buffer B: 1 M NaCl, 20 mM Tris-HCl pH 7.5, 1 mM βME). The buffer used for gel filtration was 100 mM NaCl, 20 mM Tris-HCl pH 7.5, 1 mM βME. Apo GrsA_PCP-E containing fractions were pooled and concentrated to a final concentration of 20 mg/mL. In vitro preparation of phosphopantetheinylated (holo) GrsA_PCP-E. Enzymatic reactions of PPTase were modified from previous studies.41 The reaction mixtures contained 50 µM apo GrsA_PCP-E, 1.5 µM B. subtilis PPTase (Sfp), 1mM dithiothreitol (DTT), 10 mM MgCl2, 50 mM Tris-HCl pH 8.0, in a final volume of 3.0 mL. The assays were initiated by addition of 100 µM coenzyme-A and allowed to incubate at 25°C for 1.5 h. The reaction mixture was centrifuge at 3,500 rpm for 10 min, and the supernatant was applied into size-exclusion chromatography (as above) to separate holo GrsA_PCP-E and Sfp proteins. The holo GrsA_PCP-E protein quality was determined by SDS PAGE and size-exclusion chromatography; further analysis of the holo conjugate was performed by mass-spectrometry (MALDI-TOF MS). The holo GrsA_PCP-E was harvested and concentrated to a final concentration of 20 mg/mL. Crystallization and data collection. Protein crystallization was performed by using the vapor diffusion method in the sitting drop format. Apo GrsA_PCP-E, at 8 mg/mL in 100 mM NaCl, 50 mM TrisHCl pH7.5, and 5% v/v glycerol, was screened against several commerciallyavailable matrix crystallization screens. Thin sheet crystals were obtained in a condition containing 0.15 M KBr, 30% w/v PEG MME-2000, and 100 mM MES pH 6.0. Optimization of precipitant and pH, along with microseeding were performed in a vapor diffusion method hanging drop format (24-well VDX crystallization plate, Hampton Research) at 293 K. 2 µL protein (12 mg/mL) plus 2 µL of precipitant were balanced against 1 mL of reservoir solution. The resultant rod-shape single crystals were obtained in a final condition that contained 0.2 M

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

KBr, 22% w/v PEG MME-5000, 5% v/v glycerol, and 100 mM Tris-HCl pH 7.5. Crystals of suitable size were harvested and flash frozen in liquid nitrogen with an additional 10% v/v glycerol as cryoprotectant. Using analogous methods, the holo GrsA_PCP-E protein, also at 8 mg/mL in 100 mM NaCl, 50 mM Tris-HCl pH 7.5, and 5% v/v glycerol, was screened separately for crystallization. Thin needle-shaped crystals were obtained in a condition containing 0.2 M MgCl2, 25% w/v PEG 3350, and 100 mM HEPES pH 7.5, and optimization produced plateshaped crystals in 0.2 M MgCl2, 18% w/v PEG 3350, 5% v/v glycerol, and 100 mM sodium cacodylate pH 6.5. Structure determination and crystallographic refinement. The apo, and holo GrsA_PCPE crystal diffraction data sets were collected on beamlines 21-ID-F and 21-ID-G of the Life Science Collaborative Access Team (LS-CAT) facility at the Advanced Photon Source (APS) of Argonne National Laboratory with a wavelength of 0.9786 Å at 100 K. Data images from a single crystal were integrated using the XDS program package42 and then scaled with AIMLESS from the CCP4 suite43 to space groups P1 (apo) and C2 (holo) (see Table 1). Apo GrsA_PCP-E crystals diffracted X-ray to 1.8 Å, while the holo crystals to 2.4 Å. The initial phase calculation of apo GrsA_PCP-E was carried using PHASER from the PHENIX suite with TycA E domain (pdb 2HXG) as a molecular replacement model.44 The model building was initiated with SHELXe45, and completed manually using COOT46. Final coordinates were refined with PHENIX.REFINE. For the phase calculation of holo GrsA_PCP-E, PHASER from the PHENIX suite was used with the apo GrsA structure as the molecular replacement search model. The phosphopantethiene arm was built manually using COOT upon careful inspection of electron-density maps and refined using PHENIX.REFINE. The quality of the models were evaluated using sigma-weighted, simulated annealing composite omit maps and by monitoring the free R parameters. Structural illustrations were prepared with PyMOL47 and LIGPLOT.48 Site-directed mutagenesis and diketopiperazine formation assay. The grsA and tycB_ProCAT coding sequences were amplified by PCR from the genomic DNA of B. brevis (ATCC 9999) and B. parabrevis (ATCC 8185) respectively. Mutations were made in full-length GrsA using the Quikchange mutagenesis kit (Agilent). Oligonucletide primers used are shown in Supplementary Table 1. For the production of the posttranslationallymodified modular proteins, a plasmid containing the gene for the sfp, phosphopantetheinyl transferase (pSU20_sfp)41 and the constructed plasmid were co-expressed in E. coli BL21(DE3). Proteins were expressed and purified following the procedures described above. The diketopiperazine (DKP) (D-Phe-L-Pro) formation assay was utilized as previously described with minor modifications.49,50 Briefly, reactions were carried out with GrsA (WT or GrsA_mutants) along with the TycB_ProCAT enzyme. Reactions were incubated at 37°C with 250 mM Tris-HCl pH 7.5, 10 mM MgCl2, 100 mM NaCl, 5 mM ATP, 1 mM TCEP, 500 µM proline, 5µM of each enzyme and 500 µM phenylalanine. The 250 µL reaction was initiated by adding Lphenylalanine. Reaction aliquots were quenched with 35 µL quench buffer (25% acetonitrile and 10% TFA) at 20, 40, 80, and 200 minutes. The collected samples were heated to 95°C for 5 min and centrifuged at 14,000 rpm for 15min. The supernatant applied into a C18 reverse-phase column (Vydac 218TP54, 300 Å, 5 µm, 4.6 mm i.d. x 250 mm) using a Shimadzu SPD UFLC system. Separation was performed with a linear gradient of acetonitrile (0.1% TFA) in water

ACS Paragon Plus Environment

Page 6 of 21

Page 7 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

(0.1% TFA) at a flow rate of 1.0 mL/min (gradient program: 2 min 2% acetonitrile, 15 min 2-60% acetonitrile, 5 min 100% acetonitrile, 5 min 2% acetonitrile). DKP formation was monitored at the absorbance of 220 nm with a diode array detector and product formation confirmed using synthetic DKP standards. Accession codes The atomic coordinates and structure factors have been deposited in the Protein Data Bank with accession numbers 5ISW for apo GrsA_PCP-E and 5ISX for holo GrsA_PCP-E.

Results and discussion Structure determination of the apo and holo PCP-E di-domain fragments. The epimerization domain from the initiation module of B. brevis gramicidin S synthetase (Figure 1, Grs) was chosen as a model to investigate the structure and mechanism of E domain interactions and chemistry. A designed fragment containing the last two domains (PCP-E) of the first module GrsA (GrsA module 1 domain order: A-PCP-E) was cloned with a start in the APCP linker region (residue Ala538 of GrsA) through the end of the E domain as determined by sequence alignment and comparison to the structure the standalone E domain from TycA.22 The construct was overexpressed in E. coli and purified to homogeneity. Crystals of apo GrsA_PCPE belong to the space group P1 with one polypeptide chain per asymmetric unit. The phase solution of the structure was determined by molecular replacement using the TycA E domain (PDB: 2HXG) as the initial model and the di-domain structure was built and refined to 1.8 Å resolution. To generate the post-translationally modified (Ppant) holo PCP-E fragment, the apo protein was incubated with B. subtilis PPTase (Sfp) and coenzyme-A for 2 h to allow for efficient Ppant formation. The modified GrsA_PCP-E was purified and mass-spectral analysis (MALDI) indicated quantitative generation of the holo conjugate (Supplementary Figure 1). The holo construct crystallized under a different condition than the apo construct, in the C2 space group with two monomers per asymmetric unit. The phase solution of the structure was obtained through molecular replacement with the apo GrsA_PCP-E structure and refined to 2.4 Å. Interestingly, despite efforts, holo PCP E loading with D- or L-Phe failed to crystallize under these conditions nor did apo PCP-E loaded with the nonhydrolyzable amide analog32 of the Phe substrate. Table 1. Crystallographic Data of GrsA_PCP-E GrsA_PCP-E

Apo

Holo

0.9786 P1 53.8, 57.5, 61.3 65.8, 65.5, 84.9 37.48 - 1.75 (1.78 - 1.75)* 312758 (16953) 59562 (3187)

0.9786 C121 224.8, 58.0, 90.9 90, 100.6, 90 38.97 - 2.34 (2.42 - 2.34)* 372347 (38094) 49011 (4921)

Data collection Wavelength (Å) Space group a, b, c (Å) α, β, γ (deg) Resolution (Å) Total reflections Unique reflections

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Completeness % Multiplicity Wilson B-factor (Å2) Rmerge (%) Rmeas (%) CC1/2 Refinement Rwork / Rfree (%) No. of atoms total protein atoms ligand atoms water atoms Protein residues RMS deviations bonds (Å) angles (deg) Ramachandran favored (%) outliers (%) Clashscore Average B-factors (Å2) protein water ligands PDB ID

Page 8 of 21

97.4 (96.2) 6.0 (5.9) 18.02 (2.05) 15.9 0.046 (0.34) 0.051 0.999 (0.945)

99.3 (94.6) 7.6 (7.7) 16.75 (3.38) 30.1 0.126 (0.69) 0.136 0.998 (0.895)

0.169 / 0.201

0.183 / 0.220

4947 4471 6 470 539

9236 8750 50 436 1058

0.009 1.23

0.004 0.70

98 0.6 2.0

97 0.19 4.2

21.40 20.10 28.10

39.20 34.60 34.50

5ISW

5ISX *Highest resolution shell is shown in parenthesis

Overall structure of apo and holo PCP-E. Despite being crystallized under different conditions and resulting in distinct cell parameters, the overall structures of the apo and holo didomains were very similar with a rmsd of 0.6 Å for the 520 Cα positions, with notable differences found in loop regions and around the Ppant binding channel. Of note, Eα6 in the beginning of E domain’s second subdomain shifts ~1 Å away from the first subdomain in holo structure (Figure 2). Additionally, Eβ8, Eβ9 and Eα9 move slightly closer to the PCP domain to accommodate the Ppant arm in the pocket (Supplementary Figure 2). Both constructs contain the two domains well-resolved in the electron density maps (Supplementary Figure 3) along with the 20 amino acid interdomain linker. The PCP domains are oriented in relation to the E domains in a conformation relevant to deliver a substrate to the E domain active site as outlined below. Overall, the PCP-E proteins measure approximately 62 × 53 × 57 Å and the linker joining the domains is embedded in a groove along the surface of the E domain (Figure 2A). The small carrier, PCP domain, is well-defined in the crystallographic structure (residues Ile536 to Tyr607) and resembles the described A/H conformational state.28 This conformational state of the PCP domain is found in all X-ray structures of solved NRPS fragments to date, regardless of being in the apo or holo form or interacting with an enzyme domain in trans or in cis.30 The PCP domain is positioned on the E domain at the convex opening of two, V-shaped subdomains. As common,

ACS Paragon Plus Environment

Page 9 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

the PCP domain is comprised of a compact, four α-helical unit, and the conserved serine, that is post-transnationally modified as Ppant, resides at the N-terminal end of the second α-helix31 with the arm extending toward the center of the E domain V-shape cave. The well-defined linker region (Ile608-Thr627) within the PCP-E di-domain fragment allows insight into how a long linker participates in domain-domain recognition in a NRPS module. Seventeen of the twenty residues of linker (Ser611-Thr627) are bound in a distinct groove of the E domain surface with three solvent-accessible residues (Ile608-Asp610, immediately C-terminal to the PCP domain) are dissociated from the E domain surface, which may facilitate the adjustment of PCP domain orientation toward the E domain active pocket. The E domain of GrsA is defined in the structure from Pro628 to Leu1071. Common to other E/C domains embedded in NRPS assemblies, the GrsA E domain contains two subdomains (Pro628-Asn802 and Ser803-Leu1067) categorized as belonging to the chloramphenicolacetyltransferase (CAT) fold,51 and the predicted active site pocket is located in the center of the interface of the two subdomains. A superposition of the GrsA E domain with TycA E domain (PDB code: 2XHG)22 shows significant structural similarity with an overall rmsd of 2.3 Å for 440 Cα positions. As mentioned, E and C domains share a similar structural fold, and likely similar enzyme mechanism as a result of the conserved HHxxxDxxSW catalytic motif. Structural comparison with the the TycC6 PCP-C protein crystal structure (PDB code: 2JGP),35 a homologous di-domain construct, shows a superposition of an rmsd of 4.3 Å for 384 Cα positions between the GrsA_PCP-E domain and TycC6 PCP-C domain (Supplementary Figure 4). However, the superposition with just the E and C domain shows a modest 2.6 Å for the 319 Cα positions. The C domain of TycC and E domain of TycA have more open structures hinged at the V-shaped, subdomains, as compared to the GrsA E domain. With the interaction with PCP domain, the E domain of GrsA brings the two subdomains closer together and the subdomains of GrsA E domain creates a distinct channel for the Ppant arm leading to predicted active residues (His753 and Glu892) at the bottom of the channel.

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 21

Figure 2. Overview of the GrsA_PCP-E di-domain fragment. a) The overall V-shape structure of the E subdomains characteristic of the chloramphenicol-acetyltransferase (CAT) fold is shown in grey, the PCP domain is shown in yellow, the linker connecting PCP and E domain is colored in orange, and the bridge region is colored blue. The electron density (Fo-Fc map is displayed at 1.6σ contour level) around the Ppant arm is shown as green mesh (green). b) Detailed view of the Ppant arm (in a ball and stick configuration) located at the center of the V-shaped E domain. The proposed catalytic residues, H753 and E892 are shown in red. A highly conserve tyrosine residue (976) interacts with the gem-dimethyl group of the Ppant arm. c) Schematic of the major interactions of GrsA_PCP-E residues with the Ppant arm, the hydrogen bonds are shown as blue-dashed lines with H894, K945, and Y976, and the hydrophobic pocket is created by I574, I866, and Y976. It is commonly an experimental challenge to obtain ordered crystal structures of NRPS fragments. The inherent flexibility of PCP domains in relation to the catalytic domains is a key and necessary feature of the assembly lines. Likewise, it is uncommon to obtain crystals of Ppant-loaded, holo constructs, and the terminal thiol of Ppant is susceptible to oxidation and modification. The GrsA_PCP-E construct has a relatively long, 20 amino acid, linker between the two domains and the previous structure of a similar apo PCP-C construct yielded a didomain conformation not relevant to an active conformation of the C domain.35 We were able to obtain an ordered catalytically active configuration (composed of the complete PCP, linker and

ACS Paragon Plus Environment

Page 11 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

E domain) for both apo and holo protein fragments, each under different crystallization conditions. The active site of the GrsA epimerization domain. A distinct channel containing the Ppant arm of GrsA_PCP-E is formed by the CAT-like subdomains of the E domain together with the PCP domain. It is bordered by the termed bridge region22 which is an extended twisted loop (Figure 2A). Based on previous mutagenesis studies on the epimerization domain of GrsA16 and the TycA standalone E domain structure,22 two highly conservative residues, His753 and Glu892, located in the cavity between the subdomains, were predicted to be involved in catalytic mechanism through a general base/general acid mechanism of catalysis (Figure 2B and Supplementary Figure 5). His753 is highly conserved in all C domains of NRPSs located within an HHxxxDxxSW catalytic motif and has been implicated as playing a key role in catalysis.19 In our structure, the imidazole sidechain of His753 points directly at the sulfur atom of the Ppant arm. In addition, Tyr976 (backbone amide NH), His894 (backbone amide carbonyl), and Lys945 (to the phosphate group) provide hydrogen bonding type interactions to the Ppant arm and the sidechains of Tyr976, Ile866 and Ile574 form a hydrophobic pocket that complements the gemdimethyl moiety of the Ppant arm (Figure 2C). Moreover, α-helix 4 (Eα4 in Figure 2A) is common in all E/C domain structures, pointing at the end of Ppant arm, possibly providing positive electrostatic stabilization of the enolate intermediate through the dipole moment of the α-helix.22,52 Residues R896, D757 and S760 interact, each other through hydrogen bonds, to hold α-helix 4 in a predicted catalytically assisting position. Indeed, kinetic analysis of E domain mutants of GrsA revealed that epimerization activity was highly reduced in H753A, E892A, Y976A, D757S, and R896A mutants.51 Given the structure of the GrsA_PCP-E domain here, all residues are well-defined and likely contribute to catalytic process. As mentioned, any construct of a D- or L-Phe loaded holo PCP-E failed to crystallize under the described conditions or through screening. This could suggest that with substrate bound, the catalytically competent enzyme has increased dynamics or one of the products induces a change in overall conformation. The PCP-E di-domain interface. It has been proposed that the interdomain linker plays a critical role in regulating domain-domain interactions involved in the chemistry and dynamics of NRPS assemblies.53 In the Tyc6 PCP-C di-domain structure, the 18-AA peptide linker between (C/PCP) exhibits considerable conformational flexibility and the final seven amino acids dissociate from the C domain and do not interact with either domains. In the presented structure, the linker connecting E and PCP forms extensive, ordered interactions along the E domain (Figure 3A). A majority of the interactions are of the charged/polar variety, forming specific interactions with the protein surface of the E domain. There are two notable salt bridge pairs formed by Arg613/Asp788 and Arg614/Glu785 and together these two arginines provide an anchor-like electrostatic interaction, localizing the linker region on the E domain (Figure 3A1). However, the linker region also has several hydrophobic residues buried inside the interface. A comparison of the interdomain linker regions within the five-module gramicidin S synthetase, shows the PCP-C linkers have significant homology to those in other PCP-C fragments (Supplementary Figure 6), while in contrast, there is no apparent sequence similarity between PCP-E and PCP-C linkers. The PCP-E interface is created by the first subdomain of the E and

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 21

linker, suggesting that specific topology of the interface plays a key role in determining binding recognition, and guiding the PCP domain toward the E domain active site opening. In order to assess the role of residues in the linker and di-domain interface regions, we assayed NRPS module function by utilizing the first two assembly-line modules of gramicidin synthetase in the formation of diketopiperazines.49,50 Residues described above that were shown to be involved in interface interactions were probed. Relative observed rates were used to compare the wild-type complex with mutants (Supplementary Table 2). All mutants had similar kobs of the total DKP formation suggesting these mutations are structurally nonintrusive to the overall multidomain structure. Importantly, the mutations do not obstruct the recognition of the PCP domain with the downstream condensation (Supplementary Figure 7). However, in contrast to the preferred formation of the D-Phe-L-Pro DKP (over the diastereomeric L-Phe-LPro) by the wild-type GrsA, the double mutant E785R/D788R produced ~20% L-Phe-L-Pro DKP (Supplementary Figure 7B). The double mutation may interfere with the linker recognition by the E domain and, alternatively, the loaded PCP domain could skip epimerization to directly advance to the downstream C domain forming an L-Phe-L-Pro DKP. As mentioned, the PCP domains (holo and apo) of GrsA are in an A/H-like state of the carrier domain (Figure 3B). The “universal recognition helix,”41 second and third helices of the PCP domain provide the majority interactions with the partner E domain (Figure 3B1). The PCP/E interface encompasses ~898 Å2 of total buried surface area, not including the Ppant arm (calculate by PDBePISA server).54 Previous mutagenesis studies on the second and third helix of the EntB and EntF PCP domains resulted in reduction or no activity.55 Comparing these important residues in EntB PCP domain, one of these the homologous residues in GrsA PCP domain, Thr592, forms a hydrogen bond with Glu898 (2.8 Å distance) on the floor loop of E domain. In addition, approximately one third of buried surface (310 Å2) is contributed by this floor loop (Gly893-Thr913) and the second helix, and third helix of the PCP domain. It suggests that the extended floor loop participates in directing the Pα2 orientation into a catalytically competent PCP-E domain interaction. The similar, catalytically-active, configuration for both apo and holo GrsA PCP-E fragments suggests that the Ppant arm provides minor contribution for the all overall arrangement in crystallization stabiliztion and possibly in the soluble architecture necessary for function. The interfacial contact area is concealed in a channel created by PCP and E domains resulting in a slightly closer conformation of PCP and E domain. However, calculations of the interface surface does show that the holo-PCP/E domain provides more solvation free energy upon formation than the apo-structure (Supplementary Table 3) with the linker providing a significant proportion of the interface area. In addition, the low calculated P-value of PCP-linker/E indicates an interface with a high relative degree of hydrophobicity, implying that the interface surface can be interaction-specific.

ACS Paragon Plus Environment

Page 13 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Figure 3. Overview of GrsA_PCP-E surface and domain/domain interactions. The E domain is shown in grey, the PCP domain is shown in yellow, and the linker is in orange. a) The interface between the linker and E domain (green). (A1) Detailed view of contact area created by E and linker regions. The linker is shown in cartoon representation with positively charged arginines colored with blue as stick. Negative charged residues on E domain are colored with red. b) The interface between PCP and E domain. The contact areas on the E domain is colored blue, and the n PCP domain, red. (B1) Detailed view of interface created by E and PCP domain. The PCP domain is shown in cartoon presentation with key interacting residues colored red. Comparison of NRPS epimerization and condensation domains. As introduced, there is significant sequence and structural similarity between E domains and the ubiquitous peptide bond-forming C domains of NRPSs. Indeed, some C domains are duel-functional, able to perform both epimerization and amide bond formation.56 Based on phylogenetic analysis and the comparison of sequence motifs found in C, E, dual E/C and heterocyclization domains,57 the function of E domains was predicted to result from divergent evolution events occurring on duplicated C domains in an NRPS assembly line. There are, however, several notable

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 21

differences in the structure of E domains from the homologous C domains (Figure 4).35 Comparing GrsA_PCP-E and TycC_PCP-C di-domain structures, the ‘floor loop’ is four residues longer in the E domain as compared to the C domain and the loop extends from the E2 subdomain to reach Eα5 in E1 subdomain by hydrophobic interactions with Trp792, Leu900, and Ile905. In addition, the bridge region of the GrsA E domain is longer and shows additionally flexible compared to the TycC C domain. Consequently, the bridge region twists to fill the cleft created by two subdomains, as well as to block the outlet for a donor site Ppant substrate. As GrsA is a chain-initiating module of the NRPS (A-PCP-E), accommodation of a long chain polypeptide is likely not a structural priority. From comparing different E domains located a distinct positions in NRPS assembly lines, internal elongation modules (C-A-PCP-E) contain a shorter bridge region in the E domain (Supplementary Figure 8). Moreover, Trp632/Trp911 is a highly conserved pair that rarely found in C domains (Supplementary Figure 9). The two tryptophans are located in the bottom of the Ppant pocket; along with several nearby aromatic residues (Phe638, Trp644, Tyr1015, and Phe1016), these aromatic functional groups create a hydrophobic pocket to accommodate natural phenylalanine substrate. Interestingly, in GrsA E domain, this conserved tryptophan pair and an extra Trp644 bring together the end of floor loop and the first loop of the GrsA E domain resulting in the closure of the acceptor site channel (Figure 5C). In other E domains (with distinct substrate specificity), the third tryptophan is commonly replaced by phenylalanine or tyrosine.

ACS Paragon Plus Environment

Page 15 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Figure 4. Sequence alignment (Clustal Omega) and secondary structure of GrsA_PCP-E with other available E/C domain structures. The conserved catalytic residues are highlighted with red. The conserved hydrophobic pocket residues (in E domains and their corresponding residues in C domains) are highlighted with orange box.

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 21

Figure 5. Structural model of a five-domain (PCP-C-A-PCP-Te) intermodule fragment. a) Superposition of the C/E domains from PCP-E and C-A-PCP-Te structures (this study and PDB code 4ZXH, respectively) generates a structural model for a five-domain (PCP-C-A-PCP-Te) intermodule region. The PCP, A and Te domains are shown as surface representation illustrated orange, purple and blue respectively. The E/C domain is represented as a cartoon structure. b) Detailed view of the active site regions of the superimposition of the GrsA E domain and the C domain from AB3403 showing Ppant arms from donor (colored in orange) and acceptor (colored in yellow). The E and C domains are colored as grey and light blue respectively. c) The three tryptophans in GrsA E domain are presented as spheres may result in the closure of acceptor site tunnel. d) Predicted PCP delivery cycle, LPCP domains load an amino acid, the APCP domain serves a peptidyl chain acceptor and the DPCP domain serves a peptidyl chain donor. Carrier domains (in PKS and NRPS systems) communicate through complex protein/protein interactions with diverse catalytic domains within the same module and also with upstream and downstream domains. (Figure 5D) Understanding the movement of carrier domains and recognition of the partner domains is a critical step to establish a complete mechanistic model of

ACS Paragon Plus Environment

Page 17 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

NRPS assembly lines. It is known that the NRP synthetases utilize conformational freedom to facilitate domain/domain protein communication.23 There is limited information on carrier domain with E/C domain donor site interactions and the substrate recognition/catalysis of E domains. The presented structure of the GrsA_PCP-E fragment supplies a picture of a PCP domain (PCPD state in Figure 5C) interacting with downstream module in a catalytically productive orientation along with the natural Ppant arm bound in the catalytic pocket. By superposition of the E/C domains from the PCP-E structure with the C-A-PCP-Te structures (Figure 5A), a fivedomain model for an intermodule, peptide chain elongation process was created. The detail view of the active site pocket (Figure 5B) shows the two Ppant arm from either acceptor or donor site of the PCP domain locates within E/C domain and assembles near the highly conserved motif (HHxxxDG). The reasonable distance (3-4 Å) between both thiol groups of Ppant and second histidine in the conserved motif suggesting the model is in a catalytically productive orientation. The structures of the PCP-E di-domain fragment provides the first insight into the detail interactions involving E domain and Ppant arm binding, as well as elaborates the interface information of a docked PCP domain with E domain. The clearly defined interface with each region shown in the crystal structure furnishes the NRPS catalytic model by the domain-domain recognition also the linker-domain interaction. Since the structural similarity of C and E domain and the uncertainty of the C domain catalytic mechanism, the structural details revealed in this study provide valuable information into the mechanism of homologous peptide condensation chemistry. Additionally, the new conformation formed by PCP and its downstream domain illustrates the free movement of PCP domain adopting and communicating with diverse domains in the NRP biosynthetic process.

ASSOCIATED CONTENT Supporting Information The Supporting Information is available free of charge on the website Supporting Figures 1-9 Supporting Tables 1-3

AUTHOR INFORMATION Corresponding Author * E-mail: [email protected]. Department of Chemistry, University of Florida, P.O. Box 117200, Gainesville, FL 32611. Tel. 352-392-0525. The authors declare no competing financial interest.

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 21

ACKNOWLEDGMENTS We thank the National Science Foundation (award #1411991) for supporting this research. We gratefully thank Z. Wawrzak and the staff of the Life Science Collaborative Access Team (LSCAT) facility at the Advanced Photon Source (APS) of Argonne National Laboratory for access and assistance with X-ray data collection. We thank K. Basso (Mass Spectrometry Service Facility, Chemistry department, University of Florida) for assistance on the mass spectrometric analysis of holo GrsA_PCP-E and M. Burkart for gifts of plasmids.

REFERENCES (1) Fischbach, M. A., and Walsh, C. T. (2006) Assembly-line enzymology for polyketide and nonribosomal peptide antibiotics: logic, machinery, and mechanisms. Chem. Rev. 106, 3468– 3496. (2) Strieker, M., Tanović, A., and Marahiel, M. A. (2010) Nonribosomal peptide synthetases: structures and dynamics. Curr. Opin. Struct. Biol. 20, 234–240. (3) Meier, J. L., and Burkart, M. D. (2009) The chemical biology of modular biosynthetic enzymes. Chem. Soc. Rev. 38, 2012–2045. (4) Finking, R., and Marahiel, M. A. (2004) Biosynthesis of nonribosomal peptides. Annu. Rev. Microbiol. 58, 453–488. (5) Raymond, K. N., Dertz, E. A., and Kim, S. S. (2003) Enterobactin: an archetype for microbial iron transport. Proc. Natl. Acad. Sci. U. S. A. 100, 3584–3588. (6) Walsh, C. T. (2000) Molecular mechanisms that confer antibacterial drug resistance Nature 406, 775–781. (7) Wandersman, C., and Delepelaire, P. (2004) Bacterial iron sources: from siderophores to hemophores. Annu. Rev. Microbiol. 58, 611–647. (8) Kratzschmar, J., Krause, M., and Marahiel, M. A. (1989) Gramicidin-S biosynthesis operon containing the structural genes grsA and grsB has an open reading frame encoding a protein homologous to fatty-acid thioesterases. J. Bacteriol. 171, 5422–5429. (9) Marahiel, M. A., Stachelhaus, T., and Mootz, H. D. (1997) Modular peptide synthetases involved in nonribosomal peptide synthesis. Chem Rev 97, 2651-2674. (10) Mootz, H. D., Schwarzer, D., and Marahiel, M. A. (2002) Ways of assembling complex natural products on modular nonribosomal peptide synthetases. Chembiochem 3, 490–504. (11) Hur, G. H., Vickery, C. R., and Burkart, M. D. (2012) Explorations of catalytic domains in non-ribosomal peptide synthetase enzymology. Nat. Prod. Rep. 29, 1074-1098. (12) Peypoux, F., Bonmatin, J. M., and Wallach, J. (1999) Recent trends in the biochemistry of surfactin. Appl. Microbiol. Biotechnol. 51, 553–563. (13) Radkov, A. D., and Moe, L. A. (2014) Bacterial synthesis of D-amino acids. Appl. Microbiol. Biotechnol. 98, 5363–5374. (14) Cava, F., Lam, H., de Pedro, M. A., and Waldor, M. K. (2011) Emerging knowledge of regulatory roles of D-amino acids in bacteria. Cell. Mol. Life Sci. 68, 817–831.

ACS Paragon Plus Environment

Page 19 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

(15) Linne, U., Doekel, S., and Marahiel, M. A. (2001) Portability of epimerization domain and role of peptidyl carrier protein on epimerization activity in nonribosomal peptide synthetases. Biochemistry 40, 15824–15834. (16) Stachelhaus, T., and Walsh, C. T. (2000) Mutational analysis of the epimerization domain in the initiation module PheATE of gramicidin S synthetase. Biochemistry 39, 5775–5787. (17) Tanner, M. E. (2002) Understanding Nature’s strategies for enzyme-catalyzed racemization and epimerization racemization at an activated center. Acc. Chem. Res. 35, 237–246. (18) Luo, L., Burkart, M. D., Stachelhaus, T., and Walsh, C. T. (2001) Substrate recognition and selection by the initiation module PheATE of gramicidin S synthetase. J. Am. Chem. Soc. 123, 11208–11218. (19) Stachelhaus, T., and Walsh, C. T. (2000) Mutational analysis of the epimerization domain in the initiation module PheATE of gramicidin S synthetase. Biochemistry 39, 5775–5787. (20) Linne, U., and Marahiel, M. A. (2000) Control of directionality in nonribosomal peptide synthesis: role of the condensation domain in preventing misinitiation and timing of epimerization. Biochemistry 39, 10439–10447. (21) Keating, T. A., Marshall, C. G., Walsh, C. T., and Keating, A. E. (2002) The structure of VibH represents nonribosomal peptide synthetase condensation, cyclization and epimerization domains. Nat. Struct. Biol. 9, 522–6. (22) Samel, S. A., Czodrowski, P., and Essen, L. O. (2014) Structure of the epimerization domain of tyrocidine synthetase A. Acta Crystallogr. Sect. D Biol. Crystallogr. 70, 1442–1452. (23) Weissman, K. J. (2015) The structural biology of biosynthetic megaenzymes. Nat. Chem. Biol.11, 660-670. (24) Strieker, M., Tanović, A., and Marahiel, M. A. (2010) Nonribosomal peptide synthetases: structures and dynamics. Curr. Opin. Struct. Biol. 20, 234–240. (25) Winter, J. M., and Tang, Y. (2014) Natural products: getting a handle on peptides. Nat. Chem. 6, 1037–1038. (26) Online, V. A., Winn, M., Fyans, J. K., and Zhuo, Y. (2015) Recent advances in engineering nonribosomal peptide assembly lines. Nat. Prod. Rep. 33. 317-347. (27) Conti, E., Stachelhaus, T., Marahiel, M. A., and Brick, P. (1997) Structural basis for the activation of phenylalanine in the non-ribosomal biosynthesis of gramicidin S. EMBO J. 16, 4174–4183. (28) Koglin, A., Mofid, M. R., Löhr, F., Schäfer, B., Rogov, V. V, Blum, M. M., Mittag, T., Marahiel, M. A., Bernhard, F., and Dötsch, V. (2006) Conformational switches modulate protein interactions in peptide antibiotic synthetases. Science 312, 273–276. (29) Protein, C., Volkman, B. F., Zhang, Q., Debabov, D. V, Rivera, E., Kresheck, G. C., and Neuhaus, F. C. (2001) Biosynthesis of D-alanyl-lipoteichoic acid : the tertiary structure of apo Dalanyl carrier protein. Biochemistry 40, 7964–7972. (30) Lohman, J. R., Ma, M., Cuff, M. E., Bigelow, L., Bearden, J., Babnigg, G., Joachimiak, A., Phillips, G. N., and Shen, B. (2014) The crystal structure of BlmI as a model for nonribosomal peptide synthetase peptidyl carrier proteins. Proteins 82, 1210–1218. (31) Tanovic, A., Samel, S. A., Essen, L. O., and Marahiel, M. A. (2008) Crystal structure of the termination module of a nonribosomal peptide synthetase. Science 321, 659–663.

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 21

(32) Liu, Y., Zheng, T., and Bruner, S. D. (2011) Structural basis for phosphopantetheinyl carrier domain interactions in the terminal module of nonribosomal peptide synthetases. Chem. Biol. 18, 1482–1488. (33) Sundlov, J. A., Shi, C., Wilson, D. J., Aldrich, C. C., and Gulick, A. M. (2012) Structural and functional investigation of the intermolecular interaction between NRPS adenylation and carrier protein domains. Chem. Biol. 19, 188–198. (34) Mitchell, C. A., Shi, C., Aldrich, C. C., and Gulick, A. M. (2012) Structure of PA1221, a nonribosomal peptide synthetase containing adenylation and peptidyl carrier protein domains. Biochemistry 51, 3252–3263. (35) Samel, S. A., Schoenafinger, G., Knappe, T. A., and Marahiel, M. A. (2007) Structural and functional insights into a peptide bond-forming bidomain from a nonribosomal peptide synthetase. Structure 15, 781–792. (36) Drake, E. J., Miller, B. R., Shi, C., Tarrasch, J. T., Sundlov, J. A., Allen, C. L., Skiniotis, G., Aldrich, C. C., and Gulick, A. M. (2016) Structures of two distinct conformations of holo nonribosomal peptide synthetases. Nature 529, 235–238. (37) Reimer, J. M., Aloise, M. N., Harrison, P. M., and Schmeing, T. M. (2016) Synthetic cycle of the initiation module of a formylating nonribosomal peptide synthetase. Nature 529, 239–242. (38) Hahn M. and Stachelhaus T. (2004) Selective interaction between nonribosomal peptide synthetases is facilitated by short communication-mediating domains. Proc. Nat.l Acad. Sci. USA 101, 15585-15590 (39) Sundlov, J. A., Shi, C., Wilson, D. J., Aldrich, C. C., and Gulick, A. M. (2012) Structural and functional investigation of the intermolecular interaction between NRPS adenylation and carrier protein domains. Chem. Biol. 19, 188–198. (40) Hur G.H., Meier J.L., Baskin J., Codelli J.A., Bertozzi C.R., Marahiel M.A., Burkart M.D. (2009) Crosslinking studies of protein–protein interactions in nonribosomal peptide biosynthesis. Chem Biol, 16, 372-381. (41) Quadri, L. E. N., Weinreb, P. H., Lei, M., Nakano, M. M., Zuber, P., and Walsh, C. T. (1998) Characterization of Sfp, a Bacillus subtilis phosphopantetheinyl transferase for peptidyl carrier protein domains in peptide synthetases. Biochemistry 37, 1585–1595. (42) Kabsch, W. (2010) XDS. Acta Crystallogr. Sect. D Biol. Crystallogr. 66, 125–132. (43) Winn, M. D., Ballard, C. C., Cowtan, K. D., Dodson, E. J., Emsley, P., Evans, P. R., Keegan, R. M., Krissinel, E. B., Leslie, A. G. W., McCoy, A., McNicholas, S. J., Murshudov, G. N., Pannu, N. S., Potterton, E. A., Powell, H. R., Read, R. J., Vagin, A., and Wilson, K. S. (2011) Overview of the CCP 4 suite and current developments. Acta Crystallogr. Sect. D Biol. Crystallogr. 67, 235–242. (44) Adams, P. D., Afonine, P. V., Bunkóczi, G., Chen, V. B., Davis, I. W., Echols, N., Headd, J. J., Hung, L. W., Kapral, G. J., Grosse-Kunstleve, R. W., McCoy, A. J., Moriarty, N. W., Oeffner, R., Read, R. J., Richardson, D. C., Richardson, J. S., Terwilliger, T. C., and Zwart, P. H. (2010) PHENIX: A comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. Sect. D Biol. Crystallogr. 66, 213–221. (45) Sheldrick, G. M. (2010) Experimental phasing with SHELXC /D /E : combining chain tracing with density modification. Acta Crystallogr. Sect. D Biol. Crystallogr. 66, 479–485. (46) Emsley, P., and Cowtan, K. (2004) Coot: model-building tools for molecular graphics. Acta

ACS Paragon Plus Environment

Page 21 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Crystallogr. Sect. D Biol. Crystallogr. 60, 2126–2132. (47) The PyMOL Molecular Graphics System, Version 1.7.4 Schrödinger, LLC. (48) Laskowski, R. A., and Swindells, M. B. (2011) LigPlot+: multiple ligand-protein interaction diagrams for drug discovery. J. Chem. Inf. Model. 51, 2778–2786. (49) Bergendahl, V., Linne, U., and Marahiel, M. A. (2002) Mutational analysis of the C-domain in nonribosomal peptide synthesis. Eur. J. Biochem. 269, 620–629. (50) Kries, H., Wachtel, R., Pabst, A., Wanner, B., Niquille, D., Hilvert, D. Reprogramming Nonribosomal Peptide Synthetases for “Clickable” Amino Acids (2014) Angew. Chem. Int. Ed. 53, 10105 –10108 (51) Leslie, A. G., Moody, P.C., Shaw, W.V. (1988) Structure of chloramphenicol acetyltransferase at 1.75- angstrom resolution. Proc. Natl. Acad. Sci. USA 85, 4133–4137. (52) Nishina, Y., Sato, K., Tamaoki, H., Tanaka, T., Miura, R., and Shiga, K. (2003) Molecular mechanism of the drop in the pKa of a substrate analog bound to medium-chain acyl-CoA dehydrogenase : implications for substrate activation. Biochem. J.134, 835–842. (53) Koglin, A., and Walsh, C. T. (2009) Structural insights into nonribosomal peptide enzymatic assembly lines. Nat. Prod. Rep. 26, 987-1000. (54) Krissinel, E., and Henrick, K. (2007) Inference of macromolecular assemblies from crystalline state. J. Mol. Biol. 372, 774–797. (55) Drake, E. J., Nicolai, D. A, and Gulick, A. M. (2006) Structure of the EntB multidomain nonribosomal peptide synthetase and functional analysis of its interaction with the EntE adenylation domain. Chem. Biol. 13, 409–419. (56) Vaillancourt, H., Balibar, C. J., Walsh, C. T., and Vaillancourt, F. H. (2005) Generation of D amino acid residues in assembly of arthrofactin by dual condensation/epimerization domains. Chem. Biol. 12, 1189–1200. (57) Rausch, C., Hoof, I., Weber, T., Wohlleben, W., and Huson, D. H. (2007) Phylogenetic analysis of condensation domains in NRPS sheds light on their functional evolution. BMC Evol. Biol. 7, 78.

ACS Paragon Plus Environment