Elucidation of Gephyronic Acid Biosynthetic Pathway Revealed

Seven methyltransferase (MT) domains embedded within the PKS subunits were found to install the methyl branches throughout the gephyronic acid skeleto...
0 downloads 0 Views 1MB Size
Article pubs.acs.org/jnp

Elucidation of Gephyronic Acid Biosynthetic Pathway Revealed Unexpected SAM-Dependent Methylations Jeanette Young,† D. Cole Stevens,† Rory Carmichael,‡ John Tan,‡ Shwan Rachid,⊥ Christopher N. Boddy,§ Rolf Müller,⊥ and Richard E. Taylor*,† †

Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, Indiana, United States Genomics and Bioinformatics Core Facility, University of Notre Dame, Notre Dame, Indiana, United States § Departments of Chemistry and Biology, University of Ottawa, Ottawa, Ontario, Canada ⊥ Helmholtz-Institut für Pharmazeutische Forschung Infektionsforschung and Pharmazeutische Biotechnologie, Universität des Saarlandes, Saarbrücken, Germany ‡

S Supporting Information *

ABSTRACT: Gephyronic acid, a cytostatic polyketide produced by the myxobacterium Cystobacter violaceus Cb vi76, exhibits potent and selective eukaryotic protein synthesis inhibition. Next-generation sequencing of the C. violaceus genome revealed five type I polyketide synthases and post-PKS tailoring enzymes including an O-methyltransferase and a cytochrome P450 monooxygenase. Seven methyltransferase (MT) domains embedded within the PKS subunits were found to install the methyl branches throughout the gephyronic acid skeleton. A rare loading domain from the GNAT superfamily also contains an embedded MT domain that catalyzes the in situ production of an isobutyryl starter unit. Phylogenetic analysis identified new motifs that distinguish MT domains located in PKS pathways with in cis acyltransferase (AT) domains from MT domains located in PKS pathways with trans AT enzymes. The identification of the gene cluster sets the stage for the generation of a heterologous expression system, which will allow further investigation of selective eukaryotic protein synthesis inhibitors through the generation of gephyronic acid analogues.



G

RESULTS AND DISCUSSION Identification and Sequence Analysis of the Gephyronic Acid Biosynthetic Gene Cluster. The genome of C. violaceus Cb vi76 was sequenced using a Roche 454 Genome Sequencer FLX instrument. Using a 15-fold read coverage, the reads were assembled into 83 contigs with an average size of 151 446 base pairs. Given the reported structural assignment of gephyronic acid, the draft microbial genome was screened for gene clusters encoding PKSs. Nucleotide sequence analysis utilizing antiSMASH identified a 52 kb cluster containing five PKS genes (Figure 1b).5 This cluster contains five contiguous type I PKS-encoding genes, gphF through gphJ, bordered by a set of genes that encode additional polyketide processing enzymes (Table 1). A putative O-methyltransferase, encoded by gphA, is located ∼4 kb upstream from gphF. Open reading frames encoding a cytochrome P450 monooxygenase, gphK, and an FAD-dependent oxidoreductase, gphL, are located directly downstream of gphJ. The modular arrangement of the PKS-encoded proteins (Figure 2) correlates well with the overall gephyronic acid structure, with a few unexpected elements including presumably inactive AT domains in modules 2 and 4 and a presumably inactive dehydratase (DH) domain in module 6. The initial PKS enzyme GphF lacks a conventional AT in the loading

ephyronic acid, an equilibrating but chromatographically separable mixture of structural isomers, was isolated at the Helmholtz Center for Infection Research (HZI) by Sasse, Höfle, and Reichenbach. Its structure was reported in 1995 without assignment of absolute or relative stereochemical configuration.1 In preliminary studies, gephyronic acid elicited a cytostatic effect through the inhibition of eukaryotic protein synthesis and is a potential new lead for cancer chemotherapy. Over the past few years, an international collaborative effort led by the Taylor lab proposed a structural reassignment and full stereochemical assignment, both of which were independently confirmed by total synthesis (Figure 1b).2 While the synthetic route provided the necessary material for additional biological studies and ready access to several analogues, it remains insufficient for the production of large quantities of this biologically interesting polyketide.3 Gephyronic acid is produced by two slow-growing myxobacterium: Archangium gephyra strain Ar3895 and Cystobacter violaceus strain Cb vi76. Myxobacteria have demonstrated their remarkable usefulness as both a rich source of novel polyketides and a platform for biosynthetic process engineering.4 As a first step in the development of a heterologous expression system for production of gephyronic acid, we report the sequencing, annotation, and confirmation of the PKS gene cluster from C. violaceus. © 2013 American Chemical Society and American Society of Pharmacognosy

Received: July 31, 2013 Published: December 3, 2013 2269

dx.doi.org/10.1021/np400629v | J. Nat. Prod. 2013, 76, 2269−2276

Journal of Natural Products

Article

Figure 1. Organization of the gephyronic acid biosynthetic gene cluster and structures of gephyronic acid. (a) A 52 kb gene cluster encodes for five polyketide synthases (light gray) that are surrounded by post-PKS processing enzymes (black), regulatory proteins (dark gray), and proteins with an unknown function (white). (b) Structures of gephyronic acid: keto and hemiketal.

domains in the gph pathway contain a Phe, suggesting malonylCoA selectivity (Figure S2). However, the AT domains of modules 2 and 4 lack the highly conserved active site GHSxG motif. The active site serine is essential for substrate attachment; therefore, AT2 and AT4 are presumed to be inactive (ATo).17 Modules 2 and 4 may obtain malonyl-CoA extender units from a malonyl acyltransferase (MAT) or an active discrete trans AT present in the C. violaceus genome, as observed in the biosynthesis of stigmatellin and neocarzilin.18 Investigation of the upstream and downstream regions surrounding the gephyronic acid pathway did not identify a potential trans AT or a specific MAT responsible for substrate selection in modules 2 and 4. Alternatively, the malonyl-CoA extender units could be loaded by an adjacent AT domain, similar to the mechanism identified in yersiniabactin biosynthesis, where the adenylation (A) domain from module 2 loads the peptidyl carrier proteins from adjacent “A-less” modules.19 The methyl groups at C-4, C-6, C-8, C-10, C-12, and C-14 are attributed to S-adenosylmethionine (SAM)-dependent methylations catalyzed by the embedded C-MT domains in all of the modules in the PKS pathway excluding module 7. The gem-dimethyl group at C-6 installed by module 5 is attributed to consecutive SAM-dependent methylations at the α-carbon of a β-ketothioester intermediate. Dimethylation by an MT domain embedded in a PKS module has been observed in the bryostatin, pederin, and epothilone biosynthetic pathways.20 Analysis of the phylogenetic relationships of the gephyronic acid MT domains with MT domains of characterized polyketide pathways led to the identification of new signature motifs. Each MT domain encoded in the gph PKS pathway was aligned with MTs from known bacterial type I PKSs. It was anticipated that the MT domain responsible for the installation of the gem-dimethyl moiety at C-6 would cluster with other MTs responsible for gem-dimethyl installation, such as MT4 and MT9 from bryostatin,20a MT2 from pederin,20b and MT7 from epothilone.20c A phylogenetic tree constructed from this alignment showed that MT5 from gephyronic acid clustered only with MT7 from epothilone. Bryostatin and pederin MT domains were grouped separately (Figure S3). Examination of the tree however clearly showed that MT domains from pathways with embedded in cis AT domains clustered away from those with trans AT enzymes. While MT sequences are often highly divergent, there are three highly conserved motifs (I, II, III) found in C-MTs that are essential for catalysis.21a−c A multiple sequence alignment of MT domains (Figure 3) revealed these characteristic motifs and

domain. Instead this module contains a GCN5-related Nacetyltransferase (GNAT) domain6 also observed in the loading domains of CurA (curacin A),7 OnnB (onnamide),8 and RhiA (rhizoxin) (Supporting Information Figure S1).9 Biochemical characterization of the CurA GNAT showed that it catalyzed a rapid decarboxylation of malonyl-CoA followed by direct Sacetyl transfer from acetyl-CoA to adjacent acyl carrier protein (ACP) domains.10 On the basis of the structure of gephyronic acid, isobutyrate would be the expected starter unit derived from valine. While isobutyrate starter units are known, as observed in the biosynthesis of myxalamid B from Myxococcus xanthus and Stigmatella aurantiaca,11 incorporation occurs through an AT-mediated process with isobutyryl-CoA. Because GNAT domains have not been characterized to decarboxylate dimethylmalonyl-CoA or load isobutyryl-CoA onto ACP domains, we hypothesized that a MT domain might be present. A conserved MT domain was indeed identified, utilizing the Pfam protein family database, directly upstream from the GNAT domain.12 A related loading domain responsible for the monomethylation of an acetyl-ACP to provide propionyl-ACP has been characterized previously in the biosynthesis of saxitoxin.13 Here, we propose a more likely scenario where the starter unit for gephyronic acid synthesis is malonyl-CoA. Dimethylation would then be followed by decarboxylation to generate the isobutyryl starter unit. Further support for this novel loading domain was provided by isotopic labeling (vide infra). Sequence analysis revealed that the remaining ACP domains should function as expected. The ACP domains present in the gene cluster all displayed the conserved serine residue required to tether the 4-phosphopantetheinyl arm necessary for thioester formation on the ACP domain.14 All of the observed ketosynthase (KS) domains are likely to be active, showing strong conservation for the DtaCSSsL motif.15 Multiple sequence alignments for each domain from the gephyronic acid biosynthetic gene cluster are presented in Figure S2. Upon further investigation of the domain composition of the gephyronic acid PKS pathway, it was determined that all of the methyl substituents located at the even-numbered carbons of the molecule are incorporated by discrete MT domains embedded in the PKS loading module and modules 1−6. In modular type I PKS pathways, methyl residues at α-carbons are often derived in bacteria from methylmalonyl-CoA extender units selected by AT domains.9 At position 200 of AT domains, the presence of a Phe is affiliated with malonyl-CoA selectivity, whereas Ser is observed in cases of methylmalonyl-CoA selectivity.16 At position 200, all AT 2270

dx.doi.org/10.1021/np400629v | J. Nat. Prod. 2013, 76, 2269−2276

Stigmatella aurantiaca Myxococcus xanthus Sorangium cellulosum Saccharophagus degradans Pseudomonas aeruginosa Cystobacter f uscus Stigmatella aurantiaca

61%/46% 61%/41% 60%/40% 48%/23% 42%/29% 53%/35% 52%/39%

(272) (355) (445) (193) (115) (475) (454)

ZP_01459596.1 YP_632470.1 ADZ24984.1 YP_529352.1 ZP_07792632.1 WP_002628736.1 CCD27752.1

also identified new motifs that distinguish MTs from pathways with in cis AT domains from those with trans AT enzymes. At position 165, a conserved WxD motif was observed in MT domains from in cis AT PKSs, whereas F is observed in MT domains from trans AT PKSs. At position 223, MT domains from trans AT PKS pathways possess a conserved GQQ and lack the conserved PGG of motif III. There are, however, a few exceptions to the new sequence motifs identified in this analysis. The MT domains of the leinamycin and disorazole trans AT PKS pathways21d,e lack both the in cis AT PKS conserved WxD motif and the trans AT PKS conserved GQQ motif. Our analysis supports that embedded modular MT domains from in cis AT PKSs show significant differences in their conserved sequences when compared to MT domains from trans AT PKSs. The observation that bacterial type I PKS MTs cluster primarily based on modular structure is significantly different from the results observed with type II MT enzymes, which have been shown to cluster according to the site of methylation.22 Ketoreductase (KR) domains are present in all elongation modules within the gephyronic acid PKS with the exception of module 5. Each KR domain possesses the typical NADPHbinding site GxGxxGxxxA motif as well as the semiconserved catalytic tetrad K−S−Y−N, with a His substitution for Tyr in modules 1−4.23 A similar substitution was reported in JamJ from the jamaicamide PKS pathway, in which the KR domain motif K−S−H−N from JamJ maintains functionality.24 Therefore, the KR domains of modules 1−4 are expected to be active. All KR domains observed in the pathway are classified as Btype, except KR3, which appears to contain conserved residues for both A- and B-type KRs (Figure S4).25 B-type KRs deliver their hydrides to the si face of the β-keto group, generating Sconfigured alcohols. This correlated well with the experimentally confirmed S configuration of the alcohols at C-3, C-5, and C-11 in gephyronic acid.2a Sequential modification of installed β-keto groups to alkene moieties was predicted for modules 1, 2, and 4. DH domains have been shown to possess a characteristic signature HxxxGxxxxP active site motif containing a catalytic histidine residue. This residue is required to interact with the α-hydrogen for deprotonation and initiation of the elimination.25 DH domains from modules 1, 2, and 4 lack this conserved sequence motif, but show conservation of the catalytic histidine residue (Figure S2); based on the gephyronic acid structure, we expect that they maintain their full activity. Module 1 is predicted to have atypical DH activity.26 The atypical β,γ-alkene, reminiscent of unique structural features found in the polyketides from marine organisms,27 myriaporones and tedanolides, may also be installed by GphF. Alkene generation is typically observed at the α,β position during polyketide biosynthesis through the action of a DH domain, with installation at the β,γ position occurring through β,γ-dehydration or migration from the α,β position by a noncanonical dehydratase (DH*) “shift module” as observed in the rhizoxin biosynthesis.28 Bioinformatic analysis of the gephryonic acid biosynthetic pathway did not provide evidence for the presence of a DH* “shift module”, suggesting that the Δ15,16-alkene is a result of sequential chain elongation, β-keto processing, and β,γ-dehydration in module 1 of GphF. The structure of gephyronic acid contains an epoxide at C-12,-13 that is likely derived from the post-PKS oxidation of a trisubstituted alkene with E geometry. Sequence analysis of the DH domain from module 2 predicts the catalysis of B-type reduction, leading to the requisite E configuration at this

(839) (1118) (1322) (734) (521) (1283) (1250) 279 372 440 224 163 427 416 GphA (gphA) GphB (gphB) GphC (gphC) GphD (gphD) GphE (gphE) GphK (gphK) GphL (gphL)

methyltransferase phospholipase Ser/Thr protein kinase tetR transcription factor hypothetical protein cytochrome P450 FAD-dependent oxidoreductase

accession number of the similar protein similarity/identity (amino acids) sequence similarity to source proposed function of the similar protein size aa (bp)

1637 (4913) 2253 (6761) 1809 (5429) GphH (gphH) GphI (gphI) GphJ (gphJ)

Article

protein (gene)

4174 (12 524) GphG (gphG)

proposed function (protein domains with their position in the sequence)

GphF (gphF)

MT (270−414), GNAT (415−713), ACP (714−784), KS (822−1234), AT (1342−1636), DH (1702−1862), MT (2129−2335), KR (2536−2709), ACP (2819−2891), KS (2924−3338), AT (3474−3664), DH (3665−3833), MT (4079−4295), KR (4514−4698), ACP (4795−4867) KS (39−461), AT (571−869), MT (1079−1290), KR (1472−1654), ACP (1759−1829), KS (1856−2270), AT (2338−2573), DH (2573−2739), MT (2998−3212), ER (3435−3744), KR (3758−3938), ACP (4055−4127) KS (37−459), AT (561−869), MT (1094−1310), ACP (1532−1604) KS (35−467), AT (568−855), DH (926−1092), MT (1405−1619), KR (1848−2026), ACP (2130−2202) KS (40−461), AT (571−871), KR (1160−1336), ACP (1432−1503), TE (1559−1807) Proteins Encoded Upstream of gphF and Downstream of gphJ

size aa (bp)

4927 (14 783)

protein (gene)

PKS Portion of the Gephyronic Acid Gene Cluster

Table 1. Putative Functions of Proteins within the Gephyronic Acid Biosynthetic Gene Cluster and the Surrounding Area

Journal of Natural Products

2271

dx.doi.org/10.1021/np400629v | J. Nat. Prod. 2013, 76, 2269−2276

Journal of Natural Products

Article

Figure 2. Model of gephyronic acid biosynthesis. PKS domains: GNAT (N-acetyltransferase), ACP (acyl carrier protein), KS (ketosynthase), AT (acyltransferase), ATo (inactive acyltransferase), DH (dehydratase), DHo (inactive dehydratase), MT (methyltransferase), ER (enoylreductase), KR (ketoreductase). The AT domains of modules 2 and 4 are inactive (ATo). The DH domain of module 6 is presumed to be inactive.

this center is S.2 As the ER stereogenic prediction model does not correctly predict the stereochemical outcome of this reduction, it is likely that beyond the conserved residue at position 52, additional residues participate in stereocontrol of the enol protonaton to yield the S configuration of C-8 in gephyronic acid.30 Alternatively, the observed C-8 configuration could arise from incorporation of the methyl substituent during the enoyl reduction. The timing of the C-methylation is currently unknown. The structure of gephyronic acid contains two functional groups including a methyl ether at C-5 and a C-12/C-13 epoxide that are likely incorporated by post-PKS tailoring enzymes. The tailoring chemistry to install the C-5 methyl ether is likely catalyzed by GphA. GphA displays the typical SAM-binding motif as observed from SpiB and SpiK, Omethyltransferases in spirangien biosynthesis.21a−c,29 We suspect that GphK, a member of the cytochrome P450 superfamily, is responsible for epoxidation of the C-12−C-13trisubstituted olefin installed by module 3 of GphF. This is corroborated by the post-PKS epoxidation of an olefin catalyzed by the P450 enzyme EpoK during epothilone biosynthesis.20b However, as exemplified by TamL from tirandamycin biosynthesis, an epoxide may be installed via a co-dependent process involving both the cytochrome P450 and FAD-dependent monooxygenases.31 It is thus possible that GphK in conjunction with GphL, a monooxygenase FAD-binding protein, may be required for installation of the C-12−C-13 epoxide. Further experiments to fully elucidate the exact function of each of these proteins in gephyronic acid biosynthesis are currently under way. Analysis of Embedded MT Domains through Isotope Labeling Experiments. Initially, we considered valine to be the source of the starting isobutyryl unit. Unfortunately, incubation experiments in the presence of buffered valine showed significant growth inhibition. However, the activity of the PKS embedded SAM-dependent MT domains, including the unusual MT within the loading domain, was determined through labeling experiments using (methyl-13C)-L-methionine, the metabolic precursor of SAM. This experimental approach was used to determine the origin of the methyl branches in rhizopodin from Stigmatella aurantiaca.32 Gephyronic acid has a molecular mass of 470.32 (elemental composition: C26H46O7) and includes a total of 10 proposed SAM-derived methyl

Figure 3. Methyltransferase signature motifs. Multiple sequence alignment of MT domain signature motifs made in ClustalX from BRY, bryostatin, EF032014; CUR, curacin A, AY652953; EPO, epothilone, AF2108431.1; GPH, gephyronic acid; JAM, jamaicamide, 46486672; MCY, microcystin, AF183408.1; NDA, nodularin, AY210783.2; ONN, onnamide, AY688304.2; PEDF, pederin, AY328023.1; PEDI, pederin, AY426537.1; RHI, rhizoxin, AM411073.1; TaI, antibiotic TA, CP000113.1, PKS gene clusters. New signature motifs were identified (boxed) to distinguish between MT domains from pathways with trans AT enzymes and pathways with in cis AT domains. (Adapted from Figure 3, Ligon et al.21b)

position. Most surprisingly, a DH domain with the conserved motif was also observed in module 6, although elimination is not observed at C-5 in the final structure. A similar feature was identified in spirangien biosynthesis, where DH4 is presumed inactive due to the hydroxy participation in spiroketal formation.29 The only ER domain present in the gene cluster, in module 4, possesses the conserved LxHxxxGGVGxxAxxxA active site motif characteristic of ER domains (Figure S2).7 Recently, the molecular basis of enoyl reduction stereochemistry has been correlated to conserved motifs in the amino acid sequence of ER domains. At position 52 of ER domains, the occurrence of a conserved Tyr is affiliated with the installation of a 2S-methyl branch, whereas a Val is observed in cases of 2R-methyl branch installation.30 Sequence analysis of ER4 position 53 revealed a conserved Val, which would dictate an R configuration at C-8 of gephyronic acid. The experimentally confirmed configuration of 2272

dx.doi.org/10.1021/np400629v | J. Nat. Prod. 2013, 76, 2269−2276

Journal of Natural Products

Article

Figure 4. LC-MS analysis of C. violaceus Cb vi76 extracts from (methyl-13C)-L-methionine feeding experiments. (a) Extracted m/z = 469.44 ion chromatogram and MS spectra from C. violaceus extract. (b) Extracted m/z = 479.52 ion chromatogram and MS spectra from C. violaceus 13 L-methionine-(methyl- C) feeding experiment extract.

groups (2 × loading domain, 7 × C- and 1 × O-methyl). During incubation experiments with (methyl-13C)-L-methionine, the observed molecular mass of gephyronic acid is thus expected to increase by +10 Da. Total labeling with methionine can be difficult to achieve due to competition with unlabeled methionine present in the growth medium.32 Thus, to achieve a high rate of labeled methionine incorporation, the labeled substrate was supplemented over the course of 5 days after the culture reached stationary phase. The mass spectrum of the isolated polyketide from C. violaceus extracts fed with (methyl-13C)-L-methionine showed MS signals for labeled gephyronic acid with increases in molecular mass of both +9 and +10 Da (Figure 4). In addition, there were equivalent signal intensities for unlabeled gephyronic acid present in C. violaceus extracts. A similar distribution of intensities for labeled and unlabeled products was observed in a related (methyl-13C)-L-methionine labeling experiments conducted to investigate rhizopodin biosynthesis in S. aurantiaca.32 Together these results suggest that unlabeled product may have been produced prior to substrate feeding, and isotopic methionine incorporation occurs rapidly in myxobacteria. These data are consistent with the proposal that all methyl branches in the gephyronic acid structure originate from S-adenosyl-L-methionine and are installed by the SAM-dependent MT domains suggested by the sequence analysis. Of particular importance, these data support that Cmethyl branches are not installed from methylmalonyl-CoA, further suggesting the selectivity of the AT domains for

malonyl-CoA also proposed from the sequence analysis. These results also suggest that the GNAT domain loads malonyl-CoA onto the ACPL followed by SAM-dependent dimethylation catalyzed by the MTL domain to construct the isobutyryl starter unit. Gene Inactivation to Confirm Gephryonic Acid Biosynthetic Pathway. To confirm the involvement of gphF-gphJ in the biosynthesis of gephyronic acid, inactivation of the PKS through gene disruption was performed. A 1500 bp fragment specific to the linker region between modules 3 and 4 through the KS and AT domains of module 4 (gphG) was inserted into the chromosome of C. violaceus through homologous recombination. Electroporation of plasmid DNA containing the 1500 bp fragment from gphG and conferring hygromycin resistance resulted in a C. violaceus gphG::HygR mutant. Selection of hygromycin-resistant mutants was conducted on VY2 agar plates supplemented with hygromycin. To confirm disruption of gphG, the hygromycin resistance cassette was amplified from gDNA isolated from hygromycinresistant strains (Figure S5). In addition to confirmation of chromosomal insertion, LC-MS analyses of organic extracts from C. violaceus gphG::HygR were conducted to confirm the abolished production of gephyronic acid (Figure 5). The absence of gephyronic acid production from C. violaceus gphG::HygR confirms that gphG is essential for gephryonic acid biosynthesis and supports our identification of the gephyronic acid biosynthetic pathway. 2273

dx.doi.org/10.1021/np400629v | J. Nat. Prod. 2013, 76, 2269−2276

Journal of Natural Products

Article

Figure 5. LC-MS comparison of C. violaceus wild type and gphG::HygR mutant extracts. (a) Extracted m/z = 469.44 ion chromatograms from C. violaceus extract. (b) Extracted m/z = 469.44 ion chromatograms from C. violaceus gphG::HygR extract. (c) MS spectra for C. violaceus extract, retention time = 4.47. (d) MS spectra for C. violaceus gphG::HygR mutant extract, retention time = 4.47. medium consists of 0.5% baker’s yeast, 0.1% HEPES, 0.14% CaCl2· 2H2O, and 1.5% agar, pH 7.2. After autoclaving, sterile solutions of 20% (w/v) glucose and 0.0005% (w/v) vitamin B12 were added to final concentrations of 0.2% and 0.000 005%, respectively. The media were supplemented with hygromycin B (50 μg/mL) when necessary. DNA Preparations. Genomic DNA was isolated from C. violaceus Cb vi76 using the E.Z.N.A. bacterial DNA kit−spin protocol (Omega Biotek). Cell clumps were collected from agar plates and gently crushed with a glass homogenizer. Cells were harvested by centrifugation at 4 °C (13000g, 5 min), suspended in 5 mL of washing buffer (5 mM HEPES, 0.5 mM CaCl2, pH adjusted to 7.2 using 1 M NaOH), and then centrifuged (13000g, 5 min). After centrifugation, cells were suspended in TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 7.5). DNA was then extracted and purified using the established protocol for the bacterial DNA kit. Plasmid DNA was purified using the E.Z.N.A. bacterial plasmid kit (Omega Biotek). Inactivation Constructs. An internal fragment (∼1500 bp) from the targeted gene (gphG) was amplified using Primestar HS DNA polymerase (Takara Bio Inc.) with specific forward and reverse primers. Primers JY02F 5′−CGCACGCGAGACGGATGACGCGGT−3′ and JY02R 5′−TCTGCTGGGCGGCGCAGACGC−3′ were used to provide the PCR fragment. The Mastercycler gradient thermal cycler (Eppendorf) was used for the PCR reaction. DMSO was added to the reaction mixture to a final concentration of 4% v/v. DNA template concentrations were used according to the manufacturer’s protocol. Conditions for amplification were as follows: initial denaturation, 3 min at 98 °C; denaturation, 10 s at 98 °C; annealing, 15 s at 70 °C; extension, 90 s at 72 °C (−1 °C/cycle for 10 cycles); denaturation, 10 s at 98 °C; annealing, 15 s at 55 °C; extension, 90 s at 72 °C (15 cycles); final extension at 68 °C for 10 min; hold at 4 °C. The PCR product was gel purified using the Wizard PCR Preps DNA purification system vacuum manifold protocol (Promega). The gel-purified PCR fragment was cloned into pCR-Blunt (Zero Blunt PCR cloning kit, Invitrogen) following the

Recent investigations of the structurally similar eukaryotic protein synthesis inhibitors myriaporone 3/4, tedanolide, and gephyronic acid have demonstrated their promising chemotherapeutic potential. With the current difficulties characterizing biosynthetic pathways from marine organisms, the terrestrial gephyronic acid gene cluster represented the most accessible source of biosynthetic information concerning these classes of lead compounds. The gephyronic acid biosynthetic pathway was identified in the native host C. violaceus through the use of next-generation genome sequencing, isotopic feeding experiments, and gene inactivation. We provide direct evidence that gphG, in concert with the upstream and downstream genes gphF, gphH, gphI, and gphJ, is responsible for gephyronic acid production. Tailoring enzymes GphA and GphK (and potentially, GphL) are proposed to be responsible for C-5 methyl ether formation and C-12,-13 epoxide generation. Unexpectedly, seven MT domains were unveiled, embedded in the gephyronic acid PKS pathway, and were found to install the methyl branches throughout the gephyronic acid skeleton including the isobutyryl starter unit. Identification of the gene cluster is the key step in the generation of a heterologous expression system, which will provide an alternative source of the natural product and a fermentation-based system for precursor-directed biosynthesis of gephyronic acid analogues.



EXPERIMENTAL PROCEDURES

Bacterial Strains and Culture Conditions. Escherichia coli TOP10 and DH5α strains for plasmid preparation were grown in LB broth at 37 °C. C. violaceus Cb vi76 and its mutants were cultured in VY2 liquid media at 180 rpm or on agar plates at 30 °C. VY2 2274

dx.doi.org/10.1021/np400629v | J. Nat. Prod. 2013, 76, 2269−2276

Journal of Natural Products

Article

fed to the culture. Cells were allowed to continue swarming the plate until day 12. The entire agar plate including media and bacteria was harvested for extraction on day 12. Sequence Analysis. High-throughput sequencing services were performed on a Roche 454 genome sequencer FLX instrument at the Notre Dame Genomics Core Facility, which utilizes Roche’s Titanium chemistry platform. Contigs were assembled utilizing the 454 Newbler Assembler software (version 2.3). Nucleotide sequence analysis was performed on a concatenated FASTA file linking all 83 contigs, which were generated through whole genome sequencing and assembly. The DNA sequence was screened for PKS- and NRPS-coding sequencings using antiSMASH (version 1.0).5 The entire gph gene cluster was found encoded on a single contig. Sequencing data were assembled and edited using Artemis.33 All sequence similarity searches were carried out on the amino acid level in the GenBank database with the BLAST program. Pfam HMM searches were also used for loading module domain characterization.12 Sequence alignments were carried out with ClustalX (version 2.1).34 Computation of the phylogenetic tree was performed using NJplot. Bootstrap values were calculated at 100 trials.

manufacturer’s protocol, generating plasmid pJY01. The resulting plasmid was then sequenced. For insertion of hygromycin resistance cassette into the inactivation construct, pMycoMar-HYG was digested with HindIII (New England Biolabs), and the 2081 bp fragment containing the hygromycin resistance gene was gel purified using the manufacturer’s protocol for the E.Z.N.A. gel extraction kit (Omega Biotek). Plasmid pJY01 was linearized with HindIII and gel purified using the E.Z.N.A. gel extraction kit and treated with calf intestinal alkaline phosphatase to prevent religation. The 2081 bp HindIII fragment was ligated into linearized pJY01 with T4 ligase (2 000 000 units/μL) at 4 °C. The resulting ligation was transformed using the standard protocol for DH5α chemical competent cells (Invitrogen). The resulting plasmid pJY09 contained a fragment of gphG and hygromycin resistance. pJY09 was introduced into C. violaceus Cb vi76 by electroporation using the Eppendorf Eporator electroporation system. C. violaceus was cultivated at 30 °C in VY2 media for 10 days to achieve an optimal cell density. Cell clumps were washed in washing buffer (5 mM HEPES, 0.5 mM CaCl2, pH adjusted to 7.2 using 1 M NaOH) followed by 3−5 additional washes using 10% (v/v) glycerol. Cells were suspended in 10% (v/v) glycerol in preparation for electroporation. All washes and suspensions were performed at 4 °C. Plasmid DNA (1−3 μg) and 100 μL of the C. violaceus cell suspension were mixed and transferred into an ice-cold 2 mm electroporation cuvette. A range of electroporation voltages was explored, and 1000 V was found to be optimum. VY2 media (350 μL) was added to the cells immediately after electroporation. Cells were recovered at 30 °C for 4 h and then transferred to VY2 agar plates supplemented with 50 μg/mL of hygromycin. Before incubation, Pol03 soft agar (0.3% Probion (Hoechst, single cell protein), 0.3% soluble starch, 0.2% MgSO4·7H2O, 0.05% CaCl2·2H2O, 1.19% Hepes, 0.5% (BD Difco) agar) was cooled to 37 °C, then poured directly on top of the cell suspension.31 Transformant plates were incubated at 30 °C for 5−10 days or until mutant colonies became visible. Integration of the plasmid into C. violaceus mutants was verified by PCR. Phusion High-Fidelity DNA polymerase PCR master mix with GC buffer (New England Biolabs) and the MyCycler thermal cycler (Bio-Rad) were used for all PCR reactions. DMSO was added to the reaction mixture to a final concentration of 3% v/v. DNA template concentrations were used according to the manufacturer’s protocol. Plasmids pMycoMar-HYG and pJY09 were utilized as control DNA samples to verify hygromycin insertion into C. violaceus. The primers used for the amplification of hygromycin resistance insertion in mutant samples were degenerate oligonucleotides pSUBHyg-int-up: 5′−ACCGGTGATACCACGATACTA−3′ and pSUBHyg-int-dn: 5′−CTGCACATCCATATCGCCA−3′. Conditions for amplification were as follows: initial denaturation, 30 s at 98 °C; denaturation, 10 s at 98 °C; annealing, 30 s at 60 °C; extension, 90 s at 72 °C (−1 °C/cycle for 10 cycles); denaturation, 10 s at 98 °C; annealing, 30 s at 55 °C; extension, 90 s at 72 °C (15 cycles); final extension at 72 °C for 10 min; hold at 4 °C. A 500 bp fragment was expected for mutant chromosomal DNA samples that contained hygromycin resistance. Analysis of Secondary Metabolite Production in C. violaceus Cb vi76. Secondary metabolites were extracted from cells with EtOAc (3 × 20 mL) overnight. Extracts were dried using rotary evaporation and suspended in 500 μL of HPLC grade MeOH. The LC-MS analyses were carried out on a Waters ZQ instrument consisting of an Alliance HT chromatography module, photodiode array detector 2996, and a Micromass ZQ mass spectrometer, using a 3 × 50 mm 5 μm Pro C18 YMC reverse-phase column (Waters). Mobile phases: 10 mM ammonium acetate in HPLC grade H2O (A) and HPLC grade CH3CN (B). Gradient 1 was performed from 5% to 80% of B in 10 min at 0.7 mL/min, and gradient 2 was run from 5% to 80% of B in 15 min at 0.7 mL/min. The MS electrospray source operated at a capillary voltage of 3.5 kV and a desolvation temperature of 300 °C. Data sets were acquired in negative electrospray (ESI) mode with a scan range from 200 to 1000 m/z. Isotope Labeling Experiments. Experiments with (S-methyl-13C)-L-methionine (Cambridge Isotope Laboratories) were performed on VY2 agar plates. C. violaceus was grown for approximately 4 days to allow cells to reach the stationary phase, in which they start producing secondary metabolites. On days 4−8, 20 mg of (S-methyl-13C)-L-methionine filter sterilized in 500 μL of dH2O was



ASSOCIATED CONTENT

S Supporting Information *

Supplementary figures are provided detailing sequence alignments for the GNATL domain, key motifs for gph gene products, dendrogram for PKS gene clusters including the gephyronic acid PKS, sequence alignments for KR domains, and PCR confirmation of C. violaceus gphG::HygR chromosomal insertion. This material is available free of charge via the Internet at http://pubs.acs.org. Accession Codes

Sequencing data are accessible at GenBank under accession number KF479198.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS R.E.T. acknowledges generous financial support from the National Science Foundation (CHE-0924351) and the Eck Institute for Global Health Genomics and Bioinformatics Pilot Project Program at the University of Notre Dame. We would also like to thank Dr. F. Sasse and B. Hinkelmann for help in cultivating Cystobacter violaceus Cb vi76 and the Helmholtz Center for Infection Research (HZI) for supplying purified gephyronic acid.



REFERENCES

(1) Sasse, F.; Steinmetz, H.; Höfle, G.; Reichenbach, H. J. Antibiot. 1995, 48, 21−25. (2) (a) Nicolas, L.; Anderl, T.; Sasse, F.; Steinmetz, H.; Jansen, R.; Höfle, G.; Laschat, S.; Taylor, R. E. Angew. Chem., Int. Ed. 2011, 50, 938−941. (b) Anderl, T.; Nicolas, L.; Munkemer, J.; Baro, A.; Sasse, F.; Steinmetz, H.; Jansen, R.; Höfle, G.; Taylor, R. E.; Laschat, S. Angew. Chem., Int. Ed. 2011, 50, 942−945. (3) Anderl, T.; Nicolas, L.; Muenkemer, J.; Muthukumar, Y.; Baro, A.; Frey, W.; Sasse, F.; Taylor, R. E.; Laschat, S. Eur. J. Org. Chem. 2011, 36, 7294−7307. (4) Wenzel, S.; Müller, R. Mol. Biosyst. 2009, 5, 567−574. (5) Medema, M. H.; Blin, K.; Cimermancic, P.; de Jager, V.; Zakrzewski, P.; Fischbach, M. A.; Weber, T.; Takano, E.; Breitling, R. Nucleic Acids Res. 2011, 39, W339−W346. 2275

dx.doi.org/10.1021/np400629v | J. Nat. Prod. 2013, 76, 2269−2276

Journal of Natural Products

Article

(6) Neuwald, A.; Landsman, D. Trends Biochem. Sci. 1997, 22, 154− 155. (7) Chang, Z.; Sitachitta, N.; Rossi, J. V.; Roberts, M. A.; Flatt, P. M.; Jia, J.; Sherman, D. H.; Gerwick, W. H. J. Nat. Prod. 2004, 67, 1356− 1367. (8) Piel, J.; Hui, D.; Wen, G.; Butzke, D.; Platzer, M.; Fusetani, N.; Matsunaga, S. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 16222−16227. (9) Partida-Martinez, L. P.; Hertweck, C. ChemBioChem 2006, 8, 41−45. (10) Gu, L.; Geders, T. W.; Wang, B.; Gerwick, W. H.; Håkansson, K.; Smith, J. L.; Sherman, D. H. Science 2007, 318, 970−974. (11) Bode, H. B.; Meiser, P.; Klefisch, T.; Cortina, N. S.; Krug, D.; Göhring, A.; Schwär, G.; Mahmud, T.; Elnakady, Y. A.; Müller, R. ChemBioChem 2007, 8, 2139−2144. (12) Punta, M.; Coggill, P. C.; Eberhardt, R. Y.; Mistry, J.; Tate, J.; Boursnell, C.; Pang, N.; Forslund, K.; Ceric, G.; Clements, J.; Heger, A.; Holm, L.; Sonnhammer, E. L.; Eddy, S. R.; Bateman, A.; Finn, R. D. Nucleic Acids Res. 2012, 40, D290−D301. (13) Kellman, R.; Mihali, T. K.; Jeon, Y. J.; Pickford, R.; Pomati, F.; Neilan, B. A. Appl. Environ. Microbiol. 2008, 74, 4044−4053. (14) Flugel, R. S.; Hwangbo, Y.; Lambalot, R. H.; Cronan, J. E., Jr.; Walsh, C. T. J. Biol. Chem. 2000, 275, 959−68. (15) Molnár, I.; Schupp, T.; Ono, M.; Zirkle, R.; Milnamow, M.; Nowak-Thompson, B.; Engel, N.; Toupet, C.; Stratmann, A.; Cyr, D. D.; Gorlach, J.; Mayo, J. M.; Hu, A.; Goff, S.; Schmid, J.; Ligon, J. M. Chem. Biol. 2000, 7, 97−109. (16) Yadav, G.; Gokhale, R.; Mohanty, B. J. Mol. Biol. 2003, 328, 335−363. (17) Keatinge-Clay, A. T. Nat. Prod. Rep. 2012, 29, 1050−1073. (18) (a) Gaitatzis, N.; Silakowski, B.; Kunze, B.; Nordsiek, G.; Blöcker, H.; Höfle, G.; Müller, R. J. Biol. Chem. 2002, 277, 13082− 13090. (b) Otsuka, M.; Ichinose, K.; Fujii, I.; Ebizuka, Y. Antimicrob. Agents Chemother. 2004, 48, 3468−3476. (19) Miller, D.; Luo, L.; Hillson, N.; Keating, T.; Walsh, C. Chem. Biol 2002, 9, 333−344. (20) (a) Sudek, S.; Lopanik, N. B.; Waggoner, L. E.; Hildebrand, M.; Anderson, C.; Liu, H.; Patel, A.; Sherman, D. H.; Haygood, M. G. J. Nat. Prod. 2007, 70, 67−74. (b) Piel, J. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 14002−14007. (c) Julien, B.; Shah, S.; Ziermann, R.; Goldman, R.; Katz, L.; Khosla, C. Gene 2000, 249, 153−160. (21) (a) Kagan, R. M.; Clarke, S. Arch. Biochem. Biophys. 1994, 310, 417−427. (b) Ligon, J.; Hill, S.; Beck, J.; Zirkle, R.; Molnár, I.; Zawodny, J.; Money, S.; Schupp, T. Gene 2002, 285, 257−267. (c) Ansari, M. Z.; Sharma, J.; Gokhale, R. S.; Mohanty, D. BMC Bioinf. 2008, 9, 454. (d) Cheng, Y. Q.; Tang, G. L.; Shen, B. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 3149−3154. (e) Carvalho, R.; Reid, R.; Viswanathan, N.; Gramajo, H.; Julien, B. Gene 2005, 359, 91−98. (22) Ishida, K.; Fritzsche, K.; Hertweck, C. J. Am. Chem. Soc. 2007, 129, 12648−12649. (23) (a) Tang, L.; Yoon, Y. J.; Choi, C.; Hutchinson, C. R. Gene 1998, 216, 255−265. (b) Reid, R.; Piagentini, M.; Rodriguez, E.; Ashley, G.; Viswanathan, N.; Carney, J.; Santi, D. V.; Hutchinson, C. R.; McDaniel, R. Biochemistry 2003, 42, 72−79. (24) Edwards, D.; Marquez, B. L.; Nogle, L. M.; McPhail, K.; Goeger, D. E.; Roberts, M. A.; Gerwick, W. H. Chem. Biol. 2004, 11, 817−833. (25) Keatinge-Clay, A. J. Mol. Biol. 2008, 384, 941−953. (26) Caffrey, P. ChemBioChem 2003, 4, 654−657. (27) Taylor, R. E. Nat. Prod. Rep. 2008, 25, 854−861. (28) Kusebauch, B.; Busch, B.; Scherlach, K.; Roth, M.; Hertweck, C. Angew. Chem., Int. Ed. 2010, 49, 1460−1464. (29) Frank, B.; Knauber, J.; Steinmetz, H.; Scharfe, M.; Blöcker, H.; Beyer, S.; Müller, R. Chem. Biol. 2007, 14, 221−233. (30) Kwan, D. H.; Sun, Y.; Schulz, F.; Hong, H.; Popovic, B.; SimStark, J. C.; Haydock, S. F.; Leadlay, P. F. Chem. Biol. 2008, 15, 1231− 1240. (31) Carlson, J. C.; Li, S.; Gunatilleke, S. S.; Anzai, Y.; Burr, D. A.; Podust, L. M.; Sherman, D. H. Nat. Chem. 2011, 3, 628−633. (32) Pistorius, D.; Müller, R. ChemBioChem 2012, 13, 416−426.

(33) Rutherford, K.; Parkhill, J.; Crook, J.; Horsnell, T.; Rice, P.; Rajandream, M. A.; Barrell, B. Bioinformatics 2000, 16, 944−945. (34) Larkin, M. A.; Blackshields, G.; Brown, N. P.; Chenna, R.; McGettigan, P. A.; McWilliam, H.; Valentin, F.; Wallace, I. M.; Wilm, A.; Lopez, R.; Thompson, J. D.; Gibson, T. J.; Higgins, D. G. Bioinformatics 2007, 23, 2947−2948.

2276

dx.doi.org/10.1021/np400629v | J. Nat. Prod. 2013, 76, 2269−2276