Chemical Diversification Based on Substrate Promiscuity of a

Jan 23, 2019 - CAS Key Laboratory of Tropical Marine Bio-resources and Ecology, RNAM Center for Marine Microbiology, Guangdong Key Laboratory of ...
1 downloads 0 Views 2MB Size
Subscriber access provided by EKU Libraries

Article

Chemical Diversification Based on Substrate Promiscuity of a Standalone Adenylation Domain in a Reconstituted NRPS System Mengyi Zhu, Lijuan Wang, and Jing He ACS Chem. Biol., Just Accepted Manuscript • DOI: 10.1021/acschembio.8b00938 • Publication Date (Web): 23 Jan 2019 Downloaded from http://pubs.acs.org on January 24, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Chemical Diversification Based on Substrate Promiscuity of a Standalone Adenylation Domain in a Reconstituted NRPS System Mengyi Zhu,† Lijuan Wang,‡ Jing He*,† †State

Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China ‡CAS Key Laboratory of Tropical Marine Bio-resources and Ecology, RNAM Center for Marine Microbiology, Guangdong Key Laboratory of Marine Materia Medica, South China Sea Institute of Oceanology, Chinese Academy of Sciences, 164 West Xingang Road, Guangzhou 510301, P. R. China ABSTRACT: A nonribosomal peptide synthetase (NRPS) assembly line (sfa) in Streptomyces thioluteus that directs the formation of the diisonitrile chalkophore SF2768 (1) has been characterized by heterologous expression and directed gene knock-outs. Herein, differential metabolic analysis of the heterologous expression strain and the original host led to the isolation of an SF2768 analog (2, a by-product of sfa) that possesses N-isovaleryl rather than 3-isocyanobutyryl side chains. The proposed biosynthetic logic of sfa and the structural difference between 1 and 2 suggested substrate promiscuity of the adenylate-forming enzyme SfaB. Further substrate scope investigation of SfaB and a successfully reconstituted NRPS system including a four-enzyme cascade enabled incorporation of diverse carboxylic acid building blocks into peptide scaffolds, and 30 unnatural products were thus generated. This structural diversification strategy based on substrate flexibility of the adenylation domain and in vitro reconstitution can be applied to other adenylation-priming pathways, thus providing a supplementary method for diversity-oriented total synthesis. Additionally, the biocatalytic process of the putative lysine δ-hydroxylase SfaE was validated through the derivatization of two key aldehyde intermediates (2a and 2b), thereby expanding the toolkit of enzymatic C-H bond activation.

INTRODUCTION Nonribosomal peptide synthetases, megaenzymes composed of domain-containing modules, biosynthesize a wide range of bioactive natural products. A typical nonribosomal peptide synthetase (NRPS) module generally obeys the following rules: an adenylation domain (A) activates and loads carboxylate substrates onto the distal thiol of a peptidyl carrier protein (PCP)/thiolation domain (T) via an ATP-consuming process; a condensation domain (C) catalyzes amide bond formation between the PCP-tethered thioesters. The enzyme-bound peptide chain is then subsequently released by a thioesterase (TE) or reductase (R) domain located at the C-terminus of the last assembly module for further modification to afford structural diversity of the final product1-3. As a “gatekeeper,” the A domain initiates the NRPS assembly line by chemically activating appropriate candidate carboxylates such as proteinogenic and nonproteinogenic amino acids4, aryl acids5-7, and fatty acids8, 9 and forming the corresponding acyl-adenylates for upcoming thioesterification. Because the bioactivities of nonribosomal peptides (NRPs) are highly dependent on building blocks with various functional groups and chemical properties, investigations aimed at the

substrate specificity of the A domain are of particular importance for understanding and engineering. Over the years, a series of studies regarding A-domain substrate specificity has been carried out, encompassing bioinformatic-guided substrate prediction10-12 and rational substrate profile alteration13-19, thereby paving the way for purposeful structure diversification via reprogramming the “NRPS code.” Recently, the biosynthetic gene cluster of the diisonitrile natural product SF2768, an NRP harboring 3-isocyanobutyryl side chains, was determined20 (sfa, genbank KY427327, Table S1). The standalone adenylate-forming enzyme SfaB, which is in the pathway, was proposed to activate a unique substrate, 3isocyanobutanoic acid, and mediate its covalent attachment to the type II peptidyl carrier protein (PCP)21 SfaC. Herein, a new analog (2) featuring N-isovaleryl side chains was found to be coproduced with SF2768 (1) in the same heterologous expression strain by metabolic profile comparison and chemical structure elucidation. Knocking out sfaB abolished the production of both 1 and 2, implying that SfaB may adenylate 3-isocyanobutanoic acid and isovaleric acid for the biosynthesis of 1 and 2, respectively, in a promiscuous manner.

(Figure 1.) Figure 1. (A) Biosynthetic pathway of SF2768 and its analogs. (B) A new SF2768 analog 2, was discovered based on metabolic comparison between the heterologous strain S. lividans::p13C and the control carrying the empty vector plasmid pJTU2554. The cosmid p13C, derived from pJTU2554, contains the intact sfa biosynthetic gene cluster. (C) Metabolic profile of S. lividans::p13C and its derivative mutants ΔSfaAE20. Legend: C, condensation domain; A, adenylation domain; PCP, peptidyl carrier protein; R, reductase domain; α-KG: α-ketoglutaric acid; AMP: adenosine monophosphate; PPi: pyrophosphate.

ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Afterward, we successfully conducted one-pot enzymatic synthesis to yield 2 with a reconstituted NRPS system including SfaB (A, ATY72527.1), SfaC (PCP, ATY72526.1), SfaD (NRPS, ATY72525.1), and SfaE (hydroxylase, ATY72542.1), thus introducing isovaleric acid into the peptide scaffold by SfaB in vitro and completely clarifying the biosynthesis of 2. To further investigate the substrate scope of SfaB, a panel of structurally diverse carboxylic acids (75 in total), including amino acids, aryl acids, and fatty acids, was examined based on a colorimetric method in vitro. The results revealed that SfaB is a versatile acyl-AMP ligase that prefers short-chain fatty acids (SCFAs) and their substituted analogs. Thereafter, we added all the candidate carboxylic acids into the established NRPS system separately, and 16 out of the 75 carboxylic acids were efficiently incorporated into the NRP skeleton to form unnatural compounds, hence demonstrating a practical method for NRP diversification by taking full advantage of the substrate flexibility of an adenylation domain coupled with in vitro multienzyme synthesis. We believe that without any modification at the protein level, this strategy can be readily applied to additional biosynthetic pathways initiated by adenylate-forming enzymes to release their potential capability for natural product diversification. RESULTS AND DISCUSSION Discovery of a New Metabolite Based on Untargeted Metabolomics Fermentation broth of the heterologous strain Streptomyces lividans::p13C, which contains the entire sfa cluster (Figure 1A), and the control strain S. lividans::pJTU2554 were subjected to high-resolution electrospray ionization mass spectrometry (HR-ESI-MS) in triplicate. The acquired MS data were subsequently analyzed by the web-based platform XCMS22 (Figure 1B). A peak at m/z 315.2277 (2, [M + H]+, calculated for C16H30N2O4) was observed coexisting with SF2768 (1, [M + H]+, m/z 337.1870, C16H24N4O4) and was absent in the control strain. This new peak was considered to represent a homolog of SF2768 due to their similar molecular formulas and identical peak splitting patterns, which likely reflect pyran anomers (Figure 1B, 1 in blue and 2 in orange). Large-scale fermentation and compound purification led to the structural elucidation of 2, an SF2768 congener that possesses N-isovaleryl moieties. The metabolic profile of S. lividans::p13C and five mutants20 indicated that the enzymes SfaB-E are necessary for the biosynthesis of 2 (Figure 1C); disruption of the isonitrile synthase-encoding gene sfaA has no effect on the production of compound 2, which is consistent with its chemical structure. Hence, a thorough investigation into the biosynthesis of 2 could shed more light on the rest of the part encoded by sfa, especially on the acyl-AMP ligase SfaB and the tailoring enzyme SfaE. In Vitro Reconstitution of the Biosynthetic Pathway of Compound 2 We successfully reconstituted the biosynthetic pathway that converts L-lysine and isovaleric acid to compound 2. According to bioinformatic analysis and the proposed pathway, we expressed SfaC and SfaD in a widely utilized engineered strain, Escherichia coli BAP123, to convert apo-protein into its holo form by 4’-phosphopantetheinylation, while the other two enzymes were expressed in E. coli BL21 (DE3) (Figure S1). We

Page 2 of 12

then mixed his-tagged proteins and cofactors, whose concentrations had been empirically determined, in a tris-HCl buffered system (initial concentration: 3 mM ATP, 10 mM MgCl2, 1.5 mM NADPH, 10 μM FeSO4, 5 mM α-KG, and 2 mM DTT; 1 μM SfaB, SfaB: SfaC: SfaD: SfaE=1:3:3:1). After incubation at 30 °C for 6 h, the reaction was stopped and analyzed by HR-ESI-MS. Successful de novo synthesis of compound 2 was determined by comparing the retention times and peak patterns of the product and the standard (Figure 2A). To further characterize this reaction and maximize its production, we employed a stepwise optimization of the enzyme ratios and cofactor concentrations (Figure 2B).

(Figure 2.) Figure 2. In vitro reconstitution of the biosynthetic pathway of compound 2 via a four-enzyme cascade. (A) Absence of SfaE, an Fe2+/α-KG-dependent hydroxylase, in the reconstituted system lead to the synthesis of an acyclic congener, 2’. The reaction without NADPH is presented as the representative negative control for clarity. (B) Titration of substrates and cofactors: (i) to (iii), optimization of the enzyme ratios; (iv) to (ix), optimization of cofactor concentrations. The yield of compound 2 was represented by ion intensity. Data are presented as the average of two experiments.

First, we changed the concentration of PCP SfaC from 0 to 30 μM while keeping other enzymes at 1 μM, and the results revealed that the discrete PCP enhanced the yield of 2 in a dosedependent manner (Figure 2B, i). We therefore selected 30 μM SfaC for further investigations to ensure a sufficient supply of the thiol carrier protein for shuttling the acyl extender unit. Then, we varied the concentration of NRPS SfaD from 0 to 10 μM, and the best production was achieved when 2.5 μM SfaD was present (Figure 2B, ii). The concentration of the last component, SfaE, was determined to be 2.5 μM based on improving production and reducing the cost of enzymes (Figure 2B, iii). The final enzymatic proportion was determined to be 1:30:2.5:2.5 (SfaB: SfaC: SfaD: SfaE) and was subsequently employed to determine the best cofactor content by varying the initial concentration of each individual cofactor. The electron donor preference was also determined by titrating reduced cofactors NADH and NADPH into the system (Figure 2B, ivix). Finally, the system was optimized to 1 μM SfaB, 30 μM SfaC, 2.5 μM SfaD, 2.5 μM SfaE, 8 mM ATP, 2.5 mM MgCl2, 1 mM NADPH, 2.5 mM DTT, 10 μM FeSO4, and 30 mM α-KG to maximize the production of 2. Notably, a putative analog peak, 2’, at m/z 301.2486 ([M + H]+, calculated for C16H32N2O3) was accompanied by the generation of 2 in the in vitro reaction. Removing the putative hydroxylase SfaE and its cofactors Fe2+ and α-KG from the optimized reaction terminated the biosynthesis of 2, whereas the production of the new peak was retained. Product isolation from a scaled-up reaction (20 mL) and subsequent 1D and 2D NMR analyses established the acyclic structure of 2’ (Figure 2A). Two-electron Reduction Process Validated by Aldehyde Derivatization Theoretically, the R domain of NRPS or PKS usually catalyzes 2e− or 4e− reductive release to afford the final product as an aldehyde or an alcohol2, 24-27, respectively, and 4e− reduction is conducted by two consecutive [2+2]e− reductive steps through

ACS Paragon Plus Environment

Page 3 of 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

a nonprocessive mechanism because the aldehyde product needs to be discharged from the active site pocket to facilitate cofactor exchange before reloading28. The result of a BLASTp search against the UniProtKB/Swiss-Prot database implied that SfaE mediates hydroxylation at the peptidic skeleton with the requirement of Fe2+ and α-KG (best hit: identity 31%, aspartyl/asparaginyl beta-hydroxylase, Q8BSY0.1, Mus musculus).

(Figure 3.) Figure 3. (A) PFBHA derivatization of the putative aldehyde intermediates 2a and 2b. Aldehyde 2a is hydroxylated by SfaE to afford 2b, followed by spontaneous hemiacetal formation. (B) LCMS detection of the aldehyde 2a and PFBO derivatives 2a’ and 2b’ in the reconstituted NRPS system.

(Figure 4.) Figure 4. In vitro biosynthesis of SF2768 analogs through incorporation of different building blocks. (A)(i) Investigation of the substrate specificity of SfaB toward various carboxylic acids. The background (velocity of a no-enzyme control) was subtracted in each case. The method for colorimetric detection of free PPi is illustrated in the box. “+” indicates that the activity of SfaB toward the corresponding acid was validated by LC-MS. BTPPACl: bis(triphenylphosphoranylidene) ammonium chloride. (ii) Relative abundance of the SF2768 analogs produced via the reconstituted NRPS (SfaB-SfaE) system. The results were calculated by normalizing the summed integrated area of both hemiacetal (blue) and acyclic (yellow) congeners in each case. Error bars represent standard deviations (n = 3). (B) LC-MS analysis of the SF2768 analogs biosynthesized in this study. The products are illustrated by EIC (extracted ion chromatogram) overlays of different colors (blue, cyclic hemiacetals; yellow, ring-opened species). Asterisk represents unknown substance with m/z value identical to that of the corresponding numbered product. (C) The molecular formulas and theoretical mass-to-charge ratios of the products. (D) Key 2D NMR spectroscopic data of the representative compounds purified in this study.

Deprivation of SfaE from the reaction led to ring-opened product 2’, implying that SfaE might exhibit hydroxylation activity toward it. However, no reaction was observed when we incubated 2’ with SfaE and the cofactors (data not shown), indicating that the alcohol 2’ was not the substrate. Accordingly, we proposed that the free aldehyde released from the R domain of SfaD was favored by SfaE, and two aldehyde intermediates, the released and hydroxylated ones, should be detected (Figure 3A). A peak at m/z 299.2329 (2a, [M + H]+, calculated for C16H30N2O3, Figure 3B, blue), which was consistent with the presumed released aldehyde, was observed in the reconstituted NRPS system through HR-ESI-MS. Due to the failure to isolate 2a and the resulting lack of NMR data, the proposed assignment needed to be supported by means of derivatization. Aldehydeselective reagent O-(2,3,4,5,6pentafluorobenzyl)hydroxylamine (PFBHA) was added to the optimized reconstituted system, and the reaction was terminated for HR-ESI-MS detection after 3 h of incubation. The results illustrated that two peaks, with the expected masses of pentafluorobenzyloxime (PFBO) derivatives 2a’ and 2b’, were detected, and the original aldehyde peak 2a was decreased in the derivatized sample. This result is in line with the scenario that SfaE triggers ring closure by installing a hydroxyl group at the reduced lysine residue of 2a, causing the generation of the hemiacetal species 2. Substrate Promiscuity of the Adenylate-forming Enzyme SfaB and the Resulting NRP Chemical Diversification SfaB exhibited a flexible substrate profile in vivo, including at least two acids, 3-isocyanobutanoic acid and isovaleric acid, which inspired us to seek further details regarding the substrate scope of SfaB. We applied a facile method based on direct colorimetric detection of the PPi liberated in the adenylation process to characterize the adenylating activity of SfaB against 75 commercially available acids (Figure 4A). The results showed that (i) SfaB preferentially activates SCFAs (C4-C7) and their functionalized homologs, including alkenyl, alkynyl, chlorinated and brominated acids; (ii) aromatic, amino, carboxy, and hydroxy substituted SCFAs are not well adenylated by SfaB; and (iii) α-substituted acids are not preferred. To more

accurately describe the activity of SfaB, we determined the kinetic parameters of SfaB against different substrates based on quantification of the released AMP by LC-HR-MS. The results are generally consistent with what we obtained via the colorimetric method, revealing that SfaB can efficiently activate 23 out of the 75 carboxylic acids (Table 1, Figure S34). Table 1. Steady State Kinetic parameters of SfaB. substrate

Km (mM)

kcat (min-1)

kcat/Km (mM-1 min-1)

acetate

8.47±2.74

0.017±0.003

0.002

propionic acid

1.93±0.58

0.014±0.002

0.007

butyric acid

0.31±0.05

0.012±0.000

0.038

valeric acid

0.45±0.06

0.014±0.001

0.031

isovaleric acid

0.22±0.05

0.005±0.000

0.022

hexanoic acid

1.73±0.72

0.123±0.023

0.070

3-methylpentanoic acid

0.45±0.13

0.009±0.001

0.019

4-methylpentanoic acid

1.34±0.38

0.025±0.002

0.018

heptanoic acid

1.06±0.31

0.126±0.012

0.119

octanoic acid

1.05±0.29

0.009±0.001

0.008

3-chloropropionic acid

0.24±0.15

0.007±0.001

0.028

3-chlorobutyric acid

0.11±0.04

0.005±0.000

0.049

4-chlorobutyric acid

0.30±0.06

0.008±0.000

0.026

5-chlorovaleric acid

0.32±0.04

0.016±0.001

0.050

3-bromopropionic acid

0.51±0.12

0.006±0.000

0.013

3-bromobutyric acid

0.14±0.03

0.004±0.000

0.030

4-bromobutyric acid

1.94±0.95

0.004±0.001

0.002

5-bromovaleric acid

0.61±0.11

0.016±0.001

0.026

3-mercaptopropionic acid

0.22±0.05

0.009±0.001

0.039

4-pentynoic acid

0.34±0.09

0.007±0.000

0.022

5-hexynoic acid

0.88±0.13

0.014±0.001

0.016

4-pentenoic acid

0.45±0.07

0.011±0.000

0.025

5-hexenoic acid

0.89±0.29

0.017±0.002

0.019

Next, we individually added all 75 substrate acids to the reconstituted NRPS reaction, attempting to introduce different

ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

acids into the peptide chain. Only 16 substrates were incorporated into the final product (Figure 4A, Figure S2), and both hemiacetal and acyclic species were detected in each case by HR-ESI-MS (Figure 4B and C). All these unnatural products (3-17 and 3’-17’) were not detected in the fermentation broth of the heterologous strain S. lividans::p13C and the mutant S. lividans::ΔsfaE. In fact, on the basis of coupling in vitro biosynthesis and enzymatic versatility, it is potentially feasible to forge additional unnatural products by broadening the substrate pool or, as in our case, utilizing different precursor acids in one reaction. To verify the identity of the final product generated from the reconstructed NRPS system, we therefore carried out a large-scale biocatalytic reaction29 coupled with an NADPH regeneration system. Thus, three novel representative acyclic variants, 4’ (2.5 mg), 9’ (1.2 mg), and 14’ (0.5 mg), were biosynthesized and purified from a 20 mL reaction, and their structures were confirmed by 1D and 2D NMR (Figure 4D). Notably, some acids in the pool such as heptanoic acid or octanoic acid were well adenylated and transferred to PCP (Figure S5), but failed to undergo amidation through the C domain, implying an interdomain communication between SfaB and the condensation domain of SfaD for substrate selection. Actually, components other than the A domain in the NRPS assembly line such as the C domain30, 31, E (epimerization) domain32, and TE33, 34 act as additional gatekeepers in different instances. In line with those observations, our results demonstrated a good example of the resultant product being constrained by tandem gatekeeping behavior of the A-C cascade. DISCUSSION Adenylation is a pivotal event in biochemical processes such as ribosomal and nonribosomal peptide synthesis, acyl- and arylCoA formation, and NRPS-independent siderophore synthesis35. Natural biosynthetic enzymes, including adenylate-forming enzymes, often display biocatalytic versatility toward structurally diverse substrates to access chemical diversity. Empirically, compared to the A domain embedded within multimodular megaenzymes, the standalone domain including fatty acyl- or aryl-AMP ligase, usually possesses broader substrate specificity, thereby introducing a wider range of priming units with reactive chemical moieties into natural product skeletons (Table 2). Table 2. Representative Standalone A domains and the corresponding substrates. product

protein

isoindolinomycin

IdmB21

kutzneride

KtzN

substrate

referrence

O

36

NH2

HO

O

O

HO

OH

37

NH2 O

thiocoraline

TioN

HO

38

SH NH2

streptothricin

Orf19

heronamide

HerJ

O

NH2 NH2

HO

O HO

NH2

39

40

Page 4 of 12 O

leinamycin

LnmQ

41

HO NH2

INLP

ScoC

yersiniabactin

YbtE

O

NC

42

HO O

43

HO HO

O

bacillibactin

DhbE

A33853

BomL

OH OH

HO

44

O HO

N

45

HO

O

congocidine

HO

Cgc3*

HN

NH

46

O

daptomycin

DptE

echinocandin B

EcdI

O

47

HO

O

9

HO

In vitro reconstitution is a powerful tool that obtains deep insight into biosynthetic pathways because manipulating a multienzymatic pathway abstracted from a complicated cellular system shed light on reaction conditions, rate-limiting steps and cofactor requirements, thus enabling rational engineering. Moreover, the cost of the product isolation step is reduced because fewer cell metabolites are involved. In our case, the chemical diversity of the natural product SF2768 was enhanced on the basis of in vitro reconstitution and a substrate profile survey of a standalone A domain. Although 2’, 4’, 9’ and 14’ we isolated did not show significant activity against Staphylococcus aureus ATCC 25923, Pseudomonas aeruginosa ATCC 27853, Meloidogyne incognita, and breast cancer MDA-MB-231 cell48, applying this strategy to diversify the molecular architectures of other adenylation-priming lead compounds was interesting. Presently, the field of using miniaturized high-throughput chemical synthesis for compound library construction for screening is mature49, 50, and we hope that in vitro multienzyme synthesis will be similarly automated in the near future, thus offering an optional approach for organic chemists to construct natural product-like libraries. In summary, we completely elucidated the biosynthesis of SF2768 analog 2 via a reconstituted NRPS pathway containing four enzymes. Unlike previous studies using engineering approaches at the protein level to alter substrate specificity, we conducted a thorough investigation of the substrate range of the standalone acyl-AMP ligase SfaB with a commercially available substrate pool. Then, 16 acids were successfully introduced into peptide scaffolds through in vitro reconstitution, generating diversified unnatural compounds. More significantly, this approach based on substrate profiling of the adenylation domain can be readily applied to additional biosynthetic pathways, contributing to the exploration of untapped chemical diversity. Additionally, the Fe2+/α-KG-dependent hydroxylase SfaE was verified to facilitate ring closure of the aldehyde intermediate through hydroxylation and subsequent hemiacetal formation, thereby increasing the chemical complexity of the final product as well. EXPERIMENTAL SECTION

ACS Paragon Plus Environment

Page 5 of 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Materials and Methods. Strains, plasmids, and primers are listed in Table S2-3. General methods were described unless otherwise noted51. Molecular formulas of the compounds in this study were calculated by Agilent Masshunter working station. 1H, 13C, and 2D NMR spectra of compounds were collected on a Bruker AVANCE III 600 MHz spectrometer (Table S4). General Analytical Procedures LC/LC-MS method A (analysis) Agilent 6540 UHD accurate-mass Q-TOF LC/MS (coupled with Agilent 1260 binary pump LC) system and a reverse phase column (Agilent Zorbax Eclipse Plus C18 Narrow Bore RR, 2.1×150 mm, 3.5 micron) were used. Solvent A, H2O (0.1 % formic acid); solvent B, MeCN. The elution gradient was 0-10 min 5-40 % A, 10-11 min 40-5 % A, 11-15 min 5 % A. Flow rate was 0.3 mL/min. LC/LC-MS method B (analysis) Agilent 6540 and a reverse phase column were utilized (same to method A). Solvent A, H2O (0.1 % formic acid); solvent B, MeCN. The elution gradient was 0-10 min 5-40% A, 10-15 min 40-90% A, 15-20 min 90 % A, 20-20.01 min 90-5 % A, 20.0125 min 5 % A. Flow rate was 0.3 mL/min. LC/LC-MS method C (semi-preparation) Agilent 1260 binary pump LC system and a reverse phase semipreparative column (YMC pack ODS-A, 250×10 mm, s-5 μm, 12 nm) were used. UV-vis detector wavelength was 214 nm. The detailed elution gradient and retention time of the aimed compounds are described in each case. Detection and isolation of SF2768 analog 2. S. lividans::p13C and S. lividans::pJTU2554 were constructed previously20. They were cultured in YD liquid medium (5 g yeast extract, 10 g maltose, 4 g glucose, 2 g MgCl2, 1.5 g CaCl2, and 1 L H2O) at 30 °C for 7 days. The filtered fermentation broth was injected into a liquid chromatography mass spectrometry (LC-MS) instrument for analysis, and the online platform XCMS was utilized to compare the triplicate LC-MS data (cloud plot parameter: p-value < 0.05, fold change > 10.0, retention time > 5 min, m/z range > 150). Compound 2 was isolated based on a 20 L fermentation and LC-MS-guided fractionation. In detail, S. lividans::p13C was grown on YD agar (YD liquid medium added 25 g agar) at 30 °C for 7 days. The culture was diced and extracted twice with 40 L EtOAc, shaken in flasks at 100 rpm for 2 h (30 °C) and subsequently filtered. The EtOAc was removed by a rotary evaporator. The crude extract was loaded onto an alumina column and eluted with ethanol (100 mL), followed by evaporation to dryness in vacuo. The residue was subjected to a silica gel column and eluted successively with CHCl3 (100 mL) and a mixture of CHCl3-MeOH (9:1 v/v, 100 mL). The latter fraction was collected, evaporated and purified again by a Sephadex LH20 column (10% MeOH, 2 mL/min). Fractions 6-10 (5 mL for each) were combined, lyophilized, dissolved in MeOH and subjected to semipreparative high-performance liquid chromatography (HPLC) with 20% MeCN at a flow rate of 2 mL/min (method C). Compound 2 (1.2 mg) was obtained by collecting the peak at 30 min.

Stepwise Optimization of the in Vitro Reconstitution of Compound 2. SfaB-, SfaC-, SfaD-, and SfaE-encoding genes were amplified from the gDNA of Streptomyces thioluteus DSM 40027 by PCR with KOD Plus DNA polymerase (TOYOBO). The purified PCR products were inserted into pET-28a using NdeI and HindIII restriction sites. Protein expression was carried out in E. coli BL21 (DE3) (for SfaB and SfaE) and E. coli BAP123 (for SfaC and SfaD). His-tagged proteins were overexpressed and purified by nickel-affinity chromatography after induction with 0.1 mM isopropyl β-D-1-thiogalactopyranoside (IPTG) at 16 °C overnight. SfaC was detected in Tricine-SDS-PAGE52. For LC-MS analysis, compound 2 was synthesized in a 50 μL reconstituted system that contained 50 mM tris-HCl (pH = 8.0), 1 μM SfaB, 3 μM SfaC, 3 μM SfaD, 1 μM SfaE, 6 mM isovaleric acid, 3 mM L-lysine, 3 mM ATP, 10 mM MgCl2, 1.5 mM NADPH, 2 mM DTT, 10 μM FeSO4, and 5 mM α-KG (initial system) or 50 mM tris-HCl (pH = 8.0), 1 μM SfaB, 30 μM SfaC, 2.5 μM SfaD, 2.5 μM SfaE, 6 mM isovaleric acid, 3 mM L-lysine, 8 mM ATP, 2.5 mM MgCl2, 1 mM NADPH, 2.5 mM DTT, 10 μM FeSO4, and 30 mM α-KG (optimized system). The reactions were initiated by adding ATP and NADPH, incubated at 30 °C for 6 h and stopped by adding 50 μL cold MeCN, followed by LC-MS analysis after filtration (method A). Detection of the Derivatized Aldehyde Intermediates. The aldehyde-selective reagent PFBHA was dissolved in MeOH at a final concentration of 20 mg/mL. Then, 2.5 µL PFBHA solution was added to the optimized reconstituted system (100 μL) of compound 2. The reaction was incubated at 30 °C for 3 h and stopped by adding 100 μL cold MeCN, followed by LC-HR-MS analysis after filtration (method B). The reconstituted system without PFBHA was used as a control. Colorimetric Detection of the Adenylation Activity of SfaB. The substrate specificity of SfaB assay was performed as described53, unless otherwise noted. The reaction was conducted in a 50 μL system consisting of 50 mM tris-HCl (pH = 8.0), 10 μM SfaB, 5 mM ATP (Sigma Aldrich), 5 mM MgCl2, 32 mM hydroxylamine, and 6 mM carboxylic acid. The reactions were incubated at 22 °C for 30 min. All the reagents should be prepared freshly. The reaction components should be mixed thoroughly on ice and then transferred to a 22 °C bath. A total of 75 carboxylic acids were tested. Kinetic Parameter Determination of SfaB. The reactions were conducted in a 50 μL system consisting of 50 mM tris-HCl (pH = 8.0), 18 μM SfaB, 10 mM ATP (Sigma Aldrich), 10 mM MgCl2, 32 mM hydroxylamine, and carboxylic acid (variable concentration). The reactions were incubated at 22 °C for 30 min and stopped by adding 50 μL cold MeCN. The concentration of released AMP was monitored by LC-HR-MS (method A). Kinetic parameters were calculated by GraphPad Prism 7.0. Biosynthesis of SF2768 Analogs by Incorporating Different Acid Building Blocks in the Reconstituted NRPS System. In vitro reconstitution was performed in 50 μL reactions containing 50 mM Tris-HCl (pH = 8.0), 1 μM SfaB, 30 μM SfaC, 2.5 μM SfaD, 2.5 μM SfaE, 6 mM carboxylic acid (75

ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

carboxylic acids were tested individually) and 3 mM L-lysine, 8 mM ATP, 2.5 mM MgCl2, 1 mM NADPH, 2.5 mM DTT, 10 μM FeSO4, and 30 mM α-KG. The reactions were initiated by adding ATP and NADPH, incubated at 30 °C for 6 h and stopped by adding 50 μL cold MeCN, followed by LC-MS analysis after filtration (method A). To detect the acyl-PCPs in the reconstituted system, the reactions were directly subjected to LC-HR-MS after filtration without adding MeCN to precipitate proteins. The deconvoluted mass was determined by MassHunter Qualitative Analysis (Agilent). Large-scale Enzymatic Preparation of Analogs 2’, 4’, 9’, and 14’. Large-scale enzymatic reactions were performed in a 20 mL reaction that contained 50 mM Tris-HCl (pH = 8.0), 1 μM SfaB, 30 μM SfaC, 2.5 μM SfaD, 6 mM carboxylic acid (isovaleric acid for 2’, valeric acid for 4’, 3-chlorobutyric acid for 9’, 4pentynoic acid for 14’), 3 mM L-lysine, 8 mM ATP, 2.5 mM MgCl2, 1 mM NADP, 6 mM glucose-6-phosphate, 0.5 μM glucose-6-phosphate dehydrogenase (cloned from E. coli BL21), 2.5 mM DTT. The reactions were incubated at 30 °C for 6 h and lyophilized. The residue of each reaction was dissolved by MeOH and subjected to semipreparative HPLC (method C) after filtration. Solvent A, H2O; solvent B, MeCN, isocratic elution. The flow rate was 2 mL/min (2’: 30% MeCN, retention time = 16 min; 4’: 30% MeCN, retention time = 18 min; 9’: 30% MeCN, retention time = 15 min; 14’: 15%, MeCN, retention time = 24 min).

ASSOCIATED CONTENT Supporting Information. The Supporting Information is available free of charge on the ACS Publications website at DOI: Material lists, spectroscopic data (PDF)

AUTHOR INFORMATION Corresponding Author *[email protected]

ORCID Mengyi Zhu: 0000-0001-8508-4466 Lijuan Wang: 0000-0002-2348-6950 Jing He: 0000-0003-1392-6040

Notes The authors declare no competing financial interest.

ACKNOWLEDGMENT This work is supported by the National Natural Science Foundation of China (31870089), the Natural Science Foundation for Distinguished Young Scholars of Hubei Province of China (No. 2018CFA069), the Fundamental Research Funds for the Central Universities (No. 2662018PY053) and the Open Project Program of Guangdong Key Laboratory of Marine Materia Medica (LMM2018-4). M. Zhu thanks T. Liu (Wuhan University) for his training in enzymatic assay. We thank R. Zhao (National Center for Nanoscience and Technology, Beijing) for performing cell toxicity assay, D. Dai (College of life Science and Technology, Huazhong Agricultural University) for performing Meloidogyne incognita

Page 6 of 12

killing assay, and X. Zhang (College of Science, Huazhong Agricultural University) for NMR data collection.

REFERENCES (1) Koglin, A.; Walsh, C. T., Structural insights into nonribosomal peptide enzymatic assembly lines. Nat. Prod. Rep. 2009, 26, 987-1000. (2) Du, L.; Lou, L., PKS and NRPS release mechanisms. Nat. Prod. Rep. 2010, 27, 255-78. (3) Marahiel, M. A., A structural model for multimodular NRPS assembly lines. Nat. Prod. Rep. 2016, 33, 136-40. (4) Sattely, E. S.; Fischbach, M. A.; Walsh, C. T., Total biosynthesis: in vitro reconstitution of polyketide and nonribosomal peptide pathways. Nat. Prod. Rep. 2008, 25, 757-93. (5) Ehmann, D. E.; Shaw-Reid, C. A.; Losey, H. C.; Walsh, C. T., The EntF and EntE adenylation domains of Escherichia coli enterobactin synthetase: sequestration and selectivity in acyl-AMP transfers to thiolation domain cosubstrates. Proc. Natl. Acad. Sci. U. S. A. 2000, 97, 2509-14. (6) Gaitatzis, N.; Kunze, B.; Muller, R., In vitro reconstitution of the myxochelin biosynthetic machinery of Stigmatella aurantiaca Sg a15: Biochemical characterization of a reductive release mechanism from nonribosomal peptide synthetases. Proc. Natl. Acad. Sci. U. S. A. 2001, 98, 11136-41. (7) May, J. J.; Wendrich, T. M.; Marahiel, M. A., The dhb operon of Bacillus subtilis encodes the biosynthetic template for the catecholic siderophore 2,3-dihydroxybenzoate-glycine-threonine trimeric ester bacillibactin. J. Biol. Chem. 2001, 276, 7209-17. (8) Wittmann, M.; Linne, U.; Pohlmann, V.; Marahiel, M. A., Role of DptE and DptF in the lipidation reaction of daptomycin. FEBS J. 2008, 275, 5343-54. (9) Cacho, R. A.; Jiang, W.; Chooi, Y. H.; Walsh, C. T.; Tang, Y., Identification and characterization of the echinocandin B biosynthetic gene cluster from Emericella rugulosa NRRL 11440. J. Am. Chem. Soc. 2012, 134, 16781-90. (10) Challis, G. L.; Ravel, J.; Townsend, C. A., Predictive, structurebased model of amino acid recognition by nonribosomal peptide synthetase adenylation domains. Chem. Biol. 2000, 7, 211-24. (11) Rottig, M.; Medema, M. H.; Blin, K.; Weber, T.; Rausch, C.; Kohlbacher, O., NRPSpredictor2--a web server for predicting NRPS adenylation domain specificity. Nucleic Acids Res. 2011, 39, W362-7. (12) Blin, K.; Wolf, T.; Chevrette, M. G.; Lu, X.; Schwalen, C. J.; Kautsar, S. A.; Suarez Duran, H. G.; de Los Santos, E. L. C.; Kim, H. U.; Nave, M.; Dickschat, J. S.; Mitchell, D. A.; Shelest, E.; Breitling, R.; Takano, E.; Lee, S. Y.; Weber, T.; Medema, M. H., antiSMASH 4.0-improvements in chemistry prediction and gene cluster boundary identification. Nucleic Acids Res. 2017, 45, W36-W41. (13) Uguru, G. C.; Milne, C.; Borg, M.; Flett, F.; Smith, C. P.; Micklefield, J., Active-site modifications of adenylation domains lead to hydrolysis of upstream nonribosomal peptidyl thioester intermediates. J. Am. Chem. Soc. 2004, 126, 5032-3. (14) Villiers, B.; Hollfelder, F., Directed evolution of a gatekeeper domain in nonribosomal peptide synthesis. Chem. Biol. 2011, 18, 12909. (15) Thirlway, J.; Lewis, R.; Nunns, L.; Al Nakeeb, M.; Styles, M.; Struck, A. W.; Smith, C. P.; Micklefield, J., Introduction of a nonnatural amino acid into a nonribosomal peptide antibiotic by modification of adenylation domain specificity. Angew. Chem., Int. Ed. Engl. 2012, 51, 7181-4. (16) Zhang, K.; Nelson, K. M.; Bhuripanyo, K.; Grimes, K. D.; Zhao, B.; Aldrich, C. C.; Yin, J., Engineering the substrate specificity of the DhbE adenylation domain by yeast cell surface display. Chem. Biol. 2013, 20, 92-101. (17) Wang, M.; Zhao, H., Characterization and Engineering of the Adenylation Domain of a NRPS-Like Protein: A Potential Biocatalyst for Aldehyde Generation. ACS Catal. 2014, 4, 1219-1225. (18) Bian, X.; Plaza, A.; Yan, F.; Zhang, Y.; Muller, R., Rational and efficient site-directed mutagenesis of adenylation domain alters relative yields of luminmide derivatives in vivo. Biotechnol. Bioeng. 2015, 112, 1343-53.

ACS Paragon Plus Environment

Page 7 of 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

(19) Kries, H.; Niquille, D. L.; Hilvert, D., A subdomain swap strategy for reengineering nonribosomal peptides. Chem. Biol. 2015, 22, 640-8. (20) Wang, L.; Zhu, M.; Zhang, Q.; Zhang, X.; Yang, P.; Liu, Z.; Deng, Y.; Zhu, Y.; Huang, X.; Han, L.; Li, S.; He, J., Diisonitrile Natural Product SF2768 Functions As a Chalkophore That Mediates Copper Acquisition in Streptomyces thioluteus. ACS Chem. Biol. 2017, 12, 3067-3075. (21) Du, L.; Shen, B., Identification and characterization of a type II peptidyl carrier protein from the bleomycin producer Streptomyces verticillus ATCC 15003. Chem. Biol. 1999, 6, 507-17. (22) Tautenhahn, R.; Patti, G. J.; Rinehart, D.; Siuzdak, G., XCMS Online: a web-based platform to process untargeted metabolomic data. Anal. Chem. 2012, 84, 5035-9. (23) Pfeifer, B. A.; Admiraal, S. J.; Gramajo, H.; Cane, D. E.; Khosla, C., Biosynthesis of complex polyketides in a metabolically engineered strain of E. coli. Science 2001, 291, 1790-2. (24) Read, J. A.; Walsh, C. T., The lyngbyatoxin biosynthetic assembly line: chain release by four-electron reduction of a dipeptidyl thioester to the corresponding alcohol. J. Am. Chem. Soc. 2007, 129, 15762-3. (25) Li, Y.; Weissman, K. J.; Muller, R., Myxochelin biosynthesis: direct evidence for two- and four-electron reduction of a carrier proteinbound thioester. J. Am. Chem. Soc. 2008, 130, 7554-5. (26) Peng, H.; Wei, E.; Wang, J.; Zhang, Y.; Cheng, L.; Ma, H.; Deng, Z.; Qu, X., Deciphering Piperidine Formation in PolyketideDerived Indolizidines Reveals a Thioester Reduction, Transamination, and Unusual Imine Reduction Process. ACS Chem. Biol. 2016, 11, 3278-3283. (27) Awodi, U. R.; Ronan, J. L.; Masschelein, J.; de los Santos, E. L. C.; Challis, G. L., Thioester reduction and aldehyde transamination are universal steps in actinobacterial polyketide alkaloid biosynthesis. Chem. Sci. 2017, 8, 411-415. (28) Chhabra, A.; Haque, A. S.; Pal, R. K.; Goyal, A.; Rai, R.; Joshi, S.; Panjikar, S.; Pasha, S.; Sankaranarayanan, R.; Gokhale, R. S., Nonprocessive [2 + 2]e- off-loading reductase domains from mycobacterial nonribosomal peptide synthetases. Proc. Natl. Acad. Sci. U. S. A. 2012, 109, 5681-6. (29) We chose to remove SfaE and its cofactors from the preparation system because the retention time of the hemiacetal and acyclic congeners were very close in each case, which caused difficulties for separation. (30) Belshaw, P. J.; Walsh, C. T.; Stachelhaus, T., Aminoacyl-CoAs as probes of condensation domain selectivity in nonribosomal peptide synthesis. Science 1999, 284, 486-9. (31) Meyer, S.; Kehr, J. C.; Mainz, A.; Dehm, D.; Petras, D.; Sussmuth, R. D.; Dittmann, E., Biochemical Dissection of the Natural Diversification of Microcystin Provides Lessons for Synthetic Biology of NRPS. Cell Chem. Biol. 2016, 23, 462-71. (32) Luo, L.; Burkart, M. D.; Stachelhaus, T.; Walsh, C. T., Substrate recognition and selection by the initiation module PheATE of gramicidin S synthetase. J. Am. Chem. Soc. 2001, 123, 11208-18. (33) Trauger, J. W.; Kohli, R. M.; Walsh, C. T., Cyclization of backbone-substituted peptides catalyzed by the thioesterase domain from the tyrocidine nonribosomal peptide synthetase. Biochemistry 2001, 40, 7092-8. (34) Gaudelli, N. M.; Townsend, C. A., Epimerization and substrate gating by a TE domain in beta-lactam antibiotic biosynthesis. Nat. Chem. Biol. 2014, 10, 251-8. (35) Schmelz, S.; Naismith, J. H., Adenylate-forming enzymes. Curr. Opin. Struct. Biol. 2009, 19, 666-671. (36) Thong, W. L.; Shin-Ya, K.; Nishiyama, M.; Kuzuyama, T., Discovery of an antibacterial isoindolinone-containing tetracyclic polyketide by cryptic gene activation and characterization of its biosynthetic gene cluster. ACS Chem. Biol. 2018, 13, 2615-2622. (37) Strieker, M.; Nolan, E. M.; Walsh, C. T.; Marahiel, M. A., Stereospecific synthesis of threo- and erythro-beta-hydroxyglutamic

acid during kutzneride biosynthesis. J. Am. Chem. Soc. 2009, 131, 13523-30. (38) Al-Mestarihi, A. H.; Villamizar, G.; Fernandez, J.; Zolova, O. E.; Lombo, F.; Garneau-Tsodikova, S., Adenylation and S-methylation of cysteine by the bifunctional enzyme TioN in thiocoraline biosynthesis. J. Am. Chem. Soc. 2014, 136, 17350-4. (39) Maruyama, C.; Toyoda, J.; Kato, Y.; Izumikawa, M.; Takagi, M.; Shin-ya, K.; Katano, H.; Utagawa, T.; Hamano, Y., A stand-alone adenylation domain forms amide bonds in streptothricin biosynthesis. Nat. Chem. Biol. 2012, 8, 791-7. (40) Zhu, Y.; Zhang, W.; Chen, Y.; Yuan, C.; Zhang, H.; Zhang, G.; Ma, L.; Zhang, Q.; Tian, X.; Zhang, S.; Zhang, C., Characterization of Heronamide Biosynthesis Reveals a Tailoring Hydroxylase and Indicates Migrated Double Bonds. Chembiochem 2015, 16, 2086-93. (41) Tang, G. L.; Cheng, Y. Q.; Shen, B., Leinamycin biosynthesis revealing unprecedented architectural complexity for a hybrid polyketide synthase and nonribosomal peptide synthetase. Chem. Biol. 2004, 11, 33-45. (42) Harris, N.; Born, D.; Cai, W.; Huang, Y.; Martin, J.; Khalaf, R.; Drennan, C.; Zhang, W., Isonitrile Formation by a Non-heme Iron(II)Dependent Oxidase/Decarboxylase. Angew. Chem., Int. Ed. Engl. 2018, 57, 9707-9710. (43) Ahmadi, M. K.; Fawaz, S.; Jones, C. H.; Zhang, G.; Pfeifer, B. A., Total Biosynthesis and Diverse Applications of the Nonribosomal Peptide-Polyketide Siderophore Yersiniabactin. Appl. Environ. Microbiol. 2015, 81, 5290-8. (44) May, J. J.; Kessler, N.; Marahiel, M. A.; Stubbs, M. T., Crystal structure of DhbE, an archetype for aryl acid activating domains of modular nonribosomal peptide synthetases. Proc. Natl. Acad. Sci. U. S. A. 2002, 99, 12120-5. (45) Lv, M.; Zhao, J.; Deng, Z.; Yu, Y., Characterization of the Biosynthetic Gene Cluster for Benzoxazole Antibiotics A33853 Reveals Unusual Assembly Logic. Chem. Biol. 2015, 22, 1313-24. (46) Al-Mestarihi, A. H.; Garzan, A.; Kim, J. M.; GarneauTsodikova, S., Enzymatic Evidence for a Revised Congocidine Biosynthetic Pathway. Chembiochem 2015, 16, 1307-13. (47) Grunewald, J.; Sieber, S. A.; Mahlert, C.; Linne, U.; Marahiel, M. A., Synthesis and derivatization of daptomycin: a chemoenzymatic route to acidic lipopeptide antibiotics. J. Am. Chem. Soc. 2004, 126, 17025-31. (48) 1 mL optimized in vitro reactions (four enzymes, 16 acids that can be incorporated were utilized individually) were lyophilized and then extracted by 100 μl MeOH (each reaction was estimated to generate 10 to 120 μg product). The extracts were subjected to the agar diffusion assays against S. aureus ATCC 25923 and P. aeruginosa ATCC 27853. No significant antimicrobial activity was observed after incubation at 30 °C for 12 h. (49) Buitrago Santanilla, A.; Regalado, E. L.; Pereira, T.; Shevlin, M.; Bateman, K.; Campeau, L. C.; Schneeweis, J.; Berritt, S.; Shi, Z. C.; Nantermet, P.; Liu, Y.; Helmy, R.; Welch, C. J.; Vachal, P.; Davies, I. W.; Cernak, T.; Dreher, S. D., Organic chemistry. Nanomole-scale high-throughput chemistry for the synthesis of complex molecules. Science 2015, 347, 49-53. (50) Gesmundo, N. J.; Sauvagnat, B.; Curran, P. J.; Richards, M. P.; Andrews, C. L.; Dandliker, P. J.; Cernak, T., Nanoscale synthesis and affinity ranking. Nature 2018, 557, 228-232. (51) Tobias Kieser, M. J. B., Mark J. Buttner, Keith F. Chater, David A. Hopwood, Practical Streptomyces Genetics. The John Innes Foundation, Norwich 2000. (52) Schagger, H., Tricine-SDS-PAGE. Nat. Protoc. 2006, 1, 16-22. (53) Maruyama, C.; Niikura, H.; Takakuwa, M.; Katano, H.; Hamano, Y., Colorimetric Detection of the Adenylation Activity in Nonribosomal Peptide Synthetases. Methods Mol. Biol. 2016, 1401, 77-84.

ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1. (A) Biosynthetic pathway of SF2768 and its analogs. (B) A new SF2768 analog 2, was discovered based on metabolic comparison between the heterologous strain S. lividans::p13C and the control carrying the empty vector plasmid pJTU2554. The cosmid p13C, derived from pJTU2554, contains the intact sfa biosynthetic gene cluster. (C) Metabolic profile of S. lividans::p13C and its derivative mutants ΔSfaA-E20. Legend: C, condensation domain; A, adenylation domain; PCP, peptidyl carrier protein; R, reductase domain; α-KG: α-ketoglutaric acid; AMP: adenosine monophosphate; PPi: pyrophosphate. 139x66mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 8 of 12

Page 9 of 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Figure 2. In vitro reconstitution of the biosynthetic pathway of compound 2 via a four-enzyme cascade. (A) Absence of SfaE, an Fe2+/α-KG-dependent hydroxylase, in the reconstituted system lead to the synthesis of an acyclic congener, 2’. The reaction without NADPH is presented as the representative negative control for clarity. (B) Titration of substrates and cofactors: (i) to (iii), optimization of the enzyme ratios; (iv) to (ix), optimization of cofactor concentrations. The yield of compound 2 was represented by ion intensity. Data are presented as the average of two experiments. 67x96mm (300 x 300 DPI)

ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3. (A) PFBHA derivatization of the putative aldehyde intermediates 2a and 2b. Aldehyde 2a is hydroxylated by SfaE to afford 2b, followed by spontaneous hemiacetal formation. (B) LC-MS detection of the aldehyde 2a and PFBO derivatives 2a’ and 2b’ in the reconstituted NRPS system. 67x96mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 10 of 12

Page 11 of 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Figure 4. In vitro biosynthesis of SF2768 analogs through incorporation of different building blocks. (A)(i) Investigation of the substrate specificity of SfaB toward various carboxylic acids. The background (velocity of a no-enzyme control) was subtracted in each case. The method for colorimetric detection of free PPi is illustrated in the box. “+” indicates that the activity of SfaB toward the corresponding acid was validated by LC-MS. BTPPACl: bis(triphenylphosphoranylidene) ammonium chloride. (ii) Relative abundance of the SF2768 analogs produced via the reconstituted NRPS (SfaB-SfaE) system. The results were calculated by normalizing the summed integrated area of both hemiacetal (blue) and acyclic (yellow) congeners in each case. Error bars represent standard deviations (n = 3). (B) LC-MS analysis of the SF2768 analogs biosynthesized in this study. The products are illustrated by EIC (extracted ion chromatogram) overlays of different colors (blue, cyclic hemiacetals; yellow, ring-opened species). Asterisk represents unknown substance with m/z value identical to that of the corresponding numbered product. (C) The molecular formulas and theoretical mass-to-charge ratios of the products. (D) Key 2D NMR spectroscopic data of the representative compounds purified in this study. 139x142mm (300 x 300 DPI)

ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

TOC 68x45mm (600 x 600 DPI)

ACS Paragon Plus Environment

Page 12 of 12