Structural Insights into the Free-Standing Condensation Enzyme

Mar 13, 2018 - C-1027 is a chromoprotein enediyne antitumor antibiotic, consisting of the CagA apoprotein and the C-1027 chromophore. The C-1027 chrom...
0 downloads 4 Views 1MB Size
Subscriber access provided by - Access paid by the | UCSB Libraries

Structural Insights into the Free-standing Condensation Enzyme SgcC5 Catalyzing Ester Bond Formation in the Biosynthesis of the Enediyne Antitumor Antibiotic C-1027 Chin-Yuan Chang, Jeremy R. Lohman, Tingting Huang, Karolina Michalska, Lance Bigelow, Jeffrey D. Rudolf, Robert Jedrzejczak, Xiaohui Yan, Ming Ma, Gyorgy Babnigg, Andrzej Joachimiak, George N. Phillips, and Ben Shen Biochemistry, Just Accepted Manuscript • DOI: 10.1021/acs.biochem.8b00174 • Publication Date (Web): 13 Mar 2018 Downloaded from http://pubs.acs.org on March 15, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Structural Insights into the Free-standing Condensation Enzyme SgcC5 Catalyzing Ester Bond Formation in the Biosynthesis of the Enediyne Antitumor Antibiotic C-1027

Chin-Yuan Chang,† Jeremy R. Lohman,† Tingting Huang,† Karolina Michalska,‡ Lance Bigelow,‡ Jeffrey D. Rudolf,† Robert Jedrzejczak,‡ Xiaohui Yan,† Ming Ma,† Gyorgy Babnigg,‡,§ Andrzej Joachimiak,‡,§,||, George N. Phillips, Jr., and Ben Shen*,†,#,@



Department of Chemistry, The Scripps Research Institute, Jupiter, FL 33458, United States; ‡

Midwest Center for Structural Genomics, Biosciences Division, Argonne National

Laboratory, Argonne, IL 60439, United States; §Center for Structural Genomics of Infectious Diseases, University of Chicago, Chicago, IL 60637, United States; ||Structural Biology Center, Biosciences Division, Argonne National Laboratory, Argonne, IL 60439, United States; BioSciences at Rice and Department of Chemistry, Rice University, Houston, TX 77251, United States; #Department of Molecular Medicine, The Scripps Research Institute, Jupiter, FL 33458, United States; @Natural Products Library Initiative at The Scripps Research Institute, The Scripps Research Institute, Jupiter, FL 33458, United States

*To whom correspondence should be addressed: The Scripps Research Institute, 130 Scripps Way, #3A1, Jupiter, FL 33458; Tel: (561) 228-2456; Email: [email protected]

Running title: Crystal structure of SgcC5 in C-1027 biosynthesis

1

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 32

ABSTRACT C-1027 is a chromoprotein enediyne antitumor antibiotic, consisting of the CagA apo-protein and the C-1027 chromophore. enediyne

core

appended

The C-1027 chromophore features a nine-membered with

three

peripheral

moieties,

including

an

(S)-3-chloro-5-hydroxy--tyrosine. In a convergent biosynthesis of the C-1027 chromophore, the (S)-3-chloro-5-hydroxy--tyrosine moiety is appended to the enediyne core by the free-standing condensation enzyme SgcC5.

Unlike canonical condensation domains from

the modular nonribosomal peptide synthetases that catalyze amide bond formation, SgcC5 catalyzes ester bond formation, as demonstrated in vitro, between SgcC2-tethered (S)-3-chloro-5-hydroxy-β-tyrosine and (R)-1-phenyl-1,2-ethanediol, a mimic of the enediyne core as an acceptor substrate.

Here, we report that: (i) genes encoding SgcC5 homologues

are widespread among both experimentally confirmed and bioinformatically predicted enediyne biosynthetic gene clusters, forming a new clade of condensation enzymes; (ii) SgcC5 shares a similar overall structure with the canonical condensation domains but forms a homodimer in solution, the active site of which is located in a cavity rather than a tunnel typically seen in condensation domains, and (iii) the catalytic histidine of SgcC5 activates the 2-hydroxyl group, while a hydrogen-bond network in SgcC5 prefers the (R)-enantiomer of the acceptor substrate, accounting for the regio- and stereospecific ester bond formation between SgcC2-tethered (S)-3-chloro-5-hydroxy-β-tyrosine and (R)-1-phenyl-1,2-ethanediol upon acid-base catalysis. These findings expand the catalytic repertoire and reveal new insights into the structure and mechanism of condensation enzymes.

Keywords: C-1027, enediyne, condensation enzyme, ester bond formation, SgcC5 structure

2

ACS Paragon Plus Environment

Page 3 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

INTRODUCTION Nonribosomal peptides are widespread and structurally diverse natural products exhibiting a broad range of biological activities.1 Nonribosomal peptides are biosynthesized by modular nonribosomal peptide synthetases (NRPSs), with each module typically consisting of minimally three core domains: adenylation (A), peptidyl carrier protein (PCP), and condensation (C). The A domain selects and activates a specific amino acid, the PCP domain channels the activated amino acid or the growing peptide intermediate via the phosphopantetheine (P-pant) arm, and the C domain catalyzes amide bond formation between a PCP-tethered amino acid acceptor and a PCP-tethered growing peptide donor (Figure S1A).1 While the three core domains constitute minimal NRPS modules to afford the linear peptide backbone, NRPS modules can also harbor additional domains, such as, cyclization (Cy), epimerization (E), methylation (Mt), oxidation (Ox), reduction (R), and thioesterase (TE) domains, giving rise to the vast structural diversity known for nonribosomal peptide natural products.

Commonly known as large multifunctional proteins, NRPSs

consist of either (i) multiple domains residing on the same polypeptide to constitute a functional module or (ii) discrete proteins where each protein is the functional equivalent of a single domain (i.e., a free-standing A, PCP, or C enzyme).1,2

C-1027 is a chromoprotein enediyne antitumor antibiotic produced by several Streptomyces species,3 consisting of the CagA apo-protein and the C-1027 chromophore. The C-1027 chromophore

features

a

nine-membered

enediyne

core,

a

deoxyaminosugar,

a

benzoxazolinate, and an (S)-3-chloro-5-hydroxy-β-tyrosine (Figure 1).4 In our effort to study C-1027 biosynthesis in Streptomyces globisporus, we previously identified three genes, sgcC1, sgcC2, and sgcC5, within the C-1027 biosynthetic gene cluster, that encoded three free-standing A (SgcC1), PCP (SgcC2), and C (SgcC5) enzymes.4 We subsequently established that SgcC1, SgcC2, and SgcC5 constituted an NRPS module that was responsible for the biosynthesis and installation of the (S)-3-chloro-5-hydroxy-β-tyrosyl 3

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

moiety onto the enediyne core of the C-1027 chromophore.5

Page 4 of 32

We showed that SgcC1

specifically activated (S)-β-tyrosine and loaded it onto SgcC2.6

The SgcC2-tethered

(S)-β-tyrosine then underwent regioselective chlorination and hydroxylation, by SgcC37 and SgcC,8 respectively, two additional discrete enzymes from the C-1027 biosynthetic machinery that act in trans only on an SgcC2-tethered substrate, to yield the SgcC2-tethered (S)-3-chloro-5-hydroxy-β-tyrosine.

Using (R)-1-phenyl-1,2-ethanediol as a mimic of the

enediyne core acceptor substrate, we finally demonstrated in vitro that SgcC5 indeed catalyzed the regio- and stereospecific condensation between the SgcC2-tethered (S)-3-chloro-5-hydroxy-β-tyrosine and (R)-1-phenyl-1,2-ethanediol, forming an ester bond (Figure 1).9 Remarkably, SgcC5 demonstrated significant substrate promiscuity towards both the SgcC2-tethered donor and the free acceptor in vivo10 and in vitro.5,9 As depicted in Figure 1, SgcC5 used both (S)-β-tyrosine and (S)-3-chloro-5-hydroxy-β-tyrosine as donor substrates, as long as they were tethered to SgcC2.9 SgcC5 was also capable of catalyzing amide bond formation between SgcC2-tethered (S)-3-chloro-5-hydroxy-β-tyrosine and (R)-2-amino-1-phenyl-1-ethanol, an alternative mimic of the enediyne core acceptor substrate bearing an amine at its C-2 position, exemplifying acceptor substrate prosmiscuity.5

Figure 1. The free-standing condensation enzyme SgcC5 catalyzing ester bond formation in the biosynthesis of the C-1027 chromophore. Path a: The proposed pathway for installation of the (S)-3-chloro-5-hydroxy-β-tyrosine moiety (red) onto the enediyne core (blue) by SgcC5 in a convergent biosynthesis of the C-1027 chromophore in S. globisporus.4 Path b: SgcC5 catalyzing both ester (X = O) and amide (X = NH) bond formation in vitro between the SgcC2-tethered (S)-β-tyrosine (R1 = OH and R2 = Cl) or (S)-3-chloro-5-hydroxy-β-tyrosine (R1 = R2 = H) (red) as a donor substrate and (R)-1-(2-phenyl)-1,2-ethanediol (X=O) or (R)-2-amino-1-phenyl-1-ethonol (X=NH) (blue), mimics of the enediyne core, as an acceptor substrate, respectively, showing substrate promiscuity towards both the SgcC2-tethered donors and the free acceptors.5 4

ACS Paragon Plus Environment

Page 5 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

While the hallmark chemistry of NRPS is the C domain-catalyzed amide bond formation between a PCP-tethered amino acid acceptor and a PCP-tethered growing peptide donor (Figure S1A), C domains that catalyze ester bond formation are known but rare.

For

instance, the C domains located at the C terminus of RapP and FkbP from the rapamycin and FK520 biosynthetic machineries, respectively, catalyze intramolecular cyclization via an ester bond formation between the PCP-tethered hybrid polyketide-peptide intermediate as the donor substrate and the –OH group at the distal end of the hybrid polyketide-peptide intermediate as the acceptor substrate (Figure S1B).11,12 The PCP-C didomain enzyme Fum14 from the fumonisin biosynthetic machinery catalyzes regio-selective double ester bond formation between the PCP-tethered donor substrate and an acceptor substrate bearing multiple –OH and NH2 groups (Figure S1C).13 While these studies demonstrate the catalytic versatility of the C domain to catalyze ester bond formation in natural product biosynthesis, deviating from the amide-forming chemistry known for canonical C domains, SgcC5 is the only C enzyme or domain known to date that catalyzes both ester and amide bond formation, providing an evolutionary link between amide- and ester-forming C enzymes.

Since the first X-ray structure of the free-standing C enzyme VibH was reported in 2002,14 a total of seven structures of C domains15-20 or free-standing C enzymes14 have been solved. While these studies have fundamentally advanced our current understanding for C enzyme structure and catalysis, the structure and molecular mechanism for an ester-forming C enzyme remain elusive.

Here, we now report (i) the widespread distribution of SgcC5

homologues among known or predicted 9-membered enediyne biosynthetic machineries, emerging as a distinct clade of C enzymes that specifically catalyze ester bond formation, and (ii) the X-ray structures of SgcC5, both in its apo form and in complex with (R)-1-(2-naphthyl)-1,2-ethanediol, a mimic of the enediyne core acceptor substrate. The molecular details for substrate binding, the regio- and stereospecific ester bond formation between the donor and acceptor substrates, and the mechanism of SgcC5 catalysis are 5

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 32

discussed.

MATERIALS AND METHODS Bioinformatics Analysis to Search for SgcC5 Homologues. The amino acid sequences of the C domains utilized to generate the phylogenetic tree in a previous study, 21 along with the sequences of the C enzymes from the four known and the 31 putative nine-membered enediyne biosynthetic machineries, as well as three CoA-dependent transferases, were collected to build a more comprehensive phylogenetic tree.

Bootstrap consensus tree was

generated by MEGA6 using the maximum likelihood method with a bootstrap test of 1000 replicates.22

Site-Directed Mutagenesis of sgcC5.

Plasmids of the sgcC5 mutants, pBS1161 (H154A),

pBS1162 (H154E), and pBS1163 (H154Q), were constructed by the QuikChange site-directed mutagenesis method, following the protocol provided by the manufacture (Aglient Technologies) and using pBS1093 as a template.5 The primers used in this study are summarized in Table S1. The mutations were verified by DNA sequencing.

Each of

the mutant constructs was then transformed into E. coli BL21 (DE3) for gene expression and protein production.

Production and Purification of SgcC5 Wild-Type and Mutant Enzymes for Activity Assay.

Protein production in E. coli BL21 (DE3) and purification of the recombinant proteins

(wild-type and mutants) by Ni-NTA affinity chromatography were performed following the previously published procedures.5 The protein purity was verified by 12% SDS-PAGE, and the concentrations were determined from the absorbance at 280 nm using the calculated molar absorption coefficient (ε280 = 47,440 M−1 cm−1).23

Activity Assay of SgcC5 Wild-Type and Mutant Enzymes.

Preparation of the donor 6

ACS Paragon Plus Environment

Page 7 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

substrate SgcC2-tethered (S)-β-tyrosine and activity assay of SgcC5 wild-type or mutant enzymes using as (R)-1-phenyl-1,2-ethanediol as the acceptor substrate followed the previously published procedures.5 The assay solution contained 200 μM apo-SgcC2, 1 mM CoA, 5 mM ATP, 2 mM Tris(2-carboxyethyl)phosphine hydrochloride , 12.5 mM MgCl2, 5 mM (S)-β-tyrosine, 10 μM Svp (phosphopantetheine transferase), and 10 μM SgcC1 in 75 mM Tris-HCl buffer (pH 7.5) and was incubated at 25 °C for 45 min.

The acceptor substrate, 5

mM (R)-1-phenyl-1,2-ethanediol, and 1 μM SgcC5 (wild-type or mutants) were then added to initiate the condensation reaction at 25 °C for 10 min.

The reactions were quenched by the

addition of trifluoroacetic acid (TFA) to a final concentration of 16%, and the resulting solution was subjected to HPLC analysis.

HPLC analysis was carried out on an Apollo C18 column

(5 m, 4.6 x 250 mm, Alltech Associate Inc.), using a 30 min linear gradient from 90% A (0.1% TFA)/10% B (0.1% TFA in acetonitrile) to 10% A/90% B at a flow rate of 1 mL/min with UV detection at 283 nM.5,9

Gene

Cloning

and

SeMet-Labeled

Protein

Production

and

Purification

for

Crystallization. The sgcC5 gene was amplified from the genomic DNA of Streptomyces globisporus by PCR using two primers, sgcC5-F and sgcC5-R (Table S1) and cloned into pMCSG7324 by ligation-independent procedures25 to yield pBS1164.

The pBS1164 construct

was transformed into E. coli BL21-Gold (DE3) for protein production.

Production of

SeMet-labeled SgcC5 was performed according to standard protocols.26 Briefly, the cells were cultured at 37 °C in 1 L of enriched M9 medium27 until OD600 = 1.0.

After air-cooling

the culture down at 4 °C for 60 min, inhibitory amino acids (25 mg/L each of L-valine, L-isoleucine, L-leucine, L-lysine, L-threonine,

and L-phenylalanine), L-selenomethionine

(SeMet) (50 mM/L), and isopropyl-β-D-thiogalactoside (IPTG) (1 mM/L) were added.

The

cells were incubated overnight at 18 °C, harvested and re-suspended in lysis buffer [500 mM NaCl, 5% (v/v) glycerol, 50 mM HEPES (pH 8.0), 20 mM imidazole, and 10 mM β-mercaptoethanol]. 7

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 32

The cells were disrupted by sonication. The insoluble cellular material was removed by centrifugation.

SeMet-labeled SgcC5 was purified using Ni-NTA affinity chromatography by

the ÄKTAxpress system (GE Healthcare Life Sciences) and digested with recombinant His6-tagged TEV protease to remove the His6-tag. The pure protein was concentrated using Amicon Ultra-15 concentrators (Millipore) in 20 mM HEPES buffer (pH 8.0), 250 mM NaCl, and 2 mM dithiothreitol (DTT). was

55.6

mg/mL.

The concentration of protein samples used for crystallization

For

the

structure

of

SgcC5

in

complex

with

(R)-1-(2-naphthyl)-1,2-ethanediol, unlabelled SgcC5 was produced and purified with the same method described above, except that the cells were cultured in lysogeny broth (LB) medium. The concentration of protein samples used for co-crystallization was 20.6 mg/mL.

Crystallization of SgcC5.

SgcC5 was crystallized using the sitting-drop vapor-diffusion

technique in 96-well CrystalQuick plates (Greiner Bio-one) prepared by a Mosquito liquid dispenser (TTP Labtech).

For each screening condition, 0.4 µL of protein and 0.4 µL of

crystallization formulation were mixed. reservoir in the well.

The mixture was equilibrated against 140 µL of the

The plates were incubated at 4 °C. The ligand-free SgcC5 (SgcC5apo)

was crystallized by the crystallization formulation containing 1 M sodium citrate and 0.1 M sodium cacodylate/HCl, pH 6.5.

To prepare SgcC5 in complex with an acceptor substrate,

the protein solution was mixed with 4 mM (R)-1-phenyl-1,2-ethanediol (in DMSO) or (R)-1-(2-naphthyl)-1,2-ethanediol (in DMSO) and incubated at 4°C.

While no crystal of

SgcC5 in complex with (R)-1-phenyl-1,2-ethanediol was obtained, SgcC5 in complex with (R)-1-(2-naphthyl)-1,2-ethanediol (SgcC5NE) was crystallized by the crystallization formulation containing 0.2 M Li2SO4, 2 M (NH4)2SO4, and 0.1 M 3-(cyclohexylamino)-1-propanesulfonic acid/NaOH, pH 10.5.

Data Collection and Structure Determination. The SgcC5apo and SgcC5NE crystals were 8

ACS Paragon Plus Environment

Page 9 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

transferred to cryoprotectant solutions containing 1.4 M sodium citrate and 28% sucrose, respectively, prior to X-ray diffraction.

The diffraction data of SgcC5apo and SgcC5NE were

collected at the Structural Biology Center 19-ID beamline of the Advanced Photon Source, Argonne National Laboratory.

The diffraction data of SgcC5apo were collected at 100K using

a wavelength of 0.9793 Å with the ADSC QUANTUM 315r CCD detector.

The diffraction

data of SgcC5NE were collected at 100K using a wavelength of 0.9792 Å with the same detector.

The diffraction images were processed with the HKL3000 suite.28

Intensities

were converted to structure factor amplitudes in the Ctruncate program 29,30 from the CCP4 package.30 The SgcC5apo structure was solved by single-wavelength anomalous diffraction (SAD) method.

The initial protein model was built by ARP/wARP.31

Manual model

rebuilding was carried out in COOT32 and crystallographic refinement was performed in BUSTER33 and subsequently in phenix.refine.34

The SgcC5NE structure was solved by

molecular replacement (MR) with SgcC5apo as a search template.

The model was manually

rebuilt and refined with the same procedures as that for SgcC5apo.

The final models have

been validated by Molprobity.35 The data processing and refinement statistics are given in Table 1. The atomic coordinates and structure factors have been deposited in Protein Data Bank under accession codes 4ZNM (SgcC5apo) and 4ZXW (SgcC5NE).

RESULTS Bioinformatics Analysis Revealing the Widespread Distribution of Genes Encoding SgcC5 Homologues in Enediyne Biosynthetic Gene Clusters.

While several

experimentally confirmed or bioinformatically predicted C domains that catalyze ester bond formation are now known,11-13 free-standing C enzymes that catalyze ester bond formation are rarely found in NRPS machineries.

In our current effort to mine microbial genomes for

enediyne natural products, we constructed an enediyne genome neighborhood network, including all experimentally confirmed and bioinformatically predicted enediyne biosynthetic gene clusters available from the NCBI and JGI databases.36,37 Surprisingly, genes encoding 9

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 32

SgcC5 homologues are widely present in enediyne biosynthetic gene clusters, especially those encoding the nine-membered enediyne natural products.

SgcC5 homologues are

identified from the four known and the 31 putative nine-membered enediyne biosynthetic machineries with 26–96% amino acid sequence identities among them (Figure S2). Specifically, MdpC5, KedY5, and SpoT10, homologues of SgcC5 from the maduropeptin (MDP), kedarcidin (KED), and sporolide (SPO) biosynthetic machineries, have been proposed previously to catalyze ester or amide bond formation, in mechanistic analogy to SgcC5 but with varying substrate specificity, for MDP,38 KED,39 and SPO40 biosynthesis, respectively (Figure S3).

Phylogenetic Analysis Revealing SgcC5 Homologues as a New Family of C enzymes. Based on a correlation between their sequence and function, C domains have been classified into six subtypes: (i) LCL domains catalyze peptide bond formation between an L-amino acid to a growing peptide ending with L-amino acid, (ii) DCL domains catalyze peptide bond formation between an L-amino acid to a growing peptide ending with D-amino acid, (iii) heterocyclization domains catalyze peptide bond formation and subsequent cyclization of cysteine, serine or threonine residues, (iv) epimerization domains convert the chirality of amino acids, (v) dual E/C domains catalyze both epimerization and condensation, and (vi) starter C domains acylate the first amino acid with a -hydroxy-carboxylic acid.21 Phylogenetic analysis revealed that each of the various functional subtypes of the C domains could be grouped into individual clades.21 To understand the sequence-function evolution of SgcC5, we collected the amino acid sequences of the C domains utilized to generate the phylogenetic tree in a previous study,21 along with the sequences of the C enzymes from the four known and 31 putative nine-membered enediyne biosynthetic machineries, as well as three CoA-dependent transferases (structural homologues of SgcC5 that will be discussed below), to build a more comprehensive phylogenetic tree.

The resulting phylogenetic tree

shows that SgcC5 and its homologues are grouped in a clade that is distinct from the other C 10

ACS Paragon Plus Environment

Page 11 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

domain subtypes classified by the functional categories (Figure 2). Therefore, SgcC5 and homologues emerge as a new subtype of C enzymes, members of which are all free-standing C enzymes mainly catalyzing ester bond formation, as exemplified by SgcC5.

Figure 2. Phylogenetic tree of selected C domains and SgcC5 homologues from enediyne biosynthetic machineries revealing a new family of C enzymes. The selected C domains are extracted from a previous study.21 Bootstrap consensus tree was generated by MEGA6 using the maximum likelihood method with a bootstrap test of 1000 replicates. 22 The tree divides the C domains into seven clades: dual E/C, starter, heterocycliztion, epimerization, L CL, DCL, and ester bond formation. The newly classified ester bond formation clade is highlighted with a blue background with SgcC5 highlighted by a red box. The CoA-transferases as a separate clade are highlighted with a yellow background. RapP and 11

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 32

FkbP are highlighted with a green background.

SgcC5 and the CoA-dependent acyltransferases both catalyze ester bond formation; however,

they

employ

different

chemistries.

SgcC2-tethered

(S)-β-tyrosine

and

SgcC2-tethered (S)-3-chloro-5-hydroxy-β-tyrosine, the donor substrates of SgcC5, are PCP dependent (Figure 1), while the substrates of acyltransferases are CoA dependent. The phylogenetic tree clearly shows that they fall into two distinct clades (Figure 2). SgcC5 and its homologues are most similar to C domains in the LCL subtype, while the acyltransferases are most similar to C domains in the heterocyclization subtype. catalyse ester bond formation.

RapP and FkbP also

Since RapP and FkbP are modular proteins, instead of

free-standing enzymes like SgcC5, RapP and FkbP clade with modular proteins in the LCL clade, indicating that they are more evolutionarily similar to the modular proteins (Figure 2).

Structure Solution and Refinement of SgcC5.

Crystals of the ligand-free SgcC5

(SgcC5apo) belong to orthorhombic space group P212121 with unit cell dimensions a = 99.60, b = 104.54, and c =108.46 Å.

The crystal structure of SgcC5apo was determined by the

single-wavelength anomalous dispersion (SAD) method.

Two copies of the polypeptide

chain were found and built in the asymmetric unit, corresponding to a solvent content of 55.8%. The two polypeptide chains were traced from residue 15 to residue 457 or 458 in chain A and B, respectively.

Two Cl- and three Ca2+ ions were found in the asymmetric unit

based on Fo-Fc regular and anomalous electron density maps.

The final model of SgcC5apo

was refined to a resolution of 2.00 Å with an R factor of 16.4% and an Rfree factor of 19.2%. Ramachandran analysis shows that the percentage of residues in favored, allowed, and disallowed were 97.8%, 2.2%, and 0%, respectively.

Due to the unavailability of the natural enediyne core acceptor substrate, we initially used (R)-1-phenyl-1,2-ethanediol to screen for crystals of SgcC5 in complex with an acceptor substrate for its demonstrated activity in the in vitro assay of SgcC5 (Figure 1).5 However, 12

ACS Paragon Plus Environment

Page 13 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

after multiple failed attempts, we switched to (R)-1-(2-naphthyl)-1,2-ethanediol as an alternative substrate mimic, fully aware of the pitfall that no activity was detected when it was used in the in vitro assay of SgcC5 under the same condition (Figure S4).9 Crystals of SgcC5 in complex with (R)-1-(2-naphthyl)-1,2-ethanediol (SgcC5NE) were obtained by co-crystallization.

SgcC5NE was crystallized in orthorhombic space group P212121 with unit

cell dimensions a = 99.40, b = 105.30, and c =108.17 Å. The asymmetric unit contains two chains forming a structure similar to SgcC5apo, corresponding to a solvent content of 55.9%. The structure of SgcC5NE was determined by the molecular replacement (MR) method using SgcC5apo as a searching model. The polypeptide chains were traced from residue 15 to 457. Both (R)-1-(2-naphthyl)-1,2-ethanediol and sucrose (from the cryoprotectant solution) molecules were built into each monomer.

In addition, the asymmetric unit contains five

SO42- ions and one 3-cyclohexyl-1-propylsulfonic acid at the protein surface.

The final

model of SgcC5NE was refined to a resolution of 2.20 Å with an R factor of 16.1% and an Rfree factor of 20.8%.

Ramachandran analysis shows the percentage of residues in favored,

allowed, and disallowed were 97.7%, 2.0%, and 0.2%, respectively.

The summary of

crystallographic data and refinement statistics of SgcC5apo and SgcC5NE are given in Table 1.

Overall Structure of SgcC5 and Structural Similarity to NRPS C Domain Enzymes. The structures of SgcC5apo and SgcC5NE superimpose well with a root mean square deviation (rmsd) of 0.28 Å for Cα atoms superposition.

The side chain conformations also show no

significant changes between them, revealing that (R)-1-(2-naphthyl)-1,2-ethanediol, which mimics the enediyne core acceptor substrate, does not trigger a significant conformational change.

Both models of SgcC5apo and SgcC5NE are homodimers in an asymmetric unit,

which is generated via non-crystallographic two-fold axis (Figure 3B). Consistent with these models, size exclusion chromatography analysis confirmed that SgcC5 is a dimer in solution (Figure S5), revealing that the dimerization is not a crystal-packing artifact.

13

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 32

Figure 3. The overall structure of SgcC5. (A) Ribbon diagram of the SgcC5 monomer. The N- and C-terminal subdomains are shown in light blue and deep blue, respectively. The conserved motif HHXXXDX14Y is shown in yellow, and the “floor loop” and the “bridging region” are shown in green and orange, respectively. The second histidine, His154, of the conserved motif is shown as spheres. (B) Ribbon diagram of the SgcC5 dimer. The 12-stranded β-sheet formed by the two polypeptide chains is highlighted with blue and red color. Glu57 and Arg61 of the respective β4 strands from two polypeptide chains participate in dimerization via electrostatic interactions.

The SgcC5 monomer shares a similar three-dimensional structure with typical C domains (Figure 3A).

The C domain structures are present as a pseudo-dimer configuration

comprising of an N- and a C-terminal subdomain, which folds as the CoA-dependent acyltransferase (CAT) superfamily.14,16,41,42

In SgcC5, the N-terminal subdomain (Met1–

Ala200) is composed of five α-helices (α1–α5), a four-stranded β-sheet (β1 and β4–β6), and a β-hairpin (β2–β3).

The C-terminal subdomain (Ala201–Ser459) is composed of six

α-helices (α6–α11), a six-stranded β-sheet (β7–β11 and β14), and a β-hairpin (β12–β13). Two crossovers, the “floor loop” and the “bridging region”, contact the two subdomains (Figure 3A). The “floor loop” (Ser292–Asn309) consists of a loop between β9 and β10 with a short 310 α-helix.

The “bridging region” (Val371–Asp402) is comprised of a loop between

β11 and β14 with a β-hairpin (β12 and β13), which complements the four-stranded β-sheet in the N-terminal subdomain. The active site lining the “V” shape structure is covered by the two crossovers. The C domain conserved motif HHXXXDX14Y (His153–Tyr173) from the N-terminal subdomain is located at the interface of the two subdomains, where the putative catalytic residue, His154, is located at the center of the V-shaped cavity (Figure 3A). In addition, SgcC5 forms a homodimer via electrostatic interactions between Glu57 and Arg61 14

ACS Paragon Plus Environment

Page 15 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

and β-sheet interfaces between the respective N-terminal subdomains, of which the β4 strands participate in dimerization, resulting in a 12-stranded β-sheet formed by the two polypeptide chains (Figure 3B).

Notably, functional modular NRPSs, as well as the

free-standing C enzyme, VibH,14 have been demonstrated to be monomeric,43 making SgcC5 the only known example that acts as a homodimer.

Mutational Analysis Revealing an Essential Role of the Catalytic Histidine for Enzyme Catalysis.

The SgcC5NE complex structure revealed that Phe23, Phe27, Phe291, Val367,

Ile403, Pro404, Gly406, and Leu408 form a hydrophobic pocket to accommodate (R)-1-(2-naphthyl)-1,2-ethanediol, the 2-hydroxyl group of which accepts a hydrogen bond from the second histidine of the C domain conserved motif HHXXXDX 14Y (Figures 4B and 4C). In general, the second histidine of the C domain conserved motif acts as a general base to deprotonate the amino group of the acceptor substrate and subsequently initiates the condensation reaction.44

To provide additional experimental data to support the catalytic

role of the second histidine of the HHXXXDX14Y motif in SgcC5, we mutated His154 to alanine, glutamate, or glutamine by site-directed mutagenesis.

SgcC1 and SgcC2, as well

as the Svp phosphopantetheine transferase, were produced to synthesize the donor substrate (S)-β-tyrosyl-S-SgcC2 for SgcC5 activity assay.

The results revealed that each of

the SgcC5 mutants (H154A, H154E and H154Q) completely abolished enzymatic activity (Figures S4A and S4B), suggesting that His154 plays a critical role in enzyme catalysis and may acts as a general base to catalyze ester or amide formation.

15

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 32

Figure 4. Putative active-site cavity of SgcC5. (A) Superposition of SgcC5 (PDB entry: 4ZXW, light blue) with the TqaA-T-C didomain complex [PDB entry: 5EJD, gray (C domain) and cyan (PCP domain)]. P-pant (brown sticks) of the TqaA-T-C didomain complex projected in SgcC5 enters the active-site cavity through the “floor loop” (orange) and the “bridging region” (green) in SgcC5. (B) Local view of the SgcC5 active-site cavity. The P-pant of the TqaA-T-C didomain complex was projected in SgcC5. The thiol group of P-pant (brown sticks) from structural superposition points toward the catalytic histidine, His154. (R)-1-(2-Naphthyl)-1,2-ethanediol (NE) from SgcC5NE is colored in black. The HHXXXDX14Y motif is shown in yellow. (C) The hydrogen-bond network in the active site. The water molecules are shown by red spheres. The -A-weighted difference (mFo – DFc) omit map contoured at 3 is shown as green. The hydrogen bonds are depicted by yellow dotted lines.

Discussion SgcC5 Revealing an Unusual Structure Compared to Canonical C domains.

According

to the Dali server for structural similarity searching,45 SgcC5 shows structural homology to C domains

(TycC-C6,41

SrfA-C,42

CDA-C1,16,46

VibH,14

EntF-C,15

and

TqaA-C20),

CoA-dependent transferases (TRI3,47 TRI101,48 and HCT49), the NRPS E domain (TycA-E),50 and the NRPS X domain.51

Despite the low amino acid sequence identities (11–27%)

(Figures S6 and S7), SgcC5 was found to share similar folds with each of these proteins (Figure S6 and S8).

SgcC5 and its structural homologues all fold as a pseudo-dimeric

conformation and possess the HHXXXDX14Y motif (HXXXD in CoA-dependent transferases, SHXXXDX14Y in NRPS E domain, and HRXXXDX14Y in NRPS X domain), the floor loop, and the bridging region (Figures 3 and S6).

Comparison of the structures among SgcC5 and its

structural homologues, TycC-C6, SrfA-C, EntF-C, TqaA-C, VibH, TycA-E, and the X domain are present as open conformations, while SgcC5 adopts a closed conformation as seen in CDA-C1.16 Superposition of the C-terminal subdomain of SgcC5 with that of the structural homologues, CDA-C1 (closed conformation) and SrfA-C (open conformation), reveals that the N-terminal subdomain of SrfA-C shows a shift and rotation compared to that of SgcC5 and CDA-C1 (Figure 5A). While it remains elusive if C domains undergo conformational changes, computational analysis suggested that conformational changes in C domains are likely to occur in the catalytic cycle and may be important for peptide synthesis.16 Unlike the canonical C domain structure, SgcC5 possesses an extra sequence (Trp391–Gln397) 16

ACS Paragon Plus Environment

Page 17 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

(Figure S6) within the bridging region that forms a “central loop” connecting the α3 helix from the N-terminal subdomain and β8 and β15 from the C-terminal subdomain via hydrogen-bond networks and -stacking interactions (Figure 5B). These interactions are only present in SgcC5 and may fix the structure in a closed conformation, regardless whether the substrates are bound (Figure 5).

Figure 5. Structural comparison of SgcC5 with selected homologues. (A) Superposition of the overall structures of SgcC5 (blue, PDB entry: 4ZNM), CDA-C1 (yellow, 4JN5), and SrfA-C (pink, 2VSQ). CDA-C1 and SrfA-C represent the closed and open conformations, respectively. (B) Local view of the interactions between the N- and C-terminal subdomains. The central loop (shown in magenta) forms a hydrogen-bond network with α3 from the N-terminal subdomain (shown in light blue) and β8 and β15 from the C-terminal subdomain (shown in deep blue). Trp391 from the central loop and His412 from the C-terminal subdomain form -stacking interactions. The interactions are depicted by yellow dotted lines.

In general, C domains catalyze amide bond formation between a PCP-tethered acceptor substrate and a PCP-tethered donor substrate (Figure S1A). Therefore, the structures of C domains, such as TycC-C6, SrfA-C and CDA-C1, contain the active-site tunnels to accommodate the PCP-tethered substrates, with both the donor and acceptor substrates channeled by the P-pant arms.

The active-site tunnel passes through the V-shaped

structure, where the acceptor and donor substrates can be reasonably speculated to approach and bind from opposite faces and meet at the catalytic histidine at the center of the active-site tunnel.16,41 Unlike the canonical C domains, SgcC5 represents the first structure of the C enzyme that features a substrate binding cavity instead of a tunnel.

The cavity in

SgcC5 is open to bulk solvent at the donor side between the floor loop and the bridging 17

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 32

region. The cavity is blocked at the acceptor side by four bulky residues, Phe23, Phe27, Phe344, and Met355 (Figure 4B), which are highly conserved among the SgcC5 homologues from the known and predicted enediyne biosynthetic machineries (Figure S2).

These

structural features are consistent with findings that SgcC5 specifically catalyzes ester bond formation between a PCP-tethered donor substrate and a free acceptor substrate.5

Sequence and Structural Comparisons of SgcC5 with the Structural Homologues Revealing Key Residues for Enzyme Catalysis and Protein Folding. Key residues for enzyme catalysis and protein folding in C domains have been reported based on mutational analysis.14,44 Despite the low amino acid sequence identities, these key residues are highly conserved among SgcC5 and its structural homologues (Figure S6), as well as the SgcC5 homologues from the known and putative enediyne biosynthetic machineries (Figure S2). For enzyme catalysis, His154 (the catalytic histidine) and Ala159 were proposed to play important roles as a general base for deprotonation of the acceptor substrate and as an oxyanion hole for stabilization of the tetrahedral intermediate, respectively (Figure 6).44,50 On the basis of SgcC5 structure and related proteins, six residues were proposed to be involved in the folding of SgcC5: Arg70, Arg75 and His153 at the N-terminal subdomain and Trp213 at the C-terminal subdomain are away from the active site,14,44 while Asp158 within the HHXXXDX14Y motif forms electrostatic interactions with Arg295 and lines the active site cavity near the catalytic histidine His154 (Figure S9).14

The SgcC5NE Complex Structure Indicating the Binding Sites of the Acceptor and Donor Substrates. Our previous studies indicated that SgcC5 shows substrate promiscuity by recognizing a variety of SgcC2-tethered donor substrates and free acceptor substrates to catalyze regio- and stereospecific ester or amide bond formation (Figures 1 and S4).5,9 The SgcC5NE complex structure revealed that Phe23, Phe27, Phe291, Phe310, Leu24, Ile365, Val367, Ile403, Pro404, Gly406, and Leu408 form an extremely hydrophobic pocket to 18

ACS Paragon Plus Environment

Page 19 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

accommodate the acceptor substrate (R)-1-(2-naphthyl)-1,2-ethanediol, mimicking the enediyne core structure (Figure 4B).

These residues are highly conserved among the

SgcC5 homologues from the known or putative enediyne biosynthetic machineries (Figure S2), implicating the critical hydrophobic environment for enediyne core accommodation.

The regio- and stereospecificity of SgcC5 toward the acceptor substrate are highly strict. SgcC5 catalyzes regiospecific ester bond formation at the 2-hydroxyl group of the acceptor substrate, corresponding to the 2-hydroxyl group of the proposed enediyne core acceptor substrate (Figure 1). The SgcC5NE complex structure shows that the distances between the catalytic histidine and the 2- and 1-hydroxyl groups of the acceptor substrate are 2.9 and 4.1 Å, respectively.

This indicated that SgcC5 has a preference to catalyze ester bond

formation at the 2-hydroxyl group.

Furthermore, SgcC5 is highly stereoselective for the

(1R)-enantiomer of the acceptor substrate.9 Kinetics study of SgcC5 revealed similar Km values for both enantiomers, but a kcat 180 times larger for the (1R)-enantiomer over the (1S)-enantiomer of the acceptor substrate.5

The 2- and (1R)-hydroxyl groups of the

acceptor substrate form a hydrogen-bond network with three water molecules that preferably orientate both hydroxyl groups for substrate binding and catalysis (Figure 4C).

The

(1S)-enantiomer showed similar binding affinity to the (1R)-enantiomer with SgcC5; however, the hydrogen-bond network might be disrupted upon binding the (1S)-enantiomer.

Loss of

the hydrogen-bond network might affect the enzyme catalysis, the substrate orientation, or both and consequently result in a 180-fold decrease in kcat for the (1S)-enantiomer.

SgcC5 is incapable of accepting a free donor substrate,5 indicating the protein-protein interaction between SgcC5 and SgcC2 plays a major role for donor substrate recognition. Recently, the crystal structure of the holo-form of Penicillium aethiopicum TqaA (TqaA-T-C) in complex with its PCP-tethered donor substrate was reported (Figure S10A).20 By structural superposition of SgcC5 and SgcC2 (the SgcC2 homologue model was generated in our 19

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 32

previous study)52 with TqaA-T-C (PDB entry: 5EJD), the SgcC5-SgcC2 interface and the donor substrate binding site of SgcC5 could be reasonably mapped (Figure S10). The crystal structure of TqaA-T-C shows that Asp3906 from the C domain and Arg3571 from the PCP domain form an ionic interaction.20

In contrast, the SgcC2-SgcC5 complex model

reveals that the ionic interactions could be formed between Glu37 from SgcC2 and Arg256 and Lys261 from SgcC5, which are highly conserved among the SgcC5 homologues from the enediyne biosynthetic machineries (Figure S2). On the other hand, structural superposition of SgcC5 and TqaA-T-C shows that the P-pant arm from TqaA-T-C projected in a reasonable position in SgcC5, where the P-pant arm enters and points toward the active-site cavity through the floor loop and the bridging region at the donor side (Figures 4A and 4B), and the thiol group of P-pant arm points toward the catalytic histidine (His154) (Figure 4B).

Structural and Mutational Analysis Supporting the Enzyme Catalytic Mechanism Depending on General Acid-Base Catalysis.

The C domains and CoA-dependent

transferases possess the HHXXXDX14Y and HXXXD motifs, respectively.

The catalytic

histidines from the two motifs have been demonstrated to play important roles in enzyme catalysis.14,44,49,53 In general, the catalytic histidine acts as a general base to deprotonate the amino or hydroxyl group of the acceptor substrate and subsequently initiates the condensation reaction.

Mutation of the catalytic histidine to alanine with C domains or

CoA-dependent acyltransferases resulted in the reduction or complete abolishment of enzymatic activity.

The catalytic histidine in VibH has been shown to be unnecessary and

can be substituted by other amino acids with little effect on catalysis.14 peptide

bond

formation

between

norspermidine

(NSPD)

and

VibH catalyzes PCP-tethered

2,3-dihydroxybenzoate (PCP-DHB), where the primary amine group of NSPD may be deprotonated under the enzymatic reaction conditions, causing the catalytic histidine to be superfluous as a general base for NSPD deprotonation.14

20

ACS Paragon Plus Environment

Page 21 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

The active site of the SgcC5NE complex structure shows that N of the catalytic histidine (His154) is protonated and interacts with the backbone carbonyls of His154 and Ala157, and Nε of His154 is within hydrogen-bonding distance (2.9 Å) of the 2-hydroxyl group of (R)-1-(2-naphthyl)-1,2-ethanediol (Figure S11). While it remains unclear why activity was undetectable when SgcC5 was assayed using (R)-1-(2-naphthyl)-1,2-ethanediol as an alternative acceptor substrate (Figure S4C),9 its poor solubility may contribute to the failed attempt to demonstrate the ester bond-forming activity in vitro.

Recently, the crystal

structure of CDA-C1 in complex with a chemical probe, which acts as a reaction-competent substrate, was solved to understand the acceptor substrate binding site and catalytic mechanism.46 The Nε atom of the catalytic histidine to the -amino group of the substrate in the complex structure of CDA-C1 shows the same position and distance to that of the 2-hydroxyl group of the substrate in SgcC5NE (Figure S11), suggesting a similar catalytic mechanism of CDA-C1 and SgcC5.

In the SgcC5NE complex structure, the catalytic

histidine accepts a hydrogen bond from the 2-hydroxyl group of the acceptor substrate, which precludes the catalytic histidine from participating in transition-state stabilization as proposed for TycC-C6.41 Furthermore, the SgcC5 mutants (H154A, H154E and H154Q) completely abolished the enzymatic activity (Figures S4A and S4B). The catalytic mechanism was proposed that SgcC5 promotes substrate deprotonation by the Nε of His154, which generates a 2-hydroxyl nucleophile to initiate the condensation reaction that leads to the formation of the ester bond (Figure 6).

Figure 6. Proposed mechanism for SgcC5 catalyzed ester bond formation between a PCP-tethered donor substrate and a PCP-free acceptor substrate. (a) The acceptor 21

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 32

substrate (enediyne core) is activated by the catalytic His154 that acts as a general base deprotonating the 2-hydroxyl group of the acceptor substrate to form an oxyanion, which concomitantly attacks the carbonyl carbon of the PCP-tethered donor substrate resulting in a tetrahedral intermediate that is stabilized by an oxyanion hole-forming residue Ala159. (b) The electrons on the oxyanion are pushed back to the carbonyl carbon and the product is subsequently released from the active site. The proton on the His154 is transferred to the thiol group of P-pant arm. X and Y stand for the side chain of (S)-3-chloro-5-hydroxy-β-tyrosine and enediyne core, respectively. ASSOCIATED CONTENT Supporting Information Primers and vectors used in this study (Table S1), selected C domains catalyzing ester bond formation (Figure S1), sequence alignment of SgcC5 with other homologues from known and predicted enediyne biosynthetic machineries (Figure S2), structures of the four known nine-membered enediynes (Figure S3), HPLC chromatograms of assays of ester bond formation between SgcC2-tethered (S)--tyrosine and (R)-1-ohenyl-1,2-ethanediol catalyzed by SgcC5 wild-type and mutant enzymes (Figure S4), molecular weight estimation of SgcC5 by size exclusion chromatography (Figure S5), structure-based sequence alignment of SgcC5 with selected homologues (Figure S6), sequence alignment identity matrix among SgcC5 and selected C domain or enzyme, an E domain, and an X domain, as well as three CoA-transferases, whose structures are known (Figure S7), structural comparison of SgcC5 with its homologues (Figure S8), residues involved in folding of SgcC5 (Figure S9), putative binding interface between SgcC5 and SgcC2 (Figure S10), and acceptor substrate binding sites in SgcC5 and CDA-C1 (Figure S11).

AUTHOR INFORMATION Corresponding Author *E-mail: [email protected]. Telephone: (561) 228-2456. Fax: (561) 228-2472. Funding This work is supported in part by National Institutes of Health grants GM098248 (G.N.P.), GM109456 (G.N.P.), GM094585 (A.J.), CA078747 (B.S.), and GM115575 (B.S.). The use of 22

ACS Paragon Plus Environment

Page 23 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Structural Biology Center beamlines at the Advanced Photon Source was supported in part by the U.S. Department of Energy, Office of Biological and Environmental Research, Grant DE-AC02-06CH11357 (A.J.).

C.-Y.C. and J.D.R were supported in part by postdoctoral

fellowships from the Academia Sinica-The Scripps Research Institute Talent Development Program and the Arnold and Mabel Beckman Foundation, respectively. Notes The authors declare no competing financial interest.

ACKNOWLEDGMENTS We thank Dr. Y. Li, Institute of Medicinal Biotechnology, Chinese Academy of Medical Sciences, Beijing, China, for the S. globisporus wild-type strain. This is manuscript #29652 from The Scripps Research Institute.

23

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 32

REFERENCES (1)

Finking, R., and Marahiel, M. A. (2004) Biosynthesis of nonribosomal peptides. Annu. Rev. Microbiol. 58, 453-488.

(2)

Du, L., and Shen, B. (1999) Identification and characterization of a type II peptidyl carrier protein from the bleomycin producer Streptomyces verticillus ATCC 15003. Chem. Biol. 6, 507-517.

(3)

Yan, X., Hindra, Ge, H., Yang, D., Huang, T., Crnovcic, I., Chang, C. Y., Fang, S., Annaval, T., Zhu, X., Huang, Y., Zhao, L. X., Jiang, Y., Duan, Y., and Shen, B. (2018) Discovery of alternative prodecers of the enediyne antitumor antibiotic C-1027 with high titers. J. Nat. Prod. DOI: 10.1021/acs.jnatprod.7b01013.

(4)

Liu, W., Christenson, S. D., Standage, S., and Shen, B. (2002) Biosynthesis of the enediyne antitumor antibiotic C-1027. Science 297, 1170-1173.

(5)

Lin, S., Van Lanen, S. G., and Shen, B. (2009) A free-standing condensation enzyme catalyzing ester bond formation in C-1027 biosynthesis. Proc. Natl. Acad. Sci. U.S.A. 106, 4183-4188.

(6)

Van Lanen, S. G., Lin, S., Dorrestein, P. C., Kelleher, N. L., and Shen, B. (2006) Substrate specificity of the adenylation enzyme SgcC1 involved in the biosynthesis of the enediyne antitumor antibiotic C-1027. J. Biol. Chem. 281, 29633-29640.

(7)

Lin, S., Van Lanen, S. G., and Shen, B. (2007) Regiospecific chlorination of (S)--tyrosyl-S-carrier protein catalyzed by SgcC3 in the biosynthesis of the enediyne antitumor antibiotic C-1027. J. Am. Chem. Soc. 129, 12432-12438.

(8)

Lin, S., Van Lanen, S. G., and Shen, B. (2008) Characterization of the two-component, FAD-dependent monooxygenase SgcC that requires carrier protein-tethered substrates for the biosynthesis of the enediyne antitumor antibiotic C-1027. J. Am. Chem. Soc. 130, 6616-6623.

(9)

Lin, S., Huang, T., Horsman, G. P., Huang, S. X., Guo, X., and Shen, B. (2012) Specificity of the ester bond forming condensation enzyme SgcC5 in C-1027 24

ACS Paragon Plus Environment

Page 25 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

biosynthesis. Org. Lett. 14, 2300-2303. (10) Van Lanen, S. G., Dorrestein, P. C., Christenson, S. D., Liu, W., Ju, J., Kelleher, N. L., and Shen, B. (2005) Biosynthesis of the beta-amino acid moiety of the enediyne antitumor antibiotic C-1027 featuring beta-amino acyl-S-carrier protein intermediates. J. Am. Chem. Soc. 127, 11594-11595. (11) Konig, A., Schwecke, T., Molnar, I., Bohm, G. A., Lowden, P. A., Staunton, J., and Leadlay, P. F. (1997) The pipecolate-incorporating enzyme for the biosynthesis of the immunosuppressant

rapamycin-nucleotide

sequence

analysis,

disruption

and

heterologous expression of rapP from Streptomyces hygroscopicus. Eur. J. Biochem. 247, 526-534. (12) Gatto, G. J., Jr., McLoughlin, S. M., Kelleher, N. L., and Walsh, C. T. (2005) Elucidating the substrate specificity and condensation domain activity of FkbP, the FK520 pipecolate-incorporating enzyme. Biochemistry 44, 5993-6002. (13) Zaleta-Rivera, K., Xu, C., Yu, F., Butchko, R. A., Proctor, R. H., Hidalgo-Lara, M. E., Raza, A., Dussault, P. H., and Du, L. (2006) A bidomain nonribosomal peptide synthetase encoded by FUM14 catalyzes the formation of tricarballylic esters in the biosynthesis of fumonisins. Biochemistry 45, 2561-2569. (14) Keating, T. A., Marshall, C. G., Walsh, C. T., and Keating, A. E. (2002) The structure of VibH represents nonribosomal peptide synthetase condensation, cyclization and epimerization domains. Nat. Struct. Biol. 9, 522-526. (15) Drake, E. J., Miller, B. R., Shi, C., Tarrasch, J. T., Sundlov, J. A., Allen, C. L., Skiniotis, G., Aldrich, C. C., and Gulick, A. M. (2016) Structures of two distinct conformations of holo-non-ribosomal peptide synthetases. Nature 529, 235-238. (16) Bloudoff, K., Rodionov, D., and Schmeing, T. M. (2013) Crystal structures of the first condensation domain of CDA synthetase suggest conformational changes during the synthetic cycle of nonribosomal peptide synthetases. J. Mol. Biol. 425, 3137-3150. (17) Bloudoff, K., Fage, C. D., Marahiel, M. A., and Schmeing, T. M. (2017) Structural and 25

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 32

mutational analysis of the nonribosomal peptide synthetase heterocyclization domain provides insight into catalysis. Proc. Natl. Acad. Sci. U.S.A. 114, 95-100. (18) Tarry, M., Haque, A. S., Bui, K. H., and Schmeing, T. M. (2017) X-ray crystallography and electron microscopy of cross- and multi-module nonribosomal peptide synthetase proteins reveal a flexible architecture. Structure 25, 783-793. (19) Dowling, D. P., Kung, Y., Croft, A. K., Taghizadeh, K., Kelly, W. L., Walsh, C. T., and Drennan, C. L. (2016) Structural elements of an NRPS cyclization domain and its intermodule docking domain. Proc. Natl. Acad. Sci. U.S.A. 113, 12432-12437. (20) Zhang, J., Liu, N., Cacho, R. A., Gong, Z., Liu, Z., Qin, W., Tang, C., Tang, Y., and Zhou, J. (2016) Structural basis of nonribosomal peptide macrocyclization in fungi. Nat. Chem. Biol. 12, 1001-1003. (21) Rausch, C., Hoof, I., Weber, T., Wohlleben, W., and Huson, D. H. (2007) Phylogenetic analysis of condensation domains in NRPS sheds light on their functional evolution. BMC Evol. Biol. 7, 78. (22) Tamura, K., Stecher, G., Peterson, D., Filipski, A., and Kumar, S. (2013) MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol. Biol. Evol. 30, 2725-2729. (23) Gill, S. C., and von Hippel, P. H. (1989) Calculation of protein extinction coefficients from amino acid sequence data. Anal. Biochem. 182, 319-326. (24) Eschenfeldt, W. H., Lucy, S., Millard, C. S., Joachimiak, A., and Mark, I. D. (2009) A family of LIC vectors for high-throughput cloning and purification of proteins. Methods Mol. Biol. 498, 105-115. (25) Aslanidis, C., and de Jong, P. J. (1990) Ligation-independent cloning of PCR products (LIC-PCR). Nucleic Acids Res. 18, 6069-6074. (26) Sreenath, H. K., Bingman, C. A., Buchan, B. W., Seder, K. D., Burns, B. T., Geetha, H. V., Jeon, W. B., Vojtik, F. C., Aceti, D. J., Frederick, R. O., Phillips, G. N., Jr., and Fox, B. G. (2005) Protocols for production of selenomethionine-labeled proteins in 2-L polyethylene terephthalate bottles using auto-induction medium. Protein Expr. Purif. 40, 26

ACS Paragon Plus Environment

Page 27 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

256-267. (27) Donnelly, M. I., Zhou, M., Millard, C. S., Clancy, S., Stols, L., Eschenfeldt, W. H., Collart, F. R., and Joachimiak, A. (2006) An expression vector tailored for large-scale, high-throughput purification of recombinant proteins. Protein Expr. Purif. 47, 446-454. (28) Minor, W., Cymborowski, M., Otwinowski, Z., and Chruszcz, M. (2006) HKL-3000: the integration of data reduction and structure solution - from diffraction images to an initial model in minutes. Acta Crystallogr. D Biol. Crystallogr. 62, 859-866. (29) Padilla, J. E., and Yeates, T. O. (2003) A statistic for local intensity differences: robustness to anisotropy and pseudo-centering and utility for detecting twinning. Acta Crystallogr. D Biol. Crystallogr. 59, 1124-1130. (30) Winn, M. D., Ballard, C. C., Cowtan, K. D., Dodson, E. J., Emsley, P., Evans, P. R., Keegan, R. M., Krissinel, E. B., Leslie, A. G., McCoy, A., McNicholas, S. J., Murshudov, G. N., Pannu, N. S., Potterton, E. A., Powell, H. R., Read, R. J., Vagin, A., and Wilson, K. S. (2011) Overview of the CCP4 suite and current developments. Acta Crystallogr. D Biol. Crystallogr. 67, 235-242. (31) Langer, G., Cohen, S. X., Lamzin, V. S., and Perrakis, A. (2008) Automated macromolecular model building for X-ray crystallography using ARP/wARP version 7. Nat. Protoc. 3, 1171-1179. (32) Emsley, P., and Cowtan, K. (2004) Coot: model-building tools for molecular graphics. Acta Crystallogr. D Biol. Crystallogr. 60, 2126-2132. (33) Bricogne, G. B., E. Brandl, M. Flensburg, C. Keller, P.; Paciorek, W. Roversi, P. Sharff, A. Smart, O. S. Vonrhein, C. Womack, T. O. (2011) BUSTER, Version 2.10.0 Ed. Global Phasing Ltd., Cambridge, UK. (34) DiMaio, F., Echols, N., Headd, J. J., Terwilliger, T. C., Adams, P. D., and Baker, D. (2013) Improved low-resolution crystallographic refinement with Phenix and Rosetta. Nat. Methods 10, 1102-1104. (35) Davis, I. W., Murray, L. W., Richardson, J. S., and Richardson, D. C. (2004) MolProbity: 27

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 32

structure validation and all-atom contact analysis for nucleic acids and their complexes. Nucleic Acids Res. 32, W615-W619. (36) Rudolf, J. D., Yan, X., and Shen, B. (2015) Genome neighborhood network reveals insights into enediyne biosynthesis and facilitates prediction and prioritization for discovery. J. Ind. Microbiol. Biotechnol. 43, 261-276. (37) Yan, X., Ge, H., Huang, T., Hindra, Yang, D., Teng, Q., Crnovcic, I., Li, X., Rudolf, J. D., Lohman, J. R., Gansemans, Y., Zhu, X., Huang, Y., Zhao, L. X., Jiang, Y., Van Nieuwerburgh, F., Rader, C., Duan, Y., and Shen, B. (2016) Strain Prioritization and Genome Mining for Enediyne Natural Products. mBio 7, e02104-16. (38) Van Lanen, S. G., Oh, T. J., Liu, W., Wendt-Pienkowski, E., and Shen, B. (2007) Characterization of the maduropeptin biosynthetic gene cluster from Actinomadura madurae ATCC 39144 supporting a unifying paradigm for enediyne biosynthesis. J. Am. Chem. Soc. 129, 13082-13094. (39) Lohman, J. R., Huang, S. X., Horsman, G. P., Dilfer, P. E., Huang, T., Chen, Y., Wendt-Pienkowski, E., and Shen, B. (2013) Cloning and sequencing of the kedarcidin biosynthetic gene cluster from Streptoalloteichus sp. ATCC 53650 revealing new insights into biosynthesis of the enediyne family of antitumor antibiotics. Mol. Biosyst. 9, 478-491. (40) McGlinchey, R. P., Nett, M., and Moore, B. S. (2008) Unraveling the biosynthesis of the sporolide cyclohexenone building block. J. Am. Chem. Soc. 130, 2406-2407. (41) Samel, S. A., Schoenafinger, G., Knappe, T. A., Marahiel, M. A., and Essen, L. O. (2007) Structural and functional insights into a peptide bond-forming bidomain from a nonribosomal peptide synthetase. Structure 15, 781-792. (42) Tanovic, A., Samel, S. A., Essen, L. O., and Marahiel, M. A. (2008) Crystal structure of the termination module of a nonribosomal peptide synthetase. Science 321, 659-663. (43) Sieber, S. A., Linne, U., Hillson, N. J., Roche, E., Walsh, C. T., and Marahiel, M. A. (2002) Evidence for a monomeric structure of nonribosomal peptide synthetases. Chem. 28

ACS Paragon Plus Environment

Page 29 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Biol. 9, 997-1008. (44) Bergendahl, V., Linne, U., and Marahiel, M. A. (2002) Mutational analysis of the C-domain in nonribosomal peptide synthesis. Eur. J. Biochem. 269, 620-629. (45) Holm, L., Kaariainen, S., Rosenstrom, P., and Schenkel, A. (2008) Searching protein structure databases with DaliLite v.3. Bioinformatics 24, 2780-2781. (46) Bloudoff, K., Alonzo, D. A., and Schmeing, T. M. (2016) Chemical probes allow structural insight into the condensation reaction of nonribosomal peptide synthetases. Cell Chem. Biol. 23, 331-339. (47) Garvey, G. S., McCormick, S. P., Alexander, N. J., and Rayment, I. (2009) Structural and functional characterization of TRI3 trichothecene 15-O-acetyltransferase from Fusarium sporotrichioides. Protein Sci. 18, 747-761. (48) Garvey, G. S., McCormick, S. P., and Rayment, I. (2008) Structural and functional characterization of the TRI101 trichothecene 3-O-acetyltransferase from Fusarium sporotrichioides and Fusarium graminearum: kinetic insights to combating Fusarium head blight. J. Biol. Chem. 283, 1660-1669. (49) Walker, A. M., Hayes, R. P., Youn, B., Vermerris, W., Sattler, S. E., and Kang, C. (2013) Elucidation

of

the

structure

and

reaction

mechanism

of

sorghum

hydroxycinnamoyltransferase and its structural relationship to other coenzyme A-dependent transferases and synthases. Plant Physiol. 162, 640-651. (50) Samel, S. A., Czodrowski, P., and Essen, L. O. (2014) Structure of the epimerization domain of tyrocidine synthetase A. Acta Crystallogr. D Biol. Crystallogr. 70, 1442-1452. (51) Haslinger, K., Peschke, M., Brieke, C., Maximowitsch, E., and Cryle, M. J. (2015) X-domain of peptide synthetases recruits oxygenases crucial for glycopeptide biosynthesis. Nature 521, 105-109. (52) Chang, C. Y., Lohman, J. R., Cao, H., Tan, K., Rudolf, J. D., Ma, M., Xu, W., Bingman, C. A., Yennamalli, R. M., Bigelow, L., Babnigg, G., Yan, X., Joachimiak, A., Phillips, G. N., Jr., and Shen, B. (2016) Crystal structures of SgcE6 and SgcC, the two-component 29

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 32

monooxygenase that catalyzes hydroxylation of a carrier protein-tethered substrate during the biosynthesis of the enediyne antitumor antibiotic C-1027 in Streptomyces globisporus. Biochemistry 55, 5142-5154. (53) Lallemand, L. A., Zubieta, C., Lee, S. G., Wang, Y., Acajjaoui, S., Timmins, J., McSweeney, S., Jez, J. M., McCarthy, J. G., and McCarthy, A. A. (2012) A structural basis for the biosynthesis of the major chlorogenic acids found in coffee. Plant Physiol. 160, 249-260.

30

ACS Paragon Plus Environment

Page 31 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Table 1. Data collection, phasing, and refinement statistics for structures

Protein Data Bank entry Space group Cell dimensions a, b, c (Å) () Wavelength (Å) Resolutiona (Å) Rsymb or Rmergec (%) CC1/2 (%) I/ Completeness (%) Redundancy

SgcC5apo 4ZNM P212121 99.60, 104.54, 108.47 90.00, 90.00, 90.00 0.9793 30.00-2.00(2.03-2.00) 11.9 (77.3) 99.7 (80.8) 15.84 (2.11) 99.9 (99.2) 7.0 (6.3)

SgcC5NE 4ZXW P212121 99.40, 105.30, 108.17 90.00, 90.00, 90.00 0.9792 35.00-2.20(2.24-2.20) 10.9 (76.2) 99.7 (78.6) 14.8 (2.36) 99.8 (99.9) 4.5 (4.5)

Refinement Resolution (Å) 30.00-2.00 34.27-2.20 No. reflections 75434 57547 Rwork/Rfree 0.164/0.192 0.161/0.208 d Ramachandran plot (%) favored 97.87 97.74 outliers 0 0.23 B-factors Protein 35.7 37.9 Ligand/ion 31.6 56.1 NE, SUC 31.2, 27.8 Water 33.7 32.9 R.m.s deviations Bond lengths (Å) 0.013 0.013 Bond angles (º) 1.28 1.39 d Clashscore 2.33 1.44 a Numbers in parentheses are values for the highest-resolution bin. b

Rmerge = ∑hkl∑i|Ii (hkl) – Ī(hkl)|/∑hkl∑iIi(hkl), where Ii(hkl) is the ith observation of reflection hkl

and Ī(hkl) is the weighted average intensity for all observations i of reflection hkl. c

Rmeas = ∑hkl[N/(N − 1)1/2]∑i|Ii (hkl) – Ī(hkl)|/∑hkl∑iIi(hkl).

d

As defined by MolProbity.

31

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 32

TOC

32

ACS Paragon Plus Environment