Combining Metabolic Pathway Design and ... - ACS Publications

Feb 19, 2014 - for the Design of a Novel Semisynthetic Manufacturing Scheme for ... Department of Chemical Engineering, Massachusetts Institute of ...
0 downloads 0 Views 645KB Size
Article pubs.acs.org/OPRD

Combining Metabolic Pathway Design and Retrosynthetic Planning for the Design of a Novel Semisynthetic Manufacturing Scheme for Paclitaxel Vikramaditya G. Yadav*,† Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States S Supporting Information *

ABSTRACT: Using the example of paclitaxel, this paper expounds how retrosynthetic analysis can be combined with metabolic pathway design to devise novel semisynthetic schemes for manufacturing natural products. The analytical framework presented herein leverages the latest developments in chem- and bioinformatics, metabolic engineering, and retrosynthetic planning, and the proposed schemes commence with the microbially aided synthesis of an advanced intermediate that is then recruited for target-oriented synthesis (TOS). A technoeconomic analysis of the scheme devised using bioretrosynthetic analysis for manufacturing paclitaxel suggests that the new process competes favorably with the current route for producing paclitaxel. Additionally, since TOS can access precise regions of chemical space, either a single molecule or a small assortment of molecules exhibiting minor variations on a chemical theme, bioretrosynthesis also doubles as a tool for drug discovery.



INTRODUCTION Expenditures on R&D by the pharmaceutical industry have risen 6-fold over the past 3 decades.1 Yet, in this time, the number of new drugs approved per billion U.S. dollars spent on R&D has fallen 10-fold.2 Too many drug candidates are failing during clinical testing, and analyses suggest that a ‘target-rich, lead-poor’ imbalance is a major reason for the decline in productivity of drug research. Specifically, all of the 21,000-odd drug products that have ever been approved by the FDA perturb or interact with only 130 or so unique functional protein domains of the more than 10,000 domains that exist in the body,3 and only 6% of all new molecular entities that were approved between 1989 and 2000 targeted a previously undrugged protein domain. It appears that all the easy targets have already been exploited, and the industry’s current synthetic toolbox is perhaps unable to generate structures capable of binding to novel protein domains.4 These limitations, together with the increased emphasis placed by regulatory agencies on production of single enantiomers instead of racemic mixtures,5,6 and heightened restrictions on impurity levels and the use of solvents and raw materials7 have only exacerbated the industry’s productivity crisis. Nevertheless, while high-throughput screening of synthetic compound libraries typically yields just a single FDA-approved drug for every 100,000 molecules,8 as many as 20 of the 7,000-odd polyketides that have been screened thus far have been commercialized. The likelihood of success of natural products is nearly 300-fold higher than that of synthetic compounds, and logic dictates that drug companies ought to redouble their efforts in discovering and developing natural-product-based drugs. Instead, natural products, not synthetic compounds, are the ones that are rapidly falling out of favor across the industry.9 Synthetic inaccessibility is the principal reason for the diminishing interest of pharmaceutical companies in natural products. After all, it is not mere coincidence that natural© 2014 American Chemical Society

product-based drugs typically are more cyclized and bear a larger number of chiral centers, oxygenated substituents, and solvated hydrogen-bond donors and acceptors than synthetic pharmaceuticals.9 Correctly stitching together these structural attributes unavoidably increases the number of reaction and purification steps in a synthetic scheme, which often makes the production process uneconomical and impracticable at larger scales. For instance, total synthesis of paclitaxel, which possesses 11 stereocenters, comprises 51 reaction steps and has an overall yield of 0.03%.10,11 Additionally, since process development does not normally commence until the latter half of phase II clinical trials,12 companies are discouraged by the high likelihood that they will be unable to develop a feasible manufacturing process. The attrition of a candidate drug on account of an inability to develop a manufacturing process is quite a costly loss, and it is this fear of costly failures, more than anything else, that has turned the tide against natural products. Although natural products can also be procured by directly extracting the molecules from their native hosts or host tissues, this method too has its drawbacks. The supply of source material from which the natural product is extracted is often unreliable and prone to seasonal and environmental variations,13 and an overwhelming majority of organisms simply cannot be cultured in the laboratory.14 Additionally, since the associations between natural products and their designated targets are quite strong and selective, there is never a need for the host to synthesize them in copious quantities.15 That six fully developed yew trees, which take about 200 years to grow, cumulatively produce no more than a single dose of the blockbuster anticancer drug paclitaxel illustrates just how low some yields are.16 The lower the biosynthetic yield, the greater Special Issue: Biocatalysis 14 Received: December 10, 2013 Published: February 19, 2014 816

dx.doi.org/10.1021/op4003505 | Org. Process Res. Dev. 2014, 18, 816−826

Organic Process Research & Development

Article

Figure 1. Structures of ATT and paclitaxel and overview of the taxane carbon numbering scheme.

imperative to overcome some of the constraints that regulate natural product biosynthetic pathways. Specifically, since most natural products are components of the immune and defense systems of their host, and since an organism’s evolutionary fitness is contingent on its ability to synthesize that otherwise rare molecule that can strongly bind to cellular targets in competing species, the vast majority of natural product biosynthetic pathways comprise enzymes that act on several substrates and/or catalyze the formation of multiple products.13,24 This implies that these pathways are actually highly branched and possibly synthesize several products. The paclitaxel biosynthetic pathway, for instance, generates well in excess of 100 products. As natural product biosynthetic pathways are quite long, the dissipation of intermediate molecules to competing chemical reactions at each step of the pathway amounts to inordinate losses in the overall yields. Enhancing the substrate and product fidelities of dissipative enzymes is now an urgent problem in metabolic engineering and biocatalysis. Unfortunately, the canonical methodology for modulating the activity of an enzyme, which involves selecting similar examples in nature as starting points for modification using mutagenesis or directed evolution, is inapplicable to the current scenario. More elaborate structure-guided approaches are required. However, not only is this a very long and involved process, but even in the event that the characteristics of every single enzyme in the pathway are eventually improved, it is uncertain whether simple microorganisms such as Escherichia coli, the most commonly employed host for metabolic engineering applications, will be able to express such a large collection of enzymes without incurring grave physiological stresses. Diversity-oriented biosynthesis presents an entirely new challenge for metabolic engineers, and it is apparent that synthesizing an advanced intermediate that acts as a gateway molecule for target-oriented chemical synthesis (TOS), instead, is a more viable alternative. Not only would the number of enzymes to be incorporated into the pathway be significantly more tenable, but such a semisynthetic manufacturing process would also take advantage of the core competencies of both metabolic engineering and synthetic chemistry.

is the demand for source material during manufacturing, and separation costs are also higher. For example, a significant proportion of the bulk price of paclitaxel, which in 2012 was estimated to be roughly $190,000 per kg, is attributed to separation costs.17 Increased procurement of source material, especially plant material, also raises the risk of damage to the biodiversity of the host’s natural habitat.18 Microbial metabolic engineering has been touted as a possible alternative for discovering and manufacturing natural products,19 and much of the confidence in this methodology stems from a decade-old body of work on the successful recruitment of genetically engineered microorganisms for the synthesis of complex polyketides.20,21 To the uninitiated, cellular metabolism is a highly connected and well-regulated network of a large number of biochemical reactions, and metabolic engineering can be defined as the modification of a cell’s metabolic network for increased production of a specific molecule. The practice is about two decades old,22 and the first decade of metabolic engineering centered on enhancing the production of native microbial metabolites such as ethanol, acetic acid, and several amino acids.23 These attempts to engineer microbial metabolism relied on deleting or overexpressing single or multiple genes based on knowledge of the stoichiometry, kinetics, and regulation of the pathway of interest. During the second decade of metabolic engineering, on the back of the sterling advances in microbial genetics and plant biotechnology, as well as the explosion in the volume of gene and protein data, metabolic engineers eventually ventured into the production of plant metabolites in microorganisms. This goal is generally achieved by introducing plant biosynthetic genes into the microorganism’s metabolic network, overexpressing genes controlling production of chemical precursors, and eliminating unnecessary native reactions to enhance yields. By manipulating the metabolic network of microorganisms to express combinations, permutations, and mutations of natural product biosynthetic genes using standardized tools of genetic engineering, it is possible to synthesize fundamentally new chemical entities that could bind to previously undrugged protein domains. However, before this goal can be realized, it is 817

dx.doi.org/10.1021/op4003505 | Org. Process Res. Dev. 2014, 18, 816−826

Organic Process Research & Development

Article

Scheme 1. Current understanding of the paclitaxel biosynthetic pathway

This paper reports results of work on the formalization of a conceptual framework to devise novel semisynthetic schemes for manufacturing natural products. Briefly, the reactive landscape of a natural product biosynthetic network is first analyzed using chemical pattern searching and substrate docking into in silico models of the active sites of the pathway’s enzymes. Pathway intermediates identified by the analysis are then subjected to retrosynthetic planning to reveal the intermediate that can be converted to the natural product most efficiently. The entire exercise of employing biosynthesis to generate the building blocks for retrosynthesis is labelled as ‘bioretrosynthesis’, and the utility of this approach is demonstrated herein by devising a novel semisynthetic manufacturing scheme for paclitaxel from the intermediate (1,2α,5α,7β,10β)-5-acetyloxy-1,2,7,10-tetrahydroxy-tax-4,11diene (abbreviated as ATT, Figure 1). Using a technoeconomic analysis, it is shown that the semisynthetic manufacturing scheme for paclitaxel that has been devised herein competes quite favorably with the most routinely employed route25 for producing paclitaxel from 10-deactylbaccatin III (10-DAB), another intermediate in the pathway. It is highly likely that the benefits derived from bioretrosynthesis will be even greater for those natural products for which existing manufacturing schemes are not nearly as efficient as they are in the case of paclitaxel. Significantly, bioretrosynthetic planning reveals the identities of the candidate enzymes to be recruited and evolved for assembly of the desired pathway.26

Moreover, since TOS can access precise regions of chemical space (defined as the set of structural features and physicochemical properties that a molecule could possess), either a single molecule or a small assortment of molecules exhibiting minor variations on a chemical theme, bioretrosynthesis could also be utilized as a tool for drug discovery.



METHODS Current Understanding of Paclitaxel Biosynthesis. Paclitaxel and its co-products are commonly referred to as taxanes on account of their unique C20 scaffold. Taxane biosynthesis commences with the cyclization of the linear molecule geranylgeranyl pyrophosphate (GGPP) to produce taxa-4(5),11(12)-diene (or taxadiene) and taxa-4(20),11(12)diene (or isotaxadiene).27 The former is then oxidized by a cytochrome P450 monooxygenase called taxadiene 5α-hydroxylase to form taxa-4(20),11(12)-dien-5α-ol (or taxadien-5α-ol) and 5(12)-oxa-3(11)-cyclotaxane (or OCT), among others.28−30 It is generally agreed taxadien-5α-ol later undergoes a series of substitutions en route to forming paclitaxel, yet the exact sequence of these substitutions is hitherto unknown (Scheme 1).27 Mapping the Taxane Pathway. Investigations on Taxus cuspidata (Japanese yew) cell cultures had previously characterized several intermediates and biosynthetic genes in the taxane pathway.27,31−33 The reactive landscape can, therefore, be revealed by evaluating the binding affinities for 818

dx.doi.org/10.1021/op4003505 | Org. Process Res. Dev. 2014, 18, 816−826

Organic Process Research & Development

Article

Figure 2. Oxidation of taxadiene by the enzyme taxadiene 5α-hydroxylase.

every possible enzyme−substrate pair. However, since the extent of branching is typically higher in the early part of secondary metabolic pathways,15,34 it is helpful to differentiate enzymes and substrates that are involved in metabolic reactions that occur earlier in the pathway from those that participate later in the pathway, viz., map the pathway. This information can aid in greatly reducing the number of enzyme−substrate pairs whose binding affinities are being evaluated in silico. Graph theory and atom mapping have been previously utilized to discover novel or nonstandard metabolic pathways.35,36 However, since secondary metabolic pathways proceed using a structural core formation-diversification synthesis plan,37 neither of these approaches is likely to shed any light on pathway topology. In light of these facts, the pathway was mapped by analyzing chemical substitution patterns in the taxane metabolites. To this end, chemical structures of as many naturally derived taxanes were first cataloged (results of the analysis of the taxane structures are summarized in Section S1 of the Supporting Information), and this pictorial information was then recast into a simple, mathematically interpretable vector representation wherein the length, column numbers, and elements of the vector represent the total number of carbon atoms in the molecule, their assigned IUPAC designations, and the type of substitution they bear, respectively. Bayesian probabilities were then applied to infer relationships between the taxane metabolites based on their substitution patterns. Additional details regarding the use of Bayesian probabilities for analyzing substitution patterns have been provided in Section S2 in the Supporting Information. The map of taxane metabolism as outputted by the substitution pattern finding is then validated in the second stage of the theoretical framework−computational enzyme catalysis. Herein, bioinformatics and computational chemistry are utilized to study enzyme catalysis. To eliminate bias, enzyme annotations as they appear in the database of biological information maintained by the National Center for Biotechnology Information (NCBI) were ignored. In Silico Investigation of the P450 Oxidation Cycle. A detailed, step-by-step description of the in silico methodology that was specifically developed for this study has been included in Section S3 in the Supporting Information. Briefly, since crystallographic structures of any taxane biosynthetic enzymes are not yet available, homology modeling38 was utilized to obtain precise representations of the enzymes and their active sites. Homology modeling, or comparative modeling, is a methodology that is commonly utilized by structural biologists to construct atomic-resolution models of enzymes. The method is akin to using a ball and string to model enzymes, wherein the string is the amino acid sequence of the enzyme for which no models exist, and the ball is an available template whose amino acid sequence closely resembles that of the string. By matching amino acids in both sequences, the string is wrapped around

the ball to obtain a three-dimensional model, whose geometry is then further optimized using physicochemical constraints. P450-catalyzed oxidations proved to be the single most important reaction group in the pathway. However, because P450 enzymes possess a heme prosthetic group whose geometry cannot be captured using homology modeling software, a novel computational approach had to be devised to investigate the P450 catalytic cycle. The homology models were first augmented with the heme prosthetic groups, the thiolate linkage between the heme iron and cysteinyl sulfur was then created, and finally, geometry of the heme moiety was optimized using the BFGS energy minimization algorithm. The average bond length of the thiolate linkage for the 16 homology models was 2.56 Å, which compares quite favorably with the average thiolate bond length of 2.47 Å for the templates used to develop the homology models. The taxane substrates identified in the mapping exercise were then docked into the active sites of the P450 models using AUTODOCK.39 Docking is a protocol in molecular modeling that predicts the most energetically favorable orientation of a substrate in the active site of an enzyme. However, docking provides only a static representation of the substrate in the active site at a particular instant in the reaction. Since the heme prosthetic group dynamically proceeds through multiple states during the catalytic cycle, a ‘snapshot’ methodology that assesses substrate poses at each stage of the P450 cycle and its eventual influence on the reactive pose, viz., the pose that the substrate adopts just prior to reaction, was implemented to draw rational conclusions regarding the reactive proclivities of the enzymes. Since the diffusional trajectory of oxygen into the activated heme-containing active site greatly limits the translational and rotational degrees of freedom the now-destabilized substrate has in order that it may assume a more stable position, this constraint was utilized to refine the outputs of the modeling algorithm. The likeliest path that oxygen takes to diffuse into the active site was calculated using the software package CAVER.40 An overview of the model predictions for the oxidation of taxadiene by the enzyme taxadiene 5α-hydroxylase is provided in Figure 2. The modeling methodology identifies two possible reaction trajectories for the oxidation of taxadiene by the P450 hydroxylase. The first panel for both schemes represents binding of taxadiene to the activated heme-containing active site. The diffusion path of oxygen is highlighted in red in the second panel, and the reactive conformation that leads to substrate oxidation is depicted in the third panel. Both trajectories, however, predict competition between abstractions of the same 3 protons by the oxyferyl heme iron. These are, in their order of favorability, the protons attached to the 3-, 13α-, and 20-positions. The computational methodology for studying P450-mediated substrate oxidation was validated by predicting the reaction trajectories for the dihydroxylation of vitamin D3 by 819

dx.doi.org/10.1021/op4003505 | Org. Process Res. Dev. 2014, 18, 816−826

Organic Process Research & Development

Article

Figure 3. Early taxane biosynthetic pathway.

Figure 4. Predicted reaction landscape of the taxane biosynthetic pathway.



RESULTS Pathway Map. Bayesian analysis of the substitution patterns revealed a putative map of taxane metabolism, wherein early stages almost exclusively comprise P450-mediated oxidations (Figure 3; a more detailed map is provided in Figure S6 in the Supporting Information). While substitutions at most carbon atoms on the taxane scaffold can be definitively ordered, the temporal relation between some functionalizations such as hydroxylations at the 2α-, 9α-, and 13α-positions cannot be deduced. Also, substitution patterns suggest that

CYP105A1, which has been studied experimentally. The modeling methodology’s outputs compare favorably to the crystallographically determined substrate poses in both, the activated heme and oxyferyl heme active sites of CYP105A1 (PDB codes: 2ZBZ and 3CV9). Retrosynthetic Planning. The use of retrosynthetic planning identified the pathway intermediate that could serve as the best starting material for semisynthesis of paclitaxel. SciFinder SciPlanner, Reaxys, ARChem, and published literature10,25,41−49 were utilized for this exercise. 820

dx.doi.org/10.1021/op4003505 | Org. Process Res. Dev. 2014, 18, 816−826

Organic Process Research & Development

Article

Scheme 2. Proposed synthesis of paclitaxel from ATT (Δ)a

a

Reagents: (a) CH2Cl2; (b) Et3N, 2-acetoxyacetyl chloride, CH2Cl2; (c) (NH4)2Ce(NO3)6, CH3CN; (d) MeOH, Na2CO3; (e) TESCl, pyridine; (f) 4-DMAP, Et3N, benzoyl chloride, CH2Cl2; (g) CDI, NaH, DMF; (h) TESCl, pyridine; (i) H5IO6, CH3CN, PCC; (j) dioxane, MeOH, KOH; (k) quinuclidine, NMO, tBuOH, NaBH4, OsO4, THF; (l) 4-DMAP, CH3COCl, CH2Cl2; (m) 4-DMAP, MsCl, CH2Cl2; (n) K2CO3, MeOH; (o) HMPA, DIEA; (p) 4-DMAP, (CH3CO)2O, CH2Cl2; (q) PhLi, CH3COOH, THF, cyclohexane; (r) tBuOK, (PhSeO)2O, THF; (s) 4-DMAP, (CH3CO)2O, pyridine; (t) CH3COONa, PCC, NaBH4, benzene, MeOH; (u) NaH, THF; (v) MeOH, 32% HCl.

oxetane ring formation is among the last modifications to occur on the scaffold. The suggestion that the early pathway almost exclusively consists of P450-mediated reactions is not all that surprising as this biosynthetic strategy has also been observed in other natural product biosynthetic pathways. As P450 monooxygenases catalyze C−H bond functionalization instead of acylation of an O−H bond that is mediated by acyltransferases, the former are more prone to promiscuity and nonselectivity, increasing chemical diversity within the

pathway. Interestingly, studies on the production of paclitaxel by plant cell cultures of T. cuspidata have revealed that paclitaxel, on a weight basis, accounts for about a fifth of all taxanes produced. Accounting for variations in molecular weights, the map outputted by the substitution pattern finding exercise appears to mirror actual production of paclitaxel. Single amino acid mutations within the active site of P450 monooxygenases could make these reaction sites ‘plastic’, which according to the Jones-Firn Selection Hypothesis15,34,50 is a 821

dx.doi.org/10.1021/op4003505 | Org. Process Res. Dev. 2014, 18, 816−826

Organic Process Research & Development

Article

Scheme 3. Synthesis of paclitaxel from 10-DABa

a

Reagents: (a) CH2Cl2; (b) Et3N, 2-acetoxyacetyl chloride, CH2Cl2; (c) (NH4)2Ce(NO3)6, CH3CN; (d) MeOH, Na2CO3; (e) TESCl, pyridine; (f) 4-DMAP, Et3N, benzoyl chloride, CH2Cl2; (g) TESCl, pyridine; (h) CH3COCl, pyridine; (i) NaH, THF; (j) 32% HCl, MeOH.

does not exhibit any activity (as predicted by our computational methodology) to substrates bearing substitutions at more than a single carbon atom on the taxane scaffold. The active site volumes of all P450 monooxygenases evaluated in this study have been listed in Table S1 in the Supporting Information. Formulation of a New Route for Semisynthesis of Paclitaxel. Pathway intermediates that were identified via pathway mapping and assessment of the reactivity landscape within the taxane biosynthetic pathway were subjected to retrosynthetic analysis to devise a novel and efficient semisynthetic scheme for production of paclitaxel. This exercise revealed ATT to be the best starting material for the new manufacturing scheme (Scheme 2). On the basis of results of the analysis of the reactive landscape presented previously, expressing H16, H9, and H6 in a strain expressing taxadiene should, in theory, yield taxadiene-2α,5α,10β-triol. Of these enzymes, H16 might have to be significantly re-engineered in order to minimize its promiscuity towards taxanes other than taxadiene. H13, which catalyzes 7β-hydroxylation of taxadiene, and H4, which hydroxylates taxadiene-5α,10β,13α-triol at the 1carbon, will also need to be sufficiently re-engineered to alter their substrate preference to taxadiene-2α,5α,10β-triol and its 7β- or 1-hydroxy derivative. Details about the individual steps in Scheme 2 have been provided in Section S5 in the Supporting Information. Technoeconomic Analysis. One-liter fed-batch cultures of an engineered E. coli strain that expresses the two Taxus enzymes, GGPP synthase and taxadiene synthase, synthesize

more robust and reliable way to generate diversity compared to the significant alterations that might be required to reorient a polyoxidized molecule in order to drive acylation at one hydroxyl group over another. Accordingly, the reactive landscape was subsequently investigated by assuming that the hydroxylations at the 5α- and 10β-positions precede the others and assuming all possible permutations of the 2α-, 9α-, and 13α-hydroxylations. Reactive Landscape. The enzyme that has been annotated as taxadiene 5α-hydroxylase in the sequences derived from the Taxus cDNA library is reannotated as H16 in the computational assessment described herein. It is evident that taxadiene 5α-hydroxylase is a very promiscuous enzyme (Figure 4). Interestingly, none of the other 16 P450s exhibit 5αhydroxylase activity. H6 is predicted to exhibit 10β-hydroxylase activity on taxadiene and 2α-hydroxylase activity on taxadien5α,10β-diol, and the computational methodology corroborates its annotation in the NCBI database. H13, on the other hand, appears to catalyze stereo- and regiospecific 7β-hydroxylation of taxadiene but is annotated as a 10β-hydroxylase in the NCBI database. This dissonance merits further investigation but was not pursued in this study. H9, on the other hand, is predicted to be a taxadiene-5α-ol 10β-hydroxylase, and H4 is predicted to catalyze the 2αhydroxylation of taxadiene-5α,10β,13α-triol. Additionally, the size of the active site (in Å3) does necessarily correlate with promiscuity. For instance, H9, which is estimated to possess the second largest active site among the 16 P450 monooxygenases 822

dx.doi.org/10.1021/op4003505 | Org. Process Res. Dev. 2014, 18, 816−826

Organic Process Research & Development

Article

Table 1. Total manufacturing costs for microbe-based (Scheme 2) and plant-based (Scheme 3) manufacturing schemes itemized costs

plant-based

microbe-based

no. of operations identities of GMP-compliant steps no. of personnel (n) total capital investment (C) raw material costs (R) project lifetime depreciation costs (D) waste and treatment costs (W) operating labor costs (LO = 300,000·n) nonoperating labor costs (LN = 0.6·L) administration/overhead (O = 0.9·L) supplies (P = 0.3·L) maintenance (M = 0.02·C) utilities (U = 0.01·C) miscellaneous (MS = 0.01·C) annual fixed costs (A = LO + LN + O + P + M + U + MS) fixed costs (F = A/S) total manufacturing costs (= R + D + W + F)

11 3 ( f, i, j) 10 $40 × 106 $15.7 × 106/kg 10 years $40,000/kg $3050/kg $3 × 106/yr $1.8 × 106/yr $2.7 × 106/yr $0.9 × 106/yr $0.8 × 106/yr $0.4 × 106/yr $0.4 × 106/yr $10 × 106/yr $1 × 106/kg 16.74 × 106/kg

23 3 ( f, u, v) 14 $960 × 106 $14.7 × 106/kg 10 years $960,000/kg $2,754,300/kg $4.2 × 106/yr $2.52 × 106/yr $3.78 × 106/yr $1.26 × 106/yr $19.2 × 106/yr $9.6 × 106/yr $9.6 × 106/yr $50.16 × 106/yr $5 × 106/kg 23.41 × 106/kg

nearly 1 g of taxadiene over 5 days.51 Production of ATT by this strain necessitates the expression of 6 additional enzymes (5 cytochrome P450 monooxygenase-reductase complexes and a single acetyltransferase), all of whose identities have been established in the preceding section. The global paclitaxel supply chain is populated by little over a dozen API producers that supply the bulk drug to numerous formulators such as BMS or Mylan that then produce finished doses of the drug for distribution to various consumer channels. The current manufacturing scheme for paclitaxel is a 4-step synthetic scheme that utilizes 10-DAB that has been extracted from plant cell cultures of T. baccata (Scheme 3). The average yield of 10DAB in T. baccata cell cultures is roughly 160 mg/g DCW,52,53 and the paclitaxel yield from 10-DAB is 44.7% (molar basis),25 which translates to an overall semisynthetic yield (assuming a plant water content of 80 wt %) of 2.25 wt %. Details about the individual steps in Scheme 3 have been provided in Section S5 in the Supporting Information. A paclitaxel yield of 2.25 wt % represents a >200-fold increase over the paclitaxel content in the bark needles of T. baccata, underscoring the importance of semisynthesis in the production of therapeutic natural products that are otherwise too miniscule to be directly extracted from their native hosts. The API manufacturers either produce 10-DAB internally or source the material from external providers and typically produce 50−100 kg of paclitaxel annually.17 In the economic assessment that follows, the annual production volume for the manufacturing facility is also assumed to be 100 kg, and the proposed semisynthetic process is compared with a plant cell fermentation process that produces 160 mg/g DCW of 10DAB. The yield of 10-DAB can be increased either via plant metabolic engineering54,55 or through use of biocatalysts that cleave the side chains of numerous taxanes to form 10-DAB. BMS, for instance, developed three enzymes, 13-taxolase, 10deacetylase, and 7-xylosidase, that hydrolyze the side chains present at the 13-, 10-, and 7-positions, respectively, and demonstrated a 5.5−24-fold increase in the yield of 10-DAB in cultivars of T. hicksii and T. brevifolia.56−59 In fact, BMS and Phyton Biotech have co-commercialized a plant cell fermentation process for production of 10-DAB that is estimated to

eliminate as much as 71,000 pounds of hazardous chemicals60,61 and produce up to 15 mg/L/day of taxanes.62 BMS has also replaced some of the synthetic transformations involved in the assembly of the C13 side chain of paclitaxel with enzyme-catalyzed reactions to improve yields and product selectivity. In one application, reaction d in Scheme 3 was replaced by a lipase-catalyzed reaction wherein the racemic acetate product of reaction c is stereoselectively resolved to yield the desired intermediate.63 Several lipases were tested and optimized for this purpose, and reaction yields and optical purities in excess of 96% and 99.5% were obtained. In a second application, steps a−f of Scheme 3 were entirely replaced by a microbe-based process wherein 2-keto-3-(N-benzylamino)-3phenylpropionic acid ethyl ester was converted to (2R,3S)(−)-N-benzoyl-3-phenylisoserine ethyl ester, which is then chemically coupled with baccatin to produce paclitaxel.64 Several variations of BMS’s microbe-based scheme for synthesis of the C13 phenylisoserine side chain now exist.65 Nevertheless, although the BMS-Phyton process has set the standard for industrial production of natural product APIs, it must be emphasized that separation costs still constitute a major fraction of the costs for plant cell-derived 10-DAB. The separation costs for microbial metabolic engineering, given the significantly smaller pool of products, are not anticipated to be as high as for plant cell-based processes, and the benefits derived from bioretrosynthesis will be even greater for those natural products for which existing manufacturing schemes are not nearly as efficient as they are in the case of the BMS-Phyton process. Since bulk chemical prices for many of the reagents, catalysts, and solvents that are required by the two semisynthetic schemes are not readily available, SciFinder Scholar was queried to identify the lowest price for all raw materials (Tables S2 and S3 in the Supporting Information). The price of ATT was estimated by comparing the economics of plant and microbial cultivation. Briefly, the optical density of the taxadieneproducing E. coli cultures reaches a peak of ∼40 in 5-day, 1 L fed-batch cultures. Taxadiene production is approximately 1 g/L at the end of the production run. Assuming an equivalency between an optical density of 1.0 and a dry biomass titer of 0.3 g/L, the overall yield of taxadiene on a dry cell weight basis is 823

dx.doi.org/10.1021/op4003505 | Org. Process Res. Dev. 2014, 18, 816−826

Organic Process Research & Development

Article

estimated to be 0.083 g/g (or 1 g of taxadiene produced by 12 g DCW of cells in a 1 L batch). A 30% reduction in pathway flux was factored for every additional enzyme that is expressed within the pathway, which translates to a dry basis yield of 0.01 g/g for ATT. The doubling time of the engineered E. coli strains is estimated to be about 12 h, which is nearly 12-fold lower than the doubling time of T. baccata suspension cultures.66 Despite having a dry basis yield that is 16 times lower than that of 10-DAB, microbial production of ATT is nearly 270-fold more productive (on a 5-day basis) than the plant cultures that produce 10-DAB (assuming that plant cell fermentation runs for 21 days and produces 160 mg/g DCW at the end of the run67). Another important distinction between the two cultures is that the E. coli strains utilize glycerol as the principal carbon source, whereas the plant cells are cultivated using a combination of sucrose, glucose, and fructose as carbohydrate feedstocks. Although consumption of glycerol by the E. coli cultures is estimated to be roughly 4 times greater than the mono- and disaccharide demand by the suspension cultures of T. baccata, the media formulation costs for plant cell cultivation are greater owing to the higher unit price of the sugars as well as higher nutrition supplementation costs. Consequently, the productivity of the microbial cultures is conservatively estimated to be 300-fold greater than that of the plant suspension cultures, which amounts to each kilogram of ATT costing about $36,600. This means that the raw material costs for synthesizing paclitaxel from 10-DAB are a little over 6% greater than the competing semisynthetic process that commences from ATT. The total manufacturing costs for the microbe-based (Scheme 2) and plant-based (Scheme 3) processes are summarized in Table 1. The assumptions that were made for estimating the manufacturing costs are summarized in Section S6 in the Supporting Information. The 40% difference between the unit price of paclitaxel produced using both schemes is simply an artifact of the high cost of benzeneseleninic acid anhydride, which individually accounts for 60% of the raw material cost for microbial semisynthetic process. A larger demand for this reagent or a cheaper alternative could significantly reduce the raw material cost. For instance, a >60% reduction in the cost of benzeneseleninic acid anhydride makes the microbe-based process more economical than the plant-based one. Additionally, given the hazards and poor environmental quotient of using benzene and hexamethylphosphoramide as solvents in the proposed case, the final process that most pharmaceutical companies would eventually develop and approve would be more efficient, which would additionally improve the metrics of the microbe-based process. It appears that producing ATT at 0.0032 g/g DCW per day is not only technically feasible given the tools currently at our disposal but also cost-competitive with the semisynthetic plant-based process. The benefits derived from bioretrosynthesis will be even greater for those natural products for which existing manufacturing schemes are not nearly as efficient as they are in the case of paclitaxel.

Figure 5. Overview of the experimental basis for target-oriented biosynthesis.

product selectivity, and turnover. The mutagenized enzymes must then be combined into a pathway, and enzyme expression must be suitably toggled in order to adjust their concentrations. Discriminatory expression equalizes enzyme turnovers and elevates flux through the pathway. Unfortunately, the production profiles of most enzymes in the taxane biosynthetic pathway are unknown. This greatly complicates the task of identifying the best candidates for mutagenesis. Still, it is believed that taxadiene needs to undergo at least 15 biotransformations en route to paclitaxel. This situation presents yet another quandary for metabolic engineers. Manipulating the metabolism of E. coli to accommodate the expression of such a large set of enzymes might not just be prohibitively long, it might very well be unfeasible owing to the physiological stress that might potentially be induced within the host. Instead, synthesising an advanced intermediate that acts as a gateway molecule for chemical synthesis is a more viable alternative. Not only would the number of enzymes to be incorporated into the pathway be significantly more tenable, but such a semisynthetic manufacturing process would also take advantage of the core competencies of both metabolic engineering and synthetic chemistry. Herein, the development and use of a computational methodology that incorporates chemical pattern searching and substrate docking into in silico models of the active sites of enzymes to assess the reactive landscape within the taxane biosynthetic pathway has been reported. Retrosynthetic planning is then utilized to identify ATT as the desired starting material for semisynthesis of paclitaxel, which in turn yields the candidates for mutagenesis. Using a technoeconomic comparison between the existing plant-based semisynthetic process and the one that has been proposed herein, target titers, and productivities for a strain expression, an optimal combination of P450 monooxygenases and acyltransferases has also been identified. The titers calculated herein serve as benchmarks for future protein and metabolic engineering activities.



CONCLUSIONS The targeted biosynthesis of a single molecule is a 2-step process (Figure 5). The likely combination of enzymatic reactions that could produce the molecule of interest is identified in the first step. Next, the active sites of the selected enzymes are re-engineered to enhance substrate specificity,



ASSOCIATED CONTENT

S Supporting Information *

S1, structural catalog of the taxane library; S2, application of Bayesian probabilities to deduce temporal relationships 824

dx.doi.org/10.1021/op4003505 | Org. Process Res. Dev. 2014, 18, 816−826

Organic Process Research & Development

Article

(23) Yadav, V. G.; Mey, M. d.; Lim, C. G.; Ajikumar, P. K.; Stephanopoulos, G. Metab. Eng. 2012, 14, 233−241. (24) Fischbach, M. A.; Clardy, J. Nat. Chem. Biol. 2007, 3, 353−355. (25) Kleemann, A.; Engel, J.; Kutscher, B.; Reichert, D. Pharmaceutical Substances: Syntheses, Patents, Applications; Georg Thieme Verlag: New York, NY, 2001; Vols. 1 and 2, pp 690−697, 1545−1551. (26) Bachmann, B. O. Nat. Chem. Biol. 2010, 6, 390−393. (27) Walker, K.; Croteau, R. Phytochemistry 2001, 58, 1−7. (28) Hefner, J.; Rubenstein, S. M.; Ketchum, R. E. B.; Gibson, D. M.; Williams, R. M.; Croteau, R. Chem. Biol. 1996, 3, 479−489. (29) Jennewein, S.; Long, R. M.; Williams, R. M.; Croteau, R. Chem. Biol. 2004, 11, 379−387. (30) Yadav, V. G.; Stephanopoulos, G. J. Mol. Catal. B: Enzymatic 2013, Submitted for publication. (31) Croteau, R.; Schoendorf, A.; Jennewein, S. U.S. Patent Application 20,040/236,089 2004. (32) Ketchum, R. E. B.; Croteau, R. B., The Taxus metabolome and the elucidation of the taxol biosynthetic pathway in cell suspension cultures. In Plant Metabolomics; Saito, K., Dixon, R. A., Willmitzer, L., Eds.; Springer Verlag: Heidelberg, 2006; pp 291−309. (33) Jennewein, S.; Wildung, M. R.; Chau, M.-D.; Walker, K.; Croteau, R. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 9149−9154. (34) Firn, R. D.; Jones, C. G. Nat. Prod. Rep. 2003, 20, 382−391. (35) Blum, T.; Kohlbacher, O. J. Comput. Biol. 2008, 15, 565−576. (36) Heath, A. P.; Bennett, G. N.; Kavraki, L. E. Bioinformatics 2010, 26, 1548−1555. (37) Godula, K.; Sames, D. Science 2006, 312, 67−72. (38) Orry, A. J. W.; Abagyan, R. Homology Modeling: Methods and Protocols; Springer: New York, 2012. (39) Trott, O.; Olson, A. J. J. Comput. Chem. 2010, 31, 455−461. (40) Chovancová, E.; Pavelka, A.; Beneš, P.; Strnad, O.; Brezovský, J.; Kozlíková, B.; Gora, A.; Š ustr, V.; Klvaň a, M.; Medek, P.; Biedermannová, L.; Sochor, J.; Damborský, J. PLoS Comp. Biol. 2012, DOI: 10.1371/journal.pcbi.1002708. (41) Doi, T.; Fuse, S.; Miyamoto, S.; Nakai, K.; Sasuga, D.; Takahashi, T. Chem.Asian J. 2006, 1, 370−383. (42) Fuse, S.; Machida, K.; Takahashi, T.,Efficient synthesis of Natural Products Aided by Automated Synthesizers and Microreactors. In New Strategies in Chemical Synthesis and Catalysis; Pignataro, B., Ed;. Wiley-VCH: Weinheim, 2012; pp 33−45. (43) Nicolaou, K. C.; Nantermet, P. G.; Ueno, H.; Guy, R. K.; Couladouros, E. A.; Sorensen, E. J. J. Am. Chem. Soc. 1995, 117, 624− 633. (44) Nicolaou, K. C.; Liu, J.-J.; Yang, Z.; Ueno, H.; Sorensen, E. J.; Claiborne, C. F.; Guy, R. K.; Hwang, C.-K.; Nakada, M.; Nantermet, P. G. J. Am. Chem. Soc. 1995, 117, 634−644. (45) Nicolaou, K. C.; Yang, Z.; Liu, J.-J.; Nantermet, P. G.; Claiborne, C. F.; Renaud, J.; Guy, R. K.; Shibayama, K. J. Am. Chem. Soc. 1995, 117, 645−652. (46) Nicolaou, K. C.; Ueno, H.; Liu, J.-J.; Nantermet, P. G.; Yang, Z.; Renaud, J.; Paulvannan, K.; Chadha, R. J. Am. Chem. Soc. 1995, 117, 653−659. (47) Danishefsky, S. J.; Masters, J. J.; Young, W. B.; Link, J. T.; Snyder, L. B.; Magee, T. V.; Jung, D. K.; Isaacs, R. C. A.; Bornmann, W. G.; Alaimo, C. A.; Coburn, C. A.; Di Grandi, M. J. J. Am. Chem. Soc. 1996, 118, 2843−2859. (48) Greene, T. W.; Wuts, P. G. M. Protecting Groups in Organic Chemistry, 3rd ed.; John Wiley & Sons, Inc.: New York, NY, 1991. (49) Kocienski, P. Protecting Groups, 3rd ed.; Georg Thieme Verlag: New York, NY, 2005. (50) Firn, R. D.; Jones, C. G. J. Exp. Bot. 2009, 60, 719−726. (51) Ajikumar, P. K.; Xiao, W. H.; Tyo, K. E.; Wang, Y.; Simeon, F.; Leonard, E.; Mucha, O.; Phon, T. H.; Pfeifer, B.; Stephanopoulos, G. Science 2010, 330, 70−74. (52) Zocher, R.; Weckwerth, W.; Hacker, C.; Kammer, B.; Hornbogen, T.; Ewald, D. Biochem. Biophys. Res. Commun. 1996, 229, 16−20. (53) Zarek, M.; WalIgórski, P. Herba Pol. 2009, 55, 25−35.

between the taxane metabolites; S3, investigating enzyme catalysis in silico; S4, map of the taxane biosynthetic pathway; S5, raw material costs for the technoeconomic analysis; S6, assumptions made for calculating manufacturing costs; and references pertaining to this section. This material is available free of charge via the Internet at http://pubs.acs.org.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Present Address †

Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, United States. Notes

The author declares no competing financial interest.



ACKNOWLEDGMENTS The author thanks Gregory Stephanopoulos, Kristala Prather, and Narendra Maheshri for their insightful comments and Orr Ravitz of SimBioSys Inc. for graciously providing access to use the ARChem Route Designer software. The author also acknowledges the invaluable comments provided by anonymous reviewers that have greatly improved the quality of this work and thanks the MIT Legatum Center for financial support.



REFERENCES

(1) Juliano, R. L. Sci. Public Policy 2013, 40, 393−405. (2) Scannell, J. W.; Blanckley, A.; Boldon, H.; Warrington, B. Nat. Rev. Drug Discovery 2012, 11, 191−200. (3) Overington, J. P.; Al-Lazikani, B.; Hopkins, A. L. Nat. Rev. Drug Discovery 2006, 5, 993−996. (4) Roughley, S. D.; Jordan, A. M. J. Med. Chem. 2011, 54, 3451− 3479. (5) Stinson, S. C. Chem. Eng. News 2001, 79, 79−97. (6) Collins, A. N.; Sheldrake, G. N.; Crosby, J. Chirality in Industry; Wiley: New York, 1992. (7) European Medicines Agency, 2009. (8) Li, J. W.; Vederas, J. C. Science 2009, 325, 161−165. (9) Koehn, F. E.; Carter, G. T. Nat. Rev. Drug Discovery 2005, 4, 206−220. (10) Nicolaou, K. C.; Yang, Z.; Liu, J. J.; Ueno, H.; Nantermet, P. G.; Guy, R. K.; Claiborne, C. F.; Renaud, J.; Couladouros, E. A.; Paulvannan, K.; Sorensen, E. J. Nature 1994, 367, 630−634. (11) Taylor, R. E.; Chen, Y. Org. Lett. 2001, 3, 2221−2224. (12) Federsel, H. J. Nat. Rev. Drug Discovery 2003, 2, 654−664. (13) Firn, R. Nature’s Chemicals: The Natural Products that Shaped Our World; Oxford University Press: Oxford, U.K., 2010. (14) Schloss, P. D.; Handelsman, J. Curr. Opin. Biotechnol. 2003, 14, 303−310. (15) Firn, R. D.; Jones, C. G. Mol. Microbiol. 2000, 37, 989−994. (16) Goodman, J.; Walsh, V. The Story of Taxol: Nature and Politics in the Pursuit of an Anti-Cancer Drug: Cambridge University Press: Cambridge, U.K., 2001. (17) World Bulk Paclitaxel Market; Frost & Sullivan : New York, NY, 2001. (18) Zhu, F.; Qin, C.; Tao, L.; Liu, X.; Shi, Z.; Ma, X.; Jia, J.; Tan, Y.; Cui, C.; Lin, J.; Tan, C.; Jiang, Y.; Chen, Y. Proc. Natl. Acad. Sci. U.S.A. 2011, 108, 12943−12948. (19) Khosla, C.; Keasling, J. D. Nat. Rev. Drug Discovery 2003, 2, 1019−1025. (20) Tang, L.; Shah, S.; Chung, L.; Carney, J.; Katz, L.; Khosla, C.; Julien, B. Science 2000, 287, 640−642. (21) Pfeifer, B. A.; Admiraal, S. J.; Gramajo, H.; Cane, D. E.; Khosla, C. Science 2001, 291, 1790−1792. (22) Bailey, J. E. Science 1991, 252, 1668−1675. 825

dx.doi.org/10.1021/op4003505 | Org. Process Res. Dev. 2014, 18, 816−826

Organic Process Research & Development

Article

(54) Nims, E.; Dubois, C. P.; Roberts, S. C.; Walker, E. L. Metab. Eng. 2006, 8, 385−394. (55) Roberts, S. C. Nat. Chem. Biol. 2007, 3, 387−395. (56) Nanduri, V. B.; Hanson, R. L.; LaPorte, T. L.; Patel, R. N.; Szarka, L. J. Biotechnol. Bioeng. 1995, 48, 547−550. (57) Patel, R. N. Annu. Rev. Microbiol. 1998, 52, 361−395. (58) Patel, R. N. Curr. Opin. Biotechnol. 2001, 12, 587−604. (59) Hanson, R. L.; Wasylyk, J. M.; Nanduri, V. B.; Cazzulino, D. L.; Patel, R. N.; Szarka, L. J. J. Biol. Chem. 1994, 269, 22145−22149. (60) http://bit.ly/1evhqwg. (61) http://1.usa.gov/1jc1IKa. (62) Bringi, V.; Kadkade, P. G.; Prince, C. L.; Roach, B. L. U.S. Patent Application US20130017582 A1. (63) Patel, R. N.; Banerjee, A.; Ko, R. Y.; Howell, J. M.; Li, W. S.; Comezoglu, F. T.; Partyka, R. A.; Szarka, F. T. Biotechnol. Appl. Biochem. 1994, 20, 23−33. (64) Patel, R. N.; Banerjee, A.; Howell, J. M.; McNamee, C. G.; Brozozowski, D. G.; Mirfakhrae, D.; Nanduri, V.; Thottathil, J. K.; Szarka, L. J. Tetrahedron: Asymmetry 1993, 4, 2069−2084. (65) Feske, B. D.; Kaluzna, I. A.; Stewart, J. D. J. Org. Chem. 2005, 70, 9654−9657. (66) Wickremesinhe, E. R. M.; Arteca, R. N. J. Plant Physiol. 1994, 144, 183−188. (67) Srinivasan, V.; Pestchanker, L.; Moser, S.; Hirasuna, T. J.; Taticek, R. A.; Shuler, M. L. Biotechnol. Bioeng. 1995, 47, 666−676.

826

dx.doi.org/10.1021/op4003505 | Org. Process Res. Dev. 2014, 18, 816−826