Proteomics-Based Tools for Evaluation of Cell-Free Protein Synthesis

Oct 17, 2017 - Additionally, MS/MS analysis, as part of a conventional full proteomics workflow, identified post-translational modifications, includin...
0 downloads 12 Views 852KB Size
Article Cite This: Anal. Chem. XXXX, XXX, XXX-XXX

pubs.acs.org/ac

Proteomics-Based Tools for Evaluation of Cell-Free Protein Synthesis Gregory B. Hurst,*,† Keiji G. Asano,† Charles J. Doktycz,† Elliot J. Consoli,‡ William L. Doktycz,† Carmen M. Foster,‡ Jennifer L. Morrell-Falvey,‡ Robert F. Standaert,‡,§,⊥,∥ and Mitchel J. Doktycz*,‡ †

Chemical Sciences Division, ‡Biosciences Division, §Biology & Soft Matter Division and the ⊥Shull Wollan CenterA Joint Institute for Neutron Sciences, Oak Ridge National Laboratory, Oak Ridge Tennessee 37831, United States ∥ University of Tennessee, Department of Biochemistry & Cellular and Molecular Biology, Knoxville, Tennessee 37996, United States S Supporting Information *

ABSTRACT: Cell-free protein synthesis (CFPS) has the potential to produce enzymes, therapeutic agents, and other proteins, while circumventing difficulties associated with in vivo heterologous expression. However, the contents of the cell-free extracts used to carry out synthesis are generally not characterized, which hampers progress toward enhancing yield or functional activity of the target protein. We explored the utility of mass spectrometry (MS)-based proteomics for characterizing the bacterial extracts used for transcribing and translating gene sequences into proteins as well as the products of CFPS reactions. Full proteome experiments identified over 1000 proteins per reaction. The complete set of proteins necessary for transcription and translation were found, demonstrating the ability to define potential metabolic capabilities of the extract. Further, MS-based techniques allowed characterization of the CFPS product and provided insight into the synthesis reaction and potential functional activity of the product. These capabilities were demonstrated using two different CFPS products, the commonly used standard green fluorescent protein (GFP, 27 kDa) and the polyketide synthase DEBS1 (394 kDa). For the large, multidomain DEBS1, substantial premature termination of protein translation was observed. Additionally, MS/MS analysis, as part of a conventional full proteomics workflow, identified post-translational modifications, including the chromophore in GFP, as well as the three phosphopantetheinylation sites in DEBS1. A hypothesis-driven approach focused on these three sites identified that all were correctly modified for DEBS1 expressed in vivo but with less complete coverage for protein expressed in CFPS reactions. These post-translational modifications are essential for functional activity, and the ability to identify them with mass spectrometry is valuable for judging the success of the CFPS reaction. Collectively, the use of MS-based proteomics will prove advantageous for advancing the application of CFPS and related techniques.

C

capabilities beyond transcription and translation. The actual components in a cellular extract are generally not well characterized. Unintended reactions in the extract can divert energy and metabolite resources away from protein synthesis. Therefore, significant efforts have gone into optimizing extract content16,17 and reaction conditions to enhance protein production.18−22 Other efforts to fuel protein synthesis23 or to prepare desired products24,25 have also taken advantage of metabolic pathways present in the cell extract. Enzymes that perform post-translational modification of the product protein are also essential. Lack of appropriate activity can result from the absence of needed modifications to the target protein. In general, the preparation and use of cell extracts have largely been guided by assumptions concerning the protein content of the extract.

ell-free protein synthesis (CFPS) is applied increasingly for pharmaceutical production,1 protein discovery,2−4 metabolic engineering,5 and synthetic biology.6−9 Typically, CFPS employs transcriptional and translational machinery in vitro to build the desired protein product based on information encoded in a supplied DNA template. The ease and decreasing cost of preparing DNA of any desired sequence facilitates rapid exploration of protein targets. Commonly, for CFPS, the transcription−translation machinery is isolated from cells in the form of an extract.10 The ability to customize this extract, engineer the reaction conditions, and produce products that would potentially be toxic to the intact cell are just some of the beneficial attributes of CFPS.6,11 Further, CFPS can be implemented in a range of formats including continuous exchange systems and batch formats, in volumes ranging from subnanoliter to 100 L, to realize various application needs.1,12−15 Despite its versatility and ease of implementation, CFPS presents challenges in troubleshooting the reaction and its products. Cell extracts can be derived from a range of cell types and may contain both desirable and undesirable metabolic © XXXX American Chemical Society

Received: June 30, 2017 Accepted: October 2, 2017

A

DOI: 10.1021/acs.analchem.7b02555 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry

DEBS3. The plasmid pJL1-sfGFP encoding superfolder green fluorescent protein was a kind gift from the Dr. Michael Jewett laboratory (Northwestern University, Evanston, IL). Protein Expression. For in vivo protein expression, pET26b-DEBS1TE was transformed into TB3 cells, and transformants were selected on LB agar with 50 μg/mL kanamycin. Cells were grown in 25 mL volumes at 37 °C to an optical density at 600 nm (OD600) = 0.6, at which point 1 mM isopropyl β-D-thiogalactopyranoside (IPTG) was added to the culture to induce expression of the DEBS1TE, hereafter referred to as DEBS1. After 4 h of induction, the cells were harvested by centrifugation and stored at −80 °C until lysis. For cell-free protein expression, S30 cell extract was prepared based on the method described by Liu et al.48 with slight modifications (see Supporting Information, section 1). For sfGFP and DEBS1 protein production in CFPS, 15-μL batch reactions were conducted for 20 h at 30 °C using the PANOx-P system developed by Jewett and Swartz (see Supporting Information, section 2).49,50 Sample Preparation for Proteomics. CFPS reaction products from extracts of E. coli TB3 or E. coli BL21 Star (DE3) cells were denatured by incubating in 6 M guanidinium chloride for 1 h at 60 °C, and allowed to cool to room temperature. Cysteines were reduced by incubation in 2 mM tris(2carboxyethyl)phosphine hydrochloride (TCEP) for 20 min at room temperature, and carbamidomethylated by incubation in 10 mM iodoacetamide in the dark for 15 min. Samples were diluted with 5 volumes of digestion buffer (50 mM Tris-HCl, 10 mM CaCl2, pH 7.6), and the proteins were digested by adding trypsin at a 1:50 weight ratio (based on an estimate of 40−45 mg/mL protein in S30 extracts, and a dilution factor for CFPS reactions) with overnight incubation at 37 °C. An additional identical amount of trypsin was then added, with an additional 4 h incubation at 37 °C. Trypsin was inactivated by adding formic acid to a final concentration of 0.1%. Peptides were collected as the filtrate obtained by centrifuging samples through a 10 kDa molecular weight cutoff filter (Microcon YM10, Millipore, Billerica MA) for 20 min at 14 000 × g. For in vivo expressed DEBS1, a TB3 cell pellet, ∼50 mg wet weight, was lysed by suspending in 150 μL of lysis buffer (10% SDS, 10 mM dithiothreitol) and incubating for 10 min at 60 °C. The lysate was centrifuged briefly to remove cellular debris. ∼90 μg of protein (RC DC Protein Assay kit, Bio-Rad Laboratories, Hercules CA) was applied to the filter unit of a filter-assisted sample preparation (FASP) protein digestion kit (Expedion, San Diego CA), and the manufacturer’s protocol was followed. For the digestion step of the FASP protocol, samples were incubated overnight at 37 °C with 4 μg trypsin in 75 μL of 50 mM ammonium bicarbonate. In-gel digestion of a band excised from 1D SDS−PAGE of lysate from TB3 cells used for DEBS1 production in vivo was performed following literature methods,51 with overnight trypsin digestion. Full Proteomics Measurements. The results presented here reflect several independent experiments, performed over the course of two years, with different instrumentation configurations and sample amounts. The intention is not to compare the various experiments, but rather to highlight some common features. Briefly, trypsin digests were analyzed using two-dimensional liquid chromatography-tandem mass spectrometry,52 with details for specific extracts, lysates, and gel band analyses described in Supporting Information, section 4.

Proteomics has the potential to greatly facilitate the development and implementation of CFPS. As an established technique for characterizing protein complements of biological entities ranging from microbial cultures to higher organisms to communities, its many applications, as well as trade-offs in comprehensiveness versus quantitative accuracy, are well documented. 26−28 Two-dimensional gel electrophoresis (2DE) has been applied to CFPS systems, showing changes in numbers of protein spots on the gels and a decrease in total protein abundance over the time course of a CFPS reaction, as well as individual spots that were correlated with protein synthesis activity.29,30 With its ability to provide both differential abundance measurements and protein identifications at high-throughput as inherent features of the workflow, mass spectrometry (MS)-based proteomics26−28 offers the potential for compositional assessment of the protein complement of cellular fractions, as well as targeted characterization of newly synthesized proteins. Developments in MS characterization of post-translational modifications (PTMs) of proteins further permit studies of this important mechanism for regulating protein function31 and have been applied to characterization of CFPS-synthesized proteins that require specific PTMs for correct function.32 Here, we describe the application of MS-based proteomics to characterize the protein content of bacterial extracts used for CFPS, the completeness of the expressed protein, and the presence of PTMs in the expressed protein. We used the synthesis of green fluorescent protein (GFP) as an initial test system. This model protein derives its fluorescent properties from an intrinsic chromophore, which results from a posttranslational, intramolecular reaction accompanied by a decrease in net mass of the protein that can be discerned by mass spectrometry. Additionally, we studied the CFPS of a polyketide synthase (PKS), subunit 1 of 6-deoxyerythronolide B synthase (DEBS1). Polyketide synthases are large, posttranslationally modified, modular enzymes that synthesize a wide range of natural products.33 Because each module performs a separate step of a chemical synthesis, PKSs are being explored as tools for synthesizing potentially novel compounds.34,35 MS has been applied for characterizing active sites in PKSs (and related nonribosomal peptide synthases) through analysis of phosphopantetheine modification of the active-site serines that provide an attachment site for smallmolecule substrates and intermediates of natural product synthesis.36−43 We applied and extended these MS tools to evaluate the production of full-length protein and the critical phosphopantetheinylated modifications of DEBS1.



METHODS Chemicals and Reagents. All reagents and chemicals used in this study were purchased from Fisher Scientific (Pittsburgh, PA), Sigma-Aldrich (St. Louis, MO), Roche Life Sciences (Indianapolis, IN), Bio-Rad (Hercules, CA), Pierce (Rockford IL), or Promega (Madison WI). All restriction enzymes were purchased from Thermo Fisher Scientific. Strains and Plasmids. E. coli strain TB3 was a kind gift from Dr. Blaine A. Pfeifer (State University of New York, Buffalo, NY) and has been described previously.44−46 The plasmid encoding DEBS1, pET26b-DEBS1TE, was a kind gift from the laboratory of Dr. Adrian Keatinge-Clay (University of Texas, Austin TX) and has also been described.47 In this construct, a truncated form of the DEBS1 enzyme is fused at its C-terminus to the terminal thioesterase (TE) domain of B

DOI: 10.1021/acs.analchem.7b02555 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry

Table 1. Proteome Coverage of Translation Factor Proteinsa

Hypothesis-Driven Proteomics Measurements. Aliquots from trypsin digests containing approximately 45 μg of peptides (from DEBS1 produced in vivo), or 65 μg of peptides (DEBS1 CFPS reactions) were analyzed using a hypothesisdriven mass spectrometry strategy designed to target phosphopantetheinylated peptides. A target list of parent ions was prepared, including predicted mass-to-charge ratio (m/z) values for tryptic peptides from the three DEBS1 sites predicted to contain the phosphopantetheine (PPant) modification (Ser500, Ser1974, and Ser3421), including both the unmodified form of the peptide, and the form bearing a PPant moiety with a carbamidomethyl modification of the terminal thiol resulting from the iodoacetamide reagent used in sample preparation. Each combination was included as ions with charge states of +1, + 2, and +3. The method triggered an MS/MS scan upon observation of any of these targeted parent ions in an MS scan above a minimum signal intensity of 1000. The method further programmed acquisition of an MS3 scan to fragment the potential carbamidomethylated Pant ejection ion at m/z 31840,42 if observed in the MS/MS spectrum above a threshold intensity of 200. LC separations were performed using an 11step SCX/reverse-phase program (Supporting Information, section 4). Proteomics Data Analysis. For both full and hypothesisdriven data, peptide identifications were obtained from MS/MS spectra using the program Myrimatch (version 2.1.138),53 and protein identifications were assembled from peptide identifications using IDPicker, version 3.1.599 (Supporting Information, section 3).54 In addition to Myrimatch searches of the MS/MS spectra, MS3 spectra were extracted from raw files obtained through the hypothesis-driven approach using the program MSConvert in the ProteoWizard package (version 3.0.4098)55 and saved in mzxml format. MS3 spectra were grouped by similarity using the program MSCluster (version 2.00, release 20101018).56 A custom R script calculated the fraction of abundance represented by 3 diagnostic fragment ions (m/z 142, 166, and 184) in the MS3 spectra40,42 to facilitate identification of those likely arising from Pant ejection ions.

a

GFP results represent average of 2 measurements. DEBS1 results represent average of 12 measurements. A key to the color scheme is provided at right.

DEBS1. Table S3 provides the complete parts list, including the 30S and 50S ribosomes, tRNA synthetases, and RNA polymerase. The proteins relevant to transcription and translation represent >40% of the total protein signal, calculated as normalized spectral abundance factor (NSAF),58 for these mixtures as detailed in Supporting Information, section 5. Collectively, proteomics detected nearly the complete set of proteins that constitutes the ribosome and the enzymes needed for transcription and translation. Of the 94 proteins involved in protein synthesis, all were identified in DEBS1 CFPS reactions, and all but two in GFP reactions (Table S3). The overall pathway is clearly functional as protein synthesis from exogenous DNA templates was successful (see below). The two proteins not detected in GFP reactions (50S ribosomal proteins L35, rpmI, and L36, rpmJ) may have been detected in the DEBS1 reactions because of the differences in total LC separation times, or because of the multiple replicate reactions analyzed in the latter case (2 for GFP, 12 for DEBS1); analysis of more replicates generally leads to identification of more proteins.59 L36 is also the smallest protein in the 50S ribosome, containing only 38 amino acids; bottom-up proteomics can be less sensitive to small proteins simply because they yield fewer possible tryptic peptides. L36 also contains a large number of trypsin cut sites (arginine and lysine residues), placing most of its fully tryptic peptides below the size range that would be identified in our workflow. Because of the very different LC configurations applied in LC-MS/MS of GFP and the DEBS1 CFPS reactions, one should not compare abundance or coverage differences at the individual protein level between GFP and DEBS1 in Table 1 and Table S3. Clearly, however, both reaction mixtures contained essentially complete complements of the transcription and translation machinery. The two different cell lines and resulting extracts that were examined reveal their protein synthesis potential and the possible influence of either engineered changes in genetic content of the organism or the procedures used to prepare the extract.



RESULTS AND DISCUSSION Full LC-MS/MS of Cell-Free Protein Synthesis Reactions. Table S1 lists the 1839 proteins identified by full proteome characterizations of CFPS reactions producing GFP and DEBS1 using extracts derived from BL21 Star (DE3) and TB3 cell lines, respectively, as well as TB3 cells used for in vivo expression of DEBS1. Table S2 lists tryptic peptides identified in these measurements. Detailed descriptions of the contents of these tables are provided in Supporting Information, Legend for Supplemental Table S1 and Legend for Supplemental Table S2. The metabolic potential of CFPS extracts for performing transcription and translation functions required for protein synthesis was explored. To compile a “parts list” of proteins performing these functions, we used the composition of an alternative CFPS reaction mixture, the PURE system,57 which uses a defined mixture of purified proteins and other reagents, in contrast to the complete cellular extracts used for CFPS. Full proteome measurements provided a tool for evaluating our cellfree extracts for the presence of a minimally complete protein synthesis machinery. Table 1 shows the translation factors in this list, along with coverage from the full proteome measurements of cell-free synthesis reactions for GFP and C

DOI: 10.1021/acs.analchem.7b02555 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry Potential enhancements to the reaction may be possible through increased representation, or removal, of particular components.60 Also, full proteomics can serve as a useful quality control measure for benchmarking extract preparations. In all, full proteomics provides a practical tool for interpreting and optimizing the metabolic capabilities of a cell extract, assessing the effects of extract preparation procedures and for identifying the metabolic capabilities of the mixture. Characterization of GFP Produced by CFPS. In addition to information on metabolic capabilities of the extract, full proteomics can also provide information on the protein product synthesized in a CFPS reaction. As expected, GFP is produced in the CFPS reaction and detected in the proteome, constituting ∼3% of the protein as estimated by NSAF (Supplemental Table S1). Fluorescence-based measurements of GFP concentration yielded 747 μg/mL, in the range of typical CFPS yields of 1 mg/mL.61 GFP was not detected from LC-MS/MS analysis of an unreacted S30 extract, nor in a “mock” CFPS reaction with no added template DNA (data not shown). Figure 1 shows the resulting proteome sequence coverage for GFP. While N- and C-terminal regions of GFP are robustly represented by hundreds of spectra, most of the interior of the protein is represented at lower abundance. A small uncovered region at amino acid positions 107 and 108

(TR) would not be detected because it is below the Myrimatch minimum peptide length setting of 5 amino acids. Strong representation of the C-terminal region indicates that the CFPS reaction indeed produced predominantly the full-length GFP protein. In sum, the detected peptides covered 98% of the GFP protein sequence and confirmed the presence of the full-length product. We further examined the full proteome results for evidence of the GFP tryptic peptide that bears the spontaneously formed, intramolecular cross-link required for formation of the GFP fluorophore.62 The lower panel of Figure 1 shows one of the tandem mass spectra assigned to this peptide by Myrimatch. While perhaps not convincing on its own, additional MS-MS spectra, along with the absence of any other identified protein with this modification, provided evidence for the posttranslational modification (Supporting Information, section 6, Figure S1). Interestingly, only about one-third of the tandem mass spectra representing the relevant peptide corresponded to the modified peptide, with the remaining two-thirds assigned to the unmodified version of the peptide. This could indicate improperly folded protein that does not lead to fluorophore formation, but is more likely due to an artifact resulting from sample preparation, or differences in ionization and detection between the modified and unmodified peptides. Others have reported much higher levels of active GFP based on correlations in protein yields as determined by radiolabeling and fluorescence.63 In our studies, no attempts were made to optimize protein activity. Future correlations with other analytical measurement techniques will help determine the quantitative accuracy of the MS spectral abundance measurements. Nevertheless, detailed characterization of the reaction product by mass spectrometry showed the potential for confirmation of product production, validation of functional modification and insight into potential problems involving synthesis of functional protein. Characterization of DEBS1 Produced by CFPS. The preparation of novel proteins, proteins with useful functions, and sequence variations of these targets are important applications of CFPS. Unlike GFP, the majority of potential synthesis targets are neither easily detected nor functionally assessed by fluorescence or UV−visible spectroscopy. Polyketide synthases (PKSs) are examples of important functional targets that are involved in the synthesis of a variety of secondary metabolites with potential medical applications. These proteins can be identified by genome sequence analysis, but determining their functional activities requires synthesizing a functional enzyme, complete with PTMs. DEBS1 is a model PKS that initiates the biosynthesis of the antibiotic erythromycin A. It is one of three large enzymes, termed megasynthetases, that work in an assembly line fashion to assemble propionyl units into the 21-carbon scaffold, 6deoxyerythronolide B, that is subsequently modified into erythromycin A. DEBS1 has three catalytic domains, each containing a phosphopantetheinylated serine, that serve as attachment sites for substrates. Expressing and translating the gene for this >394 kDa megasynthetase in a cell-free system, and confirming the presence of essential PTMs, provide challenges for CFPS. Others have reported that heterologous expression of native DEBS1 does not result in a functional product.43 We explored the potential of proteomics measurements of DEBS1 in a CFPS system to suggest reasons for this difficulty. The product was not visible as a band at the expected size from 1D SDS-PAGE (data not shown), but was detected in

Figure 1. Tandem mass spectrometric characterization of sfGFP from proteome analysis of CFPS. Upper panel documents sequence coverage of sfGFP, shown as number of tandem mass spectra identified in tryptic proteins at each amino acid (AA) position. Scale bar at right indicates the number of tandem mass spectra identified for peptides covering each AA position. Green boxes outline the peptide containing the residues modified in formation of the GFP chromophore. The modification results in loss of 20 amu (H2O + H2) from the TYG tripeptide, denoted [TYG−20]. Carbamidomethyl cysteine is denoted by C*. Trypsin cut sites are shown as vertical black lines below the upper panel. The lower panel shows the annotated tandem mass spectrum of this peptide (z = 2, parent m/z = 1208), which reveals a partial sequence ladder and can be identified as b ions that contain the original N terminus of the parent peptide or y ions that contain the original C terminus.64 D

DOI: 10.1021/acs.analchem.7b02555 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry

Figure 2. DEBS1 sequence coverage from full MS proteomics measurements. Scale bar at right indicates the number of tandem mass spectra identified for peptides covering each AA position (numbered N → C). Trypsin cut sites are shown as vertical lines below the plot. Red arrowheads indicate positions of the 3 active serines subject to phosphopantetheinylation; red boxes outline detected peptides containing the modified active site serines. TB3 WCL: whole cell lysate from TB3 cells expressing DEBS1-TE. Gel bands: average of two technical replicates of the in-gel digest of DEBS1 isolated from TB3 WCL.

the CFPS reactions are illustrated in Figure 2. For comparison, Figure 2 also shows the DEBS1 tryptic peptides detected by full proteome analyses of DEBS1 expressed in vivo in E. coli TB3. These latter preparations included both whole cell lysate from the TB3 cells, as well as gel bands excised from a region with no obvious visible band, but corresponding to the size of fulllength DEBS1 from SDS-PAGE separation of the whole cell lysate. In general, DEBS1 produced in vivo yielded higher coverage of the sequence than did the CFPS reactions. In particular, there is only sparse DEBS1 coverage in both technical replicates of CFPS reaction 5. More striking, coverage of the C-terminal region of DEBS1 is nearly absent in the CFPS reactions. Relative to the DEBS1 produced in vivo, the density of peptide coverage falls off dramatically at ∼1700 residues from the N-terminus for CFPS reactions 1−4, and at ∼900 residues for reaction 6. More uniform coverage across DEBS1 produced in vivo argues against a bias in the proteome analysis for the N-terminal regions of DEBS1. On the other hand, tryptic digests of CFPS reactions were filtered through 10 kDa filters, while those from in vivo expression of DEBS1 were filtered through larger 30 kDa filters of the FASP kit, which, although unlikely, could have led to nonuniform losses of peptides from the former. Nevertheless, these observations suggest that incomplete proteins were the main product in these CFPS reactions. Even so, sporadic detection of tryptic fragments near the C-terminus, for example, the fragment spanning residues 3502−3543 (out of 3756), shows that CFPS produced some copies of nearly full-length protein. The failure to produce full-length protein consistently may be due to the production of incomplete transcripts or to premature termination of translation. In either case, the tryptic peptide coverage results suggest directions for refining the CFPS reaction that would typically require radiolabeling of the

full proteome analyses (Table S4). NSAF values suggested that DEBS1 represented up to 0.04% of the protein signal (Table S4). The number of spectra identified from tryptic peptides of DEBS1 ranged from relatively high (averaging ∼150) in reactions 1−4 to quite low (3 or 4) in reaction 5, but were relatively consistent between pairs of technical duplicates analyzed for each reaction. In contrast, total numbers of identified MS/MS spectra from all other proteins showed less variability, as expected since the reaction mixtures were prepared from a common extract. The overall numbers of proteins identified and the levels of selected individual proteins (other than DEBS1) were relatively constant across the reactions. The Pearson correlation coefficients between all pairs of proteome analyses, calculated for NSAF values for all proteins other than DEBS1, ranged from 0.86 to 0.99, with a median of 0.94. Figure S2a illustrates this reproducibility in measured protein abundances. In addition, the proteome analyses consistently identified the needed post-translational modification enzyme phosphopantetheinyl transferase, encoded by the gene sfp in the TB3 strain. Figure S2b and c further illustrates the higher variability for DEBS1 across reactions compared to 40 proteins with similar normalized spectrum count (i.e., spectrum count for a protein divided by total spectrum count across all proteins); DEBS1 clearly falls below its peers in reactions 5 and 6. This result, showing that the MS response was fairly uniform across protein components of the CFPS, increases confidence in the validity of the result that the DEBS1 content, and therefore reaction yield, was variable across these replicates. These results demonstrate the utility of full proteomics for characterizing CFPS reaction products without need for isolating or purifying the product. Analyses of the MS/MS spectra from DEBS1 tryptic peptides revealed further details on the performance of the CFPS reaction. The spectrum counts across the DEBS1 sequence for E

DOI: 10.1021/acs.analchem.7b02555 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry product, PAGE analysis, and autoradiography to detect incomplete product. Post-translational Modification of DEBS1 Produced by CFPS. Three serines in the DEBS1 construct, Ser500, Ser1974, and Ser3421, are predicted sites for phosphopantetheine (PPant) modification. The utility of MS characterization of PPant-modified DEBS modules35,38,43 and other megasynthetases36,37,39−42 has been reported. Full proteomics provided evidence for the tryptic peptides containing some of these sites, but with high variability across samples (Figure 2). We manually validated tandem mass spectra matched with a peptide containing the PPant modification on Ser1974 (Supporting Information section 7, Figure S3). Myrimatch identifications also confirmed the PPant modification on the active-site peptide that contains Ser500 in some of the CFPS reactions, but not the TB3 whole cell lysate (“Full proteomics” columns, Table S5). The peptide containing Ser3421 was not detected with the PPant modification in any sample. The full proteome results contained no spectra identified as non-PPantmodified forms of the three peptides containing the active site serines (Table S2). While providing information on large numbers of proteins, full proteomics can fail to acquire MS/MS spectra for peptides of low abundance because of the well-known sampling problem arising from simultaneous elution of more peptides than the mass spectrometer can fragment within the time frame of a chromatographic feature.59 We therefore turned to a hypothesis-driven proteomics approach to focus on critical PPant modifications to peptides containing active site serines in DEBS1. A hypothesis-driven approach can provide data on peptides of low abundance that might not trigger acquisition of MS/MS spectra in a full proteome measurement, as demonstrated for selective identification of phosphorylated peptides.65 We designed such an approach to perform MS/MS only on a list of targeted parent masses drawn from the three DEBS1 tryptic peptides that contain active-site serines. The method further triggered MS3 acquisition on a candidate Pant ejection ion40,42 if it was detected in an MS/MS spectrum. Following data acquisition, we clustered similar MS3 spectra using MSCluster, and examined the clustered spectra most resembling the MS3 spectra of the Pant ejection ion described in the literature.40,42 Myrimatch searches of data from the hypothesis-driven approach identified MS/MS spectra corresponding to the DEBS1 active-site peptides. This hypothesis-driven proteomics approach provided evidence for all three active-site peptides, each bearing the PPant modification, for DEBS1 produced in vivo. One cluster derived from m/z 318 ions in MS/MS spectra contained 41 individual MS3 spectra, and closely matched literature MS3 spectra40,42 of the carbamidomethylated Pant ion (Figure 3a, Table S5). Each of these 41 MS3 spectra arose from an MS/MS spectrum that in turn originated from a targeted parent m/z corresponding to one of the 3 active site containing tryptic peptides of DEBS1. Myrimatch searches identified 20 of these MS/MS spectra as phosphopantetheinylated DEBS1 active-site peptides. Figure 3b−d present examples of these MS/MS spectra; each shows a prominent Pant ejection ion at m/z 318, a fragment corresponding to loss of the Pant group, and b- and y-ions64 that confirmed the peptide sequence. All 41 of the MS3 spectra in the “Pant-like” cluster occurred during one of three narrow chromatographic elution time windows, each of which overlapped elution time windows of the MS/MS spectra that were matched by Myrimatch with one of the 3 active-site

Figure 3. Hypothesis-driven proteomics of DEBS1. (a) Spectrum clustered56 from 41 individual MS3 spectra characteristic of the m/z 318 Pant ion. Numbers and dashed lines indicate fragments reported for Pant-like MS3 spectra in the literature.40 (b−d) MS/MS spectra of PPant modified peptides containing: (b) Ser3421, parent m/z = 787.9 (z = 2); (c) Ser500, parent m/z = 892.0 (z = 2); (d) Ser1974, parent m/z = 867.6 (z = 3). The m/z 318 Pant fragment is marked in each spectrum with a red circle, and the ion corresponding to loss of the Pant ion from the parent is marked in each spectrum with an asterisk.

peptides. This concurrence suggests that all of the Pant-like MS3 spectra arose from the three PPant-modified tryptic fragments. In addition to the PPant-modified peptide, Myrimatch also identified some MS/MS spectra consistent with the unmodified version of the peptide containing Ser500. The hypothesis-driven proteomics approach therefore provided evidence that the DEBS1 produced in vivo contained all three active-site peptides, each bearing the phosphopantetheine modification. F

DOI: 10.1021/acs.analchem.7b02555 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry

have revealed details of the metabolic capabilities of cell-free extracts and, importantly, highlighted trouble spots in CFPS, including possible incomplete maturation of GFP, expression of truncated DEBS1, and inconsistent phosphopantetheinylation of DEBS1. Clearly, developing and implementing MS-based proteomics for CFPS requires a considerable investment in instrumentation and expertise, effective experimental designs, and critical interpretation of results. The potential return on this investment will be an improved ability to understand both the process and its products, which will greatly accelerate both development of CFPS technology and its application to specific problems.

In contrast, hypothesis-driven proteomics showed that the DEBS1 active site peptides were not uniformly detected across the CFPS reactions (Table S5). This observation is consistent with lower sequence coverage observed for the CFPS reactions in the full proteomics results described above and shown in Figure 2. The active-site peptide containing Ser500 yielded appreciable numbers of Pant-like MS3 spectra from only two CFPS reactions (32 spectra from reaction 2; 23 from reaction 4). Myrimatch identified MS/MS spectra matching this peptide only from reaction 2. Identification of five Pant-like MS3 spectra from this peptide in reaction 5 was surprising due to the low DEBS1 yield inferred from the full proteomics data (Figure 2). No Pant-like MS3 or MS/MS spectra were identified for the second active site peptide containing Ser1974 in any of the CFPS reactions. The parent m/z for the third active site peptide, which contains Ser3421, yielded small numbers of Pant-like MS3 spectra from reactions 2 and 5, but Myrimatch identified no confirmatory MS/MS spectra.



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.analchem.7b02555. Preparation of cell-free extracts, cell-free protein synthesis reactions, proteomics data analysis, full proteomics methods, proteome coverage of proteins involved in transcription and translation, tandem mass spectrometry supporting GFP post-translational modification, Tandem mass spectrometry supporting DEBS1 post-translational modifications, proteome coverage of proteins involved in transcription and translation, summary of full proteomics of DEBS1 CFPS, identification of DEBS1-TE peptides containing active-site serines, tandem mass spectra of the GFP tryptic peptide containing the intramolecular crosslink, DEBS1 abundance compared to other proteins, tandem mass spectra and extracted ion chromatograms for modified DEBS1 peptides (PDF) Complete listing of the proteins identified by full proteome characterizations of CFPS reactions producing GFP or DEBS1 (XLSX) Tryptic peptides identified in full proteome characterizations of CFPS reactions producing GFP or DEBS1 (XLSX)



CONCLUSIONS Realizing the full potential of CFPS requires the development and application of analytical tools to characterize the components and products of these complex reaction systems. Attributes of MS-based proteomics match effectively to the needs of CFPS. The largely unknown contents of the cell extract can be determined by conventional, full proteomics from a fraction of a 15 μL reaction mixture. As shown here, this analysis provides confirmation of the presence of the protein components required for transcription, translation and posttranslational modification. The information from these experiments can benefit the refinement of extract preparation procedures for specific purposes, provide for quality control checks and allow assessment of the potential for carrying out various metabolic processes. For example, the ability to trace pathways related to other processes, such as energy transduction and metabolic conversions, should be possible. MS-based proteomics is also valuable for characterizing CFPS products and their functional potential. CFPS reactions producing the commonly used standard GFP and the polyketide synthase DEBS1 were demonstrated, and the peptide coverage maps resulting from proteomics analyses indicated the functional potential of the products. Full sequence coverage was observed for GFP. In the case of DEBS1, premature cessation of translation was evident in CFPS reactions when compared to the same sequence expressed in vivo. MS/MS evaluation, as part of a full proteomics workflow, provided further, needed insight into the functional potential of the CFPS product. Comparison of MS methods against other analytical methods, such as autoradiographic tracking of incomplete translation products, will benchmark these techniques in future work. The intramolecular cross-link that renders GFP fluorescent was identified, while the correct phosphopantetheinylated sites of DEBS1 could be detected but were not present consistently. The DEBS1 analysis benefitted from implementing a hypothesis-driven proteomics approach that provided increased sensitivity to the critical modifications of active-site serines beyond that available from full proteomics. Overall, the combination of the full and hypothesis-driven proteomics approaches provided a detailed characterization of the active-site peptides, a perspective on the sequence coverage across the DEBS1 protein, and broader information on other proteins necessary for its synthesis. Through application of the analytical approach to both simple and complex examples, we



AUTHOR INFORMATION

Corresponding Authors

*E-mail: [email protected]. *E-mail: [email protected]. ORCID

Gregory B. Hurst: 0000-0002-7650-8009 Jennifer L. Morrell-Falvey: 0000-0002-9362-7528 Author Contributions

The manuscript was written through contributions of all authors. Notes

This manuscript has been authored by UT-Battelle, LLC, under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a nonexclusive, paidup, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). G

DOI: 10.1021/acs.analchem.7b02555 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry The authors declare no competing financial interest.

(20) Jewett, M. C.; Calhoun, K. A.; Voloshin, A.; Wuu, J. J.; Swartz, J. R. Mol. Syst. Biol. 2008, 4, No. 220, DOI: 10.1038/msb.2008.57. (21) Freischmidt, A.; Meysing, M.; Liss, M.; Wagner, R.; Kalbitzer, H. R.; Horn, G. J. Biotechnol. 2010, 150, 44−50. (22) Caschera, F.; Noireaux, V. Biochimie 2014, 99, 162−168. (23) Calhoun, K. A.; Swartz, J. R. Biotechnol. Bioeng. 2005, 90, 606− 613. (24) Bujara, M.; Schumperli, M.; Billerbeck, S.; Heinemann, M.; Panke, S. Biotechnol. Bioeng. 2010, 106, 376−389. (25) Kay, J. E.; Jewett, M. C. Metab. Eng. 2015, 32, 133−142. (26) Angel, T. E.; Aryal, U. K.; Hengel, S. M.; Baker, E. S.; Kelly, R. T.; Robinson, E. W.; Smith, R. D. Chem. Soc. Rev. 2012, 41, 3912− 3928. (27) Zhang, Y. Y.; Fonslow, B. R.; Shan, B.; Baek, M. C.; Yates, J. R. Chem. Rev. 2013, 113, 2343−2394. (28) Aebersold, R.; Mann, M. Nature 2016, 537, 347−355. (29) Schindler, P. T.; Baumann, S.; Reuss, M.; Siemann, M. Electrophoresis 2000, 21, 2606−2609. (30) Lamla, T.; Mammeri, K.; Erdmann, V. A. Acta Biochim. Polonica 2001, 48, 453−465. (31) Witze, E. S.; Old, W. M.; Resing, K. A.; Ahn, N. G. Nat. Methods 2007, 4, 798−806. (32) Suzuki, T.; Moriya, K.; Nagatoshi, K.; Ota, Y.; Ezure, T.; Ando, E.; Tsunasawa, S.; Utsumi, T. Proteomics 2010, 10, 1780−1793. (33) Fischbach, M. A.; Walsh, C. T. Chem. Rev. 2006, 106, 3468− 3496. (34) Weissman, K. J.; Leadlay, P. F. Nat. Rev. Microbiol. 2005, 3, 925−936. (35) Kapur, S.; Lowry, B.; Yuzawa, S.; Kenthirapalan, S.; Chen, A. Y.; Cane, D. E.; Khosla, C. Proc. Natl. Acad. Sci. U. S. A. 2012, 109, 4110− 4115. (36) Stein, T.; Vater, J.; Kruft, V.; Wittmannliebold, B.; Franke, P.; Panico, M.; McDowell, R. M.; Morris, H. R. FEBS Lett. 1994, 340, 39− 44. (37) Weinreb, P. H.; Quadri, L. E. N.; Walsh, C. T.; Zuber, P. Biochemistry 1998, 37, 1575−1584. (38) Schnarr, N. A.; Chen, A. Y.; Cane, D. E.; Khosla, C. Biochemistry 2005, 44, 11836−11842. (39) Dorrestein, P. C.; Bumpus, S. B.; Calderone, C. T.; GarneauTsodikova, S.; Aron, Z. D.; Straight, P. D.; Kolter, R.; Walsh, C. T.; Kelleher, N. L. Biochemistry 2006, 45, 12756−12766. (40) Meluzzi, D.; Zheng, W. H.; Hensler, M.; Nizet, V.; Dorrestein, P. C. Bioorg. Med. Chem. Lett. 2008, 18, 3107−3111. (41) Lee, J. H.; Evans, B. S.; Li, G. Y.; Kelleher, N. L.; van der Donk, W. A. Biochemistry 2009, 48, 5054−5056. (42) Meier, J. L.; Patel, A. D.; Niessen, S.; Meehan, M.; Kersten, R.; Yang, J. Y.; Rothmann, M.; Cravatt, B. F.; Dorrestein, P. C.; Burkart, M. D.; Bafna, V. J. Proteome Res. 2011, 10, 320−329. (43) Lowry, B.; Robbins, T.; Weng, C. H.; O’Brien, R. V.; Cane, D. E.; Khosla, C. J. Am. Chem. Soc. 2013, 135, 16809−16812. (44) Zhang, H. R.; Boghigian, B. A.; Pfeifer, B. A. Biotechnol. Bioeng. 2010, 105, 567−573. (45) Zhang, H. R.; Wang, Y.; Wu, J. Q.; Skalina, K.; Pfeifer, B. A. Chem. Biol. 2010, 17, 1232−1240. (46) Jiang, M.; Fang, L.; Pfeifer, B. A. Biotechnol. Prog. 2013, 29, 862−869. (47) Enyeart, P. J.; Chirieleison, S. M.; Dao, M. N.; Perutka, J.; Quandt, E. M.; Yao, J.; Whitt, J. T.; Keatinge-Clay, A. T.; Lambowitz, A. M.; Ellington, A. D. Mol. Syst. Biol. 2013, 9, No. 685. (48) Liu, D. V.; Zawada, J. F.; Swartz, J. R. Biotechnol. Prog. 2005, 21, 460−465. (49) Jewett, M. C.; Swartz, J. R. Biotechnol. Bioeng. 2004, 87, 465− 472. (50) Jewett, M. C.; Swartz, J. R. Biotechnol. Bioeng. 2004, 86, 19−26. (51) Shevchenko, A.; Tomas, H.; Havlis̆, J.; Olsen, J. V.; Mann, M. Nat. Protoc. 2007, 1, 2856−2860. (52) McDonald, W. H.; Ohi, R.; Miyamoto, D. T.; Mitchison, T. J.; Yates, J. R. Int. J. Mass Spectrom. 2002, 219, 245−251.



ACKNOWLEDGMENTS Research supported by the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the U.S. Department of Energy (DOE) and by the U.S. DOE Office of Biological and Environmental Research, Genomic Science Program. This research was supported in part by appointments to the Higher Education Research Experiences Program at Oak Ridge National Laboratory. Oak Ridge National Laboratory is managed by UT-Battelle, LLC, for the US Department of Energy under Contract no. DEAC05-00OR22725. We thank Dr. Blaine A. Pfeifer (State University of New York−Buffalo), Dr. Adrian Keatinge-Clay (University of Texas), and Dr. Michael Jewett (Northwestern University) for plasmids and for advice on preparing cell-free extracts, and Jason Chien for technical assistance.



REFERENCES

(1) Zawada, J. F.; Yin, G.; Steiner, A. R.; Yang, J. H.; Naresh, A.; Roy, S. M.; Gold, D. S.; Heinsohn, H. G.; Murray, C. J. Biotechnol. Bioeng. 2011, 108, 1570−1578. (2) Sawasaki, T.; Ogasawara, T.; Morishita, R.; Endo, Y. Proc. Natl. Acad. Sci. U. S. A. 2002, 99, 14652−14657. (3) Goshima, N.; Kawamura, Y.; Fukumoto, A.; Miura, A.; Honma, R.; Satoh, R.; Wakamatsu, A.; Yamamoto, J.; Kimura, K.; Nishikawa, T.; Andoh, T.; Iida, Y.; Ishikawa, K.; Ito, E.; Kagawa, N.; Kaminaga, C.; Kanehori, K.; Kawakami, B.; Kenmochi, K.; Kimura, R.; et al. Nat. Methods 2008, 5, 1011−1017. (4) Horvatovich, P.; Vegvari, A.; Saul, J.; Park, J. G.; Qiu, J.; Syring, M.; Pirrotte, P.; Petritis, K.; Tegeler, T. J.; Aziz, M.; Fuentes, M.; Diez, P.; Gonzalez-Gonzalez, M.; Ibarrola, N.; Droste, C.; De Las Rivas, J.; Gil, C.; Clemente, F.; Hernaez, M. L.; Corrales, F. J.; et al. J. Proteome Res. 2015, 14, 3441−3451. (5) Dudley, Q. M.; Karim, A. S.; Jewett, M. C. Biotechnol. J. 2015, 10, 69−82. (6) Hodgman, C. E.; Jewett, M. C. Metab. Eng. 2012, 14, 261−269. (7) Karig, D. K.; Iyer, S.; Simpson, M. L.; Doktycz, M. J. Nucleic Acids Res. 2012, 40, 3763−3774. (8) Niederholtmeyer, H.; Stepanova, V.; Maerkl, S. J. Proc. Natl. Acad. Sci. U. S. A. 2013, 110, 15985−15990. (9) Takahashi, M. K.; Hayes, C. A.; Chappell, J.; Sun, Z. Z.; Murray, R. M.; Noireaux, V.; Lucks, J. B. Methods 2015, 86, 60−72. (10) Spirin, A. S., Swartz, J. R., Eds. Cell-Free Protein Synthesis: Methods and Protocols; Wiley-VCH: Weinheim, Germany, 2007. (11) Kai, L.; Dotsch, V.; Kaldenhoff, R.; Bernhard, F. PLoS One 2013, 8, e56637. (12) Khnouf, R.; Olivero, D.; Jin, S. G.; Fan, Z. H. Biotechnol. Prog. 2010, 26, 1590−1596. (13) Siuti, P.; Retterer, S. T.; Doktycz, M. J. Lab Chip 2011, 11, 3523−3529. (14) Timm, A. C.; Shankles, P. G.; Foster, C. M.; Doktycz, M. J.; Retterer, S. T. Small 2016, 12, 810−817. (15) Penalber-Johnstone, C.; Ge, X. D.; Tran, K.; Selock, N.; Sardesai, N.; Gurramkonda, C.; Pilli, M.; Tolosa, M.; Tolosa, L.; Kostov, Y.; Frey, D. D.; Rao, G. Biotechnol. Bioeng. 2017, 114, 1478− 1486. (16) Michel-Reydellet, N.; Calhoun, K.; Swartz, J. Metab. Eng. 2004, 6, 197−203. (17) Michel-Reydellet, N.; Woodrow, K.; Swartz, J. J. Mol. Microbiol. Biotechnol. 2005, 9, 26−34. (18) Underwood, K. A.; Swartz, J. R.; Puglisi, J. D. Biotechnol. Bioeng. 2005, 91, 425−435. (19) Iskakova, M. B.; Szaflarski, W.; Dreyfus, M.; Remme, J.; Nierhaus, K. H. Nucleic Acids Res. 2006, 34, e135. H

DOI: 10.1021/acs.analchem.7b02555 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry (53) Tabb, D. L.; Fernando, C. G.; Chambers, M. C. J. Proteome Res. 2007, 6, 654−661. (54) Ma, Z. Q.; Dasari, S.; Chambers, M. C.; Litton, M. D.; Sobecki, S. M.; Zimmerman, L. J.; Halvey, P. J.; Schilling, B.; Drake, P. M.; Gibson, B. W.; Tabb, D. L. J. Proteome Res. 2009, 8, 3872−3881. (55) Chambers, M. C.; Maclean, B.; Burke, R.; Amodei, D.; Ruderman, D. L.; Neumann, S.; Gatto, L.; Fischer, B.; Pratt, B.; Egertson, J.; Hoff, K.; Kessner, D.; Tasman, N.; Shulman, N.; Frewen, B.; Baker, T. A.; Brusniak, M. Y.; Paulse, C.; Creasy, D.; Flashner, L.; et al. Nat. Biotechnol. 2012, 30, 918−920. (56) Frank, A. M.; Monroe, M. E.; Shah, A. R.; Carver, J. J.; Bandeira, N.; Moore, R. J.; Anderson, G. A.; Smith, R. D.; Pevzner, P. A. Nat. Methods 2011, 8, 587−U101. (57) Shimizu, Y.; Inoue, A.; Tomari, Y.; Suzuki, T.; Yokogawa, T.; Nishikawa, K.; Ueda, T. Nat. Biotechnol. 2001, 19, 751−755. (58) Zybailov, B.; Mosley, A. L.; Sardiu, M. E.; Coleman, M. K.; Florens, L.; Washburn, M. P. J. Proteome Res. 2006, 5, 2339−2347. (59) Liu, H. B.; Sadygov, R. G.; Yates, J. R. Anal. Chem. 2004, 76, 4193−4201. (60) Li, J.; Gu, L.; Aach, J.; Church, G. M. PLoS One 2014, 9, No. e106232. (61) Shin, J.; Noireaux, V. J. Biol. Eng. 2010, 4, 9. (62) Pakhomov, A. A.; Martynov, V. I. Biochemistry (Moscow) 2009, 74, 250−259. (63) Hong, S. H.; Ntai, I.; Haimovich, A. D.; Kelleher, N. L.; Isaacs, F. J.; Jewett, M. C. ACS Synth. Biol. 2014, 3, 398−409. (64) Roepstorff, P.; Fohlman, J. Biomed. Mass Spectrom. 1984, 11, 601−601. (65) Chang, E. J.; Archambault, V.; McLachlin, D. T.; Krutchinsky, A. N.; Chait, B. T. Anal. Chem. 2004, 76, 4472−4483.

I

DOI: 10.1021/acs.analchem.7b02555 Anal. Chem. XXXX, XXX, XXX−XXX