Rapid Expression of Functional Genomic Libraries - American

Dec 13, 2005 - and at least one generic primer flanking the gene target.7,13,18,22-24 ... 55 gene targets were successfully extended and amplified to...
0 downloads 0 Views 696KB Size
Rapid Expression of Functional Genomic Libraries Kim A. Woodrow, Isoken O. Airen, and James R. Swartz* Department of Chemical Engineering, Stanford University, Stanford, California 94305-5025 Received December 13, 2005

Genomic-scale analysis of protein function is currently limited by the ability to rapidly express the enormous diversity of protein targets in their active form. We describe a method to construct transcriptionally active expression templates (ETs) in parallel using a single PCR step wherein the overlap-extension reaction for addition of transcription regulatory elements is separated from the amplification of the full-length product by using a GC-rich single primer. Over 90% of 55 diverse genomic targets were extended with T7 regulatory elements to form ETs in high yield and purity. The unpurified ETs directed protein expression using a cell-free protein synthesis (CFPS) system supplemented with cofactors and metal ions to activate a variety of enzymes. Higher activities were obtained in the modified CFPS reactions compared to standard reaction conditions. Protein purification was avoided because the expressed enzyme activity was significantly greater than the background activity associated with the cell extract. These improvements in the parallel synthesis of linear ETs combined with enhanced in vitro enzyme activation help to make CFPS systems more attractive platforms for high-throughput evaluation of protein function. Keywords: linear expression template • cell-free protein synthesis • proteomics • functional genomics • protein array • E-PCR

Introduction Global analysis of protein function has relied mostly on strategies for preparing large numbers of recombinant proteins in vivo followed by their purification and characterization on microarrays. These approaches suffer from inadequate expression yields, inconsistent protein folding, and chemical and biological instability during purification and immobilization.1-3 Characterizing protein function using in vitro assays has also been limited by the availability of cofactors or other prosthetic groups necessary for some enzyme catalysis.1,4 An alternative is expression cloning, where vectors encoding different gene products are introduced into various hosts for activity screens.5,6 While this method enables functional genomic analysis within the cellular milieu, it has been restricted to gene products that produce discernible phenotypes and to proteins whose activities can be detected above the background activity of the host cell proteins. High-throughput methods for characterizing the structure and function of thousands of gene products will require improved strategies for parallel synthesis of expression templates and improved platforms for expressing active proteins. A challenge facing many functional genomic platforms is the synthesis of expression templates (ET) to direct the transcription and translation of the multitude of gene products. While not a limitation when expressing only a few genes, conventional cloning can be time-consuming and labor-intensive when used to express entire genomes. Cloning toxic gene products can * To whom correspondence should be addressed. Tel, (650) 723-5398; fax, (650) 725-0555; e-mail, [email protected].

3288

Journal of Proteome Research 2006, 5, 3288-3300

Published on Web 11/02/2006

be especially challenging due to the inability to tightly control expression from many commercial vectors. A method is needed for making ETs rapidly and reliably without using restriction enzyme digestion, ligation, transformation, bacterial propagation, or plasmid purification. A number of methods have been described to generate linear templates for protein expression, but no single method has gained wide acceptance.7-13 To date, most of the linear ETs used to express genomic targets have been amplified from cDNA libraries subcloned into expression plasmids.13,14 This approach facilitates use of generic primers homologous to the flanking regions of the vector thereby greatly simplifying the PCR method but requiring a prior gene cloning step. Using gene-specific primers to amplify targets from complex templates such as genomic DNA and then extending regulatory regions for transcription and translation has proven to be a greater challenge, but may clearly be the most advantageous for high-throughput applications. To date, there are no published examples where this approach, known as expression PCR (E-PCR), is applied in parallel to a large number and a wide variety of gene targets. The few examples have been limited to gene targets with similar size and GC-content,15 or to libraries of linear ETs representing mutants of a single gene.16-21 In the single case where linear ETs were formed from several different gene targets, the expression elements were generated from multiple PCR steps using plasmid templates and at least one generic primer flanking the gene target.7,13,18,22-24 We have developed a reliable PCR procedure to rapidly generate higher and more consistent yields of linear ETs suitable for expressing genomic libraries. These transcription10.1021/pr050459y CCC: $33.50

 2006 American Chemical Society

research articles

Rapid Expression of Genomic Libraries

Figure 1. Principle for making linear expression templates using PCR. Linear expression elements are formed using two sequential PCR reactions. Gene targets are amplified in the first PCR by using gene-specific primers that add different extensions upstream and downstream of the gene sequence. The product from this first PCR is then combined with the dsDNA for the T7 promoter and terminator elements. In the subsequent PCR, these transcriptional regulatory elements anneal to the homologous extensions on the gene target and prime gene extension to form the full-length expression template. Addition of a GC-rich end primer amplifies the full-length product. Steps 2 and 3 are conducted in the same PCR reaction tube and are separated into discrete stages by addition of the GC-rich single primer and by inceasing the annealing temperature after the first 10 cycles.

ally active templates were constructed in two separate stages of a single PCR reaction (Figure 1). The first stage extends T7 regulatory elements onto the gene target, and the second stage uses a GC-rich single primer to amplify the extended template. In this manner, we were able to achieve high yield and specificity for the full-length expression element while eliminating the formation and accumulation of aberrant DNA products. The technique was demonstrated by extending T7 regulatory elements onto a library of gene targets obtained directly from Escherichia coli genomic DNA. Fifty-two of the 55 gene targets were successfully extended and amplified to form their full-length template in high yield and purity. Only two of the 52 extended linear ETs also amplified a 0.4 kb contaminant, which was prevalent when using other methods and has been suggested to compete for RNA polymerase binding.18,25,26 The unpurified templates were shown to be transcriptionally active using an E. coli-based combined transcription-translation system. The protein targets were expressed in parallel using batch reactions in microtiter plates using a cell-free protein synthesis (CFPS) system modified to include a variety of coenzymes and metal ions. These components are lost during cell extract preparation due to extensive dialysis and dilution, consequently requiring that they be reintroduced into the CFPS reaction for the correct assembly and activation of some protein targets.27-29 Enhancing the availability of these components beyond their physiological concentrations is also required to mature proteins being expressed in high yields. The standard CFPS reaction was supplemented with a multivitamin solution providing flavin adenine dinucleotide (FAD), thiamin, riboflavin, pyridoxal 5′-phosphate, biotin, lipoic acid, and coenzyme B12, as well as several trace metal ions. The activity of several enzymes with available colorimetric assays was evaluated, and most showed greater than 80% of the expected activity. Several of these enzymes were nicotinamide or flavin enzymes, and others required metal ions for activity. Protein purification was

avoided because the activity for each enzyme was detectable above the background activity associated with the cell extract. This array represents the largest and most diverse set of proteins to be expressed in parallel from linear DNA expression templates generated strictly by PCR. The ability to rapidly construct linear ETs and express properly folded and active proteins in this format should have broad implications for many functional genomic applications.

Experimental Section Materials. Media components and chemical reagents were purchased from Sigma (St. Louis, MO) unless indicated otherwise. Vent DNA polymerase and dNTPs were obtained from New England Biolabs (NEB, Ipswich, MA). Accuprime Pfx and Pfu Turbo DNA polymerases (DNAP) were purchased from Invitrogen (Carlsbad, CA) and Stratagene (La Jolla, CA), respectively. Restriction enzymes and calf intestinal alkaline phosphatase (CIP) were purchased from NEB. Oligonucleotides were synthesized by Operon Biotechnologies, Inc. (Huntsville, AL). Isolation and purification of DNA was performed using the QIAquick PCR or gel extraction kit (Qiagen Ltd., Valencia, CA). Amplification of Open-Reading Frames (ORFs) from E. coli Genomic DNA. Genomic DNA (gDNA) was prepared from E. coli strain A19 according to standard procedures.30 The 55 ORFs (Table 1) were amplified using gene-specific primers designed with OligoPerfect Designer (Invitrogen, Carlsbad, CA). Each gene-specific sense primer was extended at the 5′-terminus with the sequence 5′-GTTTAACTT AAGAAGGAGA TATACAT3′, whereas each gene-specific antisense primer was extended at the 5′-terminus with the sequence 5′-CAGCGGTGGC AGCAGCCAAC TCA-3′. The nucleotide additions at the 5′-termini introduce complementarity between the gene coding sequence and the transcription regulatory elements to be added later. Each ORF was amplified in a 100 µL PCR reaction containing 2.5 U Accuprime Pfx DNAP, 1 µM each of the sense and Journal of Proteome Research • Vol. 5, No. 12, 2006 3289

research articles

Woodrow et al.

Table 1. General Considerations for Adding Transcriptional Regulatory Elements Using a GC-Rich Single Primer purification

yields

other considerations

PCR I: amplification of ORF from gDNA PCR II: extension and amplification of expression element

yes (solution/gel purify)

varies (0.2 ( 0.1 µM)

no (proceed to CFPS)

50 ng/µL

Amplification of T7 regulatory elements

yes (gel purify)

0.5 µM

High magnesium concentration (1.3 mM) enhances formation of 0.1-0.2 kb aberrant amplicons, which will cause low yields of expression template in PCR II. Avoid by optimizing PCR reaction or gel purifying ORF after PCR I. Reagent Composition ORF: 8 ( 4 nM T7 regulatory elements: 30 nM SP3 single primer: 20 nM Temperature cycle 10×: 95 °C/30s, 57 °C/1min, 72 °C/1min per kb Add SP3 primer 15×: 95 °C/30s, 67 °C/1min, 72 °C/1min per kb Solution purification of the T7 regulatory elements is not sufficient for obtaining adequate quality templates for the extension PCR.

antisense primers, 2 µg/mL gDNA, 300 µM dNTPs and MgSO4 to a final concentration of 1.3 mM. The 55 gene targets were amplified in-parallel beginning with a single incubation at 95 °C for 2 min, followed by 25 cycles of 95 °C for 15 s, 60 °C for 30 s, and 68 °C for 3 min. Difficult targets were amplified by lowering the annealing temperature to 55 °C and increasing the MgSO4 to 1.5 mM. Unless otherwise specified, the PCR products were purified using the QIAquick PCR purification kit according to the manufacturer’s instructions. Yields for each gene target were quantified densitometrically against a 2-log DNA molecular weight ladder using ethidium bromide-stained agarose gels (1.3%, w/v). Preparation of Bacteriophage T7 Promoter and Terminator Elements. The bacteriophage T7 promoter (PT7.sp3) and terminator elements (Term.sp3) were amplified from the pK7CAT plasmid.31 The 250 bp PT7.sp3 was amplified using the GC-rich FwdPT7 sense primer, 5′-ATGCAGGTCA TCCGAGGGGT TAACGAGTTC GCGGCCGCTT AGGCACCCCA GGCTTTAC-3′, and the RevPT7 antisense primer, 5′-CATATGTATA TCTCCTTCTT AAAGTTAAAC AAAATGATCT CTAGATCG AAACCGTTGT GGTCTC-3′. The 170 bp Term.sp3 was amplified using the FwdTERM sense primer, 5′-TGAGTTGGCT GCTGCCACCG CTG-3′, and the GC-rich RevTERM antisense primer: 5′-ATGCA GGTCATCCGA GGGGTTAACG AGTTCGACGA GCGTCAGCTT GCATGCCCTG CAGCT-3′. Underlined regions denote sequence complementarity to extensions flanking each gene coding sequence. The regulatory elements were amplified in a total volume of 50 µL by combining 1 µM each of the appropriate sense and antisense primers, 2 µg/mL pk7CAT, 1× ThermoPol reaction buffer [1 mM KCl, 1 mM (NH4)2SO4, 0.2 mM MgSO4, 0.01% Triton-X 100, and 2 mM Tris-Cl, pH 8.8], and 2 U Vent DNA polymerase. PT7.sp3 and Term.sp3 were amplified using the following temperature cycles: one cycle of 95 °C for 2 min, followed by 25 cycles of 95 °C for 30 s, 60 °C for 30 s, and 72 °C for 30 s. The PCR products were separated in a 2% (w/v) agarose gel, and the focused products were recovered using a gel extraction kit. The double-stranded DNA (dsDNA) products were eluted from the capture column using 50 µL of water, and the concentration and purity of both regulatory elements were determined by gel analysis and by measuring the absorbance at 260 and 280 nm. Generating Linear Expression Templates Using PCR. The extension and GC-rich single primer amplification reactions were performed in a total volume of 50 µL by combining 5-10 nM purified ORF template from the first PCR, 30 nM each of PT7.sp3 and Term.sp3 (∼3 µL each of the PT7.sp3 and Term.sp3 preparations described above), 250 µM dNTPs, 1× ThermoPol 3290

Journal of Proteome Research • Vol. 5, No. 12, 2006

reaction buffer, and 3 U Vent DNA polymerase. The thermal cycler was programmed to perform the following sequence: one incubation at 95 °C for 2 min; 10 cycles of 95 °C for 30 s, 57 °C for 1 min, and 72 °C for 3.5 min; and 15 cycles of 95 °C for 30 s, 67 °C for 1 min, and 72 °C for 3.5 min. After the first 10 cycles at the lower annealing temperature, 20 µM SP3 GCrich primer (5′-ATGCAGGTCA TCCGAGGGGT T-3′) was added to each PCR reaction tube and mixed by inverting or gentle vortexing, and the remaining 15 temperature cycles were completed at the higher annealing temperature (Ta ) 67 °C). For comparison, extension of five gene targets was attempted using the RTS E. coli Linear Template Generation Set (LTGS, Roche Diagnostics, GmbH, Mannheim, Germany) and the Megaprimer extension procedure according to the published methods.22 For these methods, each ORF was amplified from gDNA using the same gene-specific primer sequence, but overlap regions were specified by the extension protocol being used. Amplified gene targets were used directly or purified using a QIAquick PCR purification column prior to the extension/amplification PCR reaction. Cloning and DNA Sequencing. Five linear expression templates were subsequently cloned into the pK7 expression plasmid31 using standard procedures.30 Plasmids isolated from cultures of individual transformants were screened by NdeI/ PstI digestion and further verified by DNA sequencing (Protein and Nucleic Acid Facilities, Stanford University, Stanford, CA). In Vitro Protein Synthesis Using Linear Expression Templates. The NMR5 S30 cell extract has been described elsewhere22 and was used for in vitro protein expression studies with the linear ETs. Batch CFPS reactions were conducted using a modified Cytomim system.31,32 A standard 15 µL CFPS reaction mixture included 1.2 mM ATP; 0.86 mM each of GTP, UTP, and CTP; 10 mM potassium phosphate (pH 7.2); 130 mM potassium glutamate; 10 mM ammonium glutamate; 8 mM magnesium glutamate; 34 µg/mL folic acid; 171 µg/mL E. coli tRNA mixture; ∼100 ng PCR template; 100 µg/mL T7 RNA polymerase; 2 mM each of 20 unlabeled amino acids; 4.2 µM [14C]-leucine; 0.33 mM nicotinamide adenine dinucleotide; 0.26 mM coenzyme A; 4 mM sodium oxalate; 1.5 mM spermidine; 1 mM putresceine; and 0.24 vol of NMR5 S30 extract. Various concentrations of vitamins, cofactors, and trace metal ions were added to the standard CFPS reaction and their concentration was optimized for maximal production of a model protein, chloramphenicol acetyltransferase (CAT). Elements already contained in the CFPS reaction included glutamate salts of magnesium, ammonium, and potassium along with the cofactors folinic acid, coenzyme A, and nicoti-

research articles

Rapid Expression of Genomic Libraries Table 2. Gene Targets from the E. coli Genome

protein

gene

base pairs

Initiation Factor 3 (IF3) Elongation Factor Tu (EF-Tu) Elongation Factor Ts (EF-Ts) Release Factor 2 (RF2) Endonulcease I (EndA) Arginine Decarboxylase (SpeA) Heat Shock Protein (GrpE) Heat Shock Protein (DnaK) Trigger Factor (Tig) 17 kDa Protein (Skp) Pyruvate dehydrogenase (E1p) Dihydrolipoamide Acetlytransferase (E2p) Dihydrolipoate Dehydrogenase (E3p) ppGpp Synthetase I (RelA) ppGpp Synthetase II (SpoT) Nucleoside Diphosphate Kinase N Utilization Site G Transcription Elongation/Cleavage Factor (GreB) L-Serine Deaminase II (LSD-2) Endonuclease III (XthA) Poly(A) Polymerase (PcnB) Cytidine Deaminase (Cdd)a Thymidine Phosphorylase (DeoA) Glutathione Reductase (GR)a Pyruvate Kinase F (PK)a Thioredoxin Reductase (TR)a UMP Kinase (UK)a Peptidyl-tRNA Hydrolase Protease La (Lon) Heat Shock Protein (DnaJ) Elongation Factor P Ribosome Recylcing Factor Signal Recognition Particle Protein Preprotein Translocase SecA Subunit Preprotein Translocase SecY Subunit 10 kDa Chaperonin 60 kDa Chaperonin Cell Division Protein ftsA Cell Division Protein ftsL Cell Division Protein ftsQ Cell Division Protein ftsW Cell Division Protein ftsZ FMN Reductasea Acetate Kinase β-Glucuronidasea Acetyl-CoA Synthetasea L-Asparaginase Catalase G6P Dehydrogenasea Glutamate Dehydrogenase Glycerol Kinase Malate Deyhydrogenasea Dihydrofolate Reductase (DHFR)a Chloramphenicol Acetyltransferasea Tryptophanase (Tna) RNase E (Rne)

infC tufA tsf prfB endA speA grpE dnaK tig skp aceE aceF lpdA relA spoT ndk nusG greB sdaB xthA pcnB cdd deoA gor pykF trxB pyrH pth lon dnaJ efp frr ffh secA secY groS groL ftsA ftsL ftsQ ftsW ftsZ fre ackA uidA acs ansA katG zwf gdhA glpK mdh folA cat tnaA meA

543 1185 852 1099 708 1977 594 1917 1299 486 2664 1893 1425 2235 2112 432 546 513 1368 807 1365 885 1323 1353 1413 966 726 585 2355 1131 567 558 1362 2706 1332 294 1647 3746 366 1763 1245 1152 702 1203 1812 1959 1017 2181 1476 1344 1509 939 480 652 1431 3186

monomer MW (kDa)

multimeric state

extension products?

20.6 43.3 30.4 45.6 26.7 73.9 21.8 69.1 46.1 17.7 99.7 66.1 50.7 83.9 79.3 15.5 20.5 19.9 48.8 31.0 52.5 31.5 47.2 48.8 50.7 34.6 26.0 21.1 87.4 41.1 20.6 20.6 49.8 102.0 48.5 10.4 57.3 45.3 13.6 31.4 46.0 40.3 28.9 27.2 146.2 143 274.5 151.6 91.2 46.9 101.5 78.5 18.0 25 53.4 118.2

monomeric monomeric heterotetrameric monomeric monomericb homotetrameric homodimeric monomeric monomeric homotetrameric heteromultimeric heteromultimeric heteromultimeric monomericb monomericb homotetrameric monomericb monomericb monomeric monomeric monomeric homodimeric homodimeric homodimeric homotetrameric homodimeric homohexameric monomeric homotetrameric homodimeric monomericb homotrimeric monomeric dimeric heteromultimeric homoheptameric heteromultimeric monomeric ND ND ND dynamic monomeric homodimeric homotetrameric homodimericb homotetrameric homotetrameric homodimeric homohexameric homotetrameric homodimeric monomeric homotrimeric homotetrameric homotetrameric

yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yesd yes yes yes yes yes yes yes no yes yes yes yes yes yes yes yes yes yes yes yes yes yes nac nac no no

gel and lane position (Figure 3)

A 1, 2 A 3, 4 A 5, 6 A 7, 8 A 9, 10 A 11, 12 A 13, 14 A 15, 16 A 17, 18 A 19, 20 A 21, 22 A 23, 24 A 25, 26 A 27, 28 B 1, 2 B 3, 4 B 5, 6 B 7, 8 B 9, 10 B 11, 12 B 13, 14 B 15, 16 B 17, 18 B 19, 20 B 21, 22 B 23, 24 B 25, 26 B 27, 28 C 1, 2 C 3, 4 C 5, 6 C 7, 8 C 9, 10 C 11, 12 C 13, 14 C 15, 16 C 17, 18 C 19, 20 C 21, 22 C 23, 24 C 25, 26 C 27, 28 D 1, 2 D 3, 4 D 5, 6 D 7, 8 D 9, 10 D 11, 12 D 13, 14 D 15, 16 D 17, 18 D 19, 20

CFPS yields (µM)

14.0 ( 1.0 4.0 ( 1.0 12.0 ( 1.0 2.0 ( 1.0 4.0 ( 0.4 2.0 ( 0.4 8.0 ( 0.3 3.0 ( 0.4 3.0 ( 0.1 11.0 ( 0.3 1.0 ( 0.3 2.0 ( 1.0 5.0 ( 0.3 0.2 ( 0.1 2.0 ( 0.1 10.0 ( 0.5 9.0 ( 2.0 5.0 ( 1.0 4.0 ( 0.3 5.0 ( 0.3 0.2 ( 0.1 7.0 ( 0.0 4.0 ( 1.0 4.0 ( 0.4 5.0 ( 0.3 8.0 ( 0.1 6.0 ( 2.0 6.0 ( 0.1 2.0 ( 0.3 4.3 ( 0.3 12.0 ( 4.0 0.0 ( 0.0 2.0 ( 0.2 0.3 ( 0.1 5.0 ( 1.0 25.0 ( 1.0 2.0 ( 0.5 na ( na 5.0 ( 2.0 0.0 ( 0.0 2.0 ( 0.7 2.0 ( 0.7 5.0 ( 1.0 8.0 ( 1.0 4.0 ( 1.0 0.4 ( 0.1 1.0 ( 0.2 1.0 ( 0.1 2.0 ( 0.2 3.0 ( 0.2 3.0 ( 2.0 6.0 ( 1.0 13.0 ( 1.0 11.3 ( 2.2 na ( na na ( na

a Activity confirmed using established assays (see Table 2). b Determined by homology comparison. c Functional templates were amplified directly from plasmids using FwdPT7 and RevTERM. d See Results and Discussion for modifications required to produce a function ET for dnaJ.

namide adenine dinucleotide (NAD). Vitamin and cofactor solutions were first prepared to stock concentrations of 100 mM. Thiamin, riboflavin, coenzyme B12, and flavin adenine dinucleotide were each dissolved in water, whereas pyridoxal 5′-phosphate was prepared in 1 M HCl and biotin in 2 M KOH. Lipoic acid was solubilized in 100% ethanol. Iron, copper, cobalt, manganese, zinc, molybdenum, and boron were introduced in the form of ferric(III) chloride or ferrous(II) ammonium sulfate, cupric(III) sulfate, cobalt(II) chloride, manganese(II) sulfate, zinc(II) sulfate, sodium molybdate(VI), and

boric acid, respectively. Batch CFPS reactions were performed as described above using the pK7CAT expression (see Results and Discussion). Batch CFPS reactions were performed in tissue culturetreated, flat-bottom microtiter plates made of polystyrene (Becton Dickinson Labware, NJ). The plates were incubated with a cover in a humidity-controlled incubator at 37 °C for 5 h. Soluble protein was collected from the supernatant after centrifuging the CFPS reaction products for 15 min at 12 000g and 4 °C. The amount of total and soluble synthesized protein Journal of Proteome Research • Vol. 5, No. 12, 2006 3291

research articles

Woodrow et al.

Table 3. Functional Protein Array PROTEIN wild-type-specific activitya (U/mg), reference

size and subnit composition

Dihydrofolate Reductase (DHFR) 53 U/mg, ref 35 FMN Reductase 120 U/mg, ref 36 Glucose 6-phosphate Dehydrogenase (G6P) 104 U/mg, ref 37 β-Glucuronidase (GUS) 60 U/mg, ref 38 Malate Dehydogenase (MDH) 700 U/mg, ref 39 Acetate Kinase (AK) 172, ref Cytidine Deaminase 250 U/mg, ref 40 Thioredoxin Reductase (TR) 39 U/mg, ref 27 Glutathione Reductase (GR) 31 U/mg, ref Chloramphenicol Acetyltransferase (CAT) 125 U/mg, ref Pyruvate Kinase 35 U/mg, ref 34 UMP Kinase 128 U/mg, ref 41

18 kDa monomer 26 kDa monomer 56 kDa monomer

Standard CFPS total (µM)

soluble (µM)

activeb (%)

total (µM)

soluble (µM)

activeb (%)

none

14.4 ( 1.6

14.7 ( 0.0

89 ( 15

14.7 ( 2.4

10.9 ( 0.4

80 ( 17

none

7.4 ( 0.8

5.6 ( 0.1

56 ( 3

4.7 ( 0.1

2 ( 0.4

80 ( 19

NADP+

2.7 ( 1.0

3.2 ( 0.8

19 ( 15

1.4 ( 0.2

0.8 ( 0.0

61 ( 21

none

1.6 ( 3.0

0.5 ( 4.6

60 ( 7

2.1 ( 0.2

1.0 ( 0.0

84 ( 27

none

5.0 ( 0.7

4.2 ( 0.4

18 ( 4

3.8 ( 0.1

3.4 ( 0.1

49 ( 2

Mg2+

8.1 ( 1.4

7.5 ( 2.7

93 ( 34

7.5 ( 0.1

5.9 ( 0.5

99 ( 13

Zn2+

5.4 ( 0.3

5.1 ( 1

34 ( 10

6.7 ( 0.3

5.1 ( 1

68 ( 14

FAD

7.6 ( 1.5

6.2 ( 0.9

36 ( 24

5.6 ( 1.5

3.8 ( 1.1

80 ( 7

FAD

4.2 ( 1.0

4.1 ( 0.1

16 ( 12

3.9 ( 0.4

1.3 ( 0.5

82 ( 11

none

9.2 ( 2.6

7.8 ( 1.0

85 ( 22

9.5 ( 0.1

5 ( 0.5

91 ( 32

Mg2+, K+

5.5 ( 0.9

5.4 ( 0.7

92 ( 23

4.9 ( 0.5

3.6 ( 0.4

99 ( 18

none

5.8 ( 1.8

5.5 ( 0.7

20 ( 10

4.5 ( 0.5

3.2 ( 1.1

52 ( 10

68.4 kDa monomer 130 kDa dimer 86 kDa dimer 63 kDa dimer 138 kDa dimer 195 kDa dimer 25 kDa trimer 203 kDa tetramer 155 kDa hexamer

Supplemented CFPS

prosthetic group

a A unit is defined as micromoles of substrate catalyzed per minute. b Determination of active fraction is based on published specific activities and on total soluble accumulated protein.

was estimated from measuring TCA-insoluble radioactivity using a liquid scintillation counter (Wallac 1450 Microbeta LSC, Perkin-Elmer) as described elsewhere.31,33 Enzymatic Assays. All spectrophotometric measurements were conducted using a Hewlett-Packard B452A diode array spectrophotometer. Enzyme-specific activities were measured without purification and corrected for any detectable background activity from the cell extract. An equivalent volume of a CFPS reaction conducted without a DNA template was used to determine background activity for each assay. Specific activities were determined according to the published methods referenced in Table 2.34-41 SDS-PAGE and Autoradiography. CFPS reactions were analyzed by reducing SDS-PAGE. Precast gels and reagents were purchased from Invitrogen (CA). Samples (e30 µg/mL) were denatured at 80 °C in loading buffer (1× LDS running buffer and 50 mM DTT). The samples were loaded onto a 12% (w/v) Bis-Tris precast gel and electrophoresed in MOPS/SDS running buffer containing NuPAGE antioxidant. SimplyBlue SafeStain was used to stain and fix the gels according to the manufacturer’s recommendations. The gels were dried using a gel dryer and exposed to photographic paper (Kodak, NY) for 2 days at room temperature to detect proteins that had incorporated radioactive 14C-leucine.

Results and Discussion Preparation of Transcriptionally-Active Linear ETs. The procedure for making transcriptionally active linear ETs is outlined in Figure 1. Each gene target was amplified from gDNA using two gene-specific primers that also installed unique overlap regions flanking the gene coding sequence. Various 3292

Journal of Proteome Research • Vol. 5, No. 12, 2006

DNA polymerases can be used for this initial amplification as long as primer-dimers are minimized (Table 1). We found that the prevalence of a 0.1 kb product, which was attributed to primer-dimer formation, increased as a function of increasing magnesium concentration in the PCR reaction. This 0.1 kb product led to aberrant product amplification in the subsequent PCR and prevented formation of the full-length ET. Of the three DNA polymerases evaluated, AccuPrime Pfx DNAP achieved the highest yields and specificity over a range of target sizes, performed best under lower magnesium concentration, and successfully amplified all 55 gene targets examined in this study. Vent DNAP was used for all other PCR reactions conducted in this work unless indicated otherwise. Although fidelity and speed are important criteria for selecting a polymerase, we also considered polymerase cost as a significant factor in designing expression platforms for whole genomes. In terms of cost alone, Vent DNAP is one of the least expensive polymerases per unit activity. In addition, Vent DNAP is ranked among the top polymerases for fidelity, second only to Pfu DNAP.52 The commercially available RTS E. coli LTGS and the published Megaprimer extension method14 illustrate the underlying principle of E-PCR, which is to use a single-temperature cycle to extend transcriptional regulatory elements onto a gene target and simultaneously amplify the full-length product using two different end-primers. Both approaches share a similar strategy for adding transcriptional regulatory elements onto PCR-amplified gene targets, but use different overlap regions to direct the extension reaction. We examined both methods for their ability to extend T7 regulatory elements onto five gene targets: infC, 543 bp; tufA, 1185 bp; tsf, 852 bp;

Rapid Expression of Genomic Libraries

research articles aberrant amplicons (Figure 3A). The quality of the ETs was improved by purifying the gene products prior to the Megaprimer extension/amplification PCR. Purification led to full-length extension products for all gene targets and a decrease in nonspecific priming events (Figure 3B), but the same 0.4 kb band seen with the LTGS continued to be strongly amplified. Variable results were observed when we used the Megaprimer protocol to extend T7 regulatory elements onto eight other gene targets with lengths between 0.5 and 3 kb. Although we observed ETs for all eight ORFs, only six targets produced yields g50 ng/µL, and we continued to observe significant nonspecific amplification of other DNA products of various sizes, including the 0.4 kb band. In summary, both the LTGS and Megaprimer method showed low target specificity and failed to produce the linear ETs in high yield or purity. The dominant product was often a 0.4 kb contaminant that has been suggested to compete for RNA polymerase binding.18,26,42

Figure 2. Gene extension with transcriptional regulatory elements using the RTS Linear Template Generation Set (Roche). The RTS E. coli Linear Template Generation Set (Roche Applied Science) was used according to the manufacturer’s recommendations to extend T7 regulatory elements onto four E. coli open-reading frames. The ORFs (-) were amplified from genomic DNA by PCR and loaded onto a lane adjacent to their extension product (+). The introduction of T7 regulatory elements added approximately 330 bp to each ORF. Asterisk (*) denotes correct product band. See Table 1 for size of each ORF.

lon, 2355 bp; and prfB 1099 bp. We were unable to amplify the prfB coding sequence from gDNA using the gene-specific primers containing the LTGS overlap regions. The four gene amplification products were used without purification and were combined with the transcription regulatory elements and endprimers included with the LTGS. Introduction of the LTGS T7 promoter and terminator elements adds approximately 330 bp to each gene target. Full-length ETs were observed for only two of the four genes, and yields were less than 10 ng/µL in each case. The LTGS method produced two PCR product bands for infC, which suggested that the promoter and terminator elements may not be equally extending onto the gene target (Figure 2, lanes 2 and 3). For tufA, PCR amplification of the gene led to an aberrant 0.8 kb product (Figure 2, lane 4) that appeared to be extended with the LTGS transcription regulatory elements (Figure 2, lane 5). This result suggested that amplicons other than the gene target could promote extension and lead to generation of other expression elements. Extension products were not observed for either tsf or lon. In all four cases, the dominant band was an aberrant amplicon migrating at 0.4 kb. The Megaprimer extension method was only moderately more successful than the LTGS in extending T7 regulatory elements onto the same five gene targets (Figure 3). Only three of these targets were successfully extended with T7 regulatory elements and amplified to yields of g50 ng/µL. All extension products, however, showed significant contamination with

We developed a new PCR procedure with the aim of rapidly generating higher and more consistent yields of linear ETs that could be used for expressing genomic libraries. The method that was designed produced the full-length ET in two separate stages of a single PCR reaction (Figure 1). The bacteriophage T7 promoter and terminator elements were first amplified in high yield from the pK7CAT expression vector,31 and although agarose gel analysis did not indicate the presence of other product bands, optimal results were obtained when both regulatory elements were purified by gel extraction (Table 1). Simply purifying the regulatory elements with a QIAquick PCR purification kit resulted in aberrant product formation and little or no full-length product from the extension/amplification PCR (data not shown). On the basis of the enhanced quality of linear ETs formed with purified ORFs when using the Megaprimer procedure, the QIAquick PCR spin-columns were used to rapidly purify all amplified genes prior to the extension/amplification PCR (Table 1). We expect that this purification can easily be automated by using a 96-well format and a programmable liquid handling device. The extension PCR is performed during the first stage (10 cycles) of the subsequent PCR, wherein the PT7.sp3 and TERM.sp3 anneal to their respective complementary regions on the gene target and prime extension of the gene coding sequence to form the full-length ET. The first stage is performed in the absence of end-primers to ensure that the only priming event is extension of the gene target. After 10 cycles, the GCrich single primer (SP3) is added and the annealing temperature is raised to 67 °C. The last stage (15 cycles) is conducted at a higher annealing temperature to avoid nonspecific amplification of aberrant amplicons. Fifty-two of 55 gene targets amplified directly from E. coli genomic DNA (gDNA) were successfully extended and amplified to form a full-length ET (Figure 4). The method was highly reproducible (n g 5) and achieved high product purity and product yield. Although we did not attempt to rigorously troubleshoot the extension and amplification of the three failed targets, we were able to form a functional template for dnaJ by using the weak product formed from the second PCR as a template to repeat the extension and amplification reaction (Figure 4C, lanes 3 and 4). Of the 52 targets, six were larger than 2 kb, and unlike other methods that require specific optimization for successful extension and amplification of large gene targets, our procedure did not show any obvious correlation between gene size and ability to form a functional ET (Table 1). Only two of the 52 extended ETs produced the 0.4 kb contaminant that was Journal of Proteome Research • Vol. 5, No. 12, 2006 3293

research articles

Woodrow et al.

Figure 3. Gene extension with transcriptional regulatory elements using the Megaprimer extension protocol.22 The Megaprimer extension protocol is performed according to procedures in Experimental Section. (A) The ORF product was used according to the Megaprimer extension method and (B), the ORF PCR product is purified from solution using QIAquick columns prior to the extension reaction. (C) ORF PCR products of varying sizes were purified using a QIAquick PCR spin column and then extended using the Megaprimer protocol. The ORF (-) is loaded in a lane adjacent to its respective extension product (+) on a 1.3% (w/v) agarose gel stained with ethidium bromide. The introduction of T7 regulatory elements added approximately 370 bp to each ORF. See Table 1 for size of each ORF.

prevalent with the other methods. We did not observe any nonspecific amplification of aberrant products, and in the case where more than one product was formed during the genomic PCR, we observed that the extension and amplification reaction was specific for the gene target (Figure 4B, lanes 3 and 4). Restriction sites located within the promoter and terminator regions were used to clone the ETs into a plasmid expression vector to confirm the integrity of the linear ET. Five ETs were successfully cloned into the pK7 expression plasmid, and the transcription regulatory regions (PT7, RBS, and terminator) and gene coding sequence were verified by DNA sequencing (data not shown). Effect of Coenzyme and Trace Metal Ion Supplementation to CFPS. Defined medium recipes for moderate-cell density E. coli fermentations were used to select coenzymes and trace metal ions for supplementation to the standard CFPS reaction.43 Initial concentrations of each vitamin, cofactor, and metal ion were selected based upon previous results from CFPS expression of complex proteins.27,28 These components were further optimized to avoid inhibition of protein synthesis. The coenzymes folinic acid, coenzyme A, and NAD are present in the standard CFPS reaction. Vitamins and cofactors 3294

Journal of Proteome Research • Vol. 5, No. 12, 2006

that were supplemented included FAD, thiamin, riboflavin, pyridoxal 5′-phosphate, biotin, lipoic acid, and coenzyme B12. Stock solutions containing these seven coenzymes were prepared as described in Experimetal Section, and then combined into a multivitamin solution to provide each component at a final concentration of 10 µM in the CFPS reaction. The initial target concentration of 10 µM was based on the optimal concentration of FAD needed to activate glutathione reductase and thioredoxin reductase, two flavin enzymes expressed previously using CFPS.27 Supplementing the standard CFPS reaction with 10 µM of the multivitamin solution did not inhibit the expression of CAT but instead enhanced protein yields by 10% (Figure 5A). Doubling the multivitamin concentration did not inhibit CAT expression but did not further improve protein yields (data not shown). Trace metal ions were supplemented into the CFPS reaction in the form of ferric chloride, cupric sulfate, cobalt chloride, manganese sulfate, zinc sulfate, and sodium molbydate. Main group elements, except for boron, are already present in the standard CFPS reaction and included glutamate salts of magnesium(II) and potassium(I). Boron was supplemented in the form of boric acid. All trace metal ions and boron were

Rapid Expression of Genomic Libraries

research articles

Figure 4. Extension and amplification of transcriptionally active PCR templates using GC-rich single primers. The single-primer extension protocol is performed according to procedures in Experimental Section. T7 regulatory elements were extended onto 55 gene targets and then amplified using a GC-rich single primer. The ORFs (-) were loaded onto a lane adjacent to their respective extension products (+) and visualized on a 1.3% (w/v) agarose gel stained with ethidium bromide. The introduction of T7 regulatory elements adds approximately 420 bp to each ORF. Arrows help to indicate extension products that are difficult to visualize or when more than one band appears. See Table 1 for lane descriptions in each panel.

combined in a cocktail and added to a final concentration of 250 µM each in the CFPS reaction, which is based on the concentration of ferrous ammonium sulfate need for the CFPS activation of ferredoxin, an iron-sulfur protein.28 CAT total protein yields dropped from ∼15 µM to less than 2 µM after supplementation with the metal cocktail (Figure 5A). Combined addition of the multivitamin solution and the metal cocktail also produced low CAT yields. The individual trace metal ions and boron were added independently to identify elements that negatively impacted CFPS yields. Supplementing the standard CFPS with iron, cobalt, molybdenum, and boron to a final concentration of 250 µM did not decrease CAT yields (Figure 5B). In fact, boric acid moderately improved CAT yields compared to the standard reaction. This result may be due to the influence of pH on CFPS yields rather than the benefits of boron itself.44 Addition of copper, manganese, and zinc negatively influenced protein yields. Copper sulfate decreased CAT yields by almost 90%, and manganese sulfate and zinc sulfate decreased protein yields by 20-30% compared to the standard reactions condition. Copper ions have several critical roles in cellular function, but excess copper can result in toxicity due to Cu-catalyzed formation of highly reactive hydroxyl radicals that can lead to DNA and RNA damage.45 Copper ions can also bind with high affinity to partially folded proteins, and catalyze auto-oxidation of lipids and proteins as well as nucleic acids.46

This transition metal ion has also been implicated in altering the glutathione-redox balance in the cell,47 which may lead to the inactivation of proteins important in transcription and translation. Lowering the CFPS concentration of copper, manganese, and zinc markedly improved protein synthesis yields (Figure 5C). Total protein yields for CAT increased 10-fold by reducing the copper sulfate concentration from 250 to 60 µM. Reducing the concentration of manganese(II) and zinc(II) by the same magnitude recovered protein yields within 10% of those achieved without metal supplementation. On the basis of these results, the metal cocktail was reformulated to reintroduce iron, cobalt, molybdenum, and boron to a final concentration of 250 µM, copper to a final concentration of 60 µM, and manganese sulfate and zinc sulfate to a final concentration of 30 µM. The concentrations of manganese sulfate and zinc sulfate were further reduced to 30 µM in an attempt to fully recover protein yields. This new metal cocktail improved CAT expression compared to the original 250 µM formulation, but the overall yields were still 20% lower than without supplementation. Reducing the final concentration of the modified metal cocktail formulation (Fe/Co/Mo/Bo ) 250 µM, Cu ) 60 µM, and Mn/ Zn ) 30 µM) alone or in combination with citrate, which acts as a weak chelator, did not return CAT yields to the value observed under standard CFPS conditions (data not shown). Journal of Proteome Research • Vol. 5, No. 12, 2006 3295

research articles

Woodrow et al.

Figure 5. Addition of trace metals, vitamins, and cofactors into the CFPS reaction for optimal accumulation of chloramphenicol acetyltransferase (CAT). (A) Initial supplementation of vitamins, cofactors, and trace metal ions showed CFPS inhibition by the metal ion cocktail. (B) Each trace metal was added separately to a final concentration of 250 µM to identify inhibitors of CFPS. (C) Copper(III), manganese(II), and zinc(II) were titrated to final concentrations of 250 µM (solid white), 120 µM (dots), and 60 µM (stripes) to examine CFPS inhibition. (D) The CFPS reaction was supplemented with a final trace metal formulation that provided iron (ferrous ammonium sulfate), cobalt(II), molybdenum(VI), and boric acid at a final concentration of 250 µM each; manganese(II) and zinc(II) to a final concentration of 30 µM each; and copper(III) to a final concentration of 60 µM. Vitamins and cofactors were added to a final concentration of 20 µM.

Sodium molybdate was removed from the metal cocktail formulation and added separately to the CFPS reaction, because we observed that it precipitated when combined with the other elements. Ferrous ammonium sulfate was also substituted for ferric chloride without impacting CFPS performance. These cumulative changes improved CAT yields to within 10% of yields obtained using standard conditions, and no further optimization was conducted. Expression of a Functional Protein Array Using Linear ETs. The transcriptionally active ETs were used without purification to direct in vitro combined transcription-translation of our library. Reactions were conducted in the NMR5 S30 cell extract, which exhibits enhanced linear DNA stability due to mutations that removed the endA gene encoding endonuclease A and 3296

Journal of Proteome Research • Vol. 5, No. 12, 2006

replaced the recCBD operon with the Red recombinase system from the λ bacteriophage.14 The standard CFPS reaction was also modified by addition of the multivitamin and trace metal ion supplementation as described above. Batch cell-free reactions for each template were performed in-parallel using a total reaction volume of 15 µL in 96-well, flat-bottom microtiter plates. The volume we used in our 96-well plate batch reactions is based on the findings of Voloshin and Swartz,48 who observed that Cytomim CFPS reactions require a special geometry for scale-up volumes. They demonstrated that CFPS yields were affected by the geometry of the reaction volume, and that optimal expression and folding was obtained by using a thin film format.48 As such, we used a 15 µL reaction volume, which adequately formed a thin film in a 96-well, flat-bottom plate.

Rapid Expression of Genomic Libraries

research articles

Figure 6. SDS-PAGE analysis of proteins expressed in a cell-free system from PCR templates. Cell-free protein synthesis reactions were conducted as described in Experimental Section. Aliquots were taken at the end of the reaction and analyzed by (A) reducing SDS-PAGE and (B) autoradiography. Top gels in (A) and (B): lane 1 and 15, Mark 12 protein standard (Invitrogen); lane 2, background CFPS; lane 3, GroES 10.4 kDa; lane 4, RF-2 45.6 kDa; lane 5, EF-P 20.6 kDa; lane 6, EfTs 30.4 kDa; lane 7, Lpd 50 kDa; lane 8, DeoA 47.2 kDa; lane 9, Pth 21.1 kDa; lane 10, PK 50.7 kDa; lane 11, IF3 20.6 kDa; lane 12, EfTu 43.3 kDa; lane 13, SpoT 79.3 kDa; lane 14, GreB 20.0 kDa. Bottom gels in (A) and (B): lane 1, background CFPS; lane 2, CAT 25 kDa; lane 3, Ffh 50.0 kDa; lane 4, Ndk 15.5 kDa; lane 5, TR 34.6 kDa; lane 6, Skp 17.6 kDa; lane 7, Grp 21.8 kDa; lane 8, Cdd 31.5 kDa; lane 9, AmpC 28.9 kDa; lane 10, SecY 48.5 kDa; lane 11, NusG 20.5 kDa; lane 12, GroEL 57.3 kDa; lane 13, DnaK 69.1 kDa; lane 14, Mark 12 protein standard.

We did not observe a decrease in protein yields in this format until the reaction volume increased to ∼50 µL. Fifteen microliter reaction volumes in round-bottom 96 well plates produced similar results, whereas V-bottom plates resulted in drastically lower protein yields. Batch reactions produced average total protein yields of 5.1 ( 4.5 µM across the entire protein array (Table 2). Almost 70% of the protein targets accumulated to concentrations greater than 3 µM. The multimeric state of most of the protein targets in the library were either monomeric or homodimeric, but 10 homotetrameric and two homohexameric proteins, as well as one homoheptameric protein, were also expressed. Seventeen of the 52 proteins targets showed total protein accumulation of 200 µg/mL or greater, and two proteins, β-glucuronidase (GUS) and malate dehydrogenase (MDH), produced at least 500 µg/mL of protein after a 5 h incubation at 37 °C. Only five proteins expressed less than 50 µg/mL, and for two of these cases, the lower protein yield could be attributed to poor template quality. The reaction time at 37 °C can be decreased from 5 to 3 h without significantly reducing protein yields. Reaction temperatures such as 25 and 30 °C have been used to improve the folding of complex mammalian proteins that contain multiple disulfide bonds. In particular, Yang et al. have shown that a lowered reaction temperature improves the folding of a complex fusion protein comprised of GM-CSF and a subdomain of the 38C13 scFv.53 We did not examine the influence of lower temperature on protein folding in this

work, since most of our targets were soluble cytoplasmic E. coli proteins with no disulfide bonds. Table 3 illustrates that we obtained high solubility at 37 °C for these protein targets that were measured. Protein targets showing the highest CFPS yields were analyzed using SDS-PAGE and autoradiography (Figure 6). Stained SDS-PAGE gels showed many protein targets were expressed to concentrations above the background proteins present in cell-free system (Figure 6A), and autoradiography demonstrated that the linear expression templates directed the synthesis of a single product (Figure 6B). We determined the total, soluble, and active protein yields in the presence and absence of the supplementation. Although many proteins have functions that are not easily measured, several enzymes with available colorimetric assays were tested for activity. Protein purification was avoided because the activity for each enzyme could be detected above the background activity associated with the cell extract (Figure 7). Many of these enzymes were nicotinamide or flavin enzymes, and others required metal ions for activity. Total protein accumulation was similar between reactions prepared in the presence or absence of the vitamin, cofactor, and trace metal ion supplementation. The standard CFPS reaction produced a larger fraction of soluble protein, but the percentage of active protein was much smaller compared to reactions that received supplementation (Table 3). For example, cytidine deaminase (CD) requires Zn2+ for its activity, and twice as much active Journal of Proteome Research • Vol. 5, No. 12, 2006 3297

research articles

Woodrow et al.

groups produced a significantly larger fraction of active protein when expressed in CFPS with supplementation. Monomeric proteins without prosthetic groups, such as dihydrofolate reductase (DHFR) and GUS, were fully active after CFPS expression. Monomeric enzymes requiring prosthetic groups for activity were greater than 50% active based on the soluble protein concentration. These enzymes include glucose 6-phosphate (G6P) dehydrogenase, which requires NADP+, and FMN reductase, which requires NADPH and riboflavin. The dimeric enzymes MDH, glutathione reductase (GR), and thioredoxin reductase (TR) were all active following CFPS expression (Table 3). Almost half of the soluble MDH showed activity, and we measured specific activity for TR and GR, which require FAD as a cofactor, that were comparable to published values.27 Expression of pyruvate kinase (PK), a homotetrameric allosteric enzyme requiring both monovalent potassium and divalent magnesium or manganese for activity, resulted in soluble protein yields of ∼4 µM of which 99% was fully active. Our results demonstrate the importance of supplementing the cell-free reactions with cofactors and metal ions. However, in this work, there is a tradeoff between activation and solubility for several proteins. The solubility, charge, and effective size of metals are determined largely by the complexed ion species, which depends on ligand availability. Since cells and, consequently, the CFPS reaction environment are comprised of a large numbers of complexing ligands, predicting the chemistry of metals can be complicated. It should be noted that metal ion concentrations were selected here to optimize CAT total expression. It now appears that a further round of optimization may be beneficial in which vitamins and metal ion concentrations are chosen for the best production and activation of a panel of cofactor-dependent enzymes.

Conclusion

Figure 7. Enzymatic activity of target proteins in the background of CFPS. (A) β-Glucuronidase activity (open squares) was measured in the background of CFPS (solid squares) by following the hydrolysis of pNP-β-D-glucuronide at 405 nm. (B) Malate dehydrogenase activity (open squares) was measured in the background of CFPS (solid squares) by following the reduction of NAD+ accompanying the conversion of malate to oxaloacetate. (C) FMN reductase activity (open squares) was measured in the background of CFPS (solid squares) by following the oxidation of NADPH.

protein was produced in the supplemented CFPS reaction compared to the standard reaction. As expected, metalloenzymes requiring main group elements such as K+ or Mg2+ produced the same fraction of active protein under both conditions. Enzymes requiring nicotinamide or flavin prosthetic 3298

Journal of Proteome Research • Vol. 5, No. 12, 2006

CFPS has an important role in the post-genomic discovery process because it offers a platform for multiplexed protein expression from linear DNA templates generated by PCR.13,29 Cell-free methods permit control of the reaction so that parameters such as pH, redox potential, ionic strength, and chemical composition can be adjusted for optimal protein expression and folding. Early challenges facing CFPS have been addressed, and the technology has evolved to more closely mimic in vivo conditions.49 This advance has enabled the use of glucose, glucose 6-phosphate, pyruvate, or glutamate as energy sources and NMPs as the initial nucleotide source, culminating in large cost reductions.44 Modifications of the source strain have produced more productive cell extracts,14,50 and CFPS has demonstrated its ability to express and fold a number of diverse and complex proteins.27,51 Consequently, CFPS has become an attractive platform for developing protein arrays for high-throughput functional analysis. However, the lack of adequate methods to rapidly generate expression templates for protein synthesis and the need to optimize reaction conditions to include small molecules and ions for more general activation of enzyme catalysis has limited its applications in the post-genomic era. Many methods that use PCR to form expression templates perform the extension and amplification of the full-length product using a single temperature cycle. This study shows that these approaches lead to low product yields and nonspecific amplification of other DNA species, including a 0.4 kb product that may be a competitive inhibitor for RNA polymerase binding.26 Developers of the E-PCR method, which is the design

research articles

Rapid Expression of Genomic Libraries

principle for both the Roche LTGS and Megaprimer extension protocols, claim that only 20% of extension products will possess the 0.4 kb contaminant,26 but our results suggest that the 0.4 kb product may be more ubiquitous. Even if the contaminant were limited to 20% of the reactions, given the 4500 ORFs in the small E. coli genome, this would mean that ∼900 ORFs would require special optimization to eliminate this aberrant product. Some protocols recommend gel purification of the gene target prior to the extension/amplification PCR to help eliminate this recurring aberrant product. However, aberrant amplicons persisted even after the gene targets were amplified to high purity and even when the targets were purified after gene amplication. We examined several alternative strategies to eliminate aberrant product formation during ET synthesis. For example, we screened three different single primer sequences and tested multiple thermal cycle schemes. The best results were observed when the amplification mixture contained the least number of DNA species pairs with similar melting temperatures. PCR additives, various magnesium concentrations, and different pHs have also been suggested to minimize the formation of this prevalent and potentially problematic amplicon.26 Gel purification and rigorous troubleshooting, however, are both costly in terms of time and reagent consumption, and neither are acceptable solutions when developing a genome-wide application.

ground activity associated with the cell extract. The protein targets evaluated for function showed increased activity when produced in CFPS reactions receiving vitamin, cofactor, and metal ion supplementation. For the enzymes evaluated, the soluble product was at least 50% active, and eight of these enzymes accumulated more than 80% active protein. Despite this success, it appears that further optimization may be beneficial.

The demand for high-throughput methods to generate templates for expressing genomic libraries led us to design a rapid and reproducible PCR method for making transcriptionally active ETs. We obtained superior yield and purity for the full-length ETs compared to other methods examined in this work. We demonstrated the reliability and reproducibility of this procedure by extending T7 regulatory elements onto more than 90% of 55 diverse genomic targets. Separating the extension PCR from the amplification PCR and using the GC-rich SP3 single-primer to enable higher annealing temperature has collectively enhanced product yield and minimized formation of aberrant products. The linear expression elements were also propagated by reamplifying with the GC-rich SP3 primer (data not shown). They can also be cloned into a plasmid using blunt-ended ligation or, as we have shown in this work, using specific restriction sites to orient the ORF with respect to transcriptional regulatory regions in the plasmid. The plasmids can then be used for sequencing, large-scale protein expression, or in vivo expression.

References

The linear DNA templates produced by this new method directed expression of their cognate protein in an in vitro combined transcription-translation system modified to include a variety of vitamins, cofactors, and metal ions. Since many enzymes require small molecules or ions to aid in folding and catalysis, the availability of these factors is a concern for developing in vitro protein arrays to effectively characterize protein function and biochemical activity. The protein targets were expressed in parallel from linear DNA templates using CFPS batch reactions in microtiter plates. The proteins ranged in size from 10 to over 100 kDa, and several proteins represent subunits of much larger multienzyme complexes. Most proteins accumulated between 2 and 5 µM, and several low molecular weight protein targets had yields much greater than 10 µM. Some of the target proteins could be visualized above the background proteins on a stained SDS-PAGE, and autoradiography demonstrated that each was exclusively expressed as a single product. All the enzymes that we examined exhibited measurable activity that was clearly greater than the back-

In conclusion, we have presented a new PCR method for rapid and reproducible synthesis of linear ETs. These linear ETs were generated in-parallel and successfully directed protein expression using batch CFPS reactions modified to enhance in vitro concentrations of vitamins, cofactors, and metal ions to aid enzyme catalysis. This work should increase the role of CFPS systems as platforms for high-throughput evaluation of protein function.

Acknowledgment. This material is based upon work supported by the National Science Foundation under Grant No. 0132535. Kim A. Woodrow was partially supported by a NSF Graduate Research Fellowship. The authors would like to thank Keith W. Gneshin for designing the SP3 single primer used in this study.

(1) Emili, A. Q.; Cagney, G. Large-scale functional analysis using peptide or protein arrays. Nat. Biotechnol. 2000, 18, 393-397. (2) Schweitzer, B.; Kingsmore, S. F. Measuring proteins on microarrays. Curr. Opin. Biotechnol. 2002, 13, 14-19. (3) Kawahashi, Y.; et al. In vitro protein microarrays for detecting protein-protein interactions: Application of a new method for fluorescence labeling of proteins. Proteomics 2003, 3, 1236-1243. (4) Saghatelian, A.; Cravatt, B. F. Global strategies to integrate the proteome and metabolome. Curr. Opin. Chem. Biol. 2005, 9, 6268. (5) Martzen, M. R.; et al. A biochemical genomics approach for identifying genes by the activity of their products. Science 1999, 286, 1153-1155. (6) Zhu, H.; et al. Global analysis of protein activities using proteome chips. Science 2001, 293, 2101-2105. (7) Nemetz, C. In Cell-Free Protein Expression; Swartz, J. R., Ed.; Springer: New York, 2003; pp 5-7. (8) Olsnes, S.; et al. Formation of active diptheria-toxin in vitro based on ligated fragments of cloned mutant-genes. J. Biol. Chem. 1989, 264, 12749-12751. (9) Lesley, S. A.; Brow, M. A. D.; Burgess, R. R. Use of in vitro proteinsynthesis from polymerase chain-generated templates to study interaction of Escherichia coli transcription factors with core RNApolymerase and for epitope mapping of monoclonal antibodies. J. Biol. Chem. 1991, 266, 2632-2638. (10) Xin, W.; Ma, J.; Huang, D. W. Assembly of linear functional expression elements with DNA fragments digested with asymmetric restriction endonucleases. Biotechnol. Lett. 2003, 25, 901904. (11) Xin, W.; Ma, J.; Huang, D. W. Construction of linear functional expression elements with DNA and RNA hybrid primers: A flexible and fast method for proteomics. Biotechnol. Lett. 2003, 25, 273-277. (12) Sykes, K. F.; Johnston, S. A. Linear expression elements: A rapid, in vivo, method to screen for gene functions. Nat. Biotechnol. 1999, 17, 355-359. (13) Sawasaki, T.; Ogasawara, T.; Morishita, R.; Endo, Y. A cell-free protein synthesis system for high-throughput proteomics. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 14652-14657. (14) Michel-Reydellet, N.; Woodrow, K. A.; Swartz, J. R. Increasing PCR fragments stability and protein yields in a cell-free system with genetically modified Escherichia coli extracts. J. Mol. Microbiol. Biotechnol. 2005, 9, 26-34. (15) Merk, H.; Meschkat, D.; Stiege, W. In Cell-Free Protein Expression; Swartz. J. R., Ed.; Springer: New York, 2003; pp 15-23 . (16) Betton, J. M. High throughput cloning and expression strategies for protein production. Biochimie 2004, 86, 601-605.

Journal of Proteome Research • Vol. 5, No. 12, 2006 3299

research articles (17) Miyazaki-Imamura, C.; et al. Improvement of H2O2 stability of manganese peroxidase by combinatorial mutagenesis and highthroughput screening using in vitro expression with protein disulfide isomerase. Protein Eng. 2003, 16, 423-428. (18) Ohuchi, S.; Nakano, H.; Yamane, T. In vitro method for the generation of protein libraries using PCR amplification of a single DNA molecule and coupled transcription/translation. Nucleic Acids Res. 1998, 26, 4339-4346. (19) Rungpragayphan, S.; Haba, M.; Nakano, H.; Yamane, T. Rapid screening for affinity-improved scFvs by means of single-moleculePCR-linked in vitro expression. J. Mol. Catal. B: Enzym. 2004, 28, 223-228. (20) Rungpragayphan, S.; et al. High-throughput, cloning-independent protein library construction by combining single-molecule DNA amplification with in vitro expression. J. Mol. Biol. 2002, 318, 395-405. (21) Rungpragayphan, S.; Nakano, H.; Yamane, T. PCR-linked in vitro expression: A novel system for high-throughput construction and screening of protein libraries. FEBS Lett. 2003, 540, 147-150. (22) Michel-Reydellet, N.; Woodrow, K. A.; Swartz, J. R. Increasing PCR fragments stability and protein yields in a cell-free system with genetically modified Escherichia coli extracts. J. Mol. Microbiol. Biotechnol., in press. (23) Norais, N.; et al. Combined automated PCR cloning, in vitro transcription/translation and two-dimensional electrophoresis for bacterial proteome analysis. Proteomics 2001, 1, 1378-1389. (24) Resto, E.; Iida, A.; Vancleve, M. D.; Hecht, S. M. Amplification of protein expression in a cell free system. Nucleic Acids Res. 1992, 20, 5979-5983. (25) Graentzdoerffer, A.; Nemetz, C. High-throughput expression-PCR using universal plasmid-specific primers. BioTechniques 2003, 34, 256-260. (26) Graentzdoerffer, A.; Nemetz, C. In Cell-Free Protein Expression; Swartz, J. R., Ed.; Springer: New York, 2003. (27) Knapp, K.; Swartz, J. Cell-free production of active E-coli thioredoxin reductase and glutathione reductase. FEBS Lett. 2004, 559, 66-70. (28) Boyer, M. E.; Wang, C. W.; Swartz, J. R. Simultaneous expression and maturation of the iron-sulfur protein ferredoxin in a cellfree system. Biotechnol. Bioeng. 2006, 94, 128-138. (29) Sawasaki, T.; et al. A bilayer cell-free protein synthesis system for high-throughput screening of gene products. FEBS Lett. 2002, 514, 102-105. (30) Sambrook, J.; Fritsch, E. F.; Maniatis, T. Molecular cloning: A laboratory manual, 2nd ed.; Cold Springs Harbor Laboratory Press: Woodbury, NY, 1989. (31) Swartz, J. R.; Jewett, M. C.; Woodrow, K. A. In Recombinant Gene Expression: Reviews and Protocols, 2nd ed.; Balbas, P., Lorence, A., Eds.; Humana Press: Totowa, NJ, 2004. (32) Jewett, M. C.; Swartz, J. R. Rapid expression and purification of 100 nmol quantities of active protein using cell-free protein synthesis. Biotechnol. Prog. 2004, 20, 102-109. (33) Calhoun, K. A.; Swartz, J. R. An economical method for cell-free protein synthesis using glucose and nucleoside monophosphates. Biotechnol. Prog. 2005, 21, 1146-1153. (34) Valentini, G.; et al. The allosteric regulation of pyruvate kinases A site-directed mutagenesis study. J. Biol. Chem. 2000, 275, 18145-18152. (35) Mouat, M. F. Dihydrofolate influences the activity of Escherichia coli dihydrofolate reductase synthesised de novo. Int. J. Biochem. Cell Biol. 2000, 32, 327-337.

3300

Journal of Proteome Research • Vol. 5, No. 12, 2006

Woodrow et al. (36) Fieschi, F.; Niviere, V.; Frier, C.; Decout, J. L.; Fontecave, M. The mechanism and substrate specificity of the NADPH: flavin oxidoreductase from Escherichia coli. J. Biol. Chem. 1995, 270, 30392-30400. (37) Banerjee, S.; Frankel, D. G. Glucose-6-phosphate dehydrogenase EC 1.1.1.49 from Escherichi coli and from a high level mutant. J. Bacteriol. 1972, 110, 155-160. (38) Geddie, M. L.; Matsumura, I. Rapid evolution of {beta}-glucuronidase specificity by saturation mutagenesis of an active site loop. J. Biol. Chem. 2004, 279, 26462-26468. (39) Breiter, D. R.; Resnik, E.; Banaszak, L. J. Engineering the quaternary structure of an enzyme: Construction and analysis of a monomeric form of malate dehydrogenase from Escherichia coli. Protein Sci. 1994, 3, 2023-2032. (40) Yang, C.; Carlow, D.; Wolfenden, R.; Short, S. A. Cloning and nucleotide-sequence of the Escherichia coli cytidine deaminase (Ccd) gene. Biochemistry 1992, 31, 4168-4174. (41) Bucurenci, N.; et al. Mutational analysis of UMP kinase from Escherichia coli. J. Bacteriol. 1998, 180, 473-477. (42) Nakano, H.; Kobayashi, K.; Ohuchi, S.; Sekiguchi, S.; Yamane, T. Single-step single-molecule PCR of DNA with a homo-priming sequence using a single primer and hot-startable DNA polymerase. J. Biosci. Bioeng. 2000, 90, 456-458. (43) Zawada, J.; Swartz, J. Maintaining rapid growth in moderatedensity Escherichia coli fermentations. Biotechnol. Bioeng. 2005, 89, 407-415. (44) Calhoun, K. A.; Swartz, J. R. Energizing cell-free protein synthesis with glucose metabolism. Biotechnol. Bioeng. 2005, 90, 606-613. (45) Kim, J. K.; Yamada, T.; Matsumoto, K. Copper cytotoxicity impairs DNA synthesis but not protein phosphorylation upon growth stimulation in LEC mutant rat. Res. Commun. Chem. Pathol. Pharmacol. 1994, 84, 363-366. (46) Pufahl, R. A.; et al. Metal ion chaperone function of the soluble Cu(I) receptor Atx1. Science 1997, 278, 853-856. (47) Garcia-Fernandez, A. J.; et al. Alterations of the glutathione-redox balance induced by metals in CHO-K1 cells. Comp. Biochem. Physiol., Part C: Pharmacol., Toxicol. 2002, 132, 365-373. (48) Voloshin, A. M.; Swartz, J. R. Efficient and scalable method for scaling up cell free protein synthesis in batch mode. Biotechnol. Bioeng. 2005, 91, 516-521. (49) Jewett, M. C.; Swartz, J. R. Mimicking the Escherichia coli cytoplasmic environment activates long-lived and efficient cellfree protein synthesis. Biotechnol. Bioeng. 2004, 86, 19-26. (50) Michel-Reydellet, N.; Calhoun, K.; Swartz, J. Amino acid stabilization for cell-free protein synthesis by modification of the Escherichia coli genome. Metab. Eng. 2004, 6, 197-203. (51) Yin, G.; Swartz, J. R. Enhancing multiple disulfide bonded protein folding in a cell-free system. Biotechnol. Bioeng. 2004, 86, 188195. (52) Borns, M. C.; Hogrefe, H. H. Herculase enhanced DNA polymerase delivers high fidelity and great performance. Strategies, 2000, 13, 1-3. (53) Yang, J. H.; Kanger, G.; Voloshin, A.; Levy, R.; Swartz, J. R. Expression of active murine granulocyte-macrophage colonystimulating factor in an Escherichia coli cell-free system. Biotechnol. Bioeng. 2004, 20, 1689-1696.

PR050459Y