EcoExpress—Highly Efficient Construction and Expression of

Jun 26, 2016 - Here we developed a method named EcoExpress, which allows efficient construction of plasmids to express individual protein with user-de...
0 downloads 3 Views 3MB Size
Research Article pubs.acs.org/synthbio

EcoExpressHighly Efficient Construction and Expression of Multicomponent Protein Complexes in Escherichia coli Yiran Qin,† Chang Tan,† Jiwei Lin,‡ Qin Qin,† Jianghaiyang He,† Qingyu Wu,† Yizhi Cai,§ Zhucheng Chen,*,† and Junbiao Dai*,† †

MOE Key Laboratory of Bioinformatics, MOE Key Laboratory of Industrial Biocatalysis and Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China ‡ Wuxi Qinglan Biotechnology Inc., Yixing, Jiangsu 214200, China § School of Biological Sciences, The King’s Buildings, University of Edinburgh, EH9 3BF Edinburgh, United Kingdom ABSTRACT: The bacterium Escherichia coli remains the leading host for protein expression in large quantity for the purpose of crystallization or other biochemical studies. However, expression of multicomponent protein complexes remains a challenge, and is often laborious and time-consuming. Here we developed a method named EcoExpress, which allows efficient construction of plasmids to express individual protein with user-defined epitope-tag, followed by one-pot assembly of a single vector to express the entire protein complex for copurification. A versatile set of vectors was designed to provide various choices to control the expression of a protein with different promoters, and to accept different number of components for coexpression. Using EcoExpress, we demonstrated that each subunit within a protein complex could be expressed individually or simultaneously, and the entire complex could be copurified. In addition, to overcome the decreased assembly efficiency with the increasing number of components, a novel oligonucleotides blocking method was designed and tested. EcoExpress provides the scientific community with a toolbox to rapidly investigate the function of protein complexes. KEYWORDS: protein expression, protein complex, coexpression, vector system, synthetic biology, cloning method

I

is less preferred because of the requirements for different compatible replicons and selective markers. The other is to clone all genes into a single vector as either polycistronic or monocistronic transcripts. In a polycistronic vector, all genes are controlled under a single, strong promoter but with distinct ribosome-binding sites (RBS).9−11 On the other hand, vectors such as the pETDuet (Novagen) and the pQLink,12 put each gene under the control of a separate promoter, which was reported to improve the level of protein expression.13 Therefore, these types of coexpression vectors are more widely used. In general, one problem for these vectors is that they rely on traditional digestion-ligation procedures, so the genes of interest have to be inserted in multiple rounds. Therefore, it usually takes quite substantial amount of time to clone five or six genes, which may greatly limit the progress of a project. Fortunately, with the development of synthetic biology, multiple DNA assembly methods have been crafted, such as Gibson assembly14 and Golden-Gate method.15,16 These

n a biological system, proteins usually do not function independently, and instead they interact with other proteins to form complexes to regulate various biological processes.1 More complicatedly, some proteins could participate in different cellular processes when they orchestrate distinct components in different complexes.2,3 Therefore, one major focus in biology is to understand the function of these protein machines, which often requires the purified, multicomponent complex. In the literature, there are two common methods to achieve this goal. One is to isolate each individually expressed subunit followed by in vitro reconstitution.4 The other is to coexpress all components in the same heterologous host cell and isolate the full complex in one purification step.5 The limitations for the former strategy are that each protein component may exist as insoluble fraction after overexpression, and components may not show correct stoichiometry when reconstituted in vitro. Alternatively, coexpression of the whole complex could potentially solve some of the above-mentioned problems since the expression of one component might help the folding of others, as molecular chaperones do.6 In addition, coexpression could also save time and reagents since the entire complex can be purified at the same time. There are two common strategies to coexpress proteins in Escherichia coli. One is to use multiple plasmids.7,8 However, it © XXXX American Chemical Society

Special Issue: Synthetic Biology in Asia Received: December 20, 2015

A

DOI: 10.1021/acssynbio.5b00291 ACS Synth. Biol. XXXX, XXX, XXX−XXX

Research Article

ACS Synthetic Biology

Figure 1. Design and workflow of EcoExpress for coexpression of multicomponent protein complexes. (A) Schematic representation of the individual vectors. The transcription unit is highlighted in pink. The ccdB gene flanked by two BsaI sites will be swapped by the open reading frame (ORF) of a protein. The two BsmBI sites flanking the entire TU are designed to construct the coexpression vector. (B) Schematic representation of the acceptor vectors. (C) A two-step workflow to construct the coexpression vector of multiple-component protein complexes. In the first step, each ORF is amplified and incorporated into an individual vector, replacing the ccdB gene. All subunits in a complex could be cloned simultaneously. In the second step, each subunit within a complex will be assembled into one acceptor vector to generate the coexpression constructs. Both individual vectors and acceptor vector are the same as these in (A) and (B) and are presented as schematic backbones. (D) The flowchart to optimize the expression of a protein complex. Each subunit cloned into individual vector could be tested independently. The formation of a protein complex could be optimized by changing the composition of individual subunits in one coexpression vector until a desired complex is obtained. P and S represent BioBrick prefix and suffix, respectively. lacOp: lac operator, RBS: ribosome binding site, tag: epitope tag, lacI: lac repressor, KanR: kanamycin resistant gene, AmpR: ampicillin resistant gene, Ori: origin of replication.

techniques greatly simplify the processes of DNA assembly, and have been applied to facilitate the assembly of transcription units or metabolic pathways.17−19 In this study, a protein coexpression system, we called EcoExpress, was designed to facilitate rapid and economical cloning and expression of multicomponent protein complexes in E. coli. The system was deliberately based on the Golden Gate assembly method, which allows quasi-scarless assembly of multiple fragments into a single vector. In EcoExpress, each gene was cloned into an individual vector containing a promoter and a RBS, which allows it to be epitope-tagged, expressed and tested separately, if necessary. Subsequently, each individual component will be mixed and assembled into an acceptor vector in a “one-pot” digestion-ligation reaction to construct the expression vector containing all members in the complex. We demonstrated that EcoExpress works efficiently and the whole cloning and testing process could be achieved within a week. To overcome the decreased cloning efficiency when the number of components increased at the stage of assembling the coexpression vector, we developed a strategy named Oligonucleotides Blocking Cloning Method (OBCM), which facilitates large linear DNA fragments ligation and coexpression vectors assembly. In addition, we designed the

entire vector system compatible with BioBrick standard, which could allow the reuse of the existing parts. We also discussed the potential application of the system in other expression hosts.



RESULTS AND DISCUSSION Overall Design of EcoExpress for Coexpression of Multicomponent Protein Complexes. EcoExpress includes two sets of vectors: the individual vectors and the acceptor vectors for coexpression. The individual vectors are designed to include two recognition sites each for both type IIs restriction enzymes, BsaI and BsmBI. The BsaI recognition sites allow the insertion of the protein coding sequence and are eliminated simultaneously (Figure 1A and 1C). The overhangs are designed to be compatible with ORFs in Yeastfab,19 which could potentially allow us to express any yeast ORF. The two BsmBI recognition sites are used to release the monocistronic transcription units for the assembly of coexpression constructs, The presence of ccdB gene (DNA gyrase inhibitor20) between the two BsaI recognition sequences provides strong negative selection against unsuccessfully cloned vectors and empowers quick identification of positive clones. Similar design is adapted to build the acceptor vectors, which use BsmBI to generate B

DOI: 10.1021/acssynbio.5b00291 ACS Synth. Biol. XXXX, XXX, XXX−XXX

Research Article

ACS Synthetic Biology Table 1. A Versatile Set of Vectors for EcoExpress name

promoter

pIAna pIBn pICn pIDn pIEn pIFn pIGn pIHn pIIn pIJn pAmb

T7 T7 T7 T7 T7 T5 T5 T5 tac tac −

promoter promoter promoter promoter promoter promoter promoter promoter promoter promoter

tag N-terminal

recognition cleavage site

terminator

selective marker

resistance marker

origin of replication

− 6xHis 6xHis+SUMO FLAG FLAG − 6xHis GST GST MBP −

− TEV − TEV − − TEV TEV TEV Factor Xa −

T7 terminator T7 terminator T7 terminator T7 terminator T7 terminator λ t0 terminator λ t0 terminator λ t0 terminator − rrnB T1 terminator −

ccdB

kanamycine

pBR322 origin

ampicillin

pBR322 origin

a

ccdB b

The letter n represents the position of the gene when it is assembled into the coexpression vector. The letter m represents the number of individual expression vectors that acceptor vector accommodates. Both n and m are from 1 to 8. The entire set of vectors will be deposited to addgene for distribution.

Table 2. Standardized Prefix and Suffix Sequences for EcoExpressa pIX1 pIX2 pIX3 pIX4 pIX5 pIX6 pIX7 pIX8 pA1 pA2 pA3 pA4 pA5 pA6 pA7 pA8

BsaI prefixes

BsaI suffixes

BsmBI prefixes

BsmBI suffixes

GATGCGAGACC GATGCGAGACC GATGCGAGACC GATGCGAGACC GATGCGAGACC GATGCGAGACC GATGCGAGACC GATGCGAGACC − − − − − − − −

GGTCTCCTAGC GGTCTCCTAGC GGTCTCCTAGC GGTCTCCTAGC GGTCTCCTAGC GGTCTCCTAGC GGTCTCCTAGC GGTCTCCTAGC − − − − − − − −

CGTCTCATGGA CGTCTCAAGGC CGTCTCATGCC CGTCTCACACT CGTCTCAGTCG CGTCTCAGCAG CGTCTCACCTG CGTCTCACAGT TGGAAGAGACG TGGAAGAGACG TGGAAGAGACG TGGAAGAGACG TGGAAGAGACG TGGAAGAGACG TGGAAGAGACG TGGAAGAGACG

AGGCTGAGACG TGCCTGAGACG CACTTGAGACG GTCGTGAGACG GCAGTGAGACG CCTGTGAGACG CAGTTGAGACG TCACTGAGACG CGTCTAAGGC CGTCTATGCC CGTCTACACT CGTCTAGTCG CGTCTAGCAG CGTCTACCTG CGTCTCACAGT CGTCTCATCAC

a Sequences in bold are recognition sites; Underline 4 base sequences are overhang sites. All sequences are written 5′ to 3′ on the “top strand” of the final part.

compatible overhangs to accept the ligated individual transcription units, replacing the ccdB gene (Figure 1B and 1C). The designed sequences were synthesized de novo and cloned into a pET-based backbone. In both sets of vectors, the BioBrick prefix and suffix are incorporated to embrace the inserted DNA fragment, which offer flexibility to use alternatively cloning method and to utilize the pre-existing biological parts such as those available from BioBrick foundation.21 To use EcoExpress, a pair of primers, each contains a universal sequence plus a gene-specific sequence, are required for every gene. The amplified DNA fragments were purified and subjected to a “one-pot” reaction with individual vectors, restriction enzyme, ligase and buffers, in which the target DNA will be inserted into the vector to generate the individual expression construct. This approach enables the assembly of as many genes as required at the same time, and greatly increases the throughput (Figure 1C). Next, a given combination of individual expression vectors and a corresponding acceptor vector could be mixed, assembled similarly and confirmed by either colony PCR or restriction digestion to obtain the final coexpression vector (Figure 1C).

Coexpression of protein complexes containing multiple components is not a simple task and often suffers low successful rate. To optimize their expression, a standard workflow is developed for EcoExpress (Figure 1D). At first, each subunit of a complex is assembled into an individual vector, which could be expressed and tested separately. At the same time, they will also be put together into an acceptor vector for coexpression. After purification, if the desired complex could be identified, it will be subjected to future analysis. Otherwise, another round of optimization of each components or different combination of a complex could be constructed and the testing process will be repeated. One big advantage of this workflow is that various derivatives of a subunit such as truncated, tagged at N- or C- terminus or fused to different epitope tags and multiple combinations of a complex, i.e., varying the number of subunit or substituting one unit with different homologous proteins etc. could be constructed and assessed simultaneously (Figure 1D). Compared with previous expression systems, EcoExpress allows high-throughput combinatory assessment of a complex and greatly improves the successful rate of obtaining a functional protein complex. C

DOI: 10.1021/acssynbio.5b00291 ACS Synth. Biol. XXXX, XXX, XXX−XXX

Research Article

ACS Synthetic Biology A Versatile Set of Vectors. As shown in Table 1, we have constructed a series of vectors for EcoExpress including individual vectors to express protein separately (the pI vectors) and acceptor vector to take multiple subunits for coexpression (the pA vectors). The pI vectors were named alphabetically to reflect the different combinations of promoter, epitope tags, protease cleavage site and terminators. Based on what are commonly used in the lab, we designed ten classes of pI vectors, i.e., pIA to pIJ. Within each class, they are also numbered to indicate the position of the gene when it is assembled into the coexpression vector using a 4bp “designer overhang” (Table 2). Therefore, we have a total of 80 individual vectors. The pA vectors are named from pA1 to pA8, which could allow us to assemble up to 8 individual expression vectors together. Both individual and acceptor vectors were designed to accommodate our own needs in mind, and they could in principle be expanded to satisfy additional requirements. Furthermore, different antibiotic resistant markers are used for individual and acceptor vectors, eliminating potential background resulted from undigested plasmids. In addition, the pBR322 origin was used for all plasmids, which can also be modified if necessary. The Efficiency of Assembly into Individual Vectors and Application of Individual Vector for Protein Expression. The most critical factor of any cloning systems is how simple and efficient the procedures are. The design of EcoExpress allows enzymatic digestion and ligation to be carried out in one step, greatly simplified the cloning procedure. Next, we tested the efficiency of this strategy. Following the above-mentioned design, we selected four proteins for expression in individual vector at first. For all four genes, we found the assembly efficiency into the individual vector is perfect (100% as shown in Table 3) by randomly isolating and

and the design of various types of individual vectors, we tried to coexpress and purify two protein complexes as proof of principle. The first complex is composed of only two subunits, protein M and N, which are known to interact with each other.22 We assembled protein M and N into pIB2 and pIA1 respectively, allowing the addition of an N-terminal polyhistidine tag on protein M to facilitate the purification. After binding, washing and eluting from nickel affinity column, we found both subunits could be copurified successfully (Figure 2B). Meanwhile, the other complex is much larger than the first one, including five components, i.e., subunit A to E. In order to generate a pure complex, we aimed to perform sequential purification by tagging two of the five components. We fused subunit C with a GST tag and subunit E with a His tag, while other components were cloned into vectors without any tags (Figure 2C). The purification was performed using nickel column at first, followed by glutathione affinity chromatography. Figure 2D shows that in the elution from nickel affinity column, more proteins than we expected present, which is very common during protein purification with his-tag. Once the complex passed the glutathione column, we can find the proteins are very pure, and most of the contaminated proteins are washed away. Taken together, we showed here that EcoExpress could allow us to purify the entire complex using either one or two tags. Further rounds of purification could be performed if highly pure complex is required. Improving the Assembly Efficiency of Coexpression Vector with an Oligonucleotides Blocking Cloning Method. During the application of EcoExpress for multiplecomponent protein complexes, we found when the number of subunits increased, the efficiency of assembly dropped accordingly. As shown in Figure 3A, the assembly efficiency is nearly 90% to clone a complex with only two subunits. When the number increased to 3 or 4, the efficiency decreased to around 40%. It becomes very difficult to obtain one correct clone if there are six components to be assembled into an acceptor vector. These results could largely limit the usage of EcoExpress, especially when the number of subunits in a complex is large. One potential reason for the decreased efficiency might due to the presence of vector backbones in the reaction, which could be religated to the released DNA fragments and prevent them from assembling into the desired products. Therefore, we engineered each vector to contain two DNA sequences (at length of 22 bp and 21 bp respectively) flanking the target DNA (Figure 3B), and named them the left sequence (LS) and right sequence (RS). The LS and RS contain distinct nucleotides, which could be released from the vectors with the target DNA simultaneously using the same enzyme, i.e., BsmBI. Consequently, after enzymatic treatment the vector backbones will be further digested to generate two new overhangs, which are not compatible with any of those in the target DNAs. Thus, interference from the released vectors in the mixture could be eliminated. In addition, to eliminate the formation of the short double-strand DNA fragments with two overhangs, which could either serve as a bridge to religate the target DNA to their original backbones or compete with the overhangs between the target DNAs, two oligonucleotides (designated as the blocking oligonucleotides LSM and RSM), which are complementary to LS and RS, respectively, will be added in excess after digestion. An additional program (OBCM program, see Methods) is performed before ligation, in which the short double-strand DNA fragments will be denatured at

Table 3. Efficiency of Gene Assembly into Individual Vectors internal BsaIa Gene Gene Gene Gene

1 2 3 4

0 1 1 0

efficiencyb 12/12 12/12 12/12 12/12

(100%) (100%) (100%) (100%)

a

The number of BsaI recognition sites within each ORF. bThe efficiency was represented by the percentage of clones showed expected PCR products in colony PCR among randomly isolated ones. Twelve clones were tested for every gene.

testing 12 clones. The protocol for the “one-pot” reaction is modified from our previous study,19 which improved the assembly efficiency of DNA fragments containing internal BsaI recognition sites (The detailed protocol is included in Methods section), overcoming the compromised assembly efficiency due to the presence of these sites in previous study.17 Next, we tested if the genes cloned into individual vector could express as expected. We chose two genes under the control of T7 promoter with lac operator. As shown in Figure 2A, in both case, upon induction with isopropyl β-D-1thiogalactopyranoside (IPTG), we could identify highly abundant target proteins using polyacrylamide gel electrophoresis (PAGE) stained with Coomassie blue, indicating genes cloned into the individual vectors in EcoExpress could be expressed. Purification of Protein Complex Using EcoExpress. Given the successful expression of the individual components D

DOI: 10.1021/acssynbio.5b00291 ACS Synth. Biol. XXXX, XXX, XXX−XXX

Research Article

ACS Synthetic Biology

Figure 2. Expression of individual subunits and copurification of protein complexes using EcoExpress. (A) Two proteins with different molecular weight were assembled into individual vectors and transformed into E. coli strain Rosetta (DE3). The whole cell lysates were analyzed by polyacrylamide gel electrophoresis (PAGE) and proteins were visualized with Coomassie blue. The expression of both the large (56.65 kDa, X) and the small (22.04 kDa, Y) protein could be detected after induction. The target protein is indicated by the black triangles. M: protein molecular weight marker. (B) Coexpression of a complex consisting of two subunits. SubM and subN were assembled into pIB2 and pIA1, respectively. The complex was purified by nickel affinity beads and the product in each purification step was analyzed by PAGE followed by Coomassie blue staining. (C) Tagging strategy of a complex with five subunits. (D) Coexpression of a complex with five subunits. The complex was purified using nickel affinity column, followed by glutathione affinity chromatography. The product in each purification step was analyzed by PAGE and visualized by Coomassie blue staining. P: total pellet; S: supernatant; FT: effluent; W: mobilized substance with wash buffer; E: eluate; M: protein molecular weight marker.

high temperature and during annealing both LSM and RSM will form double-strand DNA with LS and RS, respectively. Consequently, the other strand, which contains the original overhangs, will remain single-stranded and is not able to compete for the overhangs in the target DNAs any more. We named this strategy oligonucleotides blocking cloning method (OBCM). To demonstrate the effectiveness of OBCM, we carried out five sets of reactions by varying the number of DNA fragments (each at the length of 1.6 kb) and the presence of internal restriction enzyme recognition sites, and analyzed the abundance of ligated products using gel electrophoresis (Figure 3C). To avoid inaccurate size estimation of circular molecules, we excluded the acceptor vectors from the reaction. Therefore, we were only examining the formation of assembled linear DNA fragments. In each set of reactions, the first lane indicates the digested plasmids before ligation and the next two lanes compare the ligation production with or without OBCM. In each case, without OBCM the ligation products migrated much slower than the expected fragment, which is accompanied by decreased amount of free vectors, indicating that some of vectors could ligate with the DNA fragments again. On the other hand, with the presence of blocking oligonucleotides,

multiple bands were formed, which represents the ligation products from different number of the DNA fragments. For example, in the reaction with three DNA fragments, we can identify the single fragments (∼1.6 kb), double fragments (∼3.2 kb) and three fragments together (∼4.8 kb). Similar results were obtained in reactions containing five or seven fragments. Furthermore, we found the presence of internal restriction sites, at least with one or two sites as tested, almost have no detrimental effects on the ligation process. In summary, we showed that OBCM could effectively prevent ligations between the vectors and inserts and subsequently, facilitating the assembly of the target fragments. Next, we applied OBCM in the construction of coexpression vector to insert five and six fragments into one acceptor vector in the one-pot reaction, which are normally at quite low efficiency using previous protocol (Figure 3A). Twenty clones from the selective plates were randomly isolated and subjected to mini-prep and enzymatic digestion to confirm the assembled plasmids. In both cases, we found the ratio of the correct clones increased, from 45% to 80% and from 25% to 55%, respectively (Figure 3D). This result indicated that OBCM could serve as a good tool to ensure efficient assembly and may also be applied in other cloning protocols. E

DOI: 10.1021/acssynbio.5b00291 ACS Synth. Biol. XXXX, XXX, XXX−XXX

Research Article

ACS Synthetic Biology

Figure 3. Improving the assembly efficiency of coexpression vectors. (A) The assembly efficiency of coexpression vectors declines with the increasing number of subunits. For each complex, the efficiency was measured as the percentage of clones showed expected digestion pattern after mini-prep among randomly isolated ones (eight clones). The efficiency in the diagram is the mean value using three different complexes. Error bar represents the standard deviation. (B) Schematic presentation of the OBCM to increase cloning efficiency. The red (Left sequence, LS) and blue (Right sequence, RS) are the small DNA sequences released from vector after enzymatic digestion. The green and purple sequences are the oligonucleotides complementary to LS and RS, respectively. BsmBI recognition sites are highlighted in yellow. (C) The effect of OBCM on “one-pot” golden-gate assembly. Three, five or seven DNA fragments (∼1.6 kb) were released from vectors and ligated to obtain the large products with or without OBCM. The first lane in each group is the same reaction as these in lane two and three, but without adding ligase. The last two groups contain extra recognition sequences for BsmBI in the DNA fragments, which result in several smaller fragments after digestion. The red triangles represent the target ligation products. M: 2-Log DNA ladder. (D) The efficiency of assembling multicomponent protein complex with OBCM. Two complexes with five and six subunits respectively, are constructed into a coexpression vector with or without OBCM. Twenty clones were randomly isolated and analyzed by mini-prep and digestion.

Construction of Individual Expression Vector for Each Subunit. To construct the plasmids to express each subunit, PCR was performed with the standard condition using highfidelity polymerase to amply the DNA fragment containing the entire open-reading frame. Primers were designed to incorporate a BsaI recoginition sequence at both ends. The “one pot” reaction was performed using purified PCR products (100 ng) and individual vector (40 ng) in a 10 μL mixture containing 4 U BsaI-HF (NEB), 0.5 U T4 DNA ligase (Thermo Scientific), 1 μg BSA (NEB) and 1X T4 DNA ligase buffer (Thermo Scientific). The mixture is incubated in a thermocycler with following condition: 37 °C for 5 min followed by five cycles of 37 °C for 5 min and 25 °C for 5 min. Next, 0.5 U T4 DNA ligase (Thermo Scientific) was added and the mixture is kept at 25 °C for 60 min before transforming bacteria.

Additional Applications for EcoExpress. Although we propose using EcoExpress to assemble multiple proteins for expression in E. coli, the system could also be applied to other protein expression vector systems. We have constructed a series of Baculovirus vectors based on pFastBac following the same design principle, which could decrease the number of transfection to allow expression of multiple genes simultaneously.



METHODS Construction of EcoExpress Vectors. For individual vectors, the promoters, tags and terminators were mainly derived from commonly used vectors such as pET (Novagen) and pGEX (GE Healthcare). The backbones of individual vectors and acceptor vectors were derived from pET-28b and pET-15b, respectively. F

DOI: 10.1021/acssynbio.5b00291 ACS Synth. Biol. XXXX, XXX, XXX−XXX

Research Article

ACS Synthetic Biology Coexpression Vector Assembly. To assemble the coexpression vector, individual expression plasmids with equal molar to acceptor vector were mixed with 150 ng of acceptor vector in a 10 μL reaction mixture containing 5 U BsmBI (NEB) and 1× NEBuffer 3.1, and incubated at 55 °C for 1 h followed by 80 °C for 20 min to inactivate the enzyme. Subsequently, 1U T4 DNA ligase, 1.5 μL of 10× reaction buffer (50% (w/v) PEG4000 (Thermo Scientific), 10 mM ATP and 50 mM DTT) and ultrapure water were added to a final volume of 15 μL. The mixture was incubated at 25 °C for 3 h, 70 °C for 5 min before transforming bacteria. Oligonucleotides Blocking. To apply OBCM in coexpression vector assembly, an additional step was performed after BsmBI enzymatic digestion and before adding the ligase: 10 μL reaction mixture containing 5 U BsmBI (NEB) and 1× NEBuffer 3.1 were incubated at 55 °C for 1 h, and an OBCM program was carried out after the addition of the blocking oligonucleotides (LRSM, to the final concentration 1 μM): 83 °C for 6 min; 80 °C for 3 min and 75 °C for 90 s, 70 °C for 90 s, 65 °C for 90 s, 60 °C for 90 s, 55 °C for 90 s, 50 °C for 90 s. Subsequently, 1U T4 DNA ligase, 1.5 μL of 10x reaction buffer (50% (w/v) PEG4000 (Thermo Scientific), 10 mM ATP and 50 mM DTT) and ultrapure water were added to a final volume of 15 μL. The mixture was incubated at 25 °C for 3 h, 70 °C for 5 min before transforming bacteria. LS: 5′-ATCCGCAGTGTCTTGCGTCTCT-3′; RS: 5′-GTTGGCAGTGACTCCGTCTCT-3′; LSM: 5′-AGAGACGCAAGACACTGCGGAT-3′; RSM: 5′-AGAGACGGAGTCACTGCCAAC-3′ For ligation of DNA fragments, 200 ng of each plasmid, 10 U BsmBI and 1xNEBuffer 3.1 was mixed in 10 μL reaction volume and incubated at 55 °C for 3 h. Above OBCM program was performed either with or without the addition of blocking oligonucleotides. One U T4 DNA ligase, 1.5 μL 10x buffer (50% (w/v) PEG4000, 10 mM ATP and 50 mM DTT) and ultrapure water was added to a final volume of 15 μL, and incubated at 16 °C overnight followed by heat inactivation at 70 °C for 5 min before gel electrophoresis. Test of Protein Expression. To test the expression of a subunit in the individual expression plasmid, the bacterial cells were cultivated in 1 mL Luria−Bertani medium with 50 μg/mL kanamycin and 34 μg/mL chloramphenicol at 37 °C for 5 h. A 500 μL aliquot was taken with the addition of 0.5 mM final concentration IPTG and cultivated at 37 °C for another 4 h. The cell pellet was resuspended in 100 μL of SDS sample buffer and 4 μL soluble supernatant was analyzed by SDS-PAGE. Copurification of Protein Complexes. Bacterial cells (Rosetta-DE3) containing the coexpression plasmid were cultivated in 100 mL Luria−Bertani medium with 100 μg/mL ampicillin and 34 μg/mL chloramphenicol and grown in a 37 °C shaker for 4 h. Cells were then diluted into 8 L of fresh medium and cultured at 37 °C until OD600 reached 0.6. The medium was cooled to 18 °C for 2 h and IPTG was added to 0.5 mM final concentration. The cells were kept in 18 °C shaker overnight. After centrifugation, the cell pellet was resuspended in lysis buffer containing 10% glycerol, 25 mM Tris-HCl (pH 8.5) and 250 mM NaCl. After the addition of 1 mM (final concentration) PMSF, the suspension was lysed by suspension high pressure cracker. The soluble supernatant was collected after centrifugation and loaded onto Ni-NTA column (GE). After washing with 10% glycerol, 25 mM Tris-HCl (pH 8.5), 500 mM NaCl and 25 mM imidazole, the proteins was eluted

with 10% glycerol, 25 mM Tris-HCl (pH 8.5), 250 mM NaCl and 250 mM imidazole. For sequential purification, above eluate was loaded on the glutathione sepharose column and washed with 10% glycerol, 25 mM Tris-HCl (pH 8.5) and 500 mM NaCl. The proteinss were eluted with 10% glycerol, 25 mM Tris-HCl (pH 8.5), 250 mM NaCl and 25 mM GSH.



AUTHOR INFORMATION

Corresponding Authors

*Phone: 86-10-62796096. E-mail: zhucheng_chen@tsinghua. edu.cn. *Phone: 86-10-62796190. E-mail: [email protected]. cn. Author Contributions

Y.Q. and J.D. wrote the manuscript. Y.Q. created the figures. Y.Q., C.T., Q.Q., J.H. contributed experimental data. J.L. designed the OBCM. Q.W., Y.C., Z.C. and J.D. designed the entire project. The work was performed in the lab of Z.C., Q.W. and J.D. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This work was supported by National Science Foundation of China [31471254], Ph.D. Programs Foundation of Ministry of Education of China [20110002120055] and Research Fund for the Doctoral Program of Higher Education of China [20120002110022].



REFERENCES

(1) Eibauer, M., Pellanda, M., Turgay, Y., Dubrovsky, A., Wild, A., and Medalia, O. (2015) Structure and gating of the nuclear pore complex. Nat. Commun. 6, 7532. (2) Alqarni, S. S. M., Murthy, A., Zhang, W., Przewloka, M. R., Silva, A. P. G., Watson, A. a., Lejon, S., Pei, X. Y., Smits, A. H., Kloet, S. L., Wang, H., Shepherd, N. E., Stokes, P. H., Blobel, G. a., Vermeulen, M., Glover, D. M., Mackay, J. P., and Laue, E. D. (2014) Insight into the architecture of the NuRD complex: Structure of the RbAp48-MTA1 subcomplex. J. Biol. Chem. 289, 21844−21855. (3) Hoek, M., and Stillman, B. (2003) Chromatin assembly factor 1 is essential and couples chromatin assembly to DNA replication in vivo. Proc. Natl. Acad. Sci. U. S. A. 100, 12183−8. (4) Dyer, P. N., Edayathumangalam, R. S., White, C. L., Bao, Y., Chakravarthy, S., Muthurajan, U. M., and Luger, K. (2004) Reconstitution of Nucleosome Core Particles from Recombinant Histones and DNA. Methods Enzymol. 375, 23−44. (5) Zhou, L., Zhou, Y., Hang, J., Wan, R., Lu, G., Yan, C., and Shi, Y. (2014) Crystal structure and biochemical analysis of the heptameric Lsm1−7 complex. Cell Res. 24, 497−500. (6) Chen, Y.-J., and Inouye, M. (2008) The intramolecular chaperone-mediated protein folding. Curr. Opin. Struct. Biol. 18, 765−70. (7) Busso, D., Peleg, Y., Heidebrecht, T., Romier, C., Jacobovitch, Y., Dantes, A., Salim, L., Troesch, E., Schuetz, A., Heinemann, U., Folkers, G. E., Geerlof, A., Wilmanns, M., Polewacz, A., Quedenau, C., Büssow, K., Adamson, R., Blagova, E., Walton, J., Cartwright, J. L., Bird, L. E., Owens, R. J., Berrow, N. S., Wilson, K. S., Sussman, J. L., Perrakis, A., and Celie, P. H. N. (2011) Expression of protein complexes using multiple Escherichia coli protein co-expression systems: a benchmarking study. J. Struct. Biol. 175, 159−70. (8) Johnston, K., Clements, a, Venkataramani, R. N., Trievel, R. C., and Marmorstein, R. (2000) Coexpression of proteins in bacteria using T7-based expression plasmids: expression of heteromeric cell-cycle and

G

DOI: 10.1021/acssynbio.5b00291 ACS Synth. Biol. XXXX, XXX, XXX−XXX

Research Article

ACS Synthetic Biology transcriptional regulatory complexes. Protein Expression Purif. 20, 435− 43. (9) Tan, S., Kern, R. C., and Selleck, W. (2005) The pST44 polycistronic expression system for producing protein complexes in Escherichia coli. Protein Expression Purif. 40, 385−95. (10) Tan, S. (2001) A modular polycistronic expression system for overexpressing protein complexes in Escherichia coli. Protein Expression Purif. 21, 224−34. (11) Romier, C., Ben Jelloul, M., Albeck, S., Buchwald, G., Busso, D., Celie, P. H. N., Christodoulou, E., De Marco, V., van Gerwen, S., Knipscheer, P., Lebbink, J. H., Notenboom, V., Poterszman, A., Rochel, N., Cohen, S. X., Unger, T., Sussman, J. L., Moras, D., Sixma, T. K., and Perrakis, A. (2006) Co-expression of protein complexes in prokaryotic and eukaryotic hosts: experimental procedures, database tracking and case studies. Acta Crystallogr., Sect. D: Biol. Crystallogr. 62, 1232−42. (12) Scheich, C., Kümmel, D., Soumailakakis, D., Heinemann, U., and Büssow, K. (2007) Vectors for co-expression of an unrestricted number of proteins. Nucleic Acids Res. 35, e43. (13) Kim, K., Kim, H., Lee, K., Han, W., and Yi, M. (2004) Twopromoter vector is highly efficient for overproduction of protein complexes. Protein Sci. 13, 1698−1703. (14) Gibson, D. G., Young, L., Chuang, R., Venter, J. C., Iii, C. A. H., Smith, H. O., and America, N. (2009) Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat. Methods 6, 12−16. (15) Engler, C., Gruetzner, R., Kandzia, R., and Marillonnet, S. (2009) Golden gate shuffling: a one-pot DNA shuffling method based on type IIs restriction enzymes. PLoS One 4, e5553. (16) Engler, C., Kandzia, R., and Marillonnet, S. (2008) A one pot, one step, precision cloning method with high throughput capability. PLoS One 3, e3647. (17) Agmon, N., Mitchell, L. a, Cai, Y., Ikushima, S., Chuang, J., Zheng, A., Choi, W.-J., Martin, J. A., Caravelli, K., Stracquadanio, G., and Boeke, J. D. (2015) Yeast Golden Gate (yGG) for the Efficient Assembly of S. cerevisiae Transcription Units. ACS Synth. Biol. 4, 853. (18) Mitchell, L. A., Chuang, J., Agmon, N., Khunsriraksakul, C., Phillips, N. A., Cai, Y., Truong, D. M., Veerakumar, A., Wang, Y., Mayorga, M., Blomquist, P., Sadda, P., Trueheart, J., and Boeke, J. D. (2015) Versatile genetic assembly system (VEGAS) to assemble pathways for expression in S. cerevisiae. Nucleic Acids Res. 43, 6620− 6630. (19) Guo, Y., Dong, J., Zhou, T., Auxillos, J., Li, T., Zhang, W., Wang, L., Shen, Y., Luo, Y., Zheng, Y., Lin, J., Chen, G.-Q., Wu, Q., Cai, Y., and Dai, J. (2015) YeastFab: the design and construction of standard biological parts for metabolic engineering in Saccharomyces cerevisiae. Nucleic Acids Res. 43, 1−14. (20) El Bahassi, M., O’Dea, M. H., Allali, N., Messens, J., Gellert, M., and Couturier, M. (1999) Interactions of CcdB with DNA Gyrase. J. Biol. Chem. 274, 10936−10944. (21) Shetty, R. P., Endy, D., and Knight, T. F. (2008) Engineering BioBrick vectors from BioBrick parts. J. Biol. Eng. 2, 5. (22) Gavin, A.-C., Bösche, M., Krause, R., Grandi, P., Marzioch, M., Bauer, A., Schultz, J., Rick, J. M., Michon, A.-M., Cruciat, C.-M., Remor, M., Höfert, C., Schelder, M., Brajenovic, M., Ruffner, H., Merino, A., Klein, K., Hudak, M., Dickson, D., Rudi, T., Gnau, V., Bauch, A., Bastuck, S., Huhse, B., Leutwein, C., Heurtier, M.-A., Copley, R. R., Edelmann, A., Querfurth, E., Rybin, V., Drewes, G., Raida, M., Bouwmeester, T., Bork, P., Seraphin, B., Kuster, B., Neubauer, G., and Superti-Furga, G. (2002) Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141−147.

H

DOI: 10.1021/acssynbio.5b00291 ACS Synth. Biol. XXXX, XXX, XXX−XXX