Rapid Production of Functionalized Recombinant Proteins: Marrying

The described system has been tested on 36 mammalian Rab GTPases, and it was demonstrated that recombinant GTPases produced with pCYSLIC could be ...
1 downloads 0 Views 477KB Size
Bioconjugate Chem. 2006, 17, 610−617

610

Rapid Production of Functionalized Recombinant Proteins: Marrying Ligation Independent Cloning and in Vitro Protein Ligation Susanna Kushnir,†,‡ Yoann Marsac,†,‡ Reinhard Breitling,§ Igor Granovsky,# Vera Brok-Volchanskaya,# Roger S. Goody,‡ Christian F. W. Becker,‡ and Kirill Alexandrov*,‡,§ Department of Physical Biochemistry, Max-Planck-Institute for Molecular Physiology, Otto-Hahn-Strasse 11, 44227 Dortmund, Germany; Jena Bioscience GmbH, Lo¨bstedter Strasse 78, 07749 Jena, Germany; and Department of Molecular Biology, Institute of Biochemistry and Physiology of Microorganisms, Russian Academy of Sciences, Institutskaya str. 5, Pushchino 142290, Moscow Region, Russia. Received November 7, 2005; Revised Manuscript Received April 5, 2006

Functional genomics and proteomics have been very active fields since the sequencing of several genomes was completed. To assign a physiological role to the newly discovered coding genes with unknown function, new generic methods for protein production, purification, and targeted functionalization are needed. This work presents a new vector, pCYSLIC, that allows rapid generation of Escherichia coli expression constructs via ligationindependent cloning (LIC). The vector is designed to facilitate protein purification by either Ni-NTA or GSH affinity chromatography. Subsequent proteolytic removal of affinity tags liberates an N-terminal cysteine residue that is then used for covalent modification of the target protein with different biophysical probes via protein ligation. The described system has been tested on 36 mammalian Rab GTPases, and it was demonstrated that recombinant GTPases produced with pCYSLIC could be efficiently modified with fluorescein or biotin in vitro. Finally, LIC was compared with the recently developed In-Fusion cloning method, and it was demonstrated that In-Fusion provides superior flexibility in choice of expression vector. By the application of In-Fusion cloning Cys-Rab6A GTPase with an N-terminal cysteine residue was generated employing unmodified pET30a vector and TVMV protease.

INTRODUCTION Sequencing of several pro- and eukaryotic genomes and the subsequent advent of structural and functional genomics projects have created a need for a spectrum of new protein analysis methods. Although traditional biochemistry includes investigation of a few selected proteins often studied by individually tailored assays, the postgenomic era requires the study of hundreds of proteins simultaneously. This calls for standardization and automation of protein generation and handling methods. The obligatory first step in protein research, namely, in vivo protein expression, was greatly facilitated by the development of rapid cloning techniques based on recombination systems such as Gateway or ligation-independent cloning (LIC) (1). Both of these systems are employed to almost equal extent in highthroughput cloning projects because, whereas the LIC approach relies on cheap generic reagents, the more expensive Gateway method is simpler and faster when genes must be moved among several vectors (2). A disadvantage shared by both methods is the necessity to design vectors carrying recombination sites, which often leads to the introduction of unwanted sequences to or elimination of wanted sequences from the open reading frame. A newly introduced system called In-Fusion has the potential for avoiding this problem. In this system an undisclosed enzyme recombines linearized vector with a PCR product flanked by homology regions of 15 nucleotides to 5′ and 3′ ends of the vector (3). The combination of rapid cloning approaches with various tag systems allows rapid isolation of resulting recombinant * Corresponding author [telephone (+49)-0231-1332356; fax (+49)0231-1331651; e-mail [email protected]]. † Both authors contributed equally. ‡ Max-Planck-Institute for Molecular Physiology. § Jena Bioscience GmbH. # Russian Academy of Sciences.

proteins for subsequent structural or functional investigations (4). Whereas in structural studies proteins are directly used for crystallization or NMR studies, the functional assays require further derivatization of proteins to provide a generic and easily detectable readout for their activity. Although the methods applied vary widely, two modifications dominate: immobilization and covalent labeling with affinity groups, fluorophores, or isotopes. Although the molecular toolbox currently available to researchers is filled with reporter groups suitable for protein research, the methods for their conjugation to polypeptides are far less developed. This is largely due to the lack of chemistries that enable the modification of proteins at defined sites with a predictable outcome. The two most commonly used approaches that are based on chemical coupling to sulfhydryl groups or to primary amines result in different labeling patterns on different proteins as a consequence of the unique content and distributions of cysteine and lysine residues in a given protein. A recently developed approach referred to as expressed protein ligation (EPL) provides a possible solution to this problem (5). In this approach proteins are expressed in C-terminal fusion with an intein domain and an affinity tag. The resulting fusion proteins can be separated from the proteins of the expression host on an affinity matrix. Treatment of the immobilized protein with a high concentration of thiol leads to the cleavage of the peptide bond between intein and target protein. The cleaved protein carries a thioester group on the C terminus that can be coupled to a peptide (or, in fact, any molecule) bearing a cysteine at its N terminus according to the method of native chemical ligation to generate a native peptide bond at the coupling site (Figure 1) (6). Although this approach was successfully used for protein engineering, its shortcomings are related to the necessity of expressing a large fusion protein that may influence the solubility and folding of the target protein and the different efficiencies of intein splicing due to the influence of the flanking

10.1021/bc050320d CCC: $33.50 © 2006 American Chemical Society Published on Web 04/27/2006

Expression, Purification, and Labeling of Recombinant Proteins

Bioconjugate Chem., Vol. 17, No. 3, 2006 611

Figure 1. (A) Mechanism of the protein labeling reaction in analogy to native chemical ligation. (B) Thioester labeling reagents: biotin benzylthioester 1, biotin mercaptoacetic acid thioester 2, 5(6)-(mercaptoacetic acid)fluorescein thioester 3.

residues of the target protein (7). An alternative configuration was proposed whereby a thioester-conjugated functionality such as a fluorophore is coupled onto the N-terminal cysteine of a recombinant protein (8). However, generation of recombinant proteins with N-terminal cysteine is not straightforward. In some cases cysteine on position 2 becomes N-terminal upon methionine cleavage by aminopeptidase of the expression host, although the efficiency varies among proteins (9). In an alternative configuration an N-terminal cysteine is created by self-cleavage of an intein domain fused N-terminally to the target protein. As in the case of C-terminally fused intein domains, cleavage does not always occur with the desired efficiency. Finally, the N-terminal cysteine residue can be generated by proteolytic cleavage of a properly engineered protease cleavage site. Among proteases that were shown to tolerate cysteine at the +1 position of the cleavage site are factor X, Precision protease, and TEV proteases (10, 11). The latter protease became particularly popular due its high processivity, selectivity, and related ease of recombinant production, which promoted interest in other proteases of plant viruses such as TVMV (12). In this work we compare LIC and In-Fusion cloning for rapid production of recombinant proteins with N-terminal cysteine mediated by TEV and TVMV proteases.

EXPERIMENTAL PROCEDURES Construction of pCYSLIC Vector. The His-GST tag fusion gene was amplified using oligonucleotides Xba-GAT_for TAGATCCTCTAGAAATAATTTTGTTTAACTTTAAG AAGGAG and TEV_LIC_rev TTATTATCCCATATGGGTATACATTACCCAATAATA TTGAAAATAAAGATTTTCATCAGCCATACTTAC that encode the TEV protease recognition sequence and XbaI and SspI restriction sites. The resulting PCR product was digested with XbaI and SspI and ligated into plasmid pMCSG7 linearized with the same enzymes (13). The integrity of the GST ORF and LIC site of the resulting pCYSLIC vector was verified by sequencing using T7 promoter and T7 terminator primers. Cloning of Rab6A into pCYSLIC Vector. To amplify Rab6A ORF, oligonucleotide primers 5′-AAAATCTTTATTTTCAATGCATGTCCACGGGCGGAGACT-3′ and 5′-TTATCCACTTCCAATGTTAGCAGGAACAGCCTCCTTCA-3′ were used. The primers are complementary to the 5′- or 3′-ends of

the ORF, respectively, at 3′-ends and to sequences flanking the SspI site of the pCYSLIC vector at 5′-ends (underlined). Both PCR fragments of 663 bp and SspI-digested pCYSLIC vector were purified by electrophoresis in 1% agarose gel and isolated with QIAquick gel extraction kit (Qiagene). Five hundred nanograms of vector DNA and 50 ng of the PCR fragment were treated with 0.7 unit of T4 DNA polymerase in the presence of 2.5 mM dGTP and 2.5 mM dCTP, respectively, at 22 °C for 30 min. The enzyme was inactivated by heating at 75 °C for 20 min. Thirty nanograms of vector DNA and 10 ng of PCR fragment were mixed in a total volume of 3 µL (molar ratio of vector to fragment approximately 1:3) and incubated at 22 °C for 10 min. Then 1 µL of 25 mM EDTA was added, and incubation was continued for another 10 min. Escherichia coli Top10 competent cells were transformed with the above mixture, followed by selection on LB agar supplemented with 60 µg/ mL ampicillin. Ten recombinant clones were picked and screened by PCR with primers complementary to the vector sequences upstream and downstream of the insert. All 10 screened clones contained the insert of interest. The integrity of the open reading frame was verified by sequencing. In-Fusion Cloning of TVMV-Rab6A into the pET30a Vector. The open reading frame of Rab6A was amplified with the primers TVMV_Rab6_p30_dir AAGGCCATGGCTGATGAGACCGTTCGCTTTCAGTGCATGTCCACGGGCGGAGACTTCG and TVMV_Rab6_p30_rev GAATTCGGATCCGATTCAGCAGGAACAGCCTCCTTCAC containing the 15 bp region of homology to the pET30a vector upstream and downstream of the EcoRV site. The 3′ primer also contains the recognition sequence for TVMV protease, where the S residue in the sequence E-T-V-R-F-Q-S is substituted by cysteine. The PCR product was gel purified and mixed in 1:1 molar ratio with pET30a vector linearized by the EcoRV enzyme and incubated with In-Fusion reaction mix (BD Bioscience) for 30 min at room temperature. The reaction mixture was transformed into FusionBlue Competent Cells, and recombinant clones were selected on LB agar plates supplemented with 30 mg/L kanamycin. Cloning, Expression, and Purification of Mammalian RabGTPases. For cloning of a library of mammalian RabGTPases the oligonucleotides were designed to start with the sequence AAAATCTTTATTTTCAATGC followed by the nucleotides encoding the target protein. The antisense primers

612 Bioconjugate Chem., Vol. 17, No. 3, 2006

start with the sequence AATAGGTGAAGGTTAC followed by the complement of stop codon and the C terminus of the target protein. The cDNAs for mammalian RabGTPases were obtained from RZPD GmbH and used as templates for PCR reactions. PCR products were then treated with T4 polymerase and annealed with the pCYSLIC vector. The mixture was transformed into E. coli Top10 competent cells, and the positive clones were identified as described above. The sequence of the inserts was verified by sequencing with specific primers. Test protein expressions were performed in BL21(DE3)RIL cells (Stratagene) cultured in 2 mL of LB medium supplemented with 100 µg/mL ampicillin and 34 µg/mL chloramphenicol. The cells were grown at 37 °C to OD600 ) 0.5, and protein expression was induced by the addition of 1 mM IPTG for 4 h at 37 or 20 °C. To determine the amount and solubility of the recombinant protein, the cells were disrupted by sonication and cell debris was removed by centrifugation. The total cellular lysates, the cells debris, and supernatants were analyzed by SDS-PAGE followed by Coomassie Blue staining. For large-scale purification cultures were pregrown to OD600 ) 0.5 in 1-10 L of LB, and protein expression was induced by the addition of 0.1 mM IPTG for 12 h at 20 °C. Cells were resuspended in buffer A containing 50 mM NaH2PO4, pH 8.0, 0.3 M NaCI with 1 mM PMSF, and 2 mM BME and lysed using a fluidizer (Microfluidics Corp.). Clarified lysates were loaded onto a 5 mL Hi-Trap Ni column (Amersham) equilibrated with buffer A. The protein was eluted with a gradient of 10150 mM imidazole in buffer A. The fractions containing recombinant protein were pooled, and TEV or TVMV protease was added at 1:20 molar ratio. The resulting mixture was dialyzed against protease cleavage buffer: 25 mM NaH2PO4, pH 8.0, 2 mM EDTA, and 2 mM BME for 12 h at 4 °C. Next, the excess EDTA was quenched by the addition of excess MgCl2, and cleaved 6His-GST-tag, His-tagged protease, and impurities were removed from the dialyzed mixture by passing it over an Ni-NTA column. The collected flow-through was concentrated in Centricon 10 concentrators up to ≈8 mL and loaded onto a gel filtration column (Superdex-75) equilibrated with 25 mM Hepes, pH 7.2, 40 mM NaCl, 2 mM MgCl2, 10 µM GDP, and 2 mM DTE. The fractions were concentrated with Centriprep 10, frozen, and stored at -80 °C. Expression vectors pRK1035_TVMV and pRK1037_TEV were a generous gift of D. Waugh. Expression and purification of the enzymes were performed essentially as described previously (12). Biotin and Fluorescein Thioester Synthesis. Biotin benzylthioester 1 was prepared according to a procedure described by Schuler and Pannell (8). To a cooled mixture (0 °C) of 2.0 M trimethylaluminum (10 mL; 20 mmol) in heptane and anhydrous dichloromethane (40 mL) was slowly added benzyl mercaptan (2.35 mL; 20 mmol). The reaction mixture was allowed to warm to room temperature over a 20-30 min period. Twenty milliliters of this solution was added to a solution of biotin succinimidyl ester (110 mg; 0.322 mmol) in anhydrous DMF (5 mL). After 2 h of stirring at room temperature, the solvents were removed and the remaining yellow oil was dissolved in dichloromethane. Wet silica gel was added to the solution to quench the excess aluminum complex. After evaporation, the crude product was purified by flash chromatography (CHCl3/MeOH, 95:5). Fractions containing the biotin benzylthioester were pooled, evaporated, and dried under vacuum to give a white powder (109.5 mg; 0.312 mmol; 97% yield): 1H NMR (CDCl3, 400 MHz) δ 7.24-7.14 (m, 5H, Ph), 4.45-4.41 (m, 1H), 4.23-4.20 (m, 1H), 4.04 (s, 2H), 3.083.03 (m, 1H), 2.83 (dd, 1H, J ) 4.8 Hz, J ) 12.8 Hz), 2.65 (dd, 1H, J ) 0.4 Hz, J ) 12.8 Hz), 2.52 (t, 2H, J ) 7.4 Hz), 1.70-1.55 (m, 4H), 1.44-1.31 (m, 2H); 13C NMR (CDCl3, 100

Kushnir et al.

MHz) δ 198.7, 163.7, 137.6-127.1, 61.9, 60.1, 55.3, 43.3, 40.5, 33.1, 28.1, 25.4; MALDI-TOF MS, calcd for C17H22N2O2S2 (M + H)+ 351.5067, found 351.5827. Synthesis of the water-soluble biotin 2 and 5(6)-carboxyfluorescein 3 thioesters was performed as described by Tolbert et al. (14), using tert-butyl mercaptoacetate (15). D-Biotin thioester 2: 1H NMR ((CD3)2CO, 400 MHz) δ 5.95 (s large, 1H, NH), 5.89 (s large, 1H, NH), 3.85 (dd, 1H, J ) 4.8 Hz, J ) 8.0 Hz), 3.68 (ddd, 1H, J ) 5.2 Hz, J ) 8.0 Hz), 3.22 (s, 2H), 2.64 (ddd, 1H, J ) 4.8 Hz, J ) 6.4 Hz, J ) 8.0 Hz), 2.37 (dd, 1H, J ) 5.2 Hz, J ) 12.4 Hz), 2.17 (t, 2H, J ) 7.4 Hz), 2.12 (d, 1H, J ) 12.4 Hz), 1.20-1.10 (m, 4H), 1.06-0.85 (m, 2H); MALDI-TOF MS, calcd for C12H18N2O4S2 (M + H)+ 319.0786, found 319.2774. 5-(and -6)-carboxyfluoresceinthioester 3: 1H NMR (CDCl3, 400 MHz) δ 8.74 (s, 1H), 8.73 (s, 1H), 8.10 (d, 1H), 8.09 (d, 1H), 7.27 (s, 1H), 7.25 (s, 1H), 6.88 (d, 4H), 6.63-6.58 (m, 4H), 6.56-6.50 (m, 4H), 3.84 (s, 4H); MALDI-TOF MS, calcd for C23H14O8S (M + H)+ 451.0487, found 451.2801. Labeling of Peptides and Proteins with Biotin and Fluorescein Thioesters. Peptide Labeling. The RBD1 peptide (5 mg) was dissolved in 375 µL of a buffer containing 6 M GdmCl and 100 mM sodium phosphate, pH 7.5, and 4% thiophenol was added. The biotin benzylthioester 1 was dissolved in DMF at a concentration of 15 mM, and then 50 µL was added to the peptide solution. The reaction mixture was vortexed and incubated at room temperature. The reaction progress was monitored by analytical reversed phase HPLC. Protein Labeling. A solution of Cys-Rab6A protein (8.0 µL; 531 µM) in 50 mM NaPi, 1 mM MgCl2, and 100 µM GDP, pH 7.5, was mixed with 30 mM TCEP, 200 mM MesNa in 100 mM NaPi buffer (4.5 µL), and a 5 mM solution of water-soluble thioester 2 or 3 in 100 mM NaPi buffer (1.5 µL) at pH 8.0. The reaction mixture was vortexed and incubated at room temperature (22 °C). In the case of the 5(6)-carboxyfluorescein thioester, 1 µL of 3 mM labeling reagent 3 in 100 mM NaPi buffer at pH 8.0 was added each 8 h. In the case of the biotin thioester, 1 µL of 3 mM labeling reagent 2 in 100 mM NaPi buffer at pH 8.0 was added once after 12 h. HPLC and Mass Spectrometry. HPLC analysis of peptides and proteins was performed with an analytical RP-C4-column (Vydac) at a flow rate of 1 mL/min over 30 min with a gradient from 5 to 65% (v/v) buffer B (acetonitrile, 0.08% TFA) in buffer A (water, 0.1% TFA). Peptide and protein masses were determined by electrospray ionization mass spectrometry on an LCQ Advantage Max (Finnigan) operating in positive ion mode.

RESULTS AND DISCUSSION Construction of pCYSLIC Vector and Preparation of Recombinant Cys-Rab6A GTPase. To design a generic vector suitable for rapid construction of a large number of expression constructs that yield recombinant polypeptides selectively derivatizable with a range of chemical entities, we attempted to combine recent advances in molecular cloning and protein chemistry. The proposed vector should contain sequence(s) coding for one or two affinity tags operably linked to a site for LIC, which should allow rapid in-frame cloning of PCRamplified genes. The expressed polypeptide should be readily isolatable by, preferably, a single-step chromatography and then be amenable to site-directed modification with a range of chemical groups. The functional assembly that satisfies the above-described requirements is presented in Figure 2. The gene of interest is cloned via LIC after a tandem of the 6His-GST tag and the TEV protease cleavage site. Use of the tandem tag 1 Amino acids 96-131 from the Ras-binding domain of cRaf-1 carrying a C-terminal His6 tag.

Expression, Purification, and Labeling of Recombinant Proteins

Bioconjugate Chem., Vol. 17, No. 3, 2006 613

Figure 2. Principal scheme for generic generation of N-terminal cysteine in recombinant proteins and its subsequent labeling via in vitro protein ligation: step 1, isolation of the fusion protein on the affinity matrix; step 2, cleavage of the fusion protein with TEV protease and separation of the target protein; step 3, ligation of the thioester-tagged functional group to the N-terminal cysteine of the target protein.

Figure 3. (A) Map of pCYSLIC vector. The unique restriction sites before and after the 6-His-GST fusion gene are shown. (B) Organization of the LIC site of pCYSLIC vector and the PCR product and the flowchart of cloning reaction. The TEV protease cleavage site is boxed.

allows two-step affinity purification if the target protein is not sufficiently pure after the first purification step. The TEV protease cleavage site is modified in such a way that upon cleavage a cysteine is left on the N terminus of the resulting polypeptide that can be used for ligation of molecules bearing the reactive thioester group. To construct such a vector, we combined E. coli expression vector pGATEV, which contains tandem 6His-GST tags followed by a TEV protease cleavage site (16), and pMCSG7 vector that combines a LIC cloning site with a 6His tag removable via TEV cleavage (13). The resulting

vector was termed pCYSLIC (Figure 3A). For cloning of a gene of interest into this vector the backbone is digested with blunt cleaving SspI enzyme and treated with T4 polymerase in the presence of dGTP (Figure 3B). This results in the formation of long (14-18 nucleotides) overhangs that are then used for capturing the insert. The gene of interest is PCR amplified with the primers containing a sequence complementary to the regions flanking the SspI site of the pCYSLIC (Figure 3B). The PCR products are treated with T4 polymerase in the presence of dCTP leading to the formation of single-stranded regions complemen-

614 Bioconjugate Chem., Vol. 17, No. 3, 2006

Figure 4. Expression and purification of the Cys-Rab6A using pCYSLIC vector. Aliquots of E. coli culture, cell lysates, and purification steps were resolved on 15% SDS-PAGE and stained with Coomassie Blue stain. The positions of fusion protein and cleavage products are indicated by arrows.

Kushnir et al.

tary to those of the backbone. The backbone and the insert are mixed and transformed into the E. coli strain. The resulting clones are screened by colony PCR for the presence of the desired insert. To test the vector we PCR amplified the ORF of human GTPase Rab6A and cloned it using the above-described approach into the pCYSLIC vector. The efficient cloning depended on the high quality of backbone linearization with the SspI enzyme, which was achieved by increasing the enzyme to DNA ratio as described before (13) (Experimental Procedures). Following the optimization a cloning efficiency of >90% was routinely achieved. The resulting pCYSLIC-Rab6A vector was introduced into E. coli BL21(DE3) RIL strain, and the suspension culture was induced with IPTG as described under Experimental Procedures. As can be seen in Figure 4 6HisGST-Rab6A was highly overexpressed and found predominantly in the soluble fraction. Chromatography of the supernatant on Ni-NTA resin resulted in >70% pure 6His-GST-Rab6A

Figure 5. ESI-MS spectra of cleaved and labeled Rab6A protein: (A) deconvoluted mass spectrum of Cys-Rab6A, calculated molecular mass 23695.9 Da; (B) deconvoluted mass spectrum of Cys-Rab6A after 24 h of reaction with 1 at room temperature in a non-denaturing buffer (20 mM sodium phosphate, 1 mM MgCl2, 100 µM GDP, pH 7.5) and 5-10% of DMF; (C) deconvoluted mass spectrum of Cys-Rab6A labeled with biotin thioester 2 after 24 h at 22 °C; (D) deconvoluted mass spectrum of Cys-Rab6A labeled with 5(6)-carboxyfluorescein thioester 3 after 24 h at 22 °C; (E) deconvoluted mass spectrum of Cys-Rab6A labeled with biotin thioester 2 after 3 h of hydroxylamine (50 mM) treatment; (F) deconvoluted mass spectrum of Cys-Rab6A labeled with fluorescein thioester 3 after 3 h of hydroxylamine (50 mM) treatment.

Expression, Purification, and Labeling of Recombinant Proteins

protein. To remove the tags and to liberate the N-terminal cysteine, the sample was dialyzed against 25 mM NaPi, pH 8.0, 2 mM EDTA, and 2 mM BME at 4 °C in the presence of TEV protease. The cleavage reaction was monitored by SDS-PAGE and typically reached >80% completion after 12 h. The cleaved 6His-GST fusion and TEV protease were removed by passing the mixture over Ni-NTA resin (Figure 4). The obtained Rab6 was >80% pure and migrated on the size exclusion chromatography as a single peak with an apparent molecular mass of ≈25 kDa, indicating that the protein was properly folded (not shown). Synthesis of and Labeling with Biotin and Fluorescein Thioesters. Having developed an expression system for rapid generation of pure proteins competent for in vitro chemical ligation, a strategy for their rapid and site-selective modifications with various labels was devised. This labeling procedure is based on the nucleophilic attack of the cysteine thiol group on the thioester moiety of the labeling molecules and subsequent formation of an amide bond, as described for linking peptide segments via native chemical ligation (6) (Figure 1). However, reaction conditions and obtainable yields depend heavily on the nature of the label and the protein used. We sought to develop a general procedure for the synthesis of labeling reagents carrying a thioester and to identify conditions that ensure sitespecific monolabeling of peptides/proteins in high yields without side reactions. In a first set of experiments we used a biotin benzylthioester 1 that was synthesized following a two-step reaction procedure presented first by Schuler and Pannell for conversion of a fluorescent dye into its thioester derivative (8). The benzylthioester 1 was obtained at very high yield (97%) from a biotin succinimidyl ester derivative. An aluminum complex, prepared from a stoichiometric mixture of trimethylaluminum and benzyl mercaptan, was allowed to react with the biotin succinimidyl ester at room temperature, generating 1 (Figure 1). Initially this water-insoluble compound was applied in organic solvents such as DMF, DMSO, THF, or dioxane to ligation reactions with Rab6A (Cys-Rab6A calculated mass ) 23695.9 Da, ESI-MS observed mass ) 23694.0 Da). To avoid precipitation of 1, a concentration of 5-10% DMF was used during our initial experiments, reacting Rab6A and 1 under denaturing or native conditions in the presence of 4% thiophenol. Mass spectrometry indicated the formation of a side product (protein mass + 28 Da) and of the expected biotin labeled protein (calculated mass ) 23922.2 Da, ESI-MS observed mass ) 23926.0 Da; Figure 5). Optimization of reaction conditions [longer reaction times, use of MesNa (2-mercaptoethanesulfonic acid, sodium salt) instead of thiophenol] did not lead to complete conversion of Rab6A to the biotinylated species, possibly indicating a modification of the N-terminal cysteine residue of Rab6A. Such a modification would prevent any reaction with 1. The replacement of DMF by other organic solvents such as DMSO, THF, or dioxane led to a decrease in labeling efficiency, probably due to less efficient solubilization of 1. However, the occurrence of the nonreactive side product was prevented. To exclude the possibility that the observed phenomenon was protein-specific, a similar series of experiments was performed with the N-terminal cysteine containing the 42 amino acid peptide RBD96-131/His6-tag (17). The results corroborated those obtained with the Rab6A protein, including the low efficiency of the reaction, under either denaturing (6 M GdmCl) or native conditions, and the formation of a side product (peptide mass + 28). To overcome the problems encountered with the largely water-insoluble compound 1, we decided to prepare watersoluble thioester labeling reagents. Synthetic access to such compounds has been established by Tolbert et al. (14), using tert-butyl mercaptoacetate (15), and we obtained water-soluble biotin and 5(6)-carboxyfluorescein thioesters with yields of

Bioconjugate Chem., Vol. 17, No. 3, 2006 615

Figure 6. SDS-PAGE analysis of Rab6A labeling with 5(6)carboxyfluorescein: lane 1, protein marker; lane 2, Cys-Rab6A; lane 3, fluorescein-Rab6A; lane 4, fluorescence scan of Cys-Rab6A labeled at λex ) 473 nm and λem ) 510 nm.

Figure 7. SDS-PAGE analysis of six Rab proteins before and after ligation with 5(6)-carboxyfluorescein thioester. The reactions were carried out and analyzed as in Figure 4. The bottom panel represents the fluorescent scan of the gel, whereas the top panel shows the same gel stained with Coomassie Blue. The high molecular weight fluorescent bands represent GTPase dimers.

between 22 and 45% following their procedure. In the first ligation experiment using 1.78 mM 3 and a 370 µM solution of Rab6A in the presence of β-mercaptoethanol (1 mM) and 2-mercaptoethanesulfonic acid (25 mM), we observed low labeling yields of Rab6A and the occurrence of a small fraction of doubly labeled Rab6A protein. The latter was rationalized as thioester exchange between cysteine side chains in the protein and the labeling reagents 2 and 3. Different reaction conditions using various amounts of MesNa as an inducer of thioester exchange, TCEP as an additional reducing agent, or a combination of both reagents were tested without significant reduction in unspecific labeling. Such multiple labeling essentially defeats the potential advantages of using native chemical ligation for protein labeling. Optimization of the labeling conditions to suppress the unwanted multiple labeling revealed a marked cysteine reactivity difference of biotin and carboxyfluorescein thioesters. We observed that regular additions, approximately every 8 h for carboxyfluorescein thioester 3 (final concentration ) 1 mM) and once after 12 h for biotin thioester 2 (final concentration ) 1 mM), produced almost only monolabeled protein within 24 h. The reaction progress was monitored by LC-MS, and the respective deconvoluted mass spectra showed in each case the formation of the monolabeled protein as the major product (calculated molecular mass ) 23922.2 Da, Rab6A + 2 and 24054.2 Da Rab6A + 3; observed molecular mass )

616 Bioconjugate Chem., Vol. 17, No. 3, 2006

Kushnir et al.

Figure 8. Use of In-Fusion cloning for generation of the 6His-TVMV-Rab6A fusion protein: (A) multicloning site of the pET30A vector (the EcoRV site used for vector linearization is marked by an arrow); (B) flowchart of cloning reaction and design of the PCR product; (C) expression in E. coli, purification and cleavage of 6His-TVMV-Rab6A with TVMV protease; (D) ligation of TVMV-processed Cys-Rab6A protein with 5(6)-carboxyfluorescein thioester. The samples were processed as in Figure 7.

23922.0 and 24052.0 Da, respectively), as well as the impurity already contained in the starting material and a small peak belonging to the dilabeled protein (calculated molecular mass ) 24148.5 Da + 2 and 24412.5 Da + 3; observed molecular mass ) 24146.0 and 24411.0 Da, respectively; Figure 5). In cases when the dilabeled product interferes with the intended use of the labeled protein, the additional label that is attached to the protein via a thioester bond can be efficiently removed by treatment with hydroxylamine (50 mM in aqueous buffer, 3 h). Cloning, Expression, Purification, and Labeling of Mammalian Rab GTPases. To establish the general applicability of the designed vector and the labeling procedure on a larger set of test proteins, we chose to clone a subset of mammalian RabGTPases into the pCYSLIC vector. We amplified 39 ORFs of RabGTPases and cloned them into the pCYSLIC vector. The cloning procedure was very robust with typical cloning efficiency of >90%. The resulting clones were verified by sequencing and tested for expression in BL21(DE3)RIL strains at 37 and 20 °C (see Supporting Information). Approximately 50% of the proteins were either completely or partially soluble at 20 °C. We purified a subset composed of Rab3A, Rab3B, Rab4A, Rab19, and Rab27 to >70% purity by Ni-NTA

chromatography followed by subsequent cleavage of GST tag with TEV protease. Incubation of the purified proteins with carboxyfluorescein thioester resulted in the formation of a fluorescent product easily detectable by SDS-PAGE. Labeling of most proteins resulted in a slight (≈0.5 kDa) decrease in electrophoretic mobility of the ligation product, which suggests that the majority of the protein was labeled efficiently (Figure 6). Proteins retained their solubility, indicating that labeling does not significantly affect protein folding or solubility. Use of In-Fusion Cloning for Production of Proteins Containing N-Terminal Cysteine. The described approach based on LIC cloning is both sufficiently inexpensive and flexible to be adapted to an automated cloning platform. However, similar to most recombination-based cloning methods, it precludes the use of vectors not adapted to the chosen cloning method. Ideally one would like to be able to use any selection of vectors, particularly in cases when the protein displays folding or solubility problems. The recently introduced In-Fusion cloning procedure potentially provides a way to clone a PCR fragment into any linearized vector. We decided to test whether this approach could also be used for the rapid generation of the recombinant proteins with N-terminal cysteines. To this end, we chose a pET30a vector that features an N-terminal 6His tag

Bioconjugate Chem., Vol. 17, No. 3, 2006 617

Expression, Purification, and Labeling of Recombinant Proteins

and is commonly used for the expression of recombinant proteins. We decided to incorporate into recombinant Rab6A a cleavage site for TVMV polymerase, which has not been previously used for the generation of an N-terminal cysteine residue. Hence, we included a modified recognition sequence of TVMV polymerase into the primers, which would result in substitution of the serine in the recognition sequence ETVRFQS by cysteine (cleavage normally occurs between Q and S). The primer design and recombination cloning procedure are depicted in Figure 8A-C and described under Experimental Procedures. The recombinant clones were readily obtained, and the fusion protein was expressed in E. coli, purified, and cleaved with recombinant TVMV protease under conditions used for TEV protease (Figure 8C). The cleavage was very efficient, demonstrating that TVMV could well tolerate SfC substitution in the cleavage site. Following a second purification by Ni-NTA, the protein was already >85% pure and could be used directly for downstream applications. To ensure that the generated N-terminal cysteine was functional in the ligation reaction, we incubated the recombinant protein with carboxyfluorescein thioester as described above. As can be seen in Figure 8D, this resulted in specific labeling of Rab6A.

CONCLUSIONS In the current work we present a new E. coli expression vector pCYSLIC that allows rapid cloning of PCR fragments via ligation-independent cloning in frame with the 6-His-GST tag. The affinity tags were easily removed with TEV protease, leaving an N-terminal cysteine that was subsequently used to label the proteins with carboxyfluorescein and biotin via in vitro protein ligation reaction. The developed combination of inexpensive rapid cloning, purification, and functionalization procedures provides an advance in protein expression and derivatization. We compared this method to the recently developed In-Fusion cloning approach and demonstrated that, although more expensive, the In-Fusion cloning provides superior cloning vector flexibility compared with other cloning methods. We also showed that TVMV protease can be employed for the generation of recombinant proteins with N-terminal cysteine.

ACKNOWLEDGMENT We acknowledge N. Lupilova, S. Thuns, and M. Terbeck for invaluable technical assistance. The pMCSG7 vector was kindly provided by M. I. Donnelly. We are grateful to F. Barr for the cDNAs of Rab6A and to D. Waugh for the gift of expression vectors for TVMV and TEV proteases. This work was supported in part by Grant DFG AL 484/7-2 to K.A. Supporting Information Available: Expression and solubility test of RabGTPases cloned and expressed in pCYSLIC vector. This material is available free of charge via the Internet at http:// pubs.acs.org.

LITERATURE CITED (1) Yokoyama, S. (2003) Protein expression systems for structural genomics and proteomics. Curr. Opin. Chem. Biol. 7, 39-43.

(2) Hartley, J. L., Temple, G. F., and Brasch, M. A. (2000) DNA cloning using in vitro site-specific recombination. Genome Res. 10, 1788-1795. (3) Benoit, R. M., Wilhelm, R. N., Scherer-Becker, D., and Ostermeier, C. (2006) An improved method for fast, robust, and seamless integration of DNA fragments into multiple plasmids. Protein Express. Purif. 45, 66-71. (4) Waugh, D. S. (2005) Making the most of affinity tags. Trends Biotechnol. 23, 316-320. (5) Muir, T. W. (2003) Semisynthesis of proteins by expressed protein ligation. Annu. ReV. Biochem. 72, 249-289. (6) Dawson, P. E., Muir, T. W., Clark-Lewis, I., and Kent, S. B. (1994) Synthesis of proteins by native chemical ligation. Science 266, 776779. (7) Zhang, A., Gonzalez, S. M., Cantor, E. J., and Chong, S. (2001) Construction of a mini-intein fusion system to allow both direct monitoring of soluble protein expression and rapid purification of target proteins. Gene 275, 241-252. (8) Schuler, B., and Pannell, L. K. (2002) Specific labeling of polypeptides at amino-terminal cysteine residues using Cy5-benzyl thioester. Bioconjugate Chem. 13, 1039-1043. (9) Gentle, I. E., De Souza, D. P., and Baca, M. (2004) Direct production of proteins with N-terminal cysteine for site-specific conjugation. Bioconjugate Chem. 15, 658-663. (10) Cotton, G. J., and Muir, T. W. (2000) Generation of a dual-labeled fluorescence biosensor for Crk-II phosphorylation using solid-phase expressed protein ligation. Chem. Biol. 7, 253-261. (11) Tolbert, T. J., and Wong, C. H. (2002) New methods for proteomic research: Preparation of proteins with N-terminal cysteines for labeling and conjugation. Angew. Chem. Int. Ed. 41, 2171-2174. (12) Tozser, J., Tropea, J. E., Cherry, S., Bagossi, P., Copeland, T. D., Wlodawer, A., and Waugh, D. S. (2005) Comparison of the substrate specificity of two potyvirus proteases. FEBS J. 272, 514523. (13) Stols, L., Gu, M., Dieckman, L., Raffen, R., Collart, F. R., and Donnelly, M. I. (2002) A new vector for high-throughput, ligationindependent cloning encoding a tobacco etch virus protease cleavage site. Protein Express. Purif. 25, 8-15. (14) Tolbert, T. J., Franke, D., and Wong, C. H. (2005) A new strategy for glycoprotein synthesis: ligation of synthetic glycopeptides with truncated proteins expressed in E. coli as TEV protease cleavable fusion protein. Bioorg. Med. Chem. 13, 909-915. (15) Woulfe, S. R., and Miller, M. J. (1986) The synthesis of substituted [[3(S)-(acylamino)-2-oxo-1-azetidinyl]thio]acetic acids. J. Org. Chem. 51, 3133-3139. (16) Kalinin, A., Thoma, N. H., Iakovenko, A., Heinemann, I., Rostkova, E., Constantinescu, A. T., and Alexandrov, K. (2001) Expression of mammalian geranylgeranyltransferase type-II in Escherichia coli and its application for in vitro prenylation of Rab proteins. Protein Express. Purif. 22, 84-91. (17) Becker, C. F. W., Hunter, C. L., Seidel, R. P., Kent, S. B. H., Goody, R. S., and Engelhard, M. (2001) A sensitive fluorescence monitor for the detection of activated Ras: total chemical synthesis of site-specifically labeled Ras binding domain of c-Raf1 immobilized on a surface. Chem. Biol. 8, 243-252. BC050320D