Protein Splicing-Based Reconstitution of Split Green Fluorescent

This technique simplified detection of protein interactions, but because of the low splicing efficiency of VDE intein, its sensitivity and screening t...
0 downloads 0 Views 191KB Size
Anal. Chem. 2001, 73, 5866-5874

Protein Splicing-Based Reconstitution of Split Green Fluorescent Protein for Monitoring Protein-Protein Interactions in Bacteria: Improved Sensitivity and Reduced Screening Time Takeaki Ozawa,†,‡ Mizue Takeuchi,‡ Asami Kaihara,†,‡ Moritoshi Sato,†,‡ and Yoshio Umezawa*,†,‡

Department of Chemistry, School of Science, The University of Tokyo, Hongo, Bunkyo-ku, Tokyo 113-0033, Japan, and Japan Science and Technology Corporation (JST), Tokyo, Japan

In this research, an improved detection system is described that allows an easy in vivo screening and selection of functional interactions between two interacting proteins in bacteria. We earlier reported a new concept for detecting protein-protein interactions based on reconstitution of split-enhanced green fluorescent protein (EGFP) by protein splicing (Ozawa, T.; et al. Anal. Chem. 2000, 72, 5151-5157.): Two putative interacting proteins are genetically fused to the split VDE inteins, which are linked directly to the N- and C-terminal halves of the split EGFP. Association of the interacting proteins results in functional complementation of VDE and protein-splicing reaction that leads to formation of an EGFP fluorophore. This technique simplified detection of protein interactions, but because of the low splicing efficiency of VDE intein, its sensitivity and screening time were not enough for detecting the protein interactions directly in living cells. In this paper, we have explored the use of the DnaE split intein from Synechocystis sp. PCC6803 for intracellular reconstitution of the split EGFP. We examined efficiency of the fluorophore formation by preparing four different splitEGFP types, among which EGFP dissected at the position between 157 and 158 was found to show the strongest fluorescence intensity upon protein interactions. A time required for the formation of EGFP after protein interactions was only 4 h, as compared to 3 days with the VDE intein. The protein interactions were thereby detected by an in vivo selection and screening assay in Escherichia coli on Luria broth agar plates. This improvement permits versatile designs of screening procedures either for ligands that bind to particular proteins or for molecules or mutations that block particular interactions between two proteins of interest. Many processes in biology are mediated by noncovalently associated multienzyme complexes. Examples include the assembly of enzymes and other protein homodimers and heterodimers that play important roles in the regulation of intracel* To whom correspondence should be addressed: Phone: +81-3-5841-4351. Fax: +81-3-5841-8349. E-mail: [email protected]. † The University of Tokyo. ‡ Japan Science and Technology Corporation.

5866 Analytical Chemistry, Vol. 73, No. 24, December 15, 2001

lular transport pathways, gene expression, receptor-ligand interactions, and in the therapeutic or toxic effects of administered drugs. To increase our understanding of these biological processes, several techniques have been developed for examining the interactions between proteins within cells. The yeast twohybrid system is a powerful method for the in vivo analysis of protein-protein interactions.1,2 This system facilitates the identification of potential protein-protein interactions and has been proposed as a method for the generation of protein interaction maps.3-5 The limitation of the two-hybrid technique is the set of detectable protein interactions to those that occur in the nucleus in proximity to the reporter gene. To overcome this limitation, several techniques have been developed for identifying and studying protein-protein interactions, including the Sos-recruitment system,6-9 the split ubiquitin system,10-12 and the protein complementation assay systems using Escherichia coli β-galactosidase,13,14 mouse dehydrofolate reductase,15,16 and Bordetella pertussis CysA adenyl cyclase.17,18 These systems are well-suited for assaying interactions between cytoplasm and membrane(1) Fields, S.; Song, O. Nature 1989, 340, 245-246. (2) Chien, C. T.; Bartel, P. L.; Sternglanz, R.; Fields, S. Proc. Natl. Acad. Sci. U.S.A. 1991, 88, 9578-9582. (3) Flores, A.; Briand, J. F.; Gadal, O.; Andrau, J. C.; Rubbi, L.; Mullem, V.; Boschiero, C.; Goussot, M.; Marck, C.; Carles, C.; Thuriaux, P.; Sentenac, A.; Werner, M. Proc. Natl. Acad. Sci. U.S.A. 1999, 96, 7815-7820. (4) Ito, T.; Tashiro, K.; Muta, S.; Ozawa, R.; Chiba, T.; Nishizawa, M.; Yamamoto, K.; Kuhara, S.; Sakaki, Y. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 11431147. (5) Walhout, A. J.; Sordella, R.; Lu, X.; Hartley, J. L.; Temple, G. F.; Brasch, M. A.; Thierry-Mieg, N.; Vidal, M. Science 2000, 287, 116-122. (6) Aronheim, A. Biochem. Pharmacol. 2000, 60, 1009-1013. (7) Aronheim, A. Nucleic Acids Res. 1997, 25, 3373-3374. (8) Aronheim, A.; Zandi, E.; Hennemann, H.; Elledge, S. J.; Karin, M. Mol. Cell. Biol. 1997, 17, 3094-3102. (9) Broder, Y. C.; Katz, S.; Aronheim, A. Curr. Biol. 1998, 8, 1121-1124. (10) Dunnwald, M.; Varshavsky, A.; Johnsson, N. Mol. Biol. Cell 1999, 10, 329344. (11) Johnsson, N.; Varshavsky, A. Proc. Natl. Acad. Sci. U.S.A. 1994, 91, 1034010344. (12) Stagljar, I.; Korostensky, C.; Johnsson, N.; Heesen, S. Proc. Natl. Acad. Sci. U.S.A. 1998, 95, 5187-5192. (13) Blakely, B. T.; Rossi, F. M.; Tillotson, B.; Palmer, M.; Estelles, A.; Blau, H. M. Nat. Biotechnol. 2000, 18, 218-222. (14) Rossi, F.; Charlton, C. A.; Blau, H. M. Proc. Natl. Acad. Sci. U.S.A. 1997, 94, 8405-8410. (15) Remy, I.; Wilson, I. A.; Michnick, S. W. Science 1999, 283, 990-993. (16) Remy, I.; Michnick, S. W. Proc. Natl. Acad. Sci. U.S.A. 1999, 96, 53945399. 10.1021/ac010717k CCC: $20.00

© 2001 American Chemical Society Published on Web 11/08/2001

proximal proteins, but they can be utilized only in appropriately engineered cells or are prone to false positive signals. Because of this, development of general methods that allow detecting interactions between any two proteins is of interest for biochemical and biophysical studies. We have recently proposed a new concept for analysis of interactions between two proteins of interest.19 The concept was based on the protein splicing system of VDE intein (VMA1-derived endonuclease). An interaction between proteins brings the splicing reaction to link the concomitant two EGFP halves with a peptide bond. Reconstitution of EGFP can be monitored by its fluorescence. The advantage of this technique is that, unlike other protein interaction assays, this method does not require that the interactions take place near the cell nucleus and reporter genes or that an enzyme substrate be present. But the available detection limit of the split EGFP system was yet to be improved for direct in vivo measurements in living cells, because splicing efficiency with VDE was relatively low, and as a result, it took more than 3 days to detect fluorescent signals. For a wider application of this method, a protein splicing-based split luciferase system has been developed. In this system, VDE intein was replaced by dnaE derived from the cyanobacterium Synechocystis sp. strain PCC6803, and firefly luciferase was used as an optical probe. This split luciferase system enabled the monitoring of phosphorylation of proteins in mammalian cells as a result of highly sensitive detection of luminescence from the luciferase enzyme.20 To detect the luminescence activity, however, the cells were to be lysed with a detergent, and luciferin had to be added, together with ATP, into the lysate as the substrate. Although the split luciferase system requires such a cumbersome handling, it is suitable because of its available high sensitivity for measuring quantitatively the amounts of protein-protein interactions in mammalian cells that are triggered by hormones or pharmaceutical samples. If the highly sensitive detection is achievable in the split EGFP system that requires no enzymatic substrate nor cell lysis as a result, it will become more useful for high-throughput screening or selection of interacting proteins. We describe herein a highly sensitive split EGFP system for detecting protein-protein interactions in E. coli based on the protein splicing with dnaE intein that can allow a bacterial clonal selection. The concept is shown in Figure 1. The dnaE intein was chosen from several known inteins because of its special features: (i) It is composed of a total 159 amino acids, which is shorter than VDE by 90 amino acids.21 (ii) The solubility of dnaE is higher than that of VDE, which is known to be subject to making an inclusion body upon expression in E. coli.22 (iii) For efficient splicing to occur, the requirement for amino acid sequences of dnaE around the splicing junctions is less stringent than for VDE.23-25 (iv) Unlike artificially split VDE, N- and C-terminal domains of dnaE have a nature of self-association and (17) Ladant, D.; Karimova, G. Res. Microbiol. 2000, 151, 711-720. (18) Karimova, G.; Pidoux, J.; Ullmann, A.; Ladant, D. Proc. Natl. Acad. Sci. U.S.A. 1998, 95, 5752-5756. (19) Ozawa, T.; Nogami, S.; Sato, M.; Ohya, Y.; Umezawa, Y. Anal. Chem. 2000, 72, 5151-5157. (20) Ozawa, T.; Kaihara, A.; Sato, M.; Tachihara, K.; Umezawa, Y. Anal. Chem. 2001, 73, 2516-2521. (21) Wu, H.; Hu, Z.; Liu, X.-Q. Proc. Natl. Acad. Sci. U.S.A. 1998, 95, 92269231. (22) Kawasaki, M.; Makino, S.; Matsuzawa, H.; Satow, Y.; Ohya, Y.; Anraku, Y. Biochem. Biophys. Res. Commun. 1996, 222, 827-832.

Figure 1. Principle for the present split-EGFP system. N-terminal half of dnaE (dnaEn) and C-terminal half of dnaE (dnaEc) are connected with N- and C-terminal halves of EGFP (N-EGFP and C-EGFP), respectively. An interacting pair, protein A and protein B, are linked to opposite ends of that dnaE. Interaction between protein A and protein B accelerates the folding of N- and C-dnaE and protein splicing results. The N-and C-terminal halves of EGFP are linked together by a normal peptide bond to yield correctly folded EGFP in which its fluorophore is formed.

its successive splicing reaction to ligate exteins (trans splicing).21 Our goal was to develop a strategy that allows detection of any protein-protein interactions in vivo, such as specific cell types, organisms, or living animals. We show herein how the split EGFP system can be used by a simple genetic screening in E. coli for selection of specific clones expressing a particular protein that interacts with its target. EXPERIMENTAL PROCEDURES Materials. All reagents used were of the highest available purity. DNA-modifying enzymes were from Takara Biomedicals (Tokyo, Japan). E. coli strain DH5R was used for subcloning. For protein expression, E. coli strain BL21(DE3)pLysS was transformed with appropriate DNA constructs. General Procedures. The crude extracts from E. coli were electrophoresed on 12-15% SDS-PAGE gels with a protein marker. Protein concentrations were determined by the Bradford assay (Biorad, Hercules, CA). All PCR fragments were sequenced by an ABI310 genetic analyzer (Applied Biosystems, Foster, CA). Protein Expression. Detailed protocols of the present plasmid construction are available upon request. Each cDNA shown in Figure 2 was introduced into the NdeI and BamHI gaps of pET15b (Novagen, Madison, WI). All of the constructs include a cassette consisting of (translation termination codon)-(Shine-Dalgarno sequence)-(translation initiation codon).26 The resulting plasmid is essentially a two-gene operon with the first gene encoding the (23) Kawasaki, M.; Nogami, S.; Satow, Y.; Ohya, Y.; Anraku, Y. J. Biol. Chem. 1997, 272, 15668-15674. (24) Nogami, S.; Satow, Y.; Ohya, Y.; Anraku, Y. Genetics 1997, 147, 73-85. (25) Evans, T. C.; Martin, D.; Kolly, R.; Panne, D.; Sun, L.; Ghosh, I.; Chen, L.; Benner, J.; Liu, X.-Q.; Xu, M.-Q. J. Biol. Chem. 2000, 275, 9091-9094.

Analytical Chemistry, Vol. 73, No. 24, December 15, 2001

5867

Figure 2. Schematic structures of major new constructs. His6 indicates polyhistidine tag MAHHHHHHHHHHSSAHIGARH. Sequences of N_ and C_linkers are ASNNGNGRNG and GNNGGNNDV, respectively. Dashed lines are of Shine-Dalgarno sequence. CaM is Xenopus calmodulin and M13 is the CaM-binding peptide derived from skeletal muscle myosin light-chain kinase. Arrows in pETm157(/) are annealing positions of two primers, dnaEn1-F and dnaEc1-R, for colony PCR.

N-terminal halves of EGFP and dnaE and with the second gene encoding the C-terminal halves of EGFP and dnaE. The plasmid was introduced into E. coli strain BL21(DE3)pLysS by an electrocell manipulator 600 (BTX, SanDiego, CA), and the cells were grown to an OD600 of 0.5-0.7 in 10 mL of liquid Luria broth (LB) medium containing 100 µg/mL ampicillin and 30 µg/mL chloramphenicol. Expression of proteins in the bacteria was induced with 0.5 mM isopropyl β-D-thiogalactopyranoside (IPTG), and the bacteria were allowed to express recombinant proteins at 25 °C. Crude cell extracts were prepared at a given time by pelleting the cells at 8000g for 5 min and resuspending them in 1.0 mL of a PBS buffer (150 mM NaCl, 3.0 mM KCl, 10 mM phosphate buffer, pH 7.2). The cells were lysed with a tip (26) Wu, H.; Xu, M.-Q.; Liu, X.-Q. Biochim. Biophys. Acta 1998, 1387, 422-432.

5868

Analytical Chemistry, Vol. 73, No. 24, December 15, 2001

sonicator, and the lysates were centrifuged at 15000g for 10 min at 4 °C. The obtained supernatants were allowed for the fluorescence measurements and immunoblot analysis. Fluorescence Spectra Measurements. The crude cell extracts dissolved in the PBS buffer were filled in a 600-µL cuvette, and their fluorescence was measured using a JASCO-750 spectrofluorometer (Japan Spectroscopic Co., Tokyo, Japan). Emission spectra were recorded at an excitation wavelength of 470 nm. The measurements of the spectra for each sample were repeated three times, including protein expression and sample preparation processes. Even though the sample solutions prepared in the same conditions that were used to measure their spectra, the magnitudes of the fluorescence intensities were different because of small changes in the bacterial growth. For the measured fluorescence intensities to be comparable from one sample preparation

to another, the magnitude of the intensities of each spectrum was normalized according to the equation, F ) Fobs/Wtot, where Fobs is the observed fluorescence intensity, and Wtot is the total weight of proteins expressed in the bacteria. Hereafter, the normalized fluorescence intensity (F) is simply referred to as the fluorescence intensity, and the unit is defined as an arbitrary unit (AU). Immunoblot Analysis. The crude cell extracts (20 µL) of the cell lysate were mixed with the same volume of 2× SDS-PAGE sample buffer containing 2 mM EDTA, 2% SDS, 20% glycerol, 0.02% bromophenol blue, and 100 mM Tris/HCl (pH 6.8). The samples were boiled for 10 min, were subjected to SDS-PAGE using 15% polyaclylamide gels, and were transferred to a nitrocellulose membrane. The membrane was probed with anti-His tag polyclonal (Santa Cruz Biotechnology, Santa Cruz, CA) or anti-GFP monoclonal antibodies (Roche Molecular Biochemicals, Mannheim, Germany) and then with alkaline phosphatase-labeled antirabbit or -mouse antibodies. The second antibodies were visualized by an LAS-1000 plus image analyzer (Fujifilm Co., Tokyo, Japan) equipped with a chemiluminescence system of CDP-Star (New England Biolabs, Inc., Beverly, MA). The molecular size of each band was assessed using a standard rainbow marker (Amersham Pharmacia Biotech Ltd., Buckinghamshire, U.K.). Visualization of E. coli on LB Agar Plates and Fluorescence Measurements. Each plasmid was transformed into E. coli strain BL21(DE3)pLysS and grown on the LB agar plate containing 100 µM ampicillin, 30 µg/mL chloramphenicol, and 1.0 mM IPTG. After 12-16 h in culture, the fluorescence of transformed E. coli colonies on the plate was quantitatively assessed by exposing the plate to a long-wavelength (470 nm) excitation with blue LED (LAS-1000plus, Fujifilm). The emission of the fluorescence was detected by a cooled CCD equipped with an emission filter (530DF30). The obtained images of the bacterial colonies were analyzed by a software ImageGauge v.3.41 (Fujifilm). Colony PCR. Optimized reaction mixtures for colony PCR contained 1× PCR gold buffer, 2.0 mM MgCl2, 200 µM of each deoxynucleotide triphosphate, 0.2 µM of two primers, and 1.25 units AmpliTaq Gold (Applied Biosystems, Foster, CA) in 10 µL of PCR reaction mixture. The primer sequences were 5′AAGCTTTGGCACCGAAATTTTA-3′ (dnaEn1-F: sense strand) and 5′-TGGCTAGGAGAAAATTATGGTCT-3′ (dnaEc1-R: antisense strand). Positions of the pair of primers annealing to the constructs are shown in Figure 2. Colonies ∼1 mm in diameter were picked up with a sterilized toothpick and directly transferred to PCR tubes as DNA templates. A thermal cycle program, run on a T gradient (Biometra, Go¨ttingen, Germany), was as follows: one cycle of 94 °C for 10 min; 60 cycles of 94 °C for 30 s, 55 °C for 30 s, and 72 °C for 1.5 min; and then incubation at 72 °C for 5 min. RESULTS AND DISCUSSION Design of Fusion Protein Test System for Monitoring Interactions. We have recently used VDE intein to demonstrate the method for monitoring protein-protein interactions based on protein splicing.19 The interactions triggered the refolding of split VDE to induce the protein splicing and, thereby, the ligated EGFP folded correctly for yielding the EGFP fluorophore. The fluorescence from the thus formed EGFP was detected in vitro. It was, however, found to be difficult to detect the fluorescence from

Table 1. Wavelength Peaks of Emission Spectra of the GFP Variantsa proteins

mutations

excitation

emission

EGFP m145EGFP m157EGFP ins157EGFP m224EGFP

S65T N144C K156Y, Q157C Q157-KFAEYC-K158 F223Y, V224C

488 488 488 488 489

511 511 510 510 510

a Wavelengths of excitation and emission maxima are given in nanometers. The protein of ins157EGFP has six additional amino acids inserted between the residues 157 and 158.

single bacterial colonies in vivo on LB or M9 agar plates, even though the bacteria were illuminated with a high-power Ar laser at 488 nm. We thought that if efficiency to form the parent EGFP was more accelerated, it would be possible to obtain higher fluorescence signals. To achieve this, a pair of intein fragments of VDE was replaced by the one of dnaE derived from Synechocystis sp. PCC6803, which includes the domains necessary for trans splicing. Although a pair of VDE fragments did not proceed in the splicing reaction without protein-protein interactions, the dnaE intein was found to splice in trans to some extent by a selfcatalytic process, even without the need for additional proteinprotein interactions. We found earlier with luciferase that if the desired protein interaction occurred, the splicing efficiency of dnaE was facilitated, and thereby, a higher sensitivity of the luminescence signals resulted.20 Upon using the dnaE intein in the present split EGFP system, we had to investigate the optimum amino acid sequence around the splicing junction to achieve an efficient splicing reaction while maintaining the EGFP’s spectral property intact. The optimum amino acid sequence around the splicing junction is unique for EGFP and luciferase. It may be possible to introduce or substitute several amino acids to meet demands for an efficient splicing reaction, although only a three-point mutation could be introduced to retain its activity in the luciferase system. Refolding of ligated EGFP also seemed to be an important factor for highly sensitive detection; whether the ligated EGFP can fold correctly or not may depend on the dissection point of EGFP. In the previous split-EGFP system with VDE, EGFP was dissected just at the end of the sixth β sheet of EGFP, because formation of a β-sheet strand was required at the splicing junction, and the structure of the N terminal seemed to be of stable conformation. Contrary to VDE, the requirement for secondary or tertiary structures at the splicing junction of dnaE was not clear. We, therefore, investigated the differences in the fluorescence intensities of four EGFP mutants, each of which was formed from N- and C-EGFP fragments split at differing locations, as shown in Figure 2. Mutational Analysis of EGFP. To test whether EGFP tolerates several point mutations required for the protein splicing to occur, a mutational analysis was performed. Four EGFP mutants, m144EGFP, m157EGFP, ins157EGFP, and m224EGFP, were expressed in E. coli and fluorescence spectra of the E. coli lysates were examined (Table 1). The excitation and emission maxima of these mutants were ∼488 - 489 and ∼510 - 511 nm, respectively. These were essentially identical to those of the parent EGFP. This result indicates that several point mutations introduced into the parent EGFP did not affect its fluorescence spectra. Analytical Chemistry, Vol. 73, No. 24, December 15, 2001

5869

Figure 3. Emission spectra of crude extracts of E. coli carrying plasmids (a) pETm144(/) and pETm144(C/M), (b) pETm157(/) and pETm157(C/M), (c) pETins157(/) and pETins157(C/M), and (d) pETm224(/) and pETm224(C/M). Spectra for the plasmids NVC∆SDlinker(/) and NVC∆SDlinker(C/M), in which dnaE was replaced by VDE, were included in the panel (C) (gray line). Excitation was 470 nm with a 5.0-nm bandwidth. The emission bandwidth was 5.0 nm. Spectra for all different mutants were recorded with the same protein concentrations and gains, so that their amplitudes were made comparable across their constructs.

Analysis of Fluorescence Intensity for Each Fusion Construct. We next examined whether split EGFPs with dnaE inteins work as a probe for protein-protein interactions. The mutants shown in Table 1 were split into N- and C-EGFP fragments, each of which was directly linked to N- and C-terminal dnaE, respectively (Figure 2). To ensure that N- and C-terminal halves of dnaE could be spatially proximal when particular protein interactions occurred, flexible peptide linkers containing Gly-Asn repeats were inserted in the frame just after the end of N-terminal dnaE and just before the start of C-terminal dnaE. As an interaction partner, calmodulin (CaM) and its target peptide, M13, were chosen as a model system because CaM was known to bind to M13 with a high affinity (Kd ) 1 nM), and the structure of the CaM-M13 complex was well-resolved by NMR.27,28 Each recombinant plasmid was introduced into E. coli to produce the corresponding fusion proteins. To induce the splicing event in the bacteria, IPTG was injected into the medium, and expressions of the proteins were performed for 12 h at 25 °C. Fluorescence spectra of E. coli lysate carrying each plasmid are shown in Figure 3. In the case of E. coli containing plasmid pETm144(/), a small change in the fluorescence spectrum was observed (Figure 3a). Coexpression of CaM and M13, which were linked, respectively, to the N-EGFP-dnaEn and C-EGFP-dnaEc, increased the fluorescence intensity with its emission maximum at 510 nm (Figure 3a). The intensity was 3.0 times higher than that in the absence of CaM and M13. Upon expression of fusion protein including the m157 mutant in the absence of CaM and M13, a small change in the fluorescence was observed, whereas (27) Ikura, M.; Clore, G. M.; Gronenborn, A. M.; Zhu, G.; Klee, C. B.; Bax, A. Science 1992, 256, 632-638. (28) Porumb, T.; Yau, P.; Harvey, T. S.; Ikura, M. Protein Eng. 1994, 7, 109115.

5870

Analytical Chemistry, Vol. 73, No. 24, December 15, 2001

the presence of CaM and M13 resulted in 3.8 times higher fluorescence intensity than that in the absence of CaM and M13 (Figure 3b). A maximum change in the fluorescence was obtained when pETins157(C/M) was transformed in the bacteria. The magnitude of the fluorescence intensity for pETins157(C/M) was 80 times higher than that obtained in the crude extract carrying pET_NVC∆SDlinker(C/M) (gray line in Figure 3c), of which a construct was composed of VDE as the intein connected with EGFP split at the amino acid residues between 129 and 130 (see ref 19 for more information). In contrast, fluorescence at 510 nm changed little in the case of pETm224(/) and pETm224(C/M) (Figure 3d). The dissection point of EGFP was found to be important to obtain higher fluorescence signals: EGFP dissected at the position between 143 and 144 or between 224 and 225 resulted in a small change in the fluorescence. Only the construct dissected at a surface loop between residues 157 and 158 generated the above strong fluorescence intensity. The strong fluorescence intensities for pETm157(C/M) and pETins157(C/M) demonstrate that efficient reconstitution of N- and C-terminal halves of EGFP occurred to form its fluorophore. Considering these facts, we concluded that N- and C-terminal halves of EGFP dissected between residues 157 and 158 and directly linked, respectively, to split dnaE inteins work in the best mode as an optical probe for detecting interactions between proteins. Identification of the Splicing Products. To examine whether the enhanced fluorescence at 510 nm was, indeed, originated from the ligation due to the protein splicing, the identity of splicing products was confirmed by western blot analysis with anti-His6 and anti-GFP antibodies, each specific for His6 connected to N-EGFP and C-EGFP, respectively. The results are shown in Figure 4a. In the cases of pETm144(C/M) and pETm157(C/M),

Table 2. Time-Dependent Increases in the Fluorescence Intensity by Protein Splicinga fluorescence intensity (a.u.) time (h) pETm157(/) pETm157(C/M) pETins157(/) pETins157(C/M) 1 2 4 8 12 24

5.7 ( 0.5 7.5 ( 0.2 11.9 ( 0.4 29.5 ( 1.9 34.6 ( 1.3 41.9 ( 2.1

6.4 ( 0.6 11.4 ( 0.7 16.6 ( 1.3 73.9 ( 9.9 73.4 ( 4.7 88.8 ( 7.6

8.9 ( 1.0 18.3 ( 3.5 18.9 ( 3.4 24.4 ( 2.1 38.2 ( 2.5 37.4 ( 1.1

6.3 ( 0.7 23.2 ( 0.3 38.3 ( 0.5 74.2 ( 8.0 92.2 ( 3.2 107.2 ( 1.3

a The incubations of the bacteria carrying each plasmid were performed at 25 °C for up to 12 h, and the temperature was changed to 4 °C to avoid bacterial overgrowth.

Figure 4. In vivo trans-splicing of dnaE intein induced by CaMM13 interaction. Proteins expressed in E. coli were analyzed in 15% SDS-PAGE gel. Western blotting was done by anti-His (lane 1-3) and anti-GFP (lanes 4-6 and GFP) antibodies. Each transformant carried the plasmids (a) pETm144(C/M) (lanes 1 and 4), pETm157(C/M) (lanes 2 and 5), and pETins157(C/M) (lanes 3 and 6); (b) pETm144(/) (lanes 1 and 4), pETm157(/) (lanes 2 and 5), and pETins157(/) (lanes 3 and 6). Purified EGFP protein was used as the control for lane GFP.

major components specifically recognized by the anti-His6 antibody were 50 kDa, 30 kDa and 25 kDa proteins, the sizes of which were estimated with a standard molecular marker (data not shown). The 50 kDa and 25 kDa proteins were the expected sizes of N-EGFP-dnaEn (30 kDa) plus CaM (17 kDa) and of EGFP (25 kDa), respectively. The 25 kDa protein was found to react with anti-GFP antibody, which confirmed that the CaM-M13 interaction induced the splicing reaction fully to ligate N- and C-EGFP fragments, thereby forming matured EGFP. The other major component of the 30 kDa protein was recognized by the anti-GFP antibody. The size was almost the same as the calculated mass of an unspliced precursor protein in the C-terminus plus the N-terminal half of EGFP. In the case of pETins157(C/M), the expressed protein products were the same in size as those for pETm157(C/M), except that the 25 kDa protein, corresponding to the size of the ligated EGFP, was completely diminished. In the control experiment, products after protein expression for pETm145(/), pETm157(/), and pETins157(/) were analyzed (Figure 4b). Most of the products were unspliced proteins of N-EGFP-dnaE (30 kDa), dnaEc-C-EGFP (15 kDa), and ligated EGFP by protein splicing. The amounts of the ligated EGFP were small in comparison to the unspliced precursors. The observation that all the crude extracts of E. coli carrying pETm144(C/M), pETm157(C/M), and pETins157(C/M) included the 30 kDa protein implies an occurrence of a distinct intermediate product in the splicing reaction. The exact mechanism for the splicing reaction of dnaE remains to be worked out. The reaction steps in the splicing process of VDE derived from Saccharomyces cerevisiae have been demonstrated, however, by analyzing intermediates and side products that accumulated as a result of amino acid substitution.29,30 Since the several amino acid residues requisite for splicing reaction in the VDE system are known to

be the same as those in the dnaE system,21,23-25 similar proteinsplicing mechanisms are assumed to occur in dnaE, as well (see Figure 5): step 1, N-S acyl rearrangement involving Cys that constitutes the splicing junction in the N-terminal; step 2, transesterification involving Cys that constitutes the junction in the C-terminal; and step 3, peptide cleavage coupled to succinimide formation involving Asn existing just before Cys in the C-terminal. It is unlikely that spontaneous N-S acyl rearrangements in step 1 occur under physiological conditions, because they are induced, in general, under strongly acidic conditions. The equilibrium position of the N-S acyl rearrangements in step 1 was, however, found to be larger than the one of the typical peptide bond,29 suggesting that amino acid residues that are in close proximity to the reaction center help to drive the N-S acyl rearrangements. This notion was supported by the crystallographic analysis.31 The imidazole ring of His79 is in position to protonate the amide nitrogen of Cys1, thereby promoting the formation of thioester. Assuming this splicing mechanism, the occurrence of the abovedescribed 30 kDa protein was expected as a branched intermediate. The existence of such a branched protein was, in fact, found by western blot analysis (Figure 4a); both anti-His and anti-GFP antibodies revealed the 30 kDa protein. We, therefore, concluded that after step 2, the rearrangement of the thioester intermediate yielded the branched molecule, M13-dnaEc-(N-EGFP)C-EGFP, that formed the EGFP fluorophore (Figure 5). The fact that the 30 kDa component of the splicing product for pETins157(C/M) was of a branched intermediate and no spliced-out EGFP was formed indicates that the splicing reaction was stopped halfway, just after step 2 shown in Figure 5. Upon protein-protein interactions, protein B (in Figure 5) labeled with the C-terminal half of dnaE-C-EGFP underwent an N-S acyl shift and transesterification to form the branched intermediate bonded with the EGFP fluorophore that remained uncleaved, thereby, only protein B that interacted with protein A fluorescently tagged with EGFP. Protein B thus labeled with EGFP can reveal its localization or time-dependent distribution in eukaryotic cells and will inform the fate of protein B in the cells after interaction with protein A. (29) Chong, S.; Williams, K. S.; Wotkowicz, C.; Xu, M.-Q. J. Biol. Chem. 1998, 273, 10567-10577. (30) Chong, S.; Shao, Y.; Paulus, H.; Benner, J.; Perler, F. B.; Xu, M.-Q. J. Biol. Chem. 1996, 271, 22159-22168. (31) Poland, B. W.; Xu, M. Q.; Quiocho, F. A. J. Biol. Chem. 2000, 275, 1640816413.

Analytical Chemistry, Vol. 73, No. 24, December 15, 2001

5871

Figure 5. Proposed mechanism for protein splicing involving dnaE intein. Upon interaction of protein A and protein B, dnaEn and dnaEc are in close proximity to induce an N-S acyl shift (step 1) at the N-EGFP-dnaEn junction and to produce a thioester intermediate. This intermediate undergoes transesterification (step 2) at the dnaEc-C-EGFP junction to form a branched intermediate. Split EGFP in the branched intermediate refolds and forms EGFP fluorophore. The branched intermediate cleaves dnaEc intein (step 3) and is subject to an S-N acyl shift to generate GFP protein.

Estimation of the Reaction Time for Forming the EGFP Fluorophore. To investigate the time-dependent changes in the 5872

Analytical Chemistry, Vol. 73, No. 24, December 15, 2001

fluorescence intensity, E. coli carrying each plasmid containing the m157 mutant or the ins157 mutant was pelleted ,and its crude

Figure 6. Bacterial screening for interacting proteins. (A) Fluorescence images of bacterial colonies on LB agar plates. BL21(DE3)pLysS cells were cotransformed with a mixture of plasmids pETm157(/) and pETm157(C/M); plated on LB agar plates containing 1.0 mM IPTG, 0.1 mM ampicillin, and 0.1 mM chloramphenicol; and incubated for 16 h at 37 °C. Inset: An expanded image of bright and dark bacterial colonies on LB agar plates. (B) A fluorescence image by eliminating background fluorescence. The background fluorescence of LB agar and plastic plates was eliminated with software, ImageGauge. (C) Results of colony PCR amplification with primer pair dnaEn1-F-dnaEc1-R. Colony PCR products were electrophoresed with 2% agarose gel. Lanes: M, molecular size marker (øX174 DNA/HaeIII); 1-6, bright colonies in A; 7-12, dark colonies in A; a, plasmid pETm157(C/M) (control); b, plasmid pETm157(/) (control).

extract was allowed to measure its fluorescence intensity at 510 nm (Table 2). The injection of IPTG was initiated at time zero into an LB medium including the E. coli. When the fluorescence of the bacterial lysates carrying plasmids pETm157(/) and pETm157(C/M) was measured, the magnitude of the fluorescence intensity did not change for 4 h, and thereafter, it gradually increased. The rate of increase in the fluorescence intensity for pETm157(C/M) was 2.5 times faster than for pETm157(/). In the case of pETins157(/), the magnitude and rate of increase in the fluorescence intensities were almost the same as those for

pETm157(/). In the presence of CaM and M13, however, the fluorescence intensities increased for 2 h upon IPTG induction. The rate of formation of the EGFP fluorophore upon proteinprotein interactions is an important factor in using the present probe molecules for in vivo monitoring of the protein interactions. The split EGFP system with the VDE intein did not fulfill the need for sensitive detection of the fluorescence for this purpose; it took more than 3 days to obtain enough fluorescence signals.19 The present system required only ∼2-4 h to form the EGFP structure and emit the fluorescence. This improved rate of fluorophore Analytical Chemistry, Vol. 73, No. 24, December 15, 2001

5873

formation may enable in vivo detection of any particular interactions between proteins in both eukaryotic and prokaryotic cells. Screening for in Vivo Protein-Protein Interactions by Using Formation of EGFP. One of the practical applications with the present split EGFP method is identification of interacting partners by a simple genetic test like screening and selection on the LB agar plates. To mimic the screening procedure, we mixed plasmids pETm157(/) and pETm157(C/M) in a molar ratio of 1:1 and cotransformed this mixture in BL21(DE3)pLysS. The transformants were plated on LB agar plates including 1.0 mM IPTG. All of the colonies cotransformed were fluorescent (panel A in Figure 6). Around 50% of the colonies exhibited strong fluorescence, and the rest of the colonies fluoresced weakly. Upon elimination of autofluorescence from the LB medium and the plastic plate, half of the total colonies remained fluorescent (panel B in Figure 6). To examine whether this selection could be used to identify interacting proteins among an excess of noninteracting ones, a colony-PCR approach with given primers, dnaEn1-F and dnaEc1-R, was employed. The predicted products, amplified from pETm157(/) and pETm157(C/M), were 561 bp and 1071 bp, respectively. A very specific PCR product, ∼500 bp, was amplified in the dark colonies, but in the bright colonies, the product ∼1000 bp was amplified (panel C in Figure 6). These results indicate that bacteria expressing specific interacting proteins fused to the two optical probes could be selected among a large number of irrelevant clones. Several bacterial one- and two-hybrid systems have been proposed in which there exists a common logic that proteinprotein interactions induce a transcriptional activation of the reporter gene and, thereby, produce a signal protein that is accumulated in the bacteria.18,32 Contrary to these earlier methods, the present approach involves the reconstitution of EGFP without reporter genes; the protein-protein interaction under study does (32) Joung, J. K.; Ramm, E. I.; Pabo, C. O. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 7382-7387.

5874

Analytical Chemistry, Vol. 73, No. 24, December 15, 2001

not need to take place in the vicinity of the transcription machinery. Hence, the interactions that occur either in the cytosol or at the inner-membrane level can be screened. The selection system also permits a single-step isolation of the candidates in an in vivo context. The present system with these advantages, thus, enables a simple genetic selection in bacteria of specific clones expressing particular interacting protein partners. In conclusion, we demonstrated a highly sensitive and quickly detecting split EGFP system for analyzing protein-protein interactions in vivo, which can be used to select and identify interacting protein partners. The dissection of EGFP between residues 157 and 158 from the N-terminal was found to exhibit the highest fluorescence sensitivity when the N- and C-terminals of split EGFP was ligated by protein splicing. The highly sensitive detection enabled the bacterial screening and selection to look for interacting proteins. This method provides a rapid selection with bacteria in vivo and may be useful for high-throughput analysis and automation in a single-step selection. We envision that our genetic selection method will provide a powerful, broadly applicable tool for identifying and characterizing protein-protein interactions. It is also conceivable that this system can further expand the utility of analyzing interactions between proteins not only in prokaryotic and eukaryotic cells but also in organisms or living animals. ACKNOWLEDGMENT The authors thank Prof. S. Tabata, Kazusa DNA Research Institute, Chiba, Japan, for the gift of the PCC6803 genome. This work was financially supported by Core Research for Evolutional Science and Technology (CREST) of Japan Science and Technology (JST) and by grants for Scientific Research by the Ministry of Education, Science, and Culture, Japan. Received for review June 26, 2001. Accepted September 17, 2001. AC010717K