SEPTEMBER/OCTOBER 1997 Volume 8, Number 5 © Copyright 1997 by the American Chemical Society
COMMUNICATIONS Identification of Preferred Distamycin-DNA Binding Sites by the Combinatorial Method REPSA Paul Hardenbol,† Jo C. Wang, and Michael W. Van Dyke* Department of Tumor Biology, The University of Texas M. D. Anderson Cancer Center, Houston, Texas 77030. Received April 21, 1997X
The combinatorial method restriction endonuclease protection, selection, and amplification (REPSA) was used to determine the preferred duplex DNA binding sites of the peptide N-methylpyrrolecarboxamide antibiotic distamycin A. After 12 rounds of REPSA, several sequences were identified that bound distamycin with an apparent affinity of 2-20 nM. Among these, the highest-affinity sites averaged 10 bp in length, suggesting that these sites may be occupied by multiple, cooperatively interacting distamycin molecules. Presently, REPSA is the only combinatorial approach that allows the identification of preferred DNA targets for small molecule ligands at physiologically relevant concentrations in solution. As such, it should prove useful in the design and screening of sequencespecific DNA-binding molecules.
Many drugs important in anticancer chemotherapy bind duplex DNA noncovalently and with some sequence specificity (1). Their mechanisms of action, while not fully elucidated, are thought to rely on the strength and selectivity of these interactions. Considerable efforts have been made to identify the preferred binding sites of these small molecules, primarily employing either chemical or enzymatic cleavage protection methods (2, 3). Such “footprinting” methods, while capable of defining a drug-binding site down to base pair resolution, typically only survey 100-200 bp per experiment. Sur* Address correspondence to this author at Department of Tumor Biology, Box 79, The University of Texas M. D. Anderson Cancer Center, 1515 Holcombe Blvd., Houston, TX 77030. Telephone: (713) 792-8954. Fax: (713) 794-4784. E-mail:
[email protected]. † Present address: Department of Biochemistry, Stanford University, Stanford, CA 94305-5307. X Abstract published in Advance ACS Abstracts, August 15, 1997.
S1043-1802(97)00066-9 CCC: $14.00
veys of several million base pairs, however, are required to identify their hypothetical best binding sites, those presumably recognized at physiological drug concentrations. To survey large numbers of potential binding sites, combinatorial methods involving multiple rounds of ligand-nucleic acid complex selection and PCR amplification are usually employed. These methods have succeeded in identifying preferred duplex DNA binding sites for many proteins and triplex-forming oligonucleotides (4, 5). The methods all relied on a physical separation of ligand-DNA complexes from free DNAs, either as a result of altered chemical properties of the ligand-DNA complex (e.g., increased mass-to-charge ratio and reduced electrophoretic mobility) or through the use of affinity methods (e.g., ligand-specific antibodies). However, such methods cannot be used to selectively isolate drug-DNA complexes; thus, no combinatorial method has as yet been available for their study. We have recently described a combinatorial method, © 1997 American Chemical Society
618 Bioconjugate Chem., Vol. 8, No. 5, 1997
Hardenbol et al.
Figure 1. Structure of the DNA-binding antibiotic distamycin A.
restriction endonuclease protection selection and amplification (REPSA), for the identification of preferred nucleic acid and protein binding sites on duplex DNA (6). Unlike conventional methods, REPSA relies on the inhibition of an enzymatic process, DNA cleavage by a type IIS restriction endonuclease (IISRE), in selecting complexed DNAs. Thus, potentially, any ligand capable of being investigated by footprinting should also be amenable to analysis by REPSA. As a test of REPSA’s applicability to the identification of preferred small molecule-DNA binding sites, the method was used to investigate the sites recognized by the natural product antibiotic distamycin A. Distamycin A, a tripeptide tri-N-methylpyrrolecarboxamide (Figure 1), binds as a 1:1 ligand-DNA complex in the minor groove of duplex DNA, preferentially at AT-rich sequences 4-5 bp in length (7, 8). For a series of (A/T)4 sequences, distamycin demonstrated binding affinities ranging from 0.1 to 1.0 µM (9). Numerous studies have shown that distamycin effectively inhibits DNA cleavage by both nonspecific endonucleases and restriction endonucleases (reviewed in ref 10) and that higher-affinity, multimeric distamycin binding sites exist in heterogeneous DNA (11). Taken together, these findings suggest that distamycin would provide an ideal test for REPSA. A round of REPSA involves three steps: complex formation between a ligand and a subpopulation of DNAs, preferential cleavage of the uncomplexed DNAs, and PCR amplification of the uncleaved templates (Figure 2). Selection of bound DNAs is made possible through the use of IISREs, restriction endonucleases that cleave duplex DNA not at a specific sequence but rather at a fixed distance from their recognition sites (12). Given the proper ligand/IISRE combination, it is possible to efficiently cleave uncomplexed DNAs while maintaining the complexed DNAs intact. These intact DNAs then serve as templates for PCR amplification, thereby increasing the representation of these sequences within the pool. These steps can then be repeated for a number of rounds, until a desired population of ligand-binding sequences emerges. Central to REPSA is the design of a selection template, which allows the probing of a region of randomized sequence by IISREs. The selection template used in this analysis, ST3, is shown schematically in Figure 3. ST3 incorporates a central 21 bp cassette of randomized sequence, flanked by defined sequences containing different IISRE recognition sites. This length of random sequence is many times that of an individual distamycin binding site (5 bp). Thus, ST3 allows screening for multimeric distamycin binding sites, which are believed to have higher affinities (11). The flanking sequences were designed to allow interconversion of the IISRE sites, from BpmI to BsgI or Eco57I, depending on the primers used. Such a design provides greater flexibility with this template and allows the rotation of IISREs in different REPSA rounds, which can be essential for preventing artifact selection (e.g., selection of sequences bound by proteins present in the commercial restriction endonu-
Figure 2. Flow chart for the combinatorial method restriction endonuclease protection, selection, and amplification (REPSA). The tricyclic ligand represents the DNA-binding drug distamycin A; a type IIS restriction endonuclease is represented by a scissors (nonspecific cleaving domains) attached to gray ovals (sequence-specific DNA-binding domains).
Figure 3. Design of the selection template, ST3, used for the identification of preferred distamycin-DNA binding sites by REPSA. Locations of restriction endonuclease binding (brackets) and cleavage (arrows) sites are indicated. Long horizontal arrows correspond to the sequences of the PCR amplimers indicated. N represents random nucleotides.
clease preparations) (6). Each of the IISREs has an identical span (16 nucleotides in the top strand, 14 nucleotides in the bottom strand) between binding and cleavage sites. Also, having identical IISRE sites on both flanks increased the efficiency of template probing by some of the less efficient IISREs (e.g., BsgI) (6). To initiate REPSA, 1 ng of ST3 was incubated (30 min, 25 °C) with 0.1 µM distamycin in 10 µL of a buffer suitable for subsequent IISRE cleavage. The amount of selection template used (25 fmol) was optimal for subsequent manipulations and provided a good representation of all possible 17-mer sequences; the concentration of distamycin corresponded to its lowest dissociation constant reported for binding to a 4 bp A/T-rich sequence (9). After drug binding, 2 units of BpmI (New England Biolabs) was added and cleavage allowed to proceed for an additional 30 min at 37 °C. Afterward, the entire
Bioconjugate Chem., Vol. 8, No. 5, 1997 619
Communications Table 1. REPSA-Selected Distamycin Binding Sequences clone
sequencea
Kapp (nM)b
a Sequences obtained following 12 rounds of REPSA with 0.1 mM distamycin. Distamycin binding sites, as determined by DNase I footprinting, are underlined. b Concentrations of distamycin that afforded 50% DNase I cleavage protection.
mixture was added to a standard PCR reaction mixture (6) containing 200 ng each of BsgI(L) and BsgI(R) primers and amplified for 6, 9, and 12 cycles with a profile of 94 °C for 1 min of denaturation, 25 °C for 3 min of annealing, and 50 °C for 2 min of elongation. The annealing step was added to facilitate hybridization between the BsgI primers and the BpmI site-containing ST3, given the two base mismatches. After PCR, an aliquot of each reaction mixture was analyzed by PAGE to determine relative levels of amplification; the remainder was purified by phenol extraction and ultrafiltration, essentially as previously described (6). Selections in subsequent rounds were with BsgI (rounds 2, 5, 8, and 11), Eco57I (rounds 3, 6, 9, and 12), or BpmI (rounds 1, 4, 7, and 10), with the appropriate amplimers being used in the preceding round of PCR. After 12 rounds of REPSA, the selected DNAs were subcloned into pUC19 and individual colonies sequenced. Sequences of 15 clones containing the entire 21 bp cassette were obtained (Table 1). These selected sequences were substantially enriched in A’s and T’s with respect to the starting distribution (83.2 versus 58.5%). Using a χ2 analysis, we identified possible consensus distamycin binding sequences. For each possible 4, 5, and 6 bp combination, only the sequences TATA, ATATA, and AATTAT occurred with substantially higher than expected frequencies (P < 0.05). Most notably, the 9 bp sequence ATAAATTAT was found twice among these 15 sequences (P < 0.01), suggesting this could be a highly preferred distamycin binding site under our experimental conditions. To better ascertain the exact distamycin binding sites and their relative binding affinities, a DNase I footprinting analysis was performed. An autoradiogram of a footprinting analysis for clone 2 (Table 1) is shown in Figure 4. DNase I cleavage protection was first observed when 3 nM distamycin was present, with cleavage protection occurring over an 11 bp region centered on the
Figure 4. DNase I footprinting analysis of distamycin A binding to REPSA-selected clone 2. A 160 bp footprinting probe was generated by PCR using the pUC sequencing primers TGTTGTGTGGAATTGTG and CAAGGCGATTAAGTTGG, the latter being end-labeled with [γ-32P]ATP and T4 kinase. Footprinting reactions were performed essentially as previously described, except with the addition of 50 mM KCl in the binding reaction (13). The extents of the DNase I protection afforded by 3 and 30 nM distamycin are indicated at the right of the figure.
sequence ATAAATTAT (Table 1, underlined). At a 10fold higher distamycin concentration, cleavage protection extended over the entire 21 bp cassette. Note that, under these conditions, the more extensive cleavage protection most likely reflected the binding of multiple distamycin molecules to the DNA. Using densitometry, it was possible to estimate the concentration of distamycin necessary to afford 50% DNase I protection. This value corresponded to the apparent binding affinity of distamycin for this sequence. Similar footprinting analyses were performed for the other REPSA-selected sequences; initial distamycin-binding sites (underlined) and binding affinities are presented in Table 1. From this analysis, we found that the distamycin binding affinity of these sequences ranged from 2 to 20 nM and that the initial protections typically incorporated the consensus sequences identified by statistical means. Noticeably, these binding affinities are 10-100-fold greater than those previously described for most distamycin binding sites (9), indicative of the preferred nature of REPSA-selected sequences. Also note that these initial binding sites are
620 Bioconjugate Chem., Vol. 8, No. 5, 1997
larger than those typically described for distamycin (averaging 10 bp for the highest-affinity sites). This suggests that these preferred sites may actually be bound by multiple, cooperatively interacting distamycin molecules. Alternatively, these sites could be composed of several overlapping subsites with the sum of their protections yielding an apparently higher-affinity site. However, the relatively uniform DNase I protection observed throughout each site would argue against the latter hypothesis. These studies demonstrate that REPSA can be used to identify the preferred DNA-binding sequences of small molecule ligands such as distamycin. Binding sites in the nanomolar range were obtained, and possible consensus sequences were identified. The observed clustering of binding affinities (13 of the 15 sequences had affinities in the range of 2-8 nM) could reflect the actual limit for distamycin binding, though it may instead reflect a limitation of the selection conditions used (e.g., 100 nM final distamycin concentration). Further REPSA with this selected population at a reduced distamycin concentration should help to address this question. Similarly, identification of a single consensus sequence was not possible with this limited data set. Upon identification of a selected population possessing maximal affinity, and with a sufficiently large number of sequences, it should be possible to define a best binding sequence for this molecule. Nonetheless, the sequences identified by REPSA are a considerable improvement over prior defined sequences and should be of great utility in many future studies on this class of molecules. (e.g., for better understanding of DNA recognition by minor groove-binding ligands; to test for direct regulation of specific gene transcription by DNA-binding drugs). Beyond being the only combinatorial approach suitable for investigating drug-binding sites on duplex DNA, REPSA possesses the additional benefits of being suitable for studies with complex mixtures of relatively uncharacterized ligands. This fact was demonstrated in our REPSA investigation of purine-motif triplex-forming sequences, in which consensus sequences for proteins present in the IISRE preparation were also identified (6). Thus, it is conceivable that a combinatorial method like REPSA could be used to survey a library of compounds and identify a series of preferred binding sites. These selected sequences could then be back-selected against an array of these compounds, thereby identifying the sequence recognized by a particular compound. Indeed, it may be possible to use a combinatorially derived series of compounds to ascertain the range of sequences recognized by a series of related compounds. Such a twodimensional combinatorial study could be especially useful in the development of small molecules with defined sequence specificities, e.g., the N-methylpyrrole/2-methylimidazole polyamides (14, 15), these being potentially useful for catalyzing gene replacements or for a pharmacological regulation of specific gene expression. ACKNOWLEDGMENT
We thank Christine Hoover and Jing Shen for their participation on this project and Miche`le Sawadogo for
Hardenbol et al.
critical reading of the manuscript. This research was supported by grants from the Welch Foundation (G-1199) and the American Cancer Society (RPG-97-028-01-DHP) and by a Physicians’ Referral Service research award. P.H. was the recipient of a National Research Service Award predoctoral traineeship from the National Cancer Institute (T32 CA60440).
LITERATURE CITED (1) Neidle, S., and Waring, M. J. (1993) Molecular Aspects of Anticancer Drug-DNA Interactions, Macmillan, London. (2) Van Dyke, M. W., Hertzberg, R. P., and Dervan, P. B. (1982) Map of distamycin, netropsin, and actinomycin binding sites on heterogeneous DNA: DNA cleavage inhibition patterns with methidiumpropyl-EDTA‚Fe(II). Proc. Natl. Acad. Sci. U.S.A. 79, 5470-5474. (3) Lane, M. J., Dabrowiak, J. C., and Vournakis, J. N. (1983) Sequence specificity of actinomycin D and netropsin binding to pBR322 DNA analyzed by protection from DNase I. Proc. Natl. Acad. Sci. U.S.A. 80, 3260-3264. (4) Szostak, J. W. (1992) In vitro genetics. Trends Biochem. Sci. 17, 89-93. (5) Wright, W. E., and Funk, W. D. (1993) CASTing for multicomponent DNA-binding complexes. Trends Biochem. Sci. 18, 77-80. (6) Hardenbol, P., and Van Dyke, M. W. (1996) Sequence specificity of triplex DNA formation: analysis by a combinatorial approach restriction endonuclease protection selection and amplification. Proc. Natl. Acad. Sci. U.S.A. 93, 28112816. (7) Coll, M., Fredrick, C. A., Wang, A. H. J., and Rich, A. (1987) A bifurcated hydrogen-bonded conformation in the d(A‚T) base pairs of the DNA dodecamer d(CGCAAATTTGCG) and its complex with distamycin. Proc. Natl. Acad. Sci. U.S.A. 84, 8385-8389. (8) Pelton, J. G., and Wemmer, D. E. (1988) Structural modeling of the distamycin A-d(CGCGAATTCGCG)2 complex using 2D NMR and molecular mechanics. Biochemistry 27, 8088-8096. (9) Abu-Daya, A., Brown, P. M., and Fox, K. R. (1995) DNA sequence preferences of several AT-selective minor groove binding ligands. Nucleic Acids Res. 23, 3385-3392. (10) Dabrowiak, J. C. (1983) Sequence specificity of drug-DNA interactions. Life Sci. 32, 2915-2931. (11) Samuelson, P., Jansen, K., and Kubista, M. (1994) Longrange interactions between DNA-bound ligands. J. Mol. Recognit. 7, 233-241. (12) Szybalski, W., Kim, S. C., Hasan, N., and Podhajska, A. J. (1991) Class-IIS restriction enzymessa review. Gene 100, 1326. (13) Musso, M., and Van Dyke, M. W. (1995) Polyamine effects on purine-purine-pyrimidine triple helix formation by phosphodiester and phosphorothioate oligodeoxyribonucleotides. Nucleic Acids Res. 23, 2320-2327. (14) Geierstanger, B. H., Mrksich, M., Dervan, P. B., and Wemmer, D. E. (1996) Extending the recognition site of designed minor groove binding molecules. Nat. Struct. Biol. 3, 321-324. (15) Trauger, J. W., Baird, E. E., and Dervan, P. B. (1996) Recognition of DNA by designed ligands at subnanomolar concentrations. Nature 382, 559-561.
BC970066S