Recognizing DNA Structures - C&EN Global Enterprise (ACS

It is easy to imagine the DNA double helix as described by James D. Watson and Francis Crick. The sugar phosphate backbones of the two strands in the ...
0 downloads 0 Views 5MB Size
SPECIAL REPORT

Recognizing Triple strands

Jacqueline Κ. Barton, Columbia University

It is easy to imagine the DNA double helix as described by James D. Watson and Francis Crick. The sugar phos­ phate backbones of the two strands in the doublestranded helix wind around, as in a spiral staircase, with the four nucleotide bases that hydrogen-bond one to another—guanine to cytosine, adenine to thy­ mine—making up the steps. But this model is only a rough sketch of the structure of DNA. DNA is actually a polyanion, with one nega­ tive charge for each nucleotide, or a charge of —2 f o y each step along the helix. The helical column is, t h | # B ^ fore, awash in negative charge. The nucleotideJwpes are in the interior of this column, yet molecules ft#the outside are able to recognize specific sequences of bases and bind with fidelity at precise locations. That many molecules can recognize sites on DNA that correspond to specific base sequences is well known. A clear example is the class of enzymes called restriction endonucleases, discovered by Werner Arber, Hamilton Smith, and Daniel Nathans in the late 1970s. Restriction enzymes have played a key role in the biotechnology revolution in the past decade. These enzymes are able to seek out and bind to a specific fourto eight-base pair sequence of DNA, recognize that particular sequence in the presence of all others, and then cleave each of the two strands of DNA at the bound site. With these molecular scissors, scientists can snip out almost at will a particular segment of DNA, and then, using oligonucleotide synthetic methods and a battery of other specific enzymatic tools for splicing genes, prepare a new segment of DNA and insert it into the gene of interest. As tools for molecular biology, these enzymes have been essential; as chemical illustrations of molecular recognition, they are remarkable. There are many other examples of sequence-specific DNA-binding proteins. Some, including the restric­ tion endonucleases, play key roles in the defense of healthy cells against invading organisms. Some DNA30

September 26, 1988 C&EN

DN Hairpin

binding proteins regulate the expression of the genetic information, amplifying or perhaps repressing the transcription of a gene by RNA polymerases. Others are involved in the repair and replication of a cell's own DNA, and still others—found in viruses—allow these invading organisms to insert their own DNA into that of the host cell, thereby providing for their own survival and expression. What these macromolecules share is an exquisite ability to recognize and distinguish specific sequences along the DNA polymer. The extent of this discrimina­ tion is remarkable. In six-base pair restriction enzymes, for example, the ratio of specific-site binding to non­ specific association with DNA is commonly 10,000. Small molecules, too, both natural products and fully synthetic molecules, are being discovered that bind to specific DNA sites. They are finding application both in pharmacological design and as tools for molecular biology. Understanding how to target a synthetic mol­ ecule or natural product derivative so that it will bind to a particular site on a long DNA helix—to a specific 15-base sequence, for example, on a 100 million base chromosome—would be extremely useful. Drugs might then be more rationally designed to bind selec­ tively and damage or activate a gene of interest. Additionally, many scientists are working to devel­ op technologies to map all the genes of the human genome. For such an enormous project, reagents will

Cruciforms iyj.

^tl

Left-handed sites

Bends Supercoils

Single-stranded loops

be needed to recognize and snip DNA into discrete, manageable chunks for mapping. Our current palette of restriction enzymes is too small and, for the most part, not sufficiently specific for such work. Sequence-specific DNA binding raises many intriguing questions for chemists: How, chemically, might this sequence discrimination work? What are the factors governing recognition? Can these factors be mimicked? Can we design molecules with still greater discrimination? One factor important in accounting for site specificity involves hydrogen bonding interactions. Along the helical spiral of DNA are two grooves, called major and minor, which have distinctive widths and depths. (If the helix is extended and unwound, so that it looks like a two-dimensional ladder, the major groove is the front of the ladder, and the minor groove, containing connections to the sugar, the back.) Along the floor of the DNA grooves are the base pairs with donor and acceptor sites for hydrogen bonding that may be read and

distinguished by molecules that bind to DNA. It has become clear that there is no simple code for reading of the bases by peptide side-chains on the DNA-binding proteins, but hydrogen bonding interactions surely must play some part in the recognition process. Another factor that is becoming increasingly important to consider is that the view of DNA as a regular, right-handed helical column may in itself be somewhat oversimplified. Along the DNA strands there are bends, kinks, loops, left-handed elements—a range of structural variations. Spectroscopic methods and x-ray crystallographic studies of newly synthesized oligonucleotides are leading to detailed structural pictures of different conformations of DNA. Chemical techniques are providing strategies to look along the DNA strands, and we are finding that DNA is actually quite polymorphic. This report examines these variations in structure, the diversity of conformations of double-stranded DNA, the techniques being used to elucidate them, and the question of whether these conformations are mere test-tube curiosities or actually exist in functioning cells. Variations in local DNA conformation may be important elements in the targeting of specific DNA September 26, 1988 C&EN

31

Special Report

Minor groove Minor groove

Major groove

Major groove Major groove Minor groove

A-DNA

B-DNA

pMajor groove

Z-DNA

77ie f/iree general families of DNA double helices are shown here. Both A- and B-DNAs are right-handed helical structures, whereas Z-DNA spirals in a left-handed sense. B-DNA has distinctive major and minor grooves, of particular widths and depths. A-DNA has a very shallow and wide minor groove. The Ζ helix, long and slender, has a shallow, almost convex major groove, and very narrowed minor groove. These drawings are based on oligonucleotide crystal structures determined, respectively, by Olga Kennard and coworkers at Cambridge University, Richard E. Dickerson and coworkers at UCLA, and Alexander Rich and coworkers at MIT. The red spheres represent oxygen atoms, the light blue are phosphorus atoms, green are carbon atoms, and dark blue are nitrogen atoms. Hydrogen atoms are not shown. This picture and all the other computer graphics in this article are obtained using the program Macromodel, written by W. Clark Still and coworkers at Columbia University sites. Perhaps they are important also in thinking chemically about the regulation and expression of the information contained along the strand.

DNA's many structures Since the double helical structure of DNA was first described by Watson and Crick in 1953, it has been clear that double-stranded polynucleotides ought to be able to adopt a wide family of conformations. Most of the true high-resolution structural information about DNA conformations, however, comes from a small col­ lection of crystal structures of oligonucleotides deter­ mined over the past 10 years. Earlier fiber diffraction studies in natural and syn­ thetic polymers led to the identification of A, B, C, and D forms of double helical DNA, with modified ver­ sions of each. These right-handed helical structures differ in the pitch of the helix, its diameter, the tilt of the bases relative to the helical axis, and the dimen­ sions of the DNA grooves. In 1979, the full three-di­ mensional crystal structure of a double-stranded com­ plex made of two six-unit chains of alternating cytosine and guanosine nucleotides was determined by Alexan­ der Rich and coworkers at Massachusetts Institute of 32

September 26, 1988 C&EN

Technology. This structure revealed an even more striking variation in conformation: a left-handed heli­ cal structure called Z-DNA. Earlier spectroscopic stud­ ies in solution had led investigators to consider such a conformation, but it was only identified and character­ ized in detail with the MIT crystal structure. The helical conformations of DNA can now be classi­ fied into three general families: the A, B, and Ζ forms (C and D forms may be considered modified versions of A- and B-DNA). Although each conformation involves a helix made up of two antiparallel polymer strands with the bases paired through Watson-Crick hydrogen bonding, the overall shapes of the helices are quite different. B-DNA, the predominant form, is a regular, right-handed helix with base pairs oriented essentially perpendicular to the helix axis and distinctive major and minor grooves of well-defined width and depth. Our clearest view of the Β conformation is derived from the crystal structure of a dodecamer determined by Richard B. Dickerson and coworkers at the Universi­ ty of California, Los Angeles, in 1980. The Α-helical form is seen in fibers at low hydration and is the predominant conformation of DNA-RNA hybrids and double-stranded RNA segments. Several

agents. Homopurine-homopyrimidine stretches, for crystal structures of oligonucleotides in the A confor­ example, when torsionally wound, tend to adopt a dis­ mation have been determined. The shape of this helix tinct conformation that makes the site hypersensitive can most easily be understood by thinking about the to the enzyme SI nuclease. This enzyme usually chemical constraints of an RNA. To minimize steric cleaves single-stranded regions. Proposed structures constraints on the 2 / -hydroxyl group of ribose, the sug­ for SI hypersensitive sites (none of which are singlear ring puckers into the C ^ - e n d o conformation, stranded) include left-handed segments, Α-like confor­ which leads to the tilting of the base pairs with respect mations, and even looped-back triple-stranded struc­ to the helix axis by as much as 20° and movement tures, called H-form. outward of the base pairs from the center of the helix. Another distinct conformation of interest is the cru­ As a result, the major groove is deepened, narrowed, ciform. Palindromic sequences, which have inverted and made virtually inaccessible to binding by other repeated elements, have the potential to extrude into molecules. The surface structure of the Α-form helix intrastrand hydrogen-bonded loops, and do so when becomes defined instead by the topology of its shallow torsionally constrained. At least four bases are needed and wide minor groove. (The terms major and minor to form the stem of the loop. The single-stranded tip of describe the size of the grooves only in B-DNA). the loop provides an excellent target for single-strandZ-DNA is not a left-handed version of either the A or specific nucleases, and these have been used to detect Β helices. The Z-DNA helix zigzags: It alternates both cruciform formation. David M. J. Lilley and coworkers in the puckering of the sugar ring (between Α-like and at the University of Dundee in Scotland have em­ B-like forms) and in the angle of the base about the ployed elegant experiments to characterize the kinetics glycosidic bond (between anti and syn forms). The and thermodynamics of cruciform extrusion, which 180° rotation of the bases about the glycosidic bond seem to depend as much on the sequences that flank accounts for the left-handed chirality of this helix, cer­ the cruciform as on the sequence in it. tainly its most distinctive feature. Z-DNA can be best Although cruciforms are usually drawn as two-di­ considered as a long, slender helix with a wide and mensional crosses, information about their three-di­ shallow, almost convex, major groove and a minor mensional structure is sketchy at best. Some assistance groove pinched down into a narrow crevice. in this regard may be coming from the laboratory of Crystal structures of oligonucleotides occurring in Nadrian C. Seeman at New York University, who has each of these conformations reveal additional details synthesized oligonucleotide sequences that are con­ about structural variations that perhaps differ from the strained by hydrogen bonding considerations to adopt common view of the DNA double helix. The two het­ the central four-strand junction of a cruciform. This erocyclic bases paired one to another do not reside in junction may be a critical structural intermediate in the same plane and stack neatly perpendicular to the genetic recombination. helix axis. Instead, there is a propeller twisting of the Another illustration of DNA polymorphism can be bases one to another and a roll of the bases relative to seen in the extraordinary lace pattern of kinetoplast the axis of the helix. These features may also vary with DNA found in the mitochondria of a small protozoan sequence, though the data from the few crystal struc­ parasite called a trypanosome. The DNA in these kitures obtained to date do not provide enough samples netoplasts, much of which is not transcribed, appears to know for sure. to be packaged into an intricate pattern of small and Cristopher C. Calladine at Cambridge University has large circles and represents a striking example of se­ developed rules that relate the extent of propeller quence-directed DNA polymorphism. The circles are twisting to sequence and consequent structure. Mov­ encoded not by bound proteins but by the DNA itself, ing down the strand from the 5' to 3' direction, for example, if a purine stretch is fol­ lowed by a pyrimidine stretch a clash is likely to ensue in the major Structure of double helix differs in A-, B-, and Z-DNA groove, one that may substantially ζ Β A DNA form open up the minor groove. Such local variations in the structure of Left Right Right Helical sense the helix may help distinguish pro­ 1 1 Base pairs per repeating 1 tein binding sites along the strand. unit

Some unusual structures Sometimes quite gross varia­ tions in structure occur along the strand. These have not been char­ acterized crystallographically, so an understanding of these struc­ tures is more uncertain. Actually, their detection and characteriza­ tion have been based more on their site-specific recognition by DNA enzymes or small chemical re­

Mean base pairs per turn of helix Base inclination from nor­ mal to the helix axis Rise per base pair along the helix axis Pitch per turn of helix Sugar pucker conformation

Glycosyl angle conforma­ tion

10.7

10

12

19°

-1°

-9°

2.3 A

3.3-3.4 À

24.6 A C-3'-endo

33.2 A 0-1'-endoto C-2'-endo

Anti

Anti

3.8 A 45.6 A C-2'-endo at cytosine, C-2'-exo (C-3'endo)toC-1'-exoat guanine Anti at cytosine, syn at guanine September 26, 1988 C&EN

33

Special Report DNA's helical conformation depends on conformation of each nucleotide in the polymer The sugar rings of each nucleotide in a DNA strand may be puckered in a variety of ways. Shown schematically below are two extreme sugar pucker conformations, C-2'-endo, seen commonly in B-form polymers, and C-3'-endo, found frequently in Α-form structures. Such variations can lead to a change in orientation of the base pairs with respect to the helix axis, resulting, for example, in a tilting and outward movement of the bases with respect to the center of the helix.

purines are syn. The consequence, a left-handed, zigzag­ ging helix, is dramatic.

Ο HO OH syn-Adenine

The orientation of the bases with respect to the sugar rings can also lead to dramatic changes in the gross structure of the helix. This variation is determined in part by the rotation of the base about the glycosidic bond. Most frequently, bases are positioned anti with respect to the sugar. A rotation of the base 180° about the glycosidic bond, as shown here, leads to the conversion to the syn conforma­ tion. In the Ζ form of alternating purine-pyrimidine poly­ mers, the pyrimidines are anti to the sugar rings, but the

which contains sequences that direct the bending of DNA. Recent experiments show that segments con­ taining five or six contiguous adenine bases cause DNA to bend; three adenine bases in a row may even be sufficient. Whether the flanking sequences are impor­ tant is still in dispute. Other sequences may also pro­ mote DNA bending. Now that it is clear that DNA need not be straight, a flurry of activity to investigate alter­ native possibilities has begun. Kinetoplast DNA fragments that contain frequent and periodically spaced adenine stretches migrate more slowly in a polyacrylamide gel than would be expected based on the size of the fragments. Donald M. Crothers and coworkers at Yale University have syn­ thesized a series of oligonucleotides with pairs of se­ quences containing five and six adenines in a row. These putative bending sequences are separated by variable distances, and the researchers measure the ef­ fect on gel electrophoretic mobilities of these different spacings. When the bent DNA segments are separated by a full helical period, the bends "add u p " and pro­ duce a global curvature to the DNA fragments that leads to decreased gel electrophoretic mobility. Simi­ larly, adenine sequences separated by one half of a helical period straighten out the DNA and lead to gel electrophoretic mobility that is similar to that of DNAs lacking the bending sequences. 34

September 26, 1988 C&EN

anf/'-Adenine

The way the bases stack and hydrogen-bond one to another also leads to substantial variations in gross conformation. To maximize stacking interactions between neighboring rings in the helix, a propeller twist occurs in bases paired one to another, as shown below. Different bases show differing extents of propeller twisting, depending in part on the bases that flank them in a particular DNA sequence.

The detailed structure and function of bent DNA have become topics of hot debate. In some proposed models the bending is localized at the junction be­ tween two different conformations. Other models con­ sider the entire adenine stretch to be a smoothly curv­ ing segment. Such bending sequences might have im­ portant biological function. Some researchers have suggested, for example, that the higher-order folding of DNA in chromatin may be encoded in sequences such as these. A recent crystal structure of an oligonucleotide con­ taining a string of adenines, by Aaron Klug and co­ workers at Cambridge University, shows bending at the ends of the adenine stretch, but not of the magni­ tude nor in the direction that biochemical experiments would predict. This crystal structure does show some interesting structural features of adenine stretches, such as an array of bifurcated hydrogen bonds between severely propeller-twisted adenine and thymine resi­ dues.

Drug interactions alter DNA's shape Drugs that bind to DNA can also induce local varia­ tions in conformation, and in targeting sites along the DNA strand with new chemotherapeutic agents, the susceptibility of different sequences to such conforma­ tional changes must be kept in mind.

Cristopher R. Calladine at Cambridge University points out that such propeller twisting can lead to steric clashes between neighboring base pairs. Shown below is a crosschain clash (the double arrow) between two purines that would occur in the minor groove of B-DNA at pyrimidinepurine junctions. The opposite sequence (purine to pyrimidine) would lead to a comparable purine-purine clash in the major groove.

ing. Triple-helix formation occurs in the major groove when a third pyrimidine strand attaches by Hoogsteen pairing to a homopyrimidine-homopurine Watson-Crick base-paired double strand.

The bases may also pair one to another in ways that are different than the way originally proposed by Watson and Crick. At right are adenine (A) and uracil (U) molecules and guanine (G) and cytosine (C) molecules in two different base-pairing schemes: Watson-Crick and Hoogsteen pair-

The simplest example of such drug-induced conformational variation is probably intercalation, first described by Leonard S. Lerman at the University of Colorado in 1961. Flat, aromatic, heterocyclic moieties can insert and stack in between the base pairs of the DNA helix. This noncovalent stacking interaction, in which the drug is held rigidly perpendicular to the helix axis, requires that the DNA helix unwind to separate the base pairs to accommodate the drug. Intercalating drugs, because they unwind DNA at the site of binding, interfere with the action of DNA-binding enzymes, in particular DNA topoisomerases—enzymes that alter the degree of supercoiling, or twisting, of DNA. Once they are bound along the strand, such intercalating agents resemble a base pair and can lead to frame-shift mutations. Many natural products that act on DNA and are used as antibiotics, antiviral agents, and antitumor drugs bind to DNA at least in part through intercalation. An excellent example of drug-induced DNA polymorphism is apparent in the crystal structure, determined by Andrew H. J. Wang, Alexander Rich, and coworkers at MIT, of triostin A bound to the self-complementary DNA hexamer d(CGTACG). Triostin A, which is a quinoxaline antibiotic, is a cyclic depsipeptide containing two planar quinoxaline rings with the potential for simultaneous intercalation

Kinetoplast DNA from the organism Crithidia f asciculata forms an extraordinary network of loops and circles, as seen in this electron micrograph taken by Paul L. Englund of Johns Hopkins University. This pattern of DNA structures, certainly not a long, columnar helix, appears to be encoded by the DNA sequence itself. Its biological function is still unclear September 26, 1988 C&EN

35

Special Report

Certain DNA sequences are likely to adopt unusual conformations Many unusual structures that occur along a DNA strand have not been structurally characterized in much de­ tail. Nonetheless, hyperreactivity of certain DNA sequences with enzymes and their anomalous mobilities in gel electrophoresis experiments make it apparent that some sequences have a marked propensity to adopt altered conformations. These structures tend to be found flanking biologically inter­ esting regions on the genome, which suggests that they may serve some biological function. Homopurine-homopyrimidine se­ quences

though clearly double-stranded, are hy­ persensitive to reaction with S1 nucle­ ase, an enzyme that ordinarily cleaves DNA at sites where it is single-strand­ ed. Models for the hypersensitive sites include Α-form helices and triply stranded loops. In contrast, alternating purine-pyrimidine sequences

tend to adopt the Ζ conformation. Palindromic sequences are those in which the sequence of bases is the same when it's read in the 5' to 3' direction of each strand. Such se­ quences may extrude out from the nor­ mal duplex to form intrastrand hydro­ gen-bonded segments called cruciforms.

Such regions are hypersensitive to a variety of enzymes in both the singlestranded loop and lower stem region. The two-dimensional representation of a cruciform given here is likely to be a gross oversimplification of the threedimensional, folded structure of DNA at a cruciform site. Bent DNA has received consider­ able attention recently. Models for the bending of DNA at stretches of aden­ ines include those where the bend oc­ curs at the junctions between distinct

of both rings. A close derivative of triostin A, echinomycin, is currently in clinical trials for the treatment of human tumors. The crystal structure of triostin A bound to the oligonucleotide shows intercalation of both quinoxaline rings, but it also shows a surprising added conformational transformation. The central ade­ nine residues are flipped 180° about the glycosidic bond into the syn conformation. As a consequence, the adenine and thymine bases are no longer held together through Watson-Crick hydrogen bonding but are in­ stead paired through Hoogsteen hydrogen bonds. This change allows the close packing of the triostin A mole­ cule into the minor groove of the newly shaped DNA helix. Hoogsteen base pairing also provides the basis for the binding of a third DNA strand in the major groove of an Α-like double helix to form a triple helix of DNA. Triple-helix formation, through Hoogsteen base pair­ 36

September 26, 1988 C&EN

conformations and those where the ad­ enine stretch acts as the smoothly curving segment.

ing of a homopyrimidine oligonucleotide to a homo­ purine-homopyrimidine stretch of the DNA strand, has been exploited effectively by Peter B. Dervan and coworkers at California Institute of Technology to tar­ get oligonucleotides armed with DNA-cleaving func­ tionalities to specific sites along the strand. DNA-binding antitumor drugs also can cause DNA bending. Using the same gel electrophoretic tech­ niques he used to characterize the bending of DNA at adenine stretches, Crothers, along with Stephen J. Lippard of MIT, and coworkers examined how the antitu­ mor drug cisplatin bound to DNA affects gel mobility, which, in turn, is a measure of changes in gross DNA structure. The major adduct formed when the drug binds to DNA is ds-diammineplatinum coordinated to neighboring guanine residues. When the platinumbound guanine residues are separated by 15 or 16 bases from a string of adenines, maximum curvature results.

Supercoils are an important feature of constrained DNAs DNA, as a long, flexible polymer, is governed by the same topological constraints that lead to the coiling up of rubber bands, telephone cords, and balls of string. Whenever the ends of a DNA double helix cannot rotate freely—as, for example, in a closed circular plasmid or in regions of DNA complexed with proteins—the total winding of the DNA is necessarily constant. For such torsionally constrained DNAs, the winding may be apportioned between the twist of the helix, which de­ pends on the duplex winding or pitch of the helix, and the writhe, which refers to the number of supercoils, or turns in the tertiary structure, of the helix. Shown at right are two closed circular DNAs with the same total winding. On the left, the duplex is in the Β form and is negatively supercoiled; that is, its supercoils have a left-handed direction. On the right, the DNA contains no supercoils, but a 22-base pair stretch has been converted into the Ζ form, which, because it twists in a left-handed direction rather than the right-handed twist of the Β form, decreases the helical twist. This decrease in twist leads to the relaxation of the negative supercoils. Because the total winding is constant in a constrained helix, local conformational changes that alter the pitch of the double helix cause equal and opposite effects in the supercoiling of the DNA. In the same way, torsional strain brought on by high negative supercoiling of DNA can be relieved by local conformational transitions, such as the extrusion of cruciforms or a B to Ζ transition. These transi­ tions alter the local helical twist, thereby lessening some of the associated helical writhe. James C. Wang and coworkers at Harvard University

Under these conditions, bending by the platinum into the major groove is in phase with bending at the aden­ ines into the minor groove. Thus, the platinum appears to bend the DNA into the major groove at an angle that is about twice the 20° bending caused by the adenines in the minor groove. Proteins bound to sites along the DNA strand also can induce conformational variations. John M. Rosenberg and coworkers at the University of Pittsburgh have used x-ray crystallography to determine the structure of the restriction endonuclease EcoRI bound to the selfcomplementary oligonucleotide d(pCGCGAATTCGCG)2, which contains the recognition site for this restriction enzyme. This structure is currently the most highly resolved view of a protein site-specifically bound to DNA. The protein binding occurs primarily in the ma­ jor groove and the bound protein enfolds the DNA with what are described as peptide arms wrapped above and below the recognition site. Near the recog­ nition site, several peptide residues are aligned to fa­ cilitate hydrogen bonding to DNA bases. Hydrogen bonding interactions may not be suffi­ cient to specify uniquely the sequence that the protein recognizes. Conformational variations may also come into play. The DNA is unwound at the center of the

have developed sensitive, two-dimensional gel electrophoretic methods to study the consequences of DNA supercoil­ ing. Recently, Wang, Leroy F. Liu of Johns Hopkins Univer­ sity, and their coworkers showed that DNA transcription in the bacterial cell generates positively and negatively supercoiled domains in DNA. Because the DNA must unwind to accommodate the RNA polymerase that transcribes it, a positively supercoiled region is produced in front of the polymerase and a negatively supercoiled region behind it. Such local regions of high negative superhelicity could promote transitions to non-B-DNA conformations in vivo. These transitions might serve to modulate transcription.

helix, generating a kink in the helix structure, and additional kinks, termed neoll-kinks, are evident at the sites where the arms of the protein extend about the DNA. The DNA is, therefore, distorted from its Β con­ formation in several regions. The propensity of differ­ ent sequences to undergo such structural variations, either with or without bound protein, may be impor­ tant to recognition by the protein of the site along the strand.

A chemist's view of the DNA polymer The characterization of local variations in structure along the DNA strand offers an exciting challenge to the chemical community. It represents an example of the general problem of how to characterize single sites on any polymer, rather than bulk properties of the polymer, or how to examine specific defects on a sur­ face rather than general surface features. Nuclear mag­ netic resonance techniques and crystallographic char­ acterizations of oligonucleotides are extremely impor­ tant in providing detailed three-dimensional structures of different conformations, but what is required, in addition, is information about these structures in the context of the DNA strand. We need to know what sequences tend to adopt particular conformations, September 26, 1988 C&EN

37

Special Report

Binding of triostin A to DNA distorts double helix The crystal structure of triostin A bound to an octanucleotide provides an ex­ ample of substantial drug-Induced con­ formation change in DNA. The molecu­ lar structure of this antitumor antibiot­ ic, a cyclic peptide derivative, is shown at right. On the far right is a skeletal drawing of a portion of the DNA octamer-triostin complex viewed from the major groove of the DNA double helix. The triostin A is shown in color. Its quinoxaline rings intercalate be­ tween DNA base pairs. The interaction is further stabilized by hydrogen bonds between the drug and the cytosineguanine base pairs. The DNA double helix is significantly distorted and, per­ haps most interesting, the guanine-cytosine and adenine-thymine base pairs flanking the quinoxaline rings are in the Hoogsteen geometry.

what the length requirements may be, and the impor­ tance of flanking sequences. Techniques need to be developed to examine particular sites along the strand. An important difference between free oligonucleo­ tides and the same sequence of bases embedded in a DNA strand is that the sequence within the strand is constrained by the neighboring sugar-phosphate chain whereas a short oligonucleotide has its ends free. These constraints are amplified when the DNA is closed into a double-stranded circle. In a closed circular form, DNA is topologically constrained in much the same way that a rubber band or cord is: The total winding— that is, the twisting of the strands one about the other and the writhing of the helical segments into coils of coils, or supercoils—is a constant as long as the circle is closed. These torsional constraints affect local conforma­ tions, and measurement of the supercoiling that results provides one method to gauge local variations in struc­ ture. If, for example, a 40-base pair site on a closed circular DNA undergoes a change from a right-handed to left-handed conformation, a change in twisting of the DNA duplex has to result, leading to an equal and opposite change in the number of DNA supercoils in the full circle. This change in supercoiling is easily detected using gel electrophoresis. James C. Wang at Harvard University has developed a very sensitive two-dimensional gel electrophoretic assay that mea­ sures Z-DNA and cruciform DNA formation by taking advantage of these topological variations. Chemical assays coupled to high-resolution gel elec­ trophoretic techniques have been critical in marking conformational variations along the DNA strand. Brian H. Johnston and Rich at MIT and Winship Herr of Cold 38

September 26, 1988 C&EN

Spring Harbor Laboratory independently found that diethylpyrocarbonate (DEP) is an effective reagent to mark bases along the strand that have flipped into the syn conformation. For example, residues that have un­ dergone transition into the Ζ conformation have in­ creased accessibility to reagents in the shallow major groove of Z-DNA and become hyperreactive to carboxethylation by DEP. Subsequent treatment with piperidine leads to strand cleavage at the site of modification, and, based on DNA-sequencing methods, the exact lo­ cation on the strand where the DNA breaks may be determined using denaturing polyacrylamide gel elec­ trophoresis. Analogous schemes have been effective in marking sites of cruciform formation, where, for exam­ ple, the single-stranded loop becomes hyperreactive to osmium tetroxide. Even f e r r o u s e t h y l e n e d i a m i n e t e t r a a c e t a t e , Fe(EDTA) 2- , provides a powerful reagent to examine DNA structure. Thomas D. Tullius and coworkers at Johns Hopkins University have shown that hydroxyl radicals, generated free in solution upon reduction of hydrogen peroxide by Fe(EDTA) 2- , provide a probe of sugar hydrogen atom accessibility on the DNA helix. Using this technique of hydroxyl radical footprinting, Tullius has prepared detailed snapshots of bent DNA sites and protein-DNA complexes. My laboratory has focused on the development of chemical probes that are targeted to particular confor­ mations at local sites on the DNA polymer. Based on considerations of symmetry and shape, a series of tran­ sition metal complexes that recognize and, when photoactivated, cleave DNA at distinct conformations along the strand has been developed. We found, first, that chiral tris(phenanthroline)-

metal complexes show enantiomeric discrimination in binding to right-handed B-DNA. Not surprisingly, be­ ing able to specify the chirality of the DNA-binding molecule can be useful in targeting particular DNA sites through the formation of specific diastereomers with DNA. In the case of the tris(phenanthroline) met­ al complexes, the right-handed propellerlike structure of the metal complex is favored for intercalation into right-handed helices. Intercalation of one of the phenanthroline ligands leaves the other two aligned along the right-handed groove of the DNA. By contrast, in­ tercalation of one ligand of the left-handed isomer into the right-handed helix leads to steric repulsions be­ tween the sugar phosphate backbone and the ancillary ligands, which are disposed contrary to the helical groove. Since this discrimination is based on matching the symmetry of the metal complex to that of the DNA groove, we were able to use it to develop a probe to recognize Z-DNA, a helix with left-handed chirality. The Λ isomer of tris(4,7-diphenylphenanthroline)ruthenium(II), A-Ru(DIP) 3 2+ , is a left-handed propeller structure that is bulkier than the parent phenanthroline complex and has the wrong symmetry to fit easily into right-handed DNA. It is instead a useful spectro­ scopic probe for Z-DNA. By switching the metal in the complex to cobalt(III), we transformed a spectroscopic probe for conformation into a reactive probe that tar­ gets conformations along the strand for photochemical cleavage. The cobalt complex A-Co(DIP) 3 3+ , when photoactivated, cleaves Z-DNA sites-—in fact, it cleaves any distinct conformations that are sufficiently unwound to accommodate the bulky left-handed complex. Shape as well as symmetry can be used to target sites

along the strand. A probe for the A conformation was developed by Houng Yau Mei in my laboratory by matching the shape of the metal complex to that of the shallow Α-form minor groove. Tris(3,4,7,8-tetramethylphenanthroline)ruthenium(II), Ru(TMP) 3 2+ , binds cooperatively against the surface of Α-form polynucle­ otides, but it is simply too large to bind against the well-defined groove of B-DNA. When photoactivated, this complex cleaves Α-like conformations in a reaction mediated by singlet oxygen. In addition, Mindy R. Kirshenbaum and Roger Tribolet in my laboratory have recently found that tris(4,7-diphenylphenanthroline)rhodium, Rh(DIP) 3 3+ , may be used to mark cruci­ form sites, another distinctive shape along the strand. Photoactivation of this complex leads to cleavage of both DNA strands near the cruciform stem. Such molecular probes, when linked to appropriate DNA cleaving groups, may provide the basis for devel­ oping synthetic restriction enzymes. Ru(DIP) 2 Macro, a complex prepared in my lab by Lena A. Basile, is a first attempt at making such an enzyme. The core of the molecule is the Ru(DIP) 3 2+ complex. Tethered onto it through sulfonamide linkages are two polyamine arms that can chelate additional metal ions and target their reactivity to the DNA's sugar-phos­ phate backbone. When any of several nonredox-active metal ions are chelated to the ruthenium complex, hydrolytic cleavage (albeit inefficient) of DNA occurs. Basile and Adrienne L. Raphael have shown that supercoiled DNA cleaved by the complex in the presence of Zn + 2 , Cd +2 , or Pb + 2 may be religated enzymatically, just as DNA would be that had been cleaved by a true restriction enzyme. Determining exactly where the complex binds and cleaves DNA and achieving effi­ ciencies comparable to those of a true enzyme are goals still to be accomplished, however. With this repertoire of chemical probes, we may now begin to visualize where variations in structure along the polymer occur and to direct chemistry to those sites. Targeting DNA sites with distinct conformations actually can be achieved with a striking degree of site specificity. Such recognition of structures, and there­ fore indirectly of the sequences that encode the struc­ tures, may be an important element in biological recog­ nition of sites along the strand.

Do these distinctive conformations exist in vivo? An important aspect in the search for and character­ ization of DNA polymorphism is the question of whether these distinctive conformations exist in living cells and whether they fulfill biological functions. One can argue that since polynucleotides do sometimes adopt altered conformations, nature must have figured out how to take advantage of them. Recent studies have shown that under physiological conditions, negative (that is, left-handed) supercoiling can promote confor­ mational transitions in local regions of DNA to non-B forms. Nonetheless, altered DNA conformations have been best characterized under conditions that are distinctly nonphysiological. Z-DNA, for example, was first char­ acterized in a long synthetic polymer in solutions conSeptember 26, 1988 C&EN

39

Special Report taining 4M sodium chloride, and A-DNA is best detect­ ed in aqueous solutions containing high percentages of ethanol. One might, then, reasonably ask whether these unusual structures are simply an artifact of the strange environments to which chemists have subject­ ed DNA. Detecting altered conformations in vivo using chem­ ical methods is a difficult challenge, but one that many laboratories are taking up. More recent excitement, however, has focused on biochemical methods that are being used to demonstrate double-stranded non-BDNA structures in the well-studied bacterium, Escheri­ chia coli.

Robert D. Wells and his coworkers at the University of Alabama, Birmingham, have shown clearly that lefthanded DNA can exist in living cells. They took advan­ tage of the observation that DNA sites are not methyl­ ated by sequence-specific methylases if the sites are near or in a left-handed helix. £. coli cells were cotransformed with a series of recombinant plasmid DNAs containing potentially Z-forming regions of various lengths and base composition as well as plasmids con­ taining a gene for a temperature-sensitive EcoRI methylase. Under conditions where the methylase pro­ duced in the cell actively methylated sequences distant from Z-forming sites, methylation was inhibited at sites neighboring long alternating C-G inserts. Data from these experiments could furthermore be correlat­ ed with the differing capabilities of the inserted se­ quences to undergo transition into the Ζ conformation in vitro, as assayed using two-dimensional gel electro­ phoresis. A similar biochemical approach was used by Nikos Panayotatos and Annick Fontaine at Biogen to demon­ strate the presence of a native cruciform structure in E.

coli. Like SI nuclease, Ύ7 endonuclease cleaves cruciforms at the single-stranded loop; under conditions where the cruciform is not extruded, no cleavage by the endonuclease of the DNA is found. Induction of this endonuclease in E. coli by a plasmid containing a potential cruciform region led to the cleavage of the plasmid at the site where the endonuclease cleaves oligonucleotides. These experiments illustrate elegant genetic strate­ gies to probe DNA structure and point out how enzy­ matic probes may be powerfully applied in living cells. Most important, they make clear that DNA polymor­ phism does occur in the cell.

Biological function of DNA polymorphism Because we are only now demonstrating that DNA actually assumes a variety of structures in living cells, we are surely far from understanding the biological functions of these structures. Some interesting clues have been uncovered, however, by determining where on natural DNAs, rather than on DNAs constructed with synthetic inserts and spliced together using re­ combinant DNA technology, conformationally distinct sites occur. Long alternating purine-pyrimidine stretches, for example, which have the propensity to undergo transitions into the Ζ conformation, seem to be hot spots for genetic recombination. It is apparent also that homopurine-homopyrimidine stretches that are hypersensitive to SI nuclease occur in the S'-flanking region of eukaryotic promoters. Experiments using other nucleases have indicated particular hypersensitivities at gene termination sites. Cheng-Hsilin Hsieh and Jack D. Griffith of the Univer­ sity of North Carolina Medical School at Chapel Hill have seen by electron microscopy a sharp, sequence-

Biochemical techniques show DNA can adopt the Ζ form in cells Although experiments in many laboratories have demonstrat­ ed that DNA is capable of adopting a variety of conformations in the test tube, the question of whether it actually does so in living cells has remained a matter of hot debate. Evidence that DNA can adopt a Z-like conformation in cells comes from recent experiments in the bacterium Escherichia coli by biochemist Robert D. Wells and coworkers at the University of Alabama, Birmingham. Earlier work had shown that certain enzymes that normally act on DNA will do so only when their binding site on the DNA is in a right-handed, Β conformation. Among these is the enzyme EcoRI methylase, which recognizes double-stranded DNA at the sequence 5'-dGAATTC-3' (the same sequence recognized by the EcoRI restriction enzyme) and adds a methyl group to the cytosine residue. The researchers prepared a series of recombinant DNA plasmids containing the gene for the methylase and a second inserted segment, either a repeating (TG)n or (CG)n sequence. Both of these alternating purine-pyrimidine sequences are known from test tube experiments to be capable of adopting the Ζ conformation. The plasmids were inserted into E. coli cells, which were grown into colonies, and the resulting cells

40

September 26, 1988 C&EN

assayed for methylation at the appropriate position along the DNA. If the cells contained sites that adopt the Ζ conforma­ tion at or near the EcoRI binding site, these sites would not be methylated by the enzyme and could be distinguished from the same sites in control cells lacking DNA sequences that are especially prone to adopting the Ζ conformation. The researchers tried six different arrangements of the EcoRI binding site and the potential Z-forming regions within the E. coli plasmid. For two of them—those with the binding site immediately adjacent to an alternating GC region or with the binding site in the center of such a region—the methyl­ ation level was about half that of experimental controls. That level of inhibition is considered substantial and, as the re­ searchers themselves say, "shows that left-handed DNA can exist in plasmids in E. coli."

EcoRI methylase can distinguish the left-handed Ζ form of DNA, where it is inactive, from the right-handed Β form. In the Β form, the enzyme methylates the DNA at cytosine

Models show the enantioselective binding of A- (left) and A- (right) tris(phenanthroline) metal complexes to B-DNA by intercalation. With one phenanthroline ligand intercalated (stacked parallel to the base pairs and pointed into the picture), the two nonintercalated ligands of the Aisomerlie along the right-handed groove of the DNA. For the Α-isomer, however, intercalation of one ligand leads to steric clashes between the two nonintercalated ligands and the sugar-phosphate backbone. Because the isomer is left-handed, its ligands are arranged contrary to the righthanded DNA helical groove. The models are based on a series of photophysical experiments in collaboration with Nicholas J. Turro at Columbia, as well as nuclear magnetic resonance, and equilibrium binding experiments that indicate the preferred intercalation of the Α-isomer into B-DNA. Although the Α-isomers bind poorly to righthanded DNA by intercalation, they bind avidly with lefthanded Z-DNA, and thus can be used to locate lefthanded conformations along the DNA molecule

directed curve at the replication and transcription ter­ mini of the DNA of a simian virus called SV40. Experi­ ments in my lab by Adrienne L. Raphael and Barbara C. Muller mapping SV40 DNA, using the Λ isomer of tris(4,7-diphenyl-phenanthroline)cobalt(III), reveal an in­ triguing correlation between sites bordering coding regions on the viral genome and sites targeted by the metal complex. In fact, a host of experiments—using both enzymatic and chemical probes—all seem to show that DNA conformational heterogeneity is prevalent in those segments that border genetic coding regions. What are the distinct conformations in these biologi­ cally interesting regions along the genome? Probably a

whole variety of conformations are encoded in these regions, and quite likely no unique role will be appar­ ent for any particular one of them. Just as an alpha helix is a structural motif in many proteins where it serves a host of different functions, Z-form sites and Aform sites in DNA may represent structural features needed for many functions as well. What these distinct conformations share, however, is an increased ability to be recognized, both by enzymes and by small molecules. We have found at Columbia, for example, that the binding site for a DNA regulatory protein that affects transcription on the SV40 genome is in a distinctive conformation that can be distin-

- Mutant EcoRI methylase gene

Escherichia coli

Plasmid vector

EcoRI binding site and a sequence known to easily adopt the Ζ configura­ tion are inserted into a plasmid vector. In another plasmid vector, the gene for the temperature-sensitive EcoRI methylase enzyme is inserted

Both plasmids are introduced into Es­ cherichia coli cells, which are then cultured at high temperature, where the EcoRI methylase is inactive

B-DNA

Z-DNA

After culturing, the temperature is lowered and the methylase enzyme selectively methylates EcoRI sites lo­ cated in B, but not Z, conformations along the DNA. Finding significantly reduced levels of methylation in those cells where the EcoRI site is in a re­ gion that easily adopts the Ζ form con­ firms that this form can exist in vivo

September 26, 1988 C&EN 41

Special Report

Models show the possibilities for groove-binding by A-tris(tetramethylphenanthroline)Ru(II) with B-DNA (left) and A-DNA (right). This complex tends to bind to the surface of polynucleotides. With the A-form polymer, which has a minor groove that is wide, shallow, and accessible, cooperative binding occurs between the complex and the helix. With a B-form helix, however, the groove is too narrow and deep for surface binding of the complex and little is detected. This shape discrimination makes the complex a probe for A-DNA guished by our conformation-specific metal complexes. Perhaps the association between DNA polymorphism and borders of coding regions really reflects the correlation of such distinct conformations with places where proteins bind. Information to encode the binding of DNA regulatory proteins is surely contained at the level of the primary DNA sequence, but it may also be contained, in part, at the level of sequence-directed DNA structure. It is appealing to consider that conformational variations might modulate or even specify the binding of proteins that regulate gene expression. Exploring the structural interactions of proteins and DNA as they bind to one another is only beginning, but in this recognition process, the various conformations of DNA may play an integral part.

Future directions for research Although many conformations of DNA have yet to be well-characterized structurally, enzymatic and new chemical probes are providing clues to the variations in structure along the strand, where and under what conditions these variations may occur, and, perhaps most important, that these local variations in conformation in fact occur in the cell. However, the rules that govern structural variation still must be determined. What are the sequence and environmental requirements that determine whether a particular conformation is adopted in the context of the strand? What role is played by 42 September 26, 1988 C&EN

sequences flanking the region of interest? A host of chemical as well as genetic experiments will be needed before we can look along a primary sequence of DNA and understand its local secondary structure. What roles are played by this rich polymorphism? Distinctly altered conformations provide targets for the specific binding of other molecules. DNA regulatory proteins as well as small molecules may be targeted to conformational^ distinct sites along the strand. One way to achieve sequence-specific recognition of DNA is through such sequence-directed, conformation-specific recognition. Taking advantage of the conformational heterogeneity of DNA will be an important element in the design of drugs directed to DNA sites and in the development of new tools for molecular biology. Furthermore, experiments thus far are providing hints that these distinct conformations are important biologically: Nature may exploit these changes in the shape of the polymer to recognize sites along the DNA strand. Understanding the structure and the structural variations of DNA is, in the end, a chemical problem—one that we are now beginning to tackle, and one that surely underlies the biological expression of genetic information. D Jacqueline K. Barton is a professor of inorganic and biophysical chemistry at Columbia University. A native New Yorker, she was awarded a B.A. degree summa cum laude at Barnard College in 1974 and a Ph.D. degree in inorganic chemistry at Columbia in 1979. She did postdoctoral studies in biophysics at Bell Laboratories and Yale University before becoming an assistant professor of chemistry and biochemistry at Hunter College, City University of New York. She returned to Columbia in 1983. Barton is interested in designing simple molecular probes to explore variations in structure and conformation along the DNA helix. Using chiral inorganic complexes as spectroscopic and chemical tools, she has developed complexes that recognize specific DNA conformations and bind to and cleave DNA at these locations. With these molecular tools, she examines the heterogeneity of DNA structures and their importance in gene expression. Her numerous awards include, in 1985, the Alan T. Waterman Award of the National Science Foundation, given annually to the outstanding young scientist in the U.S., ACS's Eli Lilly Award in Biological Chemistry in 1987, and the 1988 ACS Award in Pure Chemistry. Reprints of this C&EN special report will be available in black and white at $5.00 per copy. For 10 or more copies, $3.00 per copy. Send requests to: Distribution, Room 210, American Chemical Society, 1155—16th St., N.W., Washington, D.C. 20036. On orders of $20 or less, please send check or money order with request.