The Significance of Hydrogen Bonds in Biological Structures

short stubs on the various RNA molecules. In par- ticular, the joining of messenger and transfer RNA's is interesting. You notice there are three pair...
0 downloads 0 Views 4MB Size
California Association of Chemistry Teachers

The Significance of Hydrogen Bonds

A. 1. McClellan

Chevron Research Company Richmond, California 94802

in Biological Structures

Molecular biology is creating a revolution in our understanding of the atomic aspects of "living molecules." We will pick only small parts of the field to draw attention to the place of importance held by H bonds, whose properties and importance in other fields have been discussed elsewhere in THIS JOURNAL (1). I n particular, the lack of stringent conditions of geometry and t,he low energy of an individual H bond will arise. In general plan, we will take up the nucleic acids (structure and method of action), the proteins produced at the direction of the acids, both for structure and euzymes, and finally, the destructive viruses. Since each of these substances has been the subject of book-length treatment, only selected examples are intended here.

thymine, and cytosine occur most frequently in DNA. Thymine is replaced by uracil in RNA. These molecules will be illustrated in a moment. The long strand just described usually does not occur singly. Rather, it is H bonded to another comple mentary strand in the famous double helix suggested by Watson and Crick (3). Figure 2 contains a schematic representation and an end view of one helix. The important aspect for our discussion is the H bond arrangement holding these units together. In Figure 2a, the H bonds are represented by the vertical lmes. Figure 3 gives more detail. Note, first, that in each case the H bonded pair is made up of one purine and one pyrimidine. This combination causes

Nucleic Acids

Nucleic acids have been known since 1868 when Mieesher analyzed material from the cell nucleus. Davidson's review (8) is comprehensive. These complex molecules are made of three parts: phosphate groups linking sugar groups, each of which has a purine or pyrimidine base attached to it. There are many hundreds of the units bonded covalently into a long chain. Figure 1 shows the uuit. The sugar is ribose in ribonucleic acid (RNA) and deoxyribose in deoxyribonucleic acid (DNA). The only diierence is that DNA has H, instead of OH, in the rectangle in Figure 1. Each uuit of the chain has only a few choices for the base group. Adenine, guaniue,

Base

"

H\ 0-P=O I

Phosphate

I

/ Sugar

Y

0 b

I Figure 1.

A unit of a ribonvcleic acid slrond.

Figure 2. The hslicol double strand model of DNA. Reproduced from the article b y P. Lowdin, "Advances in Quantum Chemistry." 2, 235 Asodemic Press, New York, b y permission.

(19651,

Volume 44, Number 9, September 1967

/

547

Adenine l A )

Thymine I T )

Cytosine I C )

Guanine lG1

Amino Form

l mino Form

Keto For m

Enol Form

Figure 3. The H bonding arrangements in nvcleofids boso poin. Arlerirkr show where svgor groups ore attached.

the diameter of the helix (20 i - ~ i 2b) ~ . to be constant. Second, the pairings shown are most favorable. They permit the maximum number of H bonds and avoid any repulsions, such as would exist between the two G=O groups if cytosine came close to thymine. From the Boltman factor, ecAHIRT,and an average H bond euergy of 4 kcal/mole, we can calculate the ratio of bonding for two molecules that can H bond compared to two that cannot. The H bonded pair is favored by a factor of several thousand at room temperature. Two or three H bonds per pair increase the favorable ratio aud explain the specificity of the base pairs. It is this specificity of H bonding that maintains what is called the genetic code. The genetic code contains the information that determines the species, characteristics, and functioning of the cell. The code is formed by the order in which bases orcur along t,he strand shown in Ngure 2a. A molecular code of this kind call contain a tremendous store of information and is extremely compact. Suppose there are 100,000 units (Fig. 1) in a singlestrand DNA molecule, a value giving a molecular weight of about 3.6 X lo7. With a choice of one base out of four for each unit, there are 4100,0W or ahout 10".mo different combinations. Such a number is roughly equivalent to multiplying Avogadro's number by itself 2500 times! Camras (4) has estimated that t,he genetic code stores about 1OZ0information hits per cm3. Such a value is much higher (nearly 101° times!) thau any available storage method outside of living things. We shall see later how some of these different combinations can have important consequences. The exact nature of the code and how it is "read" by the parts of the living cell are matters that are now being unraveled. For our purpose, the central point is the vital role played by H honds. The matching requirements for H honds between the bases are sufficiently strong that species retain their uniqueness through thousands or millions of reproduction cycles. Each time a cell splits, the DNA double helix must be duplicated so that both the old and new cell can have a copy of this crucial template. The splitting process is thought to occur in a fashion analogous to a rope coming apart by twisting. Except,, in the molecular case, the strands do not remain single after unraveling. They replicate themselves. That is, a strand forms which is complementary to each of the original strands. Hydrogen bonds are a part of each process. The disengaging of the two original strands is possible, partly as a consequence of the low H bond energy between t,he bases. For example, a mild temperature, a bit below 100°C, breaks all the H honds in DNA; and the helix is converted into two single strands. Second, the complementary character of the forming strands is 548

/

lournol of Chemical Educofion

Figure 4.

Tovtoreric forms of odenine end thymine.

insured by the matching showu in Figure 3. The separated strands will reform the helix as the solution is slowly cooled. Occasionally, indeed rather rarely, there is a change in the order of the bases in the DNA molecule for a particular species. How does this come about? One suggestion (5) involves the H bonding arrangement. The bases illustrated earlier are shown in the keto aud amino forms. However, they can also exist in enol and imine forms obtained by moving protons; see Figure 4. Although the tautomers are rare, they may play an important role in mutation. To understand the suggestion, let us represent the bases and their pairing in another way. Figure 3 can be simplified to

Suppose now some eveut causes a shift or, better, a double shift of protons in one pair. This is shown as

iu which the asterisks are to emphasize that new tautomeric forms are present. Suppose further that the bases come apart in the condition just produced. Figure 5 illustrates what could happen. The tautomeric change causes the production of a new sequence of bases-it changes the "coded message" contained in the molecule. The Functions of DNA

In a chemist's terms. a cell is just a large series of reactions. These reactions are catalyzed in a great number of instances by protein molecules. An important function of DNA is the production of these catalytic protein molecules and, also, protein for structural parts of the cell. By what means does this molecule carry and convey the information necessary for this process? The exact answer is not yet known, but some ideas are coming out of current research work.

Normal Bases

L

Tautomeric Bases

4

L :_I Splits and Replicates

Splits and Replicates

Sequence

Normal

Tautomeric

Original Tautomerism Complimentary New

XXXX A XXX

XXXX A XXXX XXXX A'XXXX XXXX C XXXX XXXXG XXXX

Figwe 5.

I1r"Chlre.

XXXX T XXXX XXXXAXXXX

The emsequencer of a tavlomeric chonge in nvcleotide base

In addition to just reproducing itself, it appears that along the large DNA molecule are certain sections which have instructious for preparing various proteins. To make use of the information, the double-stranded DNA molecnle is thought to separate over a part of its length and then to build up an RNA molecule from nucleotides in the reaction mixture we call the cell. I t makes up one RNA molecule whose molecular weight is only a few thousaud. This molecule is called messenger RNA because it carries from one place to another the iuformation which was specifically determined hy the portion of DNA that was copied. The messenger RNA combines with a part of the cell called a ribosome, where protein will be synthesized. Concurrently, other bits of RSA have been formed at other places on the DxA molecule. They are called transfer RNA and have the job of converting information from the messenger RNA into a given protein chain. Apparently, this job is accomplished by each transfer RNA attaching to a specific amino acidsometimes by H bonding, sometimes via hydrophobic bonds-and t,hen folding itself into a configuration

which can become H bonded to the messenger RXA. By this latter attachment, the amino acid is brought into the proper place to react with another which is already part of a peptide chain. Subsequently, a different transfer RNA molecule brings up still another amino acid and the chain grows. Figure 6 summarizes the eutire process in a schematic way. Through the center, the diagram shows replication of the DNA, manufact,ureof messenger RNA, and its attachment to the ribosome. Along the left are drawings of the trausfer RNA attaching first to the amino acid and then to the messenger RNA. At the right, the finished protein is represented. Xote the H bonding indicated by the matching of short stubs on the various RNA molecules. I n particular, the joining of messenger and transfer RNA's is interesting. You notice there are three pairs of stubs for each transfer RKA. These three stubs represent bases, as in DNA, and the code has now been worked out for each amino acid. That is, we now know which sequences of these bases mill be present on the RN.4 part if the attached amino arid is serine, which others will attach a transfer R S A molecnle carrying glycine, and so on. The role of H bonding does not end here. The structures of a few individual transfer RNA molecules are known and are denendent on H bonding. Fknre 7 shows the two R N ~ ' S(8) that bring alanGe an; tyro-

Alanine

Tymrine Figure 6. The process of protein synthesis. Reproduced with permission from "Horizons in Biochemistry." Academic Press, New York, p. 105.

Figure 7. Some possible transfer RNA rtrmturer Copyright 1966 by American As~ociation for t h e Advancement of Science, used with permwion.

Volume 44, Number 9, September 1967

/

549

sine to the ribosome for attachment to the peptide. Other transfer RNA's have a similar cloverleaf structure, which is not a random affair. In the vertical and horizontal portions, the bases written directly opposite each other can H bond. It is suggested that they do so, forming short sections of double helices and that this arrangement helps generate the specificity. The three bases at the bottom of the lower loop are the codon or coded base triplet that attaches to the messenger RNA. Some Proteins

The products of the process just outlined are proteins. Some are used for structural members of the body aud some are enzymes for catalyzing reactions in the cell. The several structural arrangements adopted by both fibrous and globular proteins are influenced by H bonds. Figure 8 presents the a-helix proposed by Pauling and Branson (6). The peptide chain is held in

in each chain is known, both for normal hemoglobin and for numerous variants. These variants are the result of mutations in the DNA molecule, causing it to pass along incorrect information to the RNA molecules and giving, finally, the iucorrect protein. One variant hemoglobin is at the root of sickle-cell anemia. The red blood cells are not the normal disc shape but are elongated aud frequently curved iuto arcs or sickles. Such cells do not perform the oxygencarrying role, and the disease is often fatal. This malady is of interest for several reasons. First, it is an excellcut example of what Pauling termed a "molecular disease." The difficulty lies in the hemoglobiu molecule itself, not in a foreign agent entering the body. Second, the molecular defect is very small, each pchain has oue amino acid replaced. d glutanlic acid residue is replaced by valine in sicklecell hen11)globin. Third, a very recent proposal to explain the mechanism of sickle formatiou involves H bonds. We can understand the general aspects of the proposal (7) in this way. The variant molecule has v a1'me as the sixth residue from one end of each p-chain. &Iurayama suggests that a hydrophobic bond betxeen this variant valine and auother valine, normally present in the first position of the p-chain, forms a loop. Some other groups then attract by H bonding across this loop and produce a keylike projection on each &chain. These projections can then fit into some sockets in the a-chain portion of auother hemoglobm molecule. The combination still has "keys" a t one end and "sockets" a t the other so other variant hemoglobin molecules attach, and a rod or sickleshaped object forms. Perhaps the remedy can he found in blocking the valinevaline interaction and interfering with the H bonding ability of the other groups forming the "key." An Enzyme Example

Amino Acid Residues

Figure 8.

The alpha helix.

a helieal form by N-H.. . . O bonds more or less parallel to the spiral axis. The a-helix is fairly well known now, having been shown to occur in a number of proteins, a t least in the crystalline form. The two best known globular proteins are myoglobin aud hemoglobin. There are a great many similarities between these two. Hemoglobin is a fairly loose assembly of four parts, each very much like myoglohin. In myoglohin, there is a continuous peptide chain of 150 residues. The chain has several a-helical sections and other regions in which it bends so that an overall spheroidal shape is obtained. This arrangement also allows a fairly good separation of the polar and H bouding side groups from those which are hydrophobic. The latter, frequently alkyl groups, are in the interior of the spheroid, where they add some stabilizing forces due to van der Waals attraction. The H honding groups, mainly on the surface, can contribute stability by means of bonds to the solvent. The heme group is a planar portion fitting into a cleft in the spheroid. Hemoglobin is made up of four parts, each folded in much the same way as myoglohin. The four parts are not identical; two have 141 amino acid residues in what is called the a-chain, and the other two, p-chains, have 146 residues. The exact order of the amino acids 550

/

Journal o f Chemical Education

Let us move on to some other molecules which occur in the cell-the enzymes. Figure 9 shows a particular enzyme (lysozyme) which has the job of breaking down portions of the bacterial cell wall made up partly by polysaccharides. The enzyme is a protein chain coiled

*-Helix Figwe 9. The structure m d action of lyrozyrne. Scientist with .ermirsion.

Reprinted from New

and folded so as to leave a distinct valley in the molecule (9). In this valley, seven sites are identified by letters. By testing dimer, trimer, and tetramer saccharide molecules, research worlcers have identified where the substrate attaches to the enzyme so that breaking the covalent bond is facilitated and the destruct,ion of cell wall material is catalyzed. Not only is the structure of the enzyme maintained in part by hydrogen bonding, but the substrate molecule is held in contact wit,h the ennyme a t the proper place, in part by eight hydrogen bonds. Some Virus Examples

The structures of two plant virus molecules are know^^ with fair completeness. Figure 10 shows the arrangement deduced for turnip yellow mosaic virus (TYMV), which causes crippling mosaic patterns on the leaves of turnip, cabbage, and other plants. TYMV, like other Protein ~ubunits

1

RNA is Hidden Inside icosahedron

RNA ,

Summary

, Protein

a

TYMV b

TMV Figure 10.

*

that the 190 1 amiuo Figure 10. It is significa~~t. acids include an unusually high fraction of hydroxy amino acids which could supply H bonding capacity. The subunit also contains a more than normal proportion of acids with hydrophobic bonding ability. They also help stabilize the regular structure. The last example is tobacco mosaic virus (TMV). This virus, also shown in Figure 10, is a large one. Its molecular weight is 40,000.000, and it measures about 3000 by 150 diameter. It too has protein subunits, in this case in a helical array with the RNA present ill spiral form. It is held together in much the same way as the TYMV. The use of subunits, which are associated by H bonds and other weak forces, has t,wo advantages. First, it allows t,he protein "overcoat" to come apart easily by breaking the weak bonds. Second, it conserves the amount of genetic code information required since there need be instructions for building only a short protein rather thau one large enough t,o account for the entire protein portion.

The structure, of turnip yellow and tobacco mosaic viruses.

viruses, is made up of many protein units clustered around, and attached to, an RNA molecule. The RNA is foreign t,o the plant host and, when it reaches a cell, causes unwanted or harmful protein substances to form. 111 Yigure 10, part of the 180 protein subunits are visible in the icosahedral pattern they form. Each protein subunit contains 190 + 1 amino acids and has molecular weight about 20,000. The RNA is interlaced among the subunits. There are three effect,s of H bonds in this structure. Two are the same we have seen before--to help determine the structure within each subunit and to assist stability by interaction between solvent and each subunit. The third int,eraction is H bonding between rotei in subunit,^ t o help form the icosahedron in

H bonds have considerable significance in biological chemistry. The frequent occurrence of OH and NH, groups to donate hydrogens, plus C=O groups to act. as sources of electrons, means many H bonds can form. These H bonds help determine the shape exhibited by nucleic acids and proteins. They are also important in enzymes and viruses. The genetic code, or order of bases in DxA, is maintained through millions of replications by the fit guaranteed by H bonds between bases. The transfer of information from DNA to the protein during synthesis is also accomplished by H bond matching of base pairs. The mode of action of enzymes is dependent on their structure aud on the presence of special sites. Both factors usually involve H bonds. Literature Cited

iGeneral) Wa~soN.J . D.. "Moleculsr Bioloev of the Gene." W: A. ~ e n j s k i nI&., , New York, 1965.( I ) FERGUSON, L. N., J. CHEM.EDUC.,33, 267 (1956). GORMAN, M., J . CHEM.EDUC,33, 468 (1056), 34, 304

,.

i, A l O"Gu7. \

HUGGINS, M. L.,J. CHEM.EDUC.,34, 480 (lY57). J . N., "The Biochemistry of Nurleic Acids" ( 2 ) DAYIDSON, (4th Ed.) Methuen, London, 1960. ( 3 ) WATSON, J. D., A N D CRICK,F. H. C., Nature, 171, 737, 964 (1953). ( 4 ) CAMRS, M . , IEEE Spectmm, July 1965, p. 98. ( 5 ) LGWDIN,PER-OLOF,Advances in Quantum Chemistry, 2 , 238 (1965). Nntl. Acad. Sci., ( 6 ) PAULING, L., AND BRANSON,H. R., PTOC. U.S., 37, 205 (1951). M., Science, 153, 145 (1966). ( 7 ) MURAYAMA, J . T.. EVEREP. G. A.. AND KUNG, (.8.) MADISON. . H.,. Science, 153, 531 (1966). (9) PHILLIPS, D. C., Seimtifie American, 78 (November 1966).

Volume 44, Number 9, September 1967

/

551