SPECIAL REPORT
paunmASB :.*.·' τ F T * ΤΛ ;
RBACT10N Norman Arnheim, University of Southern California, and Corey H. Levenson, Cetus Corp.
The polymerase chain reaction (PCR) is an in-vitro method of amplifying DNA sequences. Beginning with DNA of any origin—bacterial, viral, plant, or animal— PCR can increase the amount of a DNA sequence hun dreds of millions to billions of times. The procedure, which can amplify a targeted sequence even when it makes up less than one part in a million of the total initial sample, was developed by Cetus Corp. scientists Kary B. Mullis, Randall K. Saiki, Stephen J. Scharf, Fred A. Faloona, Glenn Horn, Henry A. Erlich, and Norman Arnheim in 1984 and 1985. PCR is an enzymatic process that is carried out in discrete cycles of amplification, each of which can dou ble the amount of target DNA in the sample. Thus, n cycles can produce 2 n times as much target as was present to begin with. The genomes of higher organisms can contain on the order of 10 9 nucleotides. A 100-base pair target DNA thus represents less than one millionth of an organ ism's total DNA. A key feature of the development of PCR was finding how to amplify a rare target sequence while minimizing amplification of the numerically overwhelming nontarget sequences. PCR conditions have now been identified that produce virtually pure DNA product. PCR already has had an impact on molecular biology, human genetics, infectious and genetic disease diagno sis, forensic science, and evolutionary biology. The rea son is threefold. In the first place, it reduces the diffi culty of isolating and manipulating specific DNA se 36
October 1, 1990 C&EN
quences. Previously, gene isolation depended completely upon the lengthy and usually technically difficult process of traditional cloning using recombi nant DNA techniques. Cloning relies on in-vivo repli cation of a target sequence integrated into a cloning vector in a host organism. The simplification of DNA sequence isolation by in-vitro methods using PCR has made gene analysis accessible to scientists who have little training in molecular biology or biochemistry. In addition, PCR makes it possible to study biological problems without giving thought to the amount of bio logical material available. Thus, even DNA sequences present in an individual cell can be studied, providing information that would have been thought impossible to obtain previously. Finally, the ease of carrying out PCR, as well as its speed and sensitivity, make it ideally suited for application to a wide variety of problems. It is not necessary to know the nucleotide sequence of a target in order to amplify it using PCR, but the se quence of a small stretch of nucleotides on each side of the target must be known. These flanking sequences are used to design two synthetic single-stranded oligo nucleotides, usually 20 nucleotides in length, that will serve as primers. The sequence of these primers is cho sen so that each has base pair complementarity with its respective flanking sequence. PCR begins by denaturing the double-stranded target DNA, followed by annealing the primers (one for each strand) to the sequences flanking the target. Each prim er forms a duplex with its flanking sequence so that the 3' hydroxyl end of the primer faces the target sequence. Addition of a DNA polymerase and deoxynucleoside triphosphates causes a new DNA strand to form begin-
H I
I
I
M i l l !
I
I
I • I
I
I
I l
l
I
I I I I I I I I I
l l
l I
I I I I I
Target area
Heat Primer
~t—
I III II I II 3-GGACGTAT GTCCG
^^^^^^^^^^^^^^^i
i i i i i i i i i
Polymerase-
ning at the primer and extending across the target sequence, thereby making copies of the target. These steps—DNA denaturation, primer annealing, and DNA polymerase extension—represent one PCR cycle; each step is carried out at an appropriate temperature. If the extension product of a primer is long enough, it will include the sequence complementary to the primer at the other end of the target sequence. Thus, each new extension product can act as a template for the next cycle. It is this fact, first recognized by Mullis, that leads to the exponential increase in PCR product with each cycle. If the amount of target exactly doubles with each cycle, as few as 20 cycles will generate about a million times more target sequence than originally was present. Amplification by PCR is extremely rapid. Twentyfive cycles can be carried out in just over one hour. Beginning with 1 fxg of total human DNA, which contains 300,000 copies of each unique sequence, 25 cycles of PCR can generate up to several micrograms of a specific product (on the order of a few picomoles). The size of the final product is equal to the sum of the lengths of the two primers and the length of the target sequence that stretches between them. PCR products up to several thousand base pairs long can be routinely synthesized.
Applications to molecular biology PCR has been used to simplify a number of techniques in molecular biology, including cloning, sequencing, and modifying specific nucleic acid sequences. For example, molecular biologists frequently wish to compare a normal cloned gene with an uncloned mutant form of the gene. Using PCR, the comparison can be carried out without laboriously cloning the altered form. DNA sequence information obtained from the normal gene can be used to synthesize the primers and allow PCR amplification of the gene from mutant cells. The mutant PCR product can be subjected directly to DNA sequencing protocols without the additional subcloning efforts usually required to sequence a larger cloned DNA fragment. The study of RNA molecules has also been enhanced by PCR approaches. RNA molecules are first converted to so-called complementary DNA (cDNA) by the retroviral enzyme reverse transcriptase. This synthesized DNA template can then be amplified by conventional
M
WffffWW
IHHMHm Copied DNA
^BS ...WTWWPP
MBT HUB
Products
A target piece of DNA, representing less than one millionth of the total DNA of a higher organism, is multiplied selectively and exponentially by the polymerase chain reaction. Heating denatures the double-stranded DNA of the target to give two single-stranded DNA templates. Site-specific primers, designed to complement the base pairs of the DNA region flanking the target, anneal to these regions. Polymerase enzyme then catalyzes synthesis of two pieces of double-stranded DNA, each Identical to the original target DNA (copied DNA). Each of the newly synthesized products can serve as templates for primer annealing and extensions (next cycle). Following repeated cycles of PCR, the product mixture consists primarily of double-stranded fragments of discrete length (products). Assuming perfect efficiency for each cycle, as few as 20 cycles would produce a millionfold Increase in the original target.
October 1, 1990 C&EN 37
i'
I,
Special Report PCR methods. In this way correlaPCR can be used to modify DNA sequences tions can be made between the levAppending DNA sequences els of rare messenger RNA molecules and the developmental progress of cells. 1 Another widely used technique in molecular biology is to alter DNA segments in vitro in order to study structure-function relationI II ships. Barbara Hummel and Russell G. Higuchi at the University of PCR California, Berkeley, and Saiki at Cetus showed a few years ago that sequence a known DNA sequence can be Primer B modified using a PCR primer difM H H N i fering at one or several positions Although polymerase chain reaction primers are designed to be perfectly complefrom the target sequence. Successmentary to the DNA sequence flanking the target, the efficiency of PCR is unafful PCR does not require perfect fected if an unrelated DNA sequence is appended to the 5' end of the primer. complementarity of a primer to the These unrelated sequences become incorporated into the final PCR product. The flanking sequence, so a primer technique can be used to add any DNA sequence to either or both ends of a containing a base pair substitution, known target, providing a useful way to incorporate different DNA elements for an insertion, or a deletion can be experimental studies. incorporated into the product. Since all of the PCR products conAltering DNA sequences tain the primer, they will all be ftATfiA modified at the same position, and I I I I I I I I I ITT I I I I I GTACT their function can be compared to the unmodified sequence. Numerous variations on this idea have been made recently by scientists in many laboratories so that a muta•T» tion in any position in any DNA fragment can now be readily introduced. Over the past several years, moGTACT TACT! lecular biologists have been studying in great detail the nature of protein-DNA interactions. AlPCR PCR though many experiments can be carried out in vitro, it is important (GA to determine whether the same inl l l l l l i m ,CI IAT T r I I teractions occur in living cells. A GT ICT few years ago, George Church and Walter Gilbert at Harvard UniverPCR primers can also be made that have internal mismatches with the flanking sesity developed a method known as quences. If the primer is designed to substitute one nucleotide for another (shown genomic footprinting that can deabove left), then the PCR products will all have this substitution, in this case a cytotermine which individual guanine sine-guanine base pair in place of the original thymidine-adenine base pair. Alternabases in a DNA segment in vivo tively, the primer can be designed to be missing several nucleotides complementary interact with a protein and thereby to those in the target strand (shown above right), which leads to PCR products with are protected from dimethylsulfate the same deletion. In this example, the sequence ATG (and its complement, TAC) is methylation. Only nonmethylated deleted from PCR product because the primer lacks the TAC sequence. guanosine sites can be cleaved by piperidine. However, this method is technically demanding and requires large amounts of DNA to achieve appropriate sensitivity. Recently, Paul Mueller from methylation by bound protein. This approach and Barbara Wold at California Institute of Technology holds great promise for studying protein-DNA interachave modified the Harvard procedure by incorporating tions in living cells. a PCR step. This innovation increases the amount of Prenatal diagnosis of genetic disease each of the piperidine-cleaved genomic fragments, making it easier to determine which fragments are Families that have had children with a lethal genetic missing and therefore which guanines were protected disease can be aided by techniques that can deduce the
ffmrnmrn
HTH
m
m
i
I
t
IIIIIIIINHffltTT
38
October 1, 1990 C&EN
'
genetic makeup of the early fetus in a subsequent pregnancy. Prenatal diagnosis makes it possible for parents to make an informed reproductive choice when it is certain that the fetus will be affected. PCR is now playing an increasingly important role in the diagnosis of genetic diseases where the molecular nature of the genetic defect is known. In such cases, PCR primers can be designed to flank the region of the gene which, in its mutated form, causes the genetic disease. In fact, the original development of PCR was aimed at enhancing the speed and sensitivity of prenatal diagnosis of one such disease, sickle cell anemia. Using PCR, the diagnosis takes less than one day, compared with several weeks using previous technology. Scientists throughout the world have applied PCR to the prenatal diagnosis of a large number of genetic diseases including phenylketonuria, ^-thalassemia, muscular dystrophy, and, just recently, cystic fibrosis. In order to carry out any prenatal diagnosis test, a sample of fetal DNA needs to be analyzed to determine whether it contains the mutated form (called an allele) of the gene that causes the disease. Because PCR amplifies the DNA fragment that contains the region that may be mutated, numerous new methods have recently become available for distinguishing between the normal and mutant alleles in the PCR product. Previously, tests were limited to those that used the most sensitive detection methods because the amount of fetal material available was often very small. With PCR, less sensitive but more convenient or automatable detection systems can now be utilized. One current method to distinguish between alleles, originally developed by Bruce Wallace and colleagues at the City of Hope Medical Research Center, Duarte, Calif., uses allele-specific oligomer probes (ASOs). A collaboration between Erlich's laboratory at Cetus and Haig Kazazian's at Johns Hopkins Medical School has
incorporated PCR into the technique. ASOs are short DNA segments that are identical in sequence except for a single base difference that reflects the difference between normal and mutant alleles. Under the appropriate DNA hybridization conditions, these probes can recognize a single base difference between two otherwise identical DNA sequences. Probes can be labeled radioactively or with a variety of nonradioactive reporter molecules. Labeled probes are then used to test a PCR sample for the presence of the disease-causing allele. The presence or absence of several different disease-causing genes can readily be determined in a single sample. Honghua Li in Arnheim's laboratory at the University of Southern California, in collaboration with Erlich's group, demonstrated that a DNA sequence present in a single molecule can be amplified to a detectable level. Such sensitivity allows the analysis of DNA sequences present even in a single cell and adds a new dimension to predicting genetic diseases, since it can allow prenatal diagnosis on eggs fertilized in vitro before they are even implanted in the womb. In such studies, a single diploid cell (blastomere) taken at a very early stage of development (4-, 6-, or 8-cell stage) is the target of DNA analysis. Studies on single human blastomeres have recently been carried out by Alan Handyside and colleagues at Hammersmith Hospital in London, and normal pregnancies have resulted after the biopsied embryos were implanted.
The human genome project The worldwide effort now under way to map and sequence the human genome eventually will produce the complete nucleotide sequence of every human chromosome. This project is expected to provide information that will significantly enhance human health care. Some of the major discoveries of the past few years—
DNA from one cell can test for genetic disease AA
>
Implant
©
Implant
'Aa
0 —- $
aa
PCR's ability to amplify a DNA sequence from a single cell to a detectable level can be used for prenatal diagnosis of recessive genetic diseases. Eggs fertilized in vitro from parents who both are known to carry a mutant gene that can lead to fatal genetic disease are grown through two or three cell divisions. A single cell at this very early stage in development—called a blastomere—can be analyzed using PCR to determine whether it has inherited the mutant gene (shown here in red) from both parents. Blastomeres containing the normal gene can be implanted in the womb.
October 1, 1990 C&EN 39
Special Report such as the chromosomal location of the genes causing muscular dystrophy, Huntington's disease, and cystic fibrosis—have been made as a result of mapping and sequencing efforts. These and similar discoveries in the future will have enormous public health consequences. The first phase of the human genome project will be construction of a fine-structure genetic map of each chromosome. The construction of such human genetic maps has depended upon carrying out family studies. The techniques used in family studies, however, place a limit on the resolution to which genes can be ordered with respect to one another. Genetic markers that are very close together are very hard to order because human family sizes are generally small, making it difficult to obtain the large number of offspring that are required for statistical reliability. A new method for genetic mapping based upon PCR analysis of DNA sequences present in single sperm cells, recently developed by Arnheim's laboratory, appears able to increase the resolution of mapping by an order of magnitude. The second phase of the human genome project will be to construct a physical map in which individual fragments of DNA are ordered with respect to one another along each chromosome in preparation for sequencing. Recently, members of an advisory committee to the project proposed that PCR can play an important role in this physical mapping process. By providing a simple and rapid way of determining whether a particular sequence in one DNA fragment is found in another fragment, PCR will help determine whether the two pieces of DNA are adjacent to one another in one of the chromosomes.
Retrospective analysis of human tissue Pathologists have been storing both normal and diseased human tissues since the turn of the century. The most common archival specimens are formalin-fixed, paraffin-embedded tissues. The worldwide collection of these archival tissues could be valuable for the study of the association of biological agents (viral, bacterial, or parasitic) or cellular DNA lesions with disease. DNA can be extracted from paraffin-embedded tissue, but archival material is not always suitable for standard DNA analysis because the DNA slowly degrades and most standard analytical techniques for studying DNA sequences require high molecular weight, undegraded DNA. The DNA target size for PCR analysis, however, can be very small, since it is restricted to a short stretch between primers. In 1987 Darryl Shibata in Arnheim's laboratory showed that even after 40 years of storage, a 5-/*m section taken from paraffin-embedded tissues from a cervical carcinoma patient still shows evidence of the presence of a virus known to be associated with that cancer. An advantage of using small slices of the embedded tissue is that its histology can be studied by light microscopic analysis using the thin sections immediately adjacent to the one analyzed by PCR. This archival analysis allows immediate testing of hypotheses linking the presence or absence of specific DNA sequences with a particular disease, rather than requiring long prospective studies. In retrospective 40
October 1, 1990 C&EN
studies, the clinical outcome is already known from the pathological material itself. Many years might be needed, for example, to study a significant number of cases of a rare form of cancer in order to examine the possibility that a virus plays a causative role in the cancer's development. With access to archival materials, however, all of the cases collected over the past 20 or 30 years could be examined immediately. With such applications in mind, the research potential of medical archival material for future generations might be significantly enhanced by finding tissue preservation techniques that maximize DNA integrity. One interesting application of the retrospective analysis of human tissues using PCR was carried out by Manuel Perucho's group at the State University of New York, Stony Brook, and Shibata. Their study was aimed at understanding the frequency of a specific gene mutation in human pancreatic tumors. One class of genes which, when mutated, contributes to tumor formation
DNA studied from stored tissue samples Tissue embedded in paraffin block
Thin section 1
Thin section 2
m *
PCR analysis
Diseased tissue, stored from biopsies, can be studied using the polymerase chain reaction to test hypotheses linking biological agents or DNA lesions with particular diseases. DNA fragments large enough to contain the region of interest are extracted from a single thin section of paraffin-embedded tissue, and the DNA amplified by PCR. Microscopic examination of the immediately adjacent thin section confirms the presence of the diseased tissue. DNA slowly degrades in such pathological samples, but since PCR requires only a relatively short segment of intact DNA, the technique can be used on samples that have been stored for decades.
are members of the ras family. Mutated ras genes have been found in cells of different tumor types by many groups throughout the world. However, these oncogenes are usually present in only a fraction of tumors of any particular type. No class of spontaneous human tumors had previously been reported in which the majority of tumors contains activated ras oncogenes. The researchers combined PCR analysis of archival formalin-fixed pancreatic tumor specimens with microscopic examination of adjacent paraffin sections, which allowed accurate selection of both normal and neoplastic tissue for analysis. They identified a mutated c-k-ras gene in over 90% of the pancreatic tumors. Studies by Johannes L. Bos and his colleagues at the State University of Leiden in the Netherlands and K. Grunewald and colleagues of the University of Ulm in West Germany also find that the vast majority of pancreatic tumors contain a mutated c-k-ras gene. This information may be useful in understanding the origin and perhaps treatment of this usually incurable cancer.
Diagnosing other diseases Because of its ability to amplify a specific sequence present in a complex mixture of DNAs, PCR was immediately applied to the detection of disease-causing viruses. A collaborative effort between the laboratories of John J. Sninsky at Cetus and Bernard J. Poiesz at the State University of New York, Syracuse, was the first to apply PCR to the detection of the AIDS virus. The blood test currently used for human immunodeficiency virus (HIV) does not detect the virus itself, but instead confirms the presence in an individual of antibodies that the body produces against HIV-1 proteins. The PCR-based test detects the nucleic acid (DNA or RNA) of the virus itself, even before the immune system has mounted an antibody response. Such direct detection has important uses in evaluating possible AIDS therapies, where the goal is to reduce the amount of virus present in individuals already infected and, therefore, antibody-positive. Scientists throughout the world are also using PCR to study the presence of human papilloma virus strains in cervical tissues in further exploration of the already known association between certain viral strains and cervical carcinoma. In addition, PCR has been applied to the detection of cytomegalovirus and measles, Epstein-Barr, hepatitis, and herpes viruses. PCR's ability to detect a rare sequence present in a complex mixture of DNA has also been applied to cancer diagnosis. Leukemia patients receive therapy until their disease is in remission and no more cancer cells can be detected. Physicians want to start therapy again if the patient has a relapse and leukemia cells are again detected. At present, the reappearance of the disease is detected by microscopic observation of a blood smear for the presence of leukemia cells or by relatively insensitive molecular biological tests, which can take several weeks. For a certain form of leukemia characterized by a chromosomal translocation that has been defined at the DNA level, PCR was shown by Ming-Sheng Lee at the University of Texas System Cancer Center and Stanley
Korsmeyer's group at Washington University in Saint Louis to detect the presence of very small numbers of leukemia cells undetectable by other methods. Because the DNA translocation can be detected by PCR analysis, the presence of leukemia cells can be identified quite early during a relapse, thereby allowing optimal therapeutic regimens to be devised.
Revealing evolutionary origins Molecular evolutionists have been limited to reconstructing evolutionary history by comparing DNA sequences obtained from species living today. Standard molecular analysis requires large amounts of sample and DNA that is highly intact. Since these are not necessary for PCR-based studies, it's now possible to analyze samples from long-dead individuals that exist in museum collections, including those from species that are extinct. Scientists in Allan Wilson's laboratory at the University of California, Berkeley, have used PCR to amplify DNAs from the quagga, an extinct zebralike animal whose only remains consist of bits of dried skin. In another study by scientists at Berkeley and the University of Florida, DNA sequences from a 7000-year-old mummified human brain were analyzed. Most remarkably, a collaboration between Michael T. Clegg and colleagues at the University of California, Riverside; David E. Giannasi at the University of Georgia; Charles J. Smiley at the University of Idaho; and Gerard Zurawski at DNAX Research Institute in Palo Alto shows that DNA sequence information could be obtained by carrying out PCR on DNA extracted from a 17 to 20 million-year-old plant fossil. It seems clear that a new field of molecular
Cetus researcher loads samples into machine that performs polymerase chain reaction October 1, 1990 C&EN
41
Special Report
PCR amplifies rare target sequences, dramatically changing DNA probe technology The polymerase chain reaction has dramatically changed the use and potential for commercialization of DNA probe technology. In diagnostics, for example, prior to PCR, the most rapid, convenient, and inexpensive methods to detect a pathogen were based on looking for antibodies present in a patient or detecting a pathogen-specific protein directly, using an immunoassay. Nucleic acid-based tests were not used because they required isolation of the DNA or RNA from a biologically complex sample, technically sophisticated manipulations, and the use of specific detection probes labeled with short-lived radioactive isotopes in order to achieve the necessary sensitivity. . PCR has significantly altered this situation. Because PCR amplifies the rare target sequence in a complex mixture, short-lived radioisotopes are not required to obtain sensitivity. Detection methods such as enzyme-based assays, chemiluminescence, and fluorescence energy transfer can be used. Finding evidence of a pathogen is no longer a matter of looking for a needle in a haystack; after PCR, the sample is predominantly composed of needles. One method to detect PCR product is a dot blot format. The product is spotted on and attached to a solid support—such as a nylon membrane—and exposed to a labeled, single-stranded DNA probe that will specifically anneal to the PCR prod-
uct. The probe can be labeled radioactively or with molecules capable of being detected by fluorescence or chemiluminescence. Detection of label in the region of the spot signifies the presence of pathogen in the biological sample. Because PCR synthesizes more copies of the pathogen sequence than were present in the sample originally, it's possible to label the pathogen-specific PCR product itself. A pathogen-specific DNA sequence can be labeled nonradioactively during PCR in two different ways. In one, the PCR primer is labeled. Since all PCR products come from primer extension, labeling the primer also labels all products. Such labels must be designed not to interfere with the ability of the primer to anneal to target or to be extended by the polymerase. The label also must be thermostable. The most frequently used labels of this type are fluorescent dyes—such as fluorescein or rhodamine—or the vitamin biotin. These labels typically are attached by a short linker arm to the 5' end of the primer or to one of its purine or pyrimidine bases. Such labels can be incorporated during the synthesis of the primer, or the oligonucleotide can be modified following synthesis. Another way to label PCR product is to include modified nucleoside triphosphates in the PCR reaction. A number of modified triphosphates are suitable substrates for polymerase and can act as a
paleontology and archeology is emerging that will provide fundamental information on evolution,, -
PCR in the courtroom During the past 10 years, extensive human genetic variation has been uncovered at the DNA level. This discovery has provided DNA markers, which can be used to help identify individuals in applications such as forensics. Studying individual differences at the DNA level is accomplished by DNA "fingerprinting," a technique originally developed by Alec J. Jeffreys and colleagues at the University of Leicester in England using the traditional molecular biological tools of Southern blotting and hybridization. While this method has the capability of high-resolution identification, it requires relatively large amounts of undegraded DNA. Very often; samples found at crime scenes contain only small amounts of degraded DNA, which makes them unsuitable for analysis by these methods. For this reason, analysis of DNA samples using PCR 42
October 1, 1990 C&EN
template in future cycles once incorporated into product. For example, a deoxyuridine triphosphate containing a biotinylated sidechain at the 5 position of the uracil ring will be recognized by polymerase as TTP and incorporated into product opposite adenosines in the template, albeit at a reduced efficiency. Even thermolabile, modified triphosphates can be incorporated into product if they are introduced into the reaction mix during the extension portion of the last cycle. Once appropriately labeled PCR product is produced, it can be detected conveniently by "capturing" the pathogen-specific product with a nonlabeled, pathogen-specific probe. The capture probe must be specific to the PCR target and can be attached, to a solid support. This so-called reverse dot blot procedure was developed by Randall K. Salki.and colleagues at Cetus Corp. After thoroughly washing the support to get rid of excess labeled primers or modified triphosphates, the presence of labeled PCR product can be detected by a number of methods. In the case of product labeled with biotin, the product will bind very tightly to the bacterial protein streptavidin. If the latter is introduced as a streptavidin-enzyme complex, a colored precipitate will be produced upon addition of the appropriate enzyme substrate, as shown in diagram. Alternative methods use streptavidin
technology is beginning to be introduced as evidence in criminal cases. A collaboration between Higuchi, now at Cetus, Erlich, and George F. Sensabaugh and. coworkers at the University of California, Berkeley, shows that DNA analysis of a gene in the human major histocompatability complex can be made using a single human hair. Adaptation of additional genetic loci to the , PCR format will increase the statistical significance of finding DNA identity between a suspect and material found at a crime scene. However, even analysis of only a single genetic locus is sufficient to exclude a suspect if the samples do not match.
A chemist's view of PCR PCR is used primarily as an analytical tool and, just as it is not necessary to understand quantum mechanics to operate a spectrophotometer, PCR can be carried out without fully understanding the molecular dynamics that underlie the reaction. Indeed, part of the universal appeal of PCR is its seeming simplicity and the ease
Biotin label on primer
Biotin label on nucleotide HHUUHH•
—NNNNNNNN • • — »
•
^ — I M NNNNNNNNNN
i
T T
• NNNNNN-
PCR
PCR
Denaturation and "capture" EYSA)
Nylon strip with four different pathogen-specific. capture probes
4
Top view of nylon strip * = Biotin • = Primer H = Nucleotide
{E)§9= Streptavidin-enzyme complex S = Noncolored substrate P = Colored product precipitate
with which the process may be automated using broadly applicable protocols. PCR, however, is a complex phenomenon. The original development of PCR required careful attention to a number of problems inherent in nucleic acid interactions. The primers need to be long enough to overcome the statistical likelihood of their sequence occurring randomly in the nontarget DNA. A length of 20 nucleotides is usually sufficient. The primer concentration must be high enough so the primers anneal to the single-stranded target faster than the target reanneals to its complementary strand. The specificity of the primer to interact with template rather than nontarget DNA is dependent on temperature and salt concentration, and appropriate conditions must be determined empirically. Finally, the conditions of the reaction must also be compatible with full activity of the polymerase. The reaction itself frequently is carried out in a lOO-^L volume, contained in a 0.5-mL polypropylene tube. The reaction mixture typically consists of the
alone to capture a biotin-labeled PCR product. For example, if PCR is carried out with one primer labeled with biotin and the other with a fluorescent dye, then the streptavidin-captured product will fluoresce. The capture approach allows simultaneous detection of different diagnostic sequences. For example, a number of different biotin-labeled primer sets, each for a specific pathogen, could be added to the same sample. The total PCR product could then be added to a single solid support, containing distinct regions, each with a capture probe specific for the PCR product of one of the pathogens. After the support is washed stringently and streptavidin-enzyme complex added, the presence of any one of the pathogens would be indicated by color production in one of the specific regions. Detection of multiple pathogens using the original dot blot procedure involves many more manipulations, since a different aliquot of unlabeled PCR product must be tested with each labeled, pathogenspecific probe in a separate hybridization experiment. In the reverse dot blot procedure, one hybridization step could conceivably test for 10 to 20 different pathogens. The importance of probe technology opens up many opportunities for the development of ways to design and attach novel reporter groups. This area will remain a challenge for chemists as automation of PCR technology continues during the next few years.
DNA sample containing the target, 20 nanomoles of each of the four deoxynucleoside triphosphates (dATP, dCTP, dGTP, and TTP), 10 to 100 picomoles of each primer, the appropriate salts and buffers, and DNA. The solution is heated to 95 °C for 15 seconds to denature the DNA, cooled to 54 °C for 15 seconds to permit primer annealing, and kept for 30 seconds at 72 °C to allow for primer extension by the polymerase. These three steps constitute one PCR cycle. The solution is overlaid with mineral oil to prevent evaporation, thus allowing more rapid thermal equilibration of the reaction mixture and eliminating increases in the concentration of reagents during the course of the reaction. The interactions between the many components in the reaction are incompletely understood, so it is conceivable that additional ingredients will be found that enhance the efficiency and specificity of PCR. A PCR experiment is frequently described in terms of discrete cycles in which the amount of product is assumed to double with each cycle. Observation shows, October 1, 1990 C&EN 43
Special Report however, less than twofold increases per cycle for the initial cycles, followed eventually by a plateau, wherein the product concentration levels off. Thus, the average efficiency of the reaction per cycle is less than 100%. There are several reasons why product concentration eventually reaches a plateau. As double-stranded product begins to accumulate, competition increases between annealing of template to primer and reannealing of the complementary template strands to one another. Also, during the early stages of the reaction, much more enzyme is present than target; later, when the target becomes in excess, not enough enzyme may be present to fully extend all primed templates in a reasonable time period. The major contributors to the cost of PCR are the polymerase and primers. The polymerase currently used costs about $1.50 per analysis. Synthetic oligonucleotide primers made commercially by automated DNA synthesis cost about $2.00 per pair per analysis. PCR may be used even more extensively once cheaper ways of preparing enzyme and synthesizing primers are developed. Although PCR has been employed in an everincreasing variety of contexts, the reagents and reaction conditions have undergone very little fundamental change. This is probably attributable to the inherent robustness of the reaction and the tediousness of extensive optimization experiments. However, in 1988 a fundamental improvement of the method was made by introducing a thermostable DNA polymerase from Thermits aqmticus (Taq). This improvement was devised by David H. Gelfand's and Erlich's groups at Cetus and Jane Gitschier's laboratory at the University of California, San Francisco. The Taq polymerase was later cloned and characterized by Gelfand's group.
The original PCR method used a DNA polymerase from the bacterium Escherichia coli at its optimal temperature of 37 ° C The heat-stable Taq polymerase allows primer annealing and extension to be carried out at significantly higher temperatures, which reduces imperfect annealing and subsequent nonspecific amplification, resulting in substantially purer product. An added benefit is that the heat stability of the Taq polymerase eliminates the need to add fresh enzyme to the reaction after the DNA denaturation step in each cycle, as is required with the E. coli enzyme. Development of the heat-stable enzyme led to the introduction of automated PCR machines that have contributed significantly to the rapid application of this technology by the scientific community. The enzymology of Taq polymerase has been studied by Gelfand's laboratory and Thomas A. Kunkel at the National Institute of Environmental Health Sciences. A key question being addressed is the extent to which the enzyme is processive or distributive. A completely processive polymerase will extend the primer to the end of the template before dissociating from the newly formed duplex, whereas a distributive polymerase will extend one primer incompletely, dissociate, and find another primed template to extend. Although both mechanisms can eventually lead to a fully extended primer, a more processive enzyme produces a longer primer extension in a given time period. The possibility of using heat-stable polymerases from other sources, including genetically engineered variants, is also being investigated at Cetus and in other laboratories. At present, PCR products on the order of a few thousand base pairs can be made routinely. Efforts are under way to make these products significantly longer. Finally, the discovery of any additional activities of the polymerase (such as the ability to degrade
Population's genetic variability leads to forensic use of PCR At the scene of a murder, hair samples are found that do not appear to be those of the victim. These samples can be studied using a polymerase chain reaction test that identifies their genetic makeup at the major histocompatibility complex locus known as DQ-alpha. This locus exists in six distinguishable forms, called alleles. The victim is typed as having alleles 2 and 3 and the evidence hair sample has alleles 2 and 4. Three suspects are arrested. Their genetic makeup at the DQ-alpha locus is also determined. Suspects A and B can be excluded with certainty, because their genetic makeup is different from that of the evidence hair sample. Suspect C, however, has the same alleles of the DQ-alpha locus as the hair
44
October 1, 1990 C&EN
sample and, therefore, is included among those individuals who could have committed the crime. Suspect C may just by chance have the same genetic makeup as the person whose hair was found, or may, in fact, have been at the crime scene. A statistical likelihood that the suspect and probable killer have the same genetic makeup at the DQ-alpha locus just by chance can be calculated based on the relative frequencies of alleles 2 and 4 in the population, as well as additional considerations. The lower the probability, the higher the chance that the probable killer and suspect are the same person. The range of probabilities for identity due to random chance using information on the DQ-alpha locus alone lies between 1 in 10 and 1 in
100. Using PCR to gain information on additional genetic loci could lower these probabilities significantly. However, DNA data are just one piece of evidence used by prosecutors or defense counsel in presenting their case. The ultimate court decision must be based on all the available evidence. Ed Blake of Forensics Sciences Associates in Emeryville, Calif., in collaboration with Russell G. Higuchi and Henry A. Erlich at Cetus, has tested samples for about 200 criminal investigations using PCR. So far, these data have been allowed to be introduced into 17 court proceedings. In a number of these cases, special hearings were required to determine the acceptability of introducing this new forensic method into criminal proceedings.
Applications of PCR to industrial problems An exciting potential application of the polymerase chain reaction, recently proposed by Gavin Dollinger of Cetus Corp., concerns a broader industrial use of PCR. The idea is to use nucleic acids (or their analogs) to tag materials one would like to be able to trace. The material would be tagged by adding a double-stranded nucleic acid (or analog) at a low enough level that it would not interfere with the performance of the material. Using PCR, one could later amplify the tag from a small amount of material and use it to trace the product back to its original manufacturer. Two attributes of PCR make it promising for this type of tagging: First, it can amplify very small amounts of initial target (even a single molecule) to easily detectable quantities, and in addition, the information-carrying capacity of nucleic acids (even short ones) is sufficiently high to enable the tagging molecule to encode a variety of types of data (such as manufacturer, product, and lot number). Thus, 10 6 0 (4 100 ) different sequences can be encoded by a DNA segment 100-base pairs long. The method potentially could be used to label materials such as petroleum products, pharmaceuticals, explosives, or industrial wastes. The technical challenge of the approach is to ensure that the nucleic acids or analogs are not rendered unamplifiable by the material being tagged. The tagging molecule must be compatible with, and recoverable from, the material being tagged. PCR is unlikely to ever be used to produce pieces of DNA in large (multigram) quantities because of the high cost of PCR reagents. However, in most molecular biological applications, a little DNA of defined sequence can go a long way. Microgram quantities of the nucleic acid sequence that comprises a particular gene can be cloned into a self-replicating expression vector (such as a bacterial, fungal, or mammalian cell line) to produce many grams of a desired protein. PCR's ability to produce any desired DNA sequences
primers as well as extend them) is obviously critical to a more complete understanding of the reaction. Polymer extension is probably going on all through the PCR cycle. The thermostable polymerase has measurable activity at room temperature and, therefore, can be expected to be active during the annealing as well as the polymerization steps of the cycle. This lower temperature activity has at least two implications: • Primer extension at nontarget sites can occur at lower and less stringent temperatures. • A two-temperature protocol—in which denaturation of the target occurs at high temperature, followed by simultaneous primer annealing and extension at an intermediate temperature—may in some cases improve specificity. PCR occurs in a kinetically complex milieu; polymerase is being heat inactivated and triphosphates are thermally degraded at a finite rate each cycle. The concentration of both double-stranded target and singlestranded primer is changing continuously throughout
Tankers containing "tagged" oil
^
Oil slick
Oil sample recovered from slick Target DNA extracted .
. amplified by PCR . . .
. . . and sequenced Source of oil leak identified
will have broad application in industries that use genetic engineering to produce new products. By complementing traditional techniques of amino acid and nucleotide substitution, PCR could accelerate the production of new materials, such as novel forms of cloned proteins with more desirable properties (muteins), antibodies that have been engineered to act as enzymes (abzymes), and RNA molecules with catalytic cleaving activities (ribozymes) toward specific RNA targets.
the reaction. The nature of the double-stranded target changes during the course of the reaction as well. If a segment of genomic DNA is being amplified, the target sequences at first are found in very long stretches of DNA. As the reaction proceeds, most of the target is found in relatively short PCR products. Thus, the requirements for full denaturation of the DNA may vary over the course of the reaction. As the reaction proceeds^ higher temperatures are required to separate the DNA strands of the product as a result of its increasing concentration. PCR offers many challenging investigative opportunities for chemists concerned with elucidating specific molecular interactions. The optimization of PCR yields, as well as the application of the technique to novel targets or particularly challenging clinical specimens, requires a more thorough understanding of the complex chemical interactions upon which the reaction depends. Why do some primers work better than others and lead to greater specificity? Why are some regions of October 1, 1990 C&EN 45
Special Report nucleic acid more easily amplified than others? What effect does an unusual local secondary structure—such as a hairpin loop in the template—have on the ability of the reaction to amplify a DNA segment? How does the presence of a polymerase affect the stability of target-primer complexes? What makes an enzyme, such as Taq polymerase, heat stable? By focusing attention on questions such as these, the steadily increasing use of PCR may significantly influence fundamental research in several areas of biochemistry, such as nucleic acid structure, protein-nucleic acid interaction, and protein thermostability.
PCR pitfalls A number of problems can be encountered in carrying out PCR experiments. For example, gel electrophoresis and ethidium bromide staining techniques reveal that DNA fragments with unexpected molecular weights often can be produced. These products can arise through enzymatic extension of primers that have annealed imperfectly to nontarget sites. Nontarget sequences that anneal to only one primer can become extended once each cycle but cannot serve as template for the other primer in subsequent cycles. As a result they increase in concentration linearly; for example by 20fold after 20 PCR cycles. However, on rare occasions, a nontarget extension product will also anneal nonspecifically to the second primer. The extension product resulting from this second mismatched annealing will contain the sequence information from both primers. Such molecules, just like target sequences, will be exponentially amplified in subsequent cycles. The more cycles of PCR that are carried out, the more likely it is that such rare nonspecific priming events will occur.
The exquisite sensitivity of the polymerase chain reaction is one of its greatest advantages, yet it can also be a source of significant experimental error
Another commonly encountered artifact is the formation of a product known as primer dimer. The characteristic feature of this undesirable by-product is its length (usually about 40 base pairs), which is approximately the sum of the lengths of the two primers. Primer dimers can arise in PCR in the absence of DNA template. Sequence analysis of the DNA of some of these fragments indicates that the fragments are, in fact, composed of the two primer sequences oriented with their 3' hydroxyl ends together, with one or a few intervening bases. Rare, nontemplate-directed addition of bases to the V end of the primers could eventually lead to complementary ends that would form a substrate for the Taq polymerase. Primer dimer usually is seen in PCR experiments in46
October 1, 1990 C&EN
volving high cycle numbers, such as those necessitated by rare targets. When primers are inadvertently designed with partially complementary S'-termini, primer dimer formation appears to be more frequent. The amount of primer dimer formed is inversely proportional to the amount of the target PCR product because of enzyme competition between the various DNA molecules being replicated. Efficient primer dimer amplification reduces the sensitivity of specific target analysis. Another PCR problem is caused by the Taq polymerase, which lacks the 3' to 5' exonuclease "proofreading" activity of some polymerases and occasionally incorporates the wrong base into the PCR product. A number of published studies estimate the rate of misincorporation as one in 10,000 to one in 200,000 nucleotides synthesized per PCR cycle. The fraction of PCR product that will contain such a mutation depends on the number of bases in the target molecule, the rate of misincorporation per base per cycle, and the number of cycles. Michael Krawczak, Jochen Reiss, and their colleagues at the University of Gottingen, West Germany, have recently developed a mathematically precise theory to estimate the frequency of molecules without misincorporations in PCR. Assuming the highest misincorporation rate, this theory predicts that for a 100-base pair target, after 20 cycles, 95% of the PCR product molecules will be unaltered by misincorporation. With a lower misincorporation rate, of course, the fraction of unaltered product becomes greater. Most of the altered molecules will not contain exactly the same mutation. The consequences of Taq misincorporations need to be considered for each application. For most analyses, these misincorporations have relatively little effect, but the interpretation of some experiments may be sensitive to this error. Finally, the exquisite sensitivity of PCR is one of its greatest advantages, yet it can also be a source of significant experimental error. Since the method can amplify targets that are present in extremely small amounts, it is very important not to contaminate experimental samples with minute amounts of PCR product from previously performed experiments. In experiments using human DNA targets, the investigator's own DNA present in dead skin, hair, or other sources must also be excluded. PCR requires special precautions to reduce the formation of aerosols and other routes of contamination. Typical steps include the use of positive displacement pipets with disposable tips and of separate hoods or laboratory areas for the preparation and subsequent analysis of PCR product. •
Suggested Readings White, T.f Arnheim, N., Erlich, H., "The Polymerase Chain Reaction," Trends in Genetics, 5, 179 (1989). Arnheim, N., White, T., Rainey, W., "Application of the Polymerase Chain Reaction to Organismal and Population Biology," Bioscience, 40, 174 (1990).
oped the polymerase chain reaction. He has been at USC since 1985. His research focuses on the application of PCR to study, human genetics, molecular biology, and disease diagnosis.
Arnheim, N. "The Polymerase Chain Reaction," in "Genetic En gineering, Principles and Methods," Vol. 12, p. 115, Plenum Press, New York, 1990. Gibbs, R. Α., "DNA Amplification by the Polymerase Chain Re action," Analytical Chemistry 62, 202 (1990). Erlich, H., editor. "PCR Technology: Principles and Applications for DNA Amplification," Stockton Press, New York, 1989. Innis, M. A. et al, editors. "PCR Protocols, a Guide to Methods and Applications," Academic Press, New York, 1990. Mullis, Κ. Β., "The Unusual Origin of the Polymerase Chain Re action," Scientific American, April 1990, page 56.
Norman Arnheim, professor of molecular biology at the Uni versity of Southern California, received a PhD. degree in ge netics at the University of Cal ifornia, Berkeley, in 1965. Fol lowing postgraduate research in biochemistry, he taught in the biochemistry department of the State University of New York, Stony Brook, until 1983. In 1984 and 1985, he was a senior scientist and head of the human genetics laboratory at Cetus Corp., where he was one of the research team that devel
Corey H. Levenson is a senior scientist and associate director of the chemistry department at Cetus. After receiving a Ph.D. in pharmaceutical chemistry from the University of Califor nia, San Francisco, in 1981, he joined Cetus as a member of Kary Mullis' group. Since 1984, he has been manager of the nucleic acid chemistry lab oratory where he conducts re search on nucleic acid structure and dynamics, DNA-drug in teractions, and the diagnostic and therapeutic applications of synthetic oligonucleotides and their analogs.
Reprints of this C&EN special report are available in black and white at $10 per copy. For orders of 50 or more, subtract 30% from the total order cost. On orders of $50 or less, please send check or money order with request. Send orders to: Distribution, Room 210, American Chemical Society, 1155—16th St., N.W., Washington, D.C. 20036.
See the " A R C " and the " Q R C " . . . E.A.S. Booth # 7 2 4
CtYlft ΠΠ ARC sr.i ΜΙΛ:Τ RAT:IVS TM:IM:RATHRI;O.I T
W I T H C S I ' S ADVANCED CALORIMETERS t
For ten years the CSI "ARC" has been a key tool-saving industry from runaway reactions.
m
lei
Q R C Operations Screen
0 - r . U tontroit I unction* UlliMloiwt Took
ARC THE Safety Calorimeter. NOW! The CSI "QRC" offers you big savings on process development and optimization. QRC... THE Second Generation Reaction Calorimeter.
ftCUBMERATING mL·. ίκ ISAIUHIMETER ^Slfc I
•^•L
gGjQUliJMBIA ÎSjCIElNJiTilFÏIC INELUSJIIRIES COR BOffiCTIjONs P.O. B6x?2p3itS I Austin, Téxa'sjrza_. 7/20j»| (800) 53i?5QQ3l
CIRCLE 17 ON READER SERVICE CARD
MLTITIVE ΤΙ0Ν ^ |
D
JBMG
RIMETER
U LJ /^jWfef|
October 1, 1990 C&EN 47