SCIENCE
Molecular Biologists Backing Effort To Map Entire Human Genome Support grows for 15-year multimillion-dollar project to delineate the genes on each chromosome, but question of funding remains unsettled Pamela S. Zurer, C&EN Washington
Two years after molecular biologists began debating the merits of a massive project designed to map and sequence the human genome, it seems a consensus has been reached. "It's obvious we're going to go ahead with it," says James D. Watson, director of Cold Spring Harbor Laboratory. "The real question is: How will it be managed?" Watson and the 14 other prestigious biologists who served on a National Research Council (NRC) committee last m o n t h gave the project the biology establishment's stamp of approval. Their report recommends that a 15-year $200-million-a-year effort to map all human genes should begin immediately (C&EN, Feb. 15, page 5). The goal of mapping and sequencing the human genome—that is, the complete array of DNA contained in the chromosomes—is to create a resource to be used in ongoing research. "Understanding the human genome is a tool to facilitate understanding human beings," says Bruce M. Alberts, professor of biochemistry at the University of California, San Francisco, and head of the NRC committee. "Biologists would use the maps the way writers use a dictionary. The words in a dictionary of the genome are individual genes—of which humans have about 100,000 distributed over 23 pairs of chromosomes. The letters that make up 22
March 14, 1988 C&EN
NRC recommendations on human genome project • Fund human genome project at $200 million per year for about 15 years. • Map human genome now, but postpone massive sequencing until better technologies are available. • Stress technology development, especially for DNA sequencing. • Include model organisms, such as yeast, drosophila, mice. • Encourage both small research labs and larger multidisciplinary centers; avoid a single large production center. • Make funding decisions through peer review process. • Designate a major federal agency to lead project with advice from a scientific advisory board. • Encourage international collaboration. • Establish centralized facilities for storing and exchanging data and materials.
the words are the nucleotide bases of DNA molecules; a single gene may contain from 1000 to more than 100,000 nucleotides. Human chromosomes contain a total of about 3 billion base pairs, but only about 5% of them are part of genes. The function of most of the other 95% simply isn't known. Mapping the genome refers to pinpointing the location of genes and other features of interest on particular chromosomes. Sequencing—that is, determining the order of nucleotides on the chromosomes—is sometimes referred to as the ultimate map. Knowing where on a specific chromosome a given gene is located would be useful in u n d e r s t a n d i n g , diagnosing, and
treating genetic diseases, Alberts notes. The information could help in deciphering how genes are turned on and off—crucial events in normal differentiation and in cancer. Comparing the human genome with that of other organisms would help in understanding evolution. It was the idea of sequencing all 3 billion bases of the genome at a cost estimated to come to about $3 billion and thousands of hours of labor that made many molecular biologists balk when the idea was first proposed in early 1986. One of the most articulate critics has been David L. Baltimore, director of the Whitehead Institute for Biomedical Research and professor of biology at Massachusetts Institute of Technology. "The genome project seems a ploy to raise money, a project justified by its public relations value, not its scientific value," he said last month in his keynote address to the annual meeting of the American Association for the Advancement of Science (AAAS). Part of the skeptics' uneasiness stems from the involvement of the Department of Energy (DOE), an agency not often linked with biological research. It was Charles P. DeLisi, however, then director of DOE's office of health and environmental research and now at Mount Sinai School of Medicine, who two years ago began pushing the goal of charting the human genome. DeLisi says that DOE's interest in the project arises naturally from its long-standing commitment to understanding the biological effects of nuclear radiation. He points out that the agency's national laboratories already have the facilities and computational support necessary for such a large-scale undertaking. Cynics view DOE's fascination with the human genome as a budget-
boosting tactic. But whatever the initial motivation, the idea of undertaking such a special project—the biologists' equivalent of the physicists7 Superconducting Super Collider—spread quickly. Baltimore and other critics are not convinced the human genome sequence is crucial to molecular biology today. They fear that vital smallscale research may be sacrificed to such large-scale ("moon shot/' as Baltimore says) projects. They warn also of a loss of quality control that could result if a large bureaucracy beyond the reach of the peer review process carries out the human genome initiative. But many critics have softened their views in the past year, in large part because the project as envisioned by the NRC panel would be carried out by small- and mediumsized laboratories whose work would be subject to peer review. They would concentrate initially on gene mapping and technology development—natural outgrowths of current research—and only turn to large-scale sequencing as the process became much more time- and cost-efficient. One such doubter turned supporter is David Botstein, formerly professor of genetics at MIT who recently moved to Genentech Corp. "We reached a consensus position in preparing the [NRC] report," Botstein, an NRC panel member, says. "Extreme views have been modified on both sides." Even the question of who .will run the show is well on the way to being resolved. The NRC report advises designating one federal agency to lead the program, but the committee couldn't agree on which it should recommend to be in charge: DOE, the National Science Foundation, or the National Institutes of Health (NIH), perhaps the most obvious choice. "DOE indicated it wants to do it," said Watson at last month's AAAS meeting. "I think NIH could be the lead agency, but we perceive a lack of enthusiasm at NIH. A number of us are waiting to see what NIH's position is." Now, NIH has signaled its readiness to plunge in. NIH director James B. Wyngaarden in early March
Among the 15 members on the NRC panel were (clockwise from left) Caltech's Hood, Genentech's Botstein, and Columbia University's Cantor
told a Congressional hearing that he was setting up a new office of the human genome with its own associate director and a permanent advisory committee. An ad-hoc committee—made up largely of the same elite biologists who served on the NRC panel—already has given Wyngaarden its wish list of how it would like to proceed over the next five years. Despite the growing consensus among molecular biologists, however, there remains the question of whether Congress will come up with the money. The NRC panel points out that $200 million annually is only about 3% of what the federal
government currently spends on biological research. The committee is adamant, however, that the human genome project be paid for with new funds, not by sacrificing ongoing programs. "The special effort shouldn't replace what we do now," Botstein says. "The research that we are doing now is the only way to know what the sequence means." DOE has requested $18 million for human genome activities in fiscal year 1989—up from $12 million in the current fiscal year. And NIH has asked for $28 million in 1989 for research on the genome of humans and other complex organisms. Together, their budget requests for the next fiscal year total $46 million—a healthy piece of change but a long way from*$200 million. Yet the NRC panel members are optimistic the funds will materialize. "I think Congress is very enthusiastic about this project, says Charles R. Cantor, professor and chairman of genetics and development at the Columbia University college of physicians and surgeons. "I have testified before the Senate and I was amazed at how knowledgeable they were." Indeed, several members of Congress, including Sen. Pete V. Domenici (R.-N.M.), Sen. Edward M. Kennedy (D.-Mass.), and Sen. Lawton March 14, 1988 C&EN
23
Science
Du Pont's DNA sequencer employs fluorescent nucleotide analogs Du Pont's new automated instrument for sequencing DNA chains represents the sort of technological advance that the National Academy of Sciences says is needed throughout the human ge nome project. The sequencing system, which the company began shipping last month, centers on some clever chemis try that allows computerization of a pro cess that had been painfully slow. The Du Pont sequencer has been deemed "elegant" by Leroy Ε. Hood, professor of biology at California Institute of Tech nology and one of the developers of a competing automated system marketed by Applied Biosystems Inc. (ABI) of Fos ter City, Calif. The automated systems are modifi cations of the widely used enzymic sequencing technique that was devel oped in 1977 by Frederick Sanger and coworkers at the Medical Research Council in Cambridge, England. The en zyme in question is DNA polymerase, which uses a single strand of DNA as a template to make the complementary strand of the double helix from the four nucleotides deoxyadenosine 5'-triphosphate (dATP), deoxyguanosine 5'-triphosphate (dGTP), deoxythymidine 5'-triphosphate (dTTP), and deoxycytidine 5'-triphosphate (dCTP). In the Sanger enzymic sequencing method, a single strand of the DNA chain whose sequence is to be deter mined is used as the template. But in stead of complete complementary strands being synthesized, the process is interrupted by chain-terminating nu cleotides that block further growth.
Chiles (D.-Fla.), have introduced leg islation concerning the human ge nome project. Several different kinds of maps can be made of the human genome, just as there can be road, political, or geographical maps of a conti nent. "There are a number of kinds of maps, reflecting a number of pur poses and a number of technical approaches," says Raymond L. White of Howard Hughes Medical Institute at the University of Utah Medical School. White's research centers on map ping through genetic—or f a m i l y linkages. Such maps are made by 24
March 14, 1988 C&EN
Four different reactions are carried out. Each employs a primer DNA se quence and radiolabeled dATP, dGTP, dTTP, and dCTP, but uses a different chain-terminating analog that corre sponds to one of the four nucleotides. The chain terminators are 2,3-dideoxynucleotides that lack the 3'-hydroxyl group needed to form a phosphodiester bond. Therefore, whenever the enzyme encounters a chain terminator instead of a normal nucleotide, the growth of the DNA chain is arrested. The resulting products from each re action are a mixture of DNA fragments of varying lengths. For example, the reaction to which the dideoxy analog of dATP is added produces a family of oligonucleotides that all end with dATP. The mixtures are separated by polyacrylamide gel electrophoresis, a tech nique that can resolve DNA fragments differing in length by a single nucleo tide. The bands on the gel are visual ized by autoradiography, a slow pro cess that takes days. Each band is a DNA fragment ending with a known base, whose length can be determined by the distance it travels in the gel. An experi enced worker can interpret the DNA se quence from the gel pattern. The Du Pont system also uses DNA polymerase and chain terminators to make a family of partial complementary copies of the DNA to be sequenced. Du Pont's dideoxy chain terminators, how ever, are tagged with fluorescein dyes [Science, 238, 336, (1987)]. Each emits light of a slightly different wavelength when excited by an argon laser.
measuring how often various traits are inherited together within fam ilies. The first gene to be m a p p e d through genetic linkage, in 1911, was the gene responsible for color blindness. By tracking the appear ance of the problem among males and females in families, the respon sible gene was tied to the X-chromosome. Similarly, more than 100 other sex-linked genes have since been assigned to the X-chromosome. "Mapping through family link age permits mapping of genes that cause human disease, but are other wise biochemically indistinguish
Because the chain terminators can be distinguished by their emission spec tra, the Du Pont system combines all four in one pot rather than using sepa rate reaction chambers. The resulting mixture of DNA fragments is separated on a single lane of a polyacrylamide gel. The identity of the nucleotide termi nating each band on the gel then is determined by its characteristic fluores cent emission. Using fluorescence rather than auto radiography to visualize the bands al lows the electrophoresis gels to be read as soon as they are run. The Du Pont sequencer employs a scanning system that can read up to 12 lanes of a gel at a time—that is, 12 different DNA chains can be sequenced at once. The data are fed to a microcomputer that recon structs the DNA sequences according to the order of appearance of each fluorescence signal. The ABI instrument also makes use of fluorescent tags—not in the chain terminators, like Du Pont's, but in the short primer oligonucleotides needed to initiate DNA synthesis. Du Pont asserts its system allows more flexibility and minimizes errors. Nonetheless, ABI's sys tem also capitalizes on computercontrolled data acquisition and analysis for speed. Indeed, both firms claim their sequencers can identify about 10,000 nucleotides per day. In contrast to the average of about 50,000 nucleotides per year that the National Research Council estimates a skilled worker could identify using the established Sanger methods, that speed is remarkable.
able," says White. Among others, genes for H u n t i n g t o n ' s disease, neurofibromatosis, and Duchenne muscular d y s t r o p h y h a v e b e e n mapped through that approach. White and other genetic-linkage map makers rely on what they call genetic markers to assign locations to genes. The markers are innocu ous variations in DNA that can be detected using restriction enzymes (enzymes that cleave DNA in spe cific locations) or other DNA probes. The mapping rests on the axiom that if two traits are inherited to gether frequently, they must be rel atively close to each other on a chro-
Du Pont modification
Sanger enzymic method Ι Ι Ι Μ
Ι Ι Μ Γ
A C G T A T G T C A T G C A T A C A G T 1 1 1 1 1 1 1 1 1 | 1
^1 |-τ rGc
h
l-A
ddATP
pT i-T -G -G -C -C -A -A -T -T l-A - A -C
I ddCTP ΓΤ
G rl-C
t
T
-G -C -A -T -A
l-C
l-A
1 ddGTP
T
r Ça r-G - c
X ddTTP
^
h-T l - T - G —G
*
Fragments ending withC
ι
Fragments ending with G
Γ
UA c
ι
Slowest moving bands
-G -C -A -T -A
1 1 1
DNA polymerase, DNA primer, dATP, dCTP, dGTP, dTTP
1
A*,
C*,
Γτ Η
Γ τ* Η
G*,
f-T
-G hG l-G*hG - C kA he* - A rT -r
r
UA
rAc rG r
T* h-T h T l-T -G -G - G -C -c - C -A -A - A -T - T - T -A - A -A* -C -C l - A * —A r-G*
l-T* 1 1 1
Sequence read from gel
Τ G
Single lane gel electrophoresis and fluorescence scanning Sequence computed by instrument
Slowest moving bands
Τ G
fragme nts)
fragrr lents)
^t
A
A
C
C
A
A
Τ
Τ
A
A
C
Fastest moving bands (shortest fragments)
G Τ
Note: d(JATP, ddCTP, ddGTPand ddTTP are chain-stopping dideoxynucleotide triphosp hates
mosome. That is, if all the members of a family who develop a genetic disease also share a certain genetic marker, the gene responsible for the disease must be near that marker. White's group has been focusing on variable number tandem repeat markers—regions in DNA with short repeating units where the number of repeats varies from per son to person. The most widely used markers are called restriction frag ment length polymorphisms (RFLPs). These are variations in DNA se quences that cause deviations from the products that restriction en zymes normally would produce.
Γ
τ
-c*
rrA rG Ι—τ Fragments ending with Τ
τ
hFc PA*
-C - C -A —A - τ h-T
-A -T -A -C -A
Single strand
A C G T A T G T C A
DNA polymerase, DNA primer, dATP, dCTP, dGTP, dTTP
Parallel gel electrophoresis and autoradiography
i
1 11111 11 1 1
^ngle strand
l-G
Fragments ending with A
1 1 1 1 1 I 1 1 1 1 Double-stranded DNA to be A C G T A T G T C A sequenced T G C A T A C A G T 1 1 1 1 1 1 1 1 1 1 1 Separate s>trands
°°D A X ® be "* ° ced sequen
Separate strands
I I I I I I I I ! Γ A C G T A T G T C A 1 1 1
ie -stranded Doub U D
^r Fastest moving bands (shortest fragments) |
C G Τ
Note: A*, C*, G *, and T* are fluorescein-labeled chain-stopping dideoxynucleotide triphosphates.
Genetic-linkage mappers are try ing to identify markers at closely spaced intervals throughout the hu man genome, so that a disease gene can be quickly pinned down to a fairly narrow region. In fact, last fall scientists from Collaborative Re search Inc., Bedford, Mass., an nounced they had completed a set of markers that completely spanned the human genome [Cell, 51, 319 (1987)]. White, however, thinks the Collaborative map is not dense enough to be called complete. Both groups are continuing to search for additional markers. "With a high-density set of mark
ers, we should be able to attack not only rare genetic diseases, but more common complex disorders such as heart disease, hypertension, and can cer/' White says. A second type of mapping is phys ical mapping, in which chemical be havior of DNA is used to place land marks along the chromosomes. "The landmarks are specific sequences of nucleotides that are attacked by very specific [restriction] enzymes," says Maynard V. Olson, professor of ge netics at Washington University's school of medicine. Some of those landmarks are RFLPs, which can be used to connect the physical map of March 14, 1988 C&EN
25
Science the genome to the genetic-linkage map. A given restriction enzyme cuts DNA at particular sites into an as sembly of fragments. The trick at that point is to reconstruct the or der in which the pieces are linked in the intact chromosome. A method known as fingerprint ing can be used to order the DNA fragments. A computer program, given certain characteristics of the fragments such as a pattern of cut ting by a set of restriction enzymes, searches for overlaps between them. Another ordering technique uses small DNA pieces called linking probes that will stick to fragments from either side of a site where they were cut by a restriction enzyme. The larger the DNA pieces, the fewer there are and the easier they are to place in order. But very large DNA segments are fragile and dif ficult to handle. However, Cantor and his coworkers several years ago developed techniques of manipu lating large pieces of DNA in a sol id matrix that supports and protects them. The Columbia researchers also developed a technique called pulsed field electrophoresis that can sepa rate large chunks of DNA. Such technical advances are criti cal to the entire human genome project, the NRC committee finds, and many more are needed. 'Tar too few people are doing this sort of technological r e s e a r c h / ' says Leroy Ε. Hood, professor of biology at California Institute of Technolo gy. Hood's own research includes development of instrumentation for biotechnology. Among the projects in Hood's lab is the development of new comput er chips in collaboration with TRW. The researchers want to design a whole series of specialized chips that can search for homologies in DNA sequences, extract patterns from se quences, incorporate mapping algo rithms for both physical and link age maps, and meld all the data into a useful data bank. "If the endeavor is carried out a p p r o p r i a t e l y , it will stimulate young people to take developing technology seriously," Hood says. "Given the right support, technolo gy development will accelerate in the next 10 years." D 26
March 14, 1988 C&EN
Academy of Engineering elects new members The National Academy of Engineer ing has elected 85 engineers to acad emy membership and seven as for eign associates, including one who was honored posthumously. This brings total U.S. membership to 1417, and the number of foreign associates to 118. Academy m e m b e r s h i p honors those who have made "important contributions to engineering theo ry and practice, including signifi cant contributions to the literature of engineering," or those who have demonstrated "unusual accomplish ment in new and developing fields of technology." Among the new members, those whose activities have been in chemically related areas include: Richard C. Alkire, professor and head of the department of chemical engineering, University of Illinois, Urbana. For imaginative research on engineering aspects of electrodeposition and corrosion and for leader ship in electrochemical engineering. Howard K. Birnbaum, professor of physical metallurgy, University of Illinois, Urbana. For exceptional work on the effect of hydrogen and hydrogen embrittlement on prop erties of metals. Kenneth B. Bischoff, Unidel Pro fessor of Biomedical and Chemical Engineering, University of Dela ware, Newark. For excellence in re search and education in chemical reaction engineering and in biomed ical engineering. Leroy L. Chang, manager, quan tum structures, IBM Thomas J. Wat son Research Center, Yorktown Heights, N.Y. For pioneering work in superlattice heterostructures. Praveen Chaudhari, vice presi dent for science, and director for physical sciences, IBM Thomas J. Watson Research Center. For con tributions to materials science and engineering and to the advancement of electronic materials. H. Ted Davis, professor and head of the department of chemical en gineering and materials science, University of Minnesota, Minneap olis. For leadership in applying chemical physics and in uniting chemical engineering and materi als science teachingand research.
John M. Googin, senior corpo rate fellow, development division, Martin Marietta Energy Systems, Oak Ridge, Tenn. For outstanding contributions in uranium process ing, uranium isotopic enrichment, and lithium processing. George E. Keller II, corporate re search fellow, R&D technical center, Union Carbide, South Charleston, W.Va. For invention and insightful analysis of novel separation pro cesses. Lester C. Krogh, vice president, R&D, 3M Co., St. Paul. For contri butions to the development and ap plication of unique materials and for leadership of innovative research. Robert M. Nerem, professor and chairman of the department of biomechanical engineering, Georgia In stitute of Technology, Atlanta. For biomedical engineering leadership through major contributions to the understanding of dynamics of blood flow and blood vessels in health and disease. Donald R. Paul, holder of the Melvin H. Gertz Regents Chair in chemical engineering and director of the Center for Polymer Research, University of Texas, Austin. For out standing research contributions on polymeric materials and for lead ership in chemical e n g i n e e r i n g education. Robert A. Rapp, professor of met allurgical engineering, Ohio State University, Columbus. For outstand ing work on solid-state electrochem istry, corrosion, and oxidation. David A. Thompson, fellow, IBM Thomas J. Watson Research Center. For pioneering work in magnetics technology for data storage products, including the invention of thin-film and magnetoresistive devices. Julia R. Weertman, professor and chairman, department of materials science and engineering, Techno logical Institute, Northwestern Uni versity, Evanston, 111. For exception al research on failure mechanisms in high-temperature alloys. Forman A. Williams, Robert H. Goddard Professor, department of mechanical and aerospace engineer ing, Princeton University. For con tributions to the advancement of combustion and flame theory. D