n e w s of t h e w e e k
CLOSING IN ON THE HUMAN GENOME Cèlera Genomics completes rough draft of human gene sequence t a news conference at its headquarters in Rockville, Md., last week, Cèlera Genomics announced that it has completed a draft sequence covering almost the entire human genome. The sequencing effort, launched just last September, used a shotgun approach to analyze millions of random DNA fragments generated by shattering the genome's complement of chromosomes. "Our statistical analysis and various methods of examining the genome indicate that we now have over 97% of all human genes in our database," observed J. Craig Venter, the company's president and chief scientific officer. Stock in the firm, which is owned by PE Corp., shot up 29% on the day of the announcement to close at $241 per share. It has since eased back somewhat, but the price has tripled since the end of November because investors hope that genome data will lead to new drugs and disease treatments. Celera's shotgun strategy has produced 10 mil- l/enter lion "very high quality sequences" that account for about 81% of the 3.18 billion base pairs in the human genome, Venter says. By incorporating sequence information available from the publicly funded Human Genome Project (HGP), the company was able to raise that figure to 90% of the complete genome. 'The mathematics that Craig is using, as I see it, are probably correct," says Bruce A. Roe, a professor of chemistry and biochemistry at the University of Oklahoma, Norman. But whether Celera's claim to have a rough draft "is 'the truth' is anybody's conjecture," he contends, because the data are not publicly available. Roe is part of the international HGP team that announced the
A
complete sequence of chromosome 22 last month (C&EN, Dec. 6,1999, page 9). Results from the 10-year-old HGP are freely accessible through public databases such as GenBank (http://www.ncbi. nlm.gov/genome/seq), which is updated daily. Access to Celera's database— updated roughly every two weeks— requires a subscription that costs $5 mil-
lion per year for five years. So far, Celera's only subscribers are from the pharmaceutical industry, which, Venter noted, uses the database extensively. This year, the company plans to open its database—at a different charge—to subscribers in academia and research institutions. Cèlera is applying for provisional patents on some gene sequences, a thorny issue for some researchers. Cèlera still must assemble the genome—that is, put all the individual random sequences into correct order. Some researchers are skeptical that the undertaking can succeed. "It's important to have an area of a chromosome sequenced several times, not only to make sure you're accurate, but also for assembling the sequence," emphasizes Cathy Yarbrough, director of communications and public liaison for the National Institutes of Health's National Human Genome Research Institute, which funds three of HGP's five major sequencing efforts. "The working draft of
the Human Genome Project requires four to five times' coverage," she says. Currently, Cèlera averages less than two times' coverage of the genome. Statistically, that's "way too little to do a decent assembly," says Ian Dunham, a senior research fellow at the Sanger Centre, Cambridge, England, and another member of the chromosome 22 team. "You'll have some pieces with two times' coverage and some with 10 times' coverage, and you'll clearly have some that have no coverage." But Venter believes sequencing algorithms developed at Cèlera combined with further shotgun sequencing and HGP data will provide adequate coverage for assembling the genome by summer. However, Randall W. Scott, president and chief scientific officer of Incyte Pharmaceuticals, a genomics firm based in Palo Alto, Calif., insists that "even the very best computer algorithms vary in predicting sequences." Last week, Incyte announced that an estimated 95% of genes in the human genome are represented in that company's database. "Incyte [too] wants to sell you its database for hundreds of thousands of dollars," notes Roe, who is concerned that there's so much hype about the human genome that "people are spending lots of money to get access" to commercial databases, which could increase the price of drugs. The HGP will produce a working draft of the human genome by spring, Yarbrough says, and the genome's sequence will be assembled by 2003. As with chromosome 22, the HGP approach uses a clone-by-clone sequencing strategy that enables researchers to construct a "physical map" of the genome. "Both the working draft of the genome and the final version will be announced through a scientific paper," as was the case for chromosome 22, Yarbrough points out. The sequences for chromosomes 7, 20, and 21 are almost ready, she says. Once it has assembled the complete human genome sequence and has finished analyzing the data, Cèlera says, it will publish a scientific study and make the basic consensus sequence freely available. "But nothing will be absolutely finished probably anytime this century in terms of a complete understanding of this information," Venter says. "Over half the genes being discovered are new to scientists. It's going to take decades and decades and decades of research to understand this vast quantity of information." Mairin Brennan and Pamela Zurer JANUARY 17, 2000 C&KN
11