DNA Sequencing on a Chip - ACS Publications

sequencing if fabrication and interpretation problems can be solved. Many gene researchers would admit that the daunting scope of the Human Genome Pro...
6 downloads 0 Views 5MB Size
Focus

DNA Sequencing on a Chip Compact arrays of probes may be used for ultrafast DNA sequencing if fabrication and interpretation problems can be solved

M

any gene researchers would admit that the daunting scope of the Human Genome Project makes it too big to complete in less than a few decades using current technology alone. Faster methods than are currently available will be needed to sequence the billions of base pairs of DNA in the human genome in a reasonable number of years and without exhausting available funds. However, solutions to the project's technical challenges are already beginning to surface. A new sequencing technology in which DNA binds or "hybridizes" to an array of oligonucleotides on a silicon chip appears to be promising as a highthroughput alternative to conventional sequencing and is currently under development at several companies and research consortia in the United States. In conventional gel-based methods, agents that cleave and label DNA at one of the four nucleotide bases (adenine, thymine, guanine, or cytosine) are added to four aliquots of a DNA sample. In each cycle of these reactions, part of the newly cleaved DNA is dye- or radiolabeled and protected from further degradation. Each aliquot ends up as a mixture of labeled DNA fragments of successive lengths, rep-

resenting the positions of A, T, G, or C, respectively, in the original strand. The fragments are separated by length in four adjacent lanes of an electrophoretic gel, and the sequence can easily be read up the resulting "ladder" of separated bands that are staggered across the A, T, G, and C lanes. Modern gel-based methods can be used to sequence DNA strands up to 1000 bases long in several hours and to sequence more than one sample at a time, but they are still quite slow and cumbersome. Hyseq (Sunnyvale, CA) claims its DNA sequencing systems will be able to sequence thousands of base pairs of DNA in minutes per sample on disposable silicon chips an inch or less on a side. The eventual cost of sequencing, according to the company, will be pennies per base pair as opposed to the current gel-sequencing cost of several dollars per base pair. Beckman Instruments (Fullerton, CA) and Affymetrix (Santa Clara, CA), which are also exploring the use of oligonucleotide arrays for de novo gene sequencing, are concentrating on developing dedicated chips using the basic technique for applications such as resequencing for confirmation, monitor-

Analytical Chemistry, Vol. 67, No. 5, March 1, 1995 201 A

Focus

ing gene expression, and physical gene mapping.

plished either by dye-labeling the DNA Advanced Research Center (HARC), fragments before hybridization or by add- Beckman is evaluating a silicon chip that ing a colored or fluorescent reagent afserves both as the array substrate and as Sequencing by hybridization terward that selectively binds the doublethe transducer, based on surface changes The use of oligonucleotide hybridization stranded probe-DNA hybrids. in electrical permittivity wherever the for interrogating DNA sequences has been Several types of detectors are being sample DNA hybridizes to a probe. The proposed by a number of groups around tested for the sequencing chips. Afconsortium is also using disposable glass the world in the pastfiveyears. Former fymetrix is working with Hewlett Packard coverslips as chips that can be placed Yugoslavian researchers Radoje Drmanac and Molecular Dynamics to develop fludirectly over a 2D charge-coupled device and Radomir Crkvenjakov were awarded a orescence detectors for its chips. Through (CCD) for discrete light detection in each U.S. patent for "sequencing by hybridizaits nine-member "Genosensor Consorpixel. Hyseq, according to company prestion" (SBH) while working at Argonne Na- tium," which also includes the Houston ident and CEO Lewis Gruber, is likely to tional Laboratory in the early 1990s and have since moved to Hyseq. Simple hybridization assays—where a sample DNA strand binds with a dye- or radio-labeled "probe" strand that has a complementary sequence—are used for positive identification of known genes or mutations such as the one responsible for cystic fibrosis. These genetic tests, although powerful clinical tools, give simple yes/no information about the gene but tell nothing about its specific sequence. Sequencing chips take these assays a step further. Instead of being hybridized to a single long probe, sample DNA is hybridized to a comprehensive series of short (6-20 bases long and usually all the same length) oligonucleotide strands with all the different combinations of A, T, C, and G. Of necessity, many of these probes have overlapping, or "nested," sequences. By looking at the sequence overlaps of the probes where the sample DNA hybridizes successfully, the sequence of the sample DNA can be reconstructed. The sequencing chip method is expected to be much faster than conventional sequencing because there's no need for lengthy electrophoretic separation and because the single hybridization cycle takes < 10 min. The sample DNA can be the product of amplification, a recombinant gene extracted from a bacterial or viral culture, or a mixture of digested gene fragments. The sample is passed over the chip, heated quickly to separate it into single strands for hybridization with the single-stranded probes, cooled, and washed. Immobilizing the probes as a 2D array (Figure 1) allows them to be tracked individually in much greater numbers than Figure 1 . Screening for a genetic mutation w i t h a DNA probe array. would be possible in solution—even if (a) Probe sequences (red) overlap to match target DNA sequence (blue), (b) Normal or "wild-type" DNA (left) hybridizes only with probes designed to match the normal sequence, but a there were thousands of different dye lamutation in one base (right) causes the target DNA to hybridize with a probe for the abnormal bels for the thousands of probes. Lightsequence, (c) Normal and mutant target DNA bound (blue-filled squares) to probes on array. (Adapted with permission of Houston Advanced Research Center) based signal transduction can be accom202 A

Analytical Chemistry, Vol. 67, No. 5, March 1, 1995

use CCDs as well for its proposed "superchip," an array composed of thousands of probe arrays that supposedly will have an area of 1 in2. Interpretation

Signal crosstalk caused by stray light be­ tween neighboring points of the array is not a significant problem at current probe densities, say researchers at the three companies. In addition, the individ­ ual points are separated by hydrophobic dividers or given enough space between them to keep any moderately long DNA fragment from binding to more than one array point. However, SBH does suffer from sev­ eral forms of interpretive error, says Ken Beattie of HARC. Double occurrences of a given short sequence are always possi­ ble in a long DNA fragment, although the likelihood of confusion from these dupli­ cate regions decreases with the use of longer oligonucleotide probe sequences. More pernicious are the problems with probe-DNA mismatches. In general, match accuracy can be enhanced by using probes of a different length and adjust­ ing hybridization and washing conditions for greater stringency. But because A and Τ bind each other more weakly than G and C during hybridization, a false match in a GC-rich region of the DNA will often give a stronger signal than a true AT match. One way to get around these problems is to know where the GC- or AT-rich re­ gions are and run them separately under appropriately tailored conditions. For an unknown sequence, of course, this solu­ tion isn't possible, so researchers in the field are trying to develop secondary hy­ bridization schemes and computer se­ quence interpretation algorithms that can compensate for these problems. Gruber says that Hyseq's chemistry, which uses both chip-bound and free probes for hybridization to the sample DNA frag­ ments, eliminates the chance of a mis­ match. The hybridization strategy can also be used in custom arrays with specifically tai­ lored probes for confirmation or determi­ nation of mutations in a known sequence. As Robert Lipshutz of Affymetrix points out, sequencing chip technology is much more reliable and easier to use for these

smaller scale "resequencing" assays than it is for initial sequencing of an unknown. If you know the sequence of a "wild type" or healthy gene and know where the point mutation is, he says, you can use a much smaller set of oligonucleotide probes to bracket the region of interest and either screen DNA samples for the defective gene or identify the base substitution re­ sponsible for the mutation. Because the chips can accommodate such a large number of probes, Lipshutz says, thou­ sands of genetic point mutations could be screened for on the same chip. Affymetrix and Beckman are pursuing these more

In limited sequencing applications, thousands of different point mutations could be screened on the same array. limited applications of sequencing chips as their first commercial products while they troubleshoot formats for full sequenc­ ing. Synthesizer madness

Although the basic idea behind the DNA sequencing chips seems simple at first glance, a comprehensive set of 8-base oli­ gonucleotides contains > 65,000 probes. For 10-base probes, the number rises to more than a million. Putting SBH to work requires a way to synthesize the full probe set, apply it to the silicon substrate with no mistakes, and detect all of the indi­ vidual DNA-probe hybrids on the array without false matches. Beattie says another member of the Genosensor Consortium, Genosys Bio­ technologies (The Woodlands, TX), has succeeded in using a "segmented synthe­ sis" method his group developed to produce several hundred short probes si­ multaneously. "Basically, you have a stack

of synthesis wafers [substrates] that you process simultaneously and shuffle them around between base addition cycles to vary the sequences being formed on them," he says. With cycle times under 10 min, he estimates that 200-300 10-base probes can be synthesized this way in a few hours. Although that still means months of synthesistimefor a full probe set, Beattie still considers his method cost effective. "In the development stage," he says, "the cost is quite high for probe syn­ thesis—it probably costs about $1 mil­ lion to synthesize, perform quality control, and provide storage and access for all of the octamer probes. However, one synthe­ sis is enough to produce up to millions of chips." Spotting the problem

Laying out a 2D array of probes, as Drmanac and Crkvenjakov did for their prototype assays at Argonne, may be straightforward on a series of standard microwell plates. But at only 96 probes per plate, the total package is too large to produce commercially. Despite the "chip wars" reports in the public media claim­ ing that the companies can deposit mil­ lions of probes on a 1-cm2 chip, current packing densities are at best thousands per chip. Although Lipshutz says Affymetrix has demonstrated placement of all 65,000+ 8-base probes on a 1.28 cm χ 1.28 cm chip, the other two companies can currently im­ mobilize up to ~ 4000 per chip. At this point, the three companies are exploring different fabrication technolo­ gies that can place the probes in precisely the right location without chemical deg­ radation or contamination. Beattie says the Genosensor Consortium is using a dis­ penser, constructed by member company MicroFab Technologies (Piano, TX), that works along the lines of an ordinary inkjet printer. The dispenser can spray in­ dividual dots of the probe solutions 50100 μπι in diameter onto the substrate, but Beattie adds, "We're not to the point of making thousands of points in an array." HARC is independently exploring the use of a proprietary probe implanter, under development at Accelerator Technologies (College Station, TX), that has a 2D ar­ ray of piezoelectric tips for depositing a large number of probe solutions on a chip simultaneously.

Analytical Chemistry, Vol. 67, No. 5, March 1, 1995 203 A

Focus Gruber says Hyseq will use robotics to spot individual probe solutions on the chips. But Hyseq has another way of multiplying the number of probe sequences. After sample DNA is added to a chip covered with a series of probe arrays, labeled secondary probe solutions and a ligating agent are added to each chip for hybridization. If the fixed probe on the sequencing chip and the one in solution match adjacent sequences on the sample DNA, the two probes are chemically ligated, and both will remain on the sequencing chip after washing. Gruber says this sandwich format strengthens the accuracy of the match and eliminates sequencing errors. It could square the effective number and double the effective length of probes in the set without having to develop highdensity probe deposition methods or synthesize the total set of probes. On-chip

synthesis

Both Beckman and Hyseq are trying to solve the problem of attaching the probes

Select

to the chips after synthesis. Affymetrix, on the other hand, synthesizes all the different probes directly on the chips by using photolithography. The chips are covered with a series of masks that permit light inactivation of the photolabile protective endgroups on selected probes, which then receive the next A T, C, or G. Only 60 masks are needed to generate any given set of 15-base probes in 10 h directly on a silicon wafer, and only 32 would be needed to make the 8-base set. The prototype system deposits 16,000 probes on a chip with ~ 100-pm resolution, Lipshutz says, but the company is trying to bring the resolution down to ~ 10 pm and pack the probes in closer. Beckman is also exploring on-chip synthesis using physical masks. Process control is easier with the mask system than for postsynthesis deposition, says Lipshutz. Borrowing from the semiconductor industry wafer process, the prototype mask system is designed to make 16 chips at a time. One chip on the wafer

Of&>0>Vf*>4> » * *

A

The N e w M o d e l 2 6 3 A uses the p r o v e n t e c h n o l o g y of the Jfctf M o d e l 2 6 3 , but has a d d i t i o n a l enhancement packages a v a i l a b l e f o r the r e s e a r c h e r ' s p a r t i c u l a r needs.

r

The M 2 6 3 A / 9 1

Turbo/RAM

1 6 Bit D A C O p t i o n gives the user 3 0 /jsec. a c q u i s i t i o n c a p a b i l i t i e s , 9 6 K of R A M , a n d a 16 bit D A C . T h e M 2 6 3 A / 9 4 H i g h C u r r e n t O p t i o n provides a 2 A m p current capability, w h i c h lets the researcher run l a r g e r electrodes or c o r r o s i o n e x p e r i m e n t s t h a t r e q u i r e h i g h e r currents. T h e M 2 6 3 A / 9 8 A u x i l i a r y I n p u t O p t i o n provides a n external input to the a n a l o g - t o - d i g i t a l converter. This lets the researcher use a lock-in a m p l i f i e r o r m o n i t o r the auxiliary p a r a m e t e r s of other i n s t r u m e n t s .

T h e M 2 6 3 A / 9 9 F l o a t i n g / A u x i l i a r y I n p u t O p t i o n allows the researcher to use the M 2 6 3 A f o r a f l o a t i n g Xte«3 your upgrade! g r o u n d e x p e r i m e n t , such as a n e x p e r i m e n t in *low Model 384B users can a n a u t o c l a v e o r stress strain tester. It includes upgrade to Windows software or the / 9 8 A u x i l i a r y Input O p t i o n . purchase a new Model 394 system C a l l t o d a y to select o n e of the n e w o p t i o n s tor all your Polarography needs. that a r e a v a i l a b l e .

EG&G

INSTRUMENTS

Princeton Applied Research RO. Box 2565 · Princeton, NJ 08543 · (609) 530-1000 · FAX: (609) 883-7259 United Kingdom (44) 734-773003 · Netherlands (31) 034-0248777 Germany (49) 89-926920 · France (33) 01-69898920 · Japan (03) 638-1506 CIRCLE 10 ON READER SERVICE CARD See Us At ACS, Booth #710 204 A

Analytical

Chemistry,

Vol. 67, No. 5, March 1, 1995

can be tested with a known DNA sequence to determine wafer quality. To

market

Lipshutz and Beckman vice president James Osborne predict that their companies'firstcommercial offerings are likely to be limited-scale sequencing chip formats for working with known genes and mutations. Clinical sequencing chips will have to undergo the medical device premarket approval process at the U.S. Food and Drug Administration, which will take some time, but for research use, these sequencing chips could be on the market within two to three years. Full sequencing formats still need more work but could be available withinfiveto seven years. The fact that Hyseq holds the U.S. patent to SBH itself and Beckman holds a broad patent on the use of oligonucleotide arrays could spell legal disputes for the three companies when these products go on the market, depending on how broadly "sequencing by hybridization" can be defined under Hyseq's patent. However, the three companies all have their own fabrication methods and in some ways have complementary strengths. And because SBH sequencing chips have yet to become foolproof, the companies may not turn the overlap in base methodology into a battleground. At this point, there is still a sense of possible cooperation among the companies, and the community of SBH researchers is growing. "Every two years since 1991, there's been an international workshop on SBH with increasing numbers of attendees," Beattie says. He adds that sequencing chip technology should be applicable in agriculture and environmental applications as well as in clinical genetic assays. Other variations, such as antibody arrays, reversed schemes with the DNA fragments immobilized on the array, and flow-through capillary formats made of porous silicon, may also be on the way. The number of potential applications and the value of high-throughput DNA sequencing make it important that sequencing chip technology not be shut down or tied up in court before anyone gets a chance to use it, he says. "Our hope is to try to get as many groups together as possible." Deborah Noble