Ultra-High-Speed DNA Sequencing Using Capillary Electrophoresis

Oct 1, 1995 - Naxing Xu, Yuehe Lin, Steven A. Hofstadler, Dean Matson, Charles J. Call, and Richard D. Smith ... Near-Infrared Heavy-Atom-Modified Flu...
1 downloads 12 Views 2MB Size
Anal. Chem. 1995, 67, 3676-3680

Ultra-High-speed DNA Sequencing Using CapiIlary Electrophoresis Chips Adam T. Woolley and Richard A. Mathie,* Department of Chemistry, University of Califomia, Berkeley, Califomia 94720

DNA sequencinghas been performed on microfabricated capillary electrophoresis chips. DNA separations were achieved in 50 x 8 pm cross-sectionchannels microfabricated in a 2 in. x 3 in. glass sandwich structure using a denaturing 9%T, 0%C polyacrylamide sievingmedium. DNA sequencing fragment ladders were produced and fluorescentlylabeled using the recently developed energy transfer dye-labeledprimers. Sequencing extension fragments were separated to -433 bases in only 10 min using a one-colordetection system and an effective separation distance of only 3.5 cm. Using a four-colorlabeling and detection format, DNA sequencingwith 97%accuracy and single-baseresolutionto -150 bases was achieved in only 540 s. A resolution of greater than 0.5 was obtained out to 200 bases for both the one- and four-colorseparations. 'he prospects for enhancing the resolutionand sensitivity of these chip separations are discussed. This work establishes the feasibility of high-speed,high-throughput DNA sequencing using capillary array electrophoresis chips. Capillary electrophoresis (CE) is a powerful technique for DNA analysis, which has been applied to restriction fragment sizing, PCR product analysis, forensic identification, and DNA sequencing.' CE separations are much faster than those in slab gels, because higher electric fields can be applied; however, conventional CE has the disadvantage that it only allows the analysis of one sample or lane at a time. Our group has addressed this issue by developing capillary array electrophoresis (CAE)2 in which separations are performed in a bundle of parallel silica capillaries, and we have demonstrated the use of CAE for DNA restriction fragment sizing? and short tandem repeat analysis6 In CAE, the lane width is reduced from the -1 mm typical for slab gels to -100 pm. Further miniaturization of electrophoretic separations to increase the number of lanes, as well as the speed and throughput of the separations will be necessary to meet the needs of the Human Genome P r ~ j e c t . ~Therefore, ,~ we have recently focused on the use of microfabrication techniques to produce miniaturized capillary arrays. Dedicated to our friend and colleague Dr. Huiping Zhu (1959-1995). (1) Monnig. C. A; Kennedy, R T. Anal. Chem. 1 9 9 4 , 66, 280R-314R (2) Mathies, R. A; Huang, X C. Nature 1992, 359, 167-169. (3) Huang, X. C.; Quesada, M. A; Mathies, R. A Anal. Chem. 1992, 64, 967972. (4) Huang, X. C.; Quesada, M. A; Mathies, R A Anal. Chem. 1992,64,21492154. (5) Clark, S. M.; Mathies, R A Anal. Biochem. 1993, 215, 163-170. (6) Wang, Y.; Ju. J.; Carpenter, B. A; Atherton, J. M.; Sensabaugh, G. F.; Mathies, R. A Anal. Chem. 1 9 9 5 , 67, 1197-1203. (7) Hunkapiller. T.; Kaiser, R J.; Koop. B. F.; Hood, L. Science 1991,254,5967. (8) Smith, L. M. Science 1993, 262, 530-532. +

3676

Analytical Chemistry, Vol. 67, No. 20, October 75, 7995

The use of microfabrication to produce electrophoretic separation capillaries was h t introduced in 1992 by Manz and Harri~on.~ Subsequently, capillary zone electrophoresis separations of fluorescent dyed0 and of fluorescently labeled amino acids11J2were performed in individual microfabricated capillaries on glass chips. Our own efforts have been directed toward producing microfab ricated capillary arrays on chips and using them for high-resolution separations of DNA restriction fragments and PCR products.13 Separations of small, fluorescently labeled phosphorothioate oligonucleotides have also recently been performed on a microfabricated chip.I4 These results suggest that if the sensitivity and resolution were enhanced it might even be possible to perform DNA sequencing on chips. To enhance the sensitivity of these separations, we have exploited the recently developed energy transfer dye-labeled sequencing p r i m e r ~ ' ~and J ~employed confocal fluorescence dete~tion.'~,'~ In addition, techniques for filling microfabricated channels with denaturing polyacrylamide matrices, loading DNA sequencing samples on a chip, injecting the samples, and performing sequencing separations have been developed. These studies show that we can achieve single-base resolution of DNA sequencing samples to -200 bases on chips in only 10 min separations. The prospects for increasing the read lengths on CE chips to -500 bases are also discussed. The demonstration of high-speed DNA sequencing on CE chips is the first step toward miniaturizing the entire DNA sequencing procedure on microfabricated devices. EXPERIMENTAL SECTION

Sequencing Sample Preparation. DNA sequencing samples were generated using standard dideoxy sequencing chemistry and energy transfer dye-labeled primers;15J6the other reagents used to prepare the sequencing samples were obtained from Amersham Life Science (Cleveland, OH). The DNA sequencing fragments used in the one-color separations were made using 2.4 pmol of the fluorescently labeled FlOF primer (see ref 15 for nomenclature), 1Opg of single-stranded M13mp18 DNA template, and they (9) M a w A; Harrison, D. J.; Verpoorte, E. M. J.; Fettinger, J. C.; Paulus, A,: Ludi, H.; Widmer, H. M. J. Chromatogr, 1 9 9 2 , 593. 253-258. (10) Jacobson, S. C.; Hergenroeder. R; Koutny, L. B.; Warmack, R. J.; Ramsey, J. M. Anal. Chem. 1994,66, 1107-1113. (11) Hamson, D. J.; Fluri, K; Seiler. K; Fan, Z.; Effenhauser, C. S.; Manz, A Science 1993, 261, 895-897. (12) Effenhauser, C. S.; Manz.A.; Widmer, H. M. Anal. Chem. 1993,65,26372642. (13) Woolley, A T.;Mathies, R A Proc. Natl. Acad. Sci. U S A . 1994,91,1134811352. (14) Effenhauser, C. S.; Paulus, A; Manz, A; Widmer, H. M. Anal. Chem. 1994, 66, 2949-2953. (15) Ju, J.; Ruan, C.; Fuller, C. W.; Glazer, A. N.; Mathies. R. A. Proc. Nafl. Acad. Sci., U S A . 1 9 9 5 , 92. 4347-4351. (16) Ju, J.; Kheterpal. I.; Scherer. J. R; Rum, C.; Fuller, C. W.; Glazer, A N.; Mathies, R A. Anal. Biochem., in press.

0003-2700/95/0367-3676$9.00/0 0 1995 American Chemical Society

Inject 6 n

e 4 X

R""

-n+ P I io

Flgun I. Schematic diagram of the electrophoresis chip indicating the injection procedure. The injection channel connects reservoirs 1 and 3. and the separation channel connects reservoirs 2 and 4. In the inject mode. a field is applied between reservoirs 1 and 3. causing the DNA to migrate through the gel-filled intersection toward resetvoir 1. In the run mode. a field is applied between reservoirs 2 and 4. causing the DNA fragments in the intersection region to migrate

toward resetvoir 4 through the gel in the separation channel. The actual devices had 15 electrophoresis systems integrated on each Chin

Time (seconds) Flgum 2 Onecolor DNA sequencing fragment separation on a CE chip. (top) Electrophemgram of M13mp18 A sequencing fragments generated with the primer F1OF. (bottom) Expanded view of the peaks corresponding to the first 100 bases. demonstrating single-base resolution. The sample was injected by electrophoresing it through the cross channel at 170 Vlcm for 60 s a n d separated at 250 VICm in a 9% T. 0% C .Dolvacrvlamide-filled channel. Excitation was at 488 . . nm and fluorescence from 515 to 545 nm was detected. The effective separation length of the 50pm wide and 8 p m deep channel was 3.5

cm. were terminated using ddATP. The FlOF primers are labeled with karboxyfluorescein (FAM) at the 5' end, with a second FAM attached to the tenth nucleotide from the 5' end of the primer on a modified T residue. The samples used in the fourcolor DNA sequencing experiments were generated using drl?' sequencing chemistry with 4.8 pmol of fluorescently labeled primer and 4.8 pg of singlestranded M13mp18 DNA template for each of the four reaction mixtures. The primers used for the fourcolor reactions were FlOF, FlOJ, F l m , and FlOR These primers are labeled at the 5' end as described above and have either FAM, 2',7'dimethoxy4,5'dichloro6carhxytluo& (TOE),NflJVJVtetratnethylbboxyrhodamine VAMRA), or Gcarboxy-X-rhodamine (ROW attached to the tenth nucleotide from the 5' end of the primer on a modified T residue. For the fourcolor experiments, the mixtures were pooled, and then for both the one- and fourcolor experiments, the DNA fragments were precipitated, washed with ethanol, and resuspended in 2 pL of 95%formamide/2.5 mM EDTA The samples were denatured at 90 "C for 2 min and immediately placed on ice prior to injection.16 Electrophoresis Procedures. Electrophoresis chips were fabricated and the channel surfaces were derivatized using [y-(methacryloxy)propyllhimethoxysilane as described previo~sly.'~An aqueous solution of acrylamide (9% T,0% 0 in 45 mM Tris/45 mM borate/l mM EDTM8.3 M urea @H 8.3) was filtered with a 0.2 p m pore diameter filter OMillipore. Bedford, MA) and then degassed under vacuum for 1 h. Polymektion was initiated by adding 2.5 p L of 10%ammonium persulfate and 1.5 p L of N,"'JV-tetratnethylethylenediamine (TEMED) to a 1-mL aliquot of acrylamide solution, which was then drawn into all the channels by placing the solution in reservoir 4 and applying vacuum to the other reservoirs (see Figure 1for labeling). The acrylamide was allowed to polymerize overnight at 4 "C.and the channels were preelectrophoresed at 100 V/cm for 15 min prior to use. To perform an injection, reservoir 3 was rinsed with 95% formamide and then 1.0 pL of sequencing sample was pipeted

into the reservoir. The other three reservoirs were filled with 45 mM Tris/45 mM borate/l mM EDTA @H 8.3); to establish electrical contacts on the chip, wires were inserted into the cutoff pipet tips which formed the four reservoirs. In the onecolor sequencing separation, the sample was "plug" injectedn3 by applying 170 V/cm between reservoirs 1and 3 for 60 s and run by applying 250 V/cm between reservoirs 2 and 4. In the fourcolor sequencing separation, the sample was injected for 30 s in the same manner and run at 200 V/cm. Inshumentation. The detection system used for the onecolor sequencing separationshas been described ~reviously;'~ the instrumentation used for the fourcolor sequencing separations has also been described.lA Briefly, the &nm line f"an argon ion laser was focused within the channel using a 2Ox NA 0.5 objective. Fluorescence was collected by the objective and passed through a series of dichroic filters to divide the fluorescent signal into four spectral regions (510-540,545-570,570-590,and 590660 nm). The fluorescent signal in the drst three spectral regions was filtered by a band-pass filter; the fourth region was filtered with a lonepass filter; a 1Wpm confocal pinhole spatially filtered the fluorescence in all four s p e h l channels before photomultiplier detection. The analog output from the photomultipliers was filtered with a low-pass filter having a 0.2-s time constant and sampled at 10 Hz with a 16bit ADC board (NEMI0-16xL18, National Instruments. Austin, TX) controlled by a program mitten in IabVIEW running on a Maclntosh Ilci. Additionally, a 660. nm short-pass filter was placed before the longest wavelength detector to reduce background fluorescence from the glass. The raw data were Fourier transformed, the high-frequency P 0 . 5 Hz) noise was removed, and then an inverse Fourier transform was performed. After background subtraction, the smoothed data were transformed using a multicomponent mahix transformation." The transformation matrix was formed using the fluorescence (17) Smith, L M.:Kaiser. R. J.: Sanders. J. Z.: Hwd. L 1987.155.2M)-301.

E. Mefhodr E n z ~ ~ o l .

Anafyiical Chemistry, Vol. 67, No. 20,October 15. 1995 36TI

I

...

A

240

260

340

360

I

320

300

280

II

420

3 80

A

440

460

480

500

520

540

Time (seconds) Figure 3. Analyzed four-color DNA sequencing data from M13mp18 DNA labeled using the FlOF. FlOJ, FlOT, and FlOR energy transfer primers and separated on a CE chip. The raw fluorescence data were transformed as described in the Experimental Section to present a plot of relative concentration of each of the labeled DNA fragments as a function of time. The bases are identified by the color of the peaks: blue (FlOF) C; green (FlOJ) T: black (FlOT) G red (FIOR) A. The three bases that were not called, Tat 397 5 , T at 441 s. and C at 484 s and a C incorrectly called at 422 s are indicated by a black asterisk. Weak signal from the C- and T-terminated peaks was most likely the cause of these errors. The Sample was injected by electrophoresing it through the cross channel at 170 Vlcm for 30 s and separated at 200 Vlcm in a 9% T. 0% C polyacylamide-filled channel. Excitation was at 488 nm and detection was at 510-540 nm for C, 545-570 nm for T. 570-590 nm for G. and 590-660 nm for A. The effectiveseparation length of the 50 fim wide and 8 pn deep channel was 3.5 cm .

signals in each spectral channel from known single DNAfragment peaks terminated at C, T, G, and A nucleotides. Transforming the data from the four spectral channels yields information about the relative concentrations of the four sets of dyelabeled DNA fragments.16 The plot of these concentration data as a function of time was corrected for the mobility differences of the d y e labeled fragments by adding 2.2 s to the migration times of the fragments labeled with FlOT and FlOQ yielding the analyzed fourcolor sequencing profile. Safety Considerations. Microfabricated CE chips could be mass produced inexpensively, so a preiiUed, disposable chip would minimize the user's exposure to hazardous chemicals such as acrylamide, a neurotoxin and carcinogen. Furthermore, the small volumes of solutions required to fill microfabricated channels decrease the quantities of all reagents used (hazardous or not). The short length of these microfabricated channels allows separations at 2M) V/cm with a lower applied voltage (1 kv), reducing the hazard of electrical shock. RESULTS AND DISCUSSION Figure 2 presents a separation of FlOF-labeled, ddATP-

terminated DNA sequencing fragments on a CE chip. The fragments were separated in a channel 50pm wide and 8 Fm deep 3678

Analyiical Chemistry. Vol. 67, No. 20,October 15, 1995

with a distance from injection to detection of only 3.5 a.The peaks are visible starting at -100 s, and all the peaks have been detected by 800 s. The peak corresponding to 433 bases after the primer is detected at 602 s. The lower portion of Figure 2 presents an expanded view of the peaks corresponding to the first 100 bases, separated in -4 min. Singlebase resolution was obtained throughout this region: for example, bases 23 and 24, bases 75 and 76, and bases 82 and 83 are all resolved. Figure 3 presents a fourcolor sequencing run using a CE chip. These data have been analyzed by applying a matrix transformation to the fluorescence data to correct for cross taut. Single base resolution was obtained throughout the region shown: there were four errors in the base calls, or 97% accuracy out to 147 bases, with a separation time of just 9 min. The signals for the C- and T-terminated fragments became too weak to distinguish from the background noise beyond this region, so the sequence could be read only as far as shown. However, the signal strength for the A- and Gterminated fragments was strong enough to distinguish peaks to -400 bases at a separation time of 15 min. Figure 4 presents an analysis of the resolution in the fourcolor sequencing separation. The figure shows a plot of resolution as a function of base number, with a least-squares exponential fit to the data. A resolution of at least 0.5 between adjacent peaks is

1.2

+

+

+t

I .o

+ +

g 0.8 .-a CI

E

E 0.6

0.4

0.2 0

I

I

I

100

200

300

400

Base number Figure 4. Plot of resolution as a function of base number for the four-color sequencing run. The solid line is a least-squares fit of an R = 0.94) exponential function to the data: ( y = 1.0762e-0.0038503x, Resolution was normalized to single-base spacing when the bases analyzed were not adjacent. Peak parameters for the resolution calculations were determined by fitting a Gaussian function to peaks in the raw data. The resolution of two peaks was calculated by taking the difference of the migration times and dividing by 4 times the mean of the variances of the peaks. The resolution of nonadjacent bases was normalized by dividing the calculated resolution by the base spacing between the fragments.

necessary to obtain peak information from the data;IXfrom the least-squares fit, the resolution in this run was above 0.5 out to 200 bases. Similar results were obtained for the one-color run with a resolution greater than 0.5 out to 215 bases. The maximum number of theoretical plates for a band was 1.1 x lo6, corresponding to 3.1 x 10' platedm. We have demonstrated that high-speed DNA sequencing can be performed using microfabricated CE chips. DNA sequencing with 97% accuracy has been demonstrated, and single-base resolution to 2200 bases has been demonstrated in only 7 min (1700 bases/h per lane), using an effective separation distance of only 3.5 cm. DNA sequencing fragments as long as 433 bases can be detected in -10 min separations (2400 bases/h per lane). By comparison, DNA sequencing using slab gel electrophoresis yields 500 bases of sequence in -8-10-h (50-60 bases/h per lane), and sequencing with capillary electrophoresis yields 500 bases of sequence in -1-2 h (250-500 bases/h per lane). This comparison makes it clear that the use of microfabricated capillaries may lead to a signiiicant improvement in DNA sequencing technology. The ability to read the sequence past -200 bases was limited in these experiments by the signal in the first two spectral channels. The raw signal in these channels was a factor of -2 less than in the other channels because the sensitivity of these two photomultipliers was less than that of the others by a factor of 2-3. The dyes that primarily fluoresce in the wavelength range collected in the first two spectral channels, FlOF and FlOJ, were used to label the C- and T-terminated fragments, hence the lower sensitivity for these fragments. The length of the read for these

fragments would have been approximately the same as for the Aand Gterminated fragments (-400 bases) if a 2-fold higher concentration of the C- and T-terminated fragments had been used in this run or if the detection sensitivity were improved. In order to achieve adequate signal, 12 times more primer and DNA template were used for sequencing on the CE chip than for conventional capillaries.16 The reduced sensitivity with chips is due to the smaller injection and detection volumes; for example, the cross-sectional area of a 50 x 8 ym channel is 20 times smaller than that of a 100-pm4.d.capillary. The detection sensitivity could be improved and the amount of reagents could be reduced through several modifications to the experiments. For example,