Optimization of High-Performance DNA Sequencing on Short

Qifeng Xue , Ann Wainright , Surekha Gangakhedkar , Ian Gibbons. ELECTROPHORESIS 2001 ... Braden C. Giordano , Ebony R. Copeland , James P. Landers...
0 downloads 0 Views 636KB Size
Anal. Chem. 2000, 72, 3129-3137

Optimization of High-Performance DNA Sequencing on Short Microfabricated Electrophoretic Devices Oscar Salas-Solano,† Dieter Schmalzing,† Lance Koutny,† Scott Buonocore,† Aram Adourian,† Paul Matsudaira,‡ and Dan Ehrlich*,†

Whitehead Institute for Biomedical Research, 9 Cambridge Center, Cambridge, Massachusetts 02142, and Department of Biology and Division of Bioengineering and Environmental Health, Massachusetts Institute of Technology, Cambridge, Massachusetts 02142

We have examined the parametric performance of short microfabricated electrophoresis devices that operate with a replaceable linear poly(acrylamide) (LPA) solution for the application of DNA sequencing. A systematic study is presented of the dependence of selectivity, separation efficiency, and resolution of sequencing fragments on buffer composition, LPA concentration, LPA composition, microdevice temperature, electric field, and device length. A specific optimization is made for DNA sequencing on 11.5-cm devices. Using a separation matrix composed of 3.0% (w/w) 10 MDa plus 1.0% (w/w) 50 kDa LPA, elevated microdevice temperature (50 °C), and 200 V/cm, high-speed DNA sequencing of 580 bases on standard M13mp18 was obtained in only 18 min with a basecalling accuracy of 98.5%. Read lengths of 640 bases at 98.5% accuracy were achieved in ∼30 min by reducing the electric field strength to 125 V/cm. We believe that this constitutes matrix-limited performance for microdevices of this length using LPA sieving matrix and this buffer chemistry. In addition, it was confirmed, that shorter devices are rather impractical for production sequencing applications when LPA is used as sieving matrix.

The demand for DNA sequencing has dramatically increased due to the needs of the Human Genome Project (HGP) and will continue to grow in the foreseeable future.1-3 Capillary electrophoresis (CE) with replaceable polymer solutions will be the dominant technique to meet the near-term HGP goals because of high electrophoresis speed4-8 and, most importantly, 24-h un* Corresponding author: (e-mail) [email protected]; (phone) (617) 258-7283; (fax) (617) 258-7663. † Whitehead Institute for Biomedical Research. ‡ Massachusetts Institute of Technology. (1) Pennisi, E. Science 1999, 283, 1822-1823. (2) Collins, F.; Patrinos, A.; Jordan, E.; Chakravarti, A.; Gesteland, R.; Walters, L. Science 1998, 282, 682-689. (3) Marshall, E.; Pennisi, E. Science 1998, 280, 994-995. (4) Carrilho, E.; RuizMartinez, M. C.; Berka, J.; Smirnov, I.; Goetzinger, W.; Miller, A. W.; Brady, D.; Karger, B. L. Anal. Chem. 1996, 68, 3305-3313. (5) Kheterpal, I.; Scherer, J. R.; Clark, S. M.; Radhakrishnan, A.; Ju, J. Y.; Ginther, C. L.; Sensabaugh, G. F.; Mathies, R. A. Electrophoresis 1996, 17, 18521859. (6) Fang, Y.; Zhang, J. Z.; Hou, J. Y.; Lu, H.; Dovichi, N. J. Electrophoresis 1996, 17, 1436-1442. 10.1021/ac000055j CCC: $19.00 Published on Web 05/27/2000

© 2000 American Chemical Society

attended operation through automated polymer replacement and sample injection.9,10 At present, commercial capillary array electrophoresis (CAE) sequencers deliver ∼500-650 bases per capillary in ∼3 h.11 This technology has now been scaled to near its practical limits and it is widely expected that next-generation instruments will employ similar separation technology but in a planar microfabricated device format. Schmalzing et al. have reported genotyping analysis on microfabricated devices 10-100-fold faster than capillary or slab gel electrophoresis, respectively.12 Moreover, the recent application of microfabrication technologies to develop electrophoretic devices with 96 lanes for ds-DNA separations showed the potential of increasing the throughput of DNA analysis by orders of magnitude13 through the combination of massive parallelism and fast assay time. DNA sequencing of 200 bases in 10 min on 3.5cm microfabricated devices using linear poly(acrylamide) (LPA) was first demonstrated by Woolley and Mathies.14 Since then, the performance of these devices on DNA sequencing has improved significantly. Recently, four-color separations of 500 bases in 20 min using standard M13mp18 as template has been reported.15 Our group has demonstrated the application of 11.5-cm microdevices to analyze real-world four-color DNA sequencing samples from human chromosome 17, obtaining average read lengths between 460 and 505 bases in less than 25 min.16 The goal of the present work is to investigate important parameters that affect the performance of microfabricated devices for DNA sequencing using replaceable LPA solutions. For the microdevice, the effect of buffer composition, LPA concentration and composition, temperature, electric field, and microdevice (7) Salas-Solano, O.; Carrilho, E.; Kotler, L.; Miller, A.; Goetzinger, W.; Sosic, Z.; Karger, B. Anal. Chem. 1998, 70, 3996-4003. (8) Madabhushi, R. S. Electrophoresis 1998, 19, 224-230. (9) Venter, J. C.; Adams, M. D.; Sutton, G. G.; Kerlabage, A. R.; Smith, H. O.; Hunkapiller, M. Science 1998, 280, 1540-1542. (10) Kheterpal, I.; Mathies, R. Anal. Chem. 1999, 71, 31A-37A. (11) Mullikin, J. C.; McMurray, A. A. Science 1999, 283, 1867-1868. (12) Schmalzing, D.; Koutny, L.; Adourian, A.; Belgrader, P.; Matsudaira, P.; Ehrlich, D. Proc. Natl. Acad. Sci. U.S.A. 1997, 94, 10273-8. (13) Shi, Y.; Simpson, P. C.; Scherer, J. R.; Wexler, D.; Skibola, C.; Smith, M. T.; Mathies, R. A. Anal. Chem. 1999, 71, 5354-5361. (14) Woolley, A.; Mathies, R. Anal. Chem. 1995, 67, 3676-3680. (15) Liu, S. R.; Shi, Y. N.; Ja, W. W.; Mathies, R. A. Anal. Chem. 1999, 71, 566573. (16) Schmalzing, D.; Tsao, N.; Lance, K.; Chisholm, D.; Srivastava, A.; Adourian, A.; Linton, L.; McEwan, P.; Matsudaira, P.; Ehrlich, D. Genome Res. 1999, 9, 853-858.

Analytical Chemistry, Vol. 72, No. 14, July 15, 2000 3129

effective length on selectivity, separation efficiency, and resolution of DNA sequencing fragments has been studied. Importantly, the optimal experimental conditions to achieve high-speed and highperformance DNA sequencing analysis, using the material system (matrix and buffer) known to be of highest performance for CE,4 have been established for 11.5-cm microfabricated devices. EXPERIMENTAL SECTION Instrumentation. The design of the microdevice apparatus and the laser-induced fluorescence (LIF) detection system has been described previously.16 Micromachining. Electrophoretic microdevices were fabricated from 150-mm-diameter fused-silica wafers (Hoya, Tokyo, Japan) using techniques described previously.17 The device structure was a simple cross. The separation channel was 11.5 cm in length, and the three side channels had a length of 0.5 cm. The structure was 40 µm deep and 90 µm wide at the top. The side channels forming the injector were offset lengthwise by 150 µm forming a double tee. Glass reservoirs (Ace Glass, Vineland, NJ) of 50-µL volume were affixed around the channel access holes to hold sample and buffer using optical adhesive according to the manufacturer’s specification (Nordland Products, New Brunswick, NJ). Channel surfaces were coated with LPA using a modified Hjerten procedure.18 LPA Separation Matrixes. High-viscosity-average molecular mass LPA (∼10 MDa) powder was synthesized in-house on the basis of the procedure described by Goetzinger et al.19 Lowviscosity-average molecular molecular mass LPA (50 kDa) was prepared using the procedure described by Salas-Solano et al.7 The viscosity-average molecular mass of the prepared LPA was determined by intrinsic viscosity measurements of the polymer at 25 °C using the Mark-Houwink equation.20,21 LPA solutions were prepared with 1× TBE (90 mM Tris/64.6 mM boric acid/2.5 mM EDTA) with 3.5 M urea/30% (v/v) formamide or with 1× TTE (50 mM Tris/50 mM TAPS/2 mM EDTA) with 7 M urea. The solutions were ready for use after 3 days of slow stirring in a glass jar. Microdevice Electrophoresis. Between each run, the separation and cross channels of the microfabricated device were simultaneously refilled with fresh LPA separation matrix from the anodic end of the separation channel using a gastight syringe attached to a mechanical fixture. The electrophoresis buffer composed of 1× TBE or 1× TTE was also changed after each run. The devices were pre-electrophoresed for 5 min at 200 V/cm through the separation channel, buffer and sample vials floating. The DNA sequencing samples were loaded by applying 340 V/cm across the loading channel and the anodic and cathodic buffer reservoirs floating. Leakage of excess sample from the loading into the separation channel during the run was prevented with a small electric field (∼20 V/cm) applied to both halves of the loading channel. (17) Koutny, L. B.; Schmalzing, D.; Taylor, T. A.; Fuchs, M. Anal. Chem. 1996, 68, 18-22. (18) Hjerten, S. J. Chromatogr. 1985, 347, 191-198. (19) Goetzinger, W.; Kotler, L.; Carrilho, E.; Ruiz-Martinez, M. C.; Salas-Solano, O.; Karger, B. L. Electrophoresis 1998, 19, 242-248. (20) Baade, W.; Reichert, K. Eur. Polym. J. 1984, 20, 505-512. (21) Munk, P.; Aminibhavi, T.; Williams, P.; Hoffman, D.; Chmelir, M. Macromolecules 1980, 13, 871-877.

3130

Analytical Chemistry, Vol. 72, No. 14, July 15, 2000

DNA Sequencing Chemistries. Most DNA sequencing reactions were conducted using standard cycle sequencing chemistry with AmpliTaq-FS and BigDye-labeled (-21) M13 universal primers (Applied Biosystems/Perkin-Elmer Corp., Foster City, CA) and M13mp18 as template (New England Biolabs, Beverly, MA). Other sequencing reactions were performed using the Thermosequenase cycle sequencing kit with 7-deaza-dGTP and DYEnamic energy-transfer (ET) M13 (-21) primers (Amersham Life Science, Cleveland, OH). A total of 0.4 µg of template DNA was used per sample (0.1 µg of M13mp18 per single-color reaction). Cycle sequencing was performed on a Genius thermocycler (Techne, Duxford, Cambridge, U.K.), consisting of 15 cycles of 10 s at 95 °C, 5 s at 50 °C, and 1 min at 70 °C, followed by 15 cycles of 10 s at 95 °C and 1 min of 70 °C. Purification of Sequencing Samples. The DNA sequencing samples were desalted using Centri-Sep spin columns (Princeton Separations, Adelphia, NJ) or ethanol precipitation according to the standard ABI procedures.22 The spin columns were hydrated for at least 30 min by adding 800 µL of deionized water. The interstitial volume was excluded by spinning the columns for 3 min at 3000 rpm. The sequencing sample, diluted in 40 µL with deionized water, was then placed on the column and spun for 3 min at 3000 rpm. The resulting volume of the sample was diluted to 50 µL with deionized water. A 10-µL aliquot was then pipetted onto the electrophoretic device. Chemicals. Acrylamide, N,N,N′,N′-tetramethylethylenediamine, ammonium persulfate, and urea were purchased from ICN Biomedicals, Inc. (Aurora, OH). TRIS, TAPS, and EDTA were from Sigma (St. Louis, MO). All chemicals were either electrophoresis or analytical grade, and no further purification was performed. Span 80 emulsifier and petroleum special with a boiling range from 180 to 220 °C were purchased from Fluka chemicals (Buchs, Switzerland). Water used in all reactions, and solutions was deionized to 18.2-MΩ grade with a Milli-Q purification system (Millipore, Worcester, MA). Data Analysis. The C-traces of four-color DNA sequencing reactions using (-21M13) forward BigDye primers were used for data analysis. The C-traces were selected since they do not contain cross-talk and single isolated peaks can easily be found over the entire range of fragment sizes. The selected peaks were bp 78, 154, 212, 289, 337, 417, 468, 535, 609, 683, 753, and 832. The raw data were used for calculations without any prior software treatment (e.g., smoothing, artificial peak narrowing, etc.). Selectivity Measurements. From the resulting electropherograms, the migration time of the sequencing fragments versus base number was plotted and curved fitted by a third-order polynomial, using Microcal Origin 6.0 software (Microcal Software Inc., Northampton, MA). The fitted values were used to calculate selectivity defined as

| | |

|

tm1 - tm2 ∆µ )2 µav tm2 - tm1

(1)

where ∆µ is the mobility difference of two adjacent fragments, µav is the mobility average of these fragments, and tm and tm2 stand for their migration times. (22) Applied Biosystems, Perkin-Elmer. Protocol P/N 402113, 1998.

Figure 1. Effect of the buffer composition on DNA sequencing analysis on a microfabricated electrophoretic device. (A) 1× TBE/ 3.5 M urea and 30% (v/v) formamide, (B) 1× TTE/7 M urea. Samples: four-color sequencing reactions using BigDye-labeled primers. Electrophoretic conditions: 11.5/12 cm device length, 150 µm offset cross-injector, 40 µm channel depth; 3.0% (w/w) LPA (10 MDa) dissolved in the respective buffer. The sequencing samples were loaded at a constant field of 300 V/cm for 3 min and electrophoresed at 200 V/cm at 40 °C.

Separation Efficiency Measurements. Efficiency (N) was calculated with the following equation:

N ) 5.5(tm/ww0.5)2

(2)

where tm is the migration time for the DNA sequencing fragment of interest and w0.5 is the width of the peak at half-height. Resolution Calculations. The resolution Rs for two adjacent peaks was calculated as

| |x

Rs ) 1/4

∆µ µav

N

(3)

Base-Calling. The microdevice data were collected using a custom software written in HPVEE (Hewlett-Packard, San Jose, CA). The four-color data sequencing were processed using Trout Base-Caller. Only four-color ET primer samples were used since our version of Trout is currently best optimized for this type of sequencing chemistry. Trout is available on the WICGR ftp site (genome.wi.mit.edu) in the directory distribution/software/trout. Documentation is provided with the program. RESULTS AND DISCUSSION Optimization of Separation Conditions. (1) Buffer Composition. When we used LPA solutions dissolved in 1× TBE, 7 M urea, and 30% (v/v) formamide, it was observed that the separation of the first 50-100 sequencing fragments was negatively affected by peak distortions. Figure 1A shows a gap between the primer

peak (21 bases long) and the DNA fragment 70 bases in length. In addition, the signal heights of fragments 78 and 84 bases long had increased significantly. These phenomena have also been reported for slab gel electrophoresis.23 They were attributed to the presence of glycerol in the sequencing sample, forming a charged glycerol-boric acid complex, which by migrating through the separation channel could affect the local electric field resulting in peak distortions. To avoid glycerol complexation, we replaced borate by TAPS and tested the buffer composed of 1× TTE (with 7 M urea) since it has been successfully used in DNA sequencing by CE.7,24 As observed in Figure 1B, using this buffer system, both gaps and peak distortion disappeared and uniform peak profiles were obtained. The buffer was adopted for the rest of the work. (2) Effect of Polymer Concentration. LPA solutions with concentrations of 2.0, 3.0, and 4.0% (w/w) were prepared using LPA powder with a weight-average molecular mass of ∼10 MDa. LPA was chosen based on its high sequencing performance demonstrated in CE 4,7,25 and microdevice electrophoresis.15,16,26 Using these separation matrixes, four-color sequencing reactions with BigDye-labeled primers were analyzed on 11.5-cm microdevices at 200 V/cm and 40 °C. Figure 2A shows selectivity as a function of DNA fragment size in dependence of LPA concentration. The inset presents migration time versus fragment size. It can be seen that the selectivity of DNA fragments shorter than 450 bases was enhanced upon increasing the concentration of LPA, whereas the selectivity of DNA fragments longer than 450 bases decreased. The latter effect can be interpreted as a shift in the DNA separation mechanism to a mode of migration involving reptation.27-29 The same behavior has also been observed in DNA sequencing by CE.7,30,31 The inset of Figure 2A shows that all DNA fragments migrated proportionally slower with the increase of the separation matrix concentration. The separation efficiency for fragments shorter than 200 bases was found to be nearly independent of the LPA concentration (data not shown). For longer fragments, a significant improvement in efficiency was observed upon increasing the LPA concentration. From the same set of experiments, we calculated resolution as a function of DNA fragment size (Figure 2B). It can be seen that sequencing performance improved upon increasing LPA concentration. The separation matrixes with 3.0 and 4.0% (w/w) 10 MDa LPA showed single-base resolution above 0.5 for a total of 380 and 430 bases, respectively, whereas 2.0% (w/w) 10 MDa LPA gave only a total of 200 bases. When processed by the software Trout, sequencing runs using 2.0, 3.0, and 4.0% (w/w) 10 MDa LPA achieved read length of approximately 250, 400, and 460 bases at an accuracy of 99%, respectively. This behavior was opposite to sequencing results reported for capillary columns with (23) Tong, X. C.; Smith, L. M. J. DNA Sequence 1993, 43, 151-162. (24) Ruiz-Martinez, M. C.; Salas-Solano, O.; Carrilho, E.; Kotler, L.; Karger, B. L. Anal. Chem. 1998, 70, 1516-1527. (25) Quesada, M. A Curr. Opin. Biotechnol. 1997, 8, 82-93. (26) Schmalzing, D.; Adourian, A.; Koutny, L.; Ziaugra, L.; Matsudaira, P.; Ehrlich, D. Anal. Chem. 1998, 70, 2303-2310. (27) Duke, T.; Viovy, J. L. Phys. Rev. E 1994, 49, 2408-2416. (28) Slater, G. W.; Mayer, P.; Grossman, P. D. Electrophoresis 1995, 16, 75-83. (29) Heller, C. Electrophoresis 1999, 20, 1962-1977. (30) Mitnik, L.; Salome, L.; Viovy, J. L.; Heller, C. J. Chromatogr., A 1995, 710, 309-321. (31) Barron, A.; Sunada, W.; Blanch, H. Electrophoresis 1996, 17, 744-757.

Analytical Chemistry, Vol. 72, No. 14, July 15, 2000

3131

Figure 2. (A) Separation selectivity (∆µ/µav) and (B) resolution as a function of DNA fragment size for LPA solutions of 2.0% (w/w) 10 MDa (9), 3.0% (w/w) 10 MDa (b), and 4.0% (w/w) 10 MDa (2). Migration time versus base number for the different LPA matrixes are shown in the inset. Buffer was 1× TTE. Sample and other electrophoretic conditions as in Figure 1.

30-cm effective length where read lengths increased when LPA concentration was reduced.4,7 This confirms the important statement that optimization of LPA concentration strongly depends on the separation distance.26 For example, the 2.0% (w/w) 10 MDa LPA, which is not very efficient on 11.5-cm-long devices, has been demonstrated to provide the longest reads (e.g., 1000 bases) in 30-cm-long capillaries.7 The 3.0 and 4.0% (w/w) 10 MDa LPA solutions were chosen to study the effect of other electrophoretic parameters on DNA sequencing fragments on 11.5-cm microfabricated devices. (3) Device Temperature. It has been shown by several groups that elevated temperature increases the read length in DNA sequencing by CE.6,7,32 Higher temperatures also result in faster separations and reduce compressions.6,32 First, we investigated the selectivity and efficiency of 3.0 and 4% (w/w) 10 MDa LPA as a function of the microdevice temperature, which was varied from 40 to 60 °C in 10 °C increments. The selectivity of sequencing fragments 40-650 bases in length did not change upon increasing the temperature for both LPA concentrations (data not shown). The efficiency of separation of DNA sequencing fragments at various temperatures using the 4.0% (w/w) 10 MDa LPA solution is shown in Figure 3. At any temperature, the peak efficiency was lowest at the beginning of the electropherogram but increased at 40 and 50 °C with DNA fragment sizes up to a maximum at ∼550 bases. For fragments longer than 550 bases, the efficiency gradually decreased. Interestingly, the efficiency of all DNA (32) Kleparnik, K.; Foret, F.; Berka, J.; Goetzinger, W.; Miller, A. W.; Karger, B. L. Electrophoresis 1996, 17, 1860-1866.

3132 Analytical Chemistry, Vol. 72, No. 14, July 15, 2000

fragments was significantly higher at 50 °C than at 40 °C. This effect has been also reported in DNA sequencing by CE, and it was attributed to an increase in the elasticity of the interactions between DNA and LPA fiber.7,33 A further increase in temperature to 60 °C reduced significantly the efficiency of fragments longer than 200 bases. It is possible that at this temperature the positive effects of higher thermal energy of the sequencing fragments are counterbalanced by the loss in efficiency due to an increase of the dynamic nature of the entangled LPA solution. Similar results were obtained for the separation matrix containing 3.0% (w/w) 10 MDa LPA (data not shown). We concluded that, for the separation matrixes containing 3.0 and 4.0% 10 MDa LPA, the optimum temperature was 50 °C. (4) Effect of Electric Field Strength. The electric field strength study was guided by the findings in DNA sequencing by CE, showing that maximum resolution was obtained in the range from 100 to 200 V/cm.5-7,25,34 The selectivity of DNA fragments shorter than 300 bases did not change significantly when the electric field was increased from 100 to 200 V/cm in 25 V/cm increments. However, the selectivity of DNA fragments longer than 300 bases decreased with increase of the electric field strength. This latter behavior was expected since the onset of orientation with reptation should be shifted toward smaller DNA sizes when the separation voltage is increased.4,28,35,36 (33) Gibson, T.; Sepaniak, M. J. J. Chromatogr., B 1997, 695, 103-111. (34) Kim, Y.; Yeung, E. S. J. Chromatogr., A 1997, 781, 315-325. (35) Luckey, J. A.; Smith, L. M. Anal. Chem. 1993, 65, 2841-2850. (36) Dovichi, N. J. Electrophoresis 1997, 18, 2393-2399.

Figure 3. Number of theoretical plates as a function of DNA fragment length at different device temperatures: 40 (9), 50 (b), and 60 °C (2). The LPA solution was 4.0% (w/w) 10 MDa and the electrophoretic runs were performed using an 11.5-cm-long device at 200 V/cm. Other electrophoretic conditions as in Figure 2.

Figure 4. Resolution as a function of DNA fragment length for different electric fields: 100 (1), 125 (2), 150 (b) and 200 (9) V/cm. The LPA solution was 4.0% (w/w) 10 MDa on an 11.5-cm-long device at 50 °C. Samples and other electrophoretic conditions as in Figure 2.

The effect of the electric field strength on the separation efficiency of DNA sequencing fragments was investigated next (data not shown). We observed that the number of theoretical plates (N) for sequencing fragments in the range from 40 to 650 bases increased when the electric field was increased from 100 to 200 V/cm. Under diffusion-limited conditions, an increase in electric field will lead to short analysis times and reduced diffusionional broadening.28,33,37 Figure 4 presents resolution as a function of DNA fragment size at different electric field strengths for the 4.0% (w/w) 10 MDa LPA at 50 °C. In the range from ∼40 to 500 bases, resolution is directly proportional to voltage. The field strengths reached a resolution maximum at ∼200 bases beyond which each resolution curve declined with a different slope. The decrease in resolution at 100 and 125 V/cm was more gradual than for 150 and 200 V/cm due to the more pronounced stretching of longer DNA fragments at higher electric fields. The curves crossed each other at ∼500 bases, and the highest resolution was then achieved at 125 V/cm. A single-base resolution of 0.5 is commonly set as a minimum requirement for accurate DNA sequencing.15,26,36,38 On the basis of this criterion, Figure 4 shows two optimal electric field strengths for DNA sequencing on 11.5-cm-long microfabricated devices using 4.0% (w/w) 10 MDa and 50 °C. High-speed, high-efficiency DNA sequencing analysis of 500 bases (R g 0.5) was obtained in less than 20 min at 200 V/cm. A separation of 550 bases with R g 0.5 was then achieved at 125 V/cm at the expense of a longer analysis time (data not shown). The separation matrix containing 3.0% (w/w) 10 MDa

LPA provided separation of only 420 bases with R g 0.5 at 50 °C and 125 V/cm, due to the lower selectivity and efficiency at this concentration. (5) Microdevice Effective Length. After studying separation buffer, polymer composition, temperature, and electric field, the next step was to investigate the possibility of decreasing the analysis time for the conditions described above (4.0% w/w 10 MDa LPA, 50 °C and 125 V/cm), by reducing the effective length of the separation channel from 11.5 to 6 cm. The selectivity of sequencing fragments did not change by reducing the effective length (data not shown). However, as shown in Figure 5A, much lower separation efficiencies were obtained upon reducing the effective length from 11.5 to 6 cm for all DNA sequencing fragments. As a result, the range of DNA sequencing fragments with R g 0.5 was significantly reduced from 550 to less than 200 as depicted in Figure 5B. Consequently, read lengths of less than 250 bases were obtained using 4.0% (w/w) 10 MDa LPA, 50 °C, 125 V/cm, and an effective length of 6 cm (data not shown). These results suggest that, for practical applications of microfabricated electrophoretic devices for DNA sequencing, effective lengths longer than 6 cm should be used. A recent publication by Backhouse et al. came to a similar conclusion.39 Reports have demonstrated that an increase of the effective column length in DNA analysis by CE, under diffusion-limited conditions, resulted in higher separation efficiencies and, therefore, better resolution and longer read length.25,34,37

(37) Karger, A. E. Electrophoresis 1996, 17, 144-151. (38) Grossman, P. D. J. Chromatogr., A 1994, 663, 219-227.

(39) Backhouse, C.; Caamano, M.; Oaks, F.; Nordman, E.; Carrillo, A.; Johnson, B.; Bay, S. Electrophoresis 2000, 21, 150-156.

Analytical Chemistry, Vol. 72, No. 14, July 15, 2000

3133

Figure 5. (A) Number of theoretical plates and (B) resolution as a function of DNA fragment size for different effective lengths: 6 (9) and 11.5 (b) cm. Electrophoretic conditions: 4.0% (w/w) 10 MDa LPA, 50 °C, and 125 V/cm. Samples and other electrophporetic condtions as in Figure 1.

(6) Separation Matrix Composition. It has been described that mixtures of short-chain and long-chain polymers improve the separation performance of DNA sequencing fragments by CE.7,34 We investigated the effect of mixing high weight-average molecular mass LPA (∼10 MDa) with low weight-average molecular mass LPA of ∼50 kDa on selectivity, efficiency, and resolution. Different weight ratios of short- and long-chain LPA at a total concentration of 4.0% (w/w) were studied at 50 °C and 125 V/cm (optimized electrophoretic parameters). Figure 6A shows the selectivity of DNA sequencing fragments as a function of fragment size for different LPA mixtures. As expected, the selectivity decreased with fragment size for the three mixtures. Upon increasing the fraction of short-chain LPA in the separation matrix, the selectivity of DNA fragments shorter than ∼550 bases decreased. In the region of 600-650 bases the curves crossed each other, and beyond this point the selectivity of sequencing fragments increased with the presence of higher content of 50 kDa LPA in the separation matrix. Importantly, by increasing the fraction of short LPA chains, the change in selectivity for long DNA fragments decreased more slowly. The inset of Figure 6A shows that all DNA fragments migrated faster when the fraction of short LPA chains in the separation matrix was increased. The efficiencies of separation of DNA sequencing fragments for the LPA mixtures are shown in Table 1. We can see that the efficiency of the first 470 bases did not depend on the composition of the separation matrixes studied. However, a larger fraction of long LPA chains provided higher separation efficiencies for DNA fragments longer than 500 bases. The improvement in efficiency at high base numbers with increasing polymer weight-average 3134

Analytical Chemistry, Vol. 72, No. 14, July 15, 2000

molecular mass has also been reported in DNA sequencing by CE.37,40,41 Figure 6B describes resolution as a function of fragment size for the three different LPA mixtures. The graph shows that the higher the content of high molecular weight LPA the higher was the resolution in the range of 25-500 bases. This is because both selectivity and efficiency improved with the percentage of 10 MDa LPA present in the separation matrix. The solution 3% (w/w) 10 MDa plus 1.0% (w/w) 50 kDa LPA showed the highest resolution above ∼550 bases, most likely caused by its superior selectivity in this region. The lowest resolution was found for 2.0% (w/w) 10 MDa plus 2.0% 50 kDa LPA, reaching in the best case a resolution of only 0.6. The 4.0% (w/w) 10 MDa LPA and the 3% (w/w) 10 MDa plus 1.0% (w/w) 50 kDa LPA solutions showed comparable read length of more than 600 bases at 98.5% accuracy at 125 V/cm and 50 °C on a 11.5-cm-long device (see below). However, the run time was reduced by using the mixed LPA solution (see inset Figure 6A). On the other hand, 2.0% (w/w) 10 MDa plus 2.0% (w/w) 50 kDa LPA showed a read length of only 350 bases at the same experimental conditions, making it rather impractical for many DNA sequencing applications. When the same resolution evaluation was performed on a device of 6.0 cm in length, a maximum read length of only 250 bases was found with the 3.0% (w/w) 10 MDa LPA and 1.0% (w/ w) 50 kDa LPA at 125 V/cm and 50 °C. This finding indicates that the read length of the 6.0-cm-long devices cannot be further (40) Wu, C. H.; Quesada, M. A.; Schneider, D. K.; Farinato, R.; Studier, F. W.; Chu, B. Electrophoresis 1996, 17, 1103-1109. (41) Heller, C. Electrophoresis 1999, 20, 1978-1986.

Figure 6. (A) Separation selectivity (∆µ/µav) and (B) resolution as a function of DNA fragment length for LPA solutions of different composition: 4.0% (w/w) 10 MDa (9), 3.0% (w/w) 10 MDa + 1.0% (w/w) 50 kDa (b), and 2.0% (w/w) 10 MDa LPA + 2.0% (w/w) 50 kDa LPA (2). The plots of migration time versus base number for the different electric fields are shown in the inset. Electrophoretic conditions: 11.5cm-long device, 125 V/cm, and 50 °C. Samples and other experimental conditions as in Figure 2. Table 2. Read Lengtha at Electrophoretic Conditions Optimized for 11.5-cm Microfabricated Electrophoretic Devices

Table 1. Effect of LPA Composition on DNA Sequencing Dataa efficiencyb (106) LPA composition

base 78

base 468

base 545

base 832

electric field (V/cm)

analysis timeb (min)

read length at 98.5% accuracyb

4.0% (w/w) 10 MDa 3.0% (w/w) 10 MDa + 1.0% (w/w) 50 kDa 2.0% (w/w) 10 MDa + 2.0% (w/w) 50 kDa

0.60 0.61

1.21 1.20

1.55 1.58

1.50

200 125

17.1 (base 565) 27.8 (base 624)

580 640

0.61

1.18

1.40

1.20

a Samples: dye-labeled primer cycle sequencing using ABI Prism BigDye (-21) primers on ssM13mp18. Electrophoretic conditions as in Figure 6. b The RSD for the number of theoretical plates was 7% (n ) 3) at each experimental condition.

improved by using the mixed LPA solutions studied here. DNA Sequencing under Optimal Conditions. All the parameters discussed above were combined to examine DNA sequencing of a standard M13mp18 template using (-21) dyelabeled ET primers and Thermosequenase. The raw data were processed by Trout base-calling software and the sequence was compared with the known sequence of M13mp18 using Seqman II expert sequence analysis software. As seen in Table 2, the average read length (n ) 10) on a microfabricated device of 11.5 cm in length filled with a separation matrix containing 3.0% (w/ w) 10 MDa LPA and 1.0% (w/w) 50 kDa LPA at 200 V/cm and 50 °C was found to be 580 bases at 98.5 accuracy in ∼18 min. The RSD of migration time was ∼3%. Longer read lengths were achieved by reducing the electric field from 200 to 125 V/cm. In

a In this study, read length was defined as the largest number of consecutive base calls starting with the fragment 33 bases in length using DNA standard reactions on an M13mp18 template. Experimental conditions as in Figure 7. The reference sequence was the published M13mp18 sequence.43 b The RSDs (n ) 10) for migration time and read lengths were 3 and 5%, respectively.

this case, 640 bases (n ) 10) could be sequenced at an accuracy of 98.5% in less than 30 min. Trout increased the read length by ∼50 bases beyond what would be expected using the minimum resolution criterion of 0.5. It is important to indicate that similar read lengths has been reported by CE using 50-cm-long capillary columns but with analysis time between 2 and 3 h.8,11,42 Taken together, these results demonstrate the practicality of 11.5-cm microfabricated devices for DNA sequencing. Figure 7 presents the electrophoretic separation of four-color sequencing reaction products from the fragments 33-710 bases. Two errors occurred in the first 60 bases due to strong compressions. No errors were found until base 572 due to another (42) Zhang, Y.; Tan, H.; Yeung, E. S. Anal. Chem. 1999, 71, 5018-5025. (43) Ebright, R.; Dong, Q.; Messing, J. Gene 1992, 114, 81-83.

Analytical Chemistry, Vol. 72, No. 14, July 15, 2000

3135

Figure 7. Electrophoretic separation of DNA sequencing fragments using ssM13mp18 as template, ET-labeled universal (-21) primer, and Thermosequenase using an 11.5-cm microfabricated device at the optimal electrophoretic conditions for long reads: 3.0% (w/w) 10 MDa + 1.0% (w/w) 50 kDa LPA, 50 °C, and 125 V/cm.

compression. The number of errors in the read length increases significantly after ∼700 bases. The base calling yielded a read length of 640 bases with 98.5% accuracy in ∼30 min. 3136

Analytical Chemistry, Vol. 72, No. 14, July 15, 2000

CONCLUSIONS This systematic study demonstrates that numerous electrophoretic parameters may interdependently affect the performance

of microfabricated devices for DNA sequencing fragments using replaceable LPA solutions. Importantly, the results show that the separation principles governing DNA sequencing by microdevice and capillary electrophoresis are essentially the same. As predicted,26 devices shorter than 10 cm in length were found to be rather impractical for most sequencing applications when LPA is used as sieving matrix. Under optimized conditions for long reads (3.0% (w/w) 10 MDa LPA and 1.0% (w/w) 50 kDa LPA, 50 °C, and 125 V/cm), we achieved on 11.5-cm-long microfabricated electrophoretic devices read lengths of 640 bases in ∼30 min with a base-calling accuracy of 98.5%. Even longer read lengths will require devices with long channels (e.g., 30-40 cm in length).

Such long-channel chips are under development in our laboratory for de novo sequencing, where long read lengths dramatically reduce sequence assembly costs. ACKNOWLEDGMENT This work was supported by National Institutes of Health (NIH) under Grant HG01389 and by Airforce Office of Scientific Research (Grant F49620-98-1-0235). Received for review January 13, 2000. Accepted April 11, 2000. AC000055J

Analytical Chemistry, Vol. 72, No. 14, July 15, 2000

3137