Anal. Chem. 1998, 70, 1516-1527
A Sample Purification Method for Rugged and High-Performance DNA Sequencing by Capillary Electrophoresis Using Replaceable Polymer Solutions. A. Development of the Cleanup Protocol Marie C. Ruiz-Martinez,† Oscar Salas-Solano, Emanuel Carrilho,‡ Lev Kotler, and Barry L. Karger*
Barnett Institute and Department of Chemistry, Northeastern University, Boston, Massachusetts 02115
A method for the cleanup of Sanger DNA sequencing reaction products for capillary electrophoresis analysis with replaceable polymer solutions has been developed. A poly(ether sulfone) ultrafiltration membrane pretreated with linear polyacrylamide was first used to remove template DNA from the sequencing samples. Then, gel filtration in a spin column format (two columns per sample) was employed to decrease the concentration of salts below 10 µM in the sample solution. The method was very reproducible and increased the injected amount of the sequencing fragments 10-50-fold compared to traditional cleanup protocols. Using M13mp18 as template, the resulting cleaned-up single DNA sequencing fragments could routinely be separated to more than 1000 bases with a base-calling accuracy of at least 99% for 800 bases. The method is simple and universal and can be easily automated. In the following paper, a systematic study to determine quantitatively the effects of the sample solution components such as high-mobility ions (e.g., chloride and dideoxynucleotides) and template DNA on the injected amount and separation efficiency of the sequencing fragments is presented. DNA sequencing technologies are under rapid development due to the needs of the Human Genome Project and beyond to accelerate the process of gene discovery and therapy. Modern applications of DNA sequencing have imposed stringent demands on reliability, speed, and throughput for DNA sequencers. Recent reports have demonstrated the potential of capillary electrophoresis (CE) for DNA sequencing given the inherent speed of separation and ease of automation.1-3 Multiple capillary arrays4-10
and microchips11-13 are under investigation to provide highthroughput analysis of DNA fragments. Relative to cross-linked gel columns, the use of replaceable polymer solutions to achieve size separation of ssDNA fragments has increased the lifetime of the columns and eliminated the requirements of gel pouring and casting.14 In addition, improvements in the preparation of the matrix composition have led to sequencing over 1000 bases per run.1 Nevertheless, problems in ruggedness remain, mainly due to insufficient cleanup of the sequencing samples. For example, CE systems for DNA sequencing have been reported to produce declining current during operation and contain spike signals due to the formation and trapping of gas bubbles.15,16 Such effects, when they occur, decrease the resolution and accuracy of DNA sequencing by CE, and subsequently, the resulting performance of a high-throughput capillary array instrument could be seriously limited. Indeed, the critical importance of sample cleanup for the successful operation of CE for separation of DNA sequencing reaction products has not been sufficiently emphasized. In contrast to slab gel electrophoresis, sequencing fragments are introduced into the capillary column using electrokinetic injection, which provides focusing of the single-stranded DNA fragments at the head of the column.3,17 However, electrokinetic
* To whom correspondence should be addressed: (e-mail) bakarger@ lynx.neu.edu. † Department of Energy Human Genome Project Distinguished Postdoctoral Fellow. Current address: Curagen Corp., 322 E. Main St., Branford, CT 06504. ‡ Current address: University of Sa ˜o Paulo, IQSC/DQFM 13560-970, Sa˜o Carlos S.P., Brazil. (1) Carrilho, E.; Ruiz-Martinez, M. C.; Berka, J.; Smirnov, I., Goetzinger, W. Miller, A. W.; Brady, D.; Karger, B. L. Anal. Chem. 1996, 68, 3305-3313. (2) Tan, H.; Yeung, E. S. Anal. Chem. 1997, 69, 664-674. (3) Swerdlow, H.; Jones, B. J.; Witter, C. T. Anal. Chem. 1997, 69, 848-855. (4) Ueno, K.; Yeung, E. S. Anal. Chem. 1994, 66, 1424-1431.
(5) Kheterpal, I.; Scherer, J. R.; Clark, S.; Radhaknshman, A.; Ju, J.; Ginter, C. L., Sensabaughn G. F.; Mathies, R. A. Electrophoresis 1996, 17, 1852-1859. (6) Dovichi, N.; Zhang, J.; Rong, J.; Rong, L.; Bay, S.; Roos, P.; Voss, K.; Dellinger, S. DOE Human Genome Program Contractor-Grantee Workshop V, Santa Fe, NM, 1996. (7) Bashkin, J.; Roach, D.; Leong, J.; Bartosiewicz, M.; Barker, D.; Johnston, R. G. J. Capillary Electrophor. 1996, 3, 61-68. (8) Quesada, M. A.; Zhang, S. Electrophoresis 1996, 17, 1841-1851. (9) Anazawa, T.; Takakashi, S.; Kambara, H. Anal. Chem. 1996, 68, 26992704. (10) Carrilho, E.; Miller, A.; Ruiz-Martinez, M. C.; Kotler, L.; Kesilman, J.; Karger, B. L. Proc. SPIE 1997, 2985A, 4-18. (11) Jacobson, S. C.; Kontnyl, B.; Hergenroder, R.; Moore, A. W.; Ramsey, J. M. Anal. Chem. 1994, 66, 3472-3476. (12) Wolley, A. T.; Mathies, R. A. Anal. Chem. 1995, 67, 3676-3680. (13) Schmalzing, D.; Kounty, L.; Adourian, A.; Belgrader, P.; Matsudaira, P.; Ehrlich, D. Proc. Natl. Acad. Sci. U.S.A. 1997, 94, 10273-10278. (14) Ruiz-Matinez, M. C.; Berka, J.; Belenkii, A.; Foret, F.; Miller, A. W.; Karger, B. L. Anal. Chem. 1993, 65, 2851-2858. (15) Swerdlow, H.; Dew-Jager, K. E.; Brady, K.; Grey, R.; Dovichi, N. J.; Gesteland, R. Electrophoresis 1992, 13, 475-483. (16) Karger, A. E. Electrophoresis 1996, 17, 144-151.
1516 Analytical Chemistry, Vol. 70, No. 8, April 15, 1998
S0003-2700(97)01143-8 CCC: $15.00
© 1998 American Chemical Society Published on Web 03/18/1998
injection is biased toward high electrophoretic mobility ions,18-20 such as chloride and dideoxynucleotides present in the sequencing reaction solution, relative to the DNA fragments and doublestranded DNA. To increase the amount of DNA injected into the capillary column, an effective removal of these small ionic species is required. A traditional sample preparation scheme for both slab gel electrophoresis and CE consists of desalting DNA sequencing samples by ethanol precipitation, followed by reconstitution of the DNA fragments and template in a mixture of formamide-0.5 M EDTA (49:1) prior to loading or injection.18,21 Although widely used, this method has been found to be not reproducible in terms of DNA recovery or not easily automated. 2,22,23 Other approaches for desalting sequencing samples such as solid-phase purification with magnetic beads24 and gel filtration in both spin column25 and microtiter plate formats26 have been developed for automated DNA sequencers. In addition to small ionic species, template DNA has also been shown to interfere with the analysis of sequencing fragments in both thin slab gels24,27 and capillary columns.15 Upon injection of the sequencing reaction mixture, we have also observed a constant decay in current and significant deterioration in the separation performance of the column when template DNA was present in the sample solution.28 However, at present, template DNA removal is not always included in the purification schemes of DNA sequencing reaction products for CE analysis.3,18,29 In early work, the effect of the presence of template DNA in the sequencing samples was removed by trimming off a portion of the injection end of the capillary after the electrokinetic injection.30 This method is not acceptable for an automated multiple capillary DNA sequencing instrument. In another case, a modified uracil-containing template was enzymatically nicked after the Sanger reaction.27 Others have used magnetic beads to remove the template for thin slab gel analysis.24 Here, both salt and template were removed in one step; however, this procedure also required chemical modification of the primer or template DNA. In another approach, the template is retarded in a polymer solution added to the sequencing sample, and DNA sequencing fragments are introduced into the capillary column.31 As to be (17) Cohen, A. S.; Najarian, D. R.; Paulus, A.; Guttman, A.; Smith, J. A.; Karger, B. L. Proc. Natl. Acad. Sci. U.S.A. 1988, 85, 9660-9663. (18) Figeys, D.; Ahmadzedeh, H.; Arriaga, E.; Dovichi, N. J. J. Chromatogr., A 1996, 744, 325-331. (19) Kleparnik, K.; Garner, M.; Bocek, P. J. Chromatogr., A 1996, 698, 375383. (20) Schwartz, H. E.; Ulfelder, K.; Sunzeri, F. J.; Busch, M. P.; Brownlee, R. G. J. Chromatogr. 1991, 559, 267-283. (21) Sambrook, J.; Fritsch, E. F.; Maniatis, T. Molecular Cloning: A Laboratory Manual; Spring Harbor Laboratory Press: Cold Spring Harbor, NY, 1989; section 9.49. (22) Talbot, D. Amicon Appl. Note 1996, 387, 11-15. (23) Hilderman, D.; Muller, D. Biotechniques 1997, 22, 878-879. (24) Tong, X. C.; Smith, L. M. Anal. Chem. 1992, 64, 2672-2677. (25) Devaney, J.; Marino, M.; Williams, P.; Weaver, K.; S., D. R.; Turner, K.; Belgrader, P. Appl. Theor. Electrophor. 1996, 6, 11-14. (26) Wang, K.; Gan, L.; Boysen, C.; Hood, L. Anal. Biochem. 1995, 226, 85-90. (27) Swerdlow, H.; Dewjager, K.; Gesteland, R. F. Biotechniques 1994, 16, 684693. (28) Salas-Solano, O.; Ruiz-Martinez, M. C.; Carrilho, E.; Kotler, L.; Karger, B. L. Anal. Chem. 1998, 70, 1528-1535. (29) Wu, C. H.; Quesada, M. A.; Schneider, D. K.; Farinato, R.; Studier, F. W.; Chu, B. Electrophoresis 1996, 17, 1103-1109. (30) Swerdlow, H.; Gesteland, R. Nucleic Acids Res. 1990, 18, 1415-1419. (31) Johnson, B. F. Int. Patent GOIN 27/47, C12Q 1/68, 1995.
discussed later, only a limited amount of DNA sequencing fragments could be injected with this procedure. In the present work, a sample cleanup procedure has been developed to remove template DNA followed by desalting to the level of 5-10 µM salt. A poly(ether sulfone) ultrafiltration membrane, pretreated with a solution of linear polyacrylamide (LPA) to minimize adsorption of the DNA sequencing fragments, was used to eliminate template DNA (as determined by agarose gel electrophoresis) from the reaction mix. The use of ultrafiltration membranes is a simple and cost-effective means of removal of circular DNA vectors from the sequencing reaction products. Then, two prewashed spin columns with filtration gel were used to remove salts, buffer components, and nucleotides. The desalting effectiveness was quantitatively determined by a CE-indirect UV method. The cleanup procedure was shown to be highly reproducible and compatible with a variety of sequencing chemistries and templates. The protocol can be easily automated using a laboratory robot equipped with either pressure or spin force procedures and will therefore be suitable for future incorporation into a fully automated DNA sequencing system. The accompanying paper describes in detail the effects of small ionic impurities (e.g., chloride and deoxynucleotides) and template DNA on the injected amount and separation performance of sequencing reaction products.28 EXPERIMENTAL SECTION Instrumentation. The basic design of a single capillary instrument with laser excitation has been previously described.1,32 Briefly, the light from a 5-mW multiline argon ion laser (model 5490-ASL-00, Ion Laser Technologies, Salt Lake City, UT) was passed through an interference filter (model 52660, Oriel, Stamford, CT) to isolate the 514-nm line. For transfer dye primer sequencing chemistry, a different interference filter was employed (model 52630, Oriel) for isolation of the 488-nm line. A mirror (Newport, Fountain Valley, CA), positioned at 45° relative to the lens then reflected the laser emission. The fluorescence emission was collected with a 40× microscope objective (numerical aperture 0.65) (model 13600, Oriel) and focused onto a spectrograph (Jarrell-Ash Division/Fisher Scientific, Waltham, MA). The light was transmitted through two holographic notch filters (model HNPF-514.5 for the 514-nm line or Notch Plus for the 488-nm line, Kaiser Optical System, Ann Arbor, MI) to block scattered laser illumination and then reflected on a 600 groove/mm grating (Jarrell-Ash). The diffracted image was detected using an intensified photodiode array (Model 1461, EG&G Princeton Applied Research, Princeton, NJ), operated at 4 °C. The fluorescence spectra of the labeled sequencing fragments were acquired from 500 to 660 nm. A total of 640 diodes, electronically integrated in 160 groups of 4 diodes each, were employed to collect the emission spectra. A second instrument was used for a single-wavelength detection system, as previously described.14 The interference filter allowed the emission of a single labeled dye primer to be selectively observed, and detection was accomplished with a photomultiplier tube. The data acquisition was performed through a DT2802 board (Data Translation, Marlborough, MA) using (32) Ruiz-Martinez, M. C.; Carrilho, E.; Berka, J.; Kieleczawa, J.; Miller, A. W.; Foret, F.; Carson, S.; Karger, B. L. Biotechniques 1996, 20, 1058-1069.
Analytical Chemistry, Vol. 70, No. 8, April 15, 1998
1517
Chrom Perfect for Windows (Justice Innovations, Mountain View, CA) software with an IBM-compatible personal computer. The base-calling routines were previously described.32 Capillary Columns and Sieving Matrixes. The capillaries of 75-µm i.d, 365-µm o.d. (Polymicro Technologies, Phoenix, AZ) used to separate sequencing reaction products were covalently coated in our laboratory with poly(vinyl alcohol) (PVA).33 The specific lengths and electrophoretic conditions are listed in the figure captions. High-molecular-weight LPA solutions, as previously described, were used to separate the sequencing reaction products.1 For this work, 2% (w/w) LPA solution (with an average molecular weight in excess of 5.5 MDa) dissolved in 50 mM TRIS-50 mM TAPS-2 mM EDTA-7 M urea buffer was utilized. The polymer solution was replaced from the capillary after each run, and the voltage was applied for 5 min before injection. The template-free samples were injected without preliminary heating. DNA Sequencing Chemistries. Most sequencing reactions were performed using standard cycle sequencing chemistry with AmpliTaq-FS and labeled M13 (-21) universal primers (Applied Biosystems/Perkin-Elmer, Foster City, CA) on a M13mp18 singlestranded template (New England Biolabs, Beverly, MA), and pGEM3Zf(+) double-stranded template (Applied Biosystems). Other DNA sequencing reactions were performed using the Thermosequenase cycle sequencing kit with 7-deaza-dGTP and DYEnamic energy-transfer (ET) M13 (-40) primers (Amersham Life Science, Cleveland, OH). According to manufacturers’ protocols, 0.3-0.6 µg of template was used for all four color reactions. The temperature-cycling protocol for both sequencing chemistries, performed on a PTC100 instrument (MJ Research, Watertown, MA), was 15 cycles of 10 s at 95 °C, 5 s at 50 °C and 1 min at 70 °C, followed by 15 cycles of 10 s at 95 °C and 1 min of 70 °C. The samples were then heated for 5 min at 100 °C in order to inactivate the enzymes prior to cleanup. Quantitation of Matrix Components in Sample Solutions. (1) Chloride Determination. A CE-indirect UV method34 was modified to determine the concentration of chloride in the sequencing samples. The capillary column (DB-1) (J&W Scientific, Folsom, CA) was 100-µm i.d., 365-µm o.d. The buffer reservoirs were carefully positioned at the same height to avoid flow inside the column. On-column UV detection at 274 nm was performed using a multiwavelength UV detector (ThermoQuest, Mountain View, CA). The separations were achieved at constant electric field using a high-voltage power supply (Spellman, Plainview, NY), with the cathodic end grounded. The data acquisition was performed through a DT2802 board using Chrom Perfect for Windows (Justice Innovations) with an IBM-compatible computer. The background electrolyte was 20 mM TRIS-5 mM acetic acid-5 mM potassium chromate. This solution was prepared daily from stock solutions and filtered through a 0.2-µm filter (Gelman Sciences, Ann Arbor, MI) prior to use. The buffer was flushed after each analysis run. A calibration curve was constructed from sodium chloride standard solutions with concentrations between 1.0 and 500 µM. The electrophoretic conditions were as follow: capillary effective length 20 cm, total length 35 cm, electric field 200 V/cm, and hydrodynamic injection at a (33) Goetzinger, W.; Karger, B. L. Int. Patent WO 96/23220, 1996. (34) Rhemrev-Boom, M. M. J. Chromatogr., A 1994, 680, 675-684.
1518 Analytical Chemistry, Vol. 70, No. 8, April 15, 1998
height of 10 cm for 40 s. For each analysis, 5 µL of sequencing sample was required. (2) Template Determination. The presence of template DNA after membrane ultrafiltration was determined by agarose slab gel electrophoresis. The whole DNA sequencing reaction sample was loaded onto the agarose slab gel (1% w/w), prepared from Seakem GTG agarose (FMC, Rockland, ME), and after 1 h of electrophoresis, the gel was stained in a 0.5 µg/L solution of ethidium bromide. A 2-µL aliquot of ssM13mp18 with a concentration of 0.25 µg/µL was used as a positive control. DNA Sequencing Reaction Products Purification. A purification protocol was developed for removal of both the template DNA and the small ions from the sample. (1) Template DNA Removal. ssM13mp18 and ds pBR327 with a Q9 mouse DNA insert templates were removed using an ultrafiltration membrane of poly(ether sulfone), with a molecular weight cutoff (MWCO) of 300 000 in a spin column format (Pall Filtron, Northborough, MA). To suppress nonspecific binding, the columns were treated with a solution of 0.005% (w/v) linear polyacrylamide (average molecular weight 700 000-1 000 000) (Polysciences, Warrington, PA). An aliquot of 500 µL of this LPA solution was placed on the column and spun in a centrifuge (radius 7.4 cm) (model 5415C, Eppendorf, Westbury, NY) for 15 min at 3000 rpm. The polymer solution remaining in the collection receptacle of the spin column was discarded. The four single color sequencing reactions (i.e., for each dye-labeled primer) were pooled together, dried under vacuum, and reconstituted in 120 µL of 30% (w/w) 1-propanol (100% by gas chromatography, J. T. Baker, Philipsburg, NJ). The samples were then heated at 95 °C for 3 min, cooled in ice-water, immediately placed in the spin ultrafiltration columns, and spun at 7000 rpm for 15 min. The 1-propanol was used to prevent rehybridization of the fragments to template DNA after the denaturation step. The filtrate was dried under vacuum and then dissolved in 50 µL of deionized water. pGEM3Zf(+), which is shorter than many other templates, was removed using a poly(ether sulfone) ultrafiltration membrane with a MWCO of 100 000 (Pall Filtron), pretreated with a solution of 0.005% (w/w) LPA (700 000-1 000 000) as well. An aliquot of 500 µL of this LPA solution was spun in the above centrifuge for 15 min at 3000 rpm. The sequencing samples prepared with this template were treated similar to those made from M13mp18 and pBR327 with insert. After heating at 95 °C for 3 min and cooling in ice-water, the samples were placed on the ultrafiltration membranes and spun at 12 000 rpm for 10 min. The filtrate was then dried under vacuum and dissolved in 50 µL of deionized water for the desalting step. The procedure to remove template DNA from sequencing samples performed using Thermosequenase and ET primer (Amersham Life Science) was slightly modified to suppress problems of recovery due to the presence of detergents such as Nonidet P-40 in these samples (see Results and Discussion). The samples were dried and then dissolved in 500 µL of the 30% (w/ w) 1-propanol solution instead of 120 µL, as described above. (2) Desalting of Sequencing Samples. In this step, the sequencing samples, with no template DNA, were desalted using Centri-Sep spin columns (Princeton Separations, Adelphia, NJ). The columns were hydrated for 30 min by adding 800 µL of deionized water. The interstitial volume was excluded by spinning
the columns for 3 min at 3000 rpm, and the columns were washed with five aliquots of distilled water, each of 600 µL. The sequencing samples were then placed on the columns and spun for 3 min at 3000 rpm. The procedure was repeated using the second column, and the resulting volume of the sample was 50 µL. A 10-µL aliquot was diluted with 15 µL of deionized water, and injection was performed from the aliquot without preheating. The specific injection conditions are described in the figure captions. It is important to note that the cleaned-up sequencing samples reconstituted in deionized water were kept at -20 °C. Analysis of these purified samples after one month showed results similar to that from freshly prepared samples, indicating no degradation of the DNA fragments. Chemicals. Acrylamide, N,N,N′,N′-tetramethylethylenediamine (TEMED), ammonium persulfate, and urea were purchased from ICN Biomedicals (Aurora, OH), and TRIS, TAPS, and EDTA were from Sigma (St. Louis, MO). All chemicals were either electrophoresis or analytical grade, and no further purification was performed. The water used in all reactions and solutions was deionized (18.2 MΩ) with a Milli-Q water purification system (Millipore, Worcester, MA). Toxicity. Acrylamide monomer is a neurotoxin and should be handled with care, avoiding skin contact. RESULTS AND DISCUSSION For capillary electrophoretic separation of DNA sequencing products, the fragments are introduced into the columns by means of electrokinetic injection. As noted previously, the presence of small ionic species can significantly decrease the amount of DNA introduced into the capillary columns.18-20 Template DNA present in the sequencing reactions can be deleterious to the separation of the fragments using a cross-linked gel15 or replaceable polymer solution (see following paper28). The goal of this work was to develop a purification scheme that was simple, cost-effective, and easily automatable for a multiple capillary array instrument. The method had to be compatible with a variety of sequencing chemistries and DNA templates and to provide a constant sample composition for rugged DNA sequencing analysis by CE. Therefore, a cleanup protocol was developed for complete template DNA removal from the sequencing samples and for reproducible decrease of the chloride concentration below 10 µM. For purposes of clarity, the procedure for desalting of the sequencing samples will be discussed first, followed by the template removal protocol. Primer Injection Studies. Initially, the influence of salt (chloride concentration) and solvent on the amount of DNA injected, using a solution of JOE-labeled primer as a singlefragment standard, was evaluated. The primer, dissolved at a concentration of 1 × 10-9 M in different solvents, and with various chloride concentrations, was electrokinetically injected for 10 s at a constant electric field of 200 V/cm into a capillary column filled with a polymer solution (2% w/w LPA, 50 mM TRIS/TAPS, 2 mM EDTA, and 7 M urea). After electrophoresis at 200 V/cm, the peak area of the primer was determined for each experimental condition, to assess the influence of solvent and salt concentration of sample solutions on the amount of DNA sequencing fragments injected. Table 1 presents the results of this study. It can be seen that a significantly greater amount of the primer was electrokinetically
Table 1. Effect of the Solvent and Salt Concentration on the Electrokinetic Injection of a 1 × 10-9 M Solution of JOE-Labeled Primera
solvent
salt type and concn
water formamide formamide (heated for 2 min at 95 °C) water water water water water
none noneb products of formamide decomp MgCl2 1 µM MgCl2 100 µM MgCl2 10 mM EDTA 5 µM EDTA 50 µM
a
normalized peak area for JOE-labeled primer 100 7 3 97 35 0 100 57
See text for experimental conditions. b Formamide was deionized.
injected from deionized water than from formamide. Indeed, upon heating formamide to 95 °C, a common step in denaturation, a further decrease in injected amount of primer was observed. There may be two main reasons for these results: first, DNA could be more effective ionized in pure water than in formamide since the solvation of the counterion may be greater for water.35 Second, the appearance of ionic species in the latter solvent may occur as a result of formamide decomposition into formic acid and ammonia upon heating.36 Traditionally, as noted above, the loading buffer for DNA sequencing samples in slab gel electrophoresis and CE is a mixture of formamide-0.5 M EDTA (49:1).18,21 In the case of slab gel electrophoresis, formamide, which is more dense than the running buffer, helps to maintain the sample at the bottom of each well. Because there is no density requirement for such a solvent in CE, formamide does not offer this advantage in this mode. As can be seen from Table 1, DNA samples should preferably be dissolved in water for loading the largest possible amount of DNA by electrokinetic injection. The clear influence of chloride concentration in the sample solution on the amount of DNA injected is also seen in Table 1. At 2 µM chloride (1 µM MgCl2), the level of JOE-labeled primer was comparable to that in deionized water. Upon an increase in the chloride concentration in the sample to 200 µM, the fluorescence signal of the primer decreased by 60%, and at 20 mM chloride, no signal was observed. A 40% decrease in the peak area was also found when the concentration of EDTA was increased from 5 to 50 µM. The results in Table 1 demonstrate the requirement of desalting the sequencing sample to a chloride concentration in the low-micromolar range. Importantly, low ionic strength aqueous solutions provide sufficient denaturing ability to minimize rehybridization of single-stranded DNA;37 therefore, heating formamide to 95 °C was found unnecessary for denaturation of the sequencing reaction products for CE analysis. Determination of Residual Chloride Concentration in Sequencing Samples Using CE-Indirect UV. For quantitation (35) Leving, L.; Gordon, J. A.; Jencks, W. P. Biochemistry 1963, 2, 168-175. (36) Carey, F. A.; Sundberg, R. J. Advanced Organic Chemistry; Plenum Press: New York, 1993; pp 473-475. (37) Doty, P.; Boedtker, H.; Fresco, J. R.; Haselkorn, R.; Litt, M. Proc. Natl. Acad. Sci. U.S.A. 1959, 45, 482-487.
Analytical Chemistry, Vol. 70, No. 8, April 15, 1998
1519
of the residual amount of chloride present in the sequencing sample solutions, we selected CE-indirect UV analysis because the method is fast, requires only a small amount of sample, and has detection limits in the low-micromolar range.38 Chromate buffer was used as background electrolyte to detect chloride.34 Figure 1 shows a typical electropherogram for the separation of anions from a cycle sequencing sample, with the chloride peak being observed at 3 min. Calibration of the CE-indirect UV method was linear in the range from 1 to 500 µM chloride (r2 ) 0.9996), with an RSD of 4% and a detection limit (S/N ) 3) of 0.2 µM. Using co-injection of standards, it was also determined that the tailing peak with a retention time of ∼6 min corresponded to the nonresolved di- and deoxynucleotides. Although the concentration of these ionic species was not quantified by this method, a relative indication of the presence of these anions could nevertheless be obtained from the electropherogram. Two methods for desalting of DNA sequencing fragments were examined: gel filtration in a spin column format and ethanol precipitation. For gel filtration, Centri-Sep and Sephacryl SH-200 (Pharmacia, Piscataway, NJ) spin columns were evaluated. Using the CE-indirect UV method, it was found that the average chloride concentration of dye-labeled primer cycle sequencing samples (n ) 5), purified with either spin column, was reduced from an original 100 mM to ∼250 µM. However, only a small portion of DNA could be injected at this chloride level in the sample solution (Table 1). On the other hand, when ethanol precipitation was used, the chloride concentration in five sequencing samples was found to be reduced on average to 100 µM. However, the amount of chloride in the samples was quite variable, ranging from 20 to 230 µM. This cleanup procedure is also difficult to automate and could lead to significant losses of DNA,2,22,23 thus seriously affecting the ruggedness of DNA sequencing analysis. Returning to the gel filtration method for desalting, a second Centri-Sep column was employed in an attempt to decrease the concentration of chloride below 250 µM, but no significant reduction in chloride concentration was obtained. In a blank run, 50 µL (the approximate volume of the cycle sequencing samples) of deionized water was spun through the column instead of the DNA sample. Surprisingly, it was found that the filtrate had a concentration of 230 µM chloride, similar to the value obtained after the desalting of the sequencing samples. After consecutive washes of the Centri-Sep column with up to five column volumes of water, the chloride concentration in the filtrate decreased to ∼2 µM. These blank experiments demonstrated that ionic impurities in the resin beads significantly contributed to the ionic strength of the final purified samples. It is interesting to note that it has also been reported by others that some spin columns for PCR product purification can add salt to the samples.25 On the basis of these results, each Centri-Sep column was washed extensively with deionized water prior to desalting the dye-primer cycle sequencing sample, and the final chloride concentration in the purified samples was determined to be between 25 and 50 µM. This result showed that the capacity or resolution of a single Centri-sep column was not sufficient for thorough desalting. Indeed, when two prewashed columns per sample were used sequentially one after another, the residual (38) Foret, F.; Fanali, S.; Ossicini, L. J. Chromatogr. 1989, 470, 299-308.
1520 Analytical Chemistry, Vol. 70, No. 8, April 15, 1998
Figure 1. CZE separation of anions present in DNA sequencing samples. Electrophoretic conditions: capillary column DB-1, 100-µm i.d., 365-µm o.d. (J&W), capillary effective length 20 cm, total length 35 cm. Constant electric field 200 V/cm (25 µA), background electrolyte 20 mM TRIS-5 mM acetic acid-5 mM potassium chromate. The sample was introduced into the column by hydrodynamic injection for 40 s at a height of 10 cm. The polarity of the detector was inverted to observe positive peaks.
chloride concentration in the samples was reduced another 10fold. More than 50 cycle sequencing samples were examined using this double desalting approach, and the final chloride concentration for all these samples was found to be between 2 and 10 µM. We considered this level of salt content as the practical lower limit that may be reached in routine operation using prewashed gel filtration columns. Should larger capacity gel filtration spin columns become available, desalting could be done in a single step. With this effective desalting of the sequencing samples, we were able to use only one-fifth (10 µL) of the total sample and still achieve a high fluorescence signal for the sequencing fragments. As indicated in Table 1 and discussed in the accompanying paper,28 the injected amount of sequencing fragments at these levels of chloride content should be close to that from pure water. Finally, it should be noted that after effective desalting, no peak for the nucleotides was observed on the CEindirect UV electropherogram. As a result, in the sequencing reaction conducted with dye-labeled dideoxy terminators, the sequence was readable from the third base after the primer (data not shown). Therefore, the gel filtration step also reduced diand deoxynucleotides to undetectable levels. Template DNA Removal. After the sequencing samples were desalted to a residual chloride concentration of ∼5 µM, it was observed that the resolution of the sequencing fragments was significantly decreased when the sample was injected for longer than 3 s at 200 V/cm (data not shown). This decrease, more pronounced for fragments longer than 300 bases, probably was caused by declining current during the electrophoretic run. Such phenomena have previously been reported for CE analysis using cross-linked gels, and it was suggested that the presence of template DNA in the sample was responsible for this behavior.15 In addition, template DNA may adversely affect the stability of capillary gels and the separation of the sequencing fragments by CE.3,18 However, as noted earlier, despite these results, template
DNA removal has not universally been included in the sample purification schemes for CE analysis. In the low-micromolar range of the concentration of small ions, the electrokinetic injection would preferentially load DNA sequencing fragments and template DNA, assuming that all DNA molecules have similar free mobility in solution. Therefore, relatively more DNA template molecules could be introduced into the capillary column from such sequencing samples compared to those with high chloride content. This increased amount of injected template can be deleterious to the separation performance of sequencing fragments and to the ruggedness of DNA sequencing by CE.28 Since low salt concentration is important for optimum amounts of DNA injected, it can be concluded that the template should be removed from the DNA sequencing samples. In this work, ultrafiltration was evaluated as a means to remove template DNA, because this methodology does not require chemical modifications of the DNA, and it should thus be generally applicable to different sequencing chemistries. The use of ultrafiltration membranes to remove template DNA from the reaction mix is reasonable due to the difference in molecular size between the sequencing fragments (1 kb base and lower) and template DNA (∼>5.0-kb bases in the case of M13mp18 which is also circular). (1) Ultrafiltration Membrane Molecular Weight Cutoff and Chemical Composition. In this work, membranes with three different pore sizes were selected: 0.01 (100K), 0.03 (300K), and 0.1 µm, to test their ability to retain template DNA (M13mp18 and pGEM3Zf(+)) from sequencing samples. In addition, a variety of commercially available membranes made from hydrophilic materials such as cellulose, cellulose acetate, poly(vinyl difluoride), poly(ether sulfone), and polysulfone were examined. In a preliminary study, mixtures of either M13mp18 or pGEM3Zf(+) and denatured PCR products in deionized water were passed through a variety of membranes to determine their ability to retain the template and pass through the sequencing reaction products. The effectiveness of template removal was assessed by agarose slab gel analysis. A positive control was used to determine the removal of the template, and the intensity of the fluorescent dye, ethidium bromide, was selected as an indication of the recovery of the denatured PCR products. The membranes with a 0.1-µm pore size were found to be inefficient in retaining both templates. In the case of the 0.03-µm pore size membranes (300K MWCO), pGEM3Zf(+) was able to pass through the membranes, while M13mp18 was partially retained. However, with the 0.01-µm (100K MWCO) membranes, both templates were effectively trapped. With respect to membrane material, the highest recovery of the denatured PCR products was found with the poly(ether sulfone) membranes (data not shown), and membranes of such material with MWCO of 100K were chosen for further model experiments. Four-color sequencing reactions were prepared with labeled primers on a ssM13mp18 template using AmpliTaq-FS DNA polymerase. The template was removed from three out of the four samples with the 100K membranes. After the final desalting (chloride concentration ∼5 µM), one-fifth of each sample was used for the injection into the capillary column. The intensity of the fluorescence signal of several DNA fragments was employed as a qualitative measure of the recovery of the DNA sequencing
reaction products after template removal. However, for all three samples injected, only a small peak corresponding to the primer was observed (see Figure 2A). The fourth reaction, only desalted with no template removal, was used as a positive control for the ultrafiltration process. The control sample showed a high fluorescence signal for the sequencing fragments, which suggested that the enzymatic reaction was successful. It was then concluded that the sequencing fragments from all three samples were adsorbed to the poly(ether sulfone) membrane. To prevent nonspecific adsorption, several different types of agents have been tested.21 It was decided to test LPA, since this polymer was used as the matrix for separation of the sequencing fragments. The recovery of the fragments after removing the template from the100K poly(ether sulfone) membrane pretreated with 500 µL of 0.01% (w/w) LPA (700 000-1 000 000) is qualitatively shown in Figure 2B. A comparison of panels A and B of Figure 2 indicate that the LPA solution can effectively deactivate the ultrafiltration membrane. Adsorption of the polymer at the pore entrance of the membranes could alter the MWCO and thus reduce the passage of smaller molecules through the membrane.39 Indeed, Figure 2C shows that decreasing the concentration of LPA for the pretreatment to 0.005% (w/w) increased the recovery of the sequencing fragments. On the other hand, at LPA concentrations below 0.005% (w/w), the recovery of the sequencing fragments was again reduced (Figure 2D). These solutions of lowconcentration LPA likely left nonspecific binding sites exposed which adsorbed the sequencing fragments, thus reducing the fluorescence signal. The deactivated 100K MWCO poly(ether sulfone) membranes were very effective in removal of both pGEM3Zf(+) and M13mp18 templates. However, due to the cutoff of the 100K membranes, a decrease in peak height with fragment length (above 600 bases) was seen in the electrophoretic run. The 300K MWCO membranes were a better alternative to solve this problem in the case of M13mp18 or larger templates. As mentioned earlier, the 300K MWCO poly(ether sulfone) membranes without polymer pretreatment were not capable of fully retaining M13mp18 DNA. However, after the pretreatment of these membranes with the LPA solution (0.005% w/w 700 000-1 000 000), M13mp18 was reproducibly removed from the sequencing samples, as determined by agarose slab gel electrophoresis (data not shown). The recovery of the longer fragments (as measured for fragments 600 and 842 bases in length) was ∼25% higher than that obtained with the 100K MWCO (data not shown). For smaller templates such as pGEM3Zf(+), the 100K MWCO membrane can successfully be used, despite the lower recovery of the longer fragments. Nevertheless, most templates are larger and would be successfully removed by the pretreated 300K MWCO membranes. For example, templates typically used in sequencing have an insert of genomic DNA a few kilobase pairs long and are thus close to or even exceed in size M13mp18 DNA. Indeed, we used these 300K MWCO membranes for template removal from the sequencing samples made on ds pBR327 (3273 bp) with a Q9 mouse DNA insert (the total size of this template was higher than 18 kbp). (39) McGregor, W. C. Membrane Separations in Biotechnology; Marcel Dekker: New York, 1986; pp 1-34.
Analytical Chemistry, Vol. 70, No. 8, April 15, 1998
1521
Figure 2. Electropherograms of dye-labeled primer cycle sequencing samples prepared with AmpliTaq-FS and passed through poly(ether sulfone) ultrafiltration membranes (100K MWCO) pretreated with LPA (700 000-1 000 000) solutions: (A) no pretreatment; (B) 0.01% (w/w); (C) 0.005% (w/w); (D) 0.002% (w/w). The fluorescence signal corresponds to the cytidine-terminated sequence of M13mp18. Electrophoretic conditions: capillary length 15 cm, total length 25 cm, 75-µm-i.d., 365-µm-o.d. coated capillary (poly(vinyl alcohol)); polymer concentration 2% (w/w) LPA (MW> 5 500 000), running buffer 50 mM TRIS-50 mM TAPS-2 mM EDTA. The samples were injected at a constant electric field of 100 V/cm for 20 s and electrophoresed at 200 V/cm (8.0 µA) and at room temperature.
Assuming that a concentration of 0.005% (w/w) LPA (700 0001 000 000) is optimal for the membrane pretreatment, the effect of the sample collection spin force on the recovery of the sequencing fragments was next evaluated. It was determined that centrifugation at a speed of 7000 rpm in a microcentrifuge with a fixed angle rotor (radius 7.4 cm) resulted in the highest fluorescence signal of the sequencing reaction products. This signal was ∼30% and ∼40% lower after spinning the DNA samples at 5000 and 3000 rpm, respectively. It should be mentioned that other workers have used untreated ultrafiltration membranes to remove template DNA (M13mp18) from sequencing reactions for analysis by matrix-assisted laser desorption/ionization (MALDI) mass spectrometry.40 However, (40) Mouradian, S.; Rank, D. R.; Smith, L. M. Rapid Commun. Mass Spectrom. 1996, 10, 1475-1478.
1522
Analytical Chemistry, Vol. 70, No. 8, April 15, 1998
they were interested in the recovery of short fragments (below 100 bases long), and therefore, deactivation of the membrane was not required. (2) Effect of Surfactants on the Recovery of Sequencing Fragments. The above cleanup method was optimized and validated, using DNA sequencing fragments obtained by performing cycle sequencing reactions using the dye-primer cycle sequencing kit with AmpliTaq-FS (Perkin-Elmer/Applied Biosystems) and either M13mp18 or pGEM3Zf(+) as a template. Subsequently, the purification protocol was applied to remove M13mp18 and pGEM3Zf(+) from sequencing samples prepared with the Thermosequenase cycle sequencing kit and ET primers (both from Amersham Life Science). At first, no fluorescence signal for the sequencing fragments was obtained, similar to that in Figure 2A. The electropherogram of a purified dye-labeled
Figure 3. Electopherograms of dye-labeled primer cycle sequencing samples passed through poly(ether sulfone) ultrafiltration membranes (300K MWCO) deactivated with 0.005% (w/w) LPA 700 000-1 000 000: (A) sample prepared with AmpliTaq-FS cycle sequencing (ABI); (B) sample prepared with AmpliTaq-FS cycle sequencing and dissolved in 0.5% (w/w) Nonidet P-40; (C) sample prepared with AmpliTaq-FS cycle sequencing and dissolved in 0.5% (w/w) Tween-20. The fluorescence signal corresponds to the C-terminated sequencing of M13mp18. Electrophoretic conditions as in Figure 2.
primer cycle sequencing sample prepared with AmpliTaq FS used as a positive control was, however, quite strong (see Figure 3A). These surprising results suggested that the different chemical composition of the Thermosequenase kit relative to the AmpliTaqFS kit might be affecting the ultrafiltration process during template removal. Comparing these two commercial kits, it was found that the Amersham kit contained detergents, such as Nonidet P-40 and Tween 20, both at concentrations of ∼0.5% (w/w), whereas such surfactants were not present in the ABI kit.41 To assess the effect of the presence of Nonidet P-40 and Tween 20 in the reaction mixture on the membrane performance, these surfactants were added separately to a dye-primer cycle sequencing samples prepared with AmpliTaq-FS, using the M13mp18 template and the 300K MWCO pretreated membranes. The detergents were added to obtain final concentrations ranging from
0.02% to 0.5% (w/w) in the sample, values approximately equal to their concentrations in the Thermosequenase kits. The presence of Nonidet P-40 in the AmpliTaq-FS sequencing sample at a concentration of 0.02% (w/w) decreased the peak area of the sequencing fragments by ∼40%, and at a concentration of 0.5% (w/w) of Nonidet P-40, the peak areas of the sequencing fragments decreased by ∼90% compared to that obtained when this surfactant is absent (Figure 3B). These results suggested that Nonidet P-40 affected the free transport of the sequencing fragments through the deactivated 300K poly(ether sulfone) membrane. This surfactant may be predominantly in the form of aggregates since the detergent concentration in the sequencing mixture was ∼6 times higher than the critical micelle concentration (cmc).42 These aggregates could behave as globular proteins and possibly clog the membrane pores, thus affecting the flow of
(41) Perkin-Elmer Corp./Applied Biosystems, Protocol P/N 402113, 1997.
(42) Shimizu, K.; Iwatsuru, M. Chem. Pharm. Bull. 1990, 38, 1353-1358.
Analytical Chemistry, Vol. 70, No. 8, April 15, 1998
1523
DNA through the membrane.43 Additionally, hydrophobic interactions between the ssDNA and the aggregates could modify the molecular structure of the ssDNA molecules, affecting their transport during the ultrafiltration process.44 If aggregates of Nonidet P-40 were indeed the cause of the lower recovery of the sequencing fragments, the most practical way to increase recovery would be to establish experimental conditions under which micelle formation is avoided. We then reduced the concentration of Nonidet P-40 below the cmc by increasing the volume of the 30% (w/w) 1-propanol solution from 120 µL to the maximum volume of the Nanosep microconcentrator (500 µL). We found that with this dilution procedure for template removal, the recovery of the DNA sequencing fragments from the samples prepared with Thermosequenase kit was as good as that obtained with the AmpliTaq-FS cycle sequencing samples (data not shown), and thus the problem was overcome. Dilution to 500 µL would significantly lower the concentration of surfactant, and the larger amount of 1-propanol would simultaneously increase the cmc.45 Interestingly, the presence of Tween-20 did not appear to have a deleterious effect on the ultrafiltration of the sequencing fragments using the 300K membranes. At a surfactant concentration of 0.5% (w/w) in the reaction mixture, the fluorescence signal of the DNA sequencing reaction products decreased by only ∼10% of that obtained without Tween-20 (Figure 3C). The influence of Tween-20 on the 100K MWCO membranes was not determined. In the manual cleanup procedure, drying the sequencing sample after template removal and subsequent desalting with prewashed gel filtration columns require ∼1 h in total. Automation of these steps in a microtiter plate format using fast evaporation and vacuum-driven gel filtration can reduce this time. Studies currently in progress suggest that poly(ether sulfone) membranes with 30K or less MWCO may provide comparable results for desalting of single-stranded DNA sequencing reaction products (i.e., 10 µM salt or less concentration), without requiring the intermediate drying step. It is interesting to note that ultrafiltration has been used previously for desalting of doublestranded PCR products for CE analysis20,25 and short DNA sequencing fragments analyzed by MALDI.40 DNA Sequencing with Long Reads. The influence of the new cleanup procedure on the separation of DNA sequencing fragments using replaceable LPA solutions was next analyzed. Figure 4 presents an electrophoretic separation of a purified dyelabeled cycle sequencing reaction products prepared on ssM13mp18 using AmpliTaq-FS. The electropherogram shows a sequence for fragments from 16 bases after the primer to ∼1100 bases. The sequence was read by previously described base-calling software.1 As indicated in the figure, eight errors occurred in the first 30 bases called; no other errors were found until base 765. It is important to note that two undercalls in this region (bases 95 and 311) resulted from base-caller limitations. Both peaks corresponding to these bases are clearly seen in the appropriate positions of the electropherogram and were identified without prior knowledge of the sequence. (See also the legend to Figure (43) Amicon Catalog 1997, 11-12. (44) Ramsey, R. S.; Kerchner, G. A.; Cadet, J. J. High Resolut. Chromatogr. 1994, 17, 4-8. (45) Weinberger, R. Practical Capillary Electrophoresis; Academic Press: New York, 1993; pp 147-187.
1524 Analytical Chemistry, Vol. 70, No. 8, April 15, 1998
4.) The base calling yielded a read length of 800 bases with 99% accuracy in ∼75 min and 900 bases with 97.5% accuracy in ∼80 min. As discussed in the following paper,28 the CE run utilized injection at a low field, i.e., 25 V/cm. Similar sequence read lengths are routinely obtained with samples cleaned up with the new purification protocol. Figures of Merit for the Purification Protocol. A series of five dye-labeled primer cycle sequencing samples (single dye, C-terminated) with M13mp18 as template were prepared, purified, and analyzed by a single researcher within the same day to assess the repeatability of the cleanup protocol. The variability of peak area and separation efficiency of fragments 289, 555, and 842 bases long were determined, and both factors were found to be within 5% RSD for all three fragments, using the five samples. Therefore, the cleanup protocol was shown to provide consistent results on a sample-to-sample basis. Next, several workers prepared, purified, and analyzed various (n ) 15) dye-labeled cycle sequencing reactions over a two-week period. The peak area and separation efficiency for the fragments 289, 555, and 842 bases long were again used to characterize this figure of merit. For simplicity, Table 2 presents the statistical evaluation of the sequencing data for the fragment 555 bases in length. The RSD values were found to be 10% and 7% for peak area and separation efficiency, respectively. Similar results were found for fragments 289 and 842 bases long. As further shown in Table 2, the average base-calling accuracy obtained for the first 800 bases was 99% with 0.3% RSD, and the average accuracy for the first 900 bases of the sequence was 97.5% with 0.5% RSD. We next compared the quality and variability of the data obtained from DNA sequencing samples cleaned by the new purification protocol vs samples purified with established ethanol precipitation protocols using several dye-primer cycle sequencing samples with M13mp18 as the template. Five of these samples were purified with the new cleanup protocol and the other 15 samples desalted by ethanol precipitation. Five samples out of the latter 15 were dried under vacuum and reconstituted in a mixture of formamide-0.5 M EDTA (traditional ethanol precipitation). Another five ethanol precipitated samples were dried and dissolved in deionized water, and the final five samples were resuspended in the template suppression reagent, as suggested by the manufacturer.41 The peak area and separation efficiency of the fragments 289, 555, and 842 bases long were measured after the analysis of each sample. Table 3 shows that samples cleaned by the traditional ethanol precipitation procedure was ∼50-fold lower than that obtained from samples purified with the new protocol. This result was probably due to the fact that (a) the salt concentration in the sample solution after ethanol precipitation could be up to 230 µM (cf.