Anal. Chem. 1997, 69, 3915-3920
Single-Molecule Detection of Specific Nucleic Acid Sequences in Unamplified Genomic DNA Alonso Castro*
Los Alamos National Laboratory, MS-D454, Los Alamos, New Mexico 87545 John G. K. Williams
Pioneer Hi-Bred International, Johnston, Iowa 50131
A new technique is described for the rapid detection of specific nucleic acid sequences in unamplified DNA samples. The method consists of using two nucleic acid probes complementary to different sites on a target DNA sequence. The two probes are each labeled with different fluorescent dyes. When mixed with a sample containing the target DNA, the two probes hybridize to their respective binding sites on the same target DNA molecule. The sample is then analyzed by a laser-based ultrasensitive fluorescence system capable of detecting single fluorescent molecules at two different wavelength channels simultaneously. Since the probes are bound to the same target DNA molecule, their signals appear simultaneously. Thus, coincident detection of both dyes provides the necessary specificity to detect an unamplified, single-copy target DNA molecule in a homogeneous assay. If the target is not present, only uncorrelated events originating from free probes will be observed at either channel. Phage λ DNA in a background of salmon genomic DNA was detected as a two-dye coincident signal at a relative concentration of one λ molecule per salmon genome. In a control sample, cleavage of the λ DNA between the two probe binding sites eliminated the coincident signals. In a second experiment, a single-copy transgene was detected in maize. Detection parameters and possible future applications to genetic analysis are discussed. Current methods for DNA detection usually require enzymatic amplification of the target sequence prior to analysis. The polymerase chain reaction (PCR), for example, selectively increases the concentration of the target sequence relative to unrelated sequences, thus enhancing both the specificity and sensitivity of the assay. Amplification methods, however, may introduce ambiguities resulting from contamination, from variability in amplification efficiency, and from other mechanisms not fully understood.1-6 Another common method for identification of specific DNA sequences is the Southern blot. In this procedure, (1) Bej, A. K.; Mahbubani, M. H.; Atlas, R. M. Crit. Rev. Biochem. Biophys. 1991, 26, 301-334. (2) Peccoud, J.; Jacob, C. Biophys. J. 1996, 71, 101-108. (3) Reiss, J.; Krawczak, M.; Schloesser, M.; Wagner, M.; Cooper, D. Nucleic Acids Res. 1990, 18, 973-978. (4) Schmidt, T.; Hummel, S.; Herrmann, B. Naturwissenschaften 1995, 82, 423431. (5) Taranger, J.; Trollfors, B.; Lind, L.; Zackrisson, G.; Belingholmquist, K. Pediatr. Infect. Dis. J. 1994, 13, 936-937. (6) Wilke, W. W.; Sutton, L. D.; Jones, R. N. Clin. Chem. 1995, 41, 622-623. S0003-2700(97)00389-2 CCC: $14.00
© 1997 American Chemical Society
the DNA sample is cleaved with a restriction enzyme, sizeseparated by gel electrophoresis, and transferred from the gel to a nitrocellulose filter. Detection is then accomplished by adding a hybridization probe. Despite its popularity, Southern blotting suffers from some limitations, mainly because it involves a series of manually intensive procedures that cannot be run unattended and cannot be readily automated; casting gels, applying samples, and running and subsequently staining the gels are time-consuming tasks susceptible to poor quantitative accuracy and poor reproducibility. In most cases, in order to improve sensitivity, a radioisotope needs to be incorporated into the probe, which brings up a set of safety and environmental concerns. There is, therefore, the need for an economical, automatable method for the analysis of small amounts of unamplified DNA in the high-throughput analytical laboratory. Recent developments in laser-based detection of fluorescent molecules have made possible the implementation of very sensitive techniques for biochemical analysis. Single DNA molecules labeled with multiple fluorescent dyes have been observed by imaging microscopy,7,8 as well as sized9,10 and electrophoretically separated in capillaries.11,12 Detection of specific DNA sequences, however, requires the use of sequence-specific probes tagged with a fluorophore. Drossman,13 for example, has used multibranched dendrimer probes containing several hundred fluorophores each to detect DNA targets with high sensitivity in pure samples. Using fluorescence correlation spectroscopy, Rigler14,15 has observed single-molecule hybridization by measuring the change in diffusion coefficient of a tagged probe when bound to a larger target. All previous work on the detection of specific DNA sequences has utilized pure samples consisting of only the target sequence. However, the direct detection of a specific sequence in a background of unrelated sequences has not yet been demonstrated. A high level of specificity is necessary in order to apply (7) Perkins, T. T.; Quake, S. R.; Smith, D. E.; Chu, S. Science 1994, 264, 822826. (8) Larson, R. G.; Perkins, T. T.; Smith, D. E.; Chu, S. Phys. Rev. E 1997, 55, 1794-1797. (9) Castro, A.; Fairfield, F. R.; Shera, E. B. Anal. Chem. 1993, 65, 849-852. (10) Goodwin, P. M.; Johnson, M. E.; Martin, J. C.; Ambrose, W. P.; Marrone, B. L.; Jett, J. H.; Keller, R. A. Nucleic Acids Res. 1993, 21, 803-806. (11) Castro, A.; Shera, E. B. Anal. Chem. 1995, 67, 3181-3186. (12) Haab, B. B.; Mathies, R. A. Anal. Chem. 1995, 67, 3253-3260. (13) Brau, D.; Miller, D.; Nilsen, T.; Drossman, H. Abstracts of Papers, 213th National Meeting of the American Chemical Society, American Chemical Society: Washington, DC, 1997; 29-anyl (Drossman, H. personal communication). (14) Eigen, M.; Rigler, R. Proc. Natl. Acad. Sci. U.S.A. 1994, 91, 5740-5747. (15) Kinjo, M.; Rigler, R. Nucleic Acids Res. 1995, 23, 1795-1799.
Analytical Chemistry, Vol. 69, No. 19, October 1, 1997 3915
Table 1. Nucleotide Sequence of the PNA Probes probe
sequence
λ1 λ2
5′-TAT TTG ACG TGG TTT-Lys-R6G 5′-BTR-O-GCC-TCC-ACG-CAC-GTT
a A, C, G, and T are the four DNA bases, R6G is Rhodamine-6G, BTR is Bodipy-TR, Lys is lysine, and O is the linker arm H2N-CH2CH2-O-CH2-CH2-O-CH2-COOH. Lysine and the linker arm each provide a primary amine for coupling to dye molecules functionalized with an NHS ester.
Figure 1. Schematic diagram of the two-probe coincidence DNA homogeneous assay.
single-molecule detection methods to the genetic analysis of unamplified genomic DNA samples. Here, we describe a new method for the rapid, direct detection of specific nucleic acid sequences in biological samples. The detection scheme involves the use of two nucleic acid probes complementary to the DNA target. Peptide nucleic acids (PNA) were used instead of DNA probes because PNAs exhibit stronger binding to DNA targets.16 The two PNA probes, each labeled with a different fluorescent dye, are hybridized to the sample. If a target molecule having both probe binding sites is present, then a complex should form between the target and the two probes. The sample is then analyzed by a laser-based ultrasensitive fluorescence system capable of detecting single fluorescent molecules at two different wavelengths simultaneously in a flow cell. Since the probes are bound to the same DNA target fragment, their signals will appear at the same time. Thus, the simultaneous detection of the two probes signifies the presence of a target molecule (Figure 1). When there is no target present, the probes will emit signals that are not coincident in time. Noncoincident signals will result from unhybridized probes, from targets hybridized to only one probe, or from nonspecifically bound probes. EXPERIMENTAL SECTION DNA. Salmon testes DNA having a haploid genome size of 3 × 109 bp17 was obtained from Sigma (St. Louis, MO). Phage λ DNA, 48 502 bp in length, was purchased from New England Biolabs (Beverly, MA). A sample was cut with HaeIII under conditions specified by the enzyme vendor (Boehringer-Mannheim, Indianapolis, IN). Maize DNA was isolated from two maize plants: a Pioneer inbred PHI-P38, and the same inbred transformed with a Bacillus thuringiensis (BT) toxin gene. The BT gene18 was present at one copy per haploid genome, as determined by Southern blot analysis (data not shown). DNA was prepared (16) Orum, H.; Nielsen, P.; Jorgensen, M.; Larsson, C.; Stanley, C.; Koch, T. Biotechniques 1995, 19, 472-480. (17) Hinegardner, R.; Rosen, E. Am. Nat. 1972, 106, 621-644.
3916 Analytical Chemistry, Vol. 69, No. 19, October 1, 1997
using a CTAB extraction procedure19 from 1.6 g of lyophilized leaf tissue. The DNA (1 mg) was then treated with proteinase K (180 µg/mL) in 0.25% sodium dodecyl sulfate (SDS) at 50 °C for 20 h. The SDS was removed by dialysis against Tris-EDTA (TE) buffer (10 mM Tris-HCl/0.1 mM EDTA). The purified DNA was precipitated with ethanol and dissolved in TE buffer at a concentration of 5 mg/mL. Probes. PNA probes were purchased from PerSeptive Biosystems, Inc. (Framingham, MA). Probes λ1 and λ2 (Table 1) hybridize to adjacent sites four nucleotides apart on the same strand of phage λ DNA. There is a HaeIII cleavage site between the two probe binding sites, at position 48 425 in the λ DNA sequence. Probes BT1-R6G and BT2-BTR (15-mers) hybridize at two positions 620 nucleotides apart in the sequence of the BT gene.18 The fluorescent dyes Rhodamine-6G (R6G) and BodipyTR (BTR) were purchased from Molecular Probes, Inc. (Eugene, OR) and coupled to the probes by the PNA manufacturer. These two dyes were chosen because they exhibit good absorption at the excitation wavelength and have well-separated emission profiles. Probes labeled with R6G were dissolved at a concentration of 50 µM in 10 mM Tris-Cl pH 7.5/0.1 mM NaEDTA. BTR probes were dissolved in the same buffer plus 7 M urea. Addition of urea facilitated dissolution of the BTR probes but was not needed for the more soluble R6G probes. Probe mixes of λ1 plus λ2, or BT1 plus BT2, were prepared by diluting 1 µL of each probe in 2.5 mL of water, so that the final concentration of each probe was 20 nM. The probe mix was vortexed and placed in a -20 °C freezer until use. Hybridization. One microliter of λ DNA (1.7 µg/mL) was mixed with 10 µL of salmon DNA (10 mg/mL) and 89 µL of a 10 mM Tris-Cl pH 7.5/0.1 mM NaEDTA/7 M urea buffer in a 1.5mL polypropylene microcentrifuge tube and the resultant mixture was then incubated in a sand heating block for 4 min at 95 °C so that the DNA sample was denatured. The tube was removed from the heating block and cooled at room temperature for at least 5 min. A 0.5-µL sample of the λ1 + λ2 probe mix was added, and the sample was incubated in the dark at 20 °C for 16 h. Concentrations in the hybridization samples were 5 × 10-13 M salmon genome, 5 × 10-13 M λ DNA, 1 × 10-10 M each probe, and 6.2 M urea. Samples were stored at -20 °C and were diluted 100-fold into 5 mL of distilled water for single-molecule detection analysis. Maize DNA was hybridized to the BT probes under the same conditions, except the concentration of the maize DNA was 2.5 × 10-13 M. (18) Armstrong, C. L.; Parker, G. B.; Pershing, J. C.; Brown, S. M.; Sanders, P. R.; Duncan, D. R.; Stone, T.; Dean, D. A.; Deboer, D. L.; Hart, A. R.; Howe, A. R.; Morrish, F. M.; Pajeau, M. E.; Petersen, W. L.; Reich, B. J.; Rodriguez, R.; Santino, C. G.; Sate, S. J.; Schuler, W.; Sims, S. R.; Stehling, S.; Tarochione, L. J.; Fromm, M. E. Crop Sci. 1995, 35, 550-557. (19) Saghai-Maroof, M. A.; Soliman, K. M.; Jorgensen, R. A.; Allard, R. W. Proc. Natl. Acad. Sci. U.S.A. 1984, 81, 8014-8018.
Figure 2. Schematic diagram of the experimental setup for two-channel coincidence single-molecule detection.
Instrument. The system consists of a modified version of the apparatus used in previous single-molecule detection experiments9,11 (Figure 2). The frequency doubled output (532 nm) of a mode-locked Spectra-Physics Model 3800 Nd:YAG laser producing 70-ps pulses at an 82-MHz repetition rate was used as the excitation source. At this wavelength, the extinction coefficients of R6G and BTR are 72 000 and 30 000 M-1 cm-1, respectively. The laser output was attenuated to 2-5 mW and focused by a 6× microscope objective into the 0.8 × 0.8 mm I.D. square crosssection capillary cell (Vitrodynamics) to yield a 10-µm spot (1/e2 value, as determined by the scanning knife-edge method20 ). A syringe pump (Harvard Apparatus) was used to pump the sample trough the capillary cell at a rate of 200 µL/h, which translates to an average linear flow velocity of 87 µm/s. As individual molecules move through the laser beam, repeated excitationemission cycles produce a fluorescence photon burst. The apparatus incorporates two detection channels which independently detect the photon bursts from each of the two dyes. Fluorescence in each channel is collected by a 40× 0.85 NA microscope objective and spatially filtered by a 0.4 × 0.4 mm square slit, which defines a 10 × 10 µm detection area. The light is then spectrally filtered by 30-nm-bandwidth eight-cavity interference filters (Omega, RDF series). The filter band-pass on each channel was chosen such that there is sufficient overlap with the emission spectrum of the corresponding dye but negligible overlap with the spectrum of the other dye. Light detection was accomplished by EG&G single-photon avalanche photodiodes. Each detector output signal is analyzed by independent time-correlated single-photon-counting electronics under computer control.9,11 The detection electronics reject Raman and Rayleigh scattering by using a time-gated window set such that only delayed fluorescence photons are detected, thus increasing the signal-to-noise ratio (SNR) of single-molecule detection.21 Fluorescence data from (20) Suzaki, Y.; Tachibana, A. Appl. Opt. 1975, 14, 2809-2810. (21) Shera, E. B.; Seitzinger, N. K.; Davis, L. M.; Keller, R. A.; Soper, S. A. Chem. Phys. Lett. 1990, 174, 553-557.
each channel was collected in 1-ms intervals, for a total running time of 819.2 s, and saved for later analysis. Data Analysis. We have used two different methods to analyze the raw data files for coincident signals. One of them is the cross-correlation analysis, where the two raw data sets gj and hk are subjected to the following formula: N-1
Corr(g,h)j ≡
∑g
j+khk
for j ) -(N - 1),
k)0
-(N - 2), ..., -1, 0, 1, ..., N - 1 where N is the total number of data points. The correlation will be large at some value of j if the first data set (g) resembles the second data set (h) at some lag time j. Therefore, whenever the sample contains a DNA target simultaneously hybridized to the two probes, the coincident detection of the two fluorescent dyes produces bursts coincident in time, which contributes to the crosscorrelation peak at zero time. Alternatively, the coincident detection of the two probes was accomplished by a burst identification routine. In this case, bursts in the data sets are identified above a certain threshold, and their appearance times, j and k, are recorded. Then, the time lags between a burst in the first data set and neighboring bursts in the second data set (j - k) are calculated, and a histogram of these times is constructed. This routine also yields a peak at zero time with an area proportional to the total number of targets counted. The first method of data analysis generally yields larger signal-to-noise ratios because all of the bursts are included in the analysis, even those weaker bursts produced by molecules traveling through the edges of the Gaussian laser beam. The second method yields smaller, narrower histograms, since weaker bursts are neglected, and the arrival time of larger bursts can be determined more accurately. This method should prove more useful when our instrument is modified to interrogate all of the molecules present in the sample, allowing for the direct quantitation of the targets present in the sample. Analytical Chemistry, Vol. 69, No. 19, October 1, 1997
3917
Figure 3. Detection of intact λ DNA. (a, b) Raw fluorescence signals for the R6G and BTR channels, respectively. Each fluorescence burst represents the passage of a molecule through the laser beam. Two representative coincident bursts are indicated by dashed lines. The data was binned in 4-ms intervals for better visualization. (c) Crosscorrelation analysis (only 1200 ms around zero are shown).
RESULTS Detection of Intact λ DNA. We used two probes, λ1 and λ2, labeled with different fluorescent dyes, R6G and BTR, to tag the λ DNA target sequence. Single molecules of the probe-target complex are expected to be identified by the coincident detection of both dyes, while unbound probes or single probes bound to randomly occurring complementary sequences22 should appear as independent signals. The sample was prepared by mixing λ and salmon DNA in a genomic ratio of 1:1. The salmon DNA provides a background of unrelated sequences to test specificity for the λ target. The hybridized sample was loaded into the system at a final concentration of 5 fM. Parts a and b of Figure 3 show the raw data obtained. It is apparent that some signals occurred in only one of the channels (independent signals), while others occurred simultaneously in both channels (coincident signals). Cross-correlation analysis yields a large peak at zero lag time (Figure 3c). A control experiment described in the next paragraph shows that the coincident signals above background are due to the binding of two probe molecules to one complementary target sequence, and not to nonspecific probe-DNA complexes or from PNA probe complexes unrelated to the presence of DNA. Analysis of Cleaved λ DNA. To see whether the coincident signals of Figure 3c indicate the presence of specific probe-target (22) Chen, J. W.; Cohen, A. S.; Karger, B. L. J. Chromatogr. 1991, 559, 295305.
3918 Analytical Chemistry, Vol. 69, No. 19, October 1, 1997
Figure 4. Control experiment for the analysis of λ DNA in Figure 3. No coincident signals were found when the target was cleaved between the two probe binding sites. (a, b) Raw fluorescence signals for the R6G and BTR channels, respectively. (c) Cross-correlation analysis (only 1200 ms around zero are shown).
complexes, we repeated the experiment under identical conditions, except that the λ target was cut between the two probe binding sites prior to mixing with the salmon DNA and hybridizing to the probe. Probes should still bind to target sequences as before, but coincident signals should be eliminated because the binding sites would now be on independent target fragments. On the other hand, if the coincident signals were due to nonspecific complexes, cleavage of the target DNA should have no effect on the occurrence of coincident signals. Photon bursts were recorded (Figure 4a,b) and analyzed as before. No coincident events are revealed by cross-correlation analysis (Figure 4c), which shows that the coincident signals seen with intact λ DNA are due to specific probe-target complexes. Moreover, these results confirm that the instrument truly detects single molecules, since it is otherwise impossible to statistically explain the elimination of coincident signals when the target is cut. We conclude that coincidence detection of two probes provides the specificity required to identify single-copy target sequences in complex samples. Detection of a Single-Copy Gene in a Transformed Maize Plant. The experimental results of Figures 3 and 4 were obtained with a synthetic sample comprising a mixture of λ and salmon DNA. To confirm these results with natural samples, we prepared DNA from a homozygous transformed maize plant containing one copy of a BT (B. thuringiensis toxin) transgene per haploid maize genome (3 × 109 bp). Instead of using cleaved DNA for a negative control as in the λ DNA experiments, here the control is an
Figure 5. Detection of a single-copy BT transgene in a maize genomic sample. Burst identification analysis for the BT-positive (a) and BT-negative (b) samples.
isogenic plant lacking the BT gene. Photon bursts were recorded as before, and the data were analyzed by the burst identification routine. Coincident signals were observed only with the BTpositive sample (Figure 5a) and not with the negative control (Figure 5b). This example illustrates that coincidence singlemolecule detection can identify a single-copy transgene integrated in a chromosome of a complex genome. DISCUSSION Sensitivity. The high molecular weight of genomes makes it impractical to use commercial capillary electrophoresis (CE) instruments because of the excessive DNA mass concentrations required. Fluorescence detectors found on typical CE instruments lack the sensitivity required for the direct detection of unamplified target sequences in complex genomic DNA samples. This is why it has not been possible to analyze unamplified DNA samples by probe hybridization followed by capillary electrophoresis.23,24 For example, a limit of detection of 10-12 M dye molecules, as found in current CE instrumentation, means that if every target DNA sequence in the test genome were hybridized to a fluorescent probe, the sample would need to be at the same concentration in genomes, 10-12 M. This corresponds to a mass concentration of 2 mg/mL for a human or maize genome of 3 × 109 bp, which is too high for practical use. Therefore, a greater detection sensitivity, such as that demonstrated by the present experiments, is required to enable the use of lower DNA mass concentrations. Analysis Time. The BT data corresponding to Figure 5 were acquired in 819.2 s and yielded a cross-correlation with a SNR of 60. Shorter acquisition times would decrease the peak height at zero time, because fewer coincident events would be counted. The acquisition time can be reduced until the limit of detection is reached, that is, where the peak height is a minimum of 3 standard deviations above the noise level. Since the SNR scales as the (23) Perryokeefe, H.; Yao, X. W.; Coull, J. M.; Fuchs, M.; Egholm, M. Proc. Natl. Acad. Sci. U.S.A. 1996, 93, 14670-14675. (24) Carlsson, C.; Jonsson, M.; Norden, B.; Dulay, M. T.; Zare, R. N.; Noolandi, J.; Nielsen, P. E.; Tsui, L. C. Nature 1996, 380, 6571.
square root of time, only a few seconds of data should be sufficient to detect the BT gene at the limit of detection. In fact, SNRs greater than 3 were obtained for data segments ranging from 4to 8-s duration. Even shorter acquisition times could be achieved by eliminating unbound probes prior to analysis, which would reduce the random background due to independent signals, thus improving the SNR. Acquisition times could be further optimized by adjusting the sample concentration: excessively high concentrations produce a steady-state fluorescence and make it impossible to detect single molecules, whereas low concentrations extend acquisition time because molecules are counted infrequently. Thus, the analysis time would be optimized at sample concentrations somewhere between these two extremes. Detection Zone Geometry. We have employed a pulsed laser with time-gated electronics to detect single fluorescent dye molecules in a focused laser beam approximately 1 pL in volume (dimensions 10 × 10 × 10 µm3). The time-gated detection technique suppresses noise due to Rayleigh and Raman light scattering.21 Another way to diminish noise is to illuminate an even smaller volume, since the signal remains the same for a single molecule, but the background signal decreases with the detection volume. For example, a continuous-wave laser has been used for single-molecule excitation14,15 by reducing the volume of the detection zone to 0.25 (0.4 µm diameter × 2 µm long). In a flowing sample, the time (t) required to count a given number of molecules (N) is inversely proportional to the cross-sectional area of the laser beam perpendicular to the flow (σ):
t ) N/Cσv where C is the concentration and v is the linear flow velocity. At a given linear flow velocity, it is necessary to compensate for smaller beam areas by a proportional increase in sample concentration, in order to maintain the same counting rate. In comparing our setup having a relatively large beam area to the CW setup described above14,15 (100 µm2 vs 0.08 µm2), it would be necessary to increase the concentration of maize DNA 1250-fold, from 10 to 12 500 µg/mL. The latter concentration is too high for practical use and cannot be implemented in routine practice. In the present experiments, we have not attempted to optimize the efficiency with which the sample solution volume is used. For example, the capillary tube is much larger than the small laser beam spot and therefore most sample molecules pass by undetected. A smaller capillary, larger beam spots, or a sheath flow hydrodynamic focusing system9,25 would make more efficient use of the sample. A combination of these parameters will be incorporated in the next generation of our instrument. The effect of the laser beam thicknesssparallel to the flow axissis also an important parameter to consider. Excessively thick beams may force the use of excessively low sample concentrations in order to avoid the simultaneous presence of two independent probe molecules in the beam. A flat beam, narrow with respect to the flow, but of large perpendicular area is preferred in order to provide maximum flexibility in selecting the sample concentration. Wide, flat beams can be obtained by using crossed cylindrical lenses.9,25 PNA Hybridization. In both the λ and maize experiments, we hybridized PNA probes to single-stranded DNA under DNA(25) Steen, H. B. In Flow cytometry and sorting; Melamed, M. R., Lindmo, T., Mendelsohn, M. L., Eds.; Wiley-Liss: New York, 1990; Chapter 2.
Analytical Chemistry, Vol. 69, No. 19, October 1, 1997
3919
denaturing conditions (in 6.2 M urea, Figures 3-5). These conditions were chosen because eventually we want to combine the present technique with single-molecule electrophoresis (SME).11 In SME, the sizes of DNA fragments are determined by measuring the electrophoretic migration time through two sequential detection zones. In this application, denaturing conditions are required in order to eliminate DNA secondary structures, so that the electrophoretic velocity of DNA depends on its molecular weight alone and not on its particular nucleotide sequence. Furthermore, when hybridization is conducted under DNA-denaturing conditions, the inhibition of PNA probe binding by DNA secondary structures is reduced or eliminated.16 Finally, the use of electroneutral PNA probes facilitates the removal of unbound probes prior to analysis, which should reduce noise and shorten the analysis time. Counting Error and Internal Genetic Standards. In principle, the expected number of targets detected in an experiment can be calculated from the molecular velocity, the laser beam geometry, and the genomic concentration. In practice, however, this calculation is difficult to achieve with certain degree of precision and consistency, because of the uncertainty in the measurement of these parameters. The actual molecular velocity, for example, cannot be readily deduced from the volumetric flow rate, since the linear velocity varies across the capillary due to hydrodynamic shear. Thus, this velocity also depends on the position of the beam in the capillary, which is undefined. Determination of the molecular velocity from the burst width is also impractical, since diffusion and photodestruction affect the relationship between transit time and burst width. Secondly, the detection volume does not have distinct boundaries, since it is defined by a Gaussian laser beam, and it is difficult to estimate at what point in the decaying Gaussian intensity profile the average molecule will produce an above-background burst. Finally, the actual genomic concentration is not known with absolute certainty in a DNA extract. All of the above uncertainties emphasize the importance of using an internal genetic standard to accurately quantitate the number of gene copies per genome. By determining the ratio of the target sequence to an internal standard (e.g., a gene known to be present in the analyzed genome and of known copy number), and by detecting both the target and standard side by side in the same sample, it is possible to determine gene dosage without knowing the mass concentration of the DNA sample or the detection parameters. Use of internal genetic standards would be facilitated if the detection instrument was arranged to distinguish at least three dyes A, B, and C. One dye pair AB would identify the target, and another dye pair AC would identify the standard, so that both could be counted together in the same analytical test. The instrument could be programmed to count until a preset level of statistical confidence is achieved. Threecolor detection would also facilitate the use of molecular weight standards for determining target sizes by single-molecule electrophoresis.11 Quantitative Precision. In particle counting experiments, such as those reported here, the standard deviation error in the amount of analyte is equal to the square root of the number of particles detected. This phenomenon has been called molecular shot noise, and it has been argued that it reduces the value of (26) Chen, D. Y.; Dovichi, N. J. Anal. Chem. 1996, 68, 690-696.
3920
Analytical Chemistry, Vol. 69, No. 19, October 1, 1997
ultrasensitive detection in some analytical tests.26 However, a relative precision of 10%, for example, would require the detection of 100 molecules, which is still many orders of magnitude below the limit of detection for currently available fluorescence-based instrumentation. Furthermore, when acquiring quantitative data in genetic analysis, we often need to determine only whether a target sequence is present or absent and whether it is heterozygous (present in one chromosome only) or homozygous (in both chromosomes). Since the analytical result is usually constrained by the rules of Mandelian inheritance to 0, 1, or 2 gene copies, a true quantitative analysis can be accomplished with adequate precision at lower particle counts. Future Applications. By combining the present technique with single-molecule electrophoresis, it will be possible to determine both the quantity and molecular weight of specific target DNA molecules in complex samples, without the need for DNA amplification. The simplicity of the assay chemistry (probe hybridization in solution phase under DNA-denaturing conditions) promises reliability. The high level of target specificity demonstrated in the present experiments suggests that single-molecule detection coupled with single-molecule electrophoresis could be used successfully for many applications in analytical genetics. The inheritance of genes and chromosome segments could be tracked using DNA markers such as restriction fragment length polymorphisms (RFLPs) or tandem repeats. Single-nucleotide differences could be detected by probe hybridization as has been shown for PNA probes.24 Finally, specific mRNA transcripts could be counted and sized. We are presently building an instrument that will both count and size specific DNA target sequences. This instrument will be used to evaluate the reliability of the assay process and to determine the overall operating cost. The high sensitivity of the method means that sample size and reagent use are minimal, which should result in significant cost savings relative to existing analytical methods. Ultimately, assay reliability and low operating costs, combined with high sensitivity, may be the primary advantages of using single-molecule detection methods in the analytical laboratory. CONCLUSIONS Detection of single-copy genes within a complex genome by coincidence single-molecule detection has been demonstrated. The analysis of a labeled sample can be completed within a few seconds. The present technique provides a basis for developing an instrument capable not only of counting but also of determining the size of specific DNA fragments in unamplified genomic DNA samples. ACKNOWLEDGMENT The authors thank Brooks Shera (Los Alamos National Laboratory) and Michael Egholm (PerSeptive Biosystems) for helpful discussions, and Bruce Orman (Pioneer Hi-Bred) for support. Received for review April 11, 1997. Accepted July 16, 1997.X AC970389H X
Abstract published in Advance ACS Abstracts, September 1, 1997.