Electrospray Ionization Mass Spectrometry-Based Genotyping: An

James J. Walters , Warees Muhammad , Karen F. Fox , Alvin Fox , Dawen Xie , Kim E. Creek , Lucia Pirisi. Rapid Communications in Mass Spectrometry 200...
0 downloads 0 Views 119KB Size
Anal. Chem. 2001, 73, 2117-2125

Electrospray Ionization Mass Spectrometry-Based Genotyping: An Approach for Identification of Single Nucleotide Polymorphisms Sheng Zhang, Colleen K. Van Pelt, and Gary A. Schultz*

Advion BioSciences, Inc., 30 Brown Road, Ithaca, New York 14850

The high frequency of single nucleotide polymorphisms (SNPs) in the human genome makes them ideal genetic markers for mapping, diagnosing disease-related alleles, and identifying SNPs that contribute to drug response differences between individuals. Here we report a novel assay utilizing a single nucleotide primer extension (SNuPE) and electrospray ionization mass spectrometry (ESIMS) detection for the analysis of SNPs. In contrast to most SNuPE genotyping technologies that detect the extended primer product, the novel Survivor assay detects the unreacted dideoxynucleotides (ddNTPs) remaining or surviving in solution following a SNuPE. This assay involves a simple analysis of the same four ddNTP analytes, regardless of the SNP being investigated, and either single or double-stranded DNA can be used to genotype a SNP, without any labeling requirements of the ddNTPs or oligonucleotide primers. We have tested and blindly validated the Survivor assay by genotyping the C/T SNP at -857 of the human TNFr promoter gene. The results obtained are in agreement with the control sequencing data. The results demonstrate that the homogeneous Survivor assay with ESI-MS detection offers advantages in simplicity, accuracy, specificity, and sensitivity. Additional advantages of the method include enhanced hybridization efficiencies in this solution-phase assay and the elimination of immobilized primers for the isolation of single-stranded DNA. With a one-well reaction and an automation platform being developed, the Survivor assay provides a powerful new tool for large-scale SNP analysis and screening. Single nucleotide polymorphisms (SNPs) are the most frequent type of variation in the human genome, with an estimated frequency of one to two polymorphic nucleotides per kilobase.1-3 They are present at high density, widely distributed in the genome and stably inherited, making them ideal genetic markers in linkage and association genetic studies for genome mapping, medical diagnostics, and identity testing.3,4 As the Human Genome Project nears a complete sequence of the human genome, there is * To whom correspondence should be addressed. Phone: 607-257-0183 x18. Fax: 607-257-0359. E-mail: [email protected]. (1) Brookes, A. J. Gene 1999, 234, 177-86. (2) Landegren, U.; Nilsson, M.; Kwok, P. Y. Genome Res. 1998, 8, 769-76. (3) Schafer, A. J.; Hawkins, J. R. Nat. Biotechnol. 1998, 16, 33-9. 10.1021/ac001549j CCC: $20.00 Published on Web 03/31/2001

© 2001 American Chemical Society

mounting interest focused on the detection and analysis of SNPs for disease susceptibility and variable drug response.5 The discovery of SNPs that are responsible for individual differences in drug response drives the pharmaceutical industry to envision its proprietary products enhancing drug discovery and development processes by linking drug response to patient genotype through genomics and diagnostic testing.6-8 Such applications would require the large-scale screening of thousands of different SNPs in thousands of samples. As the number of identified and mapped SNPs rapidly increases (http://snp.cshl.org), there is an increasing demand for SNP genotyping technologies that are simple, rapid, cost-effective, and readily amenable to automation for high throughput analyses. A number of methods have been developed for efficient SNP genotyping including allele-specific hybridization,4 allele-specific PCR,9 endonuclease cleavage,10 pyrosequencing,11 and polymerasemediated single nucleotide primer extension (SNuPE), or minisequencing.2,12 In SNuPE, which is the most widely used method for SNP analysis, once a DNA fragment containing the SNP of interest is amplified, an oligonucleotide primer anneals to the template sequence immediately adjacent to the polymorphic site. A DNA polymerase then extends the primer by a single dideoxynucleotide (ddNTP) before the extended primer is analyzed. Since the first introduction of SNuPE for the identification of genetic disease,13 several new detection methods have been developed, including luminous detection,14 colorimetric ELISA,15 gel-based (4) Wang, D. G.; Fan, J. B.; Siao, C. J.; Berno, A.; Young, P.; Sapolsky, R.; Ghandour, G.; Perkins, N.; Winchester, E.; Spencer, J.; Kruglyak, L.; Stein, L.; Hsie, L.; Topaloglou, T.; Hubbell, E.; Robinson, E.; Mittmann, M.; Morris, M. S.; Shen, N.; Kilburn, D.; Rioux, J.; Nusbaum, C.; Rozen, S.; Hudson, T. J.; Lander, E. S.; et al. Science 1998, 280, 1077-82. (5) McCarthy, J. J.; Hilfiker, R. Nat. Biotechnol. 2000, 18, 505-8. (6) Mancinelli, L.; Cronin, M.; Sadee, W. AAPS Pharmsci. 2000, 2, article 4. (7) Nebert, D. W. Clin. Genet. 1999, 56, 247-58. (8) Head, S. R.; Parikh, K.; Rogers, Y. H.; Bishai, W.; Goelet, P.; Boyce-Jacino, M. T. Mol. Cell Probes 1999, 13, 81-7. (9) Morin, P. A.; Saiz, R.; Monjazeb, A. Biotechniques 1999, 27, 538-40, 542, 544 passim. (10) Lyamichev, V.; Mast, A. L.; Hall, J. G.; Prudent, J. R.; Kaiser, M. W.; Takova, T.; Kwiatkowski, R. W.; Sander, T. J.; de Arruda, M.; Arco, D. A.; Neri, B. P.; Brow, M. A. Nat. Biotechnol. 1999, 17, 292-6. (11) Ahmadian, A.; Gharizadeh, B.; Gustafsson, A. C.; Sterky, F.; Nyren, P.; Uhlen, M.; Lundeberg, J. Anal. Biochem. 2000, 280, 103-10. (12) Syvanen, A. C. Hum. Mutat. 1999, 13, 1-10. (13) Kuppuswamy, M. N.; Hoffmann, J. W.; Kasper, C. K.; Spitzer, S. G.; Groce, S. L.; Bajaj, S. P. Proc. Natl. Acad. Sci. U.S.A. 1991, 88, 1143-7. (14) Nyren, P.; Pettersson, B.; Uhlen, M. Anal. Biochem. 1993, 208, 171-5.

Analytical Chemistry, Vol. 73, No. 9, May 1, 2001 2117

fluorescent assays,16 homogeneous fluorescent detection,17 flow cytometry-based assays,18 HPLC analysis,19 and time-of-flight mass spectrometry.20-22 Among these detection systems, matrix-assisted laser desorption ionization time-of flight (MALDI-TOF) mass spectrometry has proven to be a promising tool for the highthroughput screening of SNPs.21 Electrospray ionization mass spectrometry (ESI-MS) is a rapid, sensitive, selective, and accurate analytical tool for the quantitative measurement of small compounds. The technique is a concentration-sensitive detection method and is readily amenable to miniaturization.23 We have sought to explore the potential of combining SNuPE with ESI-MS detection by developing the novel Survivor assay. In contrast to all existing SNuPE genotyping technologies that detect the extended primer product, the Survivor assay detects the unreacted ddNTPs remaining or surviving in solution following a SNuPE. Here we report the Survivor assay and demonstrate that it can be used for rapid, facile, and accurate SNP genotyping. EXPERIMENTAL SECTION Materials. Unless otherwise indicated, all chemicals were purchased from Sigma. High Expand DNA polymerase was obtained from Boehringer Mannheim (Indianapolis,IN), and HotStar DNA polymerase was from Qiagen (Valencia, CA). Thermosequenase, ddNTPs, and deoxynucleotides (dNTPs) were purchased from Amersham (Piscataway, NJ). Immobilized iminodiacetic acid gel (TSK HW-65F) was obtained from Pierce (Rockford, IL), and Microcon-50 filter units were from Millipore (Bedford, MA). Human genomic DNA samples were obtained from F. Hoffmann-La Roche (Basel, Switzerland), who acquired them from the Coriell Institute of Medical Research (Camden, NJ). Oligonucleotides and Target Sequences. All oligonucleotide primers and synthetic target templates used in this study were synthesized by the Cornell University BioResource Center (Ithaca, NY). Three model systems having different polymorphisms were chosen to demonstrate the present methodology. The first model system included a universal primer M13/pUC reverse sequence #1233 (5′ AGCGGATAACAATTTCACACAGGA 3′) used as a SNP primer, and four 33mer synthetic target templates with the sequence 5′ CCCCTGTNTCCTGTGTGAAATTGTTATCCGCTC 3′. The four target sequences differed from one another only at the underlined polymorphic N site with an A, G, C or T base. The sequence complementary to the #1233 primer is italicized in the target DNA sequence. In the second model system, the target DNA was the 384bp PCR product that is a partial sequence of the Escherichia coli pheA (15) Nikiforov, T. T.; Rendle, R. B.; Goelet, P.; Rogers, Y. H.; Kotewicz, M. L.; Anderson, S.; Trainor, G. L.; Knapp, M. R. Nucleic Acids Res. 1994, 22, 4167-75. (16) Pastinen, T.; Partanen, J.; Syvanen, A. C. Clin. Chem. 1996, 42, 1391-7. (17) Chen, X.; Kwok, P. Y. Genet. Anal. 1999, 14, 157-63. (18) Cai, H.; White, P. S.; Torney, D.; Deshpande, A.; Wang, Z.; Marrone, B.; Nolan, J. P. Genomics 2000, 66, 135-43. (19) Hoogendoorn, B.; Owen, M. J.; Oefner, P. J.; Williams, N.; Austin, J.; O’Donovan, M. C. Hum. Genet. 1999, 104, 89-93. (20) Haff, L. A.; Smirnov, I. P. Genome Res. 1997, 7, 378-88. (21) Griffin, T. J.; Smith, L. M. Trends Biotechnol. 2000, 18, 77-84. (22) Sauer, S.; Lechner, D.; Berlin, K.; Lehrach, H.; Escary, J. L.; Fox, N.; Gut, I. G. Nucleic Acids Res. 2000, 28, E13. (23) Schultz, G. A.; Corso, T. N.; Prosser, S. J.; Zhang, S. Anal. Chem. 2000, 72, 4058-63.

2118

Analytical Chemistry, Vol. 73, No. 9, May 1, 2001

gene coding for the C-terminus of P-protein (GenBank M10431). One internal mutagenic primer (W338Ipd) and a flanking primer (P-reverse) were used as forward and reverse primers, respectively, for PCR amplification. Six primers annealing to different regions of the 384bp DNA fragment were used as SNP primers. The detailed sequence of the 384bp PCR product and its SNP primers are shown in Figure 1. The third test model was human genomic DNA with a C/T polymorphism at position 857 upstream of the tumor necrosis factor alpha (TNFR) gene24 (GenBank M16441). The amplification primers were TNFR-857F (5′ AAGCAAAGGAGAAGCTGAGA 3′) and TNFR-857R (5′ CTTAAACGTCCCCTGTATTC 3′). The SNP primer used for this polymorphic site was TNF851pp2 (5′ CTACATGGCCCTGTCTTC 3′). PCR Amplification and Sequencing of Amplified DNA. A 384bp PCR product of the pheA gene was amplified using W338Ipd and P-reverse primers. A 50-µL reaction volume containing 1× High Expand reaction buffer with 2 mM magnesium chloride, 200 µM of each dNTP, 0.5 µM of each primer, 2 units of High Expand polymerase, and 5 ng of template plasmid pJS1 containing the wild-type pheA gene25 or 5 ng of template plasmid pSZ87 containing a pheA mutant C374A gene.26 The amplification reaction was thermal-cycled in an Applied Biosystems (Foster City, CA) 96well GeneAmp PCR System 9700 for 35 cycles, with each cycle composed of 95 °C for 30 s, 60 °C for 60 s, and 72 °C for 60 s. The amplification of a 279bp region of the human TNFR promoter was set up in a 50-µL volume containing 50 ng of genomic DNA, 1× High Expand reaction buffer containing 2 mM magnesium chloride, 200 µM of each dNTP, 0.5 µM of each primer, and 2 units of HotStar Taq polymerase. The reaction was performed using an initial activation step of 15 min at 95 °C, followed by 5 cycles of 95 °C for 60 s, 53 °C for 30 s, and 72 °C for 60 s, and then 30 cycles of 95 °C for 60 s, 50 °C for 30 s, and 72 °C for 60 s. All 279bp DNA fragments amplified from 13 human genomic DNA samples were sequenced on an ABI model 377A DNA sequencer by the Cornell University BioResource Center. Preparation of PCR Products for SNuPE. The amplified PCR fragments were purified through Microcon-50 filter units. After loading the PCR reaction into the filter unit, the unit was centrifuged at 10 000g for 30 s and then washed three times with 400 µL of 20 mM ammonium acetate, pH 8.2, and centrifuged at 10 000g for 3 min. The purified PCR products were collected, quantified spectrophotometrically (OD260 nm), and used in the subsequent SNuPE reaction. SNuPE. To simplify the SNuPE, a synthetic oligonucleotide, template A (5′ CCCCTGTATCCTGTGTGAAATTGTTATCCGCTC 3′ 33mer), corresponding to the flanking region of the polyrestriction sites of pUC18/19 plasmid, was used as a target template. The universal primer #1233, which is a complement to the above synthetic template, was used as the SNP primer. The reaction was set up in a total volume of 50 µL, which was composed of 25 mM ammonium acetate, pH 9.3; 1 µM of each ddNTP; 2 mM magnesium acetate; 0.05 µM template A; and 1 (24) Hohjoh, H.; Nakayama, T.; Ohashi, J.; Miyagawa, T.; Tanaka, H.; Akaza, T.; Honda, Y.; Juji, T.; Tokunaga, K. Tissue Antigens 1999, 54, 138-45. (25) Zhang, S.; Pohnert, G.; Kongsaeree, P.; Wilson, D. B.; Clardy, J.; Ganem, B. J. Biol. Chem. 1998, 273, 6248-53. (26) Pohnert, G.; Zhang, S.; Husain, A.; Wilson, D. B.; Ganem, B. Biochemistry 1999, 38, 12212-7.

Figure 1. Sequence of the double-stranded 384bp PCR product, amplification primers, as well as the polymorphism detection primers are shown. The mutagenic bases in each primer are italicized, and bases mismatched to 384bp DNA are underlined. The polymorphic bases for each detection primer are also shown, and the ddNTP bases that are expected to be consumed are in bold in the target sequence. The binding site of each primer to the target DNA sequence and the direction of DNA synthesis are indicated. Two 384bp templates, which differed by three bases, were used in this study. The sequence variation between the wild-type and the mutant is provided in boxes, and the site of sequence variation is indicated by shading.

unit of Thermosequenase (Amersham). The #1233 primer was varied at concentrations of 0, 1, 2, 3, and 4 µM in the reaction. The reaction mixture was subjected to 25 thermal cycles, with each cycle consisting of 95 °C for 30 s, 60 °C for 60 s, and 72 °C for 60 s. To mimic a heterozygous sample, two 33-mer synthetic templates were combined at a concentration of 0.05 µM (0.025 µM each) to replace the single homogeneous template, and the SNuPE was performed as indicated above with a #1233 primer concentration of 4 µM. Controls were consistently analyzed alongside the test samples, and they always lacked a vital ingredient, such as SNP primer, template, or enzyme, which was necessary for a successful SNuPE reaction. For those reactions having amplified DNA as templates, the reaction mixture contained, except as noted, 4 µM SNP primer; 1 µM ddNTPs; 0.05 µM double-stranded PCR product; 25 mM ammonium acetate, pH 9.3, with 2 mM magnesium acetate; and

1-2 units of Thermosequenase. The 50-µL reaction mixture was thermally cycled 30 times, with each cycle consisting of 95 °C for 30 s, 63 °C for 60 s, and 72 °C for 30 s. In the SNuPE for the polymorphism located at -857 in the TNFR promoter gene, TNF851pp2 was used as the SNP primer, and the annealing temperature was 38 °C. All of the reactions were run in duplicate or triplicate. To optimize the SNuPE conditions for efficient incorporation of ddNTPs into the SNP primer, a series of SNuPE reactions were performed by varying both the single-stranded template A concentration from 5 to 100 nM and the 384bp double-stranded DNA concentration from 5 to 150 nM. In addition to varying the template concentration, the number of thermal cycles was varied between 10 and 60 cycles for every concentration of template, with each cycle consisting of 95 °C for 30 s, 60 °C for 60 s, and 72 °C for 60 s. The optimization experiments were carried out under Analytical Chemistry, Vol. 73, No. 9, May 1, 2001

2119

Figure 2. Schematic diagram of the SNuPE-based Survivor assay. A SNP primer anneals, as indicated, to a PCR product containing a polymorphic site having either a G/G homozygous genotype or an A/G heterozygous genotype. In the presence of ddNTPs, DNA polymerase, and excess SNP primer, the free ddNTPs bases extend the SNP primer by a single base that is complementary to the polymorphic site. Following the SNuPE, the reaction solution is subjected to ESI-MS analysis, which detects the unreacted ddNTPs remaining in solution. The bases that were consumed in the SNuPE are identified, and the SNP bases are subsequently determined.

the above conditions with a constant 4 µM SNP primer. The #1233 SNP primer was used with template A, and the T366pd SNP primer was used with the 384bp PCR product. Sample Preparation and Reconstitution. The extended reaction samples were passed through an in-house packed micro metal chelating gel column (immobilized iminodiacetic acid gelIDA gel) equilibrated with 20 mM ammonium acetate, pH 8.2, to remove magnesium from the reaction mixture. Once the reaction solution eluted, a 50-µL aliquot of equilibration buffer was used to wash column. The combined effluent was then evaporated to dryness. After the samples were reconstituted in 50 µL of distilled water, the unreacted ddNTPs were detected by ESI-MS. The micro IDA columns were regenerated by first washing them with 50 mM EDTA with 0.1 M NaCl and then reequilibrating them with 2120

Analytical Chemistry, Vol. 73, No. 9, May 1, 2001

20 mM ammonium acetate, pH 8.2. ESI-MS Analysis. The resulting samples were analyzed by ESI coupled to a triple quadrupole Micromass Quattro II (Cheshire, U.K.) mass spectrometer. A mobile phase composition of 1:1 methanol:water with 0.1% acetic acid was used at a flow rate of 150 µL/min. At least three 10-µL injections were made for each sample via flow injection analysis. The mass spectrometer was equipped with a Z-spray source and operated in negative ion MS/ MS selected reaction monitoring (SRM) mode. The Z-spray desolvation temperature and capillary voltage were 400 °C and 3000 V, respectively. The collision energy that was used was 35 V, and the dwell time for each transition was 200 ms. The following SRM transitions were monitored for each of the bases: ddCTP, m/z 370.1 f 79.0; ddTTP, m/z 385.1 f 79.0; ddATP, m/z 394.1

Table 1. Summary of the Peak Area Ratios of SNuPE Samples Containing Homogeneous and Heterogeneous Single-Stranded DNA Template peak area ratios sample

C/T 370/385

C/A 370/394

C/G 370/410

T/A 385/394

T/G 385/410

A/G 394/410

control, no template template A template C template G template T

1.17 12.4 1.24 0.180 1.05

0.975 1.00 1.06 0.170 7.41

1.39 1.38 9.99 0.245 1.48

0.905 0.0800 0.860 0.880 7.06

1.29 0.115 8.10 1.35 1.41

1.42 1.38 9.41 1.48 0.200

control, no template template A + C template A + G template A + T template C + G template C + T template G + T

0.860 8.79 1.21 7.19 0.135 1.26 0.170

0.970 1.25 0.150 5.75 0.155 6.55 1.31

1.24 8.71 0.210 1.58 0.990 8.08 0.280

1.14 0.150 0.130 0.825 1.12 5.23 7.55

1.45 1.01 0.175 0.225 7.33 6.42 1.63

1.28 7.01 1.38 0.280 6.55 1.25 0.220

f 79.0; ddGTP, m/z 410.1 f 79.0. The relative concentration of the ddNTPs in each sample was compared to a nonextended reaction control to which no SNP primer was added. The base(s) complementary to the consumed ddNTPs during the SNuPE can be assigned as the SNP base for both homozygous and heterozygous alleles on the basis of the relative ion responses for each of the four ddNTPs. To determine the SNP bases by ESI-MS detection, the selected ion transition chromatograms were plotted. The peaks corresponding to the ddNTPs were integrated and the areas that were obtained were then used to calculate area ratios. The differences between the area ratios of the test and the control samples were calculated, and these differences were used to determine the consumed ddNTPs in the SNuPE. Furthermore, by mathematically normalizing the area ratios for the samples to those of the control, the percent of ddNTPs that are consumed following a SNuPE can be calculated. Peak area ratios were used to determine the SNP bases instead of the raw area values themselves in order to increase the precision and accuracy of the results. RESULTS Principle of the Survivor Assay. A flow diagram of our novel method of SNP detection that uses ESI-MS to detect the unreacted ddNTPs following a SNuPE instead of the extended primer itself, is shown in Figure 2. The oligonucleotide primer is present in the SNuPE reaction in a molar excess relative to the ddNTP concentration. During the reaction, the oligonucleotide primer first anneals to the target region of the PCR-amplified genomic DNA template, then catalyzed by DNA polymerase, the oligonucleotide primer is extended by a single ddNTP base that is complementary to the template immediately adjacent to the 3′ end of the primer. Measuring the concentration of unreacted ddNTPs remaining in solution following the SNuPE identifies the ddNTP that is consumed. The concentrations of the unreacted ddNTPs in a sample are compared to those of a control in order to identify the particular SNP base. For homozygous SNPs, only one base is substantially consumed, whereas for heterozygous SNPs, two bases are essentially consumed equally during the SNuPE. Survivor Assay Using Synthetic Oligonucleotides as Templates. The Survivor assay was initially tested in a model system using synthetic oligos as templates. The SNuPE reaction solution in the Survivor assay must contain an excess of SNP primer with

Figure 3. MS/MS mass spectra of the remaining free ddNTPs following SNuPE reactions using homogeneous single-stranded DNA as template. The templates were named for their SNP bases. It is apparent in each of the samples that the base that is complementary to the SNP base decreases in concentration, as predicted.

a limited concentration of ddNTPs. The concentration of ddNTPs must be limited in order for the consumed base(s) to be detected in a statistically significant manner. The ratio of SNP primer to ddNTPs was optimized in the SNuPE. As the SNP primer concentration was varied at 1, 2, 3, and 4 µM in the reactions, the consumption of ddTTP progressively increased at 44, 64, 84, and 87%, respectively, as a result of its incorporation into the primer. It was determined from these results that 4 µM of SNP primer was the optimal concentration. Figure 3 shows the mass spectra of five reactions having homogeneous templates. The corresponding ddTTP, ddGTP, ddCTP, and ddATP were consumed by 80-90% in the reactions with template A, C, G, and T, respectively. These data demonstrate that the Survivor assay can unambiguously determine all possible homozygous SNP bases. A summary of the average peak area ratios obtained for these samples is shown in Table 1. Figure 4 displays spectra of reactions containing heterogeneous templates. Panels I to VII correspond to a control sample with no Analytical Chemistry, Vol. 73, No. 9, May 1, 2001

2121

Figure 4. MS/MS mass spectra of the remaining free ddNTPs following SNuPE reactions using heterogeneous single-stranded DNA as template. The two bases complementary to the two SNP bases present were observed to decrease in concentration.

template, template A + C, template A + G, template A + T, template C + G, template C + T, and template G + T, respectively. In each sample, both of the bases expected to decrease in concentration did, in fact, decrease, with each base consumed by 80-90% of its initial concentration, despite the fact that only onehalf of the amount of each template was present. This result reveals that all of the possible combinations for heterozygous polymorphisms can be easily identified by this assay and, in addition, that the 0.025 µM of template with 25 thermal cycles used in these SNuPE reactions is in kinetic excess for efficient incorporation of the free ddNTPs. Table 1 shows the average peak area ratios obtained in these SNuPE reactions. The peak area ratios of the four oligonucleotide bases allow for the detection of changes in the relative concentrations of the bases and, consequently, permit the genotyping of a locus. Furthermore, the relative standard deviation (RSD) of the peak area ratio data for three injections of each sample and its duplicate, such that n ) 6, was < 15%, which suggests that genotyping SNPs by detecting free ddNTPs is reproducible. Survivor Assay Using Amplified Double-Stranded DNA of Partial pheA Gene as Template. The model system described previously consisted of a single-stranded DNA target sequence; however from a practical standpoint, double-stranded DNA will be encountered more often. A potential problem for using doublestranded DNA is the reannealing of the two complementary strands that could compete with the SNP primer and, thereby, lower the efficiency of the SNuPE. To determine whether the Survivor assay is applicable to double-stranded DNA, a 384bpamplified PCR product of a partial E. coli pheA gene with several primers and a mutant gene available from previous work26 was initially used as the template in a SNuPE. The results from the SNuPE reactions with the 384bp double-stranded DNA template are displayed in Table 2. By comparing the peak area ratios of the control sample to the four different SNP primer reactions, as 2122

Analytical Chemistry, Vol. 73, No. 9, May 1, 2001

displayed in Table 2, the SNP bases can be unambiguously identified using double-stranded DNA as a template. The signal intensities of the bases that are expected to be consumed using the various SNP primers, were observed to decrease by > 60% in each of the cases. For example, the primer W338Ipd has the SNP base T, and the concentration of only ddATP was dramatically reduced, while the other ddNTP bases remained unchanged. Table 2 also provides the standard deviation of the peak area ratios for three injections of each sample analyzed in triplicate, such that n ) 9 for 384bp-amplified double-stranded DNA template in homogeneous reactions. Once again, the RSD of each sample and its triplicate was < 15%. Consequently, the current analysis technique can unambiguously identify the polymorphic bases in double-stranded DNA for homozygous samples. To investigate the ability of the Survivor assay to detect heterogeneous polymorphic bases, an equal molar mixture of 384bp wild-type and C374A mutant DNA, also shown in Figure 1, was used as a template, and two additional SNP primers, T366pd and V383pu, were used in the assays. The resulting peak area ratios are shown in Table 3 for the T366pd and V383pu SNuPE reactions. In the SNuPE reactions involving the T366pd SNP primer, A is the SNP base for the wild-type template, and C is the SNP base for the C374A mutant template. The expected decrease in intensity of ddTTP and ddGTP was observed for the wild-type and mutant templates, respectively. When an equimolar mixture of wild-type and mutant templates was used, both polymorphic bases A and C were identified by the peak area ratios of ddTTP and ddGTP being effected. It was calculated that ddTTP and ddGTP were consumed approximately 52 and 62%, respectively, when both templates were present equally in the SNuPE. The V383pu primer reaction reveals similar results, as shown in Table 3. In the SNuPE reaction of the Survivor assay, both the template concentration and the number of thermal cycles are important for adequate incorporation of free ddNTPs into unextended primers. To characterize the effect of both the template concentration and the thermal cycle number on the incorporation efficiency, the assays were performed at various concentrations of both single and double-stranded DNA and for a various number of thermal cycles ranging from 10 to 60. It was determined through these optimization studies that there is a large difference in the ddNTP incorporation rate between SNuPE reactions containing singlestranded DNA template and those containing double-stranded PCR product as template. When single-stranded DNA was used as a template, the following cases permitted the ddNTP to be consumed by at least 30% in the SNuPE reactions, thereby allowing the genotype to be scored accurately by ESI-MS: 10 nM template for 20 cycles, 20 nM template for 10 cycles, or 5 nM for 30 cycles. These results are shown in Figure 5A. When double-stranded DNA was used as template, 5 nM template for 30 cycles permitted accurate scoring, as shown in Figure 5B. Survivor Assay for C/T Polymorphism of TNFr Promoter Gene. The potential of the Survivor assay was further investigated using the published polymorphism in the TNFR promoter gene at position -857. This polymorphic site was recently found to be associated with human narcolepsy.24 The 279bp fragments amplified from 13 different human genomic samples in this study were sequenced and coded to provide a blind study. The SNuPE

Table 2. Summary of the Mean Peak Area Ratios ( Standard Deviation of Several Homogeneous Double-Stranded DNA Samples mean peak area ratios ( std dev sample

C/T 370/385

C/A 370/394

C/G 370/410

T/A 385/394

T/G 385/410

A/G 394/410

control, no enzyme W338Ipd and wild-type C374Spu and wild-type P-reverse and mutant C374Apd and wild-type

0.569 ( 0.029 0.499 ( 0.015 0.663 ( 0.019 0.214 ( 0.030 0.674 ( 0.020

0.871 ( 0.069 2.14 ( 0.07 0.957 ( 0.024 0.307 ( 0.043 3.29 ( 0.20

1.33 ( 0.06 1.13 ( 0.07 4.75 ( 0.20 0.544 ( 0.070 1.67 ( 0.13

1.53 ( 0.07 4.29 ( 0.14 1.44 ( 0.02 1.43 ( 0.06 4.89 ( 0.43

2.33 ( 0.08 2.26 ( 0.10 7.17 ( 0.44 2.55 ( 0.09 2.49 ( 0.24

1.53 ( 0.07 0.529 ( 0.029 4.97 ( 0.28 1.78 (0.06 0.510 ( 0.033

Table 3. Summary of the Peak Area Ratios of SNuPE Samples Containing Homogeneous and Heterogeneous Double-Stranded DNA Templates peak area ratios sample

C/T 370/385

C/A 370/394

C/G 370/410

T/A 385/394

T/G 385/410

A/G 394/410

control, no template T366pd and wild-type V383pu and wild-type T366pd and mutant V383pu and mutant T366pd and mixture V383pu and mixture

0.890 2.47 0.843 0.832 0.856 1.46 0.844

0.947 0.919 3.02 0.751 0.892 0.897 1.52

1.85 1.62 1.54 3.32 6.23 2.84 4.74

1.06 0.372 3.58 0.901 1.04 0.616 1.80

2.08 0.657 1.83 3.99 7.29 1.94 5.61

1.96 1.77 0.510 4.43 6.99 3.16 3.11

reactions were set up as described in the Experimental Section, with 4 µM of TNF851pp2 as the SNP primer, which annealed immediately downstream of the polymorphic site. The genotyping results obtained from the current method, shown in Table 4, were compared to the sequence data of the 13 amplified DNA fragments used in our Survivor assay. The sequencing results shown in column 3 of Table 4 indicate that the Survivor assay scored all 13 of the samples correctly. These results demonstrate the feasibility of using ESI-MS/ MS for SNP genotyping by monitoring unreacted ddNTPs remaining in solution following a SNuPE. No difficulties were encountered in the analysis of these double-stranded genomic DNA samples, because 100% accuracy was achieved. DISCUSSION A novel Survivor assay composed of a SNuPE coupled to ESIMS detection for SNP genotyping is reported. The assay detects the unreacted ddNTPs following a SNuPE instead of the extended primers that other conventional SNuPE methods detect.14-22,27 The key to the Survivor assay is to attain the maximum rate of ddNTP incorporation into the SNP primer. This is necessary to achieve adequate concentration differences between the ddNTPs in the control samples, in which no extension occurred, and the test samples, in which a SNP primer was extended by a single base, for the ESI-MS detection method to identify a SNP base. Sufficient concentration differences between the control and test samples can be ensured by limiting the initial concentration of ddNTPs in the SNuPE reaction while supplying the SNP primer in excess. The SNP primer must be added in excess in order to minimize the effects from the continual hybridization competition between extended and unextended primers. If this competition always favors an unextended primer, then the maximum rate of ddNTP incorporation into the primer will be achieved. (27) Chen, J.; Iannone, M. A.; Li, M. S.; Taylor, J. D.; Rivers, P.; Nelsen, A. J.; Slentz-Kesler, K. A.; Roses, A.; Weiner, M. P. Genome Res. 2000, 10, 54957.

From the ESI-MS data, the SNP base can be identified by considering the peak areas corresponding to the ddNTPs obtained by integrating selected ion transition chromatograms. The peak area ratio data provided in Tables 1-3 indicate that when a ddNTP is incorporated into a SNP primer, the peak area ratios involving that particular ddNTP vary dramatically from the peak area ratios of the control. Conversely, if a ddNTP in a test sample was not incorporated into a SNP primer, then the peak area ratios involving this base are similar to those of the control. In addition to considering the differences in peak area ratios between control and test samples, the percent of ddNTP consumed can also be calculated. Here, the peak areas of the test sample ddNTPs are normalized to the peak areas of the control sample ddNTPs, and a percent consumption is calculated. It has been determined that if the difference between the peak area ratios of the control and test samples is > 40%, which corresponds to a base being consumed by 30% in the SNuPE reaction, then the difference ensures statistical significance, and a SNP base is identified. The general acceptance criteria for similar ESI-MS analyses of small compounds in the pharmaceutical community is that the accuracy and precision of quality control samples cannot exceed 15%.28 The data provided in Table 2 show that the RSD did not exceed 15% and was typically < 10%. Consequently if a SNP base is identified by the assay only if the base is consumed by > 30%, SNP bases will be assigned with a high degree of confidence and accuracy. The SNuPE efficiency is lower when double-stranded DNA template is present in the SNuPE reaction than when singlestranded DNA template is present, as shown in Figure 5. This can be explained by considering the competition that takes place in a SNuPE reaction containing double-stranded DNA template between the SNP primer and the complementary strand to (28) Shah, V. P.; Midha, K. K.; Dighe, S.; McGilveray, I. J.; Skelly, J. P.; Yacobi, K. A.; Layloff, T.; Viswanathan, C. T.; Cook, C. E.; McDowall, R. D.; Pittman, K. A.; Spector, S. J. Pharm. Sci. 1992, 81, 309-312.

Analytical Chemistry, Vol. 73, No. 9, May 1, 2001

2123

Table 4. Results from a Blind Validation of the Polymorphism at -857 in the TNFr Promoter Gene in Human Genomic DNA Samples

Figure 5. ESI-MS-based primer extension reaction genotyping dependence on single-stranded (A) and double-stranded (B) DNA template concentrations and cycle numbers. The reactions were performed at various concentrations of the synthetic single-stranded template A (A) or the 384bp double-stranded template (B) with various thermal cycles. The other reaction reagents remained constant, as described in the Experimental Section.

hybridize to the target strand. When only single-stranded template is present, the competition is nonexistent and, consequently, the SNuPE efficiency is higher. This competition is the reason for which the maximum incorporation efficiency is obtained at 50 nM of double-stranded DNA template, as observed in Figure 5B, using the SNuPE conditions provided. At higher concentrations of double-stranded DNA, the excess template results in self-annealing of the template being more probable than hybridization of the SNP primer to the target strand. As one would expect, increasing the SNP primer concentration from 4 to 6 µM increases the incorporation efficiency of reactions containing a high concentration of double-stranded DNA template (data not shown). Although the incorporation efficiency for double-stranded DNA template is lower than for single-stranded DNA, the SNuPE efficiency was sufficient for a SNP base to be accurately assigned using only 5 nM double-stranded template. Because typical PCR amplifications produce from 10-8 to 10-7 M PCR product,29 the Survivor assay can confidently and unambiguously assign a SNP base from 2124 Analytical Chemistry, Vol. 73, No. 9, May 1, 2001

genomic DNA sample #

Survivor assay results

sequence data

1 2 3 4 5 6 7 8 9 10 11 12 13

C/C C/C C/T T/T T/T C/C C/C C/C C/C C/T C/T C/C C/C

C/C C/C C/T T/T T/T C/C C/C C/C C/C C/T C/T C/C C/C

double-stranded DNA template using 20 to 30 primer extension thermal cycles. As expected, the ion intensity of only one base in the SNuPE reaction solution decreases for a homozygous sample, but the ion intensities for two bases decrease approximately equally for a heterozygous sample. It was observed that some SNP sites produced a reproducibly larger percentage of consumed ddNTPs than others. For example, 80% of ddGTP was consumed using the SNP primer V383pu, but only 62% of ddCTP was consumed using the P-reverse SNP primer. This observation can be explained by the fact that the same SNuPE conditions were used for these reactions, and because the primers have different sequences and melting temperature (Tm) values, variations in the primer hybridization could account for the inconsistent signal intensity that was observed between different samples. In addition, a slight variation in the incorporation efficiency of two ddNTPs into the same SNP primer in heterozygous samples was observed. For instance, using the V383pu primer and the mixture of 384bp wild-type and C374A mutant templates, 58% of ddATP was consumed, and 68% of ddGTP was consumed. This finding suggests that variation in the enzyme kinetics between the four ddNTPs could also contribute to the inconsistent incorporation rates observed; however, as long as >30% of a particular base is consumed, a SNP base can be confidently assigned. Consequently, the assay is in no way compromised by this incorporation efficiency variation. It was also apparent that some degradation of the ddNTPs occurs during the SNuPE reaction as a result of both the thermal cycles and the enzyme; therefore, it was found that the most reproducible results were obtained when either the SNP primer or template, instead of the enzyme, was omitted from the SNuPE reaction mixture of the control sample. However, the overall amount of ddNTP degradation and the variation of ddNTP degradation are small. Consequently, this degradation does not interfere with the assay so long as the control samples are subjected to the same number of thermal cycles as the test samples. Detection of the unreacted or free solution concentrations of the four ddNTPs offers several advantages over previously described systems and conventional methods. One advantage is (29) Mathieu-Daude, F.; Welsh, J.; Vogt, T.; McClelland, M. Nucleic Acids Res. 1996, 24, 2080-6.

that MS methods are very selective and sensitive when detecting low-molecular-weight molecules. Consequently, the selectivity and sensitivity of our ESI-MS method is enhanced, as compared to other methods that detect the extended primers. Second, by detecting the relative concentrations of the free ddNTPs in solution, any SNP can be identified by quantifying the same four compounds. This greatly simplifies the detection technology that is required to identify SNPs. Additionally, quantification of free ddNTPs following SNuPE reactions is readily amenable to a computer software analysis, and the data storage requirements for the SRM acquisition of only four compounds are far less than those of a full-scan MALDI-TOF acquisition.21 A SRM experiment that collects data for 5 s generates a data file of approximately 5 KB in size, as compared to an average full-scan MALDI-TOF acquisition, which generates data files on the order of 100-1000 KB. Another advantage that the Survivor assay offers is that neither labeled primers nor labeled ddNTPs are required. Furthermore either single or double-stranded DNA can be used as a template, thus eliminating both the need to separate the strands of DNA and any immobilization steps. In addition, the hybridization efficiency in solution is known to be greater than hybridization on a surface.30 Finally, ESI-MS is a concentration-sensitive method of detection and is readily amenable to miniaturization via nanoelectrospray.23 A disadvantage of the Survivor assay is its inherent inability to multiplex, or to perform multiple SNP analyses simultaneously, which other genotyping methods can perform.31,32 However, with the addition of NTP as an internal standard, a duplex (2-fold multiplex) reaction is possible. This would require the two SNPs being investigated simultaneously to be chosen such that no potential for overlapping genotypes exists. The peak area ratios of ddNTP to NTP would be used to score SNP bases in this duplex analysis. Although the assay is unable to perform a higher order of multiplexing, this disadvantage can be overcome by sample preparation that is readily amenable to automation and by the rapid

analysis speed that ESI-MS offers. Although the approach described here involves the off-line purification of PCR products to serve as the template in the SNuPE, a single-well reaction strategy, coupling the PCR amplification to the SNuPE, omitting any off-line PCR product purification steps, is currently being developed. This single-well assay has been successfully performed using a 5-µL reaction volume. In addition, an automation platform is being developed which will enable the Survivor assay to score SNPs in a high-throughput fashion. Future work will investigate the potential of the assay to distinguish an allele in a pool of samples. In summary, a novel SNuPE-based Survivor assay has been developed. It offers a means for rapid, sensitive, and selective scoring of SNPs using ESI-MS detection. This assay offers the simple detection of the same four free ddNTPs for the analysis of any SNP, and labeling of neither the primers nor ddNTPs is necessary. Other advantages of this solution-phase assay include enhanced hybridization efficiencies and the elimination of immobilized primers to obtain single-stranded DNA. In addition to the inherent benefits of the assay, further advantages are introduced by the rapid, sensitive, selective, and accurate ESIMS detection method, which the Survivor assay uses to quantitate the unreacted ddNTPs. Furthermore, the assay is easily amenable to automated computer software identification of SNP bases and has small data storage requirements, as compared to those of MALDI-TOF. With a one-well format and an automation platform in development, the Survivor assay is an attractive approach to the large scale, high-throughput screening of SNPs for genetic analysis.

(30) Zammatteo, N.; Alexandre, I.; Ernest, I.; Le, L.; Brancart, F.; Remacle, J. Anal. Biochem. 1997, 253, 180-9. (31) Armstrong, B.; Stewart, M.; Mazumder, A. Cytometry 2000, 40, 102-8. (32) Ross, P.; Hall, L.; Smirnov, I.; Haff, L. Nat. Biotechnol. 1998, 16, 1347-51.

Received for review December 29, 2000. Accepted February 25, 2001.

ACKNOWLEDGMENT We acknowledge F. Hoffmann-La Roche for providing the human genomic DNA samples, and thank Professors Jack Henion, David Wilson, and Joseph Calvo for reviewing the manuscript and for their helpful comments.

AC001549J

Analytical Chemistry, Vol. 73, No. 9, May 1, 2001

2125