Anal. Chem. 1999, 71, 1454-1459
Determination of Nearest Neighbors in Nucleic Acids by Mass Spectrometry Jef Rozenski† and James A. McCloskey*,†,‡
Department of Medicinal Chemistry and Department of Biochemistry, University of Utah, Salt Lake City, Utah 84112
The identification of nearest-neighbor residues in nucleic acids provides useful constraints on establishment of base composition and sequence and is potentially applicable to a range of structural problems involving synthetic and natural polynucleotides. A new approach to this problem using electrospray ionization tandem mass spectrometry is based on measurement of precursor-product relationships derived from small fragment ions produced in the high-pressure ionization (“nozzle-skimmer”) region of the instrument. Measured mass values of dinucleotide or other fragments, which give rise to mononucleotide ions N formed in the collision cell and transmitted by the second mass analyzer, establish the identities of residues adjacent to N. The technique is applicable to RNA and DNA, whether modified or not, and is demonstrated using modified residues in nucleic acids up to the size of intact tRNA (76-mer). By monitoring of selected ion reaction channels, the method has been extended to LC/MS and to nearest-neighbor determinations directly in oligonucleotide mixtures. The determination of nearest neighbors was a widely used method in early studies of DNA sequence and biosynthesis and for measurement of “average” dinucleotide frequencies.1 The technique is based on polymerase-mediated incorporation of specific 5′-32P-labeled dNTP nucleotides, usually in four parallel experiments, into the growing polynucleotide chain complementary to the DNA of interest. Nuclease digestion to 3′-mononucleotides thus positions the label on the 5′-residue adjacent to the original dNTP, and the 32P-labeled nucleotide is then identified using 2D-TLC. The method was later expanded to analysis of RNA and found application in tRNA studies, where it was used in conjunction with the conserved structures and locations of many nucleotide modifications to derive tRNA sequences from patterns of nearest neighbors (e.g., refs 2 and 3). Although this approach could in principle be applied to contemporary problems of characterizing chemically synthesized DNA or RNA, the practicality of the method is limited by several factors. Because the * Corresponding author: (phone) 801-581-5581; (e-mail) james.mccloskey@ m.cc.utah.edu. † Department of Medicinal Chemistry. ‡ Department of Biochemistry. (1) Josse, J.; Kaiser, A. D.; Kornberg, A. J. Biol. Chem. 1961, 236, 864-875. (2) Mazzara, G. P.; Seidman, J. G.; McClain, W. H.; Yesian, H.; Abelson, J.; Guthrie, C. J. Biol. Chem. 1977, 252, 8245-8253. (3) Guthrie, C.; Scholla, C. A.; Yesian, H.; Abelson, J. Nucleic Acids Res. 1978, 5, 1833-1844.
1454 Analytical Chemistry, Vol. 71, No. 7, April 1, 1999
classical nearest-neighbor analysis is an indirect method, that is, the analyte is in fact a copy of the nucleic acid of interest, the complementary product must reflect the composition of the original nucleic acid. The latter requirement cannot be met in the case of modified nucleic acids, because the complementary nucleotide incorporated reflects only an unmodified residue in the template. For example, a modified A could correctly direct incorporation of U, or incorrectly of G, but in neither case does the nearest-neighbor analysis permit recognition of a modified A. On the other hand, the presence of a modification can even terminate chain elongation. Finally, classical nearest-neighbor analysis is not suitable for short oligonucleotides, owing to the requirement for a primer, or for application to mixtures. We report here a novel approach for nearest-neighbor determination using electrospray ionization and tandem mass spectrometry, based on the analysis of fragment ions of the nucleic acid, formed in the ionization region (by so-called “nozzleskimmer fragmentation”) of the mass spectrometer. These fragment ions directly indicate the nearest neighbors of any selected nucleotide whether or not it is modified and is applicable to RNA and DNA, and directly to components of oligonucleotide mixtures through the use of LC/MS/MS. Examples and leading references to analogous approaches based on nozzle-skimmer fragmentation used in conjunction with LC/MS for selective detection of protein modifications such as phosphorylation, glycosylation, and acylation are given in earlier summaries by Bean et al.4 and Annan and Carr.5 EXPERIMENTAL SECTION Materials. Synthetic DNA and RNA oligonucleotides were prepared at the University of Utah Protein/DNA Synthesis Facility on an Applied Biosystems model 394 instrument, using standard phosphoramidite technology. Procedures for deblocking of 2′-OH of RNA and of final sample purification were as described.6 Escherichia coli tRNAVal, ∼95% purity, was obtained from Subriden (Rolling Bay, WA), and treated to remove cations as earlier detailed.7 Mass Spectrometry. Measurements made using simple sample infusion (all data except in Figure 5) were carried out using (4) Bean, M. F.; Annan, R. S.; Hemling, M. E.; Mentzer, M.; Huddleston, M. J.; Carr, S. A. In Techniques in Protein Chemistry VI; Crabb, J., Ed.; Academic Press: New York, 1995; pp 107-116. (5) Annan, R. S.; Carr, S. A. J. Protein Chem. 1997, 16, 391-402. (6) Ni, J.; Pomerantz, S. C.; Rozenski, J.; Zhang, Y.; McCloskey, J. A. Anal. Chem. 1996, 68, 1989-1999. (7) Limbach, P. A.; Crain, P. F.; McCloskey, J. A. J. Am. Soc. Mass Spectrom. 1995, 6, 27-39. 10.1021/ac9812431 CCC: $18.00
© 1999 American Chemical Society Published on Web 03/04/1999
a Sciex (Concord, ON, Canada) API III+ instrument, operated with electrospray ionization in negative ion mode. A homemade microspray device was used, consisting of a fused-silica capillary (inner diameter 20 µm, outer diameter 90 µm, length 5 mm) glued inside a fused-silica capillary (inner diameter 180 µm, outer diameter 350 µm, length 30 mm) with epoxy glue (ITW Devcon, Danvers, MA). The latter was then assembled on a stainless steel union onto which the high voltage was applied. Nebulizing gas was not used; a countercurrent flow of ultrapure nitrogen gas (200-300 mL/min) heated to 60 °C was used for solvent declustering of ions. RNA and DNA samples (5 pmol/µL) dissolved in acetonitrile/ water (1:1 v/v) were infused using a Harvard model 22 syringe pump at a flow rate of 300-500 nL/min. The needle voltage was set to -2000 to -2200 V, orifice sampling voltage to -120 V, and collision energy 10-20 eV. Argon was used as collision gas at a thickness of ∼2.5 × 1015 atoms/cm2. The first quadrupole mass analyzer (MS-1) was set to 50% peak resolution for singly charged ions at m/z 800. MS-2 (second quadrupole mass analyzer) was set to the mass of the nucleoside anhydrophosphate (N>p) for each nucleotide of interest. The precursor ion spectra were recorded over the range m/z 550-900 by scanning MS-1, with the dwell time set to 3 ms and the mass interval to 0.1 m/z. Depending on signal intensities, 10-50 spectra were accumulated in multichannel analysis (MCA) mode, corresponding to an acquisition time of 2-9 min and a sample consumption of 5-23 pmol. LC/MS analysis (data in Figure 5) was performed using a Hewlett-Packard 1090 liquid chromatograph interfaced directly to a Fisons Quattro II mass spectrometer (Micromass, Beverly, MA), equipped with a standard electrospray ionization (ESI) source. Chromatographic solvents and parameters were used as described earlier for LC/MS analysis of oligonucleotide mixtures.8 For nearest-neighbor determination, the cone voltage was set to 150 V and the collision energy to 30 eV. For MRM data acquisition, the dwell time was set to 0.2 s/channel. In the same run, an additional scan function recorded a normal mass spectrum from m/z 300 to 700 in 0.6 s. This resulted in a total cycle time of 3.8 s. RESULTS AND DISCUSSION Principle of the Nearest-Neighbor Determination. The gasphase dissociation of polycharged oligonucleotides produces numerous types of ions in the lower m/z region of the mass spectrum, including those resulting from single cleavage of the polynucleotide chain9 which can in some cases be used for construction of “mass ladders” for sequencing.6,10 In addition, small fragment ions are formed by multiple backbone cleavages along the chain, such as dimer species N1pN*>p and N*pN2>p (see Figure 1), where N* represents any residue of interest and >p corresponds in mass to an anhydro- or cyclic phosphate of undefined structure. The identification of nucleotides adjacent to N* is based on recognition of both possible N*-containing dimeric (8) Felden, B.; Hanawa, K.; Atkins, J. F.; Himeno, H.; Muto, A.; Gesteland, R. F.; McCloskey, J. A.; Crain, P. F. EMBO J. 1998, 17, 3188-3196. (9) McLuckey, S. A.; Habibi-Goudarzi, S. J. Am. Chem. Soc. 1993, 115, 1208512095. (10) Nordhoff, E.; Kirpekar, F.; Roepstorff, P. Mass Spectrom. Rev. 1996, 15, 67-138.
Figure 1. Principle of nearest-neighbor determination by mass spectrometry. Upon ESI, polynucleotides containing the residue of interest N* dissociate to dinucleotide ions in the ion source region, which are identified by further dissociation in the gas collision cell to mononucleotide species N*>p. The neighbors of nucleotide residue N* are inferred from masses of the dinucleotide ions which contain the adjacent residues N1 and N2. Table 1. Mass Value Relationships for Unmodified Nucleotide Nearest Neighbors N1 or N2
DNAa
N1 or N2
RNAa
dC dT dA dG
350 365 374 390
C U A G
366 367 390 406
a Mass values to be added to the mass of neutral nucleoside N* to provide masses of negatively charged precursor ion of the type N1pN*>p or N*pN2>p (see Figure 1).
precursor ions, selectively detected by their further dissociation to form N*>p in the collision cell located between the two mass analyzers. Masses of the specific precursor (parent) ions of N*>p are established by scanning the first mass analyzer (MS-1), with the second mass analyzer fixed to transmit the ion of interest (N*>p) to the detector. Identities of the neighbors N1 and N2 are then inferred from the mass values of these precursor ions giving rise to N*>p. Note that only the identities of N1 and N2 and not their sequence positions relative to N* are determined. Other small fragment ions11 might be used for measurement of N*>p precursors and thus may indirectly bear nearest-neighbor information. These include 3′ sequence ions (w series12), 5′ sequence ions (a base series12), and various internal fragments resulting from double cleavage of the chain.13 We find that in the case of nozzleskimmer fragmentation, however, the dinucleotide ion species (Figure 1) are among the most abundant of the smaller dissociation products that are precursors of N*>p and lend themselves to straightforward data interpretation for purpose of nearestneighbor identification. The interpretation of data is aided by the fixed nature of allowable mass relationships between N* and the dimer ions. These mass relationships are tabulated in Table 1 for unmodified nearest neighbors in DNA and RNA. Changes in the mass relationships can be readily calculated for cases in which the masses of the neighbors differ from the four common values. Nearest-Neighbor Measurements in Oligonucleotides. The appearance of data for scanned mass spectra (MS-1 scans, MS-2 (11) Crain, P. F.; Gregson, J. M.; McCloskey, J. A.; Nelson, C. C.; Peltier, J. M.; Phillips, D. R.; Pomerantz, S. C.; Reddy, D. M. In Mass Spectrometry in the Biological Sciences; Burlingame, A. L., Carr, S. A., Eds.; Humana Press: Totowa, NJ, 1996; pp 497-517. (12) McLuckey, S. A.; Van Berkel, G. J.; Glish, G. L. J. Am. Soc. Mass Spectrom. 1992, 3, 60-70. (13) Lotz, R.; Gerster, M.; Bayer, E. Rapid Commun. Mass Spectrom. 1998, 12, 389-397.
Analytical Chemistry, Vol. 71, No. 7, April 1, 1999
1455
Figure 2. Determination of nearest neighbors of each of the four residues in d(AGTC) by measurement of the ion precursors of (A) dC>p (m/z 288), (B) dT>p (m/z 303), (C) dA>p (m/z 312), and (D) dG>p (m/z 328). M2- denotes the oligonucleotide molecular ion (M - 2H)2-, and >p designates an anhydro nucleotide residue equivalent in mass to cyclic phosphate or dehydrophosphate. Sample introduced by simple infusion.
fixed at the mass value of N*>p) is shown in Figure 2 for each of the four residues of the simple tetranucleotide d(AGTC). In panels A and B of Figure 2, the m/z 610 precursor ion corresponds to an allowable mass value for an ion of the type (pN1pN2)- (ion w2 in the usual nomenclature12 representing the first two residues from the 3′ terminus), in which N1 and N2 are C and T or vice versa. The m/z 739 ions (panels C and D) are assigned as N1pN2pfuranyl, or N1pN2pf, ions (i.e., a3 - B3 ions12) formed by cleavage between the third and fourth residues with loss of the third residue base and thus representing identities of the first two residues from the 5′ end. N1 and N2 are therefore A and G or vice versa. Both of these backbone cleavage products, m/z 610 and 739, are common sequence-related ions12 and provide structure information in addition to the two main nearest-neighbor determinants shown in Figure 1. Data from mass spectra acquired in a similar fashion for four additional sequence isomers (d(CGTA), d(TCGA), d(TCAG), d(CAGT)) are tabulated in Table 2. Nucleotides in terminal positions (residue C in Figure 2A, residue A in Figure 2C) have only one neighbor (T and G, respectively) and thus each shows only one dinucleotide precursor. In panel C the chain cleavage product a3 - B3 further confirms that G is a neighbor of A and that A is at the 5′ terminus. The two center residues (Figure 2B,D) each show two dinucleotide precursor ions, indicating the neighbors of G to be A and T and 1456 Analytical Chemistry, Vol. 71, No. 7, April 1, 1999
the neighbors of T to be G and C. Larger mass precursor ions, m/z 610 in panels A and B and m/z 739 in panels C and D, corroborate the nearest-neighbor assignments and constrain the total sequence to 5′-d(AGTC)-3′. In all four mass spectra, the doubly charged molecular ion is observed, the mass value of which in this simple example requires the nucleotide composition to consist of one residue each of A, G, T, and C. Similarly, mass spectra from the four tetranucleotide isomers in Table 2 show correct nearest-neighbor assignments for every residue and permit deduction of the total sequence in each case. Extension of the method to 8-mers and to RNA was studied using the structurally related models CGAGCUCG and d(CGAGCTCG). In both cases, mass spectra recorded for each of the four principal N* residues (Table 3) showed all correct nearest neighbor assignments, made from dinucleotide precursor ions. Similar to the tetranucleotide data in Figure 2 and Table 2, the identities of the first two residues from each terminus are revealed by w2 (3′ terminus) and a3 - B3 (5′ terminus) precursor ions for the N* residues that reside in these positions. For example in the mass spectrum of CGAGCUCG which was used to designate the precursor ions of G>p (Table 3), m/z 667 corresponds to an allowable value of pN1pG or pGpN2, in which N1 and N2 ) C. Similarly the m/z 763 ion represents an allowable value for the 5′-terminus dimer N1pGpf or GpN2pf, where N1 and N2 ) C. Taken together, these data indicate that the two terminal positions at both ends of the molecule are occupied by G and C. A nearest-neighbor test can be made for modified residues by adjusting the mass setting for MS-2 to the value of monocharged N*>p. In the nearest-neighbor mass spectra from the 10-mer d(CCAGGCmACGC) in which the sixth residue is the sugarmethylated ribonucleoside 2′-O-methylcytidine (Table 3), the value of N*>p becomes 318. The precursor ions of m/z 318 established by scanning the first mass analyzer demonstrate the neighbors of Cm to be dG (from m/z 647) and dA (from m/z 631) but not dC or Cm. The mass values observed for w2 (m/z 635) in two of the four mass spectra imply that dC and dG occupy the two 3′terminal residues, and the assignment of m/z 675 as an allowable a3 - B3 ion when N* ) dC correctly implies that adjacent dC residues are at the 5′ terminus. Similarly, no w2 or a3 - B3 precursors are found when N* ) Cm, because Cm does not occur in the first two positions at either terminus. The mass spectra of additional model oligonucleotides of progressively longer chain lengths (data not shown) indicated that the diagnostic pathway represented in Figure 1 still occurred throughout the chain. This result was unexpected in view of the fact that single-cleavage product ions produced by CID using a conventional gas cell tend to decrease very greatly toward the center of the chain when the number of residues exceeds about 14-17.6 On the other hand, several factors work strongly to promote extensive oligonucleotide dissociation in this region of the mass spectrometer. First is the fact that multicollision conditions exist, with correspondingly relatively short mean free paths.14 In addition, the multicharged nature of oligonucleotides during and following desolvation results in higher collision cross sections and greater fragmentation,9 an effect well documented for polyprotonated peptides.15 The lessened chain length effect (14) Voyksner, R. D.; Pack, T. Rapid Comm. Mass Spectrom. 1991, 5, 263-268. (15) Katta, V.; Chowdhury, S. K.; Chait, B. T. Anal. Chem. 1991, 63, 174-178.
Table 2. Nearest-Neighbor Mass Spectra from Four Isomeric Oligodeoxynucleotides N1 or N2 nearest neighbor, m/z (rel abund)a compd
N*
d(CGTA)
dC dT dA dG dC dT dA dG dC dT dA dG dC dT dA dG
d(TCGA)
d(TCAG)
d(CAGT)
C
T
A
G
616 (25), 634 (100)b
617 (100), 715 (49)c 632 (38)
616 (51), 634 (100)b 632 (13) 592 (41), 690 (100)c
617 (30), 715 (100)c
617 (30)
592 (100), 690 (15)c 641 (100), 659 (84)b 641 (76), 659 (100)b 601 (63)
617 (20) 592 (35), 690 (100)c 592 (100), 690 (16)c 601 (88)
641 (100), 659 (51)b 641 (100), 659 (62)b 601 (100), 699 (17)c 632 (77), 650 (100)b 641 (40)
601 (100), 699 (79)c 632 (51), 650 (100)b
641 (37)
a Mass values listed correspond to precursor ions N1pN*>p or N*pN2>p of each N*>p product ion, except as noted by footnotes b and c. Relative abundance values are normalized to the most abundant signal listed. b w2 ion (see text). c a3-B3 ion (see text).
Table 3. Nearest-Neighbor Mass Spectra from DNA and RNA Oligodeoxynucleotides N1 or N2 nearest neighbor, m/z (rel abund)a compd CGAGCUCG
N* C U A G
C
dC dT dA dG dC Cm dA dG
G 649 (80), 667 (27),b 763 (30)c
610 (100)
673 (100) 649 (100), 667 (27),b 763 (30)c
673 (52) dT
dA
dG 617 (100), 635 (84),b 715 (37)c
592 (45) 592 (100)
641 (100) 617 (64), 635 (27)b, 715 (100)c dC
d(CCAGGCmACGC)d
A
610 (100)
dC d(CGAGCTCG)
U
641 (43) Cm
577 (50), 675 (66)c 601 (68) 617 (86), 635 (100)b
dA 601 (39) 631 (100)
631 (100) 647 (41)
dG 617 (53), 635 (100) b 647 (70) 641 (35)
641 (34)
a Mass values listed correspond to precursor ions N1pN*>p or N*pN2>p of each N*>p product ion, except as noted by footnotes b and c. Relative abundance values are normalized to the most abundant signal reported. b w2 ion (see text). c a3 - B3 ion (see text). d Cm, nucleotide residue containing the ribonucleoside 2′-O-methylcytidine.
observed for the reactions represented in Figure 1 is illustrated in Figure 3 for a 40-mer which contains a single dT in the center of the chain. The precursors of dT>p clearly demonstrate the nearest neighbors to be dC (from m/z 592) and dA (m/z 616). A test for residues adjacent to dA (Figure 3B) shows the neighbors to be dT (one occurrence at residue 21) and dG (five occurrences of -GA- and -AG- sequence isomers). The ratio of ion abundances for m/z 616 and 641 qualitatively reflects these rates of occurrence. The largest molecule examined in the present study was the natural polymer tRNAVal from E. coli, a 76-mer of Mr 24 681 with significant double-stranded secondary structure in solution. Representative results are shown in Figure 4, to test for neighbors of the anticodon nucleoside uridine 5-oxyacetic acid (symbol cmo5U; see inset secondary structure16) Mr 318, and for dihydrouridine (16) Sprinzl, M.; Horn, C.; Brown, M.; Ioudovitch, A.; Steinberg, S. Nucleic Acids Res. 1998, 26, 148-153.
(symbol D), Mr 246, both of which occur once in the sequence. The precursor ions of cmo5U>p are determined to represent U (from m/z 635) and A (from m/z 708), in agreement with the reported tRNA structure,17 which contains the anticodon loop sequence 32-CUcmo5UACm6AA-38. Similarly, the mass spectrum recorded to test for neighbors of dihydrouridine (Figure 4B) shows ions corresponding to C (from m/z 612) and G (m/z 652) in accord with the sequence element 16-GDG-18.17 The abundant ion m/z 634 corresponds to a water-loss product of DpG>p, a process often observed in spectra from RNA, suggesting the participation of 2′-OH in the dehydration reaction. The presence of other ion precursors of D>p in Figure 4B is notable, but they do not detract from the dinucleotide assignments used to establish nearest-neighbor identities because they do not correspond to allowable mass values for ions of the type NpD>p, where N ) A or U (m/z 636 or 613, (17) Kimura, F.; Harada, F.; Nishimura, S. Biochemistry 1971, 10, 3277-3283.
Analytical Chemistry, Vol. 71, No. 7, April 1, 1999
1457
Figure 3. Determination of the nearest neighbors of (A) dT and (B) dA in a 40-mer oligodeoxynucleotide by measurement of the ion precursors of dT>p (m/z 303) and dA>p (m/z 312), respectively. Sample introduced by simple infusion.
Figure 4. Mass spectrum from intact E. coli tRNAVal for determination of the nearest neighbors of the single residues of (A) uridine 5-oxyacetic acid (B) dihydrouridine. Inset: tRNA secondary structure.17 Abbreviations: cmo5U, uridine 5-oxyacetic acid; D, dihydrouridine; s4U, 4-thiouridine; m7G, 7-methylguanosine; m5U, 5-methyluridine; Ψ, pseudouridine.
respectively), which are the two remaining possibilities. In similar fashion, correct nearest-neighbor signals were obtained for the remaining four modified nucleotide residues whose masses differ from those of the four major nucleotides (data not shown). Pseudouridine (Ψ) is an isomer of uridine and so is not distinguished from U in the reactions represented in Figure 1. Also notable, especially in larger polynucleotides, is the relative insensitivity of nearest-neighbor reactions to the presence of cation adducts. These adduct ions can obscure the molecular ion region in the conventional mass spectrum. Nearest-Neighbor Determination in Mixtures of Oligonucleotides Using LC/MS/MS. Methods for the analysis of oligonucleotides by ESI-LC/MS have in general lacked one or both of the required attributes for a practical HPLC solvent: good 1458 Analytical Chemistry, Vol. 71, No. 7, April 1, 1999
Figure 5. Nearest-neighbor assignments in an unresolved mixture of five isomeric oligodeoxynucleotides using LC/MS, by selected ion monitoring of the four possible neighbors for each of the four residues (A) dC, (B) dT, (C) dA, and (D) dG and (E) UV detection (260 nm) chromatogram. Elution orders were established from the nearestneighbor data: 1, d(CAGT); 2, d(CGTA); 3, d(TCAG); 4, d(AGTC); 5, d(TCGA).
chromatographic properties and efficient electrospray ionization. The recently developed solvent system based on hexafluoro-2-
propanol18 fulfills these requirements for DNA,18 DNA-phosphorothioates,19 and RNA8 and thus in principle forms the basis for application of the nearest-neighbor analysis to nucleic acid mixtures. Because the mass values of diagnostic dinucleotide ions are fixed for any desired value of N* (listed in Table 1), the assay is amenable to simple selected ion monitoring for the four common (unmodified) nucleotides or to modified neighbors assuming their molecular mass values are known. The approach is illustrated in Figure 5 for nearest-neighbor analysis of a mixture of five DNA sequence isomers which are incompletely resolved chromatographically and for which the elution orders were a priori not known. The data shown were collected from one LC/MS experiment, in which MS-2 was set sequentially within each scan cycle to represent the four desired values of N*, for each of which MS-1 was set to the four possible dinucleotide values of NpN*>p (see Figure 1). Therefore, detector responses for all 16 channels monitored were recorded every 3.8 s, permitting construction of ion chromatograms for comparison with the UV detection channel as shown in Figure 5. Although the UV detection chromatogram is ambiguous regarding the number of components present in the mixture, the ion profiles indicate five constituents, with oligonucleotide 1 as the only resolved component. Each of the four data sets (panels A-D) contains one “no response” track corresponding to the N* being tested, indicating the absence of sequences such as ...dCpdC... or ...dApdA... and requiring each oligonucleotide to consist of the composition {dC,dA,dT,dG} (a fact that would be independently determined by simple molecular mass measurements). Using the 15.35-min eluant (peak 2) as an example, the nearest-neighbor correlations show dC and dA to have only one neighbor each, namely, dG and dT, respectively, identifying them as terminal residues. The neighbors of dT for this component are dA and dG and are dC and dT for residue dG. With no prior knowledge of the components of the mixture, the sequence of component 2 is constrained to either d(CGTA) or d(ATGC), while
in terms of ordering the five components as a mixture of known structures, the second eluant is readily identified from these data as d(CGTA). Similar relationships of sequence and elution order are deduced for the remaining four oligonucleotides. In the case of longer sequences, multiple possibilities for overall sequences will result although the nearest-neighbor identities still place significant constraints on the identity of each oligonucleotide component. This conclusion would pertain to one of the most important applications of the method, that in which the sequence environment of a modified residue, rather than of the four common nucleotides, is at issue.
(18) Apffel, A.; Chakel, J. A.; Fischer, S.; Lichtenwalter, K.; Hancock, W. S. Anal. Chem. 1997, 69, 1320-1325. (19) Griffey, R. H.; Greig, M. J.; Gaus, H. J.; Liu, K.; Monteith, D.; Winniman, M.; Cummins, L. L. J. Mass Spectrom. 1997, 32, 305-313.
Received for review November 11, 1998. Accepted December 29, 1998.
CONCLUSIONS The method described is conceptually straightforward and is particularly advantageous in several respects. First, it can be applied to a structurally wide variety of synthetic or natural nucleic acids, without a prior knowledge of chromatographic properties of the component of interest, either alone or in any sequence context. Through the use of LC/MS, the technique can in principle be applied to complex mixtures of oligonucleotides, does not involve use of radioisotopes, and is relatively rapid. The method takes advantage of the numerous collisional fragmentation products of nucleic acids formed in the high-pressure ionization region of the electrospray ion source, leading to the prospect of further exploration of the various reaction pathways and subsequent development of additional structural probes using analogous approaches. ACKNOWLEDGMENT This work was supported by N.I.H. Grant GM21584 and by a gift from NeXstar Pharmaceuticals. The University of Utah DNA and Peptide Core Facility is supported by N.I.H. Cancer Center Support Grant CA42014. The authors are indebted to Pamela Crain for insightful discussion and a critical reading of the manuscript.
AC9812431
Analytical Chemistry, Vol. 71, No. 7, April 1, 1999
1459