Aryldiazomethanes for Universal Labeling of Nucleic Acids and

BioMérieux, Advanced Technology/Molecular Diagnostics, Chemin de l'Orme - 69280 Marcy l'Etoile, ... in biomedical research and in vitro diagnostics t...
0 downloads 0 Views 188KB Size
1298

Bioconjugate Chem. 2003, 14, 1298−1306

Aryldiazomethanes for Universal Labeling of Nucleic Acids and Analysis on DNA Chips Ali Laayoun,*,† Mitsuharu Kotera,*,‡ Isabelle Sothier,† Emmanuelle Tre´visiol,‡ Eloy Bernal-Me´ndez,† Ce´cile Bourget,‡ Lionel Menou,† Jean Lhomme,‡ and Alain Troesch† BioMe´rieux, Advanced Technology/Molecular Diagnostics, Chemin de l’Orme - 69280 Marcy l’Etoile, France, and LEDSS, UMR5616 CNRS - BP53, Universite´ Joseph Fourier, 38041 Grenoble, France. Received July 31, 2003; Revised Manuscript Received October 3, 2003

DNA and RNA labeling and detection are key steps in nucleic acid-based technologies, used in medical research and molecular diagnostics. We report here the synthesis, reactivity, and potential of a new type of labeling molecule, m-(N-Biotinoylamino)phenylmethyldiazomethane (m-BioPMDAM), that reacts selectively and efficiently with phosphates in nucleotide monomers, oligonucleotides, DNA, and RNA. This molecule contains a biotin as detectable unit and a diazomethyl function as reactive moiety. We demonstrate that this label fulfills the requirements of stability, solubility, reactivity, and selectivity for hybridization-based analysis and especially for detection on high-density DNA chips.

INTRODUCTION

ditional enzymatic steps which require precise calibration of the enzyme activity to achieve a reproducible labeling yield. Moreover, because the enzymes used depend on the target type (DNA or RNA), no specific method is universally applicable to all nucleic acids. A more convenient alternative is to label the target chemically and a variety of chemical methods for direct modification and derivatization of nucleic acids with functional groups have been proposed (6-10). Chemical labeling generally occurs at the bases, thereby altering the hybridization properties. We have previously described a strategy for labeling RNA molecules on their terminal phosphate (11, 12). This chemical labeling takes advantage of the fragmentation step, which is required for the hybridization to highdensity DNA chips (3), to attach the reporter molecule 5-(bromomethyl)fluorescein (5-BMF) to the 3′-phosphate of cleaved RNA fragments. In our model experiments, we reported higher labeling densities than with enzymatic incorporation of labeled ribonucleotides (11). The hybridization efficiency was shown to be preserved, but labeling yields on single- and double-stranded DNA were poor (13). We now report a new type of labeling reagent, m-BioPMDAM (3) (m-(N-Biotinoylamino)phenylmethyldiazomethane), that selectively and more efficiently reacts on the phosphates of nucleic acid sequences. The molecule includes in its structure a biotin unit covalently linked to the reactive phenyldiazomethane moiety. Its synthesis is described. In a model study we show that m-BioPMDAM reacts selectively on the phosphate of a series of 3′-nucleotide monophosphates, and it can be used to label oligonucleotides and long DNA and RNA sequences, prior to their analysis on high-density DNA chips.

In recent years, a vast amount of genetic information has been obtained from numerous genome sequencing projects. The availability of these data has boosted the development and use of nucleic acid-based technologies in biomedical research and in vitro diagnostics to identify and quantify organisms that are present in biological samples, study genome expression levels, and detect genetic mutations and polymorphisms. A typical nucleic acid-based assay usually comprises different steps, i.e., target isolation, enzymatic amplification, and detection by hybridization onto specific complementary probe(s). The analysis of target/probe hybrids is a key step in this process and generally requires the use of reporter molecules to specifically label the nucleic acids. Depending on the test format, those reporter molecules are incorporated either in the target (during the enzymatic amplification or postsynthesis) or in the detection probes. It is essential that these labeling methods do not disrupt the base pairing capability which is critical for preserving specificity during enzymatic incorporation and the subsequent hybridization detection step. This requirement is particularly important for microarrays or DNA chips, which represent one of the most promising methods for assessing genetic diversity at a large scale and a high resolution. This technology is based on the hybridization of labeled nucleic acid targets onto large sets of oligonucleotide probes (1, 2). Enzymatic incorporation of the label, either during target amplification or postsynthetically, represents the most widely used labeling method in DNA chip analysis (3-5) but presents some drawbacks. The amplified product may become altered in the presence of modified, labeled nucleotides. Also, postsynthetic labeling methods make use of ad-

EXPERIMENTAL PROCEDURES

* To whom correspondence should be addressed. A.L.: Advanced Technology/Molecular Diagnostics, BioMe´rieux, S.A., Chemin de l’Orme - 69280 Marcy l’Etoile, France. Tel: +33 478 875 337. Fax: +33 478 875 340. E-mail: ali.laayoun@eu. biomerieux.com. † BioMe ´ rieux. ‡ Universite ´ Joseph Fourier.

All commercially available chemical reagents and solvents were used without purification. D-biotin was purchased from Lancaster Synthesis (Windham, NH) or Avocado Research Chemicals (Heysham, UK). MnO2 and Celite-545 were purchased from Merck (Fontenay-sousbois, France), molecular sieves (3 Å, powder) from Acros Organics (Noisy-le-Grand, France), boric acid from Eu-

10.1021/bc0341371 CCC: $25.00 © 2003 American Chemical Society Published on Web 10/31/2003

Diazo Labeling for Chip Analysis of Nucleic Acids

robio (Les Ulis, France), and 3′-GMP from ICN Biomedicals (Costa Mesa, CA). All other nucleotides were from Sigma (Saint Quentin-Fallavier, France). TLC: Merck silica gel 60 F254 plates. Preparative reverse phase column chromatography: Merck LiChroprep RP 18 silica gel (40-63 µm). Mp: Electrothermal Serie IA9100 apparatus, uncorrected. UV: Perkin-Lambda 15UV/VIS. IR: Perkin-Elmer Impact 400 spectrophotometer. NMR: Bruker AC 200, Avance 300 spectrometers. For 1 H NMR, CDCl3 as solvent, δH ) 7.24, DMSO-d6 as solvent, δH ) 2.49; for 13C NMR, CDCl3 as solvent, δC ) 77.5, DMSO-d6 as solvent, δC ) 39.5. Mass spectra: DelsiNermag R10-10 or VG Platform (Micromass) spectrometers. Synthesis of m-BioPMDAM (3). 3-(N-Biotinoylamino)acetophenone (1). To a solution of D-biotin (1.0 g, 4.1 mmol) in dry DMF (45 mL) cooled to 0 °C under argon were added successively N-methylmorpholine (590 µL, 5.33 mmol) and isobutyl chloroformate (840 µL, 6.60 mmol). The solution was stirred for 30 min, and then 3-aminoacetophenone (824 mg, 6.10 mmol) and N-methylmorpholine (480 µL, 4.35 mmol) in 10 mL of DMF were added. The solution was stirred at 0 °C for 2 h, and then the solvent was removed under vacuum. The residue was dissolved in 3 mL of MeOH, and then 50 mL of water were added. The resulting precipitate was filtered, washed successively with water, CH2Cl2, and ether, and dried to give 1.2 g of crude 1. Recrystallization from MeOH-water gave 1 (1.01 g, 70%) as a white powder. Mp 145 °C; IR (KBr): ν ) 3280, 2931, 2857, 1691, 1590, 1540, 1487, 1434, 1298, 1266 cm-1; 1H NMR (300 MHz, DMSO-d6): δ ) 1.3-1.7 (m, 6H), 2.33 (t, J ) 8 Hz, 2H), 2.55 (s, 3H), 2.58 (d, J ) 12 Hz, 1H), 2.83 (dd, J ) 12 and 5 Hz, 1H), 3.13 (m, 1H), 4.15 (m, 1H), 4.31 (m, 1H), 6.34 (s, 1H), 6.41 (s, 1H), 7.44 (t, J ) 8 Hz, 1H), 7.64 (d, J ) 8 Hz, 1H), 7.85 (d, J ) 8 Hz, 1H), 8.17 (s, 1H), 10.05 (s, 1H). 13C NMR (50 MHz, DMSO-d6): δ ) 197.5, 171.3, 162.6, 139.5, 137.1, 128.9, 123.3, 122.8, 118.0, 60.9, 59.0, 55.2, 39.6, 36.0, 28.0, 27.9, 26.5, 24.9; MS (FAB/ glycerol): m/z 362 [M + H]+. 3-(N-Biotinoylamino)acetophenone Hydrazone (2). To a solution of 1 (500 mg, 1.38 mmol) in absol ethanol (8 mL) was added 200 µL (4.15 mmol) of hydrazine monohydrate. The solution was refluxed for 2 h. After cooling, the white precipitate was filtered, washed with water and ether, and then dried to give 2 (385 mg, 74%) as a white powder. Mp 185 °C; IR (KBr): ν ) 3298, 2931, 2857, 1698, 1665, 1626, 1541, 1494, 1470, 1446, 1330, 1265 cm-1; 1H NMR (300 MHz, DMSO-d6): δ ) 1.3-1.7 (m, 6H), 1.98 (s, 3H), 2.26 (t, J ) 8 Hz, 2H), 2.56 (d, J ) 12 Hz, 1H), 2.81 (dd, J ) 12 and 5 Hz, 1H), 3.11 (m, 1H), 4.13 (m, 1H), 4.29 (m, 1H), 6.39 (s, 3H), 6.42 (s, 1H), 7.22 (m, 2H), 7.50 (d, J ) 8 Hz, 1H), 7.84 (s, 1H), 9.82 (s, 1H). 13C NMR (75 MHz, DMSO-d6): δ ) 171.6, 163.1, 142.2, 140.6, 139.5, 128.7, 120.0, 118.2, 115.9, 61.4, 59.6, 55.8, 40.2, 36.6, 28.6, 28.4, 25.5, 11.7; MS (FAB/glycerol): m/z 376 [M + H]+. 3-(N-Biotinoylamino)phenylmethyldiazomethane (3, mBioPMDAM). To a solution of 2 (180 mg, 0.48 mmol) in DMF (2 mL) was added MnO2 (340 mg, 3.9 mmol). After 30 min of stirring at room temperature, the reaction mixture was filtered through a Celite (0.5 cm thickness)molecular sieves 3 Å (powder, 0.5 cm thickness) pad. The reaction mixture was concentrated in vacuo to 0.5 mL, and then 5 mL of ether was added. The resulting precipitate was filtered, washed with ether, and dried to give 3 (170 mg, 95%) as a pink powder. Mp 160 °C; IR (KBr): ν ) 3278, 2935, 2859, 2038, 1704, 1666, 1605, 1577, 1536, 1458, 1430, 1263 cm-1; 1H NMR (300 MHz,

Bioconjugate Chem., Vol. 14, No. 6, 2003 1299

DMSO-d6): δ ) 1.3-1.7 (m, 6H), 2.11 (s, 3H), 2.28 (t, J ) 8 Hz, 2H), 2.57 (d, J ) 12 Hz, 1H), 2.81 (dd, J ) 12 and 5 Hz, 1H), 3.11 (m, 1H), 4.13 (m, 1H), 4.29 (m, 1H), 6.33 (s, 1H), 6.41 (s, 1H), 6.60 (m, 1H), 7.25 (m, 3H), 9.84 (s, 1H). 13C NMR (75 MHz, DMSO-d6): δ ) 171.6, 163.1, 140.5, 132.8, 129.7, 116.4, 114.6, 111.7, 61.4, 59.6, 55.7, 51.7, 40.2, 36.6, 28.6, 28.4, 25.4, 9.8; MS (FAB/thioglycerol): m/z 346 [M + H - N2]+. Stability of m-BioPMDAM (3). The purity of 3 was evaluated by 1H NMR in DMSO-d6 by the integration ratio between the multiplet signals at 6.60 ppm and at 4.13 ppm (relaxation delay: 10 s). The initial purity of m-BioPMDAM (3) prepared above was 80-90%. The stability was evaluated by the same manner using a DMSO-d6 solution of 3 (30 mM). Alkylation of Nucleotide Monomers with m-BioPMDAM (3). Analytical. Each nucleotide monomer (0.04 mM): 3′-UMP, 3′-AMP, 3′-CMP, or 3′-GMP was incubated with 2 mM of m-BioPMDAM in a mixture of DMSO:CH3CN:H2O (1:3:1, vol/vol/vol) containing 2 mM H3BO3 (pH 7.3) at 60 °C. Reaction mixtures (250 µL) were then washed with CH2Cl2 and alkylation rates estimated using capillary electrophoresis. Capillary Electrophoresis (CE) Procedure. CE experiments were performed with a Beckman P/ACE 5000 capillary electrophoresis instrument (Beckman Coulter, Fullerton, CA). An untreated fused silica capillary (75 µm × 50 cm) was used. The applied voltage was 30 kV (normal polarity) and the capillary temperature maintained at 23 °C. The electrophoregrams were recorded at 254 nm. Borate buffer solution (0.1 M, pH 8.3) was prepared from boric acid by adjusting pH with NaOH solution and filtered through 0.2 µm filter. Samples were injected by pressure (0.5 psi, 5 s). Before each run, the capillary was regenerated by using successively NaOH solution (0.1 N, 2 min), pure water (2 min), and borate buffer (2 min) by pressure (20 psi). Preparative Run (4). A mixture of 9.3 mg (21 µmol) of 3′-UMP (disodic salt tetrahydrated), 2 mL of H3BO3 solution (0.1 M), 2 mL of CH3CN, 6 mL of methanol, and 75 mg (200 µmol) of m-BioPMDAM (3) was stirred at room temperature for 2.5 h. To the resulting mixture were added CH2Cl2 (30 mL) and H2O (3 mL). Aqueous phase was separated and further washed twice with CH2Cl2 (30 mL) and then concentrated. The crude product was purified by reverse-phase silica gel chromatography (LiChroprep RP 18 silica gel, 40-63 µm, eluant: 0-30% MeOH/H2O) to afford 10 mg (69%) of solid 4. The designations “a” and “b” in the 1H NMR data refer to the two sets of signals for two diastereoisomers. 1H NMR (300 MHz, D2O): δ ) 1.35-1.75 (m, 6H, CH2-biotin), 1.46 and 1.48 (2d, J ) 7 Hz, 3H, CH3), 2.36 and 2.37 (2t, J ) 7 Hz, CH2-biotin), 2.70 (d, J ) 13 Hz, 1H, CH2-biotin), 2.92 and 2.93 (2dd, J ) 13 and 5 Hz, 1H, CH2-biotin), 3.28 (m, 1H, CH-biotin), 3.33 (m, 1H, Ha-2′), 3.47 and 3.57 (ABX, J ) 13, 4, and 2 Hz, 2H, Hb-5′,5′′), 3.69 and 3.80 (ABX, J ) 13, 4, and 3 Hz, 2H, Ha-5′,5′′), 3.84 (m, 1H, Hb-4′), 3.99 (t, J ) 5 Hz, 1H, Hb-2′), 4.06 (m, 1H, Ha-4′), 4.10 (m, 1H, Ha-3′), 4.19 (m, 1H, Hb-3′), 4.36 (m, 1H, CHbiotin), 4.52 (m, 1H, CH-biotin), 5.23 (m, 1H, CHOP), 5.52 (d, J ) 3 Hz, 1H, Ha-1′), 5.62 (d, J ) 5 Hz, 1H, Hb-1′), 5.75 and 5.76 (2d, J ) 8 Hz, 1H, H-5), 7.13-7.22 (m, 1H), 7.24-7.34 (m, 2H), 7.44 (s broad, 1H), 7.65 and 7.68 (2d, J ) 8 Hz, 1H, H-6); 31P NMR (81 MHz, D2O): δ ) -0.38, -0.52; ESI-MS: m/z 670 [M + H]+. Adducts 5-7. These adducts were obtained by the same procedure as described above for 4. Adduct 5: yield 12 mg (60%). 1H NMR (300 MHz, D2O): δ ) 1.3-1.7 (m, 6H, CH2-biotin), 1.45 and 1.48 (2d,

1300 Bioconjugate Chem., Vol. 14, No. 6, 2003

J ) 7 Hz, 3H, CH3), 2.37 (t, J ) 7 Hz, CH2-biotin), 2.70 (d, J ) 13 Hz, 1H, CH2-biotin), 2.92 and 2.93 (2dd, J ) 13 and 5 Hz, 1H, CH2-biotin), 3.20 (m, 1H, Ha-2′) 3.27 (m, 1H, CH-biotin), 3.47 and 3.60 (ABX, J ) 13, 4 and 2 Hz, 2H, Hb-5′,5′′), 3.69 and 3.82 (ABX, J ) 13, 4 and 2 Hz, 2H, Ha-5′,5′′), 3.86 (m, 1H, Hb-4′), 3.93 (t, J ) 5 Hz, 1H, Hb-2′), 4.04 (m, 2H, Ha-3′ and Ha-4′), 4.13 (m, 1H, Hb-3′), 4.34 (m, 1H, CH-biotin), 4.52 (m, 1H, CH-biotin), 5.22 (m, 1H, CHOP), 5.52 (d, J ) 3 Hz, 1H, Ha-1′), 5.65 (d, J ) 5 Hz, 1H, Hb-1′), 5.90 and 5.91 (2d, J ) 8 Hz, 1H, H-5), 7.10-7.20 (m, 1H), 7.21-7.35 (m, 2H), 7.43 and 7.46 (2s broad, 1H), 7.58 and 7.61 (2d, J ) 8 Hz, 1H, H-6); 31P NMR (121 MHz, D2O): δ ) -0.38, -0.49; ESI-MS: m/z 669 [M + H]+. Adduct 6: yield 14 mg (74%). 1H NMR (300 MHz, D2O): δ ) 1.20-1.63 (m, 6H, CH2-biotin), 1.45 and 1.49 (2d, J ) 7 Hz, 3H, CH3), 2.27 and 2.29 (2t, J ) 7 Hz, CH2-biotin), 2.64 and 2.66 (2d, J ) 13 Hz, 1H, CH2biotin), 2.83 and 2.85 (2dd, J ) 13 and 5 Hz, 1H, CH2biotin), 3.34 (m, 1H, CH-biotin), 3.47 and 3.62 (ABX, J ) 13, 3 and 2 Hz, 2H, Hb-5′,5′′), 3.70 and 3.83 (ABX, J ) 13, 4 and 2 Hz, 2H, Ha-5′,5′′), 3.80 (t, J ) 4 Hz, 1H, Ha2′) 3.96 (m, 1H, Hb-4′), 4.18 (m, 1H, Ha-4′), 4.24 (m, 1H, CH-biotin), 4.36 (m, 1H, Ha-3′), 4.41-4.50 (m, 3H, Hb-3′, CH-biotin, Hb-2′), 5.24 (m, 1H, CHOP), 5.62 (d, J ) 5 Hz, 1H, Hb-1′), 5.65 (d, J ) 4 Hz, 1H, Ha-1′), 7.06-7.20 (m, 2H), 7.28 (m, 1H), 7.34 and 7.42 (2s broad, 1H), 8.09 and 8.09 (2s, 1H, H-8), 8.12 (s, 1H, H-2); 31P NMR (121 MHz, D2O): δ ) -0.27, -0.49; ESI-MS: m/z 693 [M + H]+. Adduct 7: yield 10 mg (65%). 1H NMR (300 MHz, D2O): δ ) 1.24-1.69 (m, 6H, CH2-biotin), 1.46 and 1.49 (2d, J ) 7 Hz, 3H, CH3), 2.27 and 2.30 (2t, J ) 7 Hz, CH2-biotin), 2.64 and 2.67 (2d, J ) 13 Hz, 1H, CH2biotin), 2.84 and 2.85 (2dd, J ) 13 and 4 Hz, 1H, CH2biotin), 3.17 (m, 1H, CH-biotin), 3.50 and 3.62 (ABX, J ) 13, 4 and 2 Hz, 2H, Hb-5′,5′′), 3.70 and 3.82 (ABX, J ) 13, 4 and 2 Hz, 2H, Ha-5′,5′′), 3.71 (m, 1H, Ha-2′) 3.95 (m, 1H, Hb-4′), 4.13 (m, 1H, Ha-4′), 4.27 (m, 1H, CHbiotin), 4.37 (t, J ) 5 Hz, 1H, Hb-2′), 4.45 (m, 1H, Ha-3′), 4.48 (m, 1H, CH-biotin), 4.51 (m, 1H, Hb-3′), 5.24 (m, 1H, CHOP), 5.50 (d, J ) 3 Hz, 1H, Ha-1′), 5.51 (d, J ) 5 Hz, 1H, Hb-1′), 7.12-7.23 (m, 2H), 7.26-7.29 (m, 1H), 7.38 and 7.49 (2s broad, 1H), 7.74 (s, 1H, H-8); 31P NMR (121 MHz, D2O): δ ) -0.23, -0.34; ESI-MS: m/z 709 [M + H]+. Evaluation of Diazo Labeling of Oligonucleotides. The three oligonucleotide sequences, fragments of the Mycobacterium tuberculosis (Mtb) 16S rRNA sequence, were 5′-ACACCCTCTCAGGCCGGCTACCCGTCGTCGCCTTGGTAGGCC-3′; 5′- CCGTCGTCGCCTTGGTAGGCCGTCACCCCACCAACAAGCT-3′, and 5′GTCACCCCACCAACAAGCTGATAGGCCGCGGGCTCATCCCACACCG-3′. These DNA oligomers, bearing fluorescein at their 5′-end, were obtained from Eurogentec (Seraing, Belgium). Five picomoles of each oligonucleotide were treated with 2, 5, 10, and 20 mM m-BioPMDAM at 95 °C, in pure water, for 10 min. After the labeled targets were hybridized to the Mycobacterium DNA chip (15), the 5′-fluorescein label was detected and the emitted signal was denoted as S1. The chip was then stained for 10 min at room temperature, with 6 µL (6 µg) of an anti-biotin antibody (Rockland Immunochemical, Gilbertsville, PA) labeled with fluoresceins (two labels per antibody) and diluted in 0.05 M Tris pH 7, 0.5 M NaCl, 0.02% Tween-20, 500 µg/mL BSA. After it was washed, the chip was scanned and the registered signal was noted as S2. In all cases, the background was deducted. The difference of signal intensities (taking into account the number of fluoresceins per anti-biotin anti-

Laayoun et al.

body) was used to estimate the number of biotin labels per oligonucleotide. Nucleic Acid Targets Preparation. Preparation and amplification of the Mtb 16S rRNA locus was carried out from freshly grown colonies, according to the procedure described by Troesch et al. (15) with the following modifications. PCR amplification was carried out in a 50 µL reaction volume using the Fast Start Taq DNA polymerase (Roche Molecular Biochemicals, Mannheim, Germany), 200 µM of each deoxyribonucleotide triphosphate (Promega, Madison, WI), and 0.3 µM of primers. PCR was performed in a Perkin-Elmer 9700 thermal cycler with an initial denaturation step at 94 °C for 5 min and cycling conditions of 94 °C for 30 s, 68 °C for 30 s, 72 °C for 45 s for 35 cycles, and 72 °C for 7 min for the last cycle. The promoter-tagged PCR amplicons were used for generating single-stranded RNA targets by in vitro transcription. Each 20-µL reaction mixture contained 8 µL of PCR product (approximately 50 ng). Transcription was carried out using the in vitro transcription kit megascript (Ambion, Austin, TX). The reaction was performed at 37 °C for 2 h. Genomic DNA from Mtb was extracted from a freshly grown liquid culture by universal lysis protocol (18) and using Genomic-tip 500/G (Qiagen, Venlo, The Netherlands), following the manufacturer’s instructions. Labeling of RNA. Labeling During Cleavage (LDC). Five microliters (0.04 µg) of in vitro transcripts was incubated at 60 °C for 10 min in labeling buffer (I) containing 30 mM imidazole, 5 mM MnCl2, 2 mM m-BioPMDAM or at 60 °C for 30 min in labeling buffer (II) containing 6 mM imidazole, 60 mM MnCl2, 2 mM 5-(bromomethyl)fluorescein (5-BMF) (Molecular Probes, Eugene, OR). Labeling Plus Cleavage (LPC). Conditions were the same as for LDC protocol, except that labeling and cleavage were done sequentially: 10 min at 60 °C for labeling with 2 mM m-BioPMDAM, followed by 10 min at 60 °C for cleavage with 30 mM imidazole, 5 mM MnCl2 or 10 min at 60 °C for labeling with 2 mM 5-BMF, followed by 10 min at 60 °C for cleavage with 6 mM imidazole, 60 mM MnCl2. Labeling of DNA. Labeling During Cleavage (LDC). Ten microliters of DNA amplicons (0.75 µg) were incubated at 95 °C for 10 min in 3 mM HCl, 10 mM m-BioPMDAM or in 3 mM HCl, 10 mM 5-BMF. Labeling Plus Cleavage (LPC). For LPC of DNA, conditions were the same as for LDC, except that labeling and fragmentation were done sequentially: 10 min at 95 °C for labeling with 10 mM m-BioPMDAM or 10 mM 5-BMF, followed by 10 min at 95 °C for cleavage in 3 mM HCl. Labeling of 10 µL DNA amplicons, in LPC procedure, with Biotin-Chem-Link distributed by Roche Molecular Biochemicals (Mannheim, Germany) and “Label IT Biotin” reagent from Mirus (Madison, WI) was carried out according to the supplier’s instructions. The fragmentation was achieved in 3 mM HCl, 10 min at 95 °C. Labeling Plus Cleavage of Genomic DNA. gDNA of Mtb (10 µg) was labeled in 30 mM m-BioPMDAM at 95 °C for 25 min and then fragmented by 10 mM HCl at 95 °C for 5 min. Purification of Labeled Targets. For all labeling protocols, unreacted label was removed prior to the hybridization step by means of silica membrane purification, 6S Qiavac columns (Qiagen, Venlo, The Netherlands), according to the manufacturer’s instructions. Probe Array Hybridization and Analysis. The Mycobacterium DNA chip (bioMerieux, Marcy-l’Etoile, France) is divided into 23000 specific 35 by 35 µm

Diazo Labeling for Chip Analysis of Nucleic Acids

Bioconjugate Chem., Vol. 14, No. 6, 2003 1301

Figure 1. IR (a) and 1H NMR (b) spectra of m-BioPMDAM (3). Scheme 1

synthesis sites, over a 5.25- by 5.25-mm area. The fourprobe interrogation strategy, similar to that described by Troesch et al. (15), was used to identify sequence variation in 16S rRNA and rpoB loci. Briefly, for every base interrogated within a given reference sequence, four probes of equal lengths are synthesized on the chip (usually 20mers). Those four probes are identical except at the interrogation position (centrally located within the probe), thus representing perfect hybridization match and the three possible mismatches. Base calls are determined by comparing the signal intensity of the labeled

target for the four probes. The DNA chips used in the study were manufactured by Affymetrix (Santa Clara, CA). One hundred microliters of the labeled and purified nucleic acid fragments were added to 400 µL of hybridization buffer (15), containing 0.9 M NaCl, 60 mM NaH2PO4 pH 7.4, 6 mM EDTA, 0.05% Triton X-100, 3 M betaine, 5 mM DTAB (dodecyl trimethylammonium bromide) and denatured at 95 °C for 10 min. DNA fragments labeled with “Label IT® Biotin” reagent were denatured with denaturation buffers according to the supplier’s protocol. Hybridization onto the Mycobacterium

1302 Bioconjugate Chem., Vol. 14, No. 6, 2003

Laayoun et al.

Figure 2. Capillary electrophoresis (CE) study of the alkylation of the nucleotide monophosphates with m-BioPMDAM (3).

DNA chip was performed at 45 °C during 30 min and the chip was washed twice in 0.45 M NaCl, 30 mM NaH2PO4, 3 mM EDTA pH 7.4, 0.005% Triton X-100 at 30 °C (15). For biotinylated fragments, the DNA chip was stained for 10 min with a solution made of 600 µL of 0.05 M Tris pH 7, 0.5 M NaCl, 0.02% Tween-20, 500 µg/mL BSA and 6 µL of Streptavidin-R-Phycoerythrin conjugate (300 µg/mL; DakoCytomation, Glostrup, Denmark). After a washing step in 0.9 M NaCl, 0.06 M NaH2PO4, 6 mM EDTA pH 7.4, 0.05% Triton X-100, the fluorescent signal emitted by the target bound to the DNA chip was detected by a GeneArray scanner (Agilent, Palo Alto, CA), at a pixel resolution of 3 µm. Probe array cell intensities, nucleotide base calls and reports were generated by functions available on the GeneChip software (Affymetrix, Santa Clara, CA). RESULTS AND DISCUSSION

Most electrophilic reagents, halides, sulfates, sulfonates, epoxid, etc, react with nucleotides and nucleic

acids essentially at the oxygen or nitrogen nucleophilic sites of the nucleobases (14). Design of the labeling reagent m-BioPMDAM was based on preliminary model studies which had shown that aryldiazomethanes did react at the phosphate group with nucleotide monophosphates (unpublished results). m-BioPMDAM (3) was thus prepared, that includes in its structure the reactive phenylmethyl diazomethane unit (PMDAM) covalently linked through an amide bond to the biotin (Bio) moiety. Synthesis of the m-BioPMDAM Reagent (3). mBioPMDAM 3 was prepared from commercial 3-aminoacetophenone and D-biotin with a 49% global yield (Scheme 1). Coupling of the two entities was accomplished in DMF using isobutyl chloroformate as coupling reagent in the presence of N-methylmorphline (70% yield). The methyl ketone function of the resulting conjugate 1 was transformed into the hydrazone 2 by warming with hydrazine monohydrate in ethanol in 74% yield. MnO2 oxidation of 2 yielded the diazo reagent m-BioPDAM 3 as a pink powder that was characterized notably by the 1H and 13C

Diazo Labeling for Chip Analysis of Nucleic Acids

Bioconjugate Chem., Vol. 14, No. 6, 2003 1303

Figure 3. 1H NMR spectra of the alkylation products with m-BioPMDAM (3) (a) of the 3′-UMP, (b) of the 3′-CMP, (c) of the 3′-AMP, and (d) of the 3′-GMP.

1304 Bioconjugate Chem., Vol. 14, No. 6, 2003

NMR spectrum (Figure 1a). It exhibited in the IR the characteristic vibration at 2038 cm-1 (Figure 1b). Simple alkyldiazoalkanes are unstable compounds. Substitution by aryl and electron-attracting groups generally increase the stability of the diazo function. It was thus essential to test the stability of the m-BioPMDAM reagent. Stability in solution was evaluated by 1H NMR spectroscopy. The half-life of m-BioPMDAM in DMSOd6 solution (30 mM) at room temperature is t1/2 ) 3 days. Reaction of m-BioPMDAM with 3′-Nucleotide Monophosphates. To study the site selectivity of the reaction of m-BioPMDAM we examined its reaction with representative nucleotide phosphates that incorporate the four different nucleobases, i.e., 3′-AMP, 3′-GMP, 3′CMP, 3′-UMP. Reactions were most conveniently monitored at the analytical stage using capillary electrophoresis. To dissolve both the charged hydrophilic nucleotides and the lipophilic m-BioPMDAM reagent, a homogeneous mixture of solvents (DMSO:CH3CN:H2O) was used. The reaction was run under buffered conditions at pH 7.3 using a large excess of m-BioPMDAM label. Figure 2 shows the evolution of the reactions run at 60 °C with the different nucleotides as monitored by capillary electrophoresis. The reaction profiles are quite similar for the four nucleotides examined. One very major (or unique) reaction product was formed and the kinetics of the reactions were quite comparable. In the experimental conditions used, the time required for 50% transformation of all nucleotides was about 15 min and transformation was complete after 130 min. The reaction products were identified by running the reactions at the preparative scale (20 µmol-scale). They were isolated and characterized by their analytical and spectroscopic data. The adducts were obtained in 70% average yield after purification, and they exhibited in 1H NMR spectra all peaks corresponding to the indicated structures (Figure 3). We note the presence of two diastereoisomers in a 1/1 ratio as reflected by the splitting of the sugar protons (H1′, H2′, H3′, H4′, H5′-5′′). The CH proton vicinal to the phosphate appears at 5.2 ppm. The proton-decoupled phosphorus NMR spectra equally indicate the presence of two diastereoisomers giving rise to two peaks:at δ ) -0.23 and -0.34 ppm for the adduct with 3′-GMP, to be compared with the δ ) +4.0 ppm value observed for 3′-GMP nucleotide. The same observation was done for all adducts. Reaction of m-BioPMDAM with Oligonucleotides. Reaction of m-BioPMDAM with three synthetic oligodeoxyribonucleotides, fragments of a 16S rRNA Mycobacterium tuberculosis (Mtb) sequence containing a fluorescein label at their 5′-end, was studied. A mix of the three oligonucleotides was labeled with different concentrations of m-BioPMDAM, purified, and then hybridized onto a Mycobacterium DNA chip (15). Median intensity of fluorescence signals emitted by the hybridized oligonucleotides, before and after a “posthybridization” staining step (with anti-biotin antibody bearing fluoresceins), were measured (see Experimental Procedures). The comparison of the two intensity values showed that approximately 3 of every 10 phosphates were labeled in our model experiments, suggesting that selective alkylation by m-BioPMDAM occurred on internucleotidic phosphates as well. This is substantially more than the one label per nucleic acid fragment indicated as the best yield in prior work with 5-(bromomethyl)fluorescein and iodoacetamido-fluorescein as labeling reagents (11-13). Labeling of RNA and DNA with m-BioPMDAM. We further evaluated the labeling efficiency on longer DNA and RNA targets generated by enzymatic amplifi-

Laayoun et al. Table 1. Chip Analysis of DNA Amplicons and RNA Transcripts, from 16S rRNA M. tuberculosis Locus, Labeled with 5-BMF and m-BioPMDAM chip resultsa

labeling procedure RNA

LDCb LPCc

DNA

LDC LPC

m-BioPMDAM 5-BMF m-BioPMDAM 5-BMF m-BioPMDAM 5-BMF m-BioPMDAM 5-BMF

BC (%)

median intensity (rfu)

signal/ background

97.3 95.7 96.8 90.8 99.5 62.7 100 0

20968 11058 13402 9312 2538 157 4700 18

43.0 31.0 30.0 13.0 4.5 1.5 10.7 0.2

a Chip results are given in terms of base call (BC) percentage (percent of homology between the experimentally derived sequence and the reference sequence tiled on the array), median signal intensity (relative fluorescence unit), and signal/background ratios. b LDC: labeling during cleavage. c LPC: labeling plus cleavage.

Table 2. Chip Analysis of DNA Amplicons Labeled with m-BioPMDAM and Two Existing Chemical Methods chip resultsa

labeling procedure DNA

LPCb

m-BioPMDAM Biotin-Chem-Link Label IT Biotin

BC (%)

median intensity (rfu)

signal/ background

98.4 85.4 38.4

4274 1175 87

13.0 3.2 3.5

a Chip results are given in terms of base call (BC) percentage (percent of homology between the experimentally derived sequence and the reference sequence tiled on the array), median signal intensity (relative fluorescence unit), and signal/background ratios. b LPC: labeling plus cleavage.

cation (16S rDNA hypervariable region of Mtb, 202 nt) (15), prior to their hybridization to the Mycobacterium DNA chip. Cleavage of labeled DNA and RNA targets into smaller fragments is a prerequisite for improving the uniformity and specificity of hybridization onto the DNA chip (3). RNA targets were cleaved with Mn2+ and imidazole (16). The DNA targets were fragmented by acidic treatment that promoted random depurination and cleavage (17). Labeling was accomplished during or before the cleavage step, and the labeled fragments were then hybridized to the Mycobacterium DNA chip. Basecalling, that measures the percent of homology between the experimentally derived sequence and the ref 202 nt sequence tiled on the array, was accurate for all experiments conducted with m-BioPMDAM. The signal intensities were strong and the signal/background ratios high, all significantly better than with 5-(bromomethyl)fluorescein, determined for comparison (Table 1). This was especially true for DNA target labeling which was poor with the latter compound but gave good results with m-BioPMDAM. DNA amplicons labeling was also evaluated using two commercially available chemistries, used to label nucleic acids for DNA array analysis (Table 2). “Biotin-ChemLink” is based on the use of monofunctional-platinium derivatives (10), that bind to nitrogen atom in guanine and adenine, whereas “Label IT Biotin” reagent uses a nondisclosed chemistry to covalently attach the label to nucleic bases in DNA. With these two technologies, targeting the nucleobases, signal intensities, and base call percentages were significantly lower compared to those obtained with phosphate labeling via diazo chemistry (Table 2). This is especially true in GC-rich region, as depicted in Figure 4. The data clearly corroborates that

Diazo Labeling for Chip Analysis of Nucleic Acids

Bioconjugate Chem., Vol. 14, No. 6, 2003 1305

Figure 4. Analysis of a GC-rich region of 16S rRNA Mycobacterium tuberculosis (5′CCGCGGCC). Targets were labeled with m-BioPMDAM (A), Biotin-Chem-Link (B), or Label IT Biotin (C) and hybridized to a Mycobacterium DNA chip. Signal intensities for the four different probes at each interrogation position are shown on the y axis, and base-calling results are shown on the x axis. Underlined: incorrect nucleotide calls, n: undetermined. The graphs highlight the hybridization specificity and the good discrimination obtained with m-BioPMDAM labeled targets.

the label attachment on nucleic bases, via a covalent or noncovalent binding, alters the base-pairing efficiency and hybridization specificity, in particular in DNA chip applications requiring sequence calling at high resolution. m-BioPMDAM was also tested in other DNA chip applications requiring such resolution, like detection of antibiotic-resistant mutations in Mtb and of anti-retroviral drug mutations in HIV-1 (data not shown) or typing of Staphylococcus aureus strains (19), and the results confirmed the base-pairing specificity. Moreover, labeling was equally efficient for cleaved or uncleaved targets. This suggests that diazo labeling can be used for other detection systems that do not require target fragmentation before hybridization, e.g., low-density probes detection formats. The potential of diazomethyl chemistry for labeling natural (not enzymatically amplified) nucleic acid targets was evaluated by labeling genomic DNA (gDNA) extracted from Mtb. Hybridization of directly labeled genomic DNA to high-density DNA chips would represent a simple mean to perform genome-wide scale analysis. SDS-PAGE showed that gDNA molecules were labeled (Figure 5). Indeed, a slight gel-shift was observed, after m-BioPMDAM treatment (lane 2) due to the neutralization of negative charges on the phosphodiester groups. Addition of neutravidin to the labeled sample shifted Mtb gDNA strongly toward higher molecular weights (lane 3). Interestingly, this shift was also observed with cleaved gDNA, showing that all fragments were labeled by biotin

Figure 5. Mtb gDNA labeled with m-BioPMDAM (3). Products were analyzed by SDS-PAGE and ethidium Bromide staining. Lane 1: sheared gDNA, not treated with m-BioPMDAM. Lane 2: gDNA labeled with m-BioPMDAM. Lane 3: 10 µL of labeled gDNA in lane 2 incubated for 5 min with 5 µL of 2 mg/mL neutravidin (Pierce) prior to electrophoresis. Lane 4: gDNA labeled with m-BioPMDAM (lane 2) and fragmented by 10 mM HCl at 95 °C for 5 min. Lane 5: 10 µL of labeled and cleaved gDNA in lane 4 incubated with 5 µL of 2 mg/mL neutravidin during 10 min.

(lanes 4 and 5). The base-calling accuracy obtained after analysis of the fragments on the chip (>90%, data not shown) shows that diazo labeling can be used to attach a detectable tag to a natural nucleic acid (the entire Mtb genome, in our experiment) before DNA chip analysis.

1306 Bioconjugate Chem., Vol. 14, No. 6, 2003 CONCLUSION

The m-BioPMDAM reagent described here has been shown to selectively react on phosphates in nucleotide monomers and polynucleotides and has been successfully used to label RNA and DNA targets, of different sequence composition and length. The labeling reagent is stable and can be provided as a ready-to-use component. The highly reactive diazo function is used to obtain the rapid (15-30 min, depending on target type) and selective alkylation of both internucleotidic and terminal phosphates, with labeling of approximately 3 of every 10 phosphates in our model experiments. This represents a very high labeling density with no negative impact on base-pairing efficiency (as shown by the high base-calling percentage obtained in DNA chip analysis). Moreover, diazo labeling chemistry is not sensitive to buffer components or contaminants in a nucleic acid sample; the targets are labeled as such without any prior purification step (whereas purification is necessary with the approach using monofunctional-platinium derivatives). The final conjugate is very stable, even under heat-denaturing conditions and long hybridization time (overnight). The labeling method described here requires a posthybridization staining step to introduce the fluorescent dye (via a streptavidin-fluorescent label conjugate). This step could be avoided, thus simplifying the experimental protocol, by using a diazo-compatible dye, where the fluorescent moiety is directly attached to the diazo reactive group. Our work on development of such fluorescent diazo compounds is currently in progress. Overall, these features make diazo labeling particularly well-suited for conducting extensive analysis on DNA chips, such as whole genome RNA and DNA analysis and parallel testing of numerous targets in routine diagnosis. ACKNOWLEDGMENT

We thank C. Tora for technical assistance in the synthesis of the diazo compounds and F. Telles for her contribution to the gDNA experiments. E.B.M. acknowledges a Marie Curie Industry Host Fellowship (Quality of Life Program). LITERATURE CITED (1) Ramsay, G. (1998) DNA Chips: State-of-the-art. Nature Biotechnol. 16, 40-44. (2) Lockhart, D. J., and Winzeler, E. A. (2000) Genomics, gene expression and DNA arrays. Nature 405, 827-836. (3) Chee, M., Yang, R., Hubbel, E., Berno, A., Huang, X. C., Stern, D., Winkler, J., Lockhart, D. J., Morris, M. S., and Fodor, S. P. A. (1996) Accessing genetic information with high-density DNA arrays. Science 274, 610-614. (4) Cho, R. J., Fromont-Racine, M., Wodicka, L., Feierbach, B., Stearns, T., Legrain, P., Lockhart, D. J., and Davis, R. W. (1998) Parallel analysis of genetic selections using whole

Laayoun et al. genome oligonucleotide arrays. Proc. Natl. Acad. Sci. U.S.A. 95, 3752-3757. (5) Rosenow, C., Saxena, R. M., Durst, M., and Gingeras, T. R. (2001) Prokaryotic RNA preparation methods useful for highdensity array analysis: comparison of two approaches. Nucleic Acids Res. 29, e112. (6) O’donnel, M. J., and McLaughlin, L. W. (1996) Reporter groups for the analysis of nucleic acid structure. In Bioorganic Chemistry: Nucleic Acids (Hecht, S. M., Ed.) pp 216-243, Oxford University Press, New York. (7) Kessler, C. (1995) Methods for nonradioactive labeling of nucleic acids. In Nonisotopic probing, blotting, and sequencing (Kricka, L. J., Ed.) pp 41-109, Academic Press, San Diego. (8) Hermanson, G. T. (1996) Nucleic acid and oligonucleotides modification and conjugation. In Bioconjugate Techniques (Hermanson, G. T., Ed.) pp 639-671, Academic Press, San Diego. (9) Hoevel, T., and Kubbies, M. (2002) Nonradioactive labeling and detection of mRNAs hybridized onto nucleic acid cDNA arrays. Methods Mol. Biol. 185, 417-23. (10) van Belkum, A., Linkels, E., Jelsma, T., Houthoff, H. J., van den Berg, F., and Quint, W. (1993) Application of a new, universal DNA labeling system in the PCR mediated diagnoses of Chlamydia trachomatis and human papillomavirus type 16 infection in cervical smears. J. Virol. Methods 45, 189-200. (11) Monnot, V., Tora, C., Lopez, S., Menou, L., and Laayoun, A. (2001) Labeling during cleavage (LDC), a new labeling approach for RNA. Nucleosides, Nucleotides Nucleic Acids 20, 1177-1179. (12) Laayoun, A. (2002) Process for labeling a ribonucleic acid, and labeled RNA fragments which are obtained thereby. United States Patent 6,376,179. (13) Browne, K. A. (2002) Metal ion-catalyzed nucleic acid alkylation and fragmentation. J. Am. Chem. Soc. 124, 79507962. (14) Mountzouris, J., and Hurley, L. H. (1996) Small MoleculeDNA Interaction. In Bioorganic Chemistry: Nucleic Acids (Hecht, S. M., Ed.) pp 288-323, Oxford University Press, New York. (15) Troesch, A., Nguyen, H., Miyada, C. G., Desvarenne, S., Gingeras, T. R., Kaplan, P. M., Cros, P., and Mabilat, C. (1999) Mycobacterium species identification and rifampin resistance testing with high-density DNA probe arrays. J. Clin. Microbiol. 37, 49-55. (16) Breslow, R., and Xu, R. (1993) Recognition and catalysis in nucleic acid chemistry. Proc. Natl. Acad. Sci. U.S.A. 90, 1201-1207. (17) Lhomme, J., Constant, J. F., and Demeunynck, M. (2000) Abasic DNA structure, reactivity, and recognition. Biopolymers 52, 65-83. (18) Broyer, P., Cleuziat, P., Colin, B., Jaravel, C., and Santoro, L. (2000) Improved device and method for lysis of microorganisms. PCT WO 00/05338. (19) van Leeuwen, W., Jay, C., Snijders, S., Durin, N., Lacroix, B., Verbrugh, H., Enright, M., Troesch, A., and van Belkum, A. (2003) Staphylococcus aureus multilocus sequence typing with DNA array technology. J. Clin. Microbiol. 41, 3323-3326.

BC0341371