Anal. Chem. 2006, 78, 7610-7615
MChip: A Tool for Influenza Surveillance Erica D. Dawson,† Chad L. Moore,† James A. Smagala,† Daniela M. Dankbar,† Martin Mehlmann,† Michael B. Townsend,† Catherine B. Smith,‡ Nancy J. Cox,‡ Robert D. Kuchta,† and Kathy L. Rowlen*,†,§
Department of Chemistry and Biochemistry, UCB 215, University of Colorado at Boulder, Boulder, Colorado 80309, Influenza Division, The Centers for Disease Control and Prevention, 1600 Clifton Road, Atlanta, Georgia 30333, and InDevR, LLC, 2100 Central Avenue, Suite 106, Boulder, Colorado 80301
The design and characterization of a low-density microarray for subtyping influenza A is presented. The microarray consisted of 15 distinct oligonucleotides designed to target only the matrix gene segment of influenza A. An artificial neural network was utilized to automate microarray image interpretation. The neural network was trained to recognize fluorescence image patterns for 68 known influenza viruses and subsequently used to identify 53 unknowns in a blind study that included 39 human patient samples and 14 negative control samples. The assay exhibited a clinical sensitivity of 95% and clinical specificity of 92%.
Addressing current public and scientific concern over the possible emergence of a pandemic strain of influenza requires significant advances in diagnostic tools for rapid typing and subtyping of influenza viruses, as recently highlighted.1 Three “types” of influenza virus exist (A, B, C). Virus type is defined by differences in the matrix proteins and nucleoprotein. Influenza A viruses are of greatest concern since they are responsible for an average 36 000 influenza-related deaths in the United States each year.2 Type A influenza viruses are traditionally “subtyped” according to the antigenic characteristics (antigenicity) of two surface glycoproteins, hemagglutinin (HA) and neuraminidase (NA). Antigenicity is determined by virus reactivity with antibodies produced against reference strains of influenza.3 Since there are 16 identified HA subtypes (e.g., H1, H3, H5) and 9 NA subtypes (e.g., N1, N2), there are 144 possible combinations of HA and NA and therefore 144 theoretically possible subtypes (e.g., H5N1). In practice, however, functional cooperativity between gene segments permits only certain combinations to replicate effectively.4 Currently, there are two dominant influenza A subtypes circulating in the human population, H3N2 and H1N1. Subtype determination (“subtyping”) is essential for tracking emerging viruses and for designing appropriate influenza vac* To whom correspondence should be addressed. E-mail: rowlen@ colorado.edu. † University of Colorado at Boulder. ‡ The Centers for Disease Control and Prevention. § InDevR, LLC. (1) Lu, P. S. Science 2006, 312, 337. (2) Li, J.; Chen, S.; Evans, D. H. J. Clin. Microbiol. 2001, 39, 696-704. (3) Taubenberger, J. K.; Layne, S. P. Mol. Diagn. 2001, 6, 291-305. (4) Steinhauer, D. A.; Skehel, J. J. Ann. Rev. Genet. 2002, 36, 305-332.
7610 Analytical Chemistry, Vol. 78, No. 22, November 15, 2006
cines.1 The current “gold standard” for complete influenza A subtyping (determination of both HA and NA) involves virus replication in egg or tissue culture followed by a hemagglutinin inhibition assay. This method is tedious and requires several days, with the analysis time often extended to several weeks for antigenically novel viruses.3 Antigenic “drift”, in which mutations in the genome lead to changes in antigenic properties, is a continual process for influenza. “Successful” influenza A viruses evolve such that the surface glycoproteins evade the immune system; thus, antigenic drift results in annual influenza outbreaks and epidemics.4 Influenza is a single-stranded RNA virus consisting of 8 distinct segments that code for 11 proteins.5,6 The HA and NA genes evolve rapidly at 6.7 × 10-3 and 3.2 × 10-3 nucleotide substitutions per nucleotide per year, respectively, and are genetically diverse between subtypes.4 The gene segments that code for internal proteins, including the matrix gene segment (M), evolve more slowly with ∼1 × 10-3 nucleotide substitutions per nucleotide per year.7 Due to their relatively high genetic conservation, the matrix gene and nucleoprotein gene segments are used for typing influenza.8 Influenza diagnostic methods based on reverse transcriptionpolymerase chain reaction (RT-PCR) and real-time RT-PCR (RRTPCR) are currently available for typing or as presumptive tests for a specific subtype (e.g., H5N1).9-17 RRT-PCR-based typing of (5) Chen, W.; Calvo, P. A.; Malide, D.; Gibbs, J.; Schubert, U.; Bacik, I.; Basta, S.; O’Neill, R.; Schickli, J.; Palese, P.; Henklein, P.; Bennink, J. R.; Yewdell, J. W. Nat. Med. 2001, 7, 1306-1312. (6) Lamb, R. A. In Genetics of Influenza Viruses; Palese, P., Kingsbury, D. W., Eds.; Springer: New York, 1983; pp 26-69. (7) Ito, T.; Gorman, O. T.; Kawaoka, Y.; Bean, W. J.; Webster, R. G. J. Virol. 1991, 65, 5491-5498. (8) Knipe, D. M.; Howley, P. M. Field’s Virology, 4th ed.; Lippincott, Williams and Wilkins: Philadelphia, 2001. (9) DiTrani, L.; Bedini, B.; Donatelli, I.; Campitelli, L.; Chiappini, B.; DeMarco, M. A.; Delogu, M.; Buonavoglia, C.; Vaccari, G. BMC Infect. Dis. 2006, 6, 87. (10) Fouchier, R. A. M.; Bestebroer, T. M.; Herfst, S.; Van Der Kemp, L.; Rimmelzwaan, G. F.; Osterhaus, A. D. M. E. J. Clin. Microbiol. 2000, 38, 4096-4101. (11) Ng, L. F. P.; Barr, I.; Nguyen, T.; Noor, S. M.; Tan, R. S. -P.; Agathe, L. V.; Gupta, S.; Khalil, H.; To, T. L.; Hassan, S. S.; Ren, E.-C. BMC Infect. Dis. 2006, 6, 40. (12) Payungporn, S.; Chutinimitkul, S.; Chaisingh, A.; Damrongwantanapokin, S.; Buranathai, C.; Amonsin, A.; Theamboonlers, A.; Poovorawan, Y. J. Virol. Methods 2006, 131, 143-147. (13) Payungporn, S.; Phakdeewirot, P.; Chutinimitkul, S.; Theamboonlers, A.; Keawcharoen, J.; Oraveerakul, K.; Amonsin, A.; Poovorawan, Y. Viral Immunol. 2004, 17, 588-593. (14) Poddar, S. K. J. Virol. Methods 2002, 99, 63-70. 10.1021/ac061739f CCC: $33.50
© 2006 American Chemical Society Published on Web 10/18/2006
influenza uses a single primer pair that targets a short, wellconserved portion of the M gene segment.9,10,15,17 RRT-PCR-based presumptive tests for a specific subtype use a single primer pair that targets a portion of the HA gene that is well-conserved only for a particular subtype.11,15 However, a major limitation of these singleplex PCR-based presumptive tests in terms of influenza surveillance is that they do not provide complete subtype information and are limited to a specific subtype. For example, if the patient sample is negative for an H5, no additional information is gained. While a presumptive test could be useful during a pandemic, in a prepandemic phase such as the current situation, tracking which subtypes exist in the population and how they are changing is vital information.18 Microarrays have emerged as useful tools for studying complex biological systems due to the high degree of multiplexing capability.19 Several studies have examined the utility of diagnostic microarrays for influenza detection and subtyping,2,20-23 but all have relied upon a multiple gene approach including HA and NA targets. While these studies provided proof of concept for microarray subtyping of influenza, the primary limitation was the necessity of amplifying multiple genes. For example, our previous work toward the development of a influenza diagnostic microarray utilized DNA capture probes designed to be complementary to portions of the M gene segment for typing and to HA and NA genes for subtyping.22,23 With a multiple gene approach, the method sensitivity was less than ideal due to a relatively high failure rate in the multiplex RT-PCR used to amplify the genes.24 In this work, the rather surprising discovery that a single, highly conserved gene segment could be used in an inexpensive microarray format to provide complete subtype information for influenza A is described. The potential impact of such a simple and effective test for influenza A subtyping is tremendous. EXPERIMENTAL SECTION Sequence Selection. The sequence selection method for capture and label pairs was adapted from the method of Mehlmann et al.22 Briefly, M gene segment sequences for a variety of subtypes of influenza A were compiled using the publicly available online database from Los Alamos National Laboratory (LANL)25 and information at the Centers for Disease Control (CDC). Subtype-specific databases were created for H1N1, H1N2, H3N2, (15) Spackman, E.; Senne, D. A.; Myers, T. J.; Bulaga, L. L.; Garber, L. P.; Perdue, M. L.; Lohman, K.; Daum, L. T.; Suarez, D. L. J. Clin. Microbiol. 2002, 40, 3256-3260. (16) Wei, H. -L.; Bai, G. -R.; Mweene, A. S.; Zhou, Y. -C.; Cong, Y. -L.; Pu, J.; Wang, S.; Kida, H.; Liu, J. -H. Virus Genes 2006, 32, 261-267. (17) Whiley, D. M.; Sloots, T. P. Diagn. Microbiol. Infect. Dis. 2005, 53, 335337. (18) Department of Health and Human Services. HHS Pandemic Influenza Plan; August 23, 2006. Vol. 2006. (19) Blalock, E. M., Ed. A Beginner’s Guide to Microarrays; Kluwer Academic Publishers: Boston, 2003. (20) Kessler, N.; Ferraris, O.; Palmer, K.; Marsh, W.; Steel, A. J. Clin. Microbiol. 2004, 42, 2173-2185. (21) Sengupta, S.; Onodera, K.; Lai, A.; Melcher, U. J. Clin. Microbiol. 2003, 41, 4542-4550. (22) Mehlmann, M.; Dawson, E. D.; Townsend, M. B.; Smagala, J. A.; Moore, C. L.; Smith, C. B.; Cox, N. J.; Kuchta, R. D.; Rowlen, K. L. J. Clin. Microbiol. 2006, 44, 2857-2862. (23) Townsend, M. B.; Dawson, E. D.; Mehlmann, M.; Smagala, J. A.; Dankbar, D. M.; Moore, C. L.; Smith, C. B.; Cox, Nancy J.; Kuchta, R. D.; Rowlen, K. L. J. Clin. Microbiol. 2006, 44, 2863-2871. (24) Markoulatos, P.; Siafakas, N.; Moncany, M. J. Clin. Lab. Anal. 2002, 16, 47-51.
H5N1, H3N8, and H9N2. Although influenza A virus samples for only the H3N2 and H1N1 subtypes are discussed here, these other subtypes were included in anticipation of further testing with a variety of influenza A viral subtypes. These subtype-specific databases were further divided by host species, and these subtype and host-specific databases (e.g., human H1N1) were mined for conserved regions using the ConFind algorithm.26 Briefly, the ConFind algorithm is a tool for mining conserved regions from aligned sequence files that provides robust handling when alignments contain missing or incomplete sequence information. ConFind allows user-defined limits for minimum length, maximum positional variability, exceptions to the positional variability, and number of sequences containing nonambiguous characters at each position. The conserved regions identified were then used to design appropriate “capture” and “label” sequence pairs of between 16 and 25 nucleotides (nt) each in length. Approximately 78 possible sequence pairs were identified over all of the hosts and subtypes examined. The number of mismatches between the designed probe and the target sequence was determined, and sequences were chosen that were anticipated to be broadly reactive with all influenza subtypes or with viruses of a specific host species or subtype (e.g., all avian viruses, only H3N2 viruses, etc.). Cross-Reactivity Experiments. Capture and label pairs were checked for cross-reactivity by conducting six replicate hybridizations of only fluorophore-conjugated label sequences (in the absence of target influenza genomes) under otherwise identical conditions. Where signals on the microarray occurred (signal is defined here as a mean S/N > 3 on a majority of hybridized slides), the capture and corresponding label sequence were removed and not used further. This sequence selection process resulted in 15 useful capture and label pairs (sequences are available upon request to not-for-profit enterprises for research use only from the University of Colorado Technology Transfer Office). Samples. RNA from influenza A viral isolates of known subtypes representing human, avian, equine, canine, and swine hosts was used for to characterize the microarray. Amplified RNA was generated at the CDC as detailed by Townsend et al.23 Briefly, viral RNA from each isolate was amplified using RT, PCR, and runoff transcription using the PCR product as a template. Reverse transcription was performed with SuperScript II reverse transcriptase (Invitrogen Corp., Carlsbad, CA) using the reverse primer for the M gene segment described by Zou.27 PCR of the M gene segment was then performed with primers for the matrix gene segment described by Zou,27 with the 5′ primer including a T7 promoter. Transcription was then performed utilizing T7 RNA polymerase (Invitrogen Corp.). In addition, patient samples known to be positive for influenza A (throat swabs and nasopharyngeal swabs) were provided by the Colorado Department of Public Health and Environment (CDPHE) and by the Air Force Institute for Operational Health (AFIOH) Virology and Bacteriology Laboratories (Brooks City-Base, TX). In addition, several negative (25) Macken, C.; Lu, H.; Goodman, J.; Boykin, L. In Options for the Control of Influenza IV; Osterhaus, A. D. M. E., Cox, N., Hampson, A. W., Eds.; Elsevier Science: Amsterdam, 2001; pp 103-106. (26) Smagala, J. A.; Dawson, E. D.; Mehlmann, M.; Townsend, M. B.; Kuchta, R. D.; Rowlen, K. L. Bioinformatics 2005, 21, 4420-4422. (27) Zou, S. J. Clin. Microbiol. 1997, 35, 2623-2627.
Analytical Chemistry, Vol. 78, No. 22, November 15, 2006
7611
control samples were provided by the CDC. These influenza A negative controls included true negatives, samples positive for influenza B, and samples positive for other respiratory pathogens such the SARS CoA causing severe acute respiratory syndrome, human parainfluenza virus type 3 (hPIV3), respiratory syncytial virus (RSV), and human metapneumovirus (hMPV). Microarray Slide Preparation. Capture oligonucleotides (Operon Biotechnologies, Inc., Huntsville, AL) had a 5′-amino-C6 modification for covalent attachment to aldehyde-modified glass (VALS Vantage Aldehyde Slides, CEL Associates, Inc., Pearland, TX). Buffers saline-sodium citrate (SSC) and saline-sodium phosphate-EDTA (SSPE) were purchased as 20× solutions from G Biosciences (St. Louis, MO). Electophoresis grade (99%) sodium dodecyl sulfate (SDS) was purchased from Sigma and N-lauroylsarcosine sodium salt solution (sarkosyl) was purchased from Fluka BioChemika (>97%, HPLC). Capture oligos were spotted at 10 µM from a spotting solution containing 3× SSC, 50 mM phosphate (pH 7.5), and 0.005% sarkosyl. Spotting was achieved with a MicroGrid arrayer (Genomic Solutions Inc., AnnArbor, MI) using two 100-µm-diameter solid core pins (spot diameter ∼100140 µm). The relative humidity was held at 70% during spotting. Two 5-s exposures to static 18 MΩ water followed by 5 s in recirculating 18-MΩ water was repeated 3 times between each solution spotted. Slides were placed in a 100% relative humidity environment for 18 h and subsequently washed according the following procedure: (1) 5 min in 4× SSPE/0.1% SDS, (2) 5-min shake in 4× SSPE, (3) 5-min shake in 18-MΩ water, and (4) 5-min shake in 90-100 °C in 18-MΩ water. RNA Extraction, Fragmentation, and Hybridization. Viral RNA was extracted from patient samples and amplified as described by Townsend et al.23 Amplified RNA (4 µL) was fragmented by adding 1 µL of 5× fragmentation buffer (200 mM Tris-acetate, 500 mM potassium acetate, and 150 mM magnesium acetate, pH 8.4) for 25 min at 75 °C as described by Mehlmann et al.28 The hybridization was performed immediately after fragmentation as described by Townsend et al.23 Briefly, 15 µL of quenching/hybridization buffer was added to the fragmented RNA solution (final concentration of hybridization/quenching buffer in the 20-µL solution was 4× SSPE (600 mM NaCl, 40 mM NaH2PO4, 4 mM EDTA, pH 7.0), 30 mM EDTA, 2.5× Denhardt’s solution, 30% deionized formamide, 200 nM in each of 15 appropriate Quasar 570-modified “label” oligonucleotides (Biosearch Technologies, Novato, CA), and 5 nM of a positive control Quasar 570-modified label oligonucleotide. Microarray hybridizations were carried out for 2 h at room temperature in a humidified environment. Microarray Imaging and Analysis. Slides were scanned using a VersArray ChipReader scanner (Bio-Rad Laboratories, Hercules, CA) with 532-nm detection, laser power of 60%, PMT sensitivity of 700 V, and 5-µm resolution. Fluorescence images were analyzed using VersArray Analyzer software, version 4.5 (BioRad Laboratories). The positive control spots (e.g., capture sequence and corresponding hybridized label sequence) were used as an internal check on hybridization efficiency. The intensity values were extracted for each spot in the triplicate set for each capture sequence, and the mean signal for each triplicate set was (28) Mehlmann, M.; Townsend, M. B.; Stears, R. L.; Kuchta, R. D.; Rowlen, K. L. Anal. Biochem. 2005, 347, 316-323.
7612 Analytical Chemistry, Vol. 78, No. 22, November 15, 2006
Figure 1. Schematic of the two-step hybridization fluorescence detection scheme utilized. A solution containing the fluorophoreconjugated DNA label sequence and the target RNA of interest is hybridized to the microarray containing immobilized “capture” DNA.
subsequently calculated. To ensure a robust analysis by the neural network, relative signal values, rather than absolute intensities, were used as input. For a given microarray image, the sequence with the highest mean intensity was used to defined the maximum signal, which was scaled to 100. The mean intensities for all other sequences were scaled appropriately relative to 100. Development and Validation of an Artificial Neural Network (ANN). The commercially available software package EasyNN-Plus (www.easynn.com) was used to develop an ANN for automated image interpretation. The ANN was trained using a feed-forward method with weighted back-propagation. The learning rate and momentum were both optimized by the software. The ANN utilized 16 inputs, 4 outputs, and had a hidden layer that consisted of 11 nodes. For each sample, 15 of the input values were the normalized fluorescence intensities for the 15 capture sequences. The last input was the maximum mean intensity of the 15 capture sequences (MAX). The MAX input was added in order to help differentiate negative samples from influenza A positive samples, as the MAX value for negative samples should be very low. The four output categories were H3N2, H1N1, H5N1, and negative, and the known outputs were entered as either 1 or 0, designating true or false. Microarray results from 50 samples of known subtype (H3N2, H1N1, H5N1), and 10 samples known to be negative for influenza A were selected to train the ANN. Of these 60 samples, 48 were used for training and 12 were used for validation. The number of training cycles was set to 1001. During the automated validation step, known samples were used to test neural network performance. Learning was deemed complete only once the subtype of all 12 validating examples was correctly assigned and the output scores were within 5% of the expected value, which was set to 1. The trained neural network was then used to determine the subtypes for 53 unknown samples in a blind study, with a threshold output score of 0.75 used for subtype assignment. RESULTS AND DISCUSSION Assay Overview. The subtyping assay presented here is based on a simple low-density microarray that was designed to several portions of a single gene segment, M. Hereafter the microarray will be referred to as MChip. As outlined in Figure 1, the detection strategy involved capturing amplified, fragmented RNA with 15
Figure 2. Microarray layout for 15 M gene capture sequences with positive control sequences (closed symbols) as internal standards and capture sequences spotted in triplicate (open symbols) shown in (A). Fluorescence images showing typical patterns for (B) H3N2, (C) H1N1, and (D) H5N1 viral subtypes. Brighter white spots indicate higher fluorescence intensity.
Figure 3. Gel electrophoresis of multiplex RT-PCR products for two influenza A positive samples. Figure adapted from Townsend et al.23 Left lane contains DNA ladder. Middle lane (C6) indicates amplification of all three target genes, whereas right lane (B5) indicates amplification of only the M gene segment.
distinct, short oligonuclotides (∼16-25-mer) immobilized in an array format on a solid support and labeling the hybridization events with the corresponding 15 fluorophore-tagged oligonucleotides (i.e., a sandwich assay). Due to typically low viral titers in infected patients (0.75 are highlighted. Checkmarks indicate correct determination by the ANN and “X” indicates either incorrect assignment or no output score of >0.75.
been used previously to diagnose and predict cancer types.32,33 The ANN was trained to recognize the patterns associated with each subtype using influenza A virus samples of known subtype, as described in detail in the Experimental Section. Briefly, normalized input data were provided for a set of known samples called the “training set”. By providing the known outputs for the training set, in this case the viral subtypes, the ANN software learned to associate a particular pattern of relative fluorescence intensities with a specific output. Once the patterns for the training set were established, data for unknown samples were supplied as input. The ANN provided an assignment score (scaled from 0 to 1) for each of the four output categories: H1N1, H3N2, H5N1, and negative. As described in the Experimental Section, a category score higher than 0.75 was used to make an assignment. It should be noted that while results for A/H5N1 viral isolates were included in the training set, patient samples infected with A/H5N1 were not included in the set of unknowns as they are not readily available (and also require BSL3+ handling). The present study focused on the identification of unknown patient samples; however, validation of MChip with A/H5N1 viral isolates is the focus of a separate study (data not shown). It is important to note that the viral load in patient samples can vary between 102 and 109 viruses/mL, depending on a number (32) Chatterjee, M.; Mohapatra, S.; Ionan, A.; Bawa, G.; Ali-Fehmi, R.; Wang, X.; Nowak, J.; Ye, B.; Nahhas, F. A.; Lu, K.; Witkin, S. S.; Fishman, D.; Munkarah, A.; Morris, R.; Levin, N. K.; Shirley, N. N.; Tromp, G.; Abrams, J.; Draghici, S.; Tainsky, M. A. Cancer Res. 2006, 66, 1181-1190. (33) Khan, J.; Wei, J. S.; Ringner, M.; Saal, L. H.; Ladanyi, M.; Westermann, F.; Berthold, F.; Schwab, M.; Antonescu, C. R.; Peterson, C.; Meltzer, P. S. Nat. Med. 2001, 7, 673-679.
7614
Analytical Chemistry, Vol. 78, No. 22, November 15, 2006
of patient variables, including the time since infection and age.8 Since a significant number of randomly selected patient samples of each subtype were tested in this study, the data set represents a wide range of RNA concentrations and patient sample conditions. Consequently, the robustness of the method for influenza subtype identification under “real-world” conditions was inherent in this study. Table 1 summarizes the ANN assignments for the 53 patient samples analyzed. Note that in some cases the “known” consisted of only partial subtype (i.e., samples from the CDPHE were partially subtyped using an immunofluorescence assay designed to probe only the antigenicity of the HA protein34). For the MChip and artificial neural network assay, 50 of 53 samples were correctly identified and fully subtyped (for influenza A) or identified as negative. There was one false positive result and two false negative results. It should be noted that negative samples (either true negatives or other pathogens that cause influenza-like illnesses, e.g., RSV) were defined as “influenza A negative” and were not further characterized. The relevant measures of performance for a diagnostic assay include clinical sensitivity and specificity. The clinical sensitivity is defined as TP/(TP + FN), and specificity is defined as TN/ (TN + FP), where TP is the number of true positive results, FN is the number of false negative results, TN is the number of true negative results, and FP is the number of false positive results.35 (34) World Health Organization. Recommended laboratory tests to identify avian influenza A in specimens from humans; June 12, 2005. Available from http:// www.who.int/csr/ disease/avian_influenza/guidelines/avian_labtests2.pdf. (35) Loong, T. -W. BMJ 2003, 327, 716-719.
Values for clinical sensitivity and specificity are normally reported as a percentage. Thus, the MChip assay resulted in a sensitivity of 95% and specificity of 92%. It is important to recall that the relative signal intensities of the 15 probe sequences were used as input in the ANN. Comparison to Other Rapid Diagnostic Tests. A number of rapid diagnostic tests are currently available that are capable of detecting the presence of influenza A in patient samples. Although they are simple and require only ∼15 min of analysis time, they provide no subtype information. Overall, these rapid tests exhibit sensitivities of ∼70-75% and specificities of ∼9095%.36 Thus, the MChip performed better, on average, than commercially available rapid tests and it provided complete subtype information. Other clinically relevant (non-point-of-care) diagnostics perform better but still do not provide complete subtype information. For example, a real-time RT-PCR method has been developed to identify influenza A/H5 infection within 1-2 h based on amplification of a portion of the HA gene.37 Clearly, the speed of this method for identifying specific A/H5 viruses could be extremely useful during an influenza pandemic. However, the RRT-PCR method is disadvantageous in that it relies on a set of primers designed to amplify a portion of the highly mutable HA gene of only a single H5 lineage. In addition, the method provides no subtype information for non-A/H5 influenza viruses. In comparison, the simple microarray described here currently requires longer analysis times (∼7 h) but provides full subtype information without resorting to often problematic multiplex PCR amplification procedures.23,24 Explanation for “Subtyping” Using Only the Matrix Gene Segment. The question of why the M gene segment can be used to infer the antigenic subtype may be addressed by considering the biological functions of the matrix proteins. The M gene segment codes for two different proteins designated M1 and M2; M1 is the major internal structural component of the virion7 with numerous functions in the viral life cycle,38 and M2 is transmembrane protein with ion channel activity.38-40 The location of M1 in the virion implies that it interacts with the other viral surface proteins (HA, NA, M2), and experimental evidence supports this hypothesis.38 Obenauer et al. recently developed “proteotyping” to distinguish subtle differences between related sequences in a (36) World Health Organization. WHO recommendations on the use of rapid testing for influenza diagnosis; July 11, 2006. Available from http://www.who.int/ csr/disease/avian_influenza /guidelines/RapidTestInfluenza_web.pdf. (37) Ng, E. K. O.; Cheng, P. K. C.; Ng, A. Y. Y.; Hoang, T. L.; Lim Wilina, W. L. Emerging Infect. Dis. 2005, 11, 1303-1305. (38) Nayak, D. P.; Hui, E. K. -W.; Barman, S. Virus Res. 2004, 106, 147-165. (39) Kelly, M. L.; Cook, J. A.; Brown-Augsburger, P.; Heinz, B. A.; Smith, M. C.; Pinto, L. H. FEBS Lett. 2003, 552, 61-67. (40) Tang, Y.; Zaitseva, F.; Lamb, R. A.; Pinto, L. H. J. Biol. Chem. 2002, 277, 39880-39886. (41) Obenauer, J. C.; Denson, J.; Mehta, P. K.; Su, X.; Mukatira, S.; Finkelstein, D. B.; Xu, X.; Wang, J.; Ma, J.; Fan, Y.; Rakestraw, K. M.; Webster, Robert G.; Hoffmann, E.; Krauss, Scott L.; Zheng, J.; Zhang, Z.; Naeve, C. W. Science 2006, 311, 1576-1580. (42) Ghedin, E.; Sengamalay, N. A.; Shumway, M.; Zaborsky, J.; Feldblyum, T.; Subbu, V.; Spiro, D. J.; Sitz, J.; Koo, H.; Bolotov, P.; Dernovoy, D.; Tatusova, T.; Bao, Y.; St. George, K.; Taylor, J.; Lipman, D. J.; Fraser, C. M.; Taubenberger, J. K.; Salzberg, S. L. Nature 2005, 437, 1162-1166. (43) Clayton, J. Nat. Methods 2005, 2, 621-626. (44) The material cost of the assay including sample handling (RNA extraction, RT-PCR, etc.), slide printing, and sample processing is ∼$10 US.
phylogenetic analysis.41 Specific pairings of HA and M gene proteotypes were found by identifying unique amino acid signatures within a single clade, suggesting that a change in one gene requires compensatory mutations in the other. In their large-scale complete genome sequencing effort for human influenza, Ghedin et al. noted numerous correlated changes within and between a variety of influenza proteins, including between HA and M1.42 Both of these studies provided evidence for the coevolution of M and HA. Despite the fact that M and HA coevolve, mutations in the M gene segment occur at a slower rate. At the nucleotide level, the average mutation rates for the M1 and M2 genes are 0.83 × 10-3 and 1.36 × 10-3 nucleotide substitutions per nucleotide per year, respectively, while the corresponding rate for HA is 6.7 × 10-3 nucleotide substitutions per nucleotide per year.4 At the protein level, Ito et al. have shown that M1 has evolved very little since the 1930s, with only 0.08 × 10-3 amino acid changes per residue per year).7 Since the M1 protein functions in multiple aspects of the viral life cycle, it is not surprising that this protein has a high degree of conservation. Interestingly, four of the five probe sequences that were broadly reactive on the microarray were designed to target the M1 coding region. From a diagnostics standpoint, influenza is often described as a “moving target” due to the high mutation rate of HA and NA. Utilizing the M gene segment for diagnostic purposes offers the significant benefit of relatively high conservation, which alleviates the need to continually design new probes. Summary. This study represents the first reported use of a single gene to indirectly provide subtype for influenza A on a microarray. The entire microarray single-gene assay, from extraction of viral RNA from patient sample through identification, can be completed in less than 7 h. In addition, with recent advances in lab-on-a-chip technology,43 it is likely that the total analysis time, from patient sample to an automated analysis of the image and subtyping result, can be reduced to less than 1 h. The tremendous diagnostic advantages of this single-gene microarray include elimination of multiplex PCR as well as robust and reliable amplification of the conserved M gene segment. Its relatively slow sequence evolution over time makes the M gene segment an attractive diagnostic target over currently utilized sequences that are based on highly mutable gene segments. These advantages, when combined with the low cost,44 simplicity, and high information content of the chip, make it a viable tool for enhancing global influenza surveillance efforts. ACKNOWLEDGMENT The authors gratefully acknowledge funding from NIH/NIAID (U01AI056528). E.D.D. acknowledges partial support from NSF. We also thank Patricia Young and the Colorado Department of Public Health and Environment (CDPHE) and Dr. Linda Canas and the Air Force Institute for Operational Health (AFIOH) Virology and Bacteriology Laboratories for providing patient samples. Received for review September 15, 2006. Accepted October 4, 2006. AC061739F
Analytical Chemistry, Vol. 78, No. 22, November 15, 2006
7615