Development of Rapid Canine Fecal Source Identification PCR

Of the 679 sequences obtained from GFE, we used 84 for the development of PCR assays targeting putative canine-associated genetic markers. Twelve gene...
1 downloads 7 Views 887KB Size
Article pubs.acs.org/est

Development of Rapid Canine Fecal Source Identification PCR-Based Assays Hyatt C. Green,† Karen M. White, Cathy A. Kelty, and Orin C. Shanks* U.S. Environmental Protection Agency, Office of Research and Development, National Risk Management Research Laboratory, Cincinnati, Ohio 45268, United States S Supporting Information *

ABSTRACT: The extent to which dogs contribute to aquatic fecal contamination is unknown despite the potential for zoonotic transfer of harmful human pathogens. We used genome fragment enrichment (GFE) to identify novel nonribosomal microbial genetic markers potentially useful for detecting dog fecal contamination with PCR-based methods in environmental samples. Of the 679 sequences obtained from GFE, we used 84 for the development of PCR assays targeting putative canine-associated genetic markers. Twelve genetic markers were shown to be prevalent among dog fecal samples and were rarely found in other animals. Three assays, DG3, DG37, and DG72, performed best in terms of specificity and sensitivity and were used for the development of SYBR Green and TaqMan quantitative PCR (qPCR) assays. qPCR analysis of 244 fecal samples collected from a wide geographic range indicated that marker concentrations were below limits of detection in noncanine hosts. As a proof-of-concept, these markers were detected in urban stormwater samples, suggesting a future application of newly developed methods for water quality monitoring.



fecal contamination10 further supporting the need for canine fecal source identification methods. To date, the two most widely used canine source identification methods target members of the bacterial phylum Bacteroidetes to detect11 and, in some cases, quantify10,12,13 canine fecal contaminants in environmental samples. However, these methods target the same highly conserved 16S ribosomal rRNA gene which has been found at high concentrations in a wide range of hosts, including human sources, thereby greatly reducing the diagnostic potential of these methods.14 Because dogs cohabitate with humans and can share portions of their microbiota,15,16 the development of methods that can reliably differentiate human and canine contaminants using highly conserved ribosomal genes may be difficult. Here we use genome fragment enrichment (GFE) as previously described17−19 to identify nonribosomal genetic sequences (markers) that are associated with canine fecal microbial communities. To investigate the distribution of candidate markers among different animal populations, we tested fecal samples from a wide host and geographic range. We report the development of 12 end point PCR assays and six quantitative assays for SYBR Green and TaqMan chemistries. The canine associated markers were found in about 80% of domestic dog fecal samples and were rarely found in noncanine

INTRODUCTION

Contamination of waterways with fecal material leads to the dissemination of pathogens, antibiotic-resistant bacteria, and excess nutrients that have serious impacts on human and environmental health. Major sources of contaminants include human derived sources, such as sewage leaks or combined sewer overflows, and agricultural sources, such as confined animal feeding operations and manure amendment of agricultural fields. In some watersheds, birds, deer, and other wildlife may also contribute to aquatic fecal loads. Reduction of aquatic fecal loads and increased protection of human and environmental health depend on distinguishing which of the many possible contaminating sources impact a particular body of water. However, the extent of aquatic contamination attributable to dogs is poorly understood. There are an estimated 69.9 million domesticated dogs in the United States alone.1 One study estimates that 39.1% of human pathogens also infect domestic animal hosts,2 meaning that dogs can be reservoirs for antibiotic resistant Enterococci spp.3 and many human pathogens including antibiotic resistant and pathogenic Escherichia coli,4,5 many Campylobacter spp.,6 and Giardia duodenalis7,8 in addition to many parasites.9 Despite the zoonotic potential presented by dogs, the management of dog waste is largely left up to voluntary owner responsibility, likely leading to a large proportion of dog fecal matter left in situ that can enter waterways via stormwater runoff. In a recent study, canine fecal pollution was reported to be a key yet manageable source of This article not subject to U.S. Copyright. Published 2014 by the American Chemical Society

Received: Revised: Accepted: Published: 11453

May 30, 2014 August 28, 2014 September 9, 2014 September 9, 2014 dx.doi.org/10.1021/es502637b | Environ. Sci. Technol. 2014, 48, 11453−11461

Environmental Science & Technology

Article

Laboratories, Westbrook, ME) and qPCR analysis. Automated sampler blanks were made by rinsing unused sample bottles with 50 mL distilled water prior to filtration. Filtrates from 50 mL of each stormwater sample were collected on 0.4 μM polycarbonate filters (Whatman, GE Healthcare Life Sciences, Piscataway, NJ) and stored at −80 °C overnight prior to DNA extraction in a 15 mL tube. A filter blank was performed in the same manner except 50 mL of molecular grade water was used instead of stormwater. Total DNA Extraction and Quantification. Both fecal and stormwater samples were extracted using the DNA-EZ kit (GeneRite, North Brunswick, NJ) according to the manufacturer’s instructions with some modifications. For fecal samples, about 0.5 g wet weight of fecal material was added to 1 mL GITC buffer (5 M guandidine isothiocyanate, 100 mM EDTA [pH 8.0], 0.5% Sarkosyl) and vortexed to make fecal slurries. Four hundred microliters elution buffer and 700 μL fecal slurry were added to a bead mill tube and agitated at 6 m/s for 40 s seconds using a MP FastPrep-24 instrument (MP Biomedicals, LLC, Solon, OH). After centrifuging bead mill tubes for 1 min at 14 000g, 500 μL supernatant was combined with 1000 μL binding buffer and vortexed. This mixture was then added to the DNA binding column and washed twice with wash buffer. DNA was eluted with 100 μL warm elution buffer (∼55 °C) and stored at −20 °C until further analysis. DNA was extracted from water sample filtrates in the same way except 500 μL GeneRite lysis buffer and 12 ng salmon sperm DNA (SigmaAldrich, St. Louis, MO) was added to the 15 mL tube with the filter instead of GITC and no elution buffer was added to the bead mill tube resulting in only 500 μL lysate being added to the bead tube before agitating. One or two extraction blanks were included with every extraction batch by adding 700 μL molecular grade water to bead tube instead of fecal slurry or lysate for a total of 22 extraction blanks. Total DNA concentrations for each sample were estimated with the Quant-iT PicoGreen dsDNA Assay Kit (Life Technologies, Carlsbad, CA) on a SpectraMax Paradigm Multi-Mode Microplate Detection Platform (Molecular Devices, Sunnyvale, CA) according to manufacturer’s instructions. Genome Fragment Enrichment and Enriched Sequence Annotation. GFE was performed as previously described17−19 using dog fecal DNA as the target and swine fecal DNA as the blocker. Fragments were ligated into pCR4TOPO plasmids and transformed into One Shot Top10 cells (Life Technologies). Amplicon Sanger sequencing was performed on both strands by the dye-terminator method using an ABI PRISM 3730XL DNA Analyzer (Life Technologies). Vector sequence and low quality flanking regions were removed using the function vectorstrip and trimseq in the EMBOSS suite.21 In some cases, this procedure resulted in discarding entire sequence fragments. Predictions of possible gene function and taxonomic association were obtained through the RAMMCAP22 via CAMERA23 and MG-RAST24 pipelines, respectively. Sequence dereplication was performed with CD-HIT using a 90% identity threshold.25 Identification of open reading frames (ORFs) was performed with Metagene.26 Marker selection by function was based on annotations using the clusters of orthologous groups (COG) database.27,28 Since database content has changed significantly since the time of initial fragment isolation and sequencing, more recent annotations of selected marker fragments using BLASTx and the NCBI Protein Reference

fecal samples tested in this study. As a proof-of-concept, newly developed TaqMan assays were tested on a stormwater samples collected from an urban rain garden frequented by domestic dogs. Findings suggest that these new assays may be helpful for the identification and quantification of aquatic fecal contaminants originating from canines.



MATERIALS AND METHODS Sample Collection. Fecal samples were collected over a wide geographic range by a large cohort. All sampling was done using sterile tubes and gloves paying special attention to keep individual samples separate. All canine fecal samples were obtained from household pets except those from Ohio, which were collected from the local rescue facility in Cincinnati. During transit from the field to the laboratory, samples were stored on ice and upon arrival stored at −80 °C until shipment to the EPA facility in Cincinnati, OH. The quantity, geographic origin, and host common names of the samples tested are listed in Table 1. Sewage influent samples were collected over a wide Table 1. Fecal DNA Extracts Used to Estimate Sensitivity, Specificity, And Host Contributionsa host (common name)

location

NZ NZ OH chicken CA KY NZ cow GA NE NZ OH WY deer FL NY WA dog CA FL OH WY elk WY gull CA WI human OH OR pronghorn WY sheep NZ swine GA turkey KY canine fecal sample total noncanine fecal sample total

n (end point)

alpaca cat

14 6 9 18

12 10 8 9 8 26 22 10 19 5 1 5

9 8 66 133

n (qPCR) 8 3 14 6 9 4 18 12 10 12 10 8 10 5 24 22 19 7 1 5 10 10 9 8 46 198

“n” denotes number of individual fecal samples used in each testing phase. a

geographic range by collaborators (for sample description see reference20). As a proof-of-concept pilot study, automated samplers collected stormwater runoff from a local, urban rain garden known to be frequented by canines during a 27 h storm event (April 11−12, 2014). Water samples were retrieved within 8 h from the end of sampling and split for culture analysis of total coliforms and E. coli (MPN method, IDEXX 11454

dx.doi.org/10.1021/es502637b | Environ. Sci. Technol. 2014, 48, 11453−11461

Environmental Science & Technology

Article

Table 2. Dog-Associated Assay Oligonucleotide Sequences assay

forward (5′-3′)

reverse (5′-3′)

DG2 DG3a DG5 DG29 DG37

CCTACACGAAGGGCATCCAA TTTTCAGCCCCGTTGTTTCG GGACCACTGCTTTGTCTTGCGACG GCAGGGGAGATGACAGAGAT TTTTCTCCCACGGTCATCTG

TCTGCATTAACCTCCGTCCG TGAGCGGGCATGGTCATATT GTTTAACTAAAACACAGCTCTATGCA AGGCTGCATAATCCGCAAAA CTTGGTTATGGGCGACATTG

DG39 DG46 DG68 DG72

ACGCCTGAATGGTGTAAAGG ACTACAGCGGTAAGGGCAAT CGGTGAGCTGATTGGTACAG GCAACTTGGTGAGGAAAAGG

GTCGAAAGCCTTTTCAAGCG CCCCGATGCCTATATGGTAG TGTCGGGTAATCAGGTAGGA TCCAGTATTTCCCGTCGTGT

DG74 DG75 DG80

CCCTTCGCTTCACTGATTTC ACTCCTCTGGAGGCACTGAA ATGGCCAGCAATGGACTTAC

TTCAACGGGTTGTTCAGTCA ACTGAACAACCCGTTGAAGG GATGCAAAGGTGCTGATGC

probe (5′-3′) [FAM] AGTCTACGCGGGCGTACT [MGB]

[FAM] TTGAACGTTTAAAGGAGCAGGTGGCAG [TAMRA]

[FAM] AAAGGTATTCCGCATGACTTCATCATCCGC [TAMRA]

a

End point, SYBR Green qPCR and TaqMan qPCR assays designed for those assays in bold. For all other assays only end point versions were designed.

each DNA sample. Reaction fluorescence from cycles eight to 13 were used to establish baseline fluorescence and a fluorescence threshold of 0.03 was used for all canine qPCR assays. A minimum of two no template control reactions were included with each qPCR instrument run. TaqMan qPCR assays HF183/BacR287,31 GenBac3, and Sketa22 (http:// water.epa.gov/scitech/methods/cwa/bioindicators/biological_ index.cfm#rapid) were used as previously described. End Point PCR Assay Selection. In order to determine which candidate assays had fecal source identification potential, the 92 primer sets were tested with end point PCR in a twostep process. In the first step, candidate primer sets were challenged against two fecal DNA composites. One, a canine fecal DNA composite (0.05 ng total DNA/μL = 0.1 ng total DNA/reaction) consisting of equal DNA masses from 20 canine individuals (10 each from Florida and Wyoming populations) was used to determine the presence or absence of candidate markers within dogs. Two, a DNA composite consisting of human (n = 6), and goose (n = 4), cattle fed forage (n = 5), cattle fed processed grain (n = 5), and cattle fed unprocessed grain (n = 5), fecal DNA (0.625 ng total DNA/μL, 0.125 ng DNA/μL from each population) was used to determine the presence of markers in likely sources of contaminants other than dogs. For the first assay screening step only, the number of end point PCR amplification cycles was decreased from 35 to 30 when using canine fecal DNA as template to select for markers with high abundance in dogs. Assays were discarded from further analysis if any of the following conditions were met: (1) an amplification product of the expected size was absent when using canine fecal DNA as template, (2) amplification products of any size were produced when the noncanine fecal DNA composite was used as template, or (3) amplification byproducts, such as primer dimerization molecules or other spurious PCR products noticeably different in size from the expected PCR product, were produced when using either fecal DNA composite as template. Primer sets that met these criteria (Table 2) were used for more rigorous end point PCR testing against fecal DNA extracts listed in Table 1. For this second step of screening, noncanine fecal DNA composites and individual canine fecal DNAs both consisting of 0.5 ng DNA μL −1 individual−1 (1 ng DNA reaction −1 individual−1) were used to further characterize marker distributions outside and within canine populations, respec-

Sequence database (Accessed September 9, 2013) were conducted (SI Table S1). Selection of Candidate Sequences for PCR-Based Assay Development. Sequences without statistically significant homology to database sequences or the order Bacteroidales (e-value ≤0.001) were not selected for marker development. Thirty-eight sequences distributed across all functional categories were selected for marker development including sequences with hypothetical functions. Forty-six sequences attributed to Bacteroidales bacteria were randomly selected for primer design. In some cases, a single sequence was used to design more than one primer set. In all, 84 sequences were used to design 92 primer sets for further testing. Oligonucleotide Design and Preparation. Unique regions in putative marker sequences were selected for primer design by comparison with existing sequence information. Primer-BLAST29 was used to perform in silico tests for specificity and to predict melting temperature (60 ± 2 °C). For each TaqMan qPCR assay, two probes were designed according to guidelines30 to target short sequence regions with high amino acid conservation. SYBR Green qPCR assays were created by incorporating each forward and reverse primer into a SYBR Green qPCR chemistry platform without any other modifications. The same was done with TaqMan qPCR assays, but with the addition of an internal probe and exclusion of SYBR Green Dye. PCR and qPCR Amplifications. Takara Ex Taq Hot Start Version PCR reagents (Clontech Laboratories), 200 nM each primer, 4.0 ng bovine serum albumin (BSA; Sigma-Aldrich, St. Louis), 2 μL template DNA, and molecular grade water were used for all end point PCR reactions. PCR products were visualized using 2% agarose gels with lithium borate buffer and 1X GelStar (Lonza). All end point PCR reactions were run on a Tetrad 2 thermal cycler (Bio-Rad Laboratories) under the following conditions: 94 °C for 5 min and 35 cycles of 40 s at 94 °C, 1 min at 60 °C, and 30 s at 72 °C. For qPCR assays, 25 μL reactions consisted of 1X TaqMan Environmental Master Mix (Life Technologies), 5.0 ng BSA, 200−4000 nM each primer, 40−180 nM probe (TaqMan reactions only), 0.1X SYBR Green I Dye (SYBR Green reactions only; Life Technologies), 2 μL template DNA, and molecular grade water. All qPCR reactions were run for 40 cycles on a StepOnePlus qPCR instrument (Life Technologies) under default conditions. Four qPCR reaction replicates were run for 11455

dx.doi.org/10.1021/es502637b | Environ. Sci. Technol. 2014, 48, 11453−11461

Environmental Science & Technology

Article

marker. R36 package epitools37 was used to estimate binomial confidence intervals (function binom.exact). Quantitative Data Analysis and Performance Parameters. R package qpcR38 was used for sigmoidal model fitting (functions modlist and pcrbatch). Sigmoidal efficiency, defined as the cycle specific efficiency (En) at the second derivative of the fitted amplification curve where En = Fn/(Fn−1) and Fn and Fn−1 are the cycle specific fluorescence values at cycles n and n− 1, respectively,32 was used to estimate amplification efficiency during assay optimization. In addition, separate amplification efficiency estimates were derived from standard curves using the equation E = 10(−1/slope) − 1, where “slope” is the slope of the pooled calibration curve. Resolution was estimated for standard curves as done previously.39 Briefly, the distance from the calibration curve regression line to the 95% confidence interval was estimated at each dilution and the geometric mean of these distances indicates resolution. Mean marker concentrations across fecal or sewage samples were estimated by first assuming that samples whose marker concentrations were below limits of quantification (LOQ) contained a single marker copy before averaging. Limit of detection (LOD) and LOQ were also used to help interpret experimental and environmental data. LOQ was defined as the lowest concentration of plasmid standard whose resulting quantification threshold (Cq) value fell within linearity of the fitted standard curve model where R2 values were ≥0.97. CqLOQ was defined as the Cq value at the LOQ. For TaqMan reactions, the assay LOQ (10 plasmid copies/ reaction (lowest concentration tested)) was also considered the assay LOD. Any set of replicate reactions whose melt curves indicated the intended PCR product size were considered to be in the detectable range, regardless of their Cq values or the presence of side-product melt temperature (Tm) peaks. SYBR Green reactions with at least one replicate showing the expected Tm peak and Cq values within the LOQ were interpreted as quantifiable. Experimental Tm peaks within four standard deviations of the anticipated assay specific Tm peak (estimated using plasmid standards) were classified as target Tm peaks.

tively. Markers that had the highest estimates of specificity and maintained high levels of sensitivity after this second phase of screening were used as genetic targets for the development of three qPCR assays. qPCR Assay Optimization and DNA Test Concentrations. qPCR assays were optimized for maximum specificity and sigmoidal efficiency32 by testing a range of primer (200− 4000 nM) and probe (40−180 nM) concentrations, as well as two different probe sequences for each TaqMan assay. Probes that resulted in the highest sigmoidal efficiency while maintaining specificity were selected for further method development. Noncanine fecal DNA composites and individual canine fecal DNA composites both consisting of 0.5 ng DNA μL−1 individual−1 (1 ng DNA reaction−1 individual−1) were used to estimate marker concentrations in a range of fecal sources by qPCR (Table 1). Sewage DNA extracts were normalized to 0.5 ng DNA/μL (1 ng DNA/reaction) before qPCR analysis with canine and HF183/BacR287 assays. Undiluted DNA extracts were used to estimate marker concentrations in stormwater samples with canine, HF183/BacR287, and Sketa22 assays. Standard Curve Generation. To generate standard curves, decimal dilutions of a custom plasmid construct (5e4 to 5e0 copies/μL) that contained target sequences for each canineassociated genetic marker (Integrated DNA Technologies, Coralville, IA) were used as template for each assay. Plasmid constructs prepared in the same fashion were used to generate standard curves for HF183/BacR287 and GenBac3. Data from six separately run standard curves (duplicate reactions at each concentration per run) were compiled by pooling data for each assay. This is similar to a “master” approach as previously described,33 but simple linear regression was substituted for Bayesian statistical modeling. Amplification Inhibition and DNA Recovery Controls. Kinetic outlier detection (KOD) was used to screen for amplification inhibition in all samples tested.34,35 Briefly, genomic (Sketa22 only, 1e0 to 1e-5 ng/reaction) or plasmid (all other assays, 1e5 to 1e1 copies/reaction) DNA dilutions were used to generate a KOD reference data set (SI Table S2). Then, reference and experimental fluorescence data were fit to a seven-parameter sigmoidal model and the linear relationship between fractional cycle values at first and second derivative maxima were used to estimate normal and outlier amplification profiles. Reactions with τnorm values less than the 99.9% quantile using χ2 distribution with one degree of freedom (−15.13) were considered significantly inhibited, but samples were not excluded from analysis unless all replicates exceeded the threshold. For stormwater samples, both DNA recovery and amplification inhibition were estimated with the Sketa22 assay similar to methods described previously.34 Briefly, salmon sperm DNA was added to the sample before DNA extraction instead of E. coli AF504 GFP cells. Sketa22 amplification profiles were first checked for amplification inhibition using the methods described above. Amplification profiles that passed the inhibition threshold were then used to estimate total DNA recovery as a percentage of the total spiked salmon sperm DNA. Qualitative Performance Parameters. Marker prevalence was defined as the proportion of individuals where a respective marker was detected by end point PCR. Specificity was defined as the proportion of noncanine samples testing negative for a canine marker. Sensitivity was defined as the proportion of canine samples testing positive for a canine



RESULTS Description of Putative Canine-Associated DNA Sequences. Of 768 enriched sequences submitted for sequencing, 679 contained a low percentage (0−2%) of ambiguities after trimming. Sequence length ranged from 74 to 951 bp and averaged 386 bp. DNA clustering at the 90% identity level resulted in 585 unique sequence fragments (88.2% of DNA clusters contained only one sequence). Seven hundred forty six putative ORFs were identified and had minimum, maximum, and average lengths of 20, 306, and 98 residues, respectively. Clustering of ORFs at the 90% identity level resulted in 646 unique ORFs (87.8% of ORF clusters contained only one sequence). The resulting enriched sequence fragments were highly diverse in sequence composition and annotated function. Overall, sequences had very low similarity to reference database sequences, which made functional annotations impossible in some cases. Only 31.6%, 35.2%, and 47.6% of the putative ORFs had significant matches when searched against TIGRFAM, PFAM, and COG databases as of November 11th 2011 (e-value ≤1e-3), respectively. ORFs involved in translation, ribosomal structure, and biogenesis were the most abundant (Figure 1). ORFs involved in signal transduction, 11456

dx.doi.org/10.1021/es502637b | Environ. Sci. Technol. 2014, 48, 11453−11461

Environmental Science & Technology

Article

Figure 1. Open reading frames identified from GFE. Color scale signifies the mean amino acid identity of each open reading frame to its top hit in the database.

Table 3. Estimates of Specificity and Sensitivity for 12 Canine-Associated End Point-PCR Assaysa specificity

sensitivity

assay

num. pos, n = (133)

estimate

LCIb

UCI

num. pos, n = (66)

estimate

LCI

UCI

DG2 DG3 DG5 DG29 DG37 DG39 DG46 DG68 DG72 DG74 DG75 DG80

2 0 5 15 0 8 3 0 0 3 1 7

0.98 1.00 0.96 0.89 1.00 0.94 0.98 1.00 1.00 0.98 0.99 0.95

0.95 0.97 0.91 0.82 0.97 0.88 0.94 0.97 0.97 0.94 0.96 0.89

1.00 1.00 0.99 0.94 1.00 0.97 1.00 1.00 1.00 1.00 1.00 0.98

52 51 51 63 56 48 48 51 50 52 52 48

0.79 0.77 0.77 0.95 0.85 0.73 0.73 0.77 0.76 0.79 0.79 0.73

0.67 0.65 0.65 0.87 0.74 0.6 0.6 0.65 0.64 0.67 0.67 0.6

0.88 0.87 0.87 0.99 0.92 0.83 0.83 0.87 0.85 0.88 0.88 0.83

a Based on these results assays in bold were chosen for the design of qPCR assays. bLCI and UCI are 95% lower and upper confidence intervals based on a binomial distribution, respectively.

defense, and cell wall/membrane/envelope or ORFs with unknown function shared the least identity with amino acid database sequences. Because some sequences shared high similarity to proteins previously reported in bacterial genome sequencing projects, taxonomic annotation was possible in these instances. Approximately, 61% and 33% of ORFs with a significant taxonomic association were associated with Bacteroidetes and Firmicutes phyla, respectively. One ORF was associated with C. jejuni subsp. jejuni ferric receptor cf rA (65% identity, e-value = 8e-15). Three other ORFs with unknown function were associated with Saccharomyces cerevisiae (60−62% identity, evalue = 8e-42 to 1e-8), a common dog food additive. One fragment was annotated as a Canis lupus familiaris reverse transcriptase homologue (91% identity, e-value=2e-40). Distribution and Identification of Canine-Associated PCR Markers. Out of 92 end point assays tested, 12 amplified

a single expected PCR product from a canine fecal DNA composite and did not amplify PCR products from a noncanine fecal DNA preparation. Secondary screening of these 12 assays using 198 fecal DNA extracts demonstrated that all 12 markers exhibited high levels of specificity ranging from 89% to 100% and sensitivity from 73% to 95% (Table 3). Notable exceptions included DG29, which was found in the highest proportion of canine individuals, but was also found in one human and all 14 Ohio felines. Among noncanine populations, markers were most prevalent in the Ohio feline population, which cohabitated with Ohio dogs. No false positives were observed with chicken, cow, deer, elk, or swine DNA extracts. Dogassociated markers were completely absent in all human fecal samples tested except for DG29 (1 of 6 samples positive). The average prevalence of canine-associated markers across all four dog populations ranged from 43.8% (n = 4, St. Petersburg, FL) to 99.6% (n = 22, Cincinnati, OH). 11457

dx.doi.org/10.1021/es502637b | Environ. Sci. Technol. 2014, 48, 11453−11461

Environmental Science & Technology



Development of DG3, DG37, and DG72 qPCR Assays. The top performing end point PCR assays, DG3, DG37, and DG72, were modified to two qPCR platforms, SYBR Green and TaqMan qPCR. Optimal primer and probe concentrations for each assay were 1400 nM each primer and 100 nM, respectively. Amplification of plasmid DNA dilutions showed that the assays have high reproducibility and resolution (SI Table S3). Concentrations of Markers in Fecal, Sewage, and Storm Water DNA Extracts. DG3, DG37, and DG72 marker concentrations estimated from the analysis of 244 fecal DNA extracts were high in dogs (SI Figure S1) and below the LOD ( 0.2). Canine markers were found at quantifiable concentrations in 10 out of 48 sewage influent samples and at a single-sample maximum concentration of 48.7 copies/ng sewage influent total DNA (SI Figure S2). In contrast, human-associated marker concentrations (HF183/BacR287 assay) were detected in all 48 samples and at about 3 orders of magnitude more concentrated levels compared to canine-associated markers on average. The prevalence of canine markers among urban stormwater samples was low, while human-associated markers were not detected (SI Table S4). In two samples, quantifiable mean canine marker concentrations were between 14.8 and 114.8 copies per reaction. In four instances representing three distinct samples, we found marker concentrations below the assay LOQ, but above the assay LOD using SYBR Green canine assays. All three canine markers were found in the sample with the highest counts of E. coli and no markers were found in samples with fewer than 144 E. coli MPN/100 mL (n = 8). Concentrations of molecular markers did not significantly correlate with counts of total coliforms or E. coli (p > 0.4). Quality Controls. Overall, sigmoidal fitting and KOD using control assays Sketa22, GenBac3, HF183/BacR287 qPCR amplification profiles indicate the absence of amplification inhibition in all DNA extracts (SI Figure S3). Amplification profiles produced by canine fecal source identification assays themselves provided further evidence that quantification accuracy was not greatly affected by PCR inhibitors. Of the 100 no template controls (NTCs), none produced amplicons of the anticipated size. Although, seven SYBR Green NTCs had Cq values less than 40 (ranging from 37.4 to 39.5), the presence of contaminants was excluded based on melt curve analysis. Cq values for all TaqMan NTC reactions were “undetermined”. The filter blank and extraction blanks were absent of detectable molecular marker concentrations. DNA extraction recoveries from stormwater samples were roughly normally distributed within a range of 0.1−44.0%.

Article

DISCUSSION

Fragment Enrichment Reveals Non-Ribosomal Canine-Associated Markers. Over 500 candidate canineassociated sequences were obtained using a competitive solution-phase hybridization method. Because humans and canines often cohabitate, canine-associated fecal source identification methods targeting 16S rRNA genes often crossreact with humans.14 In this study, we expanded the search for canine-associated genetic markers beyond 16S rRNA genes to include all functional genes present in a canine fecal microbial community. After testing 92 putative canine-associated markers, this metagenomic strategy lead to the identification of 11 DNA markers present in canines, but absent in human fecal samples tested in this study. The low success rate (11 of 92 markers not present in humans) supports previous notions that animal cohabitation can lead to the sharing of commensal bacteria15,16 and lower the number of discriminating genetic markers. Trends in Canine-Associated Genetic Marker Host Distributions. Patterns in marker distributions uncovered by end point PCR among hosts suggest that marker distributions are determined by more than just their animal host taxonomy. The relatively high proportion of Ohio rescue facility canines that carried all markers may be attributable not only to the fact that these rescue animals cohabitated and were kept on the same diet, but also that the original GFE fragments were obtained from a canine fecal sample collected from this same facility months before. These two factors could explain why canine-associated markers were consistently prevalent among all individuals in this Ohio population. In contrast, the lower prevalence of markers among household pet canines in California, Florida, and Wyoming populations could be explained by the lack of cohabitation, dissimilar diets or antibiotic treatments, as well as exposures to different sources of commensal or environmental bacteria. Although the exact taxonomic identity is unknown, annotations suggest that both DG3 and DG72 markers are associated with Bacteroides. This common annotation may be supported by the significant positive correlation between DG3 and DG72 marker concentrations among dogs and suggests that these assays target a similar population of bacteria or that the bacteria targeted by the assays share a close ecological niche in canine microbiotas. To date, there has been much research that shows members of the genus Bacteroides dominate mammalian microbiotas (e.g., ref 40) and that canines are no exception.41 As we show here, high host marker concentrations can translate directly into a higher likelihood of detection in areas where canine fecal contaminants are thought to be present. Despite cohabitation with dogs within the same facility, Ohio felines shared relatively few markers with their canine cohabitants. One exception was DG29, which was shared by all Ohio rescue cats and dogs. Many factors could explain this observation. Crystal structures of Bacillus SpoIIAA,42 an orthologue of the gene targeted by DG29, show that the assay targets a sequence region with multiple conserved domains thought to be involved in direct interaction with antisigma factors and the regulation of genome-wide expression. Given its importance in cellular regulation, it is possible that this sequence region has remained highly conserved throughout the divergence of Lachnospiracea historically associated with Carnivora. It is also possible that bacteria with the DG29 marker have the capability to survive or 11458

dx.doi.org/10.1021/es502637b | Environ. Sci. Technol. 2014, 48, 11453−11461

Environmental Science & Technology

Article

states by consistently testing positive in most dog fecal samples. Although a broad distribution of dog-associated genetic markers was observed in this study, additional testing is needed to confirm prevalence in local watersheds of interest. Experiments suggest that these genetic markers are dogassociated, especially DG3, DG37, and DG72 and that they may have future utility in environmental monitoring. This notion is further supported by establishing the proof-of-concept on stormwater samples suggesting that these genetic markers are probably not indigenous to urban environments and that they can persist long enough to be detectable by qPCR. However, in order to realize the full potential of these methods for fecal source identification applications, several issues remain to be addressed. More work investigating the correlation of these canine-associated markers with zoonotic pathogens and general fecal indicators (E. coli and enterococci) are needed to determine relevance to human health risk in waters impaired by canine fecal pollution. Additional research is also needed to characterize the environmental factors affecting the survival of bacteria harboring the genetic markers identified herein.

even colonize within new hosts upon transmission. Additional information on the promiscuity of the DG29 marker gene among host bacteria, as well as the distribution of these bacteria among other animal hosts, is needed to help explain the trends observed in this study. Human Assay Required to Discriminate Between Canine and Sewage Contamination. Our goal was to develop fecal source identification assays that reliably distinguished canine fecal contamination from other common pollution sources; however, identification of low levels of canine markers in sewage influent reinforce the idea that tests for multiple sources should be performed to prevent misinterpretation of results. Given that there were no detectable DG3, DG37, or DG72 markers in human fecal extracts tested in this study, it is possible that the low levels of these markers in sewage is due to the presence of actual canine contaminants mixed with human waste, perhaps at the point of origin or from mixing with surface runoff. Regardless of the cause, these canine source identification assays should be used alongside sensitive human methods in order to reliably distinguish between these two sources. In cases where sewage is the source of detectable levels of canine markers, human markers would be expected to co-occur in the same DNA extracts at much higher levels.10 Improved Flexibility in Fecal Source Identification Assays. Despite the clear benefits molecular methods offer for the management of aquatic resources, the adoption into monitoring routines is slow because it requires dedicated laboratory work space, new equipment, and additional personnel training. Quantitative capabilities require even more equipment and reagent costs. For this reason, we have developed and demonstrated the utility of three canine source identification methods on three PCR-based platforms that range in application costs. While end point assays do not provide quantitative information and may not be as sensitive as qPCR assays in terms of LOD, they avoid the higher cost of qPCR thermal cycler instruments, proprietary reagents, and quantitative reference standards. Although qualitative identification by end point PCR is not a substitute for quantitative methods, they may provide enough information to identify major contaminant sources, initiate a mitigation strategy, and expedite protection of human health. Consideration of turn-around time, cost, and required LODs are important when deciding on a particular quantitative platform. SYBR Green-based assays use a less-expensive intercalating dye instead of an assay-specific fluorogenic oligonucleotide probe, which translates into higher signal intensity per amplicon for SYBR Green-based assays. This factor could explain the higher occurrence of positive stormwater samples for SYBR Green-versus TaqMan-based assays. Although additional time is needed to distinguish unintended amplification byproducts from an accumulation of intended amplicons through melt curve analysis (∼1 h per run), SYBR Green platforms may be ideal for laboratories hoping to detect very low concentrations of molecular markers (