Identification of Individual Immobilized DNA Molecules by their

level, providing information that cannot be acquired in ensemble measurements. Page 1 of 27. ACS Paragon Plus Environment. Analytical Chemistry. 1. 2...
1 downloads 4 Views 1MB Size
Subscriber access provided by Queen Mary, University of London

Identification of Individual Immobilized DNA Molecules by their Hybridization Kinetics Using Single-Molecule Fluorescence Imaging Eric M Peterson, and Joel M. Harris Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.7b04512 • Publication Date (Web): 26 Mar 2018 Downloaded from http://pubs.acs.org on March 26, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Identification of Individual Immobilized DNA Molecules by their Hybridization Kinetics Using Single-Molecule Fluorescence Imaging Eric M. Peterson and Joel M. Harris* Department of Chemistry, University of Utah, 315 South 1400 East Salt Lake City, Utah 84112-0850 [email protected]; 1-801-581-3585 ABSTRACT Single-molecule fluorescence methods can count molecules without calibration, measure kinetics at equilibrium, and observe rare events that cannot be detected in an ensemble measurement. In this work, we employ total-internal-reflection fluorescence microscopy to monitor hybridization kinetics between individual spatially-resolved target-DNA molecules immobilized at a glass interface and fluorescently-labeled complementary probe DNA in free solution. Using superresolution imaging, immobilized target DNA molecules are located with 36-nm precision, and their individual duplex formation and dissociation kinetics with labeled DNA probe strands are measured at site densities much greater than the diffraction limit. The purpose of this study is to evaluate uncertainties in identifying these individual target molecules based on their duplex dissociation kinetics, which can be used to distinguish target molecule sequences randomly immobilized in mixed-target samples. Hybridization kinetics of individual target molecules are determined from maximum-likelihood estimation (MLE) of their dissociation times determined from a sample of hybridization events at each target molecule. The dissociation-time distributions thus estimated are sufficiently narrow to allow kinetic discrimination of different target sequences. For example, a single-base thymine to guanine substitution on immobilized strands produces a 2.5-fold difference in dissociation rates of complementary probes, allowing for identification of individual target DNA molecules by their dissociation rates with 95% accuracy. This methodology represents a step toward high-density single-molecule DNA microarray sensors and a powerful tool to investigate kinetics of hybridization at surfaces at the molecular level, providing information that cannot be acquired in ensemble measurements.

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 27

2

INTRODUCTION DNA microarray technology1,2 allows for rapid screening of specific oligonucleotide sequences associated with pathogen infections,3 disease states,4 or chromosomal abnormalities.5 To detect the dilute genetic material in samples with conventional microarray instrumentation, samples require time-consuming separations, DNA amplification, and labeling.6 Even with automation,3 enzymatic oligonucleotide amplification increases the time and material cost of microarray assays, and requires careful optimization to reduce PCR amplification bias7 or contamination. Microarray assays could be faster and cheaper without an amplification step but would necessitate extremely low detection limits, potentially down to individual molecules. A number of detection schemes have been used to detect DNA hybridization in situ using surface plasmon resonance (SPR),8,9 or average fluorescence intensity;10,11 however these techniques require large quantities of material to detect a hybridization response. With the ability to address individual molecules, single-molecule sensors offer more than low detection limits: they can quantify analyte molecules without intensity calibration, report kinetics at equilibrium, and measure rare kinetic events that could not be detected in an ensemble measurement.12-14 One such technique, single-molecule fluorescence imaging, has been used to detect binding interactions between proteins and antibodies,13 aptamers and haptens,15 and nucleic acids at interfaces.16,17 Super-resolution fluorescence imaging allows detection of individual molecules with ~20-nm spatial- and 10-ms time-resolution,18,19 making it nearly ideal for measuring biorecognition kinetics. These techniques generally employ selective readout schemes to ensure that individual molecules detected result from specific interactions with probe molecules, rather from nonspecific adsorption at the interface. Some schemes restrict detection to nanofabricated structures at known locations, such as zero-mode waveguides,20-22 or gold nanostructures.23 Nonspecific adsorption can also be excluded by labeling immobilized capture sites combined with Förster resonance energy transfer (FRET),24,25 two-color colocalization,26,27 or labeled origami structures.17,28,29 This approach limits long-term observations, however, because the fluorescent label on the immobilized target can photobleach.30

ACS Paragon Plus Environment

Page 3 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

3 Recently, Johnson-Buck31 and Su32 reported hybridization of individual DNA molecules immobilized randomly on surfaces and located ‘target’ DNA molecules from reversible hybridization with known complementary fluorescently-labeled ‘probes’ in free solution. By detecting multiple hybridization events at specific locations, nonspecific interactions can be excluded without requiring co-localization or FRET. The lifetimes of the individual hybridization events provide a direct measure of the DNA duplex association and dissociation rate constants, and by their ratio, the association constant, Ka.17,24 In contrast with ensemble techniques, single-molecule methods can measure kinetics without measuring binding isotherms, calibrating capture site density,33,34 or performing concentration-step experiments.35 We have assembled single-molecule microarrays and measured hybridization kinetics of individual immobilized DNA molecules to evaluate the uncertainties of identifying these spatially-resolved ‘target’ molecules based on their duplex dissociation kinetics with labeled ‘probe’ DNA molecules in solution and thereby determine whether they can be distinguished from other target sequences in mixed target samples. We employ silane chemistry to immobilize DNA anchoring strands to immobilize target ssDNA on glass surfaces that are passivated against nonspecific adsorption. Individual hybridization events at target DNA are located using DNApoint accumulation for imaging in nanoscale topography (DNA-PAINT).17,36,37 This highly parallel method is ideal for sampling diverse molecular populations, in contrast with serial schemes that sample one molecule at a time.12 In addition, sub-diffraction spatial resolution (super-resolution) reduces the impact of nonspecific adsorption because target sites are located with greater precision, and the information density can be increased,12,13 because higher resolution allows more closely-spaced target sites to be addressed without overlap. With high spatial resolution and low non-specific adsorption, we identify individual unlabeled immobilized target DNA molecules simultaneously by their kinetics of hybridization with different labeled DNA-probes in solution. The unlabeled immobilized-target molecules can be observed for long times (>24 hours) without any influence of photobleaching, which can lead to rapid loss of labeled-target sites in FRET or colocalization

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 27

4 assays. The labeled-probe DNA molecules exchange rapidly with the solution population, so their photobleaching does not significantly influence the results. The identification of individual target DNA molecules in a mixed random microarray proceeds by measurement of hybridization kinetics at each target DNA site on the surface. While duplex dissociations follow first-order kinetics, the challenge of determining duplex lifetimes of individual target molecules is the small sample of dissociation events (less than 40), too few for least-squares analysis of a histogram of dissociation times. To address this challenge, we employ a solution to an analogous problem of determining fluorescence lifetimes from small samples of single-photon arrival-times following pulsed-laser excitation.38 These arrival-times are drawn from an exponential-decay distribution, equivalent to the first-order dissociation kinetics of a DNA duplex. This problem has been addressed through application a maximum-likelihood estimator (MLE)38 used to determine excited-state lifetimes of individual fluorescent molecules.39,40 We apply the same MLE to determine duplex dissociation times of individual target DNA molecules. The MLE is able to report the best estimate of the mean of the dissociation-time distribution from a sample of hybridization events at each target DNA molecule. The resulting MLE dissociation-times at each target molecule are compared with predictions of a Poisson-weighted Erlang distribution41 to evaluate the extent to which sampling statistics govern uncertainties in duplex dissociation times. Kinetic discrimination of DNA capture sites was tested on target sequences having substitution near their middle of a T-A for a G-C base pair, resulting in a 2.5-fold decrease in dissociation times. We test whether this difference in duplex lifetime is sufficient to allow individual target molecules to be identified when probed with a mixture of labeled-probe DNA molecules that are complementary to the two immobilized target strands. EXPERIMENTAL SECTION Probe and Target Oligonucleotides. Single-stranded DNA was synthesized using solidphase phosphoramidite chemistry (Glen Research) and cartridge-purified by the University of Utah HSC Core DNA synthesis facility. Probe DNA with covalently-attached Cy3 fluorescent labels were further purified using HPLC. “Anchor” ssDNA was synthesized with a 3’ alkyl ACS Paragon Plus Environment

Page 5 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

5 amine modification for covalent attachment with the glycidoxysilane-modified surface, with the sequence: 5’-NH2-C6H12-(T)20-3’. Two “target” DNA strands were used; both contain polyadenine sections to attach to anchor DNA, where their recognition regions differ by one nucleotide, with sequences for the “G-target”: 5’-(A)20 TTC GGT ATA GCC CAT, and the “TTarget”: 5’-(A)20 TTC GGT ATA TCC CAT. Probe oligonucleotides have a Cy3 fluorescent label attached to the 5’ end through an 18-atom polyethylene-glycol linker, and are complementary with 10 bases on the 3’ end of the probe sequences; “G-probe”: 5’ Cy3-(PEG)6ATG GGC TAT A, and “T-probe”: 5’ Cy3-(PEG)6-ATG GGA TAT A.33 A “scrambled” sequence having nucleotide content similar to the probe DNA was used to investigate nonspecific adsorption, with sequence: 5’-Cy3-(PEG)6-AGT AGT AGAT. Immobilization of “anchor” DNA on glass and subsequent binding of ssDNA targets are described in Supporting Information, page S-2. Single-molecule imaging. Target-ssDNA modified coverslips were loaded into a microfluidics flow cell on a modified Olympus IX-71 inverted microscope with through-theobjective-TIRF illumination (Supporting Information, page S-4).33,34 Hybridization at the coverslip-solution interface was monitored from solutions of 2.5-15-nM probe ssDNA in 250mM NaCl 2.5-mM-phosphate buffer (pH8.0). Excitation intensity of ~4.0 mW (51 Wcm-2 during illumination, 2.5 Wcm-2 averaged over 2-s intervals) provided acceptable signal-to-noise and minimal photobleaching. Images were acquired using a charge-coupled-device camera and an intensity-threshold algorithm42 was used to locate individual molecules which accounted for the high background due to the high concentration of labeled DNA in the evanescent wave. Each single-molecule point-spread-function was fit to a 2D Gaussian function to locate its position. Spatial-temporal autocorrelation was used to correct lateral drift in the microscope stage using the single-molecule coordinates. Drift-corrected single-molecule coordinates were tracked within a detection radius of 80 nm in subsequent image frames to measure residence times. Each individual singlemolecule “event” is then used to locate probe sites and measure hybridization kinetics. Details are provided in Supporting Information, pages S-5, S-8 through S-13.

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 27

6 RESULTS AND DISCUSSION Comparing nonspecific adsorption to hybridization of immobilized target DNA. In order to interrogate hybridization of individual target DNA molecules immobilized on a surface, one must assure that the capture of labeled probe DNA is due to duplex formation, rather than nonspecific adsorption. To evaluate this issue, we compare surface-populations of probe DNA in the presence and absence of immobilized target DNA. Blank substrates containing only covalently-immobilized “anchor” ssDNA show very weak interactions with labeled probe DNA. Solutions of 10-nM labeled T-probe DNA produce an average of 53 molecules/image; Supporting Information, Figure S4. To quantify specific base-pairing interactions, target ssDNA with a poly-adenine (A20) linker is immobilized to T20 anchor DNA through hybridization. A mixture of 100-pM T-target DNA with 10-nM labeled T-probe DNA was flowed over anchoring substrates so that progress of probe-site immobilization can be monitored in situ. Target DNA molecules were allowed to anchor for ~2 min before rinsing the cell with buffer to remove unbound target DNA. The A20-T20 linker region is able to adopt a wide-range of stable structures with varying number of A-T base-pairs, as discussed in Supporting Information, pages S2-S3. Any target-capture structures featuring with less than 14 A-T base pairs would exhibit poor thermal stability, and would be washed away during the rinsing step and not detected in subsequent assays. A solution of 10-nM T-probe DNA in buffer was reintroduced to monitor hybridization interactions. Following target DNA immobilization, the population of probe DNA at the interface increased by 60-fold over the blank, indicating successful immobilization of target DNA. To ensure that target-DNA does not increase nonspecific adsorption of probe-DNA, the cell was rinsed, and a 10-nM solution of probe DNA having a scrambled sequence with equivalent base composition was introduced; the population of scrambled-DNA observed was 42 molecules (Figure S4), equivalent to T-probe-DNA prior to immobilization of target-DNA. Identifying hybridization events using super-resolution localization imaging. Although we observe a marked increase in population of probe-DNA after immobilizing targetDNA to the substrates, the appearance of probe-DNA does not equate to identifying target molecules on the substrate. To distinguish target molecule hybridization from nonspecific

ACS Paragon Plus Environment

Page 7 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

7 adsorption, we rely on DNA hybridization being reversible near its melting temperature, so that target molecule sites can be identified by observing repeated hybridization events at specific locations. In previous work, identifying reversible hybridization to individual molecules was achieved by locating spots in fluorescence images that show intensity fluctuations.31,32 These diffraction-limited images allow molecule localization with ~500 nm precision, limiting the density of molecules that can be resolved and the ability to reject nearby nonspecific adsorption events. By using super-resolution imaging, we reduce the uncertainty of target-molecule localization to ~36-nm precision, allowing 100-fold greater resolvable target-site densities and comparable improvement in selectivity against excluding nonspecific adsorption. This precision is achieved by fitting an empirical 2-D Gaussian function to the optical point-spread-function (PSF) measured for each located single-molecule spot.19,43 The signal-to-noise ratio of single molecule spots can be as low as S/N=2.5 due to the fluorescence background from solutionphase molecules in the evanescent wave and the low excitation intensity needed to minimize photobleaching. The low S/N challenges the fitting of the PSF, where parameters may not converge.44-46 To improve the speed and likelihood of convergence, we fit a 2-D Gaussian function with only two nonlinear parameters, the centroid x- and y-coordinates, and two linear parameters, the peak and baseline intensities. The radius of the PSF is fixed to an average derived from thousands of single-molecule images (Supporting Information, Figure S5). To ensure accurate registration of target sites, X-Y drift of the microscope stage must be corrected. Spatial drift can be corrected by using “fiducial markers,” intensely-fluorescent objects tracked on a separate color channel to avoid interfering with weak single-molecule fluorescence.47 Alternatively, repetitive hybridization at a collection of target sites can be used to track drift,29 eliminating the need for 2-color imaging. In this approach, probe hybridization locations are collected in maps versus time, and each map is spatially cross-correlated with the map from the first-time acquisition. Probe DNA hybridization with identical target molecules at different times generates a peak in the cross-correlation function, the location of which is the spatial offset between the images, which can be used to correct the typical 50-300-nm drift in molecular coordinates over a 30-min acquisition (Supporting Information. Figure S7).

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 27

8 Organizing hybridization events to detect individual target molecules. The driftcorrected coordinates of probe DNA molecules in each video frame are organized into hybridization “events” by tracking nearby (within 100 nm) coordinates in sequential video frames to determine their locations. The tracking algorithm bridges brief dark states of 2-video frames or less to minimize the impact of photoblinking on the measured kinetics.33 A map of a small sample area of drift-corrected T-target hybridization events is shown as an inset in Figure 1. Hybridization events appear in small clusters, the sizes of which are governed by the precision in localization and drift correction. We have used pair autocorrelation analysis48 to estimate the localization precision; this method computes the overlap integral of a density map with itself versus lateral shift. The spatial decay of the pair-autocorrelation-function (PACF) indicates hybridization events cluster with a Gaussian decay, indicating a localization uncertainty of σ = 17.70.1 nm (Supporting Information, Figure S8). This positional uncertainty can be used to establish spatial criteria to combine hybridization events clustered around each target molecule; target sites are identified by finding at least 2 hybridization events within a conservative 4.5σ or 80-nm radius. Using this criterion, we locate 2934 hybridization sites in a 50-by-50-µm area, examples of which are plotted in Figure 1(inset) as 80-nm radius circles. The target molecule density can be also checked with pair-autocorrelation analysis. If the number of hybridization events per target molecule follows Poisson statistics, then the amplitude of the autocorrelation function is inversely proportional to the density of probe molecule clusters.48 The PACF fit used to determine the localization precision (Supporting Information, page S-12) predicts a molecule density of 2,94020 molecules, which matches the target-site density determined by counting sites. Of the ~3x104 hybridization events observed in a typical data set, 95.5% of them occur at hybridization sites that meet the criteria of 2-events within an 80-nm radius. The area occupied by these identified hybridization sites is very small (2.5% of the total sample area), one might expect that nonspecific adsorption in the remaining 97.5% of the sample area would dominate the spatially-uncorrelated events, which number ~1,400 in a typical data set. Measurements of the same probe DNA interacting with a blank sulfonated surface having only the anchor DNA

ACS Paragon Plus Environment

Page 9 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

9 strands and no target DNA, however, yields only ~300 events in an equivalent experiment; this number represents ~20% of the uncorrelated binding events detected when complementary target strands are present. The 5-fold greater number of uncorrelated events when target strands are present suggests a population of target DNA that exhibits very weak probe interactions. Indeed, the histogram of the number of binding events at correlated target sites (Figure 1) shows two populations of capturing sites on the surface: a large (88%) population of sites with an average of 10.9 capture events per experiment, and small (12%) population of sites that average only 2-3 events, which are of shorter duration (see below). Weak capture sites may result from chemically damaged target DNA, hindrance by overhanging capture or anchoring DNA, or target DNA interacting with surface defects, weakening their interactions with probe molecules. Sites with fewer capture events also correspond to short duration, single-frame (≤2-sec) probe-target residence times. We exclude weak capture sites by identifying locations where the majority of hybridization events survive for only a single-frame. Although 14% of the target sites are excluded by this criterion, these sites account for only 5.9% of the hybridization events and only 1.8% of the detected probe population because they are less efficient at capturing probe molecules and their dissociation times are short. In an ensemble measurement, the signal from these anomalous target sites would be undetectable, which highlights the power of singlemolecule imaging to observe weak interactions even in the presence of a dominant population of strongly binding sites. After eliminating weak-binding sites based on the above criteria, the histogram of hybridization events at the remaining sites is consistent with a Poisson distribution:

𝑝(𝑘, 𝜆) =

𝜆𝑘 exp⁡(−𝜆) 𝑘!

(1)

where the average number of visits per target molecule is λ=10.9 (Figure 1). The agreement of the results with a Poisson site-sampling model is consistent with a uniform probability of hybridization at these sites. Hybridization kinetics at individual target molecules. With hybridization events observed at spatially-resolved sites, we can investigate hybridization kinetics of individual target

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 27

10 molecules to determine the precision with which these kinetics can be determined. To test whether the stochastic on-off behavior is a result of reversible hybridization and not photoblinking,49,50 the kinetics at each target site were measured at varying solution concentrations of probe DNA where the association rate should increase with solution probe-DNA concentration. Hybridization was monitored for 30 to 90 min following exposure to each probe DNA concentration, ranging from 2.5 to 15 nM. Drift-corrected target DNA molecule sites were monitored within 80-nm radii, and only sites that were observed in every data set (over a 24-hour period) were analyzed. Of the initial 2521 target sites, 1551 sites appeared throughout all experiments indicating a loss of 38%. Site-registration over time accounts for a ~3% loss, and sites that exhibit fewer than 2-events for one or more target concentrations (required for determining an association rate) represent a 4% loss. Most lost target sites are likely due to dissociation of partially-hybridized A20-T20 anchoring segments (see Supporting Information, page S-2). Evidence suggesting this mechanism is the appearance of a small number of new sites over time, possibly from target DNA detaching and being re-captured at a different location. From individual hybridization events with time stamps indicating their arrival and departure times, we generate a logical trace of the occupancy of each target molecule site over time; see examples in Figure 2A(inset), where the frequency of observed hybridization events increases with probe concentration while their average duration appears to be unaffected. The individual lifetimes that a site is occupied or unoccupied, τoff and τon respectively, can be used to determine average dissociation and association lifetimes, respectively, for each hybridization event at all target molecules. These are plotted in Figure 2 as cumulative histograms,51 where each point represents the number of on- or off-events that persist to that time and earlier. Histograms are well described by an exponential decay probability function:

𝑝(𝑡 ) = ⁡𝐴exp⁡(−𝜏⁄𝜏̅)

(2)

where average association or dissociation times, 𝜏̅on or 𝜏̅off, represent the inverse of the association or dissociation rates, kon and koff respectively. Dissociation times do not vary with solution concentration of probe DNA (Figure 2A), while association times decrease with increasing probe concentration (Figure 2B) from increasing target encounters with probe

ACS Paragon Plus Environment

Page 11 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

11 molecules. The event rates from Figure 2 are plotted in Figure 3, where the slope of the concentration dependence gives the association-rate constant, kon = 1.350.06 M-1s-1, and dissociation-rate constant, independent of concentration, koff = 0.0700.002 s-1. The observed exponential-decay probabilities are consistent with uniform first-order and pseudo-first order dissociation and association kinetics. However, these aggregated results from all target sites do not characterize hybridization kinetics at individual sites on the surface. The challenge of determining duplex lifetimes of individual target molecules arises from the small sample of dissociation events, typically 40 or less, too few for least-squares analysis of a histogram of dissociation times. An analogous problem has been addressed for determining fluorescence lifetimes from small samples of single-photon arrival-times following pulsed-laser excitation.38 These photon-arrival times are drawn from an exponential-decay distribution, equivalent to first-order dissociation of a DNA duplex after its formation. For fluorescence lifetime analysis, the problem analyzing small samples of single-photon arrival times has been addressed through a maximum-likelihood estimator (MLE)38 applied to determining excited-state lifetimes of individual molecules.39,40 In the present case, we apply this MLE38 for determining both duplex dissociation and association times of individual target DNA molecules from a sample of hybridization events, modified to allow sampling to begin after decay of short-lived events and index-offset by 1/2 to report times at the bin-centers: 𝑖=∞

𝑇 𝜏̂ = ( ∑ 𝑖⁡𝑛𝑖 ) − 𝑇(𝑗 − 1/2) 𝑁 𝑖=𝑗

(3) where T is bin width, N is the total number of events, ni is the number of binding events in bin i, and j is the first-sampled bin. Distributions of 1551 target sites’ maximum-likelihood-estimated dissociation time, 𝜏̂ 𝑜𝑓𝑓 , and MLE association time, 𝜏̂ 𝑜𝑛 , are plotted in Figures 4A and B, respectively, for varying probe-DNA solution concentrations. The MLE-dissociation-times of 98% of target molecules are centered at 15-s, while a 2% fraction exhibit short dissociation

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 27

12 times, ~2.5-s; association-time histograms are single peaks, the centroids of which shift to shorter times with increasing solution concentration, as expected. Because the energetics of duplex dissociation are sensitive to DNA base content and sequence order,31 we can use characteristic probe-DNA dissociation times as a signature for identifying target DNA molecules. Hybridization at each target molecule is sparsely sampled, producing distributions of dissociation times (Figure 4A). For accurate identification, we must ensure that lifetimes of different target DNA sequences differ by more than the widths of their distributions. Furthermore, a statistical model is needed to determine whether these widths are limited by sampling of dissociation lifetimes or if there is additional heterogeneity in the dissociation kinetics. The Erlang distribution has been used to describe waiting times generated by a discrete number of time values sampled from an exponential distribution.41 It is a discrete version of a gamma distribution, previously used to describe stochastic protein expression in cells52 and single-molecule enzyme kinetics.53 For individual target molecules, the number, k, of measured dissociation events is not fixed, but follows a Poisson distribution (Figure 1). The distribution of MLE dissociation times, is represented by the sum of Erlang distributions evaluated for variable numbers of samples, k, drawn from a Poisson-probability-density function. This Poisson-weighted Erlang distribution is shown below for average lifetime, 𝜏̅, number of events sampled, k, and their mean number, λ:

𝑘 𝑘 𝑘−1 𝑡̂ 𝑘 ( ) 𝜏̂ 𝑒𝑥𝑝 (− ) 𝜏̅ 𝜏̅ 𝑓 (𝜏̂ ; 𝜆, 𝜏̅) = ∑ 𝑝(𝑘, 𝜆) (𝑘 − 1)! 𝑘=𝑘𝑚𝑎𝑥

𝑘=𝑘𝑚𝑖𝑛

(4) where p(k,λ) is a Poisson probability for the number of events, k, from Equation 1. The lowerbound of k is defined as the minimum number of events-per-site, 2 for dissociation times (singleevent sites are dropped to exclude non-specific interactions, see above), and 1 event for association times (association lifetimes require two hybridization events or k-1 total samples). The data in Figure 4 were fit to Poisson-Erlang (P-E) distributions (Equation 4) and results are included in the figure. The average dissociation and association times from the fit are

ACS Paragon Plus Environment

Page 13 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

13 listed in Table S1, converted to dissociation and association rates, and included in Figure 3. When compared with fitting cumulative histograms of all hybridization events, rate constants from P-E modeling of individual target kinetics (koff = 0.071±0.002 s-1 and kon = 1.30 ± 0.05 x 106 s-1 M-1) are within the uncertainties of fitting histograms of all results. The variance of the PE lifetime distribution, τ2, can be predicted from the 2nd moment of the Erlang distribution,

τ2 = ̅2 /k,41 weighted by Poisson-sampling probabilities p(k,λ): 𝜎𝜏2

𝜏̅ 2 =∑ 𝑝(𝑘, 𝜆) 𝑘 𝑘=𝑘𝑚𝑖𝑛 𝑘𝑚𝑎𝑥

(5) The standard deviations of dissociation-time distributions (Figure 4A) average τ ~3.8-sec, which is close to (~23% greater than) the value τ ~ 3.1-sec predicted by Equation 5 from the average lifetime, ̅off, and Poisson distribution of site visits, p(k,λ). This result indicates that the 𝜏̂𝑜𝑓𝑓 distributions of individual sites are governed primarily by the number of hybridization events sampled at each target molecule. The additional lifetime uncertainty likely arises from inhomogeneity in interactions between target DNA and substrate, or small variation in experimental conditions, such as temperature. Because the dissociation time uncertainties are largely described by P-E sampling statistics, we can conclude that the dissociation kinetics are relatively uniform among immobilized targets; this result suggests that dissociation lifetimes can serve as a metric for identifying individual target molecules based on their DNA sequences. Identifying individual immobilized target DNA molecules. To evaluate uncertainties of identifying immobilized target molecules based on their dissociation kinetics, we immobilized mixtures of target DNA having different sequences. A “G-target” differs from the original Ttarget by one dT-to-dG base substitution near the middle of the recognition sequence. This 10mer G-target probe exhibited significantly slower dissociation kinetics, 𝜏̅off=130±8s (Supporting Information Figure S10), compared to 14.3±0.2s for the T-target, due to increased stability from of a G-C versus an T-A base-pair. Imaging conditions used to sample the shortlived T-target duplex resulted in photobleaching of the long-lived 10-mer G-probe-duplex. To minimize photobleaching, G-targets were interrogated with a shorter 9-mer probe, which exhibits

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 27

14 faster dissociation times, 𝜏̅off=35±1s (Supporting Information S11), which are less influenced by photobleaching but more challenging to resolve kinetically because of the small (2.5-fold) difference in dissociation rates. Mixtures of DNA targets were immobilized simultaneously from solutions of 330- and 300-pM T- and G-targets, respectively, in equilibrium with a mixture of their respective labeledprobe DNA to monitor accumulation. The mixed target ssDNA surface was imaged with a mixture of 5-nM T-probe and 5-nM G-probe DNA for 2 hours to locate target molecules. Target sites were identified and filtered to remove sites in close proximity to avoid overlap in nearby PSFs (Supporting Information, Figure S6). When using the two separate probes to identify immobilized target sites, mismatched probe-target hybridization can result in false identification. Mismatched base-pairing between T-target and G-probe (a T·C mismatch) was too weak to be detected reliably; however, hybridization between G-targets and T-probes (a G·A mismatch) was detectable, consistent with expected duplex stability of mismatched base-pairs.54,55 The mismatched dissociation kinetics are fast, koff=0.63(±0.2) s-1, and association rates are slow, kon=0.51(±0.03) x 106 M-1s-1, resulting in a 14-fold weaker affinity constant than the fully complementary 9-mer G-target-probe duplex (Supporting Information, page S-19). To mitigate mismatched hybridization cross-talk, we exclude single-frame hybridization events (≤2s). This filtering excludes 92% of mismatched G-target to T-probe hybridization events. An example map of identified single-molecule target sites with spatial histograms of hybridization events for both mixed and pure samples of their corresponding probes are shown in Figure 5. The Poisson-Erlang model predicts that T-target and G-target strands can be identified by their dissociation times with