Reply to Comment on Accurate Data Process for Nanopore Analysis

Sep 28, 2015 - High-bandwidth nanopore data analysis by using a modified hidden Markov model. Jianhua Zhang , Xiuling Liu , Yi-Lun Ying , Zhen Gu , Fu...
0 downloads 0 Views 742KB Size
Comment pubs.acs.org/ac

Reply to Comment on Accurate Data Process for Nanopore Analysis Zhen Gu,†,‡ Yi-Lun Ying,† Chan Cao,† Pingang He,‡ and Yi-Tao Long*,† †

Key Laboratory for Advanced Materials and Department of Chemistry, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, P. R. China ‡ Department of Chemistry, East China Normal University, 500 Dongchuan Road, Shanghai, 200241, P. R. China

Anal. Chem. 2015, 87, 907−913. DOI: 10.1021/ac5028758

H

setting of thresholds for the blockages with unavoidable noise at two edges and the integrations of blockage amplitude we proposed are independent of the bandwidth. The QUB, another software suggested in the comment, is a professional software for single channel data analysis. It aims to simulate kinetics of single molecule and owns the advanced features in statistical analysis based on Hidden Markov model and K-mean algorithm. The studies mentioned in the comment (refs 10−12 in the comment) about the Hidden Markov model were not used in analyzing real data for blockage detection7,8 or did not take the influence of the filter into consideration.6 However, our methods exhibit good performances in analyzing real data from α-hemolysin nanopore experiments. Thus, our method has the advantage among the present process for nanopore data analysis. Prior to the implementation of the DBC method, we applied a Fourier series to fit the experimental blockages, which showed a good performance. Note that the role of fitting process is only to smooth the signal data. The fitted function is not based on the circuit model of the system. Therefore, the fitted parameters are not relevant to the frequency response of the system. Here, we listed the fitted parameters of the 4th-order Fourier series which are asked for in the comment (Table 2). As a kind reminder, we noticed that the description of the conventional method for evaluating the dwell time might lead to a misunderstanding. In our paper, the conventional method uses Ps2 to Pe4 in the measurement of dwell time, while the DBC method uses Ps3 to Pe4 in the measurement of dwell time. Briefly, both the conventional and DBC method chose the same stop point as Pe4 but use the different start points. Particularly, the comment raised a concern about the “criterion” of the conventional method used in the comparison of dwell time. The dwell time definition of Ps2 and Pe4 is a conventional method originating from Rant’s study.2 The points of Ps2 and Pe2 are located by using a threshold and the tracking back routine, which is similar to the two-threshold method.12 Both of these two methods are widely used in automatic data processes for searching the start and stop point of the blockage in nanopore analysis. However, the filter would affect the dwell time of blockages. Rant’s group proposed a modified criterion to measure the dwell time of blockage by choosing “the last (or only) local minimum of the pulse before the signal starts to return” as the stop point to cut down the overestimation of dwell time.2 Here we adopted their definitions as a conventional one. Therefore, the stop point is regarded as Pe4 for defining the conventional dwell time. The start point in their criterion is located as the last data point before the current drops below the baseline,

ere, we carefully consider and reply to the comments raised by Dr. Dunbar regarding the accuracy of our presented data process for nanopore analysis. Our work focuses on current blockage location by improving the accuracy of local threshold approach and the evaluation of current amplitude by applying an integration method.1 In this response, we provide a more detailed description of second-order-differential-based calibration (DBC) method. Moreover, the advantages of our proposed method have been discussed by comparing with conventional methods. Thanks to the simulation method provided in the comment, we further built a calibration method to decrease the errors in evaluation of short blockages in nanopore studies. Our improved methods are useful for the problem of extracting information from blockages which are faster than the instrument response. The pulse train model is a simplified model for processing the blockage current acquired in experiments.2−4 We agree with the opinions from Dr. Dunbar that the current model should be carefully used in a part of nanopore signal analysis, where the original shape of current blockage may appear to be ramplike. Therefore, we considered the pulse train model and data process based on this model is not appropriate for all of the nanopore experimental results. It can be used in the experiments where the transition between open pore state and blockage state is faster than the instrument response (up to 100 kHz bandwidth at current stage5). In our method, the DBC method is developed to precisely locate the region of blockage, which largely eliminates the effect of random noise. Then, we adopt the criterion from Rant’s study to evaluate dwell time.2 It should be noted that the DBC method is not limited to the applications which are based on Rant’s criterion. DBC method could be incorporated into many other criterions of dwell time. As kindly proved in the comment, the unfiltered blockage and its filtered version have equal areas. In other words, the integrations of filtered blockage currents are hardly affected by the low-pass filter. Therefore, we adopted integrated area in calculating the current amplitude from the attenuated events. We appreciated that Dr. Dunbar provided good references and software for nanopore data analysis.6−8 The present methods for nanopore data analysis were listed but not limited in Table 1. The OpenNanopore software extracts multilevel information in the blockages from the solid-state nanopore by using the cumulative sums algorithm. The attenuations on the blockages induced by the low-pass filter are not a major concern in this software. For locating the blockage in the data process, the OpenNanopore uses localized thresholds method. In contrast, the DBC method eliminates the errors originating from the © XXXX American Chemical Society

A

DOI: 10.1021/acs.analchem.5b03225 Anal. Chem. XXXX, XXX, XXX−XXX

Comment

Analytical Chemistry Table 1. Selected Methods Used in the Present Softwares for Nanopore Data Analysis method OpenNanopore3 QuB9 MOSAIC4 Transalyzer analysis10 PythIon11 Rant’s group2 nanopore analysis1

baseline

event detection

Kalman filter Kalman filter single baseline moving window

two thresholds threshold and Viterbi algorithm multithresholds two thresholds

median of selected data moving window moving window

threshold and tracking back

current amplitude measurement

attenuation calibration

cumulative sums algorithm Viterbi algorithm exponential fitting full-width at half-maximum method two sides of blockage

average of pulse plateau average of pulse plateau exponential fitting average of pulse plateau

no no yes no

average of pulse plateau

no

modified stop pointa modified stop pointa

slope based method current integration method

yes yes

dwell time measurement

threshold and tracking back threshold and tracking back and DBC method

a

Both the methods from Rant’s group and our group use the last (or only) local minimum of the blockage before the signal starts to return as the stop point for dwell time measurements.

DBC method, as described in the comment. Therefore, in our study, the comparison between the conventional and DBC method are suitable and acceptable. About the integration method, the comment raised a major concern for estimating the current amplitude which compounds the error in the dwell time estimation. The comment recommended applying the high order derivative of the fitted blockage curve to avoid introducing the error from dwell time, which is similar to the slope method. However, the slope method for estimation of the blockage amplitude is also based on the dwell time.2 Since we adopt the same criterion of dwell time as the slope method, the influence of dwell time for our estimation of current amplitude is acceptable. One of the long-term goals in nanopore studies is to accurately acquire and distinguish short pulses since translocation speed of the analyte is too fast. A good example based on Matlab simulation is given in the comment to demonstrate the challenges in analyzing the nanopore pulses with extremely short dwell time (10 and 20 μs). At present, very short blockages are seriously attenuated due to the limitation of 3−10 kHz bandwidth which is widely used in nanopore studies.13−18 On the other hand, the sampling rate (100−500 kHz)19 of the commercialized analog-to-digital convertor may not meet the high requirement for well recording the attenuated short blockages. In our work, the widths of the generated pulses are larger than 0.05 ms by using the function generator. We agree that the circuit for generating the ideal pulse we used may contribute to other frequency responses which are different from the nanopore system.20 To generate and analyze the exact pulse shape of current blockages (