DataSet-Dependent Acquisition enables comprehensive tandem mass

Subscriber access provided by University of Massachusetts Amherst Libraries

Article

DataSet-Dependent Acquisition enables comprehensive tandem mass spectrometry coverage of complex samples. Corey David Broeckling, Emmy Hoyes, Keith Richardson, Jeffery M. Brown, and Jessica E. Prenni Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.8b00929 • Publication Date (Web): 30 May 2018 Downloaded from http://pubs.acs.org on May 30, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

DataSet-Dependent Acquisition enables comprehensive tandem mass spectrometry coverage of complex samples. Corey D. Broeckling, Emmy Hoyes, Keith Richardson, Jeffery M. Brown, and Jessica E. Prenni. Corey D. Broeckling Proteomics and Metabolomics Facility Colorado State University C-121 Microbiology Building 2021 Campus Delivery Fort Collins, CO 80523 970-491-2273 orcid.org/0000-0002-6158-827X [email protected] Emmy Hoyes Waters Corporation Altrincham Road Wilmslow SK9 4AX UK [email protected] Keith Richardson Waters Corporation Altrincham Road Wilmslow SK9 4AX UK [email protected]

Jeffery M. Brown Waters Corporation Altrincham Road Wilmslow SK9 4AX UK [email protected] Jessica E. Prenni Department of Horticulture Colorado State University 210 Shepardson 1173 Campus Delivery Fort Collins, CO 80523 970-491-7050 orcid.org/0000-0002-0337-8450

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

[email protected]


Page 2 of 23

Page 3 of 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


Abstract: Tandem mass spectrometry (MS/MS) is an invaluable experimental tool for providing analytical data supporting the identification of small molecules and peptides in mass spectrometry based ‘omics’ experiments. Data dependent MS/MS (DDA) is a real time MS/MS acquisition strategy which is responsive to the signals detected in a given sample. However, in analysis of even moderately complex samples with state of the art instrumentation, the speed of MS/MS acquisition is insufficient to offer comprehensive MS/MS coverage of all detected molecules. Data independent approaches (DIA) offer greater MS/MS coverage, typically at the expense of selectivity and/or sensitivity. This report describes dataset dependent MS/MS (DsDA), a novel integration of MS1 data processing and target prioritization to enable comprehensive MS/MS sampling during the initial MS level experiment. This approach is guided by the premise that, in ‘omics’ experiments, individual injections are typically made as part of a larger set of samples and that feedback between data processing and data acquisition can allow approximately real time optimization of MS/MS acquisition parameters and nearly complete MS/MS sampling coverage. Using a combination of R, Proteowizard, XCMS, and WRENS software, this concept was implemented on a liquid chromatography coupled quadrupole time of flight mass spectrometer. The results illustrate comprehensive MS/MS coverage for a set of complex small molecule samples and demonstrate a strong improvement on traditional DDA.



Introduction: Mass spectrometry (MS) is an analytical tool for sensitively and selectively measuring the molecular composition of a sample, particularly when coupled to a liquid chromatography (LC) separation1. It simultaneously delivers data supporting the detection, quantification, and characterization of a molecule. Tandem mass spectrometry (MS/MS) can be used to improve MS selectivity or sensitivity. For example, targeted quantitative assays utilize MS/MS to improve selectivity, which can often improve sensitivity on low mass resolution (quadrupole) MS instruments. Alternatively, on a high mass resolution instrument, MS/MS can improve selectivity in annotation efforts, as MS/MS offers more annotation selectivity than does accurate mass alone. The two most common approaches for acquiring traditional MS/MS data include (1) targeted MS/MS – whereby a given precursor mass window is selected and fragmentation data is acquired over chromatographic time scales with or without interleaved full scan MS data or (2) data dependent MS/MS acquisition (DDA) – whereby the data acquisition software directs acquisition of MS/MS data based on real-time evaluation of MS level data. With DDA, fragmentation is performed only on signals known to exist in a recent MS level scan and that meet certain user-guided criteria 2–5. Traditional targeted MS/MS is typically used when sensitivity for a given target analyte or analytes needs to be maximized 6. DDA is generally used to increase annotation confidence in a non-targeted experiment. However, as non-targeted proteomics and metabolomics applications developed, the limitations of data dependent MS/MS were exposed; most notably, even highly optimized instrumentation settings do not enable truly comprehensive MS/MS coverage of detected MS features7,8. This limitation has spurned the development of several alternative approaches, described below. In order to improve the efficiency of MS/MS data acquisition, Neumann et al9 developed MetShot, or ‘nearline’ acquisition, whereby processing and targeted method file creation was streamlined to enable feedback between the MS level data and relatively broadly targeted MS/MS acquisition. MontenegroBurke et al developed an approach supported by XCMS online processing to enable efficient acquisition of targeted MS/MS data as the experiment proceeds10. With these tools, features of interest can be quickly recognized and targeted relatively efficiently, directing acquisition of MS/MS spectra to those of most immediate priority. Several publications have described efforts to improve DDA coverage, hereafter referred to as ‘modified DDA’ approaches. These methods typically employ exclusion and inclusion lists11–14 with various levels of automation to improve depth of coverage for single samples. These approaches benefit from retaining canonical ‘data dependence’, enabling precursor selection to be responsive to signals that are eluting at that time, while enabling a responsiveness to previous injections which improves sampling depth. Data independent acquisition (DIA) methods have developed in parallel to efforts to modified DDA. These have included methods such as shotgun collision induced dissociation15 (SCID), whereby all precursor ions are fragmented in-source simultaneously via elevated cone voltage. Later, the ability to fragment all ions in the collision cell without precursor ion selection was enabled16 and commercialized as MSE. SCID and MSE offer absolutely optimal fragment ion transmission: when all precursor ions are fragmented all the time, all product ions are transmitted to the detector all the time. However, these


Page 4 of 23

Page 5 of 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


approaches suffer from a lack of selectivity, as no precursor selection is used. Venable et al3 proposed and implemented a serially-stepped wide precursor isolation window approach, which was followed by the development of PAcIFIC17 and ultimately commercialized as SWATH18. These methods offer increased selectivity compared to SCID and MSE, however, this selectivity is gained at the expense of reduced ion transmission (sensitivity). In contrast to the stepped operation of SWATH-like methods, SONAR utilizes smooth scanning of the precursor ion window, enabling more precise matching of precursors to fragments19,20. With all quadrupole based precursor selection methods, there is an explicit negative relationship between selectivity and ion transmission. The more selective the method, the less sensitive it is. Further, since the acquisition is independent of the MS level data, some of the acquisition time will be spent on relatively intense ions which do not benefit from additional sensitivity or on mass regions which do not contain signal. The incorporation of a traveling wave21 or trapped22 ion mobility separation module into the ion path of a standard Q-TOF mass spectrometer upstream of the collision cell enables the ability to offer mobility based precursor ion selection, resulting in much greater sensitivity than quadrupole selection. Both traditional DDA and DIA approaches aim to comprehensively characterize a sample by MS/MS. In ‘omics’ experiments, the individual sample almost always exists as a unit of the larger ‘sample set.’ Traditional DDA approaches do not take advantage of this, though some of the modified DDA approaches leverage this knowledge through exclusion and inclusion lists. The individual samples in the sample set are generally expected to be similar enough to be comparable to each other, but different enough to distinguish between treatment groups (i.e. biomarker discovery). Data processing workflows such as the XCMS processing workflow23 in metabolomics or MS1 quantification methods in proteomics workflows24–26 explicitly assume that there are far more commonalities between samples than differences, and retention time alignment and feature grouping strategies rely on these similarities for confidence signal matching across the dataset. We hypothesize that a precursor selection approach which is built not on the concept of an individual sample, but rather on the sample set, may enable comprehensive coverage of the full feature space by narrow precursor isolation window MS/MS. Such an approach, if feasible, would enable a wealth of qualitative (annotation) data to be acquired with every dataset, significantly improving our capacity to interpret results and formulate biological conclusions. This paper describes an MS/MS precursor selection approach which offers more selectivity than DIA approaches, more feature coverage than DDA approaches, and enables the sensitivity to be selectively applied to ions most likely to benefit from it. This is accomplished by (1) locally processing data in approximately real time, (2) implementing a novel feature prioritization scheme reflecting quality of both the MS and MS/MS data for each feature, and (3) use of the Waters Research Enabled Software (WRENS) program to enable continuous modification of an existing template MS/MS method throughout the dataset. This approach is called Dataset Dependent data Acquisition (DsDA). The results presented here demonstrate comprehensive precursor selected MS/MS coverage at multiple collision energies over the course of a typical omics experiment. The ancillary benefits of this framework and its potential for further improving data quality are discussed.



Materials and Methods: Samples: The dataset dependent data acquisition approach was tested on an artificially complex sample derived from bovine (Bos taurus) muscle tissue, green onion (Allium cepa) leaves, and barley (Hordeum vulgare) grain. While any sample type could have been used, these represent a diverse and representative assortment of samples and therefore can be considered representative of sample types commonly analyzed using metabolomics methods. All samples were lyophilized prior to extraction. Sample extraction: A complex extract was generated for each sample type listed above. 200 mg of sample were weighed into a 20 mL glass vial with a teflon lined cap. Each dry sample was extracted with 10 mL absolute methanol at 4°C with vigorous vortexing. Extracts were centrifuged for 30 minutes at 3000xg and 4°C, and the supernatant was collected. Extracts were stored at -20°C overnight before use. In the single sample experiment, these extracts were mixed at equal proportions, and the mixture was repeatedly injected 20 times. In the multi-sample experiment 33 unique ratios of the three extracts were used, ranging from 0-200 µL of each extract, in increments of 20 µL, each sample containing 400 µL of extract in total. A “QC” sample was prepared by mixing all extracts at equal proportions. The ‘QC’ was injected every sixth injection, as might be employed in a standard metabolomics experiment. The multi-sample set serves as a surrogate for an authentic sample set, with the feature variance exaggerated to test the limits of the approach. The composition used for each sample is provided in supplementary table 1. Ultra-high pressure liquid chromatograph coupled time of flight mass spectrometry: Samples were analyzed using a Waters Acquity UPLC system. Metabolites were separated on a Waters CSH phenyl-hexyl column (1.0 x 100 mm, 1.7 µM) using a gradient from solvent A (water + 0.1% formic acid + 2.0 mM ammonium hydroxide) to solvent B (acetonitrile + 0.1% formic acid). All solvents were Thermo-Fisher HPLC-MS grade. Injections were made in 100% A, held at 100% A for 1 min, ramped to 98% B over 12 minutes, held at 98% B for 3 minutes, and then returned to starting conditions over 0.05 minutes and allowed to re-equilibrate for 3.95 minutes. Flow rate was constant at 200 µL/min. The column and samples were held at 65 °C and 6 °C, respectively. Column eluent was coupled via electrospray ionization to a Waters Xevo G2 Q-TOF mass spectrometer, alternating between MS and MS/MS scans. Three MS/MS approaches were used: traditional data dependent acquisition (DDA), DataSet Dependent Acquisition (DsDA), and DsDA+MaxDepth, described below. In each case, every 0.18 second MS scan was followed by exactly 4 MS/MS scans of 0.035 seconds each. For DDA, all MS scans were collected using a collision energy of 20 V. DDA was triggered for the four most intense ions above two counts per second, and ions (within 100 ppm m/z) were excluded for 10 seconds after sampling. For DsDA methods, collision energies were randomly assigned, with values selected from one of 10, 20, 30, and 40 V. Each scan was collected in positive ionization mode, ranging from 50-2000 m/z. In all methods, capillary voltage was held at 2200 V and cone voltage at 30 V. MS scans were collected at collision energy 6 V. For DDA, the interscan delay was set to ‘auto’ (0.014 seconds), while for DsDA it was set to 0.03 seconds (see below for reasoning).


Page 6 of 23

Page 7 of 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


DataSet Dependent Acquisition (DsDA) approach: The descriptions that follow utilize metabolomics language and approaches, but it should be noted that this approach is equally suited to proteomics or other applications where comprehensive qualitative MS/MS data is desired. The DsDA framework is designed to shift the scope of data-dependent acquisition from the injection to the sample set, which should enable oversampling of nearly all features by MS/MS. The DsDA approach was developed considering the following generalizations: 1) Metabolomics samples are routinely analyzed as part of a sample set, whereby the set of samples represent largely similar samples that are to be compared to each other. 2) The differences among samples are small relative to the similarities between them. 3) Data processing workflows typically discard the exceptionally rare features which are not observed with sufficient frequency. 4) Feature annotation does not require annotation of that feature in every sample, rather annotation of a feature from one sample can be propagated across samples. 5) When MS-level quantification is sufficient, negating the need for DIA-based MS/MS quantification, MS/MS need only support annotation. Working within this collection of generalizations, one can expand the concept of DDA’s ‘data dependency’ from single injections to sample sets, whereby any single sample can be considered an approximate representation of all other samples in the set. Once this approximation is acknowledged, technical constraints which limit coverage in DDA methods (sensitivity, selectivity, speed of acquisition8) become less troublesome. Once a given feature has sufficiently high-quality MS/MS data, the algorithm should begin sampling more deeply by diverting MS/MS time to a previously neglected precursor ion. If new data from subsequent injections improves MS feature quality, the precursor selection algorithm should be responsive to this change and prioritize MS/MS acquisition for that feature. Within this conceptual guideline, a reasonable approach to prioritizing MS/MS acquisition would take into account all the MS and MS/MS data from all existing injections, then design the MS/MS schedule for the next injection based on pre-existing data. Some very approximate calculations of the number of MS/MS events available as compared to the number of features in a complex sample suggest that once one views the feature sampling problem from a sample set perspective, there is ample MS/MS time available to sample every feature not only once, but many times. A graphical description supporting this logic is depicted in Figure S1 in supplemental materials. Successfully sampling all features requires a precursor selection algorithm which responds to existing MS and MS/MS data, coupled to MS control software that can offer scan-by-scan control of the MS/MS acquisition for subsequent injections. The DsDA approach was designed to meet this need and was implemented using the following steps: 1) Acquire data using MassLynx acquisition control software. A sequence of injections is made which represents a sample set. The MS method consists of a repeating series of MS scans followed by four MS/MS scans – this file serves as a template upon which scan-by-scan modifications can be made by WRENS (see below). 2) An R-based monitoring (R version 3.3.1), processing, and control system was implemented. At the completion of a file acquisition, a call is made to the command line version of Proteowizard15 (v 3.0.10246), which converts from the native Waters .raw format to mzML format. 3) once conversion of the raw data has completed, XCMS (v 1.50.0) centWave27 algorithm is used for feature detection. 4) the mzR28 package (v 2.8.0) is used to catalog and evaluate all MS/MS events, and map them to the MS level features based on accurate mass and retention time. 5) a target prioritization score is generated for each MS feature which takes into account the quality of the MS feature and the



quality of the MS/MS data representing that feature. Highest priority is assigned to features which have high quality MS feature scores and low MS/MS scores. Details of this scoring algorithm are provided below. 6) The prioritization scores are used to develop a MS/MS acquisition schedule for the next injection. The schedule is written as a csv file to disk. 7) WRENS software, running in a C# environment, reads the .csv schedule and submits this to the Q-TOF instrument, overwriting the hard-coded collision energy and precursor m/z in the template MassLynx method. 8) the cycle repeats with each injection until the sequence has finished. The full R and C# WRENS code is provided as supplemental material. This workflow is depicted graphically in figure S2. Feature alignment and retention time correction: Features detected in consecutive samples were considered to represent the same compound if their masses were within 20 ppm and their chromatographic peaks overlapped in retention time from one injection to the next. Upon alignment, the new mass of the feature is calculated as the intensity (peak area) weighted mean of the two measured masses. The new retention time of the feature is assigned as the most recent retention time of that feature, acknowledging that retention time is less stable than mass assignment and that retention time drift is typically non-random on our system. Retention times for all features not detected in the most recent injection were adjusted using a quadratic fit derived from all features that were detected in the most recent injection that mapped to features from previous injections. Prioritization scoring: The feature prioritization scoring dictates which features are targeted for MS/MS acquisition in the next injection. The prioritization score itself is designed around the premise that the most important features to target in the next injection are those which have high quality MS data and low quality MS/MS data. As the discrepancy between MS and MS/MS data quality for a given feature diminishes, so should its prioritization score. The implemented scoring scheme is based on MS feature intensity, MS feature detection frequency, and summed intensity of all ions in the ions in the MS/MS spectrum. An MS quality score is calculated for each feature based on the frequency of observation and signal intensity and an MS/MS quality score is generated based on sampling frequency and ranked signal intensity. Each of these values is calculated for each feature, and the priority is calculated as the difference between the average of the ranked MS intensity and detection frequency minus the ranked MS/MS intensity. Those features with the largest difference between MS and MS/MS quality are assigned the highest priority. The MaxDepth option in the R function is used to force sampling of features which have not been previously sampled – this method is used to increase the probability of sampling rare features, which the feature prioritization scheme is designed to devalue. It is used when absolute depth of sampling is required, but will necessarily take MS/MS time away from features that are more frequently observed. MS/MS schedule generation: The interscan delay was manually set to 0.03 seconds for the DsDA template method. This was done to ensure that there was a rigorously defined relationship between scan index and chromatographic


Page 8 of 23

Page 9 of 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


retention time. The defined relationship between retention time and scan index, coupled with our retention time adjustment, enables accurate prediction of feature retention time for the next injection and therefore optimal scan index assignment for any given feature. In this way, the full schedule of the 17 minute acquisition method could be set in advance of the injection without any feedback from the MS system during the acquisition. This approach assumes predictable retention times, and most modern UPLC systems meet these criteria. Application of the DsDA approach for nano-flow regimes, which generate less reproducible elution times, may require modification of this approach. Features are assigned to empty MS/MS slots in the schedule using five rounds of filling, starting with the highest priority features and descending to the lowest. In this way, each feature has the opportunity for assignment in the next injection schedule, assuming an available slot within the expected retention time window is available. After 5 rounds of priority-based filling, any empty MS/MS slots are assigned to the highest priority feature with a retention time overlapping with the available slot. This scheme ensures that the highest priority features can be assigned nearest the peak apex, where signal intensity is likely to be highest.

Results: DsDA acquires more MS/MS spectra than DDA. In this experiment, utilizing a 17 minute LC method, DDA-based acquisition collected 1498 (+/-11) MS/MS scans. For the DsDA method, a fixed interscan was selected. Despite the longer interscan delay used for DsDA, DsDA methods resulted in 2165 (+/-0) MS/MS events in a 17 minute LC method. Thus, the prescheduling of MS/MS events before acquisition offers two clear benefits for DsDA. The first is that DsDA methods acquired approximately 45% more MS/MS spectra, despite the use of a long interscan delay. This is apparently due to the lack of any processing between scan events. While the nominal interscan delay was 0.014 seconds for the DDA method, in practice the median time between the MS scan and the first DDA MS/MS scan was 0.069 (+/- 0.004) seconds, while that for DsDA was 0.0290 (+/- 0.0006) seconds. Though beyond the scope of this proof-of-concept application, future iterations of this approach may be able to widen this gap even further by utilizing real time retention time data, thereby negating the requirement of a relatively long interscan delay to stabilize the scan index – retention time relationship. The second benefit is that every MS/MS scan index has an essentially invariant retention time – enabling the assignment of a precursor ion to a specific MS/MS slot defined by scan index and retention time. This property is critical for full acquisition scheduling that DsDA employs.

DsDA enables comprehensive MS/MS coverage of common features through replicate analysis The DsDA approach was implemented on a Waters Q-TOF instrument running WRENS software, which enables custom control of instrument parameters and continuous scan-by-scan modification of the template MS/MS method. This approach was first tested on a single complex sample injected 20 times. Three approaches were utilized, including standard DDA as well as the novel DsDA implementation with



or without a ‘MaxDepth’ option. The MaxDepth option was implemented every 5th injection, and forces sampling of features which have no assigned MS/MS spectra as a means of maximizing coverage. In 20 injections, DDA sampled approximately 30% of all features at least once, and about 55% of all reproducibly observed features (Table 1). Reproducibly observed features are defined as those that were detected in at least 20% of the 20 injections. DsDA based prioritization and targeting increased coverage to 46% of all features, and 87% of all reproducibly observed features, offering the promise of nearly comprehensive feature coverage. DsDA with the additional MaxDepth option resulted in sampling of 77% of all features, and 99% of all reproducibly observed features. These data indicate that by enabling feedback between processed data and instrument control, a more intelligent MS/MS sampling scheme enables comprehensive MS/MS sampling. This coverage is depicted graphically in Figure 1. While DDA samples the most intense features at a given elution time, DsDA offers increased depth (more black points lower on the y-axis), which is particularly valuable when feature density at that time is high. Very few features remain unsampled (red) in the bottom two panels, and those that are unsampled are at the lowest intensities. This demonstrates that the prioritization scheme and MS/MS scheduling are working as designed. An additional benefit of this prioritization scheme is that it enables the DDA bias of repeated sampling of the same high intensity features (in subsequent injections) to be inverted – rather than repeatedly sampling the most intense features across a dataset, additional MS/MS time is spent on lower intensity features more likely to benefit from additional sampling. This can be visualized by comparing the point size between panels A. and B. of figure 1. In panel A. (DDA), the points with high signal intensity (higher on the y-axis) have larger point sizes, indicating repeated sampling. In panel B. the points at the top of the plot have the smallest point sizes, and as the signal intensity drops the points get larger, reflecting an increasing number of times that those lower intensity features were sampled. This is valuable as the MS signal intensity limits the MS/MS signal intensity – lower levels of MS signal must generate lower levels of MS/MS signal. Thus, DsDA not only enables more spectra to be acquired than standard DDA, but enables the instrument to acquire more spectra for those features which benefit from additional sensitivity. The bottom panel (DsDA + MaxDepth) demonstrates the same trend, but the point sizes will be slightly smaller (on average) than DsDA as the algorithm biases toward depth of coverage at the expense of oversampling of low intensity features. DsDA triples the number of MS/MS spectra which map to features DDA can respond in real time to signals detected in MS spectra, but analysis of a single spectrum in a chromatographically coupled method cannot distinguish chromatographic patterns that can help distinguish signal from noise. As such, DDA risks spending valuable MS/MS time acquiring MS/MS spectra on MS level signals which do not reflect chromatographically eluting features, as defined by standard data processing tools. To understand the magnitude of this effect, the number of total MS/MS events mapped to XCMS detected features for DDA, DsDA, and DsDA + MaxDepth were tallied. This data is provided in the last row of table 1, and demonstrates that by prescheduling the MS/MS events based on expected features, the number of MS/MS events which map to a real feature in the dataset as compared to DDA is approximately tripled. DsDA enables comprehensive MS/MS coverage for realistic sample sets.


Page 10 of 23

Page 11 of 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


The examples above demonstrate a proof of concept demonstrating the efficacy of the algorithm for an artificial sample set of identical sample injections. The next test would be to apply the algorithm to a more complex sample set, so a set of samples of varying composition was generated by adjusting the proportions of each of three complex sample extracts (see online supplementary material). None of the samples or injections were identical, except for the quality control samples, reflective of a real metabolomics dataset running a pooled quality control sample approach. This test is important for two reasons: 1) it more accurately reflects a typical metabolomics experiment and 2) variant concentrations of metabolites across samples favor DDA, as coeluting features with sufficient variation have the opportunity to be one of the top n features sampled by DDA MS/MS in subsequent samples. Figure 2 demonstrates that the patterns depicted in Figure 1 are largely recapitulated with a highly variant sample set with higher complexity. However, it should also be noted that in this highly complex (artificially so) system, complete feature coverage was not observed with 40 injections. For this sample set DDA, DsDA, and DsDA + MaxDepth sampled 47%, 72%, and 80% of all reproducibly observed features, respectively. A larger sample set or more MS/MS events per MS scan would offer more opportunity to achieve comprehensive coverage in this case. Examples demonstrating how this coverage can be used in interpretation of MS/MS data are provided in supplementary online material. Discussion: This report describes an approach to MS/MS precursor ion selection which is dependent on acquired data, but rather than being dependent on any data in the current LC-MS injection, the dependency is purely based on all data acquired in all previous analyses from that sample set. Rather than utilizing exclusion and inclusion lists to improve depth, as do most modified DDA approaches, DsDA utilizes a scan event schedule which, once set, is executed fully for that particular LC-MS run. The strengths and weaknesses of this approach are discussed below. DDA approaches can respond to signals immediately. This can be a strength when rare and unexpected signals are of interest. However, this is also a detriment when typical comparative experiments are performed as rare signals do not pass into the final dataset due to filters applied during processing. Further, in the absence of chromatographic peak front recognition by the DDA algorithm, signals present in a single or even multiple MS scans can trigger DDA precursor selection even if they do not actually represent a chromatographic peak (feature). Again, these artefactual signals will not be found in the final processed dataset, meaning the MS/MS time spent acquiring that data does not contribute to characterization of the features in the final dataset. DIA approaches are, by definition, completely unresponsive to any data acquired. The strength of this approach is that all MS/MS data is acquired all the time, providing a completely unsupervised and comprehensive sampling of MS/MS fragments generated under the pre-selected collision energy profile. Every signal has chromatographic coverage: in the event that the MS level signal is impure due to an isobaric or near isobaric contaminating signal, the opportunity exists to use a more selective fragment ion for quantitation. The major disadvantage to DIA approaches is the reduction (or elimination) of precursor ion resolution compared to narrow isolation width MS/MS (DDA and DsDA included), and an implicit inverse relationship between sensitivity and selectivity. DIA precursor isolation widths are



generally ‘wide’ as compared to DDA or targeted MS/MS, requiring deconvolution to infer precursor product relationships. While deconvolution can be applied, physical isolation of precursors can be valuable, and reduces the reliance on computational methods. Narrower DIA windows can be adopted, but doing so requires faster scanning and/or reduced mass range to maintain sufficient duty cycle for chromatographic coverage, sacrificing sensitivity and breadth of coverage, respectively. Ion mobility based approaches hold great promise in increasing sensitivity of DIA methods due to the trapping events which prevent quadrupole filter based loss of ions – both traveling wave16 and trapped ion mobility22 separation offer excellent potential. However, it must be noted that, at this time, ion mobility instrumentation is considerably more expensive than a more standard Q-TOF platform and open source processing tools for mobility data remain underdeveloped. DsDA, can be likened as a local alternative to ‘data-streaming’10 coupled to a more comprehensive and versatile version of ‘nearline acquisition’9, in that rapid processing can guide (broadly) targeted acquisition methods. However, the DsDA approach described here offers advantages not realized in the previous methods. Nearline acquisition was not designed to obtain comprehensive MS/MS coverage, but rather provide an efficient mechanism to acquire relatively targeted MS/MS data. The coverage potentially available through the nearline acquisition approach is quite large, but utilizing a standard MS/MS method file imposes constraints on the number of targets that can be scheduled for a given run. Alternatively, the approach used in the DsDA method utilizes a vendor supplied low-level instrument control utility which enables scan-by-scan control of acquisition for ultimate flexibility. We demonstrate the DsDA approach on a Waters instrument using Waters control software. Performing this acquisition method on instruments from other vendors will require capable instrument control software for those instruments. The data-streaming10 approach offered via XCMS online, automates conversion, upload, and processing, thereby allow the user to know quickly which targets need supporting MS/MS data. That is, the efficiency of the data-streaming approach is in data analysis, and the efficiency gain of nearline acquisition9 is primarily in generating methods to target the signals of interest. DsDA offers efficiency in each, as processing and MS/MS control are explicitly linked and automated. In this proof-of-concept study, no efforts were made to group features to recognize isotopes, adducts, or in-source fragments. Despite the knowledge that there are vastly more features than compounds29,30, essentially complete MS/MS sampling of all features was achieved. While complete feature coverage will not always be the goal, it can be exceptionally useful in recognizing the relationships between features, and therefore can supplement deconvolution and feature grouping methods. Additionally, achieving complete coverage on a moderately complex sample demonstrates the potential of an approach in which data processing and data acquisition are highly integrated. Future implementations of the DsDA approach could integrate sufficiently fast isotope envelope recognition algorithms to further improve MS/MS data quality by focusing on the M0 isotope only. The DsDA approach, involving prescheduling the full method, generates more MS/MS events than traditional DDA by temporally isolating data processing from acquisition, thereby maximizing data acquisition time during the chromatographic gradient by minimizing competing processing time. DsDA also ensures that valuable MS/MS time is spent on features which benefit from additional MS/MS time – low signal intensity features that are frequently detected receive a disproportionate amount of MS/MS


Page 12 of 23

Page 13 of 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


time, as the additional time improves MS/MS quality much more so than spending additional MS/MS time on high intensity features. Notably, DsDA inverts the relationship obtained using traditional DDA approaches. An interesting implication of the DsDA approach is that the more injections that are made in a particular sample set, the deeper the DsDA algorithm is able to delve, as there are more MS/MS slots available with each additional injection. DsDA also demonstrates similarities to modified DDA approaches using a combination of exclusion and inclusion lists. However, DsDA differs from them by offering much more flexibility in scheduling, utilizing a prioritization score rather than a binary decision to include or exclude a feature. Additionally, the DsDA approach demonstrated here offers multiple collision energies by default (an optional value or set of values in the R script), improving the likelihood that an MS/MS spectrum is collected which is highly informative or similar to an MS/MS database spectrum. DsDA was also shown to yield more acquired MS/MS spectra, more spectra acquired for low intensity features, and more spectra which map to detected features when compared to DDA. This indicates that prescheduled methods offer benefits to real time DDA methods with regards to duty cycle, though the magnitude of this benefit will clearly depend on data complexity, the DDA algorithm employed, and the computer performing the evaluations. Future directions: DsDA, as presented here, is a tool by which comprehensive MS/MS coverage can be obtained. However, it can also be viewed as a novel framework supporting implementation of any number of customizations. For example, while this application utilized four predefined collision energies (10, 20, 30, and 40), nothing prevents a finer resolution refinement using 1 V step size rather than 10 V. While sampling at each CE is not likely to be fruitful, an algorithm which evaluates fragmentation at default CE (20, for example) may be able to predict optimal fragmentation at CE = 33. After returning this data, the predicted optimal can be refined (CE = 31). If a given precursor ion appears to have a second interfering precursor ion, the isolation window can be adjusted on the next acquisition to physically increase selectivity to remove (or reduce) contributions of the contaminating product ions. In-source fragment ions might now be recognized explicitly (as product ions of another precursor), and MS/MS data can be acquired and presented as MS3 fragmentation trees31. Additionally, the use of cone voltage to further increase in-source fragmentation (as performed in shotgun CID15) can be employed to achieve higher level fragmentation tree data. In practice, the DsDA approach as described is best conceptualized as a near real-time feedback loop between data processing and data acquisition, and any mass spectrometric adjustment that an experienced user might employ can be implemented through appropriately designed acquisition software, taking care to avoid negatively impacting MS level data quality. This tool, and the framework within which it was implemented, present an opportunity, not only for metabolomics research but also potentially for proteomics applications, to generate a more informative discovery dataset imbued with thorough MS/MS data, supporting more comprehensive and confident identification of detected signals and ultimately increasing the biological interpretability of mass spectrometry based ‘omics’ data.



Page 14 of 23

Table 1. DsDA samples more comprehensively than DDA, and more MS/MS events are mapped to chromatographic peaks. a

% all features % reproducibly observed featuresb mapped MS/MSc

standard DDA 29.6 %

DsDA 46.0 %

DsDA + MaxDepth 77 %

54.8 %

87 %

98.7 %

50,147

152,572

152,059

a. the percentage of all detected features which were sampled by MS/MS at least once. b. the percentage of all features detected in at least 20% of the injections that were sampled by MS/MS at least once. c. the total number of MS/MS events (absolute counts) which map to a feature.


Page 15 of 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


Supporting Information Available: Figures S1 though S6, additional files with code used to executing DsDA as described. DsDA_20180116.cs: C# code executed in Wrens Software. DsDA_function_20180116.R: R code used for file monitoring, calling proteowizard, data processing, and writing schedule. DsDA_pos.exp: Waters MassLynx MS method used for DsDA template file. DsDA_schedule.csv: template .csv file representing all scan events scheduled in DsDA_pos.exp method. Each ms/ms event is filled by R function, and executed by wrens. Variant_Sample_Set.xls: description of how the variant sample set was created and used. Script files can also be found at github.com/cbroeckl/DsDA.

References: (1)

Niessen, W. M. . J. Chromatogr. A 1999, 856 (1–2), 179–197.

(2)

Yates, J. R. Trends Genet. 2000, 16 (1), 5–8.

(3)

Venable, J. D.; Dong, M.-Q.; Wohlschlegel, J.; Dillin, A.; Yates, J. R. Nat. Methods 2004, 1 (1), 39– 45.

(4)

Lingjun Li; Christophe D. Masselon; Gordon A. Anderson; Ljiljana Paša-Tolić; Sang-Won Lee; Yufeng Shen; Rui Zhao; Mary S. Lipton; Thomas P. Conrads; Nikola Tolić, A.; Smith*, R. D. Anal. Chem. 2001, 73 (14), 3312–3322.

(5)

Yates, J. R.; Eng, J. K.; McCormack, A. L. Anal. Chem. 1995, 67 (18), 3202–3210.

(6)

Sandhu, C.; Hewel, J. A.; Badis, G.; Talukder, S.; Liu, J.; Hughes, T. R.; Emili, A. J. Proteome Res. 2008, 7 (4), 1529–1541.

(7)

Mullard, G.; Allwood, J. W.; Weber, R.; Brown, M.; Begley, P.; Hollywood, K. A.; Jones, M.; Unwin, R. D.; Bishop, P. N.; Cooper, G. J. S.; Dunn, W. B. Metabolomics 2015, 11 (5), 1068–1080.

(8)

Michalski, A.; Cox, J.; Mann, M. J. Proteome Res. 2011, 10 (4), 1785–1793.

(9)

Neumann, S.; Thum, A.; Böttcher, C. Metabolomics 2013, 9 (S1), 84–91.

(10)

Montenegro-Burke, J. R.; Aisporna, A. E.; Benton, H. P.; Rinehar, D.; Fang, M.; Huan, T.; Warth, B.; Warth, E.; Abe, B. T.; Ivanisevic, J.; Wolan, D. W.; Teyton, L.; Lairson, L.; Siuzdak, G. Anal. Chem. 2017, 89 (2), 1254–1259.

(11)

Hoopmann, M. R.; Merrihew, G. E.; von Haller, P. D.; MacCoss, M. J. J. Proteome Res. 2009, 8 (4), 1870–1875.

(12)

Koelmel, J. P.; Kroeger, N. M.; Gill, E. L.; Ulmer, C. Z.; Bowden, J. A.; Patterson, R. E.; Yost, R. A.; Garrett, T. J. J. Am. Soc. Mass Spectrom. 2017, 28 (5), 908–917.

(13)

Kreimer, S.; Belov, M. E.; Danielson, W. F.; Levitsky, L. I.; Gorshkov, M. V.; Karger, B. L.; Ivanov, A. R. J. Proteome Res. 2016, 15 (10), 3563–3573.

(14)

Rudomin, E. L.; Carr, S. A.; Jaffe, J. D. J. Proteome Res. 2009, 8 (6), 3154–3160.

(15)

Purvine, S.; Eppel*, J.-T.; Yi, E. C.; Goodlett, D. R. Proteomics 2003, 3 (6), 847–850.

(16)

Silva, J. C.; Gorenstein, M. V; Li, G.-Z.; Vissers, J. P. C.; Geromanos, S. J. Mol. Cell. Proteomics



2006, 5 (1), 144–156. (17)

Panchaud, A.; Scherl, A.; Shaffer, S. A.; von Haller, P. D.; Kulasekara, H. D.; Miller, S. I.; Goodlett, D. R. Anal. Chem. 2009, 81 (15), 6481–6488.

(18)

Gillet, L. C.; Navarro, P.; Tate, S.; Röst, H.; Selevsek, N.; Reiter, L.; Bonner, R.; Aebersold, R. Mol. Cell. Proteomics 2012, 11 (6), O111.016717.

(19)

Juvvadi, P. R.; Moseley, M. A.; Hughes, C. J.; Soderblom, E. J.; Lennon, S.; Perkins, S. R.; Thompson, J. W.; Geromanos, S. J.; Wildgoose, J.; Richardson, K.; Langridge, J. I.; Vissers, J. P. C.; Steinbach, W. J. J. Proteome Res. 2018, 17 (2), 780–793.

(20)

Moseley, M. A.; Hughes, C. J.; Juvvadi, P. R.; Soderblom, E. J.; Lennon, S.; Perkins, S. R.; Thompson, J. W.; Steinbach, W. J.; Geromanos, S. J.; Wildgoose, J.; Langridge, J. I.; Richardson, K.; Vissers, J. P. C. J. Proteome Res. 2018, 17 (2), 770–779.

(21)

Geromanos, S. J.; Hughes, C.; Ciavarini, S.; Vissers, J. P. C.; Langridge, J. I. Anal. Bioanal. Chem. 2012, 404 (4), 1127–1139.

(22)

Meier, F.; Beck, S.; Grassl, N.; Lubeck, M.; Park, M. A.; Raether, O.; Mann, M. J. Proteome Res. 2015, 14 (12), 5378–5387.

(23)

Smith, C. A.; Want, E. J.; O’Maille, G.; Abagyan, R.; Siuzdak, G. Anal. Chem. 2006, 78 (3), 779–787.

(24)

Addona, T. A.; Shi, X.; Keshishian, H.; Mani, D. R.; Burgess, M.; Gillette, M. A.; Clauser, K. R.; Shen, D.; Lewis, G. D.; Farrell, L. A.; Fifer, M. A.; Sabatine, M. S.; Gerszten, R. E.; Carr, S. A. Nat. Biotechnol. 2011, 29 (7), 635–643.

(25)

Cox, J.; Mann, M. Nat. Biotechnol. 2008, 26 (12), 1367–1372.

(26)

Mortensen, P.; Gouw, J. W.; Olsen, J. V.; Ong, S.-E.; Rigbolt, K. T. G.; Bunkenborg, J.; Cox, J.; Foster, L. J.; Heck, A. J. R.; Blagoev, B.; Andersen, J. S.; Mann, M. J. Proteome Res. 2010, 9 (1), 393–403.

(27)

Tautenhahn, R.; Böttcher, C.; Neumann, S. BMC Bioinformatics 2008, 9, 504.

(28)

Ruttkies, C.; Schymanski, E. L.; Wolf, S.; Hollender, J.; Neumann, S. J. Cheminform. 2016, 8, 3.

(29)

Broeckling, C. D.; Ganna, A.; Layer, M.; Brown, K.; Sutton, B.; Ingelsson, E.; Peers, G.; Prenni, J. E. Anal. Chem. 2016, 88 (18), 9226–9234.

(30)

Mahieu, N. G.; Patti, G. J. Anal. Chem. 2017, 89 (19), 10397–10406.

(31)

Rasche, F.; Scheubert, K.; Hufsky, F.; Zichner, T.; Kai, M.; Svatoš, A.; Böcker, S. Anal. Chem. 2012, 84 (7), 3417–3426.


Page 16 of 23

Page 17 of 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60




Figure 1. A complex small molecule sample was generated and injected 20 times using each of standard DDA (A), DsDA (B), or DsDA with the MaxDepth option enabled (C). Each panel contains a scatterplot where each point represents an XCMS feature. All black points are plotted with sizes proportional to the number of times they were sampled for MS/MS (see legend). All points colored red remained unsampled at the end of the 20 injection sequence.


Page 18 of 23

Page 19 of 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60




Figure 2. A variant set of 33 complex samples and seven invariant QC samples were generated and analyzed in 40 injections for each of standard DDA (A), DsDA (B), or DsDA with the MaxDepth option enabled (C). Each panel contains a scatterplot where each point represents an XCMS feature. All black points are plotted with sizes proportional to the number of times they were samples for MS/MS (see legend). All points colored red were unsampled at the end of the 40 injection sequence.


Page 20 of 23

Page 21 of 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


Figure 1. A complex small molecule sample was generated and injected 20 times using each of standard DDA (A), DsDA (B), or DsDA with the MaxDepth option enabled (C). Each panel contains a scatterplot where each point represents an XCMS feature. All black points are plotted with sizes proportional to the number of times they were sampled for MS/MS (see legend). All points colored red remained unsampled at the end of the 20 injection sequence. 241x764mm (600 x 600 DPI)



Figure 2. A variant set of 33 complex samples and seven invariant QC samples were generated and analyzed in 40 injections for each of standard DDA (A), DsDA (B), or DsDA with the MaxDepth option enabled (C). Each panel contains a scatterplot where each point represents an XCMS feature. All black points are plotted with sizes proportional to the number of times they were samples for MS/MS (see legend). All points colored red were unsampled at the end of the 40 injection sequence. 228x723mm (600 x 600 DPI)


Page 22 of 23

Page 23 of 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


graphical abstract 410x209mm (96 x 96 DPI)


DataSet-Dependent Acquisition enables comprehensive tandem mass

Recommend Documents