Scanning Quadrupole Data-Independent ... - ACS Publications

Sep 13, 2017 - A novel data-independent acquisition (DIA) method incorporating a scanning quadrupole in front of a collision cell and orthogonal accel...
2 downloads 5 Views 1MB Size
Subscriber access provided by UNIV OF CAMBRIDGE

Article

Scanning Quadrupole Data Independent Acquisition – Part A. Qualitative and Quantitative Characterization M. Arthur Moseley, Christopher J. Hughes, Praveen R. Juvvadi, Erik J. Soderblom, Sarah Lennon, Simon R. Perkins, J. Will Thompson, William J. Steinbach, Scott J Geromanos, Jason Wildgoose, James I. Langridge, Keith Richardson, and Johannes P.C. Vissers J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.7b00464 • Publication Date (Web): 13 Sep 2017 Downloaded from http://pubs.acs.org on September 14, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 27

Journal of Proteome Research -1-

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Title Scanning Quadrupole Data Independent Acquisition – Part A Qualitative and Quantitative Characterization

Authors M. Arthur Moseley1,ǂ, Christopher J. Hughes2,ǂ, Praveen R. Juvvadi3,ǂ, Erik J. Soderblom1,ǂ, Sarah Lennon2, Simon R. Perkins4, J. Will Thompson1, William J. Steinbach3,5, Scott J. Geromanos6, Jason Wildgoose2, James I. Langridge2, Keith Richardson2,ǂ, Johannes P.C. Vissers2, ǂ,* 1.

Proteomics and Metabolomics Shared Resource Center for Genomic and Computational Biology, Duke University Medical Center, Durham, NC

2.

Waters Corporation, Wilmslow, United Kingdom

3.

Division of Pediatric Infectious Diseases, Department of Pediatrics, Duke University Medical Center, Durham, NC

4.

Institute of Integrative Biology, University of Liverpool, Liverpool, United Kingdom

5.

Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC

6. ǂ

Waters Corporation, Milford, MA

these authors have contributed equally to this work

* to whom correspondence should be addressed ([email protected])

ACS Paragon Plus Environment

Journal of Proteome Research -2-

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract A novel data independent acquisition (DIA) method incorporating a scanning quadrupole in front of a collision cell and orthogonal acceleration time-of-flight mass analyzer is described. The method has been characterized for the qualitative and quantitative label-free proteomic analysis of typical complex biological samples. The principle of the scanning quadrupole DIA method is discussed and analytical instrument characteristics, such as the quadrupole transmission width, scan/integration time, and chromatographic separation, have been optimized in relation to sample complexity for a number of different model proteomes of varying complexity and dynamic range including human plasma, cell lines, and bacteria. In addition, the technological merits over existing DIA approaches are described and contrasted. The qualitative and semi-quantitative performance of the method is illustrated for the analysis of relatively simple protein digest mixtures and a well-characterised human cell line sample using untargeted and targeted search strategies. Finally, the results from a human cell line were compared against publically available data that used similar chromatographic conditions, but were acquired with DDA technology and alternative mass analyzer systems. Qualitative comparison showed excellent concordance of results with over 90% overlap of the detected proteins. Keywords: label-free quantitation, data-independent acquisition, scanning quadrupole. Abbreviations: DIA, DDA, oa-TOF

ACS Paragon Plus Environment

Page 2 of 27

Page 3 of 27

Journal of Proteome Research -3-

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Introduction Data-independent acquisition (DIA) is an emerging quantitative omics profiling technique and is rapidly gaining in popularity due to its comprehensive and unbiased sampling of precursor ions compared to data-dependent acquisition (DDA). A number of variants have been proposed, which all have non-biased precursor selection, or the lack thereof, in common. Moreover, DIA based approaches, in general, are aimed at increasing the detectable dynamic range and coverage of the MS analysis, and improve the precision and accuracy of relative or absolute protein quantification by systematic sampling of the peptide precursor m/z space. A recent review by Chapman et al. [1] provides a comprehensive overview, including classification based on technology and/or principle. The most well-known and prominent ones, in chronological order, are probably the original multiplexed acquisition method described by Masselon et al. [2], the implementation of the method on a time-offlight mass spectrometer by Purvine et al. [3], the disclosure of DIA using sequential, relatively small m/z isolation windows by Venable et al. [4], the alternating scanning DIA technology first described by Silva et al. [5], followed by an extension of the method that includes ion mobility (IM) separation to increase selectivity and sensitivity by RodriguezSuarez et al. [6], and adaptations of the method proposed by Venable et al. [3], such as sequential window acquisition of all theoretical fragment-ion spectra by Gillet et al. [7] and Fourier transform-all reaction monitoring by Weisbrod et al. [8]. The increased popularity of DIA is marked by the development and increased availability of dedicated informatics tools to analyse the various types of multiplexed DIA schemes. Bilbao et al. [9] examined in a current review the different schemes to facilitate the discussion of the concepts related to DIA data processing as well as a comprehensive overview of available software implementations for the identification and quantification of DIA data. Well-know principle examples include the DIA specific search algorithm, since by design mixture spectra are generated, developed by Li et al. [10], which uses an iterative search process and physicochemical peptide and protein properties to assign product to precursor ions, the computational workflow presented by Tsou et al. [11] to detect precursor and fragment chromatographic features to assemble them into pseudo-tandem MS spectra that can be searched with conventional DDA database-searching and protein-inference tools, and the targeted data analysis approaches proposed by Gillet et al. [7] and Weisbrod et al. [8] whereby extracted ion chromatograms or peptide fragmentation patterns are respectively used to detect, identify and quantify query (library) peptides. Egertson et al. [12] described a de-multiplexing approach to decrease chemical noise, which is inherently increased, to increase DIA data processing selectivity. To distinguish between the peptide

ACS Paragon Plus Environment

Journal of Proteome Research -4-

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

identification types, the terms ‘spectrum-centric’ and ‘peptide-centric’ analysis were introduced [13], whereas in the case of the former spectra are most commonly interpreted using database search approaches, and the latter tests directly for the presence and absence of query (library) peptides. DIA methods are limited by reduced precursor selectivity, when compared to DDA based techniques, typically using a five to ten-fold wider isolation window, or not applying any precursor ion selection. For example, with a stepped quadrupole DIA approach, static quadrupole isolation windows are sequential cycled through to cover the whole m/z range of interest [7], with ions passed for fragmentation, mass separation and detection. Decisions are not required on which ions to select or fragment and quadrupole isolation provides selectivity over a broadband DIA approach. However, this increase in selectivity comes at the price of mass spectrometer duty cycle, as this is a serial approach, and as such the use of wider or variable m/z windows has been used to gain sensitivity. This approach has been primarily used for targeted data extraction where the data is probed for specific combinations of fragment ions of the peptides of interest. The limitations of DIA approaches are geometry/configuration as well as method dependent, but include both duty cycle and serial acquisition limitations, resulting ultimately in reduced sensitivity. The latter is practically often not a limitation, for most applications, as it can be countered by a higher loading of protein lysate or extract. However, improvements in technology, next to unbiased and complete sampling, will contribute to broader acceptance of the methodology and afford more comprehensive analysis and understanding of complex biological samples. Here, the principles, advantages and application of a novel scanning quadrupole based DIA method will be discussed and presented, as well as possible customisation of the acquisition method. The technical performance of the method will be highlighted and its qualitative and quantitative performance shown. Moreover, it will be demonstrated that the obtained data are applicable to both search and library based proteomics analysis strategies and how the results compare and complement with current methods.

ACS Paragon Plus Environment

Page 4 of 27

Page 5 of 27

Journal of Proteome Research -5-

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Experimental Conditions Protein digestion E.coli

digestion

standard

(Waters

Corporation),

a

four

protein

(Alcohol

Dehydrogenase (yeast), Glycogen Phosporylase (rabbit), Enolase 1 (yeast) and Bovine Serum Albumin) digest mixture (Waters Corporation), and pre-digested extract from human K562 cells (Promega) and human HeLa cells (Thermo Scientific Pierce, Waltham, MA) were resuspended in a 10% (v/v) aqueous acetonitrile solution and diluted with aqueous 0.1% (v/v) formic acid to an intermediate concentration of 1 or 2 µg/µl. Undepleted and unfractionated human plasma samples were donated by Department of Cardiovascular Sciences, NIHR Leicester Cardiovascular Biomedical Research Unit, Glenfield Hospital, Leicester, UK and digested as previously described [14]. LC-MS configuration LC

separations

were

performed

using

a

nanoACQUITY

system

(Waters

Corporation), equipped with a Symmetry C18 5 µm, 2 cm x 180 µm precolumn and an HSS T3 C18 1.8 µm, 20 cm x 75 µm analytical column. The samples were transferred with aqueous 0.1% (v/v) formic acid to the precolumn at a flow rate of 5 µl/min. Mobile phase A was water containing 0.1% (v/v) formic acid, whilst mobile phase B was acetonitrile containing 0.1% (v/v) formic acid. The peptides were eluted from the precolumn to the analytical column and separated with a gradient of 5 to 40% mobile phase B over 90 min at a flow rate of 300 nl/min. The analytical column temperature was maintained at 35ºC. The lock mass compound, [Glu1]-Fibrinopeptide B (200 fmol/µl), was delivered at 600 nl/min to the reference sprayer of source of the mass spectrometer. Mass spectrometric analysis of tryptic peptides was performed using a Xevo G2-XS QTOF mass spectrometer (Waters Corporation, Wilmslow, United Kingdom). The mass spectrometer was operated with a resolution of 35,000 FWHM and all analyses were performed in positive mode ESI. The ion source block temperature and capillary voltage were set to 100ºC and 3.2 kV, respectively. The time-of-flight (TOF) mass analyzer of the mass spectrometer was externally calibrated with a NaCsI mixture from m/z 50 to 1990. LCMS data were collected in a novel data independent mode of acquisition (SONAR). In this acquisition mode the quadrupole was continuously scanned between m/z 400 to 900, with a quadrupole transmission width of approximately 24 Da. The oa-TOF records mass spectra as the quadrupole scans and stores these MS data into two hundred discrete bins. Two data functions (modes) are acquired in an alternating mode, differing only in the collision energy applied to the gas cell. In the low energy MS1 mode, data are collected at constant gas cell

ACS Paragon Plus Environment

Journal of Proteome Research -6-

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

collision energy of 6 eV. In the elevated energy MS2 mode, the gas cell collision energy is ramped from 14 eV to 40 eV (per unit charge). As such, the resulting data contains both precursor ions and all associated fragment ions. The spectral acquisition time in each mode was 0.5 s with a 0.02 s interscan delay. The reference sprayer sampled every 60 s and the data post-acquisition lock mass corrected. Library creation and targeted searching Targeted data analysis/library searches were conducted with development software [15,16] using an in-house developed retention time normalized library representing 25,719 replicating non-modified (variable) identified peptides, mapping to 6003 proteins/4994 protein groups, based on the analysis of 120 µg HeLa sample using a basic pH multiple fraction concatenation strategy (96 fractions concatenated into 12 samples reanalyzed by acidic pH reversed phase DDA LC-MS) [17,18]. The first dimension separation was conducted with I-Class ACQUITY/Fractionation Manager system (Waters Corporation) equipped with a BEH C17 2.1 mm x 100 mm column operated at 0.5 ml/min. Mobile phase A was 10 mM ammonium formate (pH 10), whilst mobile phase B was 10% 10 mM ammonium formate (pH 10) and 90% acetonitrile, using a gradient of 4 to 35% mobile phase B over 50 min. The second dimension reversed phase LC-MS DDA method has been previously described [19]. Linear regression correlation was applied to correct for retention time differences, i.e. normalize chromatographic content/information of the library data, between the DDA library and scanning quadrupole DIA data. Data processing and untargeted searching SONAR quadrupole scanning DIA data were processed using ProteinLynx Global Server (PLGS) v3.0.2 (Waters Corporation) using optimized threshold and search parameters. Additional qualitative analysis was performed with Skyline v3.5 (University of Washington, Seattle, WA) using libraries derived from PLGS protein database searches. ISOQuant was applied for the integrated quantitative analysis of data derived from multiple LC-MS runs (http://www.isoquant.net) [20]. Protein and peptide identifications were obtained by searching Homo sapiens UniProt (20,161 reviewed entries, release 2016_10). Additional qualitative analysis was performed with Scaffold v4.7.1 (Proteome Software, Portland, OR) using the results from PLGS database searches. The results have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository [21] with dataset identifier PXD005869. The processing and search parameters are summarized in Supplementary Table 1.

ACS Paragon Plus Environment

Page 6 of 27

Page 7 of 27

Journal of Proteome Research -7-

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Results and Discussion Principle In this study, a scanning quadrupole data independent acquisition (DIA) method is described, termed SONAR, implemented on a hybrid quadrupole orthogonal time-of flight mass (oa-TOF) spectrometry platform enabled with broadband DIA [22,23]. A schematic of the ion optics of the instrument is shown in Figure 1 (A). The low-resolution quadrupole mass filter of the first mass analyser is scanned repeatedly and both precursor MS1 and product ion MS2 data acquired at spectral rates approaching 2000 spectra per second using the oa-TOF mass analyser. This method therefore produces a high duty-cycle and unbiased two-dimensional MS data, as will be explained in more detail in the following sections. In the current configuration of the method, the quadrupole is typically set to transmit a 10 to 35 m/z unit window (Th), which is continuously and repetitively scanned with a 0.1 to 1 second cycle time over a user selected m/z range. For the majority of the data and results presented in this paper, the transmission window was set to a 24 m/z unit window and the m/z scan range from 400 to 900 m/z. At the end of each quadrupole cycle the instrument was switched between a post-quadrupole fragmentation mode and a non-fragmentation mode. The acquisition system is configured to profile scanning quadrupole separations by adding individual TOF spectra (pushes) incrementally into a buffer containing 200 memory locations or “bins”. Each bin consists of a mass spectrum labeled with a different average quadrupole position. The pusher period is determined by the TOF mode and mass range, and is typically around 60 to 70 µs. In normal use, data are added to the buffer in a cyclic fashion and at least 10 cycles are usually added before it is read out and stored. Data from several consecutive pushes to the same spectral bin in the buffer before moving on to the next bin. The number of pushes per bin is set to be 1/200 of the quadrupole cycle time (there is no inter-scan delay between pushes). In the described experiment, the quadrupole cycle time was chosen to be about 1 s, so the number of pushes added to each bin was about 70. The whole arrangement is shown schematically in Figure 1 (B). This setup produces nested two-dimensional MS data sets data that can be viewed using multidimensional analysis software, as illustrated in Figure 1 (C). Within these distributions, the horizontal axis represents the centre of the quadrupole transmission window while the vertical axis is the m/z value recorded by the oa-TOF. Within the MS1 precursor data (Figure 1 (C) (i)), a largely diagonal structure represents the precursor ions transmitted by the quadrupole and recorded by the oa-TOF. Some fragmentation at low m/z is also visible in this log-intensity heat map. In the product ion MS2 data (Figure 1 (C) (ii), the residual diagonal structure corresponds to un-fragmented precursor, but the additional

ACS Paragon Plus Environment

Journal of Proteome Research -8-

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

scatter above and below this line arises from fragmentation. Using software tools developed to extract drift plots from IMS-MS data, reconstructed quadrupole mass spectra can be extracted for a given TOF m/z and retention time (tr), as shown by the two right hand panes, (iii) and (iv), in Figure 1 (C) for the precursor MS1 and product ion MS2 data, respectively. In the current experiment, fragmentation is induced post-quadrupole, so the reconstructed spectra should be, limited only by ion statistics, identical for a precursor and its fragments. This opens up the possibility of precursor and fragment alignment with a tolerance much tighter than the quadrupole window. To investigate the accuracy of the precursor m/z assignment within the scanning quadrupole dimension, two isotope features were examined for the annotated fragment y-ions of an abundant peptide as shown in Figure 1 (D). The average calculated precursor m/z value and uncertainty was 783.7 ± 3.0. The theoretical m/z for the 2+ charge state of this peptide is 783.9. In this case, the m/z of the precursor was therefore determined to be approximately 12.5% of the quadrupole peak width. Since the two-dimensional data produced by this method is stored using the same format as ion mobility enabled DIA experiments ((U)(H)DMSE) [6,15,24], the scanning quadrupole DIA data can be processed and searched directly using existing commercial [10] and open source software tools [11] for discovery type experiments through mzML conversion or by reading/importing the raw data directly. However, the data can also be used for targeted data analysis, as will be demonstrated in the next section, using either proprietary or public reference spectral libraries. The additional selectivity afforded by scanning the quadrupole is illustrated in Figure 2 and Supplementary Figure 1. The results shown in Figure 2 represent the Skyline open source informatics interpretation of a number of MS1 and MS2 spectra for one of the annotated peptides. The left hand side 2D ion maps of the (A) and (B) panel sets, MS1 and MS2 data respectively, disregard the quadrupole filtering of the data by Skyline, imitating broadband DIA data. The right hand side 2D ion maps of the (A) and (B) panels demonstrate the effect of quadrupole isolation using the prediction option of the software, with the average quadrupole position shown by the rectangular band(s). This, in turn affords reduced complexity in the MS1 and MS2 spectra that can be readily annotated, as shown by the unfiltered and scanning quadrupole filtered spectra, top and bottom respectively, shown in the (C) and (D) panels of Figure 2. An alternative view of the multi-dimensional nature of the data is shown in Supplementary Figure 1, representing 20 s of chromatographic product ion data of a HeLa cell protein extract load of 250 ng on-column separated using a 90 min reversed gradient. Panel A shows the deconvoluted ion detections with accurate m/z TOF detection as a function of quadrupole m/z position. Panel B illustrates that within a single quadrupole position multiple product ion series originating from multiple precursors were detected by the processing

ACS Paragon Plus Environment

Page 8 of 27

Page 9 of 27

Journal of Proteome Research -9-

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

software from which, as shown in panel C, product ion spectra can be derived for qualitative or quantitative purposes. This example highlights an important fundamental difference between stepped and scanning quadrupole based DIA approaches. Since each precursor and product ion is associated with a specific quadrupole profile, scanning quadrupole DIA data can be peak detected in the quadrupole domain, as previously shown in Figures 1 (C) (iii) and 1 (C) (iv). Moreover, since the binned data overlap and are off-set, dependent upon the m/z scan range and quadrupole isolation width, scanning quadrupole DIA is inherently more specific than stepped quadrupole DIA, as certain precursor and product ion data will reside in unique bins. In addition, the method is more amendable to applications that require a high sampling rate, such as fast LC or CE separations, which can be achieved by acquiring DIA data at very fast quadrupole scan rates. Higher throughput/duty cycle metabolomics/lipidomics applications using 0.1 s quadrupole scan times will be presented elsewhere, highlighting the importance of peak sampling frequency and its effect on quantitative precision [25]. Characteristics – duty cycle, speed, precision Analytical parameters and sample properties such as quadrupole transmission width, sample complexity, throughput, dynamic range, and acquisition speed have to be considered and optimized for a given assay type. Acquisition parameter/sample complexity dependency examples are shown in Figure 3. The effect of duty cycle time on the relative number of protein identifications using a fixed scanning quadrupole transmission window for two different proteomes, i.e. E.coli and a human cell line, is shown in panels (A) and (B), using two relative short gradient acquisitions of 30 and 45 min, respectively. Regardless of the complexity of the proteome, longer gradients provided higher number of protein identifications. However, the maxima are at different scan times with the results suggesting a proteome complexity dependency. The effect of transmission window on the number of protein identifications for undepleted human plasma and a human cell line sample is illustrated in panel (C) of Figure 3. Here, for both samples, the amount loaded on-column, gradient length, quadrupole scan time and oa-TOF integration time were all kept constant. A dependency on specificity can be observed with the number of protein identifications maximizing at a transmission window of approximately 23 to 28 m/z wide. The effect of the gradient length at fixed load for two different quadrupole scan and oa-TOF integration times on the number of identified human cell line proteins is shown in Figure 3 (D). In this instance, predominantly due to sample complexity and dynamic range, the highest observed identification rate was found with a 90 min separation, for both quadrupole scan/oa-TOF integration times investigated. The absolute protein group identification numbers are shown

ACS Paragon Plus Environment

Journal of Proteome Research

Page 10 of 27

- 10 -

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

in Supplementary Figure 2, with the highest observed identification rate reported per experiment (panel) and sample type. Note that these figures of merit represent untargeted search results and that only the results shown in Figure 3 (D) can be directly compared with the detailed characterization of the human cell line results described in the following section. The Figure 3 (A), (B) and (C) panel results are relative references, where one or more parameters was varied, but the other, although not always optimal in terms of load for a given sample type, were kept constant. Quantitative precision was found to be around 1015% at both the protein and peptide detection level which will be discussed in greater detail in the context of a biological study in part B of the manuscript, ‘Application to the Analysis of the Calcineurin-Dependent Drug-Treated Aspergillus fumigatus Proteome’, and section ‘Characterization’. In summary, for a given application, method optimization is required that takes into account the discussed parameters and sample properties; however, once determined, these parameters are constant and can be repeatedly applied. For all experiments and results discussed in the following sections, a 24 m/z transmission window, a 90 min reversed phase gradient, a quadrupole scan range of m/z 400 to 900, and 0.5 s quadrupole scan time was used. The configuration affords different modes of operation whereby, for example, MS2 acquisition time is increased at the expense of MS1 acquisition time in order to improve MS2 duty cycle and sensitivity. To address the loss in MS1 sensitivity, a non-scanning, wide transmission experiment could be conducted, as this experiment would only require the confirmation of the (accurate mass) presence of a precursor. However, since the applied analysis software is expecting equally time spaced MS1 and MS2 experiments, these modes of acquisition have thus far not been implemented. Compared to broadband DIA methods, specificity is increased as afforded by the scanning quadrupole. However, the sensitivity is negatively impacted due to the serial nature of the scanning quadrupole and this overall reduced duty cycle results in a reduction of sensitivity of approximately 4 to 5 times. The combination of a fast scanning quadrupole with a fast TOF acquisition system allows quadrupole scans to be completed in less than 0.1 s [25].

In contrast, oa-TOF based

stepping quadrupole DIA cycles typically last for 1 to 2 s. Additionally, a precursor quadrupole profile is available for every fragment, which can be centroided to give a precursor m/z value to a small fraction of the quadrupole transmission window (typically ~1/10), thereby affording greatly increased selectivity compared to stepping quadrupole based DIA methods at the same sensitivity, or, alternatively, greatly increased sensitivity at the same specificity. Ion mobility assisted DIA methods benefit from sensitivity gains due to the parallel nature, by nesting ion mobility with TOF mass spectrometry, of the technique, [26-28]. Therefore, ion mobility based parallel DIA approaches are more sensitive then serial

ACS Paragon Plus Environment

Page 11 of 27

Journal of Proteome Research - 11 -

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

stepping and scanning quadrupole DIA methods [2-4,7], but their resolving power reduced compared to quadrupole isolation. A more detailed discussion and comparison of broadband DIA, stepping and scanning quadrupole DIA, and mobility assisted DIA, is however beyond the scope of this manuscript and will be presented elsewhere [25]. Characterization The precision and accuracy of the scanning quadrupole DIA acquisition method was initially assessed by analyzing a four protein mixture injected at two different levels differing by a factor of two. The concentrations of the four proteins within the sample(s) vary by a factor of 16, i.e. lowest vs. highest molar amount, and span a molecular weight range of 3797 kDa. The data were normalized to one of the median abundant proteins; hence, the expected protein ratio values should be equal to one with minimal variance. The LC-MS experiments, since sample complexity is reduced, were conducted using a shortened 30 min reversed phase gradient and reduced quadrupole scan and TOF integration times of 0.25 s. The results are graphically summarized in Supplementary Figure 3, illustrating that the obtained protein ratio values are well within 5% of the expected values, with one exception, which was caused by a reduced number of identified peptides due to the lower Mw (smaller number of detectable peptides) and lower amount injected on-column for the protein of interest, although still within 10% of the expected value. The 95% confidence interval, i.e. relative protein abundance ratio error, which takes the contribution of the individual peptides abundances into account [29], was nevertheless similar to that of the other three quantified proteins and equaled on average 0.03. This error estimation value and precision compare favorable, especially considering that a reduced scan time and fast scanning quadrupole isolation was employed within the acquisition schema, with previously published results for label-free LC-MS peptide and protein quantitation using a broadband based DIA technique [29,30]. A more detailed qualitative and semi-quantitative characterization of the method was conducted with three technical replicates of 0.5, 1.0, and 1.5 µg of HeLa human cell line sample extract loaded on-column with the results summarized in Table 1. Untargeted searching of the data, using 1% protein and peptide FDR cut-off values [24], identified 3170 protein groups experiment wide. Identification yields were moderately increased with injected quantity. The experiment-wide probabilistic interpretation of the untargeted search results, data not shown, detected a similar number of proteins, i.e. 3363, with a protein prophet FDR equal to 1% [31]. This analysis also afforded to assess the average MS1 abundance variation (%CV) across technical replicates, which equaled 11.8% for all peptides (total number), 10.4% for sequence unique peptides, and 9.5% for the peptides used in Hi(n)

ACS Paragon Plus Environment

Journal of Proteome Research - 12 -

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

quantitation for the 1 µg load experiment, and similar figures of merit for the two other HeLa protein digest amounts analyzed. The targeted library search approach, using a sample specific library, yielded a higher number of protein identifications and detected 4074 protein groups in total, although, as explained below, with somewhat reduced repeatability. The detection reproducibility across all LC-MS experiments and amounts injected on-column, as afforded by experiment-wide processing was better than 85% and ranged from 96% to 98% for the technical replicates for untargeted searching of the data as graphically summarized in the Venn bar graphs on the right hand side panel (E) of Figure 4. For the targeted searches, based on sample centric analysis, experiment wide protein identification replication was better than 89% and technical sample replication range from 88% to 89%. These slightly reduced numbers vs. the protein replication figures of merit of the untargeted search are explained by the differences in the applied approaches, i.e. sample centric targeted vs. experiment wide centric untargeted searches. In the instance of experiment wide analysis, the presence of precursor and product ions across samples is based on the identification of a peptide within one of more samples and its presence in the other samples of the whole experiment confirmed, whereas with sample centric search approaches this inter-sample type of comparison across the complete experimental design is not conducted. It also explains for the relative high increase in identified number of protein groups when the search results of the different analyzed on-column amounts were combined compared to the result of the individual experiments. An example identification of a peptide that was not identified using the untargeted search strategy but that could be detected using a targeted search is shown in Supplementary Figure 4. The left hand side of Figure 4 illustrates the distribution of the observed sequence coverage (A) and number of peptides (B) detected as a function of dynamic range in concentration µmol/mol values, using a Hi(n) quantitation approach to estimate abundances [32], as well as the normalized and average quantitative response (C) expressed as normalized slope values ((µmol target protein /mol total protein)/amount target protein vs. amount total protein). Shown inset are the median values, illustrating an increase in amino acid sequence coverage and number of peptides as a function of concentration dynamic range bin and linear behavior for the majority (89%) of the identified proteins for the three amounts injected on-column. However, signal response per se does not have to increase proportionally with amount, which accounts for a portion of the negative normalized slopes, but can still exhibit positive linear response. The regression analysis results shown in Supplementary Figure 5 illustrate the correlation coefficients of the amount responses shown in Figure 4 (C). The correlation was found to be positive for 99% of the data for four of the five µmol/mol orders investigated, and positive for 95% of the data for the fifth, middle order.

ACS Paragon Plus Environment

Page 12 of 27

Page 13 of 27

Journal of Proteome Research - 13 -

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

In terms of replication, i.e. detection consistency between replicates similar trends were noted at the (non-redundant) peptide level with on average 14,594 peptides based on replicating annotated deconvoluted/charge state reduced features, identified with an experiment-wide normalized intensity %CV identification value of 4.3%. The overall median amino acid sequence coverage values was 48% and the average number of peptides detected per protein equal to 8 for the untargeted searches. For the targeted searches, the fraction of detectable amino acid sequence coverage and number of peptides, i.e. library coverage, equaled 74% and 65%, respectively, for the scanning quadrupole DIA data. A summary, contrasting both identification methods, is provided in Table 1, with a total of 4342 protein groups identified by combining the results of both search strategies. Note that the queried library includes peptides beyond the m/z scan range of the applied scanning quadrupole DIA method, limiting the number of unique peptides that can be targeted; hence, the increase in sequence coverage through a combined search. The advantages and limitations of both approaches have been discussed elsewhere [11,33]. Here, only the use of scanning quadrupole DIA data for targeted and untargeted searches is described. In terms of identified proteins and peptides, the methods both compare and intersect well with the qualitative results from previously reported studies [34-36]. However, the samples under study are not completely identical, hampering detailed comparative analysis. Moreover, the libraries used in these reference studies are not directly applicable to the sample, data and results presented in this paper; hence, gains are to be expected with the improvement of libraries and annotation approaches in terms of proteome coverage [16]. However, the results from the library data used for the targeted searches and a publically available data set (PXD001441), representing two independently acquired fractionated peptide versions, using similar chromatographic conditions as used for the HeLa sample under study, but acquired with DDA technology and alternative mass analyzer systems, were used for a qualitative comparison to assess basic performance. The results are graphically summarized in Figure 5. In summary, panel 5 (A), 92% of the proteins detected by scanning quadrupole DIA were found to be present in DDA library and 81% within the reference DDA data set. Overall, 97% of the proteins detected were found to be identified by the two DDA experiments combined. It is anticipated that greater detection coverage can be achieved by the use of more comprehensive libraries or the combination of libraries. The additional proteins, panel 5 (B), identified by the two DDA methods, as expected from peptide centric fractionation strategies, mainly stem from identifications associated with a relative low coverage and number of identified peptides that were not detected by the analysis of unfractionated HeLa using scanning quadrupole DIA.

ACS Paragon Plus Environment

Journal of Proteome Research - 14 -

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The selectivity of the method is highlighted by the example spectra shown in the top right hand side (D) of Figures 4 with extracted product ion chromatograms (F) and quadrupole scanning profiles (G) in the lower right hand side to exemplify once more that the multidimensional character of the data and the similarity of the product ion profiles in the retention time and quadrupole m/z separation domains also holds for more complex peptide mixtures and could potentially be used in identification/scoring [37,38] or precursor/product ion correlation schema [10,11]. In terms of extracting chromatograms, this means that both dimensions need to be considered as well, affording ultimately the best signal-to-noise. Conclusions The work presented here describes the development and implementation of a scanning quadrupole data independent acquisition (DIA) strategy, termed SONAR. Evaluation of the data acquisition method for proteomic applications has been described and systematically characterized for biological samples of varying complexity, from human plasma and cell lines, to bacterial cell lysates. Parameters including duty cycle, speed, sensitivity and specificity have been evaluated from both a qualitative and quantitative perspective for bottom up proteomics analysis. In addition, the utility of data from this technique for both targeted and untargeted search strategies has been shown, demonstrating the applicability and use of this novel type of DIA data for both strategies and their complementary nature. An inherent feature of the scanning quadrupole DIA technique is the limited number of parameters to be defined prior to analysis and the work contained here has allowed for the definition of default parameters for analyzing proteome samples without the requirement to create sample specific libraries or to rely on public repositories. Acknowledgement Donald Jones of Glenfield Hospital (Department of Cardiovascular Sciences and NIHR Leicester Cardiovascular Biomedical Research Unit) and the University of Leicester (Department of Cancer Studies) is kindly acknowledged for the donation of plasma samples.

Conflict of Interest CJH. SL, SJG, JW, JIL, KR, and JPCV are employed by Waters Corporation, which operates in the field covered by the article. The remaining authors declare no competing financial interests.

ACS Paragon Plus Environment

Page 14 of 27

Page 15 of 27

Journal of Proteome Research - 15 -

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Supporting Information The following files are available free of charge at ACS website http://pubs.acs.org:

Table S1:

Scanning quadrupole DIA and DDA processing and search parameters.

Figure S1:

Limited HeLa load scanning quadrupole DIA LC-MS data analysis.

Figure S2:

Highest observed identification rate from untargeted search results per experiment and sample type, optimizing scanning quadrupole DIA acquisition parameters and gradient lengths

Figure S3:

Log2 relative profile abundance profiles scanning quadrupole DIA.

Figure S4:

Targeted library search identification example.

Figure S5:

Quantitative precision summary of scanning quadrupole DIA analysis of HeLa human cell line sample showing the correlation, Hi(n) response vs. amount injected, as a function of dynamic range.

Captions Figure 1. (A) MS instrument optics/configuration, shown inset is the transmission profile as a function of time, (B) collision energy profiles quadrupole as a function of experiment type and time (and average quadrupole position in bins), (C) nested two dimensional MS (quadrupole m/z vs. TOF m/z) MS1 and MS2 data sets, including typical quadrupole transmission profiles for fragments of DFNVGGYIQAVLDR (PYGM_RABIT) eluting at 23.7 min, and (D) quadrupole extracted TOF MS1 and MS2 spectra for the peptide shown in panel (C) following 2D peak (precursor m/z, and fragment m/z) detection of the scanning quadrupole DIA data (the extraction width used in this case was about 10% of the quadrupole peak width). Figure 2. 2D m/z quadrupole vs. m/z TOF distributions and reconstructed spectra, showing the aggregate of all average quadrupole positions and a single average quadrupole position 2D MS1 and MS2 distributions, (A) and (B), respectively, and aggregate and single average quadrupole position MS1 and MS2 spectra, (C) and (D) respectively, for SADTLWGIQK (LDHA_HUMAN) eluting at 61.7 min. Figure 3. Scanning quadrupole DIA acquisition parameter and gradient optimisation examples for the normalised number of identified protein groups for (A) E.coli - squares = 30 min gradient and circles = 45 min gradient; (B) human cell line - squares = 30 min gradient

ACS Paragon Plus Environment

Journal of Proteome Research - 16 -

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

and circles = 45 min gradient; (C) squares = human cell line and circles = human undepleted plasma; (D) human cell line - squares = 0.3 s scan time and circles = 0.5 s scan time. Average (n = 3) relative identification values are shown with an average technical variation across all experiments smaller than 5%.

Figure 4. Graphical qualitative identification summary of scanning quadrupole DIA analysis of HeLa human cell line sample. Left to right and clock wise: amino acid sequence coverage as a function of dynamic range (µmol/mol) (A), number of peptides as a function of dynamic range (µmol/mol) (B), quantitative response (normalised slope ((µmol/mol)/amount) vs. amount) as a function of dynamic range (µmol/mol) (C), shown inset are median values, DIA product ion spectrum (D), Venn distribution protein identifications (grey = common; black = condition or condition pair unique) (E), product ion extracted chromatograms (F), and product ion scanning quadrupole profiles (G). Figure 5. Qualitative comparison scanning quadrupole DIA (unfractionated) results vs. library DDA (concatenated (96/12) off-line 2D basic pH reversed phase fractionated and identification replication filtered) and public DDA (6 fraction off-line 2D basic pH reversed phase fractionated) HeLa LC-MS data. (A) proteins and protein groups in parenthesis, and (B) number of identified peptides and amino sequence coverage (black = scanning quadrupole DIA, dark grey = reference DDA data, and light grey = DDA library). References 1.

Chapman, J. D; Goodlett, D. R; Masselon, C. D. Multiplexed and data-independent tandem mass spectrometry for global proteome profiling. Mass Spec. Rev., 2014, 33, 452-470.

2.

Masselon, C. D.; Anderson, G. A; Harkewicz, R.; Bruce, J.E.; Pasa-Tolic, L.; Smith, R. D. Accurate mass multiplexed tandem mass spectrometry for high-throughput polypeptide identification from mixtures. Anal. Chem. 2000, 72, 1918-1924.

3.

Purvine, S.; Eppel, J. T; Yi, E. C.; Goodlett, D. R. Shotgun collision-induced dissociation of peptides using a time of flight mass analyzer. Proteomics. 2003, 3, 847-850.

4.

Venable, J. D.; Dong, M. Q.; Wohlschlegel, J.; Dillin, A.; Yates, J. R. Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra. Nat. Methods. 2004, 1, 39-45.

5.

Silva, J. C.; Denny, R.; Dorschel, C. A.; Gorenstein, M.; Kass, I. J.; Li, G. Z.; McKenna, T.; Nold, M. J.; Richardson, K.; Young, P.; Geromanos, S J.. Quantitative proteomic analysis by accurate mass retention time pairs. Anal. Chem. 2005, 77, 2187-2200.

ACS Paragon Plus Environment

Page 16 of 27

Page 17 of 27

Journal of Proteome Research - 17 -

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

6.

Rodriguez-Suarez, E.; Hughes, C.; Gethings, L.; Giles, K.; Wildgoose, J. L.; Stapels, M. E.; Fadgen, K.; Geromanos, S. J.; Vissers, J. P. ; Elortza, F.; Langridge, J. I. An Ion Mobility Assisted Data Independent LC-MS Strategy for the Analysis of Complex Biological Samples. Current Anal. Chem. 2013, 9, 199-211.

7.

Gillet, L. C.; Navarro, P.; Tate, S.; Röst, H.; Selevsek, N.; Reiter, L.; Bonner, R.; Aebersold R. Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol. Cell. Proteomics. 2012, 11, O111.016717

8.

Weisbrod, C. R.; Eng, J. K.; Hoopmann, M. R; Baker, T.; Bruce, J. E. Accurate peptide fragment mass analysis: multiplexed peptide identification and quantification. J. Proteome Res. 2012, 11, 1621-1632

9.

Bilbao, A.; Varesio, E.; Luban, J.; Strambio-De-Castillia, C.; Hopfgartner, G.; Müller, M.; Lisacek, F. Processing strategies and software solutions for data-independent acquisition in mass spectrometry. Proteomics. 2015, 15, 964-980.

10.

Li, G. Z.; Vissers, J. P.; Silva, J. C.; Golick, D.; Gorenstein, M. V.; Geromanos, S. J. Database searching and accounting of multiplexed precursor and product ion spectra from the data independent analysis of simple and complex peptide mixtures. Proteomics. 2009, 9, 16961719.

11.

Tsou, C. C.; Avtonomov, D.; Larsen, B.; Tucholska, M.; Choi, H.; Gingras, A. C.; Nesvizhskii, A. I. DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics. Nat Methods. 2015, 12, 258-264

12.

Egertson, J. D.; Kuehn, A.; Merrihew, G. E.; Bateman, N. W.; MacLean, B. X.; Ting, Y. S.; Canterbury, J. D.; Marsh, D. M.; Kellmann, M.; Zabrouskov, V.; Wu, C. C.; MacCoss, M. J. Multiplexed MS/MS for improved data-independent acquisition. Nat. Methods. 2013, 10, 744746

13.

Ting, Y. S.; Egertson, J. D.; Payne, S. H.; Kim, S.; MacLean, B.; Käll, L.; Aebersold, R.; Smith, R. D.; Noble, W. S.; MacCoss, M. J. Peptide-Centric Proteome Analysis: An Alternative Strategy for the Analysis of Tandem Mass Spectrometry Data. Mol. Cell. Proteomics. 2015, 14, 2301-2307.

14.

Daly, C. E.; Ng, L. L.; Hakimi, A.; Willingale, R.; Jones, D. J. Qualitative and quantitative characterization of plasma proteins when incorporating traveling wave ion mobility into a liquid chromatography-mass spectrometry workflow for biomarker discovery: use of product ion quantitation as an alternative data analysis tool for label free quantitation. Anal. Chem. 2014, 86, 1972-1979

15.

Thalassinos, K.; Vissers, J. P.; Tenzer, S.; Levin, Y.; Thompson, J. W.; Daniel, D.; Mann, D.; DeLong, M. R.; Moseley, M. A.; America, A. H.; Ottens, A. K.; Cavey, G. S.; Efstathiou, G.; Scrivens, J. H.; Langridge, J. I.; Geromanos, S. J. Design and application of a dataindependent precursor and product ion repository. J. Am. Soc. Mass Spectrom. 2012, 23, 1808-1820.

ACS Paragon Plus Environment

Journal of Proteome Research - 18 -

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

16.

Williams, B. J.; Ciavarini, S. J.; Devlin, C.; Cohn, S. M.; Xie, R.; Vissers, J. P.; Martin, L. B.; Caswell, A.; Langridge, J. I.; Geromanos, S. J. Multi-mode acquisition (MMA): An MS/MS acquisition strategy for maximizing selectivity, specificity and sensitivity of DIA product ion spectra. Proteomics. 2016, 16, 2284-2301

17.

Wang, Y.; Yang, F.; Gritsenko, M. A.; Wang, Y.; Clauss, T.; Liu, T.; Shen, Y.; Monroe, M. E.; Lopez-Ferrer, D.; Reno, T.; Moore, R. J.; Klemke, R. L.; Camp, D. G.; Smith, R. D. Reversedphase chromatography with multiple fraction concatenation strategy for proteome profiling of human MCF10A cells. Proteomics. 2011, 11, 2019-2026.

18.

Yang, F.; Shen, Y.; Camp, D. G.; Smith, R. D. High pH reversed-phase chromatography with fraction concatenation as an alternative to strong-cation exchange chromatography for twodimensional proteomic analysis. Expert Rev. Proteomics. 2012, 9, 129–134.

19.

Juvvadi, P. R.; Ma, Y.; Richards, A. D.; Soderblom, E. J.; Moseley, M. A.; Lamoth, F.; Steinbach, W. J. Identification and mutational analyses of phosphorylation sites of the calcineurin-binding protein CbpA and the identification of domains required for calcineurin binding in Aspergillus fumigatus. Frontiers in Microbiology 2015, 6, 175.

20.

Distler, U.; Kuharev, J.; Navarro, P.; Tenzer, S. Label-free quantification in ion mobilityenhanced data-independent acquisition proteomics. Nat. Protoc. 2016, 11, 795-812.

21.

Jones, P.; Côté, R. G.; Martens, L.; Quinn, A. F.; Taylor, C. F.; Derache, W.; Hermjakob, H.; Apweiler, R. PRIDE: a public repository of protein and peptide identifications for the proteomics community. Nucleic Acids Res. 2006, 34, D659-D663.

22.

Bateman, R. H.; Carruthers, R.; Hoyes, J. B.; Jones, C.; Langridge, J. I.; Millar, A.; Vissers, J. P. A novel precursor ion discovery method on a hybrid quadrupole orthogonal acceleration time-of-flight (Q-TOF) mass spectrometer for studying protein phosphorylation. J. Am. Soc. Mass Spectrom. 2002, 13, 792-803.

23.

Geromanos, S. J.; Vissers, J. P.; Silva, J. C.; Dorschel, C. A.; Li, G. Z.; Gorenstein, M. V.; Bateman, R. H.; Langridge J. I. The detection, correlation, and comparison of peptide precursor and product ions from data independent LC-MS with data dependant LC-MS/MS. Proteomics. 2009, 9, 1683-1695.

24.

Distler, U.; Kuharev, J.; Navarro, P.; Levin, Y.; Schild, H.; Tenzer, S. Drift time-specific collision energies enable deep-coverage data-independent acquisition proteomics. Nat. Methods. 2014, 11, 167-170.

25.

Gethings, L. A.; Vissers, J. P.; Richardson, K.; Wildgoose, J. L.; Langridge, J. I. Lipid profiling of complex biological mixtures by liquid chromatography mass spectrometry using a novel scanning quadruple data independent acquisition strategy. Rapid Commun. Mass Spectrom. 2017, doi: 10.1002/rcm.7941.

26.

Giles, K.; Pringle, S. D.; Worthington, K. R.; Little, D.; Wildgoose, J. L.; Bateman, R. H. Applications of a travelling wave-based radio-frequency-only stacked ring ion guide. Rapid Commun. Mass Spectrom. 2004, 18, 2401-2414.

27.

Giles, K.; Williams, J. P.; Campuzano, I. Enhancements in travelling wave ion mobility resolution. Rapid Commun. Mass Spectrom. 2011, 25, 1559-1566

ACS Paragon Plus Environment

Page 18 of 27

Page 19 of 27

Journal of Proteome Research - 19 -

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

28.

Pringle, S. D.; Giles, K.; Wildgoose, J. L.; Williams, J. P.; Slade, S. E.; Thalassinos, K.; Bateman, R. H.; Bowers, M. T.; Scrivens, J. An Investigation of the Mobility Separation of Some Peptide and Protein Ions Using a New Hybrid Quadrupole/Travelling Wave IMS/OA-ToF Instrument. Int. J. Mass Spectrom. 2007, 261, 1-12.

29.

Richardson, K.; Denny, R.; Hughes, C.; Skilling, J.; Sikora, J.; Dadlez, M.; Manteca, A.; Jung, H. R.; Jensen, O. N.; Redeker, V.; Melki, R.; Langridge, J. I.; Vissers, J. P. A probabilistic framework for peptide and protein quantification from data-dependent and data-independent LC-MS proteomics experiments. OMICS. 2012, 16, 468-482.

30.

Kramer, G.; Woolerton, Y.; van Straalen, J. P.; Vissers, J. P.; Dekker, N.; Langridge, J. I.; Beynon, R. J.; Speijer, D.; Sturk, A.; Aerts, J. M. Accuracy and Reproducibility in Quantification of Plasma Protein Concentrations by Mass Spectrometry without the Use of Isotopic Standards. PLoS One. 2015, 10, e0140097.

31.

Searle, B. C. Scaffold: a bioinformatic tool for validating MS/MS-based proteomic studies. Proteomics. 2010, 10, 1265-1269.

32.

Silva, J. C.; Gorenstein, M. V.; Li, G. Z.; Vissers, J. P.; Geromanos, S. J. Absolute quantification of proteins by LCMSE: a virtue of parallel MS acquisition. Mol. Cell Proteomics. 2006, 5, 144156.

33.

Wang, J.; Tucholska, M.; Knight, J. D.; Lambert, J. P.; Tate, S.; Larsen, B.; Gingras, A. C.; Bandeira, N. MSPLIT-DIA: sensitive peptide identification for data-independent acquisition. Nat. Methods. 2015, 12, 1106-1108.

34.

Rosenberger, G.; Koh, C. C.; Guo, T.; Röst, H. L.; Kouvonen, P.; Collins, B. C.; Heusel, M.; Liu, Y.; Caron, E.; Vichalkovski, A.; Faini, M.; Schubert, O. T.; Faridi, P.; Ebhardt, H. A.; Matondo, M.; Lam, H.; Bader, S. L.; Campbell, D. S.; Deutsch, E. W.; Moritz, R. L.; Tate, S.; Aebersold, R.A repository of assays to quantify 10,000 human proteins by SWATH-MS. Sci. Data. 2014, 1, 140031.

35.

Navarro, P.; Kuharev, J.; Gillet, L. C.; Bernhardt, O. M.; MacLean, B.; Röst, H. L.; Tate, S. A.; Tsou, C. C.; Reiter, L.; Distler, U.; Rosenberger, G.; Perez-Riverol, Y.; Nesvizhskii, A. I.; Aebersold, R.; Tenzer, S. A multicenter study benchmarks software tools for label-free proteome quantification. Nat. Biotechnol. 2016, 34, 1130-1136.

36.

Collins, B. C.; Hunter, C. L.; Liu, Y.; Schilling, B.; Rosenberger, G.; Bader, S. L.; Chan, D. W.; Gibson, B. W.; Gingras, A. C.; Held, J. M.; Hirayama-Kurogi, M.; Hou, G.; Krisp, C.; Larsen, B.; Lin, L.; Liu, S.; Molloy, M. P.; Moritz, R. L.; Ohtsuki, S.; Schlapbach, R.; Selevsek, N.; Thomas, S. N.; Tzeng, S. C.; Zhang, H.; Aebersold, R. Multi-laboratory assessment of reproducibility qualitative and quantitative performance of SWATH-mass spectrometry, Nat. Commun. 2017, 8, 291.

37.

Röst, H. L.; Rosenberger, G.; Navarro, P.; Gillet, L.; Miladinović, S. M.; Schubert, O. T.; Wolski, W.; Collins, B. C.; Malmström, J.; Malmström, L.; Aebersold, R. OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data. Nat. Biotechnol. 2014, 32, 219-223.

ACS Paragon Plus Environment

Journal of Proteome Research - 20 -

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

38.

Reiter, L.; Rinner, O.; Picotti, P.; Hüttenhain, R.; Beck, M.; Brusniak, M. Y.; Hengartner, M. O.; Aebersold, R. mProphet: automated data processing and statistical validation for large-scale SRM experiments. Nat. Methods. 2011, 8, 430-435.

ACS Paragon Plus Environment

Page 20 of 27

Page 21 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Journal of Proteome Research

- 21 -

Table 1. Qualitative analysis scanning quadrupole DIA analysis HeLa cell lysate using untargeted and target search approaches. search type †

untargeted [24]

targetedǂ [16,17]

combined

protein groups° 0.5 µg 1.0 µg 1.5 µg combined

2940±18 (96.3%) 3001±6 (97.7%) 3011±15 (97.2%) 3170 (85.6%)

3196±14 (88.3%) 3360±11 (89.6%) 3400±13 (89.8%) 4074 (89.5%)

3940 4044 4063 4342

unique peptides

14,594*

17,021

25,178



integrated, co-detection based experiment-wide centric analysis;

ǂ

sample centric analysis; ° replication in parenthesis; * replicating annotated

deconvoluted/charge state reduced features

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

- 22 -

Table of Contents (TOC) graphic For TOC only:

ACS Paragon Plus Environment

Page 22 of 27

Page 23 of 27

Journal of Proteome Research

Figure 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42

(A)

(B)

(i)

(iii)

(ii)

(iv)

(C)

ACS Paragon Plus Environment

(D)

Journal of Proteome Research

Figure 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42

(A)

(B)

(C)

(D)

ACS Paragon Plus Environment

Page 24 of 27

Page 25 of 27

Journal of Proteome Research

Figure 3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 (A) 21 22 23 24 25 26 E. coli 27 30 and 45 min gradient 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42

(B)

human cell line 30 and 45 min gradient

(C)

human cell line and undepleted human plasma

ACS Paragon Plus Environment

(D)

human cell line 0.25 and 0.5 s scan time

Journal of Proteome Research

Page 26 of 27

Figure 4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42

(A)

(B)

(C)

(D)

(E)

0 - 2 µmol/mol

24.0

3

30.9

6

47.6

14

0E0

2 - 20 µmol/mol

1.1E-6

20 - 200 µmol/mol

4.6E-6

(F)

200 - 2000 µmol/mol

57.7

13

65.2

59

-6.2E-7

2000 - 20000 µmol/mol

sequence coverage (%)

# peptides

2.1E4

normalized slope

ACS Paragon Plus Environment

(G)

Page 27 of 27

Journal of Proteome Research

Figure 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42

(A)

(B)

ACS Paragon Plus Environment