Scanning Quadrupole Data-Independent Acquisition, Part A

Donald Jones of Glenfield Hospital (Department of Cardiovascular Sciences and ...... Bateman , R. H.; Carruthers , R.; Hoyes , J. B.; Jones , C.; Lang...
1 downloads 0 Views 3MB Size
Article pubs.acs.org/jpr

Cite This: J. Proteome Res. 2018, 17, 770−779

Scanning Quadrupole Data-Independent Acquisition, Part A: Qualitative and Quantitative Characterization M. Arthur Moseley,†,▽ Christopher J. Hughes,‡,▽ Praveen R. Juvvadi,§,▽ Erik J. Soderblom,†,▽ Sarah Lennon,‡ Simon R. Perkins,∥ J. Will Thompson,† William J. Steinbach,§,⊥ Scott J. Geromanos,# Jason Wildgoose,‡ James I. Langridge,‡ Keith Richardson,‡,▽ and Johannes P. C. Vissers*,‡,▽

Downloaded via AUCKLAND UNIV OF TECHNOLOGY on January 29, 2019 at 01:44:38 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.



Proteomics and Metabolomics Shared Resource Center for Genomic and Computational Biology, Duke University Medical Center, Durham, North Carolina 27710, United States ‡ Waters Corporation, Wilmslow SK9 4AX, United Kingdom § Division of Pediatric Infectious Diseases, Department of Pediatrics, Duke University Medical Center, Durham, North Carolina 27710, United States ∥ Institute of Integrative Biology, University of Liverpool, Liverpool L69 3BX, United Kingdom ⊥ Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27710, United States # Waters Corporation, Milford, Massachusetts 01757, United States S Supporting Information *

ABSTRACT: A novel data-independent acquisition (DIA) method incorporating a scanning quadrupole in front of a collision cell and orthogonal acceleration time-of-flight mass analyzer is described. The method has been characterized for the qualitative and quantitative label-free proteomic analysis of complex biological samples. The principle of the scanning quadrupole DIA method is discussed, and analytical instrument characteristics, such as the quadrupole transmission width, scan/integration time, and chromatographic separation, have been optimized in relation to sample complexity for a number of different model proteomes of varying complexity and dynamic range including human plasma, cell lines, and bacteria. In addition, the technological merits over existing DIA approaches are described and contrasted. The qualitative and semiquantitative performance of the method is illustrated for the analysis of relatively simple protein digest mixtures and a well-characterized human cell line sample using untargeted and targeted search strategies. Finally, the results from a human cell line were compared against publicly available data that used similar chromatographic conditions but were acquired with DDA technology and alternative mass analyzer systems. Qualitative comparison showed excellent concordance of results with >90% overlap of the detected proteins. KEYWORDS: label-free quantitation, data-independent acquisition, scanning quadrupole



multiplexed acquisition method described by Masselon et al.,2 the implementation of the method on a time-of-flight mass spectrometer by Purvine et al.,3 the disclosure of DIA using sequential, relatively small m/z isolation windows by Venable et al.,4 the alternating scanning DIA technology first described by Silva et al.,5 followed by an extension of the method that includes ion mobility (IM) separation to increase selectivity and sensitivity by Rodriguez-Suarez et al.,6 and adaptations of the method proposed by Venable et al.,4 such as sequential window acquisition of all theoretical fragment-ion spectra by Gillet et al.7 and Fourier transform all reaction monitoring by Weisbrod et al.8

INTRODUCTION Data-independent acquisition (DIA) is an emerging quantitative omics profiling technique and is rapidly gaining popularity due to its comprehensive and unbiased sampling of precursor ions compared to data-dependent acquisition (DDA). A number of variants have been proposed, that all have nonbiased precursor selection, or the lack thereof, in common. Moreover, DIA-based approaches, in general, are aimed at increasing the detectable dynamic range and coverage of the MS analysis and improving the precision and accuracy of relative or absolute protein quantification by systematic sampling of the peptide precursor m/z space. A recent review by Chapman et al.1 provides a comprehensive overview, including classification based on technology or principle. The most well known and prominent ones, in chronological order, are the original © 2017 American Chemical Society

Received: July 18, 2017 Published: September 13, 2017 770

DOI: 10.1021/acs.jproteome.7b00464 J. Proteome Res. 2018, 17, 770−779

Journal of Proteome Research



The increased popularity of DIA is marked by the development and availability of dedicated informatics tools to analyze the various types of multiplexed DIA schemes. Bilbao et al.9 examined in a current review the different schemes to facilitate the discussion of the concepts related to DIA data processing, as well as a comprehensive overview of available software implementations for the identification and quantification of DIA data. Well-known principle examples include the DIA-specific search algorithm, because by design mixture spectra are generated, developed by Li et al.,10 which uses an iterative search process and physicochemical peptide and protein properties to assign product to precursor ions, the computational workflow presented by Tsou et al.11 to detect precursor and fragment chromatographic features to assemble them into pseudotandem MS spectra that can be searched with conventional DDA database-searching and protein-inference tools, and the targeted data analysis approaches proposed by Gillet et al.7 and Weisbrod et al.,8 whereby extracted ion chromatograms or peptide fragmentation patterns are, respectively, used to detect, identify, and quantify query (library) peptides. Egertson et al.12 described a demultiplexing approach to decrease chemical noise, which is inherently increased, to increase DIA data processing selectivity. To distinguish between the peptide identification types, the terms “spectrum-centric” and “peptide-centric” analysis were introduced,13 whereas in the case of the former spectra are most commonly interpreted using database search approaches, and the latter tests directly for the presence and absence of query (library) peptides. DIA methods are typically limited by reduced precursor selectivity when compared with DDA-based techniques, using a 5- to 10-fold wider isolation window or not applying any precursor ion selection. For example, with a stepped quadrupole DIA approach, static quadrupole isolation windows are sequentially cycled through to cover the whole m/z range of interest,7 with ions passed for fragmentation, mass separation, and detection. Decisions are not required on which ions to select or fragment, and quadrupole isolation provides selectivity over a broadband DIA approach. However, this increase in selectivity comes at the price of mass spectrometer duty cycle, as the approach is serial, and as such the use of wider or variable m/z windows has been used to gain sensitivity. This method has been primarily used for targeted data extraction where the data are probed for specific combinations of fragment ions of the peptides of interest. The limitations of DIA approaches are geometry/configuration- as well as method-dependent, but include both duty cycle and serial acquisition limitations, ultimately resulting in reduced sensitivity. The latter is practically often not a limitation, for most applications, as it can be countered by a higher loading of protein lysate or extract. However, improvements in technology, next to unbiased and complete sampling, will contribute to broader acceptance of the methodology and afford more comprehensive analysis and understanding of complex biological samples. Here, the principles, advantages, and application of a novel scanning-quadrupole-based DIA method will be discussed and presented as well as possible customization of the acquisition method. The technical performance of the method will be highlighted and its qualitative and quantitative performance shown. Moreover, it will be demonstrated that the obtained data are applicable to both search- and library-based proteomics analysis strategies and how the results compare and complement with current methods.

Article

EXPERIMENTAL CONDITIONS

Protein Digestion

E. coli digestion standard (Waters Corporation, Milford, MA), a four protein (alcohol dehydrogenase (yeast), glycogen phosphorylase (rabbit), enolase 1 (yeast), and bovine serum albumin) digest mixture (Waters Corporation), and predigested extract from human K562 cells (Promega, Madison, WI) and human HeLa cells (Thermo Scientific Pierce, Waltham, MA) were resuspended in a 10% (v/v) aqueous acetonitrile solution and diluted with aqueous 0.1% (v/v) formic acid to an intermediate concentration of 1 or 2 μg/μL. Undepleted and unfractionated human plasma samples were donated by the Department of Cardiovascular Sciences, NIHR Leicester Cardiovascular Biomedical Research Unit, Glenfield Hospital, Leicester, U.K. and digested as previously described.14 LC−MS Configuration

LC separations were performed using a nanoACQUITY system (Waters Corporation) equipped with a Symmetry C18 5 μm, 2 cm × 180 μm precolumn and an HSS T3 C18 1.8 μm, 20 cm × 75 μm analytical column. The samples were transferred with aqueous 0.1% (v/v) formic acid to the precolumn at a flow rate of 5 μL/min. Mobile phase A was water containing 0.1% (v/v) formic acid, while mobile phase B was acetonitrile containing 0.1% (v/v) formic acid. The peptides were eluted from the precolumn to the analytical column and separated with a gradient of 5−40% mobile phase B over 90 min at a flow rate of 300 nL/min. The analytical column temperature was maintained at 35 °C. The lock mass compound, [Glu1]Fibrinopeptide B (200 fmol/μL), was delivered at 600 nL/min to the reference sprayer of source of the mass spectrometer. Mass-spectrometric analysis of tryptic peptides was performed using a Xevo G2-XS QTOF mass spectrometer (Waters Corporation, Wilmslow, United Kingdom). The mass spectrometer was operated with a resolution of 35 000 FWHM, and all analyses were performed in positive-mode ESI. The ion source block temperature and capillary voltage were set to 100 °C and 3.2 kV, respectively. The time-of-flight (TOF) mass analyzer of the mass spectrometer was externally calibrated with a NaCsI mixture from m/z 50 to 1990. LC−MS data were collected in a novel data-independent mode of acquisition (SONAR). In this acquisition mode the quadrupole was continuously scanned between m/z 400 to 900, with a quadrupole transmission width of ∼24 Da. The oa-TOF records mass spectra as the quadrupole scans and stores these MS data into 200 discrete bins. Two data functions (modes) are acquired in an alternating mode, differing only in the collision energy applied to the gas cell. In the low-energy MS1 mode, data are collected at constant gas cell collision energy of 6 eV. In the elevated energy MS2 mode, the gas cell collision energy is ramped from 14 to 40 eV (per unit charge). As such, the resulting data contain both precursor ions and all associated fragment ions. The spectral acquisition time in each mode was 0.5 s with a 0.02 s interscan delay. The reference sprayer sampled every 60 s and the data were postacquisition lock-mass corrected. Library Creation and Targeted Searching

Targeted data analysis/library searches were conducted with development software15,16 using an in-house developed retention time normalized library representing 25 719 replicating nonmodified peptides, mapping to 6003 proteins/4994 protein groups, based on the analysis of 120 μg HeLa sample 771

DOI: 10.1021/acs.jproteome.7b00464 J. Proteome Res. 2018, 17, 770−779

Article

Journal of Proteome Research

Figure 1. (A) MS instrument optics/configuration; shown inset is the transmission profile as a function of time. (B) Quadruple collision energy profiles as a function of experiment type and time (and average quadrupole position in bins). (C) Nested 2D MS (quadrupole m/z vs TOF m/z) MS1 and MS2 data sets, including typical quadrupole transmission profiles for fragments of DFNVGGYIQAVLDR (PYGM_RABIT) eluting at 23.7 min. (D) Quadrupole extracted TOF MS1 and MS2 spectra for the peptide shown in panel C following 2D peak (precursor m/z and fragment m/z) detection of the scanning quadrupole DIA data. The extraction width used in this case was ∼10% of the quadrupole peak width.

Additional qualitative analysis was performed with Scaffold v4.7.1 (Proteome Software, Portland, OR) using the results from PLGS database searches. The results have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository21 with data set identifier PXD005869. The processing and search parameters are summarized in Supplementary Table 1.

using a basic pH multiple fraction concatenation strategy (96 fractions concatenated into 12 samples reanalyzed by acidic pH reversed-phase DDA LC−MS).17,18 The first dimension separation was conducted with I-Class ACQUITY/Fractionation Manager system (Waters Corporation) equipped with a BEH C18 1.7 μm, 2.1 mm × 100 mm column operated at 0.5 mL/min. Mobile phase A was 10 mM ammonium formate (pH 10), while mobile phase B was 10% 10 mM ammonium formate (pH 10) and 90% acetonitrile, using a gradient of 4 to 35% mobile phase B over 50 min. The second dimension reversedphase LC−MS DDA method has been previously described.19 Linear regression correlation was applied to correct for retention time differences, that is, normalize chromatographic content/information on the library data, between the DDA library and scanning quadrupole DIA data.



RESULTS AND DISCUSSION

Principle

In this study, a scanning quadrupole DIA method is described, termed SONAR, implemented on a hybrid quadrupole orthogonal time-of flight mass (oa-TOF) spectrometry platform enabled with broadband DIA.22,23 A schematic of the ion optics of the instrument is shown in Figure 1A. The lowresolution quadrupole mass filter of the first mass analyzer is scanned repeatedly, and both precursor MS1 and product ion MS2 data are acquired at spectral rates approaching 2000 spectra per second using the oa-TOF mass analyzer. This method therefore produces a high duty-cycle and unbiased 2D MS data, as will be explained in more detail in the following sections. In the current configuration of the method, the quadrupole is typically set to transmit a 10 to 35 m/z unit window (Th), which is continuously and repetitively scanned with a 0.1 to 1 s cycle time over a user selected m/z range. For the majority of

Data Processing and Untargeted Searching

SONAR quadrupole scanning DIA data were processed using ProteinLynx Global Server (PLGS) v3.0.2 (Waters Corporation) using optimized threshold and search parameters. Additional qualitative analysis was performed with Skyline v3.5 (University of Washington, Seattle, WA) using libraries derived from PLGS protein database searches. ISOQuant was applied for the integrated quantitative analysis of data derived from multiple LC−MS runs (http://www.isoquant.net).20 Protein and peptide identifications were obtained by searching Homo sapiens UniProt (20 161 reviewed entries, release 2016_10). 772

DOI: 10.1021/acs.jproteome.7b00464 J. Proteome Res. 2018, 17, 770−779

Article

Journal of Proteome Research

Figure 2. 2D m/z quadrupole versus m/z TOF distributions and reconstructed spectra, showing the aggregate of all average quadrupole positions and a single average quadrupole position 2D MS1 and MS2 distributions, (A) and (B), respectively, and aggregate and single average quadrupole position MS1 and MS2 spectra, (C) and (D), respectively, for SADTLWGIQK (LDHA_HUMAN) eluting at 61.7 min.

precursor, but the additional scatter above and below this line arises from fragmentation. Using software tools developed to extract drift plots from IMS−MS data, reconstructed quadrupole mass spectra can be extracted for a given TOF m/z and retention time (tr), as shown by the two right-hand panes, (iii) and (iv), in Figure 1C for the precursor MS1 and product ion MS2 data, respectively. In the current experiment, fragmentation is induced postquadrupole, so the reconstructed spectra should be limited only by ion statistics, and identical for a precursor and its fragments. This opens up the possibility of precursor and fragment alignment with a tolerance much tighter than the quadrupole window. To investigate the accuracy of the precursor m/z assignment within the scanning quadrupole dimension, two isotope features were examined for the annotated fragment y-ions of an abundant peptide, as shown in Figure 1D. The average calculated precursor m/z value and uncertainty was 783.7 ± 3.0. The theoretical m/z for the 2+ charge state of this peptide is 783.9. In this case, the m/z of the precursor was therefore determined to be ∼12.5% of the quadrupole peak width. Because the 2D data produced by this method is stored using the same format as ion mobility enabled DIA experiments ((U)(H)DMSE),6,15,24 the scanning quadrupole DIA data can be processed and searched directly using existing commercial10 and open source software tools11 for discovery type experiments through mzML conversion or by reading/importing the raw data directly. However, the data can also be used for targeted data analysis, as will be demonstrated in the next section, using proprietary or public reference spectral libraries. The additional selectivity afforded by scanning the quadrupole is illustrated in Figure 2 and Supplementary Figure 1. The results shown in Figure 2 represent the Skyline open source

the data and results presented in this paper, the transmission window was set to a 24 m/z unit window and the m/z scan range from 400 to 900 m/z. At the end of each quadrupole cycle the instrument was switched between a postquadrupole fragmentation mode and a nonfragmentation mode. The acquisition system is configured to profile scanning quadrupole separations by adding individual TOF spectra (pushes) incrementally into a buffer containing 200 memory locations or “bins”. Each bin consists of a mass spectrum labeled with a different average quadrupole position. The pusher period is determined by the TOF mode and mass range and is typically around 60 to 70 μs. In normal use, data are added to the buffer in a cyclic fashion and at least 10 cycles are usually added before it is read out and stored. Data from several consecutive cycles are pushed to the same spectral bin in the buffer before moving on to the next bin. The number of pushes per bin is set to be 1/ 200 of the quadrupole cycle time (there is no interscan delay between pushes). In the described experiment, the quadrupole cycle time was chosen to be ∼1 s, so the number of pushes added to each bin was ∼70. The whole arrangement is shown schematically in Figure 1B. This setup produces nested 2D MS data sets that can be viewed using multidimensional analysis software, as illustrated in Figure 1C. Within these distributions, the horizontal axis represents the center of the quadrupole transmission window while the vertical axis is the m/z value recorded by the oa-TOF. Within the MS1 precursor data (Figure 1C (i)), a largely diagonal structure represents the precursor ions transmitted by the quadrupole and recorded by the oa-TOF. Some fragmentation at low m/z is also visible in this log-intensity heat map. In the product ion MS2 data (Figure 1C (ii)), the residual diagonal structure corresponds to unfragmented 773

DOI: 10.1021/acs.jproteome.7b00464 J. Proteome Res. 2018, 17, 770−779

Article

Journal of Proteome Research

Figure 3. Scanning quadrupole DIA acquisition parameter and gradient optimization examples for the normalized number of identified protein groups for (A) E. coli, squares = 30 min gradient and circles = 45 min gradient; (B) human cell line, squares = 30 min gradient and circles = 45 min gradient; (C) squares, human cell line; circles, human undepleted plasma; and (D) human cell line, squares = 0.3 s scan time and circles = 0.5 s scan time. Average (n = 3) relative identification values are shown with an average technical variation across all experiments smaller than 5%.

Characteristics: Duty Cycle, Speed, and Precision

informatics interpretation of a number of MS1 and MS2 spectra for one of the annotated peptides. The left-hand side 2D ion maps of the (A) and (B) panel sets, MS1 and MS2 data, respectively, disregard the quadrupole filtering of the data by Skyline, imitating broadband DIA data. The right-hand side 2D ion maps of the (A) and (B) panels demonstrate the effect of quadrupole isolation using the prediction option of the software, with the average quadrupole position shown by the rectangular band(s). This, in turn, affords reduced complexity in the MS1 and MS2 spectra that can be readily annotated, as shown by the unfiltered and scanning quadrupole filtered spectra, top and bottom, respectively, shown in Figures 2C and 2D. An alternative view of the multidimensional nature of the data is shown in Supplementary Figure 1, representing 20 s of chromatographic product ion data of a HeLa cell protein extract load of 250 ng on-column separated using a 90 min reversed gradient. Panel A shows the deconvoluted ion detections with accurate m/z TOF detection as a function of quadrupole m/z position. Panel B illustrates that within a single quadrupole position multiple product ion series originating from multiple precursors were detected by the processing software from which, as shown in panel C, product ion spectra can be derived for qualitative or quantitative purposes. This example highlights an important fundamental difference between stepped and scanning quadrupole-based DIA approaches. Because each precursor and product ion is associated with a specific quadrupole profile, scanning quadrupole DIA data can be peak detected in the quadrupole domain, as previously shown in Figure 1C (iii) and (iv). Moreover, because the binned data overlap and are offset, dependent on the m/z scan range and quadrupole isolation width, scanning quadrupole DIA is inherently more specific than stepped quadrupole DIA, as certain precursor and product ion data will reside in unique bins. In addition, the method is more amendable to applications that require a high sampling rate, such as fast LC or CE separations, which can be achieved by acquiring DIA data at very fast quadrupole scan rates. Higher throughput/duty cycle metabolomics/lipidomics applications using 0.1 s quadrupole scan times will be presented elsewhere, highlighting the importance of peak sampling frequency and its effect on quantitative precision.25

Analytical parameters and sample properties such as quadrupole transmission width, sample complexity, throughput, dynamic range, and acquisition speed have to be considered and optimized for a given assay type. Acquisition parameter/ sample complexity dependency examples are shown in Figure 3. The effect of duty cycle time on the relative number of protein identifications using a fixed scanning quadrupole transmission window for two different proteomes, that is, E. coli and a human cell line, is shown in panels A and B using two relatively short gradient acquisitions of 30 and 45 min. Regardless of the complexity of the proteome, longer gradients provided a higher number of protein identifications. However, the maxima are at different scan times, with the results suggesting a proteome complexity dependency. The effect of transmission window on the number of protein identifications for undepleted human plasma and a human cell line sample is illustrated in Figure 3C. Here, for both samples, the amount loaded on-column, gradient length, quadrupole scan time, and oa-TOF integration time were all kept constant. A dependency on specificity can be observed with the number of protein identifications maximizing at a transmission window of approximately 23 to 28 m/z wide. The effect of the gradient length at fixed load for two different quadrupole scan and oa-TOF integration times on the number of identified human cell line proteins is shown in Figure 3D. In this instance, predominantly due to sample complexity and dynamic range, the highest observed identification rate was found with a 90 min separation for both quadrupole scan/oaTOF integration times investigated. The absolute protein group identification numbers are shown in Supplementary Figure 1, with the highest observed identification rate reported per experiment (panel) and sample type. Note that these figures of merit represent untargeted search results and that only the results shown in Figure 3D can be directly compared with the detailed characterization of the human cell line results described in the following section. The Figure 3A−C panel results are relative references, where one or more parameter was varied, but the others, although not always optimal in terms of load for a given sample type, were kept constant. Quantitative precision was found to be around 10−15% at both the protein and peptide detection levels, which will be discussed in greater detail in the context of a biological study in Part B of the manuscript, “Application to the Analysis of the Calcineurin774

DOI: 10.1021/acs.jproteome.7b00464 J. Proteome Res. 2018, 17, 770−779

Article

Journal of Proteome Research Interacting Proteins during Treatment of Aspergillus f umigatus with Azole and Echinocandin Antifungal Drugs”,26 and the Characterization section. In summary, for a given application, method optimization is required that takes into account the discussed parameters and sample properties; however, once determined, these parameters are constant and can be repeatedly applied. For all experiments and results discussed in the following sections, a 24 m/z transmission window, a 90 min reversed-phase gradient, a quadrupole scan range of m/z 400 to 900, and a 0.5 s quadrupole scan time were used. The configuration affords different modes of operation whereby, for example, MS2 acquisition time is increased at the expense of MS1 acquisition time to improve MS2 duty cycle and sensitivity. To address the loss in MS1 sensitivity, a nonscanning, wide transmission experiment could be conducted, as this experiment would only require the confirmation of the (accurate mass) presence of a precursor. However, because the applied analysis software is expecting equally time spaced MS1 and MS2 experiments, these modes of acquisition have thus far not been implemented. Compared with broadband DIA methods, specificity is increased as afforded by the scanning quadrupole. However, the sensitivity is negatively impacted due to the serial nature of the scanning quadrupole, and this overall reduced duty cycle results in a reduction of sensitivity of approximately four to five times. The combination of a fast scanning quadrupole with a fast TOF acquisition system allows quadrupole scans to be completed in