MS Spectra

Jun 1, 2017 - Eliminating noise by adjusting a minimal peak intensity threshold is biased and inefficient since lipid species and classes vary in thei...
1 downloads 0 Views 705KB Size
Subscriber access provided by Binghamton University | Libraries

Article

Intensity-Independent Noise Filtering in FT MS and FT MS/MS Spectra for Shotgun Lipidomics Kai Schuhmann, Henrik Thomas, Jacobo Miranda Ackerman, Konstantin O. Nagornov, Yury O. Tsybin, and Andrej Shevchenko Anal. Chem., Just Accepted Manuscript • Publication Date (Web): 01 Jun 2017 Downloaded from http://pubs.acs.org on June 2, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 22

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Intensity-Independent Noise Filtering in FT MS and FT MS/MS Spectra for Shotgun Lipidomics Kai Schuhmann1, Henrik Thomas1,3, Jacobo Miranda Ackerman1,3, Konstantin O. Nagornov2 , Yury O. Tsybin2 and Andrej Shevchenko1,4

1

MPI of Molecular Cell Biology and Genetics, 01307 Dresden, Germany

2

Spectroswiss, EPFL Innovation Park, 1015 Lausanne, Switzerland

3

HT and JMA contributed equally to this work

4

corresponding author: [email protected]

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 22

2 Shotgun lipidomics relies on the direct infusion of total lipid extracts into a high resolution tandem mass spectrometer. A single shotgun analysis produces several hundred of densely populated FT MS and FT MS/MS spectra, each of which might comprise thousands of peaks although a very small percentage of those belong to lipids. Eliminating noise by adjusting a minimal peak intensity threshold is biased and inefficient since lipid species and classes vary in their natural abundance and ionization capacity. We developed a method of peak intensity– independent noise filtering in shotgun FT MS and FT MS/MS spectra that capitalizes on a stable composition of the infused analyte leading to consistent time-independent detection of its bona fide components. Repetition rate filtering relies on a single quantitative measure of peaks detection reproducibility irrespectively of their absolute intensities, masses or assumed elemental compositions. In comparative experiments it removed more than 95% of signals detectable in shotgun spectra without compromising the accuracy and scope of lipid identification and quantification. It also accelerated spectra processing by 15-fold and increased the number of simultaneously processed spectra by ca 500-fold hence eliminating the major bottleneck in highthroughput bottom-up shotgun lipidomics.

ACS Paragon Plus Environment

Page 3 of 22

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

3

Shotgun mass spectrometry provides a rapid quantitative snapshot of the composition of complex mixtures of diverse biomolecules and is widely applied in lipidomics (reviewed in refs.1,2). It relies on direct infusion of total lipid extracts from cells, tissues or entire model organisms into a tandem mass spectrometer followed by the acquisition of MS and MS/MS spectra – most commonly by precursor and neutral loss scanning on triple quadrupole mass spectrometers or data-dependent MS/MS (MSn) on quadrupole – time-of-flight3, linear ion trap – Orbitrap4 or quadrupole - Orbitrap - linear ion trap5 hybrid instruments. Shotgun lipidomics powered by the high spectra acquisition rate, mass resolution and accuracy of hybrid Orbitrap mass spectrometers offers several analytical advantages4,6 such as vastly simplified sample preparation, high throughput, broad coverage of lipid classes2 and robust quantification of lipids in large-scale epidemiological screens7-9. A typical sample injection rate of ca 200 nL/min delivered by a robotic nanoflow ion source10 offers ample time to acquire tandem mass spectra from a few microliters of samples containing low-micromolar concentrations of lipids in both polarity modes with maximal mass resolution and under varying normalized collision energy11. In a typical bottom-up shotgun lipidomics workflow MS/MS spectra are acquired within the entire range of expected precursor masses using a small (typically, 1 Da) isolation window centered at monoisotopic peaks of targeted precursors4. The experiment yields a comprehensive dataset comprising all fragments produced from all ionizable lipid precursors, including currently unknown or unexpected molecules12,13. However, shotgun spectra are exceedingly complex. A shotgun dataset acquired on a high resolution mass spectrometer and consisting of 10 to 100 FT MS and 100 to 1000 FT MS/MS spectra might comprise more than one million individual peaks, although only a few hundreds of those might

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 22

4 belong to lipids14. FT MS and FT MS/MS spectra acquired on Orbitrap tandem mass spectrometers in reduced profile mode result from extensive computational post-processing, including baseline correction and noise reduction, of the original full profile mass spectra. However, they still contain an overwhelming number of noise peaks that slow down lipid identification or could even terminate processing of batches of spectra because of a fatal shortage of RAM. Last but not least, noise peaks is a major source of false positive identifications and quantification biases. LipidXplorer software partly reduced the dataset size by averaging and aligning m/z of peaks in related spectra (e.g. in MS/MS spectra acquired from the same precursor in different shotgun experiments), however the vast majority of saved peaks did not contribute to lipid identifications15. One practical approach would be to eliminate all low abundant peaks falling below some arbitrary selected or pre-computed16 threshold and assume that most of them belong to noise of diverse sources of origin. This, however, biases the interpretation and might even discard low abundant or poorly ionized, yet biologically interesting lipids. In shotgun lipidomics the intensity of useful peaks may drastically differ and usually better than 1000-fold dynamic range is required for a confident interpretation of FT MS and FT MS/MS spectra4,17. While background usually remains stable during a few successive runs, it might vary strongly between batches of samples or even between individual MS/MS spectra acquired within the same batch. Typically the intensity threshold should be manually adjusted for each batch and sometime also for individual spectra, despite this compromises the integrity of full dataset interpretation. A safer strategy would be to only focus on lipids consistently detectable in all samples9,18, yet this reduces the lipidome coverage and is a promiscuous strategy for the biomarkers discovery because it might disregard bona fide molecules only detectable in a small sub-set of all samples.

ACS Paragon Plus Environment

Page 5 of 22

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

5 We therefore reasoned that rapid and accurate interpretation of high resolution shotgun FT MS and FT MS/MS spectra requires eliminating noise peaks by a method that does not consider their intensities. Here we present an intensity–independent strategy that takes advantage of a fundamental feature common to all shotgun acquisitions: in shotgun experiments the composition of infused analytes does not change with time and this should lead to consistent time-independent detection of its bona fide components. We demonstrate that by assessing the rate at which peaks are detected in the series of successively acquired scans we could eliminate over 95% of noise peaks without compromising the accuracy and scope of the lipid identification and quantification.

EXPERIMENTAL SECTION Annotation of lipid species was according to ref.19. Acronyms: PE: 1,2-diacyl-sn-glycero-3phosphoethanolamine;

PC:

1,2-diacyl-sn-glycero-3-phosphocholine;

PC

O-:

1-alkyl,2-

acylglycerophosphocholine. Chemicals and lipid standards. Standards of synthetic PE and bovine heart PC extract were purchased from Avanti Polar Lipids (Alabaster, AL). Methanol, isopropanol, water, and ammonium formate were purchased from Sigma-Aldrich/Merck (Darmstadt, Germany) and were of LC-MS or Chromasolv/LiChrosolv grade. Chloroform (HPLC grade) was purchased from Rathburn Chemicals (Walkerburn, UK). Shotgun mass spectrometry. The synthetic standards of PE 12:0/13:0, PE 18:2/18:2, PE 18:0/18:2 and PE 18:0/18:0 at the concentrations within the range of 0.2 nM to 2 µM, or bovine heart PC with the total concentration of 2 µM were reconstituted in a mixture of isopropanol/methanol/chloroform (4:2:1; v/v/v) containing 7.5 mM ammonium formate.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 22

6 Shotgun lipidomics was performed on a Q Exactive hybrid quadrupole – Orbitrap tandem mass spectrometer (Thermo Fisher Scientific, Bremen, Germany) equipped with a Triversa Nanomate robotic nanoflow electrospray ion source (Advion BioScience, Ithaca NY) using spraying chips with the diameter of nozzles of 4.1 µm. The ion source was controlled by Chipsoft 8.1.0 software. Spraying voltage and gas backpressure were set to 1.25 kV and 0.95 psi, respectively. Ion transfer capillary temperature was set to 200 ºC and S-lens RF level to 50 %. Target mass resolution at m/z 200 (Rm/z=200) was set to 140 000 for both FT MS and FT MS/MS spectra. For acquiring FT MS spectra automated gain control (AGC) was set at the value of 3*106; maximum injection time at 500 ms and mass range was m/z 400 to 1200. FT MS scans were on-line calibrated using lock mass function11. In FT MS/MS experiments we applied precursor isolation window of 1 Da; normalized collision energy of 22.5 %; AGC of 2*104 and maximum filling time of 1 s. Target masses for MS/MS were specified in the inclusion list. All spectra were acquired in reduced profile mode. Repetition rate filtering. Shotgun FT MS and FT MS/MS spectra were subjected to RRF using open source software PeakStrainer developed in-house as a Python-based application with graphic user and command-line interfaces. The software is available as a single file installation for Microsoft Windows 7 and also as a Python source code at: https://doi.org/10.17617/1.47 , along with the installation guide and tutorial. In this work PeakStrainer was used on a 64-bit Windows 7 desktop computer having 2 x 3.2 GHz processors and 32 GB RAM. PeakStrainer processed full shotgun (FT MS and FT MS/MS) experiments acquired in reduced profile mode and saved as *.raw files; filtered spectra datasets were output as *.mzXML files and directly imported into LipidXplorer software for lipid identification and quantification14,15. Benchmarking of repetition rate filtering. To determine the impact of RRF on the lipid quantification FT MS spectra were acquired from a dilution series of PE standards. While the

ACS Paragon Plus Environment

Page 7 of 22

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

7 concentration of PE 12:0/13:0 (internal standard) was maintained at 0.5 µM, concentrations of PE 18:2/18:2, PE 18:0/18:2 and PE 18:0/18:0 were varied within the range of 2 µM to 0.2 nM. To benchmark the impact of RRF on the lipid identification we analyzed the extract of bovine heart PC by data-independent FT MS/MS and identified PC and PC O- species by LipidXplorer as described in refs.11,14. In RRF-processed spectra lipids were identified without applying a peak intensity threshold. For comparison the same, but unfiltered spectra in *.mzXML format were interpreted by LipidXplorer. Peak intensity thresholds were applied during importing the spectra into the MasterScan relational database as detailed in ref.15. Molecular fragmentation query language (MFQL) queries specifying the combinations of fragments that identified molecular species of lipids and other search settings (e.g. mass tolerance and resolution) were the same in both experiments.

RESULTS AND DISCUSSION

Noise in shotgun FT MS and FT MS/MS spectra. In a typical shotgun experiment each spectrum is acquired within a specified period of time (say, 30 s) in successive short (e.g. 1 s) time intervals termed as scans. Scans are stored in a *.raw file and then averaged into a single representative spectrum14. During averaging unique masses (solely for presentation clarity, we will further refer to masses m instead of m/z) falling into small mass intervals (bins) with a pre-computed mass–resolution dependent width (typically, a few mDa) are replaced by a single intensity-averaged mass and averaged intensity, which remain associated with this bin. In this way, an averaged spectrum is composed of averaged masses and intensities of binned peaks, while individual scans are disregarded. We note that there is no restriction on the number of peaks averaged within each bin: bins with a few peaks and bins

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 22

8 with just one peak are both retained in the averaged spectrum. Although averaging decreases the overall number of peaks it is not intended to distinguish peaks of bona fide analyte from noise. We took a detailed look at noise peaks in FT MS/MS spectra acquired in reduced profile mode from formate adducts [M+HCOO]- of three PC / PC O- precursors, whose abundances differed by ca 200-fold (Figure 1A). Each FT MS/MS spectrum was acquired in 11 scans. However, instead of averaging scans, we plotted intensities of all peaks detected in all scans and, for presentation clarity, disregarded peaks of bona fide fragments20.

Figure 1. Intensity and repetition rate of noise peaks in FT MS/MS spectra of PC and PC O- formate adducts. Panel A: absolute intensity of noise peaks is proportional to the abundance of fragmented precursors and also changes with m/z. Panel B: the number of 1mDa bins plotted against their occupancy rate (in %). Irrespectively of the absolute intensity of filling peaks the vast majority of bins (> 90%) were only occupied in one out of 11 scans (occupancy rate 70%) rate (inset in Figure 1B). We therefore reasoned that the repetition rate with which peaks were detected in successively acquired scans could serve as an intensity-independent criterion discriminating noise peaks from peaks of bona fide analytes. However, even 100% repetition rate does not necessarily mean that these peaks belong to authentic lipid precursors in FT MS or fragments in FT MS/MS spectra.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 22

10

Figure 2. Workflow of repetition rate filtering is exemplified using FT MS/MS spectrum of PC O34:3 ([M+HCOO]-; m/z 786.5677) acquired in 11 scans. Panel A: FT MS/MS spectrum presented as a collection of data points (bin mass; averaged peak intensity) that are shown as black dots. Vertical bars in red are occupancy rates of corresponding bins (in %). Panel B: one peak (inset at the left) and three peaks (inset at the right) having the repetition rate below 70%. These peaks were removed. Panels C and D: peaks corresponding to [M-CH3-FA18:2] and [M-CH3]- fragments falling into five adjacent bins are having 100% repetition rate. These peaks were retained. Panel E: the same spectrum as in panel A upon repetition rate filtering with the threshold of 70% (shown as dashed line in panel A).

The workflow and proof-of-principle of repetition rate filtering. FT MS and FT MS/MS spectra were extracted from the *.raw files as centroided peaks with respective m/z, intensity and resolution values. Peaks pertinent to each scan were combined into one spectrum based on the scan header and filtered as described below. To exemplify the RRF workflow, let us consider the FT MS/MS spectrum of PC O- 34:3 ([M+HCOO]-; m/z 786.5677) that was acquired in 11 scans under the target mass resolution of Rm/z 200

ACS Paragon Plus Environment

Page 11 of 22

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

11 = 140 000 (Figure 2A). Its RRF proceeded in two steps. First, the entire mass range was split into 10 mDa bins, the occupancy of each bin was counted and the content of bins occupied with a single peak was disregarded (but, at this point, not discarded). Next, the remaining peaks were re-binned using smaller (e.g. 1 mDa) bin size. Each bin that was occupied by two or more peaks was considered together with two adjacent bins if its m/z was above the value of 200. Bins with m/z lower than 200 were considered individually. In each individual bin (< m/z 200) or in five (2+1+2) combined bins (> m/z 200) we counted the total number of peaks, now also including bins with single peaks disregarded at the first round. We only saved bins occupied by seven or more peaks in 11 acquired scans: effectively, this was equivalent to applying an occupancy threshold of 70%. In Figure 2 the bin in panel B (inset at the left) was disregarded already at the first step since it contained only one peak within the vicintiy of 10 mDa. Another bin shown in panel B (inset at the right) had m/z < 200, contained only three peaks which were discarded since bin’s occupancy was below 70%. Five combined bins in panels C and D contained 11 signals each – this was equivalent to 100% occupancy and therefore peaks in these bins were saved. Note that here binning was only used for computing the repetition rates of peaks with closely related masses and had no impact on subsequent spectra processing. The number of bins combined at the second step of filtering is, in principle, user-defined and could be adjusted according to the actual mass resolution of the employed instrument. Here we exemplified the filtering scheme by combining 5 x 1 mDa bins because it approximately matched a half-height width of averaged peaks within the range of m/z 400 to 800 in the spectra acquired under the target mass resolution of Rm/z=200 = 140000. Bins with m/z < 200 were considered individually because half-height width of low molecular weight peaks was close to 1 mDa. A more elaborate version of our open source software PeakStrainer employs the same occupancy counting algorithm; however it implements a more flexible scheme of

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 22

12 calculating the width of merged bins depending on the actual resolution at bin’s masses. Therefore it could process shotgun spectra acquired at different target mass resolutions and/or mass ranges without adjusting basic settings of RRF. For each FT MS/MS spectrum the number of acquired scans could be read from the scan header. Depending on the precursor abundance and AGC-adjusted Orbitrap filling time it might slightly vary for different precursors. PeakStrainer independently determines the actual number of acquired scans by counting the number of most abundant peaks falling into merged bins (e.g 5 x 1 mDa bins in Figure 2 C, D) because they are usually detected with 100% repetition rate. This helped to amend errors in scans counting, especially in case of unstable or accidentally discontinued spraying. We underscore that filtering does not substitute subsequent processing of spectra, such as peaks averaging, and does not compute or in any way affect the exact masses as reported in the “final” averaged spectrum. Also, RRF does not consider peaks intensity or mass accuracy or relies on any structure-related input e.g. expected fragment ions, elemental composition or isotopic ratios. Therefore it can be applied to any shotgun FT MS and FT MS/MS experiment targeting any molecules of interest. In this work RRF processed spectra lipids were identified by LipidXplorer software14,21. RRF applied to the FT MS/MS spectrum in Figure 2A using the threshold of 70% repetition rate retained only 12 (1.1%) unique peaks (Figure 2E). Accurate masses of five of those peaks matched major fragments of PC-O 34:3 despite ca 1000-fold difference in their intensities. Other seven peaks could not be unequivocally assigned to its structure and might originate from chemical background or cofragmented lipid species. This, however, had no impact on the confidence of PC O- 34:3 identification. We therefore concluded that RRF eliminated a large number of noise peaks in a fully unbiased and peak intensity-independent manner, while all known characteristic fragments were retained.

ACS Paragon Plus Environment

Page 13 of 22

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

13

Validation and benchmarking of repetition rate filtering To validate and benchmark RRF we addressed two major questions: i) if RRF affected the lipid quantification, particularly at low concentrations of the analyte, by removing still quantifiable peaks and ii) if RRF outperformed a conventional method of eliminating noise peaks by adjusting the minimal peak intensity threshold.

Figure 3. RRF corroborates the accurate shotgun quantification. Black line plot shows the intensities of precursor peaks in the dilution series of PE 36:0, PE 36:2 and PE 36:4 standards (all detected as [MH]-). Unfilled and filled circles stand for peak intensities in, respectively, unfiltered and filtered spectra (repetition rate cut-off 70%). Red line plot shows repetition rates of the same PE peaks. At the repetition rate 99%) contained species whose peaks were detected with S/N >10. However, in this dataset RRF retained twice (82 vs 40) as many PC / PC O- species compared to an alternative method of minimal peaks intensity adjustments.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 22

16

Figure 4. Comparison of the alternative methods of spectrum noise reduction: repetition rate filtering vs applying minimal peak intensity threshold. The number of identified PC / PC O- species and the total number of peaks retained after dataset processing under the specified peak intensity (panel A) and repetition rate (panel B) thresholds. Cumulative distribution of signal-to-noise ratios of PC / PC Opeaks identified after intensity threshold adjustment (panel C) and repetition rate filtering (panel D).

ACS Paragon Plus Environment

Page 17 of 22

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

17 By removing poorly repeated signals from FT MS and FT MS/MS spectra RRF dramatically accelerated processing of shotgun datasets. Importing the above *.raw file acquired from the PC- / PC O- extract and converted to *.mzXML into a MasterScan database by LipidXplorer required 33 s and 0.8 GB RAM on a desktop PC having 2 x 3.3 GHz processor(s) and 32 GB RAM. However, both the required time and memory scale linearly with the number of *.raw files submitted as a batch: importing 15 of such *.raw files without applying over-restrictive (and, arguably, biased) settings could max out the RAM. Also, an inflated size of a MasterScan slowed down the lipid identification. Identification of PC and PC O- species required ca 45 s in a MasterScan built from one *.raw file and 24.3 min for 15 *.raw files (ca. 97 s per file) having the same number of spectra. From the same *.raw file RRF eliminated 96.5% of the total number of peaks; it was imported in as little as 4 s and required no (< 0.01 GB) extra RAM. The identification of PC/ PC O- species in a single filtered file required ca 1 s, 6 s for 10 files (= 0.6 s per file) and 13 s for 40 files (= 0.3 s per file). Altogether, RRF accelerated the import of spectra and identification of lipids by more than 15-fold and the absence of significant memory restrictions enabled simultaneous processing of batches of more than 1000 files containing full (FT MS and FT MS/MS) shotgun experiments. In summary, in shotgun datasets RRF reduced the total number of peaks by more than 95% and dramatically accelerated lipid identification without compromising the lipidome coverage and quantification accuracy. If compared to peak intensity threshold adjustments, RRF produced spectra dataset of better quality that contained more peaks of bona fide analytes and fewer peaks of noise. Importantly, all spectra could be processed under the same repetition rate (here 70%) threshold without iterative adjustment of peak intensity cut-offs that requires many laborious intermediate interpretations.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 22

18

CONCLUSIONS AND OUTLOOK Repetition rate filtering is a computationally simple, efficient and generic procedure that eliminates a major bottleneck in bottom-up shotgun lipidomics by high resolution mass spectrometry. While spectra acquisition and processing have been automated and steered by a few transparent settings selecting appropriate peak intensity thresholds remained error-prone, biased and required considerable shotgun lipidomics expertise. It was typically addressed by monitoring the trade-off between the lipidome coverage and levels of apparent false positive hits in several iterative test runs each of which might easily take a few hours. It also limited analytical advantages brought by the high mass resolution mass spectrometry since over-restrictive peak intensity thresholds were required only to keep the dataset size within computationally manageable limits. These efforts, however, did not improve lipid identification and consumed computational resources in vain. RRF takes advantage of a stable composition of the electrosprayed analyte – an inherent feature of shotgun mass spectrometry. It does not rely on particular structural properties of the analyte (e.g. expected masses or isotopic patterns), or pre-selected signal intensity and signal-to-noise levels. In a consistent and controlled manner it eliminated >95% of noise peaks in both FT MS and FT MS/MS spectra and dramatically reduced the total size of shotgun datasets. RRF does not interfere with retained signals and therefore does not compromise mass resolution and accuracy. In contrast to minimal signal intensity threshold adjustments, it does not impede on the lipid quantification or lipidome coverage. In the future it would be interesting to test if an appropriately adjusted RRF approach could be applied to high resolution shotgun spectra acquired in the full profile mode, or on mass spectrometers having lower mass resolution e.g QqTOFs or even triple quadrupoles. While the current work was

ACS Paragon Plus Environment

Page 19 of 22

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

19 only focused on shotgun lipidomics it would be intriguing to re-screen RRF processed spectra datasets and check if other classes of hydrophobic biomolecules were co-extracted along with the bulk of lipids. Both shotgun acquisition of spectra and RRF processing are unbiased and, conceivably, their symbiosis would assist in systematic discovery of novel lipids and structurally related molecules 12.

ACKNOWLEDGEMENTS Work in AS laboratory was supported by Max Planck Gesellschaft and grants TRR83 (Project A17) from Deutsche Forschungsgemeinschaft (DFG); LiSyM program and LIFS Unit of de.NBI Consortium from Bundesministerium f. Bildung u. Forschung (BMBF).

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 22

20

REFERENCES (1) Wang, C.; Wang, M.; Han, X. Mol Biosyst 2015, 11, 698-713. (2) Han, X.; Yang, K.; Gross, R. W. Mass Spectrom Rev 2012, 31, 134-178. (3) Schwudke, D.; Oegema, J.; Burton, L.; Entchev, E.; Hannich, J. T.; Ejsing, C. S.; Kurzchalia, T.; Shevchenko, A. Anal Chem 2006, 78, 585-595. (4) Schuhmann, K.; Herzog, R.; Schwudke, D.; Metelmann-Strupat, W.; Bornstein, S. R.; Shevchenko, A. Anal Chem 2011, 83, 5480-5487. (5) Almeida, R.; Pauling, J. K.; Sokol, E.; Hannibal-Bach, H. K.; Ejsing, C. S. J Am Soc Mass Spectrom 2015, 26, 133-148. (6) Schwudke, D.; Schuhmann, K.; Herzog, R.; Bornstein, S. R.; Shevchenko, A. Cold Spring Harbor Perspect Biol 2011, 3, a004614. (7) Surma, M. A.; Herzog, R.; Vasilj, A.; Klose, C.; Christinat, N.; Morin-Rivron, D.; Simons, K.; Masoodi, M.; Sampaio, J. L. Eur J Lipid Sci Technol 2015, 117, 1540-1549. (8) Heiskanen, L. A.; Suoniemi, M.; Ta, H. X.; Tarasov, K.; Ekroos, K. Anal Chem 2013, 85, 87578763. (9) Sales, S.; Graessler, J.; Ciucci, S.; Al-Atrib, R.; Vihervaara, T.; Schuhmann, K.; Kauhanen, D.; Sysi-Aho, M.; Bornstein, S. R.; Bickle, M.; Cannistraci, C. V.; Ekroos, K.; Shevchenko, A. Sci Rep 2016, 6, 27710. (10) Kameoka, J.; Craighead, H. G.; Zhang, H.; Henion, J. Anal Chem 2001, 73, 1935-1941. (11) Schuhmann, K.; Almeida, R.; Baumert, M.; Herzog, R.; Bornstein, S. R.; Shevchenko, A. J Mass Spectrom 2012, 47, 96-104. (12) Papan, C.; Penkov, S.; Herzog, R.; Thiele, C.; Kurzchalia, T.; Shevchenko, A. Anal Chem 2014, 86, 2703-2710. (13) Penkov, S.; Mende, F.; Zagoriy, V.; Erkut, C.; Martin, R.; Passler, U.; Schuhmann, K.; Schwudke, D.; Gruner, M.; Mantler, J.; Reichert-Muller, T.; Shevchenko, A.; Knolker, H. J.; Kurzchalia, T. V. Angew Chem Int Ed Engl 2010, 49, 9430-9435. (14) Herzog, R.; Schwudke, D.; Schuhmann, K.; Sampaio, J. L.; Bornstein, S. R.; Schroeder, M.; Shevchenko, A. Genome Biol 2011, 12, R8. (15) Herzog, R.; Schwudke, D.; Shevchenko, A. Curr Protoc Bioinformatics 2013, 43, 14 12 11-30. (16) Zhurov, K. O.; Kozhinov, A. N.; Fomelli, L.; Tsybin, Y. O. Anal Chem 2014, 86, 3308-3316.

ACS Paragon Plus Environment

Page 21 of 22

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

21 (17) Ejsing, C. S.; Sampaio, J. L.; Surendranath, V.; Duchoslav, E.; Ekroos, K.; Klemm, R. W.; Simons, K.; Shevchenko, A. Proc Natl Acad Sci U S A 2009, 106, 2136-2141. (18) Zeigerer, A.; Bogorad, R. L.; Sharma, K.; Gilleron, J.; Seifert, S.; Sales, S.; Berndt, N.; Bulik, S.; Marsico, G.; D'Souza, R. C.; Lakshmanaperumal, N.; Meganathan, K.; Natarajan, K.; Sachinidis, A.; Dahl, A.; Holzhutter, H. G.; Shevchenko, A.; Mann, M.; Koteliansky, V.; Zerial, M. Cell Rep 2015, 11, 884-892. (19) Liebisch, G.; Vizcaino, J. A.; Kofeler, H.; Trotzmuller, M.; Griffiths, W. J.; Schmitz, G.; Spener, F.; Wakelam, M. J. J Lipid Res 2013, 54, 1523-1530. (20) Ekroos, K.; Ejsing, C. S.; Bahr, U.; Karas, M.; Simons, K.; Shevchenko, A. J Lipid Res 2003, 44, 2181-2192. (21) Herzog, R.; Schuhmann, K.; Schwudke, D.; Sampaio, J. L.; Bornstein, S. R.; Schroeder, M.; Shevchenko, A. PLoS One 2012, 7, e29851. (22) Simoneit, B. R.; Medeiros, P. M.; Didyk, B. M. Environ Sci Technol 2005, 39, 6961-6970.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 22

22 Image for TOC

ACS Paragon Plus Environment