Subscriber access provided by University of Newcastle, Australia
Article
Optimization of acquisition and data-processing parameters for improved proteomic quantification by SWATH mass spectrometry Shanshan Li, Qichen Cao, Weidi Xiao, Yufeng Guo, Yunfei Yang, Xiaoxiao Duan, and Wenqing Shui J. Proteome Res., Just Accepted Manuscript • Publication Date (Web): 20 Dec 2016 Downloaded from http://pubs.acs.org on December 20, 2016
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Proteome Research
1
Optimization of acquisition and data-processing parameters for improved
2
proteomic quantification by SWATH mass spectrometry
3
Shanshan Li1,#, Qichen Cao2,#,*, Weidi Xiao3, Yufeng Guo2, Yunfei Yang2, Xiaoxiao
4
Duan3, Wenqing Shui1,*
5
1
6
2
7
300308
8
3
iHuman Institute, ShanghaiTech University, Shanghai 201210, China Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin
College of Life Sciences, Nankai University, Tianjin 300071, China
9 10
#These authors contribute equally to this work
11
12
*To whom correspondence should be addressed to:
13
Qichen Cao, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences,
14
Tianjin 300308, China; Tel: 86-22-24828768; email: cao_qc@tib.cas.cn
15
Wenqing Shui, iHuman Institute, ShanghaiTech University, Shanghai 201210, China;
16
Tel: 86-21-20685595; email: shuiwq@shanghaitech.edu.cn
17 18
1
ACS Paragon Plus Environment
Journal of Proteome Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Abstract
2
Proteomic analysis with data independent acquisition (DIA) approaches represented by
3
the SWATH technique has gained intense interest in recent years because DIA is able to
4
overcome the intrinsic weakness of conventional data dependent acquisition (DDA)
5
methods
6
quantification. Although the raw mass spectrometry (MS) data quality and the data-
7
mining workflow conceivably influence the throughput, accuracy and consistency of
8
SWATH-based proteomic quantification, there lacks a systematic evaluation and
9
optimization of the acquisition and data-processing parameters for SWATH MS analysis.
10
Herein, we evaluated the impact of major acquisition parameters such as the precursor
11
mass range, isolation window width and accumulation time as well as the data-
12
processing variables including peak extraction criteria and spectra library selection on
13
SWATH performance. Fine tuning these interdependent parameters can further improve
14
the throughput and accuracy of SWATH quantification compared to the original setting
15
adopted in most SWATH proteomic studies. Furthermore, we compared the
16
effectiveness of two widely used peak extraction software PeakView and Spectronaut in
17
discovery of differentially expressed proteins in a biological context. Our work is believed
18
to contribute to a deeper understanding of the critical factors in SWATH MS experiments
19
and help researchers optimize their SWATH parameters and workflows depending on
20
the sample type, available instrument and software.
and afford higher throughout
and
reproducibility for proteome-wide
2
ACS Paragon Plus Environment
Page 2 of 39
Page 3 of 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Proteome Research
1
Key words: SWATH, DIA, acquisition parameters, data processing, spectral library,
2
proteomic quantification
3
3
ACS Paragon Plus Environment
Journal of Proteome Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Page 4 of 39
Introduction
2
The ultimate goal of proteomics is qualitative and quantitative profiling of the full
3
repertoire of proteins with sufficient accuracy and consistency. As technologies in mass
4
spectrometry continue advancing, thousands of protein constituents in complex
5
biological samples can be identified and quantified unambiguously.1-3 Proteomic analysis
6
with data independent acquisition (DIA) approaches represented by the sequential
7
window acquisition of all theoretical fragment ion spectra (SWATH) technique has
8
gained intense interest in recent years because DIA is able to overcome the intrinsic
9
weakness of conventional data dependent acquisition (DDA) methods and afford higher
10
throughput and reproducibility for proteome-wide quantification.4, 5
11
SWATH mass spectrometry (MS) analysis records the complete fragment ion traces
12
for all peptides detectable within specific precursor mass windows, therefore overcoming
13
the stochastic, intensity-driven selection of precursors in DDA-based MS analysis. In
14
fact, SWATH MS maintains the major advantages of multiple reaction monitoring (MRM)
15
-based targeted approaches such as high degree of specificity, reproducibility and
16
sensitivity,6 yet it substantially improves throughput and coverage of protein
17
quantification compared to MRM.5 An increasing number of studies have demonstrated
18
the great potential of SWATH MS in large-scale quantitative proteomic research
19
including interrogation of dynamics of the human interactome,7,
20
mapping of mouse tissue proteome9,
21
determination of genome-wide absolute protein concentrations.12, 13
10
8
comprehensive
and human plasma proteome11 as well as
22
SWATH MS operates within a space of interdependent acquisition parameters,
23
including precursor mass range, precursor isolation window width, accumulation time of
24
the product ion scan, and total duty cycle that would collectively influence the intensity
25
and specificity of fragment ion peaks, and throughput, accuracy and consistency of peak 4
ACS Paragon Plus Environment
Page 5 of 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Proteome Research
1
quantification. In the original publication of SWATH technique, the authors provided an
2
acquisition parameter set as condition 1 listed in Table 1 mainly based on their extensive
3
experiences in DDA and MRM experiments as well as computational simulation of
4
fragment ion interferences5. Even since then these instrument settings have been
5
adopted in many proteomic studies using the SWATH approach.4, 7, 8, 10, 11, 13-15 Because
6
of the interconnections between these acquisition variables such as the obvious tradeoff
7
of the isolation window width with the accumulation time/cycling rate,5 we consider it is
8
necessary to explore the acquisition parameters in detail and understand their distinct
9
influences on SWATH MS performance.
10
Fragment ion chromatogram extraction against a reference spectral library is an
11
essential step in most SWATH MS data processing workflows. The depth and quality of
12
the spectral library makes a significant impact on the outcome of proteomic
13
quantification. Very recently, Wu JX et al. evaluated locally generated and online
14
repository-based libraries for their effects on SWATH quantification using a commercial
15
software PeakView for SWATH peak extraction.16 In addition to PeakView, an array of
16
open-access software such as Spectronaut, OpenSWATH, DIA-Umpire, and Skyline
17
have been developed to process SWATH and other types of DIA datasets with more
18
flexibility and openness than the commercial software.4, 12, 17-21 Among them, Spectronaut
19
has emerged as a popular DIA data mining tool because it can utilize spectral libraries
20
generated from raw data acquired on various instrument platforms and achieve accurate
21
RT calibration using defined spike-in peptide standards.22 However, the variables in
22
fragment ion peak extraction and spectral library construction have not been extensively
23
investigated.
24
In this work, we systematically evaluated the impact of major acquisition parameters
25
such as the precursor mass range, isolation window width and accumulation time as well 5
ACS Paragon Plus Environment
Journal of Proteome Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
as the data-processing variables including peak extraction criteria and spectra library
2
selectin on SWATH performance. By analyzing yeast proteomic samples serially diluted
3
in a complex digestion background, we assessed the throughput and accuracy of
4
proteins and peptides quantified by SWATH MS analysis under different conditions.
5
Furthermore, we compared two workflows using PeakView or Spectronaut for fragment
6
ion peak extraction and assessed their performance in discovery of differentially
7
expressed proteins in yeast cells upon heat shock stress. We anticipate our work will
8
contribute to a deeper understanding of the critical factors in SWATH MS experiments
9
and help researchers optimize their SWATH parameters and workflows depending on
10
the sample type, available instrument and software.
11 12 13 14 15 16 17
6
ACS Paragon Plus Environment
Page 6 of 39
Page 7 of 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Proteome Research
1
MATERIALS AND METHODS
2
Yeast Culture, Protein Extraction, Digestion and Fractionation
3
Cell culture media and media supplements were all purchased from Invitrogen
4
(Carlsbad, CA, USA). All the other chemical materials were purchased from Sigma (St.
5
Louis, MO, USA). Saccharomyces cerevisiae BY4743 strain was grown at 30 °C in YPD
6
medium until they reached mid-exponential phase. Yeast culture was centrifuged at
7
1500 × g for 5 min at 4 °C. The cell pellets were washed 3 times with cold PBS to
8
remove the medium. Then the cell pellets were resuspended in lysis buffer of 8 M urea,
9
100 mM NH4HCO3, 5 mM DTT and protease inhibitor cocktail (Roche, Mannheim,
10
Germany). The cells were disrupted by glass beads and centrifuged at 15000 × g for 20
11
min at 4 °C. The protein supernatant concentration was determined by Bradford Protein
12
Assay Kit. Yeast proteins (~1 mg) were reduced with 10 mM DTT at 37 °C for 4 h and
13
alkylated with 40 mM iodoacetamide at room temperature in darkness for 40 min.
14
Additional 10 mM DTT were added to quench excess iodoacetamide followed by
15
incubation at 37 °C for 30 min. Samples were then diluted with 100 mM NH4HCO3 to a
16
final concentration of 1.0 M urea and proteins were digested with sequencing grade
17
modified trypsin (Promega, Madison, USA) at an enzyme: protein ratio of 1:100 (w/w) at
18
37 °C for 4 h. The same amount of trypsin was added again and incubated at 37 °C
19
overnight. Digestion was terminated by adding 1% FA and then the peptides were
20
desalted with C18 cartridges (Waters, Milford, USA) and dried by speed vacuum.
21
To build a reference spectral library for SWATH-MS analysis, the yeast peptide
22
sample was pre-fractionated by high-pH RPLC with a Durashll-C18 column (C18, 3 µm
23
resin, 4.6 mm × 250 mm, Agela, China) on the Nexera UHPLC system (SHIMADZU,
24
Japan). The peptides were dissolved in mobile phase A (water-acetonitrile-NH4OH = 98: 7
ACS Paragon Plus Environment
Journal of Proteome Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
2: 0.014, v/v/v). The peptides were eluted by mobile phase B (water-acetonitrile-NH4OH
2
= 2: 98: 0.014, v/v/v) using a gradient of 5-40 min 8-18%; 40-62 min 18-32%; 62-64 min
3
32-95%. The fractions were collected and pooled into 15 fractions before lyophilization
4
under vacuum. Before LC-MS analysis, all peptide samples were spiked with the
5
retention time standard peptides iRT-Kit (Biognosys, Schlieren, Switzerland)22 according
6
to the manufacture instruction.
7 8
Dilution of Yeast Proteomic Samples with E.coli Total Digests
9
To assess the SWATH-MS quantification accuracy, the yeast total cell digest was
10
mixed with Escherichia coli cell total digest which simulates the highly complex
11
proteomic matrix. The E. coli K-12 strain was grown to 1.0 OD in M9 minimal medium.
12
Protein extraction and digestion was conducted with the same procedure as described
13
above for yeast total digest preparation. The yeast peptide samples were spiked into E.
14
coli total digest with 2X, 5X and 10X dilution factors, which gave rise to expected fold
15
changes of 0.5, 0.2 and 0.1 for the serially diluted yeast samples vs the undiluted
16
sample. Each diluted and undiluted yeast proteomic sample was injected into 1D
17
nanoLC-MS in duplicate for SWATH MS analysis.
18 19
NanoLC-MS/MS Setup
20
All the nanoLC-MS/MS runs in this work were performed on an Eksigent NanoLC
21
connected to TripleTOF 5600 mass spectrometer (AB SCIEX, Concord, Ontario) with a
22
nano-electrospray ionization source. Each proteomic sample (~2 µg) was loaded onto a
23
C18 trap column (10 mm × 100 µm, 5 µm, C18 resin) using an isocratic 98% Buffer A
24
(2% acetonitrile, 0.1% FA) and 2% Buffer B (98% acetonitrile, 0.1% FA) at a flowrate of 8
ACS Paragon Plus Environment
Page 8 of 39
Page 9 of 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Proteome Research
1
2 µL/min. Then peptides were separated on a nanoLC column (150 mm × 75 µm)
2
packed with C18-AQ 3 µm C18 resin (Dr. Maisch, GmbH, Germany) at a flow rate of 300
3
nl/min with an elution gradient of 0-1 min, 5% Buffer B; 1-55 min, 5-24% B; 55-70 min,
4
24-36% B; 70-85 min, 36-80% B. The mass spectrometer was operated in the positive
5
ion mode. In the shotgun experiment of pre-fractionated yeast digest samples,
6
information-dependent acquisition (IDA) was implemented using a “top 40” method.
7
Specifically, a 250 ms survey scan was performed in the m/z range of 350-1500, and the
8
top 40 ions above the intensity threshold of 120 counts were selected for subsequent
9
MS/MS scans with an accumulation time of 50 ms. In the SWATH experiment, a 100 ms
10
survey scan was performed in the m/z range of 350-1500, followed by serial consecutive
11
SWATH scans. Key parameters such as the SWATH scan mass range, isolation window
12
width, accumulation time, etc. specified in Table 1 were individually evaluated.
13 14
Database Search and Spectral Library Generation
15
For extensive spectral library generation, 15 off-line RPLC fractions from yeast
16
protein digestion were analyzed on TripleTOF 5600 mass spectrometer in the IDA mode.
17
Each IDA file was searched with Mascot (v2.5.1, Matrix Science), MaxQuant
18
(v1.5.0.30),23 ProteinPilot (v4.5, AB SCIEX)24 or X!tandem25 search engine built in
19
SearchGui software (v2.1.4)26 against the SGD protein sequence database (release 09
20
Nov. 2011, containing 6771 entries). Trypsin was set as the specific enzyme and up to
21
two missed cleavages per peptide were allowed. Carbamidomethylation on cysteine was
22
set as fixed modification, oxidation on methionine and acetylation on the protein N-
23
terminus were variable modification. For Mascot, MaxQuant and X!tandem searches,
24
precursor ion mass tolerance was set to 20 ppm and fragment ion tolerance was 0.05
25
Da. For ProteinPilot search, the “Thorough ID” mode was selected which automatically 9
ACS Paragon Plus Environment
Journal of Proteome Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
adjust mass tolerance to fit the high-resolution MS and MSMS data. The corresponding
2
database search result files were imported to Skyline software19 to generate the spectral
3
library with a cut-off score of 0.99 to ensure confident spectral assignment.
4 5
SWATH MS Data Extraction and Statistical Analysis
6
Peak extraction of the SWATH data was performed using either the Spectronaut
7
software (ver 8.0, Biognosys, Switzerland) or SWATH micro App embedded in
8
PeakView (ver2.0, AB SCIEX, USA).27 SWATH data was processed with default settings
9
in Spectronaut4. Reference peptides from the iRT-kit (Biognosys) spiked into each
10
sample were used to calibrate the retention time of extracted peptide peaks using
11
Spectronaut. Peptide identification results were filtered with a q-value < 0.01 which
12
controlled the estimated peptide FDR below 1% using the error rate algorithm originally
13
from mProphet.28 PeakView was also used for peptide peak extraction with the following
14
parameters: 75 ppm m/z tolerance for the targeted transition, six peptides selected per
15
protein, six transitions selected per peptide, peptide identification FDR < 1%, and
16
excluding shared peptides. RT calibration was also performed based on iRT peptide
17
elution profiles in PeakView using the SWATH App module (ver2.0).
18
After peak extraction with either Spectronaut or PeakView, the sum of MS2 ion peak
19
areas of SWATH quantified peptides for individual proteins were exported to calculate
20
the protein peak areas. In the data analysis for diluted vs undiluted yeast proteomic
21
samples, the protein fold change was calculated based on protein peak areas from the
22
pair of samples in comparison. For statistical analysis of the SWATH dataset from the
23
yeast heat shock experiment, peak extraction output data matrix from either Spectronaut
24
or PeakView was imported into MSstats (v2.3.5) for data normalization and relative
25
protein quantification.29 Proteins with a fold change > 1.5 and statistical p-value < 0.05 10
ACS Paragon Plus Environment
Page 10 of 39
Page 11 of 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Proteome Research
1
estimated by MSstats were regarded differentially expressed under heat shock condition
2
vs control.
3 4
Yeast Heat Shock Experiment
5
An industrial yeast strain ScY01 was first grown at 30 °C to early exponential phase in
6
YPD medium. An equal volume of pre-warmed YPD medium (70 °C) was added to the
7
culture, resulting in an instantaneous shift of culture temperature from 30 °C to 50 °C,
8
and cells were exposed to the heat stress at 50 °C for 30 min. A control experiment was
9
carried out by culturing yeast cells at 30 °C constantly. Cells were collected by
10
centrifugation at 1500 × g for 5 min at 4 °C and the cell pellets were washed by PBS
11
three times. The following protein extraction and digestion procedure was the same as
12
described above. Two biological replicates, each with two process replicates were
13
implemented for both the control and heat shock samples. Protein digests from
14
individual samples were separately analyzed on AB 5600 TripleTOF MS in SWATH
15
acquisition mode. The optimized instrument parameters specified as “condition 5” in
16
Table 1 were applied. SWATH peak extraction was separately performed using
17
Spectronaut and PeakView, both with the criteria of peptide transition FDR 1
18
unique peptide per protein.
19 20
11
ACS Paragon Plus Environment
Journal of Proteome Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
RESULTS AND DISCUSSION
2
Impact of SWATH MS Major Acquisition Parameters
3
We first evaluated the influence of tuning major SWATH MS acquisition parameters
4
on the total number of proteins and peptides quantified as well as precision of
5
quantification. The possible influence of several instrument parameters on general
6
SWATH MS performance was briefly discussed in a recent publication30. We anticipate
7
the best identification and quantification results can be obtained only when the major
8
acquisition parameters (i.e. precursor mass range, Q1 isolation window width,
9
accumulation time for product ion scans, and cycle time) are well adapted to both the
10
targeted analytes and the instrument performance. To this end, six different combination
11
of these parameters listed in Table 1 were applied in SWATH analysis of a yeast
12
proteomic sample using TripleTOF 5600 mass spectrometer (Complete quantification
13
results summarized in Table S1). In this study, we relied on protein identification results
14
from DDA experiments on the pre-fractionated peptide samples from the same yeast
15
total digest to construct a spectral library for SWATH data extraction.
16
Condition 1 represents the conventional SWATH MS settings which divide the mass
17
range from 400 to 1200 m/z into 32 consecutive Q1 isolation windows with 25 amu per
18
window4, 5. Each product ion scan takes 100 ms and the total cycle time is 3.35 s which
19
allows collection of enough data points across most peptide chromatographic peaks.
20
Under condition 2, the precursor mass range was shrunk to 350-1000 m/z and the
21
accumulation time of each product scan was increased while the cycle time was kept
22
constant. These changes resulted in quantification of 14% more proteins and peptide
23
precursors compared to the initial setting (Figure 1A). 12
ACS Paragon Plus Environment
Page 12 of 39
Page 13 of 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Proteome Research
1
We attributed the gain of quantification throughput mostly to the increased
2
accumulation time which directly improves the MSMS spectral quality. The increase of
3
accumulation time in product ion scan is related to the reduced number of isolation
4
window due to the narrowed mass range. It is noteworthy that the adjusted mass range
5
did not compromise the total number of protein and peptide identifications in the SWATH
6
experiment as we found the majority of identifications from a DDA experiment of the
7
yeast total digest using the same instrument was concentrated in this region (Figure
8
S1A). In other words, skipping the high mass range of 1000-1200 m/z allowed for an
9
enhanced product ion scan for each isolation window while not affecting the cycling rate,
10
leading to increased quantification coverage as well as slightly improved precision
11
(Figure 1A and 1B). However, when we analyzed DDA data from mouse brain tissue or
12
HeLa cells acquired on two instruments from other vendors31, 32, a lot more identifications
13
were based on precursors detected in the high mass range (Figure S1B, S1C).
14
Therefore, optimization of the precursor mass range is sample and instrument specific.
15
According to our DDA data, we employed this new precursor mass range of 350-1000
16
m/z in the subsequent experiments and condition 2 was regarded a benchmark for
17
performance comparison.
18
We next assessed the impact of Q1 isolation window width on SWATH MS
19
performance. In principle, large isolation windows allow for faster cycling rates across a
20
defined precursor scan range. However, large isolation windows increase the number of 13
ACS Paragon Plus Environment
Journal of Proteome Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
precursors concurrently fragmented in the respective window, introducing more ion
2
interference. In addition, presence of fragment ion interference could influence the mass
3
accuracy, resolution and signal intensity of targeted fragment ions in SWATH spectra.27
4
When we narrowed the isolation window from the original 25 amu to 15 amu or 10 amu
5
and still fixing the cycle time around 3.3 s, it follows that much more isolation windows
6
were needed across the mass range of 350-1000 m/z and each one consumed shorter
7
accumulation time (conditions 3 and 4 in Table 1). Interestingly, these adjustments
8
further increased the total number of protein quantification by 7.1% and 10.4% for
9
conditions 3 and 4 respectively compared to condition 2. Notably, these new conditions
10
did not impair precision of quantification mainly because of the constant cycle time
11
(Figure 1B). These results indicate that less interferences in the product ion spectra due
12
to smaller isolation window overweighed the reduction of accumulation time, causing net
13
enhancement of SWATH data quality. Recently, the technique of variable Q1 isolation
14
window has been introduced to SWATH MS and it allows more flexible optimization of
15
the SWATH isolation window within different segments of the precursor mass range so
16
as to acquire deeper coverage for proteome-wide quantification.33
17
Considering the cycle time significantly affects data point sampling across
18
chromatographic peaks and thus the precision of quantification, we then modified this
19
parameter to achieve faster cycling rates (conditions 5 and 6). It turned out that a cycle
20
time of 2.7 s which is shorter than the conventional setting (3.3-3.4 s) afforded the best 14
ACS Paragon Plus Environment
Page 14 of 39
Page 15 of 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Proteome Research
1
precision among six conditions examined here (Figure 1B). Compared to condition 2, the
2
median CV of peptide peak quantification in condition 5 dropped from 8.3% to 6.1%, and
3
the median number of data points across peaks increased from 8.6 to 11.0. However,
4
the even faster cycling rate in condition 6 deteriorated the precision of SWATH
5
quantification shown by the median CV of peak area raised to 10.7% (Figure 1A, 1B). It
6
should be also noted that in our study the shorter cycle time is connected to reduced
7
accumulation time of product ion scans (80 or 50 ms) as the isolation window setting
8
was not much changed. Interestingly, the reduced accumulation time in condition 5 did
9
not cause negative effect on the quantification throughput (Figure 1A). Instead, slightly
10
more proteins and peptides were measured in condition 5 vs condition 2 (Figure 1A).
11
However, the unmatched combination of shorter accumulation time and wider isolation
12
window in condition 6 would affect fragment ion spectral quality and peptide transition
13
extraction, which ultimately lowered both quantification throughput and precision (Figure
14
1). Scatter plots of CV relative to ion intensity under different conditions are shown in
15
Figure S2.
16
Taken together, we obtained an optimal combination of major acquisition parameters
17
(condition 5) in which precursor mass range, accumulation time and cycle time are
18
significantly modified compared to the conventional settings adopted in previous studies
19
using TripleTOF 5600 mass spectrometer (condition 1). These modifications led to
20
increase of the total number of protein quantifications from 1983 to 2297, and reduction 15
ACS Paragon Plus Environment
Journal of Proteome Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
of the median CV of peptide peak area from 10.5% to 6.1%. Our strategy of combined
2
adjustment of interdependent instrument parameters can be extended to other mass
3
instrument platforms with faster scan speed, better sensitivity and stronger mass
4
resolution power to accomplish optimal SWATH MS performance.
5 6
SWATH MS Analysis of Yeast Proteomic Samples Diluted in E. coli Digest
7
Background
8
To assess the capability of SWATH MS in profiling of various proteins in complex
9
matrix, we diluted a proteomic sample of the yeast total digest win E. coli digest
10
background by a factor of 2, 5 and 10. The initial (undiluted) proteomic sample and three
11
diluted samples were analyzed by SWATH MS using the optimal instrument parameter
12
set discussed above (i.e. condition 5 in Table 1). Each sample was injected with equal
13
loading in technical duplicate. The spectral library generated earlier from pre-fractionated
14
yeast proteomic samples was used here for detection of proteins and peptides from
15
SWATH dataset.
16
The number of yeast proteins and peptides that can be identified and quantified by
17
SWATH analysis was gradually reduced from the initial sample to the diluted samples of
18
increasing dilution factors (Figure 2A). In total, 14007 peptide precursors corresponding
19
to 2297 proteins were measured in the initial yeast total digest, whereas only 1901
20
precursors corresponding to 535 yeast proteins were measured in the 10-fold diluted
21
sample. It seems increasingly challenging to extract and identify less abundant yeast
22
peptides serially diluted with E. coli total digests. Surprisingly, according to the copy
23
number estimation given by Ghaemmaghami S et al.,34 the dynamic range of cellular
24
abundances of yeast proteins detected by SWATH MS from the initial sample and 16
ACS Paragon Plus Environment
Page 16 of 39
Page 17 of 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Proteome Research
1
serially diluted samples all spanned a wide range of four orders of magnitude (from 100
2
to 1E6 copies/cell) (Figure 2B). Furthermore, the quantification precision for yeast
3
proteomic samples was not much affected in the E. coli digest background. More than
4
70% yeast proteins were quantified with CV below 20% in both the initial and diluted
5
samples (Figure 2C). We then used this reference dataset to assess how changes in
6
data processing parameters impact on the quantification throughput and accuracy of
7
SWATH MS analysis.
8 9
Impact of Peptide Peak Extraction Criteria
10
We investigated the influence of peptide extraction criteria in Spectronaut software on
11
the total number of yeast proteins quantified in the E. coli digest background as well as
12
the quantification accuracy of SWATH analysis. As expected, quantitative comparison
13
between different diluted yeast proteomic samples and the initial sample would yield
14
theoretical ratios of 0.5, 0.2 and 0.1 respectively. Using default settings in Spectronaut
15
with a single filter of q-value < 0.01 for control of peptide transition FDR, 1598, 999 and
16
618 proteins were quantified in the yeast proteomic samples diluted in 2, 5, and 10 fold
17
with E. coli digests (complete quantification results summarized in Table S1). Although
18
the median protein ratio determined for each sample was close to the theoretical value,
19
a number of outliers far from the expected ratio were observed (Figure 3 A). In fact,
20
22.2%-43.9% proteins from three diluted samples showed a relative error above 50%
21
(Table S2). These outliers are mostly derived from peptides of poor quantification
22
reproducibility or having low-quality extracted ion spectra reflected by Cscore given by
23
Spectronaut (Figure S3). Because peptides carrying variable modifications are routinely
24
removed when developing a quantification assay (e.g. S/MRM), we then filtered out
25
these modified peptides extracted from the SWATH dataset. Surprisingly, this treatment 17
ACS Paragon Plus Environment
Journal of Proteome Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
did not significantly reduce the number of outliers for protein quantification (Figure 3B).
2
Next we changed the criteria to select at least two unique peptides per protein for
3
quantification, which is adopted in many proteomic studies. As a result, 12.8%-20.5%
4
proteins from three diluted samples were found to have a relative error above 50%,
5
indicating substantial increase of SWATH quantification accuracy (Figure 3C, Table S2).
6
Combining the filters to exclude modified peptides and retain proteins with more than
7
one peptide assignment did not further improve the accuracy of quantification (Figure
8
3D, Table S1). Although extraction of at least two peptides per protein reduced the
9
coverage of the quantifiable protein and peptide precursors to varying extent (Figure
10
3E), we still recommended implementing this stringent criteria for the benefit of
11
quantification accuracy.
12 13
Impact of Spectral Library and Data Processing Software
14
To ensure specificity of fragment ion extraction from SWATH datasets, a prior
15
generated spectra library is needed to provide the information matrix of peptide
16
sequence, retention time, peptide-fragment transitions and fragment ion intensity. In a
17
detailed protocol of building high-quality spectral libraries by Schubert et al.,35 the
18
authors suggested pre-fractionation of peptide samples and combining output of multiple
19
search engines to maximize the number of PSMs and enhance discrimination between
20
true and false assignments. In the recently published work by Wu J et al, the spectral
21
library generated on the same sample using the same type of instrument led to detection
22
of the highest number of differential proteins with the lowest false positive rate.16
23
However, this study did not investigate the value of using multiple search engines to
24
build extended spectral libraries.
18
ACS Paragon Plus Environment
Page 18 of 39
Page 19 of 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Proteome Research
1
In this work, we first constructed two separate spectral libraries based on protein
2
identifications for the pre-fractionated yeast proteomic samples from Mascot search
3
alone or pooling results from four search engines (Mascot, MaxQuant, ProteinPilot and
4
X!Tantem). The third spectral library available in Spectronaut was built on the public data
5
repository of yeast proteomes. It is noteworthy that all three spectral libraries contained
6
data from iRT peptide standards spiked into every sample for accurate RT calibration
7
which warrants the specificity of fragment ion peak extraction.4 In total, 27,462 peptides
8
corresponding to 4145 proteins were built in the Mascot library; 33,097 peptides
9
corresponding to 4315 proteins were in the multi-engine library; and 34,029 peptides
10
corresponding to 3421 proteins were available in the public data library. The
11
aforementioned yeast dilution reference dataset was processed against the three
12
different spectral libraries using Spectronaut, which extracted at least two peptides per
13
protein. It turned out that use of the library from both the multi-engine search output and
14
the public data repository increased the total number of quantified proteins to different
15
extent for three diluted yeast samples compared to use of the Mascot library (Figure 4).
16
However, the quantification accuracy was impeded when using the multi-engine and
17
public data libraries given that more outliers with abnormal ratios were present (Figure
18
4A-C). In particular, the percentage of yeast proteins in the 10-fold diluted sample that
19
had fold changes of > 50% relative error increased from 20% to over 26% when
20
changing the Mascot library to the multi-engine or public repository derived library (Table
21
S3). We speculated that combining identification outputs from multiple engines
22
expanded the search space and may have introduced more errors in SWATH spectral
23
annotation as well as the following fragment ion peak extraction. Thus a stringent FDR
24
control method such as Mayu36 and BiblioSpec37 could be exploited to improve the
25
quality of the multi-engine and public data libraries and minimize false assignment of 19
ACS Paragon Plus Environment
Journal of Proteome Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
SWATH spectra. In addition, it is reported by Wu J et al. that extending the assay library
2
by integrating spectra from public data repository largely compromises accuracy and
3
precision of SWATH quantification16. Other software tools for creating combined spectral
4
libraries from different sources may be further evaluated to increase the depth and
5
quality of assay libraries12,
6
with the public data library also observed in our study emphasizes the importance of
7
building the library from shotgun proteomic data acquired on the same instrument
8
system as the SWATH experiment even though precursor RT can be strictly calibrated
9
with peptide standards.
35, 38
. Nevertheless, the inferior quantification performance
10
Apart from optimizing data processing parameters in Spectronaut, we also tested a
11
widely used commercial software PeakView in analysis of the same SWATH dataset for
12
yeast dilution samples. Fragment ion peaks were extracted using SWATH micro App 2.0
13
built in PeakView software with criteria similar to Spectronaut settings (see Methods for
14
details). The spectral library was constructed on the identification output from
15
ProteinPilot which contained 150791 peptides corresponding to 3673 proteins.
16
Surprisingly, SWATH peak extraction with PeakView against the single library gave rise
17
to significantly fewer proteins (Figure 4E). Only 361, 210, 117 proteins were quantified in
18
2, 5, 10-fold diluted yeast samples respectively, compared to 1082, 609, 337 proteins
19
quantified by Spectronaut against the Mascot library. Swapping the spectral library with
20
the two software led to the same conclusion that a greater number of yeast proteins
21
were quantified when extracting SWATH peaks by Spectronaut than by PeakView
22
(Figure 4E).
23
Notably, higher accuracy of protein quantification was obtained from SWATH data
24
processed by PeakView than by Spectronaut (Figure 4, Table S3). Thus the two
20
ACS Paragon Plus Environment
Page 20 of 39
Page 21 of 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Proteome Research
1
software showed complementary features in terms of quantification throughput and
2
accuracy for SWATH data mining.
3 4
Quantification of Expression Changes in The Yeast Proteome upon Heat Shock
5
We next employed the SWATH MS technique to quantify the expression changes in
6
the S. cerevisiae proteome induced by heat stress. The heat shock response of yeast
7
cells which activates multiple fundamental stress response programs have been
8
characterized at the proteomic level using stable isotope labeling-based quantification
9
techniques39,
40
. Herein we analyzed the control and heat-stressed proteome of an
10
industrial yeast strain by SWATH MS with the optimal instrument and data-processing
11
parameters discovered above (see Methods for details). Given that the two software
12
Spectronaut and PeakView displayed complementary strengths in SWATH quantification
13
performance, we processed this SWATH dataset using both for peak extraction under
14
the same criteria (i.e. FDR 1 peptide/protein). Ion peak areas exported from
15
Spectronaut and PeakView were normalized and subjected to statistical analysis by
16
MSstats.29
17
Using the two data-mining workflows, we measured a total of 1323 and 663 proteins
18
across all four replicates of the heat-shock and control samples based on peak
19
extraction by Spectronaut and Peakview, respectively. Among the quantified proteins,
20
100 were found to be differentially expressed with a fold change > 1.5 (p-value < 0.05) in
21
heat shock response vs control with the Spectronaut workflow whereas 43 were
22
differential proteins discovered by the PeakView workflow. The overlap of up- and down-
23
regulated proteins revealed by two different workflows is summarized in Figure 5.
24
Although Spectronaut is more sensitive in detection of differentially expressed proteins,
25
PeakView still captures a handful of differential proteins that would have been 21
ACS Paragon Plus Environment
Journal of Proteome Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
overlooked if relying on Spectronaut alone. Considering the relatively high accuracy of
2
PeakView in quantification of expected protein fold changes in the diluted yeast samples
3
(Figure 4), we pooled differential proteins reported by two software that meet our
4
selection criteria. Yet it should be noted that the FDR of quantification could be
5
increased by combining results from two workflows. Among 117 non-redundant
6
differential proteins revealed by our SWATH MS analysis, only 27 were reported in the
7
previous study to be also significantly changed under heat shock stress39. This
8
considerable difference in proteomic profiles may be attributed to multiple factors
9
including distinct strain background, different stress conditions and different MS
10
quantification techniques employed.
11
Functional classification was then performed on the differentially expressed proteins
12
detected by both workflows (Figure 5). A panel of proteins involved in protein
13
folding/degradation were up-regulated and those in the electron transport/energy
14
generation were down-regulated during heat shock response of the industrial yeast
15
strain. The complete dataset and in-depth characterization of differential proteins will be
16
presented elsewhere. In summary we recommend pooling results given by the two
17
widely used SWATH data extraction software to cover as many differentially regulated
18
proteins as possible for functional studies.
19 20
22
ACS Paragon Plus Environment
Page 22 of 39
Page 23 of 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Journal of Proteome Research
Conclusions
2
The present study has evaluated the impact of multiple parameters in raw data
3
acquisition and data processing workflow on SWATH MS performance so as to strike a
4
balance
5
quantification.
6
Our work implicates that several acquisition variables including precursor mass range,
7
MS2 accumulation time, isolation window width and cycle time affect quantification
8
performance in an interdependent manner. In addition, special attention needs to be
9
paid to SWATH peak extraction criteria and the software chosen for SWATH data
10
processing so as to acquire complete and reliable quantification results. For SWATH
11
experiments performed on ABSciex TripleTOF 5600 instrument, we provided an optimal
12
set of acquisition parameters and recommended two complementary workflows for data
13
mining and differential protein selection. However, it should be noted that the optimized
14
parameter set is sample and instrument specific. Our study mainly provides
15
experimental evidence to show that combined optimization of these parameters is able
16
to improve SWATH quantification performance, yet we would recommend researchers to
17
optimize their own parameters and workflows depending on the sample type, available
18
instrument and software.
between
the
throughput,
accuracy
and
reproducibility
19 20 21
23
ACS Paragon Plus Environment
of
proteomic
Journal of Proteome Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Acknowledgement
2
This work was supported by grants from the Bairenjihua Program of the Chinese
3
Academy of Sciences, and the National Natural Science Foundation of China (No.
4
31401150 and 21505151) the Key Projects in Tianjin Science & Technology Pillar
5
Program (No. 14ZCZDSY00062).
6 7
Conflict of Interest Disclosure
8
The authors declare no competing financial interests.
9 10
24
ACS Paragon Plus Environment
Page 24 of 39
Page 25 of 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
Journal of Proteome Research
REFERENCES 1.
Chick, J.M.; Munger, S.C.; Simecek, P.; Huttlin, E.L.; Choi, K.; Gatti, D.M.; Raghupathy, N.; Svenson, K.L.; Churchill, G.A.; Gygi, S.P. Defining the consequences of genetic variation on a proteome-wide scale. Nature 2016, 534, 500-505.
2.
Mertins, P.; Mani, D.R.; Ruggles, K.V.; Gillette, M.A.; Clauser, K.R.; Wang, P.; Wang, X.; Qiao, J.W.; Cao, S.; Petralia, F.; Kawaler, E.; Mundt, F.; Krug, K.; Tu, Z.; Lei, J.T.; Gatza, M.L.; Wilkerson, M.; Perou, C.M.; Yellapantula, V.; Huang, K.L.; Lin, C.; McLellan, M.D.; Yan, P.; Davies, S.R.; Townsend, R.R.; Skates, S.J.; Wang, J.; Zhang, B.; Kinsinger, C.R.; Mesri, M.; Rodriguez, H.; Ding, L.; Paulovich, A.G.; Fenyo, D.; Ellis, M.J.; Carr, S.A.; Nci, C. Proteogenomics connects somatic mutations to signalling in breast cancer. Nature 2016, 534, 55-62.
3.
Kusebauch, U.; Campbell, D.S.; Deutsch, E.W.; Chu, C.S.; Spicer, D.A.; Brusniak, M.Y.; Slagel, J.; Sun, Z.; Stevens, J.; Grimes, B.; Shteynberg, D.; Hoopmann, M.R.; Blattmann, P.; Ratushny, A.V.; Rinner, O.; Picotti, P.; Carapito, C.; Huang, C.Y.; Kapousouz, M.; Lam, H.; Tran, T.; Demir, E.; Aitchison, J.D.; Sander, C.; Hood, L.; Aebersold, R.; Moritz, R.L. Human SRMAtlas: A Resource of Targeted Assays to Quantify the Complete Human Proteome. Cell 2016, 166, 766-778.
4.
Selevsek, N.; Chang, C.Y.; Gillet, L.C.; Navarro, P.; Bernhardt, O.M.; Reiter, L.; Cheng, L.Y.; Vitek, O.; Aebersold, R. Reproducible and consistent quantification of the Saccharomyces
cerevisiae proteome by SWATH-mass
spectrometry.
Mol
Cell
Proteomics 2015, 14, 739-749. 5.
Gillet, L.C.; Navarro, P.; Tate, S.; Rost, H.; Selevsek, N.; Reiter, L.; Bonner, R.; Aebersold, R. Targeted data extraction of the MS/MS spectra generated by dataindependent acquisition: a new concept for consistent and accurate proteome analysis. Mol Cell Proteomics 2012, 11, O111 016717.
6.
Lange, V.; Picotti, P.; Domon, B.; Aebersold, R. Selected reaction monitoring for quantitative proteomics: a tutorial. Mol Syst Biol 2008, 4, 222.
7.
Collins, B.C.; Gillet, L.C.; Rosenberger, G.; Rost, H.L.; Vichalkovski, A.; Gstaiger, M.; Aebersold, R. Quantifying protein interaction dynamics by SWATH mass spectrometry: application to the 14-3-3 system. Nat Methods 2013, 10, 1246-1253.
8.
Lambert, J.P.; Ivosev, G.; Couzens, A.L.; Larsen, B.; Taipale, M.; Lin, Z.Y.; Zhong, Q.; Lindquist, S.; Vidal, M.; Aebersold, R.; Pawson, T.; Bonner, R.; Tate, S.; Gingras, A.C. Mapping differential interactomes by affinity purification coupled with data-independent mass spectrometry acquisition. Nat Methods 2013, 10, 1239-1245.
9.
Bruderer, R.; Bernhardt, O.M.; Gandhi, T.; Miladinovic, S.M.; Cheng, L.Y.; Messner, S.; Ehrenberger, T.; Zanotelli, V.; Butscheid, Y.; Escher, C.; Vitek, O.; Rinner, O.; Reiter, L. Extending the limits of quantitative proteome profiling with data-independent acquisition and application to acetaminophen-treated three-dimensional liver microtissues. Mol Cell Proteomics 2015, 14, 1400-1410. 25
ACS Paragon Plus Environment
Journal of Proteome Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
10.
Guo, T.; Kouvonen, P.; Koh, C.C.; Gillet, L.C.; Wolski, W.E.; Rost, H.L.; Rosenberger, G.; Collins, B.C.; Blum, L.C.; Gillessen, S.; Joerger, M.; Jochum, W.; Aebersold, R. Rapid mass spectrometric conversion of tissue biopsy samples into permanent quantitative digital proteome maps. Nat Med 2015, 21, 407-413.
11.
Liu, Y.; Buil, A.; Collins, B.C.; Gillet, L.C.; Blum, L.C.; Cheng, L.Y.; Vitek, O.; Mouritsen, J.; Lachance, G.; Spector, T.D.; Dermitzakis, E.T.; Aebersold, R. Quantitative variability of 342 plasma proteins in a human twin population. Mol Syst Biol 2015, 11, 786.
12.
Rost, H.L.; Rosenberger, G.; Navarro, P.; Gillet, L.; Miladinovic, S.M.; Schubert, O.T.; Wolski, W.; Collins, B.C.; Malmstrom, J.; Malmstrom, L.; Aebersold, R. OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data. Nat Biotechnol 2014, 32, 219-223.
13.
Schubert, O.T.; Ludwig, C.; Kogadeeva, M.; Zimmermann, M.; Rosenberger, G.; Gengenbacher, M.; Gillet, L.C.; Collins, B.C.; Rost, H.L.; Kaufmann, S.H.; Sauer, U.; Aebersold, R. Absolute Proteome Composition and Dynamics during Dormancy and Resuscitation of Mycobacterium tuberculosis. Cell Host Microbe 2015, 18, 96-108.
14.
Loke, M.F.; Ng, C.G.; Vilashni, Y.; Lim, J.; Ho, B. Understanding the dimorphic lifestyles of human gastric pathogen Helicobacter pylori using the SWATH-based proteomics approach. Sci Rep 2016, 6, 26784.
15.
Tang, X.; Meng, Q.; Gao, J.; Zhang, S.; Zhang, H.; Zhang, M. Label-free Quantitative Analysis of Changes in Broiler Liver Proteins under Heat Stress using SWATH-MS Technology. Sci Rep 2015, 5, 15119.
16.
Wu, J.X.; Song, X.; Pascovici, D.; Zaw, T.; Care, N.; Krisp, C.; Molloy, M.P. SWATH Mass Spectrometry Performance Using Extended Peptide MS/MS Assay Libraries. Mol Cell Proteomics 2016, 15, 2501-2514.
17.
Tsou, C.C.; Avtonomov, D.; Larsen, B.; Tucholska, M.; Choi, H.; Gingras, A.C.; Nesvizhskii, A.I. DIA-Umpire: comprehensive computational framework for dataindependent acquisition proteomics. Nat Methods 2015, 12, 258-264, 257 p following 264.
18.
Rardin, M.J.; Schilling, B.; Cheng, L.Y.; MacLean, B.X.; Sorensen, D.J.; Sahu, A.K.; MacCoss, M.J.; Vitek, O.; Gibson, B.W. MS1 Peptide Ion Intensity Chromatograms in MS2 (SWATH) Data Independent Acquisitions. Improving Post Acquisition Analysis of Proteomic Experiments. Mol Cell Proteomics 2015, 14, 2405-2419.
19.
MacLean, B.; Tomazela, D.M.; Shulman, N.; Chambers, M.; Finney, G.L.; Frewen, B.; Kern, R.; Tabb, D.L.; Liebler, D.C.; MacCoss, M.J. Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 2010, 26, 966-968.
20.
Teleman, J.; Rost, H.L.; Rosenberger, G.; Schmitt, U.; Malmstrom, L.; Malmstrom, J.; Levander, F. DIANA--algorithmic improvements for analysis of data-independent acquisition MS data. Bioinformatics 2015, 31, 555-562. 26
ACS Paragon Plus Environment
Page 26 of 39
Page 27 of 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
Journal of Proteome Research
21.
Wang, J.; Tucholska, M.; Knight, J.D.; Lambert, J.P.; Tate, S.; Larsen, B.; Gingras, A.C.; Bandeira, N. MSPLIT-DIA: sensitive peptide identification for data-independent acquisition. Nat Methods 2015, 12, 1106-1108.
22.
Escher, C.; Reiter, L.; MacLean, B.; Ossola, R.; Herzog, F.; Chilton, J.; MacCoss, M.J.; Rinner, O. Using iRT, a normalized retention time for more targeted measurement of peptides. Proteomics 2012, 12, 1111-1121.
23.
Cox, J.; Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol 2008, 26, 1367-1372.
24.
Shilov, I.V.; Seymour, S.L.; Patel, A.A.; Loboda, A.; Tang, W.H.; Keating, S.P.; Hunter, C.L.; Nuwaysir, L.M.; Schaeffer, D.A. The Paragon Algorithm, a next generation search engine that uses sequence temperature values and feature probabilities to identify peptides from tandem mass spectra. Mol Cell Proteomics 2007, 6, 1638-1655.
25.
Craig, R.; Beavis, R.C. TANDEM: matching proteins with tandem mass spectra. Bioinformatics 2004, 20, 1466-1467.
26.
Vaudel, M.; Barsnes, H.; Berven, F.S.; Sickmann, A.; Martens, L. SearchGUI: An opensource graphical user interface for simultaneous OMSSA and X!Tandem searches. Proteomics 2011, 11, 996-999.
27.
Nesvizhskii, A.I. Protein identification by tandem mass spectrometry and sequence database searching. Methods Mol Biol 2007, 367, 87-119.
28.
Reiter, L.; Rinner, O.; Picotti, P.; Huttenhain, R.; Beck, M.; Brusniak, M.Y.; Hengartner, M.O.; Aebersold, R. mProphet: automated data processing and statistical validation for large-scale SRM experiments. Nat Methods 2011, 8, 430-435.
29.
Choi, M.; Chang, C.Y.; Clough, T.; Broudy, D.; Killeen, T.; MacLean, B.; Vitek, O. MSstats: an R package for statistical analysis of quantitative mass spectrometry-based proteomic experiments. Bioinformatics 2014, 30, 2524-2526.
30.
Simburger, J.M.; Dettmer, K.; Oefner, P.J.; Reinders, J. Optimizing the SWATH-MSworkflow for label-free proteomics. J Proteomics 2016, 145, 137-140.
31.
Sharma, K.; Schmitt, S.; Bergner, C.G.; Tyanova, S.; Kannaiyan, N.; Manrique-Hoyos, N.; Kongi, K.; Cantuti, L.; Hanisch, U.K.; Philips, M.A.; Rossner, M.J.; Mann, M.; Simons, M. Cell type- and brain region-resolved mouse brain proteome. Nat Neurosci 2015, 18, 1819-1831.
32.
Beck, S.; Michalski, A.; Raether, O.; Lubeck, M.; Kaspar, S.; Goedecke, N.; Baessmann, C.; Hornburg, D.; Meier, F.; Paron, I.; Kulak, N.A.; Cox, J.; Mann, M. The Impact II, a Very High-Resolution Quadrupole Time-of-Flight Instrument (QTOF) for Deep Shotgun Proteomics. Mol Cell Proteomics 2015, 14, 2014-2029.
33.
Zhang, Y.; Bilbao, A.; Bruderer, T.; Luban, J.; Strambio-De-Castillia, C.; Lisacek, F.; Hopfgartner, G.; Varesio, E. The Use of Variable Q1 Isolation Windows Improves Selectivity in LC-SWATH-MS Acquisition. J Proteome Res 2015, 14, 4359-4371. 27
ACS Paragon Plus Environment
Journal of Proteome Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
34.
Ghaemmaghami, S.; Huh, W.K.; Bower, K.; Howson, R.W.; Belle, A.; Dephoure, N.; O'Shea, E.K.; Weissman, J.S. Global analysis of protein expression in yeast. Nature 2003, 425, 737-741.
35.
Schubert, O.T.; Gillet, L.C.; Collins, B.C.; Navarro, P.; Rosenberger, G.; Wolski, W.E.; Lam, H.; Amodei, D.; Mallick, P.; MacLean, B.; Aebersold, R. Building high-quality assay libraries for targeted analysis of SWATH MS data. Nat Protoc 2015, 10, 426-441.
36.
Reiter, L.; Claassen, M.; Schrimpf, S.P.; Jovanovic, M.; Schmidt, A.; Buhmann, J.M.; Hengartner, M.O.; Aebersold, R. Protein identification false discovery rates for very large proteomics data sets generated by tandem mass spectrometry. Mol Cell Proteomics 2009, 8, 2405-2417.
37.
Frewen, B.; MacCoss, M.J. Using BiblioSpec for creating and searching tandem MS peptide libraries. Curr Protoc Bioinformatics 2007, Chapter 13, Unit 13 17.
38.
Zi, J.; Zhang, S.; Zhou, R.; Zhou, B.; Xu, S.; Hou, G.; Tan, F.; Wen, B.; Wang, Q.; Lin, L.; Liu, S. Expansion of the ion library for mining SWATH-MS data through fractionation proteomics. Anal Chem 2014, 86, 7242-7246.
39.
Nagaraj, N.; Kulak, N.A.; Cox, J.; Neuhauser, N.; Mayr, K.; Hoerning, O.; Vorm, O.; Mann, M. System-wide perturbation analysis with nearly complete coverage of the yeast proteome by single-shot ultra HPLC runs on a bench top Orbitrap. Mol Cell Proteomics 2012, 11, M111 013722.
40.
Shui, W.; Xiong, Y.; Xiao, W.; Qi, X.; Zhang, Y.; Lin, Y.; Guo, Y.; Zhang, Z.; Wang, Q.; Ma, Y. Understanding the Mechanism of Thermotolerance Distinct From Heat Shock Response Through Proteomic Analysis of Industrial Strains of Saccharomyces cerevisiae. Mol Cell Proteomics 2015, 14, 1885-1897.
24 25 26
28
ACS Paragon Plus Environment
Page 28 of 39
Page 29 of 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Journal of Proteome Research
1
Table 1. SWATH MS acquisition parameters evaluated in this study
Condition
Mass range (m/z)
SWATH window width (amu)
SWATH window number
Accumulation time (ms)
Cycle time (s)
1
400-1200
25
32
100
3.35
2
350-1000
25
26
125
3.4
3
350-1000
15
44
70
3.23
4
350-1000
10
65
50
3.38
5
350-1000
20
33
80
2.7
6
350-1000
20
33
50
1.8
2 3
29
ACS Paragon Plus Environment
Journal of Proteome Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Figure Legends
2
Figure 1. Impact of major acquisition parameters on proteomic quantification by
3
SWATH analysis. (A) The total number of proteins and peptide precursors quantified by
4
SWATH analysis of a yeast total digest. Different conditions of data acquisition are
5
specified in Table 1. (B) The median number of data points across each fragment ion
6
peak (DP/peak) and the median CV% of peak quantification under different data
7
acquisition conditions.
8 9
Figure 2. SWATH MS analysis of yeast proteomic samples diluted in E.coli digest
10
background. (A) The total number of proteins and peptide precursors quantified by
11
SWATH MS analysis of the initial (undiluted) yeast proteomic sample and the same
12
sample serially diluted with E. coli total digest by 2, 5 and 10 fold. The optimal condition
13
5 in Table 1 was employed for data acquisition. (B) Estimated cellular abundance
14
distribution of yeast proteins measured in the initial and diluted samples. (C) The
15
accumulated curve of CV% of yeast protein quantification in the initial and diluted
16
samples across triplicate measurement.
17 18
Figure 3. Impact of peptide peak extraction criteria on proteomic quantification by
19
SWATH MS analysis. Distribution of relative ratios determined on individual yeast
20
proteins in a specific dilute sample vs the initial sample is shown in the boxplots. Median
21
protein ratios are indicated next to the boxplots. Fragment ion peaks were extracted by
22
Spectronaut with different criteria: (A) peptide transition q-value < 0.01; (B) q-value
50% in diluted
12
yeast proteomic samples vs the initial sample.
13
Table S3 Percentage of proteins with fold changes of relative error > 50% in diluted
14
yeast proteomic samples vs the initial sample.
15
38
ACS Paragon Plus Environment
Page 38 of 39
Page 39 of 39
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
1
Journal of Proteome Research
for TOC only
2 3
Image courtesy of Shanshan Li (author), Copyright 2016
39
ACS Paragon Plus Environment