Multiparameter Optimization of Two Common Proteomics

Nov 5, 2018 - Multiparameter Optimization of Two Common Proteomics Quantification Methods for Quantifying Low-Abundance Proteins. Chengqian Zhang† ...
0 downloads 0 Views 937KB Size
Subscriber access provided by University of Sunderland

Article

Multi-parameter optimization of two common proteomics quantification methods for quantifying low-abundance proteins Chengqian Zhang, Zhaomei Shi, Ying Han, Yan Ren, and Piliang Hao J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.8b00769 • Publication Date (Web): 05 Nov 2018 Downloaded from http://pubs.acs.org on November 5, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Multi-parameter

optimization

of

two

common

proteomics

quantification methods for quantifying low-abundance proteins

Chengqian Zhang,1# Zhaomei Shi,1# Ying Han,1 Yan Ren*2,3 and Piliang Hao*1

1

School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia

Road, Shanghai 201210, China 2

BGI-Shenzhen, Beishan Industrial Zone 11th building, Yantian District, Shenzhen,

Guangdong 518083, China 3

China National GeneBank, BGI-Shenzhen, Jinsha Road, Shenzhen, 518120, China

# These authors contribute equally to this work.

*Corresponding author Yan REN Tel: (+86)755-36307403; Email: [email protected]

Piliang HAO Tel: (+86)21-20685416, Fax: (+86)21-20685430 Email: [email protected]

1

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract Quantitative proteomics has been extensively applied in the screening of differentially regulated proteins in various research areas for decades, but its sensitivity and accuracy have been a bottleneck for many applications. Every step in the proteomics workflow can potentially affect the quantification of low-abundance proteins, but a systematic evaluation of their effects has not been done yet. In this work, to improve the sensitivity and accuracy of label-free quantification and tandem mass tags (TMT) labeling in quantifying low-abundance proteins, multi-parameter optimization was carried out using a complex 2-proteome artificial sample mixture for a series of steps from sample preparation to data analysis, including the desalting of peptides, peptide injection amount for LC-MS/MS, MS1 resolution, the length of LC-MS/MS gradient, AGC targets, ion accumulation time, MS2 resolution, precursor co-isolation threshold, data analysis software, statistical calculation methods and protein fold changes, and the best settings for each parameter were defined. The suitable cutoffs for detecting low-abundance proteins with at least 1.5-fold and 2-fold changes were identified for label-free and TMT methods, respectively. The use of optimized parameters will significantly improve the overall performance of quantitative proteomics in quantifying low-abundance proteins, and thus promote its application in other research areas.

Keywords: quantitative proteomics, label-free quantification, tandem mass tags, low-abundance proteins, mass spectrometry

2

ACS Paragon Plus Environment

Page 2 of 31

Page 3 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Introduction Quantitative proteomics using high-resolution mass spectrometry have been widely applied in the screening of differentially regulated proteins across multiple samples from cell, tissue and plasma samples for studying the molecular mechanism of biological processes, drug target discovery, biomarker discovery for diseases, and so on.(1-4) Despite that thousands of papers have been published on biomarker discovery using quantitative proteomics, few biomarker candidates have made the transition to the clinic.(5) The relatively low sensitivity and accuracy of quantitative proteomics in quantifying low-abundance proteins could be one of the reasons, which has been proved as a bottleneck for many other applications, although the shortlisted biomarker candidates were generally cross-validated using orthogonal methods, such as WESTERN-BLOT and ELISA. It is thus pressing to improve the sensitivity and accuracy of quantitative proteomics. In recent years, with the quick development of mass spectrometry with high resolution and high scan speed, it is feasible to produce high-quality LC-MS/MS data for accurate protein quantification on a proteome-wide scale. Label-free quantification based on precursor ion intensity and stable isotope labeling methods, such as isobaric tag for relative and absolute quantification (iTRAQ) and tandem mass tags (TMT),(6, 7) are the two most commonly used relative quantification strategies in many published studies.(8-10) Many studies have been done to improve the accuracy of quantification for both label-free quantification and stable isotope labeling methods,(11-17) but most of them focus on the comparison of different label-free quantification strategies or how to reduce the ratio compression problem in stable isotope labeling methods. In fact, every step in the proteomics workflow can potentially affect the sensitivity and accuracy of protein quantification. For example, different sample preparation methods, LC-MS/MS data acquisition methods, data analysis software and statistical calculation methods may be used in different laboratories, which may have considerable effects on protein quantification. In addition, the major aim of quantitative proteomics is to identify differentially 3

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

regulated proteins across multiple samples in order to find biomarker candidates or study their biological functions. Therefore, it is essential to conduct a systematic study on how these parameters affect the sensitivity and accuracy of quantitative proteomics. The identification and quantification of low-abundance proteins from complex samples have always been a challenging task in shotgun proteomics due to undersampling and ion suppression effects from co-eluting peptides of high abundance.(18, 19) In this work, complex 2-proteome artificial sample mixtures that E. coli peptides were spiked into human or mouse peptides at the ratio of 1:10, 1:20, 1:30, 1:40 and 1:60, respectively, were used to optimize the parameters of quantitative proteomics for identifying and quantifying low-abundance proteins with known ratios using both label-free and stable isotope labeling methods. We evaluated various parameters throughout the shotgun proteomics workflow, including different C18 cartridges or tips used for desalting, peptide injection amount for LC-MS/MS, MS1 resolution, the length of LC-MS/MS gradient, AGC targets, ion accumulation time, MS2 resolution, precursor co-isolation threshold, data analysis software, statistical calculation methods and protein fold change, and identified the best settings for them.

EXPERIMENTAL PROCEDURES Chemicals and Reagents Urea (U0631) and ammonium bicarbonate (ABC) were purchased from Sigma-Aldrich (St. Louis, MO). EDTA-free protease inhibitor cocktail tablets (05892791001) were obtained from Roche (Basel, Switzerland). TMT 6-Plex kit was purchased from Thermo Fisher Scientific. All other materials were purchased from Sigma-Aldrich unless specified otherwise.

Sample Preparation of human 293T cells, mouse 4T1 cells and E.coli cells About five million of 293T cells, five million of 4T1 cells, 10 mg E.coli cells, 1:10 and 1:20 mixtures of E.coli and 293T cells were resuspended with 200 µL lysis buffer 4

ACS Paragon Plus Environment

Page 4 of 31

Page 5 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(8 M urea, 50 mM ABC, 1 mM DTT) with protease inhibitor added (10 ml/tablet, 05892791001, Roche), respectively. The suspension was sonicated for 10 seconds thrice on ice. The protein concentration of the lysates was then determined by the bicinchoninic acid (BCA) assay. About 1 mg lysate was reduced with 5 mM DTT at 37oC for 2 h and alkylated with 20 mM iodoacetamide for 45 min in the dark. After the concentration of urea was diluted to 1M with 50mM ABC, trypsin (V5111, Promega) was added at a weight ratio of 1:50. It was then incubated for 16h at 37oC. The reactions were stopped by adding 20% formic acid (FA) until a pH < 2 was reached. Peptides were desalted with ZipTip® C18 (ZTC18S09, Merck/ Millipore), Stagetip

(SP301,

Thermo

Scientific),

MonoSpin®

C18

(5010-21701,

SHIMADZU-GL) or Sep-Pak C18 (WAT054955, Waters) according to the manufacturer’s instructions, and dried in vacuum.

TMT Labeling and High-pH Reverse Phase (Hp-RP) Fractionation Complex 2-proteome artificial peptide mixtures were generated by spiking E. coli peptides into peptides from 293T or 4T1 cell lysates at the ratio of 1:20, 1:30 and 1:40, respectively. Complex 2-proteome peptide mixtures with different concentration of E. coli peptides were labeled using the 6-plex TMT kit according to the manufacturer’s protocol. Hp-RP fractionation was done as previously described with slight modifications.(20) Briefly, the labeled peptides were dissolved in 100 μl buffer A (2% ACN, adjusting pH to 10 using ammonium hydroxide), injected completely with an autosampler and fractionated using an XBridge C18 column (4.6×250 mm, 5 μm, 130 Å; Waters, Milford, MA) on a Ultimate 3000 UPLC system monitored at 280 nm. Sixty fractions were collected with a 85 min gradient of 5% buffer B (98% ACN, adjusting pH to 10 using ammonium hydroxide) for 2 min, 5%–8%B for 5 min, 8%– 18% B for 28 min, 18%–36% B for 25 min, 36%–95% B for 6 min, 95% B for 4 min, 95%–5% B for 5 min, followed by 10 min at 5% B at a flow rate of 1 ml/min. The 5

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

fractions were then dried in vacuum, pooled into 20 fractions as described, and redissolved in 0.1% FA for LC–MS/MS analysis.

LC-MS/MS Peptides were separated and analyzed on an Easy-nLC 1000 system coupled to a Q Exactive HF (Thermo Scientific). About 1 µg of peptides were separated in an home-made column (75 µm × 15 cm) packed with C18 AQ (5 µm, 300Å, Michrom BioResources, Auburn, CA, USA) at a flow rate of 300 nL/min. Mobile phase A (0.1% formic acid in 2% ACN) and mobile phase B (0.1% formic acid in 98% ACN) were used to establish a 60 min gradient comprised of 2 min of 5% B, 40 min of 5-26% B, 5 min of 26-30% B, 1 min 30-35% B, 2 min of 35-90% B, 10 min of 90% B, 0.5 min of 80%-5% B and 5.5 min of 5% B. For the 120 min gradient, 2 min of 2%-4% B, 90 min of 4-22% B, 10 min of 22-30% B, 7 min 30-45% B, 1 min of 45-90% B and 10 min of 90% B were used. Peptides were then ionized by electrospray at 2.1 kV. A full MS spectrum (375-1400 m/z range) was acquired at a resolution of 120,000 at m/z 200 and a maximum ion accumulation time of 20 ms. Dynamic exclusion was set to 30 s. Resolution for HCD MS/MS spectra was set to 30,000 at m/z 200. The AGC setting of MS1 and MS2 were set at 3E6 and 1E5, respectively. The 20 most intense ions above a 1.7E4 counts threshold were selected for fragmentation by HCD with a maximum ion accumulation time of 60 ms. MS2 isolation width of 1.2 m/z and 1.6 m/z units was used forTMT samples and label-free samples, respectively. Single and unassigned charged ions were excluded from MS/MS. For HCD, normalized collision energy was set to 25% for label-free methods and 30% for TMT methods, respectively.

Data Analysis The raw data were processed using MaxQuant with integrated Andromeda search engine (v.1.5.4.1) and Proteome Discoverer software (PD) (version 2.1.0.81, Thermo Scientific). The UniProt E.coli protein database (release 2017_01, 4306 sequences) 6

ACS Paragon Plus Environment

Page 6 of 31

Page 7 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

and mouse protein database (release 2016_07, 49863 sequences) or human protein database (release 2016_07, 70630 sequences) were used for database searches of label-free samples. The UniProt E.coli protein database (release 2017_01, 4306 sequences) and human protein database (release 2016_07, 70630 sequences) were used for database searches of TMT labeled samples. Trypsin/P was set as the enzyme, and two missed cleavage sites of trypsin were allowed. Mass error was set to 5 ppm for precursor ions and 0.02 Da for fragment ions. Carbamidomethylation on Cys was specified as the fixed modification and oxidation (M), deamidation (NQ), acetylation (Protein N-term) were set as variable modifications. False discovery rate (FDR) thresholds for protein, peptide and modification site were specified at 1%. Minimum peptide length was set at 7. For the TMT quantification method, TMT-6plex was selected. For the label-free quantification method using MaxQuant, the options of “Second peptides”, “Match between runs” and “Dependent peptides” were enabled.(21, 22) All other parameters were set to default values for both of the software. A Student's t-test was used to verify the significance of the differences between each comparison. Principle Component Analysis (PCA) was done using MetaboAnalyst (http://www.metaboanalyst.ca/).(23)

RESULTS AND DISCUSSION In this work, complex 2-proteome artificial sample mixtures were used to evaluate the sensitivity and accuracy of quantitative proteomics for identifying and quantifying low-abundance proteins using both label-free and TMT labeling methods. The 2-proteome artificial mixtures were generated by spiking E.coli peptides into human or mouse peptides at the ratio of 1:20, 1:30, 1:40 and 1:60, respectively. E.coli proteins were used to mimic relatively low-abundance proteins with known ratios of 1:2 or 1:1.5 between different samples, while human or mouse proteins are consistent across different samples. Therefore, all detected differential E. coli proteins between the samples are true positives, and all detected differential human or mouse proteins are true negatives. The quantitative accuracy was calculated by dividing the number 7

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 31

of true positives by the total number of detected differential proteins. The evaluated parameters cover the workflow of quantitative proteomics, which include (1) desalting process and peptide injection amount for LC-MS/MS, (2) MS1 resolution and the length of LC-MS/MS gradient, (3) AGC targets and ion accumulation time; (4) MS2 resolution, (5) precursor co-isolation threshold, (6) data analysis software, (7) statistical calculation methods and (8) protein fold changes. See Figure 1 for the detailed flowchart. The mass spectrometry proteomics data have been deposited to the iProX (http://www.iprox.org) with the project ID IPX0001235000. It can be downloaded

using

the

following

webpage

and

password

(http://www.iprox.org/page/PSV023.html;?url=1541063864875jWUB,

Password:

OiLr).

1. Desalting process and peptide injection amount for LC-MS/MS In shotgun proteomics, it is necessary to desalt peptides before LC-MS/MS since proteins were generally digested in buffers with denaturing reagents which may affect peptide ionization and block the analytical column of LC-MS/MS. The choice of improper desalting tips or cartridges may result in the loss of peptides which affects the identification and quantification of low-abundance proteins. Here, four commonly used C18 desalting tips/cartridges, including Ziptip C18, Stagetip C18, MonoSpin C18 and Sep-Pak C18, were evaluated using the tryptic digests of 1:10 and 1:20 mixtures of E.coli and 293T cells in the aspect of protein identification and label-free quantification. As shown in Figure 2A, peptide identification from the same sample desalted using the four different desalting tips/cartridges shows certain degree of overlap, and the highest number of unique peptides was identified from Stagetip, indicating that the C18 materials used in Stagetip may be different from the other 3 tips/cartridges. As a matter of fact, 5% FA can be used for Stagetip, but 0.1% TFA or 5% acetic acid are used for other tips/cartridges according to the instructions from the manufacturers. In addition, we evaluated whether the desalting process affected label-free quantification. Principle Component Analysis (PCA) of the MaxQuant LFQ 8

ACS Paragon Plus Environment

Page 9 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

intensity values for all proteins identified from the 4 different desalting tips/cartridges shows that Sep-Pak C18 and MonoSpin C18 are quite similar, and Stagetip has large difference with the other 3 tips/cartridges (Figure 2B). It is in agreement with the above-mentioned peptide identification result. The large difference between Stagetip and other three desalting tips/cartridges in peptide identification and protein quantification may be due to the use of different C18 materials in Stagetip. Therefore, the same desalting tips/cartridges should be used in protein quantification experiments. We analyzed the pI, GRAVY (grand average of hydropathy) value and molecular weight of the identified peptides from each desalting method and found that they were nearly identical, but smaller hydrophilic peptides tended to be identified if peptide injection amount decreases (Table 1). Different desalting tips/cartridges have different loading capacity. We evaluated the effect of 10 folds overloading of desalting tips/cartridges on peptide and protein identification, and found that it had no significant effects (Supplemental Figure 1), but overloading did not result in more peptide recovery. To determine whether the use of different desalting tips/cartridges affected the label-free quantification of low-abundance E.coli proteins, the 1:10 and 1:20 peptide mixtures of E.coli and 293T cells were analyzed by LC-MS/MS and compared in triplicate, and only proteins with 1.5 fold changes and P