Missing Value Monitoring Enhances the Robustness in Proteomics

Mar 10, 2017 - Here, we present a new analytical workflow MvM (missing value monitoring) able to recover quantitation of missing values generated by s...
1 downloads 10 Views 1MB Size
Subscriber access provided by University of Newcastle, Australia

Article

Missing value Monitoring enhances the robustness in proteomics quantitation Vittoria Matafora, Andrea Corno, Andrea Ciliberto, and Angela Bachi J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.6b01056 • Publication Date (Web): 10 Mar 2017 Downloaded from http://pubs.acs.org on March 11, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Missing value Monitoring enhances the robustness in proteomics quantitation Vittoria Matafora§, Andrea Corno§, Andrea Ciliberto§ and Angela Bachi*§ § IFOM, the FIRC Institute of Molecular Oncology, Milan, Italy KEYWORDS: Missing values, proteomics, mass spectrometry, data-independent acquisition (DIA), data-dependent acquisition (DDA), Missing value Monitoring workflow (MvM), cell cycle.

ABSTRACT

In global proteomic analysis, it is estimated that proteins span from millions to less than 100 copies per cell. The challenge of protein quantitation by classic shotgun proteomic techniques relies on the presence of missing values in peptides belonging to low-abundance proteins that lowers intra-runs reproducibility affecting post-data statistical analysis. Here, we present a new analytical workflow MvM (Missing value Monitoring) able to recover quantitation of missing values generated by shotgun analysis. In particular, we used confident DDA quantitation only for proteins measured in all the runs, while we filled the missing values with DIA analysis using the library previously generated in DDA. We analyzed cell cycle regulated proteins, as they are low abundance proteins with highly dynamic expression levels. Indeed, we found that cell cycle related proteins are the major components of the missing values-rich proteome. Using the MvM

ACS Paragon Plus Environment

1

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 29

workflow, we doubled the number of robustly quantified cell cycle related proteins and we reduced the number of missing values achieving robust quantitation for proteins over ~50 molecules per cell. MvM allows lower quantification variance among replicates for low abundance proteins with respect to DDA analysis, demonstrating the potential of this novel workflow to measure low abundance, dynamically regulated proteins.

INTRODUCTION Traditional label free proteomics methods use data dependent acquisition (DDA) to measure proteome differences among multiple biological states. In DDA, a hybrid mass spectrometer performs first a survey scan and then selects the most abundant precursor ions for fragmentation.1 This is the most efficient approach in terms of protein identification and quantitation across multiple experiments, even though it shows high reproducibility among the replicates especially for high abundance proteins, while the low abundance counterpart suffers of the so called “missing value problem”. 2,3 Actually, although last generation mass spectrometers are increasingly faster, the majority of the precursor ions detected in MS are not selected for MS/MS, as common MS methods select top-12 abundant ions per cycle, causing the missidentification of individual peptides among the replicates. It has been estimated that only one third of the features detected across experiments is selected for fragmentation.2 Notably, the stochastic nature of precursor selection affects mostly low abundance proteins, as their identification relies mainly on a few peptides.4 The presence of missing values in low-abundance proteins affects post-data statistical analysis which requires complete datasets5; as a result, the missing value problem is emerging as one of the Achilles' heel of the DDA approach. Recently, a new DDA analytical method, DeMix-Q, has been suggested to address this issue.6 DeMix-Q

ACS Paragon Plus Environment

2

Page 3 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

introduces a novel scoring scheme, based on covariation of peptides abundances, for quality control in “peptide identity propagation” across multiple experiments. This method reduces quantification variation with the only limitation of co-eluting precursors or isobaric species whose quantitation still remains difficult. In contrast to DDA, targeted methods partially overcame this limitation searching for transition ions of pre-defined set of peptides along the entire LC gradient. These methods, though, require a priori knowledge of precursor ions and previous optimization for transition detection. More recently, data independent acquisition (DIA) has proven to reduce the occurrence of missing values.7-10 Differently from DDA, DIA is free from any predefinition of precursor ions for the successive data acquisitions. Using DIA, all the precursors are fragmented within a series of predefined isolation windows of m/z region.11 DIA displays superior reproducibility overcoming the stochastic nature of classic shotgun analysis9, but presents several limitations as increased spectral complexity and the requirement of a DDA derived spectral library for peptide identification. Here, we propose a novel workflow named Missing value Monitoring (MvM) able to improve quantitation of low abundance proteins. MvM combines the enhanced protein identification capability of DDA with the increased peptides coverage of DIA, mending the weakness of the shotgun method. As discussed before, even though new generation mass spectrometers have reached high identification rate, allowing thousands of proteins to be identified in one-shot experiment, a parallel improvement in quantitation robustness has not been achieved yet due to the incompleteness of data among replicates.

Our workflow attempts to address this issue, focusing on the optimization of

quantitation rather than identification. Exactly, we ran first a shotgun analysis, then we quantified the proteins that were identified in all the runs with high confident DDA quantitation while for the proteins poorly quantified among the replicates, we supplied the missing values

ACS Paragon Plus Environment

3

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 29

with data‐independent acquisition using the library previously generated in DDA analysis. In order to prove the validity of our workflow, we applied MvM to challenging biological samples. We used budding yeast Saccharomyces cerevisiae as model systems for the analysis of the cell cycle regulated proteome.12 In particular, we compared the proteins expressed in yeast arrested in mitosis upon overexpression of the mitotic spindle checkpoint component MAD2 or nocodazole treatment versus the proteins expressed in cells arrested in G1 phase by alpha-factor treatment.13,14 Our method enabled robust quantitation, lowering quantification variation for high as well as low abundance proteins, while maintaining the high proteome coverage typical of DDA. EXPERIMENTAL SECTION Protein Lysate Preparation The lysate was prepared by collecting 20 ml of cell culture, which were centrifuged at 4000 rpm for 2 min at room temperature. The supernatant was discarded, and the pellet was resuspended in 1 ml of 100 mM Tris/HCl pH 7.6. The sample was vortexed and transferred into 2 ml Eppendorf tube and centrifuged at 13200 rpm, 3 min at 4°C. The supernatant was eliminated and it was induced a rapid cooling of the sample in dry ice plus denatured alcohol. The sample was then stored at -80°C. Once thawed from -80°C, the sample was resuspended in 80 µl of 100 mM Tris pH 7.6, 100 mM Dithiothreitol, 5% SDS and heated at 95°C for 5 min. For a complete lysis, glass beads were added and the sample was vortexed for 10 min at room temperature. Forty microliters of 100 mM Tris/HCl pH 7.6, 100 mM Dithiothreitol, 5% SDS were added. The sample was transferred into a new Eppendorf, centrifuged at 13200 rpm, 5 min at room temperature. The supernatant was transferred into a new Eppendorf and stored at -20 °C.

ACS Paragon Plus Environment

4

Page 5 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Protein Digests for MS Analysis Yeast sample were in-solution digested with trypsin using the FASP protocol15 with spin ultrafiltration units of nominal molecular weight cut off of 30 kDa. Briefly, 120 µg of yeast proteins were added to 200 µL of 8 M urea in 0.1 M Tris/HCl, pH 8.5 (UA buffer) and the samples were transferred to YM-30 microcon filters (Cat No. MRCF0R030, Millipore). Samples were centrifuged at 14000×g for 15 min. Then, three washings with 400 µL of 8 M urea in 0.1 M Tris/HCl, pH 8.5 (UA buffer) were done followed by centrifugation at 14000×g for 15 min. The proteins were reduced with 0.01 M DTT in UA buffer for 30 min at room temperature. Then, two washings with 400 µL of 8 M urea in 0.1 M Tris/HCl, pH 8.5 (UA buffer) were done followed by centrifugation at 14000×g for 15 min. Then, proteins were alkylated adding 100 µL of 0.05 M iodoacetamide in 8 M urea and the samples were incubated in the dark for 5 min. Filters were washed twice with 100 µL of 8 M UA followed by two washings with 100 µL of 40 mM NH4HCO3. Finally, 4 µg of trypsin were added in 95 µL of 40 mM NH4HCO3 plus 10 µL of 120mM CaCl2. Samples were incubated overnight at 37°C, then 1 µg of trypsin was added for further 3h of incubation. Released peptides were collected by centrifugation. The resulting peptides were purified on a C18 StageTip (Proxeon Biosystems, Denmark).16 For the analysis of technical replicates, the peptides concentrate was divided into four independent samples. Mass Spectrometry Analysis For both data dependent acquisition and DIA, 1 µg of digested sample was injected onto a quadrupole Orbitrap Q-exactive HF mass spectrometer (Thermo Scientific). Peptides separation was achieved on a linear gradient from 95% solvent A (2% ACN, 0.1% formic acid) to 55% solvent B (80% acetonitrile, 0.1% formic acid) over 222 min and from 55 to 100% solvent B in 3

ACS Paragon Plus Environment

5

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 29

min at a constant flow rate of 0.25 µl/min on UHPLC Easy-nLC 1000 (Thermo Scientific) where the LC system was connected to a 23-cm fused-silica emitter of 75 µm inner diameter (New Objective, Inc. Woburn, MA, USA), packed in-house with ReproSil-Pur C18-AQ 1.9 µm beads (Dr Maisch Gmbh, Ammerbuch, Germany) using a high-pressure bomb loader (Proxeon, Odense, Denmark). The mass spectrometer was operated in DDA mode with dynamic exclusion enabled (exclusion duration of 15 seconds), MS1 resolution of 70,000 at m/z 200, MS1 automatic gain control target of 3 x 106, MS1 maximum fill time of 60 ms, MS2 resolution of 17,500, MS2 automatic gain control target of 1 x 105, MS2 maximum fill time of 60 ms, and MS2 normalized collision energy of 25. For each cycle, one full MS1 scan range of 300-1650 m/z was followed by 12 MS2 scans using an isolation window size of 2.0 m/z. In dataindependent mode, the mass spectrometer was operated with a MS1 scan at resolution of 35,000 at m/z 200, automatic gain control target of 1x106, and scan range of 490-910 m/z, followed by a DIA scan with a loop count of 10. DIA settings were as follows: window size of 20 m/z, resolution of 17,500, automatic gain control target of 1 x 106 and normalized collision energy of 30. All the proteomic data as raw files, total proteins and peptides identified with relative intensities and search parameters have been loaded into Peptide Atlas repository (PASS00828). Database Search and Spectral Library Construction MS analysis was performed as reported previously.17 Raw MS files were processed with MaxQuant software (1.5.1.0)18 making use of the Andromeda search engine.19 MS/MS peak lists were searched against the UniProtKB/Swiss-Prot protein sequence Yeast complete proteome database (release 2014, 6643 entries) in which trypsin specificity was used with up to two missed cleavages

allowed.

Searches

were

performed

selecting

alkylation

of

cysteine

by

carbamidomethylation as fixed modification, and oxidation of methionine and N-terminal

ACS Paragon Plus Environment

6

Page 7 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

acetylation as variable modifications. Mass tolerance was set to 5 ppm and 10 ppm for parent and fragment ions, respectively. A reverse decoy database was generated within Andromeda and the False Discovery Rate (FDR) was set to