A systematic evaluation of the use of human ... - ACS Publications

Calculations and statistics were performed and figures generated with Excel 2016. (Microsoft, Redmond, WA) .... plotted (Supporting Figure S1 (a-c)) a...
0 downloads 11 Views 1MB Size
Subscriber access provided by UNIV OF NEW ENGLAND ARMIDALE

Article

A systematic evaluation of the use of human plasma and serum for mass spectrometry-based shotgun proteomics JIAYI LAN, Antonio Núñez Galindo, James Doecke, Christopher Fowler, Ralph N. Martins, Stephanie R. Rainey-Smith, Ornella Cominetti, and Loïc Dayon J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.7b00788 • Publication Date (Web): 16 Feb 2018 Downloaded from http://pubs.acs.org on February 18, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

A systematic evaluation of the use of human plasma and serum for mass spectrometry-based shotgun proteomics

Jiayi Lan1, Antonio Núñez Galindo1, James Doecke2, Christopher Fowler3, Ralph N. Martins4, 5, Stephanie R. Rainey-Smith4, Ornella Cominetti1 and Loïc Dayon1, *

1

Proteomics, Nestlé Institute of Health Sciences, Lausanne, Switzerland

2

CSIRO Health and Biosecurity/Australian E-Health Research Centre, Brisbane,

Queensland, Australia 3

The Florey Institute of Neuroscience and Mental Health, The University of

Melbourne, Parkville, Victoria, Australia 4

Centre of Excellence for Alzheimer's Disease Research and Care, School of Medical

and Health Sciences, Edith Cowan University, Joondalup, Western Australia, Australia 5

Department of Biomedical Sciences, Macquarie University, New South Wales,

Australia

*Corresponding author: Tel: +41 21 632 6114; Fax: +41 21 632 6499 E-mail address: [email protected]

1

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 32

Abstract Over the last two decades, EDTA-plasma has been used as the preferred sample matrix for human blood proteomic profiling. Serum has also been employed widely. Only a few studies have assessed the difference and relevance of the proteome profiles

obtained

from

plasma

samples,

such

as

EDTA-plasma

or

lithium-heparin-plasma, and serum. A more complete evaluation of the use of EDTA-plasma,

heparin-plasma,

and

serum

would

greatly

expand

the

comprehensiveness of shotgun proteomics of blood samples. In this study, we evaluated the use of heparin-plasma with respect to EDTA-plasma and serum to profile blood proteomes using a scalable automated proteomic pipeline (ASAP2). The use of plasma and serum for mass spectrometry-based shotgun proteomics was first tested with commercial pooled samples. The proteome coverage consistency and the quantitative performance were compared. Further, protein measurements in EDTA-plasma and heparin-plasma samples were comparatively studied using matched sample pairs from 20 individuals from the Australian Imaging, Biomarkers and Lifestyle (AIBL) Study. Herein, we identified 442 proteins in common between EDTA-plasma and heparin-plasma samples. Overall agreement of the relative protein quantification between the sample pairs demonstrated that shotgun proteomics using workflows such as the ASAP2 is suitable in analyzing heparin-plasma, and that such sample type may be considered in large-scale clinical research studies. Moreover, the partial proteome coverage overlaps (e.g., about 70%) showed that measures from heparin-plasma could be complementary to those obtained from EDTA-plasma.

2

ACS Paragon Plus Environment

Page 3 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

1. Introduction Profiling of proteomes from human blood samples is key for clinical research and biomarker discovery. Human blood is the most complex human-derived biofluid for proteomic analysis, and can be considered as an assembly of tissue proteomes.1 From human whole blood as the parent source, several types of samples can be derived using different collection procedures. One approach consists of preventing the coagulation of blood, using anticoagulants such as ethylenediaminetetraacetic acid (EDTA) or lithium-heparin, and collecting the supernatant plasma after centrifugation. An alternative method allows blood samples to clot, and serum is obtained by discarding the clot after centrifugation. Nonetheless, serum is often considered to be a possibly biased sample type for proteomics as ex vivo clotting procedures can lead to the neo-generation of peptides and uncontrollable co-precipitation of some proteins.2 In plasma samples, proteins like fibrinogen and other clotting factors are preserved, thus presenting higher protein content than serum samples.3 Plasma was recommended over a decade ago by The Specimens Committee4 as the preferable blood sample for proteomic analysis, due to the lesser extent of ex vivo degradation compared to serum.5, 6 Since in heparin-plasma enzymatic activity is preserved and in vitro proteolysis processes may continue,1,

7

EDTA was selected by The Human

Proteome Organization (HUPO),8 and was previously reported,2 as the most suitable anticoagulant for plasma collection in proteomic analysis. The influences of blood collection procedures were summarized by Bowen and co-workers.3, 9, 10 As a matter of fact, very few studies have assessed the differences of proteomic profiles between EDTA-plasma, heparin-plasma, and serum,4, 7, 11-15 consequently, the evaluation specific to shotgun proteomics within larger cohorts is still lacking. Considering the large amount of existing clinical studies and biobanks that collected alternatives to EDTA-plasma such as heparin-plasma or serum, a systematic evaluation of the use of heparin-plasma and serum samples for mass spectrometry (MS)-based shotgun proteomics is of great value. 3

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 32

We initially developed a scalable automated proteomics pipeline (ASAP2) for analyzing EDTA-plasma samples, with the purpose of discovering plasma protein biomarkers within large-scale clinical research studies.16,

17

In order to perform

proteomics on large clinical cohorts of people, proteomic workflows need to be accurate and reproducible enough throughout the analysis of all samples. The precision, accuracy, linearity, robustness and throughput of ASAP2 have been assessed17 based on EDTA-plasma samples permitting its deployment across large-scale studies. ASAP2 is a bottom-up MS-based proteomics workflow that uses tandem mass tag (TMT)18, 19 for relative quantification of proteins between samples. The ASAP2 pipeline consists first of removing 14 highly abundant proteins in plasma with immuno-affinity liquid chromatography (LC), followed by a buffer exchange step. The rest of the workflow is automated in a 96-well plate format, and includes steps of; (i) reduction, alkylation, enzymatic digestion; (ii) TMT labeling and pooling; (iii) reversed-phase (RP) solid-phase extraction (SPE) purification; and (iv) strong cation exchange (SCX) SPE purification. Finally, the processed samples are analyzed with RP-LC tandem MS (MS/MS). Expanding the range of use of ASAP2 to heparin-plasma and serum as well as benchmarking results from such samples with the preferred EDTA-plasma will be valuable for blood proteomics and clinical research. Herein, we examined whether blood sample collection procedures for EDTA-plasma, lithium-heparin-plasma, and serum induced any differences in: (i) the proteome coverage; (ii) the quantitative precision and accuracy; and (iii) the inter-day and intra-day repeatability. We further assessed the qualitative and quantitative differences in EDTA-plasma versus heparin-plasma proteomic profiles in a well-established clinical research study, using matched samples from 20 older individuals.

2. Experimental Section 2.1 Human samples 4

ACS Paragon Plus Environment

Page 5 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Matching commercial human EDTA-plasma, lithium-heparin-plasma, and serum samples, pooled from several individuals, were purchased from Analytical Biological Services (Wilmington, DE). Plasma and serum donors were previously qualified as routine whole blood donors. Venous punctures from four healthy donors (male, 32-40 years old) were performed in the morning without a fasting period. After the blood was drawn into the different tubes (10 mL Vacutainer from Becton Dickinson, Franklin Lakes, NJ) for plasma and serum, these were capped, properly labeled, and thoroughly mixed. The tubes were then placed in the quarantine refrigerator or freezer per end-user’s specifications. For the plasma samples, after collection into tubes containing anticoagulants (either EDTA or lithium-heparin), these were centrifuged in a refrigerated unit at 5000 × g for 15 min at 4 °C. Then, samples were carefully removed from the centrifuge not to re-suspend cells, and placed on the plasma extractor unit for final extraction. For the serum samples, no anticoagulant was used in the collection tubes. Immediately after collection, the serum tubes were transferred to a cold sterile pack in order to avoid clotting. Then, the supernatant was left to clot for up to 48 h at room temperature, after which time, samples were put back into the centrifuge and spun down at 5000 × g for 20 min at 4 °C. Finally, the serum was separated from the coagulated fraction, frozen, and stored until further use. Frozen pooled plasma and serum samples were sent for proteomic analysis. After thawing, 40 μL aliquots were made and stored at -80 °C before the proteomic experiments. In addition, 20 human matched plasma samples (e.g., 20 EDTA-plasma and 20 lithium-heparin-plasma samples) were obtained from the Australian Imaging, Biomarkers and Lifestyle (AIBL) Study of ageing consortium (https://aibl.csiro.au/). The AIBL Study was approved by the local Ethics Committees. In the present report, we included 10 cognitively healthy volunteers and 10 patients diagnosed with Alzheimer’s disease; 12 females and 8 males, aged from 62 to 88 years. Blood was collected using the vacuum method of venipuncture from the antecubital vein. Twenty-one gauge 0.75 Winged Infusion Sets were used, with the attachment of the multi-adapter to aid in blood collection. Fasting blood samples were collected into tubes containing EDTA with prostaglandin E1 where quiescence of platelets was 5

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

required (+PGE1, 33.3 ng/mL, Sapphire Biosciences, Waterloo, Australia) and tubes containing liquid lithium-heparin. The tubes were placed on a rocker for 45 to 60 min following blood collection. The tubes were then spun at 200 × g at 20 °C for 10 min, with no brake, in an Allegra™ X-15R centrifuge (Beckman Coulter, Brea, CA). Following the spin, tubes were removed from the centrifuge, and the platelet rich plasma supernatants were transferred separately into fresh tubes. The EDTA+PGE1 and lithium-heparin tubes were then spun again, this time at 800 × g for 15 min at 20 °C with the brake on before resultant supernatant was transferred to a new tube. Plasma was aliquoted into 250 µL and 500 µL volumes in Cryobank vials (Nalge Nunc International, Rochester, NY). Samples were snap-frozen, prior to being transferred to a liquid nitrogen facility for long-term storage. Frozen samples were sent for proteomic analysis. After thawing, 40 μL aliquots were made and stored at -80 °C before the proteomic experiments.

2.2 Materials Iodoacetamide (IAA), tris (2-carboxyethyl) phosphine hydrochloride (TCEP), triethylammonium hydrogen carbonate buffer (TEAB) (1 M, pH 8.5), sodium dodecyl sulfate (SDS), and β-lactoglobulin (LACB) from bovine milk were purchased from Sigma (St. Louis, MO). Formic acid (FA, 99%) and CH3CN were from BDH (VWR International, Ltd., Poole, UK). Hydroxylamine solution 50 wt% in H2O (99.999%) was acquired from Aldrich (Milwaukee, WI). H2O (18.2 MΩ·cm at 25 °C) was obtained from a Milli-Q apparatus (Millipore, Billerica, MA). Trifluoroacetic acid Uvasol was sourced from Merck Millipore (Billerica, MA). The 6-plex TMT isobaric label kits were purchased from Thermo Scientific (Rockford, IL). Sequencing-grade modified trypsin/Lys-C was procured from Promega (Madison, WI). For immuno-affinity depletion of 14 highly abundant proteins from human biological fluids, multiple affinity removal system (MARS) columns, depletion Buffer A, and depletion Buffer B were obtained from Agilent Technologies (Wilmington, DE). Oasis HLB cartridges (1 cm3, 30 mg) were acquired from Waters (Milford, MA). Strata-X 33u Polymeric RP and Strata-X-C 33u Polymeric SCX cartridges (30 mg/1 mL) were 6

ACS Paragon Plus Environment

Page 6 of 32

Page 7 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

from Phenomenex (Torrance, CA).

2.3 Sample preparation Volumes of 5-30 µL of human blood sample were diluted in depletion Buffer A containing 0.0134 mg·mL-1 LACB according to the experimental design (see Experimental Sections 2.5-2.7). All the samples were diluted appropriately with Buffer A up to a final volume of 120 µL. Fourteen abundant plasma proteins were depleted, following the manufacturer instructions, with MARS columns and high performance LC systems (Thermo Scientific, San Jose, CA) equipped with an HTC-PAL fraction collector (CTC Analytics, Zwingen, Switzerland). After immuno-depletion, samples were snap-frozen and stored at -80 °C until further use. Then, buffer exchange was performed with Polymeric RP cartridges mounted on a 96-hole holder and a vacuum manifold, as previously described.17, 20 Samples were subsequently evaporated with a vacuum centrifuge (Thermo Scientific) and stored at -80 °C. After thawing, the samples were re-suspended in TEAB 100 mM with SDS (0.1%), reduced with TCEP (about 1 mM in solution), alkylated with IAA (about 7.5 mM in solution), digested with trypsin/Lys-C (protease to protein ratio of about 1:20, with incubation at 37 °C overnight), and 6-plex TMT labeled. After reaction quenching with hydroxylamine, sets of six differentially labeled samples were pooled. SPE purifications (Oasis HLB and SCX) were performed on a 4-channels Microlab Star liquid handler (Hamilton, Bonaduz, Switzerland) according to a previously reported protocol.17, 20 The pooled 6-plex TMT-labeled samples were then evaporated to dryness before storage at -80 °C.

2.4 RP-LC MS/MS The dried samples were dissolved in 500 µL H2O/CH3CN/FA 96.9/3/0.1 for RP-LC MS/MS. RP-LC MS/MS was performed with an Orbitrap Fusion Lumos Tribrid mass spectrometer and an Ultimate 3000 RSLC nano system (Thermo Scientific). Proteolytic peptides (injection of 5 µL of sample) were trapped on an Acclaim PepMap 75 µm × 2 cm (C18, 3 µm, 100 Å) pre-column, and separated on an 7

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 32

Acclaim PepMap RSLC 75 µm × 50 cm (C18, 2 µm, 100 Å) column (Thermo Scientific) coupled to a stainless steel nanobore emitter (40 mm, OD 1/32”) (Thermo Scientific). The column was heated to 50 °C using a PRSO-V1 column oven (Sonation, Biberach, Germany). Peptide separation was performed with a gradient of mobile phase C (H2O/CH3CN/FA 97.9/2/0.1) and D (H2O/CH3CN/FA 19.92/80/0.08): from 6.3% to 11% D over 12 min, from 11% to 25.5% D over 117 min and from 25.5% to 40% D over 28 min, with final elution (98% D) and equilibration (6.3% D) for a further 23 min. The flow rate was 220 nL·min−1 with a total analysis time of 180 min. Data were acquired using a data-dependent method. A positive ion spray voltage of 1700 V and a transfer tube temperature of 275 °C were set up. For MS survey scans in profile mode, the Orbitrap resolution was 120000 at m/z = 200 (automatic gain control (AGC) target of 2 × 105) with a m/z scan range from 300 to 1500, RF lens set at 30%, and maximum injection time of 100 ms. For MS/MS with higher-energy collisional dissociation (HCD) at 35% of the normalized collision energy, AGC target was set to 1 × 105 (isolation width of 0.7 in the quadrupole), with a resolution of 30000 at m/z = 200, first mass at m/z = 100, and a maximum injection time of 105 ms with Orbitrap acquiring in profile mode. A duty cycle time of 3 s (top speed mode) was used to determine the number of precursor ions to be selected for HCD-based MS/MS. Ions were injected for all available parallelizable time. Dynamic exclusion was set for 60 s within a ± 10 ppm window. A lock mass of m/z = 445.1200 was used.

2.5

Proteome

coverage

and

quantitative

precision

and

accuracy

for

workflow

with

EDTA-plasma,

EDTA-plasma, heparin-plasma, and serum To

systematically

assess

the

ASAP2

heparin-plasma, and serum samples, three experiments were designed, namely the Qualitative-Quantitative study (Qual-Quant study, ❶), the repeatability study (❷) and the comparative AIBL study (❸). The Qual-Quant study (❶) assessed the overall qualitative performance and quantitative result precision, accuracy, and linearity, when applying ASAP2 separately on commercial matched pools of EDTA-plasma, heparin-plasma, and serum. In this 8

ACS Paragon Plus Environment

Page 9 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

experiment, blood samples were prepared with different dilution factors (DFs) (i.e., DF = 4, 4, 6, 8, 12, and 24) in depletion Buffer A containing 0.0134 mg·mL-1 LACB as an internal protein reference (see Experimental Section 2.3); all the diluted samples had equal final volumes of 120 µL before depletion (Figure 1 (a)). In total, twelve 6-plex TMT experiments were performed in a 96-well plate, corresponding to quadruplicate sample preparations of EDTA-plasma, heparin-plasma, and serum (i.e., four 6-plex TMT experiments for each sample type). Each sample was analyzed in duplicate with RP-LC MS/MS.

2.6 Inter-day and intra-day variability for EDTA-plasma, heparin-plasma, and serum The repeatability study (❷) aimed at studying inter- and intra-day variances for the three different types of commercial matched pooled samples. Based on previous results, DF = 4 was set as the optimal parameter for all sample types, and blood samples were prepared in different batches (Figure 1 (b)). From day 1 to day 3, one 6-plex TMT experiment was performed per day and included two identical pools of EDTA-plasma, two identical pools of heparin-plasma, and two identical pools of serum samples in order to directly compare the three sample types using the TMT technology. On day 4, three groups of samples were prepared in order to assess the intra-day variance. The samples were prepared as described in Experimental Section 2.3. Each sample was analyzed in duplicate with RP-LC MS/MS.

2.7 Comparison of EDTA-plasma and heparin-plasma proteomes in a small cohort of samples of the Australian Imaging, Biomarkers and Lifestyle Study The comparative analysis using samples from the AIBL Study (❸) was aimed at qualitatively and quantitatively comparing the proteomic profiles of EDTA-plasma and heparin-plasma using 20 pairs of samples (from 20 individuals). In this experiment, DF = 4 was again set as the optimal parameter for both sample types. The 20 pairs of EDTA-plasma and heparin-plasma matched samples were treated on the same 96-well plate after randomization, and separated into two groups according to 9

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 32

the sample type (Figure 1 (c)). Two pools of samples were generated to be included in each 6-plex TMT experiment as biological references and quality control (QC) samples. The first pool was made from 50 µL of each individual EDTA-plasma sample, and the other was made from 50 µL of each individual heparin-plasma sample. For further QC, we introduced into the plate an extra TMT 6-plex containing only pooled samples (data not shown). Samples were prepared as described before. A total of twelve TMT 6-plex experiments was analyzed in duplicate with RP-LC MS/MS.

2.8 Data processing Proteome Discoverer (version 1.4, Thermo Scientific) was used as the data processing

interface.

Identification

was

performed

against

the

human

UniProtKB/Swiss-Prot database (10/2015 release) including the LACB sequence (20198 sequences in total). Mascot21 (version 2.4.2, Matrix Sciences, London, UK) was used as the search engine. Variable amino acid modifications were oxidized methionine, deamidated asparagine/glutamine, and 6-plexTMT-labeled peptide amino terminus (+229.163 Da); 6-plex TMT-labeled lysine (+229.163 Da) was set as fixed modification as well as carbamidomethylation of cysteine. Trypsin was selected as the proteolytic enzyme, with a maximum of two potential missed cleavages. Peptide and fragment ion tolerances were set to10 ppm and 0.02 Da, respectively. All Mascot results files were loaded separately into Scaffold Q+S 4.7.2 (Proteome Software, Portland, OR) to be further searched with X! Tandem (version CYCLONE (2010.12.01.1)). Both peptide and protein false discover rates were fixed at 1% maximum, with a two-unique-peptide criterion to report protein identification. TMT quantitative values were exported from Scaffold as log2 of the protein ratio fold change, that is, mean log2 after isotopic purity correction but without normalization applied between samples and experiments. The intensities of the TMT reporter-ions at m/z = 126 (i.e., i126) were used as reference (i.e., as the denominators for ratio calculations). Only the matched spectra and proteins consistently quantified across replicates were considered for quantitative assessment (i.e., precision, accuracy, linearity, and repeatability). The coefficients of variation (CVs) were calculated for 10

ACS Paragon Plus Environment

Page 11 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

the relative abundances at the protein level, as shown in Supporting Information files Interday_CV.xlsx and Intraday_CV.xlsx. For the comparative AIBL study, the proteins quantified in both EDTA-plasma and heparin-plasma for at least one individual were kept for the comparison. Spectral counting data of 6-plex TMT-labeled samples were also extracted from Scaffold. The MS proteomic data of the commercial pooled samples have been deposited to the ProteomeXchange Consortium22 (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository23 with the dataset identifiers PXD007846 and 10.6019/PXD007846. The Commission cantonale d'éthique de la recherche sur l'être humain (CER-VD) has not granted yet the right to deposit MS proteomic data from individuals participating to the AIBL Study. Calculations and statistics were performed and figures generated with Excel 2016 (Microsoft, Redmond, WA) and R (https://www.R-project.org/). In particular, the R routine ‘bland.altman.plot()’ from the package BlandAltmanLeh was used to generate the Bland-Altman plot. Qlucore Omics Explorer (Qlucore, Lund, Sweden) was used to produce the principal component analysis (PCA) plot and heatmap, as well as the hierarchical clustering of both samples and proteins.

3. Results and Discussions In this study, the use of EDTA-plasma, heparin-plasma and serum for MS-based shotgun proteomics was firstly evaluated using commercial matched pooled samples. The proteome coverage, overall quantitative precision and accuracy, linearity and variability obtained with EDTA-plasma, heparin-plasma, and serum were compared against one another. Further, the proteomic profiles obtained from 20 pairs of EDTA-plasma and heparin-plasma samples (from the AIBL Study) were compared.

3.1 Proteome coverage in commercial matched pools of EDTA-plasma, heparin-plasma, and serum One of the key challenges in proteomics is to obtain a comprehensive and 11

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

reproducible proteome coverage across samples analyzed in a study. The impact of blood sample types analyzed with ASAP2 on proteome coverages and their consistency among different experiments were assessed (Figure 1; ❶). We used commercial matched pools of EDTA-plasma, heparin-plasma, and serum samples. In EDTA-plasma, heparin-plasma, and serum, 342 identified proteins (IDs) ± 27, 341 IDs ± 25, and 350 IDs ± 13 were found respectively per LC MS analysis. In total, when combining experiments and LC MS analyses (see Figure 1 (a) and Experimental Section), 532, 532, and 510 proteins were identified in EDTA-plasma, heparin-plasma, and serum, respectively. In total, 747 proteins were identified when finally combining the results from all blood samples. As shown in Figure 2 (a), 47.8% (357 proteins) of the total IDs were detected in all sample types. Therefore, 52.2% (100, 92, and 85 IDs were unique to EDTA-plasma, heparin-plasma, and serum, respectively, while 113 IDs were common to two sample types) of total IDs were detectable in one or two types of samples. This observation suggested the detectability of those proteins was influenced by the sample collection procedures, beyond sample preparation variation and the stochastic nature of the mass spectrometry data-dependent acquisition. Serum showed a 4.3% lower number of total IDs (i.e., 510) when compared with the plasma samples. The protein overlaps between serum and both of the plasma samples were marginally smaller (51.8% and 52.9% for EDTA-plasma and heparin-plasma, respectively) than the overlap between both plasma samples (53.8%). The slightly lower IDs in serum may be mainly due to the coagulation process, as proteins including platelet factor 4, fibrinogen gamma chain, and fibrinogen like protein 1, were absent from serum, as expected. The qualitative reproducibility of proteome coverages using proteomic workflows is also an essential element for data completeness in large-scale clinical studies. The occurrences of measured proteins in both technical and instrumental replicates were compared as shown in Figure 2 (b) for each of the sample types. Most proteins were identified either in all replicates (i.e., 45.1%, 46.6%, and 51.0% of IDs, for EDTA-plasma, heparin-plasma, and serum, respectively) or in only one of them (i.e., 24.1%, 26.3%, 18.8%, for EDTA-plasma, heparin-plasma, and serum, respectively). 12

ACS Paragon Plus Environment

Page 12 of 32

Page 13 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

These observations were in line with those of common proteomic workflows based on data-dependent acquisition where a set of proteins is consistently identified (mostly the most abundant proteins) and another set sporadically detected (mostly the less abundant proteins). For EDTA-plasma and heparin-plasma respectively, 272 proteins and 266 proteins were consistently found in at least 7 replicates among 8 replicates. The usage of serum induced a small increase of the qualitative reproducibility as 281 proteins were found in at least 7 of the 8 replicates. Considering EDTA-plasma as the benchmark (based on our previous experience17 and HUPO recommendation8), both heparin-plasma and serum showed similar or slightly higher reproducibility in terms of proteome coverage. The three sample types shared 230 common IDs, which were consistently identified across all experiments and replicates; indicating that those proteins and derived peptides are lost neither by sample collection, sample preparation nor by LC MS analysis. Thus, these 230 proteins are robust enough to be selected for further inter-sample types of study. Finally, the proteome profiles grouped perfectly by sample type in an unsupervised manner, as shown in the PCA plot in Figure 3 (a). A hierarchical clustering was then applied on the total spectrum counting data of each 6-plex TMT experiment (Figure 3 (b)). These clustering results and dendrogram confirmed that the proteome profiles of serum samples are more distinguishable from both EDTA-plasma and heparin-plasma samples.

3.2 Quantitative precision and accuracy in commercial matched pools of EDTA-plasma, heparin-plasma, and serum The second key criteria of our systematic evaluation was the quantitative precision and accuracy of the analytic workflow for the different blood sample types. The ASAP2 approach is a bottom-up MS-based proteomic workflow that uses isobaric TMT for relative quantification of proteins and its quantitative precision and accuracy might be affected by matrix effect, protein initial concentration in experimental samples and sample integrity, for instance. Hence we used the quantitative data obtained with the TMT technology to assess the quantitative performances of ASAP2 13

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

with heparin-plasma and serum as compared with those obtained with EDTA-plasma in a calibration experiment using different DF as shown in the Qual-Quant study (Figure 1 (a); ❶). The log2(in/i126) distributions of matched raw spectral data were plotted (Supporting Figure S1 (a-c)) and descriptive statistics reported (Supporting Table S1), showing the ability of the ASAP2 workflow to determine concentration differences in all three types of samples. At DF = 4 or 6, the log2(in/i126) distributions presented the best comparative statistics, with the highest kurtosis, the sharpest peak shape as well as the closest value to the expected fold-change ratio of 1 and 0.66, respectively (Supporting Table S1). When samples were diluted with higher DF, the measurements became less accurate as the width of the distribution broadened and the mean values shifted from the expected values. The normalized distribution using DF = 4 was compared between samples as shown in Supporting Figure S1 (d). In conclusion, the quantitative precision and accuracy were equivalent between EDTA-plasma, heparin-plasma, and serum, thus confirming the suitability of those sample types for quantitative proteomic analysis using the ASAP2 workflow. In order to further evaluate the linearity response, the calibration curves were plotted using the theoretically used sample volumes versus the measured sample volumes (see Experimental Section 2.5). The measured volumes were recalculated from the relative abundances obtained with TMT and MS and the total spiked volumes (See Supporting Information). The linear correlations were checked, taking iteratively quantified proteins as a sub-dataset (i.e., 302, 308, and 348 proteins, for EDTA-plasma, heparin-plasma, and serum, respectively). As shown in Figure 4 (a-c), the measured sample volumes linearly correlated to the theoretical volumes, with R2 = 0.9955 for EDTA-plasma, R2 = 0.9959 for heparin-plasma, and R2 = 0.9911 for serum. Except the measurements obtained at DF = 24, the CVs of the measurements were below 20% (averaged CV of 18.5%, 16.9%, and 17.4% for EDTA-plasma, heparin-plasma, and serum, respectively). These results indicated sufficient proteomic quantitative performance for both plasma samples and serum. Additionally, the inter- and intra-day variabilities of the ASAP2 workflow were examined using the same commercial pooled samples to study the repeatability (❷) 14

ACS Paragon Plus Environment

Page 14 of 32

Page 15 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

of the measurements and their compatibility with large-scale studies. Six 6-plex TMT experiments were performed on different days according to Figure 1 (b) (see also Experimental Section 2.6). To assess both inter- and intra-day repeatability, CV values were calculated for protein relative abundance as shown in Supporting Figure S2. Subgroups of common IDs were taken as datasets (i.e., 339 proteins for the inter-day study and 384 proteins for the intra-day study). Median CVs were calculated based on the subgroups for all sample types (Supporting Table S2). Within the subgroup dataset, the median inter-day CV values were 14.1%, 9.5%, and 13.4% for EDTA-plasma, heparin-plasma, and serum, respectively. Median intra-day CV values were 16.0%, 7.3%, and 10.9%, respectively. In this evaluation, both heparin-plasma and serum samples showed lower variability compared with EDTA-plasma in both inter-day and intra-day studies. As three types of samples were mixed in one 6-plex TMT experiment (Figure 1 (b); ❷), the variances of protein abundance for each sample type can be used for comparison only and not as absolute figures. Thus, the obtained CVs showed the relative degree of variance between the three sample types, and were higher than the CVs measured previously in the single sample type repeatability study for EDTA-plasma.17 Throughout the evaluation with commercial pooled samples, considering EDTA-plasma as a well-accepted benchmark for proteomic studies, both analyses of heparin-plasma and serum samples showed similar consistency and compatibility with large-scale studies in terms of qualitative and quantitative performances (i.e., precision, accuracy, linearity, and repeatability). From this perspective, we therefore concluded that EDTA-plasma, heparin-plasma, and serum samples are suitable for shotgun proteomics applied in clinical research. Nonetheless, reproducibility of the sample collection was not assessed here and should be considered further.

3.3 Proteomic profiles of EDTA-plasma and heparin-plasma in a small cohort of samples of the Australian Imaging, Biomarkers and Lifestyle Study The proteomic profiles of plasma samples were further compared in terms of relative quantification of proteins using matched EDTA-plasma and heparin-plasma 15

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

samples collected from 20 individuals (❸; see Figure 1 (c) and Experimental Section 2.7). The proteomic profiles of the commercial pooled plasma samples were shown to be more similar to each other than that of serum (in this study but also in the work by Hsieh et al.13). We excluded serum from this more comprehensive quantitative comparison (we were operationally missing the matched serum samples) and therefore focused on comparing the plasma sample types. As shown in Supporting Figure S3 (a), the proteome coverages obtained from individual samples in this cohort of subjects were consistent with those obtained from the commercial sample sources. In EDTA-plasma and heparin-plasma, 343 IDs ± 27 and 340 IDs ± 20 were found per LC MS analysis, respectively. In total, when combining experiments and LC MS analyses, 532 and 546 proteins were identified in EDTA-plasma (i.e., 60.7% shared IDs with the previous commercial EDTA-plasma pool) and heparin-plasma (i.e., 60.7% shared IDs with the previous commercial heparin-plasma pool), respectively. In total, 636 proteins were identified among all samples (442 proteins in common between EDTA-plasma and heparin-plasma samples). The occurrences of the measured proteins in all patients were compared between EDTA-plasma and heparin-plasma samples as shown in Supporting Figure S3 (b). For EDTA-plasma and heparin-plasma respectively, 63.2% (i.e., 284 proteins) and 59.1% (i.e., 277 proteins) of total IDs were consistently found in at least 80% of the patients. As compared to EDTA-plasma, heparin-plasma showed slightly lower consistency in terms of proteome coverage (see orange curve in Supporting Figure S3 (b)). The log2(in/i126) distributions of matched raw spectral data were also plotted and compared with the commercial pooled samples (Supporting Figure S1 (d)). The median CVs measured in the AIBL sample dataset (i.e., 273 proteins commonly quantified among all LC MS analyses) were 18.9% for EDTA-plasma and 18.3% for heparin-plasma. The ratio fold change distributions were broader in the clinical samples than in the commercial pooled samples (Supporting Figure S1 (d)). As the measurements in the clinical samples included both technical and biological variability, standard deviations (SDs) were indeed higher than in the commercial 16

ACS Paragon Plus Environment

Page 16 of 32

Page 17 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

pooled samples (Supporting Table S1).24 Consistently, in both commercial and clinical samples, measurements in heparin-plasma provided lower variance than in EDTA-plasma. This observation corroborated the results reported by Ilies et al.,14 indicating that heparin-plasma could be more suitable than EDTA-plasma for blood proteomics in terms of protein quantitative precision. On the scatter plot of protein fold change ratios between EDTA-plasma and heparin-plasma sample pairs (Supporting Figure S4), the dots aligned with the diagonal, indicating a linear correlation between measurements in EDTA-plasma and heparin-plasma. In the Bland-Altman plot of Figure 5, the mean difference of the EDTA-plasma and heparin-plasma measurements was plotted; each dot represents a pair of measurements for one protein in one individual. Overall, 95.8% of the data points fitted within the confidence interval (i.e., mean ± 2 SDs), and the mean differences of the dataset located at 0, indicating there was no bias in the measurements towards one sample type. From the statistical point of view, both EDTA-plasma and heparin-plasma could be used interchangeably, meaning that comparable proteomic results could be obtained using both sample matrices. Some of the proteins however, exhibited different results in EDTA-plasma and heparin-plasma. Protein with divergent measures between the sample types in more than half of the cases were coronin-1A (3/4 data points outside of the confidence interval), thymosin beta-10 (2/4), hemoglobin subunit alpha (2/4), ferritin light chain (4/4), InaD-like protein (9/12), alpha-actinin-1 (2/4), keratin, type II cytoskeletal 5 (6/8), and vasodilator-stimulated phosphoprotein (3/4). Despite none of these proteins was consistently quantified in the 20 patients in both EDTA-plasma and heparin-plasma, this observation suggests that even if most of the quantitative measurements in the two plasma sample types are roughly thought to be interchangeable, they may not be completely equivalent for biomarker discovery.

4. Conclusions In this study, the proteomic analysis of human heparinized plasma and serum 17

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

samples with the ASAP2 workflow, a highly automated shogun proteomic workflow using isobaric labeling, was evaluated qualitatively and quantitatively. No significant matrix effects were found to interfere with the analytical pipeline and preclude the use of heparin-plasma and serum in future investigations. The analytical figures of merit were similar to those obtained previously17 and herein with EDTA-plasma as sample source. Therefore, we concluded that ASAP2 can also be applied to heparin-plasma and serum and that those sample types may be technically suitable for discovery proteomics in blood. EDTA-plasma and heparin-plasma showed a large similarity in terms of qualification and quantification in a small clinical cohort. The results obtained from both plasma samples were statistically aligned. However, it is recommended that in the phase of the discovery and verification of biomarkers, one should focus on one sample type in order to keep the consistency of the study and to maximize the comparability between measurements. The possibility of analyzing two types of plasma, as well as serum, is also attractive and valuable as it increases the comprehensiveness of the proteome profiles. The limitations of this comparative study need to be acknowledged. First, the commercial pooled samples were quite different from the AIBL Study samples in terms of demographic and clinical characteristics, as well as their collection procedures, such as fasting versus non-fasting conditions, processing protocols, and storages. This may have affected the results despite we observed overall consistency in our conclusions. Finally, we investigated EDTA-plasma and heparin-plasma samples from 10 cognitively healthy volunteers and 10 patients diagnosed with Alzheimer’s disease of the AIBL Study. The choice of the cohort might have also affected our results and interpretations but was driven by some of our future proteomic investigations in the field.

Supporting Information Supporting Information Figures_Tables - Log2(in/i126) distribution of matched raw 18

ACS Paragon Plus Environment

Page 18 of 32

Page 19 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

spectral data. Descriptive statistics of the datasets. CV values per protein of inter-day, intra-day and AIBL datasets. Median CVs of inter-day, intra-day and AIBL datasets. Proteome coverage obtained with AIBL EDTA-plasma and heparin-plasma samples. Scatter plot of the protein fold change ratios in matched EDTA-plasma and heparin-plasma samples from the AIBL Study. TableS1_descriptive_statistics - Descriptive statistics of the datasets. Commercial_sample_protein_IDs - List of proteins identified in commercial matched pools of EDTA-plasma, heparin-plasma, and serum. Commercial_sample_calibration_curve – Relative quantification of proteins in commercial matched pools of EDTA-plasma, heparin-plasma, and serum in a calibration experiment. Intraday_CV – CVs of intra-day experiments on commercial matched pools of EDTA-plasma, heparin-plasma, and serum. Interday_CV – CVs of inter-day experiments on commercial matched pools of EDTA-plasma, heparin-plasma, and serum.

Acknowledgments We thank the Australian Imaging, Biomarkers and Lifestyle Study Research Group for providing human samples. We thank Larry Ward, Polina Mironova, Ondine Walter, Sofia Moco, India Severin, Pascal Steiner, Aline Bichsel, and Gene L. Bowman for their continuous support and very fruitful discussions.

19

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

References 1. Anderson, N. L.; Anderson, N. G. The human plasma proteome: history, character, and diagnostic prospects. Mol Cell Proteomics 2002, 1 (11), 845-867. 2. Surinova, S.; Schiess, R.; Hüttenhain, R.; Cerciello, F.; Wollscheid, B.; Aebersold, R. On the development of plasma protein biomarkers. J. Proteome Res. 2011, 10 (1), 5-16. 3.

Bowen, R. A. R.; Remaley, A. T. Interferences from blood collection tube

components on clinical chemistry assays. Biochem Med. 2014, 24 (1), 31-44. 4.

Rai, A. J.; Gelfand, C. A.; Haywood, B. C.; Warunek, D. J.; Yi, J.; Schuchard, M.

D.; Mehigh, R. J.; Cockrill, S. L.; Scott, G. B. I.; Tammen, H.; Schulz-Knappe, P.; Speicher, D. W.; Vitzthum, F.; Haab, B. B.; Siet, G.; Chan, D. W. HUPO Plasma Proteome Project specimen collection and handling: Towards the standardization of parameters for plasma proteome samples. Proteomics 2005, 5 (13), 3262-3277. 5.

Misek, D. E.; Kuick, R.; Wang, H.; Galchev, V.; Deng, B.; Zhao, R.; Tra, J.;

Pisano, M. R.; Amunugama, R.; Allen, D.; Walker, A. K.; Strahler, J. R.; Andrews, P.; Omenn, G. S.; Hanash, S. M. A wide range of protein isoforms in serum and plasma uncovered by a quantitative intact protein analysis system. Proteomics 2005, 5 (13), 3343-3352. 6.

Tammen, H.; Schulte, I.; Hess, R.; Menzel, C.; Kellmann, M.; Mohring, T.;

Schulz-Knappe, P. Peptidomic analysis of human blood specimens: Comparison between plasma specimens and serum by differential peptide display. Proteomics 2005, 5 (13), 3414-3422. 20

ACS Paragon Plus Environment

Page 20 of 32

Page 21 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

7.

Jambunathan,

K.;

Galande,

A.

K.

Sample

collection

in

clinical

proteomics-Proteolytic activity profile of serum and plasma. Proteomics Clin. Appl. 2014, 8 (5-6), 299-307. 8.

Omenn, G. S.; States, D. J.; Adamski, M.; Blackwell, T. W.; Menon, R.;

Hermjakob, H.; Apweiler, R.; Haab, B. B.; Simpson, R. J.; Eddes, J. S.; Kapp, E. A.; Moritz, R. L.; Chan, D. W.; Rai, A. J.; Admon, A.; Aebersold, R.; Eng, J.; Hancock, W. S.; Hefta, S. A.; Meyer, H.; Paik, Y. K.; Yoo, J. S.; Ping, P.; Pounds, J.; Adkins, J.; Qian, X.; Wang, R.; Wasinger, V.; Wu, C. Y.; Zhao, X.; Zeng, R.; Archakov, A.; Tsugita, A.; Beer, I.; Pandey, A.; Pisano, M.; Andrews, P.; Tammen, H.; Speicher, D. W.; Hanash, S. M. Overview of the HUPO Plasma Proteome Project: Results from the pilot phase with 35 collaborating laboratories and multiple analytical groups, generating a core dataset of 3020 proteins and a publicly-available database. Proteomics 2005, 5 (13), 3226-3245. 9.

Bowen, R. A. R.; Adcock, D. M. Blood collection tubes as medical devices: The

potential to affect assays and proposed verification and validation processes for the clinical laboratory. Clin. Biochem. 2016, 49 (18), 1321-1330. 10. Bowen, R. A. R.; Hortin, G. L.; Csako, G.; Otañez, O. H.; Remaley, A. T. Impact of blood collection devices on clinical chemistry assays. Clin. Biochem. 2010, 43 (1-2), 4-25. 11. Dupin, M.; Fortin, T.; Larue-Triolet, A.; Surault, I.; Beaulieu, C.; Gouel-Chéron, A.; Allaouchiche, B.; Asehnoune, K.; Roquilly, A.; Venet, F.; Monneret, G.; Lacoux, X.; Roitsch, C. A.; Pachot, A.; Charrier, J. P.; Pons, S. Impact of serum and plasma 21

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

matrices on the titration of human inflammatory biomarkers using analytically validated SRM assays. J. Proteome Res. 2016, 15 (8), 2366-2378. 12. Haab, B. B.; Geierstanger, B. H.; Michailidis, G.; Vitzthum, F.; Forrester, S.; Okon, R.; Saviranta, P.; Brinker, A.; Sorette, M.; Perlee, L.; Suresh, S.; Drwal, C.; Adkins, J. N.; Omenn, G. S. Immunoassay and antibody microarray analysis of the HUPO Plasma Proteome Project reference specimens: Systematic variation between sample types and calibration of mass spectrometry data. Proteomics 2005, 5 (13), 3278-3291. 13. Hsieh, S. Y.; Chen, R. K.; Pan, Y. H.; Lee, H. L. Systematical evaluation of the effects of sample collection procedures on low-molecular-weight serum/plasma proteome profiling. Proteomics 2006, 6 (10), 3189-3198. 14. Ilies, M.; Iuga, C. A.; Loghin, F.; Dhople, V. M.; Thiele, T.; Völker, U.; Hammer, E. Impact of blood sample collection methods on blood protein profiling studies. Clin. Chim. Acta 2017, 471, 128-134. 15. Randall, S. A.; McKay, M. J.; Molloy, M. P. Evaluation of blood collection tubes using selected reaction monitoring MS: Implications for proteomic biomarker studies. Proteomics 2010, 10 (10), 2050-2056. 16. Cominetti, O.; Núñez Galindo, A.; Corthésy, J.; Oller Moreno, S.; Irincheeva, I.; Valsesia, A.; Astrup, A.; Saris, W. H. M.; Hager, J.; Kussmann, M.; Dayon, L. Proteomic Biomarker Discovery in 1000 Human Plasma Samples with Mass Spectrometry. J. Proteome Res. 2016, 15 (2), 389-399. 17. Dayon, L.; Núñez Galindo, A.; Corthésy, J.; Cominetti, O.; Kussmann, M. 22

ACS Paragon Plus Environment

Page 22 of 32

Page 23 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Comprehensive and scalable highly automated MS-based proteomic workflow for clinical biomarker discovery in human plasma. J. Proteome Res. 2014, 13 (8), 3837-3845. 18. Dayon, L.; Hainard, A.; Licker, V.; Turck, N.; Kuhn, K.; Hochstrasser, D. F.; Burkhard, P. R.; Sanchez, J. C. Relative quantification of proteins in human cerebrospinal fluids by MS/MS using 6-plex isobaric tags. Anal. Chem. 2008, 80 (8), 2921-2931. 19. Thompson, A.; Schäfer, J.; Kuhn, K.; Kienle, S.; Schwarz, J.; Schmidt, G.; Neumann, T.; Hamon, C. Tandem mass tags: A novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. Anal. Chem. 2003, 75 (8), 1895-1904. 20. Núñez Galindo, A.; Kussmann, M.; Dayon, L. Proteomics of Cerebrospinal Fluid: Throughput and Robustness Using a Scalable Automated Analysis Pipeline for Biomarker Discovery. Anal. Chem. 2015, 87 (21), 10755-10761. 21. Perkins, D. N.; Pappin, D. J. C.; Creasy, D. M.; Cottrell, J. S. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 1999, 20 (18), 3551-3567. 22. Vizcaíno, J. A.; Deutsch, E. W.; Wang, R.; Csordas, A.; Reisinger, F.; Ríos, D.; Dianes, J. A.; Sun, Z.; Farrah, T.; Bandeira, N.; Binz, P. A.; Xenarios, I.; Eisenacher, M.; Mayer, G.; Gatto, L.; Campos, A.; Chalkley, R. J.; Kraus, H. J.; Albar, J. P.; Martinez-Bartolomé, S.; Apweiler, R.; Omenn, G. S.; Martens, L.; Jones, A. R.; Hermjakob, H. ProteomeXchange provides globally coordinated proteomics data 23

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

submission and dissemination. Nat. Biotechnol. 2014, 32 (3), 223-226. 23. Vizcaíno, J. A.; Côté, R. G.; Csordas, A.; Dianes, J. A.; Fabregat, A.; Foster, J. M.; Griss, J.; Alpi, E.; Birim, M.; Contell, J.; O'Kelly, G.; Schoenegger, A.; Ovelleiro, D.; Pérez-Riverol, Y.; Reisinger, F.; Ríos, D.; Wang, R.; Hermjakob, H. The Proteomics Identifications (PRIDE) database and associated tools: Status in 2013. Nucleic Acids Res. 2013, 41 (D1), D1063-D1069. 24. Liu, Y.; Buil, A.; Collins, B. C.; Gillet, L. C. J.; Blum, L. C.; Cheng, L. Y.; Vitek, O.; Mouritsen, J.; Lachance, G.; Spector, T. D.; Dermitzakis, E. T.; Aebersold, R. Quantitative variability of 342 plasma proteins in a human twin population. Mol. Syst. Biol. 2015, 11 (2).

24

ACS Paragon Plus Environment

Page 24 of 32

Page 25 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure Captions Figure 1. (a) Experimental design for assessment of qualitative and quantitative performance (Qual-Quant; ❶). Each spot represents one individual sample in a 6-plex TMT experiment. The grey rectangular exemplifies the general scheme for sample pooling after labelling. The darkness of the color and number on each spot indicate the DF used. Each 6-plex TMT experiment was performed in quadruplicate, and each was analyzed in duplicate with RP-LC MS/MS. (b) Experimental design for assessment of inter-day and intra-day repeatability (❷) (30 μL sample was used for each sample type (i.e., EDTA-plasma, heparin-plasma, and serum), corresponding to DF = 4). (c) Experimental design for comparison of EDTA-plasma and heparin-plasma proteomic measurements in a cohort of 20 individuals (❸). Samples were collected by the AIBL Study consortium. Figure 2. (a) Proteome coverage obtained from commercial matched pools of EDTA-plasma, heparin-plasma, and serum. (b) Reproducibility of IDs among 8 replicates (including four sample technical replicates and for each two instrumental replicates). Proteins which were identified in at least 7 replicates were considered as consistent IDs. (c) Proteome coverage overlap between the consistent IDs in EDTA-plasma, heparin-plasma, and serum. All data correspond to results obtained from experimental design presented in Figure 1(a) (❶). Figure 3. (a) Unsupervised PCA plot and (b) hierarchical clustering of commercial pooled samples (i.e., EDTA-plasma, heparin-plasma, and serum) (data correspond to results obtained from experimental design presented in Figure 1(a) (❶)). The y-axis of the hierarchical clustering shows protein. For each protein, data were obtained from spectral counting. Figure 4. Calibration curves obtained in (a) EDTA-plasma (b) heparin-plasma, and (c) serum samples. DF of 4, 4, 6, 8, 12, and 24 were used corresponding to initial volume of biofluid of 30, 30, 20, 15, 10, and 5 µL, respectively. Statistics of regression for each sample type is shown in (d). The measured volumes were recalculated from the measured relative quantitative abundances of the proteins obtained with MS and the 25

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

total spiked volumes. All data correspond to results obtained from experimental design presented in Figure 1(a) (❶). Figure 5. Bland-Altman plot comparing the protein fold change ratios in EDTA-plasma and heparin-plasma samples of the AIBL Study (experiment ❸). Each dot represents one protein in one individual measured in both EDTA-plasma and heparin-plasma samples. The mean difference located at 0 and 2 SDs of the mean are indicated with dashed lines.

26

ACS Paragon Plus Environment

Page 26 of 32

Page 27 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 1.

27

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2.

28

ACS Paragon Plus Environment

Page 28 of 32

Page 29 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 3.

29

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 4.

30

ACS Paragon Plus Environment

Page 30 of 32

Page 31 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 5.

31

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Graphical Abstract

32

ACS Paragon Plus Environment

Page 32 of 32