Proteomes of paired human cerebrospinal fluid and plasma: Relation

Publication Date (Web): January 31, 2019. Copyright © 2019 American Chemical Society. Cite this:J. Proteome Res. XXXX, XXX, XXX-XXX ...
1 downloads 0 Views 1MB Size
Subscriber access provided by EKU Libraries

Article

Proteomes of paired human cerebrospinal fluid and plasma: Relation to blood-brain barrier permeability in older adults Loïc Dayon, Ornella Cominetti, Jerome Wojcik, Antonio Núñez Galindo, Aikaterini Oikonomidi, Hugues Henry, Eugenia Migliavacca, Martin Kussmann, Gene Bowman, and Julius Popp J. Proteome Res., Just Accepted Manuscript • Publication Date (Web): 31 Jan 2019 Downloaded from http://pubs.acs.org on January 31, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Proteomes of paired human cerebrospinal fluid and plasma: Relation to blood-brain barrier permeability in older adults

Loïc Dayon1, *, Ornella Cominetti1, Jérôme Wojcik2, Antonio Núñez Galindo1, Aikaterini Oikonomidi3, Hugues Henry4, Eugenia Migliavacca1, Martin Kussmann1, $, Gene L. Bowman1, #, and Julius Popp3, 5

1Nestlé

Institute of Health Sciences, 1015 Lausanne, Switzerland for Medicine, 1202 Geneva, Switzerland 3CHUV, Old Age Psychiatry, Department of Psychiatry, 1011 Lausanne, Switzerland 4CHUV, Department of Laboratories, 1011 Lausanne, Switzerland 5HUG, Geriatric Psychiatry, Department of Mental Health and Psychiatry, 1226 Geneva, Switzerland $Current affiliation: Liggins Institute, University of Auckland, Auckland 1142, New Zealand #Current affiliation: Institute for Aging Research, Hebrew Senior Life, Department of Medicine, Division of Gerontology Beth Israel Deaconess Medical Center and Harvard Medical School, Boston, MA 02131, USA 2Precision

Correspondence to: *Nestlé Institute of Health Sciences, EPFL Innovation Park, Bâtiment H, 1015 Lausanne, Switzerland; Email: [email protected], Phone: +41 21 632 6114, Fax: +41 21 632 6499

1 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 41

Abbreviations A1BG

alpha-1B-glycoprotein

Aβ1–42

β-amyloid 1-42

AD

Alzheimer’s disease

ADL

activities of daily living

AFAM

afamin

ANOVA

analysis of variance

APOE

Apolipoprotein E gene

AUC

area under the curve

BBB

blood-brain barrier

BMI

body mass index

CBPB2

carboxypeptidase B2

CDR

clinical dementia rating

CNS

central nervous system

CO4A

complement C4-A

CO4B

complement C4-B

CO9

complement component C9

CRP

C-reactive protein

CSF

cerebrospinal fluid

DCD

dermcidin

emPAI

exponentially modified protein abundance index

FA

formic acid

FETUA

alpha-2-HS-glycoprotein

FETUB

fetuin-B

FHR1

complement factor H-related protein 1

FHR2

complement factor H-related protein 2

HELZ

probable helicase with zinc finger domain

HRG

histidine-rich glycoprotein

IAA

iodoacetamide

KNG1

kininogen-1 2 ACS Paragon Plus Environment

Page 3 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

LACB

β-lactoglobulin

LASSO

least absolute shrinkage and selection operator

LBP

lipopolysaccharide-binding protein

LC

liquid chromatography

LTQ-OT

linear ion trap-Orbitrap

MARS

multiple affinity removal system

MCI

mild cognitive impairment

MMSE

mini-mental state examination

MS

mass spectrometry

MS/MS

tandem MS

MS/MS

tandem MS

PGRP2

N-acetylmuramoyl-L-alanine amidase

PHLD

phosphatidylinositol-glycan-specific phospholipase D

PLMN

plasminogen

PRAP1

proline-rich acidic protein 1

PRDX2

peroxiredoxin-2

PRG4

proteoglycan 4

P-tau

hyperphosphorylated tau

P-tau 181

tau phosphorylated at threonine 181

R

correlation coefficient

ROC

receiver operating characteristic

RP

reversed-phase

SAMP

serum amyloid P-component

SCX

strong cation-exchange

SPE

solid-phase extraction

TCEP

tris(2-carboxyethyl) phosphine hydrochloride

TMT

tandem mass tag

TTHY

transthyretin

VTDB

vitamin D-binding protein

3 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 41

Abstract The systems-level relationship between the proteomes of cerebrospinal fluid (CSF) and plasma has not been comprehensively described so far. Recently developed shotgun proteomic workflows allow for deeper characterization of the proteomes from body fluids in much larger sample size. We deployed state-of-the-art mass spectrometry-based proteomics in paired CSF and plasma samples volunteered by 120 elders with and without cognitive impairment to comprehensively characterize and examine compartmental proteome differences and relationships between both body fluids. We further assessed the influence of blood-brain barrier (BBB) integrity and tested the hypothesis that BBB breakdown can be identified from CSF and plasma proteome alterations in non-demented elders. We quantified 790 proteins in CSF and 422 proteins in plasma, and 255 of the proteins were identified in both compartments. Pearson’s statistics determined 28 proteins with associated levels between CSF and plasma. BBB integrity as defined with the CSF/serum albumin index influenced 76 CSF/plasma protein ratios. In least absolute shrinkage and selection operator models, CSF and plasma proteins improved identification of BBB impairment. In conclusion, we provide here a first comprehensive draft map of interacting human CSF and plasma proteomes, in view of their complex and dynamic compositions, and influence of the BBB.

Keywords BBB; Blood; Brain; Cerebrospinal fluid; Circulation; Clinical proteomics; CSF; Mass spectrometry; Nervous system; Serum/plasma

4 ACS Paragon Plus Environment

Page 5 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Introduction Cerebrospinal fluid (CSF) is a fluid proximal to central nervous system (CNS), including the brain and spinal cord. The CSF offers a protective cushion against mechanical forces and a sink for byproducts that reflect brain metabolism. Maintenance of CSF is due primarily to the ultrafiltration of arterial blood at the choroid plexus and at any time about 20-30% of CSF proteins are thought to be derived from the brain specifically.1,

2

CSF protein analysis is used in the diagnosis for

tumors, bleeding, inflammation, and injury3 CSF peptides and proteins (i.e., total tau (tau), hyperphosphorylated tau (P-tau), and β-amyloid 1-42 (Aβ1–42)) can complement clinical examination by the objective evidence to improve the diagnosis for Alzheimer’s disease (AD).4, 5 In 2010, Schutzer et al. established an initial database including 2630 proteins contained in “normal” human CSF.6 A few years later, Guldbrandsen et al. released the CSF Proteome Resource composed of 3081 proteins, also making a proteome coverage comparison of CSF and plasma.7 Zhang et al. also contributed to the CSF protein map by reporting 2513 proteins in normal CSF 8 and Macron et al. recently reported in-depth characterizations of the human CSF proteome.9, 10 Blood is more easily accessible than CSF regarding sampling. Blood sampling can be more often repeated and material amount are less limited. Therefore, many efforts have been made to discover blood-based biomarkers for neurological disorders, in particular proteins.11-14 Protein biomarker candidates from the circulating blood are likely to prove useful for the diagnosis and prognosis of CNS disorders as well as for the monitoring of drug effects on the CNS. The questions to be addressed include what part of the proteome of the peripheral blood reflects CSF proteins and whether the relationships between CSF and peripheral blood proteins are dependent on factors such as the blood-brain barrier (BBB) function, and others. Proteomic analysis of blood plasma is a challenging task, mainly due to the presence of a few proteins at very high concentrations; as a result, protein concentrations in plasma span 10-12 orders of magnitude. But, in recent years, mass 5 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 41

spectrometry (MS)-based proteomics of plasma has regained momentum15, 16 following several technological improvements regarding sample preparation, MS instrumentation, and data processing and analysis.17-19 Those developments offer additional perspectives for biomarker discoveries, as the proximity of blood with many tissues makes it an obvious source of candidate biomarkers. The relationship between blood and CSF composition, including molecules retained in the CNS by the BBB and blood-CSF barrier predominately formed by the epithelium of the choroid plexus are of particular interest. Brain-enriched proteins in CSF have been previously identified,1 but literature that has well characterized the compartmentalization differences between CSF and plasma proteomes is very limited.20 However, studies of specific CSF and serum/plasma proteins (e.g., albumin and IgG) are abundant and those proteins are used clinically.21 For instance, the CSF/serum albumin index (i.e., [CSF albumin] / [serum albumin] × 100) and the CSF IgG index have been studied extensively to assess BBB integrity22,

23

and CNS inflammatory disorders,

respectfully.24 The aims of the study were to deploy state-of-the-art MS-based proteomics to characterize CSF and plasma proteomes in paired samples and decipher their relationship to BBB permeability in older adults.

Experimental Section Study Design One hundred and twenty community dwelling participants were included in this study, of whom 48 were cognitively healthy volunteers and 72 had mild cognitive impairment (MCI);25 16 subjects presented BBB impairment defined a priori as previously described (i.e., CSF/serum albumin index ≥ 9.0) (see Table 1).22 Diagnosis of mild cognitive MCI or dementia was based on 6 ACS Paragon Plus Environment

Page 7 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

neuropsychological and clinical evaluation, and made by a consensus conference of psychiatrists and/or neurologists, and neuropsychologists prior to the inclusion into the study. The participants with cognitive impairment were recruited among outpatients who were referred to the Memory Clinics, Departments of Psychiatry, and Department of Clinical Neurosciences, University Hospitals of Lausanne (Switzerland). They had no major psychiatric disorders, nor substance abuse or severe or unstable physical illness that may contribute to cognitive impairment, had a clinical dementia rating (CDR)26 score > 0, and met the clinical diagnostic criteria for MCI27 or AD mild dementia according to the recommendations from the National Institute on Aging and Alzheimer’s Association.28 In the current study, 9 subjects met criteria for probable AD dementia. As there is a clinical continuum between MCI and mild dementia, and the participants with cognitive impairment were patients from memory clinics recruited in the same way irrespective of MCI or mild dementia classification, these subjects were group together and labeled as cognitively impaired with CDR > 0. The control subjects were recruited through journal announcements or word of mouth and had no history, symptoms, or signs of relevant psychiatric or neurologic disease and no cognitive impairment (CDR = 0). All participants underwent a comprehensive clinical and neuropsychological evaluation, structural brain imaging, and venous and lumbar punctures.25 Magnetic resonance imaging and computerized tomography scans were used to exclude cerebral pathologies possibly interfering with the cognitive performance. Neuropsychological tests were used to assess cognitive performance in the domains of memory,29 language, and visuo-constructive functions. The mini-mental state examination (MMSE)30 was used to assess participants’ global cognitive performance. Depression and anxiety were assessed using the hospital anxiety and depression scale.31 The psychosocial and functional assessments included the activities of daily living (ADL) and instrumental ADL, the neuropsychiatric inventory questionnaire and informant questionnaire on cognitive decline in the elderly,32 and were 7 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 41

completed by the family members of the participants. All tests and scales are validated and widely used in the field. The institutional ethical committee from the University Hospitals of Lausanne approved the clinical protocol (No. 171/2013) and all participants or their legally-authorized representatives signed written informed consent. Sample collection Venous and lumbar punctures were performed between 8:30-9:30 am after overnight fasting. For lumbar puncture, a standardized technique with a 22 gauge “atraumatic” spinal needle and a sitting or lying position was applied.33 A volume of 10-12 mL of CSF was collected in polypropylene tubes. Routine cell count and protein quantification were performed. Remaining CSF was frozen in aliquots (500 μL) no later than 1 h after collection and stored at -80 °C without thawing until experiment and assay. Sample centrifugation to remove blood cells from the CSF was not part of the collection protocol. Considering this limitation, we excluded from this study all subjects with visually hemorrhagic CSF samples and subjects with a CSF cell count of more than 5 cells·µL−1. Blood was drawn into EDTA K3 containing S-Monovette (Sarstedt, Nümbrecht, Germany). After maximum 20-30 min on ice, the tubes were centrifuged at 3000 rpm for 12 min at 6 °C. Volumes of 350 μL plasma samples were aliquoted 2-5 min after centrifugation into polypropylene tubes, frozen no later than 1 h after collection, and stored at -80 °C. Plasma aliquots were not thawed before the proteomic experiment. Materials Iodoacetamide (IAA), tris(2-carboxyethyl) phosphine hydrochloride (TCEP), triethylammonium hydrogen carbonate buffer 1 M pH = 8.5, sodium dodecyl sulfate, and β-lactoglobulin (UniProtKB/Swiss-Prot entry name: LACB) from bovine milk were purchased from Sigma (St. Louis, MO, USA). Formic acid (FA, 99%) and CH3CN were from BDH (VWR International Ltd., 8 ACS Paragon Plus Environment

Page 9 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Poole, UK). Hydroxylamine solution 50 wt % in H2O (99.999%) was acquired from Aldrich (Milwaukee, WI, USA). H2O (18.2 MΩ·cm at 25 °C) was obtained from a Milli-Q apparatus (Millipore, Billerica, MA, USA). Trifluoroacetic acid Uvasol® was sourced from Merck Millipore (Billerica, MA, USA). The 6-plex tandem mass tags (TMTs)34 were purchased from Thermo Scientific (Rockford, IL, USA). Sequencing grade modified Lys-C/trypsin was procured from Promega (Madison, WI, USA). For immuno-affinity depletion of 14 abundant human proteins, multiple affinity removal system (MARS) columns, Buffer A, and Buffer B were obtained from Agilent Technologies (Wilmington, DE, USA). Oasis HLB cartridges (1cc, 30 mg) were acquired from Waters (Milford, MA, USA) and Strata-X 33u Polymeric reversed-phase (RP) and Strata-XC 33u Polymeric strong cation-exchange (SCX) solid-phase extraction (SPE) cartridges (30 mg/1 mL) from Phenomenex (Torrance, CA, USA). Sample Preparation The study design is presented in Figure S1 of the Supporting Information. Randomization of the samples was based on CDR, gender, and age; the position of the CSF and plasma samples on the experimental plates is given, respectively, in Tables S1A and S1B of the Supporting Information. CSF and plasma samples were prepared as previously described.35, 36 A volume of 400 µL (for 12 samples, this volume was not available; different volumes were therefore taken and correction factors were subsequently applied) of CSF sample was evaporated with a vacuum centrifuge. The dried CSF samples were diluted in 125 µL of depletion Buffer A containing 0.00965 mg·mL−1 LACB. A volume of 30 µL of plasma sample was diluted in 90 µL of Buffer A containing 0.0134 mg·mL−1 LACB. All samples were filtered using a 0.22 µm filter plate (Millipore). Immuno-affinity depletion was performed by removing 14 highly abundant proteins from the 100 µL filtered CSF and plasma sample solutions. Samples were depleted with MARS columns, following the manufacturer instructions and using high performance liquid chromatography (LC) 9 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 41

systems (Thermo Scientific, San Jose, CA, USA) equipped with HTC-PAL (CTC Analytics, Zwingen, Switzerland) fraction collectors. After immuno-depletion, samples were snap-frozen and stored at -80 °C. Buffer exchange was performed with RP cartridges mounted on a 96-hole holder and a vacuum manifold as previously described.35 Samples were subsequently evaporated with a vacuum centrifuge (Thermo Scientific) and stored at -80 °C. Reduction with TCEP, alkylation with IAA, digestion with Lys-C/trypsin, TMT 6-plex labeling, sample pooling, and SPE purification (Oasis HLB and SCX) were performed on a 4-channels Microlab Star liquid handler (Hamilton, Bonaduz, Switzerland) according to a previously reported protocol.35 The pooled 6-plex TMTlabeled samples were then evaporated to dryness before storage at -80 °C. Reversed-Phase Liquid Chromatography Mass Spectrometry The CSF and plasma samples were dissolved in 200 µL and 500 µL H2O/CH3CN/FA 96.9/3/0.1, respectively, for RP-LC tandem MS (MS/MS). RP-LC MS/MS was performed with a hybrid linear ion trap-Orbitrap (LTQ-OT) Elite and an Ultimate 3000 RSLC nano system (Thermo Scientific) as recently described.35 Proteolytic peptides (injection of 5 µL of sample) were trapped on an Acclaim PepMap 75 µm × 2 cm (C18, 3 µm, 100 Å) pre-column and separated on an Acclaim PepMap RSLC 75 µm × 50 cm (C18, 2 µm, 100 Å) column (Thermo Scientific) coupled to a stainless steel nanobore emitter (40 mm, OD 1/32”) mounted on a Nanospray Flex Ion Source (Thermo Scientific). The analytical separation was run for 150 min using a gradient that reached 30% of CH3CN after 140 min and 80% of CH3CN after 150 min at a flow rate of 220 nL·min−1. For MS survey scans, the OT resolution was 120000 (ion population of 1 × 106) with an m/z window from 300 to 1500. For MS/MS with higher-energy collisional dissociation at 35% of the normalized collision energy, ion population was set to 1 × 105 (isolation width of 2), with a resolution of 15000, first mass at m/z = 100, and a maximum injection time of 250 ms in the OT. A maximum of 10 (most intense) precursor ions were selected for MS/MS. Dynamic exclusion was 10 ACS Paragon Plus Environment

Page 11 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

set for 60 s within a ± 5 ppm window. A lock mass of m/z = 445.1200 was used. Each sample was analyzed in duplicate. Mass Spectrometry Data Analyses Proteome Discoverer (version 1.4, Thermo Scientific) was used as data analysis interface. Identification was performed against the human UniProtKB/Swiss-Prot database (08/12/2014 release) including the LACB sequence (20194 sequences in total). Mascot (version 2.4.2, Matrix Sciences, London, UK) was used. Variable amino acid modifications were oxidized methionine, deamidated asparagine/glutamine, and 6-plex TMT-labeled peptide amino terminus (+ 229.163 Da). 6-plex TMT-labeled lysine (+ 229.163 Da) was set as fixed modifications as well as carbamidomethylation of cysteine. Trypsin was selected as the proteolytic enzyme, with a maximum of two potential missed cleavages. Peptide and fragment ion tolerances were set to 10 ppm and 0.02 Da, respectively. All Mascot result files were loaded into Scaffold Q+S 4.4.1.1 (Proteome Software, Portland, OR, USA) to be further searched with X! Tandem (version CYCLONE (2010.12.01.1), the GPM, http://thegpm.org/). Both peptide and protein FDRs were fixed at 1% maximum, with a 2 unique peptide criterion to report protein identification. Quantitative values were exported from Scaffold Q+S as log2 of the protein ratio fold changes with respect to their measurements in the biological reference, i.e., mean log2 values after isotopic purity correction but without normalization applied between samples and experiments. The biological references were a pool of all individual CSF (used for the analysis of CSF samples; see Figure S1) or plasma (used for the analysis of plasma samples; see Figure S1) samples. In each TMT experiment, two biological references were included (i.e., two CSF pools for the analysis of CSF samples and two plasma pools for the analysis of plasma samples) and labeled with 6-plex TMT reporter-ions at m/z = 126 and 131, allowing successively protein ratio fold change calculations with respect to both channels; this experimental design allowed mitigating the risk in case one 11 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 41

reference channel failed proper sample preparation for instance. After confirming a good agreement between the calculation results, data from both replicates calculated independently with both biological references were averaged to generate one single data matrix for each of the body fluids (selected quality metrics are given in Figure S2). In summary, for a given protein, protein ratio fold changei = 2^((log2(Ii / I126)Rep1 + log2(Ii / I131)Rep1 + log2(Ii / I126)Rep2 + log2(Ii / I131)Rep2) / 4), where I is for convenience the here so-called “intensity” of the protein in sample i labeled with either 6-plex TMT reporter-ions at m/z = 127, 128, 129 or 130; I126 and I131 are the “intensities” of the protein in the biological references; and Rep refers to the instrumental replicate. CSF Tau, P-tau 181, Aβ1-42, CSF/Serum Albumin Index, and APOE Genotyping CSF total tau, tau phosphorylated at threonine 181 (P-tau 181), and Aβ1-42 concentrations were measured using commercially available ELISA kits (Fujirebio, Ghent, Belgium) (Table S2). Those measurements were done as a routine laboratory assay at the Department of Laboratories of the University Hospitals of Lausanne. CSF albumin and plasma albumin were quantified using immunoturbidimetry on the Tina-quant Albumin generation 2 (Roche Diagnostics, Rotkreuz, Switzerland). CSF/serum albumin index was derived as previously described22 (Table S2). DNA was extracted from whole blood using the QIAsymphony DSP DNA Kit (Qiagen, Hombrechtikon, Switzerland). The single nucleotide variant rs429358 and rs7412 were genotyped using the Taqman assays C___3084793_20 and C____904973_10 respectively (Thermo Fischer Scientific, Waltham, MA, USA). Statistical Pearson Correlation and Bioinformatics Analysis Correlation analysis was performed on protein ratio fold changes (see above) using the Pearson’s statistics and Bonferroni correction for multiple comparisons. After quality checks, 112 matched sample pairs were used for correlation analysis of 253 proteins (see Table S3 of the Supporting 12 ACS Paragon Plus Environment

Page 13 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Information). Protein CSF/plasma ratios were calculated by dividing each protein ratio fold change in CSF by its protein ratio fold change in plasma, i.e., protein CSF/plasma ratioi = CSF protein ratio fold changei / plasma protein ratio fold changei. Associations of protein ratio fold changes in CSF and plasma and their CSF/plasma ratios with clinical variables (i.e., gender, age, years of education, BMI, APOE ε4 genotype, CDR, MMSE, and BBB impairment) were obtained fitting a linear model for each protein using those selected variables as covariates and an analysis of variance (ANOVA) to identify the significance. Proteins with greater than 10% missingness were excluded from the analysis. P-values were adjusted for multiplicity testing using the Bonferroni method. Several bioinformatics tools and resources were used for analysis and protein annotation (i.e., R version 3.2.4 (‘stats’ package and ‘circlize’ library) for generating associations between proteins and clinical variables and for plotting chord diagrams, and version 3.3.2 for the rest of the analyses (http://www.r-project.org/), Visualization and Integrated Discovery (DAVID) 6.8,37 the UniProt tissue annotation database,38 the tissue-atlas,39 the Kyoto Encyclopedia of Genes and Genomes (KEGG) database,40 and Venny (http://bioinfogp.cnb.csic.es/tools/venny/)). Statistical Analysis and Least Absolute Shrinkage and Selection Operator Classification Models Proteins with greater than 5% missingness were excluded. Remaining missing data (5% or less per protein) were imputed by randomly drawing a value between the observed range of biomarker values, leaving 541 quality-controlled CSF proteins and 248 quality-controlled plasma proteins (Figure S2). Log2 of the protein ratio fold changes were scaled to mean 0 and standard deviation of 1 prior to statistical analyses. A clustering effect was previously observed in the heatmap of the plasma proteome profiles;41 this effect involving 37 proteins (of which 21 were included in the CSF and plasma proteome overlap) with enriched annotations such as exosome and platelet was not associated with any of the available clinical covariates; importantly, those proteins were 13 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 41

checked not to appear in any of the results presented herein (Figures 1, 2, and 3). Out of the 118 subjects with available BBB measures and after quality control, CSF and plasma proteomic data were available for 116 and 112 subjects for statistical analyses described below. Calculation and statistics were performed with R version 3.3.2 (http://www.r-project.org/). Least absolute shrinkage and selection operator (LASSO) logistic regression42 selected biomarkers that best predict BBB impairment defined a priori as a CSF/serum albumin index ≥ 9.0. A reference model was initially generated, testing available variables that may contribute to predict BBB impairment to provide a benchmark for comparison with the models that included CSF and plasma proteins, separately. These inputs included age, gender, years of education, presence of the APOE ε4 allele, diabetes, hypercholesterolemia, CDR, and CSF tau, P-tau 181, and Aβ1-42. In addition of all variables used to make the reference model, CSF or plasma protein measurements were then included in building so-called best models. Our primary aim here was to identify proteins that contribute to improving classification models of BBB dysfunction independently of the presence of AD pathology; we therefore systematically built models that took into account the CSF protein markers of core AD pathology (i.e., CSF tau, P-tau 181, and Aβ1-42; see above). A 10-fold crossvalidation process was performed for each LASSO analysis using the glmnet package,43 which allows estimating the confidence interval of the misclassification error for each value of the regularization parameter λ. The LASSO analyses were repeated 100 times (1000 times for the reference model). The models that minimized the upper limit of the cross-validated misclassification error confidence interval across the 100 runs with less than 20 features were selected. The choice of the minimal cross-validation misclassification error was applied to reduce the risk of overfitting the data. Their performance was assessed by receiver operating characteristic (ROC) area under the curve (AUC) estimation using a bootstrap approach with 1000 iterations.44 Results were compared visually and formally tested for significance against the reference model 14 ACS Paragon Plus Environment

Page 15 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

using ROC AUC45 (the fitted values were predicted from the selected LASSO model at the selected λ that minimized the misclassification error; this method sums the protein measurements of the selected features weighted by their penalized coefficients in the LASSO model to obtain a value between 0 and 1; for each fitted value used as a cutoff threshold, the sensitivity and specificity were then computed to build ROC curve) and accuracy using a McNemar test. The group differences for the CSF and plasma proteins selected in the best models were each graphically illustrated in boxplots and assessed using t-test statistics. As the tests were only applied to the proteins selected with LASSO, p-values obtained from these analyses were here not corrected for multiple testing. Data Availability The

MS

data

was

deposited

to

the

ProteomeXchange

Consortium

(http://proteomecentral.proteomexchange.org)46 via the PRIDE partner repository47 and is available with the identifier PXD009589 (Username: [email protected]; Password: 71RvzbeX).

Results MS-Based Proteomics of 120 Paired CSF and Plasma Samples - Correlation of CSF and Plasma Proteins in Older Adults We recently developed a highly automated MS-based proteomic workflow to enable relatively large sample-size studies of human body fluids in clinical research, in both CSF36 and plasma samples.35 This methodology meets standard criteria for accuracy, precision and robustness; it can be deployed in a variety of different tissues,48 and is biologically informative.19, 49 It might therefore 15 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 41

be particularly suited for applications in brain health research, where both CSF and plasma are commonly studied. In the present study (Figure S1), we analyzed 120 CSF and 120 plasma MSbased proteome profiles after abundant protein depletion within a month of laboratory work. In total, we identified and quantified using isobaric labeling technology 790 and 422 proteins in CSF and plasma samples, respectively (see Tables S4 and S5). The proteome overlap between those matched CSF and plasma samples was composed of 255 proteins (Figure S3). To the best of our knowledge, these proteomic measurements provide, for the first time, the possibility to more broadly compare human CSF and plasma proteomes in paired samples from a cohort of more than 100 individuals. Among the 255 proteins commonly detected in both CSF and plasma, 253 proteins (see Experimental Section) were investigated to assess the existence of a correlation between their MS-based relative quantifications in each compartment (Table S3). We found 28 proteins (i.e., 11% of the commonly proteins measured) whose measured protein ratio fold changes significantly correlated between CSF and plasma (Figure 1A), with Bonferroni corrected p-value ≤ 0.05. Creactive protein (CRP) presented markedly the highest correlation (correlation coefficient R = 0.81), followed by complement factor H-related protein 2 and 1 (FHR2 and FHR1 with R = 0.66 and 0.65, respectively), lipopolysaccharide-binding protein (LBP with R = 0.64), and complement C4-A (CO4A with R = 0.63). After ranking the proteins in both CSF and plasma using their averaged exponentially modified protein abundance index (emPAI)50 (see Table S3), we also found significant correlation between the rankings. We then investigated the link of these 253 overlapping proteins in each compartment with clinical variables such as gender, age, years of education, body mass index (BMI), APOE ε4 genotype, cognitive function, and BBB impairment. The chord diagram of relationships (Figure 1B) revealed that the majority of the highly significant associations (i.e., Bonferroni corrected p-value ≤ 0.05) 16 ACS Paragon Plus Environment

Page 17 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

were related to the CSF proteins and BBB impairment (as defined a priori as a CSF/serum albumin index ≥ 9.0; see Experimental Section). At that level, no association overlap of a clinical measure with both a CSF protein and its plasma counterpart was evidenced even though the starting data was the common proteins detected in both matrices (see Table S3). Same analysis with calculated protein CSF/plasma ratios (see Experimental Section) also showed main associations with BBB impairment (Figure 1C). We therefore decided to further investigate the influence of the BBB integrity on our matched CSF and plasma proteomes. CSF to Plasma Protein Ratios in Association with BBB Permeability We hypothesized that the observed protein correlations between CSF and plasma can be indeed strongly influenced by the integrity of the BBB, which compartmentalizes and impacts the composition of circulating proteins in these body fluids. The clinical and biochemical characteristics of the cohort by BBB integrity are detailed in Table 1. With respect to BBB impairment, the proportion among females was notably lower than among males, in line with a recent report of lower integrity of the BBB in men in a large cohort of patients of different age groups51 and suggesting this observation to be primary related to gender rather than disease conditions. Correlation analyses between the calculated protein CSF/plasma ratios for each protein and the CSF/serum albumin index showed 76 proteins displaying a positive and significant (i.e., p-value ≤ 0.05 after correction for multiple testing) correlation (Figure 2A). The highest correlations were observed for kininogen-1 (KNG1, with R = 0.82), fetuin-B (FETUB, with R = 0.81), plasminogen (PLMN, with R = 0.79), N-acetylmuramoyl-L-alanine amidase (PGRP2, with R = 0.78), and alpha1B-glycoprotein (A1BG, with R = 0.77) (Figure 2B). Levels of significance were generally very high as 36 proteins had a p-value ≤ 1.11 × 10−13 after Bonferroni correction for multiple testing. In addition, many protein CSF/plasma ratios were correlated between each other, forming a main 17 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 41

cluster (Cluster 1) of ratio associations, where clusters corresponded to component subgraphs of the network (Figure S4); another main cluster (Cluster 2) was identified, composed of 18 proteins whose CSF/plasma ratios were primarily not linked to CSF/serum albumin index. We also checked the average value and variability of all protein CSF/plasma ratios calculated from the complete proteomic measurements (Figure 2C). From this, we confirmed that average values were centered to one (i.e., log2(CSF/plasma ratio) = 0) since relative protein quantifications were obtained independently in each body fluid (see Experimental Section). There were only few proteins with average value not centered but they presented higher variability and were not among the proteins mentioned before. Proteins whose CSF/plasma ratios changed according to the CSF/serum albumin index were mainly categorized as plasma proteins (74%) according to the UniProt tissue annotation database,38 the majority of which is expressed in the liver (84%) (Figure 2D). Among the 76 proteins previously identified, 29 pertained to the complement and coagulation cascades as revealed by annotation enrichment analysis using the Database for Annotation, DAVID resource37 and the KEGG database.40 Most of the proteins whose MS-based relative quantifications correlated between CSF and plasma (Figure 1A), displayed CSF/plasma ratios also associated with the CSF/serum albumin index (i.e., 22 over 28; Figure 2E) using our a priori-defined threshold (i.e., p-value ≤ 0.05 after Bonferroni correction, corresponding to R ≥ 0.3513 and 0.3531 for the two correlation analyses, respectively). Only for 6 proteins (i.e., CRP, FHR1, LBP, CO4A, proline-rich acidic protein 1 (PRAP1), and complement C4-B (CO4B)), this was not the case. Interestingly, most of those proteins (with the exception of CO4B) presented some of the strongest correlations (R > 0.5) between CSF and plasma (Figure 1A). FHR2, histidine-rich glycoprotein (HRG), and complement component C9 (CO9) correlated well enough between CSF and plasma (Figure 1A; R > 0.5) as did their 18 ACS Paragon Plus Environment

Page 19 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

CSF/plasma ratios and the CSF/serum albumin index (Figure 2A; R > 0.5), yet, in a contrasting magnitude order of R. Given all those relationships with the CSF/serum albumin index and the confirmed role of the BBB, we investigated if CSF or plasma proteins could identify BBB impairment in older adults and add valuable clinical value to our findings. CSF and Plasma Proteomic Signatures of BBB Impairment in Older Adults We used LASSO logistic regression (see Experimental Section) to build mathematical models able to classify BBB impairment, defined a priori as a CSF/serum albumin index ≥ 9.0. We first defined a reference model for classification of BBB impairment using available clinical measures that after LASSO selection included age, gender, years of education, CDR, and hypercholesterolemia. The reference model diagnostic accuracy was 86.44% (as compared to the accuracy of a majority class prediction52 of 86.21%); the reference model therefore did not achieve better than a majority vote assignment. The AUC of the ROC curve was 0.75 (95% confidence interval of [0.63-0.86]). CSF protein biomarkers were able to predict BBB impairment with high accuracy, i.e., diagnostic accuracy of 98.21% (McNemar p-value of 0.0094, showing significant improvement against the reference model) and AUC under the ROC curve of 0.99 [0.97-1.00] (p = 5.0 × 10−5 as shown in Figure 3A). In total, 12 proteins in CSF were selected in this best classification model without addition of any clinical parameter. In Figure 3B, testing individually each selected protein, significant group differences were observed for alpha-2-HS-glycoprotein (FETUA) (p = 2.5 × 10−11), vitamin D-binding protein (VTDB) (p = 4.6 × 10−11), A1BG (p = 6.5 × 10−10), carboxypeptidase B2 (CBPB2) (p = 8.9 × 10−8), serum amyloid P-component (SAMP) (p = 9.0 × 10−8), afamin (AFAM) (p = 1.1 × 10−7), probable helicase with zinc finger domain (HELZ) (p = 6.0 × 10−7), transthyretin (TTHY) (p = 8.3 × 10−3), and dermcidin (DCD) (p = 0.04). We noted that several of those proteins such as FETUA, VTDB, A1BG, CBPB2, and AFAM were highly 19 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 41

correlated (R ≥ 0.82) between each other (Figure S5A). CSF FETUA, VTDB, A1BG, CBPB2, SAMP, and AFAM showed association with BBB impairment previously (Figure 1B). Their corresponding protein CSF/plasma ratios were associated with BBB impairment (Figure 1C) and the CSF/serum albumin index (Figure 2A). Interestingly, FETUA, SAMP, and AFAM levels in CSF significantly correlated with their respective plasma levels (Figure 1A). Plasma protein biomarkers were able to predict BBB impairment with a diagnostic accuracy of 91.38% but without significant improvement with respect to the reference model (McNemar pvalue of 0.0736). However, AUC under the ROC curve was significantly improved to 0.92 [0.860.97] (p = 0.0057 as shown in Figure 3C). In total, 11 proteins in plasma were selected in this best model in addition of age and CSF tau. In Figure 3D, testing individually each selected protein, significant group differences were observed only for phosphatidylinositol-glycan-specific phospholipase D (PHLD) (p = 3.6 × 10−4), proteoglycan 4 (PRG4) (p = 0.02), and peroxiredoxin2 (PRDX2) (p = 0.03). This time, the selected proteins were not strongly correlated between each other (Figure S5B) and they were not previously identified with relation to BBB impairment (Figure 1). PRDX2 was present in both CSF and plasma protein-based classification models of BBB impairment (Figure S6) but a small significant group difference of PRDX2 levels was only observed in plasma.

Discussion Measuring biomolecules in body fluids or tissues for diagnosis, monitoring and prognostic purposes is key to help clinicians and guide decision making for treatment, patient management, and the development of drugs and prevention strategies. The focus of many research works is to expand the pool of available biomarkers in precision medicine and personalized nutrition. Though, 20 ACS Paragon Plus Environment

Page 21 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

the characterization of sample matrices taken from the human body is still not fully comprehensive and the fine molecular compositions, relations between them, an alterations in diseases are even more incomplete. Identification of specificity and commonality of those biopsies at the molecular level may also help rationalizing biomarker discovery and verification approaches, and translating their use in the clinical setting. In particular in the field of neurological disorders, while CSF sampling requires a rather invasive procedure via lumbar puncture, blood sampling remains minimally invasive and more easily repeatable. Identifying proxy measures of CSF biomolecules in blood sample such as plasma is therefore very attractive. In the present study, we focused on proteomes and used RP-LC MS/MS to analyze matched CSF and plasma samples from 120 elderly subjects with normal or impaired cognition. To the best of our knowledge, this kind of investigation has not been performed previously in large enough cohort to infer statically significant and robust associations. Several groups have reported on selected proteins such as leptin,53 amyloid peptides, and α-1 antichymotrypsin,54 for instance. Berven and co-workers performed shotgun proteomics in both CSF and plasma but only in five individuals.55 Schwenk and co-workers measured paired CSF and plasma samples of hundreds of subjects using a 101-plex bead array but did not evidence correlations of protein levels between both fluids.56 Additional studies investigated both the CSF and plasma proteomes in relation to pathologies but did not decipher their interactions specifically.57-59 In our study, we found only 28 proteins over 253 commonly measured with associated levels between CSF and plasma. Given the main origin of CSF as an ultrafiltrate of arterial blood produced by the choroid plexus, this 11% percent of associated protein may appear rather low. We did not infer particular influence of protein sizes as the average and median molecular masses for the 253 commonly measured proteins in CSF and plasma were 89046 and 52286 Da, respectively, while for the 28 proteins with revealed association between both fluids they were 66642 and 50963 21 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 41

Da, respectively. These results may well illustrate the active control of the CNS and confirm the tight regulation of the CSF composition. The 28 proteins found with association relationship between CSF and plasma were mainly of plasma origin. According to the tissue-based map of the human proteome,39 none of the 28 proteins (Figure 1A) corresponded to genes with elevated expression in the brain and that could be considered as brain-specific. This observation may suggest that, in general, circulation in the CSF is not influential to the peripheral circulation. In particular, the clearance of proteins from the CSF into the blood may not significantly influence the concentration of the main blood proteins; it is true that total protein concentrations in human plasma and serum are more than two orders of magnitude higher than that in CSF. Therefore, one could hypothesize that one characteristic of blood-based biomarkers of neurological disorders should be their brain origin. However, it needs to be considered that MS-based proteomics is biased towards the detection and quantification of most abundant proteins in each sample matrix. The number of associations might therefore have been influenced by the different levels of proteins in CSF and plasma and their specific concentrations in each of those. Confounding factors such as gender, age, years of education, BMI, APOE ε4 genotype, CDR, MMSE, and BBB impairment have been considered (Figure 1B-C) in order to ensure that the associations identified were not product of the influence of these variables. CSF and blood are not in direct contact but are compartmentalized and separated by physical barriers collectively called the BBB. The BBB is composed of brain endothelial cells organized by tight junctions surrounding the capillary vessels in the brain,60 allowing the transfer of molecules through diffusion or more selective processes. The CSF/plasma (or serum) protein concentration ratio is frequently used to assess the integrity of the BBB. In particular, the albumin ratio between the body fluids is commonly employed, because of the abundance of albumin in the blood. When albumin crosses over a leaking BBB from the blood side, increased concentration of albumin can 22 ACS Paragon Plus Environment

Page 23 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

be detected in the CSF. The CSF/serum albumin index is a recognized proxy measure of BBB integrity.22,

23

Our finding of a large number of associations (i.e., 76 in total) between protein

CSF/plasma ratios from proteomics and the CSF/serum albumin index suggests a fairly similar global behavior for those proteins and albumin. This fact has not been evidenced before. We identified a particular cluster of proteins that seems indeed to exchange through BBB similarly and could help characterizing a leaking BBB. Determination of protein absolute concentrations for some of the proteins with CSF/plasma ratios correlated with CSF/serum albumin index would be a next step to confirm or refute our observations. Such measurements may allow to calculate the true CSF/plasma protein “indexes” (and not only relative quantitative ratio measures (see Figure 2C)) and quantitatively determine the group of proteins that can cross the BBB. Interestingly, we observed few proteins showing a strong relationship between the CSF and plasma where the BBB function does not seem to play a role to a relevant extent. Those proteins were CRP, FHR1, LBP, CO4A, PRAP1, and CO4B (Figure S4B). While CO4A and CO4B from the complement system are quite large protein with molecular masses above 190000 Da (we might have only detected the presence of fragments in one or the other body fluid using shotgun proteomics), the specific behavior of those proteins may deserve further investigations. In particular, the inflammatory CRP that promotes agglutination, bacterial capsular swelling, phagocytosis, and complement fixation was very strongly correlated between CSF and plasma (Figure 1A); such strong correlations between CRP in CSF and in blood was previously observed in Parkinson's disease patients and in a reference group,61 suggesting that particular inflammatory markers in the blood may mirror inflammation in the CNS. Taken together, our results highlight the complex CSF-blood dynamics of protein productions and exchanges occurring via multiple mechanisms. We previously and separately studied the CSF62 and plasma41 proteome profiles of the present cohort of individuals in relation to biomarkers of AD pathology and identified quite different 23 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 41

biomarker candidate panels relative to amyloid pathology, and markers of neuronal injury and tau hyperphosphorylation; in those studies, diagnostic models obtained with CSF proteins were in general statistically stronger than those obtained with plasma proteins. Nonetheless, because of their simplicity of sampling, blood-derived specimens such as plasma offer compelling perspectives for clinical use. Our present results appears therefore valuable to define proxy measures in plasma that could substitute CSF determinations. Not only informative on CSF and plasma similarities, our results additionally bring some evidences on protein compartmentspecificity. Both CSF and plasma proteins were able to identify BBB impairment in classification models. Twelve proteins in CSF alone were able to identify BBB impairment with improved performances with regards to a reference benchmark. Despite the selection of some alpha-glycoproteins and the link of VTDB and AFAM with the transport of vitamins, potentially across the BBB, it was not clear how to elaborate on the mechanisms involving those proteins in relation with BBB impairment. While they may warrant further research, such findings in CSF appear limited for further routine application due to the necessity to perform lumbar puncture. But, 11 proteins in plasma improved classification of BBB impairment. It is interesting that from plasma proteins we may classify BBB impairment. This suggests that one may not need to perform lumbar puncture and CSF analysis to determine BBB impairment. This finding may have a high relevance for translation to clinical praxis and clinical studies. Beyond the plasma proteins, age and CSF tau contributed to the obtained mathematical model (Figure 3C). While age was also present in the reference model used as a benchmark, CSF tau was not initially selected in the reference model. As no significant difference was previously detected in BBB intact versus BBB impaired subjects in this cohort63, 64 (see also Table 1), we may therefore hypothesize that plasma proteins and age could be determinant to predict BBB impairment. In the present exploratory study, however, only 24 ACS Paragon Plus Environment

Page 25 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

16 subjects had a BBB impairment, i.e., a CSF/serum albumin index ≥ 9, and our findings will need to be confirmed in an independent and larger population before further translation. An important next step would be to isolate the specific effects of BBB impairment on the CSF and plasma proteomes from the possible effects generated from various brain disorders. We utilized here the CSF/serum albumin index because it is the gold standard for assessment of BBB integrity in living subjects and it is a reproducible measurement over time in older adults.22 The CSF/serum albumin index is a global measure of BBB integrity and does not determine the locus or mechanism by which albumin concentrates in the CSF as a sign of BBB breakdown.

Conclusions In conclusion, for the first time, a comprehensive draft of CSF/plasma matching proteomes was obtained using MS-based shotgun proteomics. Profiling of more than 200 proteomes revealed unknown associations between CSF and peripheral circulation and confirm the influence of the BBB function in such relationships. Furthermore, a selection of candidate plasma proteins was able to classify BBB impairment with high diagnostic accuracy (> 90%), that if confirmed would have a high potential for the use of blood-based protein biomarkers to detect BBB impairment. The description of the interacting CSF and plasma proteomes provides an original resource to biomarker development for neurological disorders.

Supporting Information Figure S1. Study design and proteome profiling workflow. Figure S2. Quality metrics of CSF and plasma proteomic analyses. Figure S3. Total proteome coverages in CSF and plasma samples and their overlap. Figure S4. Network analysis of protein CSF/plasma ratios. Figure S5. Pairwise 25 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 41

correlation heatmap of selected CSF and plasma proteins. Figure S6. Overlap between the CSF and plasma proteins for classification of BBB impairment. Table S1. Randomization of CSF (A) and plasma (B) samples on the experimental plates based on CDR, gender and categorized age (four groups). Table S2. CSF concentrations of total tau, P-tau 181, and Aβ1-42; CSF/serum albumin index measures. Table S3. List of proteins in CSF and plasma to perform the data analyses. Table S4. CSF protein quantification report exported from Scaffold Q+S as log2 of the protein ratio fold changes (reference is the CSF pool labeled with 6-plex TMT reporter-ions at m/z = 126 (A) and 131 (B)) for instrumental replicate 1 and replicate 2. Table S5. Plasma protein quantification report exported from Scaffold Q+S as log2 of the protein ratio fold changes (reference is the plasma pool labeled with 6-plex TMT reporter-ions at m/z = 126 (A) and 131 (B)) for instrumental replicate 1 and replicate 2.

Acknowledgement We thank John Corthésy and India Severin for their continuous support and very fruitful discussions. This study was supported by grants from the Swiss National Research Foundation to Dr. Popp (SNF 320030_141179) and funding from the Nestlé Institute of Health Sciences.

Author Contributions Dr. Dayon - study concept and design, acquisition of data, supervision of data acquisition, analysis of data, interpretation of the analysis, and writing of the manuscript. Dr. Cominetti - statistical analysis and critical revision of the manuscript. Dr. Wojcik - study concept and design, statistical analysis plan, statistical analysis, drafting of the statistical analysis section, and critical revision of 26 ACS Paragon Plus Environment

Page 27 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

the manuscript. Mr. Núñez Galindo - acquisition of data and critical revision of the manuscript. Dr. Oikonomidi - acquisition of data and critical revision of the manuscript. Dr. Henry - supervision of data acquisition and critical revision of the manuscript. Dr. Migliavacca - statistical analysis plan and critical revision of the manuscript. Prof. Kussmann - supervision of data acquisition and critical revision of the manuscript. Dr. Bowman - study concept and design, statistical analysis plan, critical revision of the manuscript, and overall study supervision. Dr. Popp - study concept and design, critical revision of the manuscript, and overall study supervision. All authors have given approval to the final version of the manuscript.

Conflict of Interest Disclosure Dr. Dayon, Dr. Cominetti, Mr. Núñez Galindo, and Dr. Migliavacca are employees of Nestlé Institute of Health Sciences. Dr. Wojcik is an employee and shareholder of Precision for Medicine and received consultation honoraria from Nestlé Institute of Health Sciences. Dr. Oikonomidi and Dr. Henry report no disclosures. Dr. Kussmann and Dr. Bowman were employees of Nestlé Institute of Health Sciences at the time of the research. Dr. Bowman is an unpaid scientific advisor of the H2020 EU-funded project PROPAG-AGEING that aims at identifying new molecular signatures for early diagnosis of neurodegenerative diseases, and receives research support from the NIH/NIA related to cognitive decline. Dr. Popp received consultation honoraria from Nestlé Institute of Health Sciences.

27 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 41

References

1. Begcevic, I.; Brinc, D.; Drabovich, A. P.; Batruch, I.; Diamandis, E. P. Identification of brain-enriched proteins in the cerebrospinal fluid proteome by LC-MS/MS profiling and mining of the Human Protein Atlas. Clin. Proteomics 2016, 13 (1), 11. 2. Fang, Q.; Strand, A.; Law, W.; Faca, V. M.; Fitzgibbon, M. P.; Hamel, N.; Houle, B.; Liu, X.; May, D. H.; Poschmann, G.; Roy, L.; Stühler, K.; Ying, W.; Zhang, J.; Zheng, Z.; Bergeron, J. J. M.; Hanash, S.; He, F.; Leavitt, B. R.; Meyer, H. E.; Qian, X.; McIntosh, M. W. Brain-specific proteins decline in the cerebrospinal fluid of humans with huntington disease. Mol. Cell. Proteomics 2009, 8 (3), 451-466. 3. Deisenhammer, F.; Bartos, A.; Egg, R.; Gilhus, N. E.; Giovannoni, G.; Rauer, S.; Sellebjerg, F. Guidelines on routine cerebrospinal fluid analysis. Report from an EFNS task force. Eur. J. Neurol. 2006, 13 (9), 913-922. 4. Galasko, D. R.; Shaw, L. M. Alzheimer disease: CSF biomarkers for Alzheimer diseaseapproaching consensus. Nat. Rev. Neurol. 2017, 13 (3), 131-132. 5. Jack, C. R., Jr.; Bennett, D. A.; Blennow, K.; Carrillo, M. C.; Feldman, H. H.; Frisoni, G. B.; Hampel, H.; Jagust, W. J.; Johnson, K. A.; Knopman, D. S.; Petersen, R. C.; Scheltens, P.; Sperling, R. A.; Dubois, B. A/T/N: An unbiased descriptive classification scheme for Alzheimer disease biomarkers. Neurology 2016, 87 (5), 539-547. 6. Schutzer, S. E.; Liu, T.; Natelson, B. H.; Angel, T. E.; Schepmoes, A. A.; Purvine, S. O.; Hixson, K. K.; Lipton, M. S.; Camp, D. G.; Coyle, P. K.; Smith, R. D.; Bergquist, J. Establishing the proteome of normal human cerebrospinal fluid. PLoS One 2010, 5 (6), e10980. 7. Guldbrandsen, A.; Vethe, H.; Farag, Y.; Oveland, E.; Garberg, H.; Berle, M.; Myhr, K. M.; Opsahl, J. A.; Barsnes, H.; Berven, F. S. In-depth characterization of the cerebrospinal fluid (CSF) proteome displayed through the CSF proteome resource (CSF-PR). Mol. Cell. Proteomics 2014, 13 (11), 3152-3163. 8. Zhang, Y.; Guo, Z.; Zou, L.; Yang, Y.; Zhang, L.; Ji, N.; Shao, C.; Sun, W.; Wang, Y. A comprehensive map and functional annotation of the normal human cerebrospinal fluid proteome. J. Proteomics 2015, 119, 90-99. 9. Macron, C.; Lane, L.; Núnez Galindo, A.; Dayon, L. Deep Dive on the Proteome of Human Cerebrospinal Fluid: A Valuable Data Resource for Biomarker Discovery and Missing Protein Identification. J. Proteome Res. 2018, doi: 10.1021/acs.jproteome.8b00300. 10. Macron, C.; Lane, L.; Núñez Galindo, A.; Dayon, L. Identification of Missing Proteins in Normal Human Cerebrospinal Fluid. J. Proteome Res. 2018, doi: 10.1021/acs.jproteome.8b00194. 28 ACS Paragon Plus Environment

Page 29 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

11. Chahine, L. M.; Stern, M. B.; Chen-Plotkin, A. Blood-based biomarkers for Parkinson's disease. Parkinsonism Relat. Disord. 2014, 20, S99-S103. 12. Di Battista, A. P.; Rhind, S. G.; Baker, A. J. Application of blood-based biomarkers in human mild traumatic brain injury. Front. Neurol. 2013, 4, 44. 13. Misra, S.; Kumar, A.; Kumar, P.; Yadav, A. K.; Mohania, D.; Pandit, A. K.; Prasad, K.; Vibha, D. Blood-based protein biomarkers for stroke differentiation: A systematic review. Proteomics Clin. Appl. 2017, 11 (9-10), 1700007. 14. O'Bryant, S. E.; Mielke, M. M.; Rissman, R. A.; Lista, S.; Vanderstichele, H.; Zetterberg, H.; Lewczuk, P.; Posner, H.; Hall, J.; Johnson, L.; Fong, Y. L.; Luthman, J.; Jeromin, A.; BatrlaUtermann, R.; Villarreal, A.; Britton, G.; Snyder, P. J.; Henriksen, K.; Grammas, P.; Gupta, V.; Martins, R.; Hampel, H. Blood-based biomarkers in Alzheimer disease: Current state of the science and a novel collaborative paradigm for advancing from discovery to clinic. Alzheimers Dement. 2017, 13 (1), 45-58. 15. Aebersold, R.; Mann, M. Mass-spectrometric exploration of proteome structure and function. Nature 2016, 537 (7620), 347-355. 16. Geyer, P. E.; Holdt, L. M.; Teupser, D.; Mann, M. Revisiting biomarker discovery by plasma proteomics. Mol. Syst. Biol. 2017, 13 (9), 942. 17. Geyer, P. E.; Wewer Albrechtsen, N. J.; Tyanova, S.; Grassl, N.; Iepsen, E. W.; Lundgren, J.; Madsbad, S.; Holst, J. J.; Torekov, S. S.; Mann, M. Proteomics reveals the effects of sustained weight loss on the human plasma proteome. Mol. Syst. Biol. 2016, 12 (12), 901. 18. Liu, Y.; Buil, A.; Collins, B. C.; Gillet, L. C. J.; Blum, L. C.; Cheng, L. Y.; Vitek, O.; Mouritsen, J.; Lachance, G.; Spector, T. D.; Dermitzakis, E. T.; Aebersold, R. Quantitative variability of 342 plasma proteins in a human twin population. Mol. Syst. Biol. 2015, 11 (2), 786. 19. Cominetti, O.; Núñez Galindo, A.; Corthésy, J.; Oller Moreno, S.; Irincheeva, I.; Valsesia, A.; Astrup, A.; Saris, W. H. M.; Hager, J.; Kussmann, M.; Dayon, L. Proteomic biomarker discovery in 1000 human plasma samples with mass spectrometry. J. Proteome Res. 2016, 15 (2), 389-399. 20. Kim, S.; Swaminathan, S.; Ngo, K.; Risacher, S.; Shen, L.; Foroud, T.; Shaw, L.; Trojanowski, J.; Soares, H.; Weiner, M.; Saykin, A. Relationship between CSF and plasma proteomic data in the ADNI-1 cohort. Alzheimers Dement. 2012, 8 (4), P271-P272. 21. Bowman, G. L.; Quinn, J. F. Alzheimer's disease and the blood-brain barrier: Past, present and future. Aging Health 2008, 4 (1), 47-57.

29 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 41

22. Bowman, G. L.; Kaye, J. A.; Moore, M.; Waichunas, D.; Carlson, N. E.; Quinn, J. F. Bloodbrain barrier impairment in Alzheimer disease: Stability and functional significance. Neurology 2007, 68 (21), 1809-1814. 23. Tumani, H.; Hegen, H., CSF albumin: Albumin CSF/serum ratio (marker for blood-CSF barrier function). In Cerebrospinal Fluid in Clinical Neurology, Springer International Publishing: 2015; pp 111-114. 24. Link, H.; Tibbling, G. Principles of albumin and igg analyses in neurological disorders. III. Evaluation of igg synthesis within the central nervous system in multiple sclerosis. Scand. J. Clin. Lab. Invest. 1977, 37 (5), 397-401. 25. Popp, J.; Oikonomidi, A.; Tautvydaitė, D.; Dayon, L.; Bacher, M.; Migliavacca, E.; Henry, H.; Kirkland, R.; Severin, I.; Wojcik, J.; Bowman, G. L. Markers of neuroinflammation associated with Alzheimer's disease pathology in older adults. Brain Behav. Immun. 2017, 62, 203-211. 26. Morris, J. C. The clinical dementia rating (cdr): Current version and scoring rules. Neurology 1993, 43 (11), 2412-2414. 27. Winblad, B.; Palmer, K.; Kivipelto, M.; Jelic, V.; Fratiglioni, L.; Wahlund, L. O.; Nordberg, A.; Bäckman, L.; Albert, M.; Almkvist, O.; Arai, H.; Basun, H.; Blennow, K.; De Leon, M.; Decarli, C.; Erkinjuntti, T.; Giacobini, E.; Graff, C.; Hardy, J.; Jack, C.; Jorm, A.; Ritchie, K.; Van Duijn, C.; Visser, P.; Petersen, R. C. Mild cognitive impairment - Beyond controversies, towards a consensus: Report of the International Working Group on Mild Cognitive Impairment. J. Intern. Med. 2004, 256 (3), 240-246. 28. McKhann, G. M.; Knopman, D. S.; Chertkow, H.; Hyman, B. T.; Jack Jr, C. R.; Kawas, C. H.; Klunk, W. E.; Koroshetz, W. J.; Manly, J. J.; Mayeux, R.; Mohs, R. C.; Morris, J. C.; Rossor, M. N.; Scheltens, P.; Carrillo, M. C.; Thies, B.; Weintraub, S.; Phelps, C. H. The diagnosis of dementia due to Alzheimer's disease: Recommendations from the National Institute on AgingAlzheimer's Association workgroups on diagnostic guidelines for Alzheimer's disease. Alzheimers Dement. 2011, 7 (3), 263-269. 29. Buschke, H.; Sliwinski, M. J.; Kuslansky, G.; Lipton, R. B. Diagnosis of early dementia by the Double Memory Test: Encoding specificity improves diagnostic sensitivity and specificity. Neurology 1997, 48 (4), 989-997. 30. Folstein, M. F.; Folstein, S. E.; McHugh, P. R. "Mini-mental state". A practical method for grading the cognitive state of patients for the clinician. J. Psychiatr. Res. 1975, 12 (3), 189-198. 31. Zigmond, A. S.; Snaith, R. P. The Hospital Anxiety and Depression Scale. Acta Psychiatr. Scand. 1983, 67 (6), 361-370.

30 ACS Paragon Plus Environment

Page 31 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

32. Jorm, A. F.; Jacomb, P. A. The Informant Questionnaire on Cognitive Decline in the Elderly (IQCODE): Socio-demographic correlates, reliability, validity and some norms. Psychol. Med. 1989, 19 (4), 1015-1022. 33. Popp, J.; Riad, M.; Freymann, K.; Jessen, F. Diagnostic lumbar puncture performed in the outpatient setting of a memory clinic: Frequency and risk factors of post-lumbar puncture headache. Nervenarzt 2007, 78 (5), 547-551. 34. Dayon, L.; Sanchez, J. C., Relative protein quantification by MS/MS using the tandem mass tag technology. In Methods in Molecular Biology, 2012; Vol. 893, pp 115-127. 35. Dayon, L.; Núñez Galindo, A.; Corthésy, J.; Cominetti, O.; Kussmann, M. Comprehensive and scalable highly automated MS-based proteomic workflow for clinical biomarker discovery in human plasma. J. Proteome Res. 2014, 13 (8), 3837-3845. 36. Núñez Galindo, A.; Kussmann, M.; Dayon, L. Proteomics of cerebrospinal fluid: Throughput and robustness using a scalable automated analysis pipeline for biomarker discovery. Anal. Chem. 2015, 87 (21), 10755-10761. 37. Huang, D. W.; Sherman, B. T.; Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 2009, 4 (1), 44-57. 38. Bateman, A.; Martin, M. J.; O'Donovan, C.; Magrane, M.; Alpi, E.; Antunes, R.; Bely, B.; Bingley, M.; Bonilla, C.; Britto, R.; Bursteinas, B.; Bye-Ajee, H.; Cowley, A.; Da Silva, A.; De Giorgi, M.; Dogan, T.; Fazzini, F.; Castro, L. G.; Figueira, L.; Garmiri, P.; Georghiou, G.; Gonzalez, D.; Hatton-Ellis, E.; Li, W.; Liu, W.; Lopez, R.; Luo, J.; Lussi, Y.; MacDougall, A.; Nightingale, A.; Palka, B.; Pichler, K.; Poggioli, D.; Pundir, S.; Pureza, L.; Qi, G.; Rosanoff, S.; Saidi, R.; Sawford, T.; Shypitsyna, A.; Speretta, E.; Turner, E.; Tyagi, N.; Volynkin, V.; Wardell, T.; Warner, K.; Watkins, X.; Zaru, R.; Zellner, H.; Xenarios, I.; Bougueleret, L.; Bridge, A.; Poux, S.; Redaschi, N.; Aimo, L.; ArgoudPuy, G.; Auchincloss, A.; Axelsen, K.; Bansal, P.; Baratin, D.; Blatter, M. C.; Boeckmann, B.; Bolleman, J.; Boutet, E.; Breuza, L.; Casal-Casas, C.; De Castro, E.; Coudert, E.; Cuche, B.; Doche, M.; Dornevil, D.; Duvaud, S.; Estreicher, A.; Famiglietti, L.; Feuermann, M.; Gasteiger, E.; Gehant, S.; Gerritsen, V.; Gos, A.; Gruaz-Gumowski, N.; Hinz, U.; Hulo, C.; Jungo, F.; Keller, G.; Lara, V.; Lemercier, P.; Lieberherr, D.; Lombardot, T.; Martin, X.; Masson, P.; Morgat, A.; Neto, T.; Nouspikel, N.; Paesano, S.; Pedruzzi, I.; Pilbout, S.; Pozzato, M.; Pruess, M.; Rivoire, C.; Roechert, B.; Schneider, M.; Sigrist, C.; Sonesson, K.; Staehli, S.; Stutz, A.; Sundaram, S.; Tognolli, M.; Verbregue, L.; Veuthey, A. L.; Wu, C. H.; Arighi, C. N.; Arminski, L.; Chen, C.; Chen, Y.; Garavelli, J. S.; Huang, H.; Laiho, K.; McGarvey, P.; Natale, D. A.; Ross, K.; Vinayaka, C. R.; Wang, Q.; Wang, Y.; Yeh, L. S.; Zhang, J. UniProt: The universal protein knowledgebase. Nucleic Acids Res. 2017, 45 (D1), D158-D169. 39. Uhlén, M.; Fagerberg, L.; Hallström, B. M.; Lindskog, C.; Oksvold, P.; Mardinoglu, A.; Sivertsson, Å.; Kampf, C.; Sjöstedt, E.; Asplund, A.; Olsson, I.; Edlund, K.; Lundberg, E.; Navani, S.; Szigyarto, C. A. K.; Odeberg, J.; Djureinovic, D.; Takanen, J. O.; Hober, S.; Alm, T.; Edqvist, P. H.; Berling, H.; Tegel, H.; Mulder, J.; Rockberg, J.; Nilsson, P.; Schwenk, J. M.; Hamsten, M.; 31 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 41

Von Feilitzen, K.; Forsberg, M.; Persson, L.; Johansson, F.; Zwahlen, M.; Von Heijne, G.; Nielsen, J.; Pontén, F. Tissue-based map of the human proteome. Science 2015, 347 (6220), 1260419. 40. Kanehisa, M.; Furumichi, M.; Tanabe, M.; Sato, Y.; Morishima, K. KEGG: New perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 2017, 45 (D1), D353D361. 41. Dayon, L.; Wojcik, J.; Núñez Galindo, A.; Corthésy, J.; Cominetti, O.; Oikonomidi, A.; Henry, H.; Migliavacca, E.; Bowman, G. L.; Popp, J. Plasma proteomic profiles of cerebrospinal fluid-defined Alzheimer’s disease pathology in older adults. J. Alzheimers Dis. 2017, 60 (4), 16411652. 42. Tibshirani, R. Regression shrinkage and selection via the lasso: A retrospective. J. R. Stat. Soc. Ser. B Stat. Methodol. 2011, 73 (3), 273-282. 43. Friedman, J.; Hastie, T.; Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Software 2010, 33 (1), 1-22. 44. Robin, X.; Turck, N.; Hainard, A.; Tiberti, N.; Lisacek, F.; Sanchez, J. C.; Müller, M. pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 2011, 12, 77. 45. DeLong, E. R.; DeLong, D. M.; Clarke-Pearson, D. L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988, 44 (3), 837-845. 46. Deutsch, E. W.; Csordas, A.; Sun, Z.; Jarnuczak, A.; Perez-Riverol, Y.; Ternent, T.; Campbell, D. S.; Bernal-Llinares, M.; Okuda, S.; Kawano, S.; Moritz, R. L.; Carver, J. J.; Wang, M.; Ishihama, Y.; Bandeira, N.; Hermjakob, H.; Vizcaíno, J. A. The ProteomeXchange consortium in 2017: Supporting the cultural change in proteomics public data deposition. Nucleic Acids Res. 2017, 45 (D1), D1100-D1106. 47. Vizcaíno, J. A.; Csordas, A.; Del-Toro, N.; Dianes, J. A.; Griss, J.; Lavidas, I.; Mayer, G.; Perez-Riverol, Y.; Reisinger, F.; Ternent, T.; Xu, Q. W.; Wang, R.; Hermjakob, H. 2016 update of the PRIDE database and its related tools. Nucleic Acids Res. 2016, 44 (D1), D447-D456. 48. Lan, J.; Núñez Galindo, A.; Doecke, J.; Fowler, C.; Martins, R. N.; Rainey-Smith, S. R.; Cominetti, O.; Dayon, L. Systematic evaluation of the use of human plasma and serum for massspectrometry-based shotgun proteomics. J. Proteome Res. 2018, 17 (4), 1426-1435. 49. Oller Moreno, S.; Cominetti, O.; Núñez Galindo, A.; Irincheeva, I.; Corthésy, J.; Astrup, A.; Saris, W. H.; Hager, J.; Kussmann, M.; Dayon, L. The differential plasma proteome of obese and overweight individuals undergoing a nutritional weight loss and maintenance intervention. Proteomics Clin. Appl. 2018, 12 (1), 1600150. 32 ACS Paragon Plus Environment

Page 33 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

50. Ishihama, Y.; Oda, Y.; Tabata, T.; Sato, T.; Nagasu, T.; Rappsilber, J.; Mann, M. Exponentially modified protein abundance index (emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein. Mol. Cell. Proteomics 2005, 4 (9), 1265-1272. 51. Parrado-Fernández, C.; Blennow, K.; Hansson, M.; Leoni, V.; Cedazo-Minguez, A.; Björkhem, I. Evidence for sex difference in the CSF/plasma albumin ratio in ~20 000 patients and 335 healthy volunteers. J. Cell. Mol. Med. 2018. 52. Gauher, S. Is your Classification Model making lucky guesses? http://blog.revolutionanalytics.com/2016/03/classification-models.html. Accessed 28 November 2018. 53. Schwartz, M. W.; Peskind, E.; Raskind, M.; Boyko, E. J.; Porte Jr, D. Cerebrospinal fluid leptin levels: Relationship to plasma levels and to adiposity in humans. Nat. Med. 1996, 2 (5), 589593. 54. Mehta, P. D.; Pirttila, T.; Patrick, B. A.; Barshatzky, M.; Mehta, S. P. Amyloid β protein 140 and 1-42 levels in matched cerebrospinal fluid and plasma from patients with Alzheimer disease. Neurosci. Lett. 2001, 304 (1-2), 102-106. 55. Aasebø, E.; Opsahl, J. A.; Bjørlykke, Y.; Myhr, K. M.; Kroksveen, A. C.; Berven, F. S. Effects of blood contamination and the rostro-caudal gradient on the human cerebrospinal fluid proteome. PLoS One 2014, 9 (3), e90429. 56. Byström, S.; Ayoglu, B.; Häggmark, A.; Mitsios, N.; Hong, M. G.; Drobin, K.; Forsström, B.; Fredolini, C.; Khademi, M.; Amor, S.; Uhlén, M.; Olsson, T.; Mulder, J.; Nilsson, P.; Schwenk, J. M. Affinity proteomic profiling of plasma, cerebrospinal fluid, and brain tissue within multiple sclerosis. J. Proteome Res. 2014, 13 (11), 4607-4619. 57. Gitau, E. N.; Kokwaro, G. O.; Karanja, H.; Newton, C. R. J. C.; Ward, S. A. Plasma and cerebrospinal proteomes from children with cerebral malaria differ from those of children with other encephalopathies. J. Infect. Dis. 2013, 208 (9), 1494-1503. 58. Richens, J. L.; Vere, K. A.; Light, R. A.; Soria, D.; Garibaldi, J.; Smith, A. D.; Warden, D.; Wilcock, G.; Bajaj, N.; Morgan, K.; O'Shea, P. Practical detection of a definitive biomarker panel for Alzheimer's disease; comparisons between matched plasma and cerebrospinal fluid. Int. J. Mol. Epidemiol. Genet. 2014, 5 (2), 53-70. 59. Westwood, S.; Liu, B.; Baird, A. L.; Anand, S.; Nevado-Holgado, A. J.; Newby, D.; Pikkarainen, M.; Hallikainen, M.; Kuusisto, J.; Streffer, J. R.; Novak, G.; Blennow, K.; Andreasson, U.; Zetterberg, H.; Smith, U.; Laakso, M.; Soininen, H.; Lovestone, S. The influence of insulin resistance on cerebrospinal fluid and plasma biomarkers of Alzheimer's pathology. Alzheimers Res. Ther. 2017, 9 (1), 31. 33 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 41

60. Marchi, N.; Cavaglia, M.; Fazio, V.; Bhudia, S.; Hallene, K.; Janigro, D. Peripheral markers of blood-brain barrier damage. Clin. Chim. Acta 2004, 342 (1-2), 1-12. 61. Lindqvist, D.; Hall, S.; Surova, Y.; Nielsen, H. M.; Janelidze, S.; Brundin, L.; Hansson, O. Cerebrospinal fluid inflammatory markers in Parkinson's disease - Associations with depression, fatigue, and cognitive impairment. Brain Behav. Immun. 2013, 33, 183-189. 62. Dayon, L.; Núñez Galindo, A.; Wojcik, J.; Cominetti, O.; Corthésy, J.; Oikonomidi, A.; Henry, H.; Kussmann, M.; Migliavacca, E.; Severin, I.; Bowman, G. L.; Popp, J. Alzheimer disease pathology and the cerebrospinal fluid proteome. Alzheimers Res. Ther. 2018, 10, 66. 63. Bowman, G.; Dayon, L.; Severin, I.; Tautvydaite, D.; Henry, H.; Oikonomidi, A.; Kirkland, R.; Migliavacca, E.; Wojcik, J.; Bacher, M.; Popp, J. A neuroinflammatory biomarker signature of blood-brain barrier impairment in older adults. Alzheimers Dement. 2016, 12 (7), P670. 64. Bowman, G. L.; Dayon, L.; Kirkland, R.; Wojcik, J.; Peyratout, G.; Severin, I.; Henry, H.; Oikonomidi, A.; Migliavacca, E.; Bacher, M.; Popp, J. Blood-brain barrier breakdown, neuroinflammation and cognitive decline in older adults. Alzheimers Dement. 2018, doi: 10.1016/j.jalz.2018.06.2857.

34 ACS Paragon Plus Environment

Page 35 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Table 1. Clinical and biochemical characteristics of the total cohort by BBB integrity (118/120 subjects with available BBB measures).

Age, years1 Female, n (%) BMI, kg/m2 APOE ε4 carrier, n (%) Education, years MMSE CDR = 0, n (%)2 CDR = 0.5, n (%)2 CDR = 1.0, n (%)2 HAD score Diabetes, n (%) Hypertension, n (%) Hypercholesterolemia, n (%) CSF Aβ1-42, pg/mL CSF tau, pg/mL CSF p-tau 181, pg/mL CSF/serum albumin index3

All (n = 118) 70.2 (7.8) 76 (64.4) 25.6 (4.6) 37 (31.3) 12.4 (2.6) 26.9 (3.1) 48 (40.6) 61 (51.7) 9 (7.6) 10.4 (5.5) 11 (9.3) 41 (34.8) 44 (37.6) 841.5 (262.9) 369.5 (280.1) 61.9 (35.5) 6.1 (2.4)

BBB intact (n = 102) 69.8 (7.7) 70 (68.6) 25.5 (4.7) 32 (31.3) 12.5 (2.7) 27.3 (2.9) 45 (44.1) 51 (50.0) 6 (5.9) 10.7 (5.7) 9 (8.8) 35 (34.3) 35 (34.6) 836.0 (250.1) 356.2 (268.9) 61.5 (36.8) 5.4 (1.5)

BBB impairment (n = 16) 72.8 (8.2) 6 (37.5)* 26.5 (3.9) 5 (31.2) 11.9 (2.1) 24.8 (3.4)* 3 (18.7)* 10 (62.5) 3 (18.8) 8.8 (4.1) 2 (12.5) 6 (37.5) 9 (56.2) 876.6 (341.1) 454.4 (340.5) 64.0 (26.3) 10.7 (1.5)*

BMI, body mass index; APOE ε4, apolipoprotein E epsilon 4 allele; MMSE, mini-mental state examination; CDR, clinical dementia rating; HAD, hospital anxiety and depression score. 1Mean and standard deviation (SD) unless denoted otherwise. 2CDR scores include 0 (n = 48), 0.5 (n = 61), and 1 (n = 9). 3CSF/serum albumin index = [CSF albumin (mg/L)] / [serum albumin (mg/L)] × 100. *Significant differences between groups (p ≤ 0.05) using t-tests for continuous variables and binomial proportion tests for categorical variables.

35 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 41

FIGURES LEGENDS Figure 1. CSF and Plasma Proteins and their Common Interactome. A) Correlation of CSF proteins with their plasma counterparts. Only significant correlations with p-value ≤ 0.05 after Bonferroni correction for multiple testing were retained and displayed in the graph. B) Chord diagram of the associations between the overlapping CSF and plasma proteins and some clinical parameters. C) Chord diagram of the associations between the CSF/plasma protein ratios and some clinical parameters. In both chord diagrams, the connections represent the log2(1/p-value) and illustrate significant contribution to protein measurement variance; the figures display only significant covariates in relation to each protein but all the following clinical covariates were tested: gender, age, years of education, BMI, APOE ε4 genotype, CDR, MMSE, and BBB impairment; proteins with greater than 10% missingness were excluded from the analysis; the relationship between the proteins and the clinical variables was obtained fitting a linear model for each protein using those selected variables as covariates and an ANOVA to identify the significance; p-values were adjusted for multiplicity testing using the Bonferroni method; only significant correlations with p-value ≤ 0.05 were retained and displayed in the graph. Figure 2. Protein CSF/plasma Ratios and their Relationship with BBB Permeability. A) Correlations of protein CSF/plasma ratios with CSF/serum albumin index. Proteins are ordered according to their R with the CSF/serum albumin index. The size of the circle represent the significance of this correlation (i.e., log10(1/p-value)). Only significant correlations with p-value ≤ 0.05 after Bonferroni correction for multiple testing were retained and displayed in the graph. B) Correlation of protein CSF/plasma ratios for KNG1, FETUAB, PLMN, PGRP2, and A1BG with CSF/serum albumin index. C) Protein CSF/plasma ratio distribution across the complete proteomic dataset. Relative protein quantification was obtained independently in both CSF and plasma using 36 ACS Paragon Plus Environment

Page 37 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

isobaric labeling technology. D) Tissue annotation obtained with DAVID bioinformatic resources using the UniProt tissue annotation database for the proteins whose CSF/plasma ratio correlated with CSF/serum albumin index. Significant enrichment is indicated with an asterisk; n.s. = nonsignificant. Background was constituted of the 253 proteins commonly measured in CSF and plasma. E) Venn diagram of the correlated proteins between CSF and plasma (CSF versus plasma) and those whose CSF/plasma ratio correlated with the CSF/serum albumin index (CSF/plasma ratio versus CSF/serum albumin index). Figure 3. Models for Prediction of CSF/Serum Albumin Index-Based BBB impairment with CSF and Plasma Proteins. A) ROC curve of the model including CSF proteins for prediction of BBB impairment. The best model is composed of A1BG, AFAM, CBPB2, DCD, ENPP2, FETUA, HELZ, IBP7, PRDX2, SAMP, TTHY, and VTDB. B) Box-plots of selected CSF proteins according to BBB impairment. C) ROC curve of the model including plasma proteins for prediction of BBB impairment. The best model is composed CERU, CO6, CO9, FA12, FETUB, HXA3 (which is not known to be circulating), PHLD, PRDX2, PRG4, S10A9, and ZA2G in addition of age and CSF tau. D) Box-plots of selected plasma proteins according to BBB impairment. In A) and C), the ROC curve of the reference model contains age, gender, years of education, CDR, and hypercholesterolemia. The opacity of the ROC curves is proportional to the accuracy of the models; the diamonds indicate the selected most accurate models; the p-value on the graphs indicate the significance of the differences of AUC. In B) and D), red dots refer to subjects with intact BBB (i.e., CSF/serum albumin index < 9), and black dots refer to subjects with impaired BBB (i.e., CSF/serum albumin index ≥ 9); relative protein ratio fold changes were used.

37 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

Page 38 of 41

Figure 1.

38 ACS Paragon Plus Environment

Page 39 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 2.

39 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 41

Proteomics of paired CSF and plasma

Figure 3.

40 ACS Paragon Plus Environment

Page 41 of 41 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Proteomics of paired CSF and plasma

For TOC Only.

41 ACS Paragon Plus Environment