Metabolomics in Cerebrospinal Fluid of Patients with Amyotrophic

Jun 24, 2013 - The aim of the study was to (i) devise an untargeted metabolomics methodology that reliably compares cerebrospinal fluid (CSF) from ALS...
1 downloads 0 Views 3MB Size
Article pubs.acs.org/jpr

Metabolomics in Cerebrospinal Fluid of Patients with Amyotrophic Lateral Sclerosis: An Untargeted Approach via High-Resolution Mass Spectrometry Hélène Blasco,*,†,‡,§ Philippe Corcia,†,‡,∥ Pierre-François Pradat,⊥ Cinzia Bocca,‡,▲ Paul H. Gordon,⊥ Charlotte Veyrat-Durebex,†,‡,§ Sylvie Mavel,†,‡ Lydie Nadal-Desbarats,†,‡,§,▲ Caroline Moreau,▽ David Devos,▽ Christian R. Andres,†,‡,§ and Patrick Emond†,‡,§,▲ †

Unité 930, Institut National de la Santé et de la Recherche Médicale, 37044 Tours, France Université François-Rabelais, 37000 Tours, France § Laboratoire de Biochimie et Biologie Moléculaire and ∥Centre SLA, Service de Neurologie, Centre Hospitalier Régional Universitaire de Tours, 37044 Tours, France ⊥ APHP, Fédération des Maladies du Système Nerveux, Centre Référent Maladie Rare SLA, Hôpital de la Pitié-Salpêtrière, 75013 Paris, France ▲ Programme Pluri-Formation “Analyse des Systèmes Biologiques”, Université François-Rabelais, 37032 Tours, France ▽ Service de Neurologie, Centre Hospitalier Régional Universitaire de Lille, 59037 Lille, France ‡

S Supporting Information *

ABSTRACT: Amyotrophic lateral sclerosis (ALS) is characterized by the absence of reliable diagnostic biomarkers. The aim of the study was to (i) devise an untargeted metabolomics methodology that reliably compares cerebrospinal fluid (CSF) from ALS patients and controls by liquid chromatography coupled to high-resolution mass spectrometry (LC-HRMS); (ii) ascertain a metabolic signature of ALS by use of the LCHRMS platform; (iii) identify metabolites for use as diagnostic or pathophysiologic markers. We developed a method to analyze CSF components by UPLC coupled with a Q-Exactive mass spectrometer that uses electrospray ionization. Metabolomic profiles were created from the CSF obtained at diagnosis from ALS patients and patients with other neurological conditions. We performed multivariate analyses (OPLS-DA) and univariate analyses to assess the contribution of individual metabolites as well as compounds identified in other studies. Sixtysix CSF samples from ALS patients and 128 from controls were analyzed. Metabolome analysis correctly predicted the diagnosis of ALS in more than 80% of cases. OPLS-DA identified four features that discriminated diagnostic group (p < 0.004). Our data demonstrate that untargeted metabolomics with LC-HRMS is a robust procedure to generate a specific metabolic profile for ALS from CSF and could be an important aid to the development of biomarkers for the disease. KEYWORDS: amyotrophic lateral sclerosis, ALS, cerebrospinal fluid, CSF, biomarkers, LC-HRMS, metabolomics, mass spectrometry



INTRODUCTION

Better diagnostic sensitivity during a presymptomatic phase would enable subjects at genetic risk for ALS to participate in clinical trials.4 Markers of disease progression would add to understanding of disease physiology and make clinical studies more efficient. Targeted exploration of a few biomarkers in blood and cerebrospinal fluid (CSF)5−7 has helped study pathophysiological mechanisms, but no metabolite has proved useful for routine practice. The field of metabolomics is challenging because of the need to select one or several metabolites from a

Amyotrophic lateral sclerosis (ALS), the most common adultonset motor neuron disease, is characterized by degeneration of upper and lower motor neurons. Most people die within 3 years of disease onset.1 The identification of several gene mutations that cause familial ALS has contributed to understanding of the disease. Oxidative stress, mitochondrial dysfunction, glutamatemediated excitotoxicity, cytoskeletal abnormalities, and protein aggregation2 contribute to motor neuron degeneration, but how these pathways interact in the complex pathophysiology of ALS awaits elucidation. Clinical heterogeneity and lack of biological tools explain the 9−13 month delay between the first symptoms and diagnosis.3 © XXXX American Chemical Society

Received: April 22, 2013

A

dx.doi.org/10.1021/pr400376e | J. Proteome Res. XXXX, XXX, XXX−XXX

Journal of Proteome Research

Article

mixed 20 μL of CSF and 180 μL of methanol with vortexing for 10 min.26,27 We centrifuged the sample at 3000 rpm for 10 additional minutes and then transferred 150 μL of the supernatant into a 96-well plate. After nitrogen evaporation, the residue was dissolved in 200 μL of a 1:1 (v/v) mixture of acetonitrile/water. We prepared quality control samples (QC) from a pooled mixture of equal volumes of all samples. QCs followed the same preanalytic steps described above.

large range of concentrations, polarities, and masses.8,9 Just a few metabolomics studies in ALS have been based on highthroughput techniques10 such as high-performance liquid chromatography (HPLC) followed by electrochemical detection,11 high-resolution 1H NMR spectroscopy,12−14 or gas chromatography coupled with mass spectrometry (GCMS).15−17 The recent development of ultraperformance liquid chromatography (UPLC), which separates compounds with high selectivity,18,19 in parallel with progress made in mass spectrometry techniques such the Orbitrap Fourier transform analyzer, have advanced the metabolomics approach.20 UPLC coupled with high-resolution mass spectrometry (HRMS), characterized by more efficient preanalysis time as well as high sensitivity and resolution, is now capable of analyzing comprehensive metabolite patterns. While metabolomics is being increasingly applied to various fields, the emerging LCMS technologies have only rarely been used in health fields21 and never in ALS. The aim of this study was to assess the analytical robustness and potential usefulness of UPLC-HRMS in the search for a CSF metabolic signature in ALS. We compared the CSF metabolome between ALS patients and patients having other neurological diseases in order to examine biochemical factors that could be used as diagnostic biomarkers or to elucidate pathophysiological mechanisms.



Liquid Chromatography−High-Resolution Mass Spectrometry Analysis

LC-HRMS analysis was performed on a UPLC Ultimate 3000 system (Dionex), coupled to a Q-Exactive mass spectrometer (Thermo Fisher Scientific, Bremen, Germany) and operated in the positive (ESI+) and negative (ESI−) electrospray ionization modes (one run for each mode). The system was controlled by Xcalibur 2.2 (Thermo Fisher Scientific). The HESI (heated electrospray ionization) source used, for both modes, a spray voltage of 3 kV, capillary temperature of 325 °C, heater temperature of 350 °C, sheath gas flow of 25 arbitrary units (AU), auxiliary gas flow of 8 AU, spare gas flow of 3 AU, and tube lens voltage of 100 V. During the full-scan acquisition, which ranged from 66.7 to 1000 m/z, the instrument operated at 70 000 resolution (m/z = 200), with an automatic gain control (AGC) target of 5 × 10−5 charges and a maximum injection time (IT) of 120 ms. For MS2 analyses, the isolation window was set at 0.4 m/z, the instrument was operated at 35 000 resolution (m/z = 200), with an AGC target of 1 × 10−5 charges, maximum IT of 250 ms, and general NCE of 35 eV; collision energy was set at 35% with a ramp of 50%, and N2 was used as collision gas. The HRMS is coupled with an Ultimate WPS-3000 UHPLC system (Dionex, Germering, Germany) Chromatography was carried out with a Phenomenex Kinetex 1.7 μm XB - C18 (150 mm ×2.10 mm) and 100 Å HPLC column kept at a temperature of 40 °C. A multistep gradient (preceded by a 3-min equilibration time) had mobile phase A of 0.1% formic acid in water and mobile phase B consisting of acetonitrile (ACN) acidified with 0.1% formic acid; the gradient operated at a flow rate of 0.3 mL/min over a run time of 17 min for the negative mode and 21 min for the positive mode. The UHPLC autosampler temperature [Ultimate WPS-3000 UHPLC system (Dionex, Germering, Germany)] was set at 4 °C and the injection volume for each sample was 5 μL. The samples were randomized before the preanalytical step so that the injection order was independent of clinical status. Nine QC samples were injected to equilibrate the system (three injections of three QC samples) before each analytic series, and one QC sample was injected after every 10 samples to monitor the reproducibility of the LC-HRMS (12 QC per 96-well plate).28

EXPERIMENTAL SECTION

Patients and Controls

Cerebrospinal fluid samples were obtained at the time of diagnosis from patients at three French ALS centers (Tours, Lille, and Paris). Patients were included if they met El Escorial criteria for clinically definite or probable ALS.22 CSF analysis is done as part of the initial evaluation in all patients with ALS at these centers. The participants in this study gave informed consent. Patients from Lille and Paris gave consent in writing for analysis of their CSF samples; part of the sample was used in complementary analyses with LC-HRMS. In Tours, patients gave informed verbal consent for the CSF analyses, a process approved by the Persons Protection Committee, a board overseeing ethical aspects of research in Tours. Information collected from each patient included age, sex, site of onset, age at onset, and diagnosis. Site of onset was defined as bulbar or limb. Age at onset was taken as the time at which motor weakness was first noted by the patient. We compared clinical, demographic, and biological data for ALS patients among centers. The control group encompassed individuals with other neurological diseases that required lumbar puncture at the time of diagnosis: axonal peripheral neuropathy, non-ALS neurological and neurodegenerative diseases, chronic inflammatory demyelinating polyradiculoneuropathy, and multiple sclerosis. Tests of the CSF included standard bacteriological and biochemical analyses. CSF samples were stored in polypropylene tubes at −80 °C immediately after collection and until analysis.23,24 Before the experiments, samples were thawed, centrifuged at 3000 rpm for 5 min, and separated into two aliquots. No participant had been in a previous research study.14

Data Preprocessing

SIEVE software (Thermo Fisher Scientific) processed raw data for peak alignments and framing in one batch. This process produced a table of detected features aligned by time, with columns for sample retention time, m/z ratio, and intensity (i.e. peak area). We used a low cutoff intensity value for sensitivity considerations; the SIEVE framing algorithm considers all areas above the cutoff value as potential peaks, so we also inspected each frame visually to confirm the shape. Only those frames that had a Gaussian conformation were kept in the table.

Sample Preparation

We prepared the samples by standard methods for untargeted metabolomics using LC-HRMS: unselective methods with rapid, minimal handling, steps to ensure high reproducibility, and incorporation of a metabolism quenching step.25 First, we B

dx.doi.org/10.1021/pr400376e | J. Proteome Res. XXXX, XXX, XXX−XXX

Journal of Proteome Research

Article

Table 1. ALS and Non-ALS Patient Characteristicsa ALS patients (n = 66) no. of females (%) no. of males (%) age, years blood glucose, mmol/L CSF glucose, mmol/L lactate, mmol/L protein, mmol/L leucocytes/mm3 red blood cells/mm3

non-ALS patients (n = 128)

group I

group II

group III

group IV

group V

group VI

p-value

24 (36.4) 42 (63.6) 63.4 ± 11.5 5.67 ± 1.55

13 (65) 7 (35) 49.3 ± 12.2 4.94 ± 0.80

17 (46) 20 (54) 67.9 ± 13.6 5.47 ± 1.07

6 (33.3) 12 (66.7) 67.4 ± 9.9 5.25 ± 0.98

12 (63.2) 7 (36.8) 64.2 ± 12.4 5.73 ± 0.94

16 (47.1) 18 (52.9) 57.7 ± 16.7 5.25 ± 1.10

0.17b 0.37 0.92

3.95 ± 0.44 1.88 ± 0.28 0.52 ± 0.25 1.02 ± 1.12 125 ± 355

3.91 ± 0.49 1.96 ± 0.24 0.42 ± 0.14 1.47 ± 2.34 276 ± 930

3.83 ± 0.42 1.74 ± 0.21 0.58 ± 0.34 1.12 ± 1.63 102 ± 222

4.04 ± 0.58 1.84 ± 0.31 0.65 ± 0.19 1.18 ± 1.63 188 ± 889

3.85 ± 0.4 1.94 ± 0.35 0.50 ± 0.19 0.8 ± 1.80 222 ± 536

3.83 ± 0.62 1.75 ± 0.30 0.53 ± 0.23 1.61 ± 2.86 464 ± 882

0.18 0.28 0.19 0.45 0.55

a

Data are expressed as median (range). Group I, ALS; group II, multiple sclerosis; group III, axonal peripheral neuropathy; group IV, chronic inflammatory demyelinating polyradiculoneuropathy; group V, non-ALS neurodegenerative disease; group VI, other neurological diseases. bp-value for sex.

R2 is defined as a fraction of the variance explained by a component. Cross-validation of R2 gives Q2, which represents the proportion of total variation predicted by a component. The quality of the models was described by the cumulative modeled variation in the X matrix (metabolites), R2X(cum); the cumulative modeled variation in the Y matrix (CSF samples), R2Y(cum); and the cross-validated predictive ability, Q2(cum). Models were rejected if there was complete overlap of Q2 distributions [Q2(cum) < 0] or low classification rates [Q2(cum) < 0.05 and eigenvalues > 2]. We considered a model robust if Q2 > 40% and R2 > 50%, but these cutoff values need to be confirmed under biological conditions. The set of multiple models resulting from the cross-validation was used to calculate jackknife uncertainty measures. We fixed the maximum number of iterations at 200 to ensure convergence of the OPLS algorithm.32 A misclassification table was generated to show the proportion of correctly classified observations in the data set (ALS vs non-ALS diagnosis). Variable importance parameters (VIP) ranked the compounds according to their contribution to the model.

We normalized each peak area to the total peak area of each chromatogram and then analyzed the non-normalized and normalized data separately. Quality Control Analysis

QC samples were analyzed to assess reproducibility of the methods overall and between different phases (intra- and interplate reproducibility).28,29 The stability of mass accuracy and retention time were evaluated. Clustering of QC samples was assessed by principal component analysis (PCA) according to peak area data. We also generated control charts to evaluate trends in QC variability. These charts show the sample number from which the analytical system is equilibrated. Multivariate Data Analysis

Following linear transformation, the preprocessed data sets were used as input for Simca P+ version 12.0 (Umetrics, Umea, Sweden). Data were centered around the mean and unit variance was scaled for unsupervised PCA30 to identify similarities or differences between sample profiles. Spectral variation was reduced to a series of principal components (PC), each representing correlated spectral changes, and summarized in a score plot. PCs, new variables that are orthogonal to each other, explained progressively less variance in the data set. The PCs were displayed in a two-dimensional score plot, allowing visualization of the distribution and grouping of the samples in the new variable space. Score plots were visually inspected for grouping, trends, and outliers in the data. If outliers were detected in the distance to model plot (DModX), which is based on the residual variance model, they were rejected, and the PCA model was rebuilt. Orthogonal partial least-squares discriminant analysis (OPLS-DA) evaluated variations in frame areas between groups: variation in the measured data was partitioned into two blocks by the program, one containing variations that correlated with the class identifier and the other containing variations that were orthogonal to the first block and thus did not contribute to discrimination between groups.31 Next, we created a score plot to visualize the OPLS-DA model and characterized the contribution of metabolites to the separation of classes using the loading plot and the contribution plot. OPLS-DA was cross-validated by withholding 1/7 of the samples in seven successive simulations, such that each sample was omitted once in order to guard against overfitting. This approach meant that the OPLS-DA was built from one “predictive” component and two or more orthogonal components. Q2 and R2 assessed the robustness of the model.

Univariate Data Analysis

Univariate analyses focused on the VIP obtained from the best models (data from ESI+ and ESI− mode acquisitions). VIP integrations performed with SIEVE were verified, and if necessary, the peaks were manually reintegrated by use of LCquan (Thermo Fisher Scientific). We checked that the coefficient of variation for all VIP was