Modeling Longitudinal Metabonomics and ... - ACS Publications

data set complexity, the most relevant patterns could be extracted to further explore physiological processes at an anthropometric, cellular, and ...
0 downloads 0 Views 1MB Size
Subscriber access provided by University of Sussex Library

Article

Modelling Longitudinal Metabonomics and Microbiota Interactions in C57BL/6 mice fed a high fat diet Ivan Montoliu, Ornella Cominetti, Claire Laurence Boulange, Bernard Berger, Jay Siddharth, Jeremy K. Nicholson, and François-Pierre J. Martin Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.6b01343 • Publication Date (Web): 10 Jul 2016 Downloaded from http://pubs.acs.org on July 12, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Modelling

Longitudinal

Metabonomics

and

Microbiota Interactions in C57BL/6 mice fed a high fat diet Running title: Longitudinal integration of Metabonomics and Microbiota Ivan Montoliu,1,2# Ornella Cominetti,1# Claire L. Boulangé,2 Bernard Berger,3 Jay Siddharth,1 Jeremy Nicholson,2 François-Pierre J. Martin1 1

Nestlé Institute of Health Sciences SA, EPFL Innovation Park, Building H,1015

Lausanne, Switzerland; 2Department of Biomolecular Medicine, Division of Surgery, Oncology, Reproductive Biology and Anaesthetics, Faculty of Medicine, Imperial College London, Sir Alexander Fleming Building, South Kensington Campus, London SW7 2AZ, UK; 3Nestlé Research Center, Vers-chez-les-Blanc, CH-1000 Lausanne 26, Switzerland;

# contributed equally to the work * Corresponding authors: Ornella Cominetti [email protected] and Ivan Montoliu, [email protected], Francois-Pierre Martin [email protected], Nestlé Institute of Health Sciences SA, EPFL Innovation Park, Building H, 1015 Lausanne, Switzerland;

ACS Paragon Plus Environment

1

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 30

Abstract

Longitudinal studies aim typically at following populations of subjects over time and are important to understand the global evolution of biological processes. When it comes to longitudinal Omics data, it will often depend on the overall objective of the study, and constraints imposed by the data, to define the appropriate modeling tools. Here, we report the use of multilevel simultaneous component analysis (MSCA), Orthogonal Projection on Latent Structures (OPLS) and regularized Canonical Correlation Analysis (rCCA) to study associations between specific longitudinal urine metabonomics data and microbiome data in a diet-induced obesity model using C57BL/6 mice. 1H NMR urine metabolic profiling was performed on samples collected weekly over a period of 13 weeks, and stool microbial composition was assessed using 16S rRNA gene sequencing at three specific time periods (baseline, first week response, end of study). MSCA and OPLS allowed to explore longitudinal urine metabonomics data in relation to the dietary groups, and dietary effects on body weight. In addition, we report a data integration strategy based on regularized CCA and correlation analyses of urine metabonomics data and 16S rRNA gene sequencing data to investigate the functional relationships between metabolites and gut microbial composition. Thanks to this workflow enabling the breakdown of this dataset complexity the most relevant patterns could be extracted to further explore physiological processes at an anthropometric, cellular and molecular level.

ACS Paragon Plus Environment

2

Page 3 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Keywords: High fat diet, PCA, OPLS, multilevel simultaneous component analysis (MSCA), metabonomics, microbiota, prebiotics, regularized Canonical Correlation Analysis (CCA),

ACS Paragon Plus Environment

3

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 30

Introduction The increasing etiological complexity due to the rising prevalence of multifactorial disorders, the lack of understanding of the molecular processes at play, and the need for disease prediction in asymptomatic conditions are some of the many challenges that systems biology is well-suited to address.1 Complex interactions between genetic, gut microbiota, and environmental factors,2-5 such as a high calorie diet and sedentary lifestyles,6 and even increased food energy supply,7,8 may play a role in the development of obesity. Major advances in metabonomics have provided novel insights into the metabolic processes involved in the onset of type-2 diabetes and insulin-resistance, and related changes in body composition and physiology.9-13 Similarly to other omics technologies, the analysis of metabolic profiling14-16 data based on mass spectrometry and nuclear magnetic resonance spectroscopy (NMR) produce data brings a number of challenges, like the high-dimensional nature of Omics data. When it comes to longitudinal Omics data, i.e. one or more type of Omics data measured over time, the statistical analysis becomes even more challenging.17-19 However, longitudinal studies are key to understand the global evolution of biological processes. Such studies typically aim at following populations of subjects over time. Resulting time profiles can be clustered to identify subgroups or can be used for monitoring, forecasting and diagnostic purposes.20,21 Additional challenges when dealing with longitudinal data include auto-correlation of repeated measurements of the same variables, which is a limitation when trying to use certain techniques such as projection-based methods (e.g. Principal Component Analysis (PCA) 22, Partial Least Squares regression (PLS,

23,24

) which are well-suited to tackle high-dimensional

datasets but that do not take into account subjects’ trajectories. However, considering the vast array of techniques and their specific advantages and limitations,1 it will often depend on the overall objective of the study, and constraints ACS Paragon Plus Environment

4

Page 5 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

imposed by the data, to choose the best adapted modeling tools. In the present contribution, we report the use of multilevel simultaneous component analysis (MSCA), OPLS and regularized Canonical Correlation Analysis (rCCA) to help building associations between specific longitudinal metabotypes obtained from urine and temporal changes of the gut microbiome in diet induced obesity models using C57BL/6 mice.

Isogenic C57BL/6 mice are well known to develop three specific metabolic phenotypes when fed a high fat diet, namely obese diabetic, lean diabetic and lean non-diabetic.25-27 In a recent study, we modulated the metabolic and nutritional status of C57BL/6 mice with high fat diets with and without prebiotics to explore the relationship between metabolic-microbial biomarkers and body weight, based on previous work.28-31 Prebiotics are non-digestible food ingredients, generally oligosaccharides, which modify the balance of the intestinal microbiota by stimulating the activity of health beneficial bacteria32. Here, we compared the effects of consuming a HF diet combined either with fructans (inulin and fructooligosaccharides), galactosyl-oligosaccharide (GOS), or an in-house preparation of bovin milk oligosaccharides (BMOS). Urine metabolic profiling was performed on samples collected weekly over a period of 13 weeks, and stool microbial composition was assessed using 16S rRNA gene sequencing at three specific time periods (baseline, first week response and end of study, Figure 1). In the present contribution, we report the use of MSCA and OPLS to explore longitudinal urine Metabonomics data in relation to body weight and dietary intervention (Figure 1). MSCA is a generalization of the concept of ANOVA to the multivariate domain, where withinand between-subjects variance is analyzed separately.33,34 In addition, we report a data integration strategy based on regularized CCA and correlation analyses of urine ACS Paragon Plus Environment

5

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 30

Metabonomics data and 16S rRNA gene sequencing data to explore functional relationships between metabolites and gut microbial composition (Figure 1). CCA is a generalization of multivariate linear regression to the case where more than one response variable is present and the regularization step allows its use when collinearities are present in one or in both datasets.35 Such approaches are particularly well suited to analyze problems of large numbers of variables and reduced numbers of samples. We exemplify how such multivariate methodologies could help in studying association between specific longitudinal urine metabonomics data and microbiome data and in exploring similarities and specificities of the dietary effects.

Material and Methods Animal handling procedure and sample preparation The experiment was carried out under appropriate national guidelines at the Nestlé Research Center (NRC, Switzerland). All experiments were conducted according to local animal welfare policy and approved by Swiss governmental veterinary offices (authorisation number VD-2231). A total of 90 C57BL/6 male mice aged 6 weeks were permanently housed in individual cages under 12h-12h of light-dark regime and fed ad libitum during the overall experiment. During a first period of acclimatization of 3 weeks, animals were fed under a standard low fat (LF) diet (Diet D12450B Research Diets, USA). Based on body weight and fasting blood glucose measured during the second week of acclimatization, animals were randomized into 6 groups of 15 mice each. At day 0 (at the end the three weeks of acclimatization), one group was maintained on LF diet to serve as reference (group A) and another group was fed with a control high fat (HF, group B) diet (Diet 12492 Research Diets, USA). The four remaining groups received modified HF diets in term of composition in simple and complex sugar. In order to maintain the diets isocaloric, the energy from simple ACS Paragon Plus Environment

6

Page 7 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

sugars was matched with regards to that from sucrose, and the energy from oligosaccharides with that of maltodextrin. The 4 groups received either HF_S diet (HF containing glucose, lactose and galactose, Group C), HF + GOS (Group D), HF + BMOS (group E), and HF + FOS_IN diet (HF containing fructosyl-oligosaccharides and inulin, group F). All five HF diets also were added 10% of cellulose to be able to generate stable pellets for food intake monitoring and diet management purposes. HF + GOS diet contained 211g of GOS (Vivinal-GOS, Borculo Domo Ingredients, Netherlands) per 4057 Kcal, HF + BMOS diet contained 140g of fiber (composition reported previously30) per 4057 Kcal, HF + FOS_IN diet contained 100 of fiber (30% FOS, 70% Inulin, Beneo GmbH, Germany) per 4057 Kcal, and HF_S diet contained 35.1 g of dextrose, 32.3g of lactose and 1.5g of galactose per 4057 Kcal. By experimental design, all diets were iso-caloric and contained match proportions of proteins, fat, carbohydrates and fibres. However, by nature the BMOS mix contains a greater proportion of free sugars than the GOS and FOS_IN fibre mixes. We have therefore designed a HF_S diet that contained the same proportions of free sugars as HF+BMOS diet. The three sources of fibre were selected based on scientific evidence on their beneficial impact on several features of metabolic syndrome, including body weight and body composition, glucose and lipid metabolism.30,31,36,37 In particular, we have previously characterized the impact of GOS and BMOS on host-gut-microbiota co-metabolism in various pre-clinical studies.30,31 All the diets were manufactured using prebiotics mix at Research Diets, USA. Mouse body weight and food consumption were weekly measured during the experiment. Urine and stool samples were collected weekly for each animal over the duration of the experiment (i.e. at day D-21, D-14, D-7, D0 before the diet switch, D7, D14, D21, D28, D35, D42, D49, D56, D63 to D70 after the diet switch, Figure 1). All the samples were snap-frozen at -80 C on the day of collection until analysis. ACS Paragon Plus Environment

7

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 30

Urine collection was performed in the morning by gently pressing the abdomen.38 After 70 days of dietary intervention, mice were finally euthanized by anesthesia with isofluarane. 1

H NMR spectroscopic analysis of urine

A volume of 10 µl of urine were homogenized in 50 µl of phosphate buffer solution (NaHPO4, 0.6M pH=7.0, 100% D2O) containing sodium azide (3mM) and TSP (0.5 mM). After centrifugation, samples were transferred into 1.7 mm diameter NMR tubes by using a syringe. 1H NMR spectra were then recorded on a 600MHz Avance III Bruker NMR spectrometer (Bruker Biospin, Rheinstetten, Germany) operating at 600.13 MHz, by performing 64 scans of a standard sequence with 98K data-points at 300 K. Processing of 1H NMR spectra was carried out using TOPSPIN 2.1 software package (Bruker Biospin, Rheinstetten, Germany). For each spectrum, an exponential function corresponding to a size of 1 Hz and Fourier transformation were applied to each spectrum prior to automated phasing, baseline correction and calibration using the TSP signal at δ 0. The spectral data (from δ 0.2 to δ 9.5) were imported into Matlab software (version R2012b, the Mathworks Inc, Natwick MA) and normalized to total area after solvent peak removal. Poor quality or highly diluted spectra were discarded from the subsequent analysis. Based on previous Metabonomics characterization of this animal model,29 intermediate metabolites from host gut microbial co-metabolism, as well as from fatty acid β oxidation, branched chain amino acid oxidation, Krebs’s cycle and nicotinamide adenine dinucleotide pathways assignable on urine 1H NMR spectra were integrated. 16S rRNA gene sequencing of fecal microbiota DNA material was extracted from fecal samples during the acclimatization period (Day D-7), one week after diet switch (D7), and at the end of the experiment (Day D70) and the 16S RNA genes of the corresponding bacterial communities sequenced ACS Paragon Plus Environment

8

Page 9 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

as previously described.39 The pyrosequencing dataset (454 FLX Titanium technology, Microsynth AG, Balgach, Switzerland) was analysed as described in the Mothur 454 SOP accessed on March 2014.40 Briefly the samples were de-multiplexed and error correction performed as per SOP, the sequences were aligned against the silva release 102 from the Mothur website. The classification of the sequences was performed using RDP release 9 modified for use with Mothur SOP. The entire analysis was performed using mother release 1.32.0 on a compute cluster running RHEL Version 6). Using this analytical workflow, we have used the genus level as the most relevant annotation data. A total of 297 microbial variables at genus level were identified and 175 microbial variables were kept for analysis after data curation and QC. Multivariate data analysis of urine Metabonomics data To extract phenotype-associated features from

1

H NMR spectral data, three

multivariate data analysis methodologies were used: Principal Component Analysis (PCA),22 Orthogonal Projection on Latent Structures (OPLS),34 and Multilevel Simultaneous Component Analysis (MSCA).41 In all analyses, spectroscopic data were centred and scaled to unit variance. PCA was used for exploratory data visualization and outlier detection purposes. OPLS models were used to model the relationship between spectroscopic and weight gain data. A variant of use of OPLS (OPLS-Discriminant Analysis)

42

was used to highlight discriminant profiles between

diets. The complexity of parameters for both OPLS models was obtained by internal validation following a random cross-validation (CV) scheme using 10 CV groups. Longitudinal (multilevel) multivariate data was modelled using MSCA. The dimensionality of the model was set to two principal components on each variation source (between and within individuals). PCA and OPLS models were built by using SIMCA P (Umetrics, Sweden) data analysis program, while MSCA routine was adapted

to

R43

from

original

routine

ACS Paragon Plus Environment

in

Matlab 9

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 30

(http://www.bdagroup.nl/content/Downloads/software/software.php). Data handling, importing, pre-processing and visualization was performed using in-house routines also written in R.43 Integration of Metabonomics and microbiome data The R mixOmics 44 library was employed for regularized CCA (rCCA)45 modeling of metabonomics (urine metabonomics) and microbiome data (16s rRNA sequencing data). The regularization parameters were determined using the tune.rcc routine with a 10-fold internal cross-validation step. The function cim (clustered image maps) was used to plot the regularized CCA similarity scores between the variables as heatmaps for all the different dietary groups. The order of the variables was obtained after applying hierarchical clustering (complete linkage, Euclidean distance) over the global data. Finally, the plotVar function was used to plot the variables and to observe strong correlations (only variables which have a correlation above 0.5 are plotted outside the inner circle) between variables.

Results and discussion Experimental design In the present study, we compared the effects of consuming a HF diet in C57BL/6 mice with HF diet combined either with FOS_IN (fructo-oligosaccharides and inulin), GOS, or BMOS. Urine and stool samples were collected for each animal 21 days (D21), 14 days (D-14), 7 days (D-7) before the diet switch (on day 0 D0) and weekly from 7 days (D7) to 70 days (D70) after the switch from a low fat diet, Figure 1). Urine metabolic profile (all time points) and stool microbiota profiles (D-7, D7, D70) were integrated longitudinally in relation to body weight and dietary intervention using a combination of MSCA, OPLS and rCCA methodologies. Overview of urine 1H NMR metabolic profiles ACS Paragon Plus Environment

10

Page 11 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Based on our previous urine metabonomic observations in C57BL/6 mice29,31, and with the aim to reduce data dimensionality in a meaningful way, we have selected 36 urinary metabolites for which representative 1H NMR signals were integrated across the 1170 urinary spectral data. Relative quantitative data used for subsequent analysis were generated for acetate, acyl-carnitine, carnitine, cis-aconitate, citrate, succinate, oxaloacetate,

α-ketoglutarate,

fumarate,

creatine,

creatinine,

glucuronate,

guanidoacetate, hippurate, indoxyl-sulfate, isobutyrate, isoleucine, valine, leucine, isovalerylglycine,

hexanoyl-glycine, vinylacetylglycine (putative assignment), α-

keto-methylvalerate,

α-ketoisovalerate,

hydroxyphenylacetate,

phenylacetylglycine,

N-acetyl-gycoprotein, taurine,

trimethylamine

4(TMA),

trimethylamine-N-oxide (TMAO), sucrose, tartrate, nicotinate, nicotinurate, N1methyl-2-pyridone-5-carboxamide (2-PY) and N1-methyl-4-pyridone-3-carboxamide (4-PY). These metabolites are representative of central energy metabolism (Krebs cycle, fatty acid and branched-chain amino acid, NAD metabolic pathways), and hostgut bacterial co-metabolism (methylamines, aromatic amino acid metabolism). Analysis of longitudinal urine metabonomic data in relation to dietary groups MSCA method was applied to explore the influence of the diet and the time on the variations in urinary metabolites over the duration of the study (i.e. 14 time points). MSCA modelling integrates two concepts, namely the splitting of variance sources as found in ANOVA and a multivariate data projection onto a reduced subspace (PCA). Therefore, the MSCA analysis provides information of the variance between subjects (here associated to individual response the HF dietary intervention) and additional sources of variability – here associated to individual changes experienced along the experimental time.46 Such an approach appears suitable to evaluate metabonomics longitudinal data sets both for exploratory and predictive analyses. In the present

ACS Paragon Plus Environment

11

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 30

contribution, we used MSCA for exploratory purposes by modelling both sources of variation by means of PCA. Due to the exploratory nature of this model, only two first principal components were used (Figure 2). The analysis of scores of the betweenanimal source of variance showed a major separation between animals receiving LF diet from the other animals along the first component (43.61 % of explained variance), whilst the effects of prebiotics in HF diets could be observed along the second component (17.82 % of explained variance). In this score plot, each dot corresponds to one single animal, as expressed by its metabolic profiles summarized from all the timepoints simultaneously (Figure 2 A). Interestingly, the groups HF + GOS and HF + FOS_IN appeared in the same sub-space suggesting a strong metabolic similarity, whilst groups HF and HF_S showed only partial overlaps in this projection. The analysis of the related loadings showed a consistent structure in the variables belonging to the same metabolite or metabolic pathway (Figure 2 B). In particular, the main source of variations across the HF diet groups was related to central energy metabolic intermediates (e.g. citrate, succinate, α-ketoglutarate, oxaloacetate, fumarate, carnitine, etc..) and bacterial metabolites from aromatic amino acid metabolism (e.g. phenylacetylglycine, indoxylsulfate or p-hydroxyphenylacetate). Analysis of longitudinal urine metabonomic data in relation to dietary groups and body weight We have previously investigated the longitudinal metabolic response to high-fat dietinduced obesity in the C57BL/6 mouse strain, which is predisposed to IR and obesity.29 Based on the observation of a significant correlation between body weight after one week and final body weight, we previously explored the early metabolic signatures associated with the high phenotypic variability in the response to high fat induced weight gain. This previous analysis identified key host and gut microbial metabolic features that associated with the susceptibility of the animals to put more on ACS Paragon Plus Environment

12

Page 13 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

weight with DIO. Since a main phenotypic effect of high fat diet intervention in C57BL/6 mice is increased body weight,29 we explored further the covariance between the body weight and the metabolic profiles using OPLS approaches. Supplementary Figure S1 describes the body weight variations observed in the present experimental study. A first OPLS model generated from all metabonomics data with 1 predictive and 2 orthogonal components revealed statistically significant relationships between these parameters (Q2Y=0.54, R2Y=0.55, R2X=0.43). The metabolites found to be most influential were TMAO, sucrose, oxaloacetate, indoxylsulfate, tartarate, hippurate, creatine, and cis-aconitate. The use of this multivariate regression analysis approach implies that pseudo replication effects are considered to have little or no impact. To confirm this point, MSCA was used to verify the relevance of the variables. Results from this model confirmed this point by showing a good degree of overlap with the most influential variables in the OPLS model. However major differences in longitudinal metabolic and body weight data over the course of the study in animals maintained on LF diet was identified as a source of bias. To address this observation, we assessed the covariance between urine metabonomics data and body weight in a series of sub-models using OPLS (Supplementary Table 1). First, we assessed the metabolic association across the range of body weight changes under LF and HF control diets (HF and H_S). Then, considering that the fluctuations in body weights are of smaller amplitude across the animal receiving HF diets, we further analysed these associations by combining two groups of animals, including a group of HF control diets for comparison. Due to experimental design, further tests were included with the HF_S group with regards to HF group and HF + BMOS groups. The relationships with body weight were then assessed in grouping HF with LF groups, and HF_S with LF groups. Then, groups of animals receiving HF diets combined with prebiotics were modelled separately with ACS Paragon Plus Environment

13

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 30

the HF group, and a specific model was generated for HF_S and HF + BMOS groups. Overall significant OPLS models could be generated for the 7 conditions of interest (Supplementary Table 1). Shared and Unique Structures plots47 were employed to compare different multivariate regression models that share data and have the same dimensionality. This graphical tool enabled plotting correlation-scaled loading weights of one model against another to see possible deviations from the diagonal [-1,0,1] line. The closest the variable is from the diagonal the more similar its behaviour is in both models. Therefore, such plots enable an easier interpretation of the data and the identification of variables specific to a given model far from this diagonal may have differences in behaviour for any of the models. By choosing the right combination of models, one can compare with this approach the changes due to an intervention with a baseline or control. In our case here, for variable selection purposes, we introduced some minor modifications to the general procedure. OPLS regression model loadings (not correlation scaled) from both models were plotted and jack-knifed estimated coefficient confidence intervals were introduced to help in selecting which variables show significant deviations between the models. These confidence intervals allow estimating the overlap with the diagonal line or even between variables. As in ordinary OPLS model interpretation, importance of the individual variable is indicated by the absolute value of its coefficient. For instance, in Figure 3, we show how we compared the loadings coming from two models (LF – HF and LF – HF_S). In general, metabolite-associated variables on the diagonal had similar importance in explaining body weight variance in the two models. With the aim of highlighting the more relevant variables, a threshold value of 0.2 was applied. In the examples given, this selection corresponds mostly to values of Variable Importance in Projection

48

for the variable on any of both models >= 1.0.

ACS Paragon Plus Environment

14

Page 15 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Positive correlations with body weight variation in both models were observed for creatine, tartrate, carnitine, and negative ones with TMA and TMAO. In spite of the similarities, we could also identify a specific positive correlation of oxaloacetate with body weight in the LF-HF model. In a similar fashion, we have compared the different contribution of the metabolites to body weight variations to assess prebiotic-specific effects (Supplementary Figures S2-4). Metabolic specificities are reported in Supplementary Tables S2. Such analyses allowed the identification of metabolites that associate consistently with body weight variations, such as creatinine, nicotinate, nicotinurate, and tartrate. Some metabolites associated with body weight variations were exclusive to prebiotic supplementation with HF, such as oxaloacetate, phenylacetylglycine, indoxylsulfate. Many metabolites showed associations dependent on dietary conditions, such as carnitine, hippurate, with some specific to the type of prebiotics (i.e. TMA and TMAO for BMOS, acyl-carnitine and sucrose for GOS, guanidoacetate for FOS_IN). Such a combination of MSCA, OPLS and graphical exploration of model specificities appeared as a relevant approach to explore and identify most influential variables that share similarities or similitudes in relation to a dietary effect and key clinical endpoints of the intervention. In the present example, this analysis identified a set of host-gut microbial metabolites. We here after study metabonomics data with microbial data to explore further influence of the dietary interventions on gut microbial functionality. Modelling

relationships

between

urine metabolites

and gut

microbial

composition overtime In previous studies, we observed a significant correlation between body weight after one week (D7) and final body weight (D70).29 Based on these observations, we could identify key host and gut microbial metabolic features that associated with the ACS Paragon Plus Environment

15

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 30

susceptibility of the animals to put more on weight with DIO. These findings lead us to focus on the host – gut microbial interactions in relation to body weight at D7 (e.g. one week after diet switch) and D70 (e.g. end of study) in current study. Since the population was randomized prior to diet switch, we also selected D-7 as baseline for both metabolic and microbial phenotypes. Urine 1H NMR metabolites and stool 16S rRNA gene sequencing data were both available for each animal at three timepoints of the experimental design and subjected to CCA modelling,. CCA maximizes the correlation between linear combinations of variables from two datasets. Neither the metabolomics, nor the microbiota data are considered as response variables, instead, we are interested to seek correlations between both datasets. After determining the regularization parameters, we identified that 4 components are sufficient to explain the dataset information, which includes 35 metabolites and 175 microbiota variables at each of the three time points. Figure 4 shows the result of the projection of the data points into the joint canonical dimensions. The data clustered into three groups: (i) one higher in the second component composed of HF and HF_S diets after 7 and 70 days (as expected from the randomization of animals and the single-housing conditions), (ii) one group of LF diet and with the different HF diets at -7 days, and (iii) one in the right, bottom side of the Figure comprised of HF-GOS, HF-BMOS and HF-FOS_IN diets at days 7 and 70 (Figure 4 A, B, C). These canonical components did not separate days 7 from 70, suggesting that the changes in either the metabonomics or microbiota data were not significantly different among those time points, or that the differences were too small to be captured by such a method. Interestingly, the analysis of the covariance biplot highlighted some robust interactions between Flavonifractor spp and indoxylsulfate and phenylacetylglycine, an effect common to animals receiving the high fat diets (Figure 4 D). It is very interesting to note these two metabolites, previously highlighted by MSCA and OPLS ACS Paragon Plus Environment

16

Page 17 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

model comparison as key metabolites related to dietary effect and changes in body weight. We explored further metabolite-microbial relationships through correlation analyses. Taking into account all data points, the initial data was comprised of 35 metabolites and 175 microbiota variables, which corresponds to over six thousand pair-wise comparisons if analysing every pair of inter-data variables, or over 36 thousand when comparing the 6 dietary groups separately. To reduce such high dimensionality, we first removed variables which have at least for one of the dietary groups null standard deviation and a threshold of similarity score of 0.4 and 0.5 was chosen for metabolites and microbiotas species respectively, only keeping variables for which such threshold is reached for at least one of the comparisons in at least one of the groups. Such thresholds provided around 20 variables per dataset, which is a reasonable number of variables to inspect visually (Supplementary Figure S5 describes how the numbers of variables kept increasing if the similarity scores was reduced). The dimensionality was therefore reduced to 22 metabolites and 27 microbiota variables. Overall visual inspection was conducted using heatmaps with colour gradients related to the similarity between the variables (a red cell represents a very high correlation between the corresponding metabolite and microbiota component; blue represents an anticorrelation between the variables; while a white cell no similarity). The same order was kept on all the variables (metabolites in rows and microbiota components in columns) for the six dietary groups, allowing comparison of groups by directly contrasting the colours of the same cells in the respective heatmaps (Figure 5). Qualitative observation of the group specific correlation maps, showed many shared structure between the three groups receiving HF diets combined with prebiotics. Interestingly the networks generated in HF and HF_S animals look remarkably different, suggesting that simple sugars influence the host-microbial metabolic ACS Paragon Plus Environment

17

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 30

interactions. Of particular, interest, in HF animals, TMAO, TMA, carnitine, taurine appeared as key metabolites for which variations overtime are related to changes in several bacterial spp, when compared to LF animals. Moreover, tartrate and hippurate show

specific

associations

with

Hydronenoanaerobacterium,

Lactococcus,

Odoribacter spp. and TMA with Lutispora bacteria. These patterns were not observed in HF_S animals, the analysis showing a less dense but still different network than in HF and LF animals. In the prebiotics groups, TMAO, TMA, indoxylsulfate, sucrose, phenylacetlyglycine showed consistent and dense correlations with several bacterial groups. However, the associations with Allobacuulum previously highlighted with rCCa were not further explored with this approach due to the similarity scores was just below the threshold 0.4. If the similarity score threshold had been decreased to fit Allobacuulum, the number of microbiota features would have increased from 27 to 47, making the visual comparisons much more difficult to perform. However, such analyses allow us to explore in more detail the associations between Flavonifractor bacteria, indoxylsulfate, phenylacetylglycine, which is positively and statistically significant in HF + prebiotics groups, especially with BMOS. We previously discussed how Flavonifractor spp and several clostrium spp may contribute to the production of phenylacetate derivates and some short chain fatty acids through metabolism of dietary glycosylated flavonoids.49,50 Moreover, genome sequencing revealed that these spp fell within Clostridium cluster IV, which link to blood and urinary levels of indoxylsulfate,51,52 and indoleamine 2,3-dioxygenase activity in colonic epithelial cells.53

Conclusions and outlook Amongst the major challenges when integrating longitudinal omics data are the high dimensional nature of the omics data, the longitudinal aspect of multivariate omics ACS Paragon Plus Environment

18

Page 19 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

data and integrating multiple datasets, as well as the mechanistic interpretation of the omics data. In the present example, such a combination of longitudinal urine metabonomics data analysis and metabonomics-microbiome data integration could help identifying a relevant longitudinal signature for two aromatic gut microbial metabolites – pheylacetylglycine and indoxylsulfate - which might be attributed to phenol and indole metabolism by Flavonifractor spp and specific to prebiotic modulation of the gut microbial ecology under a high fat diet regimen. This workflow enables to breakdown the complexity of the data set to extract relevant patterns with which to further explore physiological processes at an anthropometric, cellular and molecular level. In this respect, as illustrated in the current example, a combination of sophisticated data modelling techniques is required. Projection methodologies such as PCA and OPLS work well for low n, high p datasets, but not for longitudinal data. But when combined with other exploratory methods such as MSCA that account for within and between subject variations, they can help in selecting more relevant variables of interest. Furthermore, the integration of different Omics datasets could be achieved via rCCA and correlation analyses to study the covariance between the different matrices. rCCA could help in further selecting relevant associations between metabolites and microbial readouts, with which to further explore diet-gut microbial metabolic functional ecology.

Acknowledgments The authors acknowledge the pre-clinical team at Nestle Research Center (NRC), Switzerland for conducting the study, Drs. Enea Rezzonico, Jason Chieh Chou and Norbert Sprenger for helpful discussions on gut bacterial and prebiotics metabolism, Drs. Jean-Philippe Godin, Serge Rezzi, Martin Kussmann, and Sunil Kochhar for managerial support. The authors acknowledge the overall scientific discussion and ACS Paragon Plus Environment

19

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 30

inputs from Dr. Marc-Emmanuel Dumas and Prof. Elaine Holmes. The authors acknowledge Beneo GmbH and Borculo Domo Ingredients for provision of the Fructans and GOS, respectively.

Supporting Information Supporting Information Available: 5 Supplementary Figures and 2 Supplementary Tables. This material is available free of charge via the Internet at http://pubs.acs.org.

ACS Paragon Plus Environment

20

Page 21 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

(1) Sperisen, P.; Cominetti, O.; Martin, F. P. Frontiers in molecular biosciences 2015, 2, 44. (2) Delzenne, N. M.; Neyrinck, A. M.; Backhed, F.; Cani, P. D. Nat Rev Endocrinol 2011, 7, 639-646. (3) Backhed, F. Ann Nutr Metab 2011, 58 Suppl 2, 44-52. (4) Turnbaugh, P. J.; Gordon, J. I. J Physiol 2009, 587, 4153-4158. (5) Goodrich, J. K.; Waters, J. L.; Poole, A. C.; Sutter, J. L.; Koren, O.; Blekhman, R.; Beaumont, M.; Van Treuren, W.; Knight, R.; Bell, J. T.; Spector, T. D.; Clark, A. G.; Ley, R. E. Cell 2014, 159, 789-799. (6) Bleich, S.; Cutler, D.; Murray, C.; Adams, A. Annu Rev Public Health 2008, 29, 273-295. (7) Swinburn, B.; Sacks, G.; Ravussin, E. Am J Clin Nutr 2009, 90, 1453-1456. (8) Vandevijvere, S.; Chow, C. C.; Hall, K. D.; Umali, E.; Swinburn, B. A. Bulletin of the World Health Organization 2015, 93, 446-456. (9) Wahl, S.; Vogt, S.; Stuckler, F.; Krumsiek, J.; Bartel, J.; Kacprowski, T.; Schramm, K.; Carstensen, M.; Rathmann, W.; Roden, M.; Jourdan, C.; Kangas, A. J.; Soininen, P.; Ala-Korpela, M.; Nothlings, U.; Boeing, H.; Theis, F. J.; Meisinger, C.; Waldenberger, M.; Suhre, K.; Homuth, G.; Gieger, C.; Kastenmuller, G.; Illig, T.; Linseisen, J.; Peters, A.; Prokisch, H.; Herder, C.; Thorand, B.; Grallert, H. BMC medicine 2015, 13, 282. (10) Reinehr, T.; Wolters, B.; Knop, C.; Lass, N.; Hellmuth, C.; Harder, U.; Peissner, W.; Wahl, S.; Grallert, H.; Adamski, J.; Illig, T.; Prehn, C.; Yu, Z.; Wang-Sattler, R.; Koletzko, B. European journal of nutrition 2015, 54, 173-181. (11) Fearnside, J. F.; Dumas, M. E.; Rothwell, A. R.; Wilder, S. P.; Cloarec, O.; Toye, A.; Blancher, C.; Holmes, E.; Tatoud, R.; Barton, R. H.; Scott, J.; Nicholson, J. K.; Gauguier, D. PLoS One 2008, 3, e1668. (12) Yang, X.; Deignan, J. L.; Qi, H.; Zhu, J.; Qian, S.; Zhong, J.; Torosyan, G.; Majid, S.; Falkard, B.; Kleinhanz, R. R.; Karlsson, J.; Castellani, L. W.; Mumick, S.; Wang, K.; Xie, T.; Coon, M.; Zhang, C.; Estrada-Smith, D.; Farber, C. R.; Wang, S. S.; van Nas, A.; Ghazalpour, A.; Zhang, B.; Macneil, D. J.; Lamb, J. R.; Dipple, K. M.; Reitman, M. L.; Mehrabian, M.; Lum, P. Y.; Schadt, E. E.; Lusis, A. J.; Drake, T. A. Nat Genet 2009, 41, 415-423. (13) Newgard, C. B.; An, J.; Bain, J. R.; Muehlbauer, M. J.; Stevens, R. D.; Lien, L. F.; Haqq, A. M.; Shah, S. H.; Arlotto, M.; Slentz, C. A.; Rochon, J.; Gallup, D.; Ilkayeva, O.; Wenner, B. R.; Yancy, W. S., Jr.; Eisenson, H.; Musante, G.; Surwit, R. S.; Millington, D. S.; Butler, M. D.; Svetkey, L. P. Cell Metab 2009, 9, 311-326. (14) Smith, C. A.; Want, E. J.; O'Maille, G.; Abagyan, R.; Siuzdak, G. Analytical Chemistry 2006, 78, 779-787. (15) Nicholson, J. K.; Lindon, J. C.; Holmes, E. Xenobiotica 1999, 29, 1181-1189. (16) Fiehn, O. Plant Molecular Biology 2002, 48, 155-171. (17) Dean, C. L., X; Neuhaus, J; Wang, L; Wu, L; Yi, G Workshop on Emerging Issues in the Analysis of Longitudinal Data; Banff International Research Station (BIRS), Banff, Alberta.2009. (18) Cominetti, O.; Collino, S.; Martin, F. P. Agro Food Industry Hi-Tech 2014, 25, 14-18. (19) Stanberry, L.; Mias, G. I.; Haynes, W.; Higdon, R.; Snyder, M.; Kolker, E. Metabolites 2013, 3, 741-760. (20) Liquet, B.; Le Cao, K. A.; Hocini, H.; Thiebaut, R. BMC bioinformatics 2012, 13, 325. ACS Paragon Plus Environment

21

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 30

(21) Albert, P. S.; Schisterman, E. F. Statistics in medicine 2012, 31, 2457-2460. (22) Jolliffe, I. T.; Springer-Verlag: New York, 2002. (23) Wold, S.; Sjöström, M.; Eriksson, L. Chemometrics and Intelligent Laboratory Systems 2001, 58, 109-130. (24) Geladi, P.; Kowalski, B. R. Analytica Chimica Acta 1986, 185, 1-17. (25) Burcelin, R.; Crivelli, V.; Dacosta, A.; Roy-Tirelli, A.; Thorens, B. Am J Physiol Endocrinol Metab 2002, 282, E834-842. (26) West, D. B.; Boozer, C. N.; Moody, D. L.; Atkinson, R. L. Am J Physiol 1992, 262, R1025-1032. (27) Champy, M. F.; Selloum, M.; Zeitler, V.; Caradec, C.; Jung, B.; Rousseau, S.; Pouilly, L.; Sorg, T.; Auwerx, J. Mamm Genome 2008, 19, 318-331. (28) Roberfroid, M. B. Br.J.Nutr. 1998, 80, S197-S202. (29) Boulange, C. L.; Claus, S. P.; Chou, C. J.; Collino, S.; Montoliu, I.; Kochhar, S.; Holmes, E.; Rezzi, S.; Nicholson, J. K.; Dumas, M. E.; Martin, F. P. J.Proteome.Res. 2013. (30) Martin, F. P.; Wang, Y.; Sprenger, N.; Yap, I. K.; Rezzi, S.; Ramadan, Z.; PereTrepat, E.; Rochat, F.; Cherbut, C.; van Bladeren, P.; Fay, L. B.; Kochhar, S.; Lindon, J. C.; Holmes, E.; Nicholson, J. K. Molecular systems biology 2008, 4, 205. (31) Martin, F. P.; Sprenger, N.; Yap, I. K.; Wang, Y.; Bibiloni, R.; Rochat, F.; Rezzi, S.; Cherbut, C.; Kochhar, S.; Lindon, J. C.; Holmes, E.; Nicholson, J. K. J Proteome.Res 2009, 8, 2090-2105. (32) Roberfroid, M.; Gibson, G. R.; Hoyles, L.; McCartney, A. L.; Rastall, R.; Rowland, I.; Wolvers, D.; Watzl, B.; Szajewska, H.; Stahl, B.; Guarner, F.; Respondek, F.; Whelan, K.; Coxam, V.; Davicco, M. J.; Leotoing, L.; Wittrant, Y.; Delzenne, N. M.; Cani, P. D.; Neyrinck, A. M.; Meheust, A. Br J Nutr 2010, 104 Suppl 2, S1-63. (33) Hansen, J. J., Hoefsloot, H.C.J., van der Greef, J., Timmerman, M.E., Smilde, A.K. . Analytica Chimica Acta 2005, 530, 11. (34) Trygg, J.; Wold, S. Journal of Chemometrics 2002, 16, 119-128. (35) Soneson, C.; Lilljebjorn, H.; Fioretos, T.; Fontes, M. BMC Bioinformatics 2010, 11, 191. (36) Liber, A.; Szajewska, H. Annals of nutrition & metabolism 2013, 63, 42-54. (37) Rastall, R. A. Curr Opin Clin Nutr Metab Care 2013, 16, 675-678. (38) Kurien, B. T.; Everds, N. E.; Scofield, R. H. Lab Anim 2004, 38, 333-361. (39) Sanchez, M.; Darimont, C.; Drapeau, V.; Emady-Azar, S.; Lepage, M.; Rezzonico, E.; Ngom-Bru, C.; Berger, B.; Philippe, L.; Ammon-Zuffrey, C.; Leone, P.; Chevrier, G.; St-Amand, E.; Marette, A.; Dore, J.; Tremblay, A. Br J Nutr 2014, 111, 1507-1519. (40) Schloss, P. D.; Gevers, D.; Westcott, S. L. PloS one 2011, 6, e27310. (41) Jansen, J. J.; Hoefsloot, H. C. J.; van der Greef, J.; Timmerman, M. E.; Smilde, A. K. Analytica Chimica Acta 2005, 530, 173-183. (42) Bylesjö, M.; Rantalainen, M.; Cloarec, O.; Nicholson, J. K.; Holmes, E.; Trygg, J. Journal of Chemometrics 2006, 20, 341-351. (43) 2011. (44) Le Cao, K. A.; Gonzalez, I.; Dejean, S. Bioinformatics 2009, 25, 2855-2856. (45) González, I.; Déjean, S.; Martin, P. G. P.; Baccini, A. 2008 2008, 23, 14. (46) Schouteden, M.; Van Deun, K.; Pattyn, S.; Van Mechelen, I. Behavior research methods 2013, 45, 822-833. (47) Wiklund, S., Johansson, S., Sjöström, L., Mellerowicz, E.J., Edlund, U., Shockcor, J.P. Gottfries, J., Moritz, T., Trygg, J. Analytical Chemistry 2008, 80, 115122. ACS Paragon Plus Environment

22

Page 23 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

(48) Chong, I., Chi-Hyuck Jun. chemometrics and intelligent laboratory systems 2005, 78, 103-112. (49) Moco, S.; Martin, F. P.; Rezzi, S. J.Proteome.Res. 2012. (50) Carlier, J. P.; Bedora-Faure, M.; K'Ouas, G.; Alauzet, C.; Mory, F. International journal of systematic and evolutionary microbiology 2010, 60, 585-590. (51) Wikoff, W. R.; Anfora, A. T.; Liu, J.; Schultz, P. G.; Lesley, S. A.; Peters, E. C.; Siuzdak, G. Proc.Natl.Acad.Sci.U.S.A 2009, 106, 3698-3703. (52) Weber, D.; Oefner, P. J.; Hiergeist, A.; Koestler, J.; Gessner, A.; Weber, M.; Hahn, J.; Wolff, D.; Stammler, F.; Spang, R.; Herr, W.; Dettmer, K.; Holler, E. Blood 2015, 126, 1723-1728. (53) Atarashi, K.; Tanoue, T.; Oshima, K.; Suda, W.; Nagano, Y.; Nishikawa, H.; Fukuda, S.; Saito, T.; Narushima, S.; Hase, K.; Kim, S.; Fritz, J. V.; Wilmes, P.; Ueha, S.; Matsushima, K.; Ohno, H.; Olle, B.; Sakaguchi, S.; Taniguchi, T.; Morita, H.; Hattori, M.; Honda, K. Nature 2013, 500, 232-236.

ACS Paragon Plus Environment

23

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 30

Figure Captions Figure 1: Overview of the experimental design and the analytical strategy Figure 2: MSCA analysis of urinary metabolites collected from all animals overtime. In the score plot (A), each dot corresponds to one single animal, as expressed by its metabolic profiles summarized from all the timepoints simultaneously. The analysis of scores of the between-animal source of variance showed a major separation between animals receiving LF diet from the other animals along the first component (43.61 % of explained variance), whilst the effects of prebiotics in HF diets could be observed along the second component (17.82% of explain variance). (B) The analysis of the related loadings showed a consistent structure in the variables belonging to the same metabolite or metabolic pathway. Figure 3: PLS loading model comparison for LF and HF and HF_S diet. Metabolite associated variables on the diagonal had similar importance in explaining body weight variance in the two models. A threshold value of 0.2 was applied to select most influential variable. This selection corresponds mostly to values of VIP for the variable on any of both models >= 1.0. Figure 4: (A, B, C) Scatter plots of individual samples. Each sample is represented by a symbol on a 2-dimensional projection over rCCA components. Different dates are represented by different symbols and different colours depict different diets. (D) The bottom-right plot corresponds to the variable plot of the rCCA model between microbiota and metabolites. Here both metabolites and microbiota variables are represented through their projections onto the planes defined by their respective canonical variates. Figure 5: Heatmaps with color gradients related to the similarity between the variables. A red cell represents a very high correlation between the corresponding metabolite and microbiota component; blue represents an anti-correlation between the variables; while a white cell no similarity. The same order was kept on all the variables (metabolites in rows and microbiota components in columns) for the six dietary groups: A: LF diet; B: HF diet; C: HF_S diet; D: HF + GOS diet; E: HF+BMOS; F: HF + FOS_IN, allowing comparison of groups by directly contrasting the colors of the same cells in the respective heatmaps .

ACS Paragon Plus Environment

24

Page 25 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

TOC Graphic

ACS Paragon Plus Environment

25

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1: Overview of the experimental design and the analytical strategy 254x190mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 26 of 30

Page 27 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 2: MSCA analysis of urinary metabolites collected from all animals overtime. In the score plot (A), each dot corresponds to one single animal, as expressed by its metabolic profiles summarized from all the timepoints simultaneously. The analysis of scores of the between-animal source of variance showed a major separation between animals receiving LF diet from the other animals along the first component (43.61 % of explained variance), whilst the effects of prebiotics in HF diets could be observed along the second component (17.82% of explain variance). (B) The analysis of the related loadings showed a consistent structure in the variables belonging to the same metabolite or metabolic pathway. 254x190mm (300 x 300 DPI)

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3: PLS loading model comparison for LF and HF and HF_S diet. Metabolite associated variables on the diagonal had similar importance in explaining body weight variance in the two models. A threshold value of 0.2 was applied to select most influential variable. This selection corresponds mostly to values of VIP for the variable on any of both models >= 1.0. 254x190mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 28 of 30

Page 29 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 4: (A, B, C) Scatter plots of individual samples. Each sample is represented by a symbol on a 2dimensional projection over rCCA components. Different dates are represented by different symbols and different colours depict different diets. (D) The bottom-right plot corresponds to the variable plot of the rCCA model between microbiota and metabolites. Here both metabolites and microbiota variables are represented through their projections onto the planes defined by their respective canonical variates. 254x190mm (300 x 300 DPI)

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 5: Heatmaps with color gradients related to the similarity between the variables. A red cell represents a very high correlation between the corresponding metabolite and microbiota component; blue represents an anti-correlation between the variables; while a white cell no similarity. The same order was kept on all the variables (metabolites in rows and microbiota components in columns) for the six dietary groups: A: LF diet; B: HF diet; C: HF_S diet; D: HF + GOS diet; E: HF+BMOS; F: HF + FOS_IN, allowing comparison of groups by directly contrasting the colors of the same cells in the respective heatmaps . 254x190mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 30 of 30