Multivariate Modeling Strategy for Intercompartmental Analysis of Tissue and Plasma 1H NMR Spectrotypes Ivan Montoliu,† Franc¸ois-Pierre J. Martin,† Sebastiano Collino, Serge Rezzi, and Sunil Kochhar* BioAnalytical Science, Metabonomics & Biomarkers, Nestle´ Research Center, P.O. Box 44, CH-1000 Lausanne 26, Switzerland Received November 25, 2008
Multicompartmental metabolic profiling combined with multivariate data analysis offers a unique opportunity to explore the multidimensional metabolic relationships between various biological matrices. Here, we applied unsupervised chemometric methods for integrating 1H NMR metabolic profiles from mouse plasma, liver, pancreas, adrenal gland and kidney cortex matrices in order to infer intercompartments functional links. Principal Component Analysis (PCA) revealed metabolic differences between matrices but contained limited information on intercompartment metabolic relationships. Multiway PCA enabled the assessment of interindividual metabolic variability across multiple compartments in a single model and, therefore, metabolic correlations between different organs and circulating biofluids. However, this approach does not provide information on the relative contribution of one compartment to another. Integration of metabolic profiles using Multivariate Curve Resolution (MCR) and Parallel Factor Analysis (PARAFAC) methods provided an overview of functional relationships across matrices and enabled the characterization of compartment-specific metabolite signatures, the spectrotypes. In particular, the spectrotypes describe biochemical profiles specific or common to different biological compartments. Consequently, MCR-ALS and PARAFAC appeared to be better adapted for stepwise variable and compartment selection for further correlation analysis. Such a combination of chemometric techniques could provide new research avenues to assess the efficacy of drug or nutritional interventions on targeted organs. Keywords: Adrenal gland • Chemometrics • MCR-ALS • MPCA • HRMAS 1H NMR spectroscopy • Kidney • Liver • Intact tissue • Metabonomics • Pancreas • PARAFAC • PCA • Plasma
Introduction One of the greatest challenges in modern biology is to understand how the changes in environmental conditions and lifestyle influence human genetics and physiology, and ultimately human ability to exploit new dietary resources.1 Modern nutrition research focuses on deciphering the molecular mechanisms involved in the individual response to dietary modulations, through the understanding of metabolic disorders and the efficacy of active ingredients, with the overall aim to provide health maintenance.2 This implies the development of analytical strategies based on the measurement of metabolites (metabonomics) to assess the effects of nutrition at both the organspecific compartment and system levels.3–5 Metabonomics provides high density informative data on the real end points of physiological regulatory processes providing compartmental and systemic metabolic profiles from tissue and biofluid samples. When associated with a well-defined physiological condition, metabolic profiles provide a snapshot of a functional phenotype, or metabotypes, as a result of multiple interactions * To whom correspondence should be addressed.
[email protected]. Phone: + 41 21 785 9336. † These authors equally contributed to this work. 10.1021/pr8010205 CCC: $40.75
2009 American Chemical Society
E-mail:
between metabolic pathways under the influence of environment, lifestyle, genetics, and microbial factors.6–8 A specific metabolic profile of a systemic biofluid, such as urine or plasma, reflects the overall metabolic status of an individual as the result of highly complex metabolic exchanges between diverse biological compartments, including organs, biofluids and microbial symbionts. Therefore, the understanding of regulatory metabolic processes of a complex living organism at the system level implies the assessment of spatiotemporal interorgan metabolic cross talks through the analysis of biofluids. This challenge can be addressed by studying the metabolic correspondences among tissue and biofluid metabolic profiles. Nowadays, 1H Nuclear Magnetic Resonance (NMR) is a well-established approach for multicompartmental metabolic profiling of intact tissue and biofluid samples.9 The analysis of the metabolic data with sophisticated data mining techniques offers the potential to explore functional relationships among various biological compartments. Multicompartmental metabolic studies were successfully applied to the diagnosis of toxicological10,11 and pathophysiological12 states, to assess the effects of nutritional interventions13–15 and to determine metabolic responses to stress.16,17 Journal of Proteome Research 2009, 8, 2397–2406 2397 Published on Web 03/24/2009
research articles However, up to date, few studies evaluated the potential of different chemometric modeling methods for integrating metabolic correlation across multiple biofluids and tissues for the detection of intercompartment functional relationships. In the current contribution, we aimed at developing unsupervised chemometric methodology to model intercompartment metabolite relationships using an animal model that we have previously characterized using metabonomic analysis.13,14,18 We aimed to model with an unsupervised methodology the metabolic relationships between different biological compartments using 16 biological replicates studied under normal physiological conditions. However, the arrangement of spectroscopic data, which are dependent upon experimental and instrumental conditions, is an important determinant for information recovery. In the case of supervised data mining, such as regression applications, the consideration of these conditions might not be relevant to the final outcome. However, for unsupervised descriptive analysis, the organization of the data, here the metabolic profiles from different biological compartments, can be essential for a straightforward interpretation of the models. This process can be either handled through (i) appropriate unfolding of data matrices and application of PCA and Multivariate Curve Resolution (MCR), and/or (ii) preserving a high order structure with adapted methods such as Multiway PCA (M-PCA) and Parallel Factor Analysis (PARAFAC) (Figure 1). To infer intercompartmental metabolic relationships in a well-characterized mouse model, we have applied these methods for the unsupervised modeling of 1H NMR metabolic profiles from plasma, liver, pancreas, adrenal gland and kidney cortex samples.18
Material and Methods Animal Handling. This study was conducted under the appropriate national guidelines at the Nestle´ Research Center (Lausanne, Switzerland). A total of 16 C3H female germ-free mice (Charles River, France) were housed under the same environmental conditions. In addition, they were fed with a standard semisynthetic irradiated rodent diet19 and received ad libitum saline solution. At 8 weeks of age, the germ-free mice were inoculated with a simplified model of a Human Baby Microbiota and were euthanized after 2 weeks. The preparation of the human baby microbiota (HBM) was previously described.13 A total of 7 bacteria were isolated, namely, Escherichia coli, Bifidobacterium breve, Bifidobacterium longum, Staphylococcus epidermidis, Staphylococcus aureus, Clostridium perfringens, and Bacteroides distasonis, and they were mixed in equal amounts (approximatively 1010 cells/mL for each strain) for gavage. Bacterial cell mixtures were kept in frozen aliquots until use. Sample Collection. Blood (100 µL) was collected into Liheparin tubes and the plasma was obtained after centrifugation at 4 °C at 10 000 g for 10 min. A section of liver was sampled in the same lobe for each animal; the pancreas and the two adrenal glands were also collected and snap-frozen for NMR spectroscopy. The kidneys were also dissected, and the cortex was separated from the medulla before being snap-frozen. Fluid and tissue samples were maintained at - 80 °C prior to analysis. 1 H NMR Spectroscopic Analysis. Plasma samples (100 µL) were prepared into a 5 mm NMR tube with 450 µL of saline solution containing 10% D2O. Intact tissue samples of liver, pancreas, and kidney cortex were bathed in ice-cold saline D2O solution. A portion of the tissue (∼15 mg) was packed into a zirconium oxide (ZrO2) 4 mm outer diameter rotor. Because 2398
Journal of Proteome Research • Vol. 8, No. 5, 2009
Montoliu et al. of the limited size of the adrenal glands, tissue extracts were prepared from both glands following the previously described methods.20 Tissue samples were ground in 1 mL of acetonitrile/ water (1:1). The supernatant containing the aqueous phase was collected, freeze-dried, and redissolved in 60 µL of D2O. Samples were spun for 10 min at 6000 g, and 50 µL of the supernatant was then pipetted into 1.7 mm NMR tubes. 1 H NMR spectra were acquired for each sample using a Bruker DRX 600 NMR spectrometer (Bruker Biospin, Rheinstetten, Germany) operating at 600.13 MHz. 1H NMR spectra of plasma, urine, and extracts of adrenal gland were acquired with a Bruker 5 mm TXI triple resonance probe at 298 K. 1H NMR spectra of intact tissues of liver, pancreas, and kidney cortex were acquired using a standard Bruker 4 mm high resolution MAS probe under magic-angle-spinning conditions at a spin rate of 5000 Hz.20 To minimize any time-dependent biochemical degradation, tissue samples were regulated at 283K using cold N2 gas. For all the samples, a NMR spectrum was acquired using a standard one-dimensional pulse sequence with water suppression. In addition, for intact tissues and plasma samples, Carr-Purcell-Meiboom-Gill (CPMG) spinecho spectra with water suppression were obtained. 1H NMR spectra were acquired and processed according to the parameters previously published.13 Chemometrics. Spectral data were reduced to 789 variables by integrating spectral intensity in segments (width in chemical shift δ, 0.005) corresponding to the regions δ 0.8-5.45 (excluding δ 4.5-5.19 containing the residual water signal). Chemometric analysis has been performed on the standard NMR spectra for adrenal gland polar extract, and the CPMG NMR spectra for plasma, liver, pancreas, and kidney cortex. 1. Principal Components Analysis and Multiway Principal Components Analysis. Principal Components Analysis (PCA) provides a bilinear decomposition of the data matrix X (NMR variables) in a set of scores and loadings, with the former describing variance among samples and the latter the directions of maximum variable variance.21 The model is generated from the solution of the eigenvalue problem of the covariance matrix and provides an ordered factor decomposition of the data imposing an orthogonality constraint for each new factor (see Supporting Information). X ) TPT + E When high order data are involved, a particular case of PCA, Multiway Principal Components Analysis (MPCA), can also be used.22 In this model, the algorithm decomposes the initial N-order data matrix X (NMR variables) in the outer product of a set of scores (ti) and their associated loadings matrix (P), plus a residual matrix E. These residuals are minimized in a leastsquares sense, and are considered to be associated to the nondeterministic part of the information (noise, etc.). As seen in the following equation, the systematic part of the information is considered to be represented by the outer product between t and P. This loading matrix contains information from two modes (commonly the second and the third mode). R
X _ )
∑t r)1
r
X Pr + E _
Intercompartmental Analysis of Tissue/Plasma 1H NMR Spectrotypes
research articles
Figure 1. Schematic of analytical strategy. (A) Multivariate Data Analysis workflow of bivariate data of five biological compartments. Column-wise arrangement of the data structure in Principal Component Analysis and Multivariate Curve Resolution (left); row-wise arrangement for Multiway Principal Component Analysis (upper right) and three-way data arrangement for PARAFAC (lower right). (B) Representation of biochemical profiles specific or common to different biological compartments as per the calculation of “pure” and “relational” spectrotypes.
According to the model, t scores are providing the variation of the samples considering all the information in the second and third modes. The P loadings on each factor describe the contribution of the variables from the second and third modes for each of the R factors (see Supporting Information). As in PCA, the decomposition is decreasingly ordered according to explained variances. For both PCA and MPCA, the determination of the number of principal components is often done following common strategies, such as scree plots, validation tests and cross-validation results. 2. Multivariate Curve Resolution-Alternating Least Squares. Multivariate Curve Resolution-Alternating Least Squares (MCR-
ALS) provides a bilinear decomposition of the global response included in the original data matrix X in a set of factors (see Supporting Information).23,24 These ones are expressed as a combination of pure contributions C (contribution profiles) and S (spectral profiles). This decomposition can be expressed as: X ) CST + E X corresponds to the set of 1H NMR spectra of the different biological compartments obtained from each animal, the contribution profiles C reflect the changes in the contribution Journal of Proteome Research • Vol. 8, No. 5, 2009 2399
research articles
Montoliu et al.
Figure 2. Typical aliphatic regions of 600 MHz 1H CPMG NMR spectra of plasma (A), liver (B), pancreas (C) and kidney cortex (D) and 1 H NMR spectrum of adrenal glands extract (E) obtained from a HBM mouse. Key: Asn, asparagin; Asp, aspartate; Ala, alanine; Arg, arginine; Glu, glutamate; GPC, glycerophosphocholine; GSH, glutathione; Ile, isoleucine; Gly, glycane; Lip, lipids; Leu, leucine; Lys, lysine; Met, methionine; Pc, phosphocholine; Tau, taurine; TMAO, trimethylamine-N-oxide; Val, valine.
of the different factors in each compartment and S is the matrix containing the spectral variables associated to each factor. The algorithm is initialized using estimations of the pure profiles (contribution (C) or spectral (S) profiles), for instance by evolving factor analysis (EFA).25 Assuming that most of the data variance is expressing deterministic information (not noise), PCA can be used for a first estimation of the number of factors present in the data. However, solutions obtained by MCR-ALS are not unique. They present rotational and intensity ambiguities. To solve these, a set of constraints is usually applied in the resolution of the Alternating Least Squares algorithm. These constraints are closely related to the physicochemical nature of the system studied and provide a better interpretability of the model. Because of the nature of the NMR metabolic profiles, nonnegativity constraints are applied to both C and S. As stated above, ALS algorithm is extremely flexible and allows the appropriate ordering of the data matrices for its simultaneous resolution. To apply this, different augmentation schemes can be used: column-wise (CW), row-wise (RW) and column + row wise (CW+RW). In this work, a CW augmentation scheme has been applied onto a set of I matrices corresponding to I compartments. Each matrix has a dimen2400
Journal of Proteome Research • Vol. 8, No. 5, 2009
sionality of nR rows (individuals) and nC (chemical shifts). Briefly, CW augmentation consists in the generation of a data matrix XCW that keeps constant the number of columns (nC) and set I compartments consecutively one after the other. According to this ordering, the MCR ALS model is given by the following equation.
[][] []
X1 C1 E1 XCW ) l ) l ST + l X1 C1 E1
3. Parallel Factor Analysis (PARAFAC). This method, formerly known as CANDECOMP (CANonical DECOMPosition), provides a way to decompose high order data sets in a sum of outer products of n sets of loadings (see Supporting Information).26,27 Formally speaking, PARAFAC represents a constrained version of the Tucker3 model28 and it assumes that no interaction between factors is present, forcing the number of factors on each mode to be equal. These restrictions drive PARAFAC model to have less freedom than Tucker3 or MPCA, which usually provide significantly higher fit. Element-wise, for a three-way data set, PARAFAC model can be described by:
Intercompartmental Analysis of Tissue/Plasma 1H NMR Spectrotypes
research articles
F
X _ (i,j,k) )
∑a b
if jf
ckf + E _ (i,j,k)
f)1
In the model, aiF, bjF and ckF are sets of loadings corresponding to the first, second and third mode, respectively. Some properties of the model are its uniqueness and that this is a nonnested model. Therefore, each set of factors is calculated at a time, and is dependent on the total number of calculated factors. This property preserves the inner structure of the data. According to the nature of the data, additional restrictions, such as non-negativity and orthogonality can be applied for all/some of the modes. Similarly to MCR-ALS, PARAFAC model parameters can be determined using a wide range of methods, namely, scree plots, cross-validation, study of the residuals, or the Core consistency test.29 Briefly, the Core consistency test provides an idea of the fitness of the PARAFAC model checking how it fulfills the core restriction conditions imposed for this model.30
Results 1 H NMR Spectroscopic Analysis of Systemic Fluid and Organ Matrices. Examples of typical 1H NMR spectra of adrenal gland polar extracts and 1H CPMG NMR spectra of blood plasma and intact tissues of liver, kidney cortex, and pancreas obtained from C3H mice colonized with HBM are shown in Figure 2. For the analysis of blood plasma and tissues, CPMG NMR pulse sequence was used to attenuate the spectroscopic contributions of large macromolecular species and to favor observation of sharp signals arising from low molecular weight metabolites. The metabolite identification was achieved using an in-house reference compound database, literature data,18,31,32 and confirmed by 2D homo- and heteronuclear NMR spectroscopy experiments. For each of the biological matrices, the biochemical profile contains a wide range of amino acids in addition to organic acids, sugars, sugar-alcohols, saturated and unsaturated fatty acids (Figure 2). Overview of Inter- and Intracompartment Metabolic Relationships Using Principal Component Analysis. An initial data analysis was performed using PCA to assess biochemical similarities between the samples from different biological compartments through modeling the main sources of metabolic variations (Figure 3). The PCA model was generated using 5 principal components (PCs) explaining 45.1, 19.1, 14.4, 9.5, and 4.8% of the total variance, respectively. Each point in the scores plot represents an individual biochemical profile of a sample, and biochemical components responsible for the differences between samples detected in the scores plot can be extracted from the corresponding loadings plot. The distribution of the biochemical profiles along the five PCs by means of scores plot revealed a significant co-mapping of samples according to their biological origin. PCA scores plots showed metabolic differences between the pancreas and the kidney cortex from other biological matrices along the first two PCs. PC3 explained the metabolic differences between blood plasma and liver tissue samples (data not shown). In addition, PC4 described the metabolic variations specific to adrenal glands, while PC5 describes the behavior of some of the kidney compartment samples (data not shown). The analysis of the PCA loadings highlighted metabolic discrepancies of pancreas exhibiting relatively higher levels of phosphocholine (Pc), glycerophosphocholine (GPC), glycine, alanine, glutamate and creatine and lower levels of lipids, dimethylamine (DMA) and
Figure 3. PCA analysis on column-wise augmented matrix. Agglomeration of PCA scores according to biological compartment (top). Highlight of variables responsible of highest differentiation between pancreas and kidney compartments (bottom). Key: see Figure 2; DMA, dimethylamine.
glucose. Kidney cortex showed higher concentrations of choline, taurine, glutamate, trimethylamine-N-oxide (TMAO) and lower levels of lactate and glucose. Interpretation of PC3 loadings indicated relative high amounts of lipids, taurine, glycogen, and Pc and low contents in amino acids, lactate, glucose, creatine and DMA in the liver when compared to plasma. Analysis of PC4 loadings described that adrenal gland extracts contained relatively more DMA, myo-inositol and alanine, and lower levels of lactate and glucose when compared to other matrices were discerned. In addition, MPCA was explored to simultaneously integrate the metabolic profiles from diverse biological compartments for each individual animal (Figure 4). The model was generated using 4 PCs explaining 32.4, 19.7 13.6 and 10.16% of the total variance. The MPCA scores plot revealed the discrimination of the animal 15 along PC1, while a homogeneous dispersion of the animals was observed in the following PCs. The analysis of the loadings corresponding to the second and the third PCs describe metabolic discrepancies between individuals and therefore provide deeper insights into intercompartment metabolic relationships. Along PC2, high liver triglycerides were associated with high plasma lactate and free amino acids, elevated alanine and lipids in the adrenal glands, alanine, glutamate, creatine, Pc, betaine, taurine in the pancreas, and alanine, glutamate, taurine and choline in the kidney. These metabolic variations were correlated to the decrease of glucose in plasma, glucose and glycogen in the liver, DMA, creatine, taurine, myo- and scyllo-inositol, and betaine in the adrenal glands, lipids in the pancreas, betaine and GPC in the kidney. Journal of Proteome Research • Vol. 8, No. 5, 2009 2401
research articles
Montoliu et al.
Figure 4. MPCA analysis. Scores and loadings of the individuals and their different compartments. Compartmental loadings obtained after cropping of MPCA model loadings. Key: see Figures 2 and 3.
Analysis of the third PC loadings provided additional metabolic information, as noted with the correlation between low lactate and high lipids in plasma, low glucose, glutamine, glycine, methionine, glutathione, alanine and high triglycerides in the liver, high DMA and low lactate, alanine, creatine and taurine in adrenal glands, low valine, alanine, glycine, Pc and betaine in the pancreas, high GPC and betaine and low choline and glycine in kidney. Modeling Matrix-Specific Metabolic Fingerprints and Compartment Metabolic Relationships Using MCR-ALS and PARAFAC Methods. A first estimation of the MCR-ALS compartment profiles was obtained from EFA of the CW augmented matrix. The number of relevant factors was determined to be 5 and was confirmed both by PCA and from the interpretation on EFA results. Non-negativity constraints were applied by non-negative least-squares for both the contribution and the spectral profiles. Spectral profiles were normalized to have equal length. Stopping criterion of the algorithm was set to a difference of 0.1% between the standard deviations of the residuals of two consecutive iterations. Under the conditions cited, a 5 factors MCR-ALS model has been obtained, explaining 88.3% of the variance expressed as sum of squares. After the MCR-ALS analysis, pure contribution profiles showed the distribution of individual metabolic profiles according to their inherent biochemical composition (Figure 5). Simultaneously, the metabolic changes responsible for the differences between samples can be extracted from the pure spectral profiles, or spectrotypes.33 Interestingly, MCR-ALS generated a spectrotype that represents the pure metabolite composition of each individual compartment. The profile of the blood plasma was characterized by lactate, branch-chain amino acids, alanine, glucose, GPC, creatine, citrate, N-acetyl2402
Journal of Proteome Research • Vol. 8, No. 5, 2009
glycoprotein, lysine, arginine, glutamine, methionine and pyruvate. The spectrotype of the liver comprised signals from glucose, glycogen, triglycerides, and glutathione. The metabolic profile of adrenal glands was characterized by the occurrence of lactate, alanine, creatine, taurine, acetate, DMA, myo-inositol and glutathione. The spectrotype characteristic of the pancreas contained Pc, GPC, glycine, alanine, glutamate, betaine and creatine. Finally, the spectrotype for the kidney cortex was dominated by choline, taurine, TMAO, glutamate, alanine, lactate, and branch-chain amino acids. Another feature of the MCR-ALS analysis is that the pure compartment profiles pinpoint the contributions of pure spectrotypes in different biological compartments. In particular, the pure contribution profile showed that certain changes in the blood plasma spectrotype are co-varying with certain changes in the liver spectrotype. Moreover, the pure compartment profile shows that the liver spectrotype has also a small but appreciable contribution to the metabolite composition of the adrenal glands and the kidneys. Kidney spectrotype was also present in more than one compartment, with a minor contribution to pancreas metabolic profile. Following a similar approach, PARAFAC model generated a set of spectrotypes representative of the different biological compartments and to model the intercompartment relationships. Similarly to MCR-ALS, modes 2 (spectral) and 3 (compartments) were non-negatively constrained (Figure 6). The Core consistency test was applied to calculate the number of factors, which was determined at 4 PARAFAC factors. Under those conditions, a level of fit of 90.44% of explained variance was achieved. Analysis of the compartment mode loadings showed that a specific profile could be modeled for kidney cortex, pancreas, liver and blood plasma, whereas the adrenal metabolic profile
Intercompartmental Analysis of Tissue/Plasma 1H NMR Spectrotypes
research articles
Figure 5. MCR-ALS analysis. Pure contributions to compartment profiles (top). Normalized pure spectrotypes associated to each compartment (bottom). Key: see Figures 2 and 4; Nac, N-acetyl-glycoproteins.
could be expressed only by a combination of the four other factors. Moreover, each factor, except the first factor representative of the pancreas, had a significant contribution to the metabolic phenotype of one or two other biological compartments. The second factor modeling blood plasma metabolite composition also explained some parts of the liver and adrenal metabolic profiles; the third factor characterizing the kidney profile had a contribution to the metabolic composition of the pancreas and the adrenal gland; and the fourth factor describing the liver metabolite content also explains some of the adrenal changes. Less significant is the interpretation of the first mode loadings, corresponding to samples. These loadings show the influence of the individual samples in the global PARAFAC model. The interpretation of the spectrotype mode loadings showed that the pancreas spectrotype is composed of signals arising from lactate, valine, alanine, glutamate, creatine, glycine, taurine, betaine, Pc and GPC. The second spectrotype comprised lactate, glucose, creatine, DMA, glutamine, alanine, valine and GPC. The third spectrotype was dominated by lipids, branch-chain amino acids, alanine, glutamate, lactate, choline, TMAO, glycine and taurine. The fourth spectrotype was mainly characterized with lipids, glycogen, glucose, and glutathione.
Discussion Multicompartmental metabolic profiling combined with multivariate data analysis offers a unique opportunity to explore the deep and wide extent of functional relationships between various biological compartments. Moreover, it helps the identification of intercompartment metabolic contribution. Visual inspection of the 1H NMR spectra may reveal some compartment-specific qualitative and quantitative metabolite differences. For instance, liver, kidney cortex and pancreas contained prominent signals from betaine and choline metabolism intermediates. Also, liver and plasma exhibited both high concentrations of glucose and lipids, while adrenal gland extracts and pancreas were marked by high levels of myo- and scyllo-inositol. However, these qualitative observations are subject to interanimal variations and do not provide information on any intra- and intercompartment relationships. Hence, a more formal integration and comparison of the metabolic profiles requires application of sophisticated chemometric modeling methods. Here, we have applied a series of descriptive analyses to assess intercompartmental metabolic functional relationships. In such unsupervised modeling methods, the interpretability of the models is under the influence of data structure. The initial application of PCA on data Journal of Proteome Research • Vol. 8, No. 5, 2009 2403
research articles
Montoliu et al.
Figure 6. PARAFAC analysis. First (individuals), second (spectra) and third (compartment) mode loadings.
structured with a column-wise augmentation scheme showed that intercompartment differences are stronger than interindividual variations. This was observed through the clear distribution of the samples according to their respective biological origin. Therefore, PCA provides a way to characterize the metabolic discrepancies between compartments. In terms of metabolite composition, pancreas and kidney cortex tissues, for instance, appeared as the most distinct biological compartments. However, PCA contains limited information on individual metabolism and intercompartment functional relationships. In addition, PCA loadings describe relative changes (positive and negative) of metabolite concentrations, a feature that makes difficult the definition of compartment-specific metabolite patterns. With the use of MPCA, higher order information, such as the biological compartment, is introduced in the model. The MPCA outcome enables the assessment of interindividual metabolic variability across multiple compartments in a single model. Therefore, the relationships between the descriptors in 2404
Journal of Proteome Research • Vol. 8, No. 5, 2009
the MPCA space (MPCA loadings plot) indicate correlations and anticorrelations between different compartments, which ultimately infer the functional metabolic relationships between different organs and circulating biofluids. This is exemplified here with the metabolic information on gluconeogenesis pathways collected across plasma and liver. Along PC3, a collinear increase of glucose level in plasma and liver was associated with a relative depletion of circulating free amino acids and lactate and an augmented level of hepatic glycogen. Additional correlations were revealed between plasma lipids and levels of triglycerides, glutathione and its precursors methionine, glutamine and glycine34 in liver. This observation may highlight the relationships between circulating lipids and their metabolism in the liver with implication of metabolic response to oxidative stress arising from mitochondrial β-oxidation in this mouse model.18,35 In addition, levels of glycine, Pc, betaine and choline were shown to be strongly interrelated in kidney and pancreas, as shown in PC3. These metabolites are biochemically linked by
Intercompartmental Analysis of Tissue/Plasma 1H NMR Spectrotypes transmethylation pathways that occur primarily in these organs.36,37 As captured in the PC3 loading, these pathways are also implicated in lipid and hormone synthesis,36,38 which may explain their functional relationship with the overall lipid metabolism (Figure 4). Finally, PCs2 and 3 indicated a strong functional link among various osmolytes in adrenal glands and kidneys, namely taurine, myo-inositol, scyllo-inositol, GPC and betaine,39,40 which described site-specific regulatory mechanisms for osmotic pressure. Thus, this multicompartmental top-down approach provided by MPCA offers a way forward to study the systemic biochemical profiles and regulation of metabolic reactions between different organs. However, this approach is limited to the highlight of inter-related pathways in different compartments and does not provide information on the relative contribution of one compartment to another. Alternatively, limitations of MPCA can be addressed with complementary MCR-ALS and PARAFAC analyses, which are applied here for the first time in a metabonomic description of intercompartmental functional relationships. MCR-ALS and PARAFAC provide a summary of both functional relationships between different biological matrices and compartmentspecific metabolite signatures. In particular, integration of 1H NMR profiles using these approaches enables the characterization of numerical spectroscopic constructs, also called spectrotypes,33 that describe biochemical profiles specific or common to different biological compartments. MCR-ALS and PARAFAC spectrotypes provide matrix-specific fingerprints that summarized metabolite presence and proportion in a meaningful way. Another benefit of applying such approaches lies in the much simplified visualization of global system biochemical relationships in various biological compartments while keeping an easy interpretability of the obtained loadings. Interpretation of MCRALS pure compartment profiles or PARAFAC compartment mode offers a clear way to characterize metabolic relationships among different biological compartments. This possibility can be illustrated, for instance, by MCR-ALS plasma spectrotype. In this case, the overlap of the plasma pure compartment profile with other compartments such as liver for instance, shows that the plasma spectrotype contributes partly to the liver metabolic variation (Figure 5). Such observations might describe multiorgan metabolic relationships and guide the selection of metabolites and compartments on which further correlation analyses could be performed. PARAFAC model also summarizes the inherent metabolic relationships between the different compartments. The results indicate good convergence of spectrotypes and cross-compartmental metabolic interactions between PARAFAC and MCRALS, although both techniques rely on distinct data structures and algorithms. This is well-illustrated with similarities between pure spectrotypes of pancreas obtained from both techniques, with the exception made by the contribution of lactate and valine uniquely extracted by PARAFAC. Unlike MCR-ALS, PARAFAC modeled interactions between plasma, liver and adrenal glands on one hand, and between kidney, pancreas and adrenal glands on the other hand (Figures 5 and 6). Therefore, PARAFAC provides a better representation of metabolic co-variations across different biological matrices, by which influential metabolites are summarized in a sort of “relational” spectrotypes. The interpretation of these “relational” spectrotypes appears to be an efficient way to characterize metabolic co-linearity between different biological compartments. Nev-
research articles
ertheless, in the present study, PARAFAC poorly characterized the uniqueness of the adrenal compartment as compared with MCR-ALS. To summarize, while MCR-ALS emphasizes more the definition of “pure” spectrotypes, PARAFAC provides a better representation of metabolic co-variations between biological compartments as per the calculation of “relational” spectrotypes. However, none of these two methods can clearly filter the most influential metabolites, i.e., metabolites with a strong relationship and/or contribution to the metabolic profile of other compartments. For this purpose, MCR-ALS and PARAFAC models can be completed by further correlation analysis of their spectrotype metabolic components. Indeed, MCR-ALS pure compartments profiles and PARAFAC compartment mode can be used for objective selection of the most meaningful compartment combinations on which correlation analysis is performed. In addition, the analysis of the “pure” or “relational” spectrotypes provides a stepwise approach for variable selection for further correlation analysis. Such a combination of chemometric techniques would be well-suited for studying functional relationships between a large number of biological compartments. Abbreviations: CPMG, Carr-Purcell-Meiboom-Gill; MCRASL, Multivariate curve resolution-alternating least squares; MPCA, multiway principal component analysis; NMR, nuclear magnetic resonance; PARAFAC, parallel factor analysis; PCA, principal component analysis.
Acknowledgment. We thank John Newell, Monique Julita, Massimo Marchesini, Catherine Schwartz and Christophe Maubert for provision of the animal facilities and expertise. Supporting Information Available: Appendix with chemometrics methods. This material is available free of charge via the Internet at http://pubs.acs.org. References (1) Dunne, C. Inflamm. Bowel Dis. 2001, 7, 136–145. (2) Rezzi, S.; Ramadan, Z.; Fay, L. B.; Kochhar, S. J. Proteome Res. 2007, 6, 513–525. (3) Rezzi, S.; Martin, F. P.; Kochhar, S. Ernst Schering Found. Symp. Proc. 2007, 251–264. (4) Nicholson, J. K. Mol. Syst. Biol. 2006, 2, 52. (5) Nicholson, J. K. J. Proteome Res 2006, 5, 2067–2069. (6) Nicholson, J. K.; Holmes, E.; Wilson, I. D. Nat. Rev. Microbiol. 2005, 3, 431–438. (7) Kochhar, S.; Jacobs, D. M.; Ramadan, Z.; Berruex, F.; Fuerhoz, A.; Fay, L. B. Anal. Biochem. 2006, 352, 274–281. (8) Ellis, D. I.; Dunn, W. B.; Griffin, J. L.; Allwood, J. W.; Goodacre, R. Pharmacogenomics 2007, 8, 1243–1266. (9) Nicholson, J. K.; Holmes, E.; Lindon, J. C.; Wilson, I. D. Nat. Biotechnol. 2004, 22, 1268–1274. (10) Garrod, S.; Bollard, M. E.; Nicholls, A. W.; Connor, S. C.; Connelly, J.; Nicholson, J. K.; Holmes, E. Chem. Res. Toxicol. 2005, 18, 115– 122. (11) Garrod, S.; Humpher, E.; Connor, S. C.; Connelly, J. C.; Spraul, M.; Nicholson, J. K.; Holmes, E. Magn. Reson. Med. 2001, 45, 781–790. (12) Moreno, A.; Rey, M.; Montane, J. M.; Alonso, J.; Arus, C. NMR Biomed. 1993, 6, 111–118. (13) Martin, F. P.; Wang, Y.; Sprenger, N.; Yap, I. K.; Lundstedt, T.; Lek, P.; Rezzi, S.; Ramadan, Z.; van, B. P.; Fay, L. B.; Kochhar, S.; Lindon, J. C.; Holmes, E.; Nicholson, J. K. Mol. Syst. Biol. 2008, 4, 157. (14) Martin, F.-P. J.; Wang, Y.; Sprenger, N.; Yap, I. K. S.; Rezzi, S.; Ramadan, Z.; Pere-Trepat, E.; Rochat, F.; Cherbut, C.; van Bladeren, P.; Fay, L. B.; Kochhar, S.; Lindon, J. C.; Holmes, E.; Nicholson, J. K. Mol. Syst. Biol. 2008, 4, 205. (15) Martin, F. P.; Verdu, E. F.; Wang, Y.; Dumas, M. E.; Yap, I. K.; Cloarec, O.; Bergonzelli, G. E.; Corthesy-Theulaz, I.; Kochhar, S.; Holmes, E.; Lindon, J. C.; Collins, S. M.; Nicholson, J. K. J. Proteome Res. 2006, 5, 2185–2193.
Journal of Proteome Research • Vol. 8, No. 5, 2009 2405
research articles (16) Wang, Y.; Tang, H.; Holmes, E.; Lindon, J. C.; Turini, M. E.; Sprenger, N.; Bergonzelli, G.; Fay, L. B.; Kochhar, S.; Nicholson, J. K. J. Proteome Res. 2005, 4, 1324–1329. (17) Teague, C. R.; Dhabhar, F. S.; Barton, R. H.; Beckwith-Hall, B.; Powell, J.; Cobain, M.; Singer, B.; McEwen, B. S.; Lindon, J. C.; Nicholson, J. K.; Holmes, E. J. Proteome Res 2007, 6, 2080–2093. (18) Martin, F. P.; Dumas, M. E.; Wang, Y.; Legido-Quigley, C.; Yap, I. K.; Tang, H.; Zirah, S.; Murphy, G. M.; Cloarec, O.; Lindon, J. C.; Sprenger, N.; Fay, L. B.; Kochhar, S.; van, B. P.; Holmes, E.; Nicholson, J. K. Mol. Syst. Biol. 2007, 3, 112. (19) Reeves, P. G.; Nielsen, F. H.; Fahey, G. C., Jr. J. Nutr. 1993, 123, 1939–1951. (20) Waters, N. J.; Garrod, S.; Farrant, R. D.; Haselden, J. N.; Connor, S. C.; Connelly, J.; Lindon, J. C.; Holmes, E.; Nicholson, J. K. Anal. Biochem. 2000, 282, 16–23. (21) Wold, S.; Esbensen, K.; Geladi, P. Chemom. Intell. Lab. Syst. 1987, 2, 37–52. (22) Wold, S.; Geladi, P.; Esbensen, K.; Ohman, J. J. Chemom. 1977, 1, 41–46. (23) Tauler, R.; Smilde, A.; Kowalski, B. J. Chemom. 1995, 9, 31–58. (24) Tauler, R.; Kowalski, B.; Fleming, S. Anal. Chem. 1993, 65, 2040– 2047. (25) Keller, H. R.; Massart, D. L. Chemom. Intell. Lab. Syst. 1992, 12, 209–224. (26) Harshman, R. A. UCLA Work. Pap. Phonetics 1970, 16, 1.
2406
Journal of Proteome Research • Vol. 8, No. 5, 2009
Montoliu et al. (27) (28) (29) (30) (31) (32) (33) (34) (35) (36) (37) (38) (39) (40)
Carrol, J. D.; Chang, I. Psychometrika 1970, 35, 283. Bro, R. Chemom. Intell. Lab. Syst. 1997, 38, 149–171. Bro, R.; Henk, A.; Kiers, L. J. Chemom. 2003, 17, 274–286. Smilde, A.; Bro, R.; Geladi, P. Multi-way Analysis: Applications in the Chemical Sciences; Wiley: Chichester, 2004. Fan, T. W. Prog. Nucl. Magn. Reson. Spectrosc. 1996, 28, 161–219. Nicholson, J. K.; Foxall, P. J.; Spraul, M.; Farrant, R. D.; Lindon, J. C. Anal. Chem. 1995, 67, 793–811. Richards, S. E.; Wang, Y.; Lawler, D.; Kochhar, S.; Holmes, E.; Lindon, J. C.; Nicholson, J. K. Anal. Chem. 2008, 80, 4876–4885. Roth, E.; Oehler, R.; Manhart, N.; Exner, R.; Wessner, B.; Strasser, E.; Spittler, A. Nutrition 2002, 18, 217–221. Wu, G.; Fang, Y. Z.; Yang, S.; Lupton, J. R.; Turner, N. D. J. Nutr. 2004, 134, 489–492. Craig, S. A. Am. J. Clin. Nutr. 2004, 80, 539–549. Olthof, M. R.; Brink, E. J.; Katan, M. B.; Verhoef, P. Am. J. Clin. Nutr. 2005, 82, 111–117. Wise, C. K.; Cooney, C. A.; Ali, S. F.; Poirier, L. A. J. Chromatogr., B: Biomed. Sci. Appl. 1997, 696, 145–152. Holub, B. Ann. Rev. Nutr. 1986, 6, 563–597. Yancey, P. H.; Clark, M. E.; Hand, S. C.; Bowlus, R. D.; Somero, G. N. Science 1982, 217, 1214–1222.
PR8010205