An Inflammatory Arthritis-Associated Metabolite Biomarker Pattern Revealed by 1H NMR Spectroscopy Aalim M. Weljie,#,† Reza Dowlatabadi,#,‡ B. Joan Miller,§ Hans J. Vogel,# and Frank R. Jirik*,§ Metabolomics Research Centre, Department of Biological Sciences, Department of Biochemistry and Molecular Biology, and the McCaig Institute for Bone and Joint Health, University of Calgary, Calgary, Alberta T2N 4N1, Canada, and Chenomx, Inc., Edmonton, Alberta T5K 2J1, Canada Received March 6, 2007
Rheumatoid arthritis, a debilitating, systemic inflammatory joint disease, is likely accompanied by alterations in circulating metabolites. Here, an 1H NMR spectroscopy-based metabolomics approach was developed to establish a metabolic ‘biomarker pattern’ in a model of rheumatoid arthritis, the K/BxN transgenic mouse. Sera obtained from arthritic K/BxN mice (N ) 15) and a control population (N ) 19) having the same genetic background, but lacking the arthritogenic T-cell receptor KRN transgene, were compared by 1H NMR spectroscopy. A unique method was developed by combining technologies such as ultrafiltration to remove proteins from serum samples, quantitative ‘targeted profiling’ of known metabolites, pseudo-quantitative profiling of unknown resonances, a supervised O-PLS-DA pattern recognition analysis, and a metabolic-pathway based network analysis for interpretation of results. In total, 88 spectral features were profiled (59 metabolites and 28 unknown resonances). A highly significant subset of 18 spectral features (15 known compounds and 3 unknown resonances) was identified (p ) 0.00075 using MANOVA) that we term a ‘metabolic bioprofile’. We identified metabolites relating to nucleic acid, amino acid, and fatty acid metabolism, as well as lipolysis, reactive oxygen species generation, and methylation. Pathway analysis suggested a shift from metabolites involved in numerous reactions (hub-metabolites) toward intermediates and metabolic endpoints associated with arthritis. The results attest to the metabolic complexity of systemic inflammation and to the power of the experimental approach for identifying a wide variety of disease-associated marker candidates. The diagnostic and prognostic implications of monitoring a spectrum of metabolic events simultaneously using serum samples is discussed with respect to the potential for individualized medicine. Keywords: metabolite profiling • metabolomics • rheumatoid arthritis • K/BxN mice • inflammatory arthritis
Introduction Systemic diseases like rheumatoid arthritis (RA) are likely associated with changes in a complicated array of chemical reactions and metabolites that stem from a diverse set of metabolic pathways.1 These may be synovial joint specific, or related to systemic inflammation. Practically, genomic and ‘post-genomic’ methods, such as transcriptomics and proteomics, aim to monitor the co-regulation of genes, RNA transcripts, and proteins, in order to provide pathway-specific information about the etiology and/or progression of disease. For example, recent proteomic analysis of RA patients2 demonstrated an approach to disease diagnosis involving ‘fingerprinting’ of biomarkers. * Corresponding author. Dr. Frank R. Jirik, Department of Biochemistry and Molecular Biology, University of Calgary, 3330 Hospital Drive N.W., Calgary, Alberta, Canada T2N 4N1. Phone, (403) 220-8666; fax, (403) 2108127; e-mail,
[email protected]. # Metabolomics Research Centre, Department of Biological Sciences, University of Calgary. § Department of Biochemistry and Molecular Biology, The McCaig Institute for Bone and Joint Health, University of Calgary † Chenomx, Inc. ‡ Current address: Department of Medical Chemistry, Faculty of Pharmacy, Medical Science/University of Tehran, Iran.
3456
Journal of Proteome Research 2007, 6, 3456-3464
Published on Web 08/15/2007
Metabolite-specific data provides a functional complement to these techniques and is advantageous for several reasons. First, as it is ‘downstream’ of variations in the genome, transcriptome, or proteome, it more directly reflects the phenotypic features of an organism. Clinically, this has been exploited for many years through numerous metabolite assays. Second, the estimated number of endogenous human metabolites (∼2500) is estimated to be more than an order of magnitude lower than the number of genes, transcripts, or protein variants, making metabolite profiling tractable as a screening tool.3 Finally, metabolites and metabolic pathways are highly conserved among organisms4,5 ensuring a solid basis for the use of model systems to investigate disease pathogenesis and response to treatment applicable to humans. NMR spectroscopy has been used to study synovial fluid (SF) and serum in RA patients6 and rodent air pouch inflammatory exudates,7 primarily with the aim of understanding the metabolic consequences of the hypoxic synovial environment. Similar results have been reported for canine osteoarthritis (OA) synovial fluid,8 indicative of a hypoxic environment undergoing lipolysis, and intriguingly, the ratio of synovial lactate to alanine has been suggested as a discriminatory feature between RA and OA.9 10.1021/pr070123j CCC: $37.00
2007 American Chemical Society
Inflammatory Arthritis-Associated Metabolite Biomarker Pattern
In general, this early work has been limited to qualitative or semiquantitative conclusions due to a dearth of tools suitable for analyzing data-rich complex biofluid mixtures. The emerging field of metabolomics aims to distill relevant information from feature-rich analytical data, such as via the use of NMR spectral data of biofluids. Typically, models are built using multivariate statistical methods, such as principal component analysis or cluster analysis.10 The resultant profiles represent ‘fingerprints’, defining spectral features of interest based on their diagnostic or prognostic value.11 This approach has recently been applied to examine urine samples of osteoarthritic guinea pigs and to study dietary influences on disease progression,12 and concluded that energy and purine metabolites were of major importance. ‘Targeted profiling’ represents a further advance in the quantitative analysis of 1H NMR data.13 The transformation of spectral data into concentration data also eliminates a number of factors which can confound multivariate analysis of spectral data, such as peak shifting due to pH and ionic strength variation in the samples, thus, yielding higher quality models.13,14 In the current work, a ‘biomarker pattern’ of metabolites was identified within the sera of arthritic K/BxN mice.15 This spontaneous, severe inflammatory arthritis shares a number of characteristics with rheumatoid arthritis, including leukocytic infiltration of joints, synovial hyperplasia with pannus formation, and effusions, as well as cartilage and bone destruction.16,17 When mice expressing the KRN T-cell receptor transgene (Tg) on a C57BL/6 background are crossed with non-obese diabetic (NOD) mice, all of the F1 offspring containing the KRN transgene develop a progressive symmetrical polyarthritis.16,17 The KRN T-cells recognize a peptide derived from the ubiquitously expressed glucose-6-phosphate isomerase (GPI) protein presented by the MHC class II molecule I-Agk, and the resulting T-helper cells stimulate the production of arthritogenic anti-GPI immunoglobulins. The IgG:GPI immune complexes that form within the joints lead to the generation of an inflammatory response requiring the participation of the C5a complement fragment, interleukin-1, mast cells, granulocytes, and macrophages.16,17 The latter two cell populations are primarily responsible for the synovitis and synovial hyperplasia that lead to progressive joint destruction in the K/BxN model. Although previous examinations of SF, nucleotide, and energy metabolites have shown classes of biomarkers potentially able to differentiate arthritic from non-arthritic animals, there may also exist arthritis-associated metabolites as yet uncharacterized by NMR spectroscopy. Measurement of a metabolite ‘biomarker pattern’ or ‘bioprofile’ would also allow a wider appreciation of the metabolic changes in arthritis owing to the observed shift from ‘hub’-metabolites to other intermediates and end points. We thus carried out an 1H NMR study of serum metabolites in the K/BxN mouse model of rheumatoid arthritis.
Materials and Methods Mice. K/BxN mice were obtained by crossing the KRN TCR hemizygous transgenic line that was fully backcrossed onto the C57BL/6 background (kindly provided by Dr. C. Benoist, Harvard Medical School)18 with the NOD/Lt inbred strain (Jackson Laboratory, Bar Harbor, ME). Control sera were obtained from age-matched KRN T-cell receptor transgenenegative littermate mice derived from the above-mentioned cross. The controls are thus designated as BxN and share the same genetic background as the experimental mice, except for the absence of the KRN transgene. Severely arthritic K/BxN mice, and BxN control mice, between 50 and 60 days of age,
research articles were bled by cardiac puncture. This procedure was carried out immediately following euthanization by carbon dioxide inhalation. Viral antibody free animals were maintained in a barrier facility and in accordance with institutional and national guidelines for animal care and use. NMR Analysis. Ultrafiltration of serum samples was carried out using a 3 kDa MW 500 µL maximum volume cutoff filter (Pall Life Sciences) in order to separate higher molecular weight components from the metabolites of interest. Deuterium oxide (D2O) was added to the filtrate for a final sample volume of 600 uL and final buffer concentration of 100 mM Na2HPO4/ NaH2PO4 at pH 7.0 ( 0.05. DSS was added to final concentrations of 0.25-1.7 mM as an internal standard. 1D 1H spectra were acquired on a Bruker AVANCE 600 spectrometer equipped with a TXI probe using the standard noesypr1d pulse sequence. Parameters used were matched to the Chenomx (Edmonton, Alberta) NMR Suite library as previously described,13 with between 512 and 2048 transients during acquisition. 2D TOCSY and HSQC spectra were acquired on three representative samples of each type (arthritic and control) using standard sequences. Identification of metabolites was accomplished using 2D NMR methods and the compound library from Chenomx NMR Suite version 4.5. Metabolites were quantified using a targeted profiling approach as implemented in the Chenomx software. Custom library entries were created for unidentified resonances in order to carry them through the analysis for relative concentration comparison. Statistical Analysis. The output concentration data from the targeted profiling data analysis was normalized to the sum of all concentrations, excluding the two most concentrated metabolites, glucose and lactate, due to their disproportionate influence on normalization.19 Four samples were excluded based on strict criteria related to the NMR properties of the data (poor solvent suppression, shimming, etc.) and, in one case, the presence of a number of spurious peaks in the spectrum. The final data set consisted of 34 total samples, 15 arthritic (10 female, 5 male) and 19 control (11 female, 8 male). The data were then trimmed to eliminate significant skewing (>2.5) in the concentration distributions resulting in the removal of 8 measurements from the unknown resonance concentrations that represented 0.21% of the total data. These normalized and trimmed data were used for analysis of variance (ANOVA) and multivariate analysis of variance (MANOVA) analysis in gnu R version 2.3.0 (http://www.r-project.org/), and input for pattern recognition (PR) analysis. SIMCA-P version 11.5 (Umetrics, Sweden) was used for PR multivariate analysis using supervised projection techniques (orthogonal partial least-squares discriminant analysis, O-PLSDA) after mean-centering and autoscaling for equal metabolite weighting.13 The quality of the multivariate PR models is evaluated with the parameters R2 and Q2 where R2 is the total variation explained in the data, where 0 is no variation explained, and 1, for which 100% variation is accounted for. Q2 is the cross-validated explained variation using a 7-fold cross validation approach. As such, Q2 < 0 suggests a model with no predictive ability, and 0 < Q2 < 1 suggests some predictive character, with the reliability increasing as Q2 approaches 1. Further model validation was performed using a 7-fold crossvalidation approach, whereby 5 samples were excluded from the model training stage. As the original model exhibited subpopulations, the cross-validation groups were assigned based on the scores from the original model to ensure similar samples were in different groups. The excluded samples were subsequently used for prediction, and a simple score was assigned (correct or incorrect). Journal of Proteome Research • Vol. 6, No. 9, 2007 3457
research articles
Weljie et al.
Figure 1. Representative regions of the 1D proton NMR spectrum of the ultrafiltrate from an arthritic mouse. (A) Expansion of the aliphatic region from 0.80 to 4.2 ppm. The upper panel is an expansion of the highlighted region. (B) Expansion of the aromatic region from 7.15 to 8.45 ppm which includes resonances from a number of compounds important to the analysis. Overlaid are the most extreme arthritic spectrum (gray line) and control spectrum (black line). Resonances from the most pronounced compounds are identified as the three-letter code for standard amino acids and as follows: 3Hb, 3-hydroxybutyrate; Ac, acetate; Aden, adenosine; Chol, choline; Cit, citrate; Citru, citrulline; Cr, creatine; csi, chemical shift indicator; For, formate; Glc, glucose; HXan, hypoxanthine; Lac, lactate; Me-His, 1-methylhistidine; ND, not determined; O-AC, O-acetylcarnitine; Orn, ornithine, Pyr, pyruvate; Suc, succinate; Tau, taurine; Xan, xanthine.
The choice of input variables for MANOVA analysis was based on magnitude of the variable influence on projection (VIP) from SIMCA-P where a VIP > 1 indicates contribution to the model. In addition, a further constraint was imposed such that the ANOVA p < 0.2. The default R test statistic was used (Pillai-Bartlett); however, as there is only one degree of freedom in the MANOVA response model (arthritic vs control), the possible test statistics are equivalent. Biochemical reactions involving the identified metabolites were identified through pathways annotated in the Kyoto Encyclopedia of Genes and Genomes (KEGG),20 with a super-network created of all pathways which involved 3 or more of the ’bioprofile’ metabolites identified here (further details in Supporting Information). Graphical network analysis was conducted with Cytoscape v. 2.3.1 (http://www.cytoscape.org). Two types of networks were generated for visualization, pathway-ordered and degreeordered. In the degree-ordered network, each node (metabolite or enzyme) is circularly sorted based on the total number of edges (chemical reactions) in which it participates. As a result, the most connected metabolites (or hub-metabolites) will be 3458
Journal of Proteome Research • Vol. 6, No. 9, 2007
in the -y position, with the degree of ordering decreasing in a clockwise direction.
Results Metabolite Identification and Quantification. To develop an understanding of the arthritic process at a metabolite level, the profiles of sera obtained from K/BxN arthritic and control animals were examined by NMR spectroscopy. In total, 59 metabolites were identified and quantified using a ‘Targeted Profiling’ approach as indicated in Figures 1-4, and Supplementary Table 1 in Supporting Information. The mean measured concentrations ranged from 1.9 ( 0.2 µM (Trigonelline) to 9694.0 ( 542.6 µM (glucose) for the total data set. Known compounds constituted between ∼93 and 95% of the total spectrum area. In addition, 29 weak unknown resonances were followed for a total of 88 spectral features, increasing the profile coverage to ∼95-98%. An exact estimate was hampered by 13C satellite peaks from high-concentration metabolites such as glucose and lactate. Assignment of unknown resonances was difficult owing to low signal-to-noise and overlap with more
Inflammatory Arthritis-Associated Metabolite Biomarker Pattern
Figure 2. Compounds identified to be statistically significantly different between the arthritic (N ) 15) and control mice (N ) 19) by ANOVA. Significance is indicated as very strong (****, p < 0.00003), strong (***, p < 0.01), standard (*, p < 0.05), and weak (+, p < 0.10) and low (unmarked, p < 0.20). Red indicates down-regulated in the arthritic animals relative to control.
intense signals in 1D and 2D experiments. Differences in dilution from sample to sample were considered using a normalization procedure based on the sum of all determined concentrations as described in the Materials and Methods. Univariate Statistical Analysis of Candidate Biomarkers. To ascertain the metabolic differences between the arthritic and control animals, an ANOVA test was applied to all features in the normalized data set. Figure 2 shows the normalized changes from the normal state for all metabolites and resonances with p-values > 0.20. The mean uracil concentration in the arthritic animals was elevated by 2.95-fold (p ) 2.4 × 10-5) over controls, while xanthine (p ) 0.0044) and glycine (0.0093) were decreased by 0.49-fold and 0.80-fold. Additional compounds showing statistically significant decreases (p < 0.05) include glycerol (0.85-fold), hypoxanthine (0.45-fold), and an unknown resonance at 2.48 ppm. TMAO levels increased by 1.22-fold relative to the control level (p ) 0.037). A number of other compounds were identified with weak statistical significance (0.05 < p < 0.2) including 2-hydroxybutyrate, 1-methylhistidine, 2-oxoglutarate, uridine, methionine, glutamate, serine, phenylalanine, taurine, choline, O-acetylcarnitine, and asparagine. In addition, a number of unknown resonances were identified at 5.17, 3.61, 5.68, and 1.11 ppm which were weakly significant in discriminating arthritic from control sera. Predictive Pattern Recognition Modeling Using Multivariate Projection Analysis and Validation. Univariate statistical analysis provides significant insight into the compounds responsible for the differences in the arthritic and control animals. Ultimately, however, the true value in the NMR analysis is the ability to quantify the signals from a number of
research articles metabolites simultaneously, which exhibit a collinear response and do not conform to the univariate assumption of independence. As a result, the data set was analyzed using O-PLS-DA,21 a supervised discrimination extension to traditional principal component analysis, to build a model that demonstrates the predictive potential of the data. The model consisted of 3 components (1 principal component and 2 orthogonal components) with an explained variance, R2 ) 0.938 and crossvalidated predictive ability Q2 ) 0.547 (Figure 3A). In an O-PLSDA model, the discriminatory variable (in this case arthritic vs control animals) is always the first component and is oriented along the x-axis. Notable features of the scores plot include the excellent discrimination between the arthritic and control sera, and the robustness of the model with respect to gender. In addition, several weak subpopulations were evident within the first orthogonal component along the y-axis; two are within the control group separated by the origin, and three within the arthritic animals. The arthritic samples are loosely clustered according to the experimental group in which they were acquired, with the three samples in the top right corner (Figure 3A) contaminated with a small amount of methanol. There is no clear differentiating factor for the two subpopulations from the control group. Two validation procedures were performed on the data to ensure that the discrimination in the model was not the result of ‘over-fitting’. First, a permutation test was conducted in which the class labels (arthritic or control) were randomly shuffled into 20 random combinations. The R2 and Q2 values evaluated from 3 component O-PLS-DA models for each permuted data set. Figure 3C demonstrates that the original model has the highest R2 and Q2 values, suggesting this is not the result of ‘over-fitting’ the data set.14 The slope of the R2 permutation group was only slightly positive, however, warranting further testing. The second test performed was a 7-fold cross validation, as described in Materials and Methods: 85% of samples were accurately predicted in this test, and 15% incorrectly classified. Establishing a Metabolite Biomarker Pattern of Arthritis Using MANOVA Testing. To establish a set of compounds with the most influence on the model, a subset of most important features were identified to generate a ‘bioprofile’, starting with compounds with univariate p-values < 0.20 (Figure 2). Figure 3B illustrates the positive relationship between the univariate F-statistic from ANOVA testing and the VIP from the PR analysis. The VIP is a measure of a variable’s contribution to both the R2 and Q2 of the model, with a higher value indicating larger influence; VIP > 1 is considered significant. The uracil VIP of 3.0 was the largest observed, and corresponded to the largest F-statistic (24.1). Also demonstrated in Figure 3B is the inverse relationship between the multivariate VIP and the univariate p-value from ANOVA (Uracil p ) 2.45 × 10-5). The least significant feature considered in the subset was asparagine, p ) 0.19. At this level, the VIP was still considered significant to the multivariate model at 1.21. The loadings plot of the O-PLS-DA model (Figure 4) demonstrated that the starting subset of compounds was important in the discrimination of arthritic from control animals. The statistical reliability of the determined compound subset was established by MANOVA testing. This analysis provides a measure of how (inversely) correlated changes in variables (e.g., up-regulation or down-regulation of multiple metabolites) can be significant, even if the individual metabolites are not statistically discriminatory by univariate testing. Table 1 illustrates two sets of features which exhibit the highest statistical significance between the arthritic and control animals. When all of the compounds with univariate ANOVA significance of Journal of Proteome Research • Vol. 6, No. 9, 2007 3459
research articles
Weljie et al.
Figure 3. Scores plot from the multivariate OPLS-DA model (A), relationship to univariate parameters (B), and model validation (C). (A) Each point represents a single serum spectrum, with the position determined by the contribution of the 88 tracked features. Variation between the arthritic and control sera is along the x-axis, while information orthogonal to the arthritic condition is in the y-axis. Gender is indicated as female (2) or male (9). (B) Relationship between the VIP (x-axis) from multivariate O-PLS-DA analysis to the F-statistic (9, left y-axis) and p-value (light gray 2, right y-axis) from univariate ANOVA analysis. (C) R2 is the explained variance, Q2 the predictive ability of the model, and correlation is the degree of overlap between the permuted and original models. The validation models exhibit poor predictive ability, as indicated by low Q2 values, supporting the conclusion that the original model is not overfit.14
Figure 4. Loadings plot from the OPLS-DA analysis. Metabolites abbreviated as indicated in Figure 1 plus: 1-Mehis, 1-methylhistidine; 2-Hb, 2-hydroxybutryate; 2-Oxoglut, 2-Oxoglutarate; DCFA, di-carboxylic fatty acid, possibly suberate; Carn, carnitine; Cyt, cytidine; DMA, dimethylamine; Glyc, glycerol; Trig, trigonelline; PhChol, phosphocholine; SCFAn, short-chain fatty acid n; TMAO, trimethylamineN-oxide. Metabolites identified as significant with MANOVA testing are enlarged. Unknown resonances are indicated by the numerical ppm value or unmarked if not significant to the model. The axis are associated with the scores plot of Figure 3A, hence metabolite concentrations on the left are higher in the normal sera as compared with the arthritis, and vice versa.
p < 0.20 are considered with the exception of asparagine, choline, O-acetylcarnitine, and unknown resonances at 1.1 and 5.68, the MANOVA significance as a function of disease con3460
Journal of Proteome Research • Vol. 6, No. 9, 2007
dition is very high, p ) 0.00075, F-statistic ) 14.2. If the entire subset is considered except for the unknown resonance at 5.68, the result remains significant with p ) 0.039, F-statistic ) 10.4.
research articles
Inflammatory Arthritis-Associated Metabolite Biomarker Pattern Table 1. Significant Compounds and Resonances from Multivariate Analysis of Variance p ) 0.00075
p ) 0.039
Xanthinea Glycine Glycerol Unknown2.48 Hypoxanthine 1-Methylhistidineb 2-Oxoglutarate Methionine Glutamate
Xanthine Glycine Glycerol Unknown2.48 Hypoxanthine 1-Methylhistidineb 2-Oxoglutarate Methionine Glutamate Unknown1.11c Serine Phenylalanine Taurine Asparagine O-Acetylcarnitine Choline Unknown3.61 Unknown5.17 2-Hydroxybutyrate Uridine TMAO Uracil
Serine Phenylalanine Taurine
Unknown3.61 Unknown5.17 2-Hydroxybutyrate Uridine TMAO Uracil
a Metabolites are ordered in ascending order of arthritic upregulation. 1-Methylhistidine is tentatively assigned. c Metabolites/resonances in bold unique to the second subset. b
Discussion A metabolic ‘bioprofile’, consisting of predictive serum metabolite features from 1H NMR spectral data of the murine K/BxN model of arthritis are presented in this study. The findings are generally consistent with previous NMR spectroscopy results examining arthritic biofluids.6-9,12,22 Significantly, the current work demonstrates the feasibility of using serum for diagnostic or prognostic testing. Other studies of arthritis have attempted to find biomarkers without SF analysis, including examination of guinea pig urine from a OA model system,12 and lipid ratio profiling using MS and 31P NMR spectroscopy.22 This latter study has demonstrated that serum analyses can yield early stage indicators of disease. One of the most useful advantages of the current method is that it enables comparisons between RA patient serum and non-arthritic serum, an approach that is not feasible in studies based on analysis of SF. Three of the metabolites presented here may be considered classical candidate biomarkers (uracil, xanthine, and glycine) due to their discriminatory power in being able to distinguish arthritic from control animals. There are, however, a number
Figure 5. Degree-ordered network analysis of the arthritic murine metabolite bioprofile. The arrow at the -y position denotes the highest degree of metabolite connectivity, and nodes are arranged in descending order in a counter-clockwise direction. Colors are assigned based on the fold change relative to the control metabolite (Figure 2), and size proportional to the VIP (Figure 3C). Pathways derived from KEGG20 and figure generated with Cytoscape. Journal of Proteome Research • Vol. 6, No. 9, 2007 3461
research articles
Weljie et al.
Table 2. Potential Metabolic Pathways Indicated by Significant Metabolites pathway
metabolites
Lipid, fatty acid and carbohydrate metabolism Energy metabolism Nucleic acid metabolism Amino acid metabolism Reactive oxygen species (ROS) metabolism
Glycerol, Choline, 2-Hydroxybutryate, Acetylcarnitine, TMAO 2-Oxoglutarate, Acetylcarnitine Xanthine, Hypoxanthine, Uridine, Uracil, TMAO, Glutamate, Serine, Phenylalanine, Glycine, Methionine, Asparagine Methionine, Glycine, Taurine, Xanthine, Hypoxanthine, TMAO, Acetylcarnitine Methionine Glycine (via glycine-gated chlorine channels)
Methylation Macrophage response
of uncertainties about the specificity of such a limited analysis. To better understand the role of the ‘bioprofile’ metabolites, a network pathway analysis was conducted (see supplementary Table 1 and supplementary Figure 1 in Supporting Information) in order to complement traditional pathway analysis via KEGG pathways. Figure 5 demonstrates that a majority of metabolites identified were connected by a maximum of two chemical reactions. Intriguingly, a majority of metabolites downregulated in the arthritic mice were much more highly connected into the network (so-called ‘hub’-metabolites5) than those that were up-regulated relative to controls. This suggested a shift away from central ‘hub’-metabolites (such as serine, glutamate, and 2-oxoglutarate) to other intermediates and metabolic endpoints (such as uracil and 2-hydroxybutryate). The number of potential pathways involved highlights the complexity of the metabolic response to a systemic inflammatory response. This result has implications beyond the system under study here, in that a quantitative understanding of such shifts may be critical to individualized medicine approaches. Taken together, the metabolites identified provide a multifactorial ‘picture’ of the inflammatory processes as shown in Table 2. While conclusively defining roles for each metabolite is not feasible based on the data from this study, it is nevertheless possible to generate some hypotheses as to why certain metabolites are either up- or down-regulated through an examination of existing literature and metabolic pathways known to be implicated in inflammatory responses. It appears that nucleic acid metabolism may be highly impacted in this inflammation model, with uracil and xanthine being two of the most prominent metabolites noted. In addition, hypoxanthine, TMAO, and uridine were all identified in the biomarker pattern. The changes in levels of these compounds suggests an alteration in purine and pyrimidine ratios, perhaps consistent with known processes in RA.23 Other compounds from the identified set are associated with oxidative stress, since xanthine and hypoxanthine have also been implicated in reactive oxygen species (ROS) generation via xanthine oxidase.24 Taurine chloramine is a sulfur containing anti-oxidant which appears to down-regulate pro-inflammatory cytokine production,25 and is present at reduced levels in SF from RA patients.26 Other compounds such as glycine and methionine are important metabolites in homocysteine metabolism, the latter known to be altered in RA27 and which belongs to a thiol class of compounds implicated in free radical stress during inflammation.28,29 Glycine has also been suggested to have anti-inflammatory properties by acting on macrophage chloride channels to blunt cytokine release.30 A key feature of the rheumatoid joint is a hypoxic environment,31 consistent with NMR evidence for lipolysis in SF as compared to serum.6 In the current study, we noted a subtle decrease in the levels of glucose in serum of K/BxN mice and controls, and increased levels of two ketone bodies, 2-hydroxybutyrate and 3-hydroxybutyrate. Differential levels of glycerol, 3462
Journal of Proteome Research • Vol. 6, No. 9, 2007
references
22, 34, 35, 36 37 35, 23 24, 26, 27, 28, 29, 38 35, 39 30, 40, 41
choline, and O-aceytlcarnitine, also provide a means of monitoring fatty acid and lipid metabolism. It is worth noting that the weights of a subset of animals are significantly different between the arthritic and control animals (see Supporting Information). Empirical observations suggested that the general mobility of the arthritic animals and, hence, access to food, was not overtly impaired, raising the possibility that lower weights in the K/BxN animals may have been due to proinflammatiory cytokine-mediated anorexia and/or cachexia. It was not possible to build a reliable PLS model relating the metabolite data to weight for a subset of the animals (N ) 5) for which data was available. Specific metabolites tentatively identified which may be markers for weight-related phenomenon and/or inflammatory response include 2-oxoglutarate, methionine, glutamate, uridine, and 1-methylhistidine; however, more extensive experimentation is required for confirmation. To our knowledge, previous NMR studies of serum or SF from arthritic sources have identified neither nucleic acid- nor oxidative stress-related metabolites. Our identification of these metabolites is due in large part to the sensitive experimental and statistical approaches we have employed. For example, ultrafiltration of the serum significantly attenuates broad protein resonances in the NMR spectra that can interfere with the detection of relatively weak metabolite signals. As a result, quantitative information is available for a number of these relatively low-intensity compounds through targeted profiling. One disadvantage to ultrafiltration, however, is loss of information about protein and higher molecular weight lipid components that have been found to be associated with arthritis by NMR spectroscopy.6,8 Another advantage of the methods described herein is the use of multivariate pattern recognition techniques of the targeted profiling data that through appropriate unit variance scaling functions13 have allowed each compound or unknown resonance to be analyzed irrespective of their initial concentration. Recently, Cloarec et al. demonstrated that NMR spectral data preprocessed using unit-variance scaling and O-PLS-DA pattern recognition can mitigate the effects of chemical shift variability.32 In that study, a clever strategy was applied whereby the loadings plot was the covariance, and this was then colored by the correlation coefficients to the Y variable (in a 2-group discriminatory analysis). The spectral approach is useful as a rapid method to identify significant spectral features from the O-PLS-DA modeling, while providing equal weight to various spectral regions in modeling, similar to the equal weighting of compounds in the current study. The targeted profiling approach applied here is more intensive in terms of the analysis required (data fitting, unknown library parameterization, etc.), but has the advantage that spectral data is converted into compound concentration data. This results in simpler biological interpretation of the chemometric modeling (Figure 4), and allows the compound data to be used in a quantitative way, as
research articles
Inflammatory Arthritis-Associated Metabolite Biomarker Pattern
demonstrated by the ‘heatmap’ in Figure 2, and in the network visualization in Figure 5. Herein, we are also able to obtain reliable results for low concentration compounds and unknowns (Weljie, Newton, Vogel, manuscript in preparation) with a smaller sample size than recommended for statistical methods that rely on spectral analysis (e.g., groups of >20),32 which is important for sample-limited studies. Certainly, further investigation into the relative merits of both techniques is justified. Regardless of the method used to identify putative biomarkers in arthritis, the question arises as to the relevance of a given set of biomarkers to human disease. Since metabolism can be considered a ‘modular’ phenomenon,5 we predict that there will be overlap between the patterns of metabolism perturbed in human RA and those of the K/BxN model. Although the exact metabolites identified as potential ‘markers’ may change between species, it is likely that the metabolite classes involved will possibly remain relatively constant. Initial studies on human rheumatoid populations provide some justification to this hypothesis (Weljie, Vogel, Eystathioy, unpublished observations). A similar conclusion was recently reached by Griffin and co-workers in the case of type-2 diabetes using comparison of patient data with that derived from rat and mouse model systems.33 It is likely, however, that the metabolic manifestation of human disease will not be fully recapitulated by model systems, and consequently, considerable investigation will be required prior to the identification of biomarkers having clinically useful applications. In conclusion, we propose that monitoring of entire sets of metabolic features will be critical to characterizing disease processes. The targeted profiling and multivariate pattern recognition approach we have used herein permits the simultaneous monitoring of a wide number of metabolites whose concentrations may vary by several orders of magnitude. By using >20 features in the analysis, the confounding effects of changes to a subset of these metabolites, such as hubmetabolites, can be mitigated. Furthermore, the ability to track unknown NMR resonances provides relatively complete coverage of reliable NMR signals. Since metabolites can be regulated through a number of different pathways, assessing the entire constellation of metabolites and unknowns, rather than just one or two, has the potential to identify disease-specific biomarker patterns. Abbreviations: ANOVA, analysis of variance; KEGG, Kyoto Encyclopedia of Genes and Genomes; MANOVA, multivariate analysis of variance; NMR, nuclear magnetic resonance; OA, osteoarthritis; OPLS-DA, orthogonal partial least-squares discriminant analysis; PR, pattern recognition; RA, rheumatoid arthritis; SF, synovial fluid; VIP, variable influence on projection.
Acknowledgment. We would like to offer our sincere thanks to Dr. Vanina Zaremberg for useful discussions, and to Glen McInnis for acquisition of several spectra. Funding was provided by Genome Canada through the Genome Alberta Human Metabolome Project (H.J.V.), and from the Arthritis Society of Canada (F.R.J.). The equipment is funded by the Canada Foundation for Innovation and the Alberta Science and Research Authority. The Bio-NMR centre at the University of Calgary is maintained through funds provided by the Canadian Institutes for Health Research and University of Calgary. H.J.V. holds a senior scientist award from the Alberta Heritage Foundation for Medical Research; A.M.W. was the recipient of
an Industrial Associate award from Alberta Ingenuity, and F.R.J. was the recipient of a Canada Research Chair award.
Supporting Information Available: Table of pathways identified by searching bioprofile metabolites in KEGG; figure of pathway-ordered merged network of pathways including 3 or more bioprofile metabolites; Cytoscape network file of merged network for further visualization. Cytoscape is freely available at www.cytoscape.org; figure of changes in body weight for selected arthritic and control animals. This material is available free of charge via the Internet at http://pubs.acs.org. References (1) Glocker, M. O.; Guthke, R.; Kekow, J.; Thiesen, H. J. Med. Res. Rev. 2006, 26, 63-87. (2) Gobezie, R.; Millett, P. J.; Sarracino, D. S.; Evans, C.; Thornhill, T. S. J Am. Acad. Orthop. Surg. 2006, 14, 325-332. (3) Kell, D. B. Drug Discovery Today 2006, 11, 1085-1092. (4) Nikolaev, E. V.; Burgard, A. P.; Maranas, C. D. Biophys. J 2005, 88, 37-49. (5) Ravasz, E.; Somera, A. L.; Mongru, D. A.; Oltvai, Z. N.; Barabasi, A. L. Science 2002, 297, 1551-1555. (6) Naughton, D.; Whelan, M.; Smith, E. C.; Williams, R.; Blake, D. R.; Grootveld, M. FEBS Lett. 1993, 317, 135-138. (7) Claxson, A.; Grootveld, M.; Chander, C.; Earl, J.; Haycock, P.; Mantle, M.; Williams, S. R.; Silwood, C. J.; Blake, D. R. Biochim. Biophys. Acta 1999, 1454, 57-70. (8) Damyanovich, A. Z.; Staples, J. R.; Chan, A. D.; Marshall, K. W. J. Orthop. Res. 1999, 17, 223-231. (9) Meshitsuka, S.; Yamazaki, E.; Inoue, M.; Hagino, H.; Teshima, R.; Yamamoto, K. Clin. Chim. Acta 1999, 281, 163-167. (10) Brindle, J. T.; Antti, H.; Holmes, E.; Tranter, G.; Nicholson, J. K.; Bethell, H. W.; Clarke, S.; Schofield, P. M.; McKilligin, E.; Mosedale, D. E.; Grainger, D. J. Nat. Med. 2002, 8, 1439-1444. (11) Clayton, T. A.; Lindon, J. C.; Cloarec, O.; Antti, H.; Charuel, C.; Hanton, G.; Provost, J. P.; Le Net, J. L.; Baker, D.; Walley, R. J.; Everett, J. R.; Nicholson, J. K. Nature 2006, 440, 1073-1077. (12) Lamers, R. J.; DeGroot, J.; Spies-Faber, E. J.; Jellema, R. H.; Kraus, V. B.; Verzijl, N.; TeKoppele, J. M.; Spijksma, G. K.; Vogels, J. T.; van der, G. J.; van Nesselrooij, J. H. J. Nutr. 2003, 133, 17761780. (13) Weljie, A. M.; Newton, J.; Mercier, P.; Carlson, E.; Slupsky, C. M. Anal. Chem. 2006, 78, 4430-4442. (14) Chang, D.; Weljie, A. M.; Newton, J. Pac. Symp. Biocomput. 2007, in press. (15) Kouskoff, V.; Korganow, A. S.; Duchatelle, V.; Degott, C.; Benoist, C.; Mathis, D. Cell 1996, 87, 811-822. (16) Ditzel, H. J. Trends Mol. Med. 2004, 10, 40-45. (17) Kyburz, D.; Corr, M. Springer Semin. Immunopathol. 2003, 25, 79-90. (18) Korganow, A. S.; Ji, H.; Mangialaio, S.; Duchatelle, V.; Pelanda, R.; Martin, T.; Degott, C.; Kikutani, H.; Rajewsky, K.; Pasquali, J. L.; Benoist, C.; Mathis, D. Immunity 1999, 10, 451-461. (19) Dieterle, F.; Ross, A.; Schlotterbeck, G.; Senn, H. Anal. Chem. 2006, 78, 4281-4290. (20) Kanehisa, M.; Goto, S. Nucleic Acids Res. 2000, 28, 27-30. (21) Trygg, J.; Wold, S. J. Chemom. 2002, 16, 119-128. (22) Fuchs, B.; Schiller, J.; Wagner, U.; Hantzschel, H.; Arnold, K. Clin. Biochem. 2005, 38, 925-933. (23) Smolenska, Z.; Kaznowska, Z.; Zarowny, D.; Simmonds, H. A.; Smolenski, R. T. Rheumatology (Oxford) 1999, 38, 997-1002. (24) Lacy, F.; Gough, D. A.; Schmid-Schonbein, G. W. Free Radical Biol. Med. 1998, 25, 720-727. (25) Kontny, E.; Maslinski, W.; Marcinkiewicz, J. Adv. Exp. Med. Biol. 2003, 526, 329-340. (26) Kontny, E.; Wojtecka-LUkasik, E.; Rell-Bakalarska, K.; Dziewczopolski, W.; Maslinski, W.; Maslinski, S. Amino Acids 2002, 23, 415-418. (27) Roubenoff, R.; Dellaripa, P.; Nadeau, M. R.; Abad, L. W.; Muldoon, B. A.; Selhub, J.; Rosenberg, I. H. Arthritis Rheum. 1997, 40, 718722. (28) Giustarini, D.; Lorenzini, S.; Rossi, R.; Chindamo, D.; Di, S. P.; Marcolongo, R. Clin. Exp. Rheumatol. 2005, 23, 205-212. (29) Griffiths, H. R.; Aldred, S.; Dale, C.; Nakano, E.; Kitas, G. D.; Grant, M. G.; Nugent, D.; Taiwo, F. A.; Li, L.; Powers, H. J. Free Radical Biol. Med. 2006, 40, 488-500.
Journal of Proteome Research • Vol. 6, No. 9, 2007 3463
research articles (30) Zhong, Z.; Wheeler, M. D.; Li, X.; Froh, M.; Schemmer, P.; Yin, M.; Bunzendaul, H.; Bradford, B.; Lemasters, J. J. Curr. Opin. Clin. Nutr. Metab. Care 2003, 6, 229-240. (31) Taylor, P. C.; Sivakumar, B. Curr. Opin. Rheumatol. 2005, 17, 293298. (32) Cloarec, O.; Dumas, M. E.; Trygg, J.; Craig, A.; Barton, R. H.; Lindon, J. C.; Nicholson, J. K.; Holmes, E. Anal. Chem. 2005, 77, 517-526. (33) Salek, R. M.; Maguire, M. L.; Bentley, E.; Rubtsov, D. V.; Hough, T.; Cheeseman, M.; Nunez, D. J.; Sweatman, B. C.; Haselden, J. N.; Cox, R.; Connor, S. C.; Griffin, J. L. Physiol. Genomics 2007, 29, 99-108. (34) Zhang, A. Q.; Mitchell, S. C.; Smith, R. L. Food Chem. Toxicol. 1999, 37, 515-520. (35) Griffin, J. L.; Bonney, S. A.; Mann, C.; Hebbachi, A. M.; Gibbons, G. F.; Nicholson, J. K.; Shoulders, C. C.; Scott, J. Physiol. Genomics 2004, 17, 140-149.
3464
Journal of Proteome Research • Vol. 6, No. 9, 2007
Weljie et al. (36) Baumeister, F. A.; Hack, A.; Busch, R. Klin. Padiatr. 2006, 218, 230-232. (37) Zuurveld, J. G.; Oosterhof, A.; Veerkamp, J. H.; van Moerkerk, H. T. Biochim. Biophys. Acta 1985, 844, 1-8. (38) Calabrese, V.; Giuffrida Stella, A. M.; Calvani, M.; Butterfield, D. A. J. Nutr. Biochem. 2006, 17, 73-88. (39) Bottiglieri, T. Am. J Clin. Nutr. 2002, 76, 1151S-1157S. (40) Li, X.; Bradford, B. U.; Wheeler, M. D.; Stimpson, S. A.; Pink, H. M.; Brodie, T. A.; Schwab, J. H.; Thurman, R. G. Infect. Immun. 2001, 69, 5883-5891. (41) Wheeler, M. D.; Ikejema, K.; Enomoto, N.; Stacklewitz, R. F.; Seabra, V.; Zhong, Z.; Yin, M.; Schemmer, P.; Rose, M. L.; Rusyn, I.; Bradford, B.; Thurman, R. G. Cell. Mol. Life Sci. 1999, 56, 843-856.
PR070123J