Geometric Trajectory Analysis of Metabolic Responses To Toxicity

Metabonomics can be viewed as the process of defining multivariate metabolic ... We first proposed the use of metabolic trajectory analysis using mult...
0 downloads 0 Views 150KB Size
MAY 2004 VOLUME 17, NUMBER 5 © Copyright 2004 by the American Chemical Society

Articles Geometric Trajectory Analysis of Metabolic Responses To Toxicity Can Define Treatment Specific Profiles Hector C. Keun, Timothy M. D. Ebbels, Mary E. Bollard, Olaf Beckonert, Henrik Antti, Elaine Holmes, John C. Lindon, and Jeremy K. Nicholson* Biological Chemistry, Biomedical Sciences, Faculty of Medicine, Imperial College of Science, Technology and Medicine, London, SW7 2AZ, United Kingdom Received October 17, 2003

Metabonomics can be viewed as the process of defining multivariate metabolic trajectories that describe the systemic response of organisms to physiological perturbations through time. We have explored the hypothesis that the homothetic geometry of a metabolic trajectory, i.e., the metabolic response irrespective of baseline values and overall magnitude, defines the mode of response of the organism to treatment and is hence the key property when considering the similarity between two sets of measurements. A modeling strategy to test for homothetic geometry, called scaled-to-maximum, aligned, and reduced trajectories (SMART) analysis, is presented that together with principal components analysis (PCA) facilitates the visualization of multivariate response similarity and hence the interpretation of metabonomic data. Several examples of the utility of this approach from toxicological studies are presented as follows: interlaboratory variation in hydrazine response, CCl4 dose-response relationships, and interspecies comparison of bromobenzene toxicity. In each case, the homothetic trajectories hypothesis is shown to be an important concept for the successful multivariate modeling and interpretation of systemic metabolic change. Overall, geometric trajectory analysis based on a homothetic modeling strategy like SMART facilitates the amalgamation and comparison of metabonomic data sets and can improve the accuracy and precision of classification models based on metabolic profile data. Because interlaboratory variation, normal physiological variation, dose-response relationships, and interspecies differences are also key areas of concern in genomic and proteomic as well as metabonomic studies, the methods presented here may also have an impact on many other multilaboratory efforts to produce screenable “-omics” databases useful for gauging toxicity in safety assessment and drug discovery.

Introduction The rapid development of biochemical profiling technologies has made it possible, in principle, to observe the * To whom correspondence should be addressed. Tel: 44(0)20-75943230. Fax: 44(0)20-7594-3226. E-mail: [email protected].

“global” response of a living system in terms of mRNA synthesis (1), protein expression (2), or metabolic flux (3). These recent analytical advances have shifted the ratedetermining step in such studies to the interpretative stage of these so-called -omics approaches. Determining appropriate analyses for the highly variate data sets

10.1021/tx034212w CCC: $27.50 © 2004 American Chemical Society Published on Web 04/15/2004

580

Chem. Res. Toxicol., Vol. 17, No. 5, 2004

Keun et al.

Figure 1. (a) Two hypothetical metabolite concentration time courses and a corresponding trajectory plot. (b) A schematic representation of PCA.

generated remains a crucial but elusive goal, and in particular, there is a need for an effective framework to integrate these complementary views of systems biology. Metabonomics is concerned with characterizing the metabolic response of a living system to physiological perturbation or genetic intervention either through time or between two or more states (4). This aim distinguishes metabonomics from genomics or proteomics in that the focus is the end point of any treatment and not the potential for physiological response as indicated by differential gene regulation and protein expression. The practical implementation of metabonomics also sets it apart: frequently, the compositions of biofluids such as blood or urine are characterized, allowing a more direct view into the multiorgan, integrated response of a system (3). Tissue or cyto specific analysis is more typical of genomic, proteomic, and other metabolic profiling strategies. These features may allow metabonomics a key role in future efforts to integrate the -omics technologies. Recent important applications of the methodology include toxicology (3), clinical diagnosis (5), and functional genomics (6). The term trajectory means a path, especially when parametrized by time (7), and the aims of metabonomic investigation could be interpreted as the attempt to define the metabolic trajectory resulting from a particular intervention: the coordinate system defines the set of metabolites that is measured, and the position on the path indicates the current abundance or flux of the species of interest (Figure 1a). Visualized in this way, the direction and distance traveled indicate which metabolites are varying, whether these are increasing or decreasing, and to what extent are these changes across a particular period of time. In addition, relationships between the observed changes are readily seen, and such correlations are of particular interest since they may be a result of functional relationships within the living

system. As disease and toxic processes develop through time with onset, maximal damage, and regeneration phases, the metabolic responses to injury also exhibit dynamic variation, and monitoring these changes results in characteristic patterns for each type of pathological process. We first proposed the use of metabolic trajectory analysis using multivariate projection methods as a tool for studying toxicity (8, 9) and have since developed more sophisticated methods for exploring time-dependent metabolic behavior (10, 11). Typically, the number of metabolites detectable in urine is in the order of hundreds, and even using relatively insensitive analytical techniques such as NMR1 spectroscopy, many dozens of compounds can be identified. In addition, urinary NMR measurements are highly correlated due to both the structural and the underlying metabolic network relationships. For such complex data sets, pattern recognition procedures can facilitate the process of examining data, by automating aspects of data processing, extracting relevant and/or consistent features, and presenting them in a semiquantitative display. PCA (12) is one means by which the systematic (correlated) variation in highly multivariate data can be summarized by a few axes, the principal components (PCs), describing related changes (Figure 1b). The values defining the position of each sample in the new coordinate space are called scores and are linear combinations of the individual variables, in this case metabolite abundance or flux. The model loadings give the relative contributions of each parameter in calculating the scores. Crucially, each component is selected to describe the maximum variation not included in the previously calculated components (orthogonality). 1 Abbreviations: NMR, nuclear magnetic resonance; PCA, principal components analysis; TSP, sodium trimethylsilyl [2,2,3,3,-2H4]propionate; NOESY, nuclear Overhauser effect spectroscopy; SMART, scaled-tomaximum aligned reduced trajectories; HCBD, hexachlorobutadiene.

Geometric Trajectory Analysis of Metabolic Responses

Chem. Res. Toxicol., Vol. 17, No. 5, 2004 581

Figure 2. Standard trajectory analysis of urinary NMR data obtained after treatment with HCBD (2) and aurothiomalate (O). Error bars represent the standard error for each time point average.

A pervasive theme in metabolic trajectory analysis is judging when two or more responses are the same. Intuitively, we can appreciate that trajectories can have similar shapes even if there are clear differences in starting position and magnitude. The significance of direction in the trajectory analogy would suggest that geometry is important, but the implications of shape similarity alone in the context of physiological response have not generally been discussed. In the current work, we postulate a clear interpretation for the geometry of metabolic trajectories, based on empirical observations from several toxicity studies taken from the Consortium for Metabonomic Toxicology project (13). In addition, a scheme for effective geometry analysis of multivariate metabolic measurements is presented, using PCA. It is our belief that geometric trajectory analysis will dramatically benefit the interpretation of metabonomic data by allowing more effective comparison between data sets and thus facilitate the use of metabolic profiles in the classification of unknown treatments, such as in drug toxicity studies. These results also provide strong evidence for the specificity of metabolic response as characterized by NMR-based biofluid analysis. Homothetic Trajectories Hypothesis. The discriminatory power of urinary metabolic profiles in renal toxicity classification has been extensively reported (14, 15). Here, the results of metabonomic analysis of samples from two studies, one of aurothiomalate treatment and another of HCBD toxicity, show coincident metabolic trajectories (Figure 2). The power of trajectory analysis to immediately identify temporal structure and similarity is obviously apparent. Histopathological findings in both studies indicated renal tubular necrosis in all animals examined at 48 h, of marked severity for the aurothiomalate study and severe in the case of the HCBD study. Tubular regeneration was seen in all animals examined at 168 h, in agreement with the observation that the metabolic state, as viewed from the trajectory, approaches a predose position at later time points. Hence,

these observations support the notion that renal toxicity of comparable type and severity can produce similar metabolic responses. Note that this is despite the fact that the responses do not follow exactly the same time course: the study with the greater histopathological effect shows a faster onset as seen by comparing 8 and 24 h time points (Figure 2). This is reasonable since the treatments are chemically quite different and would be expected to have different pharmacodynamics. However, there is also a clear argument for our interpretation to be the same even if the two trajectories were displaced, perhaps by species differences in normal metabolism, or of different sizes, as might be expected from different levels of exposure to the same toxin. Shapes that are the same with respect to a translation and scalar enlargement or shrinkage are described as sharing homothetic geometry (7). Homothetic trajectories would share many important features such as the correlations in relative size and direction of metabolite changes. The key difference in considering homothetic geometry alone is that differences in the initial metabolic and in the overall magnitude of a treatment-related effect are removed from the interpretation process in addition to the exact time of events. Note that geometrically similar trajectories, which can be overlaid by a rotational transformation, would not share the same directions of metabolite changes and thus would not be expected to reflect similar system behavior. Several examples of homothetic and nonhomothetic trajectories and interpretation of these data are presented below and collectively support the notion that homothetic geometry between trajectories are indicative of a comparable mode of physiological response. Geometric Trajectory Analysis Using PCA. Because PCA attempts to describe the largest covariances in a data set, consistent differences in the average position or size of distribution between two clusters, e.g., treatment groups, within the data will bias the construction of the linear model to describe these differences at

582

Chem. Res. Toxicol., Vol. 17, No. 5, 2004

Figure 3. Testing for homothetic geometry using SMART analysis. Two trajectories that are homothetic (1) will coincide if corrected for different starting position (2), overall magnitude (3), and differing numbers of samples (4).

the cost of accurately modeling the treatment-related effects. In a similar way, differences in the number of samples between two clusters can also bias the analysis. To alleviate these difficulties, a modeling strategy was devised called SMART analysis, which aims to remove predose differences as gross magnitude differences in the data, prior to PCA of the metabolic profiles averaged across individuals at each time point. In practice, not all of the transformations may be necessary but each should be considered in the data analysis. SMART analysis is illustrated schematically in Figure 3. Perhaps the most useful property of this modeling strategy was that it could be used as a means to test the homothety of trajectories. Because the SMART analysis performs a linear transformation of each treatment group data (by translation and scaling without rotation), trajectories that become coincident as a result of the analysis must share homothetic geometry. The examples below demonstrate that without SMART or some similar approach the danger of misinterpreting the results of PCA is significant.

Experimental Procedures Animal Studies. Male 8-10 week old Sprague-Dawley rats or 12-16 week old B6C3F1 mice (Charles Rivers Laboratories) were housed in individual metabolism cages, and urine samples were collected daily from each animal. A standard diet, Purina chow 5002, was given to all animals, and free access to food and water was permitted throughout the study. A temperature of 21 ( 2 °C and a relative humidity of 55 ( 10% was

Keun et al. maintained with fluorescent lighting between 06.00 and 18.00 ( 1 h. Animals were randomly assigned to dose groups (vehicle only, treated; Table 1) with 10 rats or eight mice per group. All doses were administered at zero hours. Each dose group was split into two groups, one euthanized at 48 h postdose (end of day 2) and the other euthanized at 168 h postdose (end of day 7) for serum sampling and histopathological evaluation. For rats, urine samples were collected over ice and into 1 mL of 1% w/v sodium azide at between 24 h predose and 16 h predose and then at 0, 8, 24, 48, 72, 96, 120, 144, and 168 h postdose. For mice, one sample was collected for the 24 h predose period and one for the 24 h immediately after dosing. On collection, urine samples were centrifuged at ∼1200g for 10 min, to remove any solid debris, and were subsequently stored at -40 °C pending 1H NMR spectroscopic analysis. NMR Spectroscopy. A 200 µL amount of 0.2 M sodium phosphate buffer (pH 7.4) containing 1 mM TSP, ∼3 mM sodium azide, and 10% of D2O was added to 400 µL of each rat urine sample and 200 µL of each mouse urine. Mouse samples were diluted by the further addition of 200 µL of deionized water. All samples were recentrifuged at 1200g for 10 min to remove any solid debris. 1H NMR spectra were measured at 600 MHz and operating at 300 K using a flow-injection system (Bruker Biospin, Karlsruhe, Germany). During spectral acquisition, the water resonance was suppressed by using the first increment of a NOESY pulse sequence with irradiation during a 1 s relaxation delay and also during the 100 ms mixing time. For each sample, 64 transients were collected into 32K data points using a spectral width of 20.036 ppm. Prior to Fourier transformation, an exponential line-broadening factor of 1 Hz was applied to each free induction decay. The spectra were phased, baseline corrected, and referenced to TSP (δ 0.0) automatically using an in-house routine written in MATLAB (The MathWorks, Natick, MA). Data Reduction and Pattern Recognition. Each NMR spectrum was reduced to 245 variables, calculated by integrating regions of equal width (0.04 ppm) corresponding to the δ range 0.2-10.0 using an in-house routine in MATLAB. This simplified statistical analysis of the data and reduced the impact of small variations in chemical shift due to, for example, variation in pH or ionic strength. Some spectral regions were consistently excluded from the analysis. The regions δ 4.50-5.98 were deleted to remove any spurious effects of variability in the suppression of the water resonance and any cross-relaxation effects (mediated by chemical exchange of protons) on the urea signal. For hydrazine, HCBD, and bromobenzene data, further regions corresponding to xenobiotic metabolites were excluded from the analysis (Table 1). The regions around any remaining citrate resonances (δ 2.50-2.58 and δ 2.66-2.74) were merged by summing into two integrated regions to reduce further the effects of pH-dependent chemical shifts. While pharmacokinetic information and some endogenous profile information may be removed by these exclusions, the resulting analysis is thus better focused on reliably profiling metabolic change within the living system. Finally, to take account of large variations in urine concentration, all spectra thus reduced were then normalized to a constant integrated intensity of 100 units. Multivariate analysis was performed using SIMCA-P software (version 10.0, Umetrics AB, Umeå, Sweden). The data were mean-centered, in which the mean of each variable was sub-

Table 1. Dose Levels, Vehicles, Routes, and Regions Excluded from Pattern Recognition Analysis for Each of the Studies Discussed vehicle/route

dose (mg/kg)

hydrazine dihydrochloride sodium aurothiomalate HCBD bromobenzene (rat)

study compound

saline (0.9% w/v) 10 mL/kg p.o. saline (0.9% w/v) 10 mL/kg i.p. corn oil 10 mL/kg p.o. corn oil 10 mL/kg p.o.

30 100 200 1500

bromobenzene (mouse) carbon tetrachloride (low dose) carbon tetrachloride (high dose)

corn oil 10 mL/kg p.o. corn oil 10 mL/kg p.o. corn oil 10 mL/kg p.o.

750 640 3200

excluded regions (ppm) 2.74-2.94 and 2.46-2.66 8.06-7.9, 4.46-4.3, 3.78-3.58, 3.02-2.82, and 2.14-1.98

Geometric Trajectory Analysis of Metabolic Responses

Chem. Res. Toxicol., Vol. 17, No. 5, 2004 583

Figure 4. (a) Standard trajectory analysis of urinary NMR data obtained after treatment with hydrazine from identical studies repeated at different sites: study A (2) and study B (O). (b) SMART analysis of the same data. Error bars represent the standard error for each time point average. tracted from each element in a column of the data set. This is equivalent to subtracting the mean spectrum from each other spectrum in the data set and performing subsequent analyses on the mean-subtracted spectra. To visualize the underlying metabolic trajectories, PCA was conducted on the average spectra at each time point. All averages were calculated as median values or as means if less than four values were being averaged. SIMCA analysis was conducted in a similar fashion, with residual variance, which was calculated as the sum of the squares of the differences between the data model and the original data. SMART Analysis. The average spectrum from the 0 h measurement for each study was subtracted from all of the spectra for each study (predose subtraction or alignment). The vector magnitudes (the square root of the sum of the squares of the values) for each aligned spectrum were calculated, and the average magnitude for each time point of each dose group from each study was determined. The largest magnitude for each dose group from each study was used as an estimate for the overall magnitude of the treatment-related effect and used to scale each

dose group data to a common magnitude (scaling to maximum effect). Finally, analysis was conducted on the average spectra at each time point (reduction). All averages were calculated as median values or as means if less than four values were being averaged.

Results Assessing the Interlaboratory Reproducibility of Response to Hydrazine Treatment. Initial PCA analysis of average urinary profiles for two identical studies of hydrazine toxicity conducted at two different sites is shown in Figure 4a. The subpathological doses used (30 mg/kg) result in a relatively low response, so baseline differences in the metabolic profile between the two cohorts of rats used in the studies cause the trajectories to appear separated in space. The similar directions of response suggest that the metabolic changes are comparable. After SMART analysis, the trajectories appear

584

Chem. Res. Toxicol., Vol. 17, No. 5, 2004

Keun et al.

Figure 5. (a) Standard trajectory analysis of urinary NMR data obtained after treatment with bromobenzene from studies conducted in both the mouse (2) and the rat (O). (b) SMART analysis of the same data.

obviously coincident (Figure 4b), as we would expect between two replicate experiments designed to elicit the same physiological response. Differential Response to Bromobenzene Treatment between the Rat and the Mouse. The rat and the mouse differ considerably in their urinary composition as indicated by the separation of control data from bromobenzene studies from each species in PCA analysis (Figure 5a). Again, the trajectories defining the response to bromobenzene appear to traverse similar directions, suggesting that the metabolic response is of the same type. Histopathology indicated that all animals examined at 48 h showed evidence of centrilobular necrosis and hepatocellular vacuolization of comparable severity, supporting this view. However, after SMART analysis, the trajectories are clearly not coincident and describe very different physiological behavior (Figure 5b). Because the dose levels were different between the two studies (to compensate for differential susceptibility) and food intake was not measured, it is difficult to state confidently that

these observations reflect a genuine difference in metabolic regulation between the two species. What is apparent, however, is that geometric trajectory analysis can reveal differential physiological responses not obvious to examination by histopathology or standard trajectory analysis. Irrespective of the SMART procedure, the mouse trajectory remains open while the rat trajectory is closed, so information regarding different rates of progression is unaltered. Dose Dependence of the Metabolic Response in a Study of CCl4 Treatment. Metabolic trajectories for two treatment groups in a study of CCl4 hepatotoxicity are shown in Figure 6. Clearly, the high dose group exhibits a larger response than the low dose group, as might be expected, and takes a longer time period to recover. SMART analysis shows that the trajectories become more comparable, as we would expect to see if the homothetic trajectory hypothesis were true and if both dose levels were eliciting the same mode of physiological response. Apparent differences are visible how-

Geometric Trajectory Analysis of Metabolic Responses

Chem. Res. Toxicol., Vol. 17, No. 5, 2004 585

Figure 6. (a) Standard trajectory analysis of urinary NMR data obtained after treatment with carbon tetrachloride at two different dose levels: low dose (2) and high dose (O). (b) SMART analysis of the same data. Error bars represent the standard error for each time point average.

ever in the 72 and 96 h data for the high dose group, although it is possible that this is an artifact of the sampling time. There may be few data points in this region because of the greater rate of progression of the low dose group in traversing the metabolic trajectory rather than a genuine difference in response profile. While caution must be used in interpreting partial overlap of trajectories, the high coincidence of the trajectories in the first 48 h favors the conclusion that the mode of response is the same in the early onset period of toxicity. Classification by Soft Modeling of Class Analogy (SIMCA). In previous examples, the PCA models were constructed using the combined data of both treatments, i.e., all variation between the samples affects the derived components. Thus, by removing certain confounding sources of variation, the SMART procedure allows the geometry of response to dominate the model. Another common strategy is to model each group to be compared separately. For example, the SIMCA approach to classification is based on the examination of the residual unexplained variance of unknown test data when placed in the individual model space (defined by the model

loadings) for each known class (9, 12). SMART analysis does not affect the covariance within, and hence the PC space of, an individual treatment group if time point averaged data are modeled by PCA. Hence, contrasting model loadings can provide a scale-free and baselineindependent means of evaluating response geometry (16). However, model scores and residuals are more reliable parameters to compare than loadings if differential rates of response or only partial overlap of trajectories occurs, and these are altered by individual model SMART analysis (Figures 1, 3, and 5). Tables 2 and 3 show the effect of SMART analysis on the total residual matrix for three treatment groups already discussed: two replicate experiments of hydrazine toxicity (A and B) and exposure to aurothiomalate. The residuals have been normalized such that the residual of the group data from which a model is derived has unit value. Clearly, the residuals for the hydrazine A data in hydrazine B model space decrease after SMART (-2.03), and the same is true of the reciprocal arrangement (-0.22). This would be expected since both experiments were intentionally identical and is a result of the homothety of the metabolic trajectories generated. In

586

Chem. Res. Toxicol., Vol. 17, No. 5, 2004

Keun et al.

Table 2. SIMCA Analysis of Metabolic Trajectories from Three Studies before the Application of SMARTa normalized residuals pre-SMART hydrazine A samples hydrazine B samples aurothiomalate samples

hydrazine hydrazine aurothiomalate A model B model model 1.00 2.04 7.06

3.46 1.00 10.38

2.06 1.77 1.00

a Each PCA model was constructed with two components. Residuals were normalized relative to the total residual from the data used to generate a model.

Table 3. SIMCA Analysis of Metabolic Trajectories from Three Studies after the Application of SMARTa normalized residuals post-SMART hydrazine A samples hydrazine B samples aurothiomalate samples

hydrazine hydrazine aurothiomalate A model B model model 1.00 1.82 2.76

1.43 1.00 2.41

3.54 3.71 1.00

a Each PCA model was constructed with two components. Residuals were normalized relative to the total residual from the data used to generate a model.

contrast, the SMART procedure increases the residuals of both hydrazine group data when placed in the model space of the aurothiomalate group (+1.47, +1.93), while simultaneously decreasing the residual of aurothiomalate data placed in either hydrazine group model space (-4.30, -7.97). This asymmetry of effect is a direct consequence of lack of homothety between the metabolic trajectories compared, and it is our assertion that this lack of coincident geometry reflects the different toxicity, to the liver and kidney, respectively, of hydrazine and aurothiomalate. Typically, the residuals are considered on a sample-by-sample basis, so that a critical limit for model membership is defined by the limits of the distribution of residual values for the training data. If a normal distribution of residuals is assumed (which should be that case if all systematic variation is captured by the PCA model), then a 95% critical limit can be approximated by a residual value of two in normalized units. After SMART processing, residuals from the hydrazine data in hydrazine model space fall within this limit while the comparison of different toxins produces residuals outside this critical limit (Table 3). While important subtleties in interpreting the PCA-modeled trajectories remain, such as the correct number of components to use for each model, it can be seen how geometric trajectory analysis implemented using the SMART strategy facilitates the comparative study of endogenous metabolic response to toxins and could form the basis of a system to classify unknown treatments based only on prior metabonomic data.

Discussion Presented here is evidence supporting the use of geometric trajectory analysis in the interrogation of metabolic profile data such as those generated by metabonomics. The concept of homothetic trajectories was defined and assigned a qualitative interpretation: that such observations result from the same mode of physiological response. This hypothesis was tested using experimental data from studies where we might expect to observe similar metabolic responses. A modeling strategy, SMART analysis, designed to accurately test for coincident geometry (homothety) of trajectories using PCA, was used, and in each case, it was shown that

geometric trajectory analysis provided a superior test than standard trajectory analysis for determining whether two responses were the same. For the homothetic trajectories concept to provide a tangible benefit to the field in the long-term, it will be necessary to clarify, perhaps on a case-by-case basis (i) what is meant by the mode of response, (ii) what are the underlying processes generating this system property, and (iii) a quantitative criterion for coincident geometry. In the case of the PCA analyses of urinary NMR spectra presented here, the mode of response is thus defined as the pattern of covariation between different metabolite excretion rates. Relationships will exist between metabolites excretion due both to the fundamental physiological processes governing renal transport such as maintenance of biofluid pH, blood sugar, and ionic potential but also to the metabolic pathway interactions that generate the excretion products themselves. What is especially interesting is that a linearly scalable pattern of excretion can be observed despite the obvious complexity and nonlinearity of these governing processes. It remains a future challenge to model metabolic networks and attempt to generate this form of behavior. The SIMCA analyses shown represent one possible framework for a more quantitative analysis of geometry, but many other implementations could be used. For example, the full use of individual animal data might provide a more statistically rigorous definition of physiological “noise” in metabonomic data and thus a better criterion of homothety. While there is no intrinsic reason for data to be averaged across animals, it does benefit the assessment of overall magnitude and interstudy classification. However, it is also obvious that differential rates of progression will reduce the confidence with which averaged data can be interpreted, and so in cases where the time course is well-sampled, geometric trajectory analysis could be applied on an individual animal basis. One theoretical drawback of this approach is that two trajectories can share homothetic geometry even if the temporal order of events is completely reversed. In practice, this apparently paradoxical situation is unlikely to occur, since the relationships governing metabolite excretion exhibit some degree of independence to each other, although acid-base imbalance might produce such behavior in some urinary studies. One approach that takes the time of events specifically into account is to build a partial least squares regression model using time as the regression variable, an approach that has been described as batch analysis (10, 17). However, because it has been shown that similar toxicity can exhibit different rates of onset and recovery (Figure 2), the strategy is unlikely to be widely applicable unless combined with a method for synchronizing a series of time courses (18). Overall, geometric trajectory analysis based on a homothetic modeling strategy like SMART facilitates the amalgamation and comparison of metabonomic data sets and can improve the accuracy and precision of classification models based on metabolic profile data. The methods presented here may also further our understanding and confidence in interpreting and integrating biochemical data from many other technologies of the genomic revolution. Interlaboratory variation, normal physiological variation, dose-response relationships, and interspecies differences are also key areas of concern in genomic and proteomic as well as metabonomic studies. In particular, these issues are crucial to the validity of efforts

Geometric Trajectory Analysis of Metabolic Responses

to produce large-scale screenable databases for applications such as microbial taxonomy (19) and gauging toxicity in safety assessment and drug discovery (20, 21). The existence of such homothetic structures in physiology is itself an intriguing observation and in the long-term may provide new insight into the relationships between metabolic variation and systems biology in normal regulation and dysfunction.

Acknowledgment. We (H.C.K., T.M.D.E., M.E.B., and O.B.) acknowledge the members of the Consortium for Metabonomic Toxicology (Eli Lilly, Bristol Myers Squibb, Roche, Novo Nordisk, Pfizer, and Pharmacia) for financial support.

References (1) Lovett, R. A. (2000) Toxicogenomics. Toxicologists brace for genomics revolution. Science 289, 536-537. (2) Kennedy, S. (2002) The role of proteomics in toxicology: identification of biomarkers of toxicity by protein expression analysis. Biomarkers 7, 269-290. (3) Nicholson, J. K., Connelly, J., Lindon, J. C., and Holmes, E. (2002) Metabonomics: a platform for studying drug toxicity and gene function. Nat. Rev. Drug Discovery 1, 153-161. (4) Nicholson, J. K., Lindon, J. C., and Holmes, E. (1999) Metabonomics: understanding the metabolic responses of living systems to pathophysiological stimuli via multivariate statistical analysis of biological NMR spectroscopic data. Xenobiotica 29, 11811189. (5) Brindle, J. T., Antti, H., Holmes, E., Tranter, G., Nicholson, J. K., Bethell, H. W. L., Clarke, S., Schofield, P. M., McKilligin, E., Mosedale, D. E., and Grainger, D. J. (2002) Rapid and noninvasive diagnosis of the presence and severity of coronary heart disease using 1H NMR-based metabonomics. Nat. Med. 8, 1439-1444. (6) Gavaghan, C. L., Holmes, E., Lenz, E., Wilson, I. D., and Nicholson, J. K. (2000) An NMR-based metabonomic approach to investigate the biochemical consequences of genetic strain differences: application to the C57BL10J and Alpk: ApfCD mouse. FEBS Lett. 484, 169-174. (7) Borowski, E. J., and Borwein, J. M. (1989) Dictionary of Mathematics, Collins, Glasgow, UK. (8) Holmes, E., Bonner, F. W., Sweatman, B. C., Lindon, J. C., Beddell, C. R., Rahr, E., and Nicholson, J. K. (1992) Nuclear magnetic resonance spectroscopy and pattern recognition analysis of the biochemical processes associated with the progression of and recovery from nephrotoxic lesions in the rat induced by mercury(II) chloride and 2-bromoethanamine. Mol. Pharmacol. 42, 922-930.

Chem. Res. Toxicol., Vol. 17, No. 5, 2004 587 (9) Beckwith-Hall, B. M., Nicholson, J. K., Nicholls, A. W., Foxall, P. J., Lindon, J. C., Connor, S. C., Abdi, M., Connelly, J., and Holmes, E. (1998) Nuclear magnetic resonance spectroscopic and principal components analysis investigations into biochemical effects of three model hepatotoxins. Chem. Res. Toxicol. 11, 260272. (10) Antti, H., Bollard, M. E., Ebbels, T. M. D., Keun, H. C., Lindon, J. C., Nicholson, J. K., and Holmes, E. (2002) Batch statistical processing of 1H NMR-derived urinary spectral data. J. Chemom. 16, 1-9. (11) Ebbels, T., Keun, H., Beckonert, O., Antti, H., Bollard, M., Holmes, E., Lindon, J., and Nicholson, J. (2003) Toxicity classification from metabonomic data using a density superposition approach: CLOUDS. Anal. Chim. Acta 490, 109-122. (12) Wold, S., Esbensen, K., and Geladi, P. (1987) Principal component analysis. Chemom. Intell. Lab. 2, 37-52. (13) Lindon, J. C., Nicholson, J. K., Holmes, E., Antti, H., Bollard, M. E., Keun, H., Beckonert, O., Ebbels, T. M., Reily, M. D., and Robertson, D. (2003) Contemporary issues in toxicology the role of metabonomics in toxicology and its evaluation by the COMET project. Toxicol. App. Pharmacol. 187, 137-146. (14) Anthony, M. L., Sweatman, B. C., Beddell, C. R., Lindon, J. C., and Nicholson, J. K. (1994) Pattern recognition classification of the site of nephrotoxicity based on metabolic data derived from proton nuclear magnetic resonance spectra of urine. Mol. Pharmacol. 46, 199-211. (15) Holmes, E., Nicholson, J. K., Nicholls, A. W., Lindon, J. C., Connor, S. C., Polley, S., and Connelly, J. (1998) The identification of novel biomarkers of renal toxicity using automatic data reduction techniques and PCA of proton NMR spectra of urine. Chemom. Intell. Lab. 44, 245-255. (16) Benigni, R., and Giuliani, A. (1994) Quantitative modeling and biology: The multivariate approach. Am. J. Physiol. 266, R1697R1704. (17) Azmi, J., Griffin, J. L., Antti, H., Shore, R. F., Johansson, E., Nicholson, J. K., and Holmes, E. (2002) Metabolic trajectory characterisation of xenobiotic-induced hepatotoxic lesions using statistical batch processing of NMR data. Analyst 127, 271-276. (18) Kourti, T. (2003) Multivariate dynamic data modeling for analysis and statistical process control of batch processes, start-ups and grade transitions. J. Chemom. 17, 93-109. (19) Wilkes, J. G., Glover, K. L., Holcomb, M., Rafii, F., Cao, X., Sutherland, J. B., McCarthy, S. A., Letarte, S., and Bertrand, M. J. (2002) Defining and using microbial spectral databases. J. Am. Soc. Mass. Spectrom. 13, 875-887. (20) Boorman, G. A., Haseman, J. K., Waters, M. D., Hardisty, J. F., and Sills, R. C. Quality review procedures necessary for rodent pathology databases and toxicogenomic studies: the National Toxicology Program experience. Toxicol. Pathol. 30, 88-92. (21) Tennant, R. W. (2002) The National Center for Toxicogenomics: using new technologies to inform mechanistic toxicology. Environ. Health Perspect. 110, A8-A10.

TX034212W