Normalizing and Integrating Metabolomics Data | Analytical Chemistry

Nov 14, 2012 - DOI: 10.1016/bs.coac.2018.08.003. Anna C. Reisetter, Michael J. Muehlbauer, ... DOI: 10.1002/rcm.7777. Yu Guo, Zhongfeng Li, Xinfeng Li...
0 downloads 0 Views 852KB Size
Article pubs.acs.org/ac

Normalizing and Integrating Metabolomics Data Alysha M. De Livera,*,† Daniel A. Dias,† David De Souza,† Thusitha Rupasinghe,† James Pyke,† Dedreia Tull,† Ute Roessner,† Malcolm McConville,† and Terence P. Speed‡ †

Metabolomics Australia at The University of Melbourne, Melbourne, Victoria, Australia Bioinformatics Division, Walter and Eliza Hall Institute, Parkville, Victoria, Australia



S Supporting Information *

ABSTRACT: Metabolomics research often requires the use of multiple analytical platforms, batches of samples, and laboratories, any of which can introduce a component of unwanted variation. In addition, every experiment is subject to withinplatform and other experimental variation, which often includes unwanted biological variation. Such variation must be removed in order to focus on the biological information of interest. We present a broadly applicable method for the removal of unwanted variation arising from various sources for the identification of differentially abundant metabolites and, hence, for the systematic integration of data on the same quantities from different sources. We illustrate the versatility and the performance of the approach in four applications, and we show that it has several advantages over the existing normalization methods. etabolomics is a rapidly growing field in analytical biochemistry1 that combines both analytical and statistical methodologies to identify metabolites to answer various scientific questions of interest.2 Commonly used analytical platforms include nuclear magnetic resonance spectroscopy (NMR), gas chromatography/mass spectrometry (GC/MS), and liquid chromatography/mass spectrometry (LC/MS).3 A typical metabolomic data matrix consists of abundances of metabolites for a number of samples under different conditions (factors of interest), and the purpose of statistical analysis is often the identification of metabolites that are present in significantly different abundances between the factors of interest.4 Raw metabolomics data are subject to various forms of unwanted variation, which must be adequately dealt with in order to achieve the above goals.5 The unwanted variation seen in metabolomics data can occur due to both experimental and biological reasons. Unwanted experimental variation can arise from human error (e.g., sample extraction and preparation), within instrument variation (e.g., temperature changes within the instrument, sample degradation or loss of performance of the instrument during long run of samples), betweeninstrument variation, different batches, different laboratories, and different analytical platforms. Unknown biological variation (e.g., number of cells, concentration of biofluid) is also commonly encountered6 and sometimes confounded with the factors of interest, making it difficult to extract only the unwanted variation component. Another difficulty in capturing the unwanted variation is that both the biological and experimental interfering variation can be unobserved as well as observed. For example, in a long run of a large number of samples, it may not be possible to allocate a batch number to the samples, or the batch to which each sample belongs may not be known. Similarly to

M

© 2012 American Chemical Society

the biological variation, the unwanted variation may be observed, as in the weight or volume of the samples or the number of cells, and also may be unobserved, for instance, in the form of sizes of cells and concentration of biofluid. The purpose of normalization is to identify and remove both the observed and unobserved factors arising from various sources described above, and it forms an integral part of metabolomics data analysis, as the results obtained from subsequent data analysis can depend on it.7 The choice of an appropriate normalization method should consider several aspects. As well as being data- and experiment-dependent, it should also rely on the aims of the statistical analysis. In addition, the bias−variance trade-of f plays an important role, and care needs to be taken to ascertain the effectiveness of normalization. For example, even if a single internal standard perfectly describes the unwanted variation associated with the experiment, subtracting the log abundances of the single internal standard from the log abundances of each metabolite will lead to normalized values with higher variability, although the bias may be reduced. Further, the more complex normalization approaches that are able to remove more of the interfering variation than simple approaches may also remove part of the variation of interest. For these reasons, assessment of the effectiveness of a normalization method should not be based solely on the reduction of variability, such as the tightness of the replicates or smaller coefficients of variation. Similarly, an increase in the number of differentially Received: September 21, 2012 Accepted: November 14, 2012 Published: November 14, 2012 10768

dx.doi.org/10.1021/ac302748b | Anal. Chem. 2012, 84, 10768−10776

Analytical Chemistry

Article

instrumental variation has been noted by several authors in the recent literature.12−14 One approach is to choose different internal standards according to the proximity of retention times to certain metabolite classes12 and then use eq 1 to normalize within classes. Retention time, however, does not necessarily describe all chemical properties leading to unwanted variation.13 In another normalization approach, the authors select the optimal combination of multiple internal standards (NOMIS) using multiple linear regression.13 In this, they fit a linear model to log-transformed abundances of each metabolite with a design matrix consisting of columns corresponding to mean-centered internal standard log abundances and a column of ones representing an intercept term:

abundant metabolites found does not necessarily imply that a normalization approach improved the analysis. Depending on the association of the unwanted variation with the factors of interest, a good normalization approach can increase or decrease the number of differentially abundant metabolites.8 Hence, the optimal normalization method should be chosen carefully after consideration of the data, experimental design, statistical aims, and the balance of accuracy and precision ascertained through the use of auxiliary information. Several attempts have been made in the metabolomics literature over the years for handling some of the unwanted variation discussed above.9−14 Most of these methods utilize internal or external standard metabolites, which are added to each sample before and after extraction, respectively, for the normalization,11−14 and some are based on scaling factors on complete data set.9,10,15 In the next section, we discuss the existing normalization approaches and their inadequacies in handling most of the above forms of variation, and we present a versatile approach that has several benefits over existing approaches for the identification of differentially abundant metabolites. We then illustrate the versatility of the approach using four example data sets, each with a different form of variation, and we show that it performs better than several existing methods. Some concluding remarks and future research directions are presented in the last section.

(2)

yij̃ = yij − riδĵ

(3)

where μj is the mean abundance of the jth metabolite, ri (1 × nr) is an observed row vector with mean-centered log abundances of multiple internal standards, δj (nr × 1) is a column vector of coefficients, and eij are the errors. The estimated unwanted variation term is then removed to obtain normalized log abundances ỹij, which are then back-transformed to achieve normalized values on the original scale. In estimating the coefficients for the unwanted variation term δ̃ij , the model implicitly assumes that the mean-centered multiple internal standard abundances are orthogonal to the experimental factors of interest. If this assumption is violated, it is possible that the biological variation of vital interest may also be removed along with removal of the unwanted variation term. A recent paper14 argues that the unwanted variation implied by the internal standard is influenced by contamination from the rest of the metabolites, and hence the direct use of internal standards for normalizing can also remove the true signal of interest. The authors refer to this concept as cross contribution and present cross contribution compensating multiple standard normalization (CCMN). Their normalization consists of several steps. First, each metabolite is transformed to have mean 0 and variance 1 to give z-transformed log metabolite abundances y*ij . Then any direct dependence of the experimental factors on the abundances of the internal standards is removed via multiple linear regression with a design matrix consisting of observed factors of interest. This step assumes that the experimental factors of interest have direct influence on the internal standards and are independent of any unwanted variation implied by the internal standards. Any cross contribution occurring via other metabolites in this step is also ignored.14 Principal component analysis (PCA) is then performed on the residuals of the above regression model to obtain the row score vectors ti (1 × nt) for each sample, denoting the unwanted variation. Under the assumption that these score vectors are orthogonal to the factors of interest, the unwanted variation component is removed via multiple linear regression with the score vectors forming the design matrix:



METHODS Abundances of metabolites in a data matrix usually have a rightskewed distribution; hence a log transformation is often applied prior to statistical analysis.7 Let yij denote the abundance on a log scale of the jth metabolite in the ith sample for i = 1, ..., m and j = 1, ..., n, where m is the number of samples and n is the number of metabolites. In every metabolomics experiment, the yij are subject to various forms of unwanted variation as was explained in detail in the previous section, which need to be taken into account in order to extract biologically relevant information. The simplest of the normalization methods is the use of a single internal standard (SIS),11 and the normalized log abundances are obtained by subtracting the log abundance of the single internal standard from the log metabolite abundances in each sample: yij̃ = yij − yis

yij = μj + riδj + eij

(1)

where yis is the log abundance of the single internal standard. The SIS method implicitly assumes that every metabolite in a sample is subject to the same amount of unwanted variation, simply measured by a single internal standard, and is often found to be inadequate in removing unwanted variation. The variation in the abundance of an internal standard may not arise from the unwanted variation alone but can be dependent on its chemical properties.12 Other sources of variation can arise from inadequate chromatographical separation and ion suppression.13,14 As a result, it is sometimes seen in practice that multiple internal standards do not correlate strongly with each other. Hence, depending on which compound is used as the internal standard, the results obtained from subsequent statistical analysis for the identification of differentially abundant metabolites can vary. The use of multiple internal standards helps overcome some of the issues explained above, and in particular it leads to low variability of the normalized abundances.13 The importance of using more than one internal standard for measuring unwanted

yij* = tiρj + εij*

(4)

yij*̃ = yij* − tiρĵ

(5)

where ρj (nt × 1) is a column vector of coefficients, εij* are the errors, and ỹ*ij are the fitted values. The normalized metabolite 10769

dx.doi.org/10.1021/ac302748b | Anal. Chem. 2012, 84, 10768−10776

Analytical Chemistry

Article

may also be included in the model as additional covariates; however, we will show in our applications that the model is able to capture the observed variation that is treated as unobserved. Differentially abundant metabolites can be obtained by use of a certain cutoff limit on the fold changes or p-values. This model has been used for applications in several fields of research.8,16,17 We follow the most recent approach, and denote the model by the notation remove unwanted variation,8 RUV-2, where 2 stands for the two-step procedure described above. Note that we do not recommend subtracting the term ŵ iαj from eq 8, but we include ŵ i in the model with xi and refit to estimate βj and αj with all the data. Thus, we are not providing a global normalization adjustment (for example, for classification and clustering purposes).8 Instead, the model is to be used for the purpose of identifying differentially abundant metabolites as we explain in this paper. It is desirable to consider both variability and bias reduction approaches in choosing the number of unwanted factors k in the model. In every experiment, there are facts that are known beforehand, and these can be used to assess the performance of the model. For example, often results from previous literature may imply that some metabolites should be differentially abundant. The metabolites that are known not to be changing and have not been used as nonchangers in fitting the model can also be employed for this purpose, as these metabolites should have a log fold change of 0. Other examples can be the use of quality control samples such as pooled biological samples, where we know that the metabolite abundances in these samples should not be changing, and the use of standard samples in lipid profiling experiments where the metabolites have a known fold change. The consistency of results from different analytical techniques can also be used to evaluate the quality of the normalization approach. Although the approach presented here should not be used for classification or clustering, one can still use multivariate techniques such as principal component analysis (PCA) and hierarchical cluster analysis (HCA) to aid in visualizing both the biological factors of interest and unwanted factors accommodated by the model. Another useful visualization tool is the within-group or across-group relative log abundance (RLA) plots, which can be obtained by standardizing each metabolite by removing the median from each metabolite within or across groups, respectively. These scaled variables can then be visualized by use of box plots. Withingroup RLA plots can be used to assess the tightness of the replicates within groups; they should have a median of 0 and low variation around the median. Across-group RLA plots provide means of comparing the behavior of metabolites between groups. In addition, depending on the experiment, a histogram of p-values can also be useful. In the absence of any differentially abundant metabolites, an ideal histogram of p-values obtained from a fitted model should be uniformly distributed between 0 and 1.16 With the presence of some differentially abundant metabolites, the ideal histogram should be uniformly distributed but with a peak around 0.

abundances are then obtained by reversing the preprocessing steps. In addition to the above methods, two of the commonly used methods that do not use internal standards are the normalization to a total sum9 and normalization by the median10 of each sample. The total sum method scales each sample so that the sum of squares of all abundances in that sample equals 1. The median method scales each sample so that the median of the metabolite abundances in the sample equals 1: yij̃ = yij − median(yij : j = 1,..., n )

(6)

The use of the median method is found to be more practical than the sum method, especially in situations where several saturated abundances may be associated with some of the factors of interest. Here we present an approach for normalizing metabolite abundances for the identification of differentially abundant metabolites, which has several benefits over the existing approaches, which we discuss in detail in the next section. The main difference of the approach presented here is that instead of restricting only to internal or external standards, we use several, say nc, nonchanging metabolites in addition, to measure unwanted variation. These nonchanging metabolites should be present in the biological samples, exposed to the unwanted variation, and unassociated with the factors of interest. In metabolomics data, we can discover such nonchanging metabolites in various ways: The biological background to an experiment can provide insight into metabolites that are known beforehand not to change with the experimental factors. Also, we have found that metabolites that correlate with the internal standards are good candidates for nonchanging metabolites. In the absence of any reliable nonchanging metabolites in the biological samples, multiple internal or external standards can safely be used as the nonchangers. In this paper, we demonstrate each of these ideas in several applications. In addition, the contaminant metabolites obtained from running blank samples can also be candidates. As these nonchanging metabolites are unaffected by the biology of the experiment, we can safely assume that any variation seen in these metabolites is arising from unwanted interference alone. The unobserved unwanted variation component for each sample, wi, can be estimated by classical methods of factor analysis: yij = wiαj̃ + εij̃

j = 1, ..., nc

(7)

where nc is the number of nonchanging metabolites, wi (1 × k) is a row vector of unwanted factors, α̃ j (k × 1) is a column vector of coefficients, and ε̃ij is the error component. The number of unwanted factors k depends on the complexity of the unwanted variation and varies from experiment to experiment. We then fit a linear regression model to log abundances of each metabolite, using a design matrix that consists of factors of interest and unwanted variation factors estimated previously in eq 7 by use of nonchanging metabolites: yij = xi βj + wî αj + εij



DISCUSSION Scaling normalization methods9,10,15 are based on the selfaveraging property. For instance, it is assumed that an increase in the abundances of a group of metabolites is balanced by the decrease in abundances of metabolites in another group. However, in many practical applications, this property does not hold.13,14

(8)

where xi (1 × p) and ŵ i (1 × k) are row vectors indicating observed factors of interest and estimated factors of unwanted variation, and βj (p × 1) and αj (k × 1) are column vectors of coefficients corresponding to factors of interest and unwanted variation, respectively. The term εij represents the unobserved error component of the model. Observed unwanted variation 10770

dx.doi.org/10.1021/ac302748b | Anal. Chem. 2012, 84, 10768−10776

Analytical Chemistry

Article

to a total sum and normalization by the median should not be used for experiments where the self-averaging property of the metabolome is unlikely to hold, for example, in some time course experiments.

Normalization methods that are based on internal or external standards11−14 are limited in their ability to handle some of the unwanted variation. For example, the internal standards are not able to capture the unwanted variation that occurred in the process of sample preparation. In addition, unwanted biological variation, which is sometimes confounded with the factors of interest, is also missed by methods using internal or external standards. The advantages of the RUV-2 model over existing methods can be summarized as follows: First, it attempts to ensure that the biological factors of interest are not removed along with the unwanted variation. Second, the method can be applied to situations where multiple internal standards are not available. Third, it also allows for the possibility that any unwanted biological variation is also accommodated in the model. Fourth, it allows for the systematic integration of data from different sources on the same quantities. Finally, it deals with both observed and unobserved unwanted variation. The RUV-2 model requires that the data contain at least some metabolites that are not changing across the factors of interest. This is almost certainly true for any metabolomics experiment designed for the purpose of identifying differentially abundant metabolites. In the previous section, we discussed several ways of discovering these nonchanging metabolites. Depending on the data, care needs be taken in employing these methods. For example, the use of multiple internal standards or any contaminants in the blank samples as the only nonchanging metabolites will at best adjust for unwanted technical variation. For any unwanted biological variation to be accommodated, it is essential to have nonchanging metabolites that are present in the biological samples and are exposed to the same technical and other unwanted variation. In some experiments, it may happen that the technical variation is more prominent than the unwanted biological variation. In such situations, the use of metabolites that correlate highly with the internal standards as our only nonchanging metabolites may not provide sufficient information to correct for the unwanted biological variation. We may also come across experiments where background information on nonchanging metabolites is not available. A strategy for dealing with such experimental data is to fit the RUV-2 model in an iterative procedure. For example, we can use metabolites that highly correlate with the internal standards as an initial set of nonchanging metabolites to fit the RUV-2 model and rank the metabolites by use of adjusted p-values to identify a further set of nonchanging metabolites. Then the refined set of nonchanging metabolites can be used to refit the RUV-2 model. We must also carefully choose the number of unwanted variation factors k in the RUV-2 model. To do so, we presented several approaches in the previous section. Such approaches for selecting the number of unwanted variation factors provide the advantage of having more control over the experimental data compared to automated procedures for selecting model complexity such as the cross-validation method implemented in the CCMN model.14,18 Furthermore, we emphasis the fact that the choice of an appropriate normalization method must depend on the purpose of the experiment as well as the aim of the statistical analysis. For example, the normalized abundances obtained from the CCMN and RUV-2 methods, which use the biological factors of interest in the normalization approach, should not be used for the purpose of clustering or classification methods, where the factors of interest must be treated as unknown. Normalization



APPLICATIONS In the following applications, we demonstrate the usefulness of the RUV-2 approach by applying it to four different data sets with unwanted variation arising from different sources, namely, unwanted batch variation, platform variation, laboratory variation, and biological variation. In each application, estimates of log fold change are obtained from eq 8. When required for illustration purposes, RUV normalized values were obtained by subtracting the estimated unwanted variation component ŵ α̂ from eq 8. We assess the performance of the method by comparing it with relevant existing methods. Unwanted Batch Variation. Redestig et al.14 carried out a controlled GC/MS experiment to assess the performance of the metabolomics normalization approaches. The data consist of three different metabolite mixtures made up of 44 compounds. The first mixture had 12 replicates, and the second and the third had 15 replicates each. Each compound had a different concentration level in each mixture, either 0.05, 0.15, or 0.25 nmol/μL. A total of 42 samples were subjected to GC/ time-of-flight (TOF) MS in three batches with long time intervals in between, to introduce a strong component of unwanted variation. Nine samples were run in the first and the second batches, and 24 samples in the third batch. The concentration, composition, and other information regarding the data are given in the Supporting Information of Redestig et al.14 Only 35 metabolites out of the 44 compounds have been identified, and in addition, 11 isotope-labeled internal standards were run with each sample. Using this data, Redestig et al.14 showed that the CCMN and NOMIS methods outperform several popular metabolomics normalization approaches, including single internal standard normalization, retention index normalization, normalization to the sum of the chromatogram, and normalization by the median. In this application, we compared the performance of the best-performing CCMN and NOMIS methods with the RUV-2 approach. We fitted eq 8 to the data, using the 11 internal standards as the nonchanging metabolites (nc = 11), and we chose the value k = 7, by visual inspection of the biological and unwanted variation components of the model, using within-group RLA and PCA plots of the RUV-2 normalized data, as shown in Figures 4−6 in the Supporting Information. To obtain the within-group RLA plots, we standardized the normalized metabolite abundances by removing the median of the groups containing the replicate structure from each metabolite. The box plots of the samples of these standardized metabolite abundances should have a median close to 0, with low variation around the median and no visible variation between the replicate samples. For fitting both the CCMN and NOMIS methods, we used the crmn package for R.18 The design matrix used for both the CCMN and RUV-2 methods consisted of the replicate structure of the samples, and no information about the different batches or known concentration levels was used for normalization, treating this variation as unobserved. Figure 1a shows the first two principal components of the data before and after the different normalizations. The results for CCMN and NOMIS methods replicate those found in Redestig et al.14 Inspection of the PCA plots indicates that the 10771

dx.doi.org/10.1021/ac302748b | Anal. Chem. 2012, 84, 10768−10776

Analytical Chemistry

Article

Figure 1. Unwanted batch variation example showing (a) PCA plots of the data before (NONE) and after normalizations,(b) Within group RLA plots before and after normalizations, and (c) Box plots of the slopes obtained by regressing normalized log abundances of each metabolite on the log concentrations. The true value of these slopes should be one. Colors in the plots indicate different batches, and the characters indicate the concentration mixtures.

complexity in the CCMN method, we used the cross-validation procedure described by Redestig et al.14,18 However, these results may have been improved by employing strategies similar to those we have presented here for selecting the complexity of the model. For example, the use of a larger number of principal components in the CCMN model leads to PCA visualization that is comparable to those obtained by RUV-2, as shown in Figure 7 in the Supporting Information. Unwanted Platform Variation. A batch of fetal calf serum was spiked with a mix of 12 metabolites at concentration levels 0, 3, 6, 12.5, 25, and 50 (× 0.5, 2.5, or 5 nmol/10 μL), and three replicates were made at each concentration level. The samples were extracted by the same method in a controlled experiment, and the metabolites were detected by LC/MS and GC/MS (method is described by Temmerman et al.7) The full data set has 43 metabolites altogether: 24 measured by GC/MS only, 5 by LC/MS only, and 14 by both, including 12 changing and 31 unchanging. The baseline was removed by use of a nonlinear regression model fitted to each metabolite to focus on the trends due to the changing concentrations. The first row of Figure 2 shows some of the changing metabolites, indicating that there is a strong correlation between the LC/MS and GC/ MS abundances. The two sets of data need to be integrated for

unwanted variation is clearly evident in the raw data, mainly in the form of different batches. Plots of the principal component scores where the unwanted variation is removed via CCMN and NOMIS methods indicate that the batch effects have not been completely removed. The CCMN method shows clear clustering due to the first batch, while the NOMIS method shows strong variation. In contrast, the PCA plot for the data where the unwanted variation is removed by RUV-2 shows clustering by concentrations. Suppression of the unwanted batch variability by RUV model is also seen by the within-group RLA plots of the normalized data in Figure 1b, which also shows that batch variability is still visible in the CCMN and NOMIS normalized data. Since none of these normalization methods used the known concentration levels, the performance of the methods can be assessed by use of this information on the normalized data. One way of doing this is by regressing normalized log abundances of each metabolite on the log concentrations before and after normalization. As all the metabolites are changing with the concentrations, the true value of the slope obtained from each linear fit should equal 1. Among the normalization methods, the slope estimates obtained from the RUV-2 are the closest to the true value of 1, as seen in Figure 1c. In choosing the 10772

dx.doi.org/10.1021/ac302748b | Anal. Chem. 2012, 84, 10768−10776

Analytical Chemistry

Article

Figure 2. Unwanted platform variation example showing scatter plots of some of the changing metabolites in the first row, within-group RLA plots of the normalized data in the second row, HCA plots of the normalized data in the third row, and box plots of the slopes for changing and nonchanging metabolites in the fourth row. Colors indicate the platforms and the characters indicate the concentration levels.

plots of the RUV-2 normalized data. The platform variability was treated as unobserved. HCA plots for different values of k are shown in Figure 8 in the Supporting Information. The same linear model, excluding the unwanted variation component, was fitted to the SIS normalized data for comparison. The within-group RLA plots were obtained by standardizing each metabolite by removing the median of the groups containing only the concentration levels. For each concentration level, the sample-wise box plots of these standardized metabolite abundances should have median close to 0, low variation, and no variation due to platforms. The within-group RLA plots in the second row of Figure 2 show that the platform

the optimum identification of differentially abundant metabolites. For each platform, only one internal standard was available. In fitting the RUV-2 model, we did not use prior knowledge of nonchanging metabolites. Instead, we chose nc = 7 metabolites that had a moderate correlation of greater than 0.4 with the single internal standard in both platforms as the nonchanging metabolites, and these metabolites were excluded when the method was assessed. We fitted eq 8 to the log-transformed metabolite abundances with log concentrations as factors of interest and an intercept term. The number of unwanted variation factors was chosen as k = 3, using HCA and within-group RLA 10773

dx.doi.org/10.1021/ac302748b | Anal. Chem. 2012, 84, 10768−10776

Analytical Chemistry

Article

axenic amastigote stage (AA). Each of these stages has a different cell morphology and cell volume. Metabolites were harvested as previously described19 and analyzed by GC/MS,7 LC/triple-quadrupole (QQQ) MS,20 and tandem LC/QTOFMS. Six biological replicates of each parasite stage were prepared. The purpose of the statistical analysis was to find metabolites that are differentially abundant between the three stages. Unwanted biological variation in the form of differing cell sizes was one of the major issues with these data. Cells in the MLP and SP stages are larger than those in the AA stage. If the data are not appropriately normalized to accommodate the cell size variation, the selection of differentially abundant metabolites will be based on this confounded unwanted biological variation. None of the existing normalization methods based on internal standards was able to handle unwanted biological variation, as was explained in the previous section. Therefore, for these data, RUV-2 normalization was compared with median normalization. For each platform, the metabolites that had a correlation of greater than 0.7 with the internal standards and/or nonchanging metabolites in the biological samples known a priori were used to fit the RUV model. The number of unwanted variation factors was selected by visual inspection of the components in the RUV-2 model; in particular, we used within-group RLA plots. The replicate samples should behave similarly, with the box plots showing zero median and low variation around the median. For each analytical platform, the within-group RLA plots before and after normalizations are shown in Figure 10 in Supporting Information, showing that the RUV-2 normalization has reduced the variability within the replicate samples. Both the number of unwanted factors and the nonchanging metabolites differed from platform to platform; these are given in Table 2.

variation is still visible in the SIS normalized data. In the third row of Figure 2 the HCA plots show clear clustering due to biological variation in the RUV-2 normalized data, while the SIS normalized data cluster according to platform variation. The fourth row of Figure 2 shows box plots of the slopes obtained from fitted regression models, which should have a true value of 1 for changing metabolites and 0 for nonchanging metabolites. The medians of the box plots for RUV-2 normalization are closer to the true values. Unwanted Lab Variation. Eight replicates of a quality control (QC) mix and another eight replicates of the same mix with some metabolites in increased amounts (QC SPIKED) were run in three different Metabolomics Australia nodes (AWRI, Botany, and Bio21) at three different temperature settings (7, 15, and 25 °C) on four different GC-MS instruments. The data matrix consisted of 33 metabolites observed across all nodes and temperature settings and 185 samples, after exclusion of some samples that had not been run properly. In the QC SPIKED mix, 11 metabolites were in 3-fold amounts and one was in 5-fold amount relative to the QC mix, and the remaining metabolites were unchanged. For each data set, only one internal standard was available, and that was used to give the SIS normalization. Without using the known prior knowledge of the nonchanging metabolites, we chose nc = 9 metabolites that have a correlation of greater than 0.9 with the internal standards as nonchangers, and we selected the value of k = 6 using withingroup RLA and PCA plots of the RUV-2 normalized data. The nine nonchanging metabolites used in the model were excluded when the effectiveness of the normalization methods was assessed. The components of the fitted RUV-2 model are illustrated by PCA plots in Figure 9 in the Supporting Information. Table 1 contains two measures that can be used to compare the true fold changes with the estimated fold changes, namely,

Table 2. Unwanted Biological Variation Example: Number of Metabolites, Internal Standards, and RUV-2 Model Selected for Each Analytical Platform

Table 1. Unwanted Lab Variation Example: RMSE and MAD Values Obtained for Three Normalization Methods for Cross-Node Dataa MAD

GC/MS LC/amine LC/tandem

RMSE

metabolites

none

IS

RUV-2

none

IS

RUV-2

nonchanging 3-fold 5-fold

15 29 41

7 25 33

6 18 28

3 25 8

2 25 5

1 23 4

no. of metabolitess

no. of IS

nc

k

79 31 116

2 1 4

10 8 12

4 3 5

The performance of RUV-2 and median normalization methods was assessed by two strategies. First, we used prior biological knowledge concerning some metabolites. For example, the abundances of the metabolites in the glycolysis pathway should decrease from the MLP to the SP stage. However, in the raw data, an increase in these metabolite abundances is seen, reflecting the unwanted cell size variation. Figure 3 show these metabolites before and after normalizations for the GC/MS data. It is seen that the mediannormalized data are not corrected for this cell size variation, while in the RUV-2 normalized data, the expected biological trend can be seen. Second, we used the consistency of results obtained from individual analyses of the data from the three different platforms. A good normalization method should lead to consistent identification of differentially abundant metabolites from the three independent studies. For each study, pairwise comparisons were made by use of a contrast matrix for each fitted regression model, and the differentially abundant metabolites were identified by use of a BH-adjusted21 p-value cutoff of 0.05. For the two normalization methods, the number

a

The values are in hundreds, and the smallest value in each set of three is shown in boldface type.

root-mean-square error (RMSE) and median absolute deviation (MAD). The RMSE gives the average of the squared differences between true fold changes and estimated fold changes obtained before and after normalization, while the MAD gives the median of the absolute deviations of estimated fold changes from true fold changes. A suitable normalization procedure should generate low RMSE and MAD values, and the results indicate that RUV-2 has the lowest RMSE and MAD values for all changing and nonchanging metabolites. Unwanted Biological Variation. We investigated metabolite levels in three different developmental stages of Leishmania mexicana, a sandfly-transmitted parasitic protozoan that causes a spectrum of diseases in humans. The three stages comprised promastigote stages harvested in mid-logarithmic (MLP) and stationary (SP) growth phases and the cultured 10774

dx.doi.org/10.1021/ac302748b | Anal. Chem. 2012, 84, 10768−10776

Analytical Chemistry

Article

Figure 3. Unwanted biological variation example: Log abundances for the glycolysis pathway metabolites in the GC data before (NONE) and after (Median, RUV) normalizations.



of metabolites found to be giving consistent results across all platforms was expressed as a percentage of the total metabolites observed in more than one platform, and the results are shown in Table 3. It is seen that the RUV-2 model led to the most consistent results for all comparisons.

*E-mail [email protected]. Notes

The authors declare no competing financial interest.



Table 3. Unwanted Biological Variation Example: Number of Metabolites Giving Consistent Results Across All Platformsa method

MLP-SP (%)

MLP-AA (%)

SP-AA (%)

median RUV-2

34 65

37 56

41 56

REFERENCES

(1) Dettmer, K.; Hammock, B. D. Environ. Health Perspect. 2004, 112, A396. (2) Roessner, U.; Bowne, J. BioTechniques 2009, 46, 363−365. (3) Roessner, U.; Beckles, D. M. In Plant Metabolic Networks; Schwender, J., Ed.; Springer: New York: New York, 2009. (4) Roessner, U.; Nahid, A.; Chapman, B.; Hunter, A.; Bellgard, M. Comprehensive Biotechnology, 2nd ed.; Elsevier: Amsterdam, 2011; Vol. 1; pp 447−460. (5) Steuer, R.; Morgenthal, K.; Weckwerth, W.; Selbig, J. Methods Mol. Biol. 2007, 358, 105−126. (6) Craig, A.; Cloarec, O.; Holmes, E.; Nicholson, J. K.; Lindon, J. C. Anal. Chem. 2006, 78, 2262−2267. (7) Temmerman, L.; De Livera, A. M.; Bowne, J.; Sheedy, R. J.; Callahan, D. L.; Nahid, A.; De Souza, D. P.; Schoofs, L.; Tull, D. L.; McConville, M. J.; Roessner, U.; Wentworth, J. M. Metabolomics: Diabetics. J. Diabetes Metab. 2013 (upcoming special issue). (8) Gagnon-Bartsch, J. A.; Speed, T. P. Biostatistics 2012, 13, 539− 552. (9) Crawford, L. R.; Morrison, J. D. Anal. Chem. 1968, 40, 1464− 1469. (10) Wang, W.; Zhou, H.; Lin, H.; Roy, S.; Shaler, T. A.; Hill, L. R.; Norton, S.; Kumar, P.; Anderle, M.; Becker, C. H. Anal. Chem. 2003, 75, 4818−4826. (11) Gullberg, J.; Jonsson, P.; Nordström, A.; Sjöström, M.; Moritz, T. Anal. Biochem. 2004, 331, 283−295. (12) Bijlsma, S.; Bobeldijk, I.; Verheij, E. R.; Ramaker, R.; Kochhar, S.; Macdonald, I. A.; van Ommen, B.; Smilde, A. K. Anal. Chem. 2006, 78, 567−574. (13) Sysi-Aho, M.; Katajamaa, M.; Yetukuri, L.; Oresic, M. BMC Bioinf. 2007, 8, 93. (14) Redestig, H.; Fukushima, A.; Stenlund, H.; Moritz, T.; Arita, M.; Saito, K.; Kusano, M. Anal. Chem. 2009, 81, 7974−7980. (15) Scholz, M.; Gatzek, S.; Sterling, A.; Fiehn, O.; Selbig, J. Bioinformatics (Oxford, U.K.) 2004, 20, 2447−54. (16) Leek, J. T.; Storey, J. D. PLoS Genet. 2007, 3, 1724−1735. (17) Stegle, O.; Kannan, A.; Durbin, R.; Winn, J. Proc. 12th Annu. Int. Conf. Res. Comput. Mol. Biol. 2008, 411−422. (18) Redestig, H. Package crmn for R, 2012.

a

Expressed as a percentage of the total metabolites observed in more than one platform.



CONCLUSIONS We present an approach for normalizing metabolomics data that has several benefits over existing normalization methods. Some of the features of the approach are (i) it attempts to ensure that the biological variation is not removed in the normalizing step, (ii) the method can be applied to situations where internal standards may not be available, (iii) both unobserved and observed variation can be accommodated, and (iv) it allows for the systematic integration of data from different sources on the same quantities. The method was applied to four different data sets and was shown to provide better performance than the existing methods. The approach can be adapted to suit other metabolomics applications, such as for signal correction by use of quality control samples,22 obtaining concentration levels from standard samples in lipid profiling experiments, and combining GC/MS data sets with different derivatization chemistries for the optimal identification of differentially abundant metabolites. The R software for the methods used in this paper will be available in the metabolomics package for R.23



AUTHOR INFORMATION

Corresponding Author

ASSOCIATED CONTENT

S Supporting Information *

Seven additional figures as described in the text. This material is available free of charge via the Internet at http://pubs.acs.org. 10775

dx.doi.org/10.1021/ac302748b | Anal. Chem. 2012, 84, 10768−10776

Analytical Chemistry

Article

(19) Saunders, E. C.; De Souza, D. P.; Naderer, T.; Sernee, M. F.; Ralton, J. E.; Doyle, M. A.; Macrae, J.; Chambers, J.; Heng, J.; Nahid, A.; Likic, V.; Mcconville, M. J. Parasitology 2010, 137, 1303−1313. (20) Boughton, B.; Callahan, D.; Silva, C.; Bowne, J.; Nahid, A.; Rupasinghe, T.; Tull, D.; McConville, M.; Bacic, A.; Roessner, U. Anal. Chem. 2011, 83, 7523−7530. (21) Benjamini, Y.; Hochberg, Y. J. R. Statist. Soc., Ser. B 1995, 57, 289−300. (22) Dunn, W. B.; Broadhurst, D.; Begley, P.; Zelena, E.; FrancisMcIntyre, S.; Anderson, N.; Brown, M.; Knowles, J. D.; Halsall, A.; Haselden, J. N.; Nicholls, A. W.; Wilson, I. D.; Kell, D. B.; Goodacre, R. Nat. Protoc. 2011, 6, 1060−1083. (23) De Livera, A. M.; Bowne, J. Package metabolomics for R, 2012.

10776

dx.doi.org/10.1021/ac302748b | Anal. Chem. 2012, 84, 10768−10776