Strategy for Optimizing LC-MS Data Processing in Metabolomics: A

Jul 16, 2012 - statistical modeling, validation, and interpretation. The end result is highly ..... hybrid aspens (P. tremula L. tremuloıdes Michx. c...
14 downloads 0 Views 2MB Size
Article pubs.acs.org/ac

Strategy for Optimizing LC-MS Data Processing in Metabolomics: A Design of Experiments Approach Mattias Eliasson,† Stefan Ran̈ nar,† Rasmus Madsen,† Magdalena A. Donten,†,‡ Emma Marsden-Edwards,§ Thomas Moritz,⊥ John P. Shockcor,§ Erik Johansson,¶ and Johan Trygg*,† †

Computational Life Science Cluster (CLiC), Department of Chemistry, Umeå University, SE-901 87 Umeå, Sweden AcureOmics AB, Tvistevägen 48, 907 36 Umeå, Sweden § Waters Corporation, Milford, Massachusetts 01757, United States ⊥ Umeå Plant Science Centre, Department of Forest Genetics and Plant Physiology, Swedish University of Agricultural Sciences, SE-90183, Umeå, Sweden ¶ Umetrics AB, Box 7960, SE-907 19 Umeå, Sweden ‡

S Supporting Information *

ABSTRACT: A strategy for optimizing LC-MS metabolomics data processing is proposed. We applied this strategy on the XCMS open source package written in R on both human and plant biology data. The strategy is a sequential design of experiments (DoE) based on a dilution series from a pooled sample and a measure of correlation between diluted concentrations and integrated peak areas. The reliability index metric, used to define peak quality, simultaneously favors reliable peaks and disfavors unreliable peaks using a weighted ratio between peaks with high and low response linearity. DoE optimization resulted in the case studies in more than 57% improvement in the reliability index compared to the use of the default settings. The proposed strategy can be applied to any other data processing software involving parameters to be tuned, e.g., MZmine 2. It can also be fully automated and used as a module in a complete metabolomics data processing pipeline. tudies in the field of metabolomics involve several steps in the progression from a hypothesis to a biological interpretation.1−3 These steps may include experimental planning, sampling, storage and pretreatment of samples, instrumental analysis, data processing,4 and multivariate statistical modeling, validation, and interpretation. The end result is highly dependent on how well each step in this chain of events is conducted, and the quality of the end result is strongly dependent on its weakest link. To be able to extract interpretable, reliable, and reproducible information, standardized protocols for most of these steps have been proposed.5 However, one stage that often is overlooked and is therefore left to be optimized solely based on user experience in a trialand-error fashion, or using the default settings, is the dataprocessing step. The quality of the results in this stage is often assessed by the quantity of detected peaks, without regard to the quality of individual peaks or the proportion of noisy peaks or other artifacts, which may in fact be completely unrelated to the actual samples or the underlying hypothesis. In untargeted ’omics analysis, the objective is to find as many potential biomarkers associated with the underlying hypothesis as possible, with relatively little a priori information at hand. Given this ambitious objective, having well-developed procedures in each step becomes essential. In the data-processing step, the task of optimizing the parameter settings becomes difficult as there is no easy and accurate way of assessing the

S

© 2012 American Chemical Society

quality of an integrated peak without going through extensive statistical testing and investigating the variables from a biological perspective. This option is often not viable since it requires identification and, to some extent, quantification, taking up both time and resources that are often not available at this stage. For processing metabolomics data, a wide variety of software is available, both commercial and open source.6 The dataprocessing pipeline for many of these software consists of several stages, from filtration and feature detection to alignment procedures and normalization. In each stage, different methods might be available that, in turn, have several parameters that can be varied in either a continuous or a discrete fashion. The quality of the processed data depends on the parameter settings. The number of peaks obtained from a certain set of parameters can range from a couple of hundred to several thousands. The key issue to realize is that more peaks do not necessarily mean better results, unless one can easily distinguish reliable peaks that correspond to real phenotype differences in the sample species from unreliable, noisy artifacts, and other unidentifiable peaks. Different parameter settings will not only give a different number of peaks but also the peaks themselves Received: May 30, 2012 Accepted: July 16, 2012 Published: July 16, 2012 6869

dx.doi.org/10.1021/ac301482k | Anal. Chem. 2012, 84, 6869−6876

Analytical Chemistry

Article

that it enables the detection of interaction effects for parameters between different subprocedures which will in turn enhance the understanding of the software and potentially find more suitable software settings than when optimizing the subprocedures sequentially. Disadvantages are that simultaneous optimization can be very time-consuming depending on the complexity of the software being optimized and is neither easily extended nor scalable. For example, a global design with all parameters incorporated required 33 days of computing time. On the basis of these reasons, the preferred approach was to optimize each of the three XCMS subprocedures sequentially. Protocol for Optimizing LC-MS Data Processing. A representative sample is required in our proposed DoE optimization approach. Using a pooled sample ensures that all metabolites are present during the optimization procedure. By diluting this pooled sample, we create a concentration vector that can be used to separate reliable peaks from unreliable ones, since we expect the relationship between peak area and sample concentration to be close to linear within the chosen concentration range. Presented below is the optimization protocol used in this investigation. (1) A small volume of each of the individual urine samples was mixed into a pooled sample. (2) The pooled sample was split into two dilution series. (3) All diluted samples were analyzed using UPLC-MS, generating raw data in a three-dimensional array (sample, retention time, m/z). (4) The raw data were processed with XCMS using different parameter settings around the default parameter settings for subprocedure Ia (filter and identify peaks), within predetermined limits; this step is denoted as DoE Ia:1. Each software setting generated a two-dimensional table (sample, peak area) with integrated peak areas. The settings for subprocedure II (match peaks across samples) and subprocedure III (retention-time correction) were kept at their default values throughout the optimization of this subprocedure. (5) For each software setting and each peak, the squared correlation, r2, between peak area and concentration vector was calculated. (6) For each software setting, the number of reliable peaks, i.e., those with high correlations (r2 ≥ 0.9), and the number of unreliable peaks, i.e., those with low correlations (r2 ≤ 0.05), were counted, generating two response vectors for each setting. (7) The quality of each processed data set and corresponding parameter settings were then quantified by calculating the reliability index from the numbers of reliable and unreliable peaks according to formula:

will be different. Their mass, retention time, and area will differ to some extent, making it difficult, or at least time-consuming, to compare and rank the quality of sets of peaks resulting from different parameter settings. Design of experiments (DoE) is a tool commonly used to systematically induce controlled variation so that variable effects and interactions can be detected and separated from noise using statistical methods such as multiple linear regression (MLR) and partial least-squares (PLS).7−9 The main advantage of the use of DoE is its ability to generate the most useful information while keeping the number of experiments low. DoE has previously been used to optimize procedures for collecting and storing cerebrospinal fluid (CSF) samples10 and in the sample preparation step to investigate how different extraction solvents, derivatization reagents, and physical conditions affect the extraction and derivatization of plant metabolites,11 human plasma,12 and mammalian cells.13 In the instrumental setup step, DoE has been used to investigate how instrumental parameters affect analyte response in electrospray ionization (ESI)-MS14 and LC-MS15 platforms. In this study, a strategy for optimizing UPLC-MS dataprocessing parameters is proposed. This strategy is a sequential DoE approach aimed at evaluating and optimizing dataprocessing parameters. It is based on a dilution series from a pooled sample and a measure of correlation between diluted concentrations and integrated peak areas. The concept presented in this paper is not limited to any specific software. However, to facilitate the process of running many dataprocessing runs, each with a different setting, software capable of automating this process is preferred. For this purpose, the software XCMS16 was chosen. XCMS is freely available software under open-source license, written in the R statistical language.17



METHOD Design of Experiments in Optimizing XCMS Parameters. In this investigation, the main objective was to find the optimal data processing parameter settings for the three selected XCMS processing subprocedures shown in Table 1. Table 1. XCMS Subprocedures and Respective Parameters Selected for Optimization subprocedure Ia Ib II III

filter and identify peaks (matchedFilter) filter and identify peaks (centWave)18 match peaks across samples retention-time correction

parameters fwhm, max, snthr, step, steps, mzdiff, profmethod ppm, peakwidth, snthresh, prefilter, mzCenterFun, integrate, mzdiff, fitgauss bw, mzwid, max

reliability index =

(number of reliable peaks)2 number of unreliable peaks

(1)

The best software settings were then found by optimizing the reliability. (8) A new DoE (DoE Ia:2) was then constructed on the basis of the optimization results from step 7. Steps 4−7 were then repeated using this new, adjusted DoE. The DoE loop was then repeated until the reliability index did not improve further, resulting in an optimized setting for the peakidentification algorithm. If there is more than one subprocedure in the software, as in the present case, repeat steps 4−7 for each subprocedure, using the parameter combination with the highest reliability index for the already optimized subprocedures. (9) After the optimization of subprocedure Ia (filter and identify peaks) was completed, the parameter combination yielding the highest reliability index was selected and subprocedure II (match peaks across samples) was subjected to the DoE loop, steps 4−7.

method, span, family

In total, this involved optimizing ten quantitative parameters, one qualitative four-level parameter, and two qualitative twolevel parameters. With this many parameters, from separate procedures, there are many ways to apply the DoE methodology. One could chose to optimize the parameter settings by focusing on each subprocedure separately in a sequential fashion or to incorporate all parameters simultaneously in a larger design. Optimizing the data processing parameters using a design incorporating all parameters in the same design compared to sequential optimization has its advantages. One advantage is 6870

dx.doi.org/10.1021/ac301482k | Anal. Chem. 2012, 84, 6869−6876

Analytical Chemistry

Article

Figure 1. Strategy overview. The pooled sample is split and diluted into different concentrations. The samples in the dilution series are analyzed and then processed in XCMS according to DoE. From the resulting data sets, one for each design point, the peaks are binned according to the correlation between peak area and the dilution factor of the samples in the dilution series. The correlation values for each data set are then condensed into one quality measurement, reliability index. The parameters of the DoE are then optimized with respect to maximizing the reliability index after which the DoE factor limits are adjusted in the direction of the indicated optimum. The optimization procedure is then rerun until the reliability index stops increasing.

During subprocedure III (retention-time correction), the parameter settings were kept at their default. (10) The DoE loop, steps 4−7, was then repeated for subprocedure III. Once the optimization has been completed, parameter settings yielding an optimized reliability index have been identified. Following this, the data from the individual samples are processed with these optimized software settings. The optimization procedure is visualized in Figure 1. Optimization Designs and Reliability Index. An optimization design including all parameters in subprocedure I was created.8 Selection of the response in the parameter optimization will influence the results. Only counting the detected peaks and maximizing the parameter settings toward this number will inevitably produce many false positives and add noise variables to the processed data. It is therefore important to include peak quality as a criterion in the optimization. In this study, we suggest using a dilution series of a representative sample (the pooled sample) and utilizing the concentration vector in the assessment of peak quality. All peaks that show a good linear correlation with the concentration vector are classified as reliable and could be potential biomarkers. The limit for good correlation selected here was an r2 between all replicate peak area and concentration over the dilution series larger than 0.9. However, it was found that settings giving a high number of reliable peaks simultaneously gave a high number of noisy peaks, and this needed to be addressed. Noisy peaks, or unreliable peaks, were defined as those having a negligible correlation (r2 ≤ 0.05) between area and concentration. By maximizing the ratio between the number of reliable and unreliable peaks, we found a region in the parameter space with a fairly large number of reliable peaks and a minimum of unreliable peaks and artifacts. The performance of several different ratios was investigated. The selected ratio gave a high total number of peaks compared with other investigated options. Multivariate analysis is known to be able to handle a certain amount of noise variables. The selected ratio was calculated according to the reliability index formula (eq 1).

Multiple linear regression (MLR) is used to determine the optimal parameter values with regards to the response (reliability index). This can be accomplished manually or, as in this study, with an automated optimizer. The optimizer algorithm employed here is the one implemented in the software MODDE v9 (MKS Umetrics, www.umetrics.com). The output from the optimizer is the parameter-setting combination yielding the best response value (reliability index) for that subprocedure. Updating the Optimization Design. The optimization design factor limits need to be adjusted if the optimum is indicated to lie outside the design region. The adjustment is done by taking the optimal settings as indicated by the optimizer and on a parameter by parameter basis to determine whether or not an optimum has been found within the predefined parameter limits. For each parameter, the step length is defined to be the difference between the upper and lower parameter limits used in the design. The distance from the center point to the value of the optimized parameter, as a percentage of the step length, is used to determine whether or not the parameter limits should be adjusted and a further optimization round should be performed. If every optimized parameter value is close to its center point, the parameters are fixed at the optimized values and one moves on to optimization of the next subprocedure. A parameter is deemed to be close to its center point if it is less than 25% of its step length away from its center point. However, if for one or more parameters the optimized value is not close to the center point, a revised design is deemed necessary. The revised design is generated using the same design structure but using new limits for all, or some of, the parameters. The parameter limits are adjusted according to how far the optimized value is from the center point. If the optimized value is within 1% of the step length from the lower or upper limit, then the limits are decreased or increased, respectively, by 25% of the step length. If the optimized value is more than 25% of the step length below the center point and more than 1% of the step length above the lower limit, then the 6871

dx.doi.org/10.1021/ac301482k | Anal. Chem. 2012, 84, 6869−6876

Analytical Chemistry

Article

design, running the optimizer, and evaluating the optimal setting is repeated until the optimized parameters are all close to the design center point. The procedure is also terminated if the optimal reliability index of the adjusted design is lower than that of the previous design. When this happens, the settings from the design with the best reliability index are fixed and the same type of optimization procedure is performed on the next subprocedure of the processing software. Experimental Designs in the XCMS Investigation. DoE Ia: Filter and Identify Peaks (matchedFilter). In the XCMS application, subprocedure Ia (filter and identify peaks) has six quantitative factors and one qualitative factor with four settings. The parameters for this feature detection algorithm were mainly related to peak width, signal-noise ratio cutoff, and profile generation. They were all varied around their default values, which formed the center point. For optimization, one needs to investigate at least three levels of all quantitative factors. This was accomplished using a reduced, balanced optimization design in 84 runs, including repetitions for selected settings. The complete design used for subprocedure I is visualized in Figure 3. DoE Ib: Filter and Identify Peaks (centWave). Subprocedure Ib has eight parameters of which three are qualitative with one in four levels and two in two levels. Of the remaining quantitative parameters, two have two settings each. We utilized a CCF8 optimization design in 146 runs. DoE II: Match Peaks Across Samples. Subprocedure II was based on the grouping algorithm density. The design used for this step was a CCF design with three parameters, resulting in 16 experiments in total, including two center points. DoE III: Retention Time Correction. In subprocedure III, the alignment algorithm peakgroups were used, which has two

limits are decreased by 10% of the step length. If the optimized value is more than 25% of the step length above the center point and more than 1% of the step length below the upper limit, then the limits are increased by 10% of the step length. For all parameters for which the optimized value is within 25% of the step length of the center point, the limits are unchanged in the adjusted design. Values for qualitative parameters are not adjusted but are repeated in the adjusted design. For discrete, quantitative factors, if the optimized value is found to be at the upper or lower limit, then the limits are increased or decreased by one full step, respectively. The adjusting of quantitative factor limits is illustrated in Figure 2 for a simple, 2-factor design.

Figure 2. Optimization design adjustment by 25% in both factors illustrated for a two factor design.

The new updated design is then executed, and response values are recorded. A new MLR model is calculated, and the optimizer is run using the new model. Evaluation of the optimal parameter settings is performed again to decide whether or not a further adjustment is needed. This procedure of executing the

Figure 3. 84-run design used for optimizing subprocedure Ia. 6872

dx.doi.org/10.1021/ac301482k | Anal. Chem. 2012, 84, 6869−6876

Analytical Chemistry

Article

Figure 4. Reliability index optimization in the poplar tree plant study. The figure describes the progress of the optimization strategy, starting out from the default settings where the number of unreliable peaks is higher than the number of reliable peaks. Throughout the optimization, the number of reliable peaks increases, and before the optimization procedure is completed, the reliable peaks outnumber the unreliable peaks. The feature detection algorithm applied was centWave.

using extraction buffer following the procedure. All the evaporated samples were dissolved in 30 μL of MeOH/H2O (1:1, v/v) and analyzed by a Waters Acquity UPLC system coupled to a LCT Premier TOF mass spectrometer (Miltford, MA, USA).20 Apply Optimized Settings on Poplar Plant Tree Sample Set. The settings obtained from the optimization procedure were thereafter used to process the sample set data. An OPLSDA21 model was created between transgenic and wild-type hybrid aspen. The OPLS-DA model was subjected to 7-fold cross-validation to determine the suitable number of components. The number of significant peaks was measured by counting the number of OPLS-DA loading confidence intervals that were not overlapping with zero. All confidence intervals were calculated by 7-fold jack-knifing. Human Study. A single urine sample from a healthy volunteer (male, 29 years) was used as a substitute to a pooled sample. The sample was collected immediately prior to analysis so no storage was needed. The sample was centrifuged for 5 min at 14 000 rpm (ambient temperature) to remove debris. The supernatant was split into two aliquots, and each was used for one dilution series. Each dilution series was identical to the others having samples with the dilutions: 1:0, 1:1, 1:2, 1:3, 1:4, 1:5, 1:6, 1:7, 1:8, 1:9, and 1:10 (urine/water). Samples were kept cold until analysis. Chromatography was performed on a Waters Acquity system coupled to a Waters Synapt Quadropole Time-of-Flight (TOF) mass spectrometer (Milford, MA, USA) operated only in TOF mode. Five μL of the sample was injected onto a 2.1 × 100 mm, 1.7 μm C18 UPLC column. The gradient elution buffers were A (H2O, 0.1% formic acid) and B (acetonitrile, 0.1% formic acid), and the flow-rate was 600 μL min−1. The column (held at temperature 40 °C) was eluted with a linear gradient consisted as follows: 0−1 min 0% B, 0−30% B from 1 to 5 min, hold at

qualitative factors in two levels and one quantitative factor. The design used was a full factorial with four points situated at the midedge (for the quantitative factor) which were repeated, to give 16 experiments in total. Global Optimization Design. A design incorporating all parameters was run for the matchedFilter method in the first subprocedure, for both the human and poplar tree plant study. This CCF optimization design in 156 runs included in total 13 parameters, 3 qualitative of which one was in four levels and two were in two levels.



EXPERIMENTAL SECTION Poplar Tree Plant Study. Xylem tissue was collected from transgenic LMP1::AtGA2ox2 overexpressors19 and wild-type hybrid aspens (P. tremula L. tremuloıdes Michx. clone T89). After mortle in liquid nitrogen, 10 mg of sample and 1 mL of extraction buffer (MeOH/H2O, 8:2, v/v) were added to an Eppendorf tube. For each genotype, 5 replicates were prepared. The samples were extracted in a MM400 Vibration Mill (Retsch GmbH & Co. KG, Haan, Germany), with carbon beads (3 mm diameter) for 3 min at frequency of 30 Hz. After removing beads, the samples were centrifuged in Mikro 220R (Hettich Zentrifugen, Germany) for 10 min at 14 000 rpm in 4 °C. Collected supernatants of samples for each genotype were pulled and mixed together (approximately 5 mL in total). Ten replicates were done by transferring 100 μL of the supernatant to a microvial and evaporating to dryness in miVac Quattro Concentator (Genvac, Ipswich, England). The rest of the supernatants were pulled together and next used for dilution. The supernatant was diluted with extraction buffer with the following ratios: 8:2, 6:4, 5:5, 4:6, and 2:8. For each concentration, 100 μL of solution was transferred to a microvial and evaporated to dryness in a miVac Quattro Concentator (Genvac, Ipswich, England). A blank sample was prepared 6873

dx.doi.org/10.1021/ac301482k | Anal. Chem. 2012, 84, 6869−6876

Analytical Chemistry

Article

30%B between 5 and 8 min, 30−95% B from 8 to 9 min, hold for 0.1 min, and thereafter returned back to 0% B until 11 min.. The mobile phase was introduced into an electrospray ion source. The source temperature was 120 °C with a cone gas flow of 50 L hr−1, a desolvation temperature of 150 °C, and a desolvation gas flow of 200 L hr−1. The capillary voltage was set at 3 kV for positive ion mode, with a cone voltage of 50 V; aperture voltage was 15 V, and corona voltage was 15 V. For MS analysis, Leucine Enkephalin 12C [M + H] was used as lock mass. Analysis was performed in V flight mode, and data was recorded in centroid mode in the m/z range of 50−1000 with a scan time of 0.08 s. Data was recorded from 0 to 10 min.



Figure 5. The reliability index from the dilution series correlates well with the number of significant peaks (i.e., potential biomarkers) in the poplar tree plant sample set, processed with parameter settings corresponding to the reliability index, using centWave as feature detection algorithm.

RESULTS Poplar Tree Plant Study, centWave. DoE Ib: Filter and Identify Peaks, centWave (DoE Ib:1). After modeling the reliability index with MLR, the MODDE optimizer was run and the resulting optimal settings were interpreted. This indicated an optimum outside the investigated region, and therefore, the design limits were adjusted according to the protocol described above. Adjusted Designs: Filter and Identify Peaks, centWave (DoE Ib:2−5). The next design, DoE Ib:2, was a copy of the first design but with factor limits adjusted according to the optimal settings indicated by the first model. The results still indicated an optimum outside the current design region. This led to a new iteration in the optimization procedure. In total, the centWave method in subprocedure Ib needed four adjusted designs before the reliability index stopped increasing. The settings from DoE Ib:4, given in the highest reliability index, were chosen to be used when optimizing subsequent subprocedures. DoE II: Match Peaks Across Samples, centWave (DoE II:1− 2). For the subprocedure II, the reliability index decreased after the first adjusted design. The optimized settings, giving the highest reliability index, were selected from the initial design, DoE II:1, to be used from now on in subprocedure II during the optimization of the poplar tree plant data. DoE III: Retention Time Correction (DoE III:1). The initial design of subprocedure III failed to find settings resulting in a higher reliability index. Since the highest reliability index in this design is equal to the highest in design DoE II:1, the default values of this subprocedure were chosen as the best settings. Using the optimization strategy presented here with default settings as center points, we were able to find optimized settings for each of the subprocedures so that the number of reliable peaks (r2 ≥ 0.9) increased from 5 to 442, and the number of unreliable peaks (r2 ≤ 0.05) increased from 72 to 339. The quality metric, the reliability index, was increased from the default value of 0.3 to 670. The progression of the reliability index optimization can be seen in Figure 4. The settings giving the highest reliability index from each design in the poplar tree plant study gives us a span of reliability index of interest to investigate. By processing the individual samples using these settings, the correlation between the number of potential biomarkers and the reliability index can be measured (r > 0.99). This is illustrated in Figure 5. Poplar Plant Tree Study, matchedFilter. DoE Ia: Filter and Identify Peaks, matchedFilter (DoE Ia:1−4). To optimize the first subprocedure for the poplar tree plant data set, three adjusted designs were necessary before the reliability index stopped increasing. The settings from the second adjusted

design that resulted in the highest reliability index were chosen to be used in the designs of the subsequent subprocedures. DoE II: Match Peaks Across Samples (DoE II:1−2). In this subprocedure optimization, the first adjusted design failed to find settings giving a higher reliability index than the initial design. The best settings for this subprocedure were then chosen from the settings from the initial design. DoE III: Retention Time Correction (DoE III:1). The model for the third subprocedure was not reliable enough to make accurate predictions of the appropriate direction to move and adjust the parameter settings. Since the reliability index was not improved from the first design in the second subprocedure, the best settings from this subprocedure initial design were chosen for the third subprocedure. As can be seen in Figure 6, the parameter optimization in the poplar tree plant study using the feature detection algorithm

Figure 6. Reliability index optimization of the poplar tree plant study with matchedFilter used as feature detection algorithm.

matchedFilter in the first subprocedure resulted in an increase of the reliability index from 45 to 697. The improved settings for the matchedFilter method in the poplar tree plant study yielded 2016 significant peaks (potential biomarkers) out of 3123. The default settings for the same method gave 443 significant loadings out of 788. The optimization of the centWave method resulted in the number of significant loadings increasing from 41 out of 152 to 1651 out of a total of 2271. Human Study. DoE Ia: Filter and Identify Peaks, matchedFilter (DoE Ia:1−3). In the human study, the 6874

dx.doi.org/10.1021/ac301482k | Anal. Chem. 2012, 84, 6869−6876

Analytical Chemistry

Article

detection of matrix effects. However, the present study is primarily focusing on peak detection not on quantifying matrix effects. Therefore, we chose concentration ranges where the response in our experience is reasonably linear. The reliability index is related to the correlation between diluted concentrations and integrated peak areas of the processed data. Our results on both human and poplar tree biology studies using two high resolution LC-MS techniques provided at least 57% improvement compared to the default settings. The reliability index correlates well with the number of significant peaks, i.e., potential biomarkers, in the processed data set of the original samples. Two different DoE approaches have been used, sequential and global optimization. Sequential optimization did perform better in some cases, but the real benefit is its flexibility, scalability, and significant reduction in calculation speed. We therefore recommend the use of sequential optimization. This study employs design of experiments as the only optimization approach. Other methods based on genetic algorithms and simulated annealing would be interesting to evaluate, and ongoing research may also include other optimization algorithms. The strategy is here described using the popular XCMS package from Scripps, but with minor modifications to the protocol, the strategy can be applied to any other data processing software, e.g., MZmine 2. It can be fully automated; thus, it can be used as a module in a LC-MS data processing pipeline.

optimization of subprocedure I required two adjusted designs before the reliability index started decreasing. The best parameter setting from DoE Ia:2 was used in subsequent designs. DoE II: Match Peaks Across Samples (DoE II:1−3). The same protocol as with DoE Ia was used for subprocedure II (peak matching across samples). The initial design was run and then followed by two adjusted designs. DoE III: Retention Time Correction (DoE III:1). In this subprocedure, no significant model could be found, and therefore, no adjusted designs were performed. The experimental setting giving the highest reliability index was selected as the optimal setting. In the human study, we were able to increase our measure of quality from 530 to 832, requiring 6 days to complete the optimization. The reliability index optimization from default settings through all designs and subprocedures for the human study is illustrated in Figure 7.



ASSOCIATED CONTENT

S Supporting Information *

Detailed description of XCMS data processing and parameters. Table S-1 with default settings for subprocedure Ia. Table S-2 with default settings for subprocedure Ib. Table S-3 with default settings for subprocedure II. Table S-4 with default settings for subprocedure III. Table S-5 with optimized settings for the poplar plant tree study using centWave as feature detection algorithm. Table S-6 with optimized settings for subprocedure II for the poplar plant tree study. Table S-7 with optimized settings for subprocedure III for the poplar plant tree study. Table S-8 with optimized settings for subprocedure Ia for the poplar plant tree study using matchedFilter as feature detection algorithm. Table S-9 with optimized settings for subprocedure II for the poplar plant tree study. Table S-10 with optimized settings for subprocedure III for the poplar plant tree study. Table S-11 with optimized settings for subprocedure Ia for the human study with matchedFilter used as feature detection algorithm. Table S-12 with optimized settings for subprocedure II for the human study. Table S-13 with optimized settings for subprocedure III for the human study. The R-scripts used are available upon request to corresponding author. This material is available free of charge via the Internet at http://pubs.acs.org.

Figure 7. Reliability index optimization for the human study using matchedFilter as feature detection algorithm.

Global Design. In the poplar tree plant study, the reliability index started decreasing already in the first adjusted design. The reliability index did increased from the default of 45 to 587; this was however not an improvement compared to sequential optimization where, for the same data set, the highest reliability index attained was 833. Since this optimization only required two designs, the time to completion was less than 1 day. The same strategy was applied on the human study, where the reliability index increased from 529 to 1295, a clear improvement compared to sequential optimization that reached a maximum of 697. This optimization required in total 33 days to complete. Eight significant interaction effects were found between parameters from different subprocedures.



CONCLUSIONS In the present study, a strategy for optimizing LC-MS metabolomics data is proposed. We have focused on the popular XCMS package, open source and written in R. Our proposed strategy is based on a sequential design of experiments (DoE) approach aimed at optimizing parameters based on an objective function, called reliability index. The optimization procedure is based on a simple dilution series from a pooled (or representative) sample from the studied sample set. Using a dilution series allows assessment of linearity throughout the analytical concentration range, and provided the pooled sample is representative of the sample series, this allows



AUTHOR INFORMATION

Corresponding Author

*Telephone: +46 90 786 6917. E-mail: Johan.Trygg@chem. umu.se. Notes

The authors declare no competing financial interest. 6875

dx.doi.org/10.1021/ac301482k | Anal. Chem. 2012, 84, 6869−6876

Analytical Chemistry



Article

ACKNOWLEDGMENTS This research was supported by the Swedish Research Council (grant no. 2011-6044), Umetrics AB, and Swedish national strategic e-science research program eSSENCE.



REFERENCES

(1) Brown, M.; Dunn, W. B.; Ellis, D. I.; Goodacre, R.; Handl, J.; Knowles, J. D.; O’Hagan, S.; Spasic, I.; Kell, D. B. Metabolomics 2005, 1, 39−51. (2) Dunn, W. B. Phys. Biol. 2008, 5, article ID 011001. (3) Krastanov, A. Biotechnol. Biotechnol. Equip. 2010, 24, 1537−1543. (4) Boccard, J.; Veuthey, J. L.; Rudaz, S. J. Sep. Sci. 2010, 33, 290− 304. (5) Lindon, J. C.; Nicholson, J. K.; Holmes, E.; Keun, H. C.; Craig, A.; Pearce, J. T. M.; Bruce, S. J.; Hardy, N.; Sansone, S. A.; Antti, H.; Jonsson, P.; Daykin, C.; Navarange, M.; Beger, R. D.; Verheij, E. R.; Amberg, A.; Baunsgaard, D.; Cantor, G. H.; Lehman-McKeeman, L.; Earll, M.; Wold, S.; Johansson, E.; Haselden, J. N.; Kramer, K.; Thomas, C.; Lindberg, J.; Schuppe-Koistinen, I.; Wilson, I. D.; Reily, M. D.; Robertson, D. G.; Senn, H.; Krotzky, A.; Kochhar, S.; Powell, J.; van der Ouderaa, F.; Plumb, R.; Schaefer, H.; Spraul, M.; The Standard Metabolic Reporting Structures working group. Nat. Biotechnol. 2005, 23, 833−838. (6) Katajamaa, M.; Oresic, M. J. Chromatogr., A 2007, 1158, 318− 328. (7) Box, G. E.; Hunter, W. G.; Hunter, J. S. Statistics for Experimenters: Design, Innovation and Discovery, 2nd ed.; Wiley: Hoboken, NJ, 2005. (8) Eriksson, L.; Johansson, E.; Kettaneh-Wold, N.; Wikström, C.; Wold, S. Design of Experiments, Principles and Applications, 3rd ed.; Umetrics Academy: Umeå, 2008. (9) Montgomery, D. C. Design and Analysis of Experiments, 7th ed.; Wiley: Hoboken, NJ, 2008. (10) Wuolikainen, A.; Hedenstrom, M.; Moritz, T.; Marklund, S. L.; Antti, H.; Andersen, P. M. Amyotrophic Lateral Scler. 2009, 10, 229− U8. (11) Gullberg, J.; Jonsson, P.; Nordstrom, A.; Sjostrom, M.; Moritz, T. Anal. Biochem. 2004, 331, 283−295. (12) A, J.; Trygg, J.; Gullberg, J.; Johansson, A. I.; Jonsson, P.; Antti, H.; Marklund, S. L.; Moritz, T. Anal. Chem. 2005, 77, 8086−8094. (13) Ritter, J. B.; Genzel, Y.; Reichl, U. Anal. Biochem. 2008, 373, 349−369. (14) Raji, M. A.; Schug, K. A. Int. J. Mass Spectrom. 2009, 279, 100− 106. (15) Zhou, Y.; Song, J. Z.; Choi, F. F. K.; Wu, H. F.; Qiao, C. F.; Ding, L. S.; Gesang, S. L.; Xu, H. X. J. Chromatogr., A 2009, 1216, 7013−7023. (16) Smith, C. A.; Want, E. J.; O’Maille, G.; Abagyan, R.; Siuzdak, G. Anal. Chem. 2006, 78, 779−787. (17) R Development Core Team, R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2010. (18) Tautenhahn, R.; Bottcher, C.; Neumann, S. BMC Bioinf. 2008, article no. 9:504. (19) Mauriat, M.; Sandberg, L. G.; Moritz, T. Plant J. 2011, 67, 805− 816. (20) Bruce, S. J.; Jonsson, P.; Antti, H.; Cloarec, O.; Trygg, J.; Marklund, S. L.; Moritz, T. Anal. Biochem. 2008, 372, 237−249. (21) Bylesjo, M.; Rantalainen, M.; Cloarec, O.; Nicholson, J. K.; Holmes, E.; Trygg, J. J. Chemom. 2006, 20, 341−351.

6876

dx.doi.org/10.1021/ac301482k | Anal. Chem. 2012, 84, 6869−6876