Comment on “Optimized Preprocessing of Ultra-Performance Liquid

Nov 9, 2011 - European Commission, Joint Research Centre, Institute for Health and Consumer Protection, Systems Toxicology Unit,. ISPRA (VA), Italy...
0 downloads 0 Views 628KB Size
COMMENT pubs.acs.org/ac

Comment on “Optimized Preprocessing of Ultra-Performance Liquid Chromatography/Mass Spectrometry Urinary Metabolic Profiles for Improved Information Recovery” Elia Mattarucchi†,‡ and Claude Guillou*,† †

European Commission, Joint Research Centre, Institute for Health and Consumer Protection, Systems Toxicology Unit, ISPRA (VA), Italy

n a recent paper, Veselkov and co-workers1 proposed some strategies to stabilize the technical variance within metabolic profiles acquired by UPLC/MS and to normalize signal intensities. The described methods are interesting, but in our opinion, they also raise a number of comments that should be addressed: (1) the authors asserted (Figure 1A) that the peaks with the highest intensities are progressively more variable. This is reasonable in absolute terms, but it should be clarified that the relative variation decreases, as argued in the cited works of Anderle et al.2 and Sysi-Aho et al.3 (Figures 8 and 5A, respectively). In particular, Anderle et al.2 reported that the coefficients of variation of the most intense peaks were almost constant, which seems in contrast with the observations of Veselkov et al.1 (2) The analyzed data sets are the final result of two distinct processes (i.e., the acquisition of metabolic profiles, and peak integration). The authors suggested that the observed variance arises mainly from the acquisition step. However, in our opinion, peak integration itself represents an important source of variability and should also have been considered. Typical untargeted metabolic profiles are complex and irregular and for the time being, no deconvolution software is capable to accurately integrate all the acquired peaks. In fact, the need to set common (i.e., “one size fits all”) integration parameters frequently reduces the efficacy of the extraction process.4 It seems also reasonable that the most intense and well delineated peaks were well integrated; this should result in a reduced variance for these signals, which indeed differs from the observations of Veselkov et al.1 (3) In Figure 1A, only the highest signals showed an increased variance. The authors recognized in this finding a heteroscedastic structure, assuming that the noise was mainly multiplicative. However, different urine samples can have up to 15-fold changes in concentration and the signals from the most abundant metabolites can have simply exceeded the dynamic range of the instrumentation (“saturation”), resulting in the observed variability. (4) Missing data are widely occurring in mass spectrometry derived metabolomics data sets, and they can have biological and/or technical origin.5 Pharmacological treatments, as well as differences in diet and lifestyle, can be the cause of this occurrence (especially in studies on human subjects, for which these factors are difficult to control). However, from our experience,6 missing data mainly originate from failures of the peak-picking process (see point 3) and from the presence of borderline metabolites (i.e., metabolites with concentration levels close to the detection threshold of the instrumentation used). In particular, biofluids characterized by variable levels of

I

r 2011 American Chemical Society

concentration (e.g., urine) are likely to contain numerous borderline metabolites that are detectable only in the most concentrated samples (whereas they are undetectable in the other less concentrated ones) producing data sets with several gaps. Missing values have negative effects on further elaborations of the extracted data6 and they cannot benefit from any normalization procedure because their numerical value is null (i.e., intensity = 0). The presence of null values can also hamper the application of the logarithm transformation introduced by Veselkov et al.1 to stabilize the variance. Veselkov et al.1 did not report, neither discuss, the occurrence and the impact of missing values in their article; in our opinion this should be argued in the light of the above considerations and the work of Hrydziusko and Viant.5 Many of the critical issues listed above are the consequence of the broad concentration differences between urine samples and the use of suboptimal extraction parameters, which are set to be valid along the whole metabolic profile. Metabolomics is a multiplestep process, and the outcome of each step is affected by the output of the steps before. In this context, the design of the preprocessing strategy should consider that the acquisition of metabolic profiles and peak integration (i.e., the previous steps) are frequently suboptimal compromises aiming at the measurement of thousands of metabolites at the same time. The negative effect of this occurrence can be only partially compensated for by the use of sophisticated postacquisition methods which, on the other hand, may represent an additional source of variability. In a recent study5 we aimed at addressing some of these issues. As a result, we proposed a simple preacquisition strategy based on the differential dilution of the urine samples (to reduce the concentration differences between the samples) and the increase in the scan time period (to improve the signal-to-noise ratio). The application of this straightforward procedure proved to have a positive effect on the selection of potential biomarkers, reducing the need for extensive postacquisition treatment of the extracted data sets.

’ AUTHOR INFORMATION Corresponding Author

*Address: European Commission, IHCP, Systems Toxicology Unit, Via E Fermi, 21027 ISPRA (VA), Italy. Phone: +39 0332 785678. Fax: +39 0332 789303. E-mail: [email protected]. europa.eu.

Published: November 09, 2011 9719

dx.doi.org/10.1021/ac202416r | Anal. Chem. 2011, 83, 9719–9720

Analytical Chemistry

COMMENT

Present Addresses ‡

Universita’ dell’Insubria, Dipartimento di Scienze Biomediche Sperimentali e Cliniche, via H Dunant 5, 21100 Varese, Italy.

’ REFERENCES (1) Veselkov, K. A.; Vingara, L. K.; Masson, P.; Robinette, S. L.; Want, E.; Li, J. V.; Barton, R. H.; Boursier-Neyret, C.; Walther, B.; Ebbels, T. M.; Pelczer, I.; Holmes, E.; Lindon, J. C.; Nicholson, J. K. Anal. Chem. 2011, 83, 5864–5872. (2) Anderle, M.; Roy, S.; Lin, H.; Becker, C.; Joho, K. Bioinformatics 2004, 20, 3575–3582. (3) Sysi-Aho, M.; Katajamaa, M.; Yetukuri, L.; Oresic, M. BMC Bioinf. 2007, 8, 93. (4) Dunn, W. B.; Broadhurst, D.; Brown, M.; Baker, P. N.; Redman, C. W.; Kenny, L. C.; Kell, D. B. J. Chromatogr., B: Anal. Technol. Biomed. Life Sci. 2008, 871, 288–298. (5) Hrydziuszko, O.; Viant M. R. Metabolomics 2011, DOI: 10.1007/s11306-011-0366-4. (6) Mattarucchi, E.; Guillou C. Biomed. Chromatogr. 2011, DOI: 10.1002/bmc.1697.

9720

dx.doi.org/10.1021/ac202416r |Anal. Chem. 2011, 83, 9719–9720