Subscriber access provided by Kaohsiung Medical University
Article
MetaboGroupS: A Group Entropy-based Web Platform for Evaluating Normalization Methods in Blood Metabolomics Data from Maintenance Hemodialysis Patients Shisheng Wang, Xiaolei Chen, Dan Du, Wen Zheng, Liqiang HU, Hao Yang, Jingqiu Cheng, and Meng Gong Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.8b03065 • Publication Date (Web): 17 Aug 2018 Downloaded from http://pubs.acs.org on August 19, 2018
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry
MetaboGroupS: A Group Entropy-based Web Platform for Evaluating Normalization Methods in Blood Metabolomics Data from Maintenance Hemodialysis Patients Shisheng Wang,†§ Xiaolei Chen,‡§ Dan Du,† Wen Zheng,† Liqiang Hu,† Hao Yang,† Jingqiu Cheng,† and Meng Gong*† †West China-Washington Mitochondria and Metabolism Research Center; Key Lab of Transplant Engineering and Immunology, MOH, West China Hospital, Sichuan University, Chengdu, China ‡Department of Nephrology, West China Hospital of Sichuan University ABSTRACT: Because of inevitable and complicated signal variations in LC-MSn-based non-targeted metabolomics, normalization of metabolites data is a highly recommended procedure to assist in improving accuracies in metabolic profiling and discovery of potential biomarkers. Despite various normalization methods having been developed and applied for processing these datasets, it is still difficult to assess their performance. Moreover, such methods are elusive and difficult to choose for users, especially those without bioinformatics training. In this study, we present a powerful and user-friendly web platform, named MetaboGroupS, for comparison and evaluation of seven popular normalization methods and provide an optimal one automatically for end users based on the group entropies of every sample data point. For examination and application of this tool, we analyzed a complex clinical human dataset from maintenance hemodialysis patients with erythrin resistance. Metabolite peaks (11027) were extracted from the experimental data and then imported into this platform, the entire analysis process was completed sequentially within 5 minutes. To further test the performance and universality of MetaboGroupS, we analyzed two more published datasets including a nuclear magnetic resonance (NMR) dataset on this platform. The results threw a hint that the method with a lower intra-group entropy and higher inter-group entropy would be preferable. In addition, MetaboGroupS can be quite conveniently operated by users and does not require any profound computational expertise or background for scientists in many fields. MetaboGroupS is freely available at https://omicstools.shinyapps.io/MetaboGroupSapp/.
With the tremendous development of instruments and gradual amplification of relative databases, mass spectrometry (MS) coupled with ether liquid chromatography (LC-MS) or gas chromatography (GC-MS) is increasingly becoming a prevalent and powerful approach for the identification and quantification of various small-molecule metabolites involved in complex biological and disease processes in organisms,1-3 for example, renal anemia, which is usually present in maintenance hemodialysis patients suffering chronic kidney disease (CKD).4 Moreover, hundreds, even thousands of metabolites from bio-samples can be detected in one assay duo to high sensitivity and versatile selection capabilities.5-6 In addition, different forms of unwanted variations caused by biological or experimental processes may give rise to significant systematic biases involving the raw metabolomics data, resulting in invalidation of downstream statistical inference.7 As a result, deciphering and visualizing these large-scale data sets is a formidable and arduous task that usually requires a sophisticated computational approach,8 which would be not conducive to long-range application of metabolomics in biomarker identification, pathological studies, or drug discovery. Generally, these unwanted variations can be reflected by signal drift and batch effects, which are also continually encountered in long-term metabolic profiling. To correct for signal drift and eliminate batch effects, repeated analysis of quality control (QC) samples, which are typically prepared by
pooling each sample, is widely utilized over the entire time period of large-scale studies.9-10 Additionally, some statistical approaches are also implemented to capture biases of arbitrary complexity and improve the overall differential profiles across datasets. Normalization is now frequently taken into account as a necessary part of data analysis.11-12 Two main types of normalization methods have been developed so far: 1) Sample-center normalization, such as median normalization8 and variance stabilizing normalization (VSN),13 aims at correcting different sample-to-sample concentrations; 2) Metabolitecenter normalization, in which QC sample-based support vector regression (QC-SVR)14 and robust, locally estimated scatterplot smoothing (LOESS) signal correction (QC-RLSC)15 are fairly representative, is imposed primarily to correct data with batch-to-batch experiment analytical variations. Most of these methods are integrated in some published pipelines, such as NOREVA,16 BatchQC,17 MetaboAnalyst.18 However, most of them are either too obscure to be operated for users or inadequate to provide an applicable normalization method. Furthermore, in information theory, entropy (common symbol: S) has been a fundamental quantity in thermodynamic systems and expands its application in statistics and machine learning.19-20 The classical formula is: = ∗ logΩ (1)
ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 2 of 8
Figure 1. Straightforward computation framework of MetaboGroupS.
where kB is the Boltzmann constant, equal to 1.38065×10−23 J/K, Ω is the number of possible microscopic configurations.21 This formula reveals the relationships between entropy and the number of ways in which the atoms or molecules of a thermodynamic system can be placed.22 Thus, the entropies of different group samples can be natural metrics, and standard evaluations describing the information content in each dataset posttreated by diverse normalization methods can be arrived at. In this work, we designed a powerful and comprehensive software tool, MetaboGroupS, which can automatically calculate group entropies based on principal component analysis (PCA) score matrix of every dataset after normalization and comparatively evaluate the fitness of different normalization methods from miscellaneous perspectives, in order to provide the most appropriate normalization methods for subsequent analysis. To date, MetaboGroupS has performed seven current methods, including median normalization,23 standard normalization,24 VSN,13 Remove Unwanted Variation-Random normalization (RUV-random),25 QC-SVR,14 EigenMS,26 QCRLSC,15 and also reserves the potential of integrating additional methods in the future. Additionally, there are no complicated operations in MetaboGroupS, with only a necessary requirement of having access to the internet. Furthermore, one experimental data obtained from maintenance hemodialysis patients with erythrin resistance and two published datasets31-32 were applied to extensively exhibit the originality and availability of this software. With this, we aim to enable scientists not engaged in bioinformatics to conveniently incorporate relevant metabolomics analysis into their research programs, especially for clinical data analysis. EXPERIMENTAL DETAILS Software Implementation. All functions in MetaboGroupS were written in R (https://www.r-project.org/)27 and the graphical user interface (GUI) was developed in Shiny.28 This platform was deployed on the free shinyapp.io sever, which is supported by the RStudio team. Alternatively, we prepared a spare website: http://www.omicsolution.org/wukong/MetaboGroupS, which users can also visit and analyze their data freely with no login requirement. The GUI contains three main parts (Figure S1): each module name, the parameters setting panel, and the re-
sults presentation panel. Seven frequently-used normalization methods, as mentioned above, and a group entropy algorithm were embedded into this software. Detailed operations about MetaboGroupS are shown in the supplementary notes. Sample Collection and Preparation. A total of 61 maintenance hemodialysis patients in West China Hospital (Sichuan, China) were involved in this study. All the following comorbidities which are relevant to secondary anemia had been excluded: bleeding, acute infections, malignant tumors, hematological diseases, iron deficiency and malnutrition. The therapeutic effect of Erythrotropin (EPO) on these patients was evaluated using the Erythrotropin Resistance Index (ERI), calculated as the weekly weight-adjusted dose of erythropoiesis-stimulating agents (ESA) divided by Hb level (g/L). According to the latest clinical records of these patients, the ERI values ranged from 0 to 36.28. The patients were divided into three groups based on ERI values. Group A (n = 22) with ERI 20. Fasting blood samples were collected with anticoagulant heparin sodium salt and stored at 4°C for no more than 2 h. Plasma samples were separated by centrifuging whole blood at 1500 x g for 10 min at 4°C and then stored at −80°C. Each 0.2 ml plasma sample was added to 0.8 ml of a mixture of chloroform and methanol (2:1), vortexed for 1 min and then centrifuged at 13000 x g for 10 minutes at 4°C. The upper phase of each sample was collected for subsequent LCMS analysis. Liquid Chromatography/Mass Spectrometry Analysis. Biological sample preparation and data acquisition and processing were in accordance with the same protocol as described in Nikolic et al.29 LC-MS/MS data were acquired in positive ion mode and negative ion mode separately using a Xevo G2-XS Q-TOF mass spectrometer (Waters) controlled by Masslynx software (Waters, Version 4.1). Chromatographic separation was carried out using an HSS T3 column (2.1 × 100 mm, 1.8 µm, Waters) on an ACQUITY UPLC I-Class system (Waters). The detailed chromatographic and mass spectrometer parameters are described in the supplementary information section (UPLC-QTOF-MS Parameters). The pooled QC samples combined with small aliquots (10 µl) of each sample were then used throughout the experiment as a
2 ACS Paragon Plus Environment
Page 3 of 8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry process control to monitor LC and MS performance across sample runs, as recommended by Sangster et al.30 After data
acquisition,
peak
intensities,
mass
to
Figure 2. Normalization results of experimental datasets. Within-group and across-group RLA plots of maintenance hemodialysis patients blood samples (a) before (“No Normalization”) and after seven different normalization methods: (b) Median Normalization, (c) Standard Normalization, (d) VSN, (e) RUV-random, (f) SVR, (g) EigenMS, (h) QC-RLSC.
Table 1. Skewness range of original and log2transformed intensities. Group QC A B
Original [47.68, 58.07] [40.59, 62.36] [38.48, 67.40]
Log2 [0.62, 0.66] [0.42, 0.69] [0.38, 0.67]
C
[29.28, 60.96]
[0.41, 0.62]
charge (m/z), and retention time (RT) can be extracted from the raw data using Progenesis QI (Waters, version 2.3.6275.47962). Published datasets collected for verification. For testing the performance and utility of MetaboGroupS, two more pub-
3 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
lished datasets were collected: 1. MTBLS79,31 in which 48 metabolites were extracted eventually across 172 cardiac tis-
Page 4 of 8
sue samples, including 38 QC samples; 2. Wine data,32 which included the 1H nuclear magnetic resonance (NMR) spectra of
Figure 3. Comparison of various normalization methods based on PCA. (a) No Normalization, (b) Median Normalization, (c) Standard Normalization, (d) VSN, (e) RUV-random, (f) SVR, (g) EigenMS, (h) QC-RLSC. The different samples were color-visualized with 95% CI according to their group information.
Table 2. Entropies results of each group samples deduced from data matrix processed with seven normalization methods. Methods None Median Standard VSN RUV-random SVR EigenMS QC-RLSC
Entropies Group A
Group B
Group C
QC
2.632 2.440 2.858 2.516 3.090 2.582 3.091 3.091
3.028 2.639 2.958 2.839 3.082 3.028 2.978 3.091
2.673 2.557 2.719 2.715 2.833 2.749 2.833 2.833
2.303 2.303 2.303 2.303 2.297 2.303 2.257 2.063
different origin and color (red, white and rose) and were preprocessed with speaq package to obtain the feature matrix.33 Data preprocessing and normalization. The uploaded data containing a sample-by-feature matrix at the entrance portal for analysis in MetaboGroupS can be xlsx, xls, csv and txt formats, which are easily output with Progenesis QI software or other similar tools.34 Subsequently, we set missing values (whose peak intensities are 0s) to not available values (NAs)
and removed those features in which the NAs ratio was above 0.5 (50%). After that, imputation was implemented with the kNearest Neighbor (KNN) algorithm35 and the coefficient of variation (CV) for each group of samples was counted statistically based on the log2-tansformed data, while four modes of transformation (“Log2”, “Log”, “Log10”, “none”) in MetaboGroupS can be chosen felicitously for users according to their own data. After preprocessing, the data were normalized using seven methods (median normalization,23 standard normalization,24 VSN,13 RUV-random,25 QC-SVR,14 EigenMS,26 QC-RLSC,15). The syntaxic introduction and implementation in R language are interpreted in Table S2. Group Entropy Computation. Entropy and mutual information estimation have been given close attention in information theory.36 On the basis of those principles, herein we define the group entropies as: = ∑ log (2) Where g is the replicate number of each group sample and is the bin frequencies of PCA score distance matrix of the k-th group, which can be deduced with a James-Stein-type shrinkage estimator, as shown below:37 = ∑ log (3)
4 ACS Paragon Plus Environment
Page 5 of 8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry Therefore, it is necessary to infer the PCA score matrix based on the respective normalization data using the prcomp function in advance and then to calculate the Euclidean distance matrix according to its definition.38 A straightforward computation scheme is presented in Figure 1.
Figure 4. Coefficient of variation of entropy in QC samples with respect to the other groups for the no normalization (Original) and seven normalization methods. The minimum is pointed with a diamond symbol.
RESULTS AND DISCUSSION Pre-Treatment of Entire Datasets. All preliminary peak information (intensities, m/z, RT) was extracted for the 4 groups of samples with Progenesis QI software. The data contained a total of 11027 peaks (Table S3). However, the raw intensity data cannot be processed directly because of missing values (NAs) and skewness,39 which is supposed to be inspected in advance. Missing values, from which no valid information can be derived, were mostly considered as useless objects and frequently broke down the computational procedure, thus those features with high rate of NAs were supposed to be deleted. In this work, we applied all 0s to NAs and then added up all NAs in each sample (Figure S2A) as well as for each feature (Figure S2B). The maximum rate of missing values in samples was approximately 49%, which indicates that the quantification of those clinical samples was eligible. Subsequently, features with missing value rates above 0.5 and CV above 0.3 (Figure S3) were removed arbitrarily and the remaining data, containing 5308 features (Table S4), were imputed with the KNN algorithm.35 For data skewness, which usually makes data not subject to normal distribution and then affects the accuracy of subsequential computation,40 logarithm transformation is usually taken into consideration to remove or reduce this influence in biomedical and psychosocial data analysis.41 Two boxplots were shown for interpretation of difference between the two results in Figure S4. The skewness range across samples in each group (Table 1) revealed that the distribution of raw intensities was right-skewed, while the
log2-transformed data were much better than the original. For the two published datasets, the same manipulation and control standards were carried out (data not shown). Data quality would generally meet the fundamental requirements with these two basic processes and then users can proceed the following analysis. Normalization Performance. Seven common normalization methods mentioned above were implemented on the log2transformed data. The normalized data should have a center value (median or mean) close to a constant and low variation around this center value within or across group of samples. Relative log abundance (RLA) plots25 (Figure 2) were used to detect unwanted variations under the different conditions and to preliminarily evaluate the potencies of these methods. Therefore, we would expect to inspect the integral distribution of data within each group. As illustrated in Figure 2A, the RLA plot of no normalization displays the variation trend within-group samples which fluctuated due to uncontrollable experimental operation. The situation underwent some optimization after normalization, in which RUV-random and QCRLSC appeared to have better performance, as RUV-random attempted to remove variation of not interest based on a linear mixed effects model and QC-RLSC could minimize the impact of experimental or biological variation with a locally estimated scatterplot smoothing function. However, it was still difficult to provide an intuitive assessment of the RLA plots and insufficiently precise to evaluate the suitability of different normalization methods. The score plots of the first two or three principal components in PCA also frequently made use of interpretable visualization of efficiencies of normalization methods, which can state the summarization of all datasets and the manifestation of data groupings before and after data normalization.42 Figure 3 (also Figure S5, S6) displays the 2-dimensional PCA score plots for our experimental data and the two published datasets using different normalization methods. As Figure 3 illustrated, we can discover that most of these approaches for handling complicated clinical sample data are ineffective and indistinguishable when compared with no normalization, except in the cases of EigenMS and QC-RLSC. Nevertheless, the QC samples were clustered more tightly after QC-RLSC normalization and, moreover, the experimental group samples were spread apart from each other, although some samples were crosslinked together, which may demonstrate exactly the complexity of clinical samples. Correspondingly, the results of MTBLS79 (Figure S5) and NMR data (Figure S6) showed similar complex situation in PCA score plots, which reminded analyzers to choose a normalization method deliberately and carefully. Method Selection Based on Group Entropy. For more accurate and convenient selection of different normalization methods (other than intuitive discrimination or naked feelings), we calculated the entropy based on PCA score distance matrix for each group sample with a James-Stein-type shrinkage estimator in MetaboGroupS. The group entropy took the samples distribution and variation within and across group into consideration simultaneously. The lower entropy within each group while the higher entropy across groups, the model would be better. In consequence, the normalization methods that removed unwanted analytical variation and retained the essential
5 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
ones of interest may outcompete the other methods. However, this was not absolute and should depend on the practical complexity of experimental datasets. Table 2 summarizes the computational results for each group entropy. The entropies from Median, Standard, VSN and RUV-random methods were similar in comparison to no normalization, especially in QC samples, which reflected inadequate correction for unwanted signal variations. By contrast, the entropy of QC samples after QC-RLSC normalization was the minimum (2.063) and, moreover, the first two principal components score centers from QC-RLSC normalization data were much further apart from each group sample than the others (Table S5). In addition, the CV for entropy in QC samples with respect to the other groups can provide a comprehensive recommendation method, as illustrated in Figure 4 and Figure S7. The minimal CV from the QC-RLSC method for our experimental dataset was marked with a diamond symbol, which also prompted us to choose the QC-RLSC normalization method for this clinical sample dataset to process the subsequential analysis. For the two public datasets, the recommendatory normalization methods were also presented to users (Figure S7), whereas, the optimal methods were not always same, which further indicated that there was not one gold method for all kinds of datasets and users should adjust normalization method on the basis of their own datasets. Noticeably, the whole computation analysis process of the largest-scale one among the three test datasets was completed in approximately 5 minutes (Table S6), which signified that this tool was not time-consuming as well. CONCLUSIONS In this work, we developed a free, user-friendly and powerful web platform, MetaboGroupS, to automatically select an appropriate normalization method on the basis of group entropies for LC-MS/MS-based metabolomics data analysis as well as NMR data calculation.43 The entire process of group entropy computation can reveal the difference and effectiveness of various normalization methods on complicated sample data, especially clinical data. Additionally, other OMICS data such as proteomics and genomics can be also analyzed homoplastically by this software because of the similarity of these data structures and their complexity. Overall, MetaboGroupS is easy-to-use and time-saving for scientists or clinicians who are non-OMICS specialists for data analysis and worth promoting for miscellaneous applications such as drug discovery and biomarker identification.
ASSOCIATED CONTENT Supporting Information The Supporting Information is available free of charge on the ACS Publications website. Additional information as mentioned in the text (PDF).
AUTHOR INFORMATION Corresponding Author *Email address:
[email protected].
Author Contributions
Page 6 of 8
§These authors (Shisheng Wang and Xiaolei Chen) contributed equally. All authors have given approval to the final version of the manuscript.
Notes The authors declare no competing financial interest.
ACKNOWLEDGMENTS This work was supported by the Science and Technology Department of Sichuan Province (No. 2017HH0036 and No. 2018HH0028) and the National Natural Science Foundation of China (Grant No. 81102366). Particularly, we thank Dr. Chenpin Shen for sponsoring and configuring the spare network server.
REFERENCES (1) Zhao, X.; Zeng, Z.; Chen, A.; Lu, X.; Zhao, C.; Hu, C.; Zhou, L.; Liu, X.; Wang, X.; Hou, X.; Ye, Y.; Xu, G. Anal Chem 2018. DOI: 10.1021/acs.analchem.8b01482. (2) Weckwerth, W. Annual review of plant biology 2003, 54, 669-689. (3) Gu, H.; Carroll, P. A.; Du, J.; Zhu, J.; Neto, F. C.; Eisenman, R. N.; Raftery, D. Angewandte Chemie International Edition 2016, 55, 15646-15650. (4) Mikolas, E.; Kun, S.; Laczy, B.; Molnar, G. A.; Selley, E.; Koszegi, T.; Wittmann, I. Kidney & blood pressure research 2013, 38, 217-225. (5) Fessenden, M. Nature 2016, 540, 153-155. (6) Dunn, W. B.; Broadhurst, D.; Begley, P.; Zelena, E.; FrancisMcIntyre, S.; Anderson, N.; Brown, M.; Knowles, J. D.; Halsall, A.; Haselden, J. N.; Nicholls, A. W.; Wilson, I. D.; Kell, D. B.; Goodacre, R.; Human Serum Metabolome, C. Nat Protoc 2011, 6, 1060-1083. (7) Chen, J.; Zhang, P.; Lv, M.; Guo, H.; Huang, Y.; Zhang, Z.; Xu, F. Anal Chem 2017, 89, 5342-5348. (8) Cambiaghi, A.; Ferrario, M.; Masseroli, M. Brief Bioinform 2017, 18, 498-510. (9) Wehrens, R.; Hageman, J. A.; van Eeuwijk, F.; Kooke, R.; Flood, P. J.; Wijnker, E.; Keurentjes, J. J.; Lommen, A.; van Eekelen, H. D.; Hall, R. D.; Mumm, R.; de Vos, R. C. Metabolomics : Official journal of the Metabolomic Society 2016, 12, 88. (10) Sanchez-Illana, A.; Pineiro-Ramos, J. D.; Sanjuan-Herraez, J. D.; Vento, M.; Quintas, G.; Kuligowski, J. Analytica chimica acta 2018, 1019, 38-48. (11) Gagnebin, Y.; Tonoli, D.; Lescuyer, P.; Ponte, B.; de Seigneux, S.; Martin, P. Y.; Schappler, J.; Boccard, J.; Rudaz, S. Analytica chimica acta 2017, 955, 27-35. (12) De Livera, A. M.; Dias, D. A.; De Souza, D.; Rupasinghe, T.; Pyke, J.; Tull, D.; Roessner, U.; McConville, M.; Speed, T. P. Anal Chem 2012, 84, 10768-10776. (13) Veselkov, K. A.; Vingara, L. K.; Masson, P.; Robinette, S. L.; Want, E.; Li, J. V.; Barton, R. H.; Boursier-Neyret, C.; Walther, B.; Ebbels, T. M.; Pelczer, I.; Holmes, E.; Lindon, J. C.; Nicholson, J. K. Anal Chem 2011, 83, 5864-5872. (14) Shen, X. T.; Gong, X. Y.; Cai, Y. P.; Guo, Y.; Tu, J.; Li, H.; Zhang, T.; Wang, J. L.; Xue, F. Z.; Zhu, Z. J. Metabolomics : Official journal of the Metabolomic Society 2016, 12. DOI: 10.1007/s11306016-1026-5. (15) Dudzik, D.; Barbas-Bernardos, C.; Garcia, A.; Barbas, C. Journal of pharmaceutical and biomedical analysis 2018, 147, 149173. (16) Li, B.; Tang, J.; Yang, Q.; Li, S.; Cui, X.; Li, Y.; Chen, Y.; Xue, W.; Li, X.; Zhu, F. Nucleic acids research 2017, 45, W162-W170. (17) Manimaran, S.; Selby, H. M.; Okrah, K.; Ruberman, C.; Leek, J. T.; Quackenbush, J.; Haibe-Kains, B.; Bravo, H. C.; Johnson, W. E. Bioinformatics 2016, 32, 3836-3838. (18) Xia, J.; Wishart, D. S. Nat Protoc 2011, 6, 743-760. (19) Gilson, M.; Kouvaris, N. E.; Deco, G.; Zamora-Lopez, G. Physical review. E 2018, 97, 052301.
6 ACS Paragon Plus Environment
Page 7 of 8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Analytical Chemistry (20) Nemenman, I.; Bialek, W.; de Ruyter van Steveninck, R. Physical review. E, Statistical, nonlinear, and soft matter physics 2004, 69, 056111. (21) Saha, A.; Lahiri, S.; Jayannavar, A. M. Physical review. E, Statistical, nonlinear, and soft matter physics 2009, 80, 011117. (22) Truong, G. W.; Anstie, J. D.; May, E. F.; Stace, T. M.; Luiten, A. N. Nature communications 2015, 6, 8345. (23) Delongchamp, R. R.; Velasco, C.; Razzaghi, M.; Harris, A.; Casciano, D. DNA and cell biology 2004, 23, 653-659. (24) Boysen, A. K.; Heal, K. R.; Carlson, L. T.; Ingalls, A. E. Anal Chem 2018, 90, 1363-1369. (25) De Livera, A. M.; Sysi-Aho, M.; Jacob, L.; Gagnon-Bartsch, J. A.; Castillo, S.; Simpson, J. A.; Speed, T. P. Anal Chem 2015, 87, 3606-3615. (26) Karpievitch, Y. V.; Nikolic, S. B.; Wilson, R.; Sharman, J. E.; Edwards, L. M. PloS one 2014, 9, e116221. (27) Ihaka, R.; Gentleman, R. Journal of computational and graphical statistics 1996, 5, 299-314. (28) Chang, W.; Cheng, J.; Allaire, J.; Xie, Y.; McPherson, J. R package version 0.11 2015, 1, 106. (29) Nikolic, S. B.; Wilson, R.; Hare, J. L.; Adams, M. J.; Edwards, L. M.; Sharman, J. E. Metabolomics : Official journal of the Metabolomic Society 2014, 10, 105-113. (30) Sangster, T.; Major, H.; Plumb, R.; Wilson, A. J.; Wilson, I. D. The Analyst 2006, 131, 1075-1078. (31) Kirwan, J. A.; Weber, R. J.; Broadhurst, D. I.; Viant, M. R. Scientific data 2014, 1, 140012.
(32) Larsen, F. H.; van den Berg, F.; Engelsen, S. B. Journal of Chemometrics: A Journal of the Chemometrics Society 2006, 20, 198208. (33) Vu, T. N.; Valkenborg, D.; Smets, K.; Verwaest, K. A.; Dommisse, R.; Lemiere, F.; Verschoren, A.; Goethals, B.; Laukens, K. Bmc Bioinformatics 2011, 12, 405. (34) Lu, H.; Liang, Y.; Dunn, W. B.; Shen, H.; Kell, D. B. TrAC Trends in Analytical Chemistry 2008, 27, 215-227. (35) Zhang, S. Journal of Systems and Software 2012, 85, 2541-2552. (36) Paninski, L. Neural computation 2003, 15, 1191-1253. (37) Hausser, J.; Strimmer, K. Journal of Machine Learning Research 2009, 10, 1469-1484. (38) Deza, M. M.; Deza, E. In Encyclopedia of Distances; Springer, 2009, pp 1-583. (39) Little, R. J. Journal of the American Statistical Association 1988, 83, 1198-1202. (40) Cheung, D. W.; Lee, S. D.; Xiao, Y. IEEE Transactions on Knowledge and Data Engineering 2002, 14, 498-514. (41) Altman, D. G.; Bland, J. M. Bmj 1996, 313, 1200. DOI: 10.1136/bmj.313.7066.1200. (42) Wen, B.; Mei, Z.; Zeng, C.; Liu, S. Bmc Bioinformatics 2017, 18, 183. (43) Pan, Z.; Gu, H.; Talaty, N.; Chen, H.; Shanaiah, N.; Hainline, B. E.; Cooks, R. G.; Raftery, D. Analytical and bioanalytical chemistry 2007, 387, 539-549.
7 ACS Paragon Plus Environment
Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 8 of 8
For TOC Only
8 ACS Paragon Plus Environment