Chemical Shifts to Metabolic Pathways: Identifying ... - ACS Publications

Nov 10, 2015 - Supercomputer. Education and Research Centre, and. ⊥. Solid State and Structural Chemistry Unit, Indian Institute of Science, Bangalo...
0 downloads 0 Views 2MB Size
Subscriber access provided by EPFL | Scientific Information and Libraries

Article

Chemical Shifts to Metabolic Pathways: Identifying Metabolic Pathways Directly from a Single 2D NMR Spectrum Abhinav Dubey, Annapoorni Rangarajan, Debnath Pal, and Hanudatta Sastry Atreya Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.5b03082 • Publication Date (Web): 10 Nov 2015 Downloaded from http://pubs.acs.org on November 14, 2015

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Chemical Shifts to Metabolic Pathways: Identifying Metabolic Pathways Directly from a Single 2D NMR Spectrum

Abhinav Dubey1, 2, Annapoorni Rangarajan3, Debnath Pal1, 4 *, Hanudatta S. Atreya2, 5 * 1

IISc Mathematics Initiative, Indian Institute of Science, Bangalore 560012, India

2

NMR Research Centre, Indian Institute of Science, Bangalore 560012, India

3

Molecular Reproduction, Development and Genetics, Indian Institute of Science, Bangalore 560012, India 4

Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore 560012, India 5

Solid State and Structural Chemistry Unit, Indian Institute of Science, Bangalore 560012, India

* Corresponding authors: Debnath Pal [[email protected]] Hanudatta S. Atreya [[email protected]]

1 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 27

Abstract Identifying cellular processes in terms of metabolic pathways is one of the avowed goals of metabolomics studies. Currently this is done after relevant metabolites are identified to allow their mapping onto specific pathways. This task is daunting due to the complex nature of cellular processes and the difficulty in establishing the identity of individual metabolites. We propose here a new method: ChemSMP (Chemical Shifts to Metabolic Pathways), which facilitates rapid analysis by identifying the active metabolic pathways directly from chemical shifts obtained from a single two-dimensional (2D) [13C-1H] correlation NMR spectrum, without the need for identification and assignment of individual metabolites. ChemSMP uses a novel indexing and scoring system comprising of a ‘uniqueness score’ and a ‘coverage score’. Our method is demonstrated on metabolic pathways data from Small Molecule Pathway Database (SMPDB) and chemical shifts from Human Metabolome Database (HMDB).

Benchmarks show that

ChemSMP has a positive prediction rate of >90% in presence of de-cluttered data and can sustain the same at 60 - 70% even in presence of noise such as deletions of peaks and chemical shift deviations. The method tested on NMR data acquired for a mixture of 20 amino acids shows a success rate of 93% in correct recovery of pathways. When used on data obtained from cell lysate of an unexplored oncogenic cell line, it revealed active metabolic pathways responsible for regulating energy homeostasis of cancer cells. Our unique tool is thus expected to significantly enhance analysis of NMR based metabolomics data by reducing existing impediments.

2 ACS Paragon Plus Environment

Page 3 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Introduction Metabolomics studies investigate biological processes at the molecular level1-4. Two commonly used analytical techniques for this purpose are Mass spectrometry and Nuclear Magnetic Resonance (NMR) spectroscopy 5. A common prerequisite for metabolomics studies is the identification and assignment of metabolites prior to mapping them to pathways6,7. This basic premise is implemented in some of the current software like PAPi

8

and MetPA 9. PAPi

compares metabolic pathway activity across different samples by calculating the activity score. MetPA combines complementary approaches of topology of pathways, pathway enrichment analysis and statistical methods to rank metabolic pathways according to significance. However, both these methods require metabolites to be identified first and provided as input. Identification of metabolites from chemical shifts has several challenges. First, in steady states certain metabolites in a metabolic pathway accumulate while others are present below detection limit of NMR, giving rise to missing peaks in the spectra. Second, the chemical shifts database currently has non-uniform coverage of chemical shifts data for metabolic pathways. Both these bottlenecks impede metabolite assignment and the consequent mapping to pathways. We propose here a new method: ChemSMP (Chemical Shifts to Metabolic Pathways) which identifies putative metabolic pathways active in the system directly from a single two-dimensional (2D) [13C, 1H] correlation NMR spectrum, without recourse of individual metabolite assignments. Thus, ChemSMP offers distinct advantage of facilitating the rapid analysis of spectra in context of metabolic pathways skipping the rate limiting and potentially error prone step of metabolites’ assignment. In the NMR based metabolomics studies, a 2D [13C, 1H] Heteronuclear Single Quantum Coherence (HSQC)10 NMR experiment is typically recorded, which provides information on

13

C and 1H

chemical shift correlations with added benefits of high chemical shifts dispersion in the

13

C

3 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

dimension

11

Page 4 of 27

. Although, 2D NMR experiments are relatively less sensitive compared to 1D

experiments, the additional information available from 1H-13C cross peaks are invaluable for spectral annotation

/ quantitation, avoiding the peak overlaps that usually clutter the 1D

spectrum. Recent technological advances such as development of low temperature cryogenically cooled probes12, microprobes

13

, isotope enrichment, non-uniform sampling

mitigate the sensitivity issues of 2D experiments

15

14

can largely

, and newer technological developments are

expected to eliminate the problem in times to come. Using chemical shifts obtained from the 2D [13C, 1H] HSQC spectrum, we have devised a novel scoring system which avoids the bottlenecks described above to directly map chemical shifts to metabolic pathways. The underlying principle behind working of ChemSMP is to first assume that all metabolic pathways are present equally for shared metabolites and then ranking them in order of amount of evidence available for each pathway from the 2D spectrum.

This approach avoids the time consuming process of

identification of individual metabolites thereby speeding up the analysis. Our pipeline is the first method which demonstrates mapping of metabolic pathways directly from chemical shifts values. We demonstrate the efficacy of ChemSMP through (i) simulations, (ii) tests conducted on experimental sample comprised of a mixture of 20 amino acids and (iii) lysate of a cancer cell line grown in a media containing unlabeled or 13C labeled glucose.

Materials and Methods

Metabolic Pathways data Pathways data comprising of the metabolites involved were downloaded from Small Molecules Pathways Database v2.0 (SMPDB) 16,17. 91 metabolic pathways (listed in Table S1 of Supporting

4 ACS Paragon Plus Environment

Page 5 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Information, out of 93 archived in the database) were used and the availability of corresponding 2D [13C, 1H] chemical shifts of metabolites in the Human Metabolome Database v3.6 (HMDB) 18-20

was noted. This version of SMPDB had 315 metabolites for which chemical shift

information is available in HMDB. Among 91 pathways, ten pathways have more than 50 metabolites listed. Further, many metabolites are common/shared among the pathways. For instance, 32 metabolites are common to more than 5 pathways. However, there are 182 metabolites that are distributed among 57 pathways (Table S1) such that they belong only to one particular pathway and hence act as potential unique markers. An overview of all the pathway and metabolite information is shown in Figure 1. The rows of matrix (y-axis) are metabolic pathways arranged in decreasing order of number of metabolites involved in them. The columns (x-axis) are the metabolites whose chemical shits are available in HMDB database. Thus, the (i, j)th entry of the matrix corresponds to ith metabolite present in the jth metabolic pathway. Above the matrix, a bar graph is shown depicting the percentage of pathways to which a given metabolite belongs. The bar graph on the right hand side of the matrix depicts the percentage of metabolites of a given pathway whose chemical shifts are available in HMDB. This plot clearly shows that majority of pathways have less than 50% chemical shifts data available currently.

5 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 27

Figure 1: Uniqueness and availability of chemical shifts of metabolites in HMDB across 91 metabolic pathways from SMPDB. The upper bar plot shows the percentage of pathways in which a given metabolite is found. This graph shows that a large number of metabolites are shared/common among the different pathways. The right bar plot illustrates that the current chemical shifts coverage of pathways in HMDB is less than 50% for majority of the metabolic pathways.

Amino acids mixture A mixture of 20 amino acids (99% pure, purchased from HiMedia) were taken for experimental validation of our method, wherein the individual amino acids were mixed together to a final concentration of ~2 mM each in 100 % 2H2O. The list and corresponding HMDB identity number of the 20 amino acids is provided in Table S2 (Supporting Information).

Cancer cell lysate 6 ACS Paragon Plus Environment

Page 7 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Transformed cell lines from human foreskin fibroblasts carrying hTERT, LT antigen, ST antigen and Ras (+ST) 21 were used as a source of cancer cells. These cancer cell lines were cultured in Dulbecco’s modified Eagle’s medium (DMEM) supplemented with 10% fetal bovine serum. For 13

C labeled sample, cells were grown in DMEM (without glucose) supplementing glucose

exogenously in 13C labeled form. The +ST cells were grown to ~3 million in 60 mm culture dish. After removing the media from the culture dish, cells were quenched in liquid N2 to cease metabolic activities. The cells were then scraped in 600 µl of CD3OD:D2O (4:1) mixture

22

.

Subsequently, sonication was done to lyse cells and centrifugation at 12000 rpm for 15 minutes was done to harness metabolites from the supernatant of CD3OD:D2O mixture.

NMR experiments A 2D [13C, 1H] HSQC spectrum was acquired at 298 K for all the three samples: (i) amino acid mixture, (ii) +ST cell lysate with metabolites at natural abundance of

13

C and (iii)

13

C labeled

+ST cell lysate. The acquisition (NMR) and processing (NMR and ChemSMP) parameters used in all experiments are provided in Table S3 (Supporting Information). The spectra for the amino acid mixture and the +ST cell lysate containing

13

C enriched metabolites were acquired on a

Bruker Avance NMR spectrometer operating at a 1H resonance frequency of 700 MHz and equipped with a cryogenic probe. The spectrum of +ST cell lysate at natural abundance of

13

C

was acquired on a Bruker Avance NMR spectrometer operating at 1H resonance frequency of 800 MHz equipped with a cryogenic probe.

ChemSMP: Theory The working of ChemSMP can be categorized into two stages: a) scoring the chemical shifts of the metabolites on the database side, based on frequency of its occurrence in the different 7 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 27

pathways and b) comparing the database indexed shifts with the experimentally observed chemical shifts to compute scores for the possible association to metabolic pathways. In the first stage, the 2D [13C, 1H] chemical shifts of the metabolites of a pathway obtained from HMDB are indexed. Indexing involves dividing the 2D [13C, 1H] map into grids using a pre-defined chemical shifts limits and bin size for 1H and 13C dimensions (Figure 2).

8 ACS Paragon Plus Environment

Page 9 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 2: Schematic of algorithm behind ChemSMP. (a) Three hypothetical pathways involving seven 2D [13C, 1H] chemical shifts (labeled s1-s7). In this example, the three pathways share some of the chemical shifts in common. For example, both pathways A and B have s2 and s4 in common. (b) The chemical shifts s1-s7 are indexed to the 2D grid, created using defined parameters hl (lower limit on 1H chemical shift), hu (upper limit on 1H chemical shift), ∆h (bin size in 1H dimension), cl (lower limit on 13C chemical shift), cu (upper limit on 13C chemical shift) and ∆c (bin size in 13C dimension). Typically, an adjustable bin size of 0.03-0.05 ppm in 1H and 0.3-0.5 ppm in 13C is considered. Chemical shifts s4 and s5 share the same index number and s7 is not indexed as it is out of the defined region. (c) The indices are scored based on their frequency of occurrence in different pathways and later normalized such that sum of all index scores of a given pathway is 1.0. A detailed derivation of computation of index scores is illustrated in Text S1 Supporting Information. For example s1 belongs to index 2 and is shared by two pathways A and C. Therefore, index gets score ½. This score is separately normalized for pathway A and C. For normalization, the index score of a pathway is divided by sum of all index scores for that pathway. All index scores are used for computing coverage (PC) score. Index 9 is present in only one pathway and therefore it also contributes to uniqueness (PU) score for Pathway A. (d) An example of a 2D [13C, 1H] HSQC spectrum containing three peaks, which are indexed using the same criteria as used for indexing the chemical shifts of pathways in the database (shown in b). (e) Output of ChemSMP upon using the 9 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 27

HSQC shown in (d) as input. The PC score and PU scores are computed for pathways A, B and C using values shown in (c). For example, index no. 9 and 27 have peaks which point to pathway A and hence its PC score is: 3/7 + 3/14 = 64.28%. The three pathways are then sorted based on their respective PC scores.

Typically, an adjustable grid size of 0.03-0.05 ppm in 1H and 0.3-0.5 ppm in

13

C is

considered. Each grid gets a unique number. The chemical shifts belonging to a particular grid get the corresponding index number called as ‘query indices’. In this manner, a 2D chemical shift coordinate is converted into an index number. Note that a given grid or index can point to chemical shifts of more than one metabolite from the same or different pathways (due to chemical shift overlap within the grid size). Based on this, an ‘index score’ is calculated as the reciprocal of the number of distinct pathways a given index corresponds to. Thus, an index belonging to single pathway gets maximum score 1 and one belonging to all pathways in SMPDB gets score 1/91. A pathway thus becomes a collection of indices and corresponding index scores. Next, these scores are normalized for each pathway such that the sum of all index scores of its metabolites is 1. The normalization is done so that a less represented pathway (i.e., having less metabolite or chemical shift information) is not penalized as its score sum will be less due to less number of indices involved. The process of indexing and score computation is illustrated in Figure 2 and Text S1 (Supporting Information) using an example of three hypothetical pathways. A possibility is that an index may be unique to a given pathway, with no other pathway having chemical shifts of its metabolites in that grid. This gives a ‘Pathway uniqueness’ (PU) score, which is the total number of query indices unique to that pathway. In the next stage, the experimental peak-list obtained from the pre-calibrated 2D spectrum is converted into target indices using the method as described above. Each of the target index (or grid) will either have or not have a peak within the grid size (tolerance) specified 10 ACS Paragon Plus Environment

Page 11 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

(Figure 2(d)). The target indices having peaks are then matched against query indices (obtained from the database). If a target index matches a query index, it implies that the pathway specified by the query index is present in the sample. For each such match, the corresponding index score is added to ‘pathway coverage’ (PC) score of the corresponding pathway. The PC score of each pathway is therefore the sum total of all the index scores obtained from the experimental data. Thus, more the target indices match the query indices for a given pathway, its PC score increases, which in turn reflects an increase in the confidence of that pathway being present. Further, many target indices may correspond to a unique pathway. The number of target indices that point to a unique pathway increase the confidence of the particular pathway being observed and is called as the ‘pathway uniqueness’ (PU) score which is also computed. Thus, a list of pathways along with PC and PU scores are created and sorted in descending order based on PC score.

Benchmarking The 91 metabolic pathways from SMPDB along with 2D [13C, 1H] HSQC chemical shifts of the metabolites involved in these pathways were used for creating input peak-lists for the simulation studies. In our simulated datasets we mimicked problems encountered in experimental data such as variability in number of active pathways, missing peaks and non-systematic deviations in chemical shifts. For this, ten different types of simulation datasets (A-J) were created. In each dataset 2 to 9 pathways were chosen randomly from 91 metabolic pathways and input chemical shift list (peak-lists) were created mimicking their 2D [13C, 1H] HSQC spectrum. Datasets A-E had 0%, 10%, 20%, 30% and 40% chemical shifts randomly deleted from the peak-lists, respectively. Datasets F-J, along with 0 – 40 % random peak deletions, had noise added in form 11 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 27

of chemical shifts deviations. The noise consisted of a random number between 0.005 and 0.015 ppm added/subtracted to all 1H chemical shift values and similarly, a random number between 0.05 and 0.15 ppm added/subtracted to all 13C chemical shift values. These values corresponds to half the grid size described above. Each input datasets had 10,000 instances for statistical averaging.

Results and Discussion ChemSMP uses 2D [13C, 1H] chemical shifts for all its computation. Therefore, one can either directly use the NMR peak-list obtained from any source or provide processed NMR spectra (Figure 3a). For directly processing the NMR spectrum, ‘nmrglue’ package for python 23 is used to pick peaks and provide input to the program in the required format. Figure S2 (Supporting Information) shows experimental spectra and peak picked using ChemSMP.

The results

obtained from ChemSMP are explained in Figure 3b, 3c. The output contains putative metabolic pathways along with their name, p-values (Text S2 of Supporting Information) PC and PU scores. To verify the statistical method we used chemical shifts of ‘Tyrosine Metabolism’ (SMP00006) as input. As shown in Text S2, the correct metabolic pathway received 100% PC score and lowest p-value. Pathway like ‘Androgen and Estrogen Metabolism’ (SMP00074) which share some chemical shift with the correct pathway got lowest PC scores and highest pvalue. Thus, low PC score implies less statistical significance. In the detailed output, list of metabolites and ratio of peaks observed and total peaks expected for each of them is provided for all the listed pathways. This is followed by two columns of indices and corresponding cumulative peaks volume (if available) corresponding to that index. The indices computed are

12 ACS Paragon Plus Environment

Page 13 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

unique if the input parameters of upper, lower and bin size of both dimensions are same. Thus one can also compare two pathways from different samples by comparing the peak volumes of common indices, generated using same parameters.

13 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 27

Figure 3: ChemSMP format for input and output. (a) ChemSMP takes as input either a processed 2D [13C, 1H] HSQC spectrum or a peak-list. The output is presented in two parts. (b) The first part presents a summary of output by listing the name and SMPDB ID of the putative metabolic pathways, in decreasing order of the coverage (PC) score. (c) In the second part, each pathway listed in the first part is described further in detail. The expanded list is provided to make informed decision for a listed metabolic pathway. The index number and cumulative peak volume belonging to that index is listed below. This pair of index and peak volume can be used to make quantitative comparisons of pathway activity across different samples. Region of interest is defined by ChemSMP input parameters hl, hu, cl and cu.

ChemSMP was evaluated on SMPDB, wherein the chemical shifts of metabolites in different pathways are linked directly to HMDB. The 91 metabolic pathways were shortlisted based on the annotation provided in the database (Table S1 of Supporting Information). About 983 metabolites make up these 91 metabolic pathways and chemical shifts data of 315 metabolites are currently available in HMDB. As shown in Figure 1, the information of chemical shifts is not uniformly available across the different pathways. For pathways such as ‘Arachidonic Acid Metabolism ‘(SMP00075), chemical shifts of only 7% of its metabolites are currently available. Whereas ‘Methylhistidine metabolism’ (SMP00715) has 75% metabolites whose chemicals shifts are available. Overall, across all the pathways chemical shift information for an average of 41% of metabolites is known for a given pathway. We, therefore, normalized the coverage score for each pathway to add up to 1.0 to deal with this incomplete and nonuniform representation of pathways with respect to the chemical shifts data. Also, strength of ChemSMP scoring system is to give less score to ambiguous peaks (i.e., peaks belonging to several pathways) and high score to peaks that uniquely belong to a given pathway. Presence of a non-redundant metabolite in the spectrum bears unique information about a particular metabolic pathway. This information can be exploited as a large number of such metabolites exists (182) which belong to only one metabolic pathway. Our method makes use of a scoring system which exploits this non-redundant information imparted by presence of unique 14 ACS Paragon Plus Environment

Page 15 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

metabolites but at the same time does not penalize pathways for their less representation in the database. As a first check of ChemSMP, peak-lists comprising of precise chemical shifts from a single pathway was given as input. The output for each of the 91 inputs had the correct pathway identified with coverage score 100% (Figure S1 Supporting Information). For some inputs, incorrect pathways also received 100% coverage score. This was because they comprised of metabolites which form subset of other pathways. For example SMP00468 (‘Degradation of Superoxides’) has 10 metabolites but with chemical shifts of only one metabolite currently available in the HMDB. This is a common metabolite (NADP) which occurs in several other pathways and hence it gets identified with other pathways as well. This is in fact desirable because with such limited evidence we cannot rule out the absence of pathways. For practical purposes, however, one can de-clutter the pathway results by removing common metabolites from analysis in an iterative manner, which is commonly practiced while analyzing network graphs of metabolites

24,25

(like excluding ATP, AMP, ADP, NAD, NADH, NADP and

Coenzyme A). Simulation results with single pathway as input and neglecting the less discriminatory metabolite NADP is shown in Figure S1 (b). The elimination of common metabolites in one shot may not be the best strategy. Removing the common metabolites one at a time in the iterative fashion in the descending order of their occurrence allow interactive evaluation of the de-cluttered results, which may allow clear distinction between a correct and incorrectly identified pathway. Therefore the detailed output results that map metabolites to peaks (Figure 3c) are very useful for this purpose. In practice a sample will have metabolites from several different metabolic pathways that are simultaneously active. At the same time due to low concentration of the intermediate 15 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 27

metabolites of a pathway in steady state, the spectra will have missing peaks. In addition there could be non-systematic deviations of chemical shifts from the database values. In order to test the performance of ChemSMP under these conditions, datasets (A-J) mimicking such conditions were created (described above). The output of ChemSMP for these test conditions are presented in Figure 4. The simulations results were analyzed using PPR (positive prediction rate) percentage. It is defined as follows. Considering the top ‘N’ listed pathways in the output, PPR (positive prediction rate) is the ratio of number of correct metabolic pathways identified among these top ‘N’ pathways, which are ranked by PC and PU score. For example, if 3 pathways are considered in a simulation, the PPR for that simulation is the percentage of correct pathways identified in the first 3 pathways. Variable numbers of pathways (2-9) are used to generate input peak-lists for simulations.

16 ACS Paragon Plus Environment

Page 17 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 4: Results of simulations to test ChemSMP. Ten input datasets (A-J) for simulations were prepared. Five of them (top) had peaks deleted randomly from 0 – 40% and the rest five (below) along with peaks deletions had noise added in terms of chemical shifts deviations. Positive prediction rate (PPR) is calculated as ratio of true positives and sum of true positives and false positives. Averaged positive prediction rate at each position is plotted against number of output pathways considered in the decreasing order of coverage score. The '0%' with no deviations shows uniform decrease in average PPR with increasing rank in the output list. (a) Results obtained when redundant metabolites are taken into account. (b) Simulations results on neglecting the redundant metabolites (AMP, ADP, ATP, NAD, 17 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 27

NADH, NADP and Coenzyme A) which have less discriminatory ability (being common metabolites present in many pathways). This increases the PPR% significantly from ~40% to > 60%.

We plotted average PPR value at each position for 10,000 simulations. Three important results emerge out of these simulations. First, when there is no noise in terms of peak deletions and chemical shifts deviations the averaged PPR value for the first output pathway (i.e., one with the highest PC score) is > 90%. This implies that almost always the first listed pathway in the output will not be false positive. However, in presence of noise (i.e., with addition of chemical shift deviations and peak deletions) the PPR value decreases. Second, the PPR value remains nearly the same whether we consider one pathway or multiple pathways to be present in the system. This is useful when considering the fact that number of pathways known in a given system will not be known apriori. The objective of ChemSMP is to enumerate all pathways which are supported by chemical shifts from spectrum and have non-zero coverage scores. It lists them in the order of one which has more evidence than other. The uniqueness score adds value in terms of enhancing the confidence in a pathway. In general, presence of reasonable coverage (PC) score has to be ensured before we further make use of uniqueness score. Thus, chemSMP performs equally well independent of the number of pathways present. Third, when all 315 metabolites are considered the average PPR% reach a value of almost 40 % (Figure 4a). However, when we ignore the common metabolites (such as ATP, AMP, ADP, NAD, NADH, NADP and Coenzyme A) with less discriminatory ability, the average PPR% reaches 70 % (Figure 4b). False positives arise from metabolic pathways that get ranked up higher owing to common metabolites shared with the correct metabolic pathway. This is resolved by neglecting the common metabolites with less discriminatory power, which results in higher PPR%. As mentioned above, while the elimination of common metabolites may not always be the best

18 ACS Paragon Plus Environment

Page 19 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

strategy, removing them may allow distinction between a correct and incorrectly identified pathway. We also evaluated the effect of value of bin size in 1H and

13

C dimension on the

robustness of the method (measured as PPR% from the simulated datasets). The PPR% for each rank was plotted against the 36 combinations of values of bin size varying from 0.01 to 0.06 in steps of 0.01 in 1H dimension and 0.1 to 0.6 in steps of 0.1 in

13

C dimension (Figure S3

Supporting Information). There is drastic drop in PPR% when the bin size in 1H or

13

C is less

than the deviation added as noise to chemical shifts. For example, 0.01 ppm bin size in 1H cannot accommodate the random deviations chosen between 0.005 – 0.015 ppm in 1H chemical shifts. The PPR% reduction on having larger bin size (owing to decrease in specificity) is very less compared to when the bin size is less than deviation of the chemical shifts in either dimension. The results through simulations suggest that it is advantageous to keep the bin size slightly more than the expected deviations in both the 1H and 13C dimensions. We further tested ChemSMP on an experimental dataset obtained by recording 2D [13C, 1

H] HSQC on mixture of 20 amino acids. The peak-list was calibrated for systematic deviations

before giving as input. On analyzing the metabolites involved in the pathways, we found 42 out of 91 pathways could be responsible for the presence of amino acids in the sample. ChemSMP was able to correctly identify 39 (93 %) of these pathways along with 7 (17 %) incorrect pathways, which is attributed to non-systematic chemical shifts deviations. The parameters used for ChemSMP is provided in Table S3 (Supporting Information). We also compared the coverage score obtained by ChemSMP in the experimental dataset against the expected coverage score when database chemical shifts values of amino acids are used in the input peak-list (Figure 5). Interestingly, there were contrasting differences for some pathways in the theoretical and 19 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 27

experimental coverage scores. This is in part because certain amino acids have multiple chemical shifts values in database which are not observed when metabolites are in small concentration or in different environment

26

. For example, 4 chemical shifts are present for histidine in HMDB

database ((4.0037, 57.2847) for α carbon and (3.2941, 30.2042), (3.2688, 30.2042), (3.2076, 30.0802) for β carbon) within 0 - 100 ppm spectral width of

13

C. However in experiment we

observed chemical shifts ((3.929, 56.469) for α carbon and (3.243, 28.688) for β carbon) only (Figure S4 of Supporting Information).

Such inconsistencies are responsible for variation

between theoretical and practical coverage scores. We further compared the performance of ChemSMP with MetPA9. The latter requires the names of identified metabolites as input. MetPA identified 30 metabolic pathways from SMPDB (Table S4 from Supporting Information) on giving 20 amino acid names as input. ChemSMP identified 45 metabolic pathways using the 2D [13C, 1H] HSQC spectra of 20 amino acids. 26 pathways were common in the output from both ChemSMP and MetPA. Out of the four pathways missed by ChemSMP, two (‘Transcription/Translation’ (SMP00019) and Valine Leucine and Isoleucine degradation’ (SMP00032)) were not present in the database used by ChemSMP (Table S1 of Supporting Information) and the other two (as per SMPDB) ‘Ubiquinone Biosynthesis’ (SMP00065) and ‘Thiamine Metabolism’ (SMP00076)) do not have amino acids in their pathway. Further, in the top 30 metabolic pathways by ChemSMP, 18 metabolic pathways are common with MetPA. Interestingly, 14 metabolic pathways have amino acids but were not listed by MetPA whereas ChemSMP missed only 3 metabolic pathways in SMPDB with amino acids. We also tested efficacy of ChemSMP on experimental data obtained from the cancer cell lysate to identify the metabolic pathways responsible for metabolites observed in the NMR 20 ACS Paragon Plus Environment

Page 21 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

spectrum. We used samples from cells cultured on uniformly 13C labeled and unlabeled glucose. ChemSMP enumerated 58 pathways with non-zero coverage score in

13

C labeled sample

compared to the 54 pathways in the unlabeled sample on excluding common metabolites like ATP, AMP, ADP, NAD, NADH, NADP and Coenzyme A. However, 38 metabolic pathways are common to both. It shows that ChemSMP could identify 70% metabolic pathways correctly from less sensitive NMR samples prepared at natural abundance. The spectra of samples grown with 13

C labeled glucose and unlabeled glucose are not identical as shown in Figure 6a and 6b. The

amount of dissimilarity is less and hence the differential pathways are the ones with low PC scores. In the case of cells grown on labeled with

13

13

C glucose, only metabolites arising from glucose are

C and rendered sensitive. However, in the cells grown on unlabeled glucose, all

the metabolites are detected at natural abundance of

13

C. Hence, the difference is essentially in

the type of metabolites observed in the two samples. SMPDB pathways are, however, classified only at the organism level and not at organ, tissue or cellular level; therefore some pathways like ‘Glucose-alanine cycle’ (SMP00127), ‘Lactose degradation’ (SMP00457) were observed in the output with relatively high coverage scores when using the entire database. As our samples are obtained from cell lysate, when we reanalyzed excluding these and other irrelevant pathways from SMPDB, refined results were obtained pertinent to the system of our interest. Figure 6 demonstrates final top 15 pathways thus obtained and their coverage score in

13

C labeled cell lysate and cell lysate with

13

C at natural

abundance. ChemSMP identified pathways like ‘Mitochondrial Electron Transport Chain’ (SMP00355)

27

, ‘Glycerol phosphate shuttle’ (SMP00124)

(SMP00010) 29, ‘Pyruvate metabolism’ (SMP00060)

30

28

, ‘Nucleotide Sugars metabolism’

and ‘Warburg Effect’ (SMP00654)

31

in

21 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 27

top fifteen in our sample. The role of these pathways is well documented in literature27-31 Presentation of further details on evaluation of the pathways will be published elsewhere.

Figure 5: Testing ChemSMP on amino acids mixture. (a) Comparison of percentage coverage score of peak-lists generated by database and by experiments for pathways which involve amino acids. Out of 42 metabolic pathways which involve amino acids, 39 were correctly identified by ChemSMP. A list of the names of the pathways corresponding to their SMPDB ID is given in Table S1 of Supporting Information The comparison of coverage score for each pathway from a peak-list generated using the chemical shifts in the database (i.e., theoretical) v/s the experimental data is presented as black and grey bars, respectively. (b) 2D [13C, 1H] HSQC spectra of mixture of 20 amino acids.

22 ACS Paragon Plus Environment

Page 23 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Interestingly, the brief analyses highlight the importance of prerequisite biochemical knowledge to do any biologically meaningful analysis, when the current versions of our databases are still not annotated to the extent desirable. ChemSMP can thus be used iteratively to refine the analysis and arrive at the correct result. The comprehensive details (including prospective metabolites) provided in the output helps the user to interpret the consequent results. The following are some of the search modifications that can be incorporated after taking cues from the output: variation in bin size, list of metabolites to ignore, and pathways to include for search in the database. Notwithstanding, inference from analysis of samples such as biofluids, which contain metabolites from several metabolic pathways spread across variety of tissues, may be limited owing to the ambiguity on where the active pathways originate. Therefore, application of ChemSMP to spectra from localized samples derived from homogenized tissues, cells, organelles is likely to be more useful.

23 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 27

Figure 6: The 2D [13C, 1H] HSQC spectrum of (a) +ST cell lysate with 13C at natural abundance and (b) +ST cell lysate with cells fed on uniformly labeled 13C glucose. (c) Comparison of percentage coverage scores of top 15 metabolic pathways between the two samples. The SMPDB ID is given and the corresponding name of the pathway can be found in Table S1. Labeled sample has better coverage than sample prepared at natural abundance. However, ChemSMP could identify 38 pathways, from less sensitive +ST cell sample, whose 2D data was acquired at natural abundance of 13C out of 58 metabolic pathways identified in 13C labeled +ST cell sample.

Conclusion ChemSMP is the first method to identify metabolic pathways directly from 2D [13C, 1H] chemical shifts. In the present study, SMPDB was chosen as a source for metabolic pathways because the metabolites listed in this database are linked to HMDB, where their chemical shifts values are available. ChemSMP facilitates rapid metabolic pathway analysis of a single sample which could be useful in the area of metabolic pathway engineering. It will also be useful for comparing the presence/absence or variation in activity of pathways across different samples. In such cases, we can quantitatively compare metabolic pathways of two samples by calculating statistical measures such as the covariance between peak volumes of the common indices shared by those samples. At present, roughly one third of metabolites in 91 metabolic pathways have chemical shifts value in HMDB. As more chemical shifts get deposited in the database the coverage and efficiency of ChemSMP will increase. The method can be ported to any other pathway database available for which chemical shift data is also archived in a suitable repository. The ChemSMP software can be downloaded at: http://nrc.iisc.ernet.in/hsa/gft.htm for free academic use.

24 ACS Paragon Plus Environment

Page 25 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Acknowledgements Manipa Saha is thanked for her generous support in cancer cell culture. We acknowledge reviewers whose suggestions helped in improving the manuscript. AD is supported by the interdisciplinary program under the DST Centre for Mathematical Biology. The facilities provided by NMR Research Centre, supported by Department of Science and Technology (DST), India is gratefully acknowledged. DP gratefully acknowledges support from Department of Biotechnology (DBT) for computing facilities. The authors declare no competing financial interest.

Supporting Information Available Additional text on index score derivation, calculating statistical significance; four tables listing SMPDB pathways, amino acids in mixture, acquisition and processing parameters for NMR experiments and MetPA output; four figures of simulations of single pathway query, peak picking, effect of bin size on ChemSMP and histidine peaks overlay. This information is available free of charge via the Internet at http://pubs.acs.org/.

References (1) Patti, G. J.; Yanes, O.; Siuzdak, G. Nat Rev Mol Cell Biol 2012, 13, 263-269. (2) Clayton, T. A.; Lindon, J. C.; Cloarec, O.; Antti, H.; Charuel, C.; Hanton, G.; Provost, J. P.; Le Net, J. L.; Baker, D.; Walley, R. J.; Everett, J. R.; Nicholson, J. K. Nature 2006, 440, 1073-1077. (3) Van Dien, S.; Schilling, C. H. Mol Syst Biol 2006, 2, 2006 0035. (4) Kell, D. B. Curr Opin Microbiol 2004, 7, 296-307. (5) Larive, C. K.; Barding, G. A., Jr.; Dinges, M. M. Anal Chem 2015, 87, 133-146. 25 ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 27

(6) Xia, J.; Bjorndahl, T. C.; Tang, P.; Wishart, D. S. BMC Bioinformatics 2008, 9, 507. (7) Dubey, A.; Rangarajan, A.; Pal, D.; Atreya, H. S. Anal Chem 2015, 87, 7148-7155. (8) Aggio, R. B.; Ruggiero, K.; Villas-Boas, S. G. Bioinformatics 2010, 26, 2969-2976. (9) Xia, J.; Wishart, D. S. Bioinformatics 2010, 26, 2342-2344. (10) Cavanagh, J.; Fairbrother, W. J.; Palmer Iii, A. G.; Rance, M.; Skelton, N. J. In Protein NMR Spectroscopy (Second Edition), Cavanagh, J.; Fairbrother, W. J.; Palmer, A. G.; Rance, M.; Skelton, N. J., Eds.; Academic Press: Burlington, 2007, pp 533-678. (11) Guennec, A. L.; Giraudeau, P.; Caldarelli, S. Anal Chem 2014, 86, 5946-5954. (12) Robosky, L. C.; Reily, M. D.; Avizonis, D. Anal Bioanal Chem 2007, 387, 529-532. (13) Mukhopadhyay, R. Anal Chem 2007, 79, 7959-7963. (14) Hyberts, S. G.; Takeuchi, K.; Wagner, G. J Am Chem Soc 2010, 132, 2145-2147. (15) Lee, J. H.; Okuno, Y.; Cavagnero, S. J Magn Reson 2014, 241, 18-31. (16) Jewison, T.; Su, Y.; Disfany, F. M.; Liang, Y.; Knox, C.; Maciejewski, A.; Poelzer, J.; Huynh, J.; Zhou, Y.; Arndt, D.; Djoumbou, Y.; Liu, Y.; Deng, L.; Guo, A. C.; Han, B.; Pon, A.; Wilson, M.; Rafatnia, S.; Liu, P.; Wishart, D. S. Nucleic Acids Res 2014, 42, D478-484. (17) Frolkis, A.; Knox, C.; Lim, E.; Jewison, T.; Law, V.; Hau, D. D.; Liu, P.; Gautam, B.; Ly, S.; Guo, A. C.; Xia, J.; Liang, Y.; Shrivastava, S.; Wishart, D. S. Nucleic Acids Res 2010, 38, D480-487. (18) Wishart, D. S.; Jewison, T.; Guo, A. C.; Wilson, M.; Knox, C.; Liu, Y.; Djoumbou, Y.; Mandal, R.; Aziat, F.; Dong, E.; Bouatra, S.; Sinelnikov, I.; Arndt, D.; Xia, J.; Liu, P.; Yallou, F.; Bjorndahl, T.; Perez-Pineiro, R.; Eisner, R.; Allen, F.; Neveu, V.; Greiner, R.; Scalbert, A. Nucleic Acids Res 2013, 41, D801-807. (19) Wishart, D. S.; Knox, C.; Guo, A. C.; Eisner, R.; Young, N.; Gautam, B.; Hau, D. D.; Psychogios, N.; Dong, E.; Bouatra, S.; Mandal, R.; Sinelnikov, I.; Xia, J.; Jia, L.; Cruz, J. A.; Lim, E.; Sobsey, C. A.; Shrivastava, S.; Huang, P.; Liu, P.; Fang, L.; Peng, J.; Fradette, R.; Cheng, D.; Tzur, D.; Clements, M.; Lewis, A.; De Souza, A.; Zuniga, A.; Dawe, M.; Xiong, Y.; Clive, D.; Greiner, R.; Nazyrova, A.; Shaykhutdinov, R.; Li, L.; Vogel, H. J.; Forsythe, I. Nucleic Acids Res 2009, 37, D603-610. (20) Wishart, D. S.; Tzur, D.; Knox, C.; Eisner, R.; Guo, A. C.; Young, N.; Cheng, D.; Jewell, K.; Arndt, D.; Sawhney, S.; Fung, C.; Nikolai, L.; Lewis, M.; Coutouly, M. A.; Forsythe, I.; Tang, P.; Shrivastava, S.; Jeroncic, K.; Stothard, P.; Amegbey, G.; Block, D.; Hau, D. D.; Wagner, J.; Miniaci, J.; Clements, M.; Gebremedhin, M.; Guo, N.; Zhang, Y.; Duggan, G. E.; Macinnis, G. D.; Weljie, A. M.; Dowlatabadi, R.; Bamforth, F.; Clive, D.; Greiner, R.; Li, L.; Marrie, T.; Sykes, B. D.; Vogel, H. J.; Querengesser, L. Nucleic Acids Res 2007, 35, D521-526. (21) Kumar, S. H.; Rangarajan, A. J Virol 2009, 83, 8565-8574. (22) Nagana Gowda, G. A.; Gowda, Y. N.; Raftery, D. Anal Chem 2015, 87, 706-715. 26 ACS Paragon Plus Environment

Page 27 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

(23) Helmus, J. J.; Jaroniec, C. P. J Biomol NMR 2013, 55, 355-367. (24) van Helden, J.; Wernisch, L.; Gilbert, D.; Wodak, S. J. Ernst Schering Res Found Workshop 2002, 245-274. (25) Wagner, A.; Fell, D. A. Proc Biol Sci 2001, 268, 1803-1810. (26) Bingol, K.; Bruschweiler-Li, L.; Li, D. W.; Bruschweiler, R. Anal Chem 2014, 86, 5494-5501. (27) Wallace, D. C. Nat Rev Cancer 2012, 12, 685-698. (28) MacDonald, M. J.; Warner, T. F.; Mertz, R. J. Cancer Res 1990, 50, 7203-7205. (29) Lane, A. N.; Fan, T. W. Nucleic Acids Res 2015, 43, 2466-2485. (30) Jones, R. G.; Thompson, C. B. Genes Dev 2009, 23, 537-548. (31) Vander Heiden, M. G.; Cantley, L. C.; Thompson, C. B. Science 2009, 324, 1029-1033.

TOC Figure

27 ACS Paragon Plus Environment