Metabolomics-Derived Prostate Cancer Biomarkers: Fact or Fiction

Evelyne Louis , Francois-Xavier Cantrelle , Liesbet Mesotten , Gunter ... Ana Rita Lima , Maria de Lourdes Bastos , Márcia Carvalho , Paula Guedes de...
0 downloads 0 Views 1MB Size
Subscriber access provided by SELCUK UNIV

Article

Metabolomics Derived Prostate Cancer Biomarkers: Fact or Fiction Deepak Kumar, Ashish Gupta, Anil Mandhani, and Satya N Sankhwar J. Proteome Res., Just Accepted Manuscript • Publication Date (Web): 22 Jan 2015 Downloaded from http://pubs.acs.org on January 27, 2015

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Metabolomics Derived Prostate Cancer Biomarkers: Fact or Fiction Deepak Kumar1,4, Ashish Gupta*1, Anil Mandhani2, Satya Narain Sankhwar3 1

Department of Metabolomics, Centre of Biomedical Research, SGPGIMS Campus, Lucknow, India

2

Department of Urology, Sanjay Gandhi Post Graduate Institute of Medical Sciences, Lucknow, India 3

Department of Urology, King George’s Medical University, Lucknow, India

4

Department of Applied Chemistry, Uttar Pradesh Technical University, Lucknow, India

Corresponding Author: Ashish Gupta*, PhD Department of Metabolomics Centre of Biomedical Research SGPIMS campus Raebareli Road Lucknow, 226014 UP. India Phone: 91-522-2668700 Fax: 91-522-2668215 e-mail: [email protected]

Conflict of interest The authors have declared that no conflict of interest exists.

Running head: metabolomics of prostate cancer

1 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract Despite continuing research for precise probing and grading of prostate cancer (PC) biomarkers, the indexes lack sensitivity and specificity. To search for PC biomarkers, we used proton nuclear magnetic resonance (1H NMR) derived serum metabolomics. The study comprises 102 serum samples obtained from low-grade (LG, n=40), high-grade (HG, n=30) PC cases and healthy controls (HC, n=32). 1H NMR-derived serum data were examined using principal component analysis and orthogonal partial least-squares discriminant analysis. The strength of the model was verified by internal cross-validation, using the same samples divided into 70% as training and 30% as test data sets. Receiver operating characteristic (ROC) curve examination was also achieved. Serum metabolomics reveals that three biomarkers (alanine, pruvate, glycine, and sarcosine) were able to accurately (ROC 0.966) differentiate 90.2% of PC cases with 84.4% sensitivity and 92.9% specificity compared to HC. Similarly, three biomarkers—alanine, pyruvate, and glycine—were able to precisely (ROC 0.978) discriminate 92.9% of LG from HG PC with 92.5% sensitivity and 93.3% specificity. The robustness of these biomarkers was confirmed by prediction of the test data set with >99% diagnostic precision for PC determination. These findings demonstrate that 1H NMR-based serum metabolomics is a promising approach for probing and grading PC.

Key words: NMR-spectroscopy; OPLS-DA; PCA; prostate cancer; serum-metabolomics.

2 ACS Paragon Plus Environment

Page 2 of 28

Page 3 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Introduction Prostate cancer (PC) is the most frequently diagnosed malignancy found in men worldwide. Progression of PC and mortality rates can be reduced with early determination of PC. Various modalities—digital rectal examination (DRE), serum prostate specific antigen (PSA), and Gleason score (GS)—are currently used in clinical practice, but have intrinsic limitations1-2. DRE is a crude approach and serum PSA measurement contains several pitfalls such as lack of accuracy, non-specificity, and frequent false positives leading to over-diagnosis of PC1-4. GS requires biopsy tissue samples obtained through highly invasive procedures; hence, a non- or less-invasive test for early identification of PC with high precision is urgently needed. The metabolomics approach provides vital and promising biomarkers associated with urological and other organs cancers5-8. A recent study suggests the application of liquid and gas chromatography based mass spectrometry (LC/GC-MS) to identify PC and its progression using prostate tissue and urine samples and a metabolomic approach5. In contrast, controversial findings observed using the same technique and the same biological samples9-10. To confiscate the ambiguity of the developed biomarker, these contradictory findings should be re-evaluated using the following: (a) more robust and highly specific techniques such as nuclear magnetic resonance (NMR) spectroscopy, (b) different biological samples such as serum metabolomics, (c) multivariate statistical analysis. In contrast to previously used techniques, we suggest these three evaluating approaches for the following reasons (a) NMR requires native biofluid samples, (b) serum samples are not only less prone to be affected by exogenous factors but also have far fewer inter-individual variations11 and study indicates that a serum-derived mathematical model is highly robust12, and (c) that multivariate statistical analysis extracts vital variables from the whole data set of variables.

3 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

To date, only one study has revealed the role of serum sarcosine as a biomarker for PC using fluorometric assay13. A pitfall of this approach is that it can assess only one individual variable at a time with color formations reaction, which is time-consuming and labor intensive, with a small but significant probability of error; hence, the following vital research questions are still waiting to be explored: (1) Do the low-grade (LG) and high-grade (HG) PC produce signature serum metabolic profiles? (2) Is serum metabolomics a sufficiently sensitive and assuring approach to differentiate LG and HG PC? (3) Is serum metabolomics as effective as gold-standard approach of GS for PC detection? To answer these questions, the aim of the present study was to appraise whether NMR-derived serum metabolomics would allow early identification of LG and HG PC progression and thereby suggest a proof-of-principle for the use of NMR spectroscopy on serum metabolomics in screening human PC.

Material & Methods Patients and sample collection This study was approved by the institutional ethics committee of Centre of Biomedical Research, SGPGIMS, Lucknow, India. The study population comprised males >50 years of age and age-comparable healthy control (HC) subjects. Patients included in this study had not received any medications or endured any comorbid conditions. Exclusion criteria included hormonal therapy, renal pathology, diabetes, any other malignancies, tuberculosis, endocrine disorders, and other conditions known to influence metabolic phenotype. Written informed consent was obtained from all participants before enrollment. The occurrence of tumors in the prostate was first appraised using serum PSA levels, abnormal DRE, and hypoechoic areas during transrectal ultrasound (TRUS). Transurethral

4 ACS Paragon Plus Environment

Page 4 of 28

Page 5 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

resected tissue specimens were collected to execute histopathological appraisal to determine LG or HG PC. Tumors were classified per World Health Organization guidelines. Venous blood samples were collected at early morning, after overnight fast, to minimize the effect of dietary factors and inter-individual variations in metabolomic data. Blood samples (3.0ml) were kept for 30 min in a vacutainer tube at room temperature for clotting. Clotted blood samples were centrifuged at 3000g at 4°C for 10 min to separate the supernatant-serum, which was quickly stored at –80°C until the NMR experiment was conducted. The NMR experiment was performed on a total of 102 sera comprising 40 patients with LG PC, 30 patients with HG PC, and 32 HC. NMR Experiments Serum samples were liquefied at room temperature; 400 µl of serum samples were poured in 5mm NMR tubes. One dimensional 1H NMR experiments were executed on a Bruker AvanceIII 800 MHz spectrometer with a 5mm broad-band inverse probe-head and a Z-shielded gradient. A sealed capillary containing 25 µl of 0.75% (w/v) sodium salt of trimethylsilyl propionic acid (TSP), deuterated at CH2 groups and dissolved in deuterium oxide, was inserted in an NMR tube. TSP served as a deuterium lock and a chemical shift reference with the standard signal for the quantitative estimation of metabolites. 1H NMR spectra with water suppression were executed to exert a one-dimensional single pulse and Carr–Purcell–Meiboom–Gill (CPMG) sequence. The parameters used were as follows: spectral sweep width, 12,820 Hz; data points, 65 K; pulse angle, 90°; total relaxation delay, 5 s; T2 filtering, echo time of 100 µs repeated 500 times, resulting in a total duration of effective echo time of 50 ms; number of scans, 128; and line broadening, 0.3 Hz. Phase and baseline correction of all NMR data was done using Bruker TopSpin software (version 2.1).

5 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Statistical analysis Each 1H NMR spectrum was calibrated to a methyl peak of alanine at 1.44 ppm. To perform statistical analysis, reduction of NMR data was executed using Bruker AMIX software (Version 3.8.7, Bruker BioSpin, Germany). Entire NMR spectra were binned to 4K buckets containing an equal width of 0.005 ppm. All spectra were reduced to discrete chemical shift regions 0.5to 8.5 ppm. The chemical shift of 4.6 to 5.2 ppm was removed to eliminate residual water resonances. To integrate the peak area, a simple rectangular binning approach was executed. To minimize perturbations in the concentrations of metabolite within the samples, the normalization tactic was implemented by dividing every integral area of the segment by the sum of all integrals present in the spectra. The resulting data matrices were transferred into Microsoft Office Excel 2010. To perform multivariate statistical analysis, the entire data matrix was imported to ‘The Unscrambler X’ software (version 10.0.1, Camo USA, Norway). Unsupervised principal component analysis (PCA) and supervised orthogonal partial least-squares discriminant analysis (OPLS-DA) was executed to compare metabolic profiles and appraise grouping trends of various cohorts, respectively. To avoid over-fitting of the mathematical model, a crossvalidation was applied using a leave-one-out approach. An internal cross-validation (ICV) approach was executed to evaluate the OPLS-DAderived model over-fitting. For ICV, 70% of the complete data were arbitrarily chosen according to the Fisher and Yates statistical table as the training set; 30% of the data was contemplated as the test set. The OPLS-DA model was again fabricated using the training set data. The rigorous ICV appraisal revealed the firmness of the model in the form of explained variance outcomes (R2) and the precision of prediction in the OPLS-DA model (Q2). Predictions of the test data were accomplished using a visual presentation of the Y-predicted scatter plot. Vital biomarkers

6 ACS Paragon Plus Environment

Page 6 of 28

Page 7 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

were assigned with the help of S-plots and the scores of variable importance plots (VIPs). To determine the differentiations between the various cohorts—LG vs HC, HG vs HC, and LG vs HG—pairwise statistical analysis was executed. R2 and Q2 measures were achieved for pairwise discriminant assessment. To appraise clinical utility, receiver operating characteristic (ROC) curve analysis was performed to validate the robustness of the OPLS-DA model for discriminating between specific cohorts. The ROC curve of serum PSA was also executed to compare the statistical analysis of OPLS-DA model. The correlations between PSA levels and metabolomics derived biomarkers were also analyzed with linear regression. Results Figure 1 represents 1D 1H NMR spectra of sera from HC, LG, and HG subjects. Clinicopathological information of subjects in this study is listed in Table 1. Multivariate statistical analysis At the first step, AMIX-derived binned data were applied to execute PCA analysis and foster the PCA scores plot among different groups of analysis, as exhibited in Figure 2. The scattered score plot of PCA revealed that HC cohorts were well clustered and distinguished from LG and HG cohorts (Figure 2A). Moreover, the outcome of PCA unveiled clustering of samples according to the milieu of disease. All LG PC serum samples exhibited close vicinity in contrast to the HG PC cohorts; hence, the PCA scores plot not only distinguishes HC and PC samples but also reveals significant potential to segregate LG and HG stages of PC with substantial sensitivity and specificity (Figure 2B-D). To obtain a more objective statistical analysis, an OPLS-DA tactic was applied on the complete data matrix; results exhibited a well-segregated, closely clustered pattern among various groups in the OPLS-DA score plot (Figure 2E). Similarly, pair-wise appraisal also

7 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

demonstrated a well-segregated and clustered form of OPLS-DA score plots (Figure 2F-H). Because the OPLS-DA tactic exerts class information to create maximum differentiation among different classes, this procedure helped ascertain hidden and potential biomarkers related to specific groups. The firmness of the model in the form of cumulative explained variance outcomes (R2) and the precision of prediction in the OPLS-DA model (Q2) values exhibited in figure S1 E-H along with corresponding OPLS-DA score plots. To circumvent over-fitting of the OPLS-DA model, OPLS-DA was repeated on an arbitrarily chosen 70% of the original data set and presumed as the training set. The lingering 30% of the original data set was recognized as the test set. The rigorous ICV appraisal revealed the firmness of the model in the form of R2 and Q2 values along with corresponding OPLS-DA score plots (Figure 3A-D). The strength of the OPLS-DA model was confirmed by the prediction of the test set with >90.3% of accuracy between HC and PC, 95.2% of precision between HC and LG, 94.4% of accuracy between HC and HG, and 99% of precision between LG and HG, (Figure 3). The outcomes ratify the remarkable capability of OPLS-DA scrutiny as a feasible method for probing and progression of LG to HG PC with minimal invasiveness. The comparable precision of the training and test set data of the OPLS-DA model (Figure 3A-D) establishes that the use of minimally invasive NMR-based serum metabolomics is as effective as the extremely invasive and painful classical GS approach for probing and grading of PC. Selection of biomarkers The following six probing procedures were carefully executed to identify important variables; (i) the covariance-correlation-based method was performed to eliminate false positive variables and the important variables were extracted through S-plot using SIMCA-P software (version 8.0). Before executing S-plot, Pareto scaling was applied on AMIX-generated binned data. Figure 4

8 ACS Paragon Plus Environment

Page 8 of 28

Page 9 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

depict the most perturbed variables among the whole data set in S-plots. The variables exhibiting higher p and p (corr) standards were playing a major role in the formation of clusters among the groups and were considered as potential variables; (ii) screening of VIP measures of these variables was performed. VIP score>1 embraced variables were found most influential and were gleaned for further processing; (iii) a Lachenbruch jack-knife method was executed on gleaned variables that were selected at step 2. The negative confidence interval comprising variables were abolished in this step and the next step screening procedure was executed on the lingering variables; (iv) on lingering variables at step 3, to identify statistical significant difference of these spectral bins among various cohorts, a one-way ANOVA followed by a post-hoc Student– Newman–Keuls multiple comparisons test were executed and variables with significant difference (p < 0.01) were chosen. Only four variables were capitulated in this limit. Hence, these four variables were pondered as potential variables for discriminating among the HC, LG, and HG cohorts; (v) the characteristic spectral regions of these four selected variables were identified as sarcosine, glycine, pyruvate, and alanine using known chemical shift and coupling constant data of 1H NMR serum spectra. The following three-fold approaches were applied to confirm the assignment of these biomarkers: spiking experiments in 1H NMR measurements (Figure 5), 1D

13

C NMR experiments (Figure S1) and long range COSY (2D 1H-1H) NMR

experiments for sarcosine confirmation (Figure S2); (vi) the absolute concentration of these specific biomarkers was also measured using the calibration curve of the standard compounds. The calibration curves—drawn with the help of different concentrations of standard compounds—were added into control serum samples (Figure 5). The OPLS-DA model was again appraised using absolute concentration of metabolites and comparable outcomes were achieved as stated in multivariate statistical analysis section. The obtained absolute

9 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 28

measurements of these vital biomarkers were used to generate box and whisker plots depicted in Figure 5. Hidden and potential biomarkers were disclosed and determined with this approach. ROC probing was also executed not only to confirm the clinical utility of these biomarkers but also to reveal the considerably causative metabolites for the prediction of LG and HG PC. Absolute concentration of individual biomarkers were used to generate discriminant predicted probability (DPP) scores which help to reveals the ROC assessment of individual biomarkers and their area under the curve (AUC) measures (Table 2). DPP scores were calculated using different combinations of sum of absolute concentrations of these important biomarkers for various classifications. For each classification, eleven combinations were generated with the help of a mathematical approach (supplementary data and document). The outcomes of the best combinations are described in Table 3 and Figure S3 A-D. Table 3 indicates that the amalgamation of alanine, pyruvate, glycine, and sarcosine offer utmost AUC for the 90.2% of discrimination of PC (LG+HG) from HC (Figure S3 A). Only alanine, glycine, and sarcosine provided the highest AUC for 88.9% of discrimination of LG and HC (Table 3 and Figure S3 B). The combination of glycine and sarcosine contributed utmost AUC for 98.4% of the differentiation between HG and HC (Table 3 and Figure S3 C). Moreover, the grouping of alanine, pyruvate, and glycine were ample to 92.9% for distinguishing between LG and HG (Table 3 and Figure S3 D). Table 3 and figure S3 E-H is also described the outcome of ROC analysis using DPP scores of PSA levels for different classifications. Statistical comparison between metabolomics and clinical measure (PSA levels) (Table 3 and figure S3) findings reveal that the metabolomics are much better than clinical measures.

10 ACS Paragon Plus Environment

Page 11 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Regression analysis reveals the correlations between PSA vs. alanine (r = -0.008, p = 0.71), PSA vs. pyruvate (r = 0.16, p < 0.001), PSA vs. glycine (r = 0.09, p < 0.001), and PSA vs. sarcosine (r = 0.18, p < 0.001) (figure 6). Discussion This study discloses several important findings of NMR-based metabolomics of human serum samples. First, we determined that the NMR-derived fingerprint of the serum metabolic profile is able to accurately discriminate among the LG, HG, and HC cohorts. Second, the OPLS-DA tactic not only fosters clusters of cohorts but also helps to disclose potential and hidden biomarkers from the complex NMR spectra of serum. Third, critical appraisal of ICV and absolute measurement of important metabolites using a calibration curve reveals the accuracy of the outcomes. Fourth, the ROC study of vital biomarkers presented the clinical impact of the essential metabolites. Table 3 clarifies that among all four important metabolites, certain combinations of these few biomarkers are sufficient to accurately categorize LG, HG, and HC cohorts. Fifth, augmented level of sarcosine in PC serum samples concurred with a previous study5 and rebuts the contradictory observation of other studies9-10. Sixth, the above narration clarifies the gaps of the following facts: (a) LG and HG PC not only generate signature biomarkers but these biomarkers can also be used to identify and discriminate between them, (b) NMR-based serum metabolomics is adequately sensitive and promising to differentiate LG and HG PC, (c) the substantial sensitivity and specificity of these biomarkers reveals that serum metabolomics are as effective as GS for PC detection. These findings demonstrate that NMRderived serum metabolomics are sufficiently robust to detect subtle perturbations in circulating metabolites and thereby suggest a proof-of-principle for the use of probing and grading of PC.

11 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 28

The augmented level of serum alanine in LG PC compared to HG PC and HC may be associated with over-expression of the respective amino acid metabolism and suppression of the Krebs cycle under hypoxia. The increased level of alanine is in accord with the fact that the rate of amino acid metabolism becomes altered in PC14 and alanine was significantly higher in PC cell lines and biopsy tissues samples14-15. Earlier observations and our results both support the theory that glycolytic flux increases during tumor formation, and that the need for augmented protein synthesis in tumors is obvious16. A previous study reveals that oxidation of glutamate is possibly an important source of respiratory energy in un-controlled cell proliferation17. Glutamate is predominantly transaminated to pyruvate to formulate alanine, and the augmented concentration of alanine is possibly an outcome of the prerequisite for lipogenesis to biosynthesis of membranes in proliferating cancer cells16. Our observation of an augmented level of pyruvate in PC cases compared to HC also supports the findings of an increased level of transamination to alanine from pyruvate, which agrees with an earlier 13C hyperpolarized study18. The increased level of sarcosine in LG and HG PC compared to HC concurs with the fact that the rate of methylation of glycine is a much higher concomitant dehydrogenation and that oxidation of sarcosine is comparatively lower in PC, as observed in a recent study19. Additionally, the augmented level of sarcosine observed in sediment urine samples of PC cases5 concurs with our PC serum-based observations. The novel finding of our study reveals a decrease in glycine level in LG and HG PC compared to HC. This novel observation may support the reduction in dehydrogenation and oxidation of the sarcosine phenomenon. Our observation reveals that the sarcosine level up-regulated in PC in accord with an earlier study5 and confiscates the contradictory observation of other studies9-10. These studies used LC/GC-MS techniques where sarcosine and alanine possess the same parent and daughter ion pairs in MS,

12 ACS Paragon Plus Environment

Page 13 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

causing difficulty in resolving these metabolites. Moreover, due to several inherent limitations of this technique, it is far from a clinical laboratory20. In contrast, the 1H NMR technique is able to detect well-resolved sarcosine and alanine in intact serum samples. Moreover, a recent study also reveals that serum sarcosine has a higher predictive capacity than total PSA using a fluorescentbased sarcosine assay kit13; this finding also concurs with our NMR-based serum metabolomic observations. In contrast to earlier studies5,13, our study found that serum-based NMR-derived metabolomics is a considerably better approach because it provides improved AUC of ROC, sensitivity, and specificity using combination of biomarkers to reveal the progression of LG to HG PC. In contrast to sensitivity (75.0%) and specificity (36.7%) of serum PSA, the ROC examination of OPLS-DA-derived outcomes presented 92.5% sensitivity and 93.3% specificity in differentiating the LG and HG PC. Regression analysis also reveals that the pyruvate, glycine, and sarcosine are excellent biomarkers and exhibited extremely significant correlation with clinical measure. Conclusion In essence, this study demonstrates proof-of-principle that NMR-based serum metabolomics can apply for precise probing and grading of PC. The outcome of minimally invasive serum metabolomics is as effective as tedious and highly invasive clinical examinations. Our proposed approach, direct correlation of the serum metabolic profiles and grading of PC, will eventually permit the withdrawal of metabolic events that occur at the transformation of LG to HG PC and alteration of enzymatic activity of signature metabolic pathways, and it will exhibit a systematic path for exploring the hallmarks of PC progression. Moreover, this study may not only suggest the potential use of NMR-based serum metabolomics in probing the

13 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 28

progression of LG and HG PC but also supplement contemporary descriptive modalities with the aim of advancing the treatment approach and designing follow-up procedures. The present study proved that metabolomics derived prostate cancer biomarkers are not the fiction. Limitation of our study Our study does not include benign prostate hyperplasia cases which may be the presence of biasing factors. This study could not describe the possible sources of bias in the samples collections between controls and patients. This study lacks independent validation cohort. The differences in metabolites may be due to alterations in systemic metabolism consequential to malignancy or co-morbidities. This study lacks intra-class correlation coefficient of the biomarkers. In other study we may explore the effects of bias and confounding with larger and less-controlled sample size. Repeat samples may give some more strength of study.

14 ACS Paragon Plus Environment

Page 15 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Acknowledgements: Financial support was provided by Department of Biotechnology, New Delhi, India (BT/PR6547/GBD/27/450/2012). We thank Dr. Sudhir Kumar Mandal, Department of Biostatistic, Centre of Biomedical Research, SGPGIMS Campus, Lucknow for statistical guideline. The authors thank Prashant Singh Shakya for his assistance in preparing sample, NMR data acquisition and analysis.

Supporting Information Available: This

material

is

available free of

charge via the

Internet

at http://pubs.acs.org.”

Supporting Information content Figure S1: 1D 13C NMR spectra of human serum samples (A) Healthy control (B) Prostate Cancer patient. Figure S2: Long range COSY (1H-1H 2D) NMR experiment Figure S3: Comparison of AUC of ROC curves of different classifications Supplementary Supporting Information Measured data, statistical analysis, etc. are available in excel sheet.

15 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 28

References: [1]

Pal, R.P; Maitra, N.U; Mellon, J.K; Khan, M.A. Defining prostate cancer risk before prostate biopsy. Urol Oncol 2013, 31(8), 1408–1418.

[2]

Mitka, M. Is PSA testing still useful? J Am Med Assoc 2004, 292(19), 2326–2327.

[3]

Thompson, I.M; Pauler, D.K; Goodman, P.J; Tangen, C.M; Lucia, M.S; Parnes, H.I; et al. Prevalence of prostate cancer among men with a prostate specific antigen level < or =4.0 ng per milliliter. N Engl J Med 2004, 350, 2239–2246.

[4]

Bangma, C.H; Roemeling, S; Schroder, F.H. Over-diagnosis and over-treatment of early detected prostate cancer. World J Urol 2007, 25, 3–9.

[5]

Sreekumar, A; Poisson, L.M; Rajendiran, T.M; Khan, A.P; Cao, Q; Yu, J; et al. Metabolomic profiles delineate potential role for sarcosine in prostate cancer progression. Nature 2009, 457, 910−914.

[6]

Thysell, E; Surowiec, I; Hornberg, E; et al. Metabolomic characterization of human prostate cancer bone metastases reveals increased levels of cholesterol. PLoS One 2010, 5, e14175.

[7]

Bansal, N; Gupta, A; Mitash, N; et al. Low- and high-grade bladder cancer determination via human serum-based metabolomics approach. J Proteome Res 2013, 12, 5839-5890

[8]

Gupta, A; Gupta, S; Mahdi, A.A. 1H NMR-derived serum metabolomics of leukoplakia and squamous cell carcinoma. Clin Chem Acta 2015, 441, 47-55.

[9]

Jentzmik, F; Stephan, C; Miller, K; et al. Sarcosine in urine after digital rectal examination fails as a marker in prostate cancer detection and identification of aggressive tumors. Ero Uro 2010, 58, 12-18.

16 ACS Paragon Plus Environment

Page 17 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

[10] Jentzmik, F; Stephan, C; Lein, M; et al. Sarcosine in prostate cancer tissue is not a differential metabolite for prostate cancer aggressiveness and biochemical progression. J Urology 2011; 185: 706-711. [11] Walsh, M.C; Brennan, L; Malthouse, J.P.G; Roche, H.M; Gibney, M.J. Effect of acute dietary standardization on the urinary, plasma, and salivary metabolomic profiles of healthy humans. Am J Clin Nutr 2006, 84, 531−539. [12] Vigneau-Callahan, K.E; Shestopalov, A.I; Milbury, P.E; Matson, W.R; Kristal, B.S. Characterization of diet-dependent metabolic serotypes: Analytical and biological variability issues in rats. J Nutr 2001, 131, 924S−932S. [13] Lucarelli, G; Fanelli, M; Larocca, A.M.V; et al. Serum sarcosine increases the accuracy of prostate cancer detection in patients with total serum PSA less than 4.0ng/ml. The Prostate 2012, 72, 1611-1621. [14] Putluri, N; Shojaie, A; Vasu, V.T; et al. metabolomic profiling reveals a role for androgen in activating amino acid metabolism and methylation in prostate cancer cells. PLoS ONE 2011, 6(7), e21417. [15] Tessem, M.B; Swanson, M.G; Keshari, K.R; et al. Evaluation of lactate and alanine as metabolic biomarkers of prostate cancer using 1H HR-MAS spectroscopy of biopsy tissues. Magn Reson Med 2008, 60, 510-516. [16] Costello, L.C; Franklin, R.B. ‘Why do tumour cells glycolyse?’: from glycolysis through citrate to lipogenesis. Mol Cell Biochem 2005, 280, 1–8. [17] Moreadith, R.W; Lehninger, A.L. The pathways of glutamate and glutamine oxidation by tumor cell mitochondria. Role of mitochondrial NAD(P)+-dependent malic enzyme. J Biol Chem 1984, 259, 6215–6221.

17 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

[18] Albers, M.J; Bok, R; Chen, A.P; et al. Hyperpolarized 13C lactate, pyruvate, and alanine: noninvasive biomarkers for prostate cancer detection and grading. Cancer Res 2008, 68 (20), 8607–15 [19] Khan, A.P; Rajendiran, T.M; Ateeq, B. et al. The role of sarcosine in prostate cancer progression. Neoplasia 2013, 15, 491-501. [20] Grebe, S.K.G; Singh, R.J. LC-MS/MS in the clinical laboratory – where to from here? Clin Biochem Rev 2011, 32, 5-31.

18 ACS Paragon Plus Environment

Page 18 of 28

Page 19 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Table 1: Summary of clinical information’s of prostate cancer (PC) patients and healthy controls (HC). No attempt to correct for age and BMI were made because they were not considered to be confounders by the analysis. Unpaired t-test was applied between HC vs. PC cases and paired ttest was applied between LG vs. HG cases as well as batch analysis.

Characteristics No. of subjects Age (mean + SD) Low grade (LG) High grade (HG) Gender Male Female Batch analysis HC (n=32)

PC 70 63+5 63+6 63+4

HC 32 61+3

70 0

32 0

LG (n=40)

n=19 n=21 n=19 n=11

HG (n=30)

Significance level p = 0.13 (PC vs HC) p = 0.95 (LG vs HG)

n=17 n=15

p = 0.35 (for HC) p = 0.54 (for LG) p = 0.65 (for HG)

Cancer grade Low grade (LG) High grade (HG) BMI ( mean + SD) Low grade (LG) High grade (HG)

40 (57%) 30 (43%) 22.9+3.6 23.0+3.6 22.8+3.6

0 0 22.9+2.4 p =0.99 (PC vs HC)

Prostate Specific Antigen Low grade (LG) High grade (HG)

30.6+36.9 19.1+16.9 45.9+49.3

1.9+0.6

Grade 1-2 Grade 2-3

0 0

6-7 8-10 0

0 0 0

Digital Rectal Examination Low grade (LG) High grade (HG) Gleason Score Low grade (LG) High grade (HG) Medications

p = 0.67 (LG vs HG) p = 0.0001 (PC vs HC) p = 0.01 (LG vs HG)

19 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 28

Table 2: Chemical shift, multiplicity, resonance assignments, and AUC of ROC of the individual biomarkers obtained from the OPLS-DA model and S-plot.

No. 1. 2. 3. 4

Metabolites Alanine Pyruvate Glycine Sarcosine

Assignment CH3 CH3 CH2 CH3

δ1H 1.44 2.37 3.55 2.70

Multiplicity d s s s

s, singlet; d, doublet;

20 ACS Paragon Plus Environment

δ13C 19.3 29.4 44.6 35.0

AUC of ROC 0.794 0.805 0.817 0.835

Page 21 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Journal of Proteome Research

Table 3: Comparisons of accuracy, AUC of ROC, sensitivity and specificity based on discriminant predicted probability (DPP) scores of different combinations of various metabolites explained in Table 2 and clinical variables (PSA levels) for different classifications.

Test result classification

ROC analysis of DPP scores of different combinations of ROC analysis of DPP scores of PSA levels biomarkers from Table 2 Sensitivity Specificity Accuracy AUC Sensitivity Specificity Biomarker Accuracy AUC a (%) (%) (%) (%) (%) (%) No.

HC vs. LG+HG HC vs. LG HC vs. HG LG vs. HG

1+2+3+4 1+3+4 3+4 1+2+3

90.2 88.9 98.4 92.9

0.966 0.970 0.997 0.978

84.4 84.4 96.9 92.5

92.9 92.5 100 93.3

HC, healthy control; LG, low grade; HG, high grade a

Biomarker No. corresponds to the metabolite number in Table 2.

21 ACS Paragon Plus Environment

67.6 76.4 75.8 58.6

0.959 0.966 0.949 0.630

100.0 100.0 100.0 75.0

52.9 57.5 50.0 36.7

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 28

Figure 1: The histopathological analysis of prostate cancer tissue samples (magnification, x40; hematoxyline and eosine stain) from different cohorts, viz: (A) healthy control, (B) low grade (Gleason score 5), (C) high grade (Gleason score 10) PC subjects is depicted alongside of each 1 H NMR spectra of analogous human serum samples. Key: 1, HDL + LDL + VLDL; 2, valine; 3, isoleucine; 4, lactate; 5, alanine; 6, acetate; 7, N-acetylglycoprotein; 8, acetone; 9, glutamate; 10, pyruvate; 11, glutamine; 12, citrate; 13, sarcosine; 14, creatinine; 15, malonate; 16, glycine; 17, choline; 18, threonine; 19, urea; 20, tyrosine; 21, histidine; and 22, phenylalanine

22 ACS Paragon Plus Environment

Page 23 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 2: Three dimensional unsupervised PCA score plot:(A) HC vs LG + HG, (B) HC vs LG, (C) HC vs HG, and (D) LG vs HG, three dimensional supervised OPLS-DA score plot: (E) HC vs LG + HG, (F) HC vs LG, (G) HC vs HG, and (H) LG vs HG. Cumulative explained variance outcomes (R2) and the precision of prediction in the OPLS-DA model (Q2) values exhibited. Here; HC, healthy control (blue dot); LG, low grade (green circle) and HG, high grade (red square)

23 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3: Left side; three dimensional OPLS-DA score plots of training set (70% of the original data) along with R2 and Q2 values and right side; prediction of unknown samples (test set, 30% of the original data) using training set derived OPLS-DA model. The predictions are generated based on cut-off value of 1.5 for class membership, exerting Y-predicted box-plot. Y-prediction scores depicted mean values along with standard deviation. Here; (A) HC vs LG + HG, (B) HC vs LG, (C) HC vs HG, and (D) LG vs HG, Here; HC, healthy control (blue dot); LG, low grade (green circle) and HG, high grade (red square).

24 ACS Paragon Plus Environment

Page 24 of 28

Page 25 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 4: S-plot of multivariate analysis of different groups, (A) HC vs. LG+HG, (B) HC vs. LG, (C) HC vs. HG, (D) LG vs. HG. The corresponding OPLS-DA score plot elucidated in figure 2 E-H. The variables containing highest p and p (corr) values playing significant role to discriminate between their corresponding controls. Here HC: healthy control; LG: low grade; and HG: high grade.

25 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 28

Figure 5: Assignment of vital metabolites with spiking experiments, calibration curve drawn with different concentration of standard compounds added into control serum samples, and box plot denotes the 25th to 75th percentiles for the absolute concentrations (micromolar) of metabolites measured in LG, HG and HC serum samples using calibration curve equation. The median, or 50th percentile, is drawn as black horizontal line inside the box. Here HC: healthy control; LG: low grade; and HG: high grade of BC. ***, p