and High-Grade Bladder Cancer Determination via ... - ACS Publications

Nov 13, 2013 - Department of Metabolomics, Centre of Biomedical Research, SGPGIMS Campus, Lucknow 226014, India. §. Department of Urology, Sanjay ...
2 downloads 0 Views 5MB Size
Article pubs.acs.org/jpr

Low- and High-Grade Bladder Cancer Determination via Human Serum-Based Metabolomics Approach Navneeta Bansal,† Ashish Gupta,*,‡ Nilay Mitash,§ Prashant Singh Shakya,‡ Anil Mandhani,§ Abbas Ali Mahdi,∥ Satya Narain Sankhwar,† and Sudhir Kumar Mandal⊥ †

Department of Urology, King George’s Medical University, Lucknow 226003, India Department of Metabolomics, Centre of Biomedical Research, SGPGIMS Campus, Lucknow 226014, India § Department of Urology, Sanjay Gandhi Post Graduate Institute of Medical Sciences, Lucknow 226014, India ∥ Department of Biochemistry, King George’s Medical University, Lucknow 226003, India ⊥ Department of Biostatistic, Centre of Biomedical Research, SGPGIMS Campus, Lucknow 226014, India ‡

S Supporting Information *

ABSTRACT: To address the shortcomings of urine cytology and cystoscopy for probing and grading urinary bladder cancer (BC), we applied 1H nuclear magnetic resonance (NMR) spectroscopy as a surrogate method for the identification of BC. This study includes 99 serum samples comprising low-grade (LG; n = 36) and high-grade (HG; n = 31) BC as well as healthy controls (HC; n = 32). 1H NMR-derived serum data were analyzed using orthogonal partial least-squares discriminant analysis (OPLSDA). OPLS-DA-derived model validity was confirmed using an internal and external cross-validation. Internal validation was performed using the initial samples (n = 99) data set. External validation was performed on a new batch of suspected BC patients (n = 106) through a double-blind study. Receiver operating characteristic (ROC) curve analysis was also performed. OPLS-DAderived serum metabolomics (six biomarkers, ROC; 0.99) were able to discriminate 95% of BC cases with 96% sensitivity and 94% specificity when compared to HC. Likewise (three biomarkers, ROC; 0.99), 98% of cases of LG were able to differentiate from HG with 97% sensitivity and 99% specificity. External validation reveals comparable results to the internal validation. 1H NMR-based serum metabolic screening appears to be a promising and less invasive approach for probing and grading BC in contrast to the highly invasive and painful cystoscopic approach for BC detection. KEYWORDS: Urinary bladder cancer, serum metabolomics, NMR spectroscopy



INTRODUCTION Urinary bladder cancer (BC) is the fifth most common cancer and is one of the leading causes of death worldwide.1 The presentation of BC may be either low-grade (LG) or high-grade (HG). LG is a non-muscle-invasive condition of BC; in contrast, HG is an aggressive condition that may be nonmuscle invasive or muscle invasive. Even after treatment, the recurrence rate of BC is significantly higher; thus, BC patients have to be under regular supervision for life using comprehensive urine examination protocols including cytology, imaging, and highly expensive and invasive cystoscopy.2−4 The gold-standard cystoscopic examination may cause a small but definite risk of morbidity with chances of false negativity.5−7 © 2013 American Chemical Society

The voided-urine cytology approach is the most common for detection of HG BC; however, this method is subjective, costly, and it has interobserver variability as well as poor sensitivity and specificity, especially for LG tumors.8−10 Although many studies have recently divulged various urine-based protein biomarkers for the identification of BC,3,11−16 the accuracy of these biomarkers is unable to match the results of cystoscopy.2,13,17 Hence, a noninvasive or less invasive test for early identification of BC with high sensitivity, specificity, and low cost is desperately needed. Received: August 21, 2013 Published: November 13, 2013 5839

dx.doi.org/10.1021/pr400859w | J. Proteome Res. 2013, 12, 5839−5850

Journal of Proteome Research

Article

Urology, KGMU, and SGPGIMS, Lucknow. The study population included males >40 years of age with symptoms of hematuria, dysuria, and frequent urination. Patients included in this study had not received any treatment and/or endured any comorbid condition. Age- and sex-matched healthy controls (HC) were included. Exclusion criteria included renal pathology, urinary tract infections, diabetes, arthritis, any other malignancies, tuberculosis, endocrine disorders, drug abuse, and other conditions known to influence metabolic phenotype. All subjects gave written informed consent. In patients, the occurrence of tumors in the bladder was first appraised using cystoscopy. Subsequently, transurethralresected tissue specimens were collected to perform histopathological evaluation to identify LG and HG BC. The adjacent noninvolved tissue samples were also collected from some of these patients and used as controls. The tissue samples were snap-frozen in liquid nitrogen and stored in a −80 °C freezer until histological processing. Tumors were classified as per the guidelines of the World Health Organization (WHO)/ International Society of Urological Pathology (ISUP).39 To reduce the effect of dietary factors and interindividual variations on metabolic phenotype data, all patients and controls were on a restricted vegetarian diet (i.e., no meat or fish) for 48 h; subsequently, venous blood samples were drawn between 8:00 a.m. and 10:00 a.m. The blood was kept for 30 min in a vacutainer tube at room temperature for clotting. Clotted blood samples were centrifuged at 3000g at 4 °C for 10 min to remove the supernatant-serum, which was rapidly stored at −80 °C until NMR analysis was performed. A total of 99 sera were collected from 36 patients with LG BC, 31 patients with HG BC, and 32 HC.

Metabolomic studies have escalating applications in the identification of various diseases using different types of biofluids.18−21 This approach also offers potent and assuring biomarkers associated with various cancers.22−25 Recent studies reveal the application of various techniques, such as highperformance liquid chromatography/mass spectrometry (HPLC/MS),26 liquid chromatography−mass spectroscopy (LC−MS),27 gas chromatography−mass spectroscopy (GC− MS),28 and proton nuclear magnetic resonance (1H NMR) spectroscopy,29 to identify BC patients using a urine sample and metabolomic approach. Very recently, urine samples from a canine model have also been used to differentiate between BC and control subjects.30 Specifically, the differentiation of LG and HG BC has not been explored using a urine-derived metabolomics approach. To some extent, urine metabolomics is susceptible to a dilution factor (depending upon the amount of liquid intake) as well as a distinct cultural and severe dietary influence (vegetarian or nonvegetarian).31,32 These intrinsic limitations make urine a less suitable biofluid to determine the differentiation of LG and HG BC. In contrast to urine, serum-based metabolomics of BC may be a better choice because serum is not only less prone to be affected by exogenous factors but also intra- and interindividual variations are far less.31 Moreover, around 250 metabolites in serum are sufficiently reliable, both analytically and biologically, for potential use in building mathematical models of serotype.33 To date, only one report has revealed metabolic variations in serum in a BC context using NMR spectroscopy,34 which typically measures the relative intensities of various metabolites; however, that report did not explore the sensitivity and specificity of the biomarkers and also did not justify its results with external validation. Moreover, the relative intensities of various metabolites do not provide a complete picture of perturbed metabolism in BC. An absolute quantification of metabolites is technically more rewarding than relative intensities because it can detect changes in the pool size of metabolites. The pool size can also be further used for the flux measurement of various metabolites through their enzyme kinetics reactions, as flux is critically altered in disease conditions.35−38 Thus, an absolute quantification of metabolites promises more profound bioenergetic insights than the relative intensities. These indices are still waiting to be explored further. These information gaps epitomize many vital research questions. First, do the LG and HG forms of BC generate signature serum metabolic phenotypes? Second, is serum metabolomics an adequately sensitive and promising approach to differentiate LG and HG BC? Third, is serum metabolomics on par with the gold-standard approach of cystoscopy for BC detection? To answer these questions, the aim of the present study was to determine whether NMR-derived serum metabolomics would allow early prediction of LG and HG BC development and thereby suggest a proof-of-principle for the application of NMR spectroscopy on serum metabolomics in the diagnosis of human BC.



Histopathological Examinations

Within 1 week of storage, all tissue samples were fixed in 10% buffered formalin and embedded in paraffin wax for histopathology. Tissues were sliced at a thickness of 5 to 6 μm using a microtome followed by staining with hematoxylin and eosin (H & E) for pathological assessment by the Department of Pathology, KGMU. An average of 2−5 slices was examined for each tissue sample. NMR Experiments

The NMR experiments were performed on a Bruker Avance 800 MHz spectrometer using a 5 mm broad-band inverse probe-head with a Z-shielded gradient. Serum samples were thawed at room temperature; 400 μL of serum samples were placed in 5 mm NMR tubes. A coaxial insert containing 0.006 mg of trimethylsilyl propionic acid sodium salt (TSP) deuterated at CH2 groups was used for the deuterium lock, external reference, and standard signal for the absolute quantitative estimation of metabolites. For all of the specimens, 1D 1H NMR measurements were performed using a Carr− Purcell−Meiboom−Gill (CPMG) sequence with water suppression by presaturation at 25 °C. The parameters used were spectral sweep width, 16 500 Hz; data points, 64 K; pulse angle, 90°; total relaxation delay, 5 s; T2 filtering, echo time of 100 μs repeated 300 times, resulting in a total duration of effective echo time of 30 ms; number of scans, 64; and line broadening, 0.3 Hz. Multivariate chemometric analysis was applied on the data generated from the CPMG sequence because this experiment derived spectra with a smooth baseline. Bruker TopSpin software (version 2.1) was used for the phase and baseline correction of all NMR spectra.

METHODS

Patients and Sample Collection

The institutional review board and ethical committee of King George’s Medical University (KGMU) and Centre of Biomedical Research, Sanjay Gandhi Post Graduate Institute of Medical Sciences (SGPGIMS) campus, Lucknow, approved this study. Participants were recruited from the Department of 5840

dx.doi.org/10.1021/pr400859w | J. Proteome Res. 2013, 12, 5839−5850

Journal of Proteome Research

Article

Figure 1. An overview of the workflow performed for the serum metabolic profiling of bladder cancer patients using NMR spectroscopy.

Chemometric Data Analysis

cantly different spectral bins among the LG, HG, and HC cohorts. The variables with p < 0.05 were selected. Ensuing data matrices were used for multivariate statistical analyses using Unscrambler X software (version 10.0.1, Camo, Norway). The supervised orthogonal partial least-squares discriminant analysis (OPLS-DA) was applied to check grouping trends42 because this approach uses class membership information to attempt maximum segregation among different classes of observations and builds a model that can be used to detect potential biomarkers related to the discrimination among the LG, HG, and HC cohorts.

Each NMR spectrum was aligned to the methyl peak of alanine at 1.46 ppm. To facilitate statistical analysis, reduction of the NMR data was performed using Bruker AMIX software (version 3.8.7, Bruker BioSpin, Germany). NMR spectral data were binned to 4000 integrated regions with an equal width of 0.003 ppm. Spectral region of δ 0.3 to 10.0 ppm were used for analysis and exported to Microsoft Office Excel 2010. The region from about δ 4.5 to 5.0 ppm was deleted to eliminate residual water resonances. AMIX-software-generated binned data were used for statistical analysis. As for the statistical analysis, for normalization, each data point was divided by the sum of all data points present in the sample. The normalization process facilitates minimizing variations caused by sample volume, preparation, and analysis.40 The whole data matrix was scaled by means of unit variance in which identical weight was given to all variables.41 To reduce the number and selection of the most important variables that can discriminate among the several cohorts under the study. We performed one-way ANOVA followed by a post−hoc Student−Newman−Keuls multiple-comparison test using GraphPad to identify signifi-

Model Validation

To appraise the OPLS-DA-derived model overfitting, internal cross-validation (ICV) was performed. For ICV, permutation tests with 100 iterations were performed. This rigorous test provides the goodness of fit of the original model with that of randomly permuted models. For permutation tests, all samples were randomly divided into two sets: 60% of the samples were used as the training set and 40%, as the test set. For each iteration, the training set was 7-fold cross-validated and used to predict the class membership of the test-set samples. The 5841

dx.doi.org/10.1021/pr400859w | J. Proteome Res. 2013, 12, 5839−5850

Journal of Proteome Research

Article

Figure 2. Histopathological analysis of urinary bladder tissue samples (20× magnification, hematoxyline and eosin stain) from (A) healthy control, (B) low-grade BC, and (C) high-grade BC patients is shown alongside the 1H NMR spectra of their corresponding human serum samples. Key: 1, HDL + LDL + VLDL; 2, valine; 3, isoleucine; 4, lactate; 5, alanine; 6, lysine; 7, proline; 8, acetate; 9, N-acetylglycoprotein; 10, glutamate; 11, pyruvate; 12, glutamine; 13, citrate; 14, hypotaurine; 15, dimethylamine; 16, creatinine; 17, malonate; 18, choline; 19, glycine; 20, tyrosine; 21, histidine; and 22, format.

was exported to Microsoft Office Excel 2010, and the data analyzed statistically as above in the Chemometric Data Analysis section. After several steps of data mining, an OPLSDA approach was used to appraise the grouping trends and clustering using samples from all new cases. The sensitivity and specificity of the grouping trends of each new batch of samples were also evaluated.

rigorous ICV test provides model validity in the form of the explained variance results and the accuracy of prediction in the OPLS-DA model, R2 and Q2, respectively. Predictions of test samples were made with the help of a visual presentation of the Y-predicted scatter plot. Potential biomarkers were assigned on the basis of S-plot and variable-importance plots (VIPs). To determine the differences among various cohorts, such as LG versus HC, HG versus HC, and LG versus HG, pairwise discriminant analysis was also performed. R2 and Q2 values were also obtained for pairwise discriminant analysis. For clinical utility, receiver operating characteristic (ROC) curve analysis was also executed to verify the robustness of the OPLS-DA model for differentiating specific cohorts.



RESULTS

The complete scheme for the analysis of participants as well as their metabolic profile employed in this study is shown in Figure 1. Figure 2A−C exhibits typical 1D 1H NMR-annotated spectra, providing an overview of the complete metabolic profile and chemical-shift assignments of different resonances in sera from HC, LG, and HG subjects. Histopathological examinations of the corresponding tissue samples are also shown alongside each NMR spectrum. Clinico-pathological data of all participants in this study are summarized in Table S1 (Supporting Information). Normal urinary bladder epithelium is multilayered and is composed of basal, intermediate, and very large surface cells that look like an umbrella. Cytologically, it demonstrated normal nuclear size and shape (Figure 2A). Low-grade BC exhibited fused and branching papillae. The cells were observed to be ordered and cohesive, with minimal crowding and

External Validation

To validate the predictive capability of the constructed model for known samples, a new double-blind study was designed that guards against the experimental bias of known samples. A new batch of suspected cases (n = 106) was registered comprising LG, HG, and HC. This new batch of cases was not included in the initial data set. It was an altogether separate group of subjects. Clinical evaluations of the new batch of cases were undisclosed to patients and members of the NMR research team. Only the clinician and statistician (different from the previous statistician) knew about the clinical evaluation reports of the new batch of cases. After binning and removal of water resonance, the whole NMR data from the new batch of cases 5842

dx.doi.org/10.1021/pr400859w | J. Proteome Res. 2013, 12, 5839−5850

Journal of Proteome Research

Article

Figure 3. Score plots generated from OPLS-DA analysis of 1H NMR spectra of known samples: (A) HC vs LG + HG, (B) HC vs LG, (C) HC vs HG, and (D) LG vs HG. OPLS-DA analysis results of unknown samples (double-blind study): (E) HC vs LG + HG, (F) HC vs LG, (G) HC vs HG, and (H) LG vs HG. HC, healthy control; LG, low grade BC; and HG, high grade BC.

minimal loss of polarity. Cytologically, it demonstrated

Multivariate Statistical Approach

enlarged nuclei that are round-oval in shape with inconspicuous

To determine the performance of the method, OPLS-DA was employed on all of the serum samples, and the results showed a clear separation and a tightly clustered pattern among the various cohorts in the OPLS-DA score plot (Figure 3A). Similarly, pairwise (HC vs LG, HC vs HG, and LG vs HG) analysis also revealed a well-differentiated and clustered pattern in the OPLS-DA score plots (Figure 3B−D). These welldifferentiated and tightly clustered patterns verify that this approach was robust. The OPLS-DA approach uses class information to make the utmost discrimination among classes of observations; thus, this

nucleoli. The majority of cells exhibited an umbrella-like shape (Figure 2B). High-grade BC exhibited fused and branching papillae. The cells were observed to be disordered with frequent loss of polarity and discohesive. Cytologically, it demonstrated enlarged nuclei that have moderate-marked pleomorphism and multiple prominent nucleoli as well as an absence of umbrella-like cells (Figure 2C). 5843

dx.doi.org/10.1021/pr400859w | J. Proteome Res. 2013, 12, 5839−5850

Journal of Proteome Research

Article

Figure 4. Box plots denoting the 25th to 75th percentile for the concentrations of typical metabolites in serum samples from HC, LG, and HG subjects. The median, or 50th percentile, is drawn as a black horizontal line inside the box. ***, p < 0.001. Typical 1H NMR spectra of corresponding metabolites are shown at the top of the box plot for its respective group. Metabolites were assigned as DMA (s), 2.67 ppm; malonate (s), 3.12 ppm; lactate (q), 4.10 ppm; glutamine (m), 2.42 ppm; histidine (s), 7.01 ppm; and valine (d), 1.03 ppm. s, singlet; d, doublet; q, quartrate; m, multiplet; HC, healthy control; LG, low-grade BC; and HG, high-grade BC.

segregation with R2Y (cum) = 0.902 and Q2 (cum) = 0.863 was achieved between the LG and HG cohorts of the double-blind OPLS-DA analysis. The results confirm the excellent capability of OPLS-DA analysis to be used as a viable procedure for less invasive screening and grading of BC.

approach assisted in determining and discovering latent and vital biomarkers related to specific cohorts. The OPLS-DA model restricted various vital biomarkers, showing presentation statistics of R2Y (cum) = 0.934 and Q2 (cum) = 0.916 among all cohorts. When OPLS-DA analysis was performed between HC and LG, a significant discrimination was obtained, with R2Y (cum) = 0.976 and Q2 (cum) = 0.947. Using OPLS-DA analysis, a clear-cut discrimination between HC and HG cohorts was evident, with R2Y (cum) = 0.964 and Q2 (cum) = 0.942. Significant discrimination was obtained between the LG and HG cohorts using OPLS-DA analysis, with R2Y (cum) = 0.913 and Q2 (cum) = 0.876 . To evaluate the predictive capability of the constructed model for unknown samples, an external test using a new batch of double-blind samples was appraised. Comparable clustering trends were achieved using the OPLS-DA approach, which estimated the true predictive accuracy and robustness of the constructed model for unknown samples because the new batches of samples were not previously included in the OPLSDA analysis (Figure 3E−H). Regarding the statistical analysis, the OPLS-DA model of the double-blind study showed statistics of R2Y (cum) = 0.927 and Q2 (cum) = 0.905 among all cohorts. OPLS-DA analysis of the double-blind study exhibited R2Y (cum) = 0.997 and Q2 (cum) = 0.976 when HC was compared with LG and exhibited R2Y (cum) = 0.958 and Q2 (cum) = 0.930 when HC was compared with HG. Excellent

Selection and Identification of Biomarkers

To mine the intricate data and to select potential metabolites, we performed a careful screening procedure. First, to reduce the chance of false-positive metabolites, the covariancecorrelation-based approach was applied, and significantly important metabolites were gleaned from the S-plot using SIMCA-P software (version 8.0). Data were subjected to Pareto scaling before applying S-plots. Among the data set, the most altered variables were involved in S-plot generation (Figure S1). The variables with higher p and p (corr) values were the vital biomarkers that had the greatest influence on cluster formation among the groups. Second, these biomarkers underwent a screening procedure for their VIP values. The most influential variables comprised VIP scores >1 and were chosen for a further selection procedure. Variables that contained VIP scores