Nontargeted Metabolomics Approach for Age ... - ACS Publications

Sep 24, 2012 - Nahyun Kim†, Kemok Kim†, DongHyuk Lee‡, Yoo-Soo Shin§, Kyong-Hwan Bang§, Seon-Woo Cha§, Jae Won Lee‡, Hyung-Kyoon Choi⊥, ...
0 downloads 0 Views 1MB Size
Article pubs.acs.org/jnp

Nontargeted Metabolomics Approach for Age Differentiation and Structure Interpretation of Age-Dependent Key Constituents in Hairy Roots of Panax ginseng Nahyun Kim,† Kemok Kim,† DongHyuk Lee,‡ Yoo-Soo Shin,§ Kyong-Hwan Bang,§ Seon-Woo Cha,§ Jae Won Lee,‡ Hyung-Kyoon Choi,⊥ Bang Yeon Hwang,∥ and Dongho Lee*,† †

School of Life Sciences and Biotechnology, Korea University, Seoul 136-713, Korea Department of Statistics, Korea University, Seoul 136-701, Korea § Department of Herbal Crop Research, National Institute of Horticultural and Herbal Science, Rural Development Administration, Eumseong 369-873, Korea ⊥ College of Pharmacy, Chung-Ang University, Seoul 156-756, Korea ∥ College of Pharmacy, Chungbuk National University, Cheongju 361-763, Korea ‡

S Supporting Information *

ABSTRACT: The age of the ginseng plant has been considered as an important criterion to determine the quality of this species. For age differentiation and structure interpretation of age-dependent key constituents of Panax ginseng, hairy root (fine root) extracts aged from four to six years were analyzed using a nontargeted approach with ultraperformance liquid chromatography/quadrupole time-offlight mass spectrometry (UPLC-QTOFMS). Various classification methods were used to determine an optimal method to best describe ginseng age by selecting influential metabolites of different ages. Through the metabolite selection process, several age-dependent key constituents having the potential to be biomarkers were determined, and their structures were identified according to tandem mass spectrometry and accurate mass spectrometry by comparing them with an in-house ginsenoside library and with literature data. This proposed method applied to the hairy roots of P. ginseng showed an improved efficiency of age differentiation when compared to previous results on the main roots and increases the possibility of the identification of key metabolites that can be used as biomarker candidates for quality assurance in ginseng.

M

atriol (PPT) type with sugar moieties at C-6 and/or C-20; the oleanane type; and other PPD and PPT derivatives.7−9 Determining the presence and measuring the total content of ginsenosides is the most widely used method to standardize ginseng products.7,9 However, because the overall quality of ginseng is dependent on many factors, including geographical origin, species, age, and various environmental changes, adulteration and substitution of raw materials can interfere with the correct and proper use of the plant. Therefore, for effective quality control of ginseng age, a standardized method is required. Various analytical instruments have been used to determine and examine diverse constituents in P. ginseng. Recent developments in metabolomics enable the direct determination of metabolites in different living systems. With the comprehensive detection of major and minor metabolites,

any contemporary drugs are based on chemicals that are obtained initially from natural products, and their use has a long history worldwide. About a half of all major drugs used today are derived directly from natural products that have one or more compounds extracted from plants or other organisms.1−3 Owing to positive outcomes after the use of traditional medicines, the interest of consumers and their demands for medicinal herbs have been growing gradually.4 To meet this increased demand, a comprehensive evaluation of medicinal plants is necessary to properly control their quality.5 Among plant drugs, Panax ginseng C.A. Meyer (Araliaceae) has been used as a highly valued medicinal herb for its tonic or adaptogenic effects. The major constituents of ginseng are ginsenosides, and their pharmacological activities have been widely investigated.5−7 From the roots of P. ginseng, more than 40 different ginsenosides have been isolated, which are mainly dammarane triterpenes with a trans-ring rigid tetracyclic skeleton, and are generally divided into several types according to the aglycone moieties: the protopanaxadiol (PPD) type with sugar moieties attached to C-3 and/or C-20; the protopanax© 2012 American Chemical Society and American Society of Pharmacognosy

Received: July 17, 2012 Published: September 24, 2012 1777

dx.doi.org/10.1021/np300499p | J. Nat. Prod. 2012, 75, 1777−1784

Journal of Natural Products

Article

Figure 1. PCA 2D score plot of Panax ginseng extracts of ages ranging from four to six years with detected metabolites: 4Y, 5Y, and 6Y represent four-, five-, and six-year-old ginseng, respectively.



RESULTS AND DISCUSSION The chromatographic information from the nontargeted metabolite profiling of 30 P. ginseng hairy root samples in the negative-ionization mode of UPLC-QTOFMS was preprocessed to extract mass signals and align them by matching identical peaks from different samples and was tabulated subsequently to form a data matrix. The data matrix, composed of detected and quantified metabolites from the UPLCQTOFMS analysis, was introduced to several steps of data preprocessing and treatment: missing value imputation, normalization, and outlier detection. Our previously reported approach to data treatment was applied in this study,14 and the final data set was obtained for subsequent multivariate analyses. With the arrayed data set, principal component analysis (PCA) and hierarchical clustering analysis (HCA), the representative statistical analysis methods for clustering samples in a metabolomics study, were performed. As shown in Figure 1, the two-dimensional PCA result showed clear separation of the hairy root ages, explaining 48.04% of the total variance. HCA was performed to reveal the closeness between samples and groups. As shown in Figure 2, three different ages showed clear separations from one another: those of five and six years had merged into clusters to which roots of four years old had merged. These results showed that by using hairy root samples, ginseng ages can be differentiated using UPLC-QTOFMS analysis and simple multivariate analyses such as PCA and HCA. These results are quite different from those of the main root samples, which did not show any clear discrimination between samples of different age. From the multiparametric sets of metabolomic data, refining metabolites that decide the grouping pattern and finding marker candidates are the major concerns when using metabolomics, especially the nontargeted approach. From the number of classification methods and by making a comparison within them, an appropriate classification method exhibiting sufficient performances (i.e., having the highest classification accuracy) can be suggested for the data set. Here, metabolite subsets were tested based on random forest (RF), prediction analysis of microarray (PAM), and partial least-squares-

nontargeted metabolomics helps to provide an understanding of the biochemical status of plants in different systems by taking into account all information in the data sets.5,10 A challenge in metabolomics is the identification of key metabolites that account for variations among test samples. Accordingly, statistical analysis plays an important role to conduct metabolomics studies successfully not only to summarize and visualize a given data set but also to identify potential biomarkers. Traditionally, NMR-based identification has been considered the best strategy for natural product structure determination; however, combining both LC and MS methods has emerged as a new possibility for online structural characterization and rapid identification of compounds in complex herbal extracts.8,11,12 In particular, combining UPLC with MS provides the sensitivity, selectivity, accuracy, and rapidity needed for the analysis of analytes at low concentrations in complex matrices.13 Multistage mass spectrometric analysis provides structural information that can be useful in the elucidation of metabolites. The identification efficiency of biomarkers has been increased based on delicate analyses, such as tandem mass (MS/MS) and accurate mass (HRMS), and the measurements of analytes together with available reference compounds. Recently, we reported the use of the UPLC-QTOFMS-based metabolomic approach together with precise statistical analysis for age discrimination of P. ginseng main roots.14 The results showed that this approach could be used as a potential tool to standardize an age parameter for quality control of P. ginseng. In the present study, the same method for determining the age of P. ginseng was applied to its hairy roots (fine roots), and the results showed improved classification accuracy with less sample destruction using a minimal amount of hairy roots instead of the main roots. In addition, this allowed the identification of several key metabolites related to age variation by analyzing the structural information obtained using the UPLC-QTOFMS-based metabolomics technique, as described herein. 1778

dx.doi.org/10.1021/np300499p | J. Nat. Prod. 2012, 75, 1777−1784

Journal of Natural Products

Article

metabolites for the classification allowed the reduction of the number of metabolites to a minimum, while keeping the accuracy to a maximum. Finally, 10 metabolites, selected from at least two or all three classification methods, were chosen as the key constituents for the differentiation of ginseng hairy root ages. As a first step for the structure study of suggested agedependent key constituents, an in-house ginsenoside library consisting of 14 reference standards was established based on their mass spectrometric information, including retention time (tR), m/z, and MS/MS (Table S1, Supporting Information). In addition, all these standards were identified readily in the ginseng extract used (Figure S1, Supporting Information). Triterpene cores of ginsenosides have either three or four hydroxy groups, and sugars are attached at two positions among them: C-3 and/or C-20 in the PPD type and C-6 and/or C-20 in the PPT type. The number and types of sugars and their linkage sequences are combined to form a variety of structures. The 14 ginsenosides tested have different fragmentation patterns, and each observed ion provides information on the mass of aglycone and the sugar chains present in terms of the types of monosaccharide. The most common sugar residues evident in P. ginseng are hexoses (glucose), deoxyhexoses (rhamnose), and pentoses (arabinose and xylose). The metabolomic approach for age differentiation of P. ginseng revealed 10 potential markers, which are listed in Table 2. Through careful consideration of the suggested key constituents, the structures of several metabolites were assigned using LC-MS by combining the tR, m/z, HRMS, and MS/MS fragmentation pattern and then comparing the results obtained with those obtained for the reference compounds or literature data. The 10 marker ions were analyzed using a biplot, which shows the correlation among the samples and the metabolites within a single plot (Figure 3). This helped identify the component that is exhibited more abundantly according to the ginseng age. As displayed in Figure 4, box plots of each marker candidate showed a difference in relative abundance for each age (the y-axis means the scaled relative intensity). On the basis of their abundance at different ages of ginseng extracts, they were divided into three groups: group I, abundant at four years of age; group II, abundant at five years of age; and group III, abundant at six years of age. Among 10 candidates, M42 and M100 were suggested as markers in four-year-old ginseng. Two ions detected in group I showed a strong correlation with ginsenosides, the major components in the extracts, and were identified as ginsenosides Re and Rd by comparing them to those in the in-house ginsenoside library containing tR, HRMS, and MS/MS information (Table S1). HRMS analysis was used to identify M42, m/z 945.5423 [M − H]− with the molecular formula C48H81O18, which was the same as the ginsenoside Re. Subsequently, the tR and m/z of M42 corresponded to those of ginsenoside Re. To confirm the structure, MS/MS was performed, and the fragment ions and sequence clearly matched those of the reference compound in the in-house library. Similarly, M100 was identified as ginsenoside Rd. During the structure study of M100, it was found that the detected ion for M100 (m/z 1419.5) was the adduct ion for the ginsenoside Rd. Even though the nature of the adduct ion could not be identified, M100 was considered as ginsenoside Rd by comparing the tR value and MS/MS fragmentation ions with the reference compound.

Figure 2. HCA dendrogram of Panax ginseng extracts of ages ranging from four to six years with detected metabolites.

discriminant analysis (PLS-DA), and the accuracy of each method was calculated and compared to decide the most appropriate to use.14−20 In addition, classification performance was measured through a 10-fold cross-validation repeated 50 times, which assures the validity of the methods. Further details about this metabolite selection process were described in a previous study.14 Table 1 shows the selected optimal number of metabolites and their cross-validation accuracy in each method. Using RF, Table 1. Cross-Validation Accuracy of Each Age of Panax ginseng Ranging from Four to Six Years of Age Using Different Classification Methods CV accuracy (n = 30) classification method

no. of selected metabolites

four years

five years

six years

mean

RF PAM PLS-DA

11 25 11

1.000 1.000 1.000

1.000 1.000 1.000

1.000 1.000 1.000

1.000 1.000 1.000

PAM, and PLS-DA, 11, 25, and 11 metabolites out of 108 in total detected were selected, respectively, as the optimal number of metabolites for the classification. Through 10-fold cross-validation repeated 50 times, their accuracies were 1.000, meaning that the ages of P. ginseng were classified with 100% accuracy in this data set. Furthermore, an effort to find key 1779

dx.doi.org/10.1021/np300499p | J. Nat. Prod. 2012, 75, 1777−1784

Journal of Natural Products

Article

Table 2. Suggested Age-Dependent Key Constituents for Age Differentiation of Panax ginseng and Identified Structures Using UPLC-QTOFMS

group I (four years) II (five years)

III (six years)

a

no.

metabolite no.

1 2 3 4 5 6

M42 M100 M8 M23 M26 M53

7

M61

8 9 10

M25 M75 M95

identification ginsenoside Re ginsenoside Rd N.D.a ginsenoside Rg2 N.D. ginsenoside derivative 1 ginsenoside derivative 2 N.D. ginsenoside Rb1 malonyl- ginsenoside Rb1

tR (min)

accurate mass

3.76 13.71 0.69 8.70 5.67 3.43

945.5421 945.5454 447.1115 783.4930 577.2058 1077.5891

17.30 20.70 9.89 10.53

exact mass

mass error (ppm)

formula

945.5423 945.5423

−0.2 3.3

C48H82O18 C48H82O18

783.4895

4.5

C42H72O13

1077.5846

4.2

C53H90O22

945; 783; 637; 475 945; 783; 621; 459 447; 151 783; 637; 475 531; 193 1077; 945; 783; 637; 475

1001.5315

1001.5321

−0.6

C50H82O20

957; 915; 783; 621; 459

595.2811 1107.5911 1193.5956

1107.5951 1193.5955

−3.6 0.1

C54H92O23 C57H94O26

595; 279; 152 1107; 1089; 945; 783; 621; 459 1149; 1107; 1089; 945; 783; 621; 459

MS/MS fragment ion (m/z)

N.D.: not determined.

Figure 3. PCA biplot of Panax ginseng extracts of ages ranging from four to six years with key constituents: 4Y, 5Y, and 6Y represent four-, five-, and six-year-old ginseng, respectively.

As possible markers of five-year-old ginseng, five constituents were chosen. First, M23 was identified as ginsenoside Rg2 using the same method as for group I. By comparison of tR, HRMS, and MS/MS information with that in the in-house ginsenoside library, the structure of M23 was confirmed. Second, M53 was detected at m/z 1077.5891 [M − H]− in the UPLC-QTOFMS, and the HRMS result suggested a molecular formula of C53H90O22, which was the same as that of ginsenosides Rb2, Rb3, and Rc; however, the tR value and MS/MS fragmentation pattern of M53 were different from those of ginsenosides Rb2, Rb3, and Rc (Figures S2 and S3). The MS/MS result (Figure 5A) showed the fragment ion at m/z 945, indicating a loss of pentose linked to C-20. It is reported that the cleavage of the oligosaccharide residue happens initially at C-20 of the aglycone and then at C-3 or C-6, because the sugar moiety linked to C-20 is less stable than that linked to C-3 or C-6;

therefore, faster elimination occurrs at lower energy levels.21 Subsequent fragment ions at m/z 783, 637, and 475 were formed by successive losses of glucose, rhamnose, and glucose, respectively, resulting in the PPT-type aglycone. From these results, it was presumed that M53 has a PPT-type aglycone with a glucose-rhamnose unit at C-6 and a glucose-pentose unit at C-20. The type of pentose, such as arabinose and xylose, has not been differentiated because, based on the current mass spectrometric information available, their low ion intensity makes it difficult to perform reliable MSn. This type of ginsenoside has not been previously reported in the roots of P. ginseng. Instead, it has been reported in floralginsenoside M, with an arabinose (furanose) unit, and floralginsenoside N, with an arabinose (pyranose), isolated from the flower buds of P. ginseng, and floralquinquenoside E, with a xylose unit, isolated from the flower buds of P. quinquefolium. Third, M61 showed 1780

dx.doi.org/10.1021/np300499p | J. Nat. Prod. 2012, 75, 1777−1784

Journal of Natural Products

Article

Figure 4. Box plots of identified marker candidates at different ages of Panax ginseng: group I (A); group II (B); group III (C).

an ion peak at m/z 1001.5315 [M − H]− in the UPLCQTOFMS, and the HRMS observation suggested a molecular formula of C50H82O20. As shown in Figure 5B, the MS/MS exhibited typical fragmentation patterns of malonyl-ginsenosides. The deprotonated molecule of M61 showed a loss of CO2 (m/z 957) from the malonyl group and the loss of either AcOH (m/z 897) or (Ac−H) (m/z 915), which are two major fragment ions of malonyl-ginsenosides.8,22 In particular, the fragment ion at m/z 915 could be identified as a complementary ion of a malonyl residue. Subsequent fragment ions at m/z 783, 621, and 459 corresponded to the successive loss of the pentose and two glucose units, respectively, resulting in the PPD-type aglycone. On the basis of the cleavage order of the oligosaccharide residue,21 it could be predicted that the structure of M61 is a new malonyl-ginsenoside having a PPDtype aglycone with a malonyl-glucose unit at C-3 and a glucosepentose unit at C-20. This type of ginsenoside with the presented sequence has not yet been reported; however, the position of the malonyl residue and the type of pentose need to

be confirmed for an unambiguous structure determination of M61. Finally, M8 and M26 showed relatively low abundance in the chromatogram and were considered minor components in the ginseng extract. With the information based on the suggested molecular formula within the limit of 20 ppm and the fragmentation patterns from literature data, it was not possible to characterize their structure because the results did not show specific fragment patterns related to ginsenosides. More in-depth study is needed to identify these structures. Three marker candidate molecules labeled M25, M75, and M95 in group III showed higher intensities in six-year-old samples than in four- and five-year-old samples. From the UPLC-QTOFMS analysis, M75 was found to be ginsenoside Rb1. With a tR, HRMS, and MS/MS comparison using the inhouse ginsenoside library, it was possible to conclude the precise structure of M75. M95 was found at m/z 1193.5956 [M − H]− with a relatively high intensity in the extract and exhibited a good correlation with ginsenoside Rb1 based on a diverse MS analysis. The HRMS analysis suggested that the 1781

dx.doi.org/10.1021/np300499p | J. Nat. Prod. 2012, 75, 1777−1784

Journal of Natural Products

Article

Figure 5. Fragment ions observed in MS/MS analysis and suggested structure of M53 (A) and M61 (B) showing the types of sugars in each.

Table 3. Calibration Curves and Content of Ginsenosides in Panax ginseng Extracts of Different Ages content of ginsenosides (mg/g, %) ginsenoside

biomarker (age)

Rb1 Rd Re Rg2

6 4 4 5

calibration curve y y y y

= = = =

34.629x 48.084x 27.940x 59.606x

+ + + +

24.244 22.399 35.880 93.807

r

2

0.998 0.999 0.994 0.995

five years

four years 6.719 0.060 8.338 0.197

molecular formula of M95 was most likely C57H94O26 with a mass accuracy of 0.1 ppm, which was matched with ginsenoside Rb1, possessing an additional malonyl group. The major fragment ions of malonyl-ginsenoside Rb1 were observed from the MS/MS analysis of M95 as a loss of CO2 (m/z 1149) from the malonyl group and a loss of either AcOH (m/z 1089) or (Ac−H) (m/z 1107), as shown in Figure S4A. The successive fragment patterns showed a loss of the sugar moiety in sequence, which correlated with the fragment sequence of ginsenoside Rb1. To confirm the structure of M95, the method of Kite et al.23 was followed, describing the LC-MS analysis of malonyl-ginsenosides. By adopting the analytical condition of the reference, malonyl-ginsenoside Rb1 showed a retention time between the Rb1 and Rc eluting times, which corresponded with that of the reference data (Figure S4B). With this comprehensive approach, it was possible to identify and confirm the structure of M95, the key constituent in sixyear-old samples, as malonyl-ginsenoside Rb1. Last, it was not possible to characterize the structure of M25 on the basis of the HRMS and MS/MS information available, in the same manner as M8 and M26 were characterized in group II. Additional studies need to be performed to complete this structure study.

± ± ± ±

0.491 0.026 1.464 0.098

(0.672) (0.006) (0.834) (0.020)

7.708 0.058 7.454 0.210

± ± ± ±

0.907 0.054 0.420 0.062

(0.771) (0.006) (0.745) (0.021)

six years 8.825 0.031 8.323 0.169

± ± ± ±

0.670 0.024 0.940 0.042

(0.883) (0.003) (0.832) (0.017)

To identify the correlation of biomarker content between the model and biological samples, ginsenosides Rb1, Rd, Re, and Rg2, the identified marker constituents of six-, four-, four-, and five-year-old samples, respectively, were quantified and compared with ginseng root extracts of different ages. Calibration curves constructed for each of the reference standards showed good linearity with 0.994−0.999 of the correlation coefficients (Table 3). The ginsenoside Rb1 content increased with sample age, whereas the Rd and Re content decreased after year four. The ginsenoside Rg2 content was highest at five years of age. The ages exhibiting the highest contents of each ginsenoside were six, four, four, and five years for ginsenosides Rb1, Rd, Re, and Rg2, respectively, and these are reflected in the box plot results in Figure 4. Since the box plots of biomarkers indicated differences in their abundance for each age, the variations in the ages may influence ginsenoside biosynthesis pathways.8,24,25 Therefore, additional studies are necessary to understand the variation of ginsenoside content in ginseng extracts and its correlation with ginsenoside biosynthesis. In this study, the two major considerations concerning the metabolomics workflow were pinpointed as statistical analysis and marker identification. By using the proposed UPLC1782

dx.doi.org/10.1021/np300499p | J. Nat. Prod. 2012, 75, 1777−1784

Journal of Natural Products

Article

acid in water (A) and 0.1% formic acid in acetonitrile (B). Separation was performed by gradient elution: 0.5 min 10% B, 2.5 min 30% B, 6 min 60% B, 9 min 90% B, 10.5 min 100% B for washing and 12 min 10% B to re-equilibrate the column with a flow rate of 500 μL/min. The column and sample managers were maintained at 35 and 15 °C, respectively, and the injection volume of each sample was 5 μL. Furthermore, to identify suggested markers, the UPLC elution conditions were modified and reoptimized considering sensitivity and resolution to detect the maximum number of ginsenosides in P. ginseng.27 The following gradient elution was applied: 0.5 min 10% B, 3.0 min 25% B, 10.0 min 30% B, 16.0 min 35% B, 18.0 min 50% B, 21.0 min 65% B, 23.0 min 100% B for washing and 25.0 min 10% B to re-equilibrate the column with a flow rate of 350 μL/min. The column and sample managers were maintained at 40 and 15 °C, respectively, and the injection volume of each sample was 5 μL. The mass spectrometer was operated in the negative-ion mode by scanning the m/z range of 200−1500. The optimized mass conditions were as follows: capillary voltage = 2800 V, cone voltage = 35 V, collision energy = 5 eV, source temperature = 100 °C, desolvation temperature = 250 °C, desolvation gas flow = 600 L/h. The cone voltage and collision energy were varied over the range 35−55 V and 30−50 eV, respectively, for MS/MS analysis to measure ginsenoside fragmentation ions. To ensure the accuracy and reproducibility of mass measurement, the LockSpray was operated using leucine-enkephalin as the independent reference compound at a concentration and flow rate of 500 pg/μL and 2 μL/min, respectively. Leucine-enkephalin was detected at m/z 554.2615 as a [M − H]− ion. For the high-resolution mass measurement, sodium formate was used to calibrate the mass system. MassLynx version 4.1 (Waters, Manchester, UK) was used for data acquisition and processing. For analysis of malonyl-ginsenoside Rb1, the marker for the sixyear-old samples, a Thermo-Finnigan LC/MS/MS system consisting of a Surveyor autosampling LC system and LCQ Fleet (ThermoFinnigan Scientific, San Jose, CA, USA) was used and compared with a reference from Kite et al.23 Chromatographic separation was performed using a Phenomenex Luna-C18(2) column (4.6 × 150 mm, 5 μm). The mobile phase consisted of water (A), acetonitrile (B), and methanol containing 5% acetic acid (C), with the following gradient elution (A/B/C): 0.0 min 60:20:20, 30.0 min 40:40:20, 31.0 min 0:80:20, 35.0 min 0:80:20, 37.0 min 60:20:20, and 45.0 min 60:20:20 with the flow rate of 1 mL/min. The injection volume was 10 μL. For mass spectrometric analysis, the source settings used for the ionization of malonyl-ginsenoside Rb1 were as follows: needle voltage, −4.5 kV; nebulizing and sheath nitrogen gas pressures, 80 and 20 psi; heated capillary temperature, 220 °C; capillary voltage, −26 V; tube lens offset voltage, 0 V. The ion trap was set to monitor ions over the m/z range of 200−1500. This condition is optimized based on Kite et al.23 Data Analysis. The UPLC-QTOFMS data were processed as described by Kim et al.14 The peak finding, alignment, and filtering of raw data were performed using MarkerLynx XS Application Manager (Waters, Milford, MA, USA), and the parameters were set to the following conditions: retention time (tR) of 1−10.5 min, mass range from 200 to 1500 Da, mass tolerance of 0.05 ppm, intensity threshold of 250 counts, and noise elimination level of 6.00. The intensity threshold at 250 counts was higher than the default setting, which is normally 50−100 counts, to detect ions with higher intensity in the extracts. The resulting three-dimensional matrix using tR, m/z, and intensities of all detected peaks was tabulated and exported for further statistical analyses: (i) data treatment for dealing with missing values and conversion to a proper data set for classification; (ii) selection of influential metabolites to decrease the sample size and improve data interpretability using three different classification methods such as RF, PAM, and PLS-DA; and (iii) multivariate analyses for visualization of grouping and clustering on the basis of the similarities and differences of the metabolic profiles. PCA and HCA were performed using the MarkerLynx XS Application Manager and R+ package (R Foundation for Statistical Computing, Vienna, Austria), respectively, for the

QTOFMS-based metabolomic approach together with precise statistical analysis, the ages of hairy roots from P. ginseng samples ranging from four to six years could be classified precisely using detected metabolites, and key metabolites were identified. Consequently, from in-depth structural studies of the biomarker candidates, seven key constituents were determined carefully, showing age-dependent variations in P. ginseng: ginsenosides Rb1, Rd, Re, and Rg2, malonyl-ginsenoside Rb1, and two ginsenoside derivatives including one new ginsenoside. These results showed an improved efficiency of age differentiation when compared to previous results on the main roots, which were clearly differentiated after the metabolite selection.14 It is noteworthy that without sample destruction and using a minimum amount of hairy roots of ginseng and with the identified markers, this nontargeted metabolomic approach can be applied to discriminate ginseng ages, which is an important criterion in evaluating the quality of this medicinal plant. Although long-term and more comprehensive studies are required to build the classification system and validate the biomarker candidates, the metabolomic approach is worthy of further exploration as an additional method for assessing the quality of other diverse medicinal plants.



EXPERIMENTAL SECTION

Plant Material. P. ginseng samples used for this study were cultivated at the National Institute of Crop Science (37°15′ N; 127°00′ E; alt. 24 m), Rural Development Administration, Suwon, Republic of Korea, based on the ginseng GAP standard cultivation guidelines,26 and 30 samples ranging from four to six years of age were harvested in September 2006. The voucher specimens have been deposited in the College of Life Sciences and Biotechnology (accession number KUDL2-2-61R-120R), Korea University, Seoul, Korea. Standards and Chemicals. The standard ginsenosides Rb1, Rb2, Rb3, Rc, Rd, Re, Rf, Rg1, Rg2, Rg3, Rh1, and Rh2 were purchased from ChromaDex (Irvine, CA, USA), and ginsenosides Rg5 and Rk1 were obtained from VitroSys, Inc. (Seoul, Korea). HPLC-grade acetonitrile and methanol were purchased from Honeywell Burdick and Jackson (Muskegon, MI, USA), and distilled water was purified using an in-house system, aqua MAX-ultra system (Young Lin, Anyang, Korea). Leucine-enkephalin and sodium formate were purchased from Sigma-Aldrich (St. Louis, MO, USA), and formic acid was obtained from Duksan (Seoul, Korea). All chemicals used were of analytical grade, and all solvents were filtered through 0.2 μm membrane filters before analysis. Sample Preparation. Each ginsenoside standard was prepared in 50% methanol. All stock and working solutions were stored at −20 °C before use. Stock solutions of ginsenosides Rb1, Rd, Re, and Rg2 were diluted to 0.08, 0.4, 2.0, 10.0, and 50.0 μg/mL to prepare the five-point calibration curves using the least-squares method; these were used to quantify compounds by UPLC-QTOFMS detection. The hairy roots of P. ginseng were cut, freeze-dried (Eyela, Tokyo, Japan), and powdered. For the analyses carried out, ginseng samples were prepared with an optimized method for detecting diverse ginseng metabolites based on a previous study.14 Samples were extracted ultrasonically with 70% MeOH and centrifuged at 12 000 rpm for 20 min. The supernatant was filtered through a 0.2 μm membrane filter, and the solvent was removed to dryness in vacuo. The residue was further diluted with 50% MeOH to obtain a final concentration of 2 mg/mL. Ten replicates of each sample group were prepared in order to obtain reliable results. Liquid Chromatography−Mass Spectrometry Analysis. UPLC-QTOFMS was performed using an Acquity UPLC system (Waters, Milford, MA, USA) and a QTOF Micromass spectrometer (Waters, Manchester, UK). Chromatographic separation was performed with an Acquity UPLC BEH C18 column (i.d., 2.1 × 100 mm; particle size, 1.7 μm); the mobile phase consisted of 0.1% formic 1783

dx.doi.org/10.1021/np300499p | J. Nat. Prod. 2012, 75, 1777−1784

Journal of Natural Products

Article

interpretation of the variations among samples from different ages of ginseng. Structure Assignment of Selected Age-Dependent Key Constituents. Among the selected metabolites, 10 potential biomarker candidates to discriminate ages of ginseng were suggested. In combination with tR, m/z, HRMS, and MS/MS information, as well as confirmation with standard compounds and literature data, the UPLC-QTOFMS analysis provided valuable structural information for the assignment of those age-dependent key constituents.



ASSOCIATED CONTENT



AUTHOR INFORMATION

(21) Liu, Y.; Li, J.; He, J.; Abliz, Z.; Qu, J.; Yu, S.; Ma, S.; Liu, J.; Du, D. Rapid Commun. Mass Spectrom. 2009, 23, 667−679. (22) Fuzzati, N.; Gabetta, B.; Jayakar, K.; Pace, R.; Peterlongo, F. J. Chromatogr. A 1999, 854, 69−79. (23) Kite, G. C.; Howes, M. J. R.; Leon, C. J.; Simmonds, M. S. J. Rapid Commun. Mass Spectrom. 2003, 17, 238−244. (24) Liang, Y.; Zhao, S. Plant Biol. 2008, 10, 415−421. (25) Wang, J.; Gao, W.-Y.; Zhang, J.; Zuo, B.-M.; Zhang, L.-M.; Huang, L.-Q. Acta Physiol. Plant. 2012, 34, 397−403. (26) National Institute of Crop Science. Ginseng GAP Standard Cultivation Guideline; Rural Development Administration: Suwon, Korea, 2009. (27) Dan, M.; Su, M.; Gao, X.; Zhao, T.; Zhao, A.; Xie, G.; Qiu, Y.; Zhou, M.; Liu, Z.; Jia, W. Phytochemistry 2008, 69, 2237−2244.

S Supporting Information *

Additional figures to explain the structure exploration process of age-dependent key constituents are provided. This material is available free of charge via the Internet at http://pubs.acs.org. Corresponding Author

*Tel: +82-2-3290-3017. Fax: +82-2-953-0737. E-mail: [email protected]. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This work was supported by BioGreen 21 (No. 20070501034007) and Agenda (No. 20120501030504), Rural Development Administration, Republic of Korea.



REFERENCES

(1) Gurib-Fakim, A. Mol. Aspects Med. 2006, 27, 1−93. (2) Newman, D. J.; Cragg, G. M. J. Nat. Prod. 2012, 75, 311−335. (3) Simpson, B. B.; Ogorzaly, M. C. Economic Botany: Plants in Our World; McGraw-Hill Inc.: New York, 2001; pp 262−267. (4) Xie, G. X.; Ni, Y.; Su, M. M.; Zhang, Y. Y.; Zhao, A. H.; Gao, X. F.; Liu, Z.; Xiao, P. G.; Jia, W. Metabolomics 2008, 4, 248−260. (5) Okada, T.; Afendi, M.; Takahashi, H.; Nakamura, K.; Kanaya, S. Curr. Comput.-Aided Drug Des. 2010, 6, 179−196. (6) Attele, A. S.; Wu, J. A.; Yuan, C. S. Biochem. Pharmacol. 1999, 58, 1685−1693. (7) Lu, J. M.; Yao, Q.; Chen, C. Curr. Vasc. Pharmacol. 2009, 7, 293− 302. (8) Qi, L. W.; Wang, C. Z.; Yuan, C. S. Nat. Prod. Rep. 2011, 28, 467−495. (9) Fuzzati, N. J. Chromatogr., B 2004, 812, 119−133. (10) Moco, S.; Vervoort, J.; Bino, R. J.; De Vos, R. C. H.; Bino, R. Trends Anal. Chem. 2007, 26, 855−866. (11) Zhou, J. L.; Qi, L. W.; Li, P. J. Chromatogr. A 2009, 1216, 7582− 7594. (12) Qi, L. W.; Wen, X. D.; Cao, J.; Li, C. Y.; Li, P.; Yi, L.; Wang, Y. X.; Cheng, X. L.; Ge, X. X. Rapid Commun. Mass Spectrom. 2008, 22, 2493−2509. (13) Guillarme, D.; Schappler, J.; Rudaz, S.; Veuthey, J. L. Trends Anal. Chem. 2010, 29, 15−27. (14) Kim, N.; Kim, K.; Choi, B. Y.; Lee, D. H.; Shin, Y. S.; Bang, K. H.; Cha, S. W.; Lee, J. W.; Choi, H. K.; Jang, D. S.; Lee, D. J. Agric. Food Chem. 2011, 59, 10435−10441. (15) Barker, M.; Rayens, W. J. Chemom. 2003, 17, 166−173. (16) Breiman, L. Machine Learning 2001, 45, 5−32. (17) Foulkes, A. S. Applied Statistical Genetics with R: for PopulationBased Association Studies; Springer: New York, 2009; pp 181−186. (18) Hastie, T.; Tibshirani, R.; Friedman, J. H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: New York, 2009; pp 651−654. (19) Nguyen, D. V.; Rocke, D. M. Comput. Stat. Data Anal. 2004, 46, 407−425. (20) Tibshirani, R.; Hastie, T.; Narasimhan, B.; Chu, G. Proc. Natl. Acad. Sci. U. S. A. 2002, 99, 6567−6572. 1784

dx.doi.org/10.1021/np300499p | J. Nat. Prod. 2012, 75, 1777−1784