Article pubs.acs.org/JPCA
Solid Phase Excitation−Emission Fluorescence Method for the Classification of Complex Substances: Cortex Phellodendri and Other Traditional Chinese Medicines as Examples Yao Gu,†,‡ Yongnian Ni,*,†,‡ and Serge Kokot§ †
State Key Laboratory of Food Science and Technology, Nanchang University, Nanchang 330047, China Department of Chemistry, Nanchang University, Nanchang 330031, China § School of Chemistry, Physics and Mechanical Engineering, Science and Engineering Faculty, Queensland University of Technology, Brisbane 4001, Australia ‡
ABSTRACT: A novel, simple and direct fluorescence method for analysis of complex substances and their potential substitutes has been researched and developed. Measurements involved excitation and emission (EEM) fluorescence spectra of powdered, complex, medicinal herbs, Cortex Phellodendri Chinensis (CPC) and the similar Cortex Phellodendri Amurensis (CPA); these substances were compared and discriminated from each other and the potentially adulterated samples (Caulis mahoniae (CM) and David poplar bark (DPB)). Different chemometrics methods were applied for resolution of the complex spectra, and the excitation spectra were found to be the most informative; only the rank-ordering PROMETHEE method was able to classify the samples with single ingredients (CPA, CPC, CM) or those with binary mixtures (CPA/CPC, CPA/CM, CPC/CM). Interestingly, it was essential to use the geometrical analysis for interactive aid (GAIA) display for a full understanding of the classification results. However, these two methods, like the other chemometrics models, were unable to classify composite spectral matrices consisting of data from samples of single ingredients and binary mixtures; this suggested that the excitation spectra of the different samples were very similar. However, the method is useful for classification of singleingredient samples and, separately, their binary mixtures; it may also be applied for similar classification work with other complex substances. involves fluorescence measurements from solid samples. Clearly, for this method, the analytes have to emit fluorescence, and provided this condition is met, sample preparation for the analysis of TCMs, some of which do fluoresce, is significantly simplified in comparison with that required for chromatography. In principle, the sample is suitably crushed, appropriately irradiated, the fluorescence is measured, and this information is suitably interpreted. However, in contrast to chromatography, this method is only potentially useful for discrimination of the TCMs rather than for the individual identification of many of the compounds present in such substances. Thus, in the context of sample discrimination, fluorescing TCMs, which contain fluorophores such as berberine, are convenient and useful complex substances to use for research and development of new analytical methods. In general, such measurements are known for their high levels of sensitivity and specificity, and these are important quality assurance attributes of analytical methodology. Apart from the useful attributes noted above, fluorescence methods are normally fast, require minimal
1. INTRODUCTION Complex substances composed of mixtures of many biological and chemical components or natural complex materials are common, and require analysis, usually of ingredients, which are specifically active in the application in question. A common analytical approach is the extraction and purification of the sought-after analytes followed by analysis with the use of, for example, chromatography1 or electrophoresis2 among others. In general, these techniques have been commonly used for the analysis of Traditional Chinese Medicines (TCMs),3 and in particular, the application of the chromatographic fingerprinting approach for the analysis of these substances has been demonstrated to be very useful for assessing the quality of such herbal medicines. This technique is often able to provide detailed information about the ingredients of the TCMs, and qualitative and quantitative analysis of the composition of a TCM is possible provided suitable reference materials are available. However, chromatographic methods are generally complex, time-consuming, and costly, especially in the case of solvents used for analyte extraction and in the chromatographic method itself. Direct analysis of samples without the need for sample preparation and with reduced analytical and instrument maintenance costs would be preferred. One such approach © 2012 American Chemical Society
Received: June 20, 2012 Revised: August 21, 2012 Published: August 24, 2012 8949
dx.doi.org/10.1021/jp306051w | J. Phys. Chem. A 2012, 116, 8949−8958
The Journal of Physical Chemistry A
Article
Table 1. Composition Ratios of the 27 Mixture Samples No.
CPC
CPA
55 56 57 58 59 60 61 62 63
5 5 5 5 5 4 3 2 1
1 2 3 4 5 5 5 5 5
CM
No.
CPC
64 65 66 67 68 69 70 71 72
1 2 3 4 5 5 5 5 5
CPA
amounts of reagents, do not generate any significant residues, and are generally low cost. However, up until now, such methods have received limited attention and are usually recommended to be applied together with other methods, e.g., the combined usage of the solid phase fluorescence and immunological techniques,4 and other quantitative analyses, which involved supporting matrices.5,6 The TCM selected for this study was Cortex Phellodendri (CP),7 which is well-known in biological and pharmacological applications, and has been noted for its antimicrobial, antiplasmodial and antidiarrhea effects. On the basis of its different origins, CP is divided into two categories: Cortex Phellodendri Chinensis (CPC), the dry bark of the Phellodendron Chinense Schneid, which originates in southern China, and Cortex Phellodendri Amurensis (CPA), derived from the Amur corktree, which is found in northeastern China. The active ingredients of the TCMs are different kinds of alkaloid, particularly, berberine. In general, the content of this compound is much higher in CPC than in CPA. The total levels of alkaloid in CPC samples is about 4.1% (mostly berberine), while in CPA it is about 1.5%.8 However, these two substances are believed to give the same clinical efficacy, and are used alternately. Other alkaloids have been identified, e.g., palmatine and phellodendrine,9 and consequently, the effectiveness of the CPC and CPA may show subtle differences in practice.10 Unfortunately, the very effectiveness of these TCMs gives rise to a market in adulterants. In this context, Caulis mahoniae (CM) and David poplar bark (DPB) are two such common substitutes for CP. CM is derived from taxa of Berberidaceae, and is found in Folium Mahoniae or Folium Mahoniae Bealei. CM contains a considerable amount of berberine, and the clinical effect of CM as an antimicrobial substance is similar to that of CP.11 However, unlike CPA and CPC, CM substances are toxic to some extent, which makes it imperative to discriminate them. Another common adulterant, DPB, has a similar appearance to CP, and the two can be readily confused. Thus, research and development of rapid, quantitative classification and discrimination methods for these substances is important. The interpretation of high dimensional data requires the use of chemometrics methods of data analysis,12 and in the context of excitation and emission (EEM) spectroscopy, parallel factor analysis (PARAFAC) has been used for the classification of estuarine water,13 and three-way principal component analysis (PCA) has been used for the algae speciation.14 Other methods included soft independent modeling of class analogies (SIMCA),15 N-way partial least-squares (NPLS), and discriminant analysis (DA) as well as the multiway analytical data based approach, MOLMAP.16 The aims of this study were to:
CM
No.
5 5 5 5 5 4 3 2 1
73 74 75 76 77 78 79 80 81
CPC
CPA
CM
1 2 3 4 5 5 5 5 5
5 5 5 5 5 4 3 2 1
1. research and develop an analytical method based on fluorescence spectral measurements of powdered samples for the classification of complex substances, 2. apply this method for the discrimination of the wellknown TCMs such as CP in its two common and similar forms, CPA and CPC, as well as a potential adulterant CM, and 3. explore the possibilities of classifying mixtures of these substances such as CPA/CPC, CPA/CM and CPC/CM. It was anticipated that this task would be challenging because of the similarities of the CPA and CPC varieties, and consequently, several different chemometrics methods of data analyses such as PCA, least squares-support vector machine (LS-SVM), PARAFAC, and the preference ranking organization method for enrichment of evaluations (PROMETHEE) and geometrical analysis for interactive aid (GAIA) were chosen as possible multivariate methodologies to discriminate the different kinds of complex substance; also, these methods of data analysis represented different multivariate approaches, e.g., pattern recognition, prediction, and ranking, respectively.
2. EXPERIMENTAL SECTION 2.1. Apparatus and Software. All solid phase fluorescent measurements were carried out with the use of the Hitachi F7000 spectrofluorimeter (Hitachi Co., Japan). A sample cell made from a metal frame and quartz glass windows (solid sample holder, model Hitachi 650−0161) was constructed to hold the powdered samples (sample thickness is set at 5 mm). Then, this cell, filled with a sample, was positioned, such that the surface of the quartz glass windows was set at 45° angles to the incident and the reflected light paths. The temperature for the experiments was maintained at 25.0 ± 0.5 °C. A high-speed TCM pulverizer (QE-100, Yili Instrument Co., Wuyi, China) was used for crushing the materials into powder, and it was sifted through a 0.3 mm-mesh. All chemometrics methods were written in MATLAB (Version 6.5, Mathworks) on a Core 2 Duo microcomputer. The PCA program was based on the singular value decomposition (SVD) model, the LS-SVM algorithm was sourced from http://www.esat.kuleuven.be/sista/lssvmlab/, 17 and the PARAFAC program was obtained from Bro’s work.18 MCDM PROMETHEE and GAIA methods were obtained from the Decision Lab 2000 package.19 2.2. Sample Preparation. CP and other similar samples were obtained from different TCM outlets in China. They were stored under dry conditions at constant temperature to avoid degradation. Prior to measuring their fluorescence, the samples were crushed into powder for 2 min, and sifted through a 0.3 mm-mesh sieve. 8950
dx.doi.org/10.1021/jp306051w | J. Phys. Chem. A 2012, 116, 8949−8958
The Journal of Physical Chemistry A
Article
Figure 1. Physical appearance of the four types of TCM samples: CPC, CPA, CM, and DPB under daylight (top row) and UV light (bottom row).
Figure 2. Excitation and emission fluorescence spectra of the four different TCM samples. Panels a, b, c, and d correspond to the spectra of CPC, CPA, CM, and DPB samples, respectively. Peaks I, II, and III in different figures had similar spectral wavelengths in both the excitation and emission responses.
all four samples were various shades of brown (Figure 1, top line), very similar overall and difficult to discriminate into groups representing the collected varieties. On the other hand, the observations under the UV light (Figure 1, bottom row), suggested that the three types of samples, CPC, CPA and CM, all showed green fluorescence, but the DPB ones did not. In general, all samples also appeared to emit slightly different colors of different intensities. However, such observations were insufficient to discriminate the varieties; thus, this led to the question of whether quantitative measurements of the emitted fluorescence could lead to classification of the pure and mixed classes of the samples discussed above. Consequently, threeway EEM data were collected. However, EEM data analysis is often complicated because it contains the scattering effects.20 These must be corrected or significantly reduced, and in this study, the two corresponding wavelength domains, which showed Rayleigh and Raman scattering, were regarded as missing values and consequently eliminated. This reduced the scattering introduced by this effect. The fluorescent spectra of the four varieties, CPC, CPA, CM, and DPB (Figure 2) showed that DPB could be readily discriminated from the other samples
The samples were numbered as follows: CPC (#1−18; total: 18), CPA (#19−33; total: 15), CM (#34−54; total: 21), DPB (not labeled, total: 6), mixtures CPC/CPA (#55−63; total: 9), CPC/CM (#64−72; total: 9), and CPA/CM (#73−81; total: 9). For ratios of the mixtures for the last three sample sets, see Table 1. 2.3. Fluorescence Measurements and Data Collection. Appearances of the original, unprocessed samples were first compared in daylight and under UV light. Then the samples were sequentially examined in the quartz cell (section 2.1). The fluorescence spectrophotometer settings were set to excitation voltage: 400 V; slit band widths for both the excitation and emission monochromators: 10 nm; and scanning speed: 15 000 nm min−1. Spectral regions sampled were excitation: 266−476 nm, and emission: 475−592 nm.
3. RESULTS AND DISCUSSION 3.1. Fluorescence Spectra of the Different TCM Varieties and Discrimination of the DPB Samples. The appearance of the four TCM species both in daylight and under the UV light (wavelength 365 nm) was examined; in daylight, 8951
dx.doi.org/10.1021/jp306051w | J. Phys. Chem. A 2012, 116, 8949−8958
The Journal of Physical Chemistry A
Article
Figure 3. PC1 versus PC2 biplots of the CPC, CPA and CM (a) and that of their mixed component samples (b) from the emission data; PC1 versus PC2 biplots for samples of the CPC, CPA, and CM (c) and that of their binary mixed samples (d) from the excitation data. Labels of different color represent different samples: CPC (red squares), CPA (green squares), CM (blue squares), CPC/CPA (yellow triangles), CPC/CM (fuchsia triangles), and CPA/CM (cyan triangles).
because it did not fluorescence or fluorescence was very low as was evident during the direct UV irradiation experiments (Figure 1). Consequently, the DPB variety may be distinguished from the other three directly without any recourse to quantitative fluorimetry. However, the fluorescence of the remaining CPC, CPA, and CM varieties was similar: CPC, peak I (280 nm, 540 nm), peak II (350 nm, 540 nm) and peak III (440 nm, 540 nm); CPA, peak I (280 nm, 530 nm), peak II (350 nm, 530 nm) and peak III (440 nm, 530 nm); and CM, peak I (280 nm, 520 nm), peak II (350 nm, 520 nm) and peak III (440 nm, 520 nm). Thus, the varieties cannot be distinguished on the basis of their peak maxima or their spectral profiles, although there was some order in the intensity of the three peaks: for CPC, intensity of peak I > peak II > peak III; for CPA, peak I ≈ peak II > peak III; and for CM, peak I < peak II > peak III. Therefore, it appeared that the spectral profiles consisted of more than one peak and more than one chemical constituent was present in the three varieties; furthermore, these different constituents probably had the same or very similar excitation and emission wavelengths. To study these phenomena further, a chemometrics analysis of the collected EEMs’ spectral data was carried out. 3.2. PCA Analysis of the Two-Way Data for the Preliminary Classification of CPC, CPA, and CM. There are different ways to build models, which enable pattern recognition or classification, but many of these are limited because they require two-way data as input. Thus, in this study the objects were analyzed in two different ways, which were based on different data dimensions: (i) the initial classification approach used the two-way response data, and (ii) further research of the classification model was carried out with the use of the three-way response data.21 The well-known pattern recognition method, PCA, reduces the number of data variables by finding new linear combinations of the variables, which are orthogonal to each other, i.e., the principal components (PCs), which account for
the variance in the original variables. PCs reflect the relationships between objects, loadings, and scores. These relationships are commonly illustrated on PCX versus PCY biplots.22 In this study, PCA was applied to the emission spectral data (raw spectra; ex - 350 nm; wavelength range - em: 475 to 592 nm) and the excitation spectral data (raw spectra; em - 525 nm; wavelength range - ex: 280−460 nm). The biplot for spectral emission data (Figure 3a), accounted for 98.8% of data variance, and objects, when projected onto PC1, were distributed along this PC with the CPC ones in a fairly tight cluster and negative scores, the CPA ones with negative and positive scores, and the CM objects formed a tight cluster with positive scores on this PC; these three groups of objects were qualitatively separated from each other on this PC. On PC2, the CPC cluster (positive scores) was well separated from the CPA one (negative scores), and the CM objects (spread along PC2 with negative and positive scores) overlapped the CPA objects. This analysis suggested that on the basis of emission spectra, the three types of object may be separated in the PC1−PC2 space producing a roughly Vshaped distribution (Figure 3a). The distribution of objects (Figure 3b) displayed a similar V-shaped pattern with CPC, CPA, and the CM groupings remaining in roughly the same positions as in Figure 3a. However, the mixed samples in the CPC/CPA group were clearly distributed across all three varieties, i.e., indicating that some samples of this mixture contained constituents, which were similar to the three groups. The objects in the CPC/CM group fitted in roughly with the CM objects and possibly the CPA group; the CPA/CM objects overlay the CM group. Therefore, the PCA based on the emission spectra could not provide a separation of the binary mixtures. However, the PCA based on the excitation spectra (Figure 3c) demonstrated an overlap of all samples on PC1 to a lesser or greater extent; a rough separation on PC2, in order, from 8952
dx.doi.org/10.1021/jp306051w | J. Phys. Chem. A 2012, 116, 8949−8958
The Journal of Physical Chemistry A
Article
Figure 4. Recovered fluorescence spectra with the use of the PARAFAC method. Each row corresponds to different species, i.e., CPC, CPA, and CM, respectively; left column - recovered excitation spectra, and right column - recovered emission spectra. Curves of the same color represent the same constituents, while curves of different color represent different constituents.
off between the training error minimization and the smoothness of the estimated function.17 The method for extracting these two optimized parameters has been described in detail elsewhere,23 and their values in this work were 0.08 (σ2) and 9.8 (γ). When the optimized LS-SVM calibration models based on excitation and emission data, respectively, were applied for prediction of the individual varieties of the TCMs, the success rate was 100% classification of the three different varieties into their respective CPC, CPA, and CM groups. However, when the 27 mixture samples were also included, there was a total of 81 samples of which 54 samples (#1−12 (CPC), #19−28 (CPA), #34−47 (CM), #55, 57, 59, 61, and 63 (CPC/CPA), #64, 66, 68, 70, and 72 (CPC/CM) and #73, 75, 77, 79, and 81 (CPA/CM), were selected for the calibration set, and the remaining 27 samples were for prediction. The two-way excitation or emission fluorescence spectral data of the calibration sets were used to build the LS-SVM models, and the success prediction rates were 83.3% and 44.4% for the excitation and emission data, respectively. The results indicated that LS-SVM, a supervised two-way calibration model, was unsatisfactory for discriminating the adulterated samples from the authentic ones. However, interestingly, it should be noted that the excitation spectra performed much better than the emission measurements. Such an outcome prompted the trialing of the three-way models for classification.
positive to negative scores, could be discerned: CM, CPA, and CPC. In the fourth biplot (Figure 3d) while the individual CM, CPA, and CPC objects formed a pattern similar to that in Figure 3c, it was difficult to discern a clear pattern for the binary mixture samples. Thus, it is difficult to use PCA as a pattern recognition method for these complex samples, and another powerful chemometrics method, LS-SVM,17 was applied for the discrimination or the classification of the TCM samples. LS-SVM facilitates linear and nonlinear multivariate calibration modeling, providing relatively fast processing. Determination of the optimal input feature subset, proper kernel function, and optimum kernel parameters are the crucial elements for the LS-SVM. Typical kernel functions are the polynomials and the Gaussian radial basis function (RBF).17 This method is a supervised classification procedure and requires the development of a calibration model, which can then be used for prediction. In this investigation, the samples were identified numerically according to their variety, i.e., CPC - 1, CPA - 2, and CM - 3, and the calibration set (total: 36 samples) included samples #1−12 (CPC), #19−28 (CPA), and #34−47 (CM), while the remaining 18 samples were assigned to the prediction set. The RBF was used as the kernel function in the model. To obtain the optimum LS-SVM model, it is necessary to optimize two parameters: the kernel parameter, σ2 (square of the bandwidth), and the regularization parameter, γ, which estimates the trade8953
dx.doi.org/10.1021/jp306051w | J. Phys. Chem. A 2012, 116, 8949−8958
The Journal of Physical Chemistry A
Article
Figure 5. PARAFAC classification results of the 54 samples; Panels a, b, and c correspond to the recovered excitation spectra, emission spectra, and the scores for different samples, respectively. Panel d represents the clustering of objects of the different samples.
3.3. Classification of the Mixtures with the Use of the Three-Way Model. When the collected data is in a multiway format, it generally has to be decomposed into two-way data before further study. PARAFAC18 is one of the commonly used algorithms for trilinear data decomposition, which can overcome the rotational freedom problem present with a bilinear algorithm;24,25 it has been used successfully with excitation and emission spectra collected from various mixtures of fluorophores.26,27 It can extract the excitation and the emission profiles of the main components as well as the concentration profiles, which correspond to the relative concentrations of each component. In this investigation, PARAFAC was applied to decompose the three-way spectral data collected separately from the CPC, CPA, and CM samples. The number of fluorescent components was determined by the commonly applied core consistency method, which is discussed in detail elsewhere,28 and the number of significant components were found as 1, 3, and 2 in CPC, CPA, and CM samples, respectively. The non-negative constraint was applied to confine the model. The excitation and emission spectral profiles recovered by the PARAFAC model (Figure 4) reflect the number of fluorescing constituents in the three species. On this basis, for the excitation spectra, CPC has one constituent (labeled - i), CPA - three (labeled - i, ii, and iii) and CM - two (labeled - i and ii). Constituent i appeared to be common in all three TCM varieties, constituent ii in two varieties, and constituent iii was present only in CPA. A similar conclusion was reached with the emission spectral profiles as demonstrated in Figure 4b,d,f. The fluorescence matrix of the three varieties was submitted to PARAFAC for analysis. The number of factors was set at three, and the recovered spectra of each constituent (i, ii, and iii) are shown in Figure 5a,b. The plot of the relative concentration values of each constituent versus the 54 samples (Figure 5c) displayed the distribution of the three constituents in each type of sample (CPC, CPA, and CM). Constituent i was dominant in the CPC samples, while constituents ii and iii
were relatively minor contributors; for the CPA samples, the three constituents were intermixed along the relative concentration scale; for the CM samples, the constituents appeared to be in order of constituent ii (the highest amount), followed by constituent i and then iii (the lowest). A biplot based on a data matrix, which involved the relative concentrations of the constituents i and ii (Figure 5d), and which was obtained on the basis of the three-way excitation or emission fluorescence data and PARAFAC modeling, indicated a reasonable grouping of the three types of sample, CM, CPA, and CPC. This display suggested a classification map when the objects were projected onto PC1. Similar satisfactory classification maps were not obtained when constituent iii was introduced into the data matrix, presumably because samples CPC and CM did not contain this constituent to any extent. When all the 81 samples, i.e., those containing individual and mixed sample types and involving the three constituents, were submitted for analysis with the PARAFAC model, a similar constituent i versus constituent ii classification map was obtained (Figure 6). It is apparent that the positions of the three individual types of sample (CPC, CPA and CM) are in
Figure 6. Plot of the concentration of the constituents from the 81 samples including the mixtures. 8954
dx.doi.org/10.1021/jp306051w | J. Phys. Chem. A 2012, 116, 8949−8958
The Journal of Physical Chemistry A
Article
Table 2. PROMETHEE Rank-Order of the 54 Emission Spectral Objects Involving Single Ingredients and the Three Average Spectra
a
No. = PROMETHEE rank. bS. = sample number. cCate = category or variety of different samples. dAver = average spectral object.
quantitative approach for performance and classification of samples, and it is for these reasons that these methods were included in this study. To build a PROMETHEE model for ranking objects,33 three modeling properties have to be defined for each matrix variable or criterion: (1) a preference function according to which an object is compared to other objects; (2) the top-down or bottom-up ordering of objects on each criterion, i.e., “maximize” or “minimize”; this ensures that, in the context of a particular research problem, the objects are ranked in a preferred order; and (3) the weighting of each variable; for most scientific studies, generally, this is left at the default value of 1 as was done in this study. Also, the Gaussian preference function was chosen30 because this function is the most suitable of the six available options in the PROMETHEE program to model sample distributions as measured by fluorescence spectroscopy and expressed as PC scores; the “maximize/ minimize” attribute ensures that, for a given variable, the objects are rank-ordered either top-down, i.e., highest values are preferred, or bottom-up, i.e., lowest values are preferred, as appropriate for a particular experiment.29 When PCs are used as variables instead of the actual measurements, and objects’ scores are the matrix values, it should be noted that the scores are relative values, which can change in sign and/or value with the addition of even one extra object to the data matrix. Hence, it is necessary to assign a reference object to which all other objects may be related on each PC.31 Consequently, every object is compared to this reference object PC-by-PC. In this work, the average spectrum of the largest sample set, i.e., average CM spectrum (AverCM), was used as the relative reference object. Higher score values were preferred, i.e., the objects with positive scores were preferred over those with negative scores for rank-ordering; the weighting of each PC criterion was set to 1. For the analysis of samples with one chemical component only, a matrix of 57
similar relative positions to those in Figure 5d. The CPA/CM mixture samples cluster mainly between the CM and the CPA groups; also there are four CPC/CM samples in the same region, possibly indicating that these samples contained high levels of the CM variety. Similarly, the CPC/CPA mixture samples mostly overlay the CPC and CPA groups. Sample #21 corresponding to CPA, in Figure 5d, lies near (0,0); this is also found in Figure 5c. In general, once again it is difficult to discriminate the mixed samples from each other and from the individual types of objects. 3.4. PROMETHEE and GAIA Analysis of Spectral Data. PROMETHEE and GAIA together form a powerful MultiCriteria Decision-Making (MCDM) method,29 that can be applied to complex problems involving decision-making in general. PROMETHEE is a nonparametric, rank-ordering method, which produces quantitative indices, φ, that reflect the relative performance of objects in a group of samples; this is carried out on the basis of 100% of the information compiled in a matrix of the same (e.g., wavenumbers) or different (e.g., spectra, weight, length, cost etc.) types of variable, each of which can be specifically modeled. GAIA produces a PCA biplot display which usually reflects less than 100% data variance and which is obtained from a matrix that is derived from the PROMETHEE indices, φ.30 It is often an invaluable tool, which provides information on the relationships of objects, variables, and objects-and-variables on the basis of their PROMETHEE decision ranking. In this sense, on a broader basis, this methodology can bring together information from disparate project members,31 and the impact of each such contribution can be assessed by these methods, for example, the Carmody et al. study.32 Thus, in comparison to other chemometrics methods of data analysis such as the ones applied in previous sections, the combination of the two methods, PROMETHEE and GAIA as well as the requirement to model each variable, offer a very powerful qualitative and 8955
dx.doi.org/10.1021/jp306051w | J. Phys. Chem. A 2012, 116, 8949−8958
The Journal of Physical Chemistry A
Article
Table 3. PROMETHEE Rank-Order of the 54 Excitation Spectral Objects Involving Single Ingredients and the Three Average Spectra
a
No. = PROMETHEE rank. bS. = sample number. cCate = category or variety of different samples. dAver = average spectral object.
Figure 7. GAIA biplot of all the samples involved: excitation data for CPC, CPA, CM (a), and those with their mixtures (b); scores of the first three PCs from the scaled data were used as input variables; the GAIA display component 1 versus component 2 biplot explaining 100% of the input data variance. Labels are as in Figure 3.
spectral objects, showed reasonable grouping of the three kinds of sample; this was particularly evident with the excitation results. The overall φ index range was much higher for the excitation data (|Δφ| = 1.43) than the emission data (|Δφ| = 0.86), and in addition, some intermixing of CPA and CPC objects occurred for the emission data. This suggested that the excitation spectral data could discriminate the sample types better than the emission data, and as a result the latter was not discussed further in this context. In Table 3, the φ net ranking index range is 0.71 to −0.72. The CM objects (21 samples) have the highest φ net ranking values (ranks 1−22; φ index range: 0.71 to 0.29) and include the AverCM sample (rank 14; φ index = 0.51). At the other end of the net ranking range is the CPC group (18 samples; ranks 39−57; φ index range: −0.34 to −0.72) with the AverCPC object at rank 48 (φ index = −0.56). The 15 CPA spectral samples are between the two groups just discussed (ranks: 23−38; φ index range: 0.12 to −0.28) including the AverCPA sample (rank 32; φ index = −0.08). Thus, the three types of sample can be satisfactorily discriminated with the use of PROMETHEE modeling. The
objects (54 single ingredient spectral objects +3 average spectra; the latter were AverCPC, AverCPA, and AverCM) was submitted to PCA. For analysis of samples with individual and several components, the matrix involved 84 spectral objects (54 individual +27 mixtures +3 average spectra (AverCPC, AverCPA, and AverCM)). As previously noted, in PROMETHEE, each object, is characterized, on a relative basis, by a φ net ranking index value: the closer the values, the more similar the objects; the larger the difference in the values, the more different the objects. The overall value of the φ net ranking index ranges from the most positive to the most negative. Thus, on a relative basis, the larger the φ net ranking index range, the more spread out or different are the objects; however, it is possible that the objects between these extremes, may group into one or more relatively tight clusters with small φ index ranges. Generally, a ± 10% value of the overall φ range is considered as evidence of approximate discrimination between objects.34 In the analysis of single component samples, both emission (Table 2) and excitation spectral data (Table 3), i.e., the CPC, CPA, and CM 8956
dx.doi.org/10.1021/jp306051w | J. Phys. Chem. A 2012, 116, 8949−8958
The Journal of Physical Chemistry A
Article
Table 4. PROMETHEE Rank-Order of the 27 Excitation Spectral Objects Involving Mixed Ingredients, and the Three Average Spectra Spectra
a No. = PROMETHEE rank. bS. = sample number. cCate = category or variety of different samples. dAver = average spectrum of each sample type, i.e., Aver1 - CPC/CPA; Aver2 - CPC/CM; Aver3 - CPA/CM.
Figure 8. GAIA biplot (86.62% variance explained) of 27 mixed component samples and three averaged spectral samples; scores of the first four PCs from the normalized excitation data were used as input variables. Every mixed component sample is labeled with the corresponding components ratio. The average object of the CPC/CPA, CPC/CM, and CPA/CM spectral series are labeled as Aver1, Aver2, and Aver3.
order (Table 4), but it was essential to consider the GAIA biplot display before various groupings could be discerned and the effect of the ratio values became apparent. The interpretation of the excitation spectra of the three TCMs with the use of the PROMETHEE ranking and GAIA display methods has clearly indicated that the fluorescence excitation spectra from the samples containing individual or mixed components of the three studied TCMs, when investigated separately as discussed above, may be successfully compared and discriminated. This work also suggested that it is possible to classify blind samples known to contain only individual components in samples (CPC or CPA or CM) with respect to a calibration set such as that in Table 3. Similarly, it should be possible to classify blind samples of mixed components (CPC/ CPA, or CPC/CM or CPA/CM) with respect to a calibration set such as that in Table 4. However, further experiments have indicated that PROMETHEE ranking models, which contain a combination of samples with individual and mixed components of the TCMs, were unsuccessful for classification of any blind samples even when the corresponding GAIA biplot was considered. This lack of success to classify the various groups of the TCM samples containing either individual or mixed components must result from the similarity of the excitation
corresponding GAIA biplot (Figure 7a) supported this observed sample classification. However, when the same PROMETHEE model was applied to the 27 samples with mixed components and three average mixture spectral objects, the result was more complex (Table 4). The φ net ranking index range from 0.45 to −0.23 was significantly narrower than that for the individual samples just discussed, which suggested that the spectral objects of the mixtures were more similar and, hence, more difficult to discriminate. However, when the ratios of the two components in each sample were considered, a complex rank order or grouping appeared to be present. Nevertheless, this was difficult to understand until the GAIA biplot was considered and the objects on it were labeled both with the ratio values and sample numbers (Figure 8a,b). These diagrams clearly indicated that the mixed component samples can indeed be discriminated as illustrated, for example by the CPC/CPA 5/1, 5/2, ..., 5/5 objects; these lie on approximately a straight line roughly in parallel with the Component 1 axis, while the five CPC/CM 1/5, 2/5, ..., 5/5 objects lie roughly in a straight line with negative scores on Component 1 and Component 2. Other groupings can be clearly identified on the biplot, although some form clusters rather than straight lines. This information was imbedded in the PROMETHEE rank 8957
dx.doi.org/10.1021/jp306051w | J. Phys. Chem. A 2012, 116, 8949−8958
The Journal of Physical Chemistry A
Article
gates. In Methods Enzymol.; Fukuda, M., Eds.; Elsevier: Amsterdam, 2010; pp 241−264. (6) Amelin, V. G.; Aleshin, N. S.; Abramenkova, O. I.; Nikolaev, Y. N.; Lomonosov, I. A. J. Anal. Chem. 2011, 66, 709−713. (7) Okada, J.; Kawashima, Y. J. Pharmaceut. Soc. Jap. 1969, 89, 558− 564. (8) Liu, Y. M.; Sheu, S. J.; Chiou, S. H.; Chang, H. C.; Chen, Y. P. Planta Med. 1993, 59, 557−561. (9) Guo, K. J.; Xu, S. F.; Yin, P.; Wang, W.; Song, X. Z.; Liu, F. H.; Xu, J. Q.; Zoccarato, I. J. Anim. Sci. 2011, 89, 3107−3115. (10) Chen, M. L.; Xian, Y. F.; Ip, S. P.; Tsai, S. H.; Yang, J. Y.; Che, C. T. Planta Med. 2010, 76, 1530−1535. (11) Journal of Chinese Herbal Medicine and Acupuncture: Databases of traditional Chinese medicine, herbs, acupuncture, qigong, prevention and longevity. http://pharmtao.com/blog2/tag/caulis-mahoniae/, accessed May 2012. (12) Aloise, S.; Ruckebusch, C.; Blanchet, L.; Rehault, J.; Guy Buntinx, G.; Huvenne, J. P. J. Phys. Chem. A 2008, 112, 224−231. (13) Zandomeneghi, M.; Carbonaro, L.; Zandomeneghi, G. J. Agric. Food Chem. 2006, 54, 5214−5215. (14) Henrion, R.; Henrion, G.; Bohme, M.; Behrendt, H. Fresen. J. Anal. Chem. 1997, 357, 522−526. (15) Hall, G. J.; Kenny, J. E. Anal. Chim. Acta 2007, 581, 118−124. (16) Ballabio, D.; Consonni, V.; Todeschini, R. Anal. Chim. Acta 2007, 605, 134−146. (17) Suykens, J. A. K.; Van Gestel, T.; De Brabanter, J.; De Moor, B.; Vandewalle, J. Least Square Support Vector Machines; World Scientific: Singapore, 2002. (18) Bro, R. Chemom. Intell. Lab. Syst. 1997, 38, 149−171. (19) Decision Lab 2000, executive edition; Visual Decision, Inc.: Montreal, Canada, 1999. (20) Gonis, A.; Butler, W. H. Multiple Scattering in Solids; Springer: Berlin, 1999. (21) Wentzell, P. D.; Nair, S. S.; Guy, R. D. Anal. Chem. 2001, 73, 1408−1415. (22) Alves, M. R.; Oliveira, M. B. J. Chemom. 2003, 17, 594−602. (23) Brereton, R. G.; Lloyd, G. R. Analyst 2010, 135, 230−267. (24) Amigo, J. M.; Skov, T.; Bro, R. Chem. Rev. 2010, 110, 4582− 4605. (25) Li, Y. N.; Wu, H. L.; Nie, J. F.; Li, S. F.; Yu, Y. J.; Zhang, S. R.; Yu, R. Q. Anal. Methods 2009, 1, 115−122. (26) Yamashita, Y.; Panton, A.; Mahaffey, C.; Jaffe, R. Ocean Dyn. 2011, 61, 569−579. (27) Seredyńska-Sobecka, B.; Stedmon, C. A.; Boe-Hansen, R.; Waul, C. K.; Arvin, E. Water Res. 2011, 45, 2306−2314. (28) Bro, R.; Kiers, H. A. L. J. Chemom. 2003, 17, 274−286. (29) Kokot, S.; G.A. Ayoko, G. A. Chemometrics: Multi-Criteria Decision Making. In Encyclopaedia of Analytical Sciences, 2nd ed., Worsfold, P. J.; Townsend, A.; Poole, C. F., Eds.; Elsevier: Oxford, U.K., 2005; Vol. 2, pp 40−45. (30) Keller, H. R.; Massart, D. L.; Brans, J. P. Chemom. Intell. Lab. Syst. 1991, 11, 175−189. (31) Figueira, J.; Greco, S.; Ehrgott, M. Multiple Criteria Decision Analysis: State of the Art Surveys; Springer Verlag: London, 2005. (32) Carmody, O.; Frost, R.; Y., F.; Kokot, S. Surf. Sci. 2007, 601, 2066−2076. (33) Brans, J. P. Decis. Support Syst. 1994, 12, 297−310. (34) Ni, Y. N.; Mei, M. H.; Kokot, S. Chemom. Intell. Lab. Syst. 2011, 105, 147−156.
spectra of the individual and mixed samples. The binary mixture samples were scattered among the parent individual samples in the GAIA biplot (Figure 7b), and reflecting this, the φ indices of different samples were often found to be either the same or very similar. Two short sequences of rank-ordering of samples with single and mixed components illustrated the above comments: (1) CM: φ = 0.44; CM: φ = 0.43; CM/CPA: φ = 0.43 or (2) CPA: φ = −0.09; CPC/CM: φ = −0.09; CPC: φ = −0.11. Such similar results clearly complicate any discrimination, and were probably the reason why the powerful classification methods such as LS-SVM and PARAFAC were also unsuccessful.
■
CONCLUSIONS A novel and simple fluorescence method for analysis of complex, solid substances, and their potential substitutes has been researched and developed. Excitation and emission fluorescence spectra from the powdered TCMs (CPA, CPC) and their adulterants (CM and DPB) were discriminated. Of the different chemometrics methods (PCA, PARAFAC, LSSVM, PROMETHEE, and GAIA) applied for the resolution of complex spectra, the excitation spectra were found to be the most informative; attempts to classify the substances were generally found to be difficult, and only the rank-ordering PROMETHEE method was able to discriminate the samples with single ingredients (CPA, CPC, CM) or those with binary mixtures (CPA/CPC, CPA/CM, CPC/CM). Importantly, it was essential to interpret the GAIA display for a full understanding of the classification results. Nevertheless, the PROMETHEE and GAIA methods, like the other chemometrics models, were unable to classify composite matrices consisting of data from samples with single ingredients and others with binary mixtures, suggesting that the excitation spectra of the different samples were very similar. However, the PROMETHEE and GAIA methods are useful for classification of single ingredient samples and separately, their binary mixtures.
■
AUTHOR INFORMATION
Corresponding Author
*Address: Department of Chemistry, Nanchang University, Nanchang 330031, China. Tel.: 86 791 3969500. Fax: 86 791 3969500. E-mail address:
[email protected]. Notes
The authors declare no competing financial interest.
■
ACKNOWLEDGMENTS The authors greatly appreciate the financial support from the Natural Science Foundation of China (NSFC-21065007) and the State Key Laboratory of Food Science and Technology of Nanchang University (SKLF-MB-201002 and SKLF-TS200919).
■
REFERENCES
(1) Liang, Y. Z.; Xie, P. S.; Chau, F. T. J. Sep. Sci. 2010, 33, 410−421. (2) Roberto, G. J. Pharm. Biomed. Anal. 2011, 55, 775−801. (3) Liang, Y. Z.; Xie, P. S.; Chan, K. Comb. Chem. High Throughput Screening 2010, 13 (10), 943−53. (4) Park, E. K.; Jung, W. C.; Lee, H. J. Acta Vet. Hung. 2010, 58, 83− 89. (5) Leppanen, A.; Cummings, R. D. Fluorescence-based solid-phase assays to study glycan-binding protein interactions with glycoconju8958
dx.doi.org/10.1021/jp306051w | J. Phys. Chem. A 2012, 116, 8949−8958