Anal. Chem. 2002, 74, 2493-2499
Fast Identification of Echinacea purpurea Dried Roots Using Near-Infrared Spectroscopy Magali Laasonen,†,‡ Tuulikki Harmia-Pulkkinen,‡ Christine L. Simard,§ Erik Michiels,§ Markku Ra 1 sa 1 nen,| and Heikki Vuorela*,†
Department of Pharmacy, Division of Pharmacognosy, P.O. Box 56 (Viikinkaari 5E), FIN-00014 University of Helsinki, Finland, and Pharmia Oy, P.O. Box 387, FIN-00101 Helsinki, Finland
Near-infrared (NIR) reflectance spectroscopy was used to develop a fast identification method for Echinacea purpurea dried milled roots. Method development was carried out using a PLS (partial least-squares) algorithm and pretreatment options. The aim of this qualitative analysis was to confirm the identity of E. purpurea and to detect the presence of fraud, i.e., samples adulterated or substituted by Echinacea angustifolia, Echinacea pallida, or Parthenium integrifolium. Specificity was demonstrated by testing a validation set against the method. A total of 10% of the E. purpurea batches (true samples) and 0% of the false samples from that validation set were misidentified by the method. The misidentification was due to the difference in particle size distribution of one E. purpurea batch compared to that of the other samples. Adulterated E. purpurea samples can be detected at a minimum of 10% of adulteration. This study demonstrates that NIR spectroscopy is a good tool for the fast identification of E. purpurea roots if the samples are milled using the same procedure as for the calibration samples. The method is robust with respect to the origin of the samples and can be used routinely by the pharmaceutical industry or herbal suppliers to avoid mislabeling errors or adulteration. Echinacea species are currently one of the 10 best-selling herbs in the United States and Germany.1-3 Echinacea preparations or medicinal products are frequently used for nonspecific stimulation of the immune system and to prevent or cure the common cold, flu, and several other diseases, e.g., coughs, bronchitis, or upper respiratory infections.4 However, one major concern when pur† University of Helsinki. E-mail:
[email protected]. Fax: + 358-919159578. ‡ Pharmia Oy. § Present address: ABB Bomem Inc., 585 Charest Blvd, East Suite 300, Quebec, Canada G1K 9H4. | Present address: Laboratory of Physical Chemistry, Department of Chemistry, P.O. Box 55 (A. I. Virtasen Aukio 1), FIN-00014 University of Helsinki, Finland. (1) Brevoort, P. Pharm. News 1996, 3, 26-28. (2) Tyler, V. E. Phytomedicines of Europe, Chemistry and Biological activity; Lawson, L. D., Bauer, R., Eds.; American Chemical Society: Washington, DC, 1998; Vol. 691, pp 2-12. (3) Bauer, R. Phytomedicines of Europe, Chemistry and Biological activity; Lawson, L. D., Bauer, R., Eds.; American Chemical Society: Washington, DC, 1998; Vol. 691, pp 140-157.
10.1021/ac011108f CCC: $22.00 Published on Web 05/02/2002
© 2002 American Chemical Society
chasing Echinacea is the practice adulteration. It has been known for more than 10 years that commercial roots of Echinacea purpurea (L.) Moench are frequently adulterated by the roots of Parthenium integrifolium L.5 Moreover, mislabeling errors can easily occur between the different species of Echinacea, e.g., E. purpurea, E. angustifolia, and E. pallida4,6 These frauds are serious problems that affect the reliability and efficiency of Echinacea commercial products. To avoid these problems, extensive work has been carried out for many years to develop reliable and specific methods for the identification of Echinacea species. Traditional analytical methods, typically HPLC and TLC, aim at discriminating the different species on the basis of their biochemical content. However, Echinacea species, as other herbal drugs, contain a huge number of biochemical constituents, e.g., an essential oil, caffeic acid derivatives, flavonoids, polyacetylenes, alkylamides, alkaloids, and polysaccharides.7 Therefore, one of the main problems was,8 and is still,9-11 to find suitable identity markers. The second drawback of traditional chromatographic methods is their lack of rapidity: the fastest reported12 HPLC method enables the simultaneous analysis of hydrophilic and lipophilic compounds from Echinacea in a 20 min run without including sample preparation. Assurance of the quality of herbal material is a major challenge in the phytopharmaceutical and food industries because their content of active constituents varies according to a wide range of factors, i.e., species, location of growth, climate, age and harvesting season, storage conditions, etc. Regulations concerning herbal remedies have been recently updated by the adoption of two revised guidances13,14 that describe the requirements for the evaluation of herbal medicinal products. They state that identifica(4) Percival, S. S. Biochem. Pharmacol. 2000, 60, 155-158. (5) Bauer, R.; Wagner, H. Sci. Pharm. 1987, 55, 159-161. (6) Bauer, R.; Reminger P.; Wagner, H. Phytochemistry 1989, 28, 505-508. (7) Bauer, R.; Wagner, H. Economic and Medicinal Plant Research; Wagner, H., Farnsworth, N. R., Eds.; Academic Press: New York, 1991; Chapter 2. (8) Bauer, R.; Khan, I. A.; Wagner, H. Dtsch. Apoth. Ztg. 1987, 25, 1325-1330. (9) Perry, N. B.; Burgess, E. J.; Leanne Glennie, V. J. Agric. Food Chem. 2001, 49, 1702-1706. (10) Baum, B. R.; Mechanda, S.; Livesey, J. F.; Binns, S. E.; Arnason, J. T. Phytochemistry 2001, 56, 543-549. (11) Bergeron, C.; Livesey, J. F.; Awang, D. V. C.; Arnason, J. T.; Rana, J.; Baum B. R.; Letchamo, W. Phytochem. Anal. 2000, 11, 207-215. (12) Laasonen, M.; Wennberg, T.; Harmia-Pulkkinen, T.; Vuorela, H. Planta Med., in press. (13) Note for guidance on quality of herbal medicinal products; The European Agency for the Evaluation of Medicinal Products: London, 2000; CPMP/ QWP/2819/00.
Analytical Chemistry, Vol. 74, No. 11, June 1, 2002 2493
Table 1. Calibration and Validation Set Description and Specificity Results of the Identification Method of E. Purpurea Dried Milled Roots
sample identity true samples: false samples: false samples: false samples: tot. a
E. purpurea E. angustifolia E. pallida P. integrifolium
batches (spectra) in CSa
no. of suppliers in CSa
batches (spectra) in VSa
no. of suppliers in VSa
8 (16) 3 (6) 0 (0) 1 (2) 12
8 2 0 1
10 (20) 20 (40) 10 (20) 1 (2) 41
5 5 4 1
type I errors (% of spectra)
type I errors (% of batches)
10
10
10
10
type II errors (% of spectra)
type II errors (% of batches)
0 0 0 0
0 0 0 0
CS and VS: calibration set and validation set, respectively.
tion of herbal drugs is one of the first tests to be applied in ensuring quality, safety, and efficacy of herbal medicinal products. Identification testing should be able to discriminate between related species and/or potential adulterants. This concept is fully applicable to E. purpurea (L.) Moench, the adulterant of which, Parthenium integrifolium L., is well-known.5 In this study, we have developed and validated a near-infrared (NIR) spectroscopic method for the fast identification of E. purpurea roots. The only sample preparation procedure required is the milling, and the analysis of a sample can be performed within 1 min. This method is thus about 20 times faster than the fastest HPLC method.12 This handy, solvent-free method can be used by the phytopharmaceutical industry for identification testing of E. purpurea and to detect species confusion. The concept underlying this method is the opposite to that of traditional methods: it does not provide information about the level or presence of active markers. The method is based on comparison of the NIR spectrum of a sample with a library containing true and false sample spectra. NIR spectroscopy is a modern tool that has important advantages:15 it allows nondestructive analysis of solid samples, it requires very little or no sample preparation, and it enables very fast analysis. Furthermore, NIR spectroscopy has already demonstrated its capacity to determine the different types or species of herbal drugs,16-18 to screen for their geographical origins,19-21 or to quantify marker substances.17,22-25 As NIR has never been used for the identification of Echinacea, we therefore chose E. purpurea roots as the target of our study. (14) Note for the Guidance on Specifications: Test Procedures and Acceptance Criteria for Herbal Drugs, Herbal Drug Preparations and Herbal Medicinal Products; The European Agency for the Evaluation of Medicinal Products: London, 2000; CPMP/QWP/2820/00. (15) Osborne, B. G.; Fearn T.; Hindle, P. H. Practical NIR Spectroscopy With Applications in Food and Beverage Analysis, 2nd ed.; Longman: Harlow, U.K., 1993; Chapter 2. (16) Downey, G.; Boussion, J. J. Sci. Food Agric. 1996, 71, 41-47. (17) Schultz, H.; Engelhardt, U. H.; Wegent, A.; Drews, H. H.; Lapczynski, S. J. Agric. Food Chem. 1999, 47, 5064-5067. (18) Steuer, B.; Schultz, H.; La¨ger, E. Food Chem. 2001, 72, 113-117. (19) Bertran, E.; Blanco, M.; Coello, J.; Itturiaga, H.; Maspoch, S.; Montoliu, I. J. Near Infrared Spectrosc. 2000, 8, 45-52. (20) Woo, Y. A.; Kim, H. J.; Chung, H. Analyst 1999, 124, 1223-1226. (21) Woo, Y. A.; Kim, H. J.; Cho J.; Chung, H. J. Pharm. Biomed. Anal. 1999, 21, 407-413. (22) Chen, Y.; Sorensen, L. K. Fresenius’ J. Anal. Chem. 2000, 367, 491-496. (23) Ren G.; Chen, F. J. Agric. Food Chem. 1999, 47, 2771-2775. (24) Molt, K.; Zeyen, F.; Podpetschnig-Fopp, E. Pharmazie 1997, 52, 931-937. (25) Kennedy, C. A.; Shelford, J. A.; Williams, P. C. Near Infrared Spectroscopy: Future Waves; Davies, A. M. C., Williams, P. C., Eds.; NIR Publications: Chichester, U.K., 1996; pp 524-530.
2494
Analytical Chemistry, Vol. 74, No. 11, June 1, 2002
EXPERIMENTAL SECTION Sample Description. Sample collection was time-consuming because samples were obtained from a number of suppliers to obtain a robust calibration, regardless of the geographical origin. Dried roots of Echinacea purpurea, Echinacea angustifolia, Echinacea pallida, and Parthenium integrifolium were obtained from Heinrich Klenk (Schwebheim, Germany), Martin Bauer (Vestenbergsgreuth, Germany), Frantsilan Yrttitila (Kyro¨skoski, Finland), Alfred Galke (Gittelde, Germany), Richters (Goodwood, Ontario, Canada), Bioforce (Roggwil, Switzerland), Ian and Linda Grossard’s farm (Brandon, Manitoba, Canada), Medicinal herb farm (Kelowna, Canada), Nutrilite farm (Lakeview, CA), and Teardrop farm (Sterling, KS). In this study, dried roots of E. purpurea from the Asteracea family were used as the true samples. False samples were E. pallida, E. angustifolia (two other Echinacea species), and P. integrifolium, another plant from the same Asteracea family. Table 1 describes the number of samples collected for both true and false samples. All the samples were received in the form of dried roots (cut or entire roots), except for one sample of E. purpurea (called P1) that was already in powder form. Samples received as entire roots were cut into smaller pieces within a crusher and then milled to a powder using a grinder. Samples received in the form of cut, dried roots were directly milled to powder using the same grinder (Ika Labortechnik, type A10, Staufen, Germany). The amount of sample ground was 10 g, grinding speed 20 000 r/min, and grinding time 15 s. The particle size distribution of the ground samples was broad. The sample P1 received directly in milled form was scanned without the grinding procedure. The particle size distribution of samples was evaluated by vibrational sieve analysis over the range of 901000 µm (Analyzette, Fritsch, Germany). Samples ground with the grinding procedure described above had a particle size ranging from 90 to 1000 µm, whereas P1 received directly in milled form had a narrow particle size distribution: approximately 125-355 µm. The drying procedures were often unknown and most probably varied from one supplier to another. To make sure that moisture was not an interfering factor in the discrimination of E. purpurea, the moisture content of the powder samples was determined using an infrared dryer (Sartorius Thermocontrol YTCOL, Sartorius GmbH, Go¨ttingen, Germany). This dried the samples at 110 °C until the loss in weight was less than 0.1% during 50 s. The loss in weight was 7.7 ( 0.9% (n ) 18), 8.0 ( 1.1% (n ) 23), 7.2 ( 0.7% (n ) 10), and 8.05 ( 0.2% (n ) 2) for the E. purpurea, E.
angustifolia, E. pallida, and P. integrifolium samples, respectively. As the confidence intervals of these values overlapped, there was no evidence of differences in the moisture content of the four plant samples. Nine adulterated samples were prepared by blending different amounts of adulterant samples with an E. purpurea powder sample. The nine laboratory-made samples contained E. purpurea and 5%, 10%, or 20% of milled adulterant: E. angustifolia (samples A5, A10, A20); E. pallida (samples P5, P10, P20); P. integrifolium (samples I5, I10, I20). Reference Analysis. Before development of the NIR method, all samples were identified by reversed-phase HPLC to avoid using mislabeled samples. The HPLC method12 was developed using a computer-assisted optimization program, DryLab 2000 (LC Resources). This method was based on the quantitative determination of caffeic acid derivatives (echinacoside and cichoric acid) and of the main alkylamide (dodeca-2E,4E,8Z,10E/Z-tetraenoic acid isobutylamide) because they are known to be relevant for standardization purposes.7 The presence of chlorogenic acid and cynarine was also investigated. Spectroscopic Measurements and Software. The spectra were recorded on a MB 160DX FTNIR (Fourier transform nearinfrared) spectrometer (ABB Bomem, Quebec, Canada) fitted with a quartz-halogen lamp and a cooled InAs detector. The configuration used a Powder Samplir reflectance accessory and the diffuse reflectance mode. The spectrometer was equipped with the software package from ABB Bomem, including Grams 32 version 4.04 for spectral acquisition, PLSPlus/ IQ version 3.03 for spectral processing and chemometrics analysis, and AIRS (Advance Infrared Software) version 1.54 for routine qualitative analysis. PCA (principal component analysis) was performed using Systat version 9 (SPSS, Chicago, IL). For spectral acquisition, the background was scanned from a Spectralon 99% Reflective Standard (Labsphere, North Sutton, NH) located on the beam of the Powder Samplir accessory. Each batch was sampled in duplicate and poured without tapping into a 20 mL borosilicate scintillation glass vial (Kimble Glass, Vineland, NJ). The vial containing about 5 mL of powder was placed on the beam of the Powder Samplir accessory, and the spectrum was recorded over the range 10000-4000 cm-1. Each spectrum was the average of 60 coadded interferograms at 16 cm-1 resolution. RESULTS AND DISCUSSION Feasibility Study. Prior to developing a method based on a partial least-squares (PLS) algorithm, spectral differences between species were investigated using raw spectra, second-derivative spectra, and principal component analysis (PCA). Figure 1 shows some typical raw spectra obtained with these data acquisition method. These NIR spectra show broad bands of overlapping absorption bands arising from harmonics and combinations of fundamental molecular vibrations and are thus only a “blurred picture” of the chemical composition of the plants. Duplicate spectra from the 53 collected samples were preprocessed using a second derivative of Savitzky-Golay to reduce baseline variation and enhance spectral features. Second-derivative spectra showed spectral differences within the four plants (Figure 2). For example, in the 4300-4100 cm-1 region, the E. purpurea spectra showed similarities with the P. integrifolium spectra and
Figure 1. Typical spectra of dried milled roots of P. integrifolium with a y-offset of 0.07 absorbance unit, E. angustifolia with a y-offset of 0.05 absorbance unit, E. pallida with a y-offset of -0.04 absorbance unit, and E. purpurea, over the range 10000-4000 cm-1. Table 2. Vibrational Assignments14 of Absorption Bands over the Range 6170-4000 cm-1 wavenumbers/ cm-1 6170-5850 4390-4280 4260 4250 4200 4060-4000
assgnts C-H stretching, first overtone combination band: C-H stretching, C-H deformation combination band: CH2 symmetrical stretching, dCH2 deformation C-H deformation second overtone O-H deformation second overtone combination band: C-H stretching, C-C stretching
E. angustifolia with E. pallida, respectively. In the 6100-5700 cm-1 region, the E. purpurea spectra were easier to distinguish from the P. integrifolium spectra. These regions are characteristics of various internal molecular vibrations as described in Table 2 and rather well represent the complexity of the plant chemical composition. However, as spectral differences were found throughout almost the whole NIR range, the second-derivative spectra were used in PCA without spectral selection. PCA was performed on second-derivative spectra to investigate qualitative differences between the samples in the PC (principal component) space. After PCA processing, several combinations of two and three PCs were investigated. The best differentiation between E. purpurea and false samples was found with a plot of the unstandardized principal component scores using the first and third PCs in a two-dimension plot and first, third, and fourth PCs in a three-dimension plot (Figure 3). The spectra of two E. purpurea batches (P1 and P2) were substantially different from the other true spectra (Figure 3). The reason for the difference for P2 could be that this sample had a relatively different HPLC profile compared to the other purpurea samples: its isobutylamide content was below the detection limit of the method. However, no chemical differences were found for P1. Its spectral characteristics in the PC space were probably due to its different physical properties. P1 had a narrower particle size distribution (125-355 Analytical Chemistry, Vol. 74, No. 11, June 1, 2002
2495
Figure 2. Second derivative spectra over two spectral regions 4700-4300 cm-1 (a) and 5700-6100 cm-1 (b) from the four types of plant.
µm) than the other samples (90-1000 µm). It was in fact received directly in milled form, while the other samples were milled at reception according to the grinding procedure described above. Differentiation was also difficult between P. integrifolium and E. purpurea, although the former has a chemical composition different from that of the true samples. Despite the presence of some PCs space outliers, a significant difference was found between the true and false samples. We therefore decided to enhance this difference by applying our existing knowledge of sample identity and developing a PLS method. In fact the PLS algorithm has already proved its ability to perform discriminant analyses in the food and herbal industries but has never been used to classify Echinacea species. Calibration and Validation Set Design. The samples were divided into a calibration set (CS) and a validation set (VS). The calibration set was used to develop the PLS model and to find the best discrimination criteria. The validation set contained 2496 Analytical Chemistry, Vol. 74, No. 11, June 1, 2002
Figure 3. Factor score plots in PCA based on second derivative spectra from E. purpurea (b) and false samples E. angustifolia (2), E. pallida (+), and P. integrifolium (4). The two-dimensional plot (a) is from factor scores one and two, and the three-dimensional plot (b) is from factor scores one, three, and four. P1 and P2 are spectra from two E. purpurea batches showing spectral features different from those of the other true samples.
duplicate spectra that were not used in the calibration set and was used to test the validity of the developed model. The division into CS and VS was performed to have true and false samples in both sets. The spectral variation of the calibration set was also increased by including samples from various species, locations of growth, climates, and harvesting seasons. Once these conditions were fulfilled, the rest of the division was randomized. Samples received after the method development stage were added to the validation set. The calibration set (Table 1) contained duplicate spectra from eight E. purpurea batches and four false sample batches. The false samples of the calibration set did not include E. pallida because good discrimination could not be achieved between the true and false samples. The calibration set was constructed by setting binary coefficients for each spectrum according to its origin: true sample spectra (E. purpurea) received a coefficient of 100, and false sample spectra a coefficient of 0. The concept behind this identification method is that the PLS model uses spectra to predict
Table 3. Calibration Settings S1-S3a param
S1
pretreatment second deriv, SNV region whole NIR range no. of factors 7 R2 0.974 rmsd 7.61
S2
S3
second deriv, SNV 7500-4000 cm-1 7 0.982 6.41
second deriv, MSC 7500-4000 cm-1 8 0.985 5.75
a SNV, standard normal variate; second deriv, second derivative; MSC, multiplicative scatter correction; rmsd, root-mean-square deviation; S1-S3, different settings tested for E. purpurea identification method.
“values” that are used as discrimination criteria. Predicted values within the acceptance range correspond to a true sample, whereas predicted values outside the acceptance range correspond to false samples. The acceptance range was defined by the mean predicted value (( 3 standard deviations (s)) calculated from the true calibration set samples. In the same way, the PLS model predicts factor scores that are also used as discrimination criteria. Their acceptance ranges are calculated from the true calibration set samples as described above. Spectral Pretreatment and Region Selection. The quality of the calibration was optimized by choosing pretreatment options according to the following three criteria: (1) A minimum number of prediction outliers and spectral outliers in the calibration set must be obtained. The F test was applied to evaluate the statistical significance of the outliers and used as a criterion for rejecting outlier from the calibration set before calculating the final model. (2) A correlation coefficient (R2) higher than 0.9 must be reached between the spectra and prediction values. (3) A satisfactory root-mean-square deviation (rmsd) must be achieved. Various settings based on a PLS algorithm were developed from the same calibration set, and three of them gave satisfactory results (Table 3). These three types of setting (S1, S2, and S3) used a second derivative of Savitzky-Golay and mean centering. S2 and S3 used a SNV (standard normal variate) correction and a MSC (multiplicative scatter correction),26 respectively. S2 and S3 also used a negative region selection based on the determination coefficient spectrum plot and on the standard deviation spectrum plot. The standard deviation plot was built by subtracting the average spectrum from every spectrum in the calibration set and then calculating the standard deviation at every wavelength. Regions on this plot that show large positive peaks are regions where the spectra vary considerably. Negative selection was performed by removing the broad regions having the poorest standard deviation and determination coefficient. No positive region selection based on the structure of chemical compounds could be performed due to the huge number of compounds present in the plants. The resulting selected region was 75004000 cm-1 for both S2 and S3. Some of the internal molecular vibrations involved in this region are described in Table 2. The use of S3 gave the best results in terms of rmsd and R2 (Table 3). (26) Chaminade, P.; Baillet, A.; Ferrier, D. Analusis 1998, 26, 4, M33-M38.
However, S3 used a higher number of factors than S2 and was thus more susceptible to overfitting. S2 was the second type of setting that gave good results with a smaller number of factors and was, therefore, chosen to build the Echinacea identification method. Data Analysis. The method was developed using the S2 settings and a PLS algorithm and constructed by cross-validation. The number of significant PLS factors was taken to be the smallest at which the prediction error sum of squares (PRESS) was not significantly different from the lowest PRESS value.27 The resulting number of factors was 7, which is relatively high and leads to a potential risk of overfitting the model, e.g. modeling the system noise. Nevertheless, the model performance was decreased when the number of factors was reduced, and therefore, the factor number was not modified. Internal Validation and Discrimination Settings. Two discrimination criteria were chosen to discriminate false samples from true samples: the predicted value and the factor scores. The acceptance limits for each criterion were set at three standard deviations (s) around the mean value of the true samples from the calibration set. The reliability of these limits was evaluated by performing an internal validation, i.e., testing calibration samples against the method. In the case of misclassification of the spectra, the reasons for the failures must be investigated to decide whether the method has to be replaced or the acceptance criteria modified. The samples included in the calibration set were thus rescanned in duplicate, and the resulting 24 spectra were tested against the method (Table 1). As a result (Figure 4), two true spectra from two different batches were misclassified by the method, but these failures were only due to prediction values that were 2-5% higher than the upper acceptance limit. The averages of the duplicate spectra of these batches were nevertheless within the acceptance limits. Moreover, none of the false samples were misclassified. The possibility of setting the prediction value limit to 5s was investigated during the internal and external validation, but this led to misidentifications of false samples (Figure 4). It was therefore decided to keep the acceptance limits to 3s for all criteria. External Validation. The external validation is a confrontation of batches, not included in the calibration set, with the identification method. Its purpose is to verify the applicability of the method for differentiating true samples from false samples and to evaluate the method specificity. A total of 41 batches (Table 1) were tested from the validation set (VS), and included both true (10 batches) and false samples (31 batches). The results of the validation are expressed in the form of a type I error28 (to reject a sample that is acceptable by other criteria) and a type II error (to accept a sample unacceptable by other criteria). Types I and II errors were as follows: type I error (%) ) 100 × number of misclassified true batches (or spectra)/number of true spectra (or batches) in the validation set, and type II error (%) ) 100 × number of misclassified false batches (or spectra)/number of false spectra (or batches) in the validation set. Type I Error. A total of 9 of the 10 true VS samples investigated (Table 1) were correctly classified as E. purpurea, leading to 10% of type I error. Figure 4 shows the prediction values (27) Haaland D. M.; Thomas, E. V. Anal. Chem. 1988, 60, 1193-1202. (28) Svensson, O.; Josefson M.; Langkilde, F. W. Appl. Spectrosc. 1997, 51, 18261835.
Analytical Chemistry, Vol. 74, No. 11, June 1, 2002
2497
Figure 4. Prediction values of duplicate spectra recorded from calibration set (CS), validation set (VS), and synthetic adulterant samples. Prediction values were obtained by challenging spectra with a PLS model that was developed by setting a coefficient of 100 to the true sample spectra and a coefficient of 0 to the false sample spectra. Solid lines show the prediction acceptance limit at 3s, and the dashed lines show the acceptance limit at 5s. Spectra were recorded from true samples, E. purpurea (CS spectra 1-20, VS spectra 21-36), and false samples, E. angustifolia (CS spectra 37-42, VS spectra 43-82), E. pallida (VS spectra 83-102), and P. integrifolium (CS spectra 103-104, VS spectra 105-106). The laboratory-made adulterant samples were E. purpurea adulterated with 5%, 10%, and 20% E. angustifolia (spectra 107-108, 109-110, and 111-112, respectively), E. purpurea adulterated with 5%, 10%, and 20% E. pallida (spectra 113-114, 115-116, and 117-118, respectively), and E. purpurea adulterated with 5%, 10%, and 20% P. integrifolium (spectra 119-120, 121-122, and 123-124, respectively).
for the true spectra of both the calibration and validation sets. Misclassification of the failed sample was due to its prediction values, even though all the factor scores passed the acceptance limits. The prediction values from its duplicate spectra were respectively 5 and 10% higher than the upper acceptance limit at 3s, but their average fell within the acceptance limit at 5s. This confirms the internal validation results: the upper acceptance limit at 3s may not be representative enough of high prediction value samples. The rejected sample was investigated to determine the reason for this failure. The investigation suggested that the sample was a physical outlier because it was the sample (P1) for which particle size distribution was narrower (125-355 µm) than that of the samples milled after reception (90-1000 µm). P1 spectra were rather different from the other spectra already during the PCA analysis (Figure 3). It is well-known that NIR spectra are influenced by the particle size of the sample,28,29 leading to possible misidentification of samples. In the case of our failed sample P1, the intensity over the entire spectral range was approximately 0.15 absorbance units lower than that of the other true samples, although the spectral shape was similar. Thus, P1 is most probably one more example of the sensitivity of spectra to the physical properties of samples. Nevertheless, more true samples will have to be analyzed to confirm whether the acceptance limits of the prediction values need to be optimized or not. To avoid physical outliers, the sample preparation procedure will have to be always the same as that for the calibration samples. If these future true samples are accepted by the NIR method, it will confirm that the model is specific to the sample identity but not robust against variations in particle size. (29) Blanco, M.; Coello, J.; Eustaquio, A.; Iturriaga H.; Maspoch, S. J. Pharm. Sci. 1999, 88, 551-556.
2498 Analytical Chemistry, Vol. 74, No. 11, June 1, 2002
Type II Error. A total of 31 batches of false VS samples were tested (Table 1). None of them was misclassified leading to 0% of the type II error. Figure 4 confirms that the acceptance limits set during internal validation are well able to discriminate substituted samples from true samples. As an example, P. integrifolium spectra failed both the prediction value and first factor score criteria. The absence of type II error using a 1 min analysis method is very promising compared to the lack of speed of traditional chromatographic methods. Limit of Detection. To investigate the possibility of detecting the presence of an adulterant in E. purpurea samples, duplicate spectra of laboratory-made adulterated samples were challenged with the developed method (Figure 4). As a result, E. purpurea contaminated with respectively 20% of E. angustifolia (A20), E. pallida (P20), and P. integrifolium (I20) were rejected by the method. E. purpurea samples contaminated with 10% of each adulterant had prediction values close to the lower acceptance limit: A10 and P10 spectra were rejected, and one of the duplicate spectra of I10 was accepted by the method, the other being rejected. All samples with an adulteration content of 5% (A5, P5, I5) were accepted by the method as true samples. Therefore the limit of detection of an adulterant is 10% E. purpurea powder samples. Samples containing more than 10% of adulterant (E. angustifolia, E. pallida, or P. integrifolium) should be rejected as false samples by the method. CONCLUSION The NIR method has shown an excellent capacity to rapidly identify E. purpurea roots and to control the absence of frequent substitutions. However, differences in the particle size distribution can cause rejection of true samples. Therefore, samples to be
analyzed have to be milled using the grinding procedure described above. Adulterated E. purpurea samples can be detected at a minimum of 10% adulteration. This identification method presents great improvements in terms of speed and costs compared to traditional chromatographic methods. However, quantitative information on marker substances is also of importance for testing herbal drug quality. This NIR method therefore does not replace traditional analytical methods, but it can be used as an initial tool in the fight against fraud and help to avoid testing substituted or adulterated samples by expensive and time- consuming chromatographic methods.
ACKNOWLEDGMENT We are grateful to Pharmia for supplying the FT-NIR spectrometer and to Heinrich Klenk, Martin Bauer, Frantsilan Yrttitila, Alfred Galke, Richters, Bioforce, Ian and Linda Grossart’s farm, Medicinal herb farm, Nutrilite farm, and Teardrop farm for supplying samples.
Received for review October 22, 2001. Accepted February 19, 2002. AC011108F
Analytical Chemistry, Vol. 74, No. 11, June 1, 2002
2499