Assessment of the Quality of Latent Variable ... - ACS Publications

Anal. Chem. 1994,66,937-943

Assessment of the Quality of Latent Variable Calibrations Based on Monte Carlo Simulations Hans R. Keller,' Jurgen Rottele, and Hermann Bartels Central Analytical Department, Ciba Research Services, Ciba-Ge@yLtd., CH-4002 Basle, Switzerland

A general problem in multivariate calibration is the assessment of the prediction quality in advance, Le., without extensive experimentation. Monte Carlo simulations are proposed to estimate the quality of latent variable calibrations and thereby to minimize experimental work and to save resources. Two different data sets from industrial practice illustrate that this approachcan be appliedsuccessfully for very different problems. The first set consists of 100 samples of penicillin analyzed by near-IR spectroscopy. The second data set contains visible spectra of 37 samples of a dyestuff intermediate. Based on the knowledge of the noise in the calibration method, the noise in the reference method, the spectra of the analytes, the concentration range used for calibration, and the number of calibration samples, the prediction quality can be estimated. To characterize the quality of a multivariate calibration, the simultaneous use of criteria related to the standard error of prediction (SEP)and correlation-based criteria is recommended. In the chemical industry, one can no longer rely on the quality control of the final product alone. To ensure the quality of a product, quality control is increasingly integrated into chemical production. The challenge is obtaining the necessary information quickly enough, to have the possibility to take corrective actions during the manufacturing process and not to slow down production. Spectroscopic analytical techniques, such as ultraviolet-visible (UV-visible), infrared (IR), nearinfrared (near-IR), and recently also Raman spectroscopy, combined with chemometrics are increasingly applied in process control, because specific information can be gathered very quickly. The corresponding spectra are generally affected by quality relevant parameters, though mostly in a more or less unselective way. Multivariate calibration allows deconvolution of spectra and hence the prediction of quality relevant measures such as concentrations, concentration ratios, shades, and others. A major problem with multivariate calibration is the need for quite a large number of samples, the collection of which is often difficult and time consuming. In many cases, it would be highly desirable to first assess the chance of success of such a calibration project in order to save resources. Ideally, such a feasibility study should be based on only a few initial experiments and answer the question whether a given problem can be solved or not. This article will propose Monte Carlo simulations as a solution to this and illustrate their benefit with examples from industrial practice. Two data sets from an industrial environment are analyzed. The first set consists of 100samples of penicillin analyzed by near-IR spectroscopy. The second data set contains visible spectra of 37 samples of a dyestuff intermediate. 0003-2700/84/03660837$04.50/0 0 1894 American Chemical Society

THEORY There are several techniques for multivariate calibration, the ones most frequently used being classical least squares regression (CLS)l-", principal component regression (PCR)34 and partial least squares regression (PLS).3-68 In CLS, the measured signals X are considered to be a function of the sample concentrations C

X - CK

(1)

This represents Beer's law and is also known as the K matrix meth~d.~JOFor situations where the pure spectra of all constituents are not available, CLS cannot easily be applied because of collinearity problems. In such cases, inverse calibration methods of the form

C=XP

(2)

are used instead (sometimes also termed the P matrix method9J0). The most important among these inverse calibration methods are PCR and PLS. These two techniques rely on the concept of principal component analysis (PCA), which is a generally accepted method for data reduction. By means of PCA the many measured variables are reduced to a much smaller set of abstract factors (also called principal components) which are a linear combination of the originally measured variables. The advantage of PCR and PLS is that these methods can be applied for calibration even if pure standards are not available. Excellent texts on multivariate calibration methods and factor analysis can be found in the literature.6JlJ2 Like other analytical methods, multivariate calibration models also need to be validated to provide useful information. Tovalidate a calibration model and to estimate its performance, it is generally accepted to proceed as follows. The available calibration data are divided into a calibration set (learning (1) Blackburn, J. A. Anal. Chem. 1965.37, 1000-1003. (2) Haaland, D. M.; Easterling, R. G.; Vopicka, D. A. Appl. Specrrosc. 1985.39,

73-84. ( 3 ) Kowalski, B. R.; Seasholtz, M. B. J. Chemom. 1991, 5, 129-145. (4) Erichon, C. L.; Lysaght, M. J.; Callis, J. B. Anal. Chem. 1992, 64, 1155A1163A. (5) Draper, N.; Smith, H. Applied Regression Analysis, 2nd 4.Wiley: ; New York, 1981. (6) Gcladi, R.; Kowalski, B. R. Anal. Chim. Acru 1986, 185, 1-17. (7) Haaland, D. M.; Thomas, E. V. Anal. Chem. 1988, 60, 1193-1202. (8) Thomas, E. V.; Haaland, D. M. AMI. Chem. 1990,62, 1091-1099. (9) Brown, C. W.; Lynch, P. F.; Obremski, R. J.; Lavery, D. S. Anal. Chem. 1982, 54, 1472-1479. (10) Kisncr, H. J.; Brown, C. W.; Kavamcs, G. J. Anal. Chem. 1983.55, 17031707. (1 1) Martens, H.; Naes, T. Mulriuuriure Culibrurion; Wilcy: New York, 1989. (12) Malinowski, E. R. FucrorAnalysis in Chemistry, 2nd.ed.; Wiley: New York, 1991.

Analytical Chemisby, Vol. 68, No. 7, Aprll 1, 1994 037

set), which is used to build a calibration model, and avalidation set (test set), which is subsequently used tovalidate the model. Cross-validation is most frequently used to determine the appropriate number of factors and thereby to build a PCR or PLS calibration mode1.3,7-11J3 The selected number of factors is then used to build the final calibration model. The criterion used most frequently to describe the prediction quality of a given calibration model is the standard error of prediction (SEP)

(3)

where yi is the measured concentration, yi* is the predicted concentration, and n is the number of samples. SEP, which represents the average prediction error, can be determined for both the calibration andvalidation sets. Large differences in SEP between these two sets would indicate an inadequate calibration model, which can be due to unrepresentative samples in the calibration set. In practice, it is however not sufficient to get small SEP values. One will normally also require that the measured and predicted concentrations are directly related. Hence, a reasonable correlation between the measured and predicted concentrations is a prerequisite for a good calibration model. This can be understood as follows. In cases where concentration ranges are very narrow, as is usually the case when samples are taken from production, the main source of variation in the data is experimental noise. Thus, it is not possible to satisfactorily correlate measured and predicted concentrations, even if SEP values are small. In addition to the SEP, a correlation-based criterion such as the correlation coefficient r should therefore be used.

Computer simulations can be considered as a link between theory and experimental work. They have the advantage that the experimental parameters, such as error levels and number of substances, can be changed easily and results are obtained very quickly. Computer simulations can provide a deeper understanding of underlying mechanisms and are used frequently by many research groups for a wide variety of problems in PCA and related technique~.~J~-’* Provided reasonably realistic simulations are made, good estimates of real processes can be obtained with a minimum of time, cost, and effort. Still, one has to bear in mind that computer simulations must always be compared with real data. The quality of multivariate calibration methods is affected by many parameters, such as the spectra of the individual compounds, the errors in concentration measurements, spectral measurement errors, the number of samples used for calibration, and the concentration range. These parameters will be examined in this study by means of computer simulations. Among the other parameters that also affect the quality of a calibration there are experimental design, outliers, data pretreatment, wavelength range, and number of factors. The latter parameters are, however, beyond the scope of this article, and an extensive discussion can be found in the literatUre,7.11,13,18-20 A single computer simulation will not necessarily provide the most representative results; hence, to get more realistic estimates, one runs many simulations. This technique is known as Monte Carlo simulations and will be applied here. For each set of simulated data and each calibration method, one calibration model is obtained, from which SEP values and r can be calculated for both the calibration and validation sets. As a Monte Carlo simulation provides a large number of these values, one may wish to condense these into the average SEP, SEP,,,, and the average r, ra,, m

r=

where SEPi is the SEP value obtained in simulation i and m is the number of simulations. Analogously, where p is the average of the measured concentration^.^ For reasons of speed, time, or cost, one often tries to replace time-consuming application tests, separation techniques, or the like by direct multivariate methods. As the latter are often less precise, the problem is to determine whether these multivariate techniques are able to meet the requirements or not. Themost informative way todo thisis to makea feasibility study with experimental data. The difficulty with this approach is the need for quite many samples. As error propagation is not yet fully understood for PCR and PLS, the general problem cannot be solved theoretically either. Only under certain assumptions can prediction errors be calculated a n a l y t i ~ a l l y . ~A~third . ~ ~ possibility, which is being applied in this article, is to solve the problem with computer simulations. (13) Wold, S . Technometrics 1978, 20, 3 9 7 4 0 5 . (14) HGskuldsson, A. J. Chemom. 1988, 2, 211-228

938

Analytical Chemistry, Vol. 66,No. 7, April 1, 1994

Cri 1=1

where ri is the correlation coefficient obtained in simulation i.

This paper will first illustrate what the effects of different parameters on PCR and PLS are. Furthermore, the results obtained from real experiments arecompared with those from computer simulations to show the usefulness of Monte Carlo (15) Keller, H. R., Massart, D. L. Anal. Chim. Acta 1991, 246, 379-390. (1 6) Gemperline, P. J.; Long, J . R.; Gregoriou, V. G. AMI. Chem. 1991,63,23132323. (17) Gerritsen, M. J. P.; Fabcr, N. M.; van Rijn, M.; Vandeginste, B. G. M.; Kateman, G. Chemom. Intell. Lnb. Syst. 1992, 12, 257-268. (18) Seasholtz, M. B.; Kowalski, B. R. J. Chemom. 1992, 6, 103-1 1 I . (19) Gemperline, P. J. J . Chemom. 1989, 3, 549-568. (20) Brown, P. J . J. Chemom. 1992, 6, 151-161.

1. For both calibration set and validation set: Generate uniformly distributed concentration values for all analytes (using random numbers). 2. For both calibration set and validation set: Generate noise-free mixture spectra for all samples generated in I.(eqn. (7)).

3. For both calibration set and validation set: Add normally distributed noise to all concentration values created in 1. Add normally distributed noise to all spectra created in 2. 4. From the calibration set created in 3. Build a calibration model.

5. For both calibration set and validation set: Based on the calibration model created in 4., predict concentration values for all analytes and compare these with the ones generated in 1,

OO

20

40 60 wavelength [units]

80

I 100

-0

20

40 60 wavelength [units]

80

100

-0

20

Flguro 1. Prlnclple of a computer slmulatlon.

simulations for latent variable calibration methods. For cases where the pure spectra of the individual analytes are experimentally unavailable, they can be approximated for the comparative simulations by

X=CA

(7)

A = (C’C)-’C’X

(8)

where X is the (samples X wavelengths) matrix of measured spectra, C the (samples X compounds) matrix of known concentrations, and A the (compounds X wavelengths) matrix of spectra.

EXPERIMENTAL SECTION Monte Carlo Simulations. A computer simulation program for generation of multivariate data, calibration, and evaluation of the results was developed in Matlab (Version 4.0, The Mathworks Inc., Natick, MA). For each simulation a number of samples were created, each of them corresponding to concentration values for a defined number of analytes and a resulting mixture spectrum, which was generated as a linear combination of the underlying pure-compound spectra. To get more realistic results, both the simulated concentrations and the spectra were superimposed with normally distributed noise with zero mean. Mean-centered data were analyzed with PCR and PLS; the PCR factors were selected top down as is commonly recommended.11J2 Figure 1 illustrates the simulation procedure in more detail. For each set of experimental conditions, this procedure was repeated many times, as specified below. All calculations were performed on a 486-processor-based IBM personal comppter. Purely Simulated Data Set. The first data set studied consisted of simulated mixtures of two more or less equally concentrated analytes with spectra of varying similarity (Figure 2). On the basis of 50 calibration samples, CLS, PCR, and PLS calibration models were generated with two factors and their performances were evaluated using 20 validation samples. Such calibrations were repeated 25 times, and the results were evaluated using the SEP,,, and r,,,values. To study the effects of the concentration range, the number of samples used for calibration, the similarity of the spectra,

40 60 80 100 wavelength [units] Flgurr 2. Spectra of the analytes used for simulation: (a, top) hlgh spectral slmllarity (r = 0.998), (b, mkklle) moderate spectral similarity (r = 0.9411, and (c, bottom) low spectral similarity (r = 0.616).

the errors in spectra, and concentrations, the Monte Carlo simulations were carried out with the parameters shown in Table 1. Penicillin Data Set. The second data set consisted of 100 different penicillin-V sulfoxidebenzhydryl ester samples, which were recorded in routine measurement by different operators over several months. The content of the main product was analyzed by means of high-performance liquid chromatography (HPLC), and values in the range from 94.1 to 97.6% were reported. Water contents of 053.3% Ana&ticalChemistty, Voi. 66, No. 7, April 1, iGG4

939

1

I -

Table 1. Parameters Used for Simulation Experlments

spec error concn ( % ) error standard high spectral error high concn error small concn range large concn range large calibration set low spectral similarity high spectral similarity

1

5 1 1

1 1 1 1

0.1 0.1

0.5 0.1 0.1 0.1 0.1 0.1

concn spectral calibr range similarity set size (units) (r) 9.0-11.0 9.c-11.0 9.c-11.0 9.5-10.5

5.0-15.0 9.0-11.0 9.0-11.0 9.0-11.0

0.941 0.941 0.941 0.941 0.941 0.941 0.616 0.998

50 50 50 50 50 500 50 50

were reported using a Karl-Fischer titration method, where the samples were dissolved in a mixture of ethylene chloridepyridine-methanol. A 701 KF Titrino instrument (Metrohm, Herisau, Switzerland) and KF reagent (Fluka, Buchs, Switzerland) was used for this analysis. Up to -4% byproducts, which were not quantified individually, was also present in the samples. The precision of the two reference methods was estimated as 0.5% for the penicillin main component and 0.1% for water. Near-IR spectra of the samples were recorded in diffuse reflectance mode with a fiber-optics probe from 4000 to 10 000 cm-I (1000-2500 rim) using an InfraProver FT near-IR spectrometer (Bran & Luebbe GmbH, Norderstedt, Germany) with a resolution of 12 cm-' (Figure 3). Due to experimental difficulties with the fiber-optics bundle below 4800 cm-' and uninformative signals above 7500 cm-I, only the range of 4800-7500 cm-I (1 333-2083 nm) was used (226 data points per spectrum). To overcome difficulties due to different grain size and sloping baselines, second-derivative spectra were used for quantification, as is generally recommended. The ICAP software (Version 3.0, Bran & Luebbe GmbH) was used for spectrometer control and data processing. The relative spectral error of the near-IR instrument was estimated to be 1%. Half of the data were used to build the calibration model and the other half for validation. To test the performance of the simulation program developed here, this data set was also simulated using the same parameters. The pure-compound spectra of penicillinV sulfoxide-benzhydryl ester and water were estimated by eq 8. Fifty calibration samples and 50 validation samples consisting of two analytes in the given concentration ranges were generated. These simulated concentrations and spectra were then superimposed with the specified amounts of normally distributed noise. Such simulations were repeated 100 times. Dye Data Set. Another data set, parts of which have been published previously, was also investigated.21 This data set consisted of samples of a dyestuff intermediate consisting of mainly two chemical individuals, the ratio of which determines the hue that had to be predicted. A total of 22 samples were taken from day-to-day production. Their final products resulted in dyeings with a hue difference from -0.2 to 0.9 AH unit.22 Because of the poor performance of this calibration set, a further 15 samples with properly altered concentration ratios and consequently large variations (from -0.2 to 3.9 AH units) were added in the second data set. The measured visible spectra of these production intermediates werecorrelated with (21) Finneiser, K.; Rdttele, J. Anal. Methods Insfrum. 1993, 1, 97-103. (22) Commission Internationalede 1'Eclairage. Colorimetry, 2nd ed.; CIE, Vienna, 1986; No. 15.2.

940

Analytical Chemistry, Vol. 66,No. 7, April 1, 1994

1 1000

I

1500 2000 wavelength [nm]

1400

1600

1800 wavelength [nm]

2500

2000

Figure 3. (a)Measuredspectra with3.3% waterand 95.7% penicillin4 sulfoxide-benzhydryl ester (solid line) and 1.3 % water and 96.8 % penicillin-V sulfoxide-benzhydryl ester (dashed line). (b) corresponding secondderivative spectra in selected spectral region.

the differences in hue (AH) observed with diffuse reflectance measurements, which were made on cotton materials that have been dyed with the final reactive dyestuff. This dyeing process is a difficult procedure and must be considered as a source of potential problems. The visible spectra of buffered solutions (pH 7.0) were measured over a wavelength range from 500 to 780 nm (141 data points per spectrum) using a 8452A Model diode-array spectrometer equipped with a 9030 Model workstation (Hewlett-Packard, Palo Alto, CA). Typical spectra are shown in Figure 4. A Pascal program, developed in our laboratory, was used for instrument control, acquisition, and conversion of the data to ASCII format. The error of hue is -0.15 A H unit; the spectral error was estimated to be 0.1 mAU. More details are provided in ref 21. The first subset consisted of 22 production samples, where the variation in hue was very limited. The second subset included all 22 production samples plus 15 samples with extreme variation in hue. In each subset, half of the samples were used for calibration, while the other half were employed for validation. PCR and PLS models were build using the ICAP software. The A H values for those samples with extreme variation in hue correlated almost perfectly ( r = 0.998) with the relative

I

I

E

p . 5-

Figure 4. (a) Spectrum of an Intermediate, finally resulting in a dyeing conforming to type (soild line, AH = -0.04) and (b) spectrum of an Intermediate leading to a dyeing considerably redder (dashed line, AH = 1.6).

amount of either of the two dyestuff intermediates. Computer simulations imitating the present data set were therefore performed by simulating AH values covering the experimentally observed range and correlating these to the concentrations, C,of the two analytes. According to eq I , these C were then multiplied with the pure compound spectra, which were experimentally available, to calculate the mixture spectra. The simulated AH values and the corresponding mixture spectra were then superimposed with the specified amounts of noise. For the first subset, 11 calibration samples and 11 validation samples were generated for each of the 100 simulations, while for the second subset, 19calibration samples and 18 validation samples were generated for each of the 100 simulations.

RESULTS AND DISCUSSION Purely Simulated Data Set. As very similar results were obtained for the two analytes, only the results for one analyte in the validation set are presented in Figure 5. From these graphs one can conclude that higher noise in the spectra increases the SEPav, and strongly decreases ravg. Increased noise in the concentrations seriously increases SEP,,, and decreases rav,markedly. Noise in the concentrations (errors in the reference method) is the factor that most seriously degrades the performance of multivariate calibration methods. These findings are in accordance with the results reported by others for a criterion related to SEP.* Increasing the concentration range strongly improves rave, whereas the effect on SEP,,, is less clear. In the case shown here, SEP,,, first increaseswith a larger concentration range but then it decreases again somewhat. Other data sets showed that the effect of the concentration range on SEP,, depends on the different noise levels and that no rule can be found. In the case shown here, augmenting the number of calibration samples slightly improves both SEP,, and r,,,. Finally, the more similar the spectra the worse SEP,, and rave,because of the smaller spectral differences between the analytes. The calibration methods studied here appeared to perform equally well. The reason for this is that there were no unmodeled interferences present in this idealized study. These

findings essentially confirm what could be expected from theory and compare favorably with results presented in the literature.* The main benefit of such computer simulations is that the effects of the different parameters can be assessed very easily and quickly and that predictions can be made for real data, as will be shown next. Penicillin Data Set. Three factors were found to beoptimal to describe the experimental data, and additional factors did not improve the concentration estimates. This is in agreement with the presence of the two main compounds and small amounts of impurities combined with minor effects due to experimental difficulties such as different grain size. The results obtained with the penicillin data set are summarized in Table 2. For the corresponding computer simulation, the same errors were used as reported for experimental data (see Experimental Section), but only two factors were used to evaluate the ideally simulated data. Some interesting conclusions can be drawn from these results. First, it becomes apparent that SEP values alone cannot sufficiently describe the quality of a multivariate calibration. For simplicity, the correlation coefficient is being used here as an additional criterion. While the SEP values may be considered acceptable for water and penicillin, the quality of the calibration for penicillin is unacceptable because of the extremely low r value. Hence, the multivariate calibration models built for this data set are useful for the determination of the water only. The penicillin content, on the other hand, cannot be predicted satisfactorily. It is worthwhile to note that in the same set of data one of the analytes can be determined successfully,while the second cannot, even if the relative concentration of penicillin is much higher. These findings can be explained as follows. The intensity of the water bands in the spectrum is much higher than that of the penicillin compound. Additionally,the relative concentration range, e.g., expressed as range divided by the smallest concentration, is large for water. As a consequence, a good correlation between the predicted and measured concentrations can be obtained. For penicillin, however, the relative concentration range is much smaller and there is not sufficient variation in the spectra to obtain an acceptable calibration model for penicillin. To get a better calibration model for penicillin, the calibration range must be enlarged. This will be illustrated with the following data set. The results shown in Table 2 also demonstrate that correct predictions can be made using Monte Carlo simulations. Taking into account that unmodeled compounds were present in the real samples, the results obtained with the computer simulations compare very well with those from real data. For water, the results obtained by simulation match the ones obtained from experiments. For penicillin, however, only the SEP can be estimated with sufficient precision while the rave values obtained in the Monte Carlo simulation overestimate the experimentally observed r. This can be understood with the ideal assumptions made for the simulations, where the data were considered to be strongly linear and where no interferents were present. Since at least the latter is not correct for the experimental data, the computer simulation provides a too optimistic estimate of rave.Still, bearing this in mind, one must expect the real r to be smaller than the one predicted from an idealized Monte Carlo simulation. To further confirm AnalyticalChetnistry, Vol. 66, No. 7, April 1, 1994

941

io

10

a1

0.8

&?

-g '

-

04

dl

06

08 04

-3standard

?

w hgh

[z v)

I

0.2

06

02

i 0.0

0.6

9

00

04 MCA

PCR

PLS

.

MCA

bl

'.O

PLS

PCR

08.

I -7

P

-standard

I

PLS

--:

Wbw

m

mhigh

04

-

mhigh

high

v)

0.8 .

0.2 '

MCA

PCR

PLS

MCA

PCR

PLS

08

standard

04

00

1

MCA

PCR

PLS

MCA

PCR

PLS

MCA

PCR

PLS

MCA

PCR

PLS

06

02

0.0

o.6

PCR

1

standard

0.4 '

04 MCA

MCA

PCR

PLS

1

90.4

5 : Flgure5. Effect of different parameters on SEPaq, and rw: (a) noise In spectra (spectral error), (b) noise in concentrations (mcentratbn error), (c) concentration range, (d) size of calibration set, and (e) spectral similarity. ~~~

~

Tabk 2. Results Obtalned from Real and Sknulated PenMln Data.

water penicillin calibration validation calibration validation SEP r SEP r SEP r SEP r

PCR (exptl) PLS (exptl) PCR (simul) PLS (simul)

0.100 0.111 0.100 0.100

0.988 0.987 0.992 0.992

0.111 0.112 0.106 0.106

0.986 0.986 0.992 0.992

0.519 0.516 0.642 0.639

0.529 0.537 0.830 0.832

0.623 0.629 0.658 0.658

0.646 0.634 0.832 0.832

For simulateddata, the SEP,, and rm are reported as estimates of the SEP and r values.

the conclusions of the correctness of the results obtained with Monte Carlo simulations, the ruggedness of these results was tested by systematically varying the error levels for the penicillin and water concentrations as well as the spectral noise. In all cases, good predictions (in terms of S E P q and ravg) could be made for water, while poor correlations were obtained in virtually all cases for penicillin. This confirms that the results shown in Table 2 are representative. Dye Data Set. Provided there was no interference in the dye data, where the two analytes add up to 1OO%, there would be only one degree of freedom and, consecutively, one factor would suffice to describe the dye data. In the two subsets of the dye data, however, three factors were found to optimally describe the spectra. This can be explained with the presence of byproducts and turbidity of the samples. The results obtained with the two subsets of the dye data are summarized in Table 3. The first subset, consisting of samples from normal production, did not lead to an acceptable correlation of the measured and predicted Mvalues. Increasing the variability 942 AnalyticalChemistry, Vol. 66, No. 7, April 1, 1994

T a m 3. Results Obtained from Real and sknulated Dye Data.

small range of AH large range of AH calibration validation calibration validation SEP r SEP r SEP r SEP r

PCR (exptl) PLS (exptl) PCR (simul) PLS (simul)

0.117 0.112 0.146 0.146

0.875 0.887 0.920 0.920

0.140 0.122 0.151 0.151

a For simulateddata, the S EP, of the SEP and r values.

0.901 0.906 0.917 0.917

0.133 0.130 0.146 0.146

0.994 0.994 0.993 0.993

0.138 0.137 0.151 0.151

0.994 0.994 0.993 0.993

and raw&e reported as estimates

in the data by addition of samples with larger differences in AH,however, markedly improved r while SEP was affected only little. The data set presented here is uncommon in so far that, first, the mixture was a very simple one and, second, the error in the dependent variable, AH,was much larger than the error in the spectra. This situation, where the multivariate analytical method, i.e., UV-visible spectrometry, is much more precise than the currently used reference method, is not very common in multivariate calibration. As the two analytes add up to 100%in the simulated data, there was only one degree of freedom and consecutively only one factor was used. As shown in Table 3, the results obtained from computer simulations compare favorably with the ones obtained from experimental data. Interestingly, correct predictions could be made for very different data sets with Monte Carlo simulations. The dye data set differs from the penicillin data not only with respect to chemistry and application but also with respect to the relative precision of the multivariate analytical instrument employed, as near-IR

systems are known to be much less precise than UV-visible spectrophotometers. Thus, successful Monte Carlo simulations are not restricted to a very specific problem. For all data sets analyzed in this study, PLS and PCR give equal results, because the two methods are very similar. However, this equivalence is not necessarily the case for all data sets, and one calibration method may outperform the other in certain circumstances.s One has to accept that there is no multivariate calibration method that can be considered optimal in all cases.

CONCLUSIONS Based on the knowledgeof experimentalparameters, Monte Carlo simulations were found to correctly predict the quality of latent variable calibrations for two very different data sets from industrial practice. As shown in this article, the advocated approach is not restricted to a specific problem or analytical technique. Studying a specific multivariate calibration problem with computer simulations can answer the question of whether it is at all possible to solve a given problem. If a Monte Carlo simulation predicts that the problem can be solved within the specifications, it appears worthwhile collecting samples and building a calibration model. Still, one must bear in mind that the simulations described here will

provide the best possible results that can be obtained in practice. Hence, the results obtained tend to be too optimisticin certain cases and must therefore be considered with some caution. On the other hand, if a Monte Carlo simulation predicts that the calibrationwill not meet therequirements,there is probably no use tocollect data and to spend resources. In such situations, application of Monte Carlo simulations to latent variable problems could lead to considerable savings in time, money, and material. While the literature generally focuses on quality criteria that are similar to SEP, this article shows the importance of a correlation-based criterion such as r. In practice, a calibration will not be considered as useful unless there is also sufficient correlation (e.g., r > 0.9) between the measured and predicted results. Whatever calibration be used, it is therefore crucial that both SEP or related criteria and correlation-based criteria be applied to qualify a calibration model. Received for review October 8, 1993. Accepted January 3, 1994.' *Abstract published in Advance ACS Absrracts, February IS, 1994.

Ana&ticaIChemistry, Voi. 66, No. 7, April 1, 1994

943

Assessment of the Quality of Latent Variable ... - ACS Publications

Recommend Documents