Temperature-compensating calibration transfer for ... - ACS Publications

Seattle, Washington 98195. The real-world application of various multivariate cali- bration methods, such as principal component regression. (PCR) and...
0 downloads 0 Views 320KB Size
1301

Anal. Chem. 1993, 65, 1301-1303

CORRESPONDENCE

Temperature-Compensating Calibration Transfer for Near-Infrared Filter Instruments Yongdong Wangt and Bruce R. Kowalski' Center for Process Analytical Chemistry and Department of Chemistry, BG-IO,University of Washington, Seattle, Washington 98195 The real-world application of various multivariate calibration methods, such as principal component regression (PCR) and partial least squares (PLS), often requires that a calibration model built on one instrument be transferred to another, e.g., from one located in the laboratory to that on an industrial process. In a previous publication, several multivariate instrument standardization methods had been presented and tested.' Out of these methods, piecewise direct standardization (PDS) has been shown to provide the best performance.l.2 PDS proceeds by calculation of a banded diagonal matrix F from the responses of a few standard samples (called the transfer set) measured on different instruments, such that

R' = &F (1) where R, and R2 are the matrices containing responses in the rows from the first and the second instruments, respectively. The nonzero elements in each column of F are calculated from a series of small multivariate regressions. The strong banded diagonal structure in F is based on the fact that most responses measured in analytical chemistry are continuous functions of response channels. Therefore, when the response channel on the second instrument is shifted with respect to that on the first instrument, information about the shift is most likely to be found in a restrictive local region on the second instrument. As a result, a linear combination of the responses in this local region is needed to reconstruct the response on the first instrument. Besides multivariate continuous responses, multivariate calibration has been applied to discrete responses obtained from filter spectrometers,3 sensor arrays? etc. Where the discrete responses are concerned, the banded diagonal structure in F no longer holds and a modification of PDS is needed. Among the various factors which can cause differences between R1 and R2, temperature variation is an important effect in an industrial process analysis environment where an on-line process analyzer is used to monitor the chemical process in real time. This temperature effect can be so dramatic that a separate calibration model is needed for each different temperature or a general calibration model is needed to account for the temperature effect and cover the entire temperature range. Both approaches require the measurement of a large calibration set covering many different temperatures and pose * To whom

correspondence should be addressed.

' Current address: The Perkin Elmer Corp., 761 Main Ave., Norwalk, C T (33859-0284. (1) Wang, Y.; Veltkamp, D. J.; Kowalski, B. R. Anal. Chem. 1991,63, 2750. (2) Wang, Y.; Lysaght, M. J.; Kowalski, B. R. Anal. Chem. 1992,64, 562. (3) Isaksson, T.; Naes, T. Appl. Spectrosc. 1988, 42, 1273. (4) Carey, W. P.; Beebe, K. R.; Kowalski, B. R. Anal. Chem. 1987,59, 1529. 0003-2700/93/0365-1301$04.00/0

a severe burden for process analytical chemists. In this correspondence, an expansion of PDS is presented to standardize the discrete responses measured on a filter spectrometer for sample sets run at different temperatures.

THEORY As mentioned in a previous paper,' the instrumental variations considered in standardization are not limited to instrument-to-instrument differences. They can also come from the differences in sample physical conditions, e.g., differences in sample particle sizes,sample finishes, or sample temperatures. In the practice of process analytical chemistry, a situation is sometimes encountered where an on-line instrument is used to monitor a process stream with fluctuating temperatures. Since any change in fundamental responses resulting from the temperature change can be regarded as a combination of a response channel shift and an intensity change, it is expected that PDS should be able to standardize the responses measured at one temperature to those of another, thus eliminating the need to calibrate at each different temperature. In the development of PDS,it was assumed that the instrument gives a continuous response curve with respect to response channels. Therefore, for the discrete responses obtained from a filter spectrometer or sensor array, PDS is not directly applicable. Among these discrete response channels, nonetheless, there exists high collinearity for a given set of samples. This consideration leads to a modification of PDS for discrete responses. Instead of calculating a banded diagonal matrix F, a sparse matrix F, in which every column contains a limited number of nonzero elements, is calculated such that a best fit can be attained for eq 1. The problem of determining F is to find a linear combination of a subset of discrete response channels in RZso that a best fit can be provided for every specific channel in R1 rather than finding a best linear combination of response channels in a local window for continuous responses. This can be accomplished by use of all possible regressions in which all possible combinations of a certain number (k)of response channels in R:!are searched to obtain a best subset of response channels (filters) for each channel in R1. The rank of the calibration response matrix R1 provides a good estimate for k. Unfortunately, when the totalnumber of discrete response channels exceeds a certain limit, say 20, this procedure becomes very time-consuming. Instead of using an iterative stagewise regression method, the same procedure for selecting transfer samples, the subset selection proposed in ref 1,is used to first reduce the whole set of response channels to some manageable set (e.g., with less than 10 channels). The subset selection proceeds by selecting the channel with the highest leverage as the first channel and orthogonalizing the measurements 0 1993 American Chemical Society

ANALYTICAL CHEMISTRY, VOL. 65, NO. 9, MAY 1, 1993

1302 .. -

1.4-

0.015 0

'

0

2

3

1. 2 -

33 0

1-

0.8-

0. 2 ' 1400

I 1500

1600

1700

1800

1WO

2000

Zlm

2200

2900

I

-0.01 1400

2400

1500

164)

1700

le00

1900

2000

Zl00

2200

2300

2400

Wavelength (nm)

Wavelength (nm)

Flgurr 1. Near-IR filter spectra of 120 corn samples (at 45 "C).

Flgurr 3.

Dlfference spectra of 20 samples meeswed at 45 and 30

OC.

Table I. SEP from Temperature Standardization 0.01

g

~

0.005-

3

2

O-

-0.005

-0.01'

~

iao

is00

1600

imo imo iwo

2000

nm

2200

2s0

1

240

Wavelength (nm)

Flgurr 2. Mean-centered near-IR filter spectra of 120 corn samples (at 45 O C ) .

on the channel just selected with the measurements on the rest of the channels. This selection process is repeated until a specified number of channels have been included. Since this reduced set of response channels is selected as a set of response channels with the highest leverage, the selected channels should provide a compact representation of the whole set of response channels. In this case, all nonzero elements in the transformation matrix F may appear only in those rows corresponding to this reduced set of response channels.

EXPERIMENTAL SECTION Near-IR reflectance spectra of 120 corn samples (denoted as R,)were collected at 45 O C with a fixed filter spectrometer for the prediction of two analytes denoted here as A and B. Figure 1shows all 120 spectra with 20 fixed filters. The compositional information is contained in very small sample-to-sample variations. Figure 2 shows all 120spectraafter mean-centering,which shows a range of spectral variation between samples. If the spectral variation caused by sample temperature fluctuations is larger than shown in Figure 2, it is expected that the predictive ability of the calibration model will be lost and a new calibration model needed which requires the remeasurement of all 120 samples at every different temperature. In order to study the calibrationtransferfrom spectra measured at one temperature to another, 20 samples were selected from the above-mentioned calibration set and each was run at four different temperatures, 30,50, 60,and 70 O C . Out of these 20 samples,a subset was selected' as transfer samplesto standardize the spectra from every other temperature to 45 "C. The rest of the samples were used as a test set to calculate the standard error for prediction (SEP) as a measurement of the standardization performance

(z(c, n

SEP =

- t,)'/n)'''

1=1

where cl is the true concentration of a test sample, t, is the estimated concentration of this sample,and n is the total number of samples in the test set.

RESULTS AND DISCUSSION Figure 3 shows the difference spectra of the 20 samples

concn range PLS CV (rank) at 45 OC PCR CV (rank)at 45 OC standardization (4samples) 30 OC 50 OC 60 OC 70 OC standardization (9samples) 30 OC 50 OC 60 OC 70 " C

%A 36.8-39.4 0.11 (4) 0.11 (6) 0.12 0.17 0.15 0.16 0.16 0.15 0.18 0.15

%B 53.2-55.9 0.19 (5) 0.19 (7) 1.71

0.95 0.65 1.11

0.36 0.37 0.33 0.39

measured at 30 OC and at 45 "C. As can be seen, the spectral variations from this temperature change are far larger than the sample-to-sample variations on which the calibration model is based. When the calibration model built at one temperature is applied to samples collected at a different temperature, it is found that the prediction error is larger than the analyte concentration range in the calibration set. Therefore, a standardization is necessary in order to obtain acceptable prediction and avoid recalibration. A calibration study at 45 "C by use of partial least squares and principal component regression indicates that PCR needs six to seven factors to achieve about the same standard error for prediction as PLS with four to five factors (Table I),which is a common phenomenon in near-IR spectroscopy. This result suggests that six or seven factors are needed to model the spectral variations contained in the calibration set while only four or five factors are directly related to the compositional variations. In light of this observation, k is first chosen as 4 (with four transfer samples) and then as 7 (with nine transfer samples). Two additional transfer samples were included for the k = 7 case to avoid possible overfitting with all possible regression, which becomes more of a problem as the number of variables increases. When four transfer samples are used in standardization, the spectral differences between 30 and 45 "C are significantly reduced (Figure 4 as compared to Figure 3)) indicating that most of the spectral differences have been compensated for through standardization. From the PLS prediction resulte listed in Table I, it can be seen that, for the prediction of analyte A, SEP from standardization at all temperatures is already comparable to that of full set recalibration, while standardization SEP for analyte B is unacceptably high. This suggests that the predictive information for analyte B is more subtle than for analyte A, and more transfer samples are needed to transfer this subtle information. This observation is consistent with the cross-validation results listed in Table I: the rank required for analyte B is always larger than that

ANALYTICAL CHEMISTRY, VOL. 65, NO. 9, MAY 1, 1993

information important to analyte B has become more visible after the correction of the temperature variation through standardization. However, it is also clear that some information cannot be recovered by standardization due possibly to more complex spectral variations caused by the temperature variation.

0.015

-

1303

0.01

CONCLUSIONS

.0.005 -0.m

'

1100

is00

ism

1700

rmo im zmo n m

2200

zaoo z a o

Wavelength (nm)

Difference spectra of 20 samples measwed at 45 and 30 after standardlzation with four samples.

Figuro 4. OC

&zo ' o 0.015

3

t i

,.oat

-O.OC5 -0.m

t I

1400

1

1500

ism

1700

imo imo zwo nw

2100

I zaoo z a o

This study has shown that it is possible to standardize between the discrete responsesfrom filter spectrometerswith some modification of PDS. In fact, we have modified PDS to standardize between continuous and discrete responses, e.g., between a scanning spectrometer and a filter spectrometer. The study indicates that temperature variation can be regarded as a special type of instrumental variation and PDS can be applied to standardizebetween different temperatures. It is seen, however, that more transfer samplesmay be needed to attain acceptable prediction due to complicated temperature effects on fundamental spectral features. Another publication5 demonstrated the possibility of utilizing generic standards instead of subset samples for standardizing instrumental differences. If the difference is caused by the temperature fluctuations studied in this paper, only real subset samples can be used as transfer samples, since the collinearity pattern among response channelsis different from one set of samplesto another. This collinearity pattern serves as the basis for discrete PDS.

Wavelength (nm)

Flguro 5. Difference spectra of 20 samples measured at 45 and 30 OC after standardization with nine samples.

for analyte A, using either PLS or PCR. Indeed, when the number of transfer samples is increased to nine and k set to 7, the spectral difference is further reduced, as can be seen from Figure 5. While no statistically significant improvement in SEP (with PLS) is observed for analyte A (Table I), the standardization SEP (with PLS) for analyte B is improved for all temperatures, indicating that the subtle spectral

ACKNOWLEDGMENT This research was supported by the Center for Process Analytical Chemistry (CPAC),a National ScienceFoundation Industry/University Cooperative Research Center at the University of Washington. The authors thank Mr. Michael J. Blackburn and Dr. Robert T. Kean at Cargill, Inc., for providing the data set used in this study and many useful comments.

RECEIVED for review October 27, 1992. Accepted January ( 5 ) Wang,

Y.;Kowalski, B. R.Appl. Spectrosc. 1992, 46, 764.

26, 1993.