Anal. Chem. 1995,67,2379-2385
Additive Background Correction in Multivariate Dnstrument Standardization Z y i Wang, Thomas Dean, and Bruce R. Kowalski*
Center for Process Analytical Chemistry, Laboratory for Chemometncs, Department of Chemistty, BG- 10, University of Washington, Seattle, Washington 98195
Nonideal instruments give responses that do not follow Beer's law. Most spectrometers have some component of stray light that adds a structured, rather than constant or linear, background to absorbance spectra. When instrument standardization is performed using the multiplicative model introduced by Wang et al. (Anal. Chem. 1991, 63, 2750), the model does not explicitly include the additive term. This term, if uncorrected, can cause multivariate standardization to provide an incomplete transfer of calibration models between instruments. Using both simulation and real data, the current paper demonstrates the effect of the additive background term in multivariate standardization and provides an effective means by which such additive ditrerences can be corrected.
* The calibration transfer problem is ubiquitous in instrumental analysis. In most chromatography, the problem is to assign identities to peaks whose retention times vary as a function of the sample, the column age, and the temperature of the system, among other factors. In spectroscopy, concerns such as wavelength registration, instrument bandwidth, and stray light characteristics of spectrophotometers are among the myriad of effects that cause calibration models to work on one instrument, yet fail on another. Multivariate instrument standardizationsolves the calibration transfer problem by applying chemometric techniques to find a transformation that makes the measured response obtained from one instrument the same as that which would be obtained on a second instrument if the sample were measured on the second instrument. Four methods were presented in a recent paper,l all based on measurements of a small set of samples. All the methods allow full responses of the instruments to be utilized without restrictions being placed on the calibration model. Of the four methods, direct standardization (DS) and piecewise direct standardization (PDS) have shown the greatest potential. Neither method provides correction in the case where a significant baseline d ~ e r e n c ebetween instruments exists. This paper examines this particular limitation of both the DS and the PDS methods and presents an improvement in the proposed model. Since DS shares a common formulation with PDS, the mathematics are derived on the basis of the general DS method. In the following, DS and PDS with the improved algorithm are referred to as DSB and PDSB (direct and piecewise direct standardization with background correction), respectively. DS and DSB methods (1) Wang, Y.; Veltkamp, D.J.; Kowalski, B. R Anal. Chem. 1991,63, 2750.
0003-2700/95/0367-2379$9.00/0 0 1995 American Chemical Society
are compared using a simulated data set, while PDS and PDSB are used to demonstrate the background correction on an experimental data set. THEORY Throughout this paper, scalars are represented by italic lowercase letters, column vectors are represented by boldface lowercase letters, row vectors are represented by the transpose of a column vector, and matrices are represented by boldface capital letters. A bar over a vector or matrix indicates that the vector or matrix has been meancentered. The dimensions of the vectors are denoted as s (the number of samples for calibration), w (the number of measured wavelengths), and n (the number of samples for standardization). Second derivatives are estimated as second differences. Assume the response matrix (dimensioned s x w) for a full calibration set on the primary instrument is XI. The matrix SIis the response matrix of a small subset of samples in XI. The response matrix of the same subset measured on the secondary instrument is denoted as %. The response matrix of the full set of samples measured on the secondary instrument is represented as &. Original DS Method. In both DS and PDS, the response of an unknown sample on the secondary instrument, rM, is corrected by a transformation matrix, F, to be the response of that same sample on the first instrument, nun.The estimated response is as follows:
F is a square matrix dimensioned w x w. It is calculated using the transfer samples (SI,S2) as a solution of SI = S,F
(2)
This equation can be solved in the least-squares sense, giving
F = S$,
(3)
where S l is the generalized inverse of S2. As proposed in ref 1,
XZ, SI,SZ, and nunare projected onto the column space of XI calculated by singularvalue decomposition. Therefore, the scores obtained by projection are used in place of original spectra. Original PDS Method. The calculation of the transfer matrix, F,in PDS is different from that in DS in that it does not use the entire spectrum of all transfer samples to correct each wavelength on the second instrument. Instead it uses only those channels in Analytical Chemistty, Vol. 67, No. 14, July 15, 1995 2379
a local window around the channel of interest. This assumption is valid because with real instruments the response at a particular channel is going to be related to its neighbors but not to responses far removed, except in those cases where the response is correlated through the chemistry. For a vector of transfer responses, sl,i,measured at wavelength i on the primary instrument, a window from index i - j to i k is used to calculate the correction for responses, S Z , ~ .The vectors sl,i and s2,i refer to the respective ith columns of the matrices SI and SZ.Thus a small matrix dimensioned n x 0’ k + 1) is used to generate the regression vector
+
+
Given the regression vector, the future sample response, Plum,i, is estimated by
(5) For simplicity of application, the small regression vectors can be arranged along the diagonal of a correction matrix, F. Multiplication by the correction matrix gives a correction for each wavelength except the first j and last k, which can be extrapolated or truncated. Detailed descriptions of PDS and DS are given by Wang, Veltkamp, and Kowalskil Topics such as the choice of the values for i and j in PDS and the minimum number of samples required for DS are covered in that paper. Additive Background Problem. The above formulation provides only a multiplicative correction for differences between instruments. In many cases there are further, additive differences. An example of an additive difference would be if the source drifted slightly between measurements of background and sample with a single beam spectrometer such as a FT-near-IR Another example could occur with dual beam spectrophotonleters. The two beam paths are never identical, so there may be an element in the sample path that is different from the background path. If that element were different between two instruments, there would be an additive background problem. If the additive background is large enough, multiplicative corrections alone will not accomplish the correction. Background problems in instrument standardization are summarized by Workman and Coates.2 Two background correction methods were used in the standardization methods proposed by Wang et a1.l One approach involves taking the second derivative of the spectra on each instrument. The other method is to mean-center the responses from both instruments according to the grand mean of the first instrument. The first approach gives only moderate improvement unless the background is a line of constant slope on each instrument. This will rarely be the case because the background is almost always structured. The second approach is valid only if the backgrounds of both instruments are assumed to be nearly identical, in which case no significant additive difference exists anyway. Neither of these approaches corrects the general additive background difference problem. Proposed Background Correction Method (DSB). In contrast to the methods described above, the proposed method for background correction removes the additive term from the (2) Workman, J. J.; Coates, J. Spectroscopy 1993,8, 36.
2380 Analytical Chemistry, Vol. 67, No. 14, July 15, 1995
transfer standards prior to standardization,estimates the correct multiplicative model, and then estimates the additive correction required. The removal of the additive term is achieved by meancentering the transfer sample spectra prior to application of either DS or PDS. For the sake of completeness, the additive background correction will be presented in the framework of a calibration. The model is the “inverse”calibration model common in multivariate analysis. The specific method used to construct the model is not addressed because the standardization method corrects the spectra from one instrument to be like the spectra from a second instrument. Because the method corrects the spectra and not the model, the calibration model should be irrelevant to the success of the standardization. The inverse modeling methods differ in the way the regression vector 6 is estimated. In this case, the additive effect corrected by meancentering in calibration, as well as the additive background difference correction in standardization,will be explicitly specified. (1) Calibration, The model on the first instrument is constructed according to
where yl is the s x 1concentration vector of the analyte of interest, XI is the s x w response matrix, is the w x 1regression vector, and bl is the constant offset of the model. After meancentering,
71 = it,P
(7)
0
where 71and tlare the meancentered concentration vector and response matrix, respectively. An estimate of the regression vector, 6, is found from least-squares minimization to be
From eq 6,
where ylm is the mean of yl and xTmis the mean vector of XI. (2) Standardization. A w x 1 background vector, b,, is introduced to accommodate the additive background difference between the instruments:
+
S, = S2Fb lb;
(10)
where SIand SZare the n x w response matrices of the transfer subset samples. Fb is the w x w transformation matrix calculated with the background correction. The term lb: will be written as B,. An n x n centering matrix is defined by C, = I,, - l/nllT, where I,, is an n x n identity matrix. One of the properties of this matrix3is
X=C&
(11)
(3) Mardia, K. V.; Kent, J. T.; Bibby, J. M. Multivariate Analysis; Academic Press: London, 1979 p 462.
-.-.0.01
I
F.
J
3E 0.008 m e 0.006
92 0 . 0 0 4
F
Lactate
0.002
1000
1200
1400
1600
1800
2000
2200
2400
2600
Wavelength Figure 1. Pure spectra for casein, glucose, and lactate.
where X is any n C, gives
x
m matrix. Multiplying both sides of eq IO by
c,s1 = c,s,Fb + C,B, Because B, has identical rows, C,B, = B, = O,,.
(12) Therefore,
= s,Fb
(13)
pb --s+2%
(14)
s 1
By least squares,
1w Gl"O0U
100 QUI"
Figure 2. Mixture design with closure between three constituents.
Similarly, by multiplying both sides of eq 10 by I/,,lT,
'/,lTS1 = '/,lTS2Fb+ '/,lTb: Therefore, ST lm
- T - S2mFb + b:
(16)
where SI,,,and S2m are the mean vectors of matrices SIand respectively. Rearranging eq 16 gives
casein, glucose, and lactate were taken from a data set studied by Naes et al.4 ( F i i r e 1). A mixture design with closure among three constituents, as suggested by Naes et al.,4is shown in Figure 2. The open circles on the vertices of the triangle are the 66 calibration samples. The dots are the 55 prediction samples. The mixture spectra for both calibration and prediction samples were obtained by
SZ,
b, = slm- F;f9,, It should be noted that there is no d ~ e r e n c ebetween DSB and PDSB except in the way that the F matrix is calculated, so the derivation shown here is identical for both correction methods. (3)Prediction. After calibration and standardization, the concentration estimate for any spectrum, xzun,measured on the secondary instrument can be obtained using the calibration model developed on the primary instrument, as
It should be noted that, similar to DS, DSB also uses score values derived from the singular value decomposition instead of original spectra. EXPERIMENTAL SECTION
Simulated Data. To demonstrate the improvement of DSB over DS in cases with an additive background, pure spectra for
where yc,yg, and 3 are the concentrations of casein, glucose, and lactate, respectively. Due to the closure constraint, yc yg q j= 100. The vectors, K ,q, and w are the pure spectra for casein, glucose, and lactate, respectively. The simulation was done as linear combinationsof the pure component spectra with no added noise. The expected rank would be 2 with meancentering or 3 without meancentering. The prediction errors should be zero because there is no random noise component added to the simulated data. To simulate different instrument background effects, two baselines were added to the spectra, as shown in Figure 3. Baselines 1 and 2 were simulated to be the backgrounds of the primary and secondary instruments, respectively. Baseline 1was added to the spectra of calibration samples to obtain the response matrix, XI, from the primary instrument. Baseline 2 was added to the spectra of the test samples to obtain the response matrix, &. Due to the closure constraint, the rank of the response matrix XI is 2. Three subset spectra, SI,were selected from XI according
+ +
(4) Naes, T.; Isaksson, T.: Kowalski, B. R A n d . Chem. 1990,62,668.
Analytical Chemistry, Vol. 67, No. 14, July 15, 1995
2381
1.20 1.00
8
0.80
C
a
2
s:
0.60
s 4 0.40 0.20
I
Baseline 1
0.00 1000
1200
1400
1600
1800
2000
2200
2400
2600
Wavelength Figure 3. Baselines added in simulation and subset samples chosen for transfer. 1.8
1.6 1.4
2 e 2 a
0
1.2 1 . 0.8 0.6
9
0.4
9
-0.2 800
*
.
900
1000
1100
.
.
1200
1300
1400
1500
1600
Wavelength (nm) Figure 4. Mean spectra of transfer samples on instruments A and B and the difference between those means. The mean transfer sample from instrument B is significantly larger than that from instrument A. The arrow indicates the region used to determine tolerance values.
to the leverage at rank 2, as was proposed in ref 1. The subset spectra SIand Sa are shown in Figure 3 as well. All computations were done using the Matlab environment m e Math Works, Inc., Natick, MA). Near-Infrared Spectral Data. The data from the measurement of a set of 30 pseudo-gasoline samples measured in transmission mode, from 800 to 1600 nm with a 0.6667 nm sampling interval, on two separate scanning instruments from the same manufacturer were provided by Amoco Corp. and are described by Wang and Kowalski.5 The analytes consisted of three saturated hydrocarbons and two aromatics, designated S1, S2, S3, Al, and A2,respectively. All comparisons are made with a rank 8 partial least-squares (PLS)model constructed on the basis of the data from instrument A for each analyte. The rank of the model was determined by leave-one-out PLS cross-validation and is greater than 5 because of interactions between the analytes. Three transfer samples were chosen by highest leverage assuming a model rank of 8. The mean spectra for the transfer samples and the difference between the means are plotted in Figure 4. The difference is shown as the dashed line at the base of the plot. A PDS model based on the three transfer sample responses was calculated and used to (5) Wang, Y.; Kowalski, B. R. Appl. Spectrosc. 1992,46, 764.
2382 Analytical Chemistry, Vol. 67,No. 14, July 15, 1995
standardize the instrument B responses to those of instrument A. The window used was three channels wide, centered on the channel of interest. The rank of each small local regression vector was chosen by truncating all singular values with a value less than a predetermined tolerance. The tolerance was determined by taking the square root of the mean of the singular values of the covariance matrix of the transfer samples measured in a region where all samples were at baseline. In this case, the wavelength region between 856 and 861 nm was used. This is the same as the square root of the mean of the eigenvalues of the responses of the transfer samples in that region. The 27 samples not used for transfer were used as a test set. The PLS models were used to estimate the concentrationsof each of the test samples. The root-mean-squarederror of prediction (RMSEP) is reported for a variety of additive background correction methods. Figure 2 gives the standard deviations of the concentrations of the test samples and the root-mean-squared error of a rank 8 cross-validation (RMSECV) of the 30 samples measured on instrumentA. The root-mean-squared error of crossvalidation is calculated as
Table 1. Comparison of Direct Standardization Methods in Terms of Root-Mean-Square Error of Prediction for One Simulated Data Set, with Different Instrument Backgrounds.
concn range (% comp) std devn of concn RMSEP" DS DS w/second deriv DSB a
casein
glucose
lactate
3-93 24.7
2-92 24.7
5-95 24.7
10.37 2.15 0.0
26.8 1.94 0.0
37.14 4.08 0.0
The RMSEP results were obtained at rank 2, with three transfer
samples.
where 9 is the estimated concentration, yt is the true concentration, and N is the number of estimatedvalues. The formula is the same for RMSEP, except that the concentration estimate for RMSECV is obtained by building a model including all samples except the one to be estimated and then using the model to predict the concentration of the one left out. The concentration estimate for RMSEP is obtained by using an independent test set to build the calibration model. Because each sample is left out only once, RMSECV tends to give lower values than RMSEP. Table 2 shows the results of prediction using a variety of transfer and baseline correctionmethods. The table shows results of prediction without transfer, with only second derivatives and no standardization models, with standardization without background correction, with transfer with standardization after subtracting the mean of instrument A from all samples, with transfer using standardizationafter centering each instrument on the basis of the mean of the transfer samples, with transfer using standardization after background correction using second derivatives, and with transfer using standardization combining the subtraction of the mean of instrument A with second derivatives. RESULTS AND DISCUSSION Table 1 shows the standardization results from DS, DS with second derivatives, and DSB on the simulated data set. The RMSEP for the samples that are measured on the second instrument is zero for all three constituentsfrom the DSB method. Obviously, in such a case, DS is not able to achieve equivalent results due to the different instrument backgrounds. DS with second derivatives improves the prediction errors significantly over DS. However, it still cannot achieve zero prediction errors in the zero noise case. The improvement with the second derivative is due to the fact that the baselines are broad and mostly flat. The second derivative still could not give a perfect transferability because the baselines are not linear. An investigation of the RMSEPs in Table 2 shows that the additive background correction method gives an improvement over standardizationwithout correction. On average for the five analytes, PDSB gave results that were 2.3 times the RMSECV for a single instrument, compared to results that were 6.2 times the RMSECV using uncorrected PDS. Table 3 shows the squared correlation values6for standardizationwithout additive background correction, with additive background correction using second (6) Martens, H.; Naes, T.Multivariate Calibration by Data Compression. In NearInfrared Technology in the Agricultural and Food Industries; Williams, P., Noms, IC, Eds.; American Association of Cereal Chemists, Inc.: St. Paul, MN, 1987; p 70.
derivatives, and with correction based on mean centering the transfer samples. The squared correlation values, P,are calculated as
R2 = (C? - RMSEP)/C? In this equation, 0 is the standard deviation of the concentrations of the samples in the prediction set and not the uncertainty in the concentrations. The case where all the samples on each instrument are used to calculate the mean vector is included to show the improvement available by better determination of the mean response vector. The percent variance can be obtained from the squared correlation value by multiplying by 100. The improvement by using a better estimate of the mean is not very significant. The percent variance described &r second derivative background correction is significantly reduced compared to that with no correction in almost all cases except for A2,where the improvement is marginal. In the case of S1, second derivatives generated a predicted spread nearly twice as large as the standard deviation of the concentrations. Second derivatives as background correction for these data do not appear to be of any use. The RMSEPs for the case where the mean of all responses measured on instrumentA is subtracted as an additive background correction from both instruments show a degradation when the model is transferred. This is most likely because meancentering causes a loss of one rank in the instrument A data. Due to the difference in the additive backgrounds, this subtraction does not reduce the rank of the instrument B data. This loss of information in the primary instrument data limits the rank of the transfer model. Assuming that either system can be modeled by the same rank model, the reduced rank of the instrument A responses prevents proper spanning of the instrument B responses, which will be one rank higher. This means that there is not enough information in the instrument A data left to do a correction for the difference between the instruments. Because the data set was limited to only 30 samples, a true prediction value for the primary instrument is unavailable. Further, RMSECV values are generally smaller than RMSEP values, for reasons mentioned earlier. The PDSB prediction results compare favorably with the RMSECV on the primary instrument and indicate a successful transfer of the model to the second instrument. A final practical point regarding the implementation of PDS and PDSB is the choice of the tolerance value used in the algorithm. Because the algorithms generate a large number of regression models, a method for choosing optimal rank is required. In this study, a tolerance value was used. In building the principle components regression (PCR) models, all singular values found to be lower than the tolerance were disregarded. Figure 5 shows the effect of changing the tolerance on prediction error when using PDSB. In Figure 5, the rank was allowed to vary between 0 and 3, with all singular values less than the tolerance discarded in calculating the regression model. Allowing the rank to go as high as 2 or 3 allows PDSB to locally overfit the transfer data. If PDSB is being used, the maximum allowable rank is min(n - 2, m - Z), where tz is the number of transfer samples and m is the number of wavelengths in the window. There is a loss of one rank due to mean centering. If the rank is as high as min(n, m),then the model will span all variance in that domain and will overfit the data. If no additive correction is made Analytical Chemistry, Vol. 67, No. 74, July 75, 7995
2383
Table 2. Comparlson Standard Error of Prediction for Five Analytes in Pseudo-Gasoline Samples Using Seven Background Correction Methods
std devn of concn cross-validationon instrument A (rank 8) cross-validationon instrument A using second deriv spectra (rank 6) prediction of B from A no standardization no standardization,second deriv spectra standardization,no background correction standardization,with no background correction, second deriv spectra standardization,spectra centered by subtracting the mean of instrument A from both instruments standardization,spectra centered b subtracting the mean of instrument A from bo& instruments, second deriv spectra standardization,spectra separately centered using the means of the corresponding transfer sets
s1
s2
A1
A2
s3
6.50 0.46 0.65
5.41 0.14 0.22
6.44 0.08 0.20
3.58 0.14 0.36
9.12 0.41 0.50
10.7 31.57 3.46 6.40
1.3 11.15 0.70 1.71
1.5 7.17 0.39 1.10
3.0 10.60 0.82 0.63
8.5 23.17 3.10 6.61
12.01
4.75
2.16
4.10
9.03
5.29
3.43
2.16
5.41
2.25
0.66
0.31
0.30
0.38
0.58
Table 3. Squared Correlation ( P )after Standardization Using Three Addltive Background Correction Methods.
no correction of additive differences second deriv to remove additive differences mean of the transfer sets to remove additive differences mean of all samples to remove additive differences “The values are calculated as the ratio interest.
s1
s2
A1
A2
s3
0.7166 0.0305 0.9897 0.9941
0.9832 0.9001 0.9967 0.9982
0.9963 0.9708 0.9978 0.9988
0.9475 0.9690 0.9887 0.9951
0.8844 0.4747 0.9960 0.9980
(u2 - RMSEPZ)/u2,where u is
the standard deviation of the concentrations for the component of
12.00 10.00
8.00
R S
E P
6.00 4.00 2.00
0.00
6
5
4
3
2
1
-log Tolerance Figure 5. RMSEP of the five analytes as a function of the tolerance used in generating the PDSB model.
(PDS), the maximum rank is min(n - 1,m - 1). The minimum allowable rank is 0, meaning that both instruments give no response in that region for all transfer samples. CONCLUSIONS The problem of using an additive background correction before calculation of a multiplicative standardization model has been addressed, and methods to correct this problem have been examined. Both direct standardization and piecewise direct standardization showed significant improvements after providing a correction for additive background dBerences. The correction used consisted of meancentering each set of transfer samples to remove any constant baseline differences. Second derivatives would not achieve the same results because they remove only 2384 Analytical Chemistry, Vol. 67,No. 74, July 75, 1995
uniform or linear baselines and not the structured baselines most often seen in real instruments. The simulation results show that in the simplest case of no noise, a multiplicative model cannot accommodate an additive difference, but by using an additive background correction followed by the multiplicative correction, the error can be reduced to zero. The experimental results show that in a real application, correction of the constant background improves results tremendously. This points out an obvious next step in improving standardization methods. The additive correction has removed the baseline difference. The linear methods, direct and piecewise direct standardization, correct first-order differences. A nonlinear correction may make it possible to standardize instruments working
in the nonlinear regions of their detectors. The problem will be the same as it has always been in standardization: attaining enough samples to get enough data to model the differences between the instruments in question.
did much of the early work in standardization at the University of Washington. This work was supported by the Center for Process Analytical Chemistry. Received for review January 3, 1995. Accepted April 28,
ACKNOWLEDGMENT
The authors acknowledge the aid of Amoco Corp. and the chemometrics research group at the University of Washington. Further appreciation is extended to Dr. Yongdong Wang, who
1995.@ AC941253B Abstract published in Advance ACS Abstracts, June 15, 1995.
Analytical Chemistry, Vol. 67,No. 14, July 15, 1995
2385