Spectral Multivariate Calibration without Laboratory Prepared or

Dec 31, 2012 - An essential part to calibration is establishing the analyte calibration reference samples. These samples must characterize the sample ...
0 downloads 0 Views 1MB Size
Article pubs.acs.org/ac

Spectral Multivariate Calibration without Laboratory Prepared or Determined Reference Analyte Values Josh Ottaway,† Jeremy A. Farrell,† and John H. Kalivas*,† †

Department of Chemistry, Idaho State University, Pocatello, Idaho 83209, United States S Supporting Information *

ABSTRACT: An essential part to calibration is establishing the analyte calibration reference samples. These samples must characterize the sample matrix and measurement conditions (chemical, physical, instrumental, and environmental) of any sample to be predicted. Calibration usually requires measuring spectra for numerous reference samples in addition to determining the corresponding analyte reference values. Both tasks are typically time-consuming and costly. This paper reports on a method named pure component Tikhonov regularization (PCTR) that does not require laboratory prepared or determined reference values. Instead, an analyte pure component spectrum is used in conjunction with nonanalyte spectra for calibration. Nonanalyte spectra can be from different sources including pure component interference samples, blanks, and constant analyte samples. The approach is also applicable to calibration maintenance when the analyte pure component spectrum is measured in one set of conditions and nonanalyte spectra are measured in new conditions. The PCTR method balances the trade-offs between calibration model shrinkage and the degree of orthogonality to the nonanalyte content (model direction) in order to obtain accurate predictions. Using visible and near-infrared (NIR) spectral data sets, the PCTR results are comparable to those obtained using ridge regression (RR) with reference calibration sets. The flexibility of PCTR also allows including reference samples if such samples are available.

M

characterize the analyte and matrix effects, this process can be particularly time-consuming and costly. Therefore, an objective of multivariate calibration is to maintain low prediction errors and, simultaneously, limit the number of reference samples. While local modeling methods can accomplish this goal, a large library of previously measured samples with reference values is still needed.1,4,5 Additionally, such approaches require using sample selection algorithms based on a defined similarity measure(s) to form an X properly spanning the sample prediction space. Recent work has shown that it is possible to form a calibration model without reference samples.6−8 In these studies, the calibration model is formed using an analyte pure component spectrum and nonanalyte spectra, i.e., samples where the analyte amount is known to be zero. In earlier work,7 only samples of the anticipated matrix effects were used, and it was suggested that pure component interferent spectra would be useful if available. In more recent work,8 known pure component interferent spectra were included. However, in neither process is there a dynamic weighting scheme to obtain a balance between model prediction accuracy, model shrinkage (model vector size), and orthogonality to the nonanalyte matrix

ultivariate calibration relates the predictand y (the dependent variable representing chemical or physical properties) to predictor variables X (independent variables) by y = Xb + e

(1)

where y represents an m × 1 vector of quantitative analyte values for m reference samples, X signifies the m × n matrix of spectra measured at the n wavelengths, b symbolizes the n × 1 vector of model coefficients to be estimated, and e designates the m × 1 vector of normally distributed errors with mean zero and covariance matrix σ2I with I denoting the m × m identity matrix. The relationship described in eq 1 assumes y and X are column-wise mean centered or constrained to the origin and hence, no y-intercept term. The samples making up X and y need to characterize the sample matrix and measurement conditions (chemical, physical, instrumental, and environmental) of any sample to be predicted. Numerous processes exist to obtain the estimated regression vector symbolized as b̂. Common approaches in analytical chemistry are partial leastsquares (PLS), principal component regression (PCR), and ridge regression (RR).1−3 Once b̂ has been estimated, it can then be used to predict a future sample spectrum provided that sample matrix is part of the calibration domain. In order to accomplish a multivariate calibration using eq 1, sample spectra are measured for X and the corresponding reference values for y need to be determined. Because a large number of samples are commonly required to effectively © 2012 American Chemical Society

Received: September 17, 2012 Accepted: December 30, 2012 Published: December 31, 2012 1509

dx.doi.org/10.1021/ac302705m | Anal. Chem. 2013, 85, 1509−1516

Analytical Chemistry

Article

of ridge regression (RR) is a TR variant for the model expressed in eq 1 modified to (ignoring the e term)

effects (nonanalyte spectra are predicted by the model to have zero analyte). Proposed in this paper is a method named pure component Tikhonov regularization (PCTR) that forms models with the three way balance. It has been well noted that the calibration model vector size is important to model quality where the size is commonly characterized by the Euclidean length of the model vector b̂ denoted by ∥b̂∥2 for the L2 norm. Generally, for a given X and y data set, the greater the model size, the more complex and overfitted the calibration model for that particular model direction. Such a model is not useful for accurate predictions of new samples. If the model has too much shrinkage, too small in size, the model is underfitted, and, hence, such a model direction and size will also not predict accurately. This trade-off in model direction and shrinkage is characterized by the bias/ variance trade-off for the model prediction.3 As noted previously, calibration models are commonly estimated in analytical chemistry by biased methods such as PLS, RR, or PCR. These methods are termed biased as each requires selection of a meta−parameter (tuning parameter) to balance the bias/variance trade-off, i.e., the direction and magnitude are balanced by the respective tuning parameter value. The method of multiple linear regression (MLR) is often used to avoid the bias/variance trade-off. However, wavelengths (predictor variables) need to be selected due to the mathematical constraint on the least-squares solution. Wavelength selection (how many and which ones) can itself be considered a tuning parameter as the direction and magnitude of the resultant MLR model vector are determined by the specific wavelengths used. Numerous studies have been performed documenting the utility of obtaining orthogonality between the calibration model vector and the nonanalyte sample matrix and/or measurement conditions.9−14 Such matrix effects can be chemical, physical, instrumental, or environmental. Presented in this paper is PCTR that provides a flexible method to balance model size with the degree of model orthogonality to the nonanalyte space (the model direction) while simultaneously sustaining accurate predictions. To accomplish the balance, PCTR uses a pure component analyte spectrum and nonanalyte spectra. The flexibility of PCTR also allows including reference samples if such samples are available. Once a calibration model is formed, the model is designed for a particular set of instrumental, environmental, and sample matrix conditions. When these conditions deviate from the calibration conditions, it is unlikely the model will continue to accurately predict. Sample and measurement conditions can change due to a number of reasons such as changes in pH, particle size, new spectral interferent, new source or detector in an instrument, and/or temperature or humidity of the environment. This paper also demonstrates that it is possible to use PCTR for calibration maintenance by using a pure component analyte spectrum measured in the primary conditions with nonanalyte spectra measured in new secondary conditions. If available, reference samples measured in one or both conditions can also be used with PCTR. Calibration maintenance has been the subject of numerous studies and is well reviewed including variants of TR.15,16

⎛y⎞ ⎛ X ⎞ ⎜ ⎟ = ⎜ ⎟b ⎝ 0 ⎠ ⎝τ I⎠

(2)

that is also described as the minimization min(|| Xb − y||22 + τ 2 || b||22 )

(3)

with the solution b̂ = (Xt X + τ 2 I)−1Xt y

(4)

where X, y, b, and I are as defined in the Introduction for eq 1 (except I is now n × n), 0 denotes an n × 1 vector of zeros, τ represents the RR tuning parameter, the superscripts −1 and t symbolize the matrix inverse and transpose operations, respectively, and ∥•∥2 defines the vector Euclidean L2 norm that is a measure of the vector size or magnitude. The value of the tuning parameter balances the bias/variance trade-off with respect to the model vector direction and size. An extension of RR has been made that allows calibration maintenance using only a few samples measured in the new secondary conditions.16−18 This process is defined by the minimization of min(|| Xb − y||22 + τ 2 || b||22 + λ 2 || Mb − yM||22 )

(5)

with solution b̂ = (Xt X + τ 2 I + λ 2 Mt M)−1(Xt y + λ 2 Mt yM)

(6)

where M and yM are the new spectra and respective reference values, and λ denotes the tuning parameter for the weight given to the samples from the secondary conditions. Expression 5 and eq 6 can be considered RR with a second penalty in order to update the RR model to be predictive in the secondary condition. This TR process balances direction and size with two tuning parameters, and the second is selected to slightly favor the secondary condition.16−19 Expressions 3 and 5 can be modified two ways such that wavelength selection is now possible. Both approaches replace τ2∥b∥22 with different terms. In one case, the vector L1 norm τ∥b∥1 is used,16,18−21 and the other uses τ2∥Lb∥22 where L represents a diagonal matrix with the diagonal composed of the reciprocals of the absolute values of the coefficients from a full wavelength RR model vector b̂ (or another multivariate calibration model such as PLS)16,19,21 or the spectral noise structure.20,22 Pure Component TR (PCTR). For a measured spectrum x and assuming a linear Beer−Lambert law type relationship is valid, then x can be expressed as x t = ya kta + y tI KI + n t + r t

(7)

where ya and ka and are the analyte concentration and pure component analyte spectrum at unit concentration, respectively, yI and KI represent the interferent concentrations and pure component interferent spectra at unit concentration as rows in KI, n symbolizes the composite remaining sample matrix, instrumental, and environmental sources affecting x such as scatter, baseline shifts, background, temperature, etc. (components of x not due to the analyte or interferents), and r denotes the random spectral noise.



MATHEMATICS Tikhonov Regularization (TR) for Calibration or Maintenance. The diverse TR approaches to calibration and/or maintenance have been reviewed,16 and only those features needed to develop PCTR are described. The method 1510

dx.doi.org/10.1021/ac302705m | Anal. Chem. 2013, 85, 1509−1516

Analytical Chemistry

Article

definitions to NAS are possible.25 One is that part of a spectrum orthogonal to the nonanalyte (interference) space and the other is that part of a spectrum that is useful for prediction of the analyte. In the case of PCTR, the goal is to adjust the tuning parameters such that only that part of the pure component analyte spectrum useful for prediction is used in forming a model vector. As with Expressions 3 and 5, Expressions 9 and 11 can be modified in order to incorporate wavelength selection. Nonanalyte spectra can be obtained from many sources and are the most crucial part of the modeling process. One possible source of nonanalyte spectra are blanks, i.e., samples with varying degrees of matrix matching to prediction samples. However, in some quantitative analysis situations, samples without the analyte present are not possible, e.g., the development of a noninvasive glucose monitoring system where the glucose analyte is always present. Another form of nonanalyte spectra are spectra measured from samples with a constant analyte amount and varying degrees of nonanalyte spectral matrix effects. These sample spectra are then mean centered to remove the analyte contribution and can now be treated as nonanalyte spectra. If constant analyte samples are not available, it may be possible to use nearly constant analyte samples. Rather than mean centering such spectra, successive difference spectra could be used. This approach assumes that sample matrix compositions are different for the sample spectra being differenced. Nonanalyte spectra also include pure component interference spectra, simulated synthetic spectra, and mathematical functions modeling other spectral interferents such as scatter, drift, etc. In all these nonanalyte spectral formats, no reference values are needed.

Prediction of ya, represented as ŷa, is obtained by multiplying x expressed in eq 7 by an estimated model vector b̂. This is written as yâ = x t b̂ = ya ktab̂ + y tI KIb̂ + n t b̂ + r tb̂

(8)

Equation 8 characterizes four conditions that must be satisfied in order to obtain ŷa = ya. Specifically, 1. ktab̂ = 1, 2. KIb̂ = 0, 3. ntb̂ = 0, and 4. rtb̂ = 0. Unfortunately, not all conditions can be simultaneously satisfied with current modeling methods, and, hence, a compromise is needed. Similar to Expression 5, the method of PCTR incorporates and balances these trade-offs directly in the minimization expression min(|| ktab − 1||22 + τ 2 || b||22 + λ 2 || Nb − 0||22 )

(9)

with solution b̂ = (kakta + τ 2 I + λ 2 N tN)−1ka

(10)

where N represents spectra spanning the nonanalyte information (KI and n). Since the concentration of the pure component analyte spectrum ka is one, the last term is actually ka1. The goal of Expression 9 is to determine a model vector b that fulfills the four conditions where the fourth condition rtb̂ = 0 is realized by having a small ∥b̂∥2. Thus, with PCTR, there is also a trade-off between the orthogonality of the model vector to KI and n and shrinkage of the model vector, i.e., a balance between direction and size to obtain accurate predictions. For primary calibration, the nonanalyte information must span the primary measurement conditions. For calibration maintenance, the nonanalyte information must characterize the new secondary conditions that are different from the conditions when the pure component analyte spectrum was measured. It should be noted that rather than using pure component spectra at unit concentrations, pure component samples diluted with a solvent could be used. In this case, the 1 in Expression 9 would be changed to ya, the ka terms in Expression 9 and eq 10 would be changed to xa, and the last term in eq 10 would be changed from ka to xaya where xa is the pure component analyte spectrum at the concentration of ya. Regardless, the same model vectors in the same directions would be obtained. The tuning parameters would be adjusted for the concentration effect. It has been suggested that when TR in the format of Expression 5 has M = N, yM = 0, m > n (thereby dispensing with the τ regularization term), and λ approaches infinity, then it can be thought of as forming a calibration model based on net analyte signal (NAS) preprocessing.23 This concept has been mathematically derived between NAS preprocessing and TR when m > n or m ≤ n.24 Rewriting Expression 5 for TR incorporating NAS preprocessing in the general case results in min(|| Xb − y||22 + τ 2 || b||22 + λ 2 || Nb||22 )



EXPERIMENTAL SECTION Software. Matlab code written by the authors was used for all calculations and is available in the Supporting Information (SI). Data Sets. Inorganic. The data set consists of 29 samples containing a three component system of cobalt II, chromium III, and nickel II.26,27 Three of these samples are the pure components. The analyte for this paper is chromium II (Cr), and the other two components are considered interferents. Five replicates of each sample were originally measured from 350 to 650 at 2 nm intervals, with two replicate spectra later removed due to their appearance as outliers. The mean spectra of the replicates were used resulting in 26 spectra. For this study, spectra were reduced to range from 374 to 598 nm in order to remove excessive high noise regions. Pure component and mixture spectra are plotted in Figure S-1 of the SI. From the 26 spectra, six spectra were reserved for creating the nonanalyte spectral matrix N by using constant analyte samples. However, only three proved to be needed for PCTR when used with pure component interferent spectra. The remaining 20 spectra were randomly split into a 10 sample calibration set and a 10 sample validation set. The calibration spectra were used with RR to compare to PCTR. In all situations, the validation set is the same. Listed in Table S-1 of the SI are the respective concentrations. The reader should consult refs 26 and 27 for the exact experimental details. Temperature. The data set consists of 19 samples composed of water, ethanol, and 2-propanol, measured from 590 to 1091 at 1 nm intervals at 30, 40, 50, 60, and 70 °C.28,29 Respective pure component spectra are also measured at all temperatures, and all spectra were reduced to range from 850 to 1049 nm.

(11)

The relationship in Expression 11 converges to orthogonal NAS preprocessing as well as the preprocessing method of generalized least-squares (GLS) under specific conditions.24 The interrelationships are also true for Expression 9 where the pure component analyte spectrum replaces reference spectra in X. Thus, the TR variants in Expressions 9 and 11 allow simultaneous NAS preprocessing and model formation using, respectively, no calibration references or with reference samples. In both situations, a range in the degree of NAS preprocessing is possible from none (λ = 0) to complete orthogonal NAS preprocessing. It is considered that two 1511

dx.doi.org/10.1021/ac302705m | Anal. Chem. 2013, 85, 1509−1516

Analytical Chemistry

Article

Pure component and mixture spectra are plotted in Figure S-2 of the SI at 30, 50, and 70 °C. The analyte for this paper is ethanol with the other two components serving as interferents. The 19 samples were split into a 13 sample calibration set and a six sample validation set. Listed in Table S-2 of the SI are the respective concentrations. This split was used with RR for comparison to PCTR. Samples used to construct the nonanalyte spectral matrix N were from the 13 sample calibration set consisting of blanks and nearly constant analytes. The pure component interferent samples were also used in N for some studies. In all circumstances, the same six validation samples were used. Results are presented for primary calibration at 50 °C to predict the validation samples measured at 50 °C as well as for calibration maintenance with primary calibration at 30 °C to predict the validation samples measured at 70 °C. The reader should consult refs 28 and 29 for the exact experimental details.



RESULTS AND DISCUSSION The PCTR process is evaluated using several nonanalyte combinations for the nonanalyte spectral matrix N involving blanks, pure component interference, and constant analyte. Model selections (tuning parameters) and evaluations are based on several root-mean-square error (RMSE) values. These values consist of the RMSE for prediction of the pure component analyte sample (RMSEka), nonanalyte samples (RMSEN), and validation samples (RMSEV). Other model merits used for evaluations are the respective R2, slope, and intercept values from plotting validation sample predictions against actual values. The PCTR results are compared to RR results based on full calibrations with reference samples. All PCTR and RR tuning parameters are selected using the model merits noted above graphically displayed in 2-dimenional landscapes, L-curves, and convergence of model vector correlations. These model selection processes, and others, have been well described.3,16−19 The following discussion begins with the inorganic data set. Model merit landscapes for a specific nonanalyte situation are presented and graphical characterizations of tuning parameter trends are examined. Following this assessment, tables of results obtained using PCTR with different nonanalyte situations are compared to results from a full RR calibration. The other data set is similarly discussed using only the tabulated results. Primary Calibration. Inorganic. Plotted in Figure 1a−d are the respective landscapes of RMSEka, RMSEN, model vector L2 norm, and RMSEV for PCTR using three constant analyte spectra mean centered and the two interferent pure component spectra for the nonanalyte spectral matrix N. The constant analyte samples are the first three of Table S-1 in the SI where in actuality the second component is seen to also be constant. This aspect is further discussed in the evaluation of results based on different compositions for N and comparison to full calibration with reference samples using RR. The plotted landscapes in Figure 1a−d indicate there are wide ranges of λ and τ values that provide acceptable models, i.e., models providing reduced model L2 norms while also reducing both the RMSEka and RMSEN values. Thus, rather than selecting one model with the appropriate trade-offs, a consensus (ensemble) approach could be used to select a family of models. The final sample prediction value would then be a specified weighted combination. Such an approach was recently described30 but is not used in this study. Instead, methods previously noted are used to select a model.

Figure 1. PCTR landscapes for the inorganic data set with the nonanalyte spectral matrix N composed of three constant analyte spectra and the two pure component interferent spectra. (a) RMSEka, (b) RMSEN, (c) model vector L2 norm, and (d) RMSEV. Gray circled regions in (d) are models with lowest RMSEV values.

The RMSEV landscape plotted in Figure 1d reveals two regions providing the smallest, and apparently similar, RMSEV values. These regions are circled in gray and reside at the bottom of clifflike formations. In these two regions, trade-offs exist between model vector L2 norm values (size) and directions characterized by the RMSEka and RMSEN values. The intersection of the two regions in Figure 1d captures those models with λ values at approximately equal to 1.0. Landscape regions also exist where the RMSEV values are degraded. These regions generally parallel the same regions with reduced RMSEka values and increased RMSEN values or vice versa. Increases and decreases in the model L2 norms are also present. Thus, as previously emphasized, trade-offs exist, and a model(s) is sought with an acceptable bias/variance balance in conjunction with orthogonally to nonanalyte. 1512

dx.doi.org/10.1021/ac302705m | Anal. Chem. 2013, 85, 1509−1516

Analytical Chemistry

Article

Figure 1a−c shows that for the dashed circle region, any τ value in the range from 3e-7 to 0.2 can be used to form a potential L-curve for selecting a model, i.e., select a τ value in this range and plot model merits as λ varies. In this case, the RMSEka varies little, while the RMSEN varies in conjunction with the model L2 norm. Using τ = 0.032 and plotting RMSEN values against respective model L2 norms generates an L-curve with the selected model at λ = 0.36 in the corner of the L-curve (see Figure 2a). Table 1 contains model merit values for this selected model.

Conversely, Figure 1a−c shows that for the solid circle region, any λ value in the range from greater than 1.00 to 1e5 can be used to form a potential L-curve to select a model, i.e., select a λ value in this range and plot model merits as τ varies. In this case, the RMSEN now varies little (all models are orthogonal or nearly orthogonal to the nonanalyte spectral matrix N), while the RMSEka varies in conjunction with the model L2 norm. However, plotting model merits at any λ value in this range do not form L-curves. For example, from Figures 1a and 1c, it is observed that for any λ value in this range, the RMSEka and the model L2 norm values appear to be inversely correlated. Indeed, a plot of RMSEka against the model L2 norm at λ = 302 generates a line with R2 = 0.9998 (see Figures 2b and 2c). The large number of models in the lower right corner are converging and have the smallest RMSEka values with larger model L2 norms. This type of correlation and convergence behavior has been noted before in other TR calibration maintenance variants.18,19 In these situations, the model with the greatest successive correlation with the smallest model L2 norm is selected as the final prediction model. A similar correlation and convergence behavior is observed when plotting RMSEka against RMSEN using λ = 302. In this plot, the convergence of the same models in Figures 2b and 2c are close to the origin with RMSEka and RMSEN being the smallest. Both types of plots (Figures 2a and 2b) agree with the goal of minimizing Expression 9, and the model with the greatest successive correlation has the smallest model L2 norm in conjunction with small RMSEN and RMSEka values. Using these successive correlations, the model at τ = 0.15 from Figures 2b and 2c is selected. Table 1 contains model merit values for this selected model. In comparing the representative results in Table 1 from the two low RMSEV regions in Figure 1, there appears little difference between the results. The equivalent results are due to the large similarity between the models. When the actual model vectors are plotted (not shown) the models are essentially the same. The angle and Euclidean distance between the two model vectors are respectively, 0.33 degrees and 0.0057. In general, for the data sets described in this paper, the models in either circled regions of Figure 1d form comparable model merits. The process based on the correlation and convergence behavior is used to select models as it is more automatable compared to the L-curve approach. It is observed from Figure 1 that the PCTR results at λ = 1.00 would be similar to the PCTR results listed in Table 1 at λ = 302. The only difference between the two PCTR models would be the different τ values for the different λ values. However, with other data sets, the PCTR models at λ = 1.00 are not acceptable, and greater λ values are needed thereby

Figure 2. Model merit assessments for the inorganic data set from the landscapes in Figure 1 with the nonanalyte spectral matrix N composed of three constant analyte spectra and two pure component interference spectra. For all plots, the red circle indicates the selected model. (a) RMSEN against the model L2 norm at τ = 0.0032, (b) RMSEka against the model L2 norm at λ = 302, and (c) a zoom of part b.

Table 1. RR and PCTR Model Merits for the Inorganic Calibration Data Set calibration process

total number of samples

RMSEV

R2a

slopea

intercepta

∥b̂∥2

τ, λ

10 13 6

5.4e-5 4.6e-5 4.9e-5

0.999 0.999 0.999

1.0 1.0 1.0

3.8e-5 −3.0e-6 −3.0e-5

0.017 0.017 0.017

6

5.5e-5

0.999

1.0

−3.8e-6

0.018

0.018, 0.059, 0.032, 0.36 0.15, 302

4

6.9e-3

0.790

0.99

6.3e-3

0.015

RR full reference RR full reference with pure component analyte and interferents PCTR with pure component interferents and three constant analyte samples for N (model selected from Figure 2a) PCTR with pure component interferents and three constant analyte samples for N (model selected from Figure 2b) PCTR with three constant analyte samples for N a

0.0022, 302

Values are from plotting the predicted validation sample values against the corresponding validation reference values. 1513

dx.doi.org/10.1021/ac302705m | Anal. Chem. 2013, 85, 1509−1516

Analytical Chemistry

Article

Table 2. RR and PCTR Model Merits for the Temperature Calibration Data Set calibration process

total number of samples

RMSEV

R2a

13 16 6 4 11

0.037 0.026 0.13 0.15 0.035

0.995 0.995 0.986 0.982 0.993

RR full reference RR full reference with pure component analyte and interferents PCTR with pure component interferents and three blanks for N PCTR with three blanks for N PCTR with pure component interferents, three blanks, and five constant analytes for N a

slopea intercepta 0.95 1.1 1.26 1.27 1.1

0.052 −0.037 0.018 0.039 −0.060

∥b̂∥2

τ, λ

12.3 19.3 16.2 11.2 30.1

0.0057, 0.0045, 0.0147, 0.0046, 0.0022,

302 302 302

Values are from plotting the predicted validation sample values against the corresponding validation reference values.

shifting the λ values for acceptable models in the landscape plots. Other than this aspect, the noted trends just described for the landscapes and plots in Figures 1 and 2 are characteristic of other PCTR situations based on different nonanalyte matrices as well as for the other data set evaluated in this paper. For comparison to PCTR, tabulated in Table 1 are results from two full RR models based on two different reference sample situations. One is established on only reference samples, and the other is formed by using the same reference samples augmented with the pure component analyte and interferent spectra. Including the pure component spectra in RR with the reference samples appears to provide a small improvement in model merits. The listed values indicate that PCTR and RR are able to form comparable results. Not studied for this paper was using RR with reference samples and separating N from the reference samples as in Expression 11. As previously noted, this approach was studied in ref 24 for analysis of NAS, TR, and GLS. Results are also shown in Table 1 for the PCTR situation based on using only three constant analyte spectra for N. As previously noted, these samples (the first three in Table S-1 of the SI) also have one of the interferents at constant concentration. Thus, when the three spectra are mean centered to remove the analyte, one of the interferents is also removed. Removing the pure component interferent spectra from PCTR clearly degrades the ability of the PCTR to form an effective model vector insensitive to the interferents. Said another way, by including just one spectrum for the one interferent not present in the constant analyte spectra indicates the importance of including information for all the interferent effects as well as the potential power PCTR to predict well with limited samples provided the tuning parameter is selected correctly. Lacking from the pure component spectra is information characterizing interactions between the nonanalyte matrix and analyte. For example, assuming a linear additive relationship, then the situation for the pure component analyte spectrum can be algebraically expressed as k a = k̃ a + k m

Because similar results for the inorganic data set were obtained with the reference sample based calibration and with constant analyte samples, then if an analyte matrix effect exists for this data set, it was sufficiently captured. The spectroscopic situation for the inorganic data set is rather simple. In spite of this simple configuration, it was found that pure component interferent spectra were needed if the nonanalyte matrix N is limited in the number of samples and characterization of the nonanalyte sample matrix. Temperature. Presented first in Table 2 are the RR results using the full calibration set of reference values. As noted in Table S-2, this calibration set includes three blank samples. Similar to the inorganic data set, the results listed in Table 2 reveal that including pure component spectra with the reference samples provided a marginal improvement in the prediction error and model merits. Due to the unique design of the temperature data, it is possible to form several variations of the nonanalyte samples in N such as blanks, pure component interferents, and approximate constant analyte samples. Regardless of the nonanalyte design; respective model merit landscapes are not uniquely different from those in Figure 1 for the inorganic data set. Listed in Table 2 are results using PCTR with various designs of N. For the situation with only the three blank spectra in N, the results listed in Table 2 disclose that the PCTR model does not predict as well as a full RR calibration with the 13 reference samples. Due to the lack of any analyte information in the nonanalyte matrix, there is no ability to incorporate the interactions between the analyte and the nonanalyte portion of the sample matrix. In terms of eq 12, the km term and/or similar terms for N are probably significant. For the reference based RR calibration, using reference samples containing the analyte in the sample matrix allows for better predictions of the validation samples. Adding the two pure component interferent spectra to the nonanalyte spectral matrix N composed of only blanks allows PCTR to form a slightly better predicting model; the model still does not predict as well as RR with 13 reference samples or RR with 13 reference samples and the three pure component spectra. Thus, if pure component interferent spectra are available, these results indicate that the spectra only provide a small improvement when the number of samples are limited. Table 2 shows that by including five nearly constant analyte samples with PCTR, the model is able to reduce the prediction errors and improve the R2 value. Including the constant analyte samples to the nonanalyte matrix N already composed of the blanks and pure component interferent samples should allow PCTR to better orthogonalize to the sample matrix, most importantly, the matrix analyte interactions. While the PCTR results are still not as good as RR with a full calibration of

(12)

where km denotes the matrix affected contributions to the nonmatrix affected pure component analyte spectrum kã . If km is small, then from eq 8 it can be seen that the contribution from the product ktmb̂ to the predicted value ŷa should be negligible and ka = k̃a. A similar expression and reasoning can be made for nonanalyte spectra in N without the analyte such as pure component interferent spectra or blanks. If constant analyte spectra are used in N, then the matrix affected contributions should be fully represented if the remaining sample matrix varies. Another approach is to include reference spectra from mixtures of the analyte and sample matrix. In this case, the corresponding analyte reference values are needed. 1514

dx.doi.org/10.1021/ac302705m | Anal. Chem. 2013, 85, 1509−1516

Analytical Chemistry

Article

Table 3. RR and PCTR Maintenance Model Merits for the Temperature Calibration Data Set at 30 °C Predicting the Validation Set at 70 °C calibration process

total number of samples

RMSEV

R2a

RR full reference at 30 °C RR full reference at 70 °C PCTR with pure component analyte at 30 °C and three blanks, and five constant analytes for N at 70 °C

13 13 9

0.27 0.047 0.051

0.977 0.992 0.999

a

slopea intercepta 0.88 0.82 0.80

−0.23 0.086 0.041

∥b̂∥2

τ, λ

9.64 9.57 20.2

0.0083, 0.0083, 3.1e-4, 1.2

Values are from plotting the predicted validation sample values against the corresponding validation reference values.

method.31 The PCTR formats could also be modified similar to other TR processes that incorporate wavelength selection. It should be noted that other methods can be used to solve the analysis situation given to PCTR. For example PLS could be used by removing the τ tuning parameter and replacing it with the number of latent variables. However, it was recently noted that due to the limited number of samples used in PCTR, the calibration rank is correspondingly limited causing a substantially small number of latent variables to be formed.30 The discrete nature of PLS is overcome with PCTR from the continuous nature of the PCTR tuning parameters. With PCTR there is perhaps greater flexibility in balancing model shrinkage, prediction error, and orthogonality to the nonanalyte sample matrix. As with other calibration processes that require tuning parameters (wavelength selection for multiple linear regression (MLR), number of latent variables for PLS or PCR, and design schemes for neural networks or support vector machines), PCTR requires tuning parameter selection. The focus in this paper was to select a model. Recent promising work with TR has involved using a consensus approach that automatically selects a collection of models based on model merit values.30 In summary, work is needed on appropriate methods to determine good tuning parameter values that properly balance model vector directions with magnitudes while maintaining good predictive abilities. It should be noted that if a few reference samples are indeed available in conjunction with a few nonanalyte spectra and a pure component analyte spectrum, then the measured reference sample values can be augmented to the nonanalyte sample ⎛0⎞ ⎛N⎞ values N and 0 in Expression 9 to ⎜ ⎟ and ⎜ y ⎟ where M and ⎝M⎠ ⎝ M⎠

reference samples in conjunction with pure component spectra, this variant of the nonanalyte spectral matrix N with PCTR does obtain a comparable prediction error using two to five fewer spectra and zero reference values. This translates into a substantial time and cost savings with a small sacrifice to the prediction error. In summary, it was found that the combination of constant analyte, blanks, and pure component interferents were needed to obtain PCTR results most comparable to RR with reference samples. Different combinations of two of the three nonanalyte sample sources provided results similar to those presented in Table 2 for the combination of blanks with pure component interferent spectra. Calibration Maintenance. Temperature. When preforming calibration maintenance with the temperature data set, the results and trends are similar to the temperature primary calibration results. As the values in Table 3 show, model updating from 30 °C with PCTR performs similar to forming the calibration model in the secondary conditions at 70 °C. By including the constant analyte samples measured in the secondary conditions, the model appears to be orthogonalizing to the matrix analyte interaction, if the interaction is truly meaningful. Thus, PCTR seems to be a viable approach for calibration maintenance without reference samples demonstrating a strong potential to substantially reduce the time to form a calibration.



CONCLUSIONS By using a pure component analyte spectrum in conjunction with nonanalyte spectra, the Tikhonov regularization (TR) variant named PCTR in this paper has been shown to be effective for primary calibration modeling and for calibration maintenance when updating a primary model to new secondary conditions. For the data sets presented in this study, PCTR was able to obtain comparable prediction errors as that of a full calibration with reference samples. The PCTR approach was able to properly model the calibration situations with few samples. Not studied was including in the nonanalyte matrix N difference spectra of samples with constant or nearly constant analyte amounts or difference spectra of blanks. For example, using difference spectra of blanks composed of varying amounts of the nonanalyte sample matrix components should allow PCTR to better orthogonalize to the nonanalyte information. Using difference spectra in other TR calibration maintenance studies with reference samples has been shown to improve prediction results.17 The PCTR approach is generic and could be applied to other data sets with a pure component analyte spectrum and nonanalyte spectra. While not attempted in this study, methods are available in the literature for estimating a pure component analyte spectrum which would extend the applicability of the

yM denote the reference samples measured in the primary condition for primary calibration or measured in the secondary conditions for calibration maintenance. With such an approach, fewer reference samples should be needed compared to a full calibration and predictions errors could be reduced compared to just using nonanalyte samples.



ASSOCIATED CONTENT

S Supporting Information *

PCTR Matlab code, Figures S-1 and S-2, and Tables S-1 and S2. This material is available free of charge via the Internet at http://pubs.acs.org.



AUTHOR INFORMATION

Corresponding Author

*Fax: 208-282-4373. E-mail: [email protected]. Notes

The authors declare no competing financial interest. 1515

dx.doi.org/10.1021/ac302705m | Anal. Chem. 2013, 85, 1509−1516

Analytical Chemistry



Article

(31) Toiviainen, M.; Corona, F.; Paaso, J.; Teppola, P. J. Chemom. 2010, 24, 514−522.

ACKNOWLEDGMENTS This material is based upon work supported by the National Science Foundation under Grant No. CHE-1111053 (cofunded by MPS Chemistry and the OCI Venture Fund) and is gratefully acknowledged by the authors. A portion of this research was supported by University Faculty Research Committee Grant No. 1036 at Idaho State University, Pocatello, Idaho. The authors are grateful to the providers of the data sets.



REFERENCES

(1) Næs, T.; Isaksson, T.; Fern, T.; Davies, T. A User Friendly Guide to Multivariate Calibration and Classification; NIR Publications: Chichester, UK, 2002. (2) Hastie, T. J.; Tibshirani, R. J.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed.; Springer-Verlag: New York, 2009. (3) Kalivas, J. H. In Comprehensive Chemometrics: Chemical and Biochemical Data Analysis Vol. 3; Brown, S. D., Tauler, R., Walczak, B., Eds.; Elsevier: Amsterdam, 2009; pp 1−32. (4) Davies, T. Spectrosc. Eur. 1999, 11, 22−24. (5) Atkeson, C. G.; Moore, A. W.; Schaal, S. Artif. Intell. Rev. 1997, 11, 11−73. (6) Marbach, R. J. Biomed. Optics 2002, 7, 130−147. (7) Marbach, R. J. Near Infrared Spectrosc. 2005, 13, 241−254. (8) Boulet, J.-C.; Roger, J.-M. Anal. Chim. Acta 2010, 668, 130−136. (9) Zeaiter, M.; Roger, J.-M.; Bellon-Maurel, V. Trends Anal. Chem. 2005, 24, 437−445. (10) Gujral, P.; Amrhein, M.; Wise, B. M.; Bonvin, D. J. Chemom. 2010, 24, 534−543. (11) Brown, C. W.; Ridder, T. D. Appl. Spectrosc. 2005, 59, 787−803. (12) Brown, C. W. Anal. Chem. 2004, 76, 4364−4373. (13) Skibsted, E. T. S.; Boelens, H. F. M.; Westerhuis, J. A.; Smilde, A. K.; Broad, N. W.; Rees, D. R.; Witte, D. T. Anal. Chem. 2005, 77, 7103−7114. (14) Ruyken, M. M. A.; Visser, J. A.; Smilde, A. K. Anal. Chem. 1995, 67, 2170−2179. (15) Brown, S. D. In Comprehensive Chemometrics: Chemical and Biochemical Data Analysis Vol. 3, Brown, S. D., Tauler, R., Walczak, B., Eds.; Elsevier: Amsterdam, 2009; pp 345−377. (16) Kalivas, J. H. J. Chemom. 2012, 26, 218−230. (17) Kalivas, J. H.; Siano, G. G.; Andries, E.; Goicoechea, H. C. Appl. Spectrosc. 2009, 63, 800−809. (18) Kunz, M. R.; Kalivas, J. H.; Andries, E. Anal. Chem. 2010, 82, 3642−3649. (19) Kunz, M. R.; Ottaway, J.; Kalivas, J. H.; Georgiou, C. A.; Mousdis, G. A. J. Agric. Food Chem. 2011, 59, 1051−1057. (20) Stout, F.; Kalivas, J. H.; Héberger, K. Appl. Spectrosc. 2007, 61, 85−95. (21) Ottaway, J.; Kalivas, J. H.; Andries, E. Appl. Spectrosc. 2010, 64, 1388−1395. (22) Kalivas, J. H. Anal. Chim. Acta 2004, 505, 9−14. (23) Andrew, A.; Fearn, T. Chemom. Intell. Lab. Syst. 2004, 72, 51− 56. (24) Andries, E.; Kalivas, J. H. submitted to J. Chemom. (25) Ferre, J.; Faber, N. K. M. Chemom. Intell. Lab. Syst. 2003, 69, 123−136. (26) Wentzel, P. D.; Andrews, D. T.; Kowalski, B. R. Anal. Chem. 1997, 69, 2299−2311. (27) http://groupwentzell.chemistry.dal.ca/downloads.html (accessed 9/12/2012). (28) Wiülfert, F.; Kok, W. T.; Smilde, A. K. Anal. Chem. 1998, 70, 1761−1767. (29) http://www.models.life.ku.dk/datasets (accessed 9/12/2012). (30) Shahbazikhah, P.; Kalivas, J. H. Chemom. Intell. Lab. Syst. 2013, 120, 142−153. 1516

dx.doi.org/10.1021/ac302705m | Anal. Chem. 2013, 85, 1509−1516