Determination of optimal parameters for multicomponent analysis

Determination of optimal parameters for multicomponent analysis using the calibration matrix condition number. J. H. Kalivas. Anal. Chem. , 1986, 58 (...
0 downloads 0 Views 508KB Size
989

Anal. Chem. 1986, 58,989-992

AIDS FOR ANALYTICAL CHEMISTS Determination of Optimal Parameters for Multicomponent Analysis Using the Calibration Matrix Condltion Number J. H.Kalivas Department of Chemistry, Idaho State University, Pocatello, Idaho 83209 For analytical results, it is necessary to determine the optimum set of operating parameters influencing a chemical analysis. Solvent composition, pH, reaction time, and temperature are a few of the many factors that can be altered. Many techniques are available that will perform this task of optimization. However, the simultaneous methods, random design and factorial design ( I ) , require a large number of experiments, while the sequential optimization technique of single factor is insufficient when the parameters are interdependent (2). Box and Wilson (3)used an evolutionary operation procedure to contend with the multifactor response surface in the optimization scheme. Unfortunately, this evolutionary operation procedure uses factorial designs and regression techniques requiring a significant number of experiments. The sequential simplex method of optimization as described and reviewed by Deming and Morgan (4) represents a much more efficient method of optimization for the multifactor response surface. Most applications of sequential simplex have been orientated toward obtaining the optimum multifactor response surface for the analysis of a single-component chemical system. Sequential simplex has been used to optimize automated multicomponent chromatographic separations (5-7), while Leary et al. (8) have successfully used sequentially simplex optimization to locate the optimal instrumental operating conditions for simultaneous multicomponent analysis using inductively coupled plasma spectrometry. The composite response function for use in their simplex design was represented by the sum of the reciprocals of the signal-to-background ratio (SBR) for each component at approximately the same concentration levels as expected in the samples. Since then, Moore et al. (9) have added an ionization interference composite function to the simplex procedure. It was necessary to first optimize the composite response function for the SBR and then have optimization (in this case minimization) of the ionization effects. However, minimization of additional spectral interferences and matrix effects was not obtained. Also, there was some SBR loss after the ionization interferences were minimized. Deming and Morgan (10) have incorporated terms to correct for interfering absorption in the spectrophotometric analysis of calcium in blood serum. Their study was limited to the measurement of one component at only one wavelength. To overcome these limitations for multicomponent optimization, an alternative consideration for determining the optimal operating parameters influencing a chemical analysis is error propagation. When spectral interferences and matrix effects are present for a multicomponent analysis procedure, theory states that measurement errors can actually be amplified to produce larger uncertainties for the estimated analyte concentrations (11). This error amplification can be represented by the condition number of the calibration matrix. The condition number, of a square matrix is the product of its spectral norm with that of the inverse of the matrix. When the condition number of the matrix is large, the matrix is said 0003-2700/86/0358-0989$01.50/0

to be ill-conditioned (11-13). Thus, the condition number of a calibration matrix provides concluding information on the potential error to be encountered in estimating the analyte concentrations; the larger the condition number, the more error. It has been shown that a minimum in the condition number of the calibration matrix represents an optimization of the selectivity, precision, and accuracy for a given multicomponent analysis scheme ( 1 4 ) . This study was limited to determining the optimal set of sensors for a particular multicomponentanalysis. In a separate investigation, Wegscheider and Otto (15) acquired similar results. Additionally, it was found that the condition number of the calibration matrix could be used to determine the optimal complexing agent for spectrophotometric multicompoirentanalysis of trace metals. Frans et al. (16) have used the condition number of a matrix to assist in the error analysis for the resolution of overlapped liquid chromatographic peaks. After a brief review of the theory of condition numbers, this paper will deal with the methodology and applicability of using error propagation, the condition number of the calibration matrix, for optimal multicomponent analysis using UV-vis spectrophotometry. The procedure is quite general and can be applied to most multicomponent methods of analysis in the presence of interferences and matrix effects.

THEORY Fundamental spectrophotometricmulticomponent analysis is based on Beer’s law

2 = ZK where i: is a row vector of absorbances measured at p sensors (wavelengths), i! is the concentration row vector for the r components, and K is the r X p matrix of linear response constants (calibration matrix) showing the contribution of each of the r components to each of the p sensors for a given mixture. If the K matrix is known, the solution for the concentrations of the components can be found by the use of multiple linear least squares (15, 17). If it is not known whether or not matrix effects and spectral interferences exist in the multicomponent sample a practical method of calibration is to use the generalized standard addition method (GSAM) (10, 13, 18). The GSAM is able to simultaneously correct for known spectral interferences, matrix effects, and drift. The GSAM requires that when there are r analytes to be determined, the responses from p sensors ( p 2 r ) be recorded before and after n standard additions are made (n 2 r). The model in matrix notation is

R

= CK

(2)

where R is the (n + 1) X p matrix of measured responses (initial readings plus readings after standard additions) with n equal to number of multiple standard additions made and p equal to the number of sensors. C is the (n + 1) X r concentration (initial plus additions) matrix of the r analytes, and K represents the r x p matrix of linear response constants. 0 1986 American Chemical Society

990

ANALYTICAL CHEMISTRY, VOL. 58, NO. 4, APRIL 1986

Recently Lorber derived the unbiased GSAM (UBGSAM) which allows for the “best linear unbiased estimate” of the unknown components (19). The basic equation to be solved in the UBGSAM is

Q = ACK

+

(4)

After calculation of the K matrix, the vector of initial analyte quantities, Zo, is recovered by solving

(5,

+ C)K

(5)

where q is the vector of mean responses and E is the vector of mean added analyte concentrations. The upper limit of the relative error for the estimated concentrations depends on the condition number of the calibration matrix and the relative errors in the data. Specifically, the upper bound for relative error in the estimated concentrations ( ~ ~ 6 Z ~ ~is~ obtained / ~ ~ Z o by ~ ~multiplying ) the condition number of the calibration matrix (cond(K)) by the sum of the relative errors due to absorbance (Il6tll/llill) and calibration (IlSKII/IIKII) (11-13). The Euclidian norm of a vector or matrix is signified by 11 11. This error propagation is expressed by

The condition number of the calibration matrix, cond(K), is calculated from

(7) for a square K matrix (p = r). If the K matrix is rectangular (p > r, more sensors there analytes), its condition number is given by cond(K) = [ c ~ n d ( l c r K ) ] ’ / ~

solvent ethanol heptane

(3)

where Q is the (n + 1) X p matrix of zero mean volume corrected responses and AC is (n+ 1)X r matrix of zero mean concentrations. The n I t h row in the AC matrix now consists of zeros. For further information on Q and A C ref 19 should be consulted. By use of the method of multiple linear least squares to solve for K in the presence of an overdetermined system (n > r ) , the generalized inverse of AC (16) is used resulting in

K = (ACTAC)-lACTQ

Table I. Calibration Condition Numbers for Solvent Optimization

(8)

For UV-vis spectrophotometry, it is well-known that the absorption spectrum depends on the experimental conditions that can be varied (Le., pH, solvent, temperature). From eq 3 one can see that if the absorption spectrum is altered due to changes in the experimental conditions, then the calibration matrix is also influenced. Correspondingly, the cond(K) can be increased or decreased depending on the influence of the experimental factor. A logical conclusion is that the experimental factors should be varied until the minimum cond(K) is reached. Thus, by varying the multifactor response surface until the minimum in cond(K) is found, one should have simultaneously optimized the multicomponent analysis procedure for maximum selectivity and sensitivity as well as optimized for maximum accuracy and precision. Specifically, cond(K) will reduce the error propagation as shown in eq 6, thus allowing for optimal analysis. EXPERIMENTAL SECTION Apparatus and Computations. A Model 23 000 Varian spectrophotometerwas used for all measurements. An Apple 11+ was used to control the spectrophotometerin addition to providing data storage and graphics output. Measurement values were transferred to the Idaho State University HP-1000 computer for subsequent data analysis. Estimated standard deviations were

cond(K) 290 2.1

obtained by using 50 Monte Carlo perturbations (20). For the Monte Carlo estimations a normal distribution of measurement error wm assumed with a standard deviation equal to a 1%relative standard deviation on the absorbance readings. A 1% relative standard deviation is the estimated precision of the spectrophotometer used. Reagents. Stock solutions of organic components were prepared by use of reagent grade chemicals. Concentrations were 1.238 X g/mL phenol, 1.109 X lo-’ g/mL acetaldehyde in g/mL phenol, 9.431 X g/mL acheptane, and 1.122 X etaldehyde in ethanol. Aqueous stock solutions of acid-base indicators were prepared using distilled, deionized water as described previously (16). Concentrations were 4.848 X M chlorophenol red and 5.707 X M phenol red. Buffer solutions at pH 4, 5, 6, 7,8, and 9 were prepared using pH hydrion buffers (Micro Essential Laboratory, Inc., Brooklyn, NY) and distilled, deionized water. Stock solutions were used to prepare standards for additions and test sample mixtures. Procedure. Soluent Optimization.Five test sample aliquots were prepared consisting of 1.00 mL of the acetaldehyde stock solution and 0.25 mL of the phenol stock solution. Standard additions consisted of one addition of a single analyte per test sample aliquot with a total of two additions for each analyte. Volumes for each addition consisted of 0.25 mL and 0.50 mL of stock phenol and 1.00 mL and 2.00 mL of stock acetaldehyde. Ali solutions were diluted to a final volume of 25.00 mL with the respective solvents. For the initial calculation of cond(K) individual solutions of phenol and acetaldehyde were prepared consisting of 0.25 mL and 1.00 mL of the respective analyte diluted to a total volume of 25.00 mL with the appropriate solvent. Spectra were recorded at 1-nm intervals ranging from 255 to 330 nm containing the range of absorption for the two components. Calculations used absorptions measured at 5-nm intervals to allow for the advantage of the overdeterminationof the chemical system (14).

pH Optimization. Five test sample aliquots were prepared consisting of 1.00 mL each of chlorophenol red and phenol red. The same addition procedure as above was used except the volumes of additions were 0.26 mL and 0.50 mL for the two analytes. All solutions were diluted to a final volume of 25.00 mL with the respective stock buffer solutions. For the initial calculation of cond(K) individual solutions of chlorophenol red and phenol red consisting of 1.00 mL each were diluted to a total volume of 25.00 mL with the appropriate buffer. Spectra were recorded at 1-nm intervals ranging from 350 to 650 nm containing the range of absorption for the two components at the various pH levels. Calculations used absorption measured at 5-nm intervals. RESULTS AND DISCUSSION Solvent Optimization. In order to first assess the optimal solvent (heptane or ethanol) for the analysis of the phenolacetaldehyde sample, the absorption spectra of the pure standards were obtained a t the concentrations indicated in the Experimental Section. These spectra are shown in Figure 1. On the basis of the appearance of the spectra one would intuitively choose heptane as the solvent for the analysis. This decision would be the result of the observation that in the nonpolar solvent heptane (Figure lb), there is the retention of the vibrational structure. Contrary, in Figure l a it is seen that in the polar solvent of ethanol the vibrational structure is diminished considerably causing less spectral resolution of the individual components. Therefore, one would expect a decrease in the cond(K) when the solvent is changed from ethanol to heptane arising from the increase in discrimination

ANALYTICAL CHEMISTRY, VOL. 58, NO. 4, APRIL 1986

991

Table 11. Comparison of Results for Solvents Ethanol and Heptane concn, g/mL solvent

component

true

ethanol

phenol acetaldehyde phenol acetaldehyde

1.122 x 10-5 3.772 X 1.238 x 10-5 4.437 x 10-3

heptane

Relative error = 100(ltrue - calcdl)/true.

calcd

relative error: %

RSDb

cond(K)

1.284 x 10-5 8.183 x 10-4 1.269 X 4.471 X

14 114 2.5 0.77

5.23 10.35 6.32 5.20

2078 1484

Relative standard deviation = 100(standard deviation/calcd).

I

Table 111. Calibration Condition Numbers for pH Optimization PH

cond(K)

PH

cond(K)

4 5 6

136 510 1.2

7 8 9

2.4 4.5 4.6

of the components. Presented in Table I are the condition numbers for the pure standards in the two solvents. The large reduction in cond(K) for heptane agrees with the decision based on the observed spectra. Thus, using heptane should minimize the error propagation as expressed by the cond(K) values and eq 6. To test the applicability of using cond(K) as a predictor for the optimal solvent, separate analyses were run using the two solvents heptane and ethanol. The results of this study are listed in Table 11. Upon inspection, it is noticed that the condition numbers are substanially greater for the mixtures than for the pure standard solutions. This is attributed to the band intensity ratio increasing relative to the standard additions of each component. This is in agreement with Otto et al. (15)where the general influences of different parameters on the condition number were studied. It was found that as the band intensity ratio increased the condition number increased. The general pattern of a decrease in cond(K) from ethanol to heptane for the test sample is still evident, thereby indicating heptane to be the better solvent. This is reflected in the reduction of the relative errors for the analysis in heptane as shown in Table 11. The relative standard deviations are also presented in Table 11. It is seen that the precision is improved when transferring the analysis to the solvent heptane, in addition to the increase in accuracy. The drop in cond(K) in proceeding from ethanol to heptane is caused by the increase in the vibrational structure of the absorption spectrum of each component. Thus it appears from this investigation that cond(K) can be used to optimize for the best solvent for multicomponent analysis in the presence of known interferences. pH Optimization. Similar to the solvent study, absorption spectra for pure standards of chlorophenol red (CR) and phenol red (PR) were obtained at a range of pH values. Figure 2 illustrates the measured absorption spectra, while Table I11 presents the respective condition numbers. Upon inspection of the spectra for pH 5 (largest condition number) the ex-

P

I

b

Flgure 1. Absorbance spectra of phenol (I) and acetaldehyde (11) in two solvents: (a) ethanol and (b) heptane.

istence of collinearity at the lower wavelengths for CR and PR is observed. Additionally, there is a slight amount of absorption for CR at the higher wavelengths due to CR starting the color transitions from yellow to purple (pH 5 to pH 6.6). The exceptionally high condition number at pH 5 is dependent on two factors: the collinearity of the data as indicated previously and the vast difference in the intensity ratio of the bands at the low and high wavelengths (15). At pH 4, Figure 2a shows only the presence of collinearity at the low end of the spectra resulting in a lower condition number compared to pH 5. The substantial decrede for cond(K) when proceeding to pH 6 is a result of the increased band separation for the two components. The condition number increases gradually as the pH is increased from 6 to 9. Again, this pattern is attributable to the increase in the band ratio and the degree of collinearity as illustrated in Figure 2. Table IV lists the results of the analysis study for test mixtures of CR and PR at three pH levels. As expected, the results are unacceptable at pH 4. The large errors for pH 4 were correctly predicted by the large condition number. Likewise, for pH 6 and 8 the smaller relative errors would be

Table IV. Comparison of Results for Different pH Levels PH

componenta

4

CR PR

true molarity 1.939 2.238

0.7500 3.337

6

CR PR

1.939 2.238

1.804 2.096

7.5 6.3

0.339 0.213

1.9

8

CR PR

1.939 2.238

1.789 2.043

7.7 8.7

0.351 0.429

4.1

CR, chlorophenol red; PR, phenol red.

X

calcd molarity

X

10"

relative error, 70 61 46

RSD 3.24 0.766

cond(K) 26

992

ANALYTICAL CHEMISTRY, VOL. 58, NO. 4, APRIL 1986 w

04

I

I CR

0

0 4

r CR

W 0 2

E

a o 25

025

m

a

a

0

0

v)

g

01

01

a

-0.05

-005

.

w

u 2 a

065

2m

0 3

m a

a

-0 05

10

e

. ..

350 0

650 0

WAVELENGTH

(nm)

A

b.

-0 05 350 0

'I

" 650 0

WAVELENGTH

(nrn)

Figure 2. Absorbance spectra of chlorophenol red (CR) and phenol red (PR) at varlous pH values: (a) pH 4, (b) pH 5, (c) pH 6, (d) pH 7, (e) pH 8, (f) PH 9.

predicted by the smaller condition numbers. A t test at the 95% confidence level shows that the calculated molarities for CR and PR at pH 6 and 8 are significantly different from each other. Therefore, pH 6 is the optimal pH for the analysis based on the relative errors shown in Table IV. This conclusion is confirmed by cond(K). It appears that the condition number can be used to determine a set of the optimal analysis parameters (pH, solvent, etc.) for accurate measurements. Since the analysis parameters may considerably influence the spectra (calibration matrix K) of the solution components and, at the same time, the shape of the spectra influences the precision and accuracy, minimizing the condition number of the K matrix allows one to obtain optimal multicomponent performance. By use of a minimum cond(K) as the criterion for the analysis parameters, all spectral interferences will have been minimized allowing for improved selectivity. CONCLUSION The optimization discussed above could easily have been performed by human pattern recognition. However, the proposed optimization procedure can be easily incorporated with a computer allowing greater utilization of the computational and logic-based decision-making abilities of computers. Furthermore, the "intelligence" of computers can use cond(K) to optimize the analysis procedure. Finding the minimum in cond(K) will minimize spectral interference along with the matrix effects while allowing for increased accuracy and optimal sensitivity and selectivity. The results of this study suggest using cond(K) as a new objective function to be minimized by sequential simplex optimization for multicomponent analysis. The condition number should provide a single operative measure of the analysis that is representative of all components that are to

be determined. Studies of this nature are currently under way. ACKNOWLEDGMENT

I am grateful to John Trammel for his assistance in programming and data analysis. LITERATURE CITED (1) Kateman, G.; Pijpers, F. U. "Quality Control In Analytical Chemistry"; Wlley: New York, 1981. (2) King, P. G.; Deming, S. N. Anal. Chem. 1974, 4 6 , 1476-1481. (3) Box, G. E. P.; Wilson, J. R. Stat. SOC.Ser. B 1951, 73, 1. (4) Deming, S. N.; Morgan, S. L. Anal. Chem. 1973, 43, 278A-263A. (5) Berrldge, J. C. Analyst (London) 1984, 709, 291-293. (6) Morgan, S. L.; Deming, S. N. J. Chromatogr. 1975, 772, 267. (7) Morgan, S. L.; Jacques, C. A. J. Chromatogr. Sci. 1978, 76, 500. (8) Leary, J. J.; Brookes. A. E.; Dorrzapf, A. F., Jr.; Golightly, D. W. Appl. Spectrosc. 1982. 36, 37-40. (9) Moore, 0. L.; Humphries-Cuff, P. J.; Watson, A. E. Spectrochim. Acta, Part 6 1984, 395, 915-929. (IO) Deming, S. N., Morgan, C. L. "Chemometrics: Theory and ADDlication"; Kowalski, B. R., Ed.: American Chemical Society: WashInsion DC, 1977. (11) Jochum C.; Jochum, P.; Kowalski, B. R. Anal. Chem. 1981, 53, 85-92. (12) Stewart, G. W. "Introduction to Matrlx Computations"; Academic Press: New York, 1973. (13) Jochum, P.; Schrott, E. L. Anal. Chem. Acta 1984, 757, 211-228. (14) Kalivas, J. H. Anal. Chem. 1983, 55, 565-587. (15) Otto, M.; Wegscheider, W. Anal. Chem. 1985, 57, 63-69. (18) Frans, S. D.; McConnell, M. L.; Harris, J. M. Anal. Chem. 1985, 5 7 , 1552-1 559. (17) Neter, J.; Wasserman, W. "Applied Linear Statistical Models"; Richard D. Irwin, Inc.: Homewood, IL, 1974, Chapter 6. (18) Saxberg, 8. E.; Kowalski, B. R. Anal. Chem. 1979, 57, 1031-1038. (19) Lorber, A. Anal. Chem. 1985, 5 7 , 954-955. (20) Naylor, T. H.; Balntfy, J. L.; Burdick, D. S.; Cho, K. "Computer Simulation Techniques"; Wiley: New York, 1966; Chapter 4.

RECEIVED for review September 27,1985. Accepted December 2, 1985. This research was supported in part by Grant 566 from the Faculty Research Committee, Idaho State University, Pocatello, ID, and Project SEED administered by the American Chemical Society.