Resolution of infrared spectra of mixtures by self-modeling curve

relative simplicity of simplex optimization techniques In higher ..... Figure 8. Incomplete simplex location of the solution points for a mixture of 1...
0 downloads 0 Views 868KB Size
2622

Anal. Chem. 1986, 58, 2622-2628

Resolution of Infrared Spectra of Mixtures by Self-Modeling Curve Resolution Using a Library of Reference Spectra with Simplex-Assisted Searching David M. Mauro and Michael F. Delaney* Department of Chemistry, Boston University, Boston, Massachusetts 02215

A recently proposed method for extendlng the rnathematlcal technique of selfmodellng curve resolution using a library of reference spectra has been modllted to take advantage of the relatlve Simplidty of slmplex optknlratlon techniques In Mgher dimensions when searchlng for mlnlma and maxlma In a response functlon. Surface contouring of the llbrary search dlstance metric over the knportant elgenspace is replaced by slmplex optlmlratlon, resultlng In two dlstlnct advantages. Identlflcation of solutlon polnts Is accomplished more accurately and uncertainties caused by resolutlon constraints in griddlng and contouring are elhinated. Second, the extension of the shplex method to greater than three-component mlxtures Is comparatively straightforward, whereas visual inspectlon of surface contours In these dimensionalitles is lmpossfble. The method Is shown to be comparatively rapld and relatlvely insensitive to noisy data.

exact coordinates are unknown and must be estimated. I t follows that for a mixture of k components, the projections of normalized mixture spectra lie within a k-dimensional simplex, with the verticies of the simplex corresponding to the projections of the spectra of the pure underlying components. Thus for a three-component mixture, the projections lie on a plane, within a triangular boundary, on a three-axis coordinate system. The verticies of the triangular boundary are the projections of the pure underlying component spectra onto the coordinate system. The dimensionality of the solution space can be reduced further by "mean centering" (3). Here the mean intensity at each wavelength in the data set is substracted from each mixture spectrum. The result is a k-dimensional simplex in a (k - 1)-dimensionalcoordinate system (Figure 1). Each pure spectrum, i, is then reconstructed from its projection location according to Psi = aiVl biV2 M e (2)

+

The mathematical resolution of overlapped chromatographic peaks and the identification of unknown components in chemical mixtures have received considerable attention recently among analysts. Several approaches have been proposed (I). However the common goal of all proposed methods has been to efficiently use all the available information resulting from an experiment to obtain the best estimate possible of the identities and amounts of compounds present. In the experimental situation where the mixture consists of unknown amounts of an unknown number of unknown compounds, the technique of self-modeling curve resolution (SMCR) (2) comes as close as is presently possible to resolving the mixture with a minimum number of assumptions. In SMCR, the resultant principal eigenvectors from a principal components analysis of a multivariate, multifactor data set are used to define a k-dimensional coordinate system within which each data spectrum and each pure component spectrum are represented as points. The coordinate values of the points constitute a transformation vector by which the abstract eigenvectors can be rotated to physically real spectra. For a single spectrum in a two-component mixture, the following equation results: PS = aV, + b V , + e (1) where, PS is the pure component spectrum, a and b are the coordinate values of the solution point, VI and V2 are the first and second principal eigenvectors, respectively, and e is the error associated with the analysis. The dimensionality of the solution space is determined by the number of underlying pure components, and thus by the number of retained eigenvectors. In the two-component case, the number of significant eigenvectors is two (thus the dimensionality is two), and the component points lie on a two-dimensional (x-y) coordinate system. Normalization of the mixture spectra to unit area results in the projections lying on a straight line segment with the purest mixture spectra lying a t each end of the line. The projections of the pure spectra also lie in the line, but their

+ +

where M is the vector containing the mean intensities at each wavelength. The coefficients a and b which transform the abstract eigenvectors into the actual spectra, are unknown quantities which must be estimated. Several methods have been proposed to estimate the transformation coefficients (2, 4-10). Lawton and Sylvestre introduced the idea of restricting the solution space with physically meaningful boundary conditions (2). These include a nonnegative intensity restriction, and a nonnegative concentration restriction. While these boundary conditions greatly reduced the possible choices of the coefficients a and b, uncertainties remain. Several approaches to further restricting the choices of the transformation coefficients have been proposed. Meister described a method of attaining the largest possible simplex within the nonnegative intensity boundary constraint (4). This amounts to finding the most dissimilar spectra within the solution space. We have shown that in some cases this approach results in erroneous pure spectra (11). Vandeginste et al. proposed a method for two- or three-component resolution, based on a simplest spectrum or simplest chromatogram criterion (5). In this case, the two or three simplest spectra or chromatograms generated, based on the area/norm ratio of the spectrum or chromatogram, are considered the best estimates of the underlying component spectra. The resulting solution spectra are fine tuned by visual inspection of the regenerated elution profiles in the liquid chromatography, multiwavelength UV experiment. It was not clear if the technique could be made general and extended to greater than three components. Nor was it clear that the method worked when the nonnegative intensity constraint was violated by noisy data. Kawata et al. also proposed a method utilizing a simplest pure spectra criterion, but only applied it to twocomponent mixtures (16). A fourth method of restricting the solution space has been proposed by Borgen and Kowalski (7) and independently by Kim (8). This method utilizes all the information present in the geometry of the boundary constraints to define even more limited regions of solution space containing the pure component projections. The

0003-2700/86/0358-2622$01.50/00 1986 American Chemical Society

ANALYTICAL CHEMISTRY, VOL. 58, NO. 13, NOVEMBER 1986

Figure 1. Projection/boundary diagram resulting from SMCR of a mixture of l-heptanol, loctanoi, and 1-nonanol. Projections of mixture spectra, projections of pure spectra (connected by a dotted line),inner boundary (dashed line), and closest outer boundaries (solid lines) are shown. Many of the outer boundary lines are outside the figure window.

technique is largely dependent upon the quality of the experimental data, since the calculation of boundary constraints utilizes only the information contained in the data. For the case of data containing some negative intensities, some outer boundaries would cross into the solution space, requiring further assumptions to correct the inconsistency. Also, when few mixtures have been acquired, as from a narrow GC-FTIR peak, it is possible that the mixture compositions will not adequately span a range of concentrations of the underlying components. It then becomes more difficult to determine the spectra of each component, since the mixture spectra do not closely resemble those of the components. We have recently proposed a method by which the information contained in a library of reference spectra could be used to greatly narrow the solution space, and thus increase the certainty in the estimated pure component spectra (11). Good results were obtained with simulated two- and threecomponent mixtures of vapor-phase infrared spectra. The technique was also shown to give good results when data were noisy, when the nonnegativity constraint was violated, and when the spectrum of an underlying component was not present in the library. This paper presents the extension of the method to mixtures of up to five components. An improved heuristic is presented whereby a modified simplex program is used in conjunction with the library search program to greatly narrow the solution space, thus improving the estimates of pure component spectra. EXPERIMENTAL SECTION

All programs were written in this laboratory in FORTRAN-77, and run on Digital Equipment Corp. VAX 11/750. The EPA library of 3300 vapor-phase infrared spectra was used as a source of spectra for simulated mixture studies and as a reference for library searching routines. Each original full-intensity spectrum consisted of 1842 data points, sampled at 2-cm-’ resolution from 4000 cm-’ to 450 cm-’. Each of the 3300 library spectra was reduced to a 231-dimensional spectrum by a combination of moving and boxcar averaging (12),which corresponds to a sampling interval of about 16 cm-’. A 50-channel segment of each

2623

reduced spectrum that roughly corresponded to the fingerprint region of the original spectrum was then selected. The intensities, linear in absorbance units were normalized such that each spectrum had unit area. In addition, the first 500 spectra in the library were separated and used as a “short”library. This resulted in a 500 spectra by 50 channels library used in program development and simulation studies. Library spectra were selected and combined in various randomly selected proportions to generate spectra for mixture analysis. In some experiments normally distributed random noise was added to the mixture spectra to produce noisy spectra with a preselected signal to noise ratio (S/N). These data sets were then used as test sets for evaluation of the combined self-modeling curve resolution with library searching (SMCRL-LS)performance. A typical data set consisted of ten randomly generated mixtures; however, in some instances, more than ten mixtures were generated. Some of the matrix manipulations necessary were performed using subroutines from the IMSI package (IMSL,Houston, TX). The spectra used in the simulationstudies are shown in Figure 2. They are 1-hexanol, 1-heptanol, 1-octanol, and 1-decanol respectively. The spectra for these compounds are quite similar and it was expected that resolution of mixtures of these spectra would represent a difficult case in actual spectral resolution problems. For the three-componentmixture studies, 1-heptanol, 1-octanol,and 1-nonanolwere used. Additional spectra for greater than three-component mixtures were added from the above list. The gridding and contouring were done as described previously (11),with the following modifications: an 11 X 11 point grid was used, the spectra were normalized to unit area to match the normalization of the library spectra, and the 500 spectra library was used as the reference instead of the full 3300 spectra library. A modified simplex program was written and tested in this laboratory (13). The decision rules for simplex progression followed closely those described by Deming and Morgan (14). A test simplex of the appropriate dimensionality was determined (see discussion below). From the coordinates of each simplex vertex and the principal eigenvectors, a spectrum was reconstructed by using 2. The reconstructed spectrum was compared to the library of reference spectra and the sum of squared differences between the test spectrum and each reference spectrum was calculated. The smallest search distance was then saved and used as the objective function in a simplex search for the solution points. Ten randomly generated data sets of mixtures of the three test spectra described above were analyzed by the SMCR-LS method for a given value of a parameter of interest. The performance of the method was evaluated as percent correct, Le., number of runs where all component spectra were unambiguously identified vs. total number of runs. In this way, a given parameter could be varied and the effect on the performance of the SMCR-LS method tabulated as a percentage related to resolution accuracy. R E S U L T S AND DISCUSSION The primary objective of the use of a combined simplex/ library search routine is the attainment of the best estimate possible of the transformation coordinate points described above. Reconstructed pure spectra and concentration estimates will be the most accurate when the best estimate of the pure projection are used. Computer simulations are shown where a simplex optimization of a library comparison metric can accurately locate these pure projection points. Three-Component Problem. The progression of a series of simplexes for a three-dimensional mixture was plotted over a contour diagram for the mixture and is shown in Figure 3. The corresponding projection diagram containing the calculated outer boundaries is given in Figure 1. For this mixture of heptanol, octanol, and nonanol, there is a large region between the inner and outer boundaries within which the solution points are found. Gridding and contouring greatly decreased the solution zones (Figure 4), but a significant uncertainty remains resulting from the gridding resolution (compare Figures 4 and 5). In Figure 4, an evenly spaced 25 X 25 point grid was used to generate the contour levels. Note

2624

ANALYTICAL CHEMISTRY, VOL. 58, NO. 13, NOVEMBER 1986

i

-

i

I

Wavenumber

Wavenumber

0.030

I

2.

$

0.020

4

m

4

I

L

-C

i \

0.000

~

1650.0

1450.0

1250.0

1050.0

Wavenumber

Flgure 2. Vapor-phase infrared spectral fingerprint regions for the compounds used in this study: (a) 1-hexanol, (b) 1-heptanol, (c) 1-octanol, (d) 1-nonanol, (e) ldecanol.

that even with this grid resolution the projections of the pure spectra (squares) do not lie centered within a minimum, but rather are offset a small amount. This condition is worsened when an 11 X 11 point grid is used (Figure 5 ) . As the grid resolution decreases, the response minima center around a nearby grid point rather than the acutal pure spectrum projection point associated with the true minimum. The simplex optimization procedure is not affected by the grid resolution problem, and as observed in Figure 3, each simplex run locates a pure spectrum point closely. The addition of random noise to the data set has the effect or reducing the certainty of pure point location. This is indicated by greater comparison distances in the search routine and shallower minima surrounding the pure points in the

contour diagram (11). Also, if random variation in the base line should result in one or more intensities to be less than zero, then outer boundaries can actually cross the inner boundaries, confounding the accurate estimation of the pure point using these restrictions only. Figure 6 is a projection diagram for a three-component mixture of heptanol, octanol, and nonanol where random noise was added to the data to give a S/N of 20. Note that an outer boundary line crosses the inner boundary zone, and in fact two of the pure points lie outside of the restricting boundary. Figure 7 shows the results of contouring and the simplex location of the pure points. There is a significant ambiguity in pure point location using contouring alone; however the simplexes are seen to locate each of the three solution points accurately.

ANALYTICAL CHEMISTRY, VOL. 58, NO. 13, NOVEMBER 1986

Flgure 3. Example of the simplex location of pure points overlayed on a contour diagram generated by SMCR-LS of the data set used to generate Figure 1: (0)pure component projections; (A)mixture spectra projections.

2625

Figure 5. A contour diagram generated using an 11 X 1 1 grid data points. The pure component projectlons are indicated by squares.

Figure 0. Projection/boundary diagram for the three-component data set where random noise was added to the raw data to give a S I N = 20.

Figure 4. A contour diagram generated using a 25 X 25 grld of data points. The pure component projections are indicated by squares.

Simplex Starting Location and Size. The starting location and size of each simplex were found to affect several important aspects of simplex performance. These include the number of required steps to reach an acceptable result and the probability that all pure points will be found with a minimum number of simplex starts. Although several possibilities can be imagined for simplex starting location, a method based on themost dissimilar mixture projections was used in this study. For a three-component mixture, the minimum number of simplex starts necessary to find all pure points is three. The mixture projections closest to the pure points represent mixtures most pure in one component. The mixture projection a t the greatest Euclidean distance from the centroid is one such projection and was the first to be picked. A simplex progression beginning a t this point was then allowed to locate a pure point. The mixture projection a t the greatest Euclidean distance from the first pure point

found represents the mixture spectrum with the least contribution from component one. A simplex progression beginning a t this second point was allowed to locate a second pure point. The mixture spectrum that contains the smallest contribution from both pure component one and pure component two will, in general, be represented by the mixture projection at the greatest perpendicular distance from the line connecting pure point one and pure point two. A simplex progression was then begun from this point. Each additional starting point, necessary as the number of components increases beyond three, can be found in an analogous way. For example, the fourth starting point is that mixture projection at the greatest perpendicular distance from the plane formed by the three previously found pure points. The starting size of the simplexes was also found to be important. Figure 8 shows the progression of three simplexes for a three-component mixture. The experiment was obviously not completely successful, however, as two simplexes located the same pure point. This occurred because the starting size

2626

ANALYTICAL CHEMISTRY, VOL. 58, NO. 13, NOVEMBER 1986

\

30

i

I

00

01

I

I

I

02

03

04

I

1

05

OB

Simplex Size Factor

Flgure 9. Performance of SMCR-LS for ten, random three-component mixtures at several initial simplex sizes: (0)no added noise to the data; (A)S I N = 20. The length of the initial simplex size equaled the factor

Flgure 7. Simplex location of the pure-component projections and contour diagram for the data illustrated in Figure 6.

shown times the range in mixture projections.

Table I. Four-ComponentSMCR-LS Result coordinates of selected simplex moves X

Y

z

0.0134 0.0147 0.0148 0.0151 0.0147 0.0148 0.0145 -0.0096 -0.0109 -0.0098 -0.0081 4.0091

-0.0026 0.0013 0.0035 0.0034 0.0034 0.0036 0.0035 -0.0029 -0.0043 -0.0026 -0.0034 -0.0032 -0.0033 -0.0033 0.0035 0.0048 0.0052 0.0045 0.0046 0.0046 0.0045 0.0045 -0.0037 -0.0051 -0.0060 -0.0070 -0.0074 -0.0073 -0.0073

0.0012 0.0025 0.0044

-0.0088 -0.0088

Figure 8. Incomplete simplex location of the solution points for a mixture of 1-heptanol, 1-octanol, and 1-nonanol caused by an ex-

cessively large initial simplex. of the third simplex was large enough to allow it to reflect into the response well of a previously found pure point. Starting the simplex too small has the disadvantage of requiring too many steps to accurately locate a minimum, or the simplex may locate a local minimum for noisy data. An optimal value for the starting size of each simplex was determined by Monte Carlo experiments as described above. Each simplex in the three-component case was begun as an equilateral triangle. The length of each side of the triangle was calculated as some fraction of the range in the mixture projections. The Euclidian distance between each pair of mixture projection in the eigenspace was calculated. The range was taken as the greatest distance found. Figure 9 gives the number of random data sets where all underlying components were unambiguously identified vs. the total number of data sets analyzed (expressed as a percent), for various lengths of simplex sides. A length of 0.2 X range was considered optimal and was used in subsequent experiments.

-0.0133 -0.0094 -0.0103 -0.0098 -0.0099 -0.0100 -0.0099 -0.0099 0.0119 0.0079 0.0066 0.0061 0.0063 0.0062 0.0068

0.0038

0.0038 0.0034 0.0033 -0.0031 0.0009 0.0023 0.0031 0.0025 0.0023 0.0024 -0.0007 -0.0020 -0.0014 -0.0022 -0.0019 -0.0020 -0.0019 -0.0018 -0.0009 -0.0023 -0.0033 -0.0033 -0.0034 -0.0036 -0.0038

response function

best matched spectrum

0.443-4 0.633-5 0.123-5 0.773-6 0.343-6 0.193-6 0.81E-7 0.313-4 0.763-5 0.16E-5 0.10E-5 0.123-6 0.293-8 0.71E-9 0.143-4 0.563-6 0.893-6 0.12E-6 0.30E-7 0.363-7 0.14E-7 0.363-8 0.473-4 0.873-5 0.19E-5 0.563-6 0.223-6 0.153-6 0.57E-7

1-nonanol 1-nonanol 1-nonanol 1-nonanol 1-nonanol 1-nonanol 1-nonanol 1-octanol 1-octanol 1-octanol 1-octanol 1-octanol 1-octanol 1-octanol 1-heptanol 1- heptanol 1-heptanol 1-heptanol 1-heptanol 1-heptanol 1-heptanol 1-heptanol 1-decanol 1-decanol 1-decanol 1-decanol 1-decanol 1-decanol 1-decanol

Four- and Five-Component Problems. When the number of components in a mixture is greater than 3, graphical representation of the solution space is either very difficult or impossible. The success of simplex location of pure component points can be judged by the value of the response function (Le., the squared distance metric in the library search) as the simplex moves toward a minimum in the response surface. Small values of the response function indicate a close match to a spectral library member (with a value of zero being an ideal match). Little change in the response function as the simplex progresses indicates that a minimum has been reached. In Table I, the response functions for a four-component mixture, as each of four simplexes progress, are given. Four distanct minima were identified. In addition, the best

ANALYTICAL CHEMISTRY, VOL. 58, NO. 13, NOVEMBER 1986

Table 11. Five-Component SMCR-LS Result coordinates of selected simdex moves X

Y

0.054 08

0.011 87

0.044 09 0.032 23 0.027 23 0.023 07 0.01640 0.013 33 -0.02641 -0.01641 -0.008 29 -0.011 82 -0.004 76 -0.003 44 -0.003 94 -0.01699 -0.01949 -0.014 92 -0.013 60 -0,01301 -0.012 96 -0.013 01 0.035 08

0.014 38 0.012 58 0.013 94 0.013 26 0.01290 0.012 07 -0.013 31 -0.015 80 -0.010 73 -0.006 78 -0.006 85 -0.008 11 -0.008 00 0.022 60 0.01199 0.000 65 0.00801 0.004 92 0.005 78 0.005 23 -0.014 60

0.02508 0.01697 0.007 79 0.01079 0.01089 -0.027 33 -0.017 33 -0.011 09 -0.007 93 -0.013 31 -0.010 39

-0.017 09 -0.012 03 -0.012 45 -0.009 71 -0.003 29 -0.009 15 -0.011 65 -0.00962 -0.006 63 -0.003 95 -0.001 98

z

w

0.00173 -0.OOO 87

0.004 23 0.00275 0.00539 0.01388 0.01143 0.008 72 -0.000 32 -0.002 82 -0.005 55 0.002 11 0.00166 -0.001 07 -0.000 19 -0.006 75 -0.005 81 -0.006 84 -0.002 00 -0.000 14 -0.000 73

-0.00009 -0.01084 -0.013 34 -0.01607 -0.011 52 -0.013 92 -0.013 27 0.007 74 0.010 24 0.003 84 0.008 05 0.005 38 0.003 95

-0.003 37 -0.006 26 -0.007 52 -0.001 60 0.00032 0.002 13 -0.003 26 -0.005 76 -0.007 25 -0.009 20 0.002 48 0.001 15 0.00109 0.003 12 0.004 05 0.006 31 -0.003 84 -0.00034 0.000 29 0.001 18 -0.002 09

response function 0.143-2 0.883-3 0.363-3 0.243-3 0.183-3 0.783-4 0.473-4 0.453-3 0.283-3 0.143-3 0.873-4 0.113-4 0.303-5 0.233-5 0.373-3 0.133-3 0.123-3 0.343-4 0.483-5 0.263-5 0.593-6 0.803-3

0.553-3 0.233-3 0.233-3 0.123-3 0.463-4 0.443-3 0.233-3 -0.010 14 0.143-3 -0.008 38 0.953-4 -0.002 21 0.403-4 -0.004 88 0.153-4 -0.004 59 -0.00607 -0.010 13 -0.001 89 -0.003 11 -0.OOO 30 -0.002 80

2827

Table 111. Four-Component SMCR-LS Result with Added Noise best matched spectrum 9-octadecen-1-01 1-nonanol 1-nonanol 1-nonanol 1-nonanol 1-nonanol 1-nonanol 1-heptanol 1-octanol 1-octanol 1-heptanol 1-octanol 1-octanol 1-octanol 1-hexanol 1-hexanol 1-hexanol 1-hexanol 1-hexanol 1-hexanol 1- hexanol 9-octadecen-1-01 1-decanol 1-decanol 1-decanol 1-decanol 1-decanol 1-heptanol I-heptanol 1-heptanol 1-heptanol 1-heptanol 1-heptanol

matched spectrum for each minimum corresponds to one of the mixture components. Only a few of the simplex moves are shown since many moves are exploratory and do not result in an improvement of the response function. At present, each simplex stops upon completion of a set number of program iterations (usually 10). It is sometimes possible to obtain even better pure point location with more iterations; however, this limit of accuracy was not routinely determined in this study. An analogous successful result for a five-component mixture is given in Table 11. The result for a four-component mixture where normally distributed noise was added to give a S I N of 20 is given in Table 111. A complete set of underlying components was identified, but with greater search distances than in the noise-free case (also observed in the three-component case). It is impossible to visually confirm the accuracy with which the simplexes located the pure component projection points, but it is reasonable to expect that, as in the three-component case, each final simplex vertex is a good estimate of those points since the correct spectral identities occur repeatedly toward the bottom of each list (in Table 111),and the search distances (approximately are comparable to those resulting from successful three-component experiments. Spectra of Mixture Components Missing f r o m the Library. An alternative method of utilizing the information contained in a library of reference spectra for component resolution has been described (15). Since target testing the library spectra has been found to determine the underlying components directly and more quickly than simplex assisted searching, a comparison of the two methods was done. Results indicate that if all the spectra of the underlying components are present in the library, then target testing can quickly

coordinates of selected simplex moves X

Y

z

response function

best matched spectrum

-0.017 2 -0.013 2 -0.006 60 -0.006 16 -0.006 77 -0.009 44 -0.008 64 0.019 2 0.027 2 0.023 2 0.022 9 0.0194 0.019 7 0.0200 -0.002 78 -0.001 89 -0.001 01 0.00342

-0.011 6 0.006 32 0.001 01 0.002 78 0.00596 0.000 68 0.002 69 -0.004 18 0.003 79 0.011 8 0.009 31 0.010 5 0.006 58 0.007 71 -0.014 7 -0.015 2 -0.007 63 -0.012 5 -0.008 81 -0.009 43 -0.009 93 -0.010 1 0.010 8 0.002 79 0.000 43 -0.001 63 -0.002 69 -0.003 85

-0.002 69 -0.009 34 -0.009 78 -0.003 43 0.003 37 -0.000 71 -0.000 19 0.000 91 -0.003 08 0.000 91 -0.001 80 0.000 68 -0.001 47 -0.001 76 0.000 07 -0.006 57 -0.001 26 -0.002 58 -0.007 01 -0.003 07 -0.005 50 -0.004 85 -0.001 95 -0.003 28 0.002 67 0.000 49 -0.001 31 0.000 26

0.19343 0.163-03 0.123-03 0.373-04 0.323-04 0.233-04 0.173-04 0.19343 0.983-04 0.393-04 0.223-04 0.213-04 0.15344 0.123-04 0.773-04 0.523-04 0.353-04 0.313-04 0.243-04 0.193-04 0.163-04 0.153-04 0.143-03 0.513-04 0.383-04 0.373-04 0.323-04 0.213-04

1- hexanol 1-hexanol 1-hexanol 1-hexanol 1-hexanol 1-hexanol 1-hexanol 1-nonanol 1-nonanol 1-nonanol 1-nonanol 1-nonanol 1-nonanol 1-nonanol 1-octanol 1-octanol 1-heptanol 1-octanol 1-octanol 1-octanol 1-octanol 1-octanol 1-hexanol 1-hexanol 1-hexanol 1-heptanol 1-heptanol 1-heptanol

0.003 12 0.000 58

0.002 04 0.00189 -0.015 5 -0.012 9 -0.006 37 -0.006 11 -0.004 77 -0.004 89

Flgure 10. Attempted slmplex location of the pure component projections for a mixture of 1-heptanol, 1-octanol,and 1-nonanol where the spectrum for 1-nonanol was removed from the reference library.

identify the components. However, if one or more of the spectra for the underlying components are missing from the library, then no clear indication of those identities is possible. This has also been a concern for SMCR-LS performance. It was shown previously (11)that if the spectrum of a similar compound is present in the library, a response minimum can still be observed around the solution point. The spectrum for 1-nonanol was removed from the reference library and a random mixture of 1-heptanol, 1-octanol, and 1-nonanol was analyzed by SMCR-LS with simplex-assisted searching. For this mixture, removal of one of the component spectra from the reference library prohibited the direct location of its pure

2628

ANALYTICAL CHEMISTRY, VOL. 58, NO. 13, NOVEMBER 1986

Flgure 11. Simplex location of the pure component projections for

a mixture of o-cresol, p-cresol, and 2-bromo-p-cresol where the spectrum for o-cresol was removed from the reference library. point projection by searching (either by simplex or contouring) since, as observed in Figure 10, no significant minimum is apparent near the projection of 1-nonanol. However, three distinct minima were located by simplex for a mixture of o-cresol, p-cresol, and 2-bromo-p-cresol, where the spectrum for o-cresol was removed from the library. The spectra for compounds in this mixture were treated as described previously for the alcohols. Figure 11 shows the location of the three pure points by simplex, including the pure point for o-cresol which shows little or no minimum contouring. There is some error observed in the location of the o-cresol point, as the simplex converges on a point close to, but not exactly coincident with, the projection of the o-cresol spectrum (indicated in the figure by a square). The best match to a spectrum in the library at this point was 2-ethylphenol. These results indicate that the spectra of the mixture components must be sufficiently distinct from each other and that the spectrum of the compound most similar to the missing spectrum must also be distinct from the spectra of the remaining mixture components to produce a local minimum in the library searching response function around the projection of the pure component’s spectrum. Good estimates of the solution point for a compound whose spectrum is not in the reference library have been obtained for other mixtures of relatively similar spectra, and it is believed that this capability will break down only in the limit of mixtures of very similar spectra. Thus, even though the exact identity of the component with the missing spectrum cannot be determined, a close approximation can be found that can be used for subsequent qualitative and quantitative analyses.

Library Search Requirements for Simplex Optimization. A savings in library search time was realized over contouring, as no more than 40 individual searches were

necessary to locate a projection point by simplex, whereas hundreds of searches were found to be necessary to construct a grid of sufficient resolution to locate the projection points with similar accuracy (as indicated by the distance between the reconstructed test spectrum and the best matched library spectrum). For example, 625 searches would be necessary to construct a 25 X 25 point grid for contouring. Further, the examples described in this study indicate that, for a threecomponent mixture, the simplex method can locate each pure point with reasonable accuracy with less than 25 individual searches. More searches through the library were necessary as the number of mixture components increased. If the simplex search routine was allowed to cycle through the program 15 times, each projection point was accurately located with no more than 50 searches. A total of no more than 200 searches would then be necessary to accurately locate all the solution points for a four-component mixture. If a 25 X 25 X 25 grid could be used to locate these same four points in the threedimensional solution space, a rapid increase in required searches as the dimensionality increases is immediately apparent. In some instances, less than 50 searches were necessary to locate a solution point by simplex for a four-component mixture with sufficient accuracy. Presently, the researcher must decide the accuracy with which each pure point should be located and balance that with the analysis time required to attain that accuracy.

CONCLUSIONS The use of a modified simplex to determine the response function minima, corresponding to the pure component projection points of infrared spectra of mixtures, extends the multicomponent curve resolution method to four and five components. Uncertainties in pure point location are reduced over those reported previously for contouring and over those apparent in other recently reported methods (4-8). It has been observed that the placement and size of initial simplexes, in addition to the number of iterations, were important factors affecting the performance of the algorithm. Improved performance is anticipated with the use of a more robust simplex algorithm (16).

LITERATURE CITED Delaney, M. F. Anal. Chem. 1984, 5 6 , 261R. Lawton, W. H ; Sylvestre, E . A. Technometrics 1971, 14, 617. Osten, D. W.; Kowalski. B. R. Anal. Chem. 1984, 5 6 , 991. (4) Meister, A. Anal. Chim. Acta 1984, 767, 149. (5) Vandeginste, B.; Essers, R.; Bosrnan, T.; Reijnen, J.; Kateman, G. Anal. Chem. 1985, 57,971. (6) Kawata, S.; Komeda, H.; Sasaki, K.; Minarni, S. Appl. Spectrosc. 1985, 39, 610. (7) Borgen, 0.S.;Kowalski, B. R. Anal. Chim. Acta 1985, 774,1. (6) Kim, R. Ph.D. Thesis, Massachusetts Institute of Technology, 1985. (9) Vandeginste, B.; Derks, W.; Katernan, G. Anal. Chim Acta 1985, 173, 253. Gemperline, P. J. J . Chem. I n f . Comput. Sci. 1984, 2 4 , 206. Delaney, M. F.; Mauro, D. M. Anal. Chim. Acta 1985, 172, 193. Warren, FV., Jr.; Delaney, M. F. Appl. Spectrosc. 1983, 37, 172. Delaney, M. F.; Warren, F. V., Jr. J . Chem. Educ. 1981, 58, 646. i14j Deming, S. N.; Morgan, S. L. Anal. Chem. 1973, 45, 278A. (15) McCue, M.; Malinowski, E. R. Anal. Chim. Acta 1981, 733,125. (16) Parker, L. R., Jr.; Cave, M. R.; Barnes, R. M. Anal. Chim. Acta 1985, 175. 231.

RECEIVED for review March 10,1986. Accepted June 9, 1986.