Successive average orthogonalization of spectral data - Analytical

Chemical Information Based on Neural Network Processing of Near-IR Spectra. Chris W. Brown , Su-Chin Lo. Analytical Chemistry 1998 70 (14), 2983-2990...
0 downloads 0 Views 664KB Size
@eo

Anal. Chem. 1991, 63, 980-985

trometry. A variety of biological molecules can be desorbed without signfimt thermal decompition. PO~YCYC~~C ammatic hydrocarbons can also be studied. Although we have not yet failed to detect molecules which can be ionized efficiently by MPI a t 266 nm, the analytical applicability of this technique for the detection of a wide range of compounds has to be determined in the future, At present,the detection limit achieved is typically in the low-nanogram regime. The selectivity of this method has also been studied. We have shown that it is possible to use this technique to selectively ionize the active substance in a drug tablet with no sample preparation. We demonstrate that FAB can be used to introduce samples into supersonic jet expansions without affecting the jet cooling. In addition, we have shown that tryptamine can be selectively desorbed from a mixture of indole-3-acetic acid and tryptamine by adding NaOH to the mixture.

ACKNOWLEDGMENT L*L*thanks the staff Of the Department Of Chemistry, University of Alberta, for help in setting up the instruments and accessories in this work.

LITERATURE CITED (1) Lubman, D. M.;Li, L. In Lasers and Mass Spectrometry; Lubman, D. M.. Ed.; Oxford: New York, 1990; Chapter 16, pp 353-382. (2) Winograd, N.; Baxter, J. P.; Kimock, F. M. Chem. Phys. Lett. 1982, 88. 581. (3) Becker, C. H.: Glllen. K. T. Anal. Chem. 1984,56, 1671. (4) Tembreull, R.; Lubman, D. M. Anal. Chem. 1986, 58, 1299. (5) Engelke, F.: Hahn, J. H.; Henke, W.; Zare, R. N. Anal. Chem. 1987, 59,909. (6) Grotemeyer, J.; Boesl, U.; Walter, K.; Schlag, E. W. Org. Mass Spectrom. 1986, 27, 595. (7) Hayes, J. M. Chem. Rev. 1987,87, 745. (8) Tembreull, R.; Lubman, D. M. Anal. Chem. 1984,56, 1962. (9) Tembreull, R.; Sin, C. H.: LI. P.: Pang, H. M.; Lubman. D. M. Anal. Chem. 1985,57, 1186. (10) Sin, C. H.; Pang, H. M.: Lubman, D. M.;Zorn, J. C. Anal. Chem. 1986, 5 8 , 487. , R.; Park, Y. D.: Peteanu, L.; Levy, D. H. J . Chem. phys. (11) ~ i z z o T. 1985,83, 4819.

(12) Cable, J. R.; Tubergen, M. J.; Levy, D. H. J. Am. Chem. Soc. 1987, (13) Li, 709, L,; 6198. Lubman, D, M, Appl, spectrosc, lg88,42, 418, (14) Grotemeyer, J.; Schlag, E. W. Angew Chem., Int. Ed. Engl. 1988, 27, 447. (15) Barber, M.; Bordoli, R. S.; Elliolt, G. J.; Sedgwick, R. D.; Tyler, A. N. Anal. Chem. 1982,54, 645A. (16) Cotter, R. J. Anal. Chem. 1984,56, 2594. (17) Van Breemen, R. B.; Snow, M.; Cotter, R. J. Int. J. Mass Spectrom. Ion Processes 1987,49, 35. (18)Wang, A. P. L.; Zhang, J.-Y.; Nagra, D. S.; LI, L. Appl. Spectrosc. 1991,45, 304. (19) Hogg, A. M. Int. J. Mass Spectrom. Ion Phys. 1983,49, 25. (20) Alexander, A. J,; Hogg, A. M, J , Mass spectrom, Ion Phys, 1986,69, 297. (21) Ross, M. M.: Wyatt, J. R.; Colton, R. J.; Campana, J. E. Int. J. Mass Spectrom. Ion Phys. 1983,54, 237. (22) Rizzo, T. R.; Park, Y. D.; Peteanu, L. A.; Levy, D. H. J. Chem. Phys. 1966, 84, 2534. (23) Li, L.; Lubman, D. M. Anal. Chem. 1988. 60, 1409. (24) Li, L,; Lubman, D, M, Anal, them. lg8g, 61,1911, (25) Chemical Analysis of Polycyclic Aromatic Compounds; Tuan. V.-D., Ed.; John Wiley 8 Sons, Inc.: New York. 1989. (26) Caprbli. R. M.; Fan, T.; Cottrell, J. S. Anel. Chem. 1988,58, 2949. (27) Tembreull, R.; Lubman, D. M. Anal. Chem. 1987,59, 1003. (28) Li, L,; Lubman, D, M. Appl. Spectrosc, 1989,43, 543. (29) Li, L.; Lubman, D. M. RapM Commun. Mass Spectrom. 1989,3, 12. (30) Beavis, R. C.; Llnder, J.; Grotemeyer, J.; Schlag, E. W. Chem. M y s . Lett. 1988,146, 310. (31) Anderson, W. R., Jr.; Frick, W.; Daws, G. D., Jr. J. Am. Chem. SOC. 1978, 700, 1974. (32) Freas, R. B.; Ross. M. M.: Campana, J. E. J. Am. Chem. Soc. 1985, 707, 6195. (33) Cotter. R . J. Anal. Chem. 1980,52, 1767. (34) Campana, J. E.; Freas, R. B. J. Chem. Soc., Chem. Commun. 1984, 1414. (35) Li, L.; Lubman, D. M. Anal. Chem. 1988, 60, 2591. (36) Caprioli, R. M. Anal. Chem. 1983,55, 2387.

RECEIVED for review September 27,1990. Accepted February 6,1991. This work was supported by the Alberta Environment Research Trust. Additional funding was provided by the Natural Sciences and Engineering Research Council of Canads. L.L- wishes to thank the Department of Chemistry, University of Alberta, for a start-up grant.

Successive Average Orthogonalization of Spectral Data Steven M. Donahue and Chris W. Brown* Department of Chemistry, University of Rhode Island, Kingston, Rhode Island 02881

A novel method Is presented for performing principal component analysis on a spectral data set. The method Is fast, rellable, and conceptually easy to follow. The orthogonalization Is completed by means of a series of averages. An inltlai pass through the data yields the first average or loadlng vector. I n each succeeding pass, a new loading vector and the scores for the previously calculated vector are determined. The method has been applied to the analysis of broad-band UV spectra and to near-Infrared gas-phase spectra. The resuits are both qualltatlveiy and quantitatively comparable to standard methods such as the Jacobi transformation, Householder reduction, slnguiar value decomposition, and nonilnear iterative partial least squares (NIPALS). I n terms of computational efficiency, the successive average approach is bettered only by Gram-Schmidt orthogonalization, which does not provide the benefits of a principal component analysis. 0003-270019 110363-0980$02.50/0

INTRODUCTION Factor analysis has proven to be an effective means of processing large amounts of data to find a limited number of vector representations to characterize the data (1-3). More properly, it is principal component analysis, rather than the more general factor analysis, that has found widespread use in spectroscopy. Using the terminology of Malinowski ( I ) , let us consider the initial steps in a principal component regression (PCR) analysis. The first step is to construct the data matrix D. Typically, the number of rows in this matrix is equal to the number of spectra with the number of columns equal to the number of data points in each spectrum. The data matrix can be written as the product of two matrices D = RC where R is the matrix of scores and C is the matrix of loading vectors. One method for determining the scores and loadings 0 1991 American Chemical Society

ANALYTICAL CHEMISTRY, VOL. 63, NO. 10, MAY 15, 1991

981

1 Average all spectra

begins with the formation of the covariance matrix Z as

Z = DDt where Dt is the transpose of the data matrix. The covariance matrix is then diagonalized as follows n-1

where 6ij is the Kroneker delta, which is equal to 1 when i = j and is equal to 0 when i # j. This is nothing more than an eigenanalysis where each X j is an eigenvalue and the eigenvectors are the columns of Q. The eigenvectors are the loading vectors, and the scores are nothing more than the projections of the original spectral data onto the vectors. The eigenvalues can be determined as the sum of the squares of the projections associated with each loading vector (Le., the sum of the squares of a column in the scores matrix). When the number of spectra are less than the number of data points, the vectors can be expanded into a matrix V as

V = QtD The vectors in V are abstract spectral representations and can be linearly combined (in the same manner as the original eigenvectors) to reproduce the original data. What is important to remember is that the vectors are orthogonal with the first accounting for the greatest percentage of the variance in the data. Succeeding vectors account for the greatest percentage of the variance not represented by the preceding vectors. These are a variety of methods designed to obtain the orthogonal vectors. The diagonalization of the covariance matrix as outlined above is generally accomplished by means of a series of Jacobi transformations or by a Householder reduction (4-7).In the Jacobi method, one element at a time is eliminated, the process iterating until diagonalization is complete. In a Householder reduction, a row and a column a t a time are eliminated to bring the matrix to tridiagonal form. A QR factorization, which is iterative, is then applied to complete the diagonalization. It is possible, however, to operate on the data matrix directly as is the case for singular value decomposition (SVD) and nonlinear iterative partial least squares (NIPALS). SVD begins with a Householder reduction to bring the data matrix to bidiagonal form. This is followed by an iterative QL factorization to yield the eigenvectors and values. NIPALS (8) begins by taking one of the original data vectors (spectra) to be the initial orthogonal vector. It is then modified in an iterative fashion until some convergence criterion is met. The original data are made orthogonal to this vector, and the process is repeated until the required number of vectors have been determined. Gram-Schmidt orthogonalization (9) is the simplest procedure for obtaining useful vectors. While it does not produce scores and loading vectors in the strictest sense, its simplicity and popularity requires its consideration with the other methods. The initial step is to arbitrarily select one of the original data vectors (spectra) as the first orthogonal vector. It is then normalized, and all of the other spectra are made orthogonal to it. Then a second vector is arbitrarily selected from the remaining modified spectra. This one is also normalized, and all the modified spectra are made orthogonal to it as well. The process is repeated until the necessary number of vectors are calculated. The method can be improved by selecting the longest vector a t each step rather than making arbitrary selections. As one would expect, each method has its advantages and disadvantages. Gram-Schmidt, though computationally efficient, suffers from two important limitations. Since a Gram-Schmidt vector is derived from a single data vector,

I

I

Select first modifled spectrum to begin running average I

Calculate dot product between modified spectrum and runnlng average

In/ , & , Yes

No

positive?

~

+

v

1 I 1 Normalizeaverage I

t

Add modified spectrum to? running average

Calculate dot product between average and ell modifled sDectra

Flgure 1. Flow chart describing steps required to perform successive average orthogonalization for n spectra.

the method is more susceptible to the effects of outliers. Additionally, the Gram-Schmidt method is not as effective at extracting information from noise. Indeed one of the major strengths of factor analysis is its ability to separate information from noise (real data plus imbedded errors from extractable errors (1)). The principal component methods (Jacobi, Householder, SVD, and NIPALS) differ in their precision and computation time. For quantitative work, the Jacobi approach has been the method of choice in our laboratory (10-12). While this method is slower than most, it is not prohibitive on the scale of most quantitative problems. It is, however, extremely stable, which is desirable for these types of problems. As we began to investigate more qualitative analyses, and consequently larger data sets, the need for a more computationally efficient method became apparent. Moreover, the complexity of the calculations makes it difficult if not impossible to run the process in reverse, something that may allow the extraction of pure component spectra from loading vectors derived from mixtures. Our efforts to find a computationally efficient and conceptually straightforward method lead to the development of successive average orthogonalization.

EXPERIMENTAL SECTION UV-visible spectra were measured in 1-cm cuvettes from 200 to 350 nm on a Beckman Instruments Model DU-7 spectrophotometer. All samples were dissolved in hexane at a v/v ratio of 1/3333. The hexane was Fischer Scientific HPLC grade. Other chemicals were Malinckrodt organic reagent grade. Near-infrared spectra were measured on a Bio-Rad (Digilab Division) Model FTS-40 near-IR interferometer between 3500 and loo00 cm-'. For each spectrom, 512 scans were collected at a resolution of 4 cm-'. The source of the gases, the concentration ranges, the experimental conditions,and mixture spectra are given elsewhere (12). All of the spectra were transferred to an IBM-PC for processing. All of the software was written in C and operated on an IBM-PC or PC/AT using an 80(2)87 math coprocessor. The programs for SVD and the Householder reduction were taken from Numerical Recipes in C (13) and used with minor changes. All remaining software was written in-house.

METHOD Successive average orthogonalization is, as implied by its name, a process that obtains orthogonalvectors by successively averaging and modifying the original data (spectra). Figure 1is a flow chart describing the method. Computationally, there are many similarities between the Gram-Schmidt and the successive average methods. The differences, associated with averaging, are what allows the method to produce vectors comparable to those ob-

982

ANALYTICAL CHEMISTRY, VOL. 63, NO. 10, MAY 15, 1991

100

/

0

A

4.6 -

1.40

-.l 2002

1762

1602

/

100

1262

1002

Wavenumbers (cm-1)

4.16 100

2.88

4

-.e

!

2002

/

0

\ / v A

W

1762

1602

1262

J

1002

Wavenumbers (om-1)

Having made all the spectra in the data set orthogonal to the first vector, the averaging process begins again using the modified spectra of part b. There is an additional consideration for the second and subsequent vectors. The modified spectra in part b contain the same information. If we were to blindly average the spectra, they would cancel each other out because they do not all point in the same direction. To begin the averaging process, a direction must be selected. The exact direction is not particularly important, so the direction of the first spectrum can be chosen as the starting point. To determine the direction of the second spectrum with respect to the first, the dot product between the two is calculated. If the dot product is positive, the two are added; otherwise, the second is subtracted from the first. For the remaining spectra, the dot product between the spectrum and the running average is determined. On the basis of the sign of the dot product, the spectrum is added to or subtracted from the average. When this process has been completed for all the spectra, the average is normalized and becomes the next loading vector. The bottom of Figure 2c shows the results of averaging the modified spectra of Figure 2b. The orthogonalization/averaging cycle can continue until the appropriate number of vectors has been calculated. The scores and eigenvalue associated with a loading vector can be determined during the calculation of the next vector. The sum of the squares of the dot products (scores) calculated during the orthogonalizationprocess corresponds to the desired eigenvalue. Overall the number of passes through the data is equal to the number of required vectors plus one. The first and last passes do not represent complete calculations. The first pass need not check for direction since the original spectra all point in the same direction. The last pass is needed only to calculate the dot products required for the final score; no orthogonalization and direction checking is necessary.

. I * I

A

-.27 -/ 2002

1762

A

1602

1262

4

1002

Wavenumbers (em-1)

Figure 2. Synthetic spectra of two components: (a, top) Original spectra, (b, middle) spectra made orthogonal to first loading vector,

(c, bottom) first and second loading vectors.

tained by the Jacobi, Householder, SVD, and NIF'ALS approaches. One might ask why the spectra are averaged and how this produces useful vectors. Consider the synthetic example presented in Figure 2. Part. a of the figure shows 5 of a total of 11 2-component spectra each containing 5% T noise. Each component is represented by a single Gaussian band. We know that the first vector must account for the greatest percentage of the variance in the data. This is nothing more than the average of the input spectra. The normalized average becomes the first loading vector as shown in part c of the figure. Notice that the noise level in this vector has been reduced relative to the original spectra. The next step is to make all of the original spectra orthogonal to the first vector. The orthogonalization process is the same as that used in the Gram-Schmidt procedure. First, the dot product between the average spectrum and the spectrum to be orthogonalized is determined. Next, an amount equal to the dot product times the average spectrum is subtracted from the spectrum to be orthogonalized. Figure 2b shows what happens when this is done to the spectra of part a Notice that all the useful information in the 50/50mixture has been removed. Due to the concentrations of the data set (ranging from 100% of one component to 100% of the other component in 10% increments), this mixture corresponds to the average of all the spectra except for the noise level.

RESULTS AND DISCUSSION Broad-Band UV Spectra. To demonstrate the method on real spectra, we applied it to nine ultraviolet spectra of m-xyleneltoluene mixtures containing from 10 to 90% of each component in 10% increments. As one might expect, the spectra of the components are very similar and there are a number of highly overlapping bands. Five of the mixture spectra are shown in Figure 3. The first orthogonal vector is shown a t the top of part d. As for the synthetic case, it is the same as the 50/50 mixture except that it is not as noisy. Making the original spectra orthogonal to the first average vector produces the spectra shown in part b. Averaging these modified spectra, while remembering to keep track of direction, yields the middle vector in part d. Making the spectra of part b orthogonal to the second average vector produces the modified spectra of part c. Although these spectra contain a greater percentage of noise, there is still some useful information. Averaging a third time produces the bottom vector of part d. The process can continue until nine vectors have been calculated. The question is whether these are useful vectors that compare favorably with those derived from other methods. Figure 4 is a plot of the log of the eigenvalue vs number of vectors for both the Jacobi and successive average methods. Figure 5 shows the vectors determined by the two methods. Note that the first vector was removed since it is identical with the precision of the computer for both methods. The first four vectors are virtually identical; the differences associated with the later vectors reflect differences in dealing with decreasing signal-to-noise in the modified spectra. When the noise level of the modified spectra increases, the dot product is not foolproof as an indicator of direction. This is not as unfortunate as i t might seem since vectors beyond this point are not likely to be useful. Figure 4 shows differences in the values a t the fifth vector and beyond. If the vectors are used for quantitative analysis and standard error is calculated during cross-validation as a function of the number of vectors, the optimum number is found to be five. The differences in the

ANALYTICAL CHEMISTRY, VOL. 63,NO. 10, MAY 15, 1991 4-

.03

3-

.021-

50

2

.012-

.a g

/ 50

083

-

~

B

,004

--.l

-.006

I

246

230

276

280

-

280

280

245

Nanometers 1.12

-

.7 90

c

.78

P s

/

10

-

0

=8

200

276

Nanometer6

.4 70 / 30

\

.44-

-

0

e8 B

.l-

s

FACTOR 2

-

-.2

246

230

280

276

280

-.6

I

230

246

280

Nanometers

276

200

Nanometer6

Figure 3. UV spectra of m-xylene and toluene: (a, top left) Original spectra, (b, bottom left) spectra made orthogonal to first loading vector, (c, top right) spectra made orthogonal to second loading vector, (d, bottom right) first, second, and third loading vectors.

3

- Average

\

0

I\

.67

wAL FACTOR 2

FACTOR 3

-.16 230

-

JACQBI

1

I

-6

246

280

276

280

Nanometers

1

2

3

4

5 Vector

6

7

8

9

2

#

Figure 4. log of eigenvalues vs vector number for Jacobi transformations and successive average.

eigenvalues do not always appear as large differences in the vectors as indicated by Figure 5. Only the fourth factor (for which the eigenvalues are close) shows a significant difference. The successive average vector appears to have more band information. Are the differences, however small, significant? To answer that question with respect to quantitative analysis, three two-component analyses were considered; m-xylene/ toluene, m-xylenelp-xylene, and toluenelp-xylene. The spectra of the three compounds in these mixtures are shown elsewhere (14). In each case, nine mixtures containing 10-90% of each component were prepared. Cross-validation (leave one out method) was performed to determine a standard error of prediction (SEP). In addition to successive average, the data were subjected to analysis by Gram-Schmidt orthogonalization, Jacobi transformations, SVD, NIPALS, and Householder reduction. The results for the Jacobi, SVD, and NIPALS methods were identical within the precision of the computer

FACTOR 4

0

8

-.28

!

230

246

280

276

200

Nanometers

Figure 5. Comparison of the second, third, fourth, and fifth loading vectors obtained by Jacobi transformations and successlve average. a, top; b, bottom.

and are shown together in Figure 6. There 3re small differences between the methods, which are a result of the precision of the methods and a function of how each deals with noise. At a 0.05 level of significance, the standard errors cannot be considered different.

984

ANALYTICAL CHEMISTRY, VOL. 63, NO. 10, MAY 15, 1991 1.17 7

SEP

e l

Grom-Schmidt .16

M N A N E 500 psi

-

1.5

1

4

- 1.88

PROPANE 30 pel

0.5

-2.67 7700 n v

MX

/

TL

MX

/

PX

TL

/

PX

3.1

Figure 6. Standard error of prediction for three two-component UV quantitative analyses (TL = toluene, MX = m-xylene, PX = p-xylene). Methods are Gram-Schmidt, Jacobi, SVD, NIPALS, Householder,and successive average. Jacobi, NIPALS, and SVD results are identical and are shown together.

Table I. Range of Concentrations of Simulated Natural Gas Samples component

mol %

methane ethane propane n-butane isobutane n-pentane isopentane carbon dioxide nitrogen

84.1-95.4 1.12-15.23 0.0-4.89 0.17-0.43 0.17-0.38 0.051-0.146 0.050-0.152 0.053-1.696 0.0-11.65

Near-IR Gas-Phase Spectra. Twenty-five mixtures of simulated natural gas, containing methane, ethane, propane, and very small amounts of higher hydrocarbons, were subjected to cross-validation. The range of component concentrations is given in Table I. Figure 7 shows spectra of the three major components and the first six loading vectors as obtained by the successive average and Jacobi methods. Each pair of vectors is nearly identical down to the smallest noise spike. The results of cross-validation for spectra measured at 100 and 500 psi are given in Figure 8 for the prediction of BTU content. As for the UV spectra, the Jacobi, SVD, and NIPALS methods are grouped together. The differences in SEP between the methods are small. It should be pointed out that for the range of the energy content found in the data set the spectra do not change greatly; thus, we have a test of the methods where the input data are highly collinear. Even under these circumstances, the methods still produce similar results that cannot be considered statistically different. Computational Efficiency. As a final comparison of the orthogonalization methods, computational cost was considered. For this test, infrared spectra were drawn from the EPA vapor-phase library by using a random number generator. Each method, particularly Gram-Schmidt, NIPALS, and successive average, is capable of calculating less than the maximum number of vectors, but the number of vectors to use is rarely known in advance when developing a method. Therefore, in each case, the data set was completely orthogonalized. The five methods included in the comparison are GramSchmidt, SVD, Jacobi, Householder, and successive average. NIPALS was also tested, but the number of operations was too great to be shown with the other methods. Even when calculating only a few vectors, it suffers from the fact that it is an iterative technique as compared to Gram-Schmidt and successive average, which directly calculate orthogonal vectors.

,

6700

7460 7200 6960 Wavenumbera (om-1)

I

1 AMMOE

JACOBI

u

eB

A

W

E

1.6-

e 8

JACOB1

4

.7

-

A

W

E

JACOB1

-.l

I

7700

7460 7200 6960 Wavenumbers (cm-1)

6700

3 AMR*OE

2.2

-

JACOBI

A W O E

I

JACOB1

-.2 ! 7700

7460 7200 6960 Wavenumbers 7cm-1)

I

6700

Figure 7. Near-infrared gas-phase spectra of natural gas mixtures. (a, top) Spectra of methane, ethane, and propane: (b, bottom two) first through sixth loading vectors obtained by the Jacobi and successive average methods. SEP (Btu son

.

/

ft3)

Gram-Schmidt

0Jacobi

400

Houaeholder

ea Average

300

200

100

-n

100 psi

500 psi

Flgure 8. Standard error of prediction for the BTU content In natural gas mixtures at pressures of 100 and 500 psi. Methods are GramSchmidt, Jacobi, SVD, NIPALS, Househokler, and successhre average. Jacobi, NIPALS, and SVD res& are identical and are shown together.

The spectra were preprocessed with a Fourier transform so that the methods could be tested under circumstances where

ANALYTICAL CHEMISTRY, VOL. 63, Number

of Operations (milllons) 5

e Cram-Schmidt -e.m

4

..I3’ Jocobi

3

-A- Householder

t Average

2

1

0

0

10

20

50

40

30

60

Number of Spsctro Number of Operottona (miliions)

0.7 0.6

1 I

I

-8- Grom-Schmidt

/--

*.-,.’ El ..,......(.”“ .**-

..............,... ,,.,.....,

(,,

A . . . . . .

0.4

1

-e.SVD ,-El. Jocobi

4Householder

10

El

-e.m .,O. Jocobi

6 -

-A- Houaehclder

Number of Oota Points

Flgure 9. Number of muRiplications/divisions: (a, top) vs number of spectra for 128 data points, (b, middle) vs number of data points for 25 spectra, (c, bottom) vs number of data points for 75 spectra.

the number of points was less than or greater than the number of spectra. The transform allows us to retain considerable spectral information and perform quantitative analysis (10, 12, 14) even when the transform is severely truncated. Figure Sa describes the effect of the number of spectra using the same number of points. Gram-Schmidt is the fastest as it is for the other combinations of spectra and data points

085

tested. Next comes the successive average and Householder reduction followed by SVD and Jacobi transformations. The size of the matrix to be decomposed has an especially strong effect on the latter two methods. Part b of the figure illustrates what happens when the number of points is increased holding the number of spectra constant. In this case, the Householder is slightly faster than the average for larger numbers of data points since the size of the matrix to be diagonalized is constant. This also explains the relatively flat curve for the Jacobi method. Since SVD operates on the original data matrix, the effect of the number of points is more dramatic. For a larger number of spectra (Figure 9c), a change in the number of points changes the size of the matrix to be diagonalized. As expected this causes the number of operations for the Jacobi method to increase steeply. It also does the Same for the Householder reduction so that the successive average method is clearly faster and the number of operations does not rise as steeply. Basically, successive average orthogonalization produces loading vectors like those found by principal component methods but with a computational approach and in a time frame closer to that of Gram-Schmidt orthogonalization. The successive average approach does not suffer from the limitations of Gram-Schmidt since all spectra are “factored” into the calculation of the vectors. The simplicity of the method makes it easy to implement. It also makes it easier than other principal component methods to stop the calculations before the maximum number of vectors is calculated. The absence of any iterative steps in the method is particularly advantageous.

LITERATURE CITED

-8- Grom-Schmidt 8 -

NO. 10, M A Y 15, 1991

(1) Malinowski, E. R.; Howery, D. G. Factor Anahis h Chemistry; Wiley: New York, 1980. (2) Koenig, J. L.; Tovar Rodriquez, M. J. M. Appl. Spechosc. 1981, 35, 543-548. (3) Fredericks, P. M.; Lee, J. B.; Osborn, P. R.; Swinkels, D. A. J. Appl. Spectrosc. 1985, 39, 31 1-316. (4) Jolliffe, I. T. Principal Component Analysis ; Springer-Verlag: New York, 1986. (5) Golub, G. H.; Van Loan, C. F. Matrix Computations; Johns Hopkins University Press: Baltimore, MD, 1983. (6) Wiikinson, J. H.; Reinsch, C. Linear Algebra. Handbook for Aufomatlc Computations; Springer-Verlag: New York, 1971;Vol. 11. (7) Lawson, C. L.; Hanson, R. J. Solving Least Squares Problems; Prentice-Hall: Englewood Cliffs, NJ 1974. (8) Geladi, P.; Kowaiski, B. R. Anal. Chlm. Acta 1986, 185, 1-17. (9) Arfken, G. Mathematical Methods for Physicists; Academic Press: New York, 1970. (IO) Brown, C. W.; Obremski, R. J.; Anderson, P. Appl. Spectrosc. 1086, 40, 734-742. (11)Brown, C. W.; Bump, E. A.; Obremski, R. J. Appl. Spectrosc. 1988, 4 0 , 1023-1032. (12) Donahue, S.M.; Brown, C. W.; Caputo, B.; Modell, M. D. Anal. Chem. t m , 60, 1873-1878. (13) Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; Vetterling. W. T. Numerical Recipes in C ; Cambridge University Press: CambrMge, 1988. (14)Donahue. S. M.; Brown, C. W.; Obremskl, R. J. Appl. Spectrosc. 1988, 42, 353-358.

RECEIVED for review February 11,1991. Accepted February 19, 1991.