Rank estimation of emission excitation matrixes using frequency

EEMs are decomposed into a set of basis vectors (eigen- vectors) using eigenvector analysis. These vectors are then. Fourier transformed and their fre...
0 downloads 0 Views 713KB Size
810

Anal. Chem. 1986, 58,810-815

Rank Estimation of Excitation-Emission Matrices Using Frequency Analysis of Eigenvectors Thomas M. Rossi and Isiah M. Warner*

Department of Chemistry, Emory University, Atlanta, Georgia 30322

The number of components in mixtures of fiuorophores is determined by uslng a new method for matrix rank estimation of excltatlon-emission matrix (EEM) formated data. Mixture EEMs are decomposed into a set of basis vectors (eigenvectors) using eigenvector analysis. These vectors are then Fourier transformed and their frequency distributtons are used as a means of differentiating between primary (spectral) eigenvectors and secondary (noise) eigenvectors. Primary eigenvectors are found to have Fourier spectra weighted toward the lower frequency coefficients, whereas Fourier spectra of secondary eigenvectors are found to be weighted toward the high frequency coefficients. An empirical algorlthm for rank estimation based on the frequency distributions of eigenvectors Is compared to traditional rank estimation methods. Finally, the method developed in this study is applied to 12 blind coded EEMs of mixtures of polynuclear aromatic hydrocarbons of known composition.

The estimation of the number of components in mixture spectra is a problem which has received attention in the fields of infrared spectroscopy ( I ) , mass spectrometry (2),fluorescence spectroscopy (3),and chromatography (4). The general approach taken in these studies is to form a matrix from a series of mixture spectra and then apply eigenvector analysis (or factor analysis) to the matrix to determine the number of components in the samples (5). Two main limitations are encountered under these conditions. First, the number of mixture spectra in the matrix must be greater than the number of components in these mixtures. This requirement is often satisfied by acquiring spectra of a number of unique mixture samples, all of which contain the same components but in independently different relative concentrations. This may be a severe constraint for an unknown sample containing a large number of components. Second, when random noise and error are present in the data, the number of eigenvectors necessary to completely describe the matrix will be equal to the limiting dimension of the matrix. Hence, statistical methods of distinguishing primary eigenvectors (Le., those eigenvectors containing spectral information) from secondary eigenvectors (i.e., those eigenvectors arising due to noise) must be employed (5-8). In addition, some eigenvectors may have low signal/noise (S/N) and thus may be indistinguishable from noise. Because of these limitations, reliable results are seldom obtained for mixtures containing more than two or three components. The limitation of having to examine a large number of unique mixture samples in order to obtain an adequate number of mixture spectra for component resolution may be avoided by choosing an analytical technique that will generate an appropriate matrix for mixture resolution from a single mixture sample. When the species of interest are fluorescent, the video fluorometer (VF) is capable of generating such a matrix (9-21). With this technique, matrix formated data which contain both excitation and emission information about a sample are acquired. Each element of these excitation0003-2700/86/0358-0810$01.50/0

emission matrices (EEMs) represents the fluorescence intensity of a sample at a unique excitation and emission wavelength. The two dimensions of selectivity in these data matrices make it possible to apply eigenvector analysis to mixtures of fluorophores without the need of examining more than one sample (11). The size of the EEMs is generally 64 X 64, allowing these matrices to be used for the examination of a relatively large number of fluorophores. However, when eigenanalysis is applied to a 64 X 64 element EEM, random error and noise will produce an EEM of rank 64. Hence, the problem of estimating the number of fluorescent components contributing to an EEM is reduced to one of differentiating between primary and secondary eigenvectors. Techniques for the solution of this problem are known as rank estimation methods. Traditional statistical methods of rank estimation generally require that some assumptions be made as to the variance of each element in the data matrix (5). For example, the factor indicator function (7) requires that errors be evenly distributed throughout each element in the matrix. In many cases the success of the method depends on being able to estimate accurately this error (noise) associated with the matrix. Often, more than one statistical method is used and the results from the tests are averaged to produce a best guess of the number of components. The results of these tests tend to become less reliable as the number of components in the mixture increases. In the present study a new method of differentiating between spectral and noise eigenvectors from EEMs is proposed. This method is based on the assumption that in fluorescence spectroscopy, spectral information is broad band (low frequency), whereas noise is narrow band (high frequency). This assumption has proven useful in spectral filtering studies (22). It is important to note, however, that in the present study the frequency differences between spectral information and random noise are used in a unique algorithm to determine the number of Components in mixture samples. Specifically, criteria are developed for the determination of the number of true spectral eigenvectors in an EEM through empirical observations of the frequency distributions of Fourier transformed eigenvectors. The results of using this technique on ten test mixtures containing between one and six polynuclear aromatic hydrocarbons (PNAs) are compared to the results obtained by application of several statistical methods. Finally, the frequency distribution method is applied to 12 blind coded EEMs of PNA mixtures of various complexities.

THEORY Formal Statement of the Problem. The mathematical formulation of the EEM has been discussed elsewhere in the literature (II,13,14). However, a brief review will be useful here. By use of linear algebraic formalisms, the EEM of a single component sample can be represented as a matrix, M, where M = axy (1) a is a scalar, the magnitude of which is dependent on instrumental conditions and concentration of the fluorophore, and x and y are column and row vectors representing the 0 1986 American Chemical Society

ANALYTICAL CHEMISTRY, VOL. 58, NO. 4, APRIL 1986

excitation and emission spectra, respectively, of the fluorophore. For multicomponent samples, ignoring nonlinear fluorescence effects, the mixture matrix may be represented by a linear combination of the component matrices, i.e. r

M" =

c~lixiyi i=l

(2)

where M' is the mixture matrix containing r components and the subscript i denotes the ith component of the mixture. It is evident from eq 2 that a mixture EEM can be described as a summation of the outer product of component spectral vectors. Moreover, it is possible to use the linear algebraic technique of eigenvector analysis to decompose an EEM into a set of linearly independent basis vectors (eigenvectors). The number of nonzero eigenvectors that are needed to completely describe a matrix is determined by the rank of the matrix. Thus, in addition to eq 2, it is possible to represent a mixture EEM as a sum of a set of eigenvectors, specifically

(3) where E&is the eigenvalue of the ith eigenvector, the parameters v, and w, are the ith excitation and emission eigenvectors, respectively, and r is the rank of the matrix. The spectral vectors of eq 1 and 2 can be obtained from the eigenvectors by transforming the eigenvectors to a nonnegative vector space (11).

Assuming that none of the spectra of the mixture components is a linear combination of the other components and that there is no error in the data matrix, the rank of an EEM will be equal to the number of observable fluorescing components in the sample (11). Were both of these assumptions to hold true, the number of fluorophores contributing to an EEM could be determined simply by applying eigenvector analysis to the spectrum and counting the number of nonzero eigenvectors extracted from the matrix. However, the assumption of an absence of experimental error is never valid for a real sample. In the presence of random noise, the rank of an EEM becomes equal to the minimum dimension of the data matrix. Hence, for a 64 X 64 matrix, it will always be possible to extract 64 unique, nonzero eigenvectors. Thus, the true algebraic form of the data matrix, M', is given by r

M' =

64

CC,V,W, + J=r+l c,vJwJ c=1

(4)

where the first term represents primary eigenvectors, the second term represents secondary eigenvectors, and r is the rank of the matrix in the absence of noise. In eq 4, it is assumed that the eigenvalues are ordered such that c1 > e2 > c3 > ... €64. In order to determine the number of fluorescing components in the EEM represented by eq 4, it is no longer sufficient to count the number of nonzero eigenvectors. Some method must be devised that will enable one to distinguish between the primary and secondary eigenvectors. Rank Estimation by Frequency Analysis of Eigenvectors (REFAE). This method of rank estimation is based on the assumption that primary eigenvectors of EEMs will contain information weighted toward the low-frequency Fourier coefficients of the transformed eigenvectors,whereas secondary eigenvectors will contain predominantly higher frequency information. The frequency characteristics of eigenvectors can be determined by Fourier transformation of the vectors and performing some simple calculations in the frequency domain. The appropriate form of the Fourier transform equation for the present discussion is the discrete transformation given by

811

N-1

V(u) =

1/NC u ( x ) e x p ( - i 2 ~ x u / N

(5)

x=o

for the forward transformation, and N-1

u(x) =

C V(u) e x p ( i 2 a x u / N

(6)

u=o

for the inverse transformation, where N is the number of elements in the eigenvector v, which represents the evenly sampled function u ( x ) (15). In eq 5 and 6, u ( x ) and V(u) constitute a Fourier transform pair where u ( x ) is referred to as a time domain function and V ( u )is referred to as a frequency domain function. Although the intensities of the elements in an EEM are recorded simultaneously,an arbitrary sampling interval of 1 s will be given to the elements of the eigenvectors for the sake of convenience throughout the remainder of this discussion. Furthermore, the function V ( u ) is phase shifted so that u = ( N / 2 ) 1is the zero frequency coefficient. The complex function V ( u )is frequently represented by the Fourier spectrum, IV(u)),which is defined as

+

where V(ureal)and V(uimag)represent the real and imaginary coefficients, respectively, of the function V(u). In order to determine the relative importance of any given frequency range of the Fourier spectrum for representing the eigenvector, u ( x ) , it is convenient to define two more quantities. The first is the sum of the elements of IV(u)l, which is given by

where fma, is the observed frequency range for the data and IV(u)l contains N elements. The summation in eq 8 is equivalent to calculating the total area of the function IV(u)l. It is also convenient to calculate the area of a segment of IV(u)l bounded by hulim. This can be calculated via the equation dim

Allliln =

c

i= -dim

IV(Ui)l

(9)

Note that eq 9 is a summation of the lower frequency protion of IV(u)lbounded by hulim. The relative importance of this frequency region for reproducing the time domain eigenvector, u(x), can be expressed by calculating the percent of the total area of the function JV(u)lwhich lies in the range hulim. This percentage is given by %A,lim = 100A,lim/T

(10)

and will be referred to as the relative percent area of V ( u ) bounded by ulim. Now let us illustrate, by way of a hypothetical example, the utility of eq 10 for rank estimation. In this example it will be assumed that the initial data matrix is a square matrix of 64 columns and rows and that eigenvector analysis has been used to extract the ten eigenvectors with the largest associated eigenvalues from the matrix. Since a sampling interval of 1 s has been assumed, the relative percent area of the low frequency half of V(u)can be calculated by defining ulim = 0.25 Hz. Because of the expected differences in the frequency distributions of primary and secondary eigenvectors, one would expect %A0,%for the primary vectors to be greater than %A0,% for the secondary vectors. Figure 1 shows the expected relationship of %A0.26with eigenvector number where eigenvectors have been ranked according to their relative eigenvalues. This type of graph will be referred to throughout the remainder of this study as a frequency distribution plot. In this hypothetical example, it is evident from the graphical

812

ANALYTICAL CHEMISTRY, VOL. 58, NO. 4, APRIL 1986

Table I. Components in Test Mixtures mixbenzobenzoture 9,10-dimethyl[blfluor- [klfluor- fluor- benzo[a]- 9,10-dibromo- 2,3-benz- anthra- 2-ethylno. anthracene pyrene perylene anthene anthene anthene pyrene anthracene anthracene cene anthracene

-

4 5 6 7 8

9 10

-

-

-

-

-

-

+

+ + + +

-

+ + +

+

-

-

+ + + +

+ +

-

-

-

+

-

-

-

+ + +

+ +

+

-

-

-

-

+ +

of photon counting statistics (17). In the present study, therefore, the standard deviation of any element in an EEM is estimated as where eiJis the standard deviation of the element, d,, of the EEM, and i and j denote row and column positions, respectively, in the EEM.

0

0 1

2

3

4

5

6

EIGENVECTOR

7

S

9

10

NUMBER

Figure 1. Hypothetical example of rank estimation using the frequency distribution plot of a three-component EEM.

analysis that the first three eigenvectors form a plateau and that the eigenvectors beyond the edge of the plateau have consistantly lower values of %A0.25. Based on these data, the matrix yielded three primary eigenvectors and hence would be assigned a rank of 3. It would then be estimated that the EEM represented at least three fluorescent components. For a real EEM it is reasonable to assume that the success of the method would be dependent on the choice of frequency range to be included in the calculation of %Auli,. Since the optimum choice of ulim will vary depending on the specific mixture being analyzed, it will be necessary to empirically define a frequency range useful for general application to EEMs. Furthermore, since the frequency characteristics of many fluorophores are quite similar, it may be possible to empirically define threshold values of %Auli, in the various frequency limits tested. Statistical Methods of Rank Estimation. Four statistical methods of rank estimation have been evaluated in this study. The simplest of the methods will be referred to as the eigenvalue perturbation technique and has been described by Fukanaga (16).This is a technique by which the magnitudes of the eigenvalues associated with the various eigenvectors are used to determine the relative importance of each of the vectors in reproducing the true data matrix. The other three methods tested were the average error method (7), a x2 test ( 5 ) ,and a standard error test (5). The reader is referred to the cited literature for details on these techniques. The last three rank estimation methods discussed above all require an estimate of the error associated with each element in the EEM. If we assume that scattered light and other background interferences can be exactly subtracted from the matrix, the best estimate of error arises from a consideration

EXPERIMENTAL SECTION Materials. Anthracene, 2-ethylanthracene, 9,lO-dibromoanthracene, 9,10-dimethylanthracene, 2,3-benzanthracene, and benzo[a]pyrene were all purchased from Aldrich Chemical Co. (Milwaukee, WS). Fluoranthene, benzo[b]fluoranthene, and benzo[k]fluoranthene were purchased from Chem Service (West Chester, PA). Perylene was purchased from Sigma Chemical Co. (St. Louis, MO). All PNA solutions were prepared in glass-distilled cyclohexane (Burdick and Jackson Laboratories, Muskegon, MI). Equipment. The video fluorometer (VF) used in this study has been described previously in the literature (9, 10). The VF was interfaced to a HP9845B desktop minicomputer (HewlettPackard, Palo Alto, CA). All data reduction routines were programmed in our laboratory using HPBASIC. Procedure. Development of a REFAE Algorithm. Ten mixtures of PNAs, the compositions of which are summarized in Table I, were prepared in cyclohexane. Although specific concentrations were not recorded for each component in these mixtures, optical densities were maintained below 0.01 AU and approximately equal fluorescence contributions from each component were achieved. Furthermore, although mixtures one and two contained identical components, perylene was roughly twice as concentrated in mixture two. The first ten eigenvectors were extracted from the EEMs of each of these mixtures (11) and their Fourier spectra calculated via an FFT algorithm (18). Several frequency ranges were used to construct frequency distribution plota for the eigenvectors from each EEM. The six values of ulim tested were 0.258 Hz, 0.196 Hz,0.133 Hz, 0.102 Hz, 0.055 Hz, and 0.024 Hz. By a comparison of the frequency distribution plots from this portion of the study to the hypothetical example in Figure 1,the optimum frequency limits to be used for real data were chosen. In addition to the graphical analysis of the frequency distributions of these eigenvectors, threshold levels of %A,1,, were established for the various frequency ranges. The threshold levels were tested for their utility in rank estimation in cases where frequency distribution plots yielded ambiguous results. The.results from the experiments outlined above were used to establish a set of rules to be followed for rank estimation of unknown EEMs. This algorithm was followed without modification throughout the remainder of the study. Finally, the four statistical methods discussed in the theoretical section of this paper were each applied to these test mixture EEMs. The accuracy of the results of these statistical analyses was compared to the accuracy of the results of applying the frequency analysis algorithm to the same EEMs. Application of REFAE to Blind Coded Samples. In this portion of the study, the REFAE algorithm developed as described above was applied to 12 blind coded samples, the compositions of which are given in Table 11. The blind coded testing of the

ANALYTICAL CHEMISTRY, VOL. 58, NO. 4, APRIL 1986

813

Table 11. Blind Coded Mixture Composition" mixture ANT

1 2 3 4 5 -6 7 8 9

8.3 X lov6

-

7.2 x 10-5 -

4.1

10

6.8 X 10" 8.3 X 10" 8.1 X 10"

8.2 x 10-5 8.2 x 10-5 8.2 x 10-5

-

11 12

PER

9.10-DBA

2.2 x 10--8

1.1 x lo4 1.1 x 10-6

2,3-BA

no.

X 1.8 X 4.1 X 4.0 X

-

1.1 x 10-6

1.1 x 10-1 1.9 x 10-7 1.9 x lo-' 9.2 X

10"

I

9.8 X 10-1 1.1 x 10"

9.1 x 10-8 1.1 x 10-1 1.1 x 10-7

lo4

6.4 x 10" 6.4 X 10* 4.8 X -

4.8 X lo4 4.7 x 10" -

a ANT, anthracene; 2,3-BA, 2,3-benzanthracene;PER, perylene; 9,10-DBA, g,lO-dibenzanthracene,B[k]F, benzo[k]fluoranthene. Concentrations are given in molarity. Note: line indicates that no amount of the component was added to the mixture.

' 80 O0L

L

36 I

a Flgure 2. Isometric projection of an EEM of a three-component mixture of PNAs. FSPEC V I

I

I

0

FSPEC "1

,/I

..f'\,;i; 40

I

#

I

li

c

1

2

3

4

5

6

7

8

9

10

E I G E N V E C T O R NUMBER

Flgure 4. Frequency distribution plots for the first ten excitation eiFREO

FREO

B

FSPEC

iREO

YZ

E

genvectors from the three-component mixture shown in Figure 2, with ulim values of 0.133 Hz ( * ) and 0.102 Hz (#).

FSPCC "5

i

FREQ

Figure 3. Fourier spectra of the first six excitation eigenvectors extracted from the EEM shown in Figure 2.

REFAE algorithm was important as a test of the utility of this empirically developed technique in the absence of experimental bias.

RESULTS AND DISCUSSION Development of a REFAE Algorithm. The utility of the frequency characteristics of eigenvectors as a tool for rank

estimation can be supported by an examination of Figures 2 and 3. Figure 2 is the isometric projection of the EEM of a three-component mixture (mixture 2 from Table I). The Fourier spectra of the first six excitation eigenvectors extracted from this matrix are shown in Figure 3. As expected, the Fourier spectra of the fourth through sixth eigenvectors are visibly broader than the Fourier spectra of the primary eigenvectors. The frequency distribution plots for the first ten eigenvectors extracted from this mixture are shown in Figure 4. Only the frequency ranges of 0.102 Hz and 0.133 Hz are shown in this figure. By a comparison of this figure to the hypothetical example given in Figure 1,it is evident that these frequency ranges are useful for graphical rank estimation for this EEM. Although this is only a single example, it supports the concepts proposed in the theoretical section of this paper. After extraction of the first ten eigenvectors from all of the test mixtures and calculation of %Aullmvalues in all the frequency limits tested, it was possible to draw several conclusions. First, the frequency distribution plots for the excitation eigenvectors were usually very similar to the frequency distribution plots for the emission eigenvectors. Hence, in all further discussion only the excitation eigenvectors will be considered. Second, it was possible to define threshold levels of %A,J, for most of the frequency ranges tested. These thresholds are reported in Table 111. Eigenvectors with % A h values below the thresholds were defined as secondary vectors. Finally, in general, the most accurate rank estimations could be achieved by a simult,aneous consideration of both

814

ANALYTICAL CHEMISTRY, VOL. 58, NO. 4, APRIL 1986

Table 111. Threshold Values frequency range, Hz

% A threshold, %

0.258 0.196 0.133 0.102 0.055 0.024

80 65 46 36 25

Table IV. Estimated Numbers of Components in Test Mixtures estimated no. no. of of components" mixture components no. known REFAE Statl Stat2 Stat3 Stat4 1 2

a

No useful threshold defined.

T

soh

loo

60

EIGENVECTOR

loo

NUMBER

T

soh

60

I

I

I

I

I

EIGENVECTOR

I

I

I

NUMBER

Figure 5. Frequency distribution plots for (a) three-component and (b) two-component EEMs. Frequency limits used were (*) 0.133 Hz and (#) 0.102 Hz. Threshold values for these two frequency limits are Indicated by the horizontal lines where (-. -) denotes the threshold for ulim = 0.133 Hz and (-----) denotes the 0.102 Hz threshold.

-.

graphical analysis and threshold levels. Some typical data for the test mixtures are shown in Figure 5. This figure contains the frequency distribution plots for the first ten eigenvectors extracted from two of the test mixtures. Note that only data for ulim = 0.102 Hz and ulim = 0.133 Hz are presented. As a general rule, it was these frequency regions that resulted in the most informative graphical analyses. Furthermore, note that the threshold levels for these two frequency ranges are marked on the plots by broken horizontal lines. The first eigenvector to fall below the threshold was considered to be the first secondary eigenvector. For the three-component mixture, the graphical analysis was fairly accurate and the application of threshold

3 3 3

3 4 5 6

2 2 2

7

3

8 9 10

4

5 6

3* 3* 3* 2* 2* 2*

3* 4* 5*

5

3* 2 2 1 1

2 2 2 1 1

2* 2* 3 * 2 3 3 3 3 3 3 4

3* 4 3 * 4 2 2*

2

2* 3

2* 2*

5 5

2

3 3 6

5 6 *

"Asterisk denotes a correctly estimated rank. Key: Statl, eigenvalue perturbation; Stat2, average error method; Stat3, x2 test; Stat4. standard error test. values was useful mainly as a means of confirming the graphical analysis. However, the mixture shown in Figure 5b is an illustration of a case in which the graphical analysis of the frequency distributions of the eigenvectors was not sufficient to allow an estimation of the rank of the mixture matrix. In this case, the application of the threshold values from Table 111 is a necessity. In order to estimate the rank of this matrix and other test matrices, which resulted in ambiguous frequency distribution plots, threshold values from all the frequency ranges listed in Table I11 were applied. The REFAE algorithm proposed based on the above experimental observations is composed of three major steps. First the excitation eigenvectors are extracted from the unknown EEM and normalized to unity. Second, the vectors are Fourier transformed and % Aulimvalues are computed for each vector. Finally, frequency distribution plots of the eigenvectors are examined for plateaus and the threshold values listed in Table I11 are applied. The time limiting step in the process is the extraction of the eigenvectors. The frequency analysis of ten excitation vectors can be accomplished in less than 1 min, including construction of the frequency distribution plots. The results of applying this algorithm to the ten test mixtures are summarized in Table IV. It is evident from these data that the algorithm was successful in estimating the number of components contributing to the test EEMs in 90% of the cases. Comparison of REFAE to Statistical Methods of Rank Estimation. The number of components in each of the test mixtures as estimated by the statistical methods are tabulated along with the REFAE results in Table IV. It is evident from these data that REFAE had a much higher success rate than the statistical methods. I t is also true that the statistical methods had a tendency to under estimate the number of components in the mixtures and that this defect became more pronounced as the number of components in the test mixtures was increased. The standard error of the eigenvector method is the only method that consistently overestimated the number of components in the test mixtures. Application of REFAE to Blind Coded Samples. The results of applying REFAE to the 12 blind coded samples are summarized in Table V. The number of components contributing to these EEMs was correctly estimated in 75% of the cases. This success ratio is superior to the success ratios of the statistical methods as reported for the test mixtures. I t was not possible, using REFAE, to estimate the ranks of blind coded mixtures number 9 and number 10. Since the application of the threshold values in the various frequency ranges as specified in Table I11 gave different results in each

ANALYTICAL CHEMISTRY, VOL. 58, NO. 4, APRIL 1986

Table V. Estimated Numbers of Components in Blind Coded Samples mixture no. 1

2 3

no. of components known

no. of components estimated by REFAE

1 2 3

1 2 3 3 2 4 3

5

3 2

6

4

7 8

3 1 2

4

9 10 11 12

4

2 3

4

2

3

range, the data were considered to be too ambiguous to make any predictions as to the numbers of components contributing to the EEMs. Hence, when compared to traditional statistical methods of rank estimation, REFAE is seen to have the advantage of being "self-checking" for reliability.

CONCLUSION Estimation of the number of components in unknown EEMs is feasible based on frequency analysis of the eigenvectors extracted from the EEMs. In test cases, the frequency analysis approach proved to be more reliable than traditional statistical rank estimation methods. The statistical methods tended to underestimate the number of components in the test mixtures and became very unreliable for mixtures containing three or more components. The REFAE method, however, was proven to be useful for predicting the number of components in mixtures containing up to five components. Furthermore, implementation of some of the statistical methods was very time-consuming. The frequency analysis method, on the other hand, required very small calculation times. A third advantage was also observed when using the frequency analysis method of rank estimation. That is, REFAE provides some insight as to the reliability of the rank estimation for any particular mixture. Specifically, when threshold levels are used, each of the frequency regions tested should result in the same estimated rank. Finally, the use of REFAE is not theoretically restricted to applications involving EEM formated data. The only criterion that must be met by matrix formated data before REFAE would be applicable is that the primary eigenvectors

815

extracted from these matrices should have frequency characteristics different from those of the secondary eigenvectors. Unlike many of the statistical methods of rank estimation, it is not necessary to know the error associated with each element, nor is it necessary to assume an even distribution of error throughout the matrix. Thus, REFAE should be applicable for the prediction of the number of components contributing to matrix formated mass spectra, UV-vis spectra, and infrared spectra of mixtures.

ACKNOWLEDGMENT The authors gratefully acknowledge the technical assistance of P. B. Oldham and S. L. Neal. P. B. Oldham was particularly helpful in evaluation of the REFAE. LITERATURE CITED Rasmussen, G. T.; Isenhour, T. L.; Lowry, S. R.; Ritter, G. L. Anal. Chim. Acta 1978, 103, 213-221. Tway, P. C.; Cllne Love, L. J.; Woodruff, H. B. Anal. Chim. Acta 1980, 117. 45-52. Gold: H S i Rasmussen, G. T.; Mercer-Smith, J. A.; Whitten, D. G.; Buck, R. P. Anal. Chim. Acta 1980, 122, 171-178. Herman, D. P.; Gonnord, M. F.; Guiochon, G. Anal. Chem. 1984, 5 6 , 995-1003. Mallnowski, E. R.; Howery, D. G. I n "Factor Analysis in Chemistry"; Wiley: New York, 1980. Cattell, R. B. Educ. Psychol. Meas. 1958, 18, 791-838. Malinowskl, E. R. Anal. Chem. 1977, 4 9 , 612-1317, Gabriel, K. R. Biometrika 1971, 58 453-467. Johnson, D. B.; Gladden, J. G.; Callis, J. B.; Christian, G. D. Rev. Sci. Instrum. 1979, 5 0 , 119-126. Warner, I.M.; Fogarty, M. P.; Shelly, D. C. Anal. Chim. Acta 1979, 109,361-372. Warner, I.M. I n "Contemporary Topics in Analytical and Clinical Chemistry"; Hercules, D. M., Hieftje, G. M., Snyder, L. R., Evenson, M. A., Ed.; Plenum: New York, 1982; Vol. 4, pp 75-140. Rossi, T. M.; Warner, I.M. Appl. Spectrosc. 1984, 3 8 , 422-429. Warner, I.M.; Christian, G. D.; Davidson, E. R.; Callis, J. 6 . Anal. Chem. 1977, 49, 564-573. Ho, C.-N.; Chrlstian, G. D.; Davidson, E. R. Anal. Chem. 1980, 5 2 , 1071-1079. Bracewell, R. N. "The Fourier Transform and Its Applications"; McGraw-Hili: New York, 1978; Chapter 18. Fukunaga, K. "Introduction to Statistical Pattern Recognition"; Academic Press: New York, 1972; Chapter 8. Parker, C. A. "Photoluminescence of Solutions"; Eslevier: New York, 1968. Gonzalez, R. C.; Wintz, P. "Digital Image Processing"; Addison-Wesley: Reading, MA, 1977.

RECEIVED for review June 4, 1985. Resubmitted November 22,1985. Accepted November 22,1985. The study reported in this manuscript was supported in part by grants from the National Institutes, of Health (AI19916) and the National Science Foundation (CHE-8210886). Isiah M. Warner is also grateful for support from an NSF Presidential Young Investigator Award (CHE-831675).