630
Anal. Chem. 1983, 55, 630-633
Factor Analysis for Separation of Pure Component Spectra from Mixture Spectra Paul C.
Glllette, Jerome B. Lando, and Jack L. Koenlg’
Department of Macromolecular Sclence, Case Western Reserve University, Cleveland, Ohio 44 106
The use of factor analysis to extract pure component spectra from a series of Infrared mixture spectra is descrlbed. This technique requires that each pure component spectrum have a nonoveriapped characteristic peak. Through the use of the cross product of eigenspectra It Is possible to automatically identify key frequencies in the elgenspectra. Normailzation of spectra permits extension of the procedure to systems in which the pathiength is not fixed. Numerical examples, model spectra, and actual infrared spectra are used to illustrate the technique.
In two previous papers the use of factor analysis to improve the signal to noise ratio ( I ) and the analysis of mixtures by searching spectral libraries was discussed (2). One is often confronted with systems in which it is not possible to prepare or isolate pure components. This situation is especially true in the spectrometric analysis of polymers where changes in conformation result in new crystal phases. The ratio method (3) provides one means of spectral separation, although its application is limited primarily to binary systems. During the course of factor analysis abstract “eigenspectra” are computed which contain all of the spectral features necessary to reconstruct the spectra of the mixtures. These same abstract “eigenspectra”can be combined to form the spectra of the pure components which form the mixtures (4-11).
EXPERIMENTAL SECTION Instruments. Infrared spectra were collected on a Digilab FTS14 Fourier transform spectrometer equipped with a TGS detector using 10 scans of sample and reference at a resolution of 4 cm-’. Spectra were transformed with zero filling by a factor of 1 with triangular apodization. Infrared spectra were then transferred to a DEC VAX 11/780 computer operating under VMS 3.0 for data processing. Computer Programs. All calculations were performed by the FORTRAN 77 program FACANAL which contains ca. 5000 lines of source code exclusive of graphics routines. This program also incorporates facilities for target testing (2) of known spectra using the EPA spectral library and improvement ( I ) of individual S I N ratios in mixture spectra. Through the use of PARAMETER statements in INCLUDE files array dimensions may be readily changed to permit the analysis of high-resolution spectra or problems involving a large number of mixtures. Graphics output was displayed on a DEC VTlOO with Retrographics (a terminal which emulates Tektronix 4010 series terminals) using Tektronix Plot10 softwarewith hardcopy supplied by a Printronix P300 dot matrix line printer.
THEORY Factor analysis generates two quantities which are essential in extracting the spectra of the pure components: abstract eigenspectra which contain all the spectral features necessary to construct the spectra of the pure components and eigenvectors which may be thought of as being abstract representations of the concentrations of the pure components within the mixtures. To obtain the pure component spectra it is necessary to rotate the abstract concentrations to obtain “real”
concentrations. Unfortunately, both quantities are abstract in the sense that it is not possible to directly obtain the pure component spectra from this information alone. For most spectral techniques one can place some restrictions upon the pure spectra-for example in infrared spectrometry all absorbances must be greater than or equal to zero. The case in which each pure component spectrum has an absorbance which is unique to itself (i.e., a peak which is not overlapped) is a special one. If the positions of these peaks can be determined, then it is possible to set up a system of linear equations, the solution of which will provide the appropriate constants by which to multiply the abstract spectra to obtain pure component spectra. Mathematically the preceding analysis is done by first constructing a covariance or correlation matrix as follows: [CI = [Dlt[Dl
(1)
where [C] = covariance/correlation matrix and [D] = mixture spectra matrix. (A correlation matrix requires that each spectrum be normalized by dividing by the square root of the sum of the squared absorbances.) The problem can then be expressed as an eigenvalue problem [Cl[El = [EI[Ll
(2)
where [C] = covariance/correlation matrix, [E] = eigenvector matrix, and [L] = eigenvalue matrix. (In this statement of the problem the eigenvectors are in columns of the [E] matrix.) Statistical methods which have been described elsewhere (1, 9, 12-14) permit one to establish the minimum number of eigenvalues/eigenvectors necessary to reconstruct the data matrix within the bounds of the experimental error. Eigenvalues/eigenvectors can then be assigned as being either “real” or “noise”. Abstract “eigenspectra” are constructed by using the data matrix and the “real” eigenvectors in accordance with eq 3
[AI = [DI[E’l
(3)
where [A] = abstract “eigenspectra” matrix, [D] = mixture spectra matrix, and [E’] = “real” eigenvector matrix. One desires to determine a transformation matrix [TI consisting of scaling coefficients by which to multiply the abstract eigenspectra to obtain the pure component spectra [PI = [AI[Tl
(4)
where [PI = spectra of the pure components, [A] = abstract “eigenspectra”,and [TI = transformation matrix. As mentioned previously, the restriction that each pure component spectrum contain a peak in a region where the other pure component spectra do not absorb enables one to devise a system of linear equations to calculate the transformation matrix [TI. Since every point in the calculated pure component spectra must be greater than or equal to zero, one has the following relationship for a binary system (6-8):
(5) Rearrangement of this equation permits one to establish ratio
0003-2700/83/0355-0830$01.50/00 1983 American Chernlcal Society
ANALYTICAL CHEMISTRY, VOL. 55, NO. 4, APRIL 1983
Flgure 1. Model spectra generated by forming linear combinations of two Lorentzian peaks.
criteria in which the smallest minimum and positive ratios of the abstract eigenspectra establish boundary regions in which pure component spectra have negative absorbances. For spectra which are collected with a fixed pathlength cell and one is observing a system in which one component evolves as another is depleted, then the eigenvector pairs will fall on a straight line (8). The intersection of this line with the two boundary lines directly yields the coefficients by which to multiply the abstract spectra to obtain the pure component spectra. Systems containing more than two pure components are somewhat more complicated and graphical methods cannot be utilized. By forming the ratios of all pairs of abstract eigenspectra, one can determine the locations of the characteristic points in the pure spectra. (Variouscombinationswill produce the same points.) One can then use a modified form of eq 4 to determine the transformation matrix by arbitrarily setting the absorbances of the pure spectra to a square matrix of Kronecker deltas (i.e., a diagonal matrix with ones on the diagonal) and using only the absorbances in the abstract eigenspectra at the characteristic points. Rearrangement of eq 3 and 4 leads one to eq 6 in which the mixture spectra are defined in terms of the pure components. where [D] = mixture spectra matrix, [PI = spectra of pure components matrix, [TI = transformation matrix, and [E’] = “real” eigenvector matrix. The product of the inverse of the transformation matrix and the real eigenvectors represents the scaling coefficients by which to multiply the pure component spectra to obtain the mixture spectra. As such it is clearly related to the concentrations. For systems with more than two pure components it is necessary to standardize these coefficients by dividing by their s u m for each mixture to obtain the relative concentrations within the mixtures. In this way the percentage contribution of the pure components to each mixture may be calculated. (All of the preceding steps are illustrated in great detail on a very simple problem in the Appendix.)
RESULTS AND DISCUSSION As an illustration of pure component extraction by factor analysis, the four spectra in Figure 1 were generated by forming linear combinations of two Lorentzian peaks. Factor analysis of the covariance matrix indicated two pure components as anticipated. The two primary eigenvectors were then used to construct the abstract “eigenspectra” in Figure 2. A plot of the four ordered pairs of the two primary eigenvectors (Figure 3) clearly indicates a linear relationship between the eigenvectors. The points of intersection of this linear with the boundary region yield the coefficients by which to multiply the abstract eigenspectra to obtain the pure component
631
Flgure 2. Abstract “eigenspectra” generated by using only two eigenvectors and Lorentzlan model mixtures.
Flgure 3. Plot of “real” eigenvectors-polnts in shaded reglon would produce pure component spectra with negative intensitles.
n
Figure 4. Extracted pure component spectra from Lorentzian model mlxtures.
spectra. Depicted in Figure 4 are the spectra of the extracted pure components. As an example of an application of the technique to actual spectral data, six mixtures of hexane and chloroform were prepared (Figure 5). For correction of differences in base lines which would result in an incorrect number of components a linear base line was removed as follows: A line was computed which passed through the minimum absorbances in the first and last 10% of the spectrum. This wedge was then subtracted and the entire spectrum shifted by a constant so that the minimum absorbance was 0. All forms of error analysis based on the magnitudes of the eigenvalues indicate two pure components. A plot of the primary eigenvectors generated by factor analysis of the covariance matrix fails to yield any apparent relationship between the eigenvectors (Figure 6). This is because the spectra are not in the same scale-a linear
832 ANALYTICAL CHEMISTRY, VOL. 55, NO. 4, APRIL 1983
L
IMO
1400
I200
13W
J
WAVENUMBERS
Flgure 5. FTIR
Flgure 7. Plot of “real”eigenvectors based on factor analysis of the correlation matrix for hexanelchloroform mlxtures.
spectra of hexane/chloroform mixtures. Chloroform
0
I
0
I
1500
I200
1300
1400
WAVENUMBERS 0
Pure component spectra extracted from hexane/chloroform mixtures uslng transformatlon matrix based on Figure 7. Flgure 8.
Flgure 6. Plot of “real”eigenvectors based on factor analysis of the covarlance matrix for hexane/chloroform mixtures.
relationship would have been observed if a fixed path cell had been used. Rather than attempting to scale the spectra with the use of an internal thickness band, it is much easier simply to normalize each spectrum by dividing it by the square root of the sum of the squared intensities (Le., factor analysis on a correlation matrix.) A plot of primary eigenvectors exhibits a quadratic relationship between the eigenvectors(Figure 7). One encounters another problem in analyzing real spectral data: The forementioned procedure for establishing regions which would produce spectra having negative intensities does not always work due to noise. In regions where both abstract eigenspectra are of low intensity, the ratio represents little more than random noise and as such can produce spurious results. In practice the cross product of the abstract eigenspectra can provide the location of the points (Figure 8) to bound the concentration curve. A least-squares quadratic of the form El = aE22+ bE2 + c is fitted to the eigenvector pairs. Simultaneous solution of this equation in conjunction with boundary conditions imposed by the ratios of the abstract eigenspectra at the points defined by extremina in the cross product leads to the coefficients by which to scale the abstract eigenspectra produces two spectra which are virtually identical with those of chloroform and hexane (Figure 8). (The small anomaly in the spectrum of hexane at ca. 1220 cm-l can be attributed to nonlinearity in Beer’s law arising from the crude sample preparation used in collecting the infrared spectrum.)
APPENDIX To illustrate the extraction of pure component spectra from mixtures, we generated three spectra by forming linear com-
binations of two pure component spectra. The solution for the pure component spectra follows: The covariance matrix is calculated as: [Dlt[Dl = [Cl 0.20 1.60 0.12 0.40 1.20 0.14 0.80 0.40 0.18
]
X
0.26 0.20 1.60 [0.12
0.22 0.14 0.80 = 1.20 0.40 0.14 0 . l j
0.40
Solution of the eigenvalue problem [C] [E] = [E] [L] yields:
The abstract eigenspectra are then readily solved as: [A] =
1::; :::%! [Dl [E’]
=
2.027 -0.227 0.227 0.122
1.20 0.26 0.40 0.22 0.14 0.8j 1.60 1.20 0.40 0.12 0.14 0.18
Ratio of abstract l/abstract 2
[0.596 0.745 -0.400 0.0391 0.300 0.916
633
Anal. Chem. 1983, 55, 633-638
)0.367/0.033 0.627/0.668 2.027/-0.227 0.227/0.1221 = 111.121 0.939 -8.930
1.8611
Cross product of abstract 1 X abstract 2
[-x:
10.367*0.033 0.627*0.668 2.027*-0.227 0.227*0.1221 = 10.012 0.419 -0,460 0,0281 Line passing through (0.745, -0.400), (0.596, 0.039), (0.300, 0.916): E2 = -2.95631 1.802. In this example both the ratio and cross product indicate that the ratios corresponding to the second and third elements should be used to bound the concentration line. The line segment defining all possible concentrations of the two pure components is defined by that portion of the line E2 = -2.956E1 1.802 which is bounded by E2I8.930E1 and E2 1-0.939E1. The transformation matrix [TI is solved for by simultaneous solution of E2= -2.956E1 + 1.802 with E2= -0.939E1 and E2 = 8.93OE1
+
+
[PI 0.10 0.30
0.367
=
[AI[Tl 0.033
10.00 0.20 . 0 0 0.0.3 0.10 2.00 = 10.227 2.027 . 6 2 7 -0.227 0.122 0.668l
0.596 0.300 0.039 0.9161
LITERATURE CITED (1) Gillette, Paul C.; Koenig, Jack L. Appl. Specfrosc. 1982, 3 6 ,
535-539. (2) Gillette, Paul C.; Koenig, Jack L. Appl. Specfrosc. 1082, 36, 661-665. (3) Koenla. Jack L.; DESDOSito, L.;Antoon, M. K. ADP/. SDeCtfOSC. 1977, 37,232-295. (4) Knorr, F. J.; Futreil J. H. Anal. Chem. 1070, 51, 1236-1241. (5) Ohta, N. Anal. Chem. 1973, 45, 553-557. (6) Sylvester, E. A.; Lawton, W. H.; Maggio, M. S. Technomet 1974, 16, 353-368. ... ...
(7) Lawton. W. H.; Sylvester, E. A. Technomet 1971, 73, 617-633. (8) Macnaughtan, D.; Rogers, L. B.; Wernimont, G. Anal. Chem. 1972, 4 4 , 1421-1427. (9) Malinowski, E. R.; Howery, D. G. “Factor Analysis in Chemistry”; WIley: New York, 1960. (IO) Sharaf, Muhammad A.; Kowalski, Bruce R. Anal. Chem. 1982, 54, 1291-1 296. (11) Malinowski, Edmund R. Anal. Chlm. Acta 1082, 134, 129-137. (12) Malinowski, Edmund R. Anal. Chem. 1977, 49, 612-617. (13) Mallnowski, Edmund R. Anal. Chem. 1977, 49, 606-612. (14) Malinowski, Edmund R. Anal. Chim. Acta 1978, 703, 339-354.
[,.I51 1.354
The concentrations of these pure components in the starting mixtures can be directly computed as
RECEIVED for review July 19, 1982. Accepted December 20, 1982. The authors express their gratitude to the National Science Foundation for support of this research under Grant DMR80-20245.
Abstract Factor Analysis of Solid-state Nuclear Magnetic Resonance Spectra D. W. Kormos and J. S. Waugh” Department of Chemistty, Massachusetts Instlyute of Technology, Cambridge, Massachusetts 02 139
A method for component analysis of Solid-state NMR spectra Is described. By use of abstract factor analysis, AFA, slmuiated 13C powder and magic angle spinning (MAS) spectra were utilized to establish S / N levels necessary for successful delineation of factors. Chemical shift tensor information for p-dlmethoxybenzene, PMB, was used In the study. The use of dtfferentiai crosspolarization (CP) rates was suggested and demonstrated as an experimental means to produce factor anaiyzabie “spin mixture” spectra. 31P CP-MAS spectra of octacalcium phosphate, OCP, were analyzed to reveal three components. Additional areas of application are noted.
Early work applying factor analysis to liquid-state NMR data focused on elucidating the number of physically significant factors which produce solvent shifts of solutes in solution. The work of Buckingham, Schaefer, and Schneider showed these shifts to be a linear sum of various contributions ( I ) . Weiner, Malinowski, and Levinstone studied proton chemical shifts of substituted methanes in various media using factor analysis and concluded that three factors span the solventeffect space (2). Using the published data of Abraham, Wileman, and Bedford, Malinowski found similar evidence for three factors for 19Fshifts of organofluorine compounds
in nonpolar isolvents (3). Factor analysis has also been used to study solvent shifts for 13C and 29Siresonances of Me4% (4) as well as I6N resonances in amides (5). Two or three factors are generally found and are broadly ascribed to gasphase chemical shifts, anisotropic solvent susceptibilities, and van der Waals effects. Factor analysis has also been used in attempts to elucidate the variables which intrinsically control chemical shif‘ts. Studies of 13C chemical shifts of aliphatic and aromatic halides suggest two principal factors and one smaller factor are needed to correlate collected data (6, 7). In this pa.per, we investigate the use of AFA to extract component information present in solid-state NMR spectra which exhibit overlapping powder patterns or, in the case of magic angle sample spinning, overlapping sets of spinning sidebands. Results of AFA applied to model and experimental solid-state NMR spectra are presented. The model simulations which were employed to probe the practical limitations of the technique include powder and magic angle sample spinning 13Cspectra for p-dimethoxybenzene, PMB. The four independent 13C chemical shift tensors of PMJ3 were combined in varying proportionsto create chemical shift “spin mixtures”. Random noise was systematically added to the spectra. In this way, the ability of AFA to predict the number of spectral components. or chemical shifts, underlying the spectra was assessed empirically as a function of signal to noise. We then
0003-2700/83/0355-0633$01.50/00 1983 American Chemical Society