Factor analysis of mass spectra - ACS Publications

AGV-1. 0.21 ± 0.08. 0.11 (11); 0.094 (ff); 0.08. (7). GSP-1. 0.077 ± 0.046. 0.10 (11); 0.084 (5);. 0.075 (7). “ All values are an average of three...
0 downloads 0 Views 382KB Size
Table I. Determination of Silver in Rocks by RNAA Sample

Found, ppma

PTO- 1 SY- 1 AGV- 1

3.64 * 0.85 0.405 i 0.019 0.51 i 0.07 0.21 i 0.08

GSP- 1

0.077

su- 1

i

0.046

Literature valuer, ppm

4 (111, 4 (12) 0.42 i 0.04 (13) 0.4 (12) 0.11 ( 1 1 ) ;0.094 (6); 0.08 (7 ) 0.10 (11);0.084 ( 6 ) ;

0.075 (7) DAll values are an average of three determinations. The error limits are estimates of one standard deviation on a single analysis.

The only useful isotope of silver for radiochemical neutron activation analysis is llomAg with a half-life of 255 days and several gamma rays of which 0.658, 0.885, 0.937, 0.764, and 1.384 MeV are the most prominent (10). In this work, we have used the 0.658-MeV photopeak for abundance calculations. The possible interfering nuclear reactions are lloCd(n,p)llomAgand 1131n(n,a)110mAg. However, in view of (a) low natural abundances of both ll°Cd and Il3In, (b) poor cross-sections of both these reactions compared with the logAg(n,y)llOmAgreaction, (c) high thermalto-fast neutrons ratio in the irradiation positions used, and (d) equal if not lower abundance levels of Cd and In in rocks compared with Ag, it does not seem likely that these reactions can to any serious degree interfere with the determination of silver. Self-adsorption and self-shielding problems were kept to a minimum by using standards and samples with roughly the same amount of silver (4wg). Table I gives the results of analysis of three standard rocks and two standard ores distributed by American, Canadian, and South African geological organizations. Each value represents a mean of triplicate determinations. The absolute standard deviation for a single determination is included. Literature values, where available, are included for comparison. The agreement between our results and the cited values is fairly good. The first value given for SU-1, SY-1, AGV-1, and GSP-1 are either recommended, average, or approximate values given by Flanagan ( 11) and Sine et al. (12). The value given for the ore PTO-1 by Steele et al. (13) includes the techniques of emission spectrography, atomic absorption spectrometry, NAA, and spectrophotometry. The range of all results varies from 0.35 to 2 ppm with a mean value of 0.42 f 0.04 ppm. The mean of NAA values averages 0.40 ppm which is in excellent agreement with our value of 0.41 ppm. As mentioned before ( 8 ) ,part of the spread of results for silver and other noble metals may be due to their nonhomogeneous distribution in the rock samples and there may not be a "true" value for these elements because of the sampling problems. Lillie (7) also found the largest errors

in GSP-1 and G-2 in the determination of silver in a suite of rocks and attributes this to the sampling error or nonhomogeneous distribution of silver in these rocks. Brunfelt and Steinnes (6) analyzed SU-1 and SY-1 for silver, but did not report the results because the results indicated possible nonhomogeneous distribution of silver in these rocks; similar uneven distribution of gold in G-1 and W-1 rocks has been demonstrated by Fritze and Robertson (14). Howevex, for AGV-1 and GSP-1, our results are in good agreement with those of Brunfelt and Steinnes ( 6 ) ,and Lillie (7). The precision of our results varies from better than 5% for PTO-1 to 60% for GSP-1. The experimental detection limit was found to be 0.01 ppm with a 100-hour irradiation of 0.5 g of the sample. In combination with the earlier method (8) for the determination of gold and five platinum metals, silver also could be determined at the same time. However, some difficulty might,arise because, in the case of silver, long irradiations are desirable to give enough sensitivity, and the resultant high radioactivity due to major and minor elements will pose a health physics problem in processing for the shortlived platinium metal isotopes immediately after irradiation. ACKNOWLEDGMENT The authors are grateful to the personnel of the GTRR reactor, Georgia Institute of Technology, Atlanta, Ga., for performing the irradiations. They are also thankful to T. W. Steele of the South African National Institue of Metal-' lurgy, Johannesburg, for providing generous amounts of PTO-1. LITERATURE CITED A. V. Heyl. W. E. Hall, A. E. Weissenborn, H. K. Stager, W. P. Puffett, and B. L. Reed, "US. Mineral Resources", D. A. Brobst and W. A. Pratt, Ed., US. Govt. Printing Office, Washington, D.C., 1973, p 581. A. P. Vinogradov, Geokhimiya,7,641 (1962). E. N. Gilvert, S. S. Shatskaya, V. A. Mikhailov. and V. G. Torgor, J. Radioanal. Chem., 14,279 (1973). D. E . Gillum and W. D. Ehmann, Radiochim. Acta, 16, 123 (1971). R. R. Keays, R. Ganapathy, J. C. Laul, V. Krahenbuhl, and J. W. Morgan, Anal. Chim. Acta, 72, 1 (1974). A. 0.Brunfelt and E. Steinnes, Radiochem. Radioanal. Left., 1, 219 (1969). E. G.Lillie, Anal. Chim. Acta, 75,21 (1975). R. A. Nadkarni and G. H. Morrison, Anal. Chem., 46,232 (1974). R. A. A. Muzzarelli and R. Rocchetti, Anal. Chim. Acta, 70, 465 (1974). C. M. Lederer, J. M. Hollander, and I. Perlman, "Table of isotopes", 6th ed., John Wiley & Sons, New York. N.Y. 1967. F. J. Flanagan, Geochim. Cosmochim. Acta. 37, 1189 (1973). N. M. Sine, W. 0. Taylor, G. R. Webber, and C. L. Lewis, Geochim. Cosmochim. Acta, 33, 121 (1969). T. W. Steele, J. Levin, and I. Copelowitz, NIM Report No. 1696 (1975). K. Fritze and R. Robertson, "Modern Trends in Activation Analysis", J. R. DeVoe, Ed., NBS Special Publication 312, Vol. 11, 1279 (1969).

RECEIVEDfor review April 14, 1975. Accepted July 28, 1975.

Factor Analysis of Mass Spectra J. B. Justice, Jr., and T. L. lsenhour Department of Chemistry, University of North Carolina, Chapel Hill, N.C. 275 14

Linear pattern recognizers have been used to predict structural features in mass spectra ( 1 ) . Just how linear is the relationship between the mass spectra and the presence of various structural features has not been determined. To 2286

study the relationship of functional group presence to linear variation of the data, a data set consisting of 630 mass spectra with elemental composition C2-10H2-2200-4N0-2 is decomposed into linearly independent dimensions, whose

ANALYTICAL CHEMISTRY, VOL. 47, NO. 13, NOVEMBER 1975

relationship to structural features in the data is measured for seven features. These features are phenyl, carbonyl, ether, hydroxyl, nitrogen, amino, and saturated hydrocarbon. The linearly independent dimensions are determined by factor analysis, a multivariate statistical method for studying the nature of high dimensionality data from a large experimental data set. The dimensions are initially the eigenvector solution that diagonalizes a correlation matrix of the experimental variables (the mass spectra). This eigenvector solution may then be rotated to clarify the interpretation of the dimensions with respect to the masses. Factor analysis has recently found interesting applications in the study of fundamental properties of solutes and stationary phases affecting chromatographic retention times (2-5), of solvent shifts in NMR (4-8), and of polarographic half-wave potentials ( 9 ) . EXPERIMENTAL

THEORY An objective of factor analysis is to find a set of variables fewer in number than the original set, which adequately describe or express the original set. What is “adequate” is determined by the experimenter. I t may be sufficient to reproduce the data within experimental error, or less stringently, to reproduce major variations while ignoring smaller effects. One begins with a matrix, D , of observations, such as a collection of mass spectra, or a set of gas chromatographic retention indices of a series of solutes (2, 3 ) . The original experimental data are standardized by subtracting the mean intensity and dividing by the standard deviation for each mass position. The standardized mass spectra were calculated by U

B-lCB = E

(1)

where I = intensity of peak, 7 = mean intensity of mass position, and u = standard deviation of mass position. From the standardized data, a correlation matrix C is calculated which relates the variation in each mass position to all the other mass positions.

(3)

The square roots of the eigenvalues correspond to standard deviations and the eigenvalues themselves to variances along the associated eigenvector axes. Thus the correlation matrix can be reproduced by rearranging Equation 3 to yield

c = BEB-1

(4)

Since B is an orthogonal matrix, its inverse equals its transpose, so we may write

c = BEBT

(5)

Now Equation 5 is equivalent to C = elblblT

The data set consisted of 630 low resolution mass spectra with elemental composition C2-10H2-2200-4N0-2 taken from the American Petroleum Institute Research Project 44 Tables. The mass range covered 12 to 141, excluding mass positions containing fewer than ten peaks. The eleven masses excluded were 20,21,22, 23,34, 35, 48, 132, 137, 138, 139. The remaining 119 mass positions were used with 2% intensity resolution. The statistical programs were taken from the IBM FORTRAN IV Scientific Subroutine Package and run in double precision on an IBM 360/75. The seven functional groups examined were 62 phenyl compounds, 76 carbonyls, 57 ethers, 30 hydroxyls, 81 nitrogen compounds, 58 primary and secondary amino, and 89 saturated hydrocarbons. The categories were not necessarily mutually exclusive.

S = -I - I

orthogonal coordinate system is obtained by diagonalizing the correlation matrix to establish a set of eigenvalues, E , and a set of associated eigenvectors, B.

+ e2b2bzT + . . . + embmbmT

(6)

However, as the eigenvalues ei approach zero, little is gained by including the associated eigenvectors in the approximation of the correlation matrix. Needing fewer than m eigenvectors to reproduce the correlations in the original variables is the same as reducing the dimensionality of the space being studied, so that an information compression results from the re-expression of the variation by a more optimum coordinate system than the original variables. Interpretation of the eigenvector solution of the correlation matrix in terms of importance to variation in intensity of mass spectral lines can be improved if the eigenvector axes are rotated in space to maximize the kurtosis (flatness) in the distribution of coefficients within the eigenvectors. All factors with variance equal to or greater than the variance of the standardized mass positions, that is, eigenvalues equal to or greater than 1.0, were kept for comparison to functional group presence. These 42 factors accounting for 73% of the total variance were rotated orthogonally by a varimax rotation. After rotation, the significant masses in each factor-a factor is an eigenvector scaled by the square root of its eigenvalue-were determined by the magnitude of mass coefficient in the vector. For example, the largest coefficients in the 26th factor were for masses 45 (0.741, 31 (0.331, and 19 (0.80). Because these masses are usually associated with oxygen, one would expect factor 26 to correlate with oxygen presence in the molecules studied. T o test this and to correlate other factors with functional group presence in molecules, a method of ranking factors in relation to functional groups was used in which the dot product of a factor with each mass spectrum was calculated and the products summed over all mass spectra in a class. This was done for each factor.

C = DTD

Having obtained an m X m correlation matrix, C, we would like to express the same information in less than m dimensions. This is done by representing the correlation matrix in an orthogonal coordinate system composed of linear combinations of the correlations in C, chosen so that the variance represented by C is distributed in as nonuniform a fashion as possible. In other words, the first coordinate axis, or linear combination, should contain the maximum variance possible on a single axis. The second linear combination should contain the second largest variance, while constrained to an axis orthogonal to the previous axis. Linear combinations expressing maximum variance are constructed until all the variance is accounted for. This

where F, = j t h factor and Si = ith standardized mass spectrum in class k . The results of Equation 7 are ranked according to absolute magnitude, and the masses most significant in the highest ranking factors for each functional group examined. RESULTS AND DISCUSSION Factor Functional Group Relations. The results of applying Equation 7 to seven functional groups are reported in Table I. For carbonyls, two factors, 18 and 34, were of major and approximately equal importance, while three major factors, 26, 40, and 41, were found for hydroxyls. Factor 26 was also a significant factor in ethers.

ANALYTICAL CHEMISTRY, VOL. 47, NO. 13. NOVEMBER 1975

2287

Table I. Mass Positions and Factors Associated with Seven Functional Groups Class ri2e

Related factors

Relative im ortancea (&. 6 )

Phenyl

62

c=o

76

C--O--C

57

OH

33

N

81

15 8 21 1 31 32 18 34 26 9 26 40 41 10 13 24 10 4

-1.24 1.19 -0.86 -0.65 0.58 0.50 0.65 -0.63 -0.80 -0.64 -1 -60 -1 .oo -0.94 -0.51 -0.4 7 0.40 -0.72 -0.62

19 37 16 23 25

0.55 0.53 1.31 -0.62 -0.56

Functional group

NH, , NH

C"HZn+2

58

89

Significant mass positions in factorb

120,105 134,133,119 92,91 117,116,90,89 129,102 122,77 88,61,60 74, 73,28 73,45,31,19 87, 75, 59,47 73,45,31,19 59,33,31 62,32,31 123,108,18,17 58 76,61,46,30,17 123,108,18,17 98,83,70,69,56,55, 42,41,39,27 72,44 96,95,94,93 113,99,71,57,43 57 85,84,43

a Note that a negative value in the fourth column does not necessarily imply that the masses associated with the factors are negatively correlated with the functional group. If the loadings themselves are negative, the double negative results in a positive correlation between the functional groups and the masses. Coefficient larger than 0.25.

The variation in mass spectral intensity of 8 1 nitrogen compounds is related to two factors, 10 and 24, while four factors, 10,4,19, and 37 were important to the 58 amines in the data set. Since factor 24 did not appear as a significant amine factor, an association with nitro compounds is indicated. Saturated hydrocarbons are most significantly related to factor 16, while factors 23 and 25 are less important. Because the sum of products of factors and standardized spectra has been divided by the number of spectra in a class, the functional groups can be evaluated as to their relation to the linear dimensions obtained by factor analysis. Thus, factor 26 is the most highly correlated of any factor to any functional group, and is twice as important to hydroxyls as to ethers. In fact, the data in column three indicate that all three hydroxyl factors are more related to hydroxyls than any factors are to ethers or carbonyls. Factor 10 is more important to amines than to nitrogen compounds in general (0.72 vs. 0.51) so that one associates the major mass positions of factor 10 with amines. Functional Group/Mass Position Relations. Once the important factors are known for a given functional group, the functional group mass positions are directly obtainable as those masses with large coefficients in the factor. These coefficients are the square roots of the variance in a mass position attributable to a given factor. Figure 1 illustrates the factors, mass positions, and coefficients for oxygen functional groups. For example, factor 26 accounts for over half the variance in mass 45, 10% in mass 31, and 63% in mass 19. Inclusion of the second and third hydroxyl factors with the first accounts for 65% of the variance in mass 31, 2288

88

.C

OH

19

/

MASS

Figure 1.

Masses contributing greatest variance to oxygen factors

Factors are labeled in decreasing order of correlation with specific functional groups

establishing a fairly strong relationship between mass 3 1 and the O H group. In Figure 1, the factors most related to the carbonyl group, the three most related to the hydroxyl group, and the two most related to the ether group are diagrammed as to the mass positions of importance in each factor. Thus, the factor most important for hydroxyls and ether groups has significant loadings for masses 73 (C4H90), 45 ( C Z H ~ O )3, 1 ( C H 3 0 ) , and 19 ( H 3 0 ) . The second and third hydroxyl factors contain further contributions to the variance in mass 3 1 as well as 32 ( C H 4 0 ) , 33 (CH,O), 59 (C3H70), and 62 (C2H6O2). The mass positions most prominent in the carboxyl factors were 88 (C4H*O2), 73 (C3H502), 6 1 (C2H5O2), and 60 (C2H402), for the primary factor, and 74 (C3H602), 73 and 28 (CO), for the second most related factor. The second ether factor contained 87 (CSH110), 75 ( C ~ H ~ O Z59, ) , and 47 (CzH70). In conclusion, this work indicates that factor analysis may be applied to the elucidation of the relationship of functional groups to mass spectra. The examination of a more complete data base consisting of several thousand compounds seems warranted. LITERATURE C I T E D (1) (2) (3) (4) (5) (6) (7) (8)

(9)

T. L. lsenhour and P.C. Jurs, Anal. Chem., 43 (10) 20A (1971). P. H. Weiner and D. G. Howery, Anal. Chem., 44, 1189 (1972). P. H. Weiner and D. G. Howery, Can. J. Chem., 50, 448 (1972). D. G. Howery. Anal. Chem., 46, 829 (1974). J . H. Kindsvater, P. H. Weiner, and T. J. Klingen, Anal. Chem., 46, 982 (1974). P. H. Weiner, E. R. Malinowski, and A. R . Levinstone. J. Phys. Chem., 74, 4537 (1970). P. H. Weiner and E. R. Malinowski, J. Phys. Chem., 75, 1207 (1971). P. H. Weiner and E. R . Malinowski. J. Phys. Chem., 75, 3160 (1971). D. G. Howery, Bull. Chem. SOC.Jpn, 45, 2643 (1972).

RECEIVED for

ANALYTICAL CHEMISTRY, VOL. 47, NO. 13, NOVEMBER 1975

review July 12,1974. Accepted July 17,1975.