Search system for infrared and mass spectra by factor analysis and

Interpretation of infrared spectra based on statistical approaches. J. Seil , I. Köhler , C.W. v.d. Lieth , H.J. Opferkuch. Analytica Chimica Acta 19...
1 downloads 0 Views 570KB Size
Anal. Chem. 1983, 55, 1117-1121

Table 11. CMC! Deterimination of Detergent Mixtures CMC, ratioa approx % detergent mixture by % molar ratio 0.009 Triton X 114 0.009 Triton X 114/Triton X 305 3:l 9: 1 0.012 1:l 3: 1 Triton X 114/?'riton I< 305 0.012 Triton X 114/Triton X 305 1:3 1:1 0.060 Triton X 114/Triton .X 305 1:19 3:19 0.067 Triton X 305 a Detergent solutions were prepared by mixing % wlw) ic ted solutions of both detergents in the proportions ir ._ Molar determinations and then assayed as described. based on an average molecular weight for Triton X 114 of 536d and for Triton X. 305 at 1526d. suggests that ths assay conditions do not significantly perturb the biophysical propeirties of the detergent solution.

IDISCUSSION The change in absorption spectrum upon transfer of CBHG from a hydrophilic to a hydrophobic environment allows measurement of detergent micelle formation. This change in spectrum, with a rel.ative increase in ABZoand decrease in A470upon CBBG interaction with organic solvents or micelles, differs from that observed for CBBG bound to protein or in ionic detergent solutialns. The CMC is readily determined as t,he concentration of detergent which initiates a change in the absorption spectrum a t 620 nm and 470 nm. Subsequent addition of detergent increases only the micelle, not the monomer concentration of the detergent (6). This is corroborated by the linearity of change in AG2,,and A470with detergent concentration above the CMC. The CBBG assay provides a rapid means of CMC determination for nonionic detergents. Unlike many other assays

1117

for CMC, the reagents and instruments required for the CBBG assay are readily available in most labs, the analysis does not require separation of two phase systems (as for a hydrophobic dye such as Orange OT (6)),and the change in CBBG absorption with micelle concentration can be followed at two wavelengths, facilitating the extrapolation. Determination of the detergent concentration of a solution above the CMC can also be obtained by using this assay. Nonionic detergents are increasingly being used in membrane research and such a routine colorimetric assay for CMC should be useful in determining an important physical chemical parameter for aqueous solutions of one or mixtures of multiple nonionic detergents. Registry No. CBBG, 6104-58-1;Triton N, 9016-45-9;Triton X 114,9036-19-5;Triton X 100,9002-93-1;Lubrol Px, 9002-92-0.

LITERATURE CITED (1) Gennis, R. B.; Jonas, A. Annu. Rev. Biophys. Bioeng. 1977, 6 , 195-238. (2) Gennls, R. B.; Strominger, J. L. J . Bo/. Chem. 1976, 251, 1277- 1282. (3) Grlfflth, W. C. J . Soc. Cosmet. Chem. 1949, I , 311. (4) Egan, R. W.; Jones, M. A.; Lehninger, A. L. J . Biol. Chem. 1978, 251, 4442-4447. (5) "McCutcheon's Detergents and Emulsifiers", North American edition; MC Publishing Co.: Glen Rock, NJ, 1980. (6) Jain, M. K.; Wagner, R. C. "Introduction to Biological Membranes"; Wiley: New York, 1980;pp 66-70. (7) Bradford, M. Anal Biochem. 1978, 72, 248-254. (8) Bio-Rad, Price List G, 1981. (9) Chiang, J. Program of Molecular Pathology, Northeastern Ohio Universities College of Medlcine, Rootstown, OH 44272. (IO) "Rohm and Haas Surfactants and Dispersants-Handbook of Physical Properties"; Rohm and Haas: Phlladelphia, PA, 1978.

RECEIVED for review November 8, 1982. Accepted March, 10, 1983. This investigation was supported by PHS Grant No. CA 28342 awarded by the National Cancer Institute, DHHS and JFRA 36 from the American Cancer Society to K.S.R.

Search System for Infrared and Mass Spectra by Factor Analysis and Eigenvector Projection S. S. Williams, R. B. Lam,' and T. L,, Isenhour* Department of Chemistty, University of North Carolina, Chapel Hiil, North Carolina 275 14

A factor analysis technique has been used to compress a library of combined infrared and mass spectra. With this method, a 75 % reduction in library sire has been achieved with little loss in compound dlscriminetion. A search system using the compressed iilbrary Is described. Characteristics of the searches arc! demonstrated with intralibrary and GCIR/ GCMS data.

The need for efficient search-identification methods in infrared spectroimetry and mass spectrometry has become increasingly important. This is primarily a result of increased growth of spectral libraries and the rapid maturation of gas chromatography/mass spectrometry (GC/MS) and gas chromatographylinfrared spectrometry (GC/IR). To date, Current address: Foxboro Analytical, 140 Waters St., Norwalk,

CT 06856.

many automated search algorithms have been developed which are capable of rapidly identifying compounds from infrared and mass spectra with a high degree of accuracy (1, 2 ) . Recently several workers have reported the combined technique of gas chromatography/infrared spectrometry/mass spectrometry (GC/IR/MS) ( 3 , 4 ) . The method exploits the complementary nature of infrared and mass spectra, allowing more confidence in identifying individual components of mixtures (5). In GC/IR/MS work published to date, all searches applied to GC/IR/MS data have used separate IR and MS search algorithms. This approach to searching has the drawback that each search must individually generate results while ignoring a substantial portion of the available data. In addition, it is possible in some cases that the individual searches will not yield the same results. Finally, with infrared spectral libraries rapidly growing in size, and mass spectral libraries already commonly over 25 000 compounds, the time required to carry out two separate searches and rationalize the results becomes unacceptable.

0003-2700/83/0355-1117$01.50/00 1983 American Chemlcal Soclety

ill8

ANALYTICAL CHEMISTRY, VOL.

55, NO. 7, JUNE 1983

COLUMN DESIGYEE

DATA

~

MATRIX

1

COVARIANCE

MATRIX MULTl PLICATION

I i 1 MATRIX

Flgure 1. Data matrix used in factor analysls. factor r

1

column designee r 1 11

1c LINEAR FACTORS

Flgure 3. Factor analysls diagram for k cases with ivariables in each Flgure 2. Result of abstract factor analysis on data matrix.

case.

This work extends the method of factor analysis and eigenvector projection, used by Hangac and Isenhour (6) for the compression and search of infrared spectra, to the problem of combining infrared and mass spectra. The factor analysis process identifies a vector space in which redundancies are effectively removed from the original spectrum. Eigenvector projection produces a new compressed library reflecting the nonredundant form of the combined spectra. A search system utilizing a vector comparison metric is built around the transformed and compressed result and is tested with intralibrary searches and searches of data from GC/IR and GC/MS.

is performed on the correlation matrix to produce a set of i linear factors in the form of i eigenvectors with i eigenvalues. All of the information originally present in the spectral library is distributed to the i eigenvectors. This amounts to creating an alternate coordinate system into which the spectral information can be mapped. The original spectral coordinates are inefficient because information is spread over many axes of the coordinate system. As an example, an aromatic compound has multiple peaks and characteristic groupings in both the infrared and mass spectra. The new coordinate system is mutually orthogonal and thus more efficient for storing information than the spectral system; however, the new system is an abstract representation of the factor space and thus individual factors (eigenvectors) cannot be associated with specific functionalities. The data do contain all of the original information, and because the eigenanalysis attempts to span the factor space in the most efficient way possible, many of the i eigenvectors contain no significant information and may be eliminated. The number of eigenvectors j taken from the original i eigenvectors can be determined by the criterion of Kaiser (8). This assumes that eigenvectors with eigenvalues larger than the average eigenvalue are statistically significant, while eigenvalues smaller than the average indicate eigenvectors describing random variations in the data. Alternatively, the number of significant eigenvectors can be determined empirically through the use of target transformation

THEORY The general goal of factor analysis is to remove redundant information from a large data set (7).A data matrix consisting of r rows and c columns (Figure 1)is formed, with each row signifying a particular observation or case and each column representing a specific variable determined for each case. Each data point is represented by drk,where i and k are the respective row and column indexes. Factor analysis seeks to form new matrices (Figure 2), the product of which is equal to the original matrix. An abstract solution is produced through an eigenanalysis of the data matrix; the eigenvectors describe the factor space, and the eigenvalues indicate the relative significance of the corresponding eigenvectors. The complete factor analysis includes the transformation of the abstract solution into a solution which can be interpreted by use of known physical or chemical relationships, though this is neither necessary nor always possible. The data matrix can be formed from spectra, with rows representing individual compounds and columns representing points along the abscissa. For a successful factor analysis each point must be expressible as a linear sum of terms: dik

=

j CrrnCnk n=l

(1)

where j