1328
Anal. Chem. 1989, 61,1328-1332
Computer-Assisted Prediction of Gas Chromatographic Retention Indices of Pyrazines David T. Stanton and Peter C. Jurs* 152 Davey Laboratory, Penn State University, University Park, Pennsylvania 16802
The gas-liquid chromatographic retention Indices for 107 substltuted pyrazlnes on OV-101 and Carbowax-2OM are successfully modeled with the aid of a computer and the ADAPT software system. Structural descriptors are calculated and multiple llnear regression analysls methods are used to generate model equatlons relating structural features to observed retentlon characteristics. Separate equatlons are found to be necessary in order to accurately predict the retentlon lndlces for the same set of compounds on the two different stationary phases. Comparisons are made between models generated for the two stationary phases, and descriptors that may encode dlfferences in solute interactions wlth stationary phases of dlfferlng polarlty are discussed.
INTRODUCTI 0N Alkoxy- and alkylthiopyrazines are known for their characteristic flavoring properties. These compounds have been found in roasted nuts and beans and a variety of other cooked vegetables ( I ) . They have been reported to possess a variety of odors such as that of bell pepper (2)and licorice-woody odor (3). In studies of these materials, identification of a compound is often accomplished on the basis of gas chromatographic peak comparisons with an authentic standard of the suspected material. However, it is not always possible to obtain samples of the pure standard materials for such comparisons. It is then desirable to develop methods for the prediction of retention characteristics of the unknown based on the structural features and chromatographic properties of other representative compounds on hand. Mihara and Enomoto have reported such a structure-property relationship for a set of substituted pyrazines in which retention index increments corresponding to various ring substituents were determined for a small series of substituents and the parent (unsubstituted) pyrazine (4). Retention index values for new compounds were then calculated by summing the necessary increment values for the substituents present. The method was later expanded to include additional substituents and to add a term that encoded the relative position of one substituent to others on the ring (5). In a related approach, Masuda and Mihara described the use of modified molecular connectivity indices (6) to predict gas chromatographic retention indices for a series of substituted pyrazines. These methods yield good results providing that the experimentally determined index increments for all the substituents for the unknowns involved are available. A primary disadvantage of such approaches is that retention indices cannot be predicted for compounds containing substituents other than those available in the table of index increments. In addition, little insight concerning the physical processes involved in the separation can be gained through the examination of the independent variables in the resulting equations. The goal of the present work is to employ computer-assisted methods to develop model equations relating structural features of substituted pyrazines to their observed retention indices and to learn how differences in the models may de-
scribe polarity differences between two stationary phases. The predictive models are generated through multiple linear regression analysis of calculated structural descriptors which encode topological, geometrical, electronic, and physical properties of the solute molecules in question. Such models are termed quantitative structure-retention relationships (QSRR). The derivation of such relationships and the definitions of several typical descriptors have been described by Kaliszan (7) and have been employed in studies related to that reported here (8, 9). An advantage of the QSRR approach is that the equations generated allow for the prediction of retention indices for compounds which are structurally similar to those used to develop the model but which were not specifically represented in the training set. Another advantage is that differences in models developed for the same compounds on stationary phases of differing polarity may provide insight as to which structural features of the molecules may be responsible for polar interactions with stationary phases. Also, use of a computer allows for the rapid and facile input of the structures of the compounds being studied, calculation and subsequent analysis of descriptors, and the generation and validation of the model regression equations.
EXPERIMENTAL SECTION The procedure used in the current study is outlined in the flow diagram shown in Figure 1. All computations were performed by using the Penn State University Chemistry Department’s PRIME 750 computer running the ADAPT software system (IO, 11).
The Data Set. The compounds involved in this study have the general structure I.
I R, : H, alkyl, alkoxy, alkylthio, aryloxy, arylthio, acetyl, chloro.
R, : H, alkyl, chloro. R, : H, alkyl.
R4 : H, alkyl.
The retention data for the compounds chromatographed on the OV-101and Carbowax-2OM stationary phases were taken from Mihara and Masuda (5) and are listed in Table I. Of the 114 pyrazines included in their study, two were not included in the present study. Of the remaining 112 compounds, two had not been analyzed on both columns. Both of these were retained but were only included in the modeling process for the stationary phase on which they had been analyzed. This yielded a final data set of 111 compounds for each stationary phase. Structure Entry. The hydrogen-suppressed structural diagrams of all the compounds were entered into the computer by sketching them on a graphics terminal, followed by storage in connection tables. Bond lengths and angles were corrected to
0003-2700/89/0361-1328$01.50/00 1989 American Chemical Society
ANALYTICAL CHEMISTRY, VOL. 61, NO. 13, JULY 1, 1989 2800
r
1850
It
1329
I
Structure
Retention Index Carbowax-2OM
2325
1375
I m
.
.. ... . I.
I
500
Analysis
,
i
I Model
1400
950
1850
2300
Retention Index
ov - 101 Figure 2. Plot of the correlation of retention indices for pyrazines on OV-101 and Carbowax-20M.
Validation
Figure 1. Procedure flow diagram for development of retention index prediction models.
standard values, and the structures were placed in energy-minimized conformations using molecular mechanics calculations. Descriptor Generation. A total of 85 separate molecular structure descriptors were calculated for the entire set of 112 pyrazines. These descriptors can be separated into four categories: topological, geometric, physical, and electronic. The topological descriptors included path counts and molecular connectivity indices. Geometric descriptors included shadow areas (12),length to breadth ratios, van der Waals volumes, surface area, and principal moments of inertia. Calculated physical property descriptors included molecular polarizability and molar refractivity. The electronic descriptors included the most positive and most negative u charges in the molecd and the submolecular polarity parameter (A) described by Kaliszan (7). Once the descriptors were generated, the process of objective descriptor (feature) selection was initiated. The descriptors were examined, and those exhibiting high pairwise correlations to other descriptors and those values that were identical for the majority of the data set or were predominantly zero were discarded. This was done because such descriptors encode little discriminating information. In the case of high pairwise correlations, those descriptors that were easy to calculate or that possessed direct physical meaning, were retained. The remaining 45 descriptors were then examined by using a vector space analysis method to select a subset of 30 to 35 descriptors that are the most orthogonal to each other. The process of orthogonalization is accomplished by selecting as the initial basis vector a descriptor that is highly correlated to the dependent variable. The next descriptor chosen is one that yields the largest projection angle to the initial basis vector, forming a plane in the vector space. The next descriptor is chosen that has the largest projection angle to the plane formed by the first two, and so on, until a subset of the desired size has been selected. This reduction prior to regression analysis was necessary to minimize information overlap between descriptors and to reduce the possibility of chance correlations (13). A limiting ratio of five objects (compounds) to each descriptor is typical. Since the feature selection procedure, for the most part, ignores the dependent variable, in this case the retention index, the same final subset of descriptors was available for regression analysis for each stationary phase. Regression Analysis. Variable selection was performed by using the method of multiple regression by progressive deletion (14). Linear models are formed by a stepwise addition of terms where inclusion of terms is based on F statistic values. A deletion process is then employed where each variable in the model is held out in turn and a model is generated by using the remaining pool of descriptors. Then each combination of two, three, and so on, descriptors are held out, followed by model generation. This combination of steps has the effect of uncovering potentially superior equations that may have been otherwise obscured by the existence of a descriptor which is highly correlated to the dependent variable. A final set of selected equations were then tested
11
IV
111
0
CH, V
Figure 3.
VI
VI1
Structural diagrams of the six compounds identified as
outliers. for stability and validity through a variety of statistical methods.
RESULTS AND DISCUSSION The retention indices for the two stationary phases were compared, yielding a correlation coefficient (R) of 0.951. A plot of the comparison of observed retention on Carbowax20M versus observed retention on OV-101 is given in Figure 2. Results of initial experiments identified six of the 112 compounds as outliers. The structures of the six compounds are shown in Figure 3. The criterion for identifying a compound as an outlier was that compound being flagged by three or more of six standard statistical tests used to detect outliers in regression analysis. These tests were (1) residual, (2) standardized residual, (3) Studentized residual, (4)leverage, (5) DFFITS, and (6) Cooks distance (15).The residual is the difference between the actual value and the value predicted by the regression equation. The standardized residual is the residual divided by the standard deviation of the regression equation. The Studentized residual is the residual of a prediction divided by its own standard deviation. Leverage allows for the determination of the influence of a point in determining the regression equation. DFFITS describes the difference in the fit of the equation caused by removal of a given observation, and Cook's distance describes the change in a model coefficient by the removal of a given point. Since it was undesirable to assign too much weight to any one of these tests, the criterion of any combination of three of the tests to indicate a given compound as an outlier was used. For each stationary phase, four compounds were identified as outliers, with two of the six compounds being outliers common to both phases. No specific reason could be determined to explain v h y these six compounds act as outliers. They are not structurally unique, and the final conformational energies, as calculated by using the molecular mechanics programs, were found to be reasonable as compared to the rest of the data set. The outliers were dropped from consideration in sub-
1330
ANALYTICAL CHEMISTRY, VOL. 61,
NO. 13, JULY
1, 1989
Table I. Experimentally Determined Retention Indices for Pyrazines on OV-101and Carbowax-20M'
OV-101 CW-2OM
name pyrazine methylpyrazine 2,3-dimethylpyrazine 2,5-dime thylpyrazine 2,6-dimethylpyrazine trimethylpyrazine tetramethylpyrazine ethylpyrazine
710 801 897 889 889 981 1067 894 2-ethyl-5-methylpyrazine 980 977 2-ethyl-6-methylpyrazine 2,5-dimethyl-3-ethylpyrazine 1059 1064 2,6-dimethyl-6-ethylpyrazine 1066 2,3-dimethyl-5-ethylpyrazine 2,3-diethylpyrazine 1065 1137 2,3-diethyl-5-methylpyrazine propylpyrazine 986 1072 2-methyl-3-propylpyrazine 1154 2,3-dimethyl-5-propylpyrazine 1142 2,5-dimethyl-3-propylpyrazine 1151 2,6-dimethyl-3-propylpyrazine isopropylpyrazine 949 1112 2,3-dimethyl-5-isopropylpyrazine 1088 butylpyrazine 1121 2-butyl-3-methylpyrazine 1184 3-butyl-2,5-dimethylpyrazine 1196 3-butyl-2,6-dimethylpyrazine 1254 5-butyl-2,3-dimethylpyrazine 1043 isobutylpyrazine 1200 2,3-dimethyl-5-isobutylpyrazine 2-isobutyl-3,5,6-trimethylpyrazine 1263 1040 sec-butylpyrazine 1194 5-sec-butyl-2,3-dimethylpyrazine 1192 pentylpyrazine 1352 2,3-dimethyl-5-pentylpyrazine 1157 isopentylpyrazine 1317 2,3-dimethyl-5-isopentylpyrazine 1151 (2-methylbutyl)pyrazineb 2,3-dimethyl-5-(2-methylbutyl)pyrazine 1306 2-(2-methylbutyl)-3,5,6-trimethylpyrazine 1363 1240 (2-methylpenty1)pyrazine ( 2-ethylpropyl)pyrazineb 1121 1133 (1-methylbuty1)pyrazine 1377 2,3-dimethyl-5-( 2-methylpenty1)pyrazine 1293 hexylpyrazine 1495 octylpyrazine 1546 2-methyl-3-octylpyrazine 2-methyl-5-(2-methylbutyl)-3-octylpyrazine 1923 2-methyl-6-(2-methylbutyl)-3-octylpyrazine 1962 877 methoxypyrazine 954 2-methoxy-3-methylpyrazine 2-methoxy-5-methylpyrazine 969 1037 3-ethyl-2-methoxypyrazine 1078 3-isopropyl-2-methoxypyrazine 1170 5-isopropyl-3-methyl-2-methoxypyrazine 5-sec-butyl-3-methyl-2-methoxypyrazine 1250 5-isobutyl-3-methyl-2-methoxypyrazine 1257 1362 3-methyl-2-methoxy-5-(2-methylbutyl)pyrazine
1179 1235 1309 1290 1300 1365 1439 1300 1357 1353 1400 1415 1421 1417 1459 1374 1438 1500 1474 1493 1316 1431 1474 1459 1487 1514 1600 1406 1525 1556 1394 1500 1575 1700 1530 1655 1527 1636 1661 1606 1449 1471 1710 1668 1845 1956 2200 2254 1306 1339 1358 1400 1400 1467 1536 1556 1664
name
OV-101 CW-2OM
1444 959 2-ethoxy-3-methylpyrazine 1029 2-ethoxy-5-methylpyrazine 1047 2-ethoxy-3-ethylpyrazine 1101 2-ethoxy-3-isopropylpyrazine 1143 1230 2-ethoxy-5-isopropyl-3-methylpyrazine 1314 2-ethoxy-5-isobutyl-3-methylpyrazine 1306 5-sec-butyl-2-ethoxy-3-methylpyrazine 1415 2-ethoxy-3-methyl-5-( 2-methylbuty1)pyrazine 2-methylpentyl)pyrazineC 0 2-ethoxy-3-methyl-5-( (methy1thio)pyrazine 1076 1151 3-methyl-2-(methylthio)pyrazine 5-methyl-2-(rnethy1thio)pyrazine" 1163 1237 3-ethyl-2-(methylthio)pyrazine 3-isopropyl-2-(methylthio)pyrazine 1273 5-isopropyl-3-methyl-2-(methylthio)pyrazine 1362 5-secbutyl-3-methyl-2-(methylthio)pyrazine 1441 5-isobutyl-3-methyl-2-(methylthio)pyrazine 1446 3-methyl-5-(2-methylbutyl)-2-(methylthio)pyrazine 1552 3-methyl-5-(2-methylpentyl)-2-(methylthio)pyrazine1638 (ethy1thio)pyrazine 1148 1215 2-ethylthio-3-methylpyrazine 1418 2-ethylthio-5-isopropyl-3-methylpyrazine 1494 5-sec-butyl-2-ethylthio-3-methylpyrazine 1496 2-ethylthio-5-isobutyl-3-methylpyrazine 1602 2-ethylthio-3-methyl-5-(2-methylbutyl)pyrazine 2-ethylthio-3-methyl-5-(2-methylpentyl)pyrazine 1686 1415 phenoxypyrazine 1465 2-methyl-3-phenoxypyrazine 5-isopropyl-3-methyl-2-phenoxypyrazine 1620 1694 5-sec-butyl-3-methyl-2-phenoxypyrazine 5-isobutyl-3-methyl-2-phenoxypyrazine 1706 1807 3-methyl-5-(2-methylbutyl)-2-phenoxypyrazine (pheny1thio)pyrazine 1606 3-methyl-2-(phenylthio)pyrazine 1658 5-isopropyl-3-methyl-2-(phenylthio)pyrazine 1806 5-sec-butyl-3-methyl-2-(phenylthio)pyrazine 1874 1882 5-isobutyl-3-methyl-2-(phenylthio)pyrazine 3-methyl-5-(2-methylbutyl)-2-(phenylthio)pyrazine 1985 3-methyl-5-(2-methylpentyl)-2-(phenylthio)pyrazine2064 acetylpyrazine 993 1061 2-acetyl-3-methylpyrazine 1093 2-acetyl-5-methylpyrazine 2-acetyl-6-methylpyrazine 1088 2-acetyl-3-ethylpyrazine 1138 1153 2-acetyl-3,5-dimethylpyrazine chloropyrazine 861 1032 2,3-dichloropyrazine 2-chloro-3-methylpyrazine 951 1044 2-chloro-3-ethylpyrazine 2-chloro-34sobutylpyrazine 1187 2-chloro-5-isopropyl-3-methylpyrazine 1173 5-sec-butyl-2-chloro-3-methylpyrazine 1256 1264 2-chloro-5-isobutyl-3-methylpyrazine 2-chloro-3-methyl-5-( 2-methy1butyl)pyrazine 1371 2-chloro-3-methyl-5-(2-methylpentyl)pyrazine 1456
3-methyl-2-methoxy-5-(2-methylpentyl)pyrazine
ethoxypyrazine
1737 1348 1385 1418 1439 1431 1500 1584 1566 1693 1771 1600 1616 0 1696 1692 1737 1800 1816 1941 2008 1635 1655 1769 1832 1843 1951 2026 2104 2103 2114 2173 2209 2301 2400 2399 2375 2430 2452 2569 2669 1571 1567 1625 1618 1617 1629 1351 1581 1399 1467 1575 1505 1577 1600 1710 1789
aTakenfrom ref 5. bCompoundsnot included in current study. CNoreported retention index on OV-101.dNo reported retention index on Carbowax-2OM. sequent work leaving a data set of 107 compounds for each stationary phase. The models generated were of the general form n
RI
(SP)=
bo
+ Ci bixi
The equations generated give the retention index for a given stationary phase as the intercept (bo) plus the sum of the product of the ith regression coefficient (bi)and the ith independent variable or descriptor (xi),where n is the number of independentvariables in the model. Development of the models proceeded by treating each stationaryphase separately.
Since the same final descriptor pool was available to the regression analysis procedure,it was of interest to see which of the descriptors were condsidered important in predicting the retention index values for each column type. The regression analysis procedure produced several reasonable equations. The choice of which equation to consider further was made by using three criteria: multiple correlation coefficient (R),standard error (s), and the number of descriptors in the model. An ideal model is one that has a high R value, a low standard error,and the fewest independentvariables. The best models found for each stationary phase are given below. The definition of each descriptor is given in Table 11.
ANALYTICAL CHEMISTRY, VOL.
Table 11. Definitions of the Descriptors Used in the Retention Index Prediction Models descriptor code ALLP-2 EDEN-1 FRAG-12 MOMI-1 MOMI-6 MPOL MPOS WTPT-1 WTPT-3 WTPT-4
definition number of paths in the molecule between the upper and lower limits (0-45) divided by the number of atoms in the molecule most negative approximated u electron density number of single bonds in the molecule first moment of inertia ratio of the second and third moments of inertia molecular polarizability most positive partial atomic charge in molecule molecular ID (sum of atomic IDS for the molecule") sum of atomic IDS for all heteroatoms in molecule sum of atomic IDS for oxygen atoms in molecule
@Theatomic ID is the sum of all path weights for a given atom (16).
Table 111. Correlation Matrix for the Descriptors in the Retention Index Prediction Model FRAG-12 EDEN-1 MOMI-6 WTPT-1 WTPT-3 WTPT-4
a. OV-101 1.000 0.045 1.000 0.769 0.164 1.000 0.673 -0.038 0.596 0.111 -0.703 -0.029 0.019 -0.802 -0.087
1.000 0.304 1.000 0.091 0.522 1.000
b. Carbowax-2OM
FRAG-12 1.000 ALLP-2 0.582 MOMI-1 0.856 MPOL 0.893 WTPT-1 0.679 MPOS -0.041
1.ooO 0.767 0.865
1.000 0.926 0.802
1.000 0.871 1.000 0.066 -0,055 -0.078 0.026 1.000 0.895
The coefficient of multiple determination (R2)indicates the amount of variance in the data set accounted for by the model. The standard error of the regression coefficient is given in each case, and n indicates the number of molecules involved in the regression analysis procedure.
ov-101: RI = [40.77 f 0.55(WTPT-1)][31.68 f 3.10(WTPT-4)] + [56.63 f 2.46(WTPT-3)] + [602.2 f 66.35(EDEN-1)] [7.22 f 0.84(MOMI-6)] [9.74 f 1.53(FRAG-12)] 16.08
+
R2 = 0.994
n = 107
+
s = 22.92
Carbowax-2OM:
RI [32.58 f 1.43(WTPT-1)] [86.23 f 4.57(ALLP-2)] - [213.5 f 6.9(FRAG-12)] [145.7 f 5.8(MPOL-l)] + [311.4 f 23.O(MPOS)] + f (9.30 X 10-3)(MOMI-1)]- 53.08 [(6.50 X
+
R2 = 0.986
n = 107
s = 36.33
The correlation matrix for the OV-101 model and the Carbowax-20M model is given in parts a and b of Table 111, respectively. The descriptors were chosen by the regression calculation in the order indicated in the models above and were based on the statistical significance of the descriptor and its utility in predicting the dependent variable. Examination of the models shows that although the sets of descriptors availabile in regression analysis were identical for both stationary phases, some different individual descriptors were found to be important in the prediction of the retention indices. There exists some similarity in what types of descriptors (i.e. topological, geometric, or electronic) were
61, NO. 13, JULY 1, 1989
1331
important. Even though the compounds in the dataset contained a diverse collection of substituents, only one of the descriptors (WTPT-4) encodes information specific to one type of substituent. The remainder are broader in scope. Both models contain a t least three topological descriptors. These encode information on the complexity and bonding within the molecules. The other descriptors present in the models are encoding geometric and electronic properties of the molecules, which presumably relate to the polar interactions between the solutes and the stationary phase. Predictions of retention indices for a diverse set of structures on more polar stationary phases have historically been difficult. This is evident in the somewhat poorer statistics found for the Carbowax-20M model. Even though the Carbowax-20M model contains more of the electronic/geometric types of descriptors than the OV-101 model, the descriptors used do not appear to encode enough information concerning the various polar interactions taking place during the separation. I t is apparent that additional work should be done in order to find descriptors that can improve predictions on such polar phases. An experiment was done where the experimentally determined retention index for the compounds on OV-101 were used as descriptors for prediction of retention on the more polar Carbowax-20M phase. It was postulated that any nonpolar interactions encoded in the retention indices for compounds on OV-101 could be the same as those for compounds on Carbowax-20M, and any additional structural descriptors indicated as significant by the regression analysis would therefore encode only the polar interactions. Although highly correlated, when other descriptors were available, the retention index on OV-101 is not significant in predicting retention on Carbowax-20M, and it is possible that the indices for the two phases are related in some nonlinear fashion. Similarly, when descriptors for one model were used for generating models for the other phase, the models obtained were generally poorer, indicating that the descriptors are encoding significantly different types of interactions, reinforcing the notion of modeling each type of stationary phase separately. The next step was to test the quality of the models. Several methods exist for this purpose. First, the interrelation of descriptors used in each of the two models was examined. As shown in parts a and b of Table 111, a few of the descriptors are fairly highly correlated. However, they are all below the cutoff value used in this study. A test for multicollinearities was then performed by holding each descriptor out in turn and using the remaining descriptors in the model equation as independent variables in a regression analysis attempting to predict the descriptors held out. Some fairly high multicollinearities were found, so an additional test was done to determine if these high multicollinearities affect the quality of the model equation. This was done by calculating the principal components for the data space, one for each descriptor present in the model. Principal components regression analysis was used to generate a new model equation, which was then compared to the original equation. This comparison showed the same standard error and coefficient of variance for both the original and principal components model. Also, all six principal components were retained in the regression analysis. These results indicate that any multicollinearities present in the original model do not adversely affect the quality of the equation. The final test of the models is to determine their ability to predict the retention indices of compounds not used to generate the predictive equations. Due to the diverse nature of the data set, the method of internal validation was chosen. The descriptors used in the generation of the predictive equations were taken from the models generated for the whole
1332
ANALYTICAL CHEMISTRY, VOL. 61, NO. 13, JULY 1, 1989
dataset. Next, 27 compounds were held out as a prediction set, while the remaining 80 compounds comprised the learning set. The prediction set compounds were chosen so as to give a subset that was statistically similar to the learning set. The learning sets for each stationary phase were then used to generate new model equations. These new equations were found to be similar to the original models. The new OV-101 model yielded an R2 value of 0.994 and a standard error of 21.44, while the new model for the Carbowax-20M data set gave an R2 value of 0.986 and a standard error of regression of 33.72. The new models were then used to predict the retention indices for the prediction set compounds. For the OV-101 dataset, the residual mean square (rms) error for the predictions was 32.42 (2.46%) with a correlation for predicted to experimental values of 0.996. The Carbowax-20M equation fared slightly worse with a rms error of 51.59 (2.90%) and a correlation between predicted and experimental retention indices of 0.993.
CONCLUSIONS The computer-assisted prediction of the retention indices for a diverse set of substituted pyrazines, based on structurally derived descriptors, has been successfully demonstrated. The ability to input, manipulate, and store structures and calculated descriptors by using systems such as ADAPT increases the facility and speed with which such a study can be accomplished. Examination of the models produced shows that the descriptors encode information related to the types of interactions that take place between the solute molecules and stationary phase during the separation process. It has also shown that different descriptors become important in such predictive models as the polarity of the stationary phase is changed, and therefore each phase should be modeled separately. The predictive ability of models generated in this fashion has been shown to be accurate within a few percent
of the mean retention index value, indicating that such models could be used in lieu of authentic standards for the partial identification of chromatographic peaks. The results also suggest the need for further investigation into the problem of improving predictions of retention indices for such diverse sets of compounds on the more polar phases.
ACKNOWLEDGMENT We are grateful to G. P. Sutton for his guidance in the area of regression analysis.
LITERATURE CITED (1) Barlln, G. B. The Pyrazines; Wiley-Interscience: New York, 1982. (2) Parliment, T. H.; Epstein, M. F. J. Agric. Food Chem. 1973, 21, 7 14-7 16. (3) Masuda,-H.; Misaku, Y.; Shibamoto, T. J. Agric. Food Chem. 1981, 29. 944-947. (4) Mihara, S.;Enomoto, N. J. Chromatogr. 1986, 324, 428-430. (5) Mihara, S.; Masuda, H. J. Chromarogr. 1987, 402, 309-317. (6) Masuda. H.; Mihara, S.J. Chromatogr. 1986, 366, 373-377. hic Rek (7) Kaiiszan, R. Quantitative S ~ u c r u r e - C h r o ~ r ~ r a p Retention tionships; Why-Interscience: New York, 1987. (8) Rohrbaugh, R. H.; Jurs, P. C. Anal. Chem. 1986, 58, 1210-1212. (9) Hasan, M. N.; Jurs, P. C. Anal. Chem. 1988, 60, 978-982. (10) Stuper, A. J.; Brugger, W. E.; Jurs, P. C. Computer AsslstedShrdles of Chemical Structure and Sioiogiml Function; Wiley-Interscience: New York, 1979. (11) Jurs, P. C . : Chou, J. T.; Yuan, M. I n Computer-AssistedDrug Design; Olson. E. C., Christoffersen, R. E., Eds.; American Chemical Society: Washington, DC, 1979; pp 103-129. (12) Rohrbaugh, R. H.; Jurs, P. C. Anal. Chim. Acta 1987, 199, 99-109. (13) Topliss, J. G.; Edwards, R. P. J. Med. Chem. 1979, 22, 1238-1244. (14) Small, G. W.; Jurs, P. C. Anal. Chem. 1983, 55, 1121-1127. (15) Belsley, D. A,; Kuh, E.; Welsch, R. E. Regression Diagnostics; Why: New York, 1980. (16) Randic, M. J. Chem. I n f . Compur. Sci. 1884, 24, 164-175.
-
RECEIVED for review Janllary 23,1989. Accepted April 3,1989. This work was supported by the National Science Foundation under Grant CHE85 03542. The PRIME-750 computer was purchased with partial financial support of the National Science Foundation.