Anal. Chem. 1990, 62,2318-2323
2318
Prediction of Gas and Liquid Chromatographic Retention Indices of Polyhalogenated Biphenyls M. N. Hasan and P. C. Jurs* Department of Chemistry, 152 Davey Laboratory, The Penmylvania State University, University Park, Pennsylvania 16802
Gas chromatographic and liquid chromatographic retention indices of poiyhalogenated biphenyl compounds are successfully modeled by using descrlptors derived directly from the molecular structures. A fivevariable regresslon equation wHh R2 = 0.989 and relative standard deviation of 2.2% was generated for the GC retention indices of 53 PHBs. A fivevariable regression equation with R 2 = 0.968 and relative standard deviation of 8.5 % was generated for the LC retention Indices of 53 PHBs. The descriptors found to be important are analyzed.
INTRODUCTION Halogenated biphenyls are a group of chemical compounds formed by substituting hydrogen in biphenyl with one of the halogens. In theory, there are 209 possible congeners for each series of halogenated biphenyl. However, the compounds which have received the most attention from environmental chemists and toxicologists are the chlorinated biphenyls (PCBs). Their chemical inertness and other desirable physical properties have made the PCBs a versatile chemical product with many industrial applications (1). However, due to indiscriminate use and improper disposal, PCBs have become one of the most widely spread environmental contaminants. Because of their persistence, they are still considered a major threat to the environment, despite the fact that their production has long been banned. On the other hand, the other halogenated biphenyls such as polybrominated biphenyls (PBBs) are not as persistent as PCBs. Thus, their toxicology and environmental impact have not been as extensively studied as with PCBs. However, an accidental poisoning of dairy cattle and other farm animals in Michigan ( 2 , 3 ) ,led to more studies on the environmental effects and analytical methods for determination of PBBs (4,
5). One of the most commonly used methods of analysis for polyhalogenated biphenyls in environmental samples is high-resolution gas chromatography coupled with an electron-capture detector or mass spectrometer. Liquid chromatography is not as popular as GC for analysis of polyhalogenated biphenyls, probably due to the lack of sufficiently sensitive detectors. Nevertheless, it is still considered an important analytical method in the overall determination of halogenated biphenyls, especially for sample cleanup or fractionation prior to GC analysis. A discussion on analytical methods for analysis of PCBs has been given by Erickson (6). Recently, Mullin et al. (7)reported the synthesis of all 209 PCBs and subsequent analysis using capillary GC. All but 11 pairs were completely separated. They noted some interesting trends in the elution order of PCB congeners which are very much influenced by their substitution pattern. Other reports also made the same generalization in the elution order of PCBs (8-10). For example, compounds containing ortho substituents generally have shorter retention times than those lacking ortho substituents. Retention increases as the degree
of coplanarity increases. Coincidently, these so-called coplanar isomers have been shown to have higher toxicity (11,lZ)than the nonplanar isomers. Other researchers also find a similar trend in the elution order of other halogenated biphenyls (4, 5). Because of the high correlation between substitution pattern and retention characteristics of PCBs, a number of researchen have devised schemes for predicting the retention of PCBs from molecular structures (13-15). Some of these methods, however, were based on retention data measured on packed column GC. In our previous work on the prediction of retention characteristics of PCBs in capillary GC (16) we developed a very good equation for predicting retention using simple structural descriptors. It is desirable that a similar predictive equation be developed for all halogenated biphenyls. Recently, Hofler et al. (17) described the separation of selected polyhalogenated biphenyls by both gas and liquid chromatography. In general the retention of GC and LC increases in the order F < C1< Br < I. In each series (except for fluorinated biphenyls) there is also a good correlation between retention indices and molecular surface area. They also noted that this correlation is more significant in gas chromatography than in liquid chromatography. The objective of the present work was to develop regression models to predict the retention of halogenated biphenyls using descriptors derived from the molecular structures. This is an extension of the previous work on PCBs (16). This study also includes the development of a regression model for liquid chromatography using the same methodology.
EXPERIMENTAL SECTION Data Set. The data set consisted of 56 polyhalogenated biphenyls including 13 fluorinated, 22 chlorinated, 18 brominated, and 3 iodinated biphenyls (Table I). The gas and liquid chromatographic retention data for these compounds were taken from Hofler et al. (17). The GC retention data were measured on a 25 m X 0.25 mm DB-210-CB capillary column with film thickness of 0.2 pm. Retention characteristics of these compounds were expressed as Kovats retention indices. The retention index for one of the compounds, 3,3’,4,4’,5,5’-hexabromobiphenyl, was not available. Therefore, the GC retention data set consisted of 55 data points. For the liquid chromatographic data, retention for all 56 compounds in the data set were obtained on an ODS (CIS)column using 100%methanol as the mobile phase. A retention index scale as described by Hofler et al. (17)was used. As with the Kovats retention index, the retention index of a compound is found by interpolation between flanking n-alkane standards. Entry and Storage of Molecular Structures. The molecular structures were entered into the computer system by drawing them on the screen of a graphics display terminal. They were stored as connection tables for further processing. Molecular Modeling. Since descriptors dependent on geometry were expected to be useful in modeling the retention properties of the compounds, it was necessary to obtain reasonable three-dimensional representations of the molecules. In previous studies involving PCBs (16), the molecules were modeled by using a force field molecular mechanics model builder with the torsional angles between the two phenyl rings being fixed using experimental values taken from the literature. It was expected that
0003-2700/90/0362-2316$02.50/0 0 1990 American Chemical Society
ANALYTICAL CHEMISTRY, VOL. 62, NO. 21, NOVEMBER 1, 1990
Table I. Polyhalogenated Biphenyls in the Data Set no.
compound
no.
1 2-fluorobiphenyl 2 3-fluorobiphenyl
27 28 29 30 31
h4
4‘/
3 4 5 6 7 8 9 10
4-fluorobiphenyl 2,3-difluorobiphenyl 2,4-difluorobiphenyl 2,5-difluorobiphenyl 2,6-difluorobiphenyl 3,4-difluorobiphenyl 3,5-difluorobiphenyl
4,4'-difluorobiphen yl 11 2,2’,4-trifluorobi-
phenyl 12 2,3,5,6-tetrafluorobiphenyl 13 decafluorobiphenyl 14 2-chlorobiphenyl 15 3-chlorobiphenyl 16 4-chlorobiphenyl 17 2,3-dichlorobiphenyl 18 2,4-dichlorobiphenyl 19 2,5-dichlorobiphenyl 20 2,6-dichlorobiphenyl 21 3,4-dichlorobiphenyl 22 4,4’-dichlorobiphenyl 23 2,3,4-trichlorobiphenyl 24 2,4,5-trichlorobiphenyl 25 2,4,6-trichlorobiphenyl 26 3,4,5-trichlorobiphenyl
32
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56
Table 11. Structural Parameters for Biphenyl
5’
2
3
5’6‘
6
5
6’
6
5
exptn
this method
MNDO
1.495 1.398 1.732 1.095
1.500 1.407 1.753 1.090 120.3 118.7 74
1.487 1.409 1.754 1.090 117.6 121.5 88
r(C1-C1) r(C-C),i,g r(C-Cl) r(C-H) fC6ClC*
-b
fc1c2c3
-b
torsional angle
73.5
“By electron diffraction (ref 19). All distances, r, are in A and angles in degrees. Not auoted. Table IV. Torsional Angles for Polyhalogenated Biphenyls torsional angle, deg fluorinated biphenyls nonortho one ortho substituent two ortho substituents chlorinated biphenyls nonortho one ortho substituent two ortho substituents brominated biphenyl nonortho one ortho substituent two ortho substituents iodinated biphenyl nonortho one ortho substituent two ortho substituents
44 56 65 44 61 74 44 63 n.e.O 44 64 n.e.O
Not evaluated because there is no compound of this type in the data set.
biphenyl
r(Cl-C1d r (C-C 1ring r(C-H) LC5ClC2 LC1C2C3 torsional angle
\”
2,2‘-dichlorobiphenyI
.,m4 3’ 2’
2319
Table 111. Structural Parameters for 2,2’-Dichlorobiphenyl
compound 2,3,4,5-tetrachlorobiphenyl 2,3,4,6-tetrachlorobiphenyl 2,3,5,64etrachlorobiphenyl 2,2’,4,4’-tetrachlorobiphenyl 3,3’,4,4’-tetrachlorobiphenyl 2,3,3’,4,5-pentachlorobiphenyl 2,2’,4,4’,5,5’-hexachlorobiphenyl 2,2’,4,4’,6,6’-hexachlorobiphenyl decachlorobiphenyl 2-bromobiphenyl 3-bromobiphenyl 4-bromobiphenyl 2,3-dibromobiphenyl 2,4-dibromobiphenyl 2,5-dibromobiphenyl 2,6-dibromobiphenyl 3,4-dibromobiphenyl 3,5-dibromobiphenyl 4,4’-dibromobiphenyl 2,4,5-tribromobiphenyl 2,4,64ribromobiphenyl 3,4,5-tribromobiphenyl 2,3,4,5-tetrabromobiphenyl 2,3,4,5-tetrabromobiphenyl 3,3’,4,4’-tetrabromobiphenyl 3,3’,5,5’-tetrabromobiphenyl 3,3’,4,4’,5,5’-hexabromobiphenyl 2-iodobiphenyl 3-iodobiphenyl 4-iodobiphenyl
9
expta
this method
MNDO
1.507 1.398 1.102 119.4 119.4 44.4
1.500 1.397 1.090 120.9 119.0 44.0
1.485 1.409 1.090 120.8 120.4 76.1
” By electron diffraction (ref 18). All distances, r, are in A and angles in degrees. the torsional angle for the fluorinated, brominated, and iodinated biphenyls would vary slightly from the PCBs due to differences in size. Unfortunately, structural data for most of these compounds were not readily available. Thus, the torsional angle had to be estimated with the molecular mechanics modeling approach. Various methods have been used in the past to estimate the torsional angles of polyhalogenated biphenyls, including electron diffraction (18, 19), X-ray diffraction (20),and photoelectron spectroscopy (22). In addition, theoretical methods using ab initio calculations (22)and molecular mechanics (MM2) (23,24)have also been used to estimate the structural parameters of polyhalogenated biphenyls. The approach taken in this study was to fix the angle in such a way that the distance between the two substituents at the opposing ortho positions just exceeds the sum of van der Waals radii of the two atoms. Although this may appear to be a simplistic approach for estimating the torsional angle, preliminary experiments using a number of test compounds indicated that the structural parameters such as bond length, atomic distances, and torsional angles obtained by using this method agreed well with experimental and theoretical data. As examples, structural parameters for biphenyl
(Table 11)and 2,2’-dichlorobiphenyl (Table 111) obtained by using this method are compared with structural data from the literature. Also shown are the results of molecular orbital calculations using MNDO (performed by using MOPAC molecular orbital package, version 5.0). As can be seen from the two tables, the atomic distances and the bond angles obtained by using this method did not differ very much from the experimental data. On the other hand, the torsional angles suggested by the MNDO calculation were consistently larger than the experimental values. Therefore, the simple method was employed to estimate the torsional angles of all compounds in the data set. The results from this investigation suggested that, as far as the torsional angle is concerned, the halogenated biphenyls can be classified into three groups, depending on the number of substituents on the ortho position. The first group consisted of compounds that are not substituted a t the ortho position, the compounds of the second group have one ortho substituent, and the compounds of the third group have two substituents. Also, as expected, the torsional angle increases from F to I as the van der Waals radii increase. A summary of torsional angles obtained from this investigation is shown in Table IV, and these are the angles used throughout this study. Descriptor Generation. The molecular structure descriptors used to represent the molecules were generated by using the ADAPT software system (25). The descriptors can be classified into four major groups: topological, geometrical, electronic, and physicochemical descriptors. Topological descriptors include fragment descriptors, molecular connectivity indices, substructure counts, and substructure environment descriptors. Geometric descriptors include principal moments of inertia, van der Waals volume and surface area, and shape parameters. Electronic descriptors include partial charges, dipole moments, etc. Calculated
2320
ANALYTICAL CHEMISTRY, VOL. 62, NO. 21, NOVEMBER 1, 1990
physical property descriptors include calculated log P, molecular polarizability, etc. Recently, Kier (26)has developed a set of topological indices based on a graph theoretical approach to encoding the shape of molecules. These kappa indices are defined as where "K is the kappa index of order n, C is a constant (C = 2 for n = 1 and n = 2, and C = 4 for n = 3), "P,,, and "Pmin are the maximum and minimum number of paths of length n in a molecule with the same number of atoms as the target molecule, and "Piis the actual number of paths of length n in the target molecule. As an example, consider the simple molecule 2methylpentane. For this molecule, 'Pmin= 4 (obtained from the straight-chain graph), zPms= 10 (obtained from the star graph), zPi= 5 (obtained from the structure itself), and C = 2, so z~ = 3.2. Also, 3P,,,,n = 3 (obtained from the straight-chain graph), 3P,, = 4 (obtained from the graph of 2,3-dimethylbutane), =3 (obtained from the structure itself), and C = 4, so 3K = 5.333. For one of the members of the present data set, 2,3,3',4,5-pentachlorobiphenyl, 2~ = 5.325 and 3 K = 2.713. While these kappa indices are simply related to molecular structure, they are best calculated with software due to the tedious nature of the computation, especially for larger structures. Shadow area descriptors (27) are cross-sectional areas of the three-dimensional molecular models of the structures projected onto three mutually perpendicular planes. The descriptor values are highly dependent on the orientation of the molecular structures prior to the calculation, so a systematic technique for structure orientation is an essential part of the development of these descriptors. Standardized shadow areas are computed by dividing the cross-sectional areas obtained above by the areas of the largest rectangle that can enclose the molecule in the relevant orientation. A new set of electronic descriptors, capable of encoding the amount of partial charge on the atoms in a given molecule, was developed and applied to these structures. These charges are calculated by a variation of the method of Abraham and Smith (28). Essentially, it calculates the u charges from atomic electronegativity and polarizability and the A charges from Huckel molecular orbital calculations. The u and A charges were then added to give the total atomic charge for all atoms in the molecule. The method is iterative, and recalculates the partial charges repeatedly until convergence is obtained. Partial charges for the most positive and most negative atoms were also calculated as descriptors. From a knowledge of the partial charge of each atom in the molecule and its molecular surface area, the charge-weighted surface area of the molecule can be calculated as a descriptor. In addition, descriptors encoding the total surface area of positively or negatively charged portions of the molecule can also be calculated. These types of descriptors are quite useful to model charge-dependent intermolecular interactions in some chemical environments (29). After being generated, the descriptors were subjected to an objective feature selection procedure in which descriptors with insufficient variation or with high pairwise correlations with other descriptors were discarded. These procedures are objective in that the dependent variable values are never used in the analyses. The multicollinearity among the descriptors was reduced by selecting a pool of descriptors which gave maximum orthogonality. The descriptors were then used as dependent variables in multiple linear regression analysis. Regression Analysis. Two types of variable selection methods were used to develop regression equations for predicting the retention indices. They were stepwise regression (30)and leaps and bounds regression (31). The stepwise routine was also run several times, where in each run two or more variables already in the equation were deleted, in order to expose potentially good models which might have been left out because of intercorrelations among the variables. The equations obtained were tested for their validity and robustness by using various standard statistical tests and plotting methods.
RESULTS AND DISCUSSION Modeling of GC Retention Indices. A number of good models for modeling GC retention indices of polyhalogenated
2950
CALCULATED 24W
RI
.
.e..
.%-
15W
I
I
ANALYTICAL CHEMISTRY, VOL. 62, NO. 21, NOVEMBER 1, 1990
Table V. Experimental and Calculated Retention Indices
compound" 1
2 3 4 5 6 7 8 9 10 11 12
13 14 15 16 17 18 19 20 21 22
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56
GC/RI exptb calcdc 1627 1654 1651 1672 1604 1636 1622 1609 1636 1684 1558 1639 1550 1717 1845 1853 1957 1897 1900 1832 2082 2113 2170 2098 1952 2278 2339 2209 2203 2241 2632 2670 2662 2784 3254 1792 1947 1959 2129 2076 2073 1975 2307 2211
2341 2383 2221 2640 2792 2630 3124 2907 -f
1858 2062 2082
1611 1711 1709 1626 1602 1586 1597 1683 1703 1709 1575 1587 1570 1699 1844 1896 1903 1887 1866 1851 2049 2079 2097 2018 2007 2189 2202e 2203 2174 2246 2557 2736 2627 2520e 3295 1827 1944 1997 2151 2079 2035 2005 2335 2195 2288 2387 2276 4602 2773 2697 3099 2933 3722 1913 2094 2096
LC/RI exptb calcdd 367 382 370 355 304 295 223 347 382 227 228 212
278 442 578 521 540 605 559 415 706 583 732 817 700 891 978 832 794 682 871 982 977 916 1523 480 633 606 624 728 669 494 761 882
703 923 832 1000
1053 939 996 1031 1365 483 663 640
358 377 420 365 306 288 191 382 314 298 264 150 295 424 497 489 630 585 583 467 678 629 789 753 658 865 922 796 774 786 923 1007 1003 832 1508 484 571 557 699 694 706 615 787 766 759 887 792 964 1035 966 1097 1025 1323 -8
-8 -8
The compounds are numbered as in Table I. *From ref 17. CUsingeq 1. dUsing eq 2. eSuspected outliers. fNot available. #Charge-weightedsurface area descriptors were not available for these compounds. of the variance inflation factors for the five variables, showed that no serious multicollinearity exists among the variables. The retention indices calculated by using eq 1 are listed in Table V. In general, the calculated values agree well with the observed RI, indicating high validity of the equation. Note that two observations, compounds 27 and 34 were not included in developing the equations because they were suspected outliers. Both observations had residuals of greater than 3 standard deviations and one of them (compound 34) also has a high leverage on the regression line. No explanation is available as to why these two compounds have very high
2321
Table VI. Statistics for GC Retention Index Equations Produced by Omitting One Halogen Group
group omitted
n
R2
S
fluorine chlorine bromine
40 33 36 50
0.988 0.992 0.986 0.990
43.6 41.0 49.4 45.4
iodine
residuals. However, if included in developing the equations, these observations will pull the regression line away from the rest of the observations and undoubtedly will produce an inappropriate model for the system being studied. Furthermore, with the two observations left out, the standard error decreased by 20%. In order to validate the equation, a number of internal validation procedures were performed. First, a jackknife analysis was performed in which one observation was left out one at time and a regression equation was developed with the remaining observations. When the predicted values were compared with the observed values, a root mean square error of 51 was obtained. This in not very large compared to the standard error of the full model, indicating the regression line is quite stable over the sample space. In the second validation experiment, the data set was divided into two sets, each containing half the number of observations. The division was performed by using the Duplex algorithm of Snee (32) to ensure that the two halves have similar statistical properties. A regression equation was fitted by using 27 compounds in one of the training sets, and it was used to predict the retention indices of compounds in the second set. Although it was developed by using only half the number of observations, this equation still showed a high degree of statistical significance comparable to the equation developed by using all the observations. This equation was then used to predict the retention indices of the remaining 26 compounds. For all 26 compounds, the observed retention indices were within the prediction interval of the predicted values. This clearly implies that the equation is stable over the sample space and can be used to predict the retention indices of compounds not included in developing the equation. The majority of the descriptors which appeared in the GC retention index equations were geometric descriptors. Only the path-3 connectivity index is corrected for the type of halogen substituents, and it was the first variable selected in the equation. In order to show the effects of any particular halogen group on the overall quality of the equation, several subsets were created by omitting one halogen group from the data set. Regression equations were then fitted by using each of these subsets and the statistics were compared with those of the original equation (Table VI). Judging from the values of R2 and standard error of the resulting equations, none of the equations was significantly inferior to the original equation. This was also the case when individual coefficients were compared. Thus, it can be concluded that the overall equation was not affected by the type of halogen substitution. Modeling of LC Retention Indices. Several equations were evaluated for the calculation of liquid chromatographic retention indices. One of the best equations is the following: RI = -66511 f 3646 (fraction of positively charged SA) -2469 f 455 (fraction of negatively charged SA) -72.9 f 18.8 (number of ortho substituents) 3351 f 954 (relative positive charge)
-15.8 f 7.0
( 3 ~ ) 3(path-3 kappa
index)3
(2)
840.2
n = 53
R2 = 0.968
F(6,48) = 285
s =
55
2322
ANALYTICAL CHEMISTRY, VOL. 62, NO. 21, NOVEMBER 1, 1990
The R2shows that almost 97% of the variance is accounted for by this equation and the standard error is about 551650 = 8.5% of the mean value of the RI. The correlation matrix of the descriptors in this equation showed no high pairwise correlations. None of the variance inflation factors for these descriptors were large. The statistics for this equation, however, were not as good as the ones for the GC retention indices. This is understandable since it is generally more difficult to model the LC retention because more factors and interactions are involved. The first variable in the equation is the surface area of positively charged atoms in the molecule divided by the total surface area. The values of this descriptor are affected by the number and type of substituents on the phenyl rings. The negative sign of the coefficient indicates that retention indices increase as the fraction of positively charged area in the molecule decreases. This is probably caused by an increase in the number of halogens in the molecule, since the halogens have negative partial charges. However, as observed by Hofler et al. (I7), not every increment in the degree of substitution is accompanied by an increase in retention. The second variable, fraction of negatively charged surface area, probably served as a correction factor for this effect. The third variable is the number of halogens at the ortho position. The presence of substituents a t this position causes a decrease in the retention indices. The relative positive charge descriptor is the charge of the most positive atom in the molecule divided by the total charge of the molecule. The final variable, 3 K , the path-3 kappa index, is a topological descriptor encoding the general shape of the molecules. The charge-weighted surface area descriptors consistently appeared in several potential equations evaluated. This was expected since molecular surface area has been shown to be an important descriptor in modeling solubilities of hydrophobic compounds. However, the retention indices of fluorinated biphenyls do not correlate well with total surface area of the molecule. This is probably caused by interactions a t specific sites with the solvent sites with the solvent methanol. The increase in total area of the molecule does not compensate for the increase in partial negative charges. The chargeweighted surface area descriptors were probably more effective to encode the presence of such interactions. The descriptors contained in this equation are relatively complex. This is due to the complex nature of the interactions involved in retention in liquid chromatography. The charged partial surface area descriptors, which are contributing in a major way to the success of this equation, have been shown in a number of structure-property relationship studies to encode important structural information (29). The calculated retention indices are compared with experimental values in Table V. The retention indices for iodinated biphenyls were not calculated because parameters for the calculation of partial charges for iodine were not available. Therefore, only 53 data points were used to develop the equation. The plot of calculated versus observed retention indices is shown in Figure 2. As can be seen from the plot, two observations (compounds 35 and 53) have RI values isolated from the other data points. Although their residuals are small, these two compounds could be influential in the sense that they might have an effect of pulling the regression line away from the majority of the observations. T o determine whether the two compounds were influential, both were temporarily excluded from the data set and a new equation was fitted by using the remaining observations. In the equation that resulted, the coefficients did not significantly differ from those in eq 2. Therefore there is no reason to leave out the two observations. Furthermore, the inclusion of these two compounds in the data set will make the equation more
1600
,
CALCULATED
Rl
850
1W
~
'
I
IW
850
1600
EXPERlMENTALRl
Figure 2. Calculated versus experimental liquid chromatographic retention indices for 53 polyhalogenated biphenyl compounds using eq 2.
Table VII. Statistics for LC Retention Index Equations Produced by Omitting One Halogen Group group omitted
n
RZ
S
fluorine
40 31 35
0.945 0.972 0.974
59 57 51
chlorine bromine
~
useful for prediction purposes as they represent the only compounds containing six bromine and ten chlorine substituents. The residual plot for eq 2 showed a normal distribution and showed no evidence of nonconstancy of the error term. The last variable (path-3 kappa index) was raised to the third power in order to get a linear equation. When the original variable or its square was used, the residual plot suggested some curvature in the regression line. By use of the cubic transformation of the variable, the curvature disappeared, indicating that the transformation was appropriate. The effects on the equation due to the identity of the halogens on a particular compound are encoded by the descriptors, but not in a straightforward way. In order to test the effects of any particular halogen group on the overall quality of the equation, several subsets were created by omitting one halogen group from the data set. Regression equations were then fitted using each of these subsets and the statistics were compared with those of the original equation (Table VII). The values of R2 and standard errors for these equations show that the overall equation was not unduly affected by the type of halogen substitution. Internal validation using the jackknife method resulted in a root mean square error of 63, which is not large compared to the standard error of the full model. When the data set was divided into two training sets by the Duplex method and a new regression equation was developed by using one of the training sets, a statistically strong equation was generated. This equation was then used to predict the retention indices of compounds not included in generating the equation. The results show that only one compound (44) has a predicted RI that deviates significantly from the observed values, while the remaining compounds have their experimental RI within the prediction interval of the predicted RI. Thus it can be concluded that the equation developed by using the five variables was internally consistent and has a high probability of being able to predict the retention indices of compounds not included in developing the equation. CONCLUSIONS This study indicated that the methodology used previously to model GC retention characteristics of PCBs can also be
Anal. Chem. ISSO, 62,2323-2329
applied to model the retention indices of other halogenated biphenyls. In addition, the LC retention indices of these compounds can also be modeled, although the model found was not as good as the one for GC retention indices. A set of more sophisticated descriptors might be able to support better models. A descriptor to encode the tendency to form hydrogen bonds is one example. With the availability of more retention data, these models can be improved by using more observations to develop the regression equations. Retention indices measured on other columns, i.e. those that have different polarities, should also be tested. Finally, the possibility of predicting elution order should also be considered since it would be of practical value of chromatographers.
LITERATURE CITED Hutzinger, 0.; Safe, S.; Zitko, V. The Chemisw of fCBs; CRC Press: Boca Raton, FL, 1980. Carter, L. J. Science 1978, 792, 240-243. Robertson, L. W.; Chynoweth, D. P. Environment 1975, 77 (6), 25-27. de Kok, J. J.; de Kok, A.; Brinkman, J. A. Th.; Kok, R. M. J . Chromat w r . 1977, 742, 367-383. Robertson, L. W.; Safe, S. H.; Parkinson, A.; Pellizari, E.; Pochini, C.; Muilin, M. D. J . Agric. FoodChem. 1984. 32, 1107-1111. Erickson, M. D. AnaMicai Chemistry of fCBs; Butterworth: Stoneham, MA, 1986. Mullin, M. D.; Pochini, C. M.; McCrindle, S.; Romkes, M.; Safe, S. H.; Safe, L. M. Environ. Sci. Techno/. 1984, 78, 468-476. Bush, 6.; Murphy, M. J.; Connor, S.; Snow, J.; Barnard, E. J . Chroma togr. Sci. 1985, 2 3 , 509-515. Onuska, F. I.; Terry, K. A. HRC & CC, J . High Resolut. Chromatogr. Chromatogr. Commun. 1988, 9 , 671-675. Fischer, R.; Ballschmiter, K. Fresenius' Z . Anal. Chem. 1988, 332, 44 1-446. McFarland, V. A.; Clarke, J. U. Environ. Heakh ferspect. 1989, 87, 225-239. Parkinson, A.; Safe, S. I n folychlorimted Biphenyls (fCBs): Mammalian and €nvironmentai Toxicology; Safe, s., Ed.; Springer-Verlag: Berlin, 1987; pp 49-75.
2323
(13) Sissons, D.; We& D. J . Chromatogr. 1971, 60, 15-32. (14) Robbat, A.; Xyrafas, G.; Marshall, D. Anal. Chem. 1988, 60, 982-985. (15) Deviliers, J. fresinius' 2.Anal. Chem. 1988, 332, 61-62. (16) Hasan, M. N.; Jurs, P. C. Anal. Chem. 1988, 60, 978-982. (17) Hofler, F.; Melzer, H.; Mijckei, J.; Robertson, L. W.: Anklam, E. J . Agric. Food Chem. 1988, 36, 961-965. (18) Almenningen, A.; Bastlansen, 0.; Fernhot, L.; Cyvin, B. N.; Cyvin, S. J.; Samdal. S. J . Mol. Struct. 1985, 728, 59-76. (19) Romming, C.; Seip, H. M.; Aanesen Dymo, I.-M. Acta Chem. Scand., Ser. A 1974, 2 8 , 507-514. (20) Fieid, L. D.; Sketon, B. W.; Sternheil, S.; White, A. H. Aust. J . Chem. 1985, 3 8 , 391-399. (21) Dynes, J. J.; Baudais, F. L.; Boyd, R. K. Can. J . Chem. 1985, 6 3 , 1292-1 299. (22) McKinney, J. D.; Gottschalk, K. E.; Pedersen, L. J . Mol. Struct. 1983, 704, 445-450. (23) Tsuzuki, S.; Tanabe, K.; Nagawa, Y.; Nakanishi, H.; Osawa, E. J . Mol. StfUCt. 1988, 178, 277-285. (24) Jaime, C.; Font, J. J . Mol. Struct. 1989, 795, 103-110. (25) Stuper, A. J.; Brugger, W. E.: Jurs, P. C. Compufer Assisted Studies of Chemical Structure and Biological Function; Wiley-Intersclence: New York, 1979. (26) Kier, L. 9.Quant. Sfruct.-Act. Reiat. 1985, 4 , 109-116. (27) Rohrbaugh, R. H.; Jurs, P. C. Anal. Chim. Acta 1987, 799, 99-109. (28) Abraham, R. J.; Smith, P. E. J . Comput. Chem. 1987, 9 , 288-297. (29) Stanton, 0.R.; Jurs, P. C. Unpublished results. (30) Draper, N.; Smith, H. Appiied Regression Analysis, 2nd ed.; Wiley-Interscience: New York. 1981; pp 307-312. (31) Furnival, G. M.; Wilson, R. W.,Jr. Technometrics 1974, 76, 499-511. (32) Snee, R. D. Technometrics 1977, 79, 415-428.
RECEIVED for review April 16,1990. Accepted July 30,1990. This work was partially supported by the National Science Foundation Grant CHE-8815785.The Sun workstation was purchased with partial financial support of the National Science Foundation. The financial support of the University Teknologi Malaysia and the Malaysian Government is acknowledged.
Development and Use of Charged Partial Surface Area Structural Descriptors in Computer-Assisted Quantitative Structure-Property Relationship Studies David T. Stanton and Peter C. Jurs* Chemistry Department, Dauey Laboratory, Penn State Uniuersity, University Park, Pennsylvania 16802
Intermolecular interactions that are polar in nature contribute to observed physicochemical properties such as chromatographic retention and normal boliing point. However, these types of Interactions are difficult to encode with structural parameters currently available for use in SPR studies. A new series of molecular structural parameters have been deveioped that combine molecular surface area and partial atomic charge information to form charged partial surface area (CPSA) descriptors. These descriptors have been shown to be useful In a variety of structure-property studies. The characteristics and properties of these parameters are discussed, and their use in several structure-property studies is described.
INTRODUCTION There are many applications of computer-assisted quantitative structure-property relationships (QSPR) that are of
value in analytical chemistry. Such relationships act as tools to augment experimental analytical techniques, allowing the chemist to extract additional information. These computer-assisted techniques can be used as an aid to understanding chemical processes such as chromatographic retention. They are also of value in the identification of materials when authentic standards are not available or quantities of available materials are limited. In addition, computer-assisted QSPRs can save time by allowing the analyst to quickly estimate a property for a given molecule which might be too time-consuming to measure experimentally. However, because of limitations inherent to methods used to study the relationship between molecular structure and physical property, QSPR studies often focus on sets of compounds that are relatively similar. Also, limitations associated with the parameters used in QSPR in the past have restricted the accuracy of predictions and the variety of properties that can be studied. Thus, it is of interest to expand the utility of QSPR to a larger variety of compounds, to increase accuracy, and to extend the use of
0003-2700/90/0362-2323$02.50/0 0 1990 American Chemical Society