Solvent-Dependent Regression Equations for the Prediction of

in models predicting retention of individual steroids. Regression equations have been widely used to predict reten- tion of solutes in both gas and li...
0 downloads 4 Views 987KB Size
Anal. Chem. 1995, 67, 4423-4430

Solvent=DependentRegression Equations for the Prediction of Retention in Planar Chromatography David Nurok,**t Robert M. Kleyle,* Paul Hajdu,t Brenda Ellsworth,t Steven S. Myers,t Terrance M. Brogan,t and Kenneth B. Lipkowitzt Departments of Chemistry and Mathematical Science, School of Science, Indiana Universii)+ Purdue Univetsiv at Indianapolis, 402 North Blackford Street, Indianapolis, Indiana 46202

Robert C. Glen Trips Inc., 1699 South Hanley Road, St. Louis, Missouri 63144

Regression equations have been widely used to predict retention of solutes in both gas and liquid chromatography. The regression models are usually built using a set of potential solute descriptors such as dipole moment, polar and nonpolar surface areas, and quantum mechanical indexes. The book by Kaliszan’ contains a table with 44 references to such regression models. An alternative approach that has also been very successful has

used models based on linear solvation energy relationships.2 Models with very high multiple correlation coefficients have been constructed by both approaches. It has recently been demonstrated3that it is possible to also construct regression models in which the properties of the solvent are used as descriptors to predict the retention or the separation quality of either a set of 15 steroids or the bnitrobenzyl esters of a set of 15 dansyl amino acids. This was performed by using a series of mobile phases in which each member consisted of a k e d concentration of a strong solvent such as ethyl acetate, diluted with each of a series of weaker solvents such as toluene or chloroform. Such a series can span a large range of mobile phase strengths.4 The properties of the weak solvent that were used as descriptors include density, dipole moment, molar volume, and both saturated and unsaturated surface area. Both first- and second-ordermodels can be constructed in which either average Rf or a suitable metric for separation is the dependent variable. The metric that was used is the IDF, which is defined in the section below on chromatographic relationships. The quality of the regression models for either dependent variable varies considerably, with the best of the linear models having an Rz in the range 0.94-0.96, and the best of the secondorder models having an Rz in the range 0.95-0.98. The quality of a regression model is dependent on the concentration of the strong solvent. Mole fractions of 0.1 and 0.3 of the strong solvent were considered. The best regression models with the IDF as dependent variable occur at the lower mole fraction of the strong solvent, and the best models with the average Rfas the dependent variable occur at the higher mole fraction. The original communication did not consider the retention of individual solutes as a dependent variable. Such models are discussed below. Dipole moment was found to be an important descriptor, and a limitation of the preliminary report was that literature values of dipole moment were not available for all of the solvents that were considered. The largest set of solvents for which dipole moment was available was a series of 18weak solvents used for separating the p-nitrobenzyl esters of the dansyl amino acids with ethyl acetate as the common weak solvent. In the report below, the use of computed dipole moment is discussed for both first- and

Department of Chemistry. Department of Mathematical Sciences. (1) Kaliszan,R Quantitative Structure- Chromatographic Retention Relationships; John Wiley and Sons: New York, 1987.

(2) Sadek, P. C.; Cam,P.W.; Dohehy, R M.; Kamlet, M. J.; Taft,R W., Abraham, M. H. Anal. Chem. 1985, 57, 2971-2978. (3) Nurok, D.; Kleyle. R M.; Lipkowitz, IC B.; Myers, S. S.; Keams, M. L.Anal Chem. 1993, 65, 3701-3707. (4) Nurok, D.; Julian, L. A; Uhegbu, C. E. Anal. Chem. 1991,63,1524-1529.

Both first- and second-order regression models are presented that relate retention, as the Rf of individual solutes, the log k ’ of individual solutes, or the average Rf of a mixture of solutes, to the properties of the weak solvent in each of a series of 25 binary mobile phases consisting of a specified concentration of ethyl acetate as a common strong solvent. The stepwise procedure is used for constructing these models, which are for either simulated or experimental separations on silica gel. Similar regression models are used to predict separation quality, deflned by a suitable metric. A comparison of the forward and backward stepwise procedures finds that the former is the more reliable method for constructing these models. The solutes are either steroids or thep-nitrobenzyl esters of dansyl amino acids, and the sohrent descriptors are density, dipole moment, molar volume, polarizability, saturated surface area, and unsaturated surface area. The quality of regression fits obtained with models using computed dipole moment is comparable to that obtained with models using experimental (literature) dipole moment. Both nonstandardizedand standardized regression models are presented. The relative contribution of each descriptor to the variability in retention may be estimated from the latter models. A set of three descriptors-dipole moment, polarizability, and saturated surface area-predicts Rffor each of the amino acid derivatives at an ethyl acetate mole fraction of 0.30. A set of two descriptors-dipole moment and saturated surface area-predicts log k ’ for each of these compounds at an ethyl acetate mole firaction of 0.20. Such concordance in descriptors is not found in models predicting retention of individual steroids.

~

+

$

0003-2700/95/0367-4423$9.00/0 0 1995 American Chemical Society

~

~~~

~

~

~~

~

Analytical Chemistry, Vol. 67, No. 23, December 1, 1995 4423

second-order models, for separating both steroids and the p nitrobenzyl esters of the dansyl amino acids. The separation of both classes of solute in a series of 25 weak solvents, with ethyl acetate as the constant strong solvent, is also discussed. The report below discusses also the relative merits of the conventional (forward) stepwise and the backward stepwise regression techniques. THE DESCRIPTORS The values of experimental (i.e., literature) dipole moment, molar volume, density, and both nonpolar saturated and nonpolar unsaturated surface area are those used in ref 3. Hydrogen, carbon, and also halogen atoms are treated as nonpolar. The dipole moments were computed using the AM1 Hamiltonian implemented in MOPAC 6.0.5 The values of polarizability were computed from an algorithm based on a mod6cation of Slaters's rules.6 CHROMATOGRAPHIC RELATIONSHIPS Soczewinski has r e p ~ r t e din, ~a somewhat diflerent format, the following linear relationship:

+ bj

log k,' = ailog X ,

(1)

where ki' is the capacity factor of solute i, X , is the mole fraction of the strong solvent in a binary mixture of a strong and weak solvent, and a and b are experimentally determined constants for each solute. The relationship between Rf and capacity factor is given by

R,= 1/(1

+ k')

(2)

Experiments in our laboratory indicate that the agreement between predicted values of Rf (using eqs 1 and 2) and experimental values are within 0.05 of an Rf unit for Xsvalues that are either within the range used for establishing the constants in eq 2 or below the range used for this purpose. The relationship can often be modestly extrapolated to larger X , values, but this must be performed with caution. The diflerence between two Rfvalues, ARf, is

(3) and may be defined in terms of the corresponding capacity factors and k;:

kl'

k,' - k,'

%=

(1 + h i ) (1

+ k,?

(4)

SD,the distance between two spot centers, is

at 7.3 cm in this report to correspond to the path length used for the experimental determination of the regression constants in eq 2. The IDF was arbitrarily selected as the metric for separation quality in this study. It has been used in various optimization studies in planar chromatography8-10 and is similar in form to a metric introduced by Gonnord and co-authorsll for a similar purpose. The metric is defined as n

IDF = zl/S,' i= 1

where n is the number of neighboring solute pairs and S Dis~the distance between neighboring spot centers. Spots that are separated by