Prediction of Gas-Phase Reduced Ion Mobility Constants (K 0)

Departments of Chemistry and Biochemistry, and Statistics, Brigham Young University, Provo, Utah 84602-5700. A method of predicting reduced ion mobili...
0 downloads 0 Views 83KB Size
Anal. Chem. 2004, 76, 5223-5229

Prediction of Gas-Phase Reduced Ion Mobility Constants (K0) Nosa Agbonkonkon,† H. Dennis Tolley,*,‡ Matthew C. Asplund,† Edgar D. Lee,† and Milton L. Lee*,†

Departments of Chemistry and Biochemistry, and Statistics, Brigham Young University, Provo, Utah 84602-5700

A method of predicting reduced ion mobility values, K0, for use in ion mobility spectrometry is described. While the method is very similar to a previously reported method based on a neural network, the method described in this paper uses a purely statistical regression approach. Furthermore, it has been applied to a wider class of compounds, including chemical agents. Various molecular parameters were evaluated in the predictive model to determine the qualitative dynamics that have the greatest effect on K0. An R2 value of 80.1% was obtained when calculated K0 values were plotted against measured K0 values for 162 compounds for which experimental K0 values were available. However, when chloroacetophenone and 3-xylyl bromide (3-methylbenzyl bromide) were removed from the set due to their large residual values, the predictability increased to an R2 value of 87.4%. This compares well with the value of 88.7%, which was obtained in a regression step of a previous neural network study for a less diverse set of 168 compounds. Interest in the use of ion mobility as a tool for the separation of mixtures of components has greatly increased in the past few years. Much of this interest has stemmed from the rapid detection of explosives, chemical agents, and toxic industrial chemicals1,2 and the dimensionality and speed that comes from coupling ion mobility spectrometry (IMS) with time-of-flight mass spectrometry for high-throughput drug screening3 and proteomics studies.4-6 The features of IMS, including high speed, excellent detection limits, amenability to miniaturization, and ruggedness for field operation, make it an ideal analyzer in applications that require portability. These features have led to the expectation that handheld IMS instruments will become much more powerful than they are today for field detection of explosives, chemical agents, and other toxic chemicals. * To whom correspondence should be addressed. E-mail: milton_lee@ byu.edu.; [email protected]. † Department of Chemistry and Biochemistry. ‡ Department of Statistics. (1) Steiner, W. E.; Clowers, B. H.; Matz, L. M.; Siems, W. F.; Hill, H. H., Jr. Anal. Chem. 2002, 74, 4343-4352. (2) Taylor, St. J.; Piper, L. J.; Connor, J. A.; FitzGerald, J.; Adams, J. H.; Harden, Ch. S.; Schoff, D. B.; Davis, D. M.; Ewing, R. G. IJIMS 1998, 1, 58-63. (3) Collins, D. C.; Lee, M. L. Fresenius J. Anal. Chem. 2001, 369, 225-233. (4) Myung, S.; Lee, Y. J.; Moon, M. H.; Taraszka, J.; Sowell, R.; Koeniger, S.; Hilderbrand, A. E.; Valentine, S. J.; Cherbas, L.; Cherbas, P.; Kaufmann, T. C.; Miller, D. F.; Mechref, Y.; Novotny, M. V.; Ewing, M. A.; Sporleder, C. R.; Clemmer, D. E. Anal. Chem. 2003, 75, 5137-5145. (5) McLean, J. A.; Russell, D. H. J. Proteome Res. 2003, 2, 427-430. (6) Purves, R. W.; Barnett, D. A.; Ells, B.; Guevremont, R. J. Am. Soc. Mass Spectrom. 2001, 12, 894-901. 10.1021/ac030403s CCC: $27.50 Published on Web 07/23/2004

© 2004 American Chemical Society

In this paper, we evaluate the possibility of predicting gas-phase reduced mobility values, K0, of ions based on multiple physical (e.g., topological and electronic) properties of the ions. The ability to predict K0 values is important for several reasons. First, there are limited experimental K0 values that have been published. Second, predicted K0 values provide an initial value for algorithms used to calibrate instrument-specific measurements. Finally, a prediction equation could provide valuable insights into the principle dynamics that have the greatest effects on ion mobilities. There are two primary methods reported in the literature for determining or calculating gas-phase K0 values: a fitting procedure7 and a neural network (neural net) computer model.8,9 The latter model, developed by Jurs and co-workers, was able to predict K0 values with over 99% accuracy for a defined set of relatively simple compounds8 and a somewhat lower (i.e., 91.1%) accuracy for an expanded set containing more diverse compounds.9 The regression step of Jurs fit to these data had an R2 of 88.7%. The neural net adjustments were able to improve this fit to 91.1%. Despite this reasonably good level of predictability, the model has two shortcomings for practical use. First, because it is based on a trained neural network, it requires a training set of experimentally measured K0 values to hone its predictive accuracy. This training optimizes the parameters used in future predictions but limits extrapolation to new molecule types. A second shortcoming of neural nets, in general, is the difficulty often experienced in interpreting the prediction model. The weights and node links of neural nets are constructed to optimize predictions by specific criteria, however, without providing the user a method of relating inputs to predictions. In constructing their neural net, Jurs and co-workers9 initially identified a set of six measures that had the greatest predictive capability for K0 when used as regression variables in a fitted multiple regression. We used their regression equation as a starting point for our prediction model. Prediction Model. The gas-phase mobility of an ion, K, is determined from the drift velocity, vd, attained by the ion in a weak electric field, E, at atmospheric pressure:

vd ) KE

(1)

The theory underlying IMS describes the motion of slow ions in gases. As an ion moves through a neutral bath (or buffer) gas (7) Eiceman, G. A.; Karpas, Z. Ion Mobility Spectrometry; CRC Press: Boca Raton, FL, 1994. (8) Wessel, M. D.; Jurs, P. C. Anal. Chem. 1994, 66, 2480-2487. (9) Wessel, M. D.; Sutter, J. M.; Jurs, P. C. Anal. Chem. 1996, 68, 42374243.

Analytical Chemistry, Vol. 76, No. 17, September 1, 2004 5223

under the influence of an external electric field, different forces act on it: namely, resistance encountered by the ion from the buffer gas molecules (electrostatic), geometric forces (form, size, structure), and diffusive forces arising from the concentration gradient and the influence of the electric field.7 These can be modeled by the equation

K ) vd/E ) (3e/16N)[2π/µkTeff ]1/2[1/ΩD]

(2)

where K, E, and vd are as defined above, e is the charge of the ion, N is the gas number density, µ is the reduced mass of the ion, k is the Boltzmann constant, Teff is the effective temperature of the ion, and ΩD is the collision cross section of the ion, which depends on the effective temperature. For the purpose of standardization, ion mobility values are typically reported as reduced mobilities, K0:

K0 ) K(273/T)(P/760)

(3)

where T is the temperature (in kelvin) and P is the pressure (in Torr). The ion cross section, ΩD, is related to K as given by the expression in eq 2 above. The mobility of a polyatomic ion depends on its average collision cross section, ΩD, which in turn depends on various characteristics of the ion (i.e., size, shape, and charge distribution) and the neutral buffer gas molecules (i.e., size, shape, and dipole or quadrupole moments). Our prediction model is based on these cross section characteristics. Six parameters or “descriptors” used by Jurs and co-workers were also used by us in our model; five were topological and one was electronic. These indexes, which include KAPA-3, V1, Qneg, 2SP2, NO, and WTPT-3, were manually calculated according to the equations reported by Jurs and co-workers,8,9 except for Qneg and WTPT-3, which are described below. According to Wessel and Jurs,8 KAPA-3 is an index that describes the molecular shape of compounds corrected for the presence of heteroatoms. V1 is a valence connectivity index that describes the molecular connectivity of the compound, taking into account the type of bond that connects each atom.9 2SP2 is a descriptor that accounts for the total number of sp2-hybridized carbon atoms that are attached to two other carbons and one hydrogen in the structure.9 NO is simply the number of oxygen atoms in the structure. Qneg, the charge on the most negative atom, was calculated using a computer program (NWChem, version 4.5 on an IBM SP2 Power3).10 Computation of partial atomic charges is more difficult than other electrostatic quantities, such as dipole moment, because atomic charges could not be measured, leaving no basis for comparison to experiment. It is well known that simple methods for calculating charges, such as Milliken charges, are not reliable and are dependent on the basis set. A recent article compared methods of calculating partial charges.11 We were interested in selecting a method that would provide consistent results over a range of atom types and molecules, could be correlated with (10) High Performance Computational Chemistry Group, NWChem, A Computational Chemistry Package for Parallel Computers, Version 4.5, Pacific Northwest National Laboratory, Richland, WA 99352, 2003. (11) Proft, F. De; Alsenoy, Van C.; Peeters, A.; Langenaeker, W.; Geerlings, P. J. Comput. Chem. 2002, 23, 1198-1209.

5224

Analytical Chemistry, Vol. 76, No. 17, September 1, 2004

known ion mobility data, and would use a readily accessible code so others could easily perform similar calculations. Fitting tests showed that electrostatic potential charges computed by the NWChem program provided consistent agreement with measured quantities. The geometries of the ions were optimized according to the Hartree-Fock ab initio method, with either a 3-21G* basis set (for first-row elements) or a LAN2DZ ECP basis set (for transition metals). Calculation of the partial charges was then performed according to the ESP electrostatic potential fitting method using either 6-311G* or LANL2DZ basis sets. These results provided maximum negative charges, which led to the best agreement between calculated and measured K0 values. WTPT-3, the sum of weighted path lengths, accounts for the contribution of heteroatoms to the overall geometry and shape of the hydrocarbon skeleton of an ion. Following the lead of Jurs and co-workers, a pure hydrocarbon was assigned a value of zero. This was not entirely consistent with the original Randic´ index,12 which has a nonzero value for hydrocarbons. In this work, WTPT-3 was modified to follow the Randic´ algorithm12 more closely. Heteroatoms were assigned nonzero values, and the sum of weighted path calculations was carried out starting from the heteroatoms. Initially, we tried a simple multiple regression model, which included the six variables plus an intercept on the data used by Jurs and co-workers. This model worked well (R2 ) 88.7%) for many organic ions, i.e., aliphatic hydrocarbons and certain aromatic and cyclic structures containing electronegative atoms such as O, N, halogens, and certain forms of S. However, the model in its current form is less predictive for mobilities of compounds containing P, As, and certain forms of S. Therefore, the ions of interest in this study (Table 1) were divided into eight groups, as listed in Table 2, and a specific index value was calculated for each of these groups. RESULTS AND DISCUSSION Although, there is a reciprocal relationship between ion cross section and K0, as shown from eqs 3 and 4, we modeled the relationship between the various topological parameters representing ion cross section and observed K0 values as a linear relationship in order to gain insight into the principle dynamics that have the greatest effects on K0. The measured parameter values were multiplied individually by the coefficients listed in Tables 3 and 4 to derive a predicted value of the inverse of the cross section. The R2 value (80.1%) reported at the bottom of Table 3 is a measure of the goodness of fit of the model. The square root of this value (i.e., 0.895) is the correlation between the predicted and observed K0 values. Nearly the same level of predictability as attained by Jurs and co-workers, was obtained in this study. However, we included a more extensive and diverse list of 162 ions. We observed that two compounds, i.e., chloroacetophenone and 3-xylyl bromide (3methylbenzyl bromide) have very large residuals and, when they were taken out of the pool (Figure 2; see Figures 1-6), the predictability improved to an R2 value of 87.4%. As noted earlier, Jurs and co-workers attained a regression line with an R2 value of 88.7%, albeit for a less diverse set of ions. (12) Randic´, M. J. Chem. Inf. Comput. Sci. 1984, 24, 164-175.

Table 1. Experimental and Predicted K0 Values Used in Developing the Prediction Modela compounds ammonia carbonic dichloride nitrochloroform (chloropicin) chloroacetophenone hydrogen cyanide cyanogen chloride 3-xylyl bromide (3-methylbenzyl bromide) acetone ethanol acetic acid pentane 2,2,3,3-tetramethylbutane tert-butylamine n-hexane 2-butanone ethyl ester 2-propanol 1,2,4,5-tetramethylbenzene glutaraldehyde ethylbenzene o-xylene m-xylene cycloheptane triethylamine methylcyclohexane 2,4-lutidine propionic acid ethylcyclopentane benzene propanol propanal ethyl acetate 2-methylpropanal n-butylamine dimethyl methylphosphonate 3-methylhexane n-heptane diisopropylamine 2-methylhexane butylbenzene 3-pentanone 2-pentanone toluene propyl methanoate methyl isobutyrate 1,3-dimethylcyclohexane 1,2-dimethylcyclohexane 1,2,4-trimethylbenzene naphthalene methyl butanoate isobutyric acid ethylcyclohexane ethyl propanoate di-n-propylamine cumene aniline 2-ethoxyethyl acetate 2-(2-hydroxyethyl)pyridine propylbenzene benzaldehyde 2-butanol isoamylamine cyclohexanone 3,3-dimethyl-2-butanol 2-methyl-1-propanol 2-ethyl-1-hexane 2,4,6-trimethylpyridine isopropyl acetate cyclohexene butanal hexanol dichloro-(2-chlorovinyl) arsine methyl heptanoate mesitylene heptanal 2-octanone

K0 exptl pred

stdized residual

ref

compounds

2.97 2.77 2.70 2.66 2.50 2.50 2.47 2.13 2.06 2.03 2.02 2.02 2.00 2.00 2.00 1.98 1.98 1.98 1.97 1.97 1.96 1.96 1.96 1.95 1.95 1.95 1.94 1.94 1.94 1.93 1.93 1.93 1.93 1.92 1.91 1.91 1.90 1.90 1.90 1.89 1.88 1.88 1.87 1.87 1.87 1.87 1.87 1.87 1.86 1.86 1.86 1.86 1.86 1.86 1.86 1.86 1.86 1.86 1.85 1.85 1.85 1.84 1.84 1.84 1.84 1.84 1.84 1.83 1.83 1.83 1.62 1.62 1.60 1.60 1.60 1.60

27.2 1.111 -0.674 6.712 1.148 -0.993 4.565 0.744 0.209 -0.007 -0.296 0.926 0.799 0.690 0.689 0.290 0.146 2.243 1.181 0.167 0.249 0.823 0.148 -0.137 0.031 -0.071 0.289 -0.106 -1.876 0.212 -0.693 0.411 -0.356 0.686 2.524 0.351 0.346 1.167 0.825 1.008 -0.265 0.386 -0.087 -0.410 -0.339 0.470 0.192 0.523 0.083 -0.058 -0.262 -0.347 0.193 0.564 -0.139 -0.076 1.101 0.496 -0.139 -0.770 0.102 0.662 -0.054 0.729 -0.376 1.672 -0.305 -0.049 -1.517 -1.091 -0.558 1.571 -0.103 -2.018 -0.885 0.030

14 13 13 13 13 13 13 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 1,9 9 13 9 9 9 9

2-methyl-2-propanol chlorobenzene methyl tert-valerate cyclohexylamine 2-methylcyclohexanone 2,2-dimethylhexane 2,2,4-trimethylpentane N,N-dimethylaniline 4-methylheptane 3-methylbutanal 2-methylbutanal pentanal o-toluidine n-octane methyl anteisovalerate 2-methylheptane n-amylamine ethyl butanoate dicylopentadiene 2-methyl-3-hexanone 2-hexanone isopropylcyclohexane 3-pentanol 2,5-dimethyl-1,5-hexadiene propylcyclohexane propyl propanoate 3,5,5-trimethyl-1-hexene p-cymene 4-methylcyclohexanone 4-heptanone 3-methylcyclohexanone tert-butyl acetate sec-butylbenzene sec-butyl acetate isobutylbenzene 3-heptanone benzylamine 2-ethylbutanioc acid methyl ester 2-methylpentanal n-nonane isobutyl acetate isoamyl alcohol biphenylene 3-methyl-5-hexanone 2-methylpentanoic acid methyl ester 2-methyl-1-butanol propyl butanoate 3-hexanol N,N-diethylaniline diisobutylamine 5-methyl-2-hexanone 3-octanone tri-n-propylamine hexanal 2-hexanol 2-heptanone n-hexylamine methyl hexanoate isopropyl methylphosphonofluoridate butylcyclohexane butyl propanoate 3-amyl acetate tert-amyl alcohol methyl isocaproate di-n-butylamine sec-amyl acetate 2-ethyl-1-butanol isoamyl acetate n-decane ethyl hexanoate octanal diethyl phthalate octanol ethyl octanoate tri-n-butylamine nonanal

2.00 2.72 2.74 2.01 2.39 2.59 2.03 2.06 2.04 2.03 2.05 1.93 1.92 1.93 1.93 1.95 1.97 1.76 1.85 1.95 1.94 1.88 1.95 1.96 1.95 1.96 1.91 1.95 2.12 1.91 2.00 1.89 1.97 1.85 1.67 1.88 1.87 1.79 1.82 1.79 1.91 1.84 1.88 1.91 1.90 1.82 1.85 1.82 1.85 1.87 1.89 1.89 1.84 1.80 1.87 1.87 1.75 1.81 1.86 1.93 1.84 1.78 1.85 1.77 1.88 1.67 1.87 1.83 1.98 1.94 1.68 1.59 1.61 1.80 1.69 1.60

K0 exptl pred

stdized residual

ref

1.83 1.83 1.82 1.82 1.82 1.82 1.82 1.81 1.81 1.81 1.81 1.80 1.80 1.80 1.80 1.80 1.79 1.79 1.79 1.79 1.79 1.78 1.78 1.78 1.77 1.77 1.77 1.76 1.76 1.76 1.76 1.75 1.75 1.75 1.75 1.75 1.74 1.74 1.73 1.72 1.72 1.72 1.72 1.72 1.72 1.72 1.71 1.71 1.70 1.70 1.70 1.70 1.69 1.69 1.69 1.69 1.68 1.68 1.68 1.68 1.68 1.68 1.67 1.67 1.67 1.66 1.66 1.64 1.63 1.63 1.51 1.49 1.47 1.47 1.46 1.44

-0.768 -1.571 -0.396 1.190 0.181 0.629 0.794 -2.832 0.340 -0.607 -1.076 -0.455 -0.515 0.257 -0.147 0.588 0.270 0.268 0.699 0.329 0.305 -0.097 -0.522 0.527 -0.217 0.040 0.810 0.174 -0.240 0.258 -0.086 -0.117 -0.620 -0.205 -0.603 0.385 -1.699 -0.282 -0.953 0.216 -0.370 -0.607 -0.839 -0.060 -0.258 -0.897 0.259 -0.183 -1.811 -0.525 0.008 0.638 0.153 -0.857 0.167 0.844 -0.040 -0.125 0.208 -0.218 -0.105 -0.334 -0.860 -0.419 0.086 -0.398 -1.796 -0.198 0.199 0.297 -0.978 -1.281 -0.445 0.362 -0.798 -0.843

9 13 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 13 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9

1.91 1.98 1.86 1.71 1.80 1.76 1.74 2.09 1.78 1.87 1.92 1.85 1.85 1.77 1.81 1.74 1.76 1.76 1.72 1.76 1.76 1.79 1.83 1.73 1.79 1.77 1.69 1.74 1.78 1.73 1.77 1.76 1.81 1.77 1.81 1.71 1.91 1.77 1.83 1.70 1.76 1.78 1.80 1.73 1.75 1.81 1.68 1.73 1.88 1.75 1.70 1.64 1.68 1.78 1.67 1.61 1.68 1.69 1.66 1.70 1.69 1.71 1.76 1.71 1.66 1.70 1.84 1.66 1.61 1.60 1.61 1.60 1.51 1.43 1.54 1.52

Analytical Chemistry, Vol. 76, No. 17, September 1, 2004

5225

Table 1. (Continued) K0 stdized exptl pred residual ref

compounds p-diisopropylbenzene n-heptylamine ethyl N,N-dimethylphosphoroamidocyanidate heptanol pinacolyl methylphosphonofluoridate a

1.59 1.58 1.57 1.54 1.52

-0.796 -0.197 -1.905 -0.509 -0.926

1.67 1.60 1.74 1.59 1.61

9 9 13 9 15

K0 stdized exptl pred residual ref

compounds nonanol decanal decanol parathion o-ethyl-S-(2-diisopropylaminoethyl)methylphosphonothiolate

1.40 1.38 1.34 1.27 1.23

1.43 1.44 1.35 1.10 1.38

-0.273 -0.658 -0.081 2.228 -1.916

9 9 9 14 15

This list of compounds includes those used by Jurs and co-workers,9 plus a number of others of current interest to the authors.

Table 2. Grouping of Compounds According to Structure alkanes alkanes with heteroatoms (mainly O and N) cyclic alkanes cyclic alkanes with heteroatoms (mainly O and N) aromatic compounds compounds containing As and Cl compounds containing halogens, N, and O compounds containing S, Cl, and O compounds containing P, S, O, F, and N

Table 3. Estimated Coefficients of Parameters from Regression Analysisa,b predictor

coefficient

standard error

number of compounds

constant V1 NO 2SP2 QNEG KAPA 3ac KAPA 3bc WTPT 1d WTPT-PSOd WTPT 2d

2.464 -0.131 -0.131 -0.003 0.401 -0.016 -0.253 0.090 0.031 0.128

0.041 0.011 0.014 0.005 0.040 0.004 0.027 0.009 0.006 0.016

all all all all all 154 8 114 6 42

Table 4. Estimated Coefficients of Parameters from Regression Analysis of the Data Used in Table 3 after Elimination of Data for Chloracetophenone and 3-Xylyl Bromidea predictor

coefficient

standard error

number of compounds

constant V1 NO 2SP2 QNEG KAPA 3ab KAPA 3bb WTPT 1c WTPT-PSOc WTPT 2c

2.457 -0.130 -0.114 -0.010 0.368 -0.016 -0.248 0.073 0.025 0.125

0.030 0.008 0.011 0.004 0.030 0.003 0.020 0.006 0.005 0.012

all all all all all 152 8 112 6 42

a R2 ) 87.4% b KAPA 3a are KAPA 3 values for hydrocarbons, cyclic hydrocarbons, hydrocarbons with heteroatoms, cyclic hydrocarbons with heteroatoms, aromatics, and other compounds containing P, S, O, F, and N atoms. KAPA 3b are KAPA 3 values for all other compounds. c WTPT 1 are WTPT values for hydrocarbons with heteroatoms, cyclic hydrocarbons with heteroatoms, and aromatics. WTPT- PSO are the WTPT values for compounds containing P, S, O, F, and N atoms. WTPT 2 are WTPT values for all other compounds.

The less than perfect predictability of K0 values is due to the limitations in the descriptors that are available for use in the model. The topological indexes used here were originally designed for hydrocarbons and only later corrected to account for the addition of halogens and heteroatoms such as O, N, P, and S. Thus, use of the same descriptors for compounds for which they were not designed would be expected to exhibit some limitations. In the predictions of Jurs and co-workers, the Randic´ index was modified from its original form. This modification was a data centric adjustment, which means that the modification was based on actual ion mobility data and not entirely on topological considerations. This gave the index more predictive power but

made it difficult to predict mobilities for compounds for which the training set was not applicable. The version of the Randic´ index used in this paper is readily obtainable for new compounds. In this study we allowed the descriptors KAPA 3 and WTPT to be different for the eight different groups of compounds (Table 2). In the analyses, we discovered that the eight groups could empirically regroup into two clusters with respect to KAPA 3. The groups in the same clusters showed the same KAPA 3 effect on mobility. These clusters are identified in Tables 3 and 4. For a molecule in a particular cluster, the KAPA 3 values are as specified in Tables 3 and 4. The same is true for the WTPT index, which this time has three clusters. Again the groups may be grouped into three clusters, although different from the clusters identified for KAPA 3. Groups in the same cluster show the same effect of WTPT on mobility as described in Tables 3 and 4. The estimate of the standard error, SE ) 0.028, indicates that there is a significant level of uncertainty in the predictions. Note that this is not the standard error of prediction but the standard error of the fit. The standard error of prediction depends on the value of the index of the compound for which the mobility is to be predicted. Table 1 gives the observed and fitted values for each compound and also the standardized difference in fit. This

(13) Sohn, H.; Steinhanses, J. IJIMS 1998, 1, 1-14. (14) Shumate, C.; St Louis, R. H.; Hill, H. H., Jr. J. Chromatogr. 1986, 373, 141-173.

(15) Alder, J.; Do¨ring, H. R.; Starrock, V.; Wu ¨ lfing, E. Proc 4th Int. Symp. Protection Against Chemical Warfare Agents, Stockholm, Sweden, 8-12 June, 1992; pp 175-180.

a

Coefficient in second column implicitly defines 1/ΩD as given in eq 4. b R2 ) 80.1% c KAPA 3a are KAPA 3 values for hydrocarbons, cyclic hydrocarbons, hydrocarbons with heteroatoms, cyclic hydrocarbons with heteroatoms, aromatics, and other compounds containing P, S, O, F, and N atoms. KAPA 3b are KAPA 3 values for all others. d WTPT 1 are WTPT values for hydrocarbons with heteroatoms, cyclic hydrocarbons with heteroatoms, and aromatics. WTPT-PSO are WTPT values for compounds containing P, S, O, F, and N atoms. WTPT 2 are the WTPT values for all other compounds.

5226 Analytical Chemistry, Vol. 76, No. 17, September 1, 2004

Figure 1. Predicted K0 vs experimental K0 values using all compounds (Table 4 coefficients).

Figure 3. Predicted K0 vs experimental K0 values for groups 1 and 2 of Table 2.

Figure 2. Predicted K0 vs experimental K0 values after omitting data for chloroacetophenone and 3-xylyl bromide (Table 5 coefficient).

Figure 4. Predicted K0 vs experimental K0 values for groups 3 and 4 of Table 2.

standardized difference is the number of standard deviations for the particular values of the regression variables by which the predicted and observed values differ. For a few of these cases, values of this standardized difference that are much greater than 2 in absolute value indicate a very poor predication. Table 5 shows the R2 values for the first six groups of Table 2; calculation of the R2 values for the remaining three groups was not possible as experimental K0 values for the compounds in these groups were not available. As the table shows, our model is quite good for compounds in group 3, i.e., hydrocarbons with a heteroatom. Group 5 gave an R2 value of zero. The R2 value could have resulted from five compounds in Table 6 that gave high residual values, which could partially explain the low R2 value. Groups 1, 3, and 6 of Table 5 have 16, 12, and 6 compounds; this could explain why they have low R2 values.

However, the true predictive power of this model could only be tested with more experimentally determined K0 values for compounds in the groups. Group 3 has a high R2 value, where there are 95 compounds in this group. Work is continuing in this area. Accuracy of measured K0 values depends greatly on the experimental conditions, instrumentation, and reproducibility of measurements. Until recently, adequate care was not taken to ensure that mobilities were accurately measured. This is evident in the data in the literature, as there is considerable variability in K0 values even for frequently measured compounds. Although, the variation may appear small to the causal observer, the observed level of uncertainty is too large for prediction with high certainty. A high level of variation can be seen from Table 7 using data for benzene, toluene, aniline, and 1-octanol taken at the same Analytical Chemistry, Vol. 76, No. 17, September 1, 2004

5227

Table 5. R2 Values for Different Groupsa group

R2

hydrocarbons cyclic hydrocarbons hydrocarbons with heteroatom aromatic compounds cyclic hydrocarbons with heteroatom compounds containing P, S, O, F, and N

0.562 0.569 0.920 0.342 0.000 0.448

a Experimental measured K values were not available for com0 pounds in three of the groups; thus, R2 values could not be calculated for them.

Table 6. Unusual Observations (Residual for Outliers)

Figure 5. Predicted K0 vs experimental K0 values for groups 5 and 6 of Table 2.

compound

residual

N,N-dimethylaniline mesitylene 1,2,4,5,-tetramethylbenzene chloroacetophenone 3-xylyl bromide parathion dimethyl methylphosphonate

-2.83 -2.02 2.24 6.71 4.57 2.23 2.52

Table 7. Data for Some Compounds That Show Variation in Measured K0 Values temp K0 temp K0 compound (cm2 V-1 s-1) (°C) compound (cm2 V-1 s-1) (°C) benzene benzene benzene benzene benzene benzene benzene benzene benzene 1-octanol 1-octanol

Figure 6. Predicted K0 vs experimental K0 values for groups 7 and 9 of Table 2.

temperature.14 Some of these measurements were reported in the same publication. What is not apparent in these studies are the experimental conditions with the exception of temperature. Ionization method and vapor pressure also play key roles in this regard. Clustering of primary product ions may also lead to variation in these measurements. Thus, introducing a chemical to standardize K0 measurements as proposed by Eiceman et al.17 will help to resolve some of these problems. One other topological characteristic that would likely improve our ability to predict K0 is compound symmetry. From a physical and, perhaps, chemical point of view, the symmetry of a compound would affect its mobility and its ability to polarize the buffer gas.

2.00 2.22 2.22 2.22 2.33 2.45 2.48 2.26 2.08 0.98 1.12

50 50 50 50 150 150 150 150 150 22 22

1-octanol 1-octanol 1-octanol 1-octanol aniline aniline aniline aniline aniline aniline aniline

1.21 1.46 1.53 1.88 1.93 2.00 2.07 1.95 1.83 2.00 2.06

ACKNOWLEDGMENT We are extremely grateful to Professor Peter C. Jurs for kindly providing helpful discussion and information essential for completion of this work. A modified version of NWChem Version 4.5, as developed and distributed by Pacific Northwest National Laboratory, P.O. Box 999, Richland, WA 99352, and funded by the U.S. Department of Energy, was used to obtain some of these results. APPENDIX Equations Used To Calculate Topological Descriptors. KAPA 3 (Kier Shape Index) 3

κ ) (A - 3)(A - 2)2/(3Pi)2

A is even

3

A is odd

κ ) (A - 1)(A - 3)2/(3Pi)2

For heteroatom correction; the equation above was modified as follows: 3

(16) Blanchard, W. C. Int. J. Mass Spectrom. Ion Processes 1989, 95, 199-210. (17) Eiceman, G. A.; Nazarov, E. G.; Stone, J. A. Anal. Chim. Acta 2003, 493, 185-194.

5228

Analytical Chemistry, Vol. 76, No. 17, September 1, 2004

22 22 22 22 200 200 200 200 140 140 140

κ ) (A + R - 3)(A + R - 2)2/(3Pi + R)2

A is even

3

A is odd

κ ) (A + R - 1)(A + R - 3)2/(3Pi + R)2

Table 8. r Based on Covalent Radius

Table 9. Heteroatom Valence δ Values

Rx ) (rx/rCsp3) - 1 atom (valence state)

rx(A0)a

Rx

0.77 0.67 0.60 0.74 0.62 0.55 0.74 0.62 0.72 1.10 1.00 1.04 0.94 0.99 1.14 1.33

0 -0.13 -0.22 -0.04 -0.20 -0.29 -0.04 -0.20 -0.07 0.43 0.30 0.35 0.22 0.29 0.48 0.73

C sp3 C sp2 C sp N sp3 N sp2 N sp O sp3 O sp2 F P sp3 P sp2 S sp3 S sp2 Cl Br I a

Schomaker, V.; Stevenson, D. P. J. Am. Chem. Soc. 1961, 63, 37.

where A is the number of atoms in the molecule, 3Pi is the count of contiguous three-bond fragments derived from the molecule, and R is the heteroatom modifier, i.e., non-carbon atom(s) in molecule. (See Table 8.) V1 (Path 1 Valence Connectivity). V1 is calculated from a graph theory representation of the molecule. For each vertex of the graph there is a vertex valence (number of bonds of the atom at the vertex), denoted δι for the ith vertex and a set of connecting segments between the vertex and other vertexes, called edges. The variable V1, denoted 1χ in the literature is given for hydrocarbons by Ne

1

χ)

∑(δ δ )

-1/2

i j s

s)1

where w(ei) is the connectivity weight of the ith edge of the graph representation of the molecule. The Randic´ index is given by

∑w(p )

ID ) N + (1/2)

l

pl

where N is number of atoms in molecule, pl is weighted path length corrected for heteroatoms using Table 2 above, and ei (i ) 1, 2, ..., l) represents a set of weighted atoms making up the path. Equation Used for Mobility Calculations. The regression equation for linear mobility is

Exptl K0 ) 2.46-0.131 V1-0.132 NO - 0.00267 2SP2 + where the sum is over all edges linking vertexes v1 and v2. The number of edges in the graph of the molecule is Ne. For heteroatoms the above formula is modified by changing δi to δiv where δiv ) Zv - hi, where Zv is the valence of the atom in the molecule backbone structure and hi is the number of hydrogen atoms surrounding that atom. For all other molecules, the V1 value is zero. (See Table 9.) WTPT (Randic´ Connectivity ID Number). The Randic´ connectivity ID number is a measure of connectivity. It is formed by summing the connectivity weights. These weights are given by l

w(pl) )

∏w(e )

0.401 QNEG - 0.253 KAPA*Kbar + 0.0905 NsumW + 0.0315 WTPT-PSOFN - 0.0162 NsumK + 0.129 WTPT*Wbar The regression equation for linear mobility with outliers 154 and 157 removed is

mobility w/o out ) 2.46-0.130 V1-0.115 NO 0.0107 2SP2 + 0.368 QNEG - 0.248 KAPA*Kbar + 0.0730 NsumW + 0.0250 WTPT-PSOFN 0.0163 NsumK + 0.125 WTPT*Wbar Received for review December 4, 2003. Accepted April 30, 2004.

i

i)1

AC030403S

Analytical Chemistry, Vol. 76, No. 17, September 1, 2004

5229