Prediction of Chromatographic Retention Times for Aromatic

There are many examples in the literature where a QSPR methodology was used to ... increasing the length of the biaryl bride between aromatic molecule...
24 downloads 0 Views 498KB Size
Energy & Fuels 2006, 20, 609-619

609

Prediction of Chromatographic Retention Times for Aromatic Hydrocarbons Prasenjeet Ghosh,*,† Birbal Chawla,‡ Prasanna V. Joshi,§ and Stephen B. Jaffe† ExxonMobil Process Research Laboratories, Paulsboro, New Jersey 08066, ExxonMobil Research and Engineering, Paulsboro, New Jersey 08066, and ExxonMobil Process Research Laboratories, Baton Rouge, Louisiana 70821 ReceiVed July 25, 2005. ReVised Manuscript ReceiVed NoVember 16, 2005

A quantitative model for predicting the retention times of various single- and multi-ring aromatic hydrocarbons (AH) and heteroatom-containing molecules in high-performance liquid chromatography (HPLC) is presented. The retention time behavior of 44 aromatic molecules containing one-, two-, and three-ring structures on a [3-(2,4-dinitroanilino)]propyl-silica column was investigated under subambient separation conditions. Substituent effects (their number, type, and position on the ring) on the retention behavior of these aromatic ring structures were also investigated. A quantitative structure-property relationship (QSPR) model elucidating the retention behavior of AH molecules was developed based on the obtained experimental data. It was found that the electronic properties of the molecule and its molecular geometry play a pivotal role in determining its retention in the HPLC column. The important electronic and geometric descriptors were identified using partial least squares analysis. The electronic properties of the molecule were quantified by its ionization potential and electron affinity computed from the optimized solute geometry using the PM3 energy function. The molecular geometry was characterized by the number of rings, molecular weight, and the valence connectivity indices. The mathematical form of the QSPR model was derived using genetic algorithms, and a log-linear form of the model was found to best correlate the HPLC retention time data with the solute molecular descriptors. The QSPR model was also used to predict the retention behavior of many other AH molecules that are commonly present in most heavy petroleum streams. The results obtained in this study enhance our understanding of the AH retention behavior and are of significance in predicting the identity of chromatographic peaks and quantify the extent of overlap based solely on parameters derived from solute structure.

1. Introduction Petroleum is a complex mixture of thousands of distinct molecular species, containing many different hydrocarbon- and heteroatom-containing molecules. The ability to analytically identify and quantitatively measure these molecules is critical for the successful design and the economic operation of commercial refining processes. A key analytical technology that enables this is the high-performance liquid chromatography (HPLC) separation where the different molecules are retained for different amounts of times in the chromatographic column according to their molecular structures. The ease of operation, versatility, and high separation capacity has made HPLC the analytical method of preference for molecular identification, particularly for the heavy petroleum streams. However, despite its many advantages, it is often difficult to correctly identify all the chromatographic peaks in the HPLC separation. This is particularly true for the polyaromatic hydrocarbons (PAH), where there might be significant chromatographic peak overlap between the different molecules, possibly resulting in complete molecular misidentification. We present in this article a better understanding of the retention behavior of different AHs and PAHs in the chromatographic column and the development of a quantitative model that predicts their retention times, thereby facilitating correct identification of the molecular structures. * To whom all correspondence should be addressed. Email: [email protected]. † ExxonMobil Process Research Laboratories, Paulsboro, New Jersey. ‡ ExxonMobil Research and Engineering, Paulsboro, New Jersey. § ExxonMobil Process Research Laboratories, Baton Rouge.

The advent of advanced and inexpensive computational tools has brought a remarkable growth in the area of quantitative structure-property relationships (QSPR). QSPR employs various multivariate methods starting from simple regression models to detailed nonlinear neural network models to correlate relevant properties as a function of the molecular structure (also called descriptors). These descriptors can be structural, topological, electronic, and/or geometric, depending on the problem at hand. There are many examples in the literature where a QSPR methodology was used to develop a predictive model for the retention behavior of different hydrocarbons in a chromatographic column. Some of the relevant work includes the work of Collantes et al.,1 Ledesma and Wormat,2 Ferreira,3 Forgacs and Cserhati,4 and references therein who all used different descriptors that encode specific aspects of the molecular structure to correlate the retention time behavior to molecular structure. The objective of the present work is similar to the cited literature, that is, to develop a QSPR model correlating the molecular descriptors derived from different AHs including PAHs and heteroatom-containing molecules to their retention (1) Collantes, E. R.; Tong, W.; Welsh, W. Use of Moment of Intertia in Comparative Molecular Field Analysis To Model Chromatographic Retention of Nonpolar Solutes. Anal. Chem. 1996, 68, 2038. (2) Ledesma, E. B.; Wormat, M. J. QSRR Prediction of Chromatographic Retention of Ethynyl-Substituted PAH from Semiempirically Computed Solute Descriptors. Anal. Chem. 2000, 72, 5437. (3) Ferreira. M. C. M. Polycyclic aromatic hydrocarbons: a QSPR study. Chemosphere 2001, 44, 125. (4) Forgacs, E.; Cserhati T. Use of PCA for studying the separation of pesticides on polyethylene-coated silica columns. J. Chromatogr., A 1998, 797, 33.

10.1021/ef0502305 CCC: $33.50 © 2006 American Chemical Society Published on Web 01/07/2006

610 Energy & Fuels, Vol. 20, No. 2, 2006

Ghosh et al.

Table 1. Experimental HPLC Retention Time Data of Various Molecules Considered in This Studya

a

The numbers in parentheses reflect the retention time in minutes for these molecules in the HPLC column run under subambient conditions.

time behavior in a chromatographic column. Specifically, our objectives are threefold: first, to identify and rank the molecular descriptors (in order of their relevance) that correlate to the retention time of the AH molecule in the column; second, to use these descriptors to develop a quantitative model to predict the retention time; and third, to use this predictive model to study the effect of various substituents (e.g., addition of alkyl side chains, addition of naphthenic rings, increasing the length of the biaryl bride between aromatic molecules, etc.) on the retention behavior of AHs and PAHs. 2. Experimental Section A number of different aromatic molecules containing one, two, and three aromatic rings were eluted on a [3-(2,4-dinitroanilino)]-

propyl-silica (DNAP-silica) column (known to be aromatic ringclass separation specific)5-11 under subambient temperature conditions. The HPLC system consisted of a quaternary solvent delivery system, a sample injector, and a photodiode-array detector. The (5) Nondek, L.; Malek, J. Liquid Chromatography of Aromatic Hydrocarbons on a Chemically Bonded Stationary Phase of the Charge-Transfer Type. J. Chromatogr. 1978, 155, 187. (6) Nondek, L.; Minarik, M.; Malek, J. Charge-Transfer Liquid Chromatography of Aromatic Hydrocarbons and Polyaryl Alkanes. J. Chromatogr. 1979, 178, 427. (7) Nondek, L.; Ponec, R. Chemically Bonded Electron Acceptors as Stationary Phases in High-Performance Liquid Chromatography. J. Chromatogr. 1984, 294, 175. (8) Nondek, L.; Minarik, M. Chromatographic Selectivity in Liquid Chromatography of Polycondensed Aromatic Hydrocarbons on 3-(2,4Dinitroanilino)propyl-Silica. J. Chromatogr. 1985, 324, 261.

Retention Times for Aromatic Hydrocarbons

Energy & Fuels, Vol. 20, No. 2, 2006 611

Figure 1. Percentage correlation of different molecular descriptors with retention time based on PLS analysis.

Figure 2. Variation of valence connectivity index of orders 1 and 2 with molecular structure. Notice the different isomers (at carbon numbers 4 and 6) highlighting the difference between (1)χv and (2)χv.

commercially available HPLC grade mobile phases, n-pentane and methylene chloride, were used after overnight drying over freshly activated 4A molecular sieves (8-12 mesh). The photodiode-array detector was set to measure the spectra in the range of 190-400 nm at intervals of 1.3 nm every 25 s. A 1-mm path length was used for the UV/vis measurements. The chromatographic separation details are subject of a separate publication. The aromatic molecules were injected mostly as pure compound solutions. However, in some cases, where the elution behavior was well-known, mixtures of 2-4 compounds were injected (e.g., a mixture of toulene, naphthalene, phenanthrene, and pyrene). Under the subambient chromatographic separation conditions, the aromatic molecules were eluted under increasing solvent polarity at a constant flow rate. Molecules were classified into four different aromatic ring compound (ARC)-classes, namely ARC-1, ARC-2, ARC-3, (9) Nondek, L. Liquid Chromatography on Chemically Bonded Electron Donors and Acceptors. J. Chromatogr. 1986, 373, 61. (10) Grizzle, P. L.; Thomson, J. S. Liquid Chromatographic Separations of Aromatic Hydrocarbons with Chemically Bonded (2,4-Dinitroanilinopropyl)silica. Anal. Chem. 1982, 54, 1071. (11) Thomson, J. S.; Reynolds, J. W. Separation of Aromatic Hydrocarbons Using Bonded-Phase Charge-Transfer Liquid Chromatography. Anal. Chem. 1984, 56, 2434.

and ARC4+ depending upon the times for which they were retained in the column. Specifically, a molecule was classified as ARC-1 (e.g., toulene with one aromatic ring) if its retention time was less than 120 min, ARC-2 (e.g., naphthalene) if it was between 120 and 330 min, and ARC-3 (e.g., phenanthrene) if it was between 330 and 480 min. The chromatographic column was back-flushed beyond 480 min, and consequently any molecule whose retention time was greater than 480 min was lumped into ARC-4+. The molecular basis for the ARC-class categorization of different molecules is the number of aromatic rings fused together without any methylene group(s) in the molecule, that is, molecules with one ring usually elute in ARC-1, molecules with two rings in ARC2, and so on; however, there are some molecules with exceptional shape and geometry that do not follow this grouping. The measured chromatographic retention time data (in minutes) for the different molecules considered in this study are summarized in Table 1.

3. Calculation of Molecular Descriptors The development of any predictive model requires a postulate of the underlying physical phenomenon and identification of the key physical variables. Retention in the chromatographic column is a molecular adsorption phenomenon where the

612 Energy & Fuels, Vol. 20, No. 2, 2006

Ghosh et al.

Table 2. Different Molecular Descriptors Considered for Characterizing the Electronic Structure and the Molecular Geometry of the Molecule

Table 3. Computation of Various Molecular Descriptors for the Different Molecules in This Study

chemical sample 1,3,5-tri-tert-butylbenzene 1,4-di-tert-butylbenzene dioctylthiophene neo-pentylbenzene p-dicyclohexylbenzene tetralin nC18-benzene n-hexylbenzene toulene mesitylene 1,2,3,4-tetramethylbenzene durene 1,3-diphenylpropane indene dodecahydrotriphenylene phenylhexadecyl sulfide 1,2-di-p-tolylethane 6-benzyltetrahydronaphthalene biphenyl 4,4′-dimethylbiphenyl naphthalene hexamethylbenzene benzothiophene 1,2,3,6,7,8-hexahydropyrene phenyl disulfide 2,2-biothiophene 2,6-dimethylnaphthalene acenaphthene dibenzofuran fluorene 2(1-naphthylmethyl)-1,2,3,4tetrahydronaphthalene 1,2,3,4-tetrahydrophenanthrene 1,2,3,4-tetrahydro-1,2-binaphthyl nC6-dibenzothiophene 2-benzylnaphthalene acenaphthylene 2-phenylnaphthalene dibenzothiophene 4,6-dimethyldibenzothiophene phenanthrene naphthothiophene 4H-cyclopentaphenanthrene 2,3-benzofluorene benzonaphthothiophene

ionization potential (eV)

electron affinity (eV)

molecular weight

ring count

valence connectivity index (order 1)

valence connectivity index (order 2)

measured retention time (min)

9.326 9.283 9.124 9.485 9.323 9.248 9.457 9.41 9.442 9.281 9.015 8.957 9.473 8.971 8.856 8.557 9.17 9.125 9.145 8.867 8.836 8.865 8.797 8.354 9.234 8.977 8.671 8.59 9.019 8.842 8.763

-0.508 -0.407 0.08 -0.364 -0.337 -0.427 -0.354 -0.384 -0.376 -0.438 -0.423 -0.382 -0.296 0.077 -0.509 -0.09 -0.231 -0.268 0.142 0.131 0.408 -0.479 0.467 0.363 2.154 1.016 0.341 0.352 0.476 0.335 0.454

246.435 190.328 308.564 148.247 242.403 132.205 330.596 162.274 92.14 120.194 134.221 134.221 196.291 116.162 240.388 334.602 210.318 222.329 154.211 182.265 128.173 162.274 134.195 208.302 218.331 166.255 156.227 154.211 168.195 166.222 272.389

1 1 1 1 3 2 1 1 1 1 1 1 2 2 4 1 2 3 2 2 2 1 2 4 2 2 2 3 3 3 4

6.982 5.321 10.363 4.118 8.032 4.034 10.971 4.971 2.411 3.232 3.661 3.655 5.528 3.211 8.121 11.096 5.85 6.563 4.071 4.893 3.405 4.5 3.769 6.486 6.546 4.8 4.226 4.445 4.313 4.612 7.934

8.549 6.077 7.201 4.223 6.337 2.976 7.539 3.296 1.655 2.665 2.949 3.021 3.825 2.305 6.493 7.926 4.471 4.973 2.732 3.732 2.347 3.75 2.906 5.225 5.069 3.85 3.354 3.432 3.094 3.491 6.22

21 24 25 32 35 42 42 43 44 47 110 129 136 140 145 146 147 149 149 164 174 181 195 197 199 203 222 245 256 262 334

8.616 8.771 8.511 8.79 9.056 8.743 8.598 8.46 8.741 8.787 8.656 8.585 8.456

0.35 0.458 0.596 0.404 1.062 0.525 0.628 0.609 0.535 0.839 0.515 0.612 0.876

182.265 258.362 268.416 218.298 152.195 204.271 184.255 212.309 178.233 184.255 190.244 216.282 234.315

3 4 3 3 3 3 3 3 3 3 4 4 4

5.445 7.461 8.101 5.933 4.149 5.476 5.129 5.963 4.815 5.18 5.356 6.017 6.54

4.131 5.777 6.323 4.344 3.126 3.928 4.178 5.082 3.508 4.067 4.273 4.69 5.314

336 338 339 340 348 358 366 377 385 418 425 432 507

different solute molecules adsorb to varying extents on the column depending upon the extent of interaction between the solute and the column solid phase. We believe the nature of this interaction to be a charged interaction where the electronic structure of the solute would be expected to play a critical role. Further, the strength of such an interaction is expected to be shape-specific, implying that certain molecular geometries would be more favored than others. Therefore, we assert that molecular parameters that characterize the electronic structure, shape, and geometry of the solute should be the most important parameters

that control the retention of different solutes on the chromatographic column. A number of molecular descriptors that characterize various aspects of the electronic and molecular geometry of all the different molecules considered in this study were computed. The complete list of such descriptors is summarized in Table 2. Ionization potential (IP), electron affinity (EA), electron gap, and the dipole moment were used to quantify the electronic structure of each molecule, whereas molecular weight, physical dimensions such as length, breadth, width, and volume, total

Retention Times for Aromatic Hydrocarbons

Energy & Fuels, Vol. 20, No. 2, 2006 613

Figure 3. Comparison of predicted retention times against measured values. The molecules are categorized into different ARC-classes as indicated by different symbols. The 15% error bands are indicated by the dotted lines. The inset shows the ARC-1 results in detail. Table 4. Parameter Values for Eq 1 a0

a1

a2

a3

a4

a5

a6

5.1128

-0.5236

0.1461

0.0118

0.0916

-0.2232

-0.1868

number of rings, and various topological indices such as the Wiener index and valence connectivity index of different orders were used to quantify geometric effects. Although the chemical formula of the molecule aids in calculating many geometrical descriptors such as molecular weight, number of rings, and the various topological indices, computation of the physical dimensions of the molecule and its electronic descriptors requires a more detailed quantum mechanical calculation. We used the CAChe commercial software12 to compute these molecular descriptors. The geometry of each molecule was optimized in MOPAC using the semiempirical PM3 Hamiltonian function.12 The geometry calculations were performed multiple times with small perturbations to the structure to ensure that the optimized structures were indeed at their global minimum. (The term global minimum is used loosely here and merely refers to the best minimum found over multiple iterations. In general, it is not possible to theoretically guarantee global minimum for nonlinear functions unless under special circumstances such as convexity, pseudoconvexity, etc.). The descriptors listed in Table 2 convey information about the electronic structure and the molecular geometry of the molecule. However, not all of them are equally important to predict the retention time. Partial Least Squares (PLS) was used to identify the significant descriptors affecting retention time behavior. PLS is a mathematical algorithm that identifies the significant variables from a multidimensional independent variable space that exhibit the most correlation with the dependent variable space. It helps contract the dimensions of the problem with minimal loss of information. The results of the PLS analysis are presented in Figure 1, and only the most significant descriptors are shown here along with their relative contribution. The contribution from the other descriptors were negligible, and consequently they are not addressed in this figure. The PLS analysis identifies the following six descriptors to be the most significant ones: (a) EA, (b) IP, (c) molecular

weight (MW), (d) number of rings (Rc), (e) valence connectivity index of order 1, (1)χv, and (f) valence connectivity index of order 2, (2)χv. Further, the results also indicate that EA and IP play the most dominant role, accounting for almost 60% of the correlation of molecular descriptors with retention time, while the molecular geometry plays a secondary role in controlling the retention behavior of different molecules in the HPLC column. These molecular descriptors are defined below: (i) Ionization potential and electron affinity. IP is the energy required to completely remove an electron from an atom. As would be expected, this energy depends on the energy level of the orbital in which the electron resides and also depends on whether the energy level is full or half full. In contrast, the EA is the energy released when an atom picks up an electron. (ii) Molecular weight. The total mass of 1 mol of the molecule (compound). (iii) Total number of rings. The total number of rings, which includes all polygonal rings in the molecule. For example, the total number of rings in both naphthalene and tetralin is two, whereas it is three for dibenzothiophene. (iV) Valence connectiVity index of order 1 and 2.13 The valence connectivity index, typically represented by χ, is a graph theoretical index that characterizes the general topology and shape of the molecule. Specifically, it quantifies the skeletal makeup of the molecule and the degree and type of branching in it based on the connectivity matrix of the molecule. It distinguishes between the type of different atoms present in the molecule by taking into account the electronic environment of the valence shell of the different atoms. Depending on the exact definitions of the connectivity measure of the molecule, various valence connectivity indices of different orders may be defined. The mathematical definition of valence connectivity index and its different orders is summarized in the Appendix at the end of the article. Some general observations of these indices are (12) CAChe Modeling Software; Fujitsu Inc. http:// www.cachesoftware.com/cache/index.shtml. (13) Hall, L. H.; Kier, L. B. Issues in representation of molecular structure: The development of molecular connectivity. J. Mol. Graphics Modell. 2001, 20, 4.

614 Energy & Fuels, Vol. 20, No. 2, 2006

Ghosh et al.

Figure 4. Molecular structure of the different PAH cores considered in Table 5. Here, the number in parentheses refers to the index of the molecule in Table 5, not the retention time. For example, (3) here refers to core-3 in Table 5.

Figure 5. Prediction of retention times for phenanthrene homologous series. The two dotted lines indicate the ARC-3 region. A molecule is classified as ARC-3 if its retention time is between 330 min and 480 min.

presented in Figure 2, using benzene as the illustrative molecule. The figure shows the valence connectivity indices of order 1, (1)χ , and order 2, (2)χ , for benzene with its different substituents. v v Note that: (a) Both (1)χv and (2)χv increase with increase in the length of the alkyl substituent. (b) For a fixed number of substituents, (1)χv decreases with increased branching; in contrast, (2)χv increases with branching. (c) For a fixed number of substituents and the same degree of branching, (1)χv decreases with increased methylation on the aromatic ring; in contrast, (2)χv increases.

Table 3 summarizes the computed values of these descriptors for the different molecules that were studied in this work (refer to Figure 1 for their structures). Also included in this table is the measured retention time in minutes for these molecules in the HPLC column. Using this data, a QSPR model between the molecular descriptors and the retention time was developed, which is shown in eq 1. Equation 1 is a log-linear relationship, and although various other functional forms can also be considered, we found that this functional form yielded the highest correlation coefficient (i.e., R2 value) between the molecular descriptors and the retention time. The log-linear

Retention Times for Aromatic Hydrocarbons

Energy & Fuels, Vol. 20, No. 2, 2006 615

Table 5. Computation of Various Molecular Descriptors for the Different Core Structures and ARC-Class Prediction by the Model chemical sample

ionization potential (eV)

electron affinity (eV)

molecular weight

ring count (all rings)

valence connectivity index (order 1)

valence connectivity index (order 2)

retention time predicted (min)

ARC-class predicteda

core-1 core-2 core-3 core-4 core-5 core-6 core-7 core-8 core-9 core-10 core-11 core-12 core-13 core-14 core-15 core-16 core-17 core-18 core-19 core-20 core-21 core-22 core-23 core-24 core-25 core-26 core-27 core-28 core-29

9.249 9.248 9.751 8.602 9.125 9.002 9.298 8.646 8.609 8.797 8.75 8.434 8.633 9.145 8.616 8.836 8.342 8.493 8.725 8.622 8.651 8.741 8.652 8.842 8.377 8.627 8.439 8.663 8.598

-0.432 -0.427 -0.396 0.26 -0.268 -0.462 -0.413 0.395 0.398 0.467 0.475 0.298 0.351 0.142 0.35 0.408 0.706 0.436 1.044 0.486 0.511 0.535 0.504 0.335 0.507 0.924 0.569 1.192 0.628

186.296 132.205 78.113 234.34 222.329 294.479 240.388 242.378 188.287 134.195 204.271 290.447 236.356 154.211 182.265 128.173 256.346 286.416 202.255 232.324 204.271 178.233 374.558 166.222 292.438 320.466 238.347 266.375 184.255

3 2 1 4 3 5 4 4 3 2 3 5 4 2 3 2 5 5 4 4 4 3 6 3 5 5 4 4 3

6.028 4.034 2 7.152 6.563 10.061 8.011 7.796 5.809 3.769 5.482 9.479 7.438 4.071 5.445 3.405 7.571 8.89 5.565 6.856 5.856 4.815 11.702 4.612 9.21 9.549 7.17 7.509 5.129

4.889 2.976 1.155 5.594 4.973 8.581 6.794 6.647 4.69 2.906 3.893 7.829 6.044 2.732 4.131 2.347 6.074 7.12 4.255 5.292 4.596 3.508 9.712 3.491 7.725 8.224 5.965 6.465 4.178

53.69 63.29 40.21 280.46 128.97 53.65 43.74 158.24 197.67 183.61 441.11 229.42 209.37 172.33 250.96 223.46 658.00 368.18 527.63 370.70 378.83 371.90 338.43 248.06 334.24 414.14 356.87 488.78 342.75

1 1 1 2 2 1 1 2 2 2 3 2 2 2 2 2 4 3 4 3 3 3 3 2 3 3 3 3 3

a A molecule is classified as ARC-1 if retention time is less than 120 min, ARC-2 if it is between 120 and 330 min, ARC-3 if it is between 330 and 480 min, and ARC-4 if it is greater than 480 min.

Table 6. Effect of the Length of Biaryl Bridge on the Retention Timea

a

The numbers correspond to the retention time in minutes.

functional form was identified using genetic algorithms (specifically genetic function approximations), a stochastic search algorithm run to identify the most appropriate mathematical structure among the various other alternative mathematical structures.

log10

[]

tr ) a0 + a1(IP) + a2(EA) + a3(MW) + a4(Rc) + to a5((1)χv) + a6((2)χv) (1)

In eq 1, tr is the retention time of the solute molecule, to is the retention time of a nonadsorbing solute molecule (i.e., a solute that does not have any measurable interaction with the column, e.g., olefins), and ai (i ) 0-6) are the fitting parameters of the model to be regressed from the experimental data. Interestingly, the ratio [tr/to] used in eq 1 can be related to the partition coefficient of the solute molecule between the stationary and mobile phases on the chromatographic column. The partition coefficient is defined as the ratio of the number of moles of a solute molecule in the stationary adsorbed phase to the number of moles in the mobile phase and represents a widely used quantitative measure of separation. To see this, assume that ns and nm are the moles of a particular solute molecule in the stationary and mobile phase, respectively. Further, let us assume that ux denotes the average velocity with which the solute band moves in the column so that if L were the length of the chromatographic column, the observed retention time for

the solute would be [L/ux]. Furthermore, if u denotes the interstitial velocity in the column, that is, the velocity with which a nonadsorbing solute molecule will move within the interstices of the column bed and us the velocity of an adsorbed solute in the bed, mass balance yields

ux )

nsus + nmu nmu ) n s + nm ns + nm

(2)

Since an adsorbed solute is stationary, us ) 0. Therefore, eq 2 may be rearranged as

tr )

( ) ( )

ns L ns L ) 1+ t ) 1+ ux nm u nm o

(3)

Equation 3 shows that the ratio [tr/to] is related to the number of moles of the solute in the stationary phase to mobile phase, that is, the partition coefficient of the solute. 4. Results and Discussion To obtain the different ai parameters in eq 1, a least squares error minimization problem was solved, and the final optimized values are tabulated in Table 4. To ensure that the regressed parameters were adequately reliable and robust for the model to be used in an extrapolative predictive mode, the dataset was partitioned into a training set (90% of the samples) and a testing set (10% of the samples), and the errors were compared against many such partitions (total partitions considered ) 200). Figure

616 Energy & Fuels, Vol. 20, No. 2, 2006

Ghosh et al.

Table 7. Prediction of Retention Times and ARC-Classes of the Different Alkylated Structures Resulting from the PAH Cores Shown in Figure 4a

Retention Times for Aromatic Hydrocarbons

Energy & Fuels, Vol. 20, No. 2, 2006 617

Table 7. (Continued)

a

The numbers correspond to the retention time in minutes. The elution ARC-class is shown in parentheses.

618 Energy & Fuels, Vol. 20, No. 2, 2006

Ghosh et al.

Table 8. Effect of Naphthenes for the Various Ring Structuresa

a

The numbers correspond to the retention time in minutes. The elution ARC-class is shown in parentheses.

3 compares the results of the model predictions against the experimental data set. The figure reveals a good quantitative agreement between the model-predicted and experimentally observed retention times with a correlation coefficient (R2) of 0.905 in the training set and 0.831 in the cross-validation set. The average error of the model is about 10% for ARC-1 molecules and about 15% for both ARC-2 and ARC-3 molecules. These results reveal that simple molecular descriptors that are computationally inexpensive to calculate easily describe a complex phenomenon such as retention behavior. The ability to quantitatively predict the retention time of different AHs and PAHs is certainly the most desirable requirement of the predictive model. However, more frequently, the question of interest is not so much the actual retention time for the AH or PAH molecule, rather just which ARC-class the molecule will elute in. Therefore, the model was used to predict the ARC-class of typical PAH molecules present in heavy petroleum fractions. These molecules include most of the aromatic cores that are analytically measurable in the heavy petroleum fractions. Further, the effect of substituents on the PAH retention time behavior was also investigated using the QSPR model (e.g., addition of alkylated side chains, introduction of branching, or addition of naphthenic rings, etc.). Figure 4 shows the 29 different PAH cores considered for this purpose, which includes one- to five-membered rings including many heteroatom-containing molecules. Many of these PAH cores can be seldom obtained in large quantities to experimentally measure their retention time in the column, and hence a predictive model is required to quantify their retention behavior. Table 5 summarizes the computed molecular descriptors along with the predicted retention times for these PAH cores. On the basis of the retention times, the ARC-class assignments of these cores are also shown in the table. The actual molecular structure corresponding to each PAH core can be read from Figure 4 where its index is represented in parentheses. For example, Core-2 represents tetralin, Core-3 represents benzene, and so on. The table shows that each PAH core is assigned to a unique ARC-class. However, interesting deviations from these class assignments can happen as different substituents are added to the basic PAH core. For instance, increasing the length of the biaryl bridge for core-5 (which is assigned to ARC-class 2; see Table 5) decreases the retention time. Table 6 shows this effect. For the biaryl bridge length containing one and two carbon atoms, the molecules elute in ARC-2. However, as the bridge becomes longer (e.g., four carbon atoms), the molecules elute

Table 9. Effect of the Position of Naphthenic Substitution on Naphthalenea

a The numbers correspond to the retention time in minutes. The elution ARC-class is shown in parentheses.

in ARC-1, indicating that the two aromatic rings tend to behave independently of each other as the rings get separated by longer alkyl chains. Next, we use the model to investigate the effect of side chains on the retention behavior of the different PAH cores by adding different alkyl side chains to each PAH core. A typical result for the phenanthrene core is presented in Figure 5. The figure shows the variation in the retention time of phenanthrene with increase in alkyl chain length. The retention time increases for the first few molecules containing methyl substituents and then drops as the length of the alkyl substitution increases. This suggests that alkylation with methyl groups increases the retention time of the molecule, while alkylation with longer chain groups such as propyl, butyl, pentyl, and so on progressively decreases it. Also, the initial steep decrease in the retention behavior with increasing alkylation eventually plateaus out as the alkyl chains grow longer. The effect may be rationalized as follows: The methyl group is an electrondonating group; consequently, alkylation with methyl groups leads to increased electron density on the aromatic ring, resulting in increased aromatic character of the ring and hence the longer retention time. In contrast, as the number of carbon atoms in the alkyl group increases (i.e., going from methyl to ethyl, propyl, butyl, and so on), the electron-donating ability of the group progressively decreases. In other words, increasing paraffinic character reduces electron donation, and hence the decrease in retention time with alkylation. Further, since the incremental decrease in electron-donating ability with the addition of each CH2 group in the alkyl chain decreases steeply initially, eventually leading to a plateau, a similar trend is observed in the retention behavior. The increase or decrease in retention time with alkylation can have important consequences as it suggests that the entire homologous series may not elute

Retention Times for Aromatic Hydrocarbons

Energy & Fuels, Vol. 20, No. 2, 2006 619

Table 10. Prediction of Retention Times for Some Other Important Moleculesa

a

The numbers correspond to the retention time in minutes. The elution ARC-class is shown in parentheses.

in the same ARC-class. For example, Figure 5 shows that although phenanthrene elutes in ARC-3, some molecules in the phenanthrene homologous series (specifically molecules 4, 5, and 6) are very close to the boundary of ARC-3 and ARC-4 and might elute in ARC-4, while the rest elute in ARC-3. Similar computations were performed for all the other PAH cores shown in Figure 4, and the results are summarized in Table 7. Each row in this table corresponds to a particular core structure and its homologous series. The numbers below each structure represent the predicted retention time (in minutes) and the corresponding ARC-class (in parentheses) assignments. Although most of the PAH cores and their homologous series tend to elute in the same ARC-class, there are a few cores for which some of the molecules in the homologous series elute in a class different from the core. This happens for cores 8, 14, 20, 22, 25, and 27. These structures may be termed as “overlapping structures” since they overlap across the different ARC-classes. The effect of naphthenic rings on the retention time was also studied, and some typical results for a few PAH cores are illustrated in Table 8. Analogous to alkylation, the retention time initially increases and then gradually decreases with the addition of naphthenic rings. The position of the naphthenic ring attachment also has an effect on the final retention time. This is shown in Table 9 for naphtheno-naphthalene where the retention times are quite different (i.e., 214.85 and 288.03 min, respectively), depending on whether the second naphthenic ring is attached to the aromatic ring or the first naphthenic ring. Finally, the effect of a replacing a paraffinic bond with an olefinic bond in the PAH core was also studied, and the results are illustrated in Table 10. The results show that the presence of an olefinic bond may migrate the molecule to the next ARCclass. For example, indene (with a retention time of 138 min) will elute in ARC-2, although indane (with a retention time of 68 min) will elute in ARC-1. Likewise, ethylbenzene (65 min) is ARC-1, whereas styrene (124 min) is ARC-2. Acenaphthylene (348 min) is at the border of ARC-2 and ARC-3, whereas its paraffinic equivalent acenaphthene (273 min) is strictly ARC2. 5. Conclusions The retention time behavior of various AH and PAH molecules was studied in this work. QSPR analysis was used to investigate relationships between the AH and PAH molecular structure and its chromatographic retention behavior. It was found that the electronic structure of the molecule and its molecular geometry controlled its retention on the column. A predictive quantitative model was developed based on the molecular descriptors that characterized the molecule’s electronic structure and its geometry. The model was used to predict the retention time and investigate the ARC-class assignment for the various AHs and PAHs typically encountered in heavy petroleum streams. The model was also employed to investigate

the effect of different substituents on the bare PAH cores. It was found that alkylation with methyl groups increased the retention time, while alkylation with longer chains (gtwo carbons) decreased it. Addition of naphthenic rings initially increased the retention time, followed by a gradual reduction of the retention time as more rings were added. The position of the naphthenic ring also affected the retention time of the molecule. Finally, addition of an olefinic bond always increased the retention time. Acknowledgment. We wish to thank Clint Kennedy, Roland Saeger, and Larry Green for many useful discussions during the development of the model.

Appendix: Definition of Valence Connectivity Index The valence connectivity index (vχ) is a topological index that encodes the skeletal geometry of the molecule using a graph theoretic approach with explicit accounting for the valence state of the different atoms. (A graph theoretic representation of a molecule is an abstraction where each atom is considered a node and each bond is considered an edge.) To compute vχ, we need to first define the valence δv value given by

Zv - h Z - Zv - 1

(A.1)

Zv ) σ + π + n

(A.2)

δv )

where σ, π, and n are the number of valence sigma, pi, and lone pair electrons in the atom, h is the count of bonded hydrogens, and Z is the atomic number. Thus, for instance, in alkanes where each carbon is sp3 hybridized, the vδ value of all the carbon atoms is 4 - h where h ) 3, 2, 1, or 0 depending on whether the carbon is primary, secondary, tertiary, or quaternary. The following table shows the computed vδ values for a few selected examples. -CH3 -CH2- dCH2- -NH2 >N- dO -F -Cl -SδV

1

2

2

3

5

6

7

0.78

0.56



Once the values are computed, the valence connectivity of order 1 may be computed as

χ )

1 v

∑k (δvi δvj )k-1/2

(A.3)

where the vδ values are taken pairwise for all subgraphs having one edge and in general of order t as

χ )

t v

∑1 ∏i (δvi )1-1/2

(A.4)

where the vδ are taken t-tuplewise for all subgraphs with t edges. EF0502305