Anal. Chem. 2008, 80, 461-473
Effect of First-Dimension Undersampling on Effective Peak Capacity in Comprehensive Two-Dimensional Separations Joe M. Davis*
Department of Chemistry and Biochemistry, Southern Illinois University at Carbondale, Carbondale, Illinois 62901 Dwight R. Stoll and Peter W. Carr
Department of Chemistry, University of Minnesota, Minneapolis, Minnesota 55455
The objective of this work is to establish a means of correcting the theoretical maximum peak capacity of comprehensive two-dimensional (2D) separations to account for the deleterious effect of undersampling firstdimension peaks. Simulations of comprehensive 2D separations of hundreds of randomly distributed sample constituents were carried out, and 2D statistical overlap theory was used to calculate an effective first-dimension peak width based on the number of observed peaks in the simulated separations. The distinguishing feature of this work is the determination of the effective firstdimension peak width using the number of observed peaks in the entire 2D separation as the defining metric of performance. We find that the ratio of the average effective first-dimension peak width after sampling to its width prior to sampling (defined as ) is a simple function of the ratio of the first-dimension sampling time (ts) to the first-dimension peak standard deviation prior to sampling (1σ): ) x1+0.21(ts/1σ)2 This is valid for 2D separations of constituents having either randomly distributed or weakly correlated retention times, over the range of 0.2 e ts/1σ e 16. The dependence of on ts/1σ from this expression is in qualitative agreement with previous work based on the effect of undersampling on the effective width of a single first-dimension peak, but predicts up to 35% more broadening of first-dimension peaks than is predicted by previous models. This simple expression and accurate estimation of the effect of undersampling first-dimension peaks should be very useful in making realistic corrections to theoretical 2D peak capacities, and in guiding the optimization of 2D separations. Multidimensional chromatographic separations have attracted intense interest over the past decade based on the tremendous potential improvements in resolving power they offer over their * To whom correspondence should be addressed. Current address: 733 Schloss St., Wrightsville Beach, NC 28480. Telephone: 910 256 4235. E-mail:
[email protected]. 10.1021/ac071504j CCC: $40.75 Published on Web 12/13/2007
© 2008 American Chemical Society
one-dimensional counterparts.1-3 Comprehensive two-dimensional gas chromatography (GC × GC) has been used successfully in a number of application areas including petrochemical, food, beverage, and essential oil analysis. An excellent series of reviews of most of the experimental and theoretical aspects of GC × GC appeared recently.4-7 Comprehensive two-dimensional liquid chromatography (LC × LC) has become a mainstay of proteomics research, and it is fast becoming widely used in the analysis of other complex samples of nonvolatile species including low molecular weight extracts from plants, pharmaceuticals, and polymers. Several reviews of all aspects of LC × LC have appeared recently, including one by some of the authors of this paper.8-12 The current accepted definition of “comprehensive” 2D separations implies that the same fraction of every constituent of the sample being analyzed is subject to analysis by both modes of the 2D separation system.13 In nearly all reports of comprehensive 2D separations, everything that elutes from the first-dimension column is quantitatively transferred to the second-dimension column. Assuming this mode of operation, the chromatographer is immediately faced with the need to decide how frequently aliquots of the first-dimension effluent should be transferred to the second-dimension column. Theoretical guidance in making (1) Giddings, J. C. Anal. Chem. 1984, 56, 1258A-1260A, 1262A, 1264A. (2) Guiochon, G.; Beaver, L. A.; Gonnord, M. F.; Siouffi, A. M.; Zakaria, M. J. Chromatogr. 1983, 255, 415-437. (3) Karger, B. L.; Snyder, L. R.; Horvath, C. An Introduction to Separation Science; Wiley and Sons: New York, 1973. (4) Adahchour, M.; Beens, J.; Vreuls, R. J. J.; Brinkman, U. A. T. Trends Anal. Chem. 2006, 25, 821-840. (5) Adahchour, M.; Beens, J.; Vreuls, R. J. J.; Brinkman, U. A. T. Trends Anal. Chem. 2006, 25, 726-741. (6) Adahchour, M.; Beens, J.; Vreuls, R. J. J.; Brinkman, U. A. T. Trends Anal. Chem. 2006, 25, 540-553. (7) Adahchour, M.; Beens, J.; Vreuls, R. J. J.; Brinkman, U. A. T. Trends Anal. Chem. 2006, 25, 438-454. (8) Dixon, S. P.; Pitfield, I. D.; Perrett, D. Biomed. Chromatogr. 2006, 20, 508529. (9) Issaq, H. J.; Chan, K. C.; Janini, G. M.; Conrads, T. P.; Veenstra, T. D. J. Chromatogr., B: Anal. Technol. Biomed. Life Sci. 2005, 817, 35-47. (10) Shalliker, R. A.; Gray, M. J. Adv. Chromatogr. 2006, 44, 177-236. (11) Shellie, R. A.; Haddad, P. R. Anal. Bioanal. Chem. 2006, 386, 405-415. (12) Stoll, D. R.; Li, X.; Wang, X.; Carr, P. W.; Porter, S. E. G.; Rutan, S. C. J. Chromatogr., A 2007, 1168, 3-43. (13) Schoenmakers, P.; Marriott, P.; Beens, J. LCGC North Am. 2003, 16, 335336, 338-339.
Analytical Chemistry, Vol. 80, No. 2, January 15, 2008 461
this decision was first provided by the seminal study of Murphy et al.,14 who simulated the effect of the sample acquisition rate and transfer process on the effective first-dimension peak width. The approach used by Murphy et al. was to model a sampled firstdimension Gaussian peak as a histogram-like profile of the average concentration within each sampling time. Their main conclusion (which was supported by experimental separations) was that serious loss (i.e., more than ∼25%) of resolution gained in the first dimension prior to the sampling process can be avoided if at least four samples are acquired during the time equivalent to the 81σ peak width of a typical first-dimension peak, where 1σ is the peak standard deviation. This work was extended by Seeley,15 who studied the problem in situations where the duty cycle of the sample acquisition device is less than 100%; that is, in those cases where the fraction of constituents transferred from the first dimension to the second dimension is consistent over the entire first-dimension run, but less than 100%. The approach used by Seeley was to model the sampled first-dimension Gaussian peak as a series of digital pulses. Seeley’s results for the 100% duty cycle case fully corroborated the initial findings of Murphy et al., albeit by different means. As a result, the guideline that firstdimension peaks should be sampled at least three to four times across their 81σ width has been recognized by most practitioners, although seldom followed, particularly in LC × LC experiments. The number of fractions per 81σ width varies widely in the LC × LC literature from ∼316 to as few as ∼0.5.17 We have used this range to guide the theoretical work in the present study. The unfortunate reality is that this guideline is quite difficult to follow in practice because of the slow speed (long duration) of second-dimension separations compared to the temporal widths of first-dimension peaks. For a variety of fundamental and practical reasons,12 this problem is not as serious in GC × GC as in LC × LC. Many practitioners have also recognized that when the guideline is not followed, the effective peak capacity of the resulting 2D separation can be only a fraction of the theoretical maximum peak capacity (see below). Unfortunately, no systematic means of correcting the limiting peak capacity has been adopted by the 2D separations community, resulting in more than a halfdozen different correction methods being reported. This obviously makes it very difficult to compare results across studies. The history of the development of the theory of multidimensional separations was recently reviewed in detail by Schure,18 who pointed out that Karger et al.3 originally defined a theoretical maximum (or limiting) peak capacity nc,2D (they actually referred to it as “fraction capacity”) as the product of the peak capacities of the first- and second-dimension separations (denoted here as 1n and 2n , respectively). Later, both Guiochon2 and Giddings1 c c elaborated on this idea, resulting in eq 1, which is widely used today. (14) Murphy, R. E.; Schure, M. R.; Foley, J. P. Anal. Chem. 1998, 70, 15851594. (15) Seeley, J. V. J. Chromatogr., A 2002, 962, 21-27. (16) Stoll, D. R.; Cohen, J. D.; Carr, P. W. J. Chromatogr., A 2006, 1122, 123137. (17) Wagner, K.; Miliotis, T.; Marko-Varga, G.; Bischoff, R.; Unger, K. K. Anal. Chem. 2002, 74, 809-820. (18) Schure, M. R. In Multidimensional Liquid Chromatography: Theory, Instrumentation and Applications; Cohen, S. A., Schure, M. R., Eds,; Wiley: New York, 2008. (19) Horie, K.; Kimura, H.; Ikegami, T.; Iwatsuka, A.; Saad, N.; Fiehn, O.; Tanaka, N. Anal. Chem. 2007, 79, 3764-3770.
462
Analytical Chemistry, Vol. 80, No. 2, January 15, 2008
nc,2D ) 1nc2nc
(1)
We point out that although eq 1 represents the most widely used approach for calculating 2D peak capacities, even Giddings called this expression “very approximate”. For the sake of comparing the results of the current study to previous work, we have chosen to use eq 1 as the basis of our discussion. Recently Horie et al.,19 based on the results of Seeley,15 presented a method of calculating an effective 2D peak capacity, which takes into account the deleterious effect of slow sampling. They estimated an effective 2D peak capacity (n′c,2D) using a form of eq 2
n′c,2D ) nc,2D
1
σ
(2)
where is the average standard deviation of peaks along the first-dimension axis as observed in 2D separations. We will refer to /1σ as the first-dimension peak broadening factor that results from excessive undersampling of first-dimension peaks. The objective of this work is to establish another means of correcting the theoretical maximum peak capacity nc,2D to account for the deleterious effect of slow sampling of first-dimension peaks. However, we approach the problem in a manner very different from in the work19 described above. For most analytical problems where 2D separations have historically been employed (i.e., petrochemical and proteomic analyses), the primary objective in optimizing the separation conditions is to maximize the number of observed peaks that can be produced in a given analysis time. This motivated us to use the number of observed peaks in 2D separations as the metric in establishing a means of correcting the 2D peak capacity nc,2D for undersampling. With this in mind, the basic approach we adopt is to count observed peaks in two kinds of simulated 2D separations, with one of them also interpreted by theory. We first define the size of the separation spaces, and the standard deviations of average first- and second-dimension peaks (i.e., we assume peak width is independent of retention, as is observed under many conditions in temperature-programmed GC and gradient elution reversedphase LC20). In particular, we set the first-dimension peak standard deviation equal to 1σ. We then simulate a comprehensive 2D separation of a given number of randomly distributed sample constituents as it is actually done experimentally. That is, the signal due to peaks eluting from the first-dimension column is integrated for a designated sampling time ts, and the resulting integrated signal is used to simulate an “injection” of the collected aliquot into the second-dimension column. We count the number of observed peaks in the resulting comprehensive 2D separation under this particular set of conditions. To measure the broadening of the first-dimension peak, the expected number of observed peaks in the simulation then is predicted from the previously established statistical overlap theory for 2D separations21 using the known simulation parameters (e.g., number of constituents, standard deviation of the second-dimension peak, etc.), with one exception. The exception is the standard deviation of the first(20) Snyder, L. R.; Dolan, J. W. High-Performance Gradient Elution: The Practical Application of the Linear-Solvent-Strength Model; Wiley and Sons: Hoboken, NJ, 2007. (21) Liu, S.; Davis, J. M. J. Chromatogr., A 2006, 1126, 244-256.
dimension peak, which is systematically increased beyond 1σ in the prediction until the calculated number of observed peaks matches the simulated number of observed peaks. The resultant first-dimension peak standard deviation, denoted here as , determines the average first-dimension broadening factor , where
) /1σ
(3)
Thus, has a meaning similar to the ratio /1σ of Horie et al., although it is determined very differently. To validate the determined as described above, we then simulate a second 2D separation in which everything is kept the same as in the first, except that the first-dimension standard deviation is set equal to , instead of 1σ, and the first dimension is sampled both adequately and very rapidly (i.e., typically more than 40 samples per first-dimension peak). We refer to this simulation as an “ideally sampled” 2D separation, in which the broadening of the first-dimension peak due to undersampling is negligible. The number of observed peaks in such a simulation then is compared to the number of observed peaks in the simulated comprehensive 2D separation, with the hypothesis that is a valid measure of first-dimension broadening if the peak numbers are the same. Repeating these calculations for various values of the ratio of the sampling time ts relative to the first-dimension peak standard deviation 1σ (i.e., different ts/1σ) allows us to obtain the dependence of on this ratio. Finally, we use to correct the limiting 2D peak capacity using eq 4
n′c,2D ) (1nc/) 2nc
(4)
At this point, it is important to understand that the broadening factor is obtained completely empirically, but is based upon the very pragmatic metric of the number of observed peaks (i.e., the entire ensemble of overlapping peaks in the 2D separation is taken into account). In contrast, the approach of Horie et al. bases the correction to nc,2D on the theoretical behavior of a single firstdimension peak. We find that the broadening factor determined by this empirical approach is in qualitative agreement with all of the previous work described above. This is expected because the effect of slow sampling on an ensemble of first-dimension peaks should not be fundamentally different from the effect on a single first-dimension peak. However, we find that when the entire ensemble is considered, is up to 35% larger (depending on conditions; see Figure 3) than the broadening factor reported by Horie et al. Given the large additional increase in analysis time that must be made to achieve even a 20% increase in peak capacity,22 this quantitative difference in broadening factors is far from trivial (see below). We also find that is rather constant over a wide range in the number of constituents (50-1000) in the simulated separations, and a large range in the aspect ratio of the 2D separations (i.e., an 8-fold range in the ratio of the peak capacities of the first and second dimensions). These findings strongly support previous studies of the sampling problem in a (22) Wang, X.; Barber, W. E.; Carr, P. W. J. Chromatogr., A 2006, 1107, 139151.
qualitative sense and provide a more accurate means of calculating effective 2D peak capacities, which are ultimately badly needed to guide the optimization of 2D separations and to compare various approaches to 2D separations, especially in LC × LC. THEORY Terminology and Symbols. It is evident from the GC × GC and LC × LC literature that different terminology is used to describe nominally identical processes. The acquisition of aliquots of first-dimension effluent and subsequent transfer to the inlet of the second-dimension column is usually referred to as “sampling” in LC × LC and “modulation” in GC × GC. The process of sample collection and transfer is more consistent with the idea of sampling, or sample acquisition, in the language of signal processing, so we choose to refer to this process as “sampling” rather than “modulation”. However, we feel the conclusions reached in this work apply equally well to both GC × GC and LC × LC separations. There are several related terms: we use “sampling time” (ts) rather than “modulation period”, and the “modulation ratio” introduced by Kummueng et al.23 is equivalent to the ratio of the sampling time to the first-dimension 4-σ peak width (before sampling), ts/41σ. The notation described by Schoenmakers et al.13 for discussing different aspects of 2D separations has also largely been adopted here for consistency. To avoid confusion, we use the word “peak” to describe the concentration distribution of a single constituent of a mixture. The word is generic; for example, we refer to first-dimension peaks, second-dimension peaks, or 2D peaks. In contrast, we use the words “observed peak”, as did Nagels et al.,24 to describe the concentration distribution that is detectable in separations and has a single maximum concentration, or maximum. Observed peaks may contain either single or multiple constituents. Finally, we use the word “profile” to describe various representations of sampled first-dimension peaks. Assumptions. We make five assumptions in our model. First, ideally sampled peaks are Gaussian in both dimensions. Second, all peaks in the same dimension have the same width, although peak widths may differ between dimensions. Third, the sampling device that acquires first-dimension effluent and transfers it to the second dimension behaves ideally and instantaneously transfers focused sample. Fourth, sampling has no effect on the seconddimension peak width or retention time. Fifth, we consider only sampling devices having a 100% duty cycle. Broadening of First-Dimension Gaussian Peak due to Sampling. Figure 1a shows a first-dimension Gaussian peak having relative concentration c and retention time 1tr. It is plotted against the reduced time, (t - 1tr)/1σ, where t is time. Also shown is a histogram-like profile of the average concentration sampled between time t and t + ts (in this case, ts is equal to 21σ). The histogram-like profile is identical to that derived by Murphy et al.14 The filled circles in Figure 1a represent the average concentrations of the first-dimension Gaussian. The discrete times associated with these concentrations are found at the centers of the histogram bars (i.e., at the average times over which samples are collected). (23) Khummueng, W.; Harynuk, J.; Marriott, P. J. Anal. Chem. 2006, 78, 45784587. (24) Nagels, L. J.; Creten, W. L.; Vanpeperstraete, P. M. Anal. Chem. 1983, 55, 216-220.
Analytical Chemistry, Vol. 80, No. 2, January 15, 2008
463
Figure 1. Relative concentration vs reduced time for unsampled Gaussian peaks and sampled profiles. Filled circles (b) represent the average concentration of sample that is injected into a second-dimension column. ts ) 21σ unless specified otherwise. (a) (- - -) unsampled Gaussian peak, and (s) histogram-like profile representing average concentrations after integration of the Gaussian from t to t + ts. (b) (- - -) unsampled Gaussian peak, and (s) profile resulting from the convolution of the Gaussian and a rectangle of height ts-1 and width ts. (c) Interpolated profile formed by connecting the average concentrations shown in (a) with line segments. (d) ()) interpolated profiles having different starting times for the sampling process relative to the Gaussian peak maximum, and (-) convolution. β is the standard deviation of the profile in reduced time; here, ts ) 81σ.
Each average concentration is calculated by integrating the Gaussian between t and t + ts, dividing by ts, and assigning the value to the time t + ts/2, as discussed in detail by Murphy et al.14 Alternatively, it can be calculated by sampling the function produced by convolving the Gaussian with a rectangle of height ts-1 and temporal width ts.25 This is true, because the value of the convolution at time t + ts/2 is the area of the Gaussian between times t and t + ts, divided by ts (i.e., the average concentration). Figure 1b shows both the convolution and the original Gaussian peak in reduced time, with coincidence of the sampled concentrations in Figure 1a (represented by filled circles) and the convolution in Figure 1b. The sampled concentrations can be interpreted in the same way as any digital signal. Any such signal can be recovered as an analog signal by convolving it with the appropriate sinc function,26 if the Nyquist theorem concerning the minimum sampling rate is satisfied. However, for undersampled signals, that is signals obtained at large ts/1σ, accurate recovery is not possible due to frequency aliasing. Consequently, no continuous function can accurately represent an undersampled first-dimension profile. In addition to the histogram-like profile in Figure 1a, the undersampled signal can be represented by the interpolated profile in Figure 1c, in (25) Bracewell, R. N. The Fourier Transform and its Applications, 2nd ed.; McGraw-Hill: New York, 1986. (26) Felinger, A. Data Analysis and Signal Processing in Chromatography; Elsevier: Amsterdam, 1998.
464
Analytical Chemistry, Vol. 80, No. 2, January 15, 2008
which the sampled average concentrations, which are the same as in Figure 1a, are connected by line segments. Other profiles can be proposed, but none approaches legitimacy unless the convolution in Figure 1b is sampled rapidly. The interpolated profile has a special place in this study, which we explain later. The temporal standard deviation of a peak along the firstdimension axis 1σc in a comprehensive 2D separation is expressed here as 1 c
σ ) β1σ
(5)
where β g 1 is a function that corrects for broadening due to undersampling. Its value depends on ts/1σ, the means of its calculation, and the phase between the retention time of the sampled peak and the time of the first sampled concentration of significance. The last dependence was addressed by Murphy et al.14 and Seeley.15 Figure 1d shows three interpolated profiles (represented by bold lines) resulting from sampling a Gaussian at ts/1σ ) 8, where it is evident that the dimensionless peak standard deviations β differ due to the differences in the starting time of sampling (also called phase) in each case. The convolution coincident with the sampled concentrations also is shown as a solid curve for reference. The plots in Figure 2 give a clearer picture of the pronounced effect of the sampling phase and sampling time on the distortion
Figure 2. Effect of sampling phase and sampling time on the distortion of first-dimension peaks that results from the sampling of two equal size first-dimension peaks separated with a Rs ) 1 prior to sampling. Solid curves are unsampled Gaussians, open circles are sampled data, and dashed lines are interpolated profiles. (a) The sampling time is fixed at 21σ, and the time at which the sampling process is started relative to the retention time of the first peak prior to sampling (referred to as the sampling phase, φ) is systematically varied. (b) The sampling phase is fixed at -81σ, and the sampling time is systematically varied. These two sets of plots show the extreme local effect of the sampling phase and sampling time on the number of maximums that is observed, especially when ts > 21σ.
of first-dimension peaks. Figure 2a shows the extreme sensitivity of the first-dimension peak shape and thus resolution Rs to the sampling phase, even when the sampling rate satisfies the criterion suggested by Murphy et al.14 (ts e 21σ). Note that although Rs ) 1 in the first-dimension chromatogram prior to sampling, one sees only a single peak maximum when the sampling commences at a time corresponding to 2 or 4 σ units prior to the unsampled peak maximum (i.e., φ ) -21σ or -41σ), but one sees two maximums when the sampling commences at -31σ or -51σ. Similarly, Figure 2b shows the effect of sampling time. It is obvious that when Rs ) 1 undersampling severely distorts the peak shape and makes it very sensitive to the phase effect. It is thus not surprising that the loss of information represented in the deviation of β from unity is very dependent on both the exact conditions of the chromatography (i.e., aspect ratio and sampling rate) but also the amount of peak overlap, and the dispersion in β increases with increasing ts when ts/1σ > 2 (i.e, see Figure 3). Thus, the number of observed peaks in a comprehensive 2D separation depends on the interactions of the different sampled peaks in the entire separation and not just on the sampling characteristics of one peak. We emphasize that the results in Figure 2 are shown here, despite the extreme distortion of the first-dimension peak shape when ts > 21σ, because the literature indicates that such low sampling rates are commonly used. In a complex comprehensive 2D separation, the β’s of the profiles along the first-dimension axis assume different values for different constituents of the sample due to different phase angles between the start of sampling and the unsampled peak maximum, regardless of ts/1σ or means of their calculation. If sufficient numbers of peaks are sampled, then the average temporal standard deviation of the resultant profiles can be written as
) 1σ
(6)
where is the average of β across all constituents present in the separation (see eq 3).
Figure 3. Plots of vs ts/1σ. (s) computed by Monte Carlo simulation of interpolated profiles, and (- - -) one standard deviation of β; (s s) computed by Monte Carlo simulation of histogramlike profiles. Filled circles are determined by SOT, and error bars are one standard deviation. The inset plot is a fit of eq 10 to determined by SOT over the range 0.2 e ts/1σ e 16.
Because the first-dimension peak capacity is inversely proportional to 1σ,27 the average effective 2D peak capacity n′c,2D, corrected for undersampling of first-dimension peaks in a comprehensive 2D separation, is also inversely proportional to as given in eq 4. Determination of . Here we discuss two independent approaches to determine . One entails interpretation by theory of the number of observed peaks in simulations of comprehensive 2D separations and depends on the behavior of a large ensemble of overlapping 2D peaks. The second entails a random sampling of a single first-dimension peak. As we will show, the two approaches lead to similar results, provided that the sampled peak profile is appropriately modeled. Two-Dimensional Statistical Overlap Theory. can be determined by interpreting simulated comprehensive 2D separations using 2D statistical overlap theory (SOT). SOT is used to (27) Grushka, E. Anal. Chem. 1970, 42, 1142-1147.
Analytical Chemistry, Vol. 80, No. 2, January 15, 2008
465
predict the average number p of observed peaks (i.e., peak maximums) in 2D separations containing (on average) m j randomly distributed constituents. The theory was corrected recently for various shortcomings21,28 and currently predicts accurate numbers of observed peaks for overlapping bi-Gaussian concentration distributions over a wide range of conditions. The equations for SOT are described in detail elsewhere;21,28 only relevant issues are discussed here. Briefly, p depends on m j, the saturation R of the separation, the two standard deviations of a representative bi-Gaussian peak, and the durations of both separation dimensions. R is a metric of the fraction of constituents (p/m j ) that appear as observed peaks; it is inversely proportional to the 2D peak capacity and, in an ideally sampled 2D separation, is directly proportional to the first-dimension peak standard deviation 1σ
R ∝ 1σ
(8)
The modification is valid, because R is a measure of the average amount of overlap over the entire ensemble of 2D peaks. can be determined by taking p as the average number of observed peaks in a large number of simulated comprehensive 2D separations containing (prior to sampling) randomly distributed 2D peaks. This is true, because all of the other parameters on which R depends can be specified exactly, leaving as the only parameter to be determined. Two objections to this identification can be raised. First, sampled peaks do not produce biGaussians and have distorted first-dimension profiles. Second, even though the ideally sampled simulation is based on randomly distributed retention times, the sampling process partially compromises the randomness by discretizing all first-dimension retention times between t and t + ts to the time, t + ts/2. The second objection is the more substantive one, but we postpone its discussion until we report results below that show it is actually only a minor concern. Monte Carlo Simulation. Another way to determine is to repetitively randomly sample a Gaussian peak and interpret the sampled data. Specifically, the data so obtained are represented by either histogram-like or interpolated profiles; their dimensionless standard deviations (β) are computed and averaged to obtain . Their representation by a digital pulse sequence is unnecessary for a 100% duty cycle,15 since for this type of representation is the same as that of histogram-like profiles. This procedure also allows the calculation of the distribution of β. For specific ts/1σ values, was determined previously from histogram-like profiles by Murphy et al., who incrementally changed the initial sampling time and averaged the resulting standard deviations.14 For 0 e ts/1σ e 3, Seeley obtained a similar result using a digital pulse sequence.15 The calculation of based on interpolated (not histogram-like) profiles appears to be unique to the present study. (28) Davis, J. M. J. Sep. Sci. 2005, 28, 347-359.
466
Analytical Chemistry, Vol. 80, No. 2, January 15, 2008
set
1σ a r
2σ b r
γc
A B C D E F G H I J
0.01 0.02 0.0025 0.005 0.01 0.01 0.02 0.0025 0.02 0.0025
0.01 0.02 0.005 0.01 0.005 0.02 0.01 0.01 0.005 0.02
1 1 2 2 2 2 2 4 4 8
a Reduced first-dimension peak standard deviation. b Reduced seconddimension peak standard deviation. c Ratio of the reduced standard deviations of the first- and second-dimension peaks, such that γ g 1.
(7)
For a comprehensive 2D separation, eq 7 must be modified with eq 6, such that
R ∝ ) 1σ
Table 1. Parameters Used in Simulations of Comprehensive 2D Separations
PROCEDURES Simulations: Comprehensive 2D Separations. Software was written to mimic comprehensive 2D separations in a reduced square of unit area. Simulation parameters were the number of mixture constituents m, the reduced standard deviations 1σr and 2σ of peaks in the first and second dimensions, and the reduced r sampling time. The reduced variables and reason for their use are explained below. Table 1 gives 10 combinations of 1σr and 2σ , identified as set A, set B, etc. Their ratio, γ, defined such that r γ g 1, is also reported. For each set, simulations were done for m equal to 50-400 (in increments of 50). Set A also included m values from 500 to 1000 (in increments of 100). Reduced sampling times were multiples of 1σr with multipliers from 0.2 to 16. Reduced retention times were distributed randomly over the central 80% of both reduced dimensions of unit duration to avoid loss of peak maximums near boundaries. The concentrations of the first-dimension peaks were assigned by an exponential random number generator to mimic real-world mixtures.24,29-31 The firstdimension Gaussian peaks were calculated with a reduced time increment of 0.21σr between successive points. The first-dimension time axis was divided into intervals equaling the reduced sampling time, with the final interval smaller when necessary. In each interval, the average concentration of each first-dimension Gaussian peak was computed by numerical integration with Simpson’s rule over 501 points and assigned a first-dimension retention time equal to the interval’s center. Second-dimension Gaussian peaks conserving the first-dimension average concentrations were calculated, with a reduced time increment of 0.001 between successive points. The number of observed peaks (maximums) in the resultant 2D concentration array was determined by counting points at which the local concentration exceeded the eight surrounding nearest-neighbors (more elaborate determination schemes are possible, but this one has worked well with SOT and noiseless simulations21,32). The number of observed peaks in 50 simulations having different retention times were averaged. (29) Dondi, F.; Kahie, Y. D.; Lodi, G.; Remelli, M.; Reschiglian, P.; Bighi, C. Anal. Chim. Acta 1986, 191, 261-273. (30) Nagels, L. J.; Creten, W. L. Anal. Chem. 1985, 57, 2706-2711. (31) Pietrogrande, M. C.; Pasti, L.; Dondi, F.; Rodriguez, M. H. B.; Diaz, M. A. C. J. High Resolut. Chromatogr. 1994, 17, 839-850. (32) Shi, W.; Davis, J. M. Anal. Chem. 1993, 65, 482-492.
Similar calculations were made using retention times taken from experimental LC × LC separations16,33 and retention times in two calculated GC × GCs.21 The retention times were expressed as reduced coordinates and were fixed in simulations involving different values of ts/1σ. Exponentially random concentrations were assigned to each first-dimension peak, producing slightly different numbers of observed peaks in each simulation. The numbers of observed peaks in 50 simulations were averaged. Ideally Sampled 2D Separations. In the reduced space, simulations were made of ideally sampled 2D separations of biGaussian peaks having a first-dimension standard deviation equal to 1σr, with determined by SOT or histogram-like profiles (see below). Increments between successive points in either dimension were 0.2 times the smaller of the two reduced standard deviations. The retention times were either randomly assigned or were set equal to those in LC × LC and GC × GC experiments. The bi-Gaussians were assigned exponentially random concentrations. The average number of observed peaks in 50 simulations was compared to the average number of observed peaks in the corresponding simulated comprehensive 2D separation. Determination of by SOT. The average number of observed peaks p was predicted by SOT21,28 using parameters of the simulated comprehensive 2D separations. Here, m j equaled m, and the first-dimension standard deviation equaled the product of an empirical scalar and 1σr. In a series of calculations, the scalar was incremented by 0.001, producing a family of p values. For each set of m, 1σr,2σr, and ts, two p’s were found that bracketed the average number of observed peaks in the corresponding comprehensive 2D separation. The empirical scalar was linearly interpolated and finally called (cf. eq 6). The corresponding saturation R also was interpolated. For each sampling time, the ’s so determined were averaged for the entire family of m’s and reduced standard deviations in Table 1 (70 values for ts/1σ ) 0.2; 86 values for other ts/1σ’s). Determination of by Monte Carlo Simulation. For a specified ts/1σ, a Gaussian peak of unit amplitude was sampled at a randomly chosen initial reduced time z ) (t - 1tr)/1σ, but one smaller than -8z - ts/1σ, and a final reduced time greater than 8z + ts/1σ. The average concentration in a reduced sampling time (ts/1σ) was calculated by numerically integrating the Gaussian between various reduced sampling times z and z + ts/1σ using Simpson’s rule based on 301 points. The numerical error in this average concentration is certainly entirely negligible. The dimensionless variance β2 of each profile was calculated by the exact equation for the second central moment of a distribution
β2 ) - 2
(9)
where c* is the continuous normalized concentration and the brackets represent values calculated as the sum of piecewise analytical integrals over the different reduced sampling times. In other words, β2 was calculated analytically from the continuous profile, instead of computed numerically from the discretized data at the sampling points. The equations for β2 are given as Supporting Information. The standard deviation of 50 000 ran(33) Porter, S. E. G.; Stoll, D. R.; Rutan, S. C.; Carr, P. W.; Cohen, J. D. Anal. Chem. 2006, 78, 5559-5569.
domly sampled profiles were so determined and averaged to get . The dispersion of β, and sometimes its distribution, also were calculated from the 50 000 values. Calculations were made for the ts/1σ values from 0.2 to 16. Other Computational Issues. To confirm results, independent algorithms were written by two of us in Matlab version 7.0.4 and FORTRAN 90. Random number generators were built into Matlab, whereas RAN334 was used with FORTRAN (the latter also was used to randomly sample Gaussians). The analytical expression for the convolution was adapted from ref 35. RESULTS AND DISCUSSION Use of Reduced Coordinates. The 1σr and2σr values in Table 1 span 8- and 4-fold ranges, respectively, with a resultant range in nc,2D of 32. To make this large range even more relevant, separations were simulated in reduced coordinates. The reduced standard deviations 1σr and 2σr are the temporal standard deviations 1σ and 2σ, divided by the duration of the relevant dimension. Consequently, simulation results for 1σr ) 0.01 apply equally well to a first-dimension standard deviation and duration of 2 and 200 s. (2/200 ) 0.01), or of 8 and 800 s. A reduced sampling time of 4 ts corresponds to 8 s. in the former case and 32 s. in the latter case. For simplicity, we drop the subscript “r” in discussing our findings (e.g., we use ts/1σ to describe the factor by which 1σ is multiplied to obtain the sampling time). Determination of Values by Monte Carlo Simulation. Figure 3 is a plot of versus ts/1σ. The solid line was calculated from Monte Carlo simulation of interpolated profiles. The two finely dashed lines bracketing the solid line represent one standard deviation of β about the solid line. We observe that the dispersion in β increases with ts/1σ due to increasing variability in the sampled peak profile (see Figure 1d). For ts/1σ e 2, the variation is negligible (RSD < 0.6). The distribution of β at large ts/1σ depends on ts/1σ and is highly asymmetric (results not shown). The coarsely dashed line was calculated in the same fashion as the solid one, except that histogram-like profiles were used; it agrees with average values reported by Murphy et al.14 and Seeley15 for small values of ts/1σ. Here, we consider a larger ts/1σ range than did these authors. The results obtained for the two profiles differ by less than 4%, when ts/1σ e 1. However, at large values of ts/1σ (where most experimental work has actually been done, particularly in LC × LC), the result based on interpolated profiles is up to 35% larger than both prior results. The statistical significance of the difference is difficult to judge, because the β distribution for both profile types is highly asymmetric (however, the standard deviations of β for both profile types overlap, when ts/1σ > 3). The oscillations of both functions result from phase differences between the first-dimension retention time and the initial sampling time, as explained in detail by Murphy et al.14 Determination of Values Using SOT. The filled points in Figure 3 are values determined using SOT; the error bars denote one standard deviation. The data are precise, with RSDs that are less than 7.5 but increase with increasing ts/1σ for reasons discussed below. They generally agree with the ’s calculated (34) Press, W. H.; Teukolsky, S. A.; Vetterling, W.; Flannery, B. P. Numerical Recipes in FORTRAN; Cambridge University Press: New York, 1992. (35) Crank, J. Mathematics of Diffusion; Clarendon Press: Oxford, 1979.
Analytical Chemistry, Vol. 80, No. 2, January 15, 2008
467
Figure 4. Plots of average numbers of observed peaks vs ts/1σ for comprehensive 2D separations and ideally sampled 2D separations of bi-Gaussian peaks with first-dimension standard deviations equaling 1σ. Each of the panels (a)-(d) shows results for different 2D aspect ratios. Subsequent 3-tuples identify m, symbol for average numbers of observed peaks in comprehensive separations, and symbol for average numbers of observed peaks in ideally sampled separations with determined by SOT: (100, O, 0), (200, ], ×), (300, +, 4), and (400, 3, right triangle, vertical edge at right). Symbol right triangle, vertical edge at left in (a) (m ) 100) and (d) (m ) 400) represents average numbers of observed peaks in ideally sampled separations with determined by histogram-like profiles.
from Monte Carlo simulations of the interpolated profiles. However, the agreement is not exact, indicating that interpolated profiles are not exact representations of sampled first-dimension peaks. As stated above, accurate continuous profiles cannot be recovered from undersampled first-dimension peaks. The inset plot in Figure 3 is a fit of
)
x () 1+κ
ts
2
1
(10)
σ
to values determined using SOT, with κ ) 0.214h ( 0.010h as determined by weighted least-squares. The fit is excellent, with a reduced χ2 of 0.101. Although eq 10 is empirical, it has the form expected for the convolution of a Gaussian peak of standard deviation 1σ with an arbitrary filter of approximate duration ts, since the variance of a convolution is the sum of the variances of the convolved functions.25 For κ equal to 0.214h, one such filter is a rectangle of duration 1.60 ts (the variance of a rectangle of duration 2 ts is ts /12). Although this specific filter has no physical basis in a comprehensive 2D separation, it complements our interpretation of first-dimension profiles as a sampling of the convolution of a Gaussian peak and a rectangle. Validation of . Figure 4 shows plots of average numbers of observed peaks (i.e., peak maximums) versus ts/1σ, as obtained with randomly distributed retention times, the selected reduced 468
Analytical Chemistry, Vol. 80, No. 2, January 15, 2008
standard deviations in Table 1, and two types of simulation. The first type is composed of first-dimension Gaussian peaks having reduced standard deviation 1σr, which were sampled at different values of ts and “injected” as second-dimension aliquots. The second results from ideally sampled bi-Gaussians having firstdimension reduced standard deviations equal to 1σr, with determined using SOT. The second-dimension reduced standard deviation 2σr was the same for both types of simulation. All results show that the numbers of observed peaks agree well for both simulation methods. This implies that , as determined by SOT, is the appropriate correction for quantifying the undersampling-corrected peak capacity n′c,2D described in the introduction, with the number of observed peaks in a large ensemble of overlapping peaks as the defining metric. The number of observed peaks decreases with increasing ts/1σ, showing that resolving power in comprehensive 2D separations is decreased by undersampling first-dimension peaks. In Figure 4a and b, the decreases are small because the standard deviations in both dimensions are small. As the peak standard deviations increase, the number of observed peaks decreases more rapidly. In Figure 4d, overlap is so severe that when ts/1σ g 8, both 100 and 400 constituents essentially produce the same number of observed peaks. An important finding, which will be examined fully elsewhere, is that the numbers of observed peaks are almost the same at any ts/1σ, when 1σr and 2σr are interchanged (e.g., the numbers of observed peaks for sets D and E, and sets F and G, are almost the same
Figure 5. Plots of weakly correlated 2D retention times from (a) LC × LC experiments16,33 and (b), (c) calculated GC × GC separations.21 (d) Plot of the average numbers of observed peaks vs ts/1σ in simulated 2D separations using the retention coordinates shown in panels (a)-(c). Subsequent 2-tuples identify symbols for average numbers of observed peaks in comprehensive 2D separations and ideally sampled separations with determined by SOT: LC × LC, (O, 0); GC × GC, case 1, (], ×); GC × GC, case 2, (+, 4). For LC × LC, 1σ ) 7.65 s, 2σ ) 0.22 s; for GC × GC, case 1, 1σ ) 6 s, 2σ ) 0.05 s; for GC × GC, case 2, 1σ ) 4 s, 2σ ) 0.05 s.
for any ts/1σ). We believe this will have important implications for optimization of 2D separations. Panels a and d in Figure 4 also show the number of observed peaks in ideally sampled separations where the first-dimension standard deviation was calculated as 1σ using values computed from histogram-like profiles. In the m ) 100 case in Figure 4a, there is little difference between the numbers of observed peaks in ideally sampled 2D separations using values from SOT or histogram-like profiles; however, at large ts/ 1σ the difference is noticeable. The most extreme case is shown in Figure 4d for m ) 400 constituents, where the number of observed peaks in ideally sampled 2D separations using from histogram-like profiles is much larger than the number of observed peaks in simulated comprehensive 2D separations with the same values of 1σr,2σr, and ts. These results emphasize the importance of using ’s determined using SOT to accurately correct for the effect of undersampling of first-dimension peaks, when the number of observed peaks is the defining metric of performance. The practical consequences of the differences between results obtained using ’s from SOT vs histogramlike profiles are discussed below. Frequently comprehensive 2D separations are not composed of randomly distributed constituents; rather they contain constituents whose retention times are somewhat correlated.36 The (36) Slonecker, P. J.; Li, X.; Ridgway, T. H.; Dorsey, J. G. Anal. Chem. 1996, 68, 682-689.
values determined by SOT apply to such separations as well. Panels a-c in Figure 5 show plots of first- and second-dimension retention times of actual LC × LC and calculated GC × GC separations. The 95 retention times in Figure 5a were obtained from extracts of corn leaf tissue, as determined by threedimensional chemometric analysis.33 Although some of them may belong to observed peaks, each is treated here as the maximum of a single constituent peak. The 93 retention times of peaks in Figure 5b and c were calculated with gas chromatographic simulation software and reported previously.21 Inspection shows that all three retention time distributions are weakly correlated. Figure 5d is a graph identical to those in Figure 4, as calculated using temporal standard deviations cited in the figure caption. The numbers of observed peaks agree closely in all three cases for all ts/1σ, indicating that determined using SOT is valid even in comprehensive 2D separations where the retention mechanisms of the two separation modes are not perfectly orthogonal. Interpretation of . As shown in Figure 3, the determined by SOT closely resembles the determined from the interpolated profiles. Although this profile cannot be rigorous for reasons discussed above, it is identical to the first-dimension profile in a comprehensive 2D separation, when displayed as a three-dimensional “wire frame” graph. In this graph, all firstdimension concentrations having the same second-dimension time are connected by line segments, and all second-dimension Analytical Chemistry, Vol. 80, No. 2, January 15, 2008
469
Figure 6. Plots of the percent difference between average numbers of observed peaks in ideally sampled 2D separations of bi-Gaussians using ’s determined from either SOT or histogram-like profiles vs ts/1σ. Symbols O and 3 represent the smallest (Figure 4a, m ) 100) and largest (Figure 4d, m ) 400) differences observed out of all of the conditions of this study.
concentrations having the same first-dimension time are connected by line segments. The profile in Figure 1c simply represents a slice through the wire frame that parallels the first dimension. The most likely reason that sampled first-dimension Gaussian peaks behave more like interpolated profiles than histogram-like profiles is that SOT models the overlap of biGaussian peaks, and interpolated profiles more closely resemble the first-dimension component of bi-Gaussians than do either histogram-like profiles or digital pulse sequences. However, the application of ’s so determined is not limited by the assumptions of SOT (i.e., random distribution of retention times); the ’s also work with weakly correlated retention times, as shown in Figure 5. The fact that these ’s differ from those of Murphy et al.14 and Seeley15 does not mean the latter are wrong. The latter pertain to the effect of undersampling on the effective width of a single peak, whereas ours are appropriate to examine the effect of undersampling on the number of observed peaks in separations of large numbers of overlapping peaks having an exponential distribution of peak sizes. We think our results complement the original findings of Murphy et al. and Seeley. Indeed, we concur with their chief conclusion that the sampling rate used in comprehensive 2D separations should obey the inequality ts/1σ e 2 to avoid serious loss of resolving power due to undersampling of first-dimension peaks. Effect of on Number of Observed Peaks and n′c,2D. The impact of the ’s determined in this work can be viewed from the perspective of the number of observed peaks that can be expected in a separation with known values of 1σ,2σ, ts, and m, depending on which is used to do the calculation (or simulation). Figure 6 shows the percentage difference in numbers of observed peaks counted in simulated ideally sampled 2D separations where the first-dimension peak standard deviation is corrected by , as determined in this work or the previous work of Murphy et al.,14 and Seeley.15 The two cases shown represent the smallest and largest differences observed out of all of the conditions (i.e., variations in m, R, and γ) studied in this work. The differences in the number of observed peaks vary from 470
Analytical Chemistry, Vol. 80, No. 2, January 15, 2008
Figure 7. Plots of the ratio of sampling corrected 2D peak capacity (n′c,2D) to the limiting theoretical 2D peak capacity (nc,2D) vs ts/1σ and the number N of samples taken over the 81σ width of the unsampled peak, calculated using eqs 1 and 4, and calculated using different approaches. (- - -) is calculated using eq 11 for the continuous convolution of a Gaussian peak and a rectangular pulse. This is the simplest representation of the sampling process, and provides the least accurate estimate of n′c,2D. (s) is calculated using the approach of Murphy et al.14 where a histogram-like profile is used to represent a single sampled first-dimension peak; this gives a more accurate estimate of n′c,2D than the continuous convolution representation. ()) is calculated from eq 10. This is determined by SOT, where the effect of first-dimension undersampling on the entire ensemble of peaks in a 2D separation is considered, and gives the most accurate estimate of n′c,2D. These plots clearly show the very serious loss of ideal 2D peak capacity when ts > 2 1σ. Insert is a plot of n′c,2D/nc,2D vs ts/1σ for small ts/1σ and large N.
about 0 to 40%, depending on the circumstances of the different separations; this has very significant practical consequences. Figure 7 is a plot of the ratio of the undersampling-corrected 2D peak capacity (n′c,2D) to the limiting theoretical 2D peak capacity (nc,2D) versus ts/1σ, where each curve is calculated using eqs 1 and 4, but with calculated using different approaches. The dashed line in Figure 7 was calculated using the simplest model of the undersampling of first-dimension peaks as a convolution of a Gaussian peak with a rectangular pulse, as described above and shown in Figure 1b. The for this model of undersampling can be calculated using eq 11:
)
x
1+
()
1 ts 12 1σ
2
(11)
However, this calculation of assumes ideal sampling of the convoluted profile and, therefore, underestimates the degree of broadening when first-dimension peaks are actually undersampled. The representation of undersampled peaks by histogram-like profiles, as proposed by Murphy et al.14 and shown in Figure 1a, leads to the more accurate estimation of shown by the solid curve in Figure 7. Finally, the bold curve in Figure 7 was calculated using eq 10, with determined from SOT and subsequently validated in simulations of ideally sampled 2D separations. We believe this is an even more accurate estimation of , which is based on the effect of undersampling of large ensembles of peaks.
Figure 7 makes it very clear how much of the ideal 2D peak capacity is lost due to undersampling of first-dimension peaks. n′c,2D systematically decreases upon increasing ts/1σ, with its maximum rate of decrease at ts/1σ ) 1.5. Less than 10% of the peak capacity is lost for ts/1σ e 1. For larger ts/1σ, however, the losses rapidly become unacceptable, and one is forced to consider whether comprehensive 2D separations are truly better than optimized one-dimensional separations.37 A more intuitively useful form of eq 10 is given in eq 12, where N ) 81σ/ts is the number of fractions of first-dimension effluent taken across the 81σ width of a first-dimension peak, as was defined by Murphy et al.14
)
x
1+
13.7 N2
(12)
This form of the equation makes it obvious that at least five samples must be taken across the 81σ first-dimension peak width to avoid losing more than∼25% of the peak capacity due to undersampling. An inset to Figure 7 shows more clearly how the ratio of the undersampling-corrected 2D peak capacity to the limiting theoretical 2D peak capacity asymptotically approaches unity as N increases. The ’s determined using SOT are 2-35% larger (data not shown) than the ’s calculated by Murphy et al.,14 and Seeley,15 over the range of 2 e ts/1σ e 8 (where most experiments have been done), depending on the degree of saturation (i.e., number of sample constituents per unit of peak capacity) of the separation. If we take a 20% difference as a representative value, on the surface it seems as though this is not too serious. However, if under a particular set of circumstances, we were to predict conditions required to achieve an effective 2D peak capacity (n′c,2D) of 1000 using the method of Horie et al.,19 our results show that the true n′c,2D value would actually be more like 800. If we then ask what must be done experimentally to recover this 20% loss in effective peak capacity, the result is that the recovery will require much more than a 20% increase in analysis time. For example, consider the case of LC × LC where gradient elution is used in both dimensions; here recovering the 20% can be achieved by increasing the peak capacity of the first-dimension separation, which obviously will require a longer analysis time. Detailed studies of the optimization of peak capacity in gradient elution HPLC by Wang et al.22,38 show that when all operational parameters are optimized to maximize peak capacity (i.e., the best possible circumstances), peak capacity only increases with the square root of increasing analysis time (over a reasonable range in analysis time such as 10 min to 2 h). In this case, improving the firstdimension peak capacity by 20% would require nearly a 45% increase in analysis time, which is obviously quite significant. The accuracy of eq 10 with respect to realistic estimates of effective 2D peak capacity should be useful in the optimization of real 2D separations to achieve specific peak capacity goals. If one realizes during the optimization process that the effect of undersampling on the 2D peak capacity is too severe, steps can be taken either to shorten the second-dimension analysis time to mitigate the (37) Blumberg, L. M. J. Chromatogr., A 2003, 985, 29-38. (38) Wang, X.; Stoll, D. R.; Schellinger, A. P.; Carr, P. W. Anal. Chem. 2006, 78, 3406-3416.
undersampling effect or to improve the performance of the firstdimension separation (e.g., increase column length, increase analysis time, etc.) to recover some of the peak capacity lost due to excessive undersampling. The implications for undersampled comprehensive 2D separations are clear. The 2D peak capacity is not the simple product of the first- and second-dimension peak capacities, as is commonly reported (i.e., via eq 1), but really must be significantly corrected for the reduction of the former due to undersampling of firstdimension peaks. In essence, the peak capacity of a comprehensive 2D separation is not an absolute, but instead depends on the relative magnitude of the sampling time. Over 50% of the ideal peak capacity is lost when ts/1σ ) 4, and the peak capacity loss becomes even more severe as ts/1σ is increased further. The failure to correct the limiting peak capacities for excessive sampling times can lead to massively biased and misleading reports of resolving power of 2D separations and should be avoided. This is particularly true when comparing results from different laboratories. Critique of Determination by SOT. Figure 8 shows graphs of determined from SOT vs saturation (R) for various ts/1σ ratios. Each symbol identifies a value determined for a specific m and 1σr, 2σr combination in Table 1 (the averages over all combinations are shown as filled circles in Figure 3). The solid and dashed lines are the average and standard deviation, respectively, of β determined from Monte Carlo simulations of interpolated profiles for single first-dimension peaks (the standard deviation of β in the ts/1σ ) 1 case is too small to see). The large variance in β’s determined from Monte Carlo simulations of single peaks is due to the extreme sensitivity of the calculated standard deviation of interpolated profiles to exactly how the original Gaussian peak is sampled (see Figure 1d). In other words, when ts > 1σ, the phase of the sampling relative to the retention time of the peak prior to sampling has a profound effect on both the shape of the resulting interpolated profile and its calculated standard deviation. This is clearly evident from Figure 1d and Figure 2, where the sampling phase is systematically varied to show the effect on the shape of the resulting interpolated profiles. Although there is a downward drift in the values determined from SOT with increasing R, this drift is not larger than the inherent variation in β due to the effect of sampling phase, particularly at large values of ts/1σ, and therefore does not preclude the application of over a wide range of conditions (0 e R e 1.5, 1 e γ e 8, 0.2 e ts/1σ e 16). Nevertheless, the downward drift in with increasing R is systematic; we believe this drift occurs not because is inherently dependent on R, but rather because of limitations of the SOT model used to determine . A detailed analysis of these limitations is far beyond the scope of this paper, but we offer brief explanations of two particular limitations here. First, simulations of ideally sampled 2D separations show that increases with increasing R, especially at large ts/1σ, because overlap is very severe (e.g., for m ) 400, 1σr ) 2σr ) 0.02, and ts/1σ ) 16, only 13 peaks were observed, as shown in Figure 4d). SOT does not accurately account for a number of observed peaks this small, resulting in a value that is higher than it should be. Second, simple simulations of comprehensive 2D separations using ellipses in two dimensions, rather than concentration profiles Analytical Chemistry, Vol. 80, No. 2, January 15, 2008
471
Figure 8. Plots of determined by SOT vs saturation (R) for ts/1σ ratios of (a) 1, (b) 4, (c) 8, and (d) 12. Symbols refer to parametric combinations given in Table 1: set A (b), B(9), C ([), D (2), E (1), F (O), G (0), H ()), I (4), and J (3). Multiple uses of the same symbol in each panel occur for different m values.
in three dimensions, to represent peaks show that decreases rapidly above a critical saturation. This occurs at large values of ts/1σ where discretization of the first-dimension retention times is very severe (i.e., there are few first-dimension points per peak; see Figure 2). In these cases, unusually large p values are observed, which in turn leads to values for that are smaller than expected. The two errors largely cancel, but the second one dominates and produces the downward drift in observed in Figure 8. A fortuitous consequence of this cancellation of errors is that the digitization of first-dimension retention times, which in principle should compromise estimation of by SOT, does not have a serious effect. CONCLUSIONS In this work, we developed a means of correcting the limiting theoretical peak capacity of 2D separations to account for the very deleterious effect of undersampling peaks as they elute from the first-dimension column. An average first-dimension peak broadening factor () was determined using 2D statistical overlap theory and the number of observed peaks in simulated comprehensive 2D separations as the primary metric of separation performance. can be calculated using the following simple expression that depends only upon the sampling time (ts) and the width of first-dimension peaks prior to sampling (1σ):
)
472
x () 1+κ
ts
1
σ
2
;
κ ) 0.214h ( 0.010h, for 0.2 e ts/1σ e 16
Analytical Chemistry, Vol. 80, No. 2, January 15, 2008
The four principal conclusions of this study are as follows: 1. The determined in this work by considering the effect of undersampling on the entire 2D separation increases upon increasing the ratio of the sampling time to the first-dimension peak standard deviation. This result is in qualitative agreement with prior work by Murphy et al.14 and Seeley,15 where the effect of undersampling on just a single first-dimension peak was considered. 2. The values determined in this work are 2-35% larger (for practically relevant conditions) than the ’s determined using the previous approaches of Murphy et al. and Seeley. These differences are very significant from a practical perspective, given the large increase in analysis time required to increase effective peak capacity. 3. The values determined using the SOT approach are applicable over wide ranges of the number of sample constituents considered (50-1000), ratio of the reduced standard deviations of the 2D separation (1 e γ e 8), and saturation of the separation space (R e ∼1.5), although the model begins to show signs of failure at high saturations when peak overlap is very severe. 4. The values determined using SOT apply equally well in simulated separations of uniform randomly distributed sample constituents, as well as simulated separations using distributions of retention times, which are weakly correlated and obtained from previous LC × LC and GC × GC experiments. In addition, two general observations can be made. First, the representation of sampled average concentrations of first-dimension peaks by the sampling of a convolution (as in Figure 1b) is
quite general and should apply to any type of first-dimension peak, including ones that are not Gaussian. Second, can be computed by an alternative, brute force approach, in which the standard deviations of first-dimension peaks in a simulated ideally sampled 2D separation are adjusted by trial and error to produce the same number of observed peaks as found in a simulated comprehensive 2D separation. This approach might be useful in cases of extreme retention time correlation. We believe the increased accuracy of values and a simple empirical expression to calculate that depends only on the ratio of the sampling time to the first-dimension peak standard deviation should ultimately be very useful in guiding the optimization of high peak capacity 2D separations.
ACKNOWLEDGMENT This work was supported by a grant from the National Institutes of Health (GM54585) and a Fellowship from the American Chemical Society Division of Analytical Chemistry to D.R.S. SUPPORTING INFORMATION AVAILABLE A listing of the derivation of analytical equations for the dimensionless first-dimension standard deviation β of histogramlike and interpolated profiles. This material is available free of charge via the Internet at http://pubs.acs.org. Received for review July 16, 2007. Accepted October 18, 2007. AC071504J
Analytical Chemistry, Vol. 80, No. 2, January 15, 2008
473