(15) D. VanHouweling, Office of Computer Services, Cornell University, Ithaca, N.Y. 14853.
RECEIVED for review March 23, 1977. Accepted July 1, 1977.
We are grateful to the Environmental Protection Agency for generous financial support, and to the National Science Foundation for partial funding of the DEC PDP-11/45 computer used.
Software Package to Collect and Process Radiogas Chromatographic Data I. M. Campbell," D. L. Doerfler, S. A. Donahey, R. Kadlec, E. L. McGandy, J. D. Naworai, C. P. Nulton, M. Venza-Raczka, and F. Wlmberly Department of Life Sciences, Faculty of Arts and Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 1526 1
The program R A ~ collects C and processes data emerging (a) from the proportional counter (PC) and the flame loniratlon detector (FID) of a standard radiogas chromatograph or (b) from the counter and total Ion current (TIC) monitor of a comblned radiogas chromatograpWmass spectrometer. Peaks are found In the FID/TIC data streams by flrst and second derlvatlve analysls. Peak areas are later calculated by least squares optlmiratlon to a skewed Gaussian model. Counter data are collected In the integral mode and are processed In elther the integral or the dlfferential mode to yleld measures of peak Isotope content. Account Is taken of the counter efflclency and the counting background. I n the final report, the amounts of Isotope assoclated wlth maxima In the counter output are calculated; mass and actlvfty maxlma are correlated through retention tlmes. The use of RADGLC in monltorlng In VIVOradlolsotope lncorporatlon experiments Is described.
produce if they are applied imaginatively. T o cope with this problem, our laboratory has written a set of computer programs to collect data from an RGC and from various configurations of an RGC/MS, and process those data into a form which is easily used by the biological chemist. The programs have been written in modular form and mostly in FORTRAN IV; they can, therefore, be adapted easily to the needs of any particular laboratory. In this paper, the program package RADGLC is described and examples of its use are given. RADGLC handles data produced by the counter module and the flame ionization detector of a simple RGC, or from the total ion current monitor and counter module of an RGC/MS. The program RADSIM, which handles an RGC/MS operating in the selected ion monitoring mode, will be described a t a future time. Copies of RADGLC, together with full documentation are available from the laboratory.
Biological chemists have long sought a n analytical device which, when provided with a crude radiolabeled mixture of biological origin, would separate the components in the mixture, would identify and quantitate each, and would measure the amount of radioisotope associated with each component. Very shortly after gas chromatography (GC) came into general laboratory use in the late 1950s, several groups realized that if a means could be found to assay radioactivity in a GC column effluent, one form of the desired analytical device would be a t hand-a radiogas chromatograph (RGC). Karmen ( I ) has provided an excellent critical overview of the various techniques used to assay radioactivity in a GC effluent. I n a simple RGC, compound identification depends on retention time or retention index measurements. Ambiguities associated with this method of structure assignment can be eliminated if a mass spectrometer is married to the system. Three forms of combined radio gas chromatograph/mass spectrometer (RGC/MS) have been described (2-4). For some reason, the general biochemical community has realized only slowly the potential of RGC-based methods of analysis in tackling problems in, for instance, secondary metabolite biosynthesis, drug detoxification, and intermediary metabolism control. In the past few years, however, papers using RGC-based methods have appeared with increasing frequency (e.g. 3-19). One reason for the relative lack of popularity of RGC-based methods may have been the embarrassingly large volume of raw data that such methods can
MATERIAL-HARDWARE REQUIREMENTS In our laboratory, RADGLC operates with data obtained from (a) an RGC that consists of a Packard 7400 gas chromatograph and a Packard 894 gas flow proportional counter (PC), and (b) from an RGC/MS that comprises an LKB 9OOO combined GC/MS and a Packard 894 PC ( 3 ) . The Packard 7400 GC is equipped with a hydrogen flame ionization detector (FID). The only modifications made to the basic, commercially available instrumentation were the replacement of the total ion current (TIC) preamplifier of the LKB 9000 by the circuitry shown in Figure 1A and the insertion of the voltage inversion amplifier shown in Figure 1B into the electrometer module of the RGC. Both amplifiers deliver a 0 to +10 volt signal to the computer. The computer is a Digital Equipment Corporation PDP11/20, equipped with 20K words of core memory, a hardware multiply/divide unit, two RK-11 1.2M word removable disc memories, a DECtape unit, a programmable real time clock, a line printer, and a Tektronix 4010 computer display terminal. The interface to the RGC and RGC/MS is via a count register (PC), and a 15-bit analog to digital converter (FID and TIC). The latter is accessed through an eight-channel programmable multiplexor. The interface was custom built in 1971 (Analog Inc., Waltham, Mass.); since then, equivalent units have become available as stock commercial items. In our system, hard copy analog output is obtained either by feeding the FID/TIC signal and the ratemeter output of the PC simultaneously with computer data collection to a twin pen potentiometric recorder (e.g. Figure 2 ) or by transferring
1726
A N A L Y T I C A L CHEMISTRY, VOL. 49, NO. 12, OCTOBER 1977
A
y---f3 ‘15“
96 Y
1OOK
L
0 ZNPLr-
NM
AMP -I 5
Figure 1. Schematics of: (A) the replacement TIC preamplifier inserted into the LKB 9000 and (B) the voltage inversion amplifier inserted into t h e Packard model 878 electrometer. Both devices are available from Fairchild Semiconductor Components, Mountain View, Calif.
the digitized data via DECtapes to a DEC PDP-10 (University of Pittsburgh Computer Center) for plotting by Calcomp.
METHODS-DESCRIPTION OF RADGLC Data Collection. Initializing RADGLC involves the operator supplying details of the GC temperature program required. These data are used to deduce the run time and data collection rate. The latter is calculated on the basis of storing 7000 pairs of fully processed FID/TIC and PC values. Net sampling rates in the range 2 to 10 points per second are typical. T o give the operator maximum flexibility in extracting information from the sample, data are collected independently of any background or slope limit options that the operator may wish to establish in subsequent processing. Under normal conditions, data collection is triggered by the appearance of the GC solvent peak. This option can be overridden to permit collection to be started during an RGC run. Such an option proves useful where late eluting components of a chromatogram are the sole interest and/or where accurate retention times are not required. T h e data collection subroutine is written in PDP-11 assembly language. Throughout the data collection, the ADC operates a t its maximal rate of 33 kHz. Individual readings of the GC detector voltage are sampled a t that rate, and are averaged over a period that represents the net sampling interval minus a small time span needed to write out on disc the average GC value and the reading of the count register. The count register is read only once in each sampling interval and that after the average GC value has been calculated and stored. Following each sampling, the switch register of the computer is interrogated; if any switch is on, data collection is terminated immediately irrespective of time elapsed. Initial Processing of the GC Data. As soon as data collection has been terminated either because the prescribed time interval has passed, or because the run has been ended prematurely from the switch register, the computer requests values for a limiting second derivative (LIMDS), for a GC background augmentation factor (BAF), and for a counter offset. The latter parameter allows the time frame of the counter data to be moved relative to that of the GC data to remove any errors caused by unequal path lengths to mass and radioactivity detectors. The magnitude of the offset can be determined by specifying a zero value in a standard run
h g f e d
c
b
a
TIME The analog output of an “ene-ol” standard. The lower trace is derived from the FID/TIC detector and indicates mass. The upper trace is derived from the rate meter of the PC and is a measure of radioactivity. Radioactivity is associated with peaks D, K, and M. Recording conditions: temperature program, hold 1 min at 135 OC, then to 255 O C at 10°/min for final hold of 1 min; electrometer setting 1/16 X IO-’’ A full scale: PC range, 2000 cpm full scale: PC time constant, 10 s: chart speed, 0.25 in./min Figure 2.
and measuring the retention time difference between the three radioactive peaks and mass peaks D, K, and M (Figure 2). The significance of LIMD2 and BAF will be considered below. In the initial pass through the GC data, the objective is to detect peaks and determine whether they are single isolated peaks or are components of a multiplet. Time (T) and signal strengths (H) when peaks begin (Tl, H l ) , inflect (T2, H2), maximize (T3, H3), inflect again (T4, H4) and return to baseline (T5, H5), together with a peak type descriptor (TAG), are determined and stored. During this phase, the raw data are smoothed significantly. All GC data points used in calculations are nine-point running averages of raw data. Values of the first derivative ( D l ) used in decision making result from two successive nine-point running averages of a simple divided difference. Values of the second derivative (D2), also used in decision making, are single nine-point moving averages of simple divided differences between singly averaged first derivative values. This quite extensive smoothing does not affect adversely peak resolution or area measurements; it does reduce the incidence of spurious maxima caused by noise spikes. The peak detection algorithm depends on the fact that single isolated peaks, unresolved doublets and multiplets, frontside and backside shoulders, and noise spikes can all be discriminated on the basis of first ( D l ) and second (D2) derivative analysis, and a knowledge of the value of the background signal (DO). The principal features of the program logic that deal with each of these peak forms are described in Table I. ANALYTICAL CHEMISTRY, VOL. 49, NO. 12, OCTOBER 1977
1727
Table I. Logic Used by
RADGLC
GC peak feature:
defined by:a
in Detecting Peaks and Identifying Their Form First Second Start inflection Maximum inflection D1 > 0 D2 G D1< 0 D22 D2 > LIMD2 LlMD2 LIMD2
G
Endb Signal DO t BAF
TAG
As encountered in: 11111 F F F F ( a ) Single isolated peak F ( b ) Unresolved doublet 11110 F NF F F F first peak 21111 F F F F E second peak ( c ) Unresolved multiplets 11110 NF F F F F first peak 21110 NF F E F middle peaks F 21111 F F F F last peak E ( d ) Frontside shoulders 11320 E NF F C F First shoulder All intermediate 01320 E NF NF C F shoulders 01111 F F NF F F main peak ( e ) Rearside shoulders 11120 NF F F F E main peak All intermediate 03220 E E NF NF C shoulders 03211 F F C E Last shoulder NF a Abbreviations: D1, first derivative; D2, second derivative; DO, background signal (taken as the signal strength five data points before a peak group beginning); BAF, background augmentation factor; TAG, peak descriptor; F, precise values for H and T at the peak feature were found directly; NF, reverse of F ; E estimates for values of H and T were found directly; C values of H and/or T a t the peak feature were calculated by interpolation. An additional criterion to define a peak end is that the interval (T5 - T 3 ) b 4 X (T3 - T2).
T o prevent small fluctuations in the second derivative causing false triggers, D2 is always compared to the range fLIMD2 when its sign needs to be sensed. If LIMD2, an operator specified limit, is set to a small finite value, e.g. 40 mV/s/s, small fluctuations in D2 caused by noise tend to be filtered out; setting LIMD2 to zero allows data to be processed without this filtering. T o account for peak broadening under isothermal conditions, LIMD2 is decreased logarithmically with time in isothermal sections of the chromatogram. Since fluctuations in D2 occur most frequently when the signal is close to background, a further criterion has been included; shoulders found when the signal strength is no more than what it was at the peak s t a r t plus a specific increment are neglected. Two incremental values are built into the program: 60 mV if in the initial dialogue the operator opted to neglect small peaks, 6 mV if this option was not exercised. As Table I shows, the sign change sequence in D1 and D2 defines peak forms. Not all sign transitions are allowable; the permitted transitions are as shown below ( D l , D2).
-
4
(+. +)
(+
I
-I-------(-.
I
fl
ANALYTICAL CHEMISTRY, VOL. 49, NO. 12, OCTOBER 1977
-
OL S T 2 N C F R D
P-A
> 0
- 0
3
TIME
6
9
12
15
[MINUTES)
Figure 3. Representation of the trace seen on display screen (or Calcomp plotter). The derivative form of the PC is shown
being analyzed, the log on the previous peaks in the group is closed appropriately before peak detection ceases. After the peak detection is completed a listing of peak retention times, values of Tl-T5, Hl-H5, and TAG are listed on the line printer together with a log of the number of points collected, the number of noise spikes and negative points, the sample collection rate, the number of groups identified, etc. (see Table 11). For purposes of processing, RADGLC considers a single isolated peak as a group consisting of a single peak. Simultaneously, a plot of the GC and the integral or derivative of the PC data appears on the display screen. The PC trace has been offset to the specified degree to eliminate any inaccuracies caused by path-length difference to mass and radioactivity detectors. Each peak detected and, therefore, available for integration is indicated on the screen as are the peak groups. Figure 3 is a representation of such a plot. The range of a group, as defined by the start of the first peak and the end of the last, is indicated by the range marker d; tic marks extending below the range marker indicate maxima of individual peaks in the group (T3). A single isolated peak is indicated with a range marker with a single tic mark. At this point the complete file can be copied onto DECtape, and transferred to the PDP-10 of the Computer Center for plotting on a Calcomp plotter.
’ -I-\-,
All illegal changes in sign are logged as noise spikes and the total count is reported a t the end of the run. RADGLC gives the operator two options in dealing with noise spikes. If the operator opts to ignore noise spikes, RADGLC discounts the point that gave rise to the illegal sign change and proceeds undeterred; if the operator did not so opt, RADGLC halts processing of the current peak and begins the peak search routine anew. It has been our experience that noise spikes as defined above are encountered only with electrometer settings more sensitive than 1 X lo-” A, full scale. Since the end of data collection may not correspond with the end of the radio chromatogram, a partial peak can exist a t the end of the data stream. If the maximum of that peak has not been reached, the peak is abandoned; if it has been reached T 5 and H 5 are taken as the last values collected and this possible inaccuracy is labeled in TAG by setting its last digit to 5. Logic has been built into RADGLC to ensure that if the program’s limit of 80 peaks is reached while a group is 1728
EhE --m
Table 11. Report Issued by Peak
NO.^
Start
1
0.51 1.49 2.24 2.86
2 3 4 5 6 7 8 9 10 11
0.00
3.85 4.25 4.91 5.31 0.00
Following Completion of the Peak Detection Routineu TimeC IntensityC Max. Infl. End Start Infl. Max. 0.71 1709 6977 10000 0.60 0.63 1.80 428 4617 1.63 1.67 7229 2.70 199 4038 6313 2.43 2.48 209 5174 0.00 2802 2.99 3.04 0 1859 3.57 324 3.36 3.44 148 3631 6214 4.00 4.06 0.00 3651 5426 4.80 4.42 4.47 465 3866 6329 0.00 201 5.08 5.14 5718 0.00 571 3963 5.49 5.54 2138 0 5.79 5.88 5.99 40 2 6112 0.00 3594 6.17 6.23 195 4711 3084 6.83 607 6.56 6.62 3506 7.58 7.25 7.30 166 5666 6229 3796 8.66 161 8.30 8.36 152 3134 5102 9.38 0.00 9.32 0 1868 299 9.77 9.94 9.71 10.31 10.37 10.66 166 3164 5502
RADGLC
Infl. 0.57 1.59 2.39 2.94 3.31 3.95 4.37 5.03 5.44 5.74 6.12 6.50 7.19 8.25 9.26 9.65 10.25
6.00 12 6.39 13 7.06 14 8.11 15 9.12 16 0.00 17 10.10 Number of groups = 11 Collection rate was 8.35 points per second Limiting “D” 4.00 mV/s/s/. “Background” 100. mV Discrimination against small peaks in effect 0 Negative points. 1 8 Noise spikes Number of points in the file is 7010 (14.0 mins)
Infl.
End
Tag
6978 4804 4301 3395 264 4291 3756 4200 3874 292 3939 3197 3711 4255 3437 261 3691
1197 525 298
11111 11111 11111
0
174
11120 3211
0
11110
248
21111
0 0
11110
197
21120 3211
0
11110
297 266 261
21111
0
180 266
11111 11111
11120 3211 11111
a Sample was an ene-ol (Figure 2). Peaks numbered 1 through 1 7 correspond to A, B, C, D, b, E, F, G, H, c, I, J, K, L, In minutes and millivolts respectively. M, f , and N, respectively.
I n t e g r a t i o n of Selected GC Peaks. Prior to integration the operator is given considerable latitude to modify the outcome of the computer‘s preliminary analysis of the data. T h e principal purpose for incorporating this latitude into RADGLC was not to subvert the process of automation and computer-based decision-making but to permit the operator some flexibility in handling the types of RGC trace that experience has taught us, emerge from work with biological systems. Specifically: (a) Any group(s) of peaks found in the preliminary pass can be eliminated from further consideration. This option is particularly useful where only certain areas of the total chromatogram are of interest, e.g. where only a few sections of the chromatogram contain radiolabeled components. (b) T h e computer can be asked to re-examine sections of the data for mass peaks that it did not find in the initial pass. The range of each of the new groups must be specified together with approximate positions for the beginning (Tl),inflection points (T2, T4), maximum (T3), and end (T5) of each peak desired to be in the group. Such specifications are made using the cursors of the Tektronix 4010. (c) Groups that contain two or more peaks can be split into a larger number of subgroups for independent integration. (d) Groups can be fused. (e) The background found by the computer for any group can be redefined with the aid of the Tektronix cursors. This ensures that the best possible estimate of background level is fed to the integration routine. (f) Peak(s) in a given group can be eliminated from consideration. This option is useful where a limited number of peaks in a group is of importance, or where it is desired to integrate as a single peak, a peak on whose flanks shoulder(s) have been found. T h e peak integration algorithm relies on a least squares minimization fit of the experimental data to skewed Gaussian peak forms. We elected to use such a classical peak form rather than relying on measured peak shapes as models because, in using the former, values for the skew and half-band width are obtained. Our experience is that these parameters, together with resolution measurements, allow continual evaluation of GC columns and splitter performance during
a set of runs. The peak form given by Equation 1,due in part to Fraser and Suzuki (20), is employed:
signal strength at time t,
where H,is the maximum signal strength, R, is the retention time, s is the skew (if s = 0, the peak is Gaussian), and u is the parameter that for a pure Gaussian would be half the peak width measured at the inflection points. Any negative values of the logarithmic argument are set to zero. Groups containing up to and including 6 peaks can be handled simultaneously. The optimization logic is based on the work of Law and Bailey (21). As first approximations for H,, R,, and u , respectively, the values H3, T3, and (T3 minus T2) that were obtained in the peak detection algorithm are taken. Arbitrarily, the parameter s is set to 0.001 initially. The process of least-squares refinement is repeated till the desired degree of fit is achieved. Currently, RADGLC assumes that the “best” fit has occurred when the absolute change in T 3 and T 2 from one iteration to the next is less than 1.2 s , the absolute change in the skew is less than 0.002, and the relative change in H3 is less than 0.4%. A limit of 20 successive iterations is set on the fitting procedure for each group. The operator is informed if the required degree of “goodness” of fit was not achieved in 20 iterations. Since a t each stage of iteration the fitted curve and the experimental profile are superposed on the Tektronix screen, the process of fitting can be followed. Figure 4 shows three stages in the fitting procedure of a section of the “ene-ol” (3)standard shown in Figure 2. Provision is made through a switch to terminate the iteration procedure for a given group a t any stage short of the optimal. The various convergence assurances proposed by Law and Bailey have been employed (see reference 21, pages 194-196). T h e D,test for the appropriate sign of combination of the elements of matrix b proves very valuable. In their test for rapid convergence, AS2, - @SL? 2 0,/3 is maintained at 0.25. Tests with 0 in the range 0.1 5 5 0.25 failed to reveal any significant effect on the fitting procedure. If submultiples of ANALYTICAL CHEMISTRY, VOL. 49, NO. 12 OCTOBER 1 9 7 7
1729
Table 111. Final Report of RADGLC-GCData Experimental Fitted Retention GroupQ Peaksb AreaC %-Massd AreaC %-Massd timee Sigmae Skewf 70 Diff 1 1-1 38 703. 0.221 -4.09 5.59 40 285. 5.88 0.60 0.026 4 3 564. 2 2-2 0.202 0.75 0.042 6.29 1.63 4 3 235. 6.31 0.240 1.63 47 835. 3 3-3 6.91 2.43 0.052 47 053. 6.87 4 4-5 4 3 781. 44 948. 6.56 0.435 -2.66 6.33 4 39 639. 2.99 0.052 (5.73) 40 695. 5.94 (90.54) 5 4 142. (0.60) 3.32 0.280 4 253. 0.62 (9.46) 0.333 2.19 6-7 14.91 103 176. 5 100 918. 14.73 6 53 607. 52 434. 7.65 (51.96) (7.75) 4.00 0.056 7 (7.16) 49 569. 4.42 0.062 48 484. 7.08 (48.04) 6 8-1 0 110 520. 0.256 1.67 15.97 108 675. 15.86 8 5.08 0.060 55 659. (8.04) 54 730. 7.99 (50.36) 9 (7.57) 0.064 52 413. 5 1 538. 7.52 (47.42) 5.49 10 (0.35) 5.77 0.080 2 407. 0.35 (2.21) 2 448. 7 11-12 0.243 1.35 96 669. 95 360. 13.92 13.97 11 53 923. 0.060 (7.79) 53 192. 7.76 (55.78) 6.17 12 (6.18) 42 746. 6.56 42 167. 6.15 (44.22) 0.064 13-13 8 7.33 50 720. 0.276 1.58 7.24 49 919. 7.29 0.062 14-14 9 8.23 56 932. 55 771. 8.14 0.253 2.04 8.30 0.060 15-16 10 7.16 49 525. 49 418. 7.21 0.220 0.22 (6.60) 45 688. 15 9.32 0.062 45 589. 6.65 (92.25) 16 3 837. 0.316 3 829. 0.56 (7.75) (0.55) 9.67 17-17 11 0.254 2.23 7.32 49 557. 7.23 50 687. 10.31 0.062 692 112. Total 100.00 1.01 685 140. 100.00 As defined in Table 11, but subject to any group exclusion option the operator may have exercised. The symbol n-m indicates the range of integration; n-n indicates that peak n has been integrated as an isolated peak. Peak numbers correspond t o those of Table 11, but again are subject t o any peak exclusion option exercised. In units of mV.min-’. Experimental areas f o r peaks in a multiplet are computed by dividing up the total group area in proportions dictated by %-gmass. The percentage of total area that each peak represents, Following the fitted %-mass value for a group and in parentheses are %-gmass values. These represent the proportion of total group area that each component peak represents. e In minutes. Dimensionless but signed; constrained t o be the same f o r each component in a multiplet.
A
e
z
I ME
Figure 4. Progress of the peak fitting process for the triplet of peaks G, H, and c (Figure 2). (A) Initial guess using data from the peak detection routine: (B) following one round of optimization: (C) t h e final fit following five rounds of optimization. Points are raw data; continuous lines are the generated curves the elements of matrix b have to be used, the restriction factor, cy (see reference 21) is set initially to 1.00 and is decreased by factors of two a maximum of ten times. In addition to these controls proposed by Law and Bailey, a constant watch is maintained that: (i) the position of no peak maximum (T3) in a group moves backward or forward by more than the original estimated u for the peak, (ii) the position of the first inflection point of the first peak in a group never moves outwith the boundaries of the group, (iii) peak heights never become zero or less, (iv) the skew never exceeds h2.5, and (v) the T 2 of a peak in a group never becomes less than the T 3 1730
A N A L Y T I C A L CHEMISTRY, VOL. 49, NO. 12, OCTOBER 1977
value of the preceding peak. If any of these events occurs, the offending parameter is set a t the limiting value it tried to exceed, the operator is so informed, and iteration continues with the difference that changes in the offending parameter are not included in the test for goodness of fit. The actual progress of peak integration involves calculating the value of the peak-defining functions a t each sampling point and summing the ordinates. At this juncture, the experimental data points for single peaks and peak groups are also summed and stored. The two areas are referred to respectively as the “fitted” and ”experimental” areas. T h e final report (Table 111) identifies peaks by peak and group number, lists the fitted area for each peak (background subtracted), the total area and the percentage of total t h a t each peak represents (%-mass). For groups, the percentage of the total group area that each component peak represents is also reported (To-grnass). The experimental area of each group is listed together with a computed area for each peak in the group. These latter values are obtained by dividing the total experimental area of the group in accord with the 7 - g m a s s data obtained for that group by the fitting process. %-Mass values for experimental areas are calculated; for multipeak groups, the values for component peaks are placed in parentheses to indicate their approximate nature. Percentage differences between experimental and fitted areas of each group are reported together with retention times, u , and skew values. Integration of the PC Data. Radioactivity integration is accomplished interactively through the display screen. The interactive mode was selected since PC data quality is not as good as GC data quality (see Figure 5 ) . Integration can be performed with the data in the integral form in which it was collected (total counts), or as a first derivative with respect to time. Data can be used raw or after varying, selectable degrees of smoothing. The operator is required t o indicate a t the beginning of the process whether the sample being examined is a standard or a live run. If the sample is an “ene-01” standard, RADGLC requests the sample volume in-
Table IV. Final Report of RADGLC-PCData PC INTEGRATION The total number of counts = 1652. Flow = 60 cm3/min. PC offset = 0.19 min. 2.5 p L of ene-ol standard. PC data has been smoothed 4. Integral Mode Bkg Peak % Peak cpm cpma Sb Efficen TimeC 1 24.5 76. 3.04 (10.) 1234. 2 37.8 7.31 (10.) 68. 1103. 3 37.8 9.45 (11.) 64. 1040. Mean background Total peak cpm Mean efficiency 33.3 3376 69.3% Residual Counts Residual cpm 527 37.6 %-
Peak activity 1 36.5 2 32.7 3 30.8 Differential Mode Slope determined over 82 points Scale multiplied by 1 Bkg Peak % Peak cpm CPm S Efficen Time 1 29.9 3.01 (10.) 76. 1228. 2 37.8 7.26 (10.) 68. 1099. 3 37.0 9.33 (10.) 65. 1049. Mean background Total peak cpm Mean efficiency 34.9 3376 69.3% Residual Cognts Residual cpm 527 37.6 %-
Peak activity 1 36.4 2 32.6 3 31.1 a cpm = counts/minute. Percentage standard deviation of the measurement at 95% confidence. In minutes. jected. Since the radioactivity level per ML of the standard is built into the program as a constant, the disintegration rate asscciated with each of the three radioactive alcohol peaks, and therefore the P C efficiency, can be calculated. If the sample is a live run, the P C efficiency needs to be entered. Whether the sample is a standard or a live run, the gas flow rate through the P C detector must be provided by the operator. In the integral mode the data, smoothed or otherwise, are plotted on the display screen. T o make manipulation easier, the effect of a net background count of 24 cpm is subtracted from the data prior to plotting. Using the plotted data, the operator is required to specify with the Tektronix cursors, pairs of points on the plot that can be used to define background, and up to 50 pairs of points that define sectors of the plot that are to be integrated. The program allows different backgrounds to be selected for each sector or permits the same background to be used for two or more sectors. In some ways the differential mode of counter data integration is more easily used, since i t resembles the analog trace produced by a noncomputerized RGC with which operators are familiar. The first derivative is calculated by a simple divided difference and is plotted directly on the display screen. The operator can select the range over which the differential is taken and can control the vertical scaling of the P C peaks. Integration is achieved by marking the beginning of the PC peak, its maximum, and its end with the Tektronix cursors.
Table V. Comparison of the Effectiveness of a Pure and Skewed Gaussian in Fitting GC Peaks with R A D G L C
Peak No.= 1
2 3 4 5 6 7
Percentage difference between experimental and fitted areas Pure Gaussian Skewed Gaussian 2.31 1.91 5.74 1.25 11.59 0.79 23.33 0.20 13.05 2.12 9.69
2.18
7.44
1.45
8
9 10 11
10.85 -0.02 5.75 0.16 6.96 1.12 14 5.95 0.09 a The values were obtained by processing the same “ene-01’’ standard. Peak numbers correspond to peaks A-N in Figure 2 . 12 13
The count rate corresponding to the simple average of the peak beginning and its end is taken as the background value. Up to 50 individual peaks can be processed in a single pass. The report of both integral and differential integrations are similar (Table IV). Both are prefaced by a list of general characteristics about the run, to wit: the title, the gas flow rate, the extent of P C offset, and the degree of smoothing employed. For an ene-ol standard, the background level for each peak, the count rate associated with that peak (background subtracted), its retention time, and the calculated efficiency of the P C a t that retention time are provided. The percentage standard deviation associated with 95% confidence in the count rate of each peak is also calculated. The mean background, mean efficiency, total count rate associated with all the peaks, and the proportion of that total which each peak represents (%-activity) are calculated. As a check on the integration process and its completeiiess, the total number of counts associated with all the peaks examined is subtracted from the total number of counts collected in the run. When this residual count is divided by the total run time, a residual count rate close in value to the mean background level should be obtained if all is well. The report for a “live” sample differs from that of a standard in that the disintegration rate associated with each peak replaces the percent efficiency entry. In a live run, the correlation between mass and radioactivity maxima is made by retention time comparisons by the operator. If an internal standard has been incorporated into the live run, provision has been made to identify its mass peak to the computer and thereafter allow RADGLC to calculate the other three SPACAL parameters (see reference 10): the relative specific activity, the mass, and activity ratios.
RESULTS Superiority of the Skewed Gaussian Peak Form over the Simple Gaussian. In Table V are presented values obtained for the percentage difference between experimental and fitted data using both simple and skewed Gaussian peak forms. T h e same raw data were used in both cases. T h e conclusion is clear cut; the skewed Gaussian is superior. I t is significant that the worst fit with the pure Gaussian occurred with peak D, n-tetradecanol. It is our experience that this peak of the “ene-ol” standard is most subject to skew, indeed, we use the extent of this skew most commonly as an index of GC column efficiency.
Influence of the Parameters LIMDB, BAF, and the Small Peak Rejection Option on the Operation of RADGLC. A N A L Y T I C A L CHEMISTRY, VOL. 49, NO. 12, OCTOBER 1977
1731
T o describe the influence of these three factors on the operation of RADGLC, the “ene-01” trace shown in Figure 2 will be used. Note that in it there are fourteen well defined peaks, A through N. Peak A is off-scale. Minor peaks close to the base line are a through h. The experienced gas chromatograph user would most likely group these 22 peaks as follows: A singlet; B + a, doublet; C, singlet; D + b, doublet; E + F, doublet; G + H + c, triplet; I + J, doublet; K + d, doublet; L + e, doublet; M + f, doublet; N, g, and h, all singlets. For many purposes, only the 14 major peaks (A through N)would be considered of significance. RADGLC, set with LIMD2 = 20 mV/s/s, the background = 100 mV and with the small peak rejection option in operation, found the 14 major peaks and placed them in the expected 11groups. If LIMD2 was dropped in value to 4 mV/s/s with all the other parameters maintained a t their previous level, the 17 peaks A through N, b, c, and f were detected. Peak b was associated with peak D as a doublet; peak c was associated with peaks H and G as a triplet; and peak f was associated with peak M as a doublet. Further reduction of the value of LIMD2 had the effect of visualizing more of the smaller peaks. At values around 1.0 mV/s/s, however, RADGLC became unstable and picked up spurious shoulders on most of the large peaks. At 0.8 mV/s/s, for instance, such shoulders are found on peaks D, F, L, J, L, M, and N. Most of our work is done with LIMD2 set at 4.0 mV/s/s. Variation of BAF affects principally the number of peaks placed in a given group. With LIMD2 = 4.0 mV/s/s, the small peak reject options in effect, and the background set a t 100 mV, the 17 peaks detected are placed in 11 groups as described above. If all other parameters are held constant and the background is dropped to 20 mV, RADGLC places peaks E through J and c in a single, seven-peak group. Peaks K and L are associated as a doublet. The treatment of the other peaks remains the same. If BAF is increased to 500 mV, all the minor peaks are lost and each of the major peaks is seen t o be a singlet. Raising BAF to an artificially high level, 500 mV for instance, has proved a useful technique in preliminary processing of complex gas chromatograms. However, area values produced by such a procedure are underestimated. For example, a fitted area of 47663 units was found for peak N with a 500-mV background. The corresponding values for 100 and 20 mV backgrounds were 49557 and 50440 respectively. Recall, however, that in RADGLC there is provision for redefinition of the background by the operator prior to peak integration. This fact can be used to combat underestimation. With LIMD2 and BAF set a t 4.0 mV/s/s and 100 mV, respectively, and the small peak rejection option in effect, 17 peaks were detected as described above. If the small peak rejection option is inactivated, three other peaks are detected: to wit, a, g, and h. The Ability of RADGLC t o E s t i m a t e the A r e a of P e a k s That H a v e T o p p e d O u t . Despite the care of the operator in selecting sample size and GC electrometer setting, it sometimes happens that a major component in a biological sample will saturate the electrometer and so produce a square topped peak. Provided the peak is not so large that its inflection points as well as its maxima are lost, RADGLC will make an attempt to estimate the area. The estimate produced is of a much lower degree of accuracy than the normal fitting procedure but it can be useful. For example in Figure 2, peak A topped out. The ratio of the experimental areas of peaks A and B was 88.1YO. After fitting, the ratio became 93.2%. Running another sample in which both peaks were on scale, t h e ratio was found to be 96.6%. 1732
A N A L Y T I C A L CHEMISTRY, VOL. 49, NO. 12, OCTOBER 1977
Influence of the Degree of P C D a t a Smoothing on t h e O p e r a t i o n of RADGLC. Five levels of data smoothing are provided by RADGLC for P C data. Such versatility proves useful in dealing with samples wherein a range of isotope contents occurs in various areas of the chromatogram. Figure 5 illustrates this point. I t was obtained by feeding 2-I4Cacetate to a 62-h old culture of the fungus Penicillium brevicompactum, by making a methylated nonpolar soluble extract (see reference 3 for methodology), and by running the sample on our LKB 9000/Packard 894 RGC/MS. Figures SA, B, and C were obtained with smoothing factors of 0, 2, and 5 respectively; the derivative being taken over 26 points in each case. Two zones of the chromatogram illustrate well the influence and value of smoothing (marked I and I1 in Figure SA, B, and C). Note that in going from smoothing factor 0 to smooth factor 2, the multiplets in the two zones become more distinct, thus making peak integration more accurate. As would be expected, continued smoothing eventually leads to a loss of resolution. Thus with smoothing factor 5 , the multiplets in zones I and I1 are again diffuse. In routine work we find it useful to examine a given set of PC data with several smoothing factors, and use the optimal one for integration. From Figure 5 , there can be seen an important advantage of using RADGLC to process Pc data as opposed to relying on the analog trace produced by the PC rate meter. Experience has told us that the P C rate meter operates optimally with a time constant of 10 s. Figure 5D is the analog trace that was produced when the data that give rise to Figures 5A-C were being collected. Note particularly that the multiplet in zone I in Figure 5D is almost completely masked by the tail of the large preceding peak. E x a m p l e s of t h e T y p e s of Biological D a t a t o W h i c h RADGLC C a n Give Access. The sample which gave rise to Figure 5 was obtained during an experiment designed to determine how intermediary and secondary metabolism in P. brevicompactum changed as the fungus developed through trophophase into idiophase (see reference 19 and citations therein for background). Since it is a bona fide “mixture of biological origin”, it can be used to illustrate the types of data that can be obtained quite automatically through use of RADGLC.
In its preliminary report, RADGLC indicated the presence of 33 GC peaks. The retention times of these peaks together with their normalized peak heights can be used as the coordinates of a point in 33-dimensional space which is characteristic of the small molecule metabolism of the fungus a t the time the sample was taken. With the settings used to analyze the data (LIMD2 = 4.0 mV/s/s; BAF, 100 mV; discrimination against small peaks in effect), the 33 GC peaks were collected into 17 groups. The largest of these groups contained 18 peaks. Judicious use of the group splitting option allowed all the peaks to be integrated successfully with an overall goodness of fit of -2.45%. The %-mass table issued in the final GC report can also be used as a parameter characteristic of the state of fungal development at that point in time when the sample was collected. A total of 2988 counts had been collected during the run. Following correction for background count rate, 70.1 YO of the total count was associated with three major peaks, B, C, and F in Figure 5D, and in the ratio 1:2.34:0.71. On the basis of retention times of PC and GC maxima, PC peaks B, C, and F were associated with GC peaks 11, 13 and 36. Mass spectra taken at the appropriate retention times established that mass and activity were associated with methyl palmitate ( B and 11),a mixture of methyl, stearate/oleate/linoleate (C and 13) and methyl mycophenolate (F and 36). Using the results of GC and PC integration, it could be calculated that the specific activity of methyl mycophenolate in the samples was 17 400
D
t
II C
SMOOTHING 0
=
a
V
A
A 36
L
Urn
7
14
21
28
35
4
D
TIME IMINLITES)
Figure 5. Beneficial effects of smoothing PC data through software. All traces shown were obtained from the same run. Figures A, B, and C emerge from RADGLC, Figure D is the standard twin pen trace obtained from the TIC monitor and PC rate meter. The PC rate meter time constant was 10 s. RADGLC smooths PC data by calculating a running average. The range of that running average is given by the relationship: range = (4 X sampling rate X input integer) 1
+
dpm/pmol and that there was 0.6 mg of it in the sample of fungal tissue at the time of sampling. Activity peak A had the correct retention time to be trimethyl citrate-a commonly prominent intermediary metabolite of P. brevicompactum. In this particular trace no corresponding peak of mass was seen. GC peaks can be attributed to each of the maxima in the multiplets D in the PC trace. The chemical nature of each of these cell constituents has not yet been determined, nor has the identity of material whose pool of isotope gives rise to P C peak E.
DISCUSSION was designed simply and exclusively to help resolve the data jam that necessarily ensues when methods based on radiogas chromatography are used to explore comprehensively the kinetics of small molecule metabolism through in vivo incorporation experiments. Conversion of raw GC and PC data as directly as possible into parameters biologically significant such as 70-mass, 70-activity, and specific activity was the principal objective. A major consideration was to permit a high degree of operator interaction with the data processing. Seen in this light, RADGLC is a success. As future publications from this laboratory will show, metabolism monitoring experiments can be tackled with the aid of RADGLC: that would be impracticable if data had to be handled by manual triangulation. RmGLC
This transpires not only because, all things being equal, of a trace such as is shown in Figure 5 is 2-5 times faster than manual processing and provides final results in tabular form devoid of arithmetic errors. The fact is that RADGLC can deal with a trace which, because PC and/or GC peaks were off scale, could not be handled manually. One sample injection is usually all that is needed where RADGLC is being used; multiple injection would be needed in most cases where data were being processed manually. Notwithstanding, several improvements in the operation of RADGLC can be anticipated in the future. The availability of a larger core memory would allow larger groups of peaks to be considered en bloc in GC peak integration-a much more convenient situation. If a computer faster than the PDP 11/20 could be used considerable savings in time would also be realized. A larger core working space would also eliminate the need to use the same value of the skew for each peak in a multiplet as is our current practice. Apart from hardware size and speed, a major possible improvement would be the selection of a better model for the GC: peak form. The current skewed Gaussian peak form, although far superior to the pure Gaussian peak form, does not match real GC peaks perfectly. As Figure 4 portrays, real GC peaks tend to tail more and be somewhat sharper. Not only would a better model allow more accurate areas to be obtained for ordinary peaks; it would also permit more realistic estimates to be made for off-scale peaks. RADCLC processing
A N A L Y T I C A L CHEMISTRY, VOL. 49, NO. 12, OCTOBER 1977
1733
LITERATURE CITED (1) A. Karmen. MethodEnzymology, 14, 465 (1969). (2) D. C. Hobbs, American Society for Mass Spectrometry, Annual Meeting. San Francisco, Calif., 1970, Abstract 657. (3) C. P. Nulton, J. D. Naworal, I . M. Campbell, and E. W. Grotzinger. Anal. Biochem., 7 5 , 219 (1976). (4) W . H. Braun, E. 0. Madrid. and R . J, Karbowski, Anal. Chem.. 48, 2284 (1976). (5) W. C. Breckenridge and A. Kuksis, Lipids, 5 , 342-352 (1970). (6) D. C. Hobbs. Antimicrob. Ag. Chemother.. 2, 272 (1972). (7) M. Matucha, V. Svoboda, and E. Smolkova, J . Chromatogr., 9 1 , 497 (1974) ( 8 ) J . D. Mahon, K . Egle, and H. Fock, Can. J . Biochem., 5 3 , 609 (1975). (9) J. R. Scaife and G. A. Garton, Biochem SOC. Trans., 3 , 1011 (1975). (10) J . R . B. Slayback, I. M. Campbell, and E. Farish. Anal. Biochem.. 69, 140 (1975). (1 1) A . Kuksis, N. Kovacevic, D. Lau, and M. Vranic, Fed. Proc., 3 4 , 2238 (1975). (12) K. K. Stanley and P. K. Tubbs, Biochem. J . , 150, 77 (1975).
(13) H J G M Derks F A J Muskiet. and N M Drayer. Anal E m h e m , 7_2_ ,391 11976) (14) J. A. Lubkowtz and J. Galobardes, J . Environ. Sci. Heahh, B l l . 49 (1976). (15) H. T h Schneider, B. P. Disboa, and H. Breuer. fresenius' Z. Anal. Chem., 279, 161 (1976). (16) A. Hatanaka, T. Kajiwara, and J. Sekiya, Phytochemistry, 15, 1125 (1976). (17: J. R. B. Slayback, I . M. Campbell, and M. H. Vaughan, Biochim. Biophys. Acta. 431, 217 (1976). (18) J. R. B. Slayback and I. M. Campbell, Biochim Biophys Acta. 450, 33 (1976) (19) C . P. Nulton and I . M. Campbell, Can J , Microbiol., 2 3 , 20 (1977). (20) R D. B. Fraser and E. Suzuki, Anal. Chem., 4 1 , 37 (1969). (21) V. J. Law and R . V. Bailey, Chem. Eng. Sci., 18, 189 (1963).
__
\ . -
- I .
RECEILFI)for review February 2 , 1977. Accepted July 11, 1977. T h e authors acknowledge gratefully the financial support of the U.S. Public Health Service (RR-00273) and the University of Pittsburgh Medical Alumni Association (fellowship to S.D.).
Determination of Parts per Billion Levels of Electrodeposited Metals by Energy Dispersive X-ray Fluorescence Spectrometry John A. Boslett, Jr., Robert L. R. Towns,* Robert G. Megargle, and Karl H. Pearson Department of Chemistry, Cleveland State University, Cleveland, Ohio 44 1 15
Thomas C. Furnas, Jr. Molecular Data Corporation, 2869 Scarborough Road, Cleveland Heights, Ohio 44 1 18
Aqueous solutions of nickel(II), copper(II), and zinc(I1) are quantitatively determined to the low part per billion level by energy dispersive x-ray fluorescence (XRF) spectrometry using selective potentiostatic electrodeposition as a preconcentration technique. Novel, cylindrical monochromators between the sample and detector of the XRF system reduce background levels due to scattering from the reflective electrode surface, and yield greatly improved signal-to-noise ratios. Linear callbration curves were obtained. Minimum detection limits are less than 20 ng for the metals studied.
X-ray fluorescence analysis of aqueous solutions is hampered by difficulties encountered in preparing suitable samples. In addition, elements of interest are frequently found a t levels which are below the minimum detection limits of conventional x-ray fluorescence spectrometry (XRF). As a result, several investigators have employed enrichment techniques to extend the range of x-ray fluorescence analysis of ions present in solution. Methods of preconcentrating metal ions have included the use of ion-exchange resins ( I , 2 ) , ion-exchange resin impregnated paper ( 3 ) , and chelating functional groups immobilized on suitable substrates ( 4 ) . Elder, Perry, and Brady ( 5 ) have used ammonium-1-pyrrolidine dithiocarbamate as a precipitating agent to remove trace elements from environmental water samples. T h e precipitate was removed by filtering for subsequent energy dispersive X R F analysis. Vassos et al. (6) have used constant current electrodeposition of reducible metal ions upon a pyrolytic graphite rod to prepare samples for wavelength dispersive x-ray fluorescence analysis. Each of the methods has inherent problems, but, with certain limitations and with strict control over experi1734
A N A L Y T I C A L CHEMISTRY, VOL. 49, NO. 12, OCTOBER 1977
mental conditions, preconcentration techniques have proved useful in analyzing solutions with metal concentrations in the part per million (ppm) range. This research describes a technique by which trace amounts of the aqueous metal ions nickel(II), copper(II), and zinc(I1) are preconcentrated on the end face of an ordinary spectrographic graphite rod by potentiostatic electrodeposition. T h e thin metal film that results from the electrodeposition is analyzed by energy dispersive x-ray fluorescence spectrometry. Controlled potential electrodeposition has the capability to selectively separate trace concentration metal ions from a solution that may contain interfering metal ions. Background due to scattering of the incident radiation from the reflective graphite substrate is minimized by the use of specially designed cylindrical monochromators designed, built and supplied by Molecular Data Corporation, 2869 Scarborough Road, Cleveland Heights, Ohio 44118. The estimated 80- to 100-fold reduction in background radiation attributable to use of these monochromators significantly improves the signal-to-noise ratio a t all energies and dramatically lowers the minimum detection limit ( 7 ) . T h e quantitative analysis of nickel(II), copper(II), and zinc(I1) a t the 2-100 part per billion (ppb) level from 120 mL of solution is reported.
EXPERIMENTAL Solutions. Stock solutions of zinc(I1) acetate. copper(I1) perchlorate, and nickel(I1) chloride were prepared by dissolving reagent grade salts in distilled-deionized water. The stock solutions were standardized against dried primary standard disodium dihydrogen ethylenediaminetetraacetate dihydrate (Na2H2EDTA.2H20) and determined to be 0.0996 F Zn(C,H,O&, 0.1039 F Cu(ClO,), and 0.0991 F NiC12. Solutions containing trace levels of the metal ions were prepared by diluting microliter amounts of the stock solution in distilled-deionized water. Supporting electrolyte was prepared in concentrated form by dissolving reagent grade sodium acetate in distilled-deionized