Quantitative Analysis of Mixtures by Carbon4 3 Nuclear Magnetic Resonance Spectrometry Thomas H. Mareci’ and Katherine Niisfay Scott Veterans Administration Hospital, and Department of Radiology, University of Florida, Gainesville, Florida 326 10
Results for quantitative analysis of mixtures by carbon-13 magnetic resonance, without nuclear Overhauser enhancement suppression, are presented. Two systems which are not slmple Isomeric mixtures are considered. Mixtures of two steroids, 3P-acetoxy- and 3~-chlorocholesterol, and mixtures of two aromatic acids, 3,4-dihydroxyphenylproplonlc and 3,4-dlmethoxyphenylacetic acid, were quantified in varying concentrations from 2 to 98 mol% with an average RMS error of 1.6 YO. Spin-lattice relaxation times, effective transverse relaxation times, and nuclear Overhauser enhancements for these systems were determined. Based on these values, criteria were developed for the selection of resonances to be compared to yield accurate results.
With t h e advent of commercially available Fourier transform nuclear magnetic resonance, carbon-13 spectroscopy has become a practical tool in many laboratories. However, carbon-13 magnetic resonance (CMR) has several inherent problems that make quantitation difficult. T h e low natural abundance of carbon-13 nuclei and low sensitivity to observation relative to protons require signal averaging for observation. T h e length of the carbon relaxation times (0.1 to 100 s) increases the time required for observation, because it is necessary to allow the carbon spin system to recover between sampling pulses to avoid saturating the system. Hence, a delay is usually inserted after the acquisition of data a n d before the next sampling pulse. Thus for samples with moderate to low concentration, the time required to obtain adequate signal to noise may be many hours. T h e proton decoupled carbon-13 spectrum provides the simplicity which makes CMR such a useful tool but introduces an additional intensity variation due to the nuclear Overhauser enhancement (NOE). Since the amount of enhancement is dependent on t h e dipole relaxation of the carbon spin through carbon-hydrogen coupling, carbons which are not directly bonded to hydrogen and far removed from any surrounding hydrogens will have little or no enhancement. Thus the variation in observed carbon resonance intensities due to NOE can be as great as a factor of 2.988. Two methods have been used to suppress NOE, yet these methods introduce additional problems. T h e first method involves gating the proton decoupler such t h a t it is on only during data acquisition, collapsing the spin multiplets, and suppressing the Overhauser enhancement. This pulse technique requires a delay between data acquisition and the next sampling pulse. The delay required can vary from 5 to 10 times t h e spin lattice relaxation times for the observed carbon resonances (1). This technique reduces signal to noise because of t h e suppression of NOE and lengthens experimental time by the amount of the additional delay required; therefore when sample size or concentration problems require extensive signal averaging, the experimental time can become extremely long, thus prohibiting the use of this method. Address correspondence to the Department of Radiology, University of Florida. 2130
ANALYTICAL CHEMISTRY, VOL. 49, NO. 14, DECEMBER 1977
Secondly, NOE can be suppressed by t h e addition of paramagnetic materials to the sample solution. These paramagnetic species provide a n additional mechanism for spin-lattice relaxation which, a t the proper concentration of paramagnetic species, will predominate over dipole-dipole relaxation between 13C-lH and thus no NOE will be observed. However, use of paramagnetic species presents several problems ( 2 ) . When the dipole-dipole interaction is efficient enough to relax the carbons quickly (20.3 s), it will be impossible to suppress NOE effectively with relaxation agents without introducing considerable line broadening. This situation occurs with steroids in moderately concentrated solutions because the dipole-dipole interaction dominates relaxation and rotational motion is such that relaxation times are very short (-0.3 s). Also the reagent can complex with particular substrate functional groups (e.g., OH, N H J , thus preferentially suppressing some NOE’s over others. Both of the above techniques of NOE suppression reduce signal to noise and the gated decoupler method greatly increases experimental time. T h e reduced spin-lattice relaxation by the addition of paramagnetic materials shortens experimental time; yet when the sample is to be recovered after analysis, the paramagnetic species must be separated from the sample. From t h e above considerations, it is apparent that both techniques for NOE suppression must be used with careful consideration, and in some cases neither technique would produce useful quantitative results. O’Neill and Pringuer ( 3 ) have found they could quantify a mixture of toluene diisocynate isomers by comparing the resonances of the methyl carbons of each isomer without suppressing the NOE. T h u s the problems inherent to the techniques for NOE suppression can be avoided. However, they assumed the relaxation times and NOE’s to be very similar. Hence, this technique is difficult to generalize to other isomeric mixtures or to mixtures of compounds which are not simply related through isomerism. T o determine the conditions necessary for accurate results of quantifying mixtures by CMR, mixtures for two types of systems were prepared a t known concentrations. Neither technique for NOE suppression was employed; therefore NOE measurements were carried out and the results considered to determine the validity of assumptions made from knowledge of molecular structure. Since the quantity of sample available in our work is limited, we measured spin-lattice relaxation times t o determine the most efficient pulse method for accurate results. The conditions of computer definition of resonances were considered and the most accurate means of quantitation determined.
EXPERIMENTAL Materials. 3,4-Dimethoxyphenylacetic acid (98% pure), 3,4-dihydroxyphenylpropionicacid (98% pure), and 3P-acet-
oxycholesterol (97% pure) were obtained from Aldrich Chemical Company. 3~-Chlorocholesterolwas prepared from cholesterol and thionyl chloride using the method of Winstein and Kosower ( 4 ) . A white solid, mp 96-8 O C (lit. mp 96-7 “C), 86% yield resulted. The aromatic acids, 3,4-dimethoxyphenylacetic acid and 3,4-dihydroxyphenylpropionicacid were combined in solution at
varying relative molar concentrations of 50%-50%, 30%-70%, 10%-90%, 5?7-95%, and 2%-98% with 3,4-dihydroxyphenylpropionic acid the lesser component. The steroids, 3P-acetoxycholesterol and 3P-chlorocholestero1,were combined in solution at the same varying concentrations as the aromatic acids with 3/3-acetoxycholestero1 the lesser component. Solutions of mixtures with a total molarity of 2.0 were made with a stock solution of deuteriochloroform which contained 5% (v/v) hexafluorobenzene as the field-frequency lock material and 10% (v/v) tetramethylsilane as a chemical shift reference. Spectra were obtained in 10-mm diameter sample tubes fitted with Teflon vortex plugs. Instrumentation. A Bruker HX-90 spectrometer in the single coil configuration, equipped with a Nicolet 1083 computer, was used to obtain carbon-13 spectra at 22.628 MHz in the Fourier transform mode with complete carbon-hydrogen decoupling. For the mixtures of aromatic acids, free induction decays (FID's) for a spectral width of 4000 Hz were collected in 8192 data points at a repetition rate of 60 s. A tip angle of 30" was used (90" tip angle corresponds to a pulse length of 2 1 f i s ) . Signal averaging was performed for 1024 scans, requiring E total time of 17.1 h. Fourier analysis was performed on the accumulated FID giving a computer resolution of 0.98 Hz per data point. Line broadening was performed on each spectrum to give 4 to 5 computer data points above half maximum intensity for each resonance line. For the mixture of steroids, FID's for a spectral width of 5000 Hz were collected in 8192 data points at a repetition rate of 5 s. A tip angle of 40" was used and signal averaging was performed for 16 384 scans, requiring a total time of 22.8 h for all the mixtures except 2%-98%. A total of 32768 scans were accumulated for the 2%-98% mixture, requiring 45.6 h. Fourier analysis gave a computer resolution of 1.22 Hz per data point. Line broadening was also performed to give the same 4 to 5 computer data points above half maximum intensity as was used as a criterion for linewidths for the aromatic acids. Spectrometer stability was determined by accumulating 10 spectra of 1024 scans each over a 14-h period. The standard deviation of the resonance intensity determination varied from 1.8 to 3.1% with an average deviation of 2.5%. Quantitation. To quantify the concentration of each mixture, comparisons were made between four resonances from each molecule. The resultant percentages were averaged and the standard deviation of the four percentages WBS calculated. Relaxation Time Measurements. Longitudinal relaxation times, T,'s,were measured using the inversion-recovery techniqce ( 5 ) and calculated by a least square curve fit to the log of intensity ratios as a function of 7 values. Effective transverse relaxation times, T2*'s,were determined from line widths. The line broadened spectra were measured, then the line broadening parameter subtracted to give natural line widths. Nuclear Overhauser Enhancement Measurements. NOE's were determined for each carbon resonance under consideration. Gated decoupling suppresses NOE (6) and the ratio of the peak areas with and without NOE suppression can be used to determine the enhancement factor. The peak area for carbon resonances completely decoupled from hydrogen by continuous broad-band decoupling was compared with the peak area for the same carbon resonance decoupled from hydrogen by gated decoupling of a single hydrogen resonance during the time of the sampling pulse and data acquisition. Gated single frequency decoupling was used because gated broad-band decoupling over a modulation sweep width necessary t o cover the hydrogen spectrum did not provide complete decoupling. The modulation rate of our broad-band decoupler is 10 c/s, and during the gated decoupling sequence the decoupler is on for only 1.0 s. Hence, the decoupling rate is insufficient to cover the necessary hydrogen sweep width of 400 Hz. Incomplete decoupling reduces the observed peak area by an amount proportional to the amount of residual coupling. Therefore, gated broad-band decoupling will introduce an error into the value of the observed NOE. Using the gated single frequency method, the error in our NOE measurements was determined to be f0.3 or approximately 10%.
nuclear magnetic resonance, several. factors must be considered. First, the strength and duration of the excitation pulse must be optimized to provide sufficient frequency range to cover the observed spectrum. Secondly, the pulse strength and repetition rate of the pulse sequence must also be optimized to provide efficient excitation of the spin system without the introduction of phase or intensity anomalies. Finally, careful consideration must be given to NOE which can vary from carbon to carbon within a given spin system. Consider the usual case of the spin system in a stationary magnetic field of strength, H,, subjected to a sequence of rectangular rf pulses oriented perpendicular t o Ho with frequency a,magnetic field strength 2H1, duration T , and period T. By examining the frequency spectrum of the rf pulse obtained by Fourier transformation (7), the condition for a deviation of less than 2% in applied rf power across a n observed frequency range of 5000 Hz requires the pulse length, T , to be less than 25 ps. T o minimize the pulse length, the output voltage of the rf amplifier should be increased with the practical limitation that the waveform retain a rectangular shape. This limit is approximately 4 ps for our system. T o optimize the pulse parameters H I and T, consider the spin system in a frame of reference rotating about Ho with angular frequency, -a. For the duration, T , of the pulse, the spins precess about an effective magnetic field of strength, Heff.The steady-state response of the spin magnetization to a sequence of rf pulses is obtained in the Block formalism (8) and two cases of H1 field strength have been treated previously. Ernst and Anderson (7) examined the situation where HI is very large, thus H,ff = H1, and the spins precess about H I with precession or "tip" angle, a , given by
RESULTS AND DISCUSSION Pulse Technique, Relaxation Times, and NOE. In order to obtain quantitative results for pulsed Fourier transform
-
They optimized H I and iby determining the optimum a for maximum observed magnetization in the plane perpendicular to H,. This value is dependent on the time between pulses, T, and the longitudinal and transverse relaxation times, TI and Tz*, respectively. By averaging over the precession angle of the spin about the stationary field in the rotating frame, they determined the optimum value of the tip angle to be given by the condition COS
=E1
provided T > > T2*,such that (Ez*)12> T2*, such that E2* < 1. Note thacwhen p 0, this reduces LO the above result, Equation 2, of Ernst and Anderson.
ANALYTICAL CHEMISTRY, VOL. 49, NO. 14, DECEMBER 1977
2131
Jones and Sternlicht have also shown that for a given field spins with equivalent Tl's will have the same strength, H1, optimum T values to within 5% for p between 0' and 30'. Therefore, relative intensities will be insensitive to for p I 30". The above considerations indicate that the rf field strength, H I , should be maximized, then the pulse length, 7,adjusted to fulfill the optimum tip angle condition for efficient excitation of the spin system. In the case of adequate rf field strength, /3 < 30°, t h e condition of Equation 2 for optimum tip angle can be assumed to hold and can be applied to the spin system under consideration. For the offsets used in our experiments and the output parameters of our spectrometer, /3 had a maximum value between 18' and 30°, therefore, Equation 2 could be applied. Maximum signal to noise in a given experimental time is achieved by increasing the pulse repetition rate. Several factors limit the repetition rate, hence the interval between pulses, T , used to determine the optimum tip angle. There is the obvious instrumental limitation that T must be long enough to allow sampling of the FID between pulses. However, phase and intensity anomalies can be introduced into the observed spectrum if T i s not long enough to allow sufficient relaxation. Phase distortion can come from two sources. First, if transverse relaxation is not complete between pulses, phase distortion will be introduced into the observed spectrum (10). Second, if the observed FID has not decayed sufficiently at the end of the observation time, a phase distortion will be introduced in the Fourier transformed spectrum. We applied an exponential multiplication factor to t h e FID to introduce line broadening. This increases the decay rate in the observed spectrum; therefore, this source of phase distortion did not present a problem. The final consideration in the choice of the pulse interval, T , is to avoid intensity anomalies due to differences in Ti's. Intensity anomalies due to instrumental techniques can be introduced into the observed spectrum from two sources. The first is insufficient longitudinal relaxation between pulses. If the longest T I in the spin system is used in the condition for optimum tip angle, Equation 2, the intensity variation for all spins will be less than 5% if T i s chosen to be three times the longest T1 of the spin system. Hence, anomalies due to differences in Tl's will be minimized with this choice of T. If the resonances compared have very similar chemical shifts and relaxation times, T can be set equal to the minimum required for resolution, thus giving the maximum signal to noise per unit time. The resonances with the longest T,'s will be the most saturated but the comparison of resonances with similar Ti's will still produce quantitative results. Even with incomplete knowledge of relaxation times for a particular system, a n estimate can be made of the longest TI and this can be used to determine pulse conditions. The second source of intensity anomalies is a possible frequency dependent attenuation of observed intensity due to filtering of the input signal a t the computer analog to digital converter. The exact nature of this attenuation is dependent on the type of filters used and the choice of the frequency range filtered. Our computer uses low pass 4-pole Butterworth filters which attenuation -3 d B a t the high end of the chosen frequency range. T o avoid a frequency dependent attenuation across t h e observed frequency range, the input filter setting must be chosen greater than the observed frequency range. For a 5000-Hz sweep width, we chose a filter setting of 7500 Hz which gave the desired response across the frequency range without introducing additional noise. Finally, relative intensity variations due to differences of NOE must be considered. The mechanism for the development of the NOE is dipole-dipole relaxation between the 2132
ANALYTICAL CHEMISTRY, VOL. 49, NO. 14, DECEMBER 1977
Table I. Spin-Lattice Relaxation Times, Effective Transverse Relaxation Times, and Nuclear Overhauser Enhancements for Selected Carbons of the Steroids 3p-Acetoxycholestero1 Carbons'
c, c 4
c,
TI,
0.26
s
O.2Ob 0.23 t 0.13 2.75 I0.21 I
T 2 * ,s
NOEC
0.12
2.9 2.9 3.0
0.080 0.18
30-Chlorocholesterol Carbons' C,' C,' C,'
c
5'
T , ,s
0.27 t 0.33 * 0.26 * 2.70 t-
T z * ,s
NOEC
O.lOb 0.04 0.10
0.099 0.12 0.086
0.05
0.18
2.5 2.6 2.4 3.1
' Carbon numbering as in Figure 2. tion. Values accurate to t 0.3.
Standard devia-
observed carbon-13 nucleus and neighboring hydrogen nuclei when the hydrogen nuclei are decoupled from the carbon-13 nuclei by saturation of the hydrogen resonances. When carbon-13 relaxation is dominated by the dipolar mechanisms, NOE attains its maximum value of 2.988. As the contribution from mechanisms other than dipole-dipole interaction becomes significant ( I I ) , the NOE is decreased with the limit t h a t no NOE is observed when these mechanisms dominate relaxation. When neither the dipole-dipole interaction nor any other mechanism dominates the relaxation, care must be taken in comparing carbon resonance intensities for quantitation. The dipolar contribution to relaxation rate (12)is dependent on the correlation time for the rotational motion of the carbon-hydrogen system and the distance from t h e carbon to the hydrogens. For molecules undergoing isotropic motion, the relative magnitude of the dipolar contribution for a particular carbon depends only on the number and distance of separation of the hydrogen nuclei that contribute to relaxation. For anisotropic molecular motion, the dipolar contribution also depends on the location of the carbon nucleus in the molecular structure. For moderate to large size molecules, the dipolar mechanism is expected to dominate relaxation (13). Therefore, molecular systems, such as steroids, should exhibit full NOE for t h e backbone carbons where motion is approximately isotropic and the correlation times are such that spin rotation makes a negligible contribution (12). Table I shows the values of TI, T2*,and NOE for selected carbons in the steroids considered here. The values of T1for S@-chlorocholesterolare very similar to those reported previously (12). Within experimental error, all the carbons considered attain the maximum value of NOE indicating that relaxation is dominated by the dipolar mechanism. For small molecules, the dipolar mechanism may not dominate; therefore, care must be taken in resonance comand NOE values for the parison. Table I1 contains T I ,T2*, carbons of the aromatic acids considered here. T h e large values of T1and small values of NOE for the substituted ring carbons indicate a small contribution from the dipolar mechanism and a significant contribution from other mechanisms, which reduce the observed NOE. Therefore, for large molecules where the dipolar mechanism is expected to dominate or for small molecules where other mechanisms make a significant contribution, differences in observed NOE can be minimized by comparing carbons a t similar locations in the molecular structures and with the same number and separation of hydrogen nuclei contributing to relaxation.
k i
HO
H3CO
Flgure 1. CMR spectra of 30 YO 3,4-dihydroxyphenylpropionic acid and 70% 3,4-dimethoxyphenylacetic acid. 1062 scans. Lower spectrum: without line broadening. Upper spectrum: exponential multiplication with 4.0-Hz line broadening. The CD3 resonance of the solvent is shown at extreme right
Flgure 2. CMR spectra of 30% 3&acetoxycholesterol(R = OCOCH3) and 70% 3~-chlorocholesterol(R = CI). Carbons of 3~-chlorocholesterol are designated by prime numbering. 11.5-K scans. Lower spectrum: without line broadening. Upper spectrum: exponential multlpilcation with 2.5-Hr line broadening
Quantitation. In order to evaluate CMR as a technique for quantitation of mixtures, we prepared mixtures of known component concentrations for two types of systems. The systems were chosen to provide a range of relaxation times and NOE’s (see Tables I and 11) and to relate to molecules of biological importance. The first type of system prepared was a mixture of the aromatic acids, 3,4-dihydroxyphenylpropionic acid and 3,4-dimethoxyphenylaceticacid. The substituent positions on the aromatic ring for the two acids
are the same (see Figure 1) but the substituents are quite different. Comparisons were made between carbons 3 and 6 ofthe aromatic ring, between carbonyl carbons and between methylene carbon 8 adjacent to the carbonyl carbon of 3,4dihydroxyphenylpropionic acid, and methylene carbon 7 adjacent to the carbonyl carbon of 3,4-dimethoxyphenylacetic acid. These aromatic acids model the catecholamine neurotransmitters and related metabolic compounds studied in our laboratory. The second type of system was a mixture of ANALYTICAL CHEMISTRY, VOL. 49, NO. 14, DECEMBER 1977
2133
l F5
spectrum of 5 YO 3~-acetoxycholestoroIand 95 % 3~~-chlorocholesteroI.16-K scans, exponential multiplication with 2.5-Hz line broadening. Insets are shown at twofold horizontal expansion and fourfolr! vertical expansion ___ ____ ._ shifts and coupling constants; then resonances of carbons at Table 11. Spin-Lattice Relaxation Times, Effective similar locptions in molecular structure and the same number Transverse Relaxation Times, and Nuclear Overhauser Enhancements for the Aromatic Acid Carbons and separmion of hydrogens were compared to quantify the component concentrations. 3,4-Diliydraxyphenylpropionicacid These criteria for choice of resonances for comparison d o Carbonsa TI, s T 2 * ,sd NOE? not limit the application of this method to simple isomeric mixtures. Rut they do require that compounds compared have c=0 9.7 t 0.15' 0.20 1.7 the same general size and structure, such as the examples C, 15.9 t 0.15 2.0 examined here. This is required to minimize the effect that C;h 1.2 i- G.02 2.7 c 3 11.2 + 0.11 0.32 1.6 differences in rotational motion would have on relaxation and, 10.7 5 0.05 1.2 hence, the observed NOE. In the previous section it was :;b 1.2 ? 0.02 2.7 pointed out that if dipole-dipole interactions dominate C6 1.2 ?. 0.04 0.I. 5 3.0 carbon-13 relaxation, then the observed NOF is independent C8 1.7 t 0.28 0.19 3.0 of molecular motion and the number 2nd separation of hy3,4-Dimethoxyphenylacetic acid drogen nuclei surrounding the carbon-13 nucleus. However, if there is a contribution to relaxation from any mechanism Carbor.sa T I ,s T2*, sd NOEe other than dipole-dipole interaction, then there is also a 0.27 1.6 C'=O 8.2 1- 0.05c dependence on the number and separation of hydrogens 12.4 f 0.04 2.3 CI ' surrounding the carbon-13 nucleus. Therefore, it is necessary 1.5 ? 0.03 2.4 C,' to compare carbon resonances which have the same number 24.7 f 0.02 0.24 1.9 C,' and separation of hydrogens to minimize possible differences C,' 21.9 t 0.04 2.0 2.1 * 0.14 2.6 in the dipolar contribution to NOE. cI' 1.6 i. 0.03 0.16 3.1 C6 ' T o quantify the component concentrations in a mixture, C,' 1.7 t 0.27 0.14 2.5 comparison can be made between relative peak heights or OC'H, 3.85 5 0.35 areas for a number of resonances, then t h e ratios averaged a Carbon numbering as in Figure 1. Carbons with to determine the concentration of each component. These same chemical shift. Standard deviation. Values reparameters can be determined in a number of ways. The peak ported for carbons considered in quantitation. e Values heights and areas are generated as numerical results by our accurate t o i.0.3. computer program (15). We found these results to depend critically on the signal to noise ratio and the baseline. Figures steroids, 3/3-acetoxycholesterol and 3p. chlorocholesterol. They 2 and 3 exhibit a slight roll in the baseline which is perceived differ only in the functional group a t the 3p position but gave by the computer as additional noise causing inaccuracies in a large enough chemical shift difference in the resonances the resu!ts. T h e resonances of the 5% component in Figure adjacent to the 3p position (see Figure 2) to provide for 3 were not detected by the computer program; therefore no comparison between nonequivalent carbon resonances z t the results were generated for these resonances. This is because 2, 3, 4, and 5 positions. This system provides a good model the program was unable to distinguish these resonances from for our work on t h e radiopharmaceutical '311-19-iodothe background noise level. T o obtain accurate results from cholest-5-en-3F-01 ( 1 4 ) . ow program, it would have been necessary to have a noise level T h e carbon resonances compared for quantitation were approaching zero which is impractical with samples of limited chosen for similarity in molecular environment and sufficient concentrations. For peak height determinations, we relied on chemical shift nonequivalence such that no distortion is caused measurements of maximum peak height above average by resonance overlap. Assignments were made using chemical baseline from the plotted spectra. The minimum we could Figure 3. CMR
2184
ANALYTICAL CHEMISTRY, VOL. 49,
NO. 14,
DECEMBER 1977
detect was a height above average baseline equal to the peak to peak noise. Using this criterion, we were able to detect the 2% component in the 2%-98% mixture of steroids. But we could not detect the 2% component in the 2%-98% mixture of aromatic acids in a reasonable length of time (24 h) because of the length of the spin-lattice relaxation times of the nonprotonated carbons of the aromatic rings. The areas were determined by planimetry or measurement of integrated intensity from an integral subroutine in our computer program. All of these methods of quantitation are critically dependent on the line shape for accurate results. An integral Fourier transform of the FID gives a spectrum which is a continuous function of frequency. The resulting line shape is equivalent to the slow passage continuous wave line shape if the value of Ea* < < 1 (IO). However, the computer calculated Fourier transform of a FID defined by a discrete number of data points provides a spectrum which is a function of discrete frequencies. The definition of the resonance is determined by the number of data points used during the Fourier transform. Therefore, the resulting line shape is dependent on the memory block size used as well as the resonance line width. Carbon-13 line widths can be quite narrow, for example the substituted ring carbons of the aromatic acids. These resonances have line widths of approximately 1.0 Hz which are defined by only 1 or 2 data points in a transformed spectrum of a 5000-Hz sweep width in 4096 data points. The resonance height and area will be distorted a t this definition. This can be seen clearly by comparing the upper and lower spectra of Figure 1. Quantitation by comparing peak heights or areas a t this definition will give inaccurate results. T h e resonance definition can be improved in a number of ways. The number of data points used in the transformation can be increased by the technique of zerofill (16). This is limited to twice the initial block size, thus doubling the number of data points defining a resonance. Also the number of data points used to acquire the FID can be increased. This is limited only by the amount of memory available and the desired repetition rate. The final way of improving line-shape definition is the mathematical process of broadening a resonance by exponential multiplication of the FID before transformation. A multiplication by an exponential function with a negative time constant broadens the resonances in the transformed spectrum, thus effectively improving definition. Since our system is limited to 8192 data points for observing the FID, it was necessary to exponentially weigh the observed FID to improve definition upon transformation. There are practical limits to the amount of line broadening applied. As line broadening increases, resolution decreases which can cause overlap of close lying resonances as can be seen in Figure 2. Here the resonance of carbon 4 is very close to the multiple resonances of the other methylene carbons of the steroid backbone. In the upper spectrum, line broadening has been applied which obscures the baseline around carbon 4. If too much line broadening is applied, the resonance would become distorted and the intensity would not be accurate. Also small resonances can be reduced in intensity such that they are not visible above the noise. We found the minimum definition for accurate results to be approximately 2 data points above half maximum intensity for area determination and 3 data points above half maximum intensity for peak height determination. We chose a definition of 4 to 5 data points above half maximum intensity as a criterion for broadening. T h e results for quantitation by this procedure are shown in Table I11 and IV. Both systems were quantified by peak heights giving good agreement with the actual composition. If peak heights are to be used for quantitation, comparison
Table 111. Results for Quantitation of Mixtures of 3,4-DihydroxyphenylpropionicAcid and 3,4-DimethoxyphenylaceticAcid Composition'
Relative peak heights
50%
50.1 ( i l . 2 )
49.9
50% 30% 70% 10%
90%
Relative eak areas B
30*2 ( i 2 . 1 )
69.8 89.9
(i 2.1)
(t2.0)
9;:i( i 1 . 5 ) 4.8 ( t 2 . 7 )
5%
g:;
2%
Not observed above noise level
95%
95.2
98% a Known compositions; 3,4-dihydroxyphenylpropionic acid listed first then 3,4-dimethoxyphenylacetic acid. Peak areas as determined by the integral subroutine. Value in parentheses is the relative standard deviation.
Table IV. Results for Quantitation of Mixtures of 3p-Acetoxycholestero1 and 3p-Chlorocholesterol Relative peak Relative eak Composition" heights areasB 50%
51.2 (il.1)
;;:; ;;::
('3.7)
50%
48.2
30% 70%
ig:: ('2.5)
10% 90%
89.4
5% 95%
6.0 ( k 1 . 5 ) 94.0
94.4
2%
3-3 ('1.8) 96.7
1.6 ('0.2) 98.4
98%
(2
1.7)
( + 1.5)
7.7 ('0.8)
92.3
5.6 ( + 2.0)
a Known compositions; 30-acetoxycholesterol listed Peak areas as determinfirst then 30-chlorocholesterol. ed by planimetry. Value in parentheses is the relative standard deviation.
must be made between resonances with similar Ti's, NOE's, and T2*'s. The values reported in Tables I and I1 for these systems vary over a wide range, yet the values for the resonances that were compared are very similar. The aromatic acid peak areas were determined by the integral subroutine and the steroid peak areas by planimetry, also giving good agreement with the actual composition. Quantitation by peak areas requires only that resonances with similar Tl's and NOE's be compared. If resonances are compared with similar chemical shifts, T,'s and NOE's, the experimental time can be reduced even further by decreasing the delay between pulses. T o test this, two experiments were performed on the 50%-50% mixture of steroids. In the first experiment, the delay was set equal to the longest T1 in the system, 2.75 s. Quantitation by peak heights and areas both gave quantitative results. The values by peak heights are 50.5 (* 2.15), 49.5 (*2.15) and by areas are 51.7 (* 3.74), 48.3 (* 3.74). In the second experiment, the delay was set equal to the minimum required for resolution, 0.8192 s. This also gave quantitative results. The values by peak heights are 50.9 (*2.38), 49.1 (A 2.38) and by areas are 50.6 (* 4.28), 49.4 (A 4.28). The values in parentheses are the standard deviations of the measurement. The resonances compared were chosen because they are similarly located in molecular structures of the same general size, and have the same number and separation of hydrogens. ANALYTICAL CHEMISTRY, VOL. 49, NO. 14, DECEMBER 1977
2135
This ensures similarity in relaxation rates and, hence, observed NOE. This allows t h e procedure to be applied not only to isomers but also to many systems where other methods would be impractical. In large systems, such as steroids, proton magnetic resonance can prove too complex to yield quantitative results; yet this is when CMR becomes most useful. For, in general, as the size of the molecule increases, the relaxation times decrease and the CMR experimental time decreases making it a practical analytic tool. I n summary, this method overcomes problems present in previous methods of quantitation by CMR which required either very long experimentai times or sacrificing resolution. Our technique affords a method of quantifying mixtures which are not simply isomers in a reasonable length of time without expensive instrumentation or contamination of the sample by relaxation reagents.
LITERATURE CITED (1) D. Canet, J . Magn. Reson., 23, 361-364 (1976). 97, 4482-4485 (1975). (2) G. C. Levy and U. Edlund, J . A m . Chem. SOC., (3) I.K. O'Neill and M. A. Pringuer, Org. Magn. Res., 6 ,398-399 (1974).
(4) (5)
(6) (7) (8) (9) (10) (11) (12) 13) 14) 15)
16)
S. Winstein and E. M. Kosower, 3 . A m . Chem. SOC.,81, 4399-4408 (1959). R . L. Void, J. A. Waugh, M. P. Klein, and D. E. Phelps, J . Chem. Phys., 48, 3831-3832 (1968). R. Freeman, H. D. W. Hill, and R. Kaptein, J. Magn. Reson., 7 , 327-329 (1972). R. R. Ernst and W. A. Anderson, Rev. Sci. Instrum., 37, 93-102 (1966). F. Block, Phys. Rev., 70, 460-527 (1946). D. E. Jones and H. Sternlicht, J . Magn. Reson., 6, 167-182 (1972). R . Freeman and H. D. W. Hil:, J . Magn. Reson., 4, 366-383 (1971). K. F. Kuhimann, D. M. Grant, and R. K. Harris, J . Chem. Phys., 52, 3439-3448 (1970). A. Allerhand, D. Dcddreil, and R. Komroski, J. Chem. Phys., 55, 189-198 (1971). S. Berger, F. R. Kreissl, D. M. Grant, and J. D. Roberts. J . A m . Chem. Soc.. 97, 1805-1808 (1975). M. W. Couch, K. N. Scott, T. H. Mareci, and C. M. Williams, Steroids, 27, 451-458 (1976). J. W. Cooper, "An Introduction to Fourier Transform NMR and the Nicolet NIC-80 Data System", Nicolet Instrument Corp., Madison, Wis., 1974. T. C. Farrar and E. D. Becker, "Pulse and Fourier Transform NMR", Academic Press, New York, N.Y., 1971, p 75.
RECEIVED for review May 12, 1977. Accepted September 1, 1977. Financial support was provided by the Medical Research Service of the Veterans Administration.
Classification of Binary Carbon- 13 Nuclear Magnetic Resonance Spectra Charles L. Wilkins" and Thomas R. Brunner Department of Chemistry, University of Nebraska-Lincoln,
Lincoln, Nebraska 68588
A data base of 3782 binary coded proton noise-decoupled I3C nuclear magnetic resonance spectra has been used to develop Bayes, Maximum Likelihood, and Simplex Learning machine classifiers for the computer-assisted interpretation of such spectra. Comparison of the results of apptying these classifiers to prediction of presence or absence of 24 structural features using over 3400 test spectra revealed that the Bayes classifier yielded best classification performance. Both the ease of development of Bayes classifiers for new categories and the levels of categorization reliability obtained recommend this method as an attractive component of an on-line spectrum interpretation system.
In previous papers, investigations of a number of heuristic pattern recognition strategies for the computer-assisted interpretation of nuclear magnetic resonance spectra were reported (1-5). The impetus for this work was our belief that the sensitivity of I3C chemical shifts to structure variation and t h e widespread availability of instrumentation for routine natural abundance 13C spectral measurements made this structure elucidation technique an imperative target of pattern recognition research. One of the attractive features of empirical approaches to data interpretation is the lack of a need for complete theoretical understanding of the phenomena which produce them, prior to interpretation of the data. Indeed, t h e computational methods employed here and previously have the potential for highlighting unrecognized patterns in the data (hence, pattern recognition) and directing the investigator's attention toward relationships which may enhance theoretical understanding. Our most ambitious research efforts involved the use of a set of 1767 spectra which comprised the largest such data set yet to be used as a test of pattern recognition interpretation of carbon NMR data ( 5 ) . 2136
ANALYTICAL CHEMISTRY, VOL. 49, NO. 14, DECEMBER 1977
Earlier work had relied, for training of categorizers, exclusively on a set of 500 spectra drawn from the Johnson and Jankowski collection (6) and was primarily devoted to studies of the merits of various data preprocessing techniques (1-3). In this paper, results made possible by the use of a new a n d larger carbon-13 NMR data base containing close to four thousand "C NMR spectra are reported. Two similarity measures, the maximum likelihood (7,8) and a Bayes discriminant procedure (9),are compared in their performance with that of weight vectors derived using the simplex method ( 5 , 10) for the detection of presence or absence of 24 structural features. We have found that Bayes discriminant functions, based upon the a priori peak occurrence probabilities derived from a set of 3782 binary coded spectra, perform a t a satisfactorily high level to make them usable as primary components of a structure interpretation system. T h e computational ease of developing such decision functions and their overall performance levels appear, a t this time, to make this new approach an excellent method for computer-assisted NMR data interpretation. Previously, a simplex optimization training procedure ( I O ) was used to develop classifiers and resultant prediction performance was compared with that found for linear learning machine classifiers. As a result of examining close to 2000 unknown spectra for prediction of three common structural properties, we concluded that the simplex method, when used with binary coded data, showed promise as a superior route to the requisite high-reliability classifiers desirable for an on-line interpretation system ( 5 ) . Subsequently, a report of pattern recognition analysis of 13Cnuclear magnetic resonance spectra of a set of 43 norbornanes and another set of 12 polychlorinated diphenyl ethers has appeared (11). More recently, Munk and co-workers have completed a study of five different types of pattern classifiers for classification of nucleosides, carbohydrates, and steroids, using a data set of 2471 binary coded spectra drawn from t h e lit-