Computerized library search routine for comparing ultraviolet spectra

Computerized library search routine for comparing ultraviolet spectra of drugs separated by high-performance liquid chromatography. Dennis W. Hill, Th...
0 downloads 0 Views 523KB Size
350

Anal. Chem. 1987, 5 9 , 350-353

(6) Wahlund, K.-G.; Winegarner, H. S.; Caldweli, K. D.; Giddlngs, J. C. Anal. Chem. 190B. 58, 573-578. (7) Allen, T. Particle Size Measurement, 2nd ed.: Chapman and Hall: London, 1975; Chapter 11. (8) Kaye, B. H. Dkect Cheracterization of Fineparticles; Wky: New York, 1981; Chapter 7. (9) Myers, M. N. Ph.D. Thesis, Unlversity of Utah, Salt Lake City, 1965, Chapter 3.

(IO) Scholz, J. T.; Uhlmann, D. R.: Chalmers, B. Rev. Sci. Instrum. 1965, 3 6 , 1813-1818.

RECEIVED for review May 19, 1986. Accepted September 18, lg8&This work was suppo*d by Grant CHE-8218503 from the National Science Foundation.

Computerized Library Search Routine for Comparing Ultraviolet Spectra of Drugs Separated by High-Performance Liquid Chromatography Dennis

W.Hill,* Thomas R. Kelley, and Karen J. Langner

College of Agriculture and Natural Resources, Microchemistry Laboratory U-193, Storrs, Connecticut 06268

A computerlzed library search routine for matchlng ultravlolet (UV) spectra of unknown drugs to UV spectra of reference drugs has been developed. Reference compounds were separated on an acld and/or bask high-performance llquld chromatographic solvent gadlent system udng an HP 1Q4QA diode array spectrophotometk detector. Three hundred UV spectra were collected and stored In a reference data file. A presearch algorlthm, combined wlth a spectral proflle comparison algorithm, consistently resutled In accurate matchlng of unknown spectra to reference spectra contalned In the drug UV library flle. The efflclency and accuracy of the search routine were evaluated.

Recent availability of commercial diode array spectrophotometers has allowed rapid collection of ultraviolet (UV) and visible spectral profiles in digital form. These instruments, when interfaced with high-performance liquid chromatographs (HPLC), provide a powerful tool for the analysis of complex mixtures of compounds that are not amenable to gas chromatographic (GC) separation. Ultraviolet spectral data can provide structural information about compounds that have been separated on an HPLC system; however, interpretation of UV spectral data is more difficult than interpretation of mass or infrared spectral data. Comparison techniques for UV spectra traditionally utilized only a few points in the spectral profile to validate identifications. More precise evaluation of UV spectra is possible by utilizing a computerized library search routine that provides a point by point comparison of unknown spectra to reference spectra. This type of library search routine has been wellestablished for mass spectra and infrared spectra (1-8), and has only recently been developed for use in evaluating ultraviolet spectra (9). The following is a presentation of a search algorithm designed to match UV spectra of unknown compounds to UV spectra in a reference library file, using an 8-bit computer and data generated by a Hewlett-Packard 1040A spectrophotometer. SYSTEM DESIGN Ultraviolet spectra (2OC-402 nm) were collected and stored in the reference library at 2-nm resolution, which represents one nominal wavelength. Absorbance values at each wavelength were normalized to the area under the spectral curve as calculated by eq 1,where N , is the FTA (fraction of total

402

Nt = AL/( C Aj) j=200

(1)

absorbance) at wavelength i, A, is the absorbance at wavelength i, i is the individual wavelength, and j is the wavelengths in spectral profiie. Spectral data can be normalized by ratioing data points to the absqrbance of the largest band in the profiie or to the area under the profiie curve. In poor-quality spectra, it is possible that the largest band may be due to spectral contamination or noise. When the largest point of absorbance is used as the normalization reference, any deviation in the absorbance of this one point can cause a shift in the normalization values of all points in the profile. If all the points in the profile are normalized to the sum of the absorbances under the spectral curve, slight deviations at a few points in the spectrum will have a minimal effect on the total area and therefore not affect the individual normalized points as much. A data file was created in which each record contains the following information for one library spectrum: normalized spectral elements, wavelengths of maximum absorbance (A,-) in the spectrum (up to five individual wavelengths), and the name of the compound. A separate index file was created containing 102 records, each representing a nominal A- value. Each record contained the FTA values for all spectra with that ,,A value and the location of those spectra in the data file. The library search routine consists of the presearch algorithm and a search routine which utilizes a point-by-point comparison of the unknown spectrum to library spectra that meet the presearch criteria. During the presearch routine, the unknown spectrum is normalized and the FTA value at each, X is determined. The index file is searched for library spectra that contain A, values with FTA values which closely approximate those of the unknown spectrum. The operator determines how closely the A, and FTA values of an unknown spectrum must match the values of a library spectrum by defining an acceptable range of values. Only those spectra that meet these presearch criteria are compared to the unknown spectrum by calculating the scaled sum of the differences between the unknown and reference spectra using eq 2, where A4 is the goodness of fit value (FIT), S is the

response in the sample spectra at Xi, and R is the response in reference spectra at Xi.

0003-2700/87/0359-035080 1.50/0 0 1987 American Chemical Society

ANALYTICAL CHEMISTRY, VOL. 59, NO. 2, JANUARY 1987

351

Table I. Comparison of Drug Spectral Data at High Signal-to-Noise Levels and Low Signal-to-Noise Levels 1 pga

0

20

IO

30

T I M E (min)

Figure 1. HPLC chromatogram of drug mixture representing 10 compounds with spectra containing ,A, values most frequently seen in the spectral library. Compounds are eluted from a Du Pont Zorbax column using the acid solvent system described in the Experimental Section.

EXPERIMENTAL SECTION The compounds used to generate the UV reference library were divided into two groups for reverse-phase HPLC separation: those eluting in an acid solvent system and those requiring a basic solvent system. Analysis was performed on a Waters HPLC system consisting of a Model 660 automated gradient controller, two Model 6000A solvent delivery systems, and a Model U6K injector. Acid and neutral compounds were eluted from a Du Pont Zorbax column (250 mm X 4.6 mm i.d.) at 31 "C. Solvent A was 0.1% (v/v) &Po,, and solvent B was 0.1% (v/v) H,P04 and 10% (v/v) H20 in CH3CN. A linear solvent gradient from 0% B/A to 100% B/A in 30 min was used with a flow rate of 2.0 mL/min. Basic and neutral compounds were separated on a Hamilton PRP-1 column (250 mm X 4.1 mm i.d.) at ambient temperature. Solvent A was 1.0% ",OH, and solvent B was 1.0% ",OH in CH3CN. A linear solvent gradient from 0% B/A to 100% B/A in 30 min was used with a flow rate of 2.0 mL/min. UV spectra were obtained with a Hewlett-Packard 1040A spectrophotometer controlled by a Hewlett-Packard 85 computer system equipped with a Hewlett-Packard 9135A storage unit and Hewlett-Packard 7470A printer/plotter. Experiments were designed to test the efficiency of the search algorithms to choose correct spectra from the library. Ten compounds were chosen from the library with spectra containing, X values most frequently occurring in the sample library of 300 spectra. A mixture of these drugs (acetaminophen, m-aminobenzoic acid, p-aminobenzoic acid, chlorothiazide, dyphylline, mefenamic acid, probenecid, salicylic acid, sulfadiazine, and sulindac) was prepared in methanol at a concentration of 500 pg/mL. Two microliters of this mixture (1 Fg of each compound) was analyzed five times on the acid HPLC system. A representative chromatogram of this analysis is seen in Figure 1. Ultraviolet spectra of each drug were collected, and the search routine was utilized to determine the FIT value (eq 2) of each compound compared to the corresponding spectrum in the UV library. Parameters for the initial search were an FTA window of 0.001 and a window of f l nominal wavelength value. If the correct compound was not listed in the first five choices using these parameters, the FTA window was enlarged. Three microliters of the drug mixture at a concentration of 10 pg/mL (30 ng of each compound) was analyzed five times on the acid HPLC system. Spectra generated at this concentration had low signal-to-noise ratios. The developed search routine was used on these spectra to determine the efficiency of the algorithm in matching correct spectra to poorly defined sample spectra. In order to test the ability of the algorithm to distinguish between the spectra of compounds with similar profiles, 5 pL of a mixture of methylparaben, ethylparaben, propylparaben, and butylparaben (500 ng/pL) was analyzed five times on the acid HPLC system. The search routine was employed to determine the FIT value for each of the parabens to reference spectra. Laboratory samples suspected of containing procaine were analyzed by the basic HPLC/UV system. Each sample was determined to contain procaine by thin-layer chromatography, gas chromatography-mass spectrometry, and/or gas chroma-

30 ng"

compd

FITb

SIN

FITb

SIN

acetaminophen rn-aminobenzoic acid p-aminobenzoic acid chlorothiazide dyphylline mefenamic acid probenecid salicylic acid sulfadiazine sulindac

977 976 939 989 965 989 986 971 980 993

1417 902 472 1240 248

954 950 912 969 972 920 925 940 959 964

132 99 64 50 18 28 17 46 27 102

331 729 1240 496

"Analysis of variance calculated by Data Text routine (10) between l pg and 30 ng indicates there is a significant difference between these two groups at 0.002 level. * Mean of FIT values; N = 5.

Table 11. UV Profile FIT Values of Alkylparaben Spectra to Library Spectra alkylparaben

FIT

% CV"

methylparaben ethylparaben propylparaben butylparaben

991 992 994 993

0.26 0.13

0.07 0.11

" N = 5.

tography-infrared analysis. The sample was made basic with NHIOH and extracted with chloroform. Residues were analyzed by HPLC on a PRP-1 or Novapak CIScolumn using an isocratic solvent system of 1 % NH40H/CH3CN (1:l)or 1% ",OH/ CH3CN (3:7), respectively. In each of the samples, a peak was observed in the chromatograms having the appropriate retention time for procaine. UV spectra of these compounds were submitted to the library search routine to determine the compounds identity.

RESULTS AND DISCUSSION The match algorithm used to evaluate and compare the W spectra for this study utilizes a simple comparative method. The experimental studies for this algorithm appear to prove its effectiveness for matching spectra out of a relatively small library (300 compounds). As the size of the library increases, other more complex algorithms will be evaluated to determine if it is necessary to upgrade the existing system in order to differentiate spectral profiles from a larger population. Table I lists the average FIT values for the series of test compounds used to evaluate the search routine. For those compounds with high signal-to-noise ( S I N )ratios, a fit to the library spectra of greater than 950 was usually obtained. The exception was p-aminobenzoic acid, which had an average FIT of 939. Dyphyllline eluted very close to sulfadiazine (Figure 1) and in some analyses resulted in a spectrum that was partially contaminated with the sulfadiazine spectrum. This resulted in greater variability in the FIT values for the five dyphylline analyses. A statistical comparison of FIT values for high S I N to low S I N ratios shows that a consistently higher FIT value can be expected from clean spectra than from spectra with high noise interference or contamination. An analysis of variance (10)shows a significant difference between FIT values of high and low S I N ratios (Table I). Table I1 lists the FIT values obtained for high S I N level spectra ( S I N = 1200) of an homologous series of alkylparabens. The structures of these compounds and a representative spectrum of each are shown in Figure 2. When evaluated in the search routine, all of the sample spectra matched the respective library spectra with FIT values greater than 990. These values were reproducible with better than 0.3% coef-

352

ANALYTICAL CHEMISTRY, VOL. 59, NO. 2, JANUARY 1987 0 C-0-CHI

Table 111. Uniqueness of FTA Valuesa A,, nm

no. of spectra in file

10.0001

203 205 281 211 245 273 207 227 243 259 279

59 38 26 24 23 21 18 18 18 18 18

97' 95 77 100 83 90 89 100 100 78 100

FTA deviationb 3tO.001 10.002 88' 95 50 92 74 71 78 89 72 61 100

85c 96 50 75 57 57 78 78 72 50 89

Methylparaben

0

0 C-0-CHD-CHI

-

0

EthvlDaraben

a

.

10.004 61' 89 31 67 39 29 56 78 56 44 56

0 !-O-

Propylparaben

3

0

< CH2j2-CH3

ficient of variation (Table 11). In comparison of FIT values of the correct alkylparaben spectrum with the other spectra in the library, methylparaben fit the library spectrum of methylparaben the best in each of the five evaluations. Ethylparaben had a slightly better fit to the library spectrum of methylparaben during each of the five evaluations. The propylparaben test spectrum matched the library spectrum for propylparaben the best in all five evaluations, whereas, the butylparaben spectra gave a best fit to the library butylparaben spectrum once and matched the propylparaben library spectrum the best in the other four evaluations. UV spectra from 34 procaine-positive tissue and urine samples were subjected to the library search routine, and in each case the UV spectra for procaine and benzocaine were indicated as the best fits to the sample spectra. Procaine had an average fit value of 958, and benzocaine had an average fit value of 946. Procaine had a slightly better fit 31 out of 34 times to the library spectra than benzocaine, and in the other three instances procaine was chosen as the second-best fit. The similarities between the UV spectra of procaine, benzocaine, and butacaine can be seen in Figure 3. Closely related spectra give similar fit values for the profile match; however, the correct spectra usually have the first- or second-best fit. In actual samples, such as those that were analyzed for procaine, it is not sufficient to use these data alone to differentiate between two compounds. It does, however, significantly narrow the possibilities. In most cases, the retention times of possible compounds are sufficiently different to allow confidence in assigning a compound's identity. The efficiency of the presearch routine can be demonstrated by the data in Table 111. The 11 most frequently occurring

But y lpara ben

nu

2 5 1

a Efficiency of the presearch routine measured as percentage of spectra in a given A,, file that will have a unique FTA value within certain FTA deviations. *Variation around FTA value (window) being evaluated. Percent of the spectra in the library at the given A,, value that have a unique FTA value when compared to all other spectra in the library with the same A, value.

6.icH2i3-cn3

OH

0390

0%

I 200 5

300 5

400 5

!

200 5

300 5

WAVELENGTH i n m )

400 5

WAVELENGTH i n m l

spectra and structures of methylparaben,ethylparaben, propylparaben, and butylparaben. Flgure 2. UV

5 02005 -

2 0-05 5

WAVELENGTH I n m )

WAVELENGTHlnm)

0% 200 5

W 'loo

3005

5

WAVELENGTH < n ml

Figure 3. UV

spectra and structures of procaine, benzocaine, and

butacaine. A, values in the library are listed along with the number of spectra having that A, value. By use of various FTA window allowances, the table indicates the percentage of correct spectra that will be exclusively chosen in the presearch routine. As the FTA window is broadened, the number of times only one spectrum meets the criteria decreases. The ability of the presearch algorithm to chose the one correct spectrum a high percentage of the time decreases the need to compare every spectrum in the library by the time-consuming point-by-point algorithm. The fraction of total absorbance (FTA) value may vary for the same compound from run to run. Variance in this number

Table IV. Variance of FTA Values" compd acetaminophen m-aminobenzoic acid p-aminobenzoic acid chlorothiazide dyphylline mefenamic acid probenecid salicylic acid sulfadiazine sulindac

" N= 5.

A,,

nm 245 227 225 203 207 221 227 205 267 227

band 1 FTA 0.0322 0.0590 0.0345 0.0315 0.0571 0.0352 0.0259 0.0890 0.0222 0.0149

3 SD 0.0014 0.0008 0.0037 0.0011 0.0210 0.0005 0.0012 0.0007 0.0009 0.0003

A,,

nm

band 2 FTA

3 SD

273 281 229 275 281 251 237

0.0044 0.0182 0.0484 0.0220 0.0099 0.0269 0.0221

0.0002 0.0018 0.0016 0.0007 0.0001 0.0008 0.0003

259

0.0111 0.0006

A,,

nm

band 3 FTA

3 SD

279

0.0183 0.0005

353

0.0080 0.0002

303

0.0101 0.0003

285

0.0118 0.0002

A,,

nm

329

band 4 FTA

3 SD

0.0106 0.0006

Anal. Chem. 1987,59,353-358

could effect the ability of the presearch program to choose the correct library spectrum. The variance (3 standard deviations (SD)) data calculated for Table IV are presented to show the fluctuation of FTA values in the 10 test compounds. Dyphylline exhibited the most variance (3 SD = 0.0210) in FTA of the 10 compounds in one of its two ,A, values (207 nm). The second band for dyphilline (275 nm), however, had a very low variance (3 SD = 0.0006) in FTA. The presearch values in a given routine is designed to examine up to five ,A, and spectra. Test spectra are required to duplicate the ,A, FTA values of at least one band in a library spectra to meet the criteria for being chosen by the presearch routine. LOW variance in only one band of a multiband spectra is required to ensure proper selectivity for the presearch routine. Therefore, use of an FTA window of 0.001 would assure that dyphylline was selected in the presearch routine because one of its bands (275 nm) had an I T A variance below 0.001. These data can also serve as an indicator for selecting the optimal FTA window to use for the presearch routine. Of the compounds tested for this purpose, 2 of the 10 would require opening the FTA window beyond 0.001. Fell et al. (9) recently demonstrated the ability to perform library search routines on digitalized UV spectra. In this method, a presearch routine relied on the retrieval of spectra that had A, and Ah, values similar to the unknown spectral values. The fit of unknown spectra to reference spectra was calculated on the "smoothed" spectra as the mean square root of the difference between the two spectra. This routine was tested on a limited spectral library of eight compounds and demonstrated the ability to distinguish between similar spectral profiles. The search routine described in this paper was tested on a library of approximately 300 spectra. The presearch routine narrows the number of possible spectra to approximately 20 or less, and in each of the test cases this subfile contained the correct spectrum. Additionally, an absolute difference algorithm on "nonsmoothed" spectra was sufficient to give correct relative comparison of spectra including spectra that were relatively noisy.

353

Some structurally related compounds generate spectra that are very similar. The search routine has consistently distinguished between spectra of structurally related compounds, such as the alkylparabens when they differed by at least two methylene groups. This algorithm has been used in our laboratory for over a year and has consistently indicated the correct UV spectra for compounds tested that were contained in the library. We feel that this method will aid in the use of diode array UV detectors as qualitative instruments.

ACKNOWLEDGMENT We thank Albert Berrebi for conducting statistical analyses for the data presented in Table 111. Registry No. p-HoC6H4NHAc,103-90-2;n-H2NC6H4CO2H, 99-05-8; P - H ~ N C ~ H ~ C O150-13-0; ~ H , O-HOC~H~CO~H, 69-72-7; chlorothiazide, 58-94-6; mefenamic acid, 61-68-7; probenecid, 57-66-9;sulfadiazine, 6835-9;sulindac, 38194-50-2;methylparaben, 99-76-3; ethylparaben, 120-47-8; propylparaben, 94-13-3; butylparaben, 94-26-8; dyphylline, 479- 18-5.

LITERATURE CITED (1) Hertz, H. S.; Hites, R A.; Bieman, K. Anal. Chem. 1971, 4 3 , 681-691. (2) Lowry, S. R.; Huppler, D. A. Anal. Chem. 1983, 55, 1288-1291. (3) Delaney, M. F.; Uden, P. C. Anal. Chem. 1979, 51,1242-1249. (4) Hanna, A.; Marshall, J. C.; Isenhour, T. L. J. Chromafogr. Sci. 1979, 17, 434-440. (5) Azarraaa, L. V.; Williams, R. R.;de Haseth, J. A. Appl. Specfrosc. 1981, 35, 468-469. (6) de Haseth, J. A.; Azarraga, L. V. Anal. Chem. 1981, 53,2292-2298. (7) Hanaac, G.; Wleboldt, R. C.; Lam, R. B.; Isenhour, T. L. Appl. Specfrosc. 1982, 36, 40-47. (8) Erickson, M. D. Appl. Specfrosc. 1981, 35,181-184. (9) Fell, A. F.; Clark, B. J.; Scott, H.P. J. Chromafogr. 1984, 316, 423-440. (IO) Armor, D. J.; Couch, A. S.Data Text Primer; Collier Macmiiian: London, 1972; pp 92-99. '

RECEIVED for review January 14, 1986. Resubmitted August 25, 1986. Accepted September 3, 1986. This publication is recorded with the Storrs Agricultural Experiment Station as Scientific Contribution No. 1149.

Determination of Surface Polarity by Heterogeneous Gas-Solid Chromatography Scott P. Boudreau and William T. Cooper*

Department of Chemistry, Florida State University, Tallahassee, Florida 32306-3006

A polarity scale for heterogeneous surfaces is proposed that uses energy distrlbution functlons calculated from chromatographic reigntion data. The energy requlred to form a com-

plete monolayer (€,,), obtained from the energy distribution functlon, has been chosen as the parameter best suited for descrlMng the heat of adsorption on a heterogeneous surface. E,, values for chloroform (proton donor), pyrldine (proton acceptor), and dlchioromethane (dlpoie interactor) are used as parameters in the surface polarity scale that is analogous to the Rohrschnelder and McReynolds scales that describe the poiarlty of gas-llquld chromatography stationary phases. Monolayer energies have also been used to construct a surface selectivity trlangie. Results are presented for a variety of surfaces differing in the relative number of acldlc, basic, and dipolar sites.

Table I. Material and Column Characteristics column

material

dimensions, cm

kaolinite

60 X 0.4

silanized silica gel

26 X 0.4

alumina

23

silica pel

26 X 0.4

X

0.4

surface area, m2/g 3.9 180 70

200

particles size

weight, g

(mesh)

7.0606 1.4865 2.3719 1.4865

100-200 100-200 100-200 100-200

The importance of solid and particulate surfaces in chemical processes relevant in such diverse fields as analytical chemistry, environmental geochemistry, catalysis, and biomedical engineering has been recognized for some time. However, studies of these processes are often complicated by the inevitable chemical and physical heterogeneity of the surfaces.

0003-2700/S7/0359-0353$01.50/00 1987 American Chemical Society