Anal. Chem. 2010, 82, 9127–9133
Identification of Multiple Impurities in a Pharmaceutical Matrix Using Preparative Gas Chromatography and Computer-Assisted Structure Elucidation Anna Codina,* Robert W. Ryan, Richard Joyce, and Don S. Richards Analytical Development, Pfizer Global Research and Development, Ramsgate Road, Sandwich, Kent, CT13 9NJ, U.K. Gas chromatography (GC) with a preparative fraction collector (PFC) has been used to facilitate the identification of a number of volatile impurities at major and minor percentage levels in a pharmaceutical matrix by nuclear magnetic resonance spectroscopy (NMR) and mass spectrometry (MS). The trapping process was optimized using liquid sorbents, and the impurities were trapped directly into a deuterated solvent. Challenges related to the pharmaceutical matrix were overcome by derivatization with boron trifluoride in methanol and extraction with heptane, producing the methyl esters of the carboxylic acid impurities and main component. GC coupled to atmospheric pressure chemical ionization mass spectrometry (APCI-MS) with a time-of-flight (TOF) detector was used to acquire accurate mass and isotopic data for the impurities, leading to the determination of their molecular formulas (MF). One dimensional (1D) and twodimensional (2D) NMR experiments were also acquired to unambiguously determine the impurities’ structure. The acquisition time of the latter experiments was minimized by using a high-resolution instrument equipped with a small (1.7 mm) cryogenic probe. The quality of the data was such that the structure of the impurities could be determined semiautomatically by using a computerassisted structure elucidation (CASE) approach, even though the total amount of one of the isolated impurities was less than 60 nmol. Preparative gas chromatography (prep-GC) has been available for many years but has mainly found application in the academic, environmental, and flavors and fragrances sectors.1-3 Prep-GC for pharmaceutical structural elucidation is a relatively new application of this technique, and it has the potential to speed up the identification of unknown volatile impurities detected by GC by facilitating the acquisition of NMR data. Nevertheless trapping volatile impurities eluting from a GC column is not a straightfor* To whom correspondence should be addressed. Fax: +441304652291. E-mail:
[email protected]. (1) Ledauphin, J.; Saint-Clair, J.-F.; Lablanquie, O.; Guichard, H.; Founier, N.; Guichard, E.; Barillier, D. J. Agric. Food Chem. 2004, 52, 5124–5134. (2) Eyres, G. T.; Urban, S.; Morrison, P. D.; Marriott, P. J. Anal. Chem. 2008, 80, 6293–6299. (3) Ruhle, C.; Eyres, G. T.; Urban, S.; Dufour, J.-P.; Morrison, P. D.; Marriott, P. J. J. Chromatogr., A 2009, 1216, 5740–5747. 10.1021/ac102151g 2010 American Chemical Society Published on Web 10/13/2010
ward proposition, and early results using commercially available traps have shown that there is a great deal of scope for improving the technique. Our recent experience with volatile impurities demonstrated how complex and lengthy an investigation into the identity of structurally similar impurities can be. A resource hungry cycle of chemistry-theory-experimentation is generally pursued in which, given enough time, experience, and expertise, a solution will eventually present itself. This cycle can be bypassed by enabling the trapping of a pure sample of the impurity or impurities, allowing the acquisition of empirical data followed by unambiguous identifications. Computer-assisted structure elucidation (CASE) has been used for the last 40 years to elucidate molecular structures from experimental data.4,5 However, methods using 2D NMR and accurate mass MS data have usually been slow and far from being fully automated. In our opinion, bottle necks of the process have been the peak picking of the 2D NMR experiments and the capacity to obtain a unique molecular formula (MF) from accurate mass measurements. Recently software packages that allow rapid peak picking6 and MF determination7,8 have become available. The combination of those with CASE significantly improves the speed of the elucidation of structures from raw NMR and MS data. We are of the opinion that for relatively small organic molecules, the combined computational approach can nowadays be faster and more thorough than the expert spectroscopist. Thus, despite there being many instances in the literature describing CASE as an expert system to assist in solving difficult problems,9,10 we prefer to use it for the relatively routine investigations so that the expert (4) Elyashberg, M. E.; Williams, A. J.; Martin, G. E. Prog. Nucl. Mag. Reson. Spectrosc. 2008, 53, 1–104. (5) Nelson, D. B.; Munk, M. E.; Gash, K. B.; Herald, D. L. J. Org. Chem. 1969, 34, 3800–3805. (6) ACD Labs. NMR Workbook program, http://www.acdlabs.com/products/ adh/nmr/nmr_workbook, June 23, 2010. (7) Bruker. Smart Formula 3D application note, www.bdal.de/uploads/media/ LCGC_March_2008_Appl._Books.pdf, June 23, 2010. (8) Zurek, G.; Krebs, I.; Goetz, S.; Scheible, H.; Laufer, S.; Kammerer, B.; Albrecht, W. LC-GC Eur. 2008, 31–33. (9) Martin, G. E.; Hadden, C. E.; Russell, D. J.; Kaluzny, B. D.; Guido, J. E.; Duholke, W. K.; Stiemsma, B. A.; Thamann, T. J.; Crouch, R. C.; Blinov, K.; Elyashberg, M.; Martirosian, E. R.; Molodtsov, S. G.; Williams, A. J. J. Heterocycl. Chem. 2002, 39, 1241–1250. (10) Blinov, K.; Elyashberg, M.; Martirosian, E. R.; Molodtsov, S. G.; Williams, A. J.; Tackie, A. N.; Sharaf, M. M. H.; Schiff, P. L., Jr.; Crouch, R. C.; Martin, G. E.; Hadden, C. E.; Guido, J. E.; Mills, K. A. Magn. Reson. Chem. 2003, 41, 577–584.
Analytical Chemistry, Vol. 82, No. 21, November 1, 2010
9127
spectroscopist can focus on complex elucidations such as those involving mixtures, weak data, and/or with several peaks overlapping. Any automated approach requires data of an acceptable quality in order to differentiate signal (S) from noise (N). When the S/N ratio is very low, it is much more challenging for computers to do this than for human beings. This and the fact that NMR is intrinsically insensitive, compared to MS or UV, have limited CASE to instances where the amount of material available is of the order of milligrams (millimoles). Recent advances in small cold probe technology11 have pushed the limit to the microgram (nanomole) range. The use of instrumentation equipped with these probes leads to a dramatic reduction in either the NMR acquisition time (compared to when room temperature standard probes are used) or the amount of material needed. This is a key factor in reducing cycle times when isolation is required prior to elucidation. We describe below the combined use of prep-GC, GCatmospheric pressure chemical ionization-mass spectrometry (APCI-MS), NMR, and CASE for isolation and rapid elucidation of impurities at major and minor percentage levels in a pharmaceutical matrix. EXPERIMENTAL SECTION Chemicals and Materials. The following reagents were obtained from Sigma-Aldrich Company Ltd. (Dorset, England) and used as received: methanol-d4, heptane, and 10% boron trifluoride in methanol. GC-EI-MS Analysis. Determinations were carried out on a 6890N series GC system coupled to a 5973N MSD analyzer (Agilent Technologies UK Ltd., Wokingham, U.K.). Injections were made via a programmable temperature vaporization (PTV) inlet onto a ZB-5MS column, 30 m × 0.25 mm i.d. × 1.0 µm film (Phenomenex Ltd.). Injections of 1 µL were performed in the hot (temperature 250 °C), split mode (ratio, 50:1), and with helium flowing through the column at 1 mL/min. Derivatization. A volume of 0.3 mL of 10% boron trifluoride in methanol was added to 100 mg of matrix and the derivatization carried out at 60 °C for 1 h. A volume of 1 mL of aqueous sodium carbonate was added, and the aqueous solution was extracted with 2 × 250 µL of heptane. Prep-GC. Prep-GC was carried out on a 6890 GC system equipped with a flame ionization detector (FID, Agilent), a CIS 4 PTV (Gerstel GmbH & Co.KG, Mu¨lheim an der Ruhr, Germany), and a DB5 column, 30 m × 0.53 mm × 5.00 µm film (Agilent). A total of 20 injections of 1 µL of the heptane extract were made. The PTV was at 20 °C for the injection, then ramped to 250 at 12 °C/s, and held at 250 °C for the duration of the run. Carrier gas was helium flowing at 4.5 mL/min. The compounds were isolated using a Gerstel PFC equipped with traps containing 200 µL of methanol-d4. GC-APCI-MS. GC conditions were as for GC-EI. APCI mass spectra were recorded on a Bruker MicroTOFQ mass spectrometer equipped with a Bruker GC-APCI interface. Data were acquired at a resolution of 6000 (m/z 205), and accurate mass calibration was achieved by external calibration of the data file using a calibration produced by the introduction of six known compounds, covering the mass range m/z 72-609, via the (11) Molinski, T. F. Nat. Prod. Rep. 2010, 27, 321–329.
9128
Analytical Chemistry, Vol. 82, No. 21, November 1, 2010
standard spray interface supplied with the APCI source. All compounds gave a [M + H]+ ion and considerable in-source fragmentation. The chemical formulas of the [M + H]+ ions were determined using Bruker SmartFormula software, which utilizes both the mass measurement and the isotope pattern to assign the formula. Formulas with mass measurement errors of less than 2 mDa are considered, and the quality of the isotopic match, represented by a millisigma value, is used to determine a unique formula. A millisigma value of less than 20 is considered an excellent match. NMR Spectroscopy. NMR spectra were recorded on Bruker AVANCE III 600 spectrometer, equipped with a 1.7 mm tripleresonance (1H/15N/13C) single-axis gradient cryogenic probe. Homonuclear experiments included 1D 1H and 2D (1H, 1H) double quantum filtered-correlation spectroscopy (DQF-COSY). Heteronuclear experiments included 2D 13C heteronuclear single quantum correlation (HSQC) and 2D 13C heteronuclear multiple bond correlation (HMBC). The 2D experiments were acquired with high resolution in the indirect dimension: 300 complex points (pt), resolution 83 Hz/pt for the HSQC, 400 pt, 94 Hz/pt for the HMBC, and 512 pt, 12 Hz/pt for the COSY. For the least sensitive sample, the resolution was slightly decreased due to the increase of transients. In total, 200 pt were acquired for the HSQC (125 Hz/pt), 300 pt for the HMBC (126 Hz/pt), and 128 pt for the COSY (47 Hz/pt). All of the experiments were acquired at 303 K. Data were automatically processed and peak picked with NMR Workbook version 12.01, build 35344 (ACD/Labs). 1H chemical shifts were referenced to methanol-d3 at 3.31 ppm. For quantitative purposes, the 1D 1H experiments for each of the derivatized impurities and MB were acquired with the same volume (40 µL) and NMR parameters (1 scan, 1 s relaxation delay, 64 receiver gain, and 30° pulse). The 360° pulse was calibrated for one of the samples and kept constant. The tuning and matching were also kept constant. The data were processed and analyzed with MestReNova (6.1.1-6384). Baseline correction was performed using a third order Bernstein polynomial fit. CASE. Structure Elucidator version 12.01, build 33864 (ACD/ Labs) was used. Build 33886 was used to generate the graphic for impurity C in Figure 4. RESULTS AND DISCUSSION Liquid Sorbent. A set of experiments using model compounds was designed to understand how prep-GC can be utilized most effectively in pharmaceutical research and development. Liquid sorbents were found to effectively trap the compounds of interest. The liquid sorbents were selected on the basis of their boiling point, polarity, and the availability and cost of deuterated analogues. The solvents evaluated were carbon tetrachloride, chloroform, dimethylsulfoxide, acetonitrile, and methanol. These sorbents were tested for their effect on trapping efficiency and their suitability for use over an extended period of time in the PFC. Methanol was judged to be the best all-rounder and was therefore used as the liquid sorbent of choice. The use of a deuterated sorbent had the added benefit of facilitating isolation directly into a solvent suitable for NMR analysis. In doing so, a preconcentration step can be avoided, which is advantageous
the apparent disappearance of impurity A and an unresolved broad peak for MB and impurity B. Attempts to recover the system with inlet and column maintenance by injecting bis(trimethylsilyl)trifluoroacetamide had some positive effect, but the original chromatographic separation could not be reproduced. Since the chromatographic problems were thought to be caused by nonvolatile material present in the matrix, a cleaning step prior to isolation was designed. On the assumption that the compounds to be isolated had an acid group (from MW and MB structure), the matrix was treated with boron trifluoride in methanol to form methyl esters (less polar, more volatile species), which were then extracted with heptane, as described in the Experimental Section. The heptane solution was analyzed by GCEI-MS, giving good chromatography and resolution of the derivatized impurities of interest (Figure 1) and was consequently used in the preparative system. An additional peak was also detected (peak D). These impurities were successfully trapped in 200 µL of methanol-d4. Trapping efficiencies of 63% were obtained for MB and 39, 63, and 54% for impurities A, B, and C, respectively. Figure 1 also shows the superimposed GC-EI-MS chromatograms for each of the isolated derivatized impurities. The concentration of the solutions containing each of the trapped impurities was roughly estimated to assess the feasibility of their structure elucidation. Assuming the same EI response factor for MB and the impurities, concentrations from 0.2 to 1.7 µg/µL were obtained. Consequently, for the least concentrated sample (impurity D), the total amount of impurity in the NMR tube (40 µL) will be in the region of 10 µg, which should enable the acquisition of 1D and 2D homonuclear (1H) and possibly heteronuclear (1H, 13C) NMR experiments in a reasonable time scale (2-3 days). Structure Elucidation and Quantification. The APCI mass spectra of MB and the four derivatized impurities (A, B, C, and D) are shown in Figure 2. Data for the [M + H]+ ions for all the components are given in Table 1, showing the assigned chemical formula and the high confidence in the assignment represented by the low mass error and millisigma values.
Figure 1. GC-EI total ion current (TIC) chromatogram of the pharmaceutical matrix before (top) and after (middle) derivatization and extraction, indicating impurities of interest, MB, and their MW based on EI-MS. Bottom, stacked GC-EI TIC chromatograms for each of the isolated impurities and MB.
because this can lead to significant loss of the volatile impurities under investigation. Isolation by Prep-GC. Three impurities (A, B, and C) had to be identified in a pharmaceutical matrix with known main band (MB). The matrix was analyzed by GC-EI-MS (Figure 1). Chromatographic separation was achieved and the probable molecular weight (MW) for each impurity obtained from its EI mass spectrum. The impurity peaks were 2.8 (A), 7.7 (B), and 14.1 (C) %, calculated by peak area normalization. Initial attempts to carry out prep-GC required several injections, but the chromatography of the system deteriorated after only a few resulting in
The APCI mass spectrum of derivatized MB (1) gave a [M + H]+ ion at m/z 205 and several fragments including m/z 173 ([M + H - CH3OH]+, error 0.1 mDa) and a substituted tropylium ion, m/z 145 (C11H13+, error 0.9 mDa). The MS data for the derivatized impurity A, which indicated a [M + H]+ ion formula of C9H11O2 and showed a tropylium ion at m/z 91 Analytical Chemistry, Vol. 82, No. 21, November 1, 2010
9129
Figure 3. The 600 MHz 13C HSQC NMR spectrum of the derivatized impurity D in methanol-d4.
Figure 2. APCI+ mass spectra of the derivatized MB and impurities A, B, C, and D. Table 1. Assigned Formulae of the [M + H]+ Ions for the Derivatized MB and Impurities A, B, C, and D derivatized compound
measured [M + H]+ (Da)
assigned formula
error (mDa)
millisigma value
MB A B C D
205.1222 151.0760 207.1382 205.1224 237.1500
C13H17O2+ C9H11O2+ C13H19O2+ C13H17O2+ C14H21O3+
0.1 0.6 0.3 0.1 1.4
1.2 2.8 4.2 12.3 14.1
(C7H7+, error 2.1 mDa), strongly suggested structure 2. 1H and 13 C HSQC NMR experiments were also acquired and confirmed the structure. The MS data were not sufficient to unambiguously determine the structure of the derivatized impurities B, C, and D. Therefore 1D (1H) and 2D [(1H,1H) COSY, 13C HSQC and 13C HMBC] NMR experiments were acquired. A 600 MHz spectrometer equipped with a 1.7 mm cryogenic probe was used because of its high mass sensitivity. The quality of the data was such that it allowed the elucidation of the structures in a semiautomatic fashion by using CASE. The experiments were run with high resolution in the indirect dimension (as described in the Experimental Section) in order to avoid peak overlapping, decrease the number of ambiguous correlations, and therefore 9130
Analytical Chemistry, Vol. 82, No. 21, November 1, 2010
facilitate the CASE. Figure 3 shows the 13C HSQC for the derivatized impurity D (10 µg), acquired in 8 h and 5 min. 1D and 2D NMR data for the derivatized impurities were automatically processed with NMR Workbook (ACD/Labs). A project was created for each of them. The 1H spectrum was manually peak picked, and the peaks automatically transferred to the 2D spectra. The automatic 2D peak picking was visually inspected for peaks the program failed to pick and/or clusters of overlapping peaks where the program may have failed to pick the center of the peak. In most cases, the peak picking was good and required minimal human optimization. The NMR Workbook project containing the peak picked NMR data was then opened in Structure Elucidator (ACD/Labs) and the molecular formula (obtained from GC-APCI-MS) introduced. The data were checked, and some structural information was added in what we refer to as a “data grooming” step. For impurities B, C, and D, this consisted of (i) completing the 13C chemical shift table (merged) by adding information about the number of carbon atoms for each different carbon chemical shift and the number of protons directly attached to each of those carbon atoms, (ii) completing the 1H chemical shift table (merged) by adding the number of protons for each different proton chemical shift, and (iii) calibrating the 13C HMBC correlations so that correlations were mainly (>80%) 2-3 bonds except for the case of impurity B where all the correlations were set up to be 2-3 bonds purposefully. After the data were groomed, the molecular connectivity diagram (MCD), automatically generated, was examined for contradictions. No contradictions were found, and therefore structures were calculated based on the MCD. No bonds were extended during the calculations. The generated structures were filtered by carbon assignment during the generation step (maximum match factor 4 and shift difference 20). For the structures that passed through the filter, 1 H chemical shifts for each of the atoms were also predicted using artificial neural networks,12-14 and the average deviation (12) Meiler, J.; Maier, W.; Will, M.; Meusinger, R. J. Magn. Reson. 2002, 157, 242–252. (13) Meiler, J.; Lefebvre, B.; Williams, A.; Hachey, M. J. Magn. Reson. 2004, 171, 1–3.
between predicted and experimental chemical shifts for both 13C [dN(13C)] and 1H [dN(1H)] were obtained. The structure with the lowest combination of dN(13C) and dN(1H) was given as the most probable structure. The term “Best Structure” is used by the program. Further details about Structure Elucidator and/or CASE concepts and methodology are beyond the scope of this paper and can be found in the literature.4,15 A total of 65 structures were calculated for the derivatized impurity B, and 13 passed the filter. The total calculation time was 1 s. Figure 4 shows the dN(13C) and dN(1H) for these 13 structures ranked by ascending dN(13C). The second ranked structure (RS II) was found to be the best one by the program. RS I, III, and IV also had very good and similar scores [dN(13C)