Improving the Performance of High-Precision ... - ACS Publications

Feb 29, 2016 - Extra Byte, Via Raffaello Sanzio 22C, Castano Primo I-20022, Italy. •S Supporting Information. ABSTRACT: Quantitative 1H NMR (qNMR) i...
0 downloads 0 Views 1MB Size
Subscriber access provided by The University of British Columbia Library

Article

Improving the performance of high precision qNMR measurements by a double integration procedure in practical cases Torsten Schoenberger, Sonja Menges, Michael A. Bernstein, Manuel Perez, Felipe Seoane, Stanislav Sykora, and Carlos Cobas Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.5b04911 • Publication Date (Web): 29 Feb 2016 Downloaded from http://pubs.acs.org on March 3, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 11

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Improving the performance of high precision qNMR measurements by a double integration procedure in practical cases Torsten Schoenberger,*,† S. Menges,† Michael A. Bernstein,‡ Manuel Pérez,‡ Felipe Seoane,‡ Stanislav Sýkora,§ Carlos Cobas.‡ *,†

Bundeskriminalamt - KT12, 65173, Wiesbaden, Germany [email protected] Mestrelab Research, S.L Feliciano Barrera 9B − Baixo, 15706 Santiago de Compostela, Spain § Extra Byte, Castano Primo, Italy ‡

ABSTRACT: Quantitative 1H NMR (qNMR) is a widely applied technique for compound concentration and purity determinations. The NMR spectrum will display signals from all species in the sample, and this is generally a strength of the method. The key spectral determination is the full and accurate determination of one or more signal areas. Accurate peak integration can be an issue when unrelated peaks resonate in an important integral region. We describe a “hybrid” approach to signal integration that provides an accurate estimation of signal area, removing the component(s) that may arise from unrelated peaks. This is achieved by using the most accurate integration method for the region, and removing unwanted contributions. The key to this performing well, and in almost all cases, is the use of areas from deconvolved peaks. We describe this process and show that it can be very successfully applied to cases where the highest precision is required, and for more common cases of NMR-based quantitation.

Quantitative NMR (qNMR) is a reliable and wellestablished method principally for concentration and purity determination for single compounds and mixtures.1-9 The fundamental arguments supporting qNMR as a viable quantitation tool have been described by Malz.10 In recent years the general method was extended when several high precision results of quantitative nuclear magnetic resonance spectroscopy (qNMR) were reported.11,12 Whilst conventional qNMR methodology can routinely deliver uncertainties of ca 1.0-2.5 %, the high-precision approach extends this to ca 0.1 %. In practice, the potential to apply high-precision qNMR is limited by “real-world” considerations such as the presence of impurity peaks that resonate in the main compound integration window. Impurity peaks at levels on the order of the 13C satellites of the main peak are sufficient to significantly increase uncertainties and compromise the high-precision analysis. In the Forensic Science Institute of the Federal Criminal Police Office (“Bundeskriminalamt”, BKA) the high precision qNMR is routinely applied for the purity determination of analytical standards. These must be calibrated accurately so that they can be used for chromatographical methods with an uncertainty level of 2 % or higher. Since the purity value of these standards determined by NMR is the basis for all further analyses, the quantification has to be as precise as possible. Commercial standards of e.g. synthetic drugs with reliable purity values are frequently not available, so they are taken from drug seizures and sometimes purified. In this way the purity can be accurately determined by qNMR which can be seen as a very valuable quality assurance tool. Uncertainty values of less than 0.1 % for the measurement can be achieved by taking special precautions, such as ade-

quate and consistent sample preparation procedures, sufficiently long relaxation delays during signal acquisition, good signal digitization, very high signal to noise ratios, and very accurate phase and baseline corrections.11,13 The influence of integration techniques is a major point of this article, and is described in detail, below. From a data analysis and processing points of view, the ultimate goal in qNMR studies is to identify, quantify and separate all signals peaks from noise and artefacts, that is, to produce a reliable list of peaks with all their fundamental parameters such as chemical shifts and areas. The traditional way to obtain such a list consists essentially of two closely linked procedures that are usually applied separately, namely (i) peak picking and (ii) integration. Peak picking algorithms are typically based on the simple process of finding peak extrema whereas integration is accomplished by calculation of the running sum, using a simple but generally effective approximation where rectangles are generated for each data point. The fundamental issues associated with effective peak integration have been described in the classical literature, and emphasized in the context of qNMR.9,10 Acquiring the data with sufficient data points, zero filling, apodization, and phase-, baseline- and drift correction are critical considerations. Effective drift- and baseline correction removes the need for adjustments to integral SLOPE and BIAS. For quantitation, this calculation is performed within specific frequency boundaries of the spectral regions over which numerical integration is required. However, the sum integration method is insensitive to lineshapes that may deviate from expected theoretical norms, and may be a consequence of, e.g., poor optimization of the magnetic homogeneity (“shimming”). Whilst these methods are computationally very simple, fast, and yield reasonably good results

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

under good SNR and data point density conditions, conventional integration cannot disentangle overlapping peaks. Furthermore, even in the case of well resolved resonances, numerical integration can lead to a substantial underestimation of the area due to truncation of the long tails of the Lorentzian lines. This has been studied by Griffiths and Irving14 who showed that to achieve a maximum integration error of 0.1%, integration limits of 76 times the line width in both directions must be employed. This increases the likelihood that, under real measurement conditions, an integral region will contain a mixture of compound and impurity peaks (see Figure 1). This is therefore likely to compromise the correct quantitation of the compound peaks if the traditional running sum method is used, as the impurity peaks cannot be precisely factored out of the calculated integral value.

Figure 1. Typical examples from routine purity determinations of analytical standards (synthetic drugs) at the BKA; compound signals (blue) overlaid by impurity signals (red). The real NMR signal line is green.

Alternative processing procedures have been developed to overcome this limitation and seek to determine peak positions, line widths, intensities and areas (see below). These can be divided into two main approaches depending on whether the data analysis is performed in the time- or frequency domain. Considering that the Fourier Transform is a linear transformation, both approaches should, theoretically, be equivalent. In practice, however, potential artifacts that resonate in a spectral region that is important for quantitation, and computational considerations may favor one approach over the other.15 Time domain procedures offer some convenient ways to address incomplete or corrupted data, thereby avoiding performance degradation caused both by baseline and truncation artifacts. Algorithms have been developed such as parametric Linear Prediction16, Filter Diagonalization Method (FDM)17, and Bayesian approaches18,19, exemplified by the recent practical application of CRAFT.20 This and the Matrix Pencil method21 are some of the time domain approaches that have been successfully applied in the field of high resolution NMR, whereas methods based on nonlinear optimization (i.e. Levenberg–Marquardt) such as VARPRO and AMARES22 have been widely used for the analysis of in vivo/vitro NMR, to provide a coarse overview.

Page 2 of 11

Iterative frequency domain approaches, on the other hand, are also usually performed using a least square criterion in a way similar to the nonlinear time-domain methods. They have the advantage of an easier visual interpretation of the measured NMR signals and the fitting results, and are best suited for frequency selective analysis. Many frequency-domain methods solve the nonlinear least-squares problem by local optimization techniques, in particular using the LevenbergMarquardt23 or Gauss-Newton24 algorithms. All model-based approaches present significant problems when compared to the traditional integration of the area under the peaks of interest. Whilst the latter makes no assumptions concerning the lineshape of the signal and is therefore insensitive to their distortions (i.e. field inhomogeneity, chemical exchange and relaxation, etc.), time or frequency domain fitting algorithms are model based and any deviation from the assumed analytical function will therefore result in quantitation errors. A second noteworthy difficulty lies in the estimation of the number of signals. In the case of the simple running sum method there is no need to specify the number of peaks within the region of interest, whilst this is a crucial task with any fitting-based method. For example, in the case of time domain approaches, unequivocal determination of the exact number of total resonances present in the FID, also known as the model order problem, has significant consequences: too small a value of the model order results in information loss, while too large a value effectively incorporates more noise into the analysis, and generates spurious spectral features. This problem has been dealt with by a number of different criteria including information theory25, Froissart doublets26 and subjective or statistical threshold settings of singular values.27 Whilst promising results have been reported in the literature, it is often the case that these methods produce very good mathematical fits to the experimental data but they may be physically meaningless. Similarly, in the frequency domain, line fitting or deconvolution of a set of spectral peaks (regardless of their exact shape) is always an ill-defined inverse mathematically problem. Intuitively, it is easy to see that any Lorentzian peak can be progressively better and better fitted using an increasing number of narrower distinct peaks, symmetrically distributed around the center of the fitted one. It is therefore very dangerous to allow an algorithm to carry out the fit and, among other parameters, determine the number of peaks. Increasing the number of peaks would eventually appear to perfectly fit any spectrum, including every little noise excursion. Obviously, such a nonsensical propagation of peaks number must not occur but, just as obviously, there is no purely mathematical criterion allowing one to do so. This problem has been solved using the Global Spectrum Deconvolution (GSD) algorithm. A robust procedure identifies physically meaningful peaks by using a fast algorithm based on knowledge of the zero, first and second derivatives of the data (vide infra). In this work, we present a hybrid integration approach that combines the best features of the traditional integration with those of line fitting via GSD. Firstly, all multiplets in a spectrum are integrated using the standard running sum method. Integrals are determined with the highest accuracy because this procedure is robust towards lineshape excursions from the presumed model (assuming the phase and baseline have been properly corrected). Frequency-domain deconvolution is then

ACS Paragon Plus Environment

Page 3 of 11

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

used to account for the non-compound, intra-multiplet signals that should be discounted from the area. Deconvolution is usually carried out using GSD, an algorithm that overcomes many of the aforementioned line fitting limitations. Global Spectral Deconvolution (GSD) GSD is a fast and fully automatic algorithm for the detection and deconvolution of all spectral peaks present in a spectrum. Its main goal is to extract all resonances from the NMR spectrum in the presence of noise, baseline distortions, spikes and signals caused by factors other than molecules in the sample. A full technical coverage of this rather involved algorithm is beyond the scope of this article and only a high level description of GSD’s theory and implementation is provided here. More details can be found.28 GSD is based on the localization of the singular points (minima, maxima and zero crossings) of the 0th, 1st, and 2nd derivative spectra. Those are calculated using an optimized Savitzky-Golay algorithm where the polynomial order and number of points are calculated automatically. Once those singular points are detected and flagged, they are analyzed to decide whether or not they correspond to true peaks. The detected peaks are then boxed to avoid any run-offs during subsequent fitting, followed by a raw estimation of each peak parameter (frequency, linewidth, height). The process is completed by performing a number of fitting cycles aimed at refining all peaks parameters. The output of the GSD algorithm is a peaks table, listing all the detected peaks, each with its four defining parameters (position, height, width, and kurtosis), area, and extra columns reserved for subsequent GSD peaks editing. Here the algorithm determines the peak classification, e.g. compound, solvent, reference, 13C-satellite, impurity, etc. The algorithm, despite (or rather thanks to) its complexity, is very fast. It can easily handle spectra with almost any number of peaks with different line widths (see Figure 2).

Figure 2. 1H spectrum of the medical product containing Fluoxetine in DMSO-d6. Signals with line-widths from 1.1 to 45 Hz were correctly picked and fitted in automation mode by GSD (full spectrum is shown in Supporting Information, Figure S8). The green trace is the experimental spectrum, whilst the blue peaks correspond to the deconvolved peaks found by GSD. Even a very complex biofluid 1H NMR spectrum comprising 1000-2000 almost uniformly distributed peaks can be GSD analyzed in a few seconds. GSD has also been parallelized,

speeding up the algorithm by a factor close to the number of available cores in the computer CPU. The fact that GSD fits only the first and second derivatives of spectral peaks and essentially never refers to any point’s absolute value makes it insensitive to baseline distortions. Moreover, the use of the 2nd derivative enhances resolution. In Figure 3 GSD fully and automatically determines all compound resonances, allowing also the accurate measurement of the small 5-bond coupling between H-2 and H-5.

Figure 3. Resolution enhancement power offered by GSD. The green trace is the experimental spectrum, whilst the blue peaks correspond to the deconvolved peaks found by GSD. Only the aromatic signals are shown.

Lineshape flexibility Since GSD is an optimization (fitting) algorithm it must make some assumptions about the shapes of the peaks. In some spectroscopies, peaks are relatively few and their shapes are rigorously Lorentzian. With NMR spectra, however, peak shapes are affected by many factors that cause the lineshapes to depart from the simple Lorentzian model.29 Some deconvolution algorithms try to make the lineshape more flexible by using a weighted mixture of Lorentzian and Gaussian shapes, or by using the Voight profiles (a convolution of a Lorentzian with a Gaussian). In both cases, the resulting shapes are between a pure Lorentzian and a pure Gaussian, which is a totally insufficient range. To partially counter these problems and mitigate the unavoidable line shape imperfections, we have introduced the idea of a Generalized Lorentzian peak shape (GL). A single extra parameter describes the peaks shape deviation from the Lorentzian, which we call somewhat loosely the ‘kurtosis parameter’. Briefly, the GL peak shape, though using just one ‘extra’ parameter, covers deviations from the Lorentzian about 3 times broader (in terms of kurtosis) and extending to both sides of the Lorentzian. This is illustrated in Figure 4 where the green line is pure Lorentzian, the red line is pure Gaussian (of the same half-height linewidth) and the gray lines are generalized Lorentzians. When a linear combination of the Lorentzian-Gaussian lineshapes is used, the result is essentially an interpolation between the green and red lines so that, as can be seen in Figure 4, the actual range covered by this composite lineshape is more limited compared to the proposed Generalized Lorentzian.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The GL thus provides much greater flexibility. At the same time because GL is a rational function of the frequency offset, its imaginary part and its integral are easy to evaluate using exact, explicit formulas. This is not the case for a Gaussian line shape.

Page 4 of 11

However, this method is inherently flawed as it is an arithmetic operation that uses two measures determined by different methods and, as a result, they have different error estimates which may compromise the overall performance of the final calculation. A more robust approach, which is the one proposed in this work, employs a simple scaling factor to apply the contribution of the impurity peaks in the determination of multiplet areas. Thus, the hybrid integration method presented in this work is formulated as follows:

A = kZ

[4b]

Where the scaling factor defined as:

k=

∑(compound _ peaks)

[5]

∑(compound _ peaks + impurity_ peaks) Where ∑ (compound _ peaks) is the sum of all areas of Figure 4. Lorentzian (green line) and Gaussian (red line) cover a very narrow range of shapes whilst the generalized Lorentzian used by GSD (gray lines) covers at least 3 times more.

Mathematically, the new proposed Generalized Lorentzian (GL) function is defined by [1], where the amplitude has been assumed to be 1 and x corresponds to the frequency offset:

those peaks that have been classified as compound and determined by GSD or line fitting and

∑(compound _ peaks + impurity_ peaks)

peaks areas in the same region. Equation [4b] is mathematically more consistent than [4a], and more robust to errors. This is shown with the help of a simple example (see Figure 5).

GL( x; k ) = (1 − k ) L( x) + kG( x)

[1] where k is the “kurtosis parameter”, so called because it affects the peak’s kurtosis and

G ( x) = L( x) =

1 + x2 / 2 1 + x2 + x4

[2]

1 1  i  = = Re   1 + x 2 ( x + i )( x − i )  x+ i 

[3] GSD capability with varying line width, baseline artifacts, and phase errors A complete mathematical analysis of these factors is beyond the scope of this article, but we analyze them in the context of typical parameters for 1H qNMR and representative experimental spectra. Baseline imperfections have no effect on GSD and is topic is therefore not investigated further. Edited Sum Integration As described above, the core integration method proposed in this article is based on the traditional running sum, but in order to overcome some of its limitations - mainly peaks overlap arising from signals other than the compound of interest for each multiplet - this integration procedure is complemented by a second stage that exploits the power of GSD-derived areas. In principle, a naïve approach could consist of a simple subtraction of the impurity peaks’ areas calculated by GSD from the total area determined using the traditional integration method, that is:

A = Z − ∑ Pi GSD

[4a] Where A is the new integral, Z is the running sum integral, and

∑P i

GSD

is the sum of all

is the area of all impurity peaks determined by

GSD.

ACS Paragon Plus Environment

Page 5 of 11

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry 3. Finally, when the fitting error is larger in the compound peak than in the impurity peak, integration error will be larger with eq. 4b than with 4b, but they will be closer than in case 2 (See Figure 5c). The simple case validates the choice of the correction factor as the preferred mathematical operation for Edited Sum integration.

Calculation of the combined uncertainty Following the Gaussian error propagation law, the relative uncertainty of the corrected signal (uIcorr, rel) is calculated as follows:

1− A  u I corr ,rel = (u I ,rel ) 2 + (u f ,rel ) 2 ⋅    A 

2

[6]

where: uncertainty contribution of the sum integral value, uI e.g. for SN > 20,000 = 0.05 % uf uncertainty contribution of the line-shape analysis to determine the relative component / impurity ratio of the integrated signal A components share of the integrated signal (e.g.: A=0,99 -> 99% components share)

Results and discussion

Figure 5. Plots of % error for Eq 4a (orange) and 4b (blue) vs the area of the impurity as a % of the compound peak. Considering fitting errors for the compound- and impurity peak, for (a) the errors are about equal, (b) the impurity peak has a larger error (a more likely case), and (c) the compound peak has the larger error (a less likely case). Consider an integral region having two signals, one corresponding to a compound peak of interest whilst the other is an impurity peak. If the integral of the compound peak is fixed and the amount of impurity peak is varied, several scenarios may arise depending on the relative errors resulting from GSD analysis (or more generally, by any deconvolution algorithm): 1. If the fitting error is the same or very similar for both the compound and impurity peak, both errors will virtually cancel using method 4b whilst the error will increase linearly with method 4a (See Figure 5a). 2. If the fitting error is larger in the impurity peak than in the compound peak (which is the most likely situation due to the lower SNR for the weaker impurity peak), then both errors also increase linearly, but method 4b will yield smaller errors (See Figure 5b).

Validation of the line-shape analysis uncertainty contribution The uncertainty contribution of the integration to NMRbased quantitation is known to be strongly dependent on the signal to noise ratio (SNR).23 An uncertainty level of less than 0.1 % can be achieved by taking several precautions, such as very accurate sample preparation by using an ultra-micro balance, using recycle delays higher than 7 x T1, and signal averaging to achieve a very high SNR (>10,000).11 The uncertainty contribution of Mnova’s GSD lineshape analysis was determined in this study. For this purpose, the 1D 1 H-NMR spectra of a total of 10 compounds were evaluated, where all were recorded under high-precision quantitative conditions (at least 10 mg sample weight, 500 MHz spectrometer frequency, ‘zg’ pulse sequence, 16 scans, 7.7 s acquisition time, at least 63 s relaxation delay). All substances were previously quantified and shown to have a purity > 98.5%. Signals from impurities that were present at significant levels could not be recognized in the areas that were integrated for quantitation, but they may have been included, having a negative effect on the overall precision and might be a part of the deviation determined here. Very different signals were evaluated, from singlets to complex multiplets having 30 peaks. The integrals were normalized and divided by the number of corresponding protons. Across all 57 analyzed signals of all 10 spectra a RSD of 1.88% was achieved (rounded up to 2 for calculation, above) using a spectral resolution of 0.065 Hz/pt (SW = 8503 Hz,

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 11

Acquisition Size = 64K of complex points, zero filled to a final FT size of 128 K). Individual values for the 10 samples are given in Table 1. The spectra are given as Supplemental Information in Figures S3-S7. It is important to highlight the importance of adequate digital spectral resolution. For example, if the digital resolution is halved (by using an FT size of 64K), the RSD increases significantly from 1.88% to 2.42%. Any further enhancement of the spectral resolution does not lead to more precise results. Ideally for correct peak description, one should aim at having at least five digital points per peak above half height, yet higher values are desirable for an optimum performance of the line fitting algorithm. Table. 1: Overview of signals evaluated for the validation of the lineshape analysis uncertainty contribution.

Substance

Purity in %

Solvent Number RSD in of signal % ranges evaluated

5F-AKB48

98.73

acetoned6

7

2.1

4-MEC

99.38

D2O

5

1.6

STS-135

99.00

acetoned6

5

1.4

Acetylcodeine

99.15

acetoned6

8

1.7

Heroin

99.31

acetoned6

6

1.0

JWH-18

99.56

acetoned6

6

0.9

2C-D

98.55

D2O

4

2.4

DPT

99.73

D2O

5

2.3

NM-2-AI

99.60

D2O

4

0.8

MDE

99.33

D2O

7

1.6

The dependency of the integral determined by line-shape analysis and the SNR was also established. To this end, a new spectrum was produced by adding two, baseline-noise-free spectra (where the baseline noise was set to zero programmatically), with different intensities and shifted with respect to each other, to a pure noise spectrum (Fig 6). The surrogate impurity peak(s), their closeness to the main compound peak(s), their relative levels, and the final SNR in the test spectrum was thereby varied and controlled.

Figure 6. Superposition of noise and 2 mutually shifted, noisefree spectra, here in the ratio 1: 0.0625, the sum of the individual spectra was used as a new artificial spectrum to determine the influence of the SNR on the accuracy. Lower trace (blue) is the noise contribution; middle trace (maroon) the compound spectrum at higher intensity; top trace (green) the shifted, compound spectrum at lower intensity.

The determined areas of the corresponding fitted signals were compared with the known intensity ratios of the superimposition. Forty signal groups with SNR from 20 (several times) up to 1600 for the smaller signals were evaluated in this way.

Figure 7. Deviation of the area determined by line shape analysis to the theoretical, “real” value as a function of the SNR.

As shown in Figure 7 there is no dependency of the deviation with the SNR, even when very low values were evaluated. The mean value for the deviations is 1.2 % for a SNR ranging from 20 – 1600. The influence of phase errors was investigated by taking 3 spectra of the validation set listed in Table 1. The spectra were evaluated by usual sum and peak integration mode (line shape analysis). The phase was stepwise misaligned up to +4 degree for PH0 and PH1 equally. This range was considered to be a realistic error in automation mode. All integral values (4-5 for each spectrum, see Table 1) were normalized to the number of corresponding nuclei. The relative standard deviation (RSD) was calculated. For qNMR samples the phase errors must be minimized if reliable integrals are to be obtained. We therefore start with this assumption, and restricted our tests. In this phase error

ACS Paragon Plus Environment

Page 7 of 11

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

range we observed no influence of the phase error on the accuracy of the line shape analysis (solid lines). In contrast, the accuracy of the sum integration is already strongly depending on the phase error. Considering a single peak with arbitrary phase θ the total area under the peak varies according to A = N cos(θ). On the other hand, whilst this simple analysis points towards a greater stability of GSD against phase errors, it is important to highlight the fact that the analysis did not take into account the many factors that might influence the results, including receiver noise, shimming distortions, peaks overlaps, baseline distortions, and field noise. This would merit a more thoroughly study, which is beyond the scope of this work.

An extreme, limiting case occurs when ∆ν is so small that the impurity signal significantly overlaps with the main signal peak. The GSD algorithm can only model the impurity peaks if the main signal peak is also very well deconvolved. Limitations of the algorithm and practical considerations such as signal line-shape now play a much larger factor in the accuracy of the determination, and the GSD modeling of the small, impurity peak is compromised. We see in Fig. 9 that there are small but appreciable deviations between the ideal and experimental lineshapes. Under these circumstances an exact line fitting was obtained for the impurity levels of 3.125% and 1.560% (Figure 9, top). For the two examples in the bottom of Figure 9, the deviation is significantly greater than the calculated uncertainty contribution.

Figure 8. Comparison of the influence of phase errors on the accuracy of usual sum [S] integration (dotted lines) and peak [P] integration mode (line shape analysis for 3 spectra of the validation set. Check of the uncertainty of the Edited Sum function The theoretical approach of uncertainty calculation was verified in practice. Impurity signals with defined intensities were added at different positions in the frequency range bounded by the 13C satellites of the main signal. The main 1H NMR signal was marked as "Compound" peak type and the artificially added impurity signal as "Impurity". In this way the impurity signal’s contribution to the overall area estimate was set to theoretically zero, in the manner previously described as “Edited Sum integration” in this article. In the first step the signal area to be integrated with the "Sum" integration method was normalized to 100% for the main signal + the amount of impurity signal (e.g. for a 1% impurity signal the integral was set to 101). In the second step the signal area was recalculated by using the "Edited Sum" function. Ideally the impurity component to the overall integration should be discounted so that the integral is correctly determined to be 100%. In Table S12 we show that the practically determined deviation of the "Edited Sum" function is (at least for the two cases where ∆ν = 56 and 22Hz) below the calculated amount of uncertainty: no critical deviations are seen. The overlaid spectra of these two cases are shown in Figure S1.

Figure 9. Artificially added impurity signals with different intensities; ∆ν (compound to impurity signal)= 4 Hz. green = spectrum red = fitted impurity signal blue = fitted component signal

As a rule of thumb, the amount of the impurity signal must be at least twice as large as the upper intersection with the fitted main signal to be modeled correctly by GSD and obtain an overall effective Edited Sum application. Benefit of Edited Sum for standard precision quantification In Figure 10 we show that Edited Sum integration can also be usefully applied to normal (non-high precision) quantification. In this example, the methyl group of Methamphetamine overlaps with the methyl signal of another similar, synthetic drug. The integral values were normalized to the number of protons of Methamphetamine by using another signal group without any impurity overlap. While the evaluation when using the usual “Sum” integration would lead to a relative error of 22 % (the relative amount of the impurity), edited sum evaluation determined a nearly perfectly correct result (2.977 vs 3.000).

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 11

Table 2. Dependency of the corrected combined relative uncertainty uIcorr, rel from factor A (components share) with uI=0.05% and uf = 2.0% A

uIcorr, rel

1.00

0.050

0.99

0.054

0.95

0.117

0.90

0.228

0.80

0.502

0.70

0.859

0.60

1.334

0.50

2.001

It can be seen that with the application of Edited Sum in this case the high-precision evaluation is still valid up to an impurity share level up to ca 10% (A=0.90).

Figure 10. Comparison of evaluations by using “Sum” (left) and “Edited Sum” (right). With “Edited Sum” the integral value is corrected by only considering the compound signals, not the impurity signals. Thus, nearly the exact theoretical integral value (3.000) is achieved. “Edited Sum” can also be used in combination with “Sum” mode for different signals (just by defining one or more compound, and no impurity type peaks). Then the method provides the flexibility to make the best of sum integration and line fitting techniques to get the most accurate results under all circumstances. Uncertainty by using Edited Sum The relative uncertainty uIcorr, rel achieved by using the Edited Sum function can be calculated via equation [6]. The uncertainty contributions from the two combined evaluation modes can be set as follows: uI sum integration for SNR >20,000 = 0.05 % uf line-shape analysis = 2.0 % (validated above) Therefore, the combined uncertainty is mainly depended on the factor A (components share). Some examples for the calculated combined uncertainty are given in Table 2.

Conclusions High-precision compound quantitation using NMR spectroscopy is a desirable and useful experiment. Whilst the requisite precautions for signal acquisition can be met for all samples, extracting sufficiently reliable signal areas can be complicated or compromised by the presence of weak but significant impurity signals resonating in the exact regions of interest. A practical solution lies in a hybrid approach involving conventional integration of the entire peak region and subtracting from that the integrals of impurity peaks derived using the spectrum deconvolution procedure. Practically, this is applied automatically in the MestReNova software using the “edited sum” method. This exploits GSD for peak models described here, and enables the required workflows for high-precision quantitation by minimizing the errors from human operators. qNMR under the highest precision requirements therefore become possible again, and the high-precision qNMR method can be applied to an almost unlimited variety of practical cases. Furthermore, the same principle can be effectively applied to cases where there is significant contamination to generate sufficiently accurate compound integrals for conventional quantitation. In general, the Edited Sum function leads to improved uncertainty values as long as the partial uncertainty contribution of the sum integration is significantly below the uncertainty level of the line-shape analysis (GSD, here: 2.0%) and if impurities are still impurities and not “main components”. Author Contributions The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript. Supporting Information Available •

Recipe for performing Edited Sum integration

ACS Paragon Plus Environment

Page 9 of 11

Analytical Chemistry



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Additional information for the check of the uncertainty of the Edited Sum function • Spectra used for the validation of the line shape analysis • Complete range of the spectrum partly shown in Figure This material is available free of charge via the Internet at http://pubs.acs.org.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

REFERENCES (1) Barding, G. A.; Jr.; Salditos, R.; Larive, C. K. Anal. Bioanal.Chem. 2012, 404, 1165−1179. (2) Pauli, G. F.; Gödecke,, T.; Jaki, B. U.; Lankin, D. C. J. Nat. Prod. 2012, 75, 834−85. (3) Akoka, S.; Barantin, L.; Trierweiler, M. Anal. Chem. 1999, 71, 2554−2557. (4) Bernstein M. A.; Sýkora S.; Peng C.; Barba A.; Cobas C. Anal. Chem. 2013, 85 (12), 5778–5786. (5) Malz, F.; Jancke, H. J. Pharm. Biomed. Anal. 2005, 38, 813−823. (6) Simmler, C.; Napolitano, J. G.; McAlpine, J. B.; Chen, S.-N.; Pauli, G. F. Curr. Opin. Biotechnol. 2014, 25, 51–59. (7) Pauli, G. F.; Chen, S.-N.; Simmler, C.; Lankin, D. C.; Gödecke, T.; Jaki, B. U.; Friesen, J. B.; McAlpine, J. B.; Napolitano, J. G. J. Med. Chem. 2014, 57 (22), 9220–9231. (8) Beyer, T.; Diehl, B.; Holzgrabe, U. Bioanal. Rev. 2010, 2 (1), 1–22. (9) Holzgrabe, U. Prog. Nucl. Magn. Reson. Spectrosc. 2010, 57 (2), 229–240. (10) Malz, F. NMR Spectroscopy in Pharmaceutical Analysis; Holzgrabe, U., Wawer, I., Diehl, B., Eds.; Elsevier, 2008. (11) Schoenberger T. Anal Bioanal Chem. 2012, 403, 247–254. (12) Weber, M.; Hellriegel, C.; Rueck, A.; Wuethrich, J.; Jenks, P. J. Pharm. Biomed. Anal. 2014, 93, 102–110. (13) Mahajan, S.; Singh, I. P. Magn. Reson. Chem. 2013, 51, 76–81. (14) Griffiths L; Irving A. M., Analyst 1998, 123, 1061–1068. (15) de Graaf R.A.. Spectral Quantification, in In Vivo NMR Spectroscopy: Principles and Techniques. 2nd edition. Chichester (UK): Wiley-Interscience; 2007, p. 445.

Page 10 of 11

(16) Hoch J. C.; Stern A. NMR Data Processing. Wiley-Liss; 1996, p. 230 (17) Mandelshtam V. A. Prog. Nucl. Magn. Reson. Spectrosc. 2001, 38, 159–196. (18) Bretthorst G.L.; Hutton W. C.; Garbow J. R.; Ackerman J. J. H. Magn. Reson. Part A 2005, 27, 55–63. (19) Rubtsov D. V.; Griffin J. L. J. Magn. Reson. 2007, 188, 367– 379. (20) Krishnamurthy K. Magn. Reson. Chem. 2013, 51, 821-829. (21) Razavilar, J.; Ye Li, K. J. R.; Liu, K. J. R. 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, 1996, Vol.5, pp.2503-2506. (22) Vanhamme L.; van den Boogaart A; Van Huffel S. J. Magn. Reson. 1997, 129, 35–43. (23) Marquardt D. W. J. Soc. Indust. Appl. Math. 1963, 11, 431– 441. (24) Laatikainen R; Niemitz M; Malaisse W. J.; Biesemans M.; Willem R. Magn. Reson. Med. 1996, 36, 359–365. (25) Lin Y.-Y.; Hodgkinson P.; Ernst M.; Pines A. J. Magn. Reson., 1997, 128, 30–41. (26) Belkić D.; Belkić K. J. Math. Chem. 2012, 50, 2558–2576. (27) Konstantinides K.; Yao K. IEEE Trans. Acoust. 1988, 36, 757– 763. (28) Cobas C., Seoane F., Sykora S., Global Spectral Deconvolution (GSD) of 1D-NMR spectra, Poster at the SMASH Conference, Santa Fe (NM, USA), September 2-10, 2008, DOI: 10.3247/SL2Nmr08.011 (29) Metz, K. R.; Lam, M. M.; Webb, A. G. Concept. Magn. Res. 2000, 12, 21–42. (30) Hays P.; Schoenberger T. Anal. Bioanal. Chem., 2014, 406, 7397-7400.

ACS Paragon Plus Environment

Page 11 of 11

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Authors are required to submit a graphic entry for the Table of Contents (TOC) that, in conjunction with the manuscript title, should give the reader a representative idea of one of the following: A key structure, reaction, equation, concept, or theorem, etc., that is discussed in the manuscript. Consult the journal’s Instructions for Authors for TOC graphic specifications.

Insert Table of Contents artwork here

ACS Paragon Plus Environment