Phase-Constrained Spectrum Deconvolution for Fourier Transform

Dec 16, 2016 - This Article introduces a new computationally efficient noise-tolerant signal processing method, referred to as phased spectrum deconvo...
1 downloads 11 Views 3MB Size
Article Cite This: Anal. Chem. 2017, 89, 1202-1211

pubs.acs.org/ac

Phase-Constrained Spectrum Deconvolution for Fourier Transform Mass Spectrometry Dmitry Grinfeld,* Konstantin Aizikov, Arne Kreutzmann, Eugen Damoc, and Alexander Makarov Thermo Fisher Scientific (Bremen), Hanna-Kunath Strasse 11, 28199 Bremen, Germany S Supporting Information *

ABSTRACT: This Article introduces a new computationally efficient noise-tolerant signal processing method, referred to as phased spectrum deconvolution method (ΦSDM), designed for Fourier transform mass spectrometry (FT MS). ΦSDM produces interference-free mass spectra with resolution beyond the Fourier transform (FT) uncertainty limit. With a presumption that the oscillation phases are preserved, the method deconvolves an observed FT spectrum into a distribution of harmonic components bound to a fixed frequency grid, which is several times finer than that of FT. The approach shows stability under noisy conditions, and the noise levels in the resulting spectra are lower than those of the original FT spectra. Although requiring more computational power than standard FT algorithms, ΦSDM runs in a quasilinear time. The method was tested on both synthetic and experimental data, and consistently demonstrated performance superior to the FT-based methodologies, be it across the entire mass range or on a selected mass window of interest. ΦSDM promises substantial improvements in the spectral quality and the speed of FT MS instruments. It might also be beneficial for other spectroscopy approaches which require harmonic analysis for data processing.

I

frequency. In view of this fact, the super-FT resolution of a sparse spectrum is attainable in principle. Any additional information on signal’s specific properties is beneficial. For example, the ion oscillation phase is known in FT MS for any specific m/z through the mechanism of ion injection, as in the Orbitrap analyzer, or excitation, as in FT ICR. The phase data is routinely employed to obtain absorption spectra10−12 that provide 2-fold narrower peaks and reduction of noise. Both magnitude and absorption FT spectra suffer, however, from the “interference” distortions produced by notin-phase summation of the peak shoulders.13 In particular, the negative side lobes in the absorption spectra may suppress neighboring peaks. Baseline correction may also be needed.14 The above-mentioned shortcomings of FT spectra when utilized in a direct way for mass analysis have encouraged researchers to look for alternative methods of transient analysis. Some of them are based on the maximum likelihood fit of a measured transient or its FT spectrum to a model signal containing a number of harmonic components with their parameters to be found.15−17 This approach, being promising at a first glance, faces the problem that a priori knowledge of the number of sought harmonic components is required, as well as close estimates of their frequencies.17 As this information is not generally available, robustness of such methods is adversely affected. The mathematical reason for that is the fact that the

n Fourier transform mass spectrometry (FT MS) ions are trapped in a combination of magnetic and electrostatic fields (Fourier transform ion cyclotron resonance, FT ICR)1,2 or solely in an electrostatic field (Orbitrap analyzer),3,4 where they undergo multiple oscillations. The induced-current signal is represented by a sum of quasiperiodic components, each originating from ions with a specific mass to charge ratio m/z that determines the oscillation frequency. The mass spectrum is conventionally associated with Fourier Transformation (FT) of the transient sampled at 2N equidistant points during the acquisition time T. FT processing has the intrinsic frequency uncertainty defined as δf = 1/T, which translates directly into the mass resolution limitations. Some interpolation techniques, such as zerofilling1,5,6 or shifted basis (Savitski et al.7) were proposed to enhance mass accuracy by better peak centroiding. A spectrum parameter estimator based on resampling of the FT grid was considered by Aboutanios.8 Signal processing modifications in case of a slowly drifting oscillation frequency was considered by Guan et al.9 One should note, however, that it is the spectral information content, which imposes the fundamental limit on resolving power, and this content cannot be increased by interpolation or any other signal processing method. The spectral information content may be defined as the Shannon’s entropy with the opposite sign: Σ(cn/σ) log2(cn/σ), where cn are mathematical expectations of independent Fourier components and σ is the spectral noise. The total signal-tonoise ratio (SNR) defined as Σ(cn/σ) being fixed, the information content attains a higher value when cn are localized in peaks and clusters rather than uniformly distributed over the © 2016 American Chemical Society

Received: September 14, 2016 Accepted: December 16, 2016 Published: December 16, 2016 1202

DOI: 10.1021/acs.analchem.6b03636 Anal. Chem. 2017, 89, 1202−1211

Article

Analytical Chemistry

where the decay rate γ ≥ 0 and the oscillation phase φ are assumed to be known smooth functions of frequency (or, equivalently, m/z). The decay coefficient γ is normally associated with collisions.31−34 The exponential law of the collisional decay holds under very general assumptions, for example, that the probability of an ion’s collision with a residual gas molecule is constant in time, and the first collision either fragments the ion or brings it irreversibly out of the coherent packet.31 The decay rate apparently depends on the residual gas concentration as well as the nature of ions and their conformation. However, for the sake of simplicity, it might be convenient to assume the decay rate γ constant throughout the entire mass range. Though this is a rough approximation, it may be used at least for sufficiently short transients when the full decline in the course of acquisition 1 − exp(−γT) ≈ γT is small for all ions. For very short transients and good vacuum, when γT ≪ 1, the collisional decay may be eventually neglected, making the particular case γ = 0 of special importance. Other physical sources of the transient decline are described in literature, such as ion−ion scattering, field imperfections, and space charge. The ion−ion scattering is expected to produce a signal loss which is proportional to the square of the ion count; this effect is relatively small, however. Field imperfections,35,36 including Columbic field of the ion pool, are the major contributors to ion dephasing. Its nature consists in spreading out of same m/z ion packets because of slightly different oscillation frequencies,31 while the number of trapped ions is constant. Dephasing effectively broadens the frequency distribution A( f) and thus limits the achievable resolving power. Space-charge repulsion between close masses also manifests itself in altering A( f), namely, in effective frequency synchronization dubbed as self-bunching and coalescence effects.37−41 Self-bunching counteracts dephasing and makes the frequency distribution peaks narrower, whereas the coalescence effect coerces ions of different species to oscillate with equal frequencies. Deterioration of the frequency distribution A( f) because of the physical reasons cannot be reverted by any signal processing means, placing the said effects beyond the scope of this paper. It must be mentioned, however, that the transient model (1) still holds even if the function A( f) reflects the mass distribution with only a limited precision. It is worth noting that the collisional decay rate γ is practically independent of the isotopic state regardless of its abundance in an isotopic cluster. However, a combination of the field imperfections and the space-charge effects may make the apparent peak decays different. So, the low abundant peaks suffer from both collisional and dephasing decay mechanisms, while dephasing of highly abundant peaks is efficiently counteracted by self-bunching. An FT spectrum of the model transient (1) is a convolution

discrepancy measure is not generally a convex function and has multitude of local minima resulted from noise and Gibbs ripples.18 Methods based on the Prony’s theory evaluate not the peak frequencies directly but a set of autocorrelation coefficients, which substantially mitigates the nonconvexity problem. For instance, the simplest variants, such as linear prediction (LP),19−21 lead to a straightforward quadratic minimization problem. The frequencies are then found as roots of an algebraic equation. The LP algorithms appear unfortunately unstable at high noise levels. A noise-resistant modification of the Prony method was also developed22 and adapted to FT MS data processing.23 Another LP-related approach is the filter diagonalization method (FDM), which was first introduced in NMR24,25 and later adapted to FT MS.26−29 Though all these methods are, in principle, capable of super-FT resolution and provide accurate frequency and abundance estimations, they suffer from low noise tolerance (which might partially be remedied by signal denoising30) and often prohibitive computational costs. This manuscript introduces and evaluates the phased spectrum deconvolution method (ΦSDM), a super-FT resolving computational approach aimed at reconstruction of ion abundances in the frequency domain based on known oscillation phases and collisional decay constants. Any presumptions about the spectral composition are not required, which makes the method sufficiently versatile. Two next sections explain the ΦSDM principles and introduce major notations. The following sections examine its efficiency and limitations under various conditions. A detailed mathematical description of the method and additional experiments are given in Supporting Information.



FREQUENCY DISTRIBUTION VERSUS FT SPECTRUM When oscillating in an analyzing trap, the ion species, each having a unique mass-to-charge ratio mp/zp and represented by Q(p) charges, induce a signal on pick-up electrodes. The FT MS transient is a sum of harmonic components with their amplitudes proportional to Q(p) and frequencies f(p) = f(mp/ zp) = const × (mp/zp)s, where s = −1 for FT ICR or s = −1/2 for Orbitrap MS. In a nonideal analyzer, the oscillation frequency may also depend on orbital parameters such as amplitude and radius, turning the distribution of abundances into a continuous function A(f) in the frequency domain. This function is further referred to as the f requency distribution. An important property of the frequency distribution is integrability. In other words, the summed-up intensity of harmonic oscillators between any two frequencies f1 and f 2 is determined by the integral of A( f) in these limits. This property gives a simple recipe for spectrum interpretation: any local maximum of A(f) may be associated with an ion species, and its abundance is determined as the integral “under the peak”. This is similar to the way abundances are traditionally evaluated in the beam-type instruments such as magnetic sector or time-offlight mass spectrometers, but is quite different from the traditional interpretation of FT MS spectra when the peak height is the abundance measure. For a given frequency distribution the transient is defined, in the complex form, by the integral S(t ) =

∫ A(f ) exp{2πift + iφ(f ) − γ(f )t } df

C(f ) =

∫ A(f ′)Ψ(f − f ′, γ)eiφ(f ′) df ′

(2)

of the frequency distribution A(f) and the kernel (a “peak shape” function) Ψ(Δf , γ ) =

1 − e−(2πiΔf + γ )T , Δf = f − f ′, Ψ(0, 0) = 1 (2πiΔf + γ )T (3)

with a known phase factor exp(iφ). It is imperative to differentiate between the two functions of frequency: the continuous (infinitely zero-filled) FT spectrum C( f) and the frequency distribution A( f). The first is directly related to the

(1) 1203

DOI: 10.1021/acs.analchem.6b03636 Anal. Chem. 2017, 89, 1202−1211

Article

Analytical Chemistry observed transient, that is, instrument’s read-out, while the later describes the ion population, i.e. the signal’s source. Full width at half-maximum (fwhm) of the quasi-sinc kernel (3) is

sampled on a P-times refined frequency grid f k = k/PN as a set of complex xk

∼W = 1/T 2 + γ 2 , which makes the frequency resolution of the FT spectrum restricted by both Fourier uncertainty δf = 1/ T and the decay rate γ. It is worth mentioning that the effect of convolution is not limited to the loss of resolution but also compromises the FT spectrum’s integrability because the kernel only slowly decreases, asymptotically as O(1/Δf), and the integral of its magnitude |ψ| does not converge. Therefore, the FT spectrum integral is not a measure of ion abundances. Since Fourier transform is conventionally used to obtain the mass spectrum, it is important to understand the conditions under which C(f) gives reliable information about the actual frequency distribution and when it fails to do so. Consider an “ideal” case that the frequency distribution A(f) contains a peak centered at f * and separated from the neighboring peaks by empty intervals which are substantially wider than δf, that is, the acquisition time is long enough to resolve this peak unambiguously. On the other hand, the peak itself is narrow and occupies the frequency range f * ± Δf where Δf ≪ δf. So, if the peak has a fine structure, this structure is completely unresolved at the given acquisition time. Only if both conditions are met simultaneously, the FT spectrum’s apex height is

(7)

xk = A(fk )eiφ(fk ) , k = 0...PN − 1, P > 1

each of them being a phase-shifted frequency distribution value. The discretization errors and noise are accounted for by converting the exact collocation problem to a problem of minimal discrepancy || Ψx − c ||22 =

n

∫f *−Δf

A (f ) d f

Ψnk = Ψ(f n(FT ) − fk , γ ) =

f

(8)

exp(2πik /P − γT ) − 1 2πi(k /P − n) − γT

(9)

In the special case of k = Pn and γ = 0 we define ψn,Pn = 1 by continuity. It is very important that in contrast to zero-filling and other interpolation techniques, where the refined bins are just linear combinations of known cn, the refined grid introduced here is a domain for xk sought as independent unknowns. Non-negativity, by definition, of the frequency distribution A(f) fixes the phases of feasible xk, namely as arg xk = φ( f k). However, as the phase calibration precision is limited by electronic jitter and noise, a relaxed phase constraint may be imposed as

(4)

|arg xk − φ(fk )|mod 2π ≤ Δφ /2

where A* is the actual peak abundance and D(γ ) ≡ max|Ψ(f , γ )| = Ψ(0, γ ) = (1 − e−γT )/γT

k

with the matrix elements

f *+Δf

C* = max|C(f )| ≈ D(γ )A*, A* =

∑ |cn − ∑ Ψnkxk|2 → min

(10)

where Δφ ≪ 1 defines a feasibility cone on the complex plane. The phase constraint (eq 10) does not break convexity of the optimization problem (eqs 8 and 9), which guaranties solution uniqueness, independence of initial approximation, and allows efficient numerical computation.44 The deconvolution problem of interest belongs to the class of ill-posed problems and should be approached with caution. The phase constraint (eq 10) is the principal regularization factor to solve it. The discrete norm (eq 8) may also be augmented with further convex terms to aid its minimization, see SI for more details. Since the phase restriction makes the problem generally nonlinear, an iterative solution is required. Hence, another computational parameter, the number of iterations I appears. Nevertheless, the ΦSDM transformation remains linear with respect to any non-negative multiplier α ≥ 0

(5)

is the decay-related factor. Formulas 4 and 5 provide the basis to estimate the abundance as A* ≈ C*/D(γ). For peaks with similar decays, the FT peak heights only reflect the abundances with the accuracy to a multiplicative constant. In case that the collisional decay rates are substantially different, the individual decay factors may be roughly estimated from the observed peak widths as 1/D ≅ T × W which is better than 15% accurate in the entire range of γ ≥ 0. This allows assessing42,43 the abundances as proportional to the product C* × W, which should not be confused, however, with an integral of |C| under the peak. Such simple interpretation does not hold, however, if a spectrum contains peaks separated by ∼ δf or less because the kernel ψ varies substantially on this scale of Δf and its phase changes a lot. Then the FT spectrum either fails to resolve the peaks reliably or reports corrupted estimates of abundances and centroids.

ΦSDM: α × cn → α × xk

(11)

which is important to maintain the dynamic range. The iterative algorithm (see SI for details) results in a discrete set of non-negative amplitudes ak = |xk| ∈ + upon convergence. Each local maximum of {ak} may be interpreted as a peak corresponding to an ion species. Using integrability of the frequency distribution, the peak intensity A* and the centroid f * may be evaluated as



FOURIER SPECTRUM DECONVOLUTION The resolution and the accuracy may be considerably enhanced if the frequency distribution A( f) is restored from the FT spectrum, having eqs 2 and 3 restated as a deconvolution problem to be resolved for A( f). Note that completeness of the Fourier basis makes the collocation with the complex-value discrete FT spectrum n cn = C(f n(FT) ), f n(FT) = , n = 0...N − 1 (6) T

k2

A* =

∑ q[k]ak , k = k1

f* =

1 A*

k2

∑ q[k]akfk k = k1

(12)

where summation spans between two neighboring local minima k1 and k2. The choice of weights q[k] = 1, if k1 < k < k2 and q[k1] = q[k2] = 1/2 evenly distributes these local minima between two adjacent peaks.

sampled in N equidistant frequency points sufficient within the Nyquist band f < N/T. The frequency distribution should be discretized with enhanced resolution, for example, being 1204

DOI: 10.1021/acs.analchem.6b03636 Anal. Chem. 2017, 89, 1202−1211

Article

Analytical Chemistry

Figure 1. Deconvoluted frequency distributions of a doublet (A, frequency separation Δf = 0.5 δf), triplet (B, Δf = 0.5 δf), and septet (C, Δf = 0.75 δf) for different numbers of iterations I. δf = 1/T is the FT spectrum bin.

More mathematical details are given in SI. Numerical methods of choice are alternative direction method of multipliers (ADMM)45−48 with the Moreau−Yosida regularization49 and the Cooley-Tukey algorithm for Fast Fourier Transform.50−52 It is also shown in SI that the computational cost per iteration scales up as O(NP log2 NP) with the spectrum length N and the refinement factor P, being close to that of Ptimes zero-filling of the FT spectrum in question. The fact that the computational cost scales up slowly, allows applying the algorithm to an entire spectrum without windowing, while the algorithms scalable as N2 or worse may be used only for a relatively short segment at once.



S(t ) =

∑ A(p)exp(2πif (p) t ) p

(13)

were sampled in the time interval 0 ≤ t < T and Fouriertransformed to obtain model discrete spectra {cn}. The parameters A(p) and f(p) are the model peak abundances and frequencies, correspondingly, while decay is neglected. Only the separation of peaks related to Fourier uncertainty is relevant, therefore all frequencies are further expressed in FT bins δf = 1/T. The refinement factor P = 16 (unless stated else) gives the number of refined frequency bins within an FT bin. The notation δf is hereafter reserved for the FT bin separation, while the frequency spacing between peaks in model and experimental spectra will be denoted as Δf. Figure 1A demonstrates deconvolved frequency distributions upon different numbers of iterations I on a sample doublet of peaks separated by Δf = 0.5 FT bin. The horizontal axis represents the frequency in FT bin units relative to a nonessential offset f 0. Amplitudes of resolved peaks are shown in relative units above the histograms, as well as the total abundance of the cluster. The model (exact) values are given in parentheses. δf * provides estimation of the peak

PERFORMANCE AND LIMITATIONS TESTS ON SYNTHETIC SPECTRA

Performance of the method in terms of resolving power, frequency (or equivalently m/z) accuracy, and abundance accuracy was first evaluated on synthetic (model) transients containing clusters of peaks with the known phase φ ≡ 0, as assumed without loss of generality. The transients generated as 1205

DOI: 10.1021/acs.analchem.6b03636 Anal. Chem. 2017, 89, 1202−1211

Article

Analytical Chemistry

Figure 2. Definition of the resolution fidelity (A) and its dependence on the number of iterations as well as the separation between peaks in the doublet (B), triplet (C), and the septet (D). A fidelity of 100% corresponds to baseline resolution. Peak separation is given for each curve in the Fourier uncertainty units δf.

Figure 3. Comparison of absorption mode FT spectra (blue dashed line, Hann apodization, 16x zero-filling) and ΦSDM spectra (histogram, P = 16, I = 300) for triplet models with different peak separations Δf. ΦSDM peak centroids and abundances are shown with red crosses. The black dots show model (true) frequencies and amplitudes.

and the peak separation in the interrogated multiplets. The fidelity is defined as the depth of the valley between two adjacent peaks related to the height of the smaller one. With the number of iterations limited to I = 300, the 50% fidelity is achieved at the peak separations Δf = ∼0.33 δf for the doublet, ∼0.5 δf for the triplet, and ∼0.75 δf for the septet. Figure 3 compares the triplet ΦSDM spectrum to that of the absorption-mode FT. The peak separation is progressively reduced from 1.5 FT bins down to 0.25 FT bin. The determined peak abundances and centroids are shown with red crosses, while the black dots indicate the model values. For the peaks separated by Δf = 1.5δf, both FT and ΦSDM spectra show good resolution and accuracy. When the separation is reduced down to 1.25 δf and further to 1.0 δf, the peaks are still resolved in the absorption-mode FT, but their abundances and centroids are substantially corrupted by the interference. Within the domain 0.5δf ≤ Δf < δf the triplet is no longer resolved in the FT spectrum, and the apex reflects neither amplitudes of the individual peaks nor the total abundance. In contrast,

centroid accuracy. After 20 iterations the frequency distribution already shows a ∼25% valley (gap) between the peaks, which becomes deeper on the subsequent iterations. Accuracies of amplitudes and centroids improve progressively. The doublet is baseline resolved after 100 iterations. Figure 1B shows a similar test for a triplet of peaks with the abundances {0.5, 1.0, 0.5} rel. units also separated by 0.5 FT bin. Although the triplet is not resolved after 50 iterations, the total intensity is determined accurately. The baseline resolution is achieved after 500 iterations. Figure 1C shows frequency distributions for a model cluster of seven peaks with the intensities {0.2, 0.5, 0.8, 1.0, 0.8, 0.5, 0.2} rel. units separated by 3/4 FT bin. The individual peaks are resolved after 200 iterations, but the baseline resolution requires more than 1000 iterations. Generally, the larger the number of peaks in a cluster, the more iterations are required to resolve its individual components. This is yet another demonstration that the spectral sparseness is essential for super-FT resolution. Figure 2 shows the resolution fidelity as a function of the iteration count 1206

DOI: 10.1021/acs.analchem.6b03636 Anal. Chem. 2017, 89, 1202−1211

Article

Analytical Chemistry ΦSDM still shows reliable resolution and accuracy. Beyond the 0.25 δf frequency separation, ΦSDM also fails to produce a resolved spectrum, though the total intensity of 2.0 rel. units is still preserved. ΦSDM samples a continuous frequency distribution A(f) on a P-fold refined frequency grid, and the next tests investigate the accuracy of such discretization. Figure 4 demonstrates

Figure 5. ΦSDM response to the Gaussian noise with the RMS σ. The normalized distribution of the ΦSDM spectrum (black) is superimposed with an empirical relation (green), which is the normalized derivative of eq 14. The inset shows a fragment of the ΦSDM spectrum for the noise sample.

nonzero ak per one FT bin (out of P = 16). The statistical distribution of nonzero ak shows RMS ∼ 0.6 σ and is described by the empirical formula Pr(ak > ξσ |ak > 0) ≅ exp( −1.53ξ − 0.85ξ 2)

(14)

The distribution’s exponential character allows establishing a noise-level cutoff for trustworthy data. For example, the probabilities that a noise-related nonzero ak exceeds 1.5 × σ and 2 × σ are 1.5% and 0.15%, correspondingly. The next numerical experiment investigates effects of noise on the method’s resolving power and accuracy. A model triplet spectrum with one rel. unit amplitudes is corrupted by random white Gaussian noise with varied RMS σ and interrogated by ΦSDM. The noise is randomly sampled 1000 times for each combination of the peak separation Δf and the signal-to-noise ratio (SNR) defined as 1/σ. A test is considered a success if three conditions are satisfied simultaneously: (1) all three peaks are resolved, (2) the centroid errors do not exceed Δf/3, and (3) the amplitude errors are below 0.33 rel. units. The success rates are shown in Figure 6.

Figure 4. Resolution of the triplet {0.5,1,0.5} with the separation Δf = 0.5 δf on the 500th iteration with refinement factors P = 2...64.

resolution of the triplet with the abundances {0.5, 1.0, 0.5} rel. units and Δf = 0.5 δf on the 500-th iteration for different refinement factors. When an FT bin comprises only P = 2 sampling points, the triplet is apparently unresolved. For P = 4, ΦSDM resolves the cluster as a triplet, yet it is impossible to determine individual centroids and intensities with a sufficient degree of confidence. The refinement factors P = 8...16 were found to be appropriate in most practical cases. Further refinement of the frequency grid results in only marginal improvement and might not be the best strategy since the computational burden grows with P. It is also important for the algorithm characterization to understand the limits of its stability under the conditions when the presumptions about either the phase or the collisional decay rate are not accurate enough. Numerical experiments described in the SI show no accuracy deterioration if the phase error lies within the half angle Δφ/2 of the phase constraint cone. The phase calibration accuracy may depend on the electronics and timing jitter; the practically reasonable value Δφ ∼ 0.1 is used hereafter. ΦSDM is also tolerant to deviations from the expected exponential decay on the entire transient duration, γT, below 15%. This condition normally holds for sufficiently short transients (T ≲ 250 ms) in spite of variations of γ between ion species.

Figure 6. Statistical resolution success rate of ΦSDM as a function of SNR and peak separation in a model triplet. Number of iterations I = 5000.



PERFORMANCE UNDER NOISY CONDITIONS Stability of ΦSDM under noisy condition is an especially important question because the deconvolution problem (eq 2 and 3) belongs to the class of ill-posed Fredholm’s first kind integral equations, its solutions being known to be sensitive to small perturbations. In the following test, ΦSDM was evaluated on a model of normally distributed white noise with random phases. The FT spectrum components cn were sampled as independent complex random numbers with the magnitude RMS σ = 1 (which may be arbitrary as ΦSDM is linear with respect to positive scaling). The deconvolution resulted in a sparse frequency distribution (see the inset in Figure 5) that has, on average, less than one

In the case of Δf ≳ δf (which is also resolvable by the absorption FT), the noise is simply additive and all tests above a certain SNR are successful. It is different, however, in the super-FT resolution domain Δf ≲ δf where the noise may significantly affect the ΦSDM’s ability to resolve clusters. Percentage of successful tests becomes smaller as the frequency separation decreases, and the resolution limit is eventually determined by the noise level. It is noteworthy that the total abundance is affected only to the extent of the noise level, regardless of the method’s ability to resolve the cluster (see corresponding tests in SI). The success rates shown in Figure 6 depend on a particular choice of acceptance criteria and the number of iterations. The 1207

DOI: 10.1021/acs.analchem.6b03636 Anal. Chem. 2017, 89, 1202−1211

Article

Analytical Chemistry

Figure 7. Super-FT resolution of the fine isotopic structure of the MRFA peptide (experimental 128 ms transient obtained on Q Exactive HF mass spectrometer). Insets: (I) Theoretical stick spectra calculated from the natural isotope abundances and (II) ΦSDM spectrum in comparison with a theoretical high-resolution spectrum. Details are in the text.

results for other accuracy thresholds and iteration counts I may be found in SI.



APPLICATION TO EXPERIMENTAL FT MS SPECTRA Following the analysis on synthetic data, ΦSDM was evaluated on experimental transients acquired on a research-grade Q Exactive HF instrument with the high-field Orbitrap analyzer. Figure 7 demonstrates the fine structure of the MRFA peptide (524 Da, Z = 1+). The insets (I) show theoretical isotope abundances calculated by the Xcalibur software for three major isotopic clusters. The experimental transient acquisition time T = 128 ms results in the FT uncertainty δm ≅ 13 mDa (shown with blue arrows) which translates into the mass resolution RFT = m/δm ≅ 40 000, which is insufficient to resolve the isotopic fine structure in the FT spectrum. The insets (II) show the ΦSDM distribution ak after 300 iterations with the refinement factor P = 16. In contrast to FT, ΦSDM is capable of resolving many, though not all, details of the clusters. To assess the effective resolving power, we compare ΦSDM spectra with Gaussian-blurred theoretical stick spectra. The fwhm 0.2 δm of the blur was chosen to mimic the mass resolving power five times exceeding that of FT (the red lines). The A1 isotopic cluster, which comprises five species with different abundances, is resolved by ΦSDM as two peaks, which are approximately one-half FT bin apart. The ΦSDM histogram follows the theoretical spectrum rather closely and thus demonstrates the effective resolving power of about 200,000. Further resolving of the fine structure becomes a challenge as the separation between the components is less than 0.25 FT bin. Both A2 and A3 isotopic clusters are resolved as triplets. They also demonstrate strong correlation between the theoretical high-resolution spectrum and the experimental ΦSDM histogram. It is noteworthy that the rightmost ΦSDMresolved peaks in A2 and A3 look noticeably narrower than in the theoretical spectra, albeit preserving the total abundances with the ∼0.35% accuracy. One cannot exclude the spacecharge-induced coalescence of these peaks that might result in frequency synchronization. Figure 8A presents a T1 = 1024 ms transient acquired with the isolated ubiquitin charge state 11+. It reveals noticeable collisional decay which manifests itself in the exponential

Figure 8. Experimental high resolution Orbitrap data of the isolated Ubiquitin isotopic cluster Z = 11+. (A) Complete recorded transient and its first 32 ms segment, (B) corresponding magnitude FT spectra, and (C) ΦSDM processing of the first 32 ms segment only.

transient decline with the rate γ ≈ 1.7 s−1. Magnitude FT spectra were calculated with Hann apodization and 16x zerofilling on both the entire transient and its first T2 = 32 ms fraction as shown in Figure 8B. The entire transient’s spectrum completely resolves the isotope structure, though the decayrelated suppression is quite significant. The decay suppression may be estimated for the T1 transient with eq 5 as D1 ≈ 0.47, but it is hardly detectible at the transient length T2 (Figure 8A, insert), being only D2 ≈ 0.05. However, the FT spectrum of the short 32 ms transient is overwhelmingly suppressed by interference and no isotopic structure is resolved since the spacing between the isotopes is only ∼0.94 FT bin. In contrast, ΦSDM (P = 16, I = 300) applied to the short T2 = 32 ms transient produces a baseline resolved isotopic picture shown in Figure 8C. A theoretical Gaussian-blurred spectrum with the effective fwhm resolving power 40 000 is super1208

DOI: 10.1021/acs.analchem.6b03636 Anal. Chem. 2017, 89, 1202−1211

Article

Analytical Chemistry imposed for comparison. Moreover, the ΦSDM spectrum is neither suppressed by interference nor the collisional decay. In some applications, only a limited m/z range may be of interest, and it might therefore be advantageous to apply ΦSDM to a spectral window with a smaller number of FT bins. Windowing, though not necessary for ΦSDM, may considerably speed up the processing. An example of such application is tandem mass tags (TMT) MS,53,54 in which the reporter ions are concentrated in a relatively narrow mass range, but provide quantitative information on abundances of the TMT-labeled species. Processing of experimental TMT 6-plex labeled yeast lysates is presented in Figure 9. The principal challenge consists

Figure 10. Comparison between the FT and ΦSDM applied to the TMT spectra with different SNR (vertical axis). The horizontal axis shows differences of abundances of all TMT peaks normalized to the highest peak (130.13 Da). The dashed lines enclose the 1 × σ and 2 × σ noise bands.

SNRs of the ΦSDM spectra are plotted on the vertical axis, also related to the highest peak. One can see that the higher the SNR, the lower the discrepancy between the two signal processing methods. Both method result actually in the samequality abundance ratios because the differences appear constrained inside the 2 × σ noise band. Yet, ΦSDM uses only one-quarter of the acquisition time required for FT, which will potentially enable up to 4x more MS/MS spectra per an LC run and result in more proteins quantified.



CONCLUSIONS The method of phase-constrained spectrum deconvolution (ΦSDM) is implemented to speed up FT MS analysis by achieving higher mass resolution and obtaining more accurate mass-to-charge ratios and abundances on considerably shorter transients. Deconvolution of an FT spectrum into a distribution of abundances in the frequency domain (dubbed as frequency distribution) allows extraction of more useful information from a transient because of the direct relation of the frequency distribution to the harmonic components generated by oscillating ions. Approximation of the sought frequency distribution on a refined (e.g., 16-fold) frequency grid is used to achieve the resolving power far superior to that of FT methods. A practical limit of resolution is imposed by the actual information capacity of the transient and depends on its sparseness, as well as the noise level. The best super-FT resolution observed was under 1/4 of the FT uncertainty for peak doublets and sufficient signal-to-noise ratios. The frequency distribution is integrable, which means that the peak abundance is equal to the sum of refined bins that constitute the peak. This property completely eliminates the interference errors which are typical of FT spectra and manifest themselves as mutual suppression of unresolved or barely resolved peaks. In case that a cluster of peaks is unresolved, ΦSDM reports a single broad peak, and the total abundance of the cluster is preserved. An essential precondition of applying ΦSDM is existence of reliable estimates for the signal phase and the collisional decay rate, both being smooth functions of m/z. Phase calibration is straightforward for Orbitrap spectra and is routine, although more difficult, for FT ICR spectra. The decay constants may also be estimated before applying ΦSDM, though the method is forgiving to the collisional decay in sufficiently short transients. The use of shorter transient without loss of quality will result in a considerable throughput increase for both datadependent and data-independent mass analysis. Application of ΦSDM to long transients may increase the resolution substantially. Some precautions should, however, be taken to determine exact decay rates and possible space-charge

Figure 9. Experimental TMT spectra: (A) absorption mode FT of a 128 ms transient and (B) ΦSDM centroided spectrum computed on a 32 ms transient.

in quantitatively reliable resolution of the doublets with the ∼6.3 mDa mass differences at 128 and 129 Da. If approached with the standard FT processing methods, this problem requires sufficiently long transients to guarantee baselineresolution of the doublets and prevent interference-related corruption of abundances. Figure 9A shows a single scan from an LC run with the 128 ms acquisition time that corresponds to ∼3.8 FT bins separation in the doublets, which is required for reliable accuracy (better than 1%) of the abundance ratios. A 32 ms transient provides, accordingly, only ∼0.95 FT bin separation in these doublets, which is too small to be resolved by FT-based methods. ΦSDM was applied to a window containing 1024 FT bins that encompass the reporter ion mass region (Figure 9B). As ΦSDM spectra do not suffer from interference, they show robust and quantitatively accurate resolution using 4-fold shorter transients. Figure 10 compares relative abundances of the reporter ions calculated by the standard absorption-mode FT (128 ms) and ΦSDM (32 ms) for a plurality of transients obtained in an LC run. The horizontal axis shows differences between the peak abundances in ΦSDM and absorption-mode FT spectra, the values being normalized to the abundance of the highest peak at 130.13 Da. The total ion abundances vary greatly throughout the LC run, so that the spectral SNRs appear substantially different. The 1209

DOI: 10.1021/acs.analchem.6b03636 Anal. Chem. 2017, 89, 1202−1211

Article

Analytical Chemistry effects on the oscillation phases. ΦSDM might also find its applications in other Fourier Transform Spectroscopy analytical methods, for example, NMR.



(17) Aushev, T.; Kozhinov, A. N.; Tsybin, Y. O. J. Am. Soc. Mass Spectrom. 2014, 25, 1263−1273. (18) Hewitt, E.; Hewitt, R. E. Arch Hist. Exact Sci. 1979, 21, 129− 160. (19) Farrar, T. C.; Elling, J. W.; Krahling, M. D. Anal. Chem. 1992, 64, 2770−2774. (20) Guan, S.; Marshall, A. G. Anal. Chem. 1997, 69, 1156−1162. (21) Rahbee, A. Int. J. Mass Spectrom. Ion Processes 1986, 72, 3−13. (22) Osborne, M. R.; Smyth, G. K. SIAM J. Sci. Computing 1995, 16, 119−138. (23) Aizikov, K.; Grinfeld, D. US Patent Appl. US20130311110(A1), 2013. (24) Mandelshtam, V. A.; Taylor, H. S.; Shaka, A. J. J. Magn. Reson. 1998, 133, 304−312. (25) Mandelshtam, V. A. Prog. Nucl. Magn. Reson. Spectrosc. 2001, 38, 159−196. (26) Aizikov, K.; O’Connor, P. B. J. Am. Soc. Mass Spectrom. 2006, 17, 836−843. (27) Kozhinov, A. N.; Tsybin, Y. O. Anal. Chem. 2012, 84, 2850− 2856. (28) Martini, B. R.; Aizikov, K.; Mandelshtam, V. A. Int. J. Mass Spectrom. 2014, 373, 1−14. (29) Leach, F. E., III; Kharchenko, A.; Vladimirov, G.; Aizikov, E.; O’Connor, P. B.; Nikolaev, E.; Heeren, R. M. A.; Amster, I. J. Int. J. Mass Spectrom. 2012, 325−327, 19−24. (30) Chiron, L.; van Agthoven, M. A.; Kieffer, B.; Rolando, C.; Delsuc, M.-A. Proc. Natl. Acad. Sci. U. S. A. 2014, 111, 1385−1390. (31) Marshall, A. G.; Comisarow, M. B.; Parisod, G. J. Chem. Phys. 1979, 71, 4434−4444. (32) Comisarow, M. B.; Marshall, A. G. J. Chem. Phys. 1976, 64, 110. (33) Mao, L.; Chen, Y.; Xin, Y.; Chen, Y.; Zheng, L.; Kaiser, N. K.; Marshall, A. G.; Xu, W. Anal. Chem. 2015, 87, 4072−4075. (34) Yang, F.; Voelkel, J. E.; Dearden, D. V. Anal. Chem. 2012, 84, 4851−4857. (35) Aizikov, K.; Mathur, R.; O’Connor, P. B. J. Am. Soc. Mass Spectrom. 2009, 20, 247−256. (36) Grinfeld, D.; Monastyrskiy, M.; Makarov, A. Microsc. Microanal. 2015, 21, 176−181. (37) Chen, S.-P.; Comisarow, M. B. Rapid Commun. Mass Spectrom. 1991, 5, 450−455. (38) Gorshkov, M. V.; Fornelli, L.; Tsybin, Y. O. Rapid Commun. Mass Spectrom. 2012, 26, 1711−1717. (39) Mitchell, D. W.; Smith, R. D. Phys. Rev. E: Stat. Phys., Plasmas, Fluids, Relat. Interdiscip. Top. 1995, 52, 4366−4386. (40) Bolotskikh, P. A.; Grinfeld, D. E.; Makarov, A. A.; Monastyrskiy, M. A. Nucl. Instrum. Methods Phys. Res., Sect. A 2011, 645, 146−152. (41) Kharchenko, A.; Vladimirov, G.; Heeren, R. M.; Nikolaev, E. N. J. Am. Soc. Mass Spectrom. 2012, 23, 977−987. (42) Jaitly, N.; Mayampurath, A.; Littlefield, K.; Adkins, J. N.; Anderson, G. A.; Smith, R. D. BMC Bioinf. 2009, 10 (1), 87. (43) Maxwell, E.; Tan, Y.; Tan, Y.; Hu, H.; Benson, G.; Aizikov, K.; Conley, S.; Staples, G. O.; Slysz, G. W.; Smith, R. D.; Zaia, J. PLoS One 2012, 7, e45474. (44) Boyd, S. P., Vandenberghe, L. Convex Optimization; Cambridge University Press: New York, 2004. (45) Gabay, D.; Mercier, B. Comp. and Math. Appl. 1976, 2, 17−40. (46) Eckstein, J. J. Optimization Theory and Appl. 1994, 80, 39−62. (47) Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J. FNT in Machine Learning 2010, 3, 1−122. (48) Parikh, N.; Boyd, S. FNT in Optimization 2014, 1, 127−239. (49) Moreau, J. J. C. R. Acad. Sci. Paris Sér. A Math. 1962, 255, 2897− 2899. (50) Cooley, J. W.; Tukey, J. W. Math.Comput. 1965, 19, 297−301. (51) Qian, Z.; Lu, C.; An, M.; Tolimieri, R. IEEE Trans. ASSP 1994, 42, 2835−2836. (52) Hegland, M. Numerische Mathematik 1994, 68, 507−547. (53) Thompson, A.; Schäfer, J.; Kuhn, K.; Kienle, S.; Schwarz, J.; Schmidt, G.; Neumann, T.; Hamon, C. Anal. Chem. 2003, 75, 1895− 1904.

ASSOCIATED CONTENT

* Supporting Information S

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.analchem.6b03636. Numerical algorithm details, tolerances to phase and decay errors, resolution success rates under varying numbers of iterations and different acceptance criteria, the total abundance success rate, and resolution success rates for a septet (PDF)



AUTHOR INFORMATION

Corresponding Author

*Phone +494215493138. E-mail: dmitry.grinfeld@ thermofisher.com. ORCID

Dmitry Grinfeld: 0000-0003-2261-4209 Notes

The authors declare the following competing financial interest(s): All authors are employees of Thermo Fisher Scientific, which manufactures and sells Orbitrap-based mass spectrometers.



ACKNOWLEDGMENTS The authors are grateful to Dr. J. Smith, Dr. A. Schoen, O. Lange, E. Denisov, E. Couzijn (Thermo Fisher), and Prof. S. Boyd (Stanford University) for fruitful discussions. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 686547 (MSMed project).



REFERENCES

(1) Comisarow, M. B.; Marshall, A. G. Chem. Phys. Lett. 1974, 25, 282−283. (2) Marshall, A. G.; Hendrickson, C. L.; Jackson, G. S. Mass Spectrom. Rev. 1998, 17, 1−35. (3) Makarov, A. Anal. Chem. 2000, 72, 1156−1162. (4) Perry, R. H.; Cooks, R. G.; Noll, R. Mass Spectrom. Rev. 2008, 27, 661−699. (5) Bartholdi, E.; Ernst, R. R. J. Magn. Reson. 1973, 11, 9−19. (6) Ebel, A.; Dreher, W.; Leibfritz, D. J. Magn. Reson. 2006, 182, 330−338. (7) Savitski, M. M.; Ivonin, I. A.; Nielsen, M. L.; Zubarev, R. A.; Tsybin, Y. O.; Håkansson, P. J. Am. Soc. Mass Spectrom. 2004, 15, 457−461. (8) Aboutanios, E. IEEE Instr. Meas. Magaz. 2011, 14, 8−14. (9) Guan, S.; Wahl, M. C.; Marshall, A. G. Anal. Chem. 1993, 65, 3647−3653. (10) Lange, O.; Damoc, E.; Wieghaus, A.; Makarov, A. Int. J. Mass Spectrom. 2014, 369, 16−22. (11) Qi, Y.; Thompson, C. J.; Van Orden, S. L.; O’Connor, P. B. J. Am. Soc. Mass Spectrom. 2011, 22, 138−147. (12) Qi, Y.; Barrow, M. P.; Li, H.; Meier, J. E.; Van Orden, S. L.; Thompson, C. J.; O’Connor, P. B. Anal. Chem. 2012, 84, 2923−2929. (13) Rockwood, A. L.; Erve, J. C. J. Am. Soc. Mass Spectrom. 2014, 25, 2163−2176. (14) Kilgour, D. P. A.; Wills, R.; Qi, Y.; O’Connor, P. B. Anal. Chem. 2013, 85, 3903−3911. (15) Write, D. A.; Grothe, R. A. US Patent US8346487, 2013. (16) Grothe, R. A. US Patent US8431886, 2013. 1210

DOI: 10.1021/acs.analchem.6b03636 Anal. Chem. 2017, 89, 1202−1211

Article

Analytical Chemistry (54) Dayon, L.; Hainard, A.; Licker, V.; Turck, N.; Kuhn, K.; Hochstrasser, D. F.; Burkhard, P. R.; Sanchez, J.-C. Anal. Chem. 2008, 80, 2921−293.

1211

DOI: 10.1021/acs.analchem.6b03636 Anal. Chem. 2017, 89, 1202−1211