Compressively Sampled Two-Dimensional Infrared Spectroscopy That

Apr 2, 2017 - Jonathan J. Humston†, Ipshita Bhattacharya‡, Mathews Jacob‡, and Christopher M. Cheatum†. † Department of Chemistry, Universit...
0 downloads 8 Views 2MB Size
Article pubs.acs.org/JPCA

Compressively Sampled Two-Dimensional Infrared Spectroscopy That Preserves Line Shape Information Jonathan J. Humston,† Ipshita Bhattacharya,‡ Mathews Jacob,‡ and Christopher M. Cheatum*,† †

Department of Chemistry, University of Iowa, Iowa City, Iowa 52242, United States Department of Electrical and Computer Engineering, University of Iowa, Iowa City, Iowa 52242, United States



ABSTRACT: Two-dimensional infrared (2D IR) spectroscopy is a powerful tool to investigate molecular structures and dynamics on femtosecond to picosecond time scales and is applied to diverse systems. Current technologies allow for the acquisition of a single 2D IR spectrum in a few tens of milliseconds using a pulse shaper and an array detector, but demanding applications require spectra for many waiting times and involve considerable signal averaging, resulting in data acquisition times that can be many days or weeks of laboratory measurement time. Using compressive sampling, we show that we can reduce the time for collection of a 2D IR data set in a particularly demanding application from 8 to 2 days, a factor of 4×, without changing the apparatus and while accurately reproducing the line-shape information that is most relevant to this application. This result is a potent example of the potential of compressive sampling to enable challenging new applications of 2D IR.



lead to big differences in data acquisition times.6 Current technologies allow for the acquisition of a single 2D IR spectrum in a few tens of milliseconds using a pulse shaper and an array detector. For some applications, acquiring one spectrum is enough to answer the questions being asked, and improving the data acquisition time may not be important. Other applications, however, require spectra for many waiting times or may involve considerable signal averaging, and further reducing the time to acquire one spectrum can significantly reduce the overall measurement time. Previously, Dunbar et al. applied a form of compressive sampling to 2D IR.7 By scanning small, evenly spaced time points over a relatively short window, they determined peak positions and relative peak amplitudes with a reduction in acquisition time of a factor of 16. Unfortunately, their approach requires that the approximate frequencies, or at least the frequency splitting, must be known a priori to know where to place the sampling window. This method is further limited by the fact that it lacks the ability to reproduce line-shape information.

INTRODUCTION Long data acquisition times are a longstanding problem in many experimental methods and can limit the number of studies that are practicable, especially those involving timeevolving or unstable samples. Over the past decade, compressive sampling has reduced data acquisition time in diverse fields such as geophysics,1 medical imaging,2 computational biology,3 and astronomy.4 Compressive sampling requires only a fraction of the traditional number of measurements while yielding much of the same information as the fully sampled data. Here we introduce an implementation of compressive sampling to reduce the data acquisition time of two-dimensional infrared (2D IR) spectroscopy without distorting spectral peak line shapes. Natural signals often have an underlying structure that causes the signal or its coefficients in an appropriate transform domain (e.g., Fourier coefficients, finite differences) to be sparse, meaning that most coefficients have small values and there are few large coefficients. In traditional data compression, the fully sampled signal is transformed into a fixed basis and the significant coefficients are stored or transmitted. The compressed data can then be decompressed for viewing. This idea underlies methods such as JPEG compression of images. A massive data acquisition followed by the removal of most of the data, however, is inefficient. Compressive sampling aims to bypass the first step and to directly acquire the compressed signal with no a priori knowledge of the signal being measured. 2D IR spectroscopy is a multidimensional spectroscopic method used in the investigation of molecular structure and dynamics on femtosecond to picosecond time scales and has been applied to diverse systems from dilute solutions to solids to membranes.5 There are different approaches to building a 2D IR spectrometer, and different experimental designs can © 2017 American Chemical Society



EXPERIMENTAL METHODS We present 2D IR data collected in our lab measuring the antisymmetric stretch of the azide anion bound to formate dehydrogenase in a ternary complex with NAD+ to determine the frequency−frequency correlation function (FFCF).8 The full data set required 8−10 days of data acquisition because of the high density of waiting times and the extensive averaging necessary for the measurement. For each 2D IR spectrum, we scan τ1 from 0 to 4 ps, taking 24 fs steps in the rotating frame, Received: March 20, 2017 Published: April 2, 2017 3088

DOI: 10.1021/acs.jpca.7b01965 J. Phys. Chem. A 2017, 121, 3088−3093

Article

The Journal of Physical Chemistry A and use a four-pulse phase cycle to isolate the signal. We measure waiting times from 0 to 5 ps, taking 50 fs steps. To increase signal-to-noise, each waiting time is the average of ∼300 scans. Finally, we average the centerline slope (CLS) values from 20 to 25 replicate measurements for each waiting time. This data collection requires more than 500 million laser shots and with a 1 kHz repetition rate laser would take just under 6 days without missing a shot. In reality, it takes longer because it is necessary to occasionally reoptimize the signal over the course of such a long data collection period. Compressive sampling favors nonuniform undersampling such that it produces incoherent aliasing artifacts.9−11 We implement compressive sampling in the τ1 time delay, meaning we do not take evenly spaced time steps in τ1 but rather a random selection of time points from a normal distribution centered at τ1 = 0 fs. The generation of the spectrum from these undersampled data is an underdetermined problem. We use an algorithm known as total variation (TV)12 regularized recovery, which is widely used in image processing13,14 to recover the data. TV seeks to find a solution that matches the measured data, while minimizing the sum of absolute variation of the solution. Specifically, the regularization penalty is chosen as l1-norm of the finite differences of the signal (denoted by ∥ψx∥1). Note that l1-norm minimization promotes solutions with few large coefficients over many small coefficients. The above regularization penalty is minimized subject to the constraint that -sx − b 2 < ϵ, where b are the compressively sampled time domain measurements themselves, and -s is a 1D inverse Fourier transform along the reconstructed spectral dimension that contains the sampling pattern information needed to select the delay times that are actually measured. The fidelity factor, ϵ, constrains the reconstructed spectrum to the measured data within a certain experimental noise level. The above optimization problem is solved using an algorithm called as the Alternating Direction Method of Multipliers (ADMM).15 Notably, what we refer to as fully sampled is already undersampled using frequency shifting in the rotating frame using the pulse shaper.5 The oscillation at 2050 cm−1 has a period of 16 fs and must be sampled with time intervals of Fcrit = 2.7). At 8× and 10× compressions, however, the weighted sum of squared residuals for the model fit is barely better than that for a linear fit, indicating that by the highest levels of compression the line shape is sufficiently distorted that the correct parameter values cannot be determined from the data. Table 1 summarizes the CLS fit parameters for the fully sampled, 2×, and 4× data. The magnitude of the CLS varies with compression factor in a similar way that the CLS depends on the apodization.20 The exact value of the CLS is not critical if the time constants, frequency components, and relative amplitudes remain unaffected. Without correct absolute amplitudes, it is possible to extract the full FFCF, including homogeneous contribution and the absolute amplitudes of inhomogeneous frequency fluctuations, from the compressively sampled CLS decay by fitting to the FTIR spectrum. The amplitude of each term of the full FFCF is proportional to product of the fwhm of the FTIR spectrum and the square root of the relative amplitudes of the CLS,21 which are shown in the last three columns of Table 1. The square roots of the relative amplitudes at 2× and 4× compression are not all within the 95% confidence limit of the fully sampled measurement. However, all of the compressively sampled values are within 10% of the fully sampled values. At 4× compression, the time constants, frequency components, and relative amplitudes are remarkably close given that this is a first attempt at compressive sampling and we use an unrefined reconstruction algorithm. This is a reasonable compromise in a case where the measurement may not be feasible without compressive sampling. The appropriate compression factor at which to sample a new system will be unknown a priori. However, there are practical ways to determine the highest compression factor that still provides the required information. For example, one can fully measure only a few waiting times and compare those data to those obtained from compressively sampling. This will not add a significant amount of time to the measurement,

but it is not clear that they are the same oscillations as the first group. For a more quantitative analysis of these results, we fit the CLS decay to the following functional form CLS(T ) = A 0e−T / τ0 +



Ai e−T / τi × cos(2πωiT + ϕi)

i = 1,2

(2)

where we use a sum of exponential functions that includes damped cosines to model the oscillatory features in the decay. We have described the origin of the oscillatory features elsewhere,8 and here we focus on the ability of compressive sampling to resolve these oscillations and return the correct frequencies, time constants, and relative amplitude parameters. Figure 4B−I shows the fit parameters with 95% confidence intervals. Increasing compression results in increases in the errors on the parameters of both the high-frequency (blue) and low-frequency (red) oscillatory terms with the most substantial effect occurring between 4× and 6× compression. For both the low-frequency oscillatory term and the nonoscillating, slow exponential decay term there is a large decrease in amplitude between 4× and 6× compression. The amplitude of the oscillating terms have only small variations up to 4×, although the nonoscillating term has a clear downward trend. These changes show that the effects of increased compression on the CLS decay are not simply a multiplicative effect on the relative amplitude. Rather there is a differential impact on each of the three terms that contribute to the overall decay. In addition, we have fit to a predetermined model that includes two oscillations and a slow exponential decay. At the higher order compressions, however, this model is not justified based on the quality of the CLS data; therefore, this model yields parameters with large uncertainties. We have used this fit function because we know a priori what the functional form of the CLS decay is already from the fully sampled measurements. When actually using compressive sampling, however, we will not know the correct model a priori and must use statistical methods to determine the minimum appropriate model. At 6×, 3091

DOI: 10.1021/acs.jpca.7b01965 J. Phys. Chem. A 2017, 121, 3088−3093

Article

A2

0.43 ± 0.01 0.43 0.45

A1

0.36 ± 0.03 0.40 0.39 0.83 ± 0.02 0.81 0.80

A0 ω2 (cm )

9.9 ± 0.46 10.2 ± 0.63 10.3 ± 0.99 1.08 ± 0.17 1.12 ± 0.24 1.08 ± 0.34

τ2 (ps) A2

0.069 ± 0.01 0.065 ± 0.01 0.067 ± 0.01 24.0 ± 1.6 23.9 ± 2.0 23.3 ± 3.5

ω1 (cm ) τ1 (ps)

CONCLUSIONS On the basis of our results, it is clear that the acceptable level of compression depends on the required information. At 10× compression, which only measures 17 time points in τ1, the reconstructed peaks have comparable peak positions and S/N to the fully sampled data, but the line shapes are significantly distorted. For some applications this result would be entirely adequate. For many applications, however, the CLS and lineshape information are essential. On the basis of qualitative and quantitative inspection of the CLS decay curves and CLS decay fit parameters, most importantly the frequencies and time constants of the oscillations, compressive sampling can accelerate data acquisition of 2D IR spectra by a factor of at least 4× while still adequately reproducing line shapes, which is still a significant acceleration. With 4× compression, the 8 days required to collect the FDH data presented here can be reduced to 2 days, which is a substantial improvement. An area of continued study is to test the difference between compressively sampling 2D IR with and without frequency shifting. In the future, it may be more practical to talk about the number of measurements, which corresponds directly to laboratory time, rather than compression factor. For example, the same 42 time delay measurements that correspond to 4× compression in the current frequency shifted scheme are 12× compression if sampled in the true carrier frequency scheme. Additionally, when probing higher or lower vibrational frequencies, the Nyquist sampling rate scales proportionally and the same compression factor corresponds to different numbers of measurements. There ultimately will be a lower limit to the number of measurements required. For example, because fewer points are needed to fully sample lower vibrational frequencies to reproduce the relevant line-shape features, higher compression factors will not necessarily be obtained at lower vibrational frequencies. It is likely that even greater compression than shown here is possible. We have used a common reconstruction algorithm from image processing with essentially no optimization of the reconstruction parameters. In addition, it is possible to compressively sample in both τ1 and the waiting time, T. 2D reconstruction methods could leverage the redundancy of information in the 2D IR spectra at different waiting times. Using different masks at different waiting times and compressively sampling in both time delays, it is possible that we could realize an order of magnitude or more increase in the compression factor without distorting the line shapes significantly. Such possibilities could eventually enable us to collect the full FDH sample presented here in a matter of a few hours rather than days. Pulse shaping will be a key advantage in experimentally implementing compressive sampling methods. Taking uneven step sizes quickly is difficult with translation stages but is trivial with a pulse shaper.

0.64 ± 0.2 0.62 ± 0.3 0.63 ± 0.4

τ0 (ps)

80 ± 25 92 ± 47 72 ± 52

A0

0.26 ± 0.003 0.24 ± 0.004 0.21 ± 0.007

A1



0.050 ± 0.02 0.056 ± 0.03 0.048 ± 0.03

−1 −1

Table 1. CLS Parameters with Compressive Sampling

preserving the benefits of compressive sampling, and will provide a check to ensure adequate results. Additionally, one advantage of compressive sampling is that it is future-proof, meaning more data can be collected and added to the set. Furthermore, if a more robust and accurate algorithm specific to this application is developed, then the same data can be used to reconstruct spectra.



fully sampled 2× 4×

square root of relative amplitudes

The Journal of Physical Chemistry A

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. 3092

DOI: 10.1021/acs.jpca.7b01965 J. Phys. Chem. A 2017, 121, 3088−3093

Article

The Journal of Physical Chemistry A ORCID

(21) Kwak, K.; Park, S.; Finkelstein, I. J.; Fayer, M. D. FrequencyFrequency Correlation Functions and Apodization in Two-Dimensional Infrared Vibrational Echo Spectroscopy: A New Approach. J. Chem. Phys. 2007, 127, 124503.

Christopher M. Cheatum: 0000-0003-3881-3667 Notes

The authors declare no competing financial interest.

■ ■

ACKNOWLEDGMENTS This work was supported by funding from NSF CHE-1361765. REFERENCES

(1) Yao, H.; Shearer, P. M.; Gerstoft, P. Compressive Sensing of Frequency-Dependent Seismic Radiation from Subduction Zone Megathrust Ruptures. Proc. Natl. Acad. Sci. U. S. A. 2013, 110, 4512−4517. (2) Graff, C. G.; Sidky, E. Y. Compressive Sensing in Medical Imaging. Appl. Opt. 2015, 54, C23−C44. (3) Dai, W.; Sheikh, M. A.; Milenkovic, O.; Baraniuk, R. G. Compressive Sensing DNA Microarrays. EURASIP J. Bioinf. Syst. Biol. 2009, 2009, 1−12. (4) Bobin, J.; Starck, J.-L.; Ottensamer, R. Compressed Sensing in Astronomy. IEEE Journal Of Selected Topics in Signal Processing 2008, 2, 718−726. (5) Hamm, P.; Zanni, M. T. Concepts and Methods of 2D Infrared Spectroscopy; Cambridge University Press: New York, 2011. (6) Rock, W.; Li, Y.-L.; Pagano, P.; Cheatum, C. M. 2d Ir Spectroscopy Using Four-Wave Mixing, Pulse Shaping, and Ir Upconversion: A Quantitative Comparison. J. Phys. Chem. A 2013, 117, 6073−6083. (7) Dunbar, J. A.; Osborne, D. G.; Anna, J. M.; Kubarych, K. J. Accelerated 2d-Ir Using Compressed Sensing. J. Phys. Chem. Lett. 2013, 4, 2489−2492. (8) Pagano, P.; Guo, Q.; Kohen, A.; Cheatum, C. M. Oscillatory Enzyme Dynamics Revealed by Two-Dimensional Infrared Spectroscopy. J. Phys. Chem. Lett. 2016, 7, 2507−2511. (9) Sanders, J. N.; Saikin, S. K.; Mostame, S.; Andrade, X.; Widom, J. R.; Marcus, A. H.; Aspuru-Guzik, A. Compressed Sensing for Multidimensional Spectroscopy Experiments. J. Phys. Chem. Lett. 2012, 3, 2697−2702. (10) Lustig, M.; Donoho, D.; Pauly, J. M. Sparse Mri: The Application of Compressed Sensing for Rapid Mr Imaging. Magn. Reson. Med. 2007, 58, 1182−1195. (11) Candes, E. J.; Wakin, M. B. An Introduction to Compressive Sampling. IEEE Signal Processing Magazine 2008, 25, 21−30. (12) Rudin, L.; Osher, S.; Fatemi, E. Non-Linear Total Variation Noise Removal Algorithm. Phys. D 1992, 60, 259−268. (13) Strong, D.; Chan, T. Edge-Preserving and Scale-Dependent Properties of Total Variation Regularization. Inverse Problems 2003, 19, S165−S187. (14) Chambolle, A. A Journal of Mathematical Imaging and Vision 2004, 20, 89 DOI: 10.1023/B:JMIV.0000011325.36760.1e. (15) Eckstein, J.; Bertsekas, D. P. On the Douglas-Rachford Splitting Method and the Proximal Point Algorithm for Maximal Monotone Operators. Math. Program. 1992, 55, 293−318. (16) Hyberts, S. G.; Arthanari, H.; Wagner, G. Applications of NonUniform Sampling and Processing. Top. Curr. Chem. 2011, 316, 125− 148. (17) Hyberts, S. G.; Takeuchi, K.; Wagner, G. Poisson-Gap Sampling and Forward Maximum Entropy Reconstruction for Enhancing the Resolution and Sensitivity of Protein Nmr Data. J. Am. Chem. Soc. 2010, 132, 2145−2147. (18) Kwak, K.; Rosenfeld, D. E.; Fayer, M. D. Taking Apart the TwoDimensional Infrared Vibrational Echo Spectra: More Information and Elimination of Distortions. J. Chem. Phys. 2008, 128, 204505. (19) Guo, Q.; Pagano, P.; Li, Y.-L.; Kohen, A.; Cheatum, C. M. Line Shape Analysis of Two-Dimensional Infrared Spectra. J. Chem. Phys. 2015, 142, 212427. (20) Guo, Q.; Pagano, P.; Li, Y.-L.; Kohen, A.; Cheatum, C. M. Line Shape Analysis of Two-Dimensional Infrared Spectra. J. Chem. Phys. 2015, 142, 212427. 3093

DOI: 10.1021/acs.jpca.7b01965 J. Phys. Chem. A 2017, 121, 3088−3093