Letter pubs.acs.org/ac
Colors for Molecular Masses: Fusion of Spectroscopy and Mass Spectrometry for Identification of Biomolecules Vladimir Kopysov,† Alexander Makarov,‡ and Oleg V. Boyarkin*,† Laboratoire de Chimie Physique Moléculaire, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland Thermo Fisher Scientific, Hanna-Kunath Strasse 11, 28199 Bremen, Germany
† ‡
S Supporting Information *
ABSTRACT: We present an approach that integrates ultraviolet (UV) photofragmentation spectroscopy of cold ions with high-resolution Orbitrap mass spectrometry (MS) and uses mathematical analysis of the recorded 2D data arrays for structural identification of biomolecules. The synergy of the two orthogonal techniques makes these arrays unique fingerprints of molecular ions, enabling their reliable identifications. Using preliminary created libraries of fingerprints, the UV-MS approach was successfully applied for quantitative identification of exact isobaric molecules in their mixtures, which is one of the challenging cases for mass spectrometry. We also demonstrate how the UV and fragmentation mass spectra of unknown chemical components of a mixture can be recovered from its fingerprint even without a use of library.
S
mixture of different molecules allows for recovery of their UV and MS identities without a use of libraries. We demonstrate the capability of the approach to identify ions in mixtures of different singly protonated isobaric molecules, which are particular difficult cases for MS even when coupled to LC or IM, and for which the electron capture/transfer dissociation method, for instance, is not applicable at all. The physical foundation of the 2D UV-MS approach is built on the fact that UV fragmentation involves electronic excitation of ions and therefore may produce specific to their structure fragments, which are absent (or low abundant) in the products of thermal-like dissociation.15,16 Compared with single wavelength photofragmentation by VUV-UV light,17−20 in the 2D UV-MS approach the fragmentation patterns are recorded at several wavelengths within the absorption bands of UV chromophore groups. Vibrational resolution, achievable by cryogenic cooling of ions, makes these patterns sharply wavelength-dependent, adding a great number of details into ionic fingerprints. This allows distinguishing very similar species, including isomers. The approach implicates measurement of the whole photofragmentation mass spectrum of cold ions at each wavelength of UV dissociating laser, while scanning it, and a certain mathematical treatment of the measured 2D data array for extracting structural information (for details see the
tructural identification of biomolecules remains in the core of life-science research on the fundamental level. Although analysis by state-of-the-art high-resolution mass spectrometry (HRMS) demonstrates impressively accurate determination of, for instance, peptide sequences, the identification of isobaric molecules (e.g., isomers) remains among the challenges of this technique.1,2 Coupling of MS to high-resolution LC or ion mobility (IM) often enables separation of sufficiently different isobaric species by tagging each of them with a retention or arrival time, respectively.3−5 The separation does not yet ensure a reliable identification of the isobars, because these tags are not fundamental to molecules and are sensitive to experimental conditions, which may not be always reproduced with sufficient accuracy.6−8 Optical spectroscopy, which is orthogonal to MS, LC, and IM separations, is another method of structural identifications. When combined with cryogenic cooling of polyatomic ions, optical spectroscopy possesses the capability to map-up the ionic vibrational energy levels, which are characteristic of the 3D structure of the ions on a fundamental level.9−14 Here we present an approach to molecular structural identification, which is based on integration of broadband HRMS with UV photofragmentation spectroscopy of cold ions. This fusion multiplies the selectivities of the two complementary techniques and produces highly specific two-dimensional (fragmentation yield vs UV wavelength and m/z) fingerprints of ionic structures. Once stored as libraries, these fingerprints can be used in further identifications of ions employing numerical algorithms of matrix analysis. Moreover, in certain cases the decomposition of the fingerprint of a © XXXX American Chemical Society
Received: March 2, 2015 Accepted: April 6, 2015
A
DOI: 10.1021/acs.analchem.5b00822 Anal. Chem. XXXX, XXX, XXX−XXX
Letter
Analytical Chemistry Supporting Information). On the instrumentation side, we have coupled our cold ion spectroscopy instrument21 with the Exactive Orbitrap-based mass spectrometer (Thermo Fisher Scientific, Bremen, Germany) by replacing the original ion source section of the mass spectrometer with a two-section octopole ion guide. The electrosprayed ions are preselected by a quadrupole ion filter, stored, and cooled in a linear cold trap, where they are fragmented by a UV laser pulse. The content of the cold trap is released, and both fragment and parent ions are guided into the Orbitrap mass analyzer. The 2D UV-MS spectra are measured by continuously recording the mass spectra on the Orbitrap while scanning the UV laser. The 2D spectra are converted into matrices and subsequently processed using MATLAB software. The rectified numerical fingerprints of isobaric ions are stored as a library of 2D data matrices and are then used for identification of the ions. In order to decompose the UV-MS matrix of a mixture of isobaric molecules in a basis set of the respective library, we calculate the UV spectra at all the m/z values of the peaks found in the library. Then we project these spectra, as well as the spectra from the library, onto the common wavenumber scale. We normalize these spectra to the total number of ions in each MS scan. Thus, we obtain the matrix that corresponds to the measured sample and the matrices of the species from the library. The last step is to determine the coefficients of the linear combination of the library matrices that fits the matrix of the sample under study. We use the MATLAB built-in constrained linear least-squares problem solver with the nonnegativity constraints. Finally, we divide the calculated coefficients by their sum to find the relative concentrations of all library species in the sample. We use the 2D UV-MS approach for the identification of the pentapeptide YAGFL with all naturally left-handed amino acids ([Ala2]-leucine enkephalin) and its stereoisomer [D-Ala2, DLeu5]-enkephalin. It is challenging to distinguish these isomers, for instance, by MS only due to the identity of their fragmentation mass spectra (Figure S1 in the Supporting Information) or by IM-MS due to the small difference in their collisional cross sections.3 In contrast to the MS spectra, the UV ones, obtained from the 2D fingerprints of the two species (Figure 1a), are strikingly different, enabling identification of the stereoisomers. We mixed the isomers in different relative concentrations in solution and measure the fingerprints of the mixtures. The corresponding data matrix of each mixture was represented as a linear combination of the two matrices of the isomers, and the two positive coefficients of the decomposition were varied using the least-squares algorithm to minimize the difference. The decompositions correctly identify the isomers, reproducing their relative concentrations in the solution with a root mean squared deviation (RMSD) of 5.4% (Figure 1b). A more complex example is the identification of five isobaric phosphorylated peptides in their mixtures. Five fingerprints of singly protonated octapeptides TSAAATSY, which differ only by their single phosphorylation site (pT, pS, or pY), were measured and stored as a library (Figure S2 in the Supporting Information). Figure 2a shows 2D plot of the fingerprint for a mixture of three library components. The decomposition of the matrices of 12 different test mixtures in the basis set of the five components correctly identifies the peptides in all our experiments and reproduces the relative concentrations of the peptides in the solutions with RMSD of 6.5% (Figure 2b). For a comparison, a standard UPLC analysis could not unambiguously determine the number and the identity of the
Figure 1. (a) UV spectra of two singly protonated stereoisomers of YAGFL peptide and their 1:1 mixture, plotted as slices of the respective UV-MS fingerprints at the fragment m/z = 462.23. For a comparison, also depicted is the linear combination of the decomposed traces, taken with the derived coefficients of the decomposition. (b) Calculated relative concentrations of the two stereoisomers as a function of their relative solution concentrations (0−100%) for 12 different mixtures. The RMSD of 5.4% was calculated taking deviations of the calculations from the ideal, y = x, result (solid line). The two dashed lines show the interval of 99.7% confidence level (3 × RMSD) in calculating concentrations.
components in a mixture of five peptides present in equal quantities (Figure S3 in the Supporting Information). An optimized LC analysis might separate these compounds at the price of losing universality but still may not ensure their unambiguous identification due to the known issues in reproducibility of retention times, including the phenomenon of peak inversion.7,8 Based on direct measurements of fundamental characteristics of ions, the UV-MS method is insensitive to moderate changes of the experimental conditions, making the fingerprints highly specific and reproducible. An important issue for the method is whether it can be used for online identification, when the time for a fingerprint measurement is limited by the LC peak width to a few to tens of seconds. High-resolution fingerprints of library components, which we measure at thousands of UV wavelengths on a time scale of an hour, can only be obtained offline. A still specific fingerprint of a test mixture can be measured, however, utilizing B
DOI: 10.1021/acs.analchem.5b00822 Anal. Chem. XXXX, XXX, XXX−XXX
Letter
Analytical Chemistry
Figure 2. (a) UV-MS fingerprint for 1:1:1 mixture of pTSA3TSY, TpSA3TSY, and TSA3TSpY singly protonated peptides. (b) Calculated relative concentrations of the five library isobars (dots) as a function of their relative solution concentrations (0−100%) for 12 different mixtures. The RMSD of 6.5% was determined taking deviations of the calculated from the solution concentrations (solid line). The two dashed lines show the interval of the 99.7% confidence level (3 × RMSD) in calculating concentrations. (c) RMSD values calculated for all the mixtures as in part b and plotted as a function of the number of wavelengths, retained in matrix decomposition (log scales). Each green dot represents an average over 100 randomly generated wavelength sets, and the error bar indicates the dispersion of the average for each set. Blue dots are the RMSDs, calculated for the specific wavelengths, related to the band origins of the five library peptides (Supporting Information).
Figure 3. (a) UV-MS fingerprint of the mixture of two exact isobaric peptides. (b) UV absorption spectra and (c) mass spectra of the two most important components (labeled as I and II), derived by factorization of the data-matrix of the mixture (upper traces). For a comparison, the experimental spectra, obtained from the respective library’s fingerprints are shown below the derived traces. For graphical clarity, all spectra are normalized to their respective integral intensities.
complete a measurement.22 This duration corresponds to 3 s for measuring the fingerprint of a mixture at 20 wavelengths, suggesting that the rate of the UV-MS measurements can be sufficiently fast for online LC-MS identifications, using preliminary measured libraries. When the library of isobaric compounds does not exist, the number of the components can be estimated and their UV and MS spectra can be derived from the fingerprint of the mixture using methods of matrix analysis, described in details elsewhere.23−25 The UV-MS matrix of a mixture can be factorized, for instance, into two non-negative matrices, one of which represents the absorption spectra while the second matrix contains fragment mass spectra of mixture components. If k denotes the number of such components, then the n-by-m UV-MS matrix D can be factorized into n-by-k matrix W of UV spectra and k-by-m matrix H of mass spectra, so that their
only a few critical wavelengths, revealed by a preliminary analysis of the library. For the library of phosphopeptides considered above, the five UV absorption band origins give a minimum full set of the five critical wavelengths, which can be complemented each time by the wavelengths of nearby peaks. Figure 2c illustrates how the RMSD for the tested mixtures depends on the number of wavelengths, retained in the truncated matrices of the mixtures and the library. A random choice of a few wavelengths results in unacceptably high RMSDs, which, however, converge close to the RMSD of the full decomposition for 100 wavelengths. Analysis employing a few but critical wavelengths drastically reduces the RMSD relative to results acquired with randomly selected wavelengths, approaching the RMSD of the full analysis at 10−20 wavelengths. An ultrahigh field Orbitrap analyzer, set to a moderate mass resolution of 6 × 104, needs only about 0.15 s to C
DOI: 10.1021/acs.analchem.5b00822 Anal. Chem. XXXX, XXX, XXX−XXX
Letter
Analytical Chemistry matrix product WH is a lower-rank approximation to D. For a given k, the MATLAB built-in function “nnmf”, which employs the iterative alternating least-squares algorithm, performs such factorization. The number of important components is determined using the bi-cross-validation (BCV) approach,26 namely, the value of k that minimizes the BCV error is an estimate of the number of components. It is fundamental to the approach that only the components, which differ both in absorption and in fragmentation patterns, can be revealed mathematically. Figure 3 illustrates such “blind” analysis for a mixture of two isobaric peptides, pTSA3TSY and TSA3pTSY, in which the same but differently located residues are phosphorylated. The use of non-negative matrix factorization and the bicross-validation approach (Supporting Information, Figure S5), indeed, suggests the most likely presence of two components in this mixture and recovers their individual UV and mass spectra (Figure 3b,c). The derived spectra match well to the UV and mass spectra, obtained by integration of the respective library’s fingerprints over wavelength and m/z, respectively. Relative contribution of each component to the measured UV-MS fingerprint are calculated as a sum of all elements of the matrix, which is constructed as a product of the respective column and row of the derived UV and MS matrices, respectively. These contributions reflect individual photodissociation yields and concentrations of mixed peptides. In order to estimate relative concentrations of the components, their relative contributions have to be normalized to the total integrals of the respective library fingerprints (i.e., to the sum of all elements of a matrix). Such normalization accounts for the total photofragmentation yields of the components. The estimated relative concentrations of the components, 52% and 48% are in a very good agreement with their 50:50% solution concentrations. Similar results have been obtained for the decomposition of the fingerprint, measured for a mixture of two identical peptides singly phosphorylated on two different serine residues (Figures S4 and S5 in the Supporting Information). In overall, the presented here data validate the use of UV-MS approach for qualitative identifications of unknown UV-absorbing components in their mixtures. The derived UV spectra may help in determination of the chromophore groups in the revealed components, using, for instance, the positions of UV band origins in different aromatic residues.12 An analysis of the fragment mass spectra may further narrow the search of the suspected molecules or even fully identify them using available MS databases. The exact limitations for applicability of UV-MS approach yet to be assessed. Attaining vibrational resolution in UV spectra of species is the principle condition for the successful use of UV-MS approach. It is unlikely the case, for instance, for even relatively small proteins (e.g., cytochrome c), which spectra can be broaden due to significant conformational heterogeneity and rich vibrational structure of large species. The large size of proteins also reduces their UV fragmentation yield, making MS detection of the fragments difficult. In opposite, there are examples of vibrationally resolved UV spectra of cold protonated peptides containing up to 17 amino acids.27,28 These data suggest the applicability of the UV-MS technique, in particular, to tryptic peptides, which are on the same size-scale. In summary, we have presented a potentially, broadly useful approach for structural identification of UV-absorbing molecules and demonstrated its application for library-based and blind identifications of exact isobaric ions. The hardware and
operational conditions utilized for the approach are not specific to molecules and should be suitable for fingerprinting of any small to midsize UV-absorbing species compatible with MS analysis, such as aromatic peptides, drugs, metabolites, etc. Our analysis demonstrates that the UV-MS fingerprinting also has the potential for coupling it to LC for online identifications in a broad range of applications.
■
ASSOCIATED CONTENT
S Supporting Information *
Details of the experiment and the matrix analysis procedure, HCD and UV photofragmentation mass spectra, library of phosphopeptides, and results of LC analysis. This material is available free of charge via the Internet at http://pubs.acs.org.
■
AUTHOR INFORMATION
Corresponding Author
*E-mail: oleg.boiarkin@epfl.ch. Notes
The authors declare no competing financial interest.
■
ACKNOWLEDGMENTS We thank EPFL’s MS service (L. Menin) for LC measurements. The project was supported by the Fonds National Suisse (Grant 200021_146389/1).
■
REFERENCES
(1) He, F.; Emmett, M. R.; Håkansson, K.; Hendrickson, C. L.; Marshall, A. G. J. Proteome Res. 2003, 3, 61−67. (2) Creese, A. J.; Smart, J.; Cooper, H. J. Anal. Chem. 2013, 85, 4836−4843. (3) Wu, C.; Siems, W. F.; Klasmeier, J.; Hill, H. H. Anal. Chem. 2000, 72, 391−395. (4) Winter, D.; Pipkorn, R.; Lehmann, W. D. J. Sep. Sci. 2009, 32, 1111−1119. (5) Shvartsburg, A. A.; Creese, A. J.; Smith, R. D.; Cooper, H. J. Anal. Chem. 2011, 83, 6918−6923. (6) Norbeck, A. D.; Monroe, M. E.; Adkins, J. N.; Anderson, K. K.; Daly, D. S.; Smith, R. D. J. Am. Soc. Mass Spectrom. 2005, 16, 1239− 1249. (7) Riddle, L. A.; Guiochon, G. Chromatographia 2006, 64, 121−127. (8) Tarasova, I. A.; Perlova, T. Y.; Pridatchenko, M. L.; Goloborod’ko, A. A.; Levitsky, L. I.; Evreinov, V. V.; Guryca, V.; Masselon, C. D.; Gorshkov, A. V.; Gorshkov, M. V. J. Anal. Chem. 2012, 67, 1014−1025. (9) Boyarkin, O. V.; Mercier, S. R.; Kamariotis, A.; Rizzo, T. R. J. Am. Chem. Soc. 2006, 128, 2816−2817. (10) Nagornova, N. S.; Guglielmi, M.; Doemer, M.; Tavernelli, I.; Rothlisberger, U.; Rizzo, T. R.; Boyarkin, O. V. Angew. Chem., Int. Ed. 2011, 50, 5383−5386. (11) Nagornova, N. S.; Rizzo, T. R.; Boyarkin, O. V. Science 2012, 336, 320−323. (12) Garand, E.; Kamrath, M. Z.; Jordan, P. A.; Wolk, A. B.; Leavitt, C. M.; McCoy, A. B.; Miller, S. J.; Johnson, M. A. Science 2012, 335, 694−698. (13) Gloaguen, E.; Loquais, Y.; Thomas, J. A.; Pratt, D. W.; Mons, M. J. Phys. Chem. B 2013, 117, 4945−4955. (14) Feraud, G.; Dedonder, C.; Jouvet, C.; Inokuchi, Y.; Haino, T.; Sekiya, R.; Ebata, T. J. Phys. Chem. Lett. 2014, 5, 1236−1240. (15) Tabarin, T.; Antoine, R.; Broyer, M.; Dugourd, P. Rapid Commun. Mass Spectrom. 2005, 19, 2883−2892. (16) Kopysov, V.; Nagomova, N. S.; Boyarkin, O. V. J. Am. Chem. Soc. 2014, 136, 9288−9291. (17) Brodbelt, J. S. Chem. Soc. Rev. 2014, 43, 2757−2783. (18) Cui, W.; Thompson, M.; Reilly, J. J. Am. Soc. Mass Spectrom. 2005, 16, 1384−1398. D
DOI: 10.1021/acs.analchem.5b00822 Anal. Chem. XXXX, XXX, XXX−XXX
Letter
Analytical Chemistry (19) Joly, L.; Antoine, R.; Broyer, M.; Dugourd, P.; Lemoine, J. J. Mass Spectrom. 2007, 42, 818−824. (20) Yeh, G.; Sun, Q.; Meneses, C.; Julian, R. J. Am. Soc. Mass Spectrom. 2009, 20, 385−393. (21) Rizzo, T. R.; Stearns, J. A.; Boyarkin, O. V. Int. Rev. Phys. Chem. 2009, 28, 481−515. (22) Scheltema, R. A.; Hauschild, J.-P.; Lange, O.; Hornburg, D.; Denisov, E.; Damoc, E.; Kuehn, A.; Makarov, A.; Mann, M. Mol. Cell. Proteomics 2014, 13, 3698. (23) Paatero, P.; Tapper, U. Environmetrics 1994, 5, 111−126. (24) Lee, D. D.; Seung, H. S. Nature 1999, 401, 788−791. (25) Berry, M. W.; Browne, M.; Langville, A. N.; Pauca, V. P.; Plemmons, R. J. Comput. Stat. Data Anal. 2007, 52, 155−173. (26) Owen, A. B.; Perry, P. O. Ann. Appl. Stat. 2009, 3, 564−594. (27) Guidi, M.; Lorenz, U. J.; Papadopoulos, G.; Boyarkin, O. V.; Rizzo, T. R. J. Phys. Chem. A 2009, 113, 797−799. (28) Nagornova, N. S.; Rizzo, T. R.; Boyarkin, O. V. J. Am. Chem. Soc. 2010, 132, 4040−4041.
E
DOI: 10.1021/acs.analchem.5b00822 Anal. Chem. XXXX, XXX, XXX−XXX