Anal. Chem. 2007, 79, 1702-1707
Technical Notes
Nonparametric Mass Calibration Using Hundreds of Internal Calibrants Christopher H. Becker,* Praveen Kumar, Ted Jones, and Hua Lin
PPD, Inc., 1505 O’Brien Drive, Menlo Park, California 94025
In situations where many molecular ions (>100) can be identified to the level of their elemental composition, such as in proteomics, metabolomics, and glycomics, a final mass calibration is possible for every sample without reliance on any analytical description of instrument behavior. This is achieved by applying a nonparametric calibration curve determined from the difference in observed, centroided m/z values of the known, internal calibrant molecular ions versus that calculated from their elemental compositions, over the m/z range. In examples here, proteomic data are examined for two sets of samples of complex mixtures composed of tryptic peptides from human and mouse blood proteins using high-resolution time-of-flight mass spectra from on-line liquid chromatography-mass spectrometry experiments. Resultant, postcalibration median absolute value mass errors and root-mean-square errors for peptides between 300 and 1100 m/z for many samples ranged from 3.1 to 4.4 and 5.2 to 6.9 ppm, respectively. The method may be applied to other types of mass spectrometers. Many papers have been written on a multitude of aspects of mass calibration for a range of instrument types. In addition, instrument manufacturers provide high-quality calibration procedures with their equipment. Nevertheless, it is well -known that further improvements in mass accuracy are possible after calibration of the instrument because small changes in instrument behavior over time or temperature or sample load can lead to noticeable changes in mass accuracy. In the study of complex materials such as biological samples, relatively modest improvements in mass accuracy, say from 500 to 50 ppm, or 50 to 5 ppm or less, can have major consequences for enabling identification of peptides or metabolites, or other molecules. This report focuses on time-of-flight (TOF) instrumentation, although the approach may be applicable to other types of mass spectrometers. For a review of TOF instruments including * Corresponding author. Phone: (650) 470-2386. Fax: (650) 470-2400. E-mail:
[email protected]. (1) Cotter, R. J. Time-of-Flight Mass Spectrometry; American Chemical Society: Washington DC, 1997.
1702 Analytical Chemistry, Vol. 79, No. 4, February 15, 2007
information on instrument calibration, see Cotter1 and Guilhaus et al.2 A variety of recent studies explore a range of calibration issues and approaches.3-8 Especially noteworthy is the work of Egelhofer et al.,5 who used the identification of peptides in a peptide mapping experiment as a means of recalibrating a spectrum and improving protein identification confidence. Other recent contributions include the investigation of the effect of signal strength on calibration,6 temporal effects (instrument drift) during data acquisition over the duration of a chromatographic run in a liquid chromatography-mass spectrometry (LC-MS) experiment,7 and a Bayesian probability analysis to derive a linear calibration correction using partial knowledge of sample composition with data from liquid chromatography-Fourier transform ion cyclotron resonance mass spectrometry.8 This technical note adds two new aspects to the subject of mass spectrometric calibration: the use of large numbers (say, >100) of internal calibrants combined with a nonparametric estimation of the calibration curve. Large numbers of previously, or at least tentatively, identified molecular ions, such as tryptic peptides, are used to perform a global recalibration so that yet unidentified molecules can be calibrated to high accuracy over a broad mass range, thus making their identification more likely. The recalibration can also expose incorrect identifications from the initial pass at identification, as well as be used as part of a strategy to assess the final probability of correct molecular assignment. Furthermore, this process can be performed for individual samples and then the median or mean computed from many similar samples to further improve mass and identification accuracies. The approach reported here does not rely on any particular functional form for the calibration curve. This is possible when (2) Guilhaus, M.; Selby, D.; Mlynski, V. Mass Spectrom. Rev. 2000, 19, 65107. (3) Michelsen, P.; Karlsson, A. A. Rapid Commun. Mass Spectrom. 1999, 13, 2146-2150. (4) Hakansson, K.; Zubarev, R. A.; Hakansson, P.; Laiko, V.; Dodonov, A. F. Rev. Sci. Instrum. 2000, 71, 36-41. (5) Egelhofer, V.; Bussow, K.; Luebbert, C.; Lehrach, H.; Nordhoff, E. Anal. Chem. 2000, 72, 2741-2750. (6) Blom, K. F. Anal. Chem. 2001, 73, 715-719. (7) Strittmatter, E. F.; Rodriguez, N.; Smith, R. D. Anal. Chem. 2003, 75, 460468. (8) Yanofsky, C. M.; Bell, A. W.; Lesimple, S.; Morales, F.; Lam, T. T.; Blakney, G. T.; Marshall, A. G.; Carrillo, B.; Lekpor, K.; Boismenu, D.; Kearney, R. E. Anal. Chem. 2005, 77, 7246-7254. 10.1021/ac061359u CCC: $37.00
© 2007 American Chemical Society Published on Web 01/11/2007
using a large number of calibration points. Similarly, small deviations from predicted behavior may become apparent with this approach, which could be useful in understanding subtle instrumental effects. In the approach here, a calibration curve is obtained by smoothing of the difference in m/z between the experimentally observed peak positions and that predicted based on elemental composition, as a function of m/z. The differences in m/z (or Thompsons) in this report are sometimes expressed in milliThompsons, or mTh, as well as parts-per-million, or ppm. Recent advances in proteomics and metabolomics have made routine the ability to identify hundreds or thousands of molecular ions in a sample, including their elemental composition.9-13 To achieve this level of identification requires an initial instrument calibration, plus prior tandem mass spectrometric experience with similar samples or a first-pass analysis of the samples in question. However, after this initial identification set has been established, identification uncertainties exist plus many more ions remain to be identified or confirmedsmotivating optimal mass accuracy. A common biological sample is human or rodent serum or tissue, but the method is not limited to these or even to biological samples, or any specific type of mass spectrometer. Additionally, this global calibration approach is not limited to a one-dimensional LC-MS strategy, or even LC-MS. Multidimensional LC-MS, matrix-assisted laser desorption/ionization, or other ionization methods similarly may be used. EXPERIMENTAL METHODS Sample Preparation and Analysis. For the examples of this technical note, two proteomic studies were performed with a tryptic digest of a set of 30 human plasma samples and a set of 240 mouse serum samples. Proteins were denatured and disulfide bonds reduced and carboxymethylated before digestion. A onedimensional LC-MS experiment was performed with on-line capillary reversed-phase chromatography and electrospray ionization (ESI).13-15 The HPLC system was manufactured by Agilent Technologies Inc. (Palo Alto, CA, model capillary 1100). The handling of these biological materials was performed in accordance with U.S. Department of Health and Human Services guidelines for level 2 laboratory biosafety, as found in Biosafety in Microbiological and Biomedical Laboratories, 4th ed., HHS Publication no. (CDC) 93-8395. The tryptic peptides were analyzed on high-resolution (R > 5000), orthogonal injection, ESI time-of-flight instruments manufactured by Waters Corp. (model LCT, Milford, MA) and Bruker BioSciences Corp. (model microTOF, Billerica, MA). These are not tandem mass spectrometers, although tandem mass spectrometers also can be the subject of this approach to calibration. Initial Calibration. In these experiments, there were three stages of calibration for data obtained on these high-resolution TOF instruments. The first two stages provide enough accuracy to obtain a reliable list of calibrants needed for the final stage. (9) Mann, M.; Wilm, M. Anal. Chem. 1994, 66, 4390-4399. (10) Yates, J. R. 3rd; Eng, J. K.; McCormack, A. L.; Schieltz, D. Anal. Chem. 1995, 67, 1426-1436. (11) Gygi, S. P.; Rist, B.; Gerber, S. A.; Turecek, F.; Gelb, M. H.; Aebersold, R. Nat. Biotechnol. 1999, 17, 994-999. (12) Smith, R. D.; Anderson, G. A.; Lipton, M. S.; Pasa-Tolic, L.; Shen, Y.; Conrads, T. P.; Veenstra, T. D.; Udseth, H. R. Proteomics 2002, 2, 513523.
First, instrument calibration is typically performed once every month using a formula of
(m/z)11/2 ) A + Bx + Cx2 + Dx3 + Ex4 + Fx5
where x ) (m/z)01/2
and (m/z)0 is the initial value of m/z reported by the instrument as provided by the manufacturer. The coefficients A-F are determined by infusing a mixture of PEG200 and PEG600 solution and fitting the equation by linear regression using ∼30 protonated and sodiated molecular ions. Second, after data acquisition of samples, each data file (sample) is recalibrated using a phthalate impurity ion (391.285 Da) at one point in time, at the start of the LC-MS chromatographic elution. This is known as a single lock-mass correction. The formula used is
(m/z)2 ) (m/z)1(391.285/K) where K is the observed m/z of the phthalate lock-mass ion. Third, the final calibration stage for each sample using many identified ions as internal calibrants, the subject of this technical note, was performed as follows. Identification of Internal Calibrants. A set of many, say, more than 100 calibrant molecules are at least tentatively identified in the present sample set, or in samples of similar type, with identification including elemental composition so that highly accurate masses can be computed. These molecules are intrinsic to the samples and thus can be considered as internal calibrants. The monoisotopic mass is used in all calculations, the lowest isotope in the isotopic pattern. In this study, tryptic peptides of human or mouse blood proteins were identified by equipment other than that used to perform accurate mass determination measurements. (The TOF instruments mentioned above were used to perform differential expression measurements, also known as profiling, on samples from multiple cohorts.) Identification of peptides was performed by an ion trap tandem mass spectrometer (Thermo Electron, model LTQ, Waltham, MA) or a Q-TOF tandem mass spectrometer (Waters Corp., model microQTOF) in combination with database comparison using the MASCOT program (Matrix Sciences, London, UK). Given a peptide sequence, a computed highly accurate mass and m/z (depending on the charge state z) is then readily generated based on elemental composition. In this study, the minimum Mascot score used for calibrant peaks was 35. Nearly identical chromatography was performed between all LC-MS systems, using the same model HPLC system. Identifications were connected to the profiling data on the high-resolution TOF analyzers through a matching (mapping) of (i) TOF-observed m/z with m/z calculated from elemental composition and (ii) retention index of the peaks observed on the TOF mass spectrometers and tandem mass spectrometers. The retention index (13) Wang, W.; Zhou, H.; Lin, H.; Roy, S.; Shaler, T. A.; Hill, L. R.; Norton, S.; Kumar, P.; Anderle, M.; Becker, C. H. Anal. Chem. 2003, 75, 4818-4826. (14) Roy, S. M.; Anderle, M.; Lin, H.; Becker, C. H. Int. J. Mass Spectrom. 2004, 238, 163-171. (15) Roy, S. M.; Becker, C. H. In Quantitative Proteomics by Mass Spectrometry; Sechi, S., Ed.; Humana Press: Totowa, NJ, 2007; Vol. 359, Chapter 6.
Analytical Chemistry, Vol. 79, No. 4, February 15, 2007
1703
Figure 1. Exemplary plot of a calibration curve (heavy points) for one sample (file) based on 1657 tryptic peptide molecular ions from digested human plasma proteins. Small data points are those from the individual calibrant molecular ions. The values of ∆m/z are based on observed centroided peak positions versus that calculated based on elemental composition.
is based on retention time and uses ∼25 preselected ions to calibrate slight shifts in retention time between instruments to adjust for small pump or column variations. In this study, there were only minor differences in retention times between the highresolution LC-TOF-MS profiling data and the LC-MS/MS data, and equivalent results could be obtained just by using retention time. Nominally identical chromatography systems are used for MS-only profiling and tandem mass spectrometers. The uncertainty in m/z for the initial identifications, and LC-MS with LCMS/MS data matching, is constrained as much as practical, but is further constrained in m/z by the final calibration step. The initial matching or linking requirement chosen for this report was (60 mTh for the mass uncertainty and ∼ (2 min for the retention time (1-h gradient). Having obtained a list of calibrant molecules for the samples, the calibration information consisting of computed m/z and retention index is then mapped onto each high-resolution mass spectral data file. There are at least two ways to do this. In this study, molecular ions common to all samples, called components, were first determined across all files.13-16 This allows software corrections such as chromatographic time warping13 to a common time base, if necessary. Then the calibration information is mapped onto the component values with small tolerances in m/z and retention time or index, and through the common components, each file is immediately also mapped. Alternatively, it is possible to calibrate each file independently before determining common components across files, but no use is then made of the computation machinery to identify common components. In either case, the result is a table of molecular ions with tentative elemental composition assignments with associated observed m/z as determined after the first two steps of calibration described above. Final Calibration. Each file (sample) is handled independently. In order to improve both the accuracy and precision of (16) Anderle, M.; Roy, S.; Lin, H.; Becker, C.; Joho, K. Bioinformatics 2004, 20, 3575-3582.
1704
Analytical Chemistry, Vol. 79, No. 4, February 15, 2007
the m/z measurement, the calibrant peaks are centroided by a single Gaussian function using only the top half of the peak (intensities above 50% of the maximum). Avoiding the lower signal part of the peaks minimizes error due to peak asymmetry. In the current spectra, the top half shows consistency and little asymmetry and thus little error is expected by this simple fitting, although it is possible to make such future refinements. If we have n calibrant peaks, after centroiding we obtain n observed m/z values u1, u2, ... un. We will assume the calibrant peaks are ordered so that the sequence ui is increasing in m/z. Using the elemental composition of each calibrant, we can compute the corresponding theoretical m/z values t1, t2, ... tn, and the errors δi ) ui - ti. At this stage, a scatter plot of δi versus ti shows a systematic dependence on m/z (see Figure 1). The purpose of the final recalibration is to remove this remaining error in m/z. We model the errors δi as an unknown smooth function of m/z plus a random error term
δi ) f (ui) + ei
where the random errors ei are independent with mean zero. In a scatter plot like Figure 1, the function f represents the calibration curve and ei accounts for the scatter about that curve. The unknown function f could be estimated by modeling it as a polynomial and fitting it by linear regression, but polynomial fits have the disadvantage that they may produce misleading results if they are inappropriate for the data. Polynomial fitting is also nonlocal, meaning that the position of the fitted calibration curve at a given point exhibits dependence on every data point, including distant ones. For these reasons, we prefer to estimate the calibration curve using a locally fitted smoother. There are many excellent choices for such smoothers, such as smoothing
Figure 2. Histograms of the mass error for the human plasma study: before and after final calibration for one file, for one file using only those calibrant peaks with very high identification confidence (Mascot scores greater than 60), and the result of averaging over 30 samples for all scores. Vertical dashed lines show the region of final m/z uncertainty for acceptance of peptide sequence identifications, set in this work at (10 mTh.
splines or the loess smoother.17 For the purposes of this paper, we used a running median smoother, as implemented in the function “runmed” from the open source R statistical package,18 largely because of its simplicity of implementation. This smoother computes estimated points (ui, f (ui)) on the calibration curve by taking medians of the measurement errors δi in a sliding window of odd width 2k + 1:
f (ui) ) median(δi-k, δi-k+1, ‚‚‚, δi, ‚‚‚, δi+k-1, δi+k)
For the value of k we used a value approximately equal to n/10, so the smoothing window covered ∼20% of the data. Near the first and last calibrant peaks, the value of k is reduced to maintain a symmetric window. At the very first and last points, f (u1) and (17) Hastie, T. J.; Tibshirani, R. Generalized Additive Models; Chapman and Hall: New York, 1990. (18) R Development Core Team (2006). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.
f (un) are estimated using Tukey’s end rule.18,19 The calibration curve is completed by linear interpolation between the points obtained from the smoother. The running median smoother is quite resistant to outliers. Nevertheless, we investigated a second pass fit by removing any calibrant peaks whose distance |δi - f (ui)| from the calibration curve is greater than 10 mTh and then reapplying the running median smoother without these outlier points. This filtering of calibrant peaks beyond a predetermined distance from the median smoother is relevant because an uncertainty cutoff is used in the identification process. In these studies, final identifications were not accepted if the molecular ion’s observed m/z were not within 10 mTh of that predicted from the sequence and elemental composition. For the set of calibration curves used in this report, the calibration curve obtained with the initial linking requirement of (60 mTh resulted in a curve nearly superimposable with that from a second pass having a (10 mTh acceptance, showing the median smoother’s resistance to outliers. To be clear, while a (19) Tukey, J. W. Exploratory Data Analysis; Addison-Wesley: Reading, MA, 1977.
Analytical Chemistry, Vol. 79, No. 4, February 15, 2007
1705
second pass may be used, the approach presented here should not be considered an intrinsically iterative technique. Having obtained the calibration curve, final recalibrated m/z values are computed by subtracting the estimated error from (m/z)2:
(m/z)3 ) (m/z)2 - f ((m/z)2)
Outliers are generally a result of either misidentification or spectral interference. Spectral interferences for these complex samples are not rare and vary by degree, but do not overwhelm the data. Additional checking for interferences can resolve many identifications. RESULTS AND DISCUSSION As part of differential expression (profiling) experiments,13-16 tryptic peptides were analyzed in two studies from numerous samples of human plasma and mouse serum. An exemplary calibration curve for one sample is shown in Figure 1 for human plasma tryptic peptides. In this case, 1657 calibrant molecular ions were employed. These calibrants actually represent a small subset of those possible for use in this one-dimensional chromatography study. Given the large number of potential calibrants available (∼6000/sample in these examples), filtering based on identification score and signal intensity has been made before their use. Both intense (saturation effects) and weak mass peaks (low signalto-noise ratios) were removed as possible calibrants. However, exactly how such filtering of an initial calibrant list is performed is not critical given the large number of possible candidates and the general insensitivity of the calibration mathematics to outliers. One calibrant list is sufficient for all samples of a given type such as human or mouse blood proteins. Figure 1 is a typical result for a calibration curve, but the exact shape and behavior can vary substantially between samples, especially samples taken days or weeks apart or taken on different instruments. In a certain sense, the calibration curve’s deviation from a flat line through zero difference simply means the instrument is slightly out of calibration. The procedure’s calibration of many samples, after acquisition, makes the calibration a practical solution. After calibration by adjusting the m/z values of these calibrant (identified) peaks using the calibration curve, an evaluation of the method’s performance can be made. Histograms of the difference in m/z between observed and predicted values (∆m/z) before and after final calibration are shown for the human plasma calibrants in Figure 2 for one sample (file). Also shown in Figure 2 are results using only calibrant peaks with Mascot scores of >60, resulting in the use of 641 of the 1657 peaks. This selection was performed as a test to see if identification confidence was playing a significant role in the results for all calibrant peaks. Identification confidence can be seen not to be playing a substantial role under these conditions. Also in Figure 2 is a histogram showing the results after averaging the ∆m/z across 30 samples for each component. Some small improvement from 1 to 30 samples is observed. Again, each file is individually calibrated. Averaging across samples (files) consists of taking the arithmetic mean for each molecular ion’s calibrated ∆m/z. The fact that only small improvement is obtained when averaging over multiple files suggests that the residual error 1706 Analytical Chemistry, Vol. 79, No. 4, February 15, 2007
Table 1. Summary of Median Errors and rmse of |∆m/z| for Calibrant Ions: Human Plasma Study median error 1 file, scores >60 1 file, all scores 4 files 16 files 30 files
rmse
mTh
ppm
mTh
ppm
2.4 2.7 2.6 2.5 2.5
4.0 4.3 4.1 4.1 4.1
4.0 4.1 4.0 4.0 4.0
6.2 6.9 6.8 6.8 6.7
Table 2. Summary of Median Errors and rmse of |∆m/z| for Calibrant Ions: Mouse Serum Study median error
1 filea 4 filesa 16 filesa 240 filesa 1 fileb 4 filesb 16 filesb 240 filesb a
RMSE
mTh
ppm
mTh
ppm
2.7 2.1 2.2 2.3 2.8 2.2 2.1 2.1
4.4 3.1 3.3 3.2 4.2 3.3 3.2 3.2
4.1 3.5 3.5 3.4 4.2 3.6 3.4 3.5
6.5 5.2 5.3 5.2 6.5 5.2 5.2 5.3
Used in calibration. bUnused.
is correlated between samples. Examination of selected molecular ions showed a general consistency to the magnitude and direction of the error across samples. While misidentification is playing a small role, we suspect that spectral interferences occur with significant frequency for this complex type of sample, causing small systematic shifts. Further investigation is warranted here, especially with higher resolution instrumentation. Two metrics of error are the median of the absolute value of calibrated ∆m/z and the root-mean-square error (rmse) for all of the molecular ions studied. Table 1 provides a summary of both of these results, listed in mTh and ppm, for the human plasma study for those peaks accepted as identified (within (10 mTh of the predicted value). Again, one file is compared with all peaks as well as for those only with a score of >60. Results are also included after averaging over 4, 16, and 30 files, showing only minor improvement over multiple files. For the mouse serum study, 1686 potential calibrants were divided into two groups using every other one in order of observed m/z. Half of those calibrants were used for calibration, and the other half were used to test the calibration and thus test for any bias in the system. Table 2 shows that the results were independent of whether the peak was chosen for calibration. Modest improvement with averaging across sample is again observed. CONCLUSIONS The results presented here offer a new approach to global recalibration of mass spectral data, where significant numbers of molecular ions are known or readily can be identified to the level of elemental composition, using only a partial calibration. While the examples are from the area of proteomics, the approach applies to metabolomics, glycomics, or any sample type. By using a nonparametric calibration curve, this method is not limited by assuming any analytical functional model of instrument behavior.
Furthermore, while the data presented here are from TOF mass spectrometers, the method may potentially be applied to other types of mass spectrometers of greater or lesser resolution and accuracy, and with other modes of ionization. Further improvements in calibration beyond that presented here undoubtedly are possible. Averaging over multiple samples will help nullify the influence of any temperature or temporal drift and does somewhat improve mass accuracy relative to use of only a single sample. Even higher resolution instrumentation should allow better understanding of spectral interferences from such complex samples and their influence on measurement error.
This method of mass calibration may be used in studies of molecular identification or differential expression (profiling). In profiling experiments, another advantage of this recalibration is the improved ability to determine which ions in one sample correspond to the same molecular identity in another sample. By tightening the constraint in m/z uncertainty, improvement is made in knowing which molecular ions with close chromatographic retention times are the same among many complex samples. Received for review July 25, 2006. Accepted December 4, 2006. AC061359U
Analytical Chemistry, Vol. 79, No. 4, February 15, 2007
1707