Anal. Chem. 2007, 79, 8762-8768
Sequence-Specific Retention Calculator. A Family of Peptide Retention Time Prediction Algorithms in Reversed-Phase HPLC: Applicability to Various Chromatographic Conditions and Columns Vic Spicer,† Andriy Yamchuk,† John Cortens,‡ Sandra Sousa,‡ Werner Ens,† Kenneth G. Standing,† John A. Wilkins,‡ and Oleg V. Krokhin*,‡
Department of Physics and Astronomy, University of Manitoba, Winnipeg, MB, R3T 2N2, Canada, and Manitoba Centre for Proteomics and Systems Biology and Department of Internal Medicine, University of Manitoba, 799 JBRC, Winnipeg, MB, R3E 3P4, Canada
Separation selectivity of C18 reversed-phase columns from different manufacturers has been compared to evaluate the applicability of our sequence-specific retention calculator (SSRCalc) peptide retention prediction algorithms. Three different versions of SSRCalc are currently in use: 300-Å pore size sorbents (TFA as ionpairing modifier, pH 2), 100 Å (TFA, pH 2), and 100 Å (pH 10), which have been applied for the separation of randomly chosen mixture of tryptic peptides. The major factor affecting separation selectivity of C18 sorbents was found to be apparent pore size, while differences in endcapping chemistry do not introduce a significant impact. The introduction of embedded polar groups to the C18 functionality increases the retention of peptides containing hydrophobic amino acid residues with polar groups: Tyr and Trp. We also demonstrate that changing the ionpairing modifier to formic/acetic acid significantly reduces the algorithm’s predictive ability, so models developed for different eluent conditions cannot be compared directly to each other. The development of soft ionization techniques such as MALDI1 and ESI2 in conjunction with new mass spectrometric equipment has driven growth in the field of analytical chemistry of peptides and proteins. Typical bottom-up LC-MS/MS proteomics experiments are dealing with complex mixtures of hundreds to thousands of proteins; proper analysis of the generated data includes an assessment of probability of false-positive matches.3 This forces proteomics researchers, despite recent progress in mass spectrometry, to look into auxiliary information to improve the certainty * Corresponding author. Tel: (204) 789 3283. Fax: (204) 474 7622. E-mail:
[email protected]. † Department of Physics and Astronomy. ‡ Manitoba Centre for Proteomics and Systems Biology and Department of Internal Medicine. (1) Karas, M.; Hillenkamp, F. Anal. Chem. 1988, 60, 2299-2301. (2) Fenn, J. B.; Mann, M.; Meng, C. K.; Wong, S. F.; Whitehouse, C. M. Science 1989, 246, 64-71. (3) Nesvizhskii, A. I.; Keller, A.; Kolker, E.; Aebersold, R. Anal. Chem. 2003, 75, 4646-4658.
8762 Analytical Chemistry, Vol. 79, No. 22, November 15, 2007
of protein identification4-6 or to significantly reduce the time required for analysis.7 Peptide chromatographic retention time (RT) offers an excellent choice as an additional filtering criterion, as it is readily available and contains information dependent on the peptide composition and sequence. This potential application has revitalized interest in the analytical community in the quarter-century- old field of peptide retention prediction. The classical approach was first postulated by Meek8 and used by a number of research groups, with some modifications added over the years.9-11 Advances in this field in the 1980s and early 1990s were seriously hindered by the lack of adequate ionization techniques for biological molecules. This resulted in insufficient quality (mostly in terms of abundance) optimization sets, which were constructed using limited numbers of synthetic or purified peptides of biological origin. At present, every proteomics group utilizing LC-MS methods has access to virtually unlimited peptide sequence-retention time data sets. For example, Petritis et al.12 recently demonstrated the collection of a training data set containing ∼345 000 entries for their algorithm development. Conversely, we have shown that a relatively small (∼2000 peptides) high-quality optimization data set combined with introduction of a number of sequence-specific correction factors can also produce very accurate models (retention time vs hydrophobicity correlation up to R2 value 0.98).13 (4) Palmblad, M.; Ramstro ¨m, M.; Markides, K. E.; Håkansson, P.; Bergquist, J. Anal. Chem. 2002, 74, 5826-5830. (5) Strittmatter, E. F.; Kangas, L. J.; Petritis, K.; Mottaz, H. M.; Anderson, G. A.; Shen, Y.; Jacobs, J. M.; Camp, D. G. 2nd; Smith, R. D. J. Proteome Res. 2004, 3, 760-769. (6) Cargile, B. J.; Bundy, J. L.; Freeman, T. W.; Stephenson, J. L. J. Proteome Res. 2004, 3, 112-119. (7) Krokhin, O. V.; Ying, S.; Cortens, J. P.; Ghosh, D.; Spicer, V.; Ens, W.; Standing, K. G.; Beavis, R. C.; Wilkins, J. A. Anal. Chem. 2006, 78, 62656269. (8) Meek, J. L. Proc. Natl. Acad. Sci. U.S.A. 1980, 77, 1632-1636. (9) Browne, C. A.; Bennett, H. P. J.; Solomon, S. Anal. Biochem. 1982, 124, 201-208. (10) Guo, D.; Mant, C. T.; Taneja, A. K.; Parker, J. M. R.; Hodges, R. S. J. Chromatogr. 1986, 359, 499-517. (11) Mant, C. T.; Burke, T. W. L.; Black, J. A.; Hodges, R. S. J. Chromatogr. 1988, 458, 193-205. (12) Petritis, K.; Kangas, L. J.; Yan, B.; Monroe, M. E.; Strittmatter, E. F.; Qian, W. J.; Adkins, J. N.; Moore, R. J.; Xu, Y.; Lipton, M. S.; Camp, D. G.; 2nd; Smith, R. D. Anal. Chem. 2006, 78, 5026-5039. 10.1021/ac071474k CCC: $37.00
© 2007 American Chemical Society Published on Web 10/16/2007
New results derived from proteomics-driven RT prediction algorithms have provided a deeper understanding of the separation mechanisms in RP HPLC of peptides.14 However, the increase in availability of data for model optimization and a continued interest in the development of proteomic methods are fueling the appearance of new models. Indeed, the development of new or modification of old RT prediction models is now reported almost monthly.12,13,15-18 These developments raise the question of what is the proper technique to compare the algorithms and their applicability to various chromatographic conditions. The absence of criteria for comparison and the ambiguities created by applying algorithms to different LC systems (columns) threaten to slow progress in the field and as such they should be addressed. There are three primary factors that impact on the development and comparability of the various algorithms: (i) Differences between the composition of mobile phases used. For most proteomics applications, this is the difference between trifluoroacetic acid (TFA) based (HPLC-MALDI) and formic acid (FA) based (HPLC-ESI) eluents. This parameter is well established, since optimal separation conditions of peptides on RP sorbents have been known for many years.19 It is important to note that the models developed for different ion-pairing modifiers may not be compatible. (ii) Differences in the stationary phase resulting from the distinctive characteristics of RP sorbents such as pore size, chemistry of functional groups, purity of silica matrix, etc. This variable is the hardest to take into account due to continuous changes in sorbents manufacturing in the past 25 years. (iii) Variability in the nature of species to be separated; algorithm should be capable of dealing with changes associated with different enzymes, post-translational modifications, and chemical labeling such as with isotopically coded tags. The latter set of variables affects peptide retention due to changing chemical properties of the residues. Parameters of the algorithms used for the peptides generated by alternative enzymatic cleavage, chemically or post-translationally modified species should be carefully adjusted as well. We are taking preliminary steps to build algorithms, which will take into account basic sorbent parameters like pore size. Recently we demonstrated how peptide retention differs between 100- and 300-Å pore size C18 sorbents and how we can compensate for this.13 We have developed a sequence-specific retention calculator (SSRCalc) algorithm that is based on considerations of the ion pairing of peptides during RP HPLC.13,20 The algorithm is currently in two versions for 100 - and 300-Å pore size sorbents with TFA as modifier. A third version of the algorithm is now under development for 100-Å pore size C18 sorbent at pH 10. The (13) Krokhin, O. V. Anal. Chem. 2006, 78, 7785-7795. (14) Tripet, B.; Cepeniene, D.; Kovacs, J. M.; Mant, C. T.; Krokhin, O. V.; Hodges, R. S. J. Chromatogr., A 2007, 1141 (2), 212-225. (15) Shinoda, K.; Sugimoto, M.; Yachie, N.; Sugiyama, N.; Masuda, T.; Robert, M.; Soga, T.; Tomita, M. J. Proteome Res. 2006, 5, 3312-3317. (16) Gorshkov, A. V.; Tarasova, I. A.; Evreinov, V. V.; Savitski, M. M.; Nielsen, M. L.; Zubarev, R. A.; Gorshkov, M. V. Anal. Chem. 2006, 78, 7770-7777. (17) Put, R.; Daszykowski, M.; Baczek, T.; Vander Heyden, Y. J. Proteome Res. 2006, 5, 1618-1625. (18) Klammer, A. A.; Yi, X.; Maccoss, M. J.; Noble, W. S. Anal. Chem. 2007, 79, 6111-6118. (19) Mant, C. T.; Hodges, R. S. HPLC of Biological Macromolecules; Marcel Dekker: New York, 2002; pp 433-511. (20) Krokhin, O. V.; Craig, R.; Spicer, V.; Ens, W.; Standing, K. G.; Beavis, R. C.; Wilkins, J. A. Mol. Cell. Proteomics 2004, 3, 908-919.
current accuracy of the pH 10 model presented here is an R2 value of ∼0.97 compared to 0.98 for the previous TFA models. The choice of column and chromatographic conditions in the latter model is based on a recent report of Gilar et al.21 They showed great potential of reversed-phase separations at high pH as a first LC dimension in a 2D-HPLC analysis scheme. While the cationexchange mode of peptide separation provides superior orthogonality, the better peak shape in RP pH 10 makes it an excellent candidate for selection as a first dimension in 2D-HPLC analysis. SSRCalc is now used by dozens of research laboratories and institutions. Typical questions and requests from these users fall into three categories: (1) Can TFA models be applied for peptide separations under FA conditions? (2) Is it possible to use these algorithms with different sorbents? (3) Will the algorithm work for modified or non-tryptic peptides? This paper addresses the first two questions by comparing the peptide separation selectivity for the ion-pairing modifier triad TFA-FA-acetic acid (AA) and for a number of C18 RP columns at pH 2 and 10. EXPERIMENTAL SECTION Sample Preparation and HLPC Analysis. Sample preparation and the off-line microHPLC-MALDI MS setup were described in detail elsewhere.13,22 Briefly, protein mixtures were reduced, alkylated with iodoacetamide, dialyzed against ammonium bicarbonate solution, and digested with trypsin. Two samples were used for the present study: affinity purified human merosin (peptide mixture 1) and a mixture of five standard proteins (human/bovine albumin/transferrin and horse myoglobin, referred as peptide mixture 2). Merosin was purified from human placenta as described by Wewer et al.23 Unless otherwise noted, digests were fractionated using linear 0.75% acetonitrile/min water-acetonitrile gradients (1% acetonitrile starting conditions, 3 µL/min flow rate) on a micro-Agilent 1100 Series system (Agilent Technologies, Wilmington, DE). The compositions of eluents A and B were similar for TFA-FA-AA experiments: 0.1% v/v ion-pairing modifier in water (A) and acetonitrile (B). Both eluents A (100% water) and B (90:10 acetonitrile-water) contained 20 mM ammonium formate buffer for pH 10 experiments. Samples (5 µL) containing 1-2 µg of the digests were injected directly onto micro-LC columns. The column effluent was mixed on-line with 2,5-dihydroxybenzoic acid (DHB) MALDI matrix solution (0.5 µL/min, 150 mg/mL DHB in methanol), deposited by a computer-controlled robot onto a movable metal target at 0.5-min intervals, air-dried, and submitted to MALDI MS (MS/MS) analysis. Three different versions of SSRCalc were developed for 150 µm × 150 mm (home- packed Vydac 218 TP C18, 5 µm; Grace Vydac, Hesperia, CA); 300 Å-TFA, 300 µm × 150 mm (PepMap 100, 3 µm; LC Packings-Dionex, Sunnyvale, CA); 100 Å-TFA and 150 µm × 150 mm (home-packed XTerra C18, 5 µm; Waters, Milford, MA); 100 Å pH 10. These columns are referred to as standard throughout the text. Table 1 contains the complete list of the columns used in this study. (21) Gilar, M.; Olivova, P.; Daly, A. E.; Gebler, J. C. Anal. Chem. 2005, 77, 64266434. (22) Krokhin, O. V.; Ens, W.; Standing, K. G. J. Biomol. Tech. 2005, 16, 429440. (23) Wewer, U.; Albrechtsen, R.; Manthorpe, M.; Varon, S.; Engvall, E.; Ruoslahti, E. J. Biol. Chem. 1983, 258, 12654-12660.
Analytical Chemistry, Vol. 79, No. 22, November 15, 2007
8763
Table 1. Stationary Phases Used in the Study and Correlation of Observed Retention Times vs Retention Times on the Standard Sorbents for SSRCalc Development (Vydac TP 218 300 Å TFA, PepMap 100 100 Å TFA, XTerra 100 Å pH10) pore size (Å), particle size (µm)
manufacturer, sorbent, size (µm × mm)
pH range
functionality, end-capping
Agilent, Zorbax 300 SB, 300 × 150 Eksigent, ChromXP C18EP 300 Å, 300 × 150 Phenomenex, Jupiter 300, C18 300 × 150 GLScience, Inertsil WP300, 300 × 150 Vydac TP218, 150 × 150
Wide-Pore Phases, TFA C18, no C18 with embedded polar groups, yes C18, yes C18, yes Polymerically bonded C18, yes
300, 5 300, 3 300, 5 300, 5 300, 5
1.0-8.0 1-10 1.5-10 2-7.5 1.5-7.5
Waters, Symmetry 300, 300 × 150
C18, yes
300, 3.5
2-8
Dionex, PepMap 100, 300 × 150
Small-Pore Phases, TFA C18, yes
100, 3.5
2-8
Eksigent, ChromXP C18EP 120 Å, 300 × 150 ChromXP C18CL 120 Å, 300 × 150 ChromXP C18AQ 120 Å, 300 × 150 Phenomenex, Luna C18(2), 300 × 150 Jupiter Proteo, 300 × 150 Synergi Hydro-RP, 300 × 150 GLScience, Inertsil ODS-3, 300 × 150 Waters, XTerra, 300 × 150
C18 with embedded polar groups, yes C18, yes C18, partially end-capped C18, yes C9, yes C18, polar end-capping C18, yes C18, yes
120, 3 120, 3 120, 3 100, 3 90, 4 80, 4 100, 3 130, 3.5
1-10 2-9 2-9 1.5-10 1.5-10 1.5-10 2-7.5 2-10
Waters, XTerra, 150 × 150
Small pore phases, pH 10 C18, yes
130, 5
2-10
Phenomenex, Luna C18(2)
C18, yes
100, 3
1.5-10
retention time correlation 0.9941 0.9854 0.9895 0.9882 0.9997a std column 0.9899 0.9998a std column 0.9944 0.9984 0.9984 0.9991 0.9982 0.9979 0.9991 0.998 0.9996a std column 0.954
a To estimate accuracy of peptides assignment based solely on mass and retention time reproducibility, separations on standard columns were repeated two months after the original HPLC MS/MS were done. In all three cases, the correlations of measured retention times were found to be very close to 0.9996-0.9998.
The first set of experiments was designed to probe the differences in separation selectivity of peptides in mixture 1 using three groups of columns (Table 1). In the second experiment, peptide mixture 2 was fractionated on the same column (PepMap 100) using three different ion-pairing modifiers: 0.1% v/v FAAA-TFA. Mass Spectrometry and Data Treatment. Chromatographic fractions were analyzed by single mass spectrometry, with m/z range 560-5000 Da, and by tandem mass spectrometry (MS/ MS) in the Manitoba/Sciex prototype MALDI quadrupole/TOF (QqTOF) mass spectrometer.24 Orthogonal injection of ions from the quadrupole into the TOF section normally produces a mass resolving power of ∼10 000 full width at half-maximum and accuracy within a few millidaltons in the TOF spectra in both MS and MS/MS modes. An in-house software package, SMART (Search using Mass and Retention Time),25 was used to assign peaks of MS spectra of the fractions, identify proteins by peptide mass-RT fingerprint, choose candidates for MS/MS analysis, and confirm peptide identification. Detailed MS/MS analyses of merosin digest were performed once for each set of experimental conditions: 100 Å-TFA, 300 Å-TFA, and 100 Å pH 10. In the remaining analysis using similar separation conditions, peptides were assigned by mass. MS/MS confirmation was restricted to those cases of overlapping peaks or of peaks with close masses. (24) Loboda, A. V.; Krutchinsky, A. N.; Bromirski, M.; Ens, W.; Standing, K. G. Rapid Commun. Mass Spectrom. 2000, 14, 1047-1057. (25) Krokhin, O. V.; Spicer, V.; Ens, W.; Standing, K. G.; Wilkins, J. A. 54th ASMS Conference on Mass Spectrometry and Allied Topics, San Diego, CA., poster 2006.
8764
Analytical Chemistry, Vol. 79, No. 22, November 15, 2007
The differences in separation selectivity for the stationary phases were investigated based on dependencies of retention times on one column versus retention times on another. These correlations are referred to as RT-RT plots henceforth. RESULTS AND DISCUSSION Current Accuracy of SSRCalc and Orthogonality of Three Different Separation Modes. Affinity-purified merosin was chosen as a typical example of the types of samples that researchers deal with on a daily basis. Apart from major component P24043 laminin subunit R-2 precursor (merosin heavy chain), it contains a number of laminin family members and other associated proteins. Protein identification in peptide mixture 1 resulted in the unambiguous assignment of 250-300 peptides from ∼15 proteins. The list of identified peptides along with their calculated hydrophobicities and retention times is given in Supporting Information 1. Figure 1 a-c shows correlations of retention time versus SSRCalc hydrophobicity for peptide mixture 1 (human merosin digest) for three different separation modes: 100 Å-TFA, 300 Å-TFA, and 100 Å pH 10, respectively. The pH 10 model was optimized in a fashion similar to the pH 2 algorithms with a data set of ∼3000 peptides, which were confidently identified by LC-MALDI MS/MS. As mentioned before, it is still under development and gives an R2 value of ∼0.97 for the optimization set. This does not reach the accuracy of our more mature TFA-based algorithms. Despite the lower accuracy of the pH 10 model compared to the pH 2 ones, very close correlation values were obtained for the set of peptides used in this work.
Figure 1. Current accuracy of SSRCalc algorithms and orthogonality of three different separation modes. (a-c) Retention time vs calculated hydrophobicity correlations for peptide mixture 1 using 100 Å TFA, 300 Å TFA, and 100 Å pH 10 models, respectively, (d-f) RT-RT plots for PepMap100-Synergy HydroRP, PepMap100-Vydac TP218, and PepMap100-XTerra pH 10, respectively.
The alterations that we made in the pH 10 model related to the ion pair separation mechanism and mostly reflect changing the charge state of particular amino acids due to the increased pH. The amino groups of the basic residues His, Arg, Lys, and N-terminal are positively charged at pH 2 while at pH 10 they are neutral. This causes an increase in their apparent hydrophobicity. Conversely, Asp and Glu are neutral at pH 2 while they become negatively charged in pH 10 ammonium formate and carry a “cloud” of associated ammonium counterions. This significantly decreases the retention coefficients (Rc) of acidic residues under basic conditions. Respective Rc changes due to pH alteration are shown in Table 2. Recently Kovacs et al.26 demonstrated that changing pH of the mobile phase from 2 to 7 affects intrinsic hydrophobicity for the residues carrying potentially charged side chains (Orn, Lys, His, Arg, Asp, Glu), which is consistent with our findings. A charged N-terminal amino group significantly influences the hydrophobicity of the first few residues at acidic pH.20 At pH 10, this influence is negligible, because the NH2 group becomes neutral. Table 2 shows the respective changes in retention coefficients for the first (N-terminal) residue (Rc1) for individual amino acids due to a change from pH 2 to pH 10 conditions. Thus, transfer of Leu from a position within a peptide to the N-terminus decreases its retention coefficient from 9.4 to 5.57 at pH 2 and with relatively minor effects at pH 10 (i.e., 8.6 to 8.21). Figure 1d-f illustrates how RT-RT plots can be used to estimate applicability of prediction algorithms across various conditions and separation orthogonality. When separated under identical conditions (e.g., eluents, column), RT-RT correlation should be close to 1. This is the situation when two separation conditions are nonorthogonal to each other. Conversely, when (26) Kovacs, J. M.; Mant, C. T.; Hodges, R. S. Biopolymers 2006, 84, 283-297.
Table 2. Retention Coefficients of 20 Naturally Occurring Amino Acids for 100-Å Models at pH 2 (TFA)13 and pH 10 (Ammonium Formate) 100 Å pH 2
100 Å pH 10
residue
Rc
Rc1
Rc
Rc1
W F L I M V Y Ca P A E T D Q S G R N H K
13.35 11.67 9.4 7.96 6.27 4.68 5.35 0.1 1.85 1.02 1 0.64 0.15 -0.6 -0.14 -0.35 -2.55 -0.95 -3 -3.4
11.5 7.6 5.57 4.95 5.2 2.1 4.3 0.4 1.85b -0.35 1 0.95 0.9 -0.5 1.1 0.15 -1.4 1.2 -1.4 -1.85
11.93 10.23 8.60 7.53 5.58 4.77 4.84 1.23 2.08 1.46 -5.12 0.91 -5.58 0.21 0.66 0.34 3.16 0.06 0.73 2.09
12.22 10.97 8.21 8.08 5.49 4.46 4.92 -0.14 2.08b 0.24 -4.73 0.50 -5.73 -1.68 0.21 -0.52 3.10 -0.91 0.75 2.16
a Retention coefficients for carbamidomethylated Cys. b Retention coefficient for Pro at N-terminal was assigned equal to Rc due to lack of N-terminal prolines in our data set.
data points on an RT-RT plot distribute randomly (i.e., no correlation), this indicates orthogonality between the separation conditions. Thus, the respective plots for a PepMap 100 column versus a Synergi Hydro-RP column with similar matrix (C18) pore size, (Figure d) shows very good R2 0.998 correlation. These two columns are very close in separation selectivity, which justifies Analytical Chemistry, Vol. 79, No. 22, November 15, 2007
8765
Table 3. Cross-Correlations of Measured Retention Times for All Studied Sorbents
sorbent
PepMap100
XTerra
PepMap100 XTerra ChromXP C18CL 120 ChromXP C18AQ 120 Jupiter Proteo Luna C18(2) Synergi Inertsil ODS-3 ChromXP C18EP 120
0.9998a 0.9979 0.9984 0.9984 0.9982 0.9991 0.998 0.999 0.9944
0.9979 1 0.9991 0.999 0.9988 0.9988 0.9981 0.998 0.9904
100-Å Pore Size Sorbents ChromXP ChromXP Jupiter C18CL 120 C18AQ 120 Proteo
sorbent
Vydac TP218
ChromXP C18EP 300
Vydac TP218 ChromXP C18EP 300 Zorbax SB 300 Jupiter 300, C18 Inertsil WP 300 Symmetry 300
0.9997a 0.9855 0.994 0.9881 0.9875 0.9905
0.9855 1 0.9902 0.9872 0.991 0.9905
a
0.9984 0.9991 1 0.9995 0.9992 0.9994 0.999 0.9991 0.9908
0.9984 0.999 0.9995 1 0.9991 0.9993 0.9988 0.999 0.9912
0.9982 0.9988 0.9992 0.9991 1 0.9991 0.9986 0.9986 0.9913
(300-Å Pore Size Sorbents Zorbax Jupiter SB 300 300, C18 0.994 0.9902 1 0.9964 0.9959 0.997
0.9881 0.9872 0.9964 1 0.9961 0.996
Luna C18(2)
Synergi
Inertsil ODS-3
ChromXP C18EP 120
0.9991 0.9988 0.9994 0.9993 0.9991 1 0.9987 0.9993 0.9921
0.998 0.9981 0.999 0.9988 0.9986 0.9987 1 0.999 0.9918
0.999 0.998 0.9991 0.999 0.9986 0.9993 0.999 1 0.9932
0.9944 0.9904 0.9908 0.9912 0.9913 0.9921 0.9918 0.9932 1
Inertsil WP 300
Symmetry 300
0.9875 0.991 0.9959 0.9961 1 0.9991
0.9905 0.9905 0.997 0.996 0.9991 1
Correlations against repeated runs two months apart, see Table 1.
the use of the PepMap 100 prediction model for Synergi HydroRP-separated peptides. However, their lack of orthogonal separation indicates that they would not be a useful pair as stationary phases for a 2D-HPLC scheme. Changing pore size of the sorbent while keeping the same chromatographic conditions results in lowering the accuracy of the algorithm and requires adjustment.13 Thus, correlation of retention times between PepMap 100 and Vydac 218 TP is 0.978 (Figure 1e). These conditions show insufficient orthogonality as well. The effects of pH on the charge distribution within peptides makes the retention patterns at pH 10 essentially orthogonal to those observed at pH 2, while maintaining a high separation efficiency. A dramatic change in the pH of the mobile phase (Figure 1f) alters this correlation profoundly, providing the basis for effective application of pH 10 conditions as orthogonal separation mode. The prediction model needs major modifications in this case. Comparison of Separation Selectivities for Small-Pore C18 Phases. A diverse set of peptides identified in HLPC-MALDI MS runs on various columns (Supporting Information 1) provided important insights into the varying separation selectivity for the nine different sorbents in this group. Table 3 shows crosscorrelations for all small-pore packing materials ranging from 0.9994 down to 0.9904. Most of the columns showed very similar behavior except for Eksigent ChromXP C18EP 120. A detailed inspection of the peptides significantly deviating in the RT (PepMap 100)sRT (ChromXP C18EP 120) correlation plot showed a large positive deviation of peptides carrying Tyr and to some extent Trp residues. While these two amino acids are hydrophobic, they also carry polar groups, which enables tyrosine/tryptophancontaining peptides to interact with both C18EP 120 functionalities (i.e., C18 and embedded polar groups). This property increases the retention times of peptides containing these residues. The eight remaining columns in this category showed a minor variation in selectivity (Table 3). Respective RT-RT plots showed 8766
Analytical Chemistry, Vol. 79, No. 22, November 15, 2007
correlations of ∼0.999. Despite such a minor difference, it was possible to make certain conclusions about which parameter is the most important in determining the relative retention on columns of similar C18 chemistry. We have shown13 that the peptides most affected by apparent pore size are relatively small and hydrophobic. Due to their small size, they can penetrate easily into the small pores and realize their hydrophobic potential. Decreasing pore size results in increases of the relative retention of such analytes, while increasing pore size will have the opposite effect. We chose several peptides with these properties and calculated their deviations from their respective RT(PepMap 100)-RT plots. Table 4 illustrates these measurements and shows clearly that largest (positive) and smallest (negative) deviations in this group are characteristic for sorbents with smallest (Synergi Hydro-RP, 80 Å, Table 1) and largest (XTerra, 130 Å) pores, respectively. Therefore, apart from the bonding chemistry of the RP, pore size has the largest impact on separation selectivity. Calculations of the average deviation of retention of selected peptides (Table 4) from respective RT-RT plots allows for the ordering of eight sorbents according to their pore sizes: Synergi Hydro-RP e Inertsil ODS-3 < ChromXP C18CL 120 e ChromXP C18AQ 120 < Luna C18(2) e Jupiter Proteo e PepMap 100 < XTerra. The calculated average deviation for the ChromXP C18EP 120 column was very high due to the previously described additional influence of Tyr/Trp residues. However, if one considers only peptides with hydrophobic nonpolar residues, then the indicated pore size of C18EP 120 is close to that of C18CL 120 and C18AQ 120. Wide-Pore C18 Phases. Similar correlation plots for 300-Å pore sorbent showed larger variations in R2 values (Table 3) 0.9855-0.9991. The average deviation in retention values for the same 10 selected peptides of RT (Vydac TP218)-RT plots were significantly positive, indicating an anomalously large pore size of Vydac TP218 among all wide-pore sorbents (Table 4). Based on the data for the 300-Å sorbents, the relative pore sizes increased
Table 4. Deviation (in Minutes) of Small Hydrophobic Peptide Retention Times from Their Respective RT (PepMap 100)-RT (Small Pores) and RT (Vydac TP218)-RT (Wide Pores) Plots small pores
wide pores
peptide
XTerra
CL120
AQ120
Jupiter Proteo
Luna C18(2)
Synergi
ODS-3
EP120
EP300
Zorbax SB 300
Jupiter 300
WP 300
Symmetry 300
ALELFR YAIYFEAR SLGEFIK YVFR LLVVYPWTQR LIEIASR NLYFTDWK TWVTLK LTIELEVR YYYALYELVVR Average deviation
-0.16 -0.46 -0.07 -0.54 -0.24 0.1 -0.37 -0.3 -0.13 -0.13 -0.23
0.35 -0.09 0.32 -0.11 -0.08 0.53 -0.14 -0.06 0.19 -0.16 0.07
0.29 0.01 0.35 0.1 -0.14 0.23 -0.17 -0.02 0.26 -0.21 0.07
0.35 -0.14 0.02 -0.24 -0.13 0.41 -0.16 -0.03 0.38 -0.31 0.02
0.31 -0.19 0.06 0.02 -0.20 0.57 -0.20 0.16 0.06 -0.15 0.04
0.92 0.17 -0.08 0.29 0.15 0.44 0.16 0.13 -0.04 0.19 0.23
0.49 0.17 0.21 0.31 0.07 0.36 0.21 0.01 0.18 0.16 0.22
-0.55 1.27a -0.48 1.1a 0.13a -0.27 0.57a -0.36 -0.06 1.84a 0.32a
0.97 2.52a 0.77 2.75a 1.33a 0.26 1.75a 1.27a 0.23 2.70a 1.45a
1.32 0.99 1.12 1.39 0.96 0.38 1.07 1.1 0.13 0.96 0.94
1.99 1.06 1.28 1.98 1.30 0.53 1.25 1.61 1.41 1.05 1.35
2.35 1.38 1.76 2.49 1.34 1.03 1.45 1.99 0.37 1.22 1.54
2.03 1.06 1.44 1.84 1.01 0.92 1.27 1.68 0.39 0.78 1.24
a Note the anomalously high deviations for peptides carrying hydrophobic residues with polar groups on sorbents with embedded polar functionality.
in the following order: Inertsil WP300 < Jupiter 300, C18 < Symmetry 300 < Zorbax 300 , Vydac TP218. Eksigent ChromXP C18EP 300 with embedded polar groups displayed a similar to C18EP 120 behaviorspeptides with Tyr/Trp residues exhibit anomalously high retention. Based on the retention of small hydrophobic peptides without Tyr/Trp, the ChromXP C18EP 300 pore size was estimated to be close to that of Zorbax 300 SB. Small-Pore C18 at pH 10. The selectivity of two sorbents was compared in this group, XTerra and Luna C18(2). An R2 value of 0.954 was found for peptide RT-RT correlation for these sorbents. The same pair showed a much better correlation at pH 2: R2 ∼0.9988 (Table 3). As an approach to defining the basis for such a difference, the list of peptides was inspected to identify the species with largest deviation from correlation plot. The largest increases in retention on Luna C18(2) compared to XTerra were found for small relatively hydrophobic peptides without acidic Asp and Glu residues (e.g., AFLHVPAK, TWGVYR, LLVVYPWTQR, YVFR, HLLSPQR). The YVFR peptide was retained on the Luna C18(2) for 0.5 and 5.8 min longer at pH 2 and 10, respectively, than expected based on XTerra retention. The role of acidic residues was additionally highlighted by the fact that the 26 peptides with the greatest positive deviations did not contain Asp or Glu. Collectively these observations indicate the higher sensitivity of RP pH 10 separation selectivity on the pore size of the sorbent. This might be a consequence of very polar character of NH+ counterion compared to trifluoroacetate in pH 2 mode. The polar NH+ cations are operationally much bigger than TFA due to the coordination/polarization of a larger number of water molecules by the former. A similar effect causes the decrease in electrostatic interactions and, as consequence, a lower ionexchange affinity of the small ions in ion-exchange chromatography. As mentioned before, the (negatively charged at pH 10) Asp and Glu are involved in ion-pairing formation. This increases significantly the apparent size of the ion pairs for Asp/Glu-carrying peptides, preventing them from penetration into smaller Luna C18(2) pores, hence decreasing peptide retention. Varying Ion-Pairing Modifiers at Acidic pH. The difference in selectivity provided by ion-pairing modifiers at acidic conditions was monitored using the same PepMap 100 column with three different mobile-phase additives: TFA, FA, and AA. Respective
RT-RT correlation plots for peptide mixture 2 (Supporting Information 2) showed a significant change in separation selectivity under these different conditions. R2 values were found to be 0.9373, 0.8691, and 0.9715 for TFA-FA, TFA-AA, and FA-AA pairs, respectively. This indicated that altering ion-pairing chemistry has a much greater effect on separation selectivity compared to sorbent pore size. Thus, a change from the VydacTP218 sorbent with the largest pores to a typical 100-Å column such as PepMap 100 with TFA in the eluent showed 0.978 RT-RT correlation (Figure 1e). However, switching modifiers from TFA to FA reduced the correlation to below 0.94. This profound influence of ion-pairing modifiers highlights, once again, the importance of developing a proper understanding of the separation mechanisms: as the chemical properties of the counterions change, there are alterations in the resulting ion pairs in terms of size and hydrophobicity. This requires significant reconstruction of the retention prediction algorithms to maintain high prediction accuracy. Compatibility of the Algorithms and Requirements for Proper Models Comparison. There is no better way to ensure optimal performance of a retention prediction algorithm than to apply it in the same conditions that it was developed under. When selecting an optimal algorithm for particular chromatographic conditions, the priority should be given to the similarity of ionpairing modifiers. For example, all of our attempts to apply SSRCalc TFA models to FA-separated peptides did not provide hydrophobicity versus retention time correlations better than 0.93. Changing sorbent pore sizes will have a smaller effect. We showed13 that application of the Vydac TP218 300 Å TFA algorithm to PepMap 100 Å TFA data (and reverse) reduces hydrophobicity versus retention time correlations from ∼0.98 to ∼0.95, which is consistent with R2 0.978 for a Vydac-PepMap RT-RT plot. Such a change in separation selectivity was the result of the unusually large pore size of the Vydac 218TP sorbent. RT-RT correlation values between most of the wide-pore phases and the 100-Å ones are close to 0.99 (results not shown here); this indicates that the majority of the 300-Å sorbents are between Vydac 218 TP and PepMap 100 in terms of pore size and SSRCalc applicability. There were a couple of examples of alternative end-capping chemistry in our small-pore sorbents set: Synergi Hydro-RP with Analytical Chemistry, Vol. 79, No. 22, November 15, 2007
8767
polar end-capping and partially end-capped ChromXP C18AQ 120. However, we did not observe any significant impact of these alterations on peptide separation selectivity. On the other hand, the introduction of additional polar functionality significantly increases the retention of peptides with polar hydrophobic residues. In attempting a comparison of newly developed algorithms against older ones, authors should be aware of the following pitfalls: the compatibility of ion-pairing modifiers, the similarity of pore sizes, and the bonding chemistry. It is also important to remember that minor differences in selectivity that are obvious when inspecting respective RT-RT plots may not affect the overall accuracy of RT versus hydrophobicity correlations. This is true in particular for predictive models of lower accuracy with 0.90.94 R2 values. This conclusion is based on the fact that respective peptide retention time deviations from RT versus hydrophobicity plots are random, and the addition of an extra variable (such as a slightly different pore size) may even improve the prediction for a particular peptide. On the other hand, application of highaccuracy algorithms like SSRCalc may be affected by even minor alteration in sorbent selectivity. The importance of these factors will increase with the development of new advanced algorithms, which will approach a 0.99 R2 value RT versus hydrophobicity correlation or better. CONCLUSIONS The ongoing continuous collection of high-quality peptide retention data sets using the HPLC-MALDI MS platform at Manitoba Centre for Proteomics and Systems Biology, combined with a detailed study of the underlying retention mechanism, has resulted in development of series of SSRCalc algorithms for various chromatographic conditions (http://hs2.proteome.ca/ SSRCalc/SSRCalc32.html). A recent addition to the SSRCalc family has been the first algorithm for RP peptide separation at pH 10, which was developed for XTerra packing material. This opens up the possibility for effective use of peptide retention prediction in two dimensions for LC-MS proteomics analysis. Important issues of comparability of various prediction algorithms was addressed in this study by comparing separation selectivity in three SSRcalc modes: 300-Å pore size sorbents (TFA pH 2), 100 Å (TFA, pH 2), and 100 Å (pH 10) using a wide range
8768
Analytical Chemistry, Vol. 79, No. 22, November 15, 2007
of RP sorbent from different manufacturers. We found that apparent pore size is the major contributor into varying separation selectivity for the sorbents with similar C18 chemistry. It is even possible to monitor relative sorbent pore size by observing the behavior of a selected group of peptides (small hydrophobic peptides) on different RP phases. In our earlier studies, we made essentially arbitrary choices of the sorbents for SSRCalc development, i.e., Vydac TP218 for wide pores and PepMap 100 for small pores. However, the current analysis of a number of sorbents from different vendors has given us the ability to reevaluate these initial choices. While PepMap 100 showed very similar behavior compared to most of the 100-Å phases, the Vydac TP218 has anomalously large pore size compared to the consensus of the 300-Å sorbents. Changing ion-pairing modifiers at acidic pH strongly affects the accuracy of prediction models by changing the chemical properties of separated speciessthe peptide’s ion pairs. Retention times of peptides separated on the same column with TFA and FA as modifiers shows an ∼0.94 R2 value correlation. Therefore, SSRCalc models developed for TFA conditions will not provide sufficient predictive accuracy for FA retention data and cannot be accurately compared against FA-based models. ACKNOWLEDGMENT The authors thanks Remco Van Soest (Eksigent/Dionex), Wes Budakowski (K’Prime Technologies/Agilent), Kick van Lunenburg (Canadian Life Science/GLScience), Emmet Welch (Phenomenex), and Martin Gilar (Waters) for providing columns used in this study. This work was supported in part by grants from the Natural Sciences and Engineering Research Council of Canada (K.G.S., W.E.), Genome Canada (W.E.), and Canadian Institute for Health Research (J.A.W.). SUPPORTING INFORMATION AVAILABLE Additional information as noted in text. This material is available free of charge via the Internet at http://pubs.acs.org.
Received for review July 12, 2007. Accepted September 7, 2007. AC071474K