Anal. Chem. 2009, 81, 9522–9530
Peptide Retention Standards and Hydrophobicity Indexes in Reversed-Phase High-Performance Liquid Chromatography of Peptides Oleg V. Krokhin*,† and Vic Spicer‡ Department of Internal Medicine, University of Manitoba, Manitoba Centre for Proteomics and Systems Biology, 799 JBRC, 715 McDermot Avenue, Winnipeg, MB, R3E 3P4, Canada, and Department of Physics and Astronomy, University of Manitoba, Winnipeg, MB, R3T 2N2, Canada The growing utility of peptide retention prediction in proteomics would benefit from the development of a universal peptide retention standard for better alignment of chromatographic data obtained using various liquid chromatography (LC) platforms. We describe a six-peptide mixture designed for this purpose; its members cover a wide range of hydrophobicity for the most popular modes of reversed-phase peptide high-performance liquid chromatography (HPLC): C18 sorbents with trifluoroacetic/ formic acid as the ion-pairing modifier and separations at pH 10. We propose a hydrophobicity index (HI) describing the concentration of organic solvent (typically acetonitrile) that yields a retention factor of 10 under isocratic elution conditions for any peptide. This measure is a fundamental characteristic of peptide-sorbent interaction, depending only on the type of sorbent and the ionpairing modifier used. Spiking a sample with a standard peptide mixture provides a measurement of the HI values of all detected species during gradient separation. In addition to alignment of data and calibration of chromatographic runs when peptide retention prediction protocols are used, these values obtained from proteomics experiments can be utilized directly in method development for large-scale preparative separations. High-performance liquid chromatography (HPLC) of biological macromolecules has been a field of intensive research and inspiration for several generations of separation scientists. Peptide LC separation science matured both theoretically and practically during the 1980s to 1990s.1-4 When recent advances in mass spectrometry led to significant breakthroughs in protein/peptide analysis (often regarded as the “proteomics era”), the chromatographic counterpart was ready to accommodate these new * Corresponding author. Oleg Krokhin, phone: (204) 789 3283. Fax: (204) 480 1362. E-mail:
[email protected]. † Department of Internal Medicine. ‡ Department of Physics and Astronomy. (1) Hearn, M. T. W., Ed. HPLC of Proteins, Peptides and Polynucleotides. Contemporary Topics and Applications; Wiley: New York, 1991. (2) Snyder, L. R.; Glajch, J. L.; Kirkland, J. J. Practical HPLC Method Development; Wiley: New York, 1997. (3) Mant, C. T.; Hodges, R. S. In HPLC of Biological Macromolecules; Marcel Dekker: New York, 2002; pp 433-511. (4) Snyder, L. R.; Dolan J. W. High-Performance Gradient Elution: The Practical Application of the Linear-Solvent-Strength Model; Wiley: New York, 2006.
9522
Analytical Chemistry, Vol. 81, No. 22, November 15, 2009
challenges. Indeed, the optimal separation conditions for peptide mixtures are common knowledge4 and are used with only slight variations in many proteomic laboratories. The approaches to optimizing the basic parameters of LC separation techniques are very different between “classical” HPLC studies and modern proteomics applications. For LC specialists, peptide separation selectivity used to be the primary criterion, since detection options were usually limited to spectrophotometric methods and complete peak resolution was necessary. Now the use of the powerful mass spectrometry technique for detection allows the simultaneous identification of coeluting species, shifting the emphasis to providing an optimal sampling rate (peak capacity) for mass detectors, in addition to better separation efficiency. Despite some remaining fundamental questions of selectivity optimization, we believe that selectivity prediction will become an important part of proteomic protocols in the future. Peptide retention (i.e., selectivity) prediction can be used as an additional filter to harden protein/peptide identification in bottom-up proteomic studies,5-7 to decrease the time required for analysis,8 and even to direct the choice of optimal separation systems for a particular sample.9 Recent developments in this area10-15 have been built on the earlier studies16-19 and fuelled by the abundant (5) Palmblad, M.; Ramstro ¨m, M.; Markides, K. E.; Håkansson, P.; Bergquist, J. Anal. Chem. 2002, 74, 5826–5830. (6) Strittmatter, E. F.; Kangas, L. J.; Petritis, K.; Mottaz, H. M.; Anderson, G. A.; Shen, Y.; Jacobs, J. M.; Camp, D. G., 2nd; Smith, R. D. J. Proteome Res. 2004, 3, 760–769. (7) Strittmatter, E. F.; Ferguson, P. L.; Tang, K.; Smith, R. D. J. Am. Soc. Mass Spectrom. 2003, 14, 980–991. (8) Krokhin, O. V.; Ying, S.; Cortens, J. P.; Ghosh, D.; Spicer, V.; Ens, W.; Standing, K. G.; Beavis, R. C.; Wilkins, J. A. Anal. Chem. 2006, 78, 6265– 6269. (9) Dwivedi, R. C.; Spicer, V.; Harder, M.; Antonovici, M.; Ens, W.; Standing, K. G.; Wilkins, J. A.; Krokhin, O. V. Anal. Chem. 2008, 80, 7036–7042. (10) Krokhin, O. V.; Craig, R.; Spicer, V.; Ens, W.; Standing, K. G.; Beavis, R. C.; Wilkins, J. A. Mol. Cell. Proteomics. 2004, 3, 908–919. (11) Krokhin, O. V. Anal. Chem. 2006, 78, 7785–7795. (12) Shinoda, K.; Sugimoto, M.; Yachie, N.; Sugiyama, N.; Masuda, T.; Robert, M.; Soga, T.; Tomita, M. J. Proteome Res. 2006, 5, 3312–3317. (13) Petritis, K.; Kangas, L. J.; Yan, B.; Monroe, M. E.; Strittmatter, E. F.; Qian, W. J.; Adkins, J. N.; Moore, R. J.; Xu, Y.; Lipton, M. S.; Camp, D. G., 2nd; Smith, R. D. Anal. Chem. 2006, 78, 5026–5039. (14) Gorshkov, A. V.; Tarasova, I. A.; Evreinov, V. V.; Savitski, M. M.; Nielsen, M. L.; Zubarev, R. A.; Gorshkov, M. V. Anal. Chem. 2006, 78, 7770–7777. (15) Klammer, A. A.; Yi, X.; Maccoss, M. J.; Noble, W. S. Anal. Chem. 2007, 79, 6111–6118. (16) Meek, J. L. Proc. Natl. Acad. Sci. U.S.A. 1980, 77, 16321636.17. (17) Browne, C. A.; Bennett, H. P. J.; Solomon, S. Anal. Biochem. 1982, 124, 201–208. 10.1021/ac9016693 CCC: $40.75 2009 American Chemical Society Published on Web 10/22/2009
availability of peptide sequence-retention data sets. Recently there have been definite improvements in understanding the ion-pairing separation mechanism,10 its influence on the apparent hydrophobicity of amino acid side chains (especially at N- and C-termini),20 the effect of the residue position within the peptide chain,13 and the propensity to form helical structures.11 This information has resulted in significant improvements of peptide retention prediction accuracy: correlations of R2 value ∼0.98 have recently been demonstrated,11,13 while the correlation in the classical additive approach has never exceeded R2∼0.93 on real proteomic samples. One might speculate that the virtually unlimited data sets being filled by mass spectrometry, tandem mass spectrometry (MS/ MS) peptide identification would ultimately render predictive models unnecessary, as all retention characteristics for all possible peptides would eventually be experimentally determined. However, there are significant barriers preventing this from happening soon. First, although the majority of proteomic researchers deal only with the separation of tryptic peptides, large numbers of nontryptic species are formed in vivo, and even during the tryptic digestion; chemical and post-translation protein/peptide modifications add further complications. This makes it difficult to estimate the number of species one might deal with during proteomic experiments. Second, there are a large number of variations in mobile/stationary phases used in proteomic LC. There are many commercial suppliers of RP-separation media, and these products are the subject of ongoing improvements and modifications. There are several variations of mobile phase composition, such as electrospray ionization (ESI)- or matrix-assisted laser desorption ionization (MALDI)-compatible, or high pH RP for two-dimensional schemes. Different mobile/stationary phase combinations require detailed comparison of separation selectivity, combining them in groups based on similarity and the collection of separate data sets.21 Third, it is unlikely that such a vast amount of information can be collected in one laboratory or even using one LC experimental platform. This will require collective efforts from the proteomic community, making the issue of a standardized alignment of LC-MS data a critically important task. Monitoring retention times (RT) of the redundant species present in different LC-MS runs is a common practice for LC data alignment and calibration of chromatographic systems. For example, Petritis et al. followed the retention of six peptides frequently observed in D. radiodurans and S. oneidensis in order to align 687 different LC-MS/MS runs.22 When dealing with the samples from nonrelated organisms, spiking the analyzed mixture with known peptides is required. In our own work, we have added human transferrin to all of the protein mixtures used to collect peptide retention data sets for optimization of our prediction model. After tryptic digestion, it produces ∼35 easily detectable peptides; we used them to draw
RT vs calculated hydrophobicity (H) linear plots, and to align the data in order to adjust for small changes in slopes and intercepts. More recently we have used a simpler digest of horse heart myoglobin for the same purpose.11 A similar approach was undertaken by Tarasova et al.23 utilizing a cytochrome C digest. We also proposed to use the extrapolated “ideal” (or measured) hydrophobicity values of these standard peptides on the combined RT vs H plots for the optimization data set.11 Finding the retention times of target peptides in each LC run gives a very robust chromatographic calibration and data alignment. An alternative approach is to use a number of peptides that have been confidently identified by MS/MS to plot RT vs H as a calibration.15,24 However, the accuracy of this procedure is dependent on the elimination of false-positive identifications, as well as the retention prediction accuracy for a particular set of peptides. Although these methods have given good results in many cases, we believe that the use of synthetic peptides is the preferred method for LC-MS alignment. Real digests of spiked proteins often contain products of nonspecific cleavage and extraneous impurities. On the other hand, synthetic peptides can be easily purified resulting in a calibration mixture that gives minimal interference with MS procedures. An optimal peptide retention standard (peptide mixture) should (i) uniformly cover a wide range of peptide hydrophobicities similar to real proteomic samples; (ii) be easily detectable by conventional MS equipment (e.g., have suitable molecular weights and ionization properties); (iii) not interfere with the MS/MS identification of the analyzed sample or create false-positive identifications; (iv) not contain peptides prone to chemical modifications (oxidation, deamidation, N-terminal cyclization, etc.). Guo et al. have developed a peptide retention standard for RPHPLC of peptides that contains a mixture of five synthetic C-terminal amide decapeptides of the generic formula: [Ac-ArgGly-X-X-Gly-Leu-Gly-Leu-Gly-Lys-Amide].25 Sequence variations of [X-X] are [Gly-Gly], [Ala-Gly], [Val-Gly], and [Val-Val]. The fifth peptide contains the [Ala-Gly] sequence plus a free N-terminal amino group. Molecular weights of these species ranged between 883 and 995 Da, making them easily detectable by both ESI and MALDI MS techniques. The only disadvantage of this mixture is its narrow range of hydrophobicities, ∼6% on an acetonitrile percentage scale, whereas peptides typically elute within a ∼40% window. Recently, Eyers et al.26 developed a peptide standard to address the performance of LC-MS systems as a whole, including the separation and detection parts with emphasis on the latter. They generated the artificial protein QCAL1, which provides a set of 22 peptides of various sizes and hydrophobicities upon tryptic digestion. Another issue closely related to retention standard development is how the value of peptide hydrophobicity is expressed and used in chromatographic experiments. For example, our sequence
(18) Guo, D.; Mant, C. T.; Taneja, A. K.; Parker, J. M. R.; Hodges, R. S. J. Chromatogr. 1986, 359, 499–517. (19) Su, S. J.; Grego, B.; Niven, B.; Hearn, M. T. W. J. Liq. Chromatogr. 1981, 4, 1745–1753. (20) Tripet, B.; Cepeniene, D.; Kovacs, J. M.; Mant, C. T.; Krokhin, O. V.; Hodges, R. S. J. Chromatogr., A 2007, 1141 (2), 212–225. (21) Spicer, V.; Yamchuk, A.; Cortens, J.; Sousa, S.; Ens, W.; Standing, K. G.; Wilkins, J. A.; Krokhin, O. V. Anal. Chem. 2007, 79, 8762–8768. (22) Petritis, K.; Kangas, L. J.; Ferguson, P. L.; Anderson, G. A.; Pasa-Tolic, L.; Lipton, M. S.; Auberry, K. J.; Strittmatter, E. F.; Shen, Y.; Zhao, R.; Smith, R. D. Anal. Chem. 2003, 75, 1039–1048.
(23) Tarasova, I. A.; Guryca, V.; Pridatchenko, M. L.; Gorshkov, A. V.; KiefferJaquinod, S.; Evreinov, V. V.; Masselon, C. D.; Gorshkov, M. V. J. Chromatogr., B 2009, 877, 433–440. (24) May, D.; Fitzgibbon, M.; Liu, Y.; Holzman, T.; Eng, J.; Kemp, C. J.; Whiteaker, J.; Paulovich, A.; McIntosh, M. J. Proteome Res. 2007, 6, 2685– 2694. (25) Guo, D.; Mant, C. T.; Taneja, A. K.; Hodges, R. S. J. Chromatogr. 1986, 359, 519–532. (26) Eyers, C. E.; Simpson, D. E.; Wong, S. C. C.; Beynon, R. J.; Gaskell, S. J. J. Am. Soc. Mass Spectrom. 2008, 19, 1275–1280.
Analytical Chemistry, Vol. 81, No. 22, November 15, 2009
9523
specific retention calculator (SSRCalc) model plots linear dependencies of RT vs calculated hydrophobicity, with the latter being a product of multiple factors reflecting the influence of amino acid sequence, pI, and the propensity to form helical structures.11 This parameter has no connection to any physical property of the separated species unless it is related to the hydrophobicity of standard peptide(s). An alternative way to represent retention prediction data is the use of normalized retention/elution time (NRT/NET) values.12,22,23 While this approach normalizes retentions to a set of known peptides, these values do not express real chromatographic properties of the species either, unless the standard peptides are well characterized. Although the general concept of hydrophobicity of peptides and proteins is well understood, the field is known for its large number of hydrophobicity scales.27-29 Operationally, the role which the surrounding environment (i.e., pH, ionic strength, interacting partners, etc.) plays in protein interactions adds more difficulty to the quantitation of hydrophobicity. Nevertheless, peptide RP-HPLC has a chance to avoid this fate: interaction occurs on artificially and reproducibly manufactured hydrophobic surfaces under strictly controlled conditions of various ion-pairing environments. Therefore, the intelligent selection of compatible RP-HPLC systems (as discussed above) provides conditions for better standardization and development of universal hydrophobicity scales (at least on a peptide level). There is only one common variable in all these systems, the continually changing concentration of the organic solvent. This is the most attractive and natural parameter to express peptide hydrophobicity in RP-HPLC. More importantly, the acetonitrile percentage (ACN%) scale allows a simple transfer of the methods between gradient and isocratic modes. Once the acetonitrile concentration at which a component of interest elutes from the column is determined, it can be used to design a preparative method that employs a very shallow gradient or even an isocratic separation. Hydrophobicity in ACN% units can be measured under both gradient and isocratic conditions. Thus, knowing the retention time, the gradient delay time and the slope of the gradient allows one to determine the acetonitrile concentration at which any particular component elutes from the column. For the peptide retention standard introduced by Guo et al.,25 the hydrophobicity of peptide no. 4 was assigned as 17.5 using this procedure. Such measurements under gradient conditions, however, may result in significant error due to the specific character of gradient elution.4 During RP-HPLC gradient elution, peptides start to move through the column with increasing speed. This acceleration is different for different peptides due to variation of the S values in the basic equation of linear-solvent-strength (LSS) theory: log k ) log kw - Sφ; where k is the retention factor at organic solvent volume fraction φ (φ ) ACN%/100) and kw is the retention factor at φ ) 0. Therefore, the use of isocratic elution is preferable for this purpose, as peptide retention will depend only on the type of sorbent, the nature and concentration of the ion-pairing modifier, and the temperature. (27) Janin, J. Nature 1979, 277, 491–492. (28) Kyte, J.; Doolittle, R. F. J. Mol. Biol. 1982, 157, 105–132. (29) Cornette, J.; Cease, K. B.; Margalit, H.; Spouge, J. L.; Berzofsky, J. A.; DeLisi, C. J. Mol. Biol. 1987, 195, 659–685.
9524
Analytical Chemistry, Vol. 81, No. 22, November 15, 2009
Valko and Slegel30 introduced a chromatographic hydrophobicity index measured under isocratic RP-HPLC conditions as a measure of the lipophilic character of separated compounds. The parameter was defined as the organic solvent concentration at which the log k value equals 0 for a particular analyte. A similar approach can be utilized for peptide RP-HPLC, once standard sets of the chromatographic conditions (stationary and mobile phases) for peptide separation are well-defined. Our recent studies of peptide separation selectivity using a variety of reversed-phase columns11,21 and ion-pairing conditions9,10 allowed us to define such sets, lending possibility to unify the expression of peptide hydrophobicity in RP-HPLC. This technical note focuses on the design and study of the chromatographic behavior of a six-peptide standard mixture covering a wide range of hydrophobicities in different RP-HPLC modes: 100 and 300 Å C18 sorbents with trifluoroacetic acid, 100 Å with formic acid as ion-pairing modifiers, and C18 RP separation at pH 10. We propose the use of peptide hydrophobicity index values in an acetonitrile percentage scale measured under isocratic conditions in order to express peptide hydrophobicity in RP-HPLC. EXPERIMENTAL SECTION Reagents. Deionized (18 MΩ) water and HPLC-grade acetonitrile were used for the preparation of eluents. All chemicals (including the purified proteins) were sourced from Sigma Chemicals (St. Louis, MO), unless otherwise noted. Sequencinggrade modified trypsin (Promega, Madison, WI) was used for digestion. The six peptides (referred to henceforth as P1-P6), LGGGGGGDGSR, LGGGGGGDFR, LLGGGGDFR, LLLGGDFR, LLLLDFR, LLLLLDFR, were custom synthesized by BioSynthesis Inc. (Lewisville, TX). The peptide retention standard S1-S5 described in the introductory material was purchased from Pierce (Rockford, IL). Protein Digestion. A tryptic digest of a five-protein mixture [(human/bovine) transferrin and albumin, plus horse myoglobin] was used for the microLC-MALDI MS experiments. Proteins were reduced (10 mM dithiothreitol, 30 min, 57 °C), alkylated (50 mM iodoacetamide, 30 min in the dark at room temperature), dialyzed (100 mM NH4HCO3, 6 h, 7 kDa MWCO, Pierce), and digested with trypsin (1/50 enzyme/substrate weight ratio, 12 h, 37 °C). This protein digest was diluted prior to injection and spiked with a mixture of the six peptides to provide ∼1-2 pmol per injection for all components. A tryptic digest of four human proteins (transferrin, lactoferrin, albumin, and fibrinogen) was prepared in a similar way and used for nanoLC-ESI MS/MS analyses (∼100 fmol per injection). Chromatography. Three different chromatographic setups were used: nano-, micro-, and normal flow. Microflow at 3 µL/ min and normal-flow at 150 µL/min were performed utilizing a micro-Agilent 1100 series system (Agilent Technologies, Wilmington, DE) with two column sizes: 300 µm × 150 mm (micro) and 1 mm × 100 mm. The three stationary phases studied included Vydac 218 TP C18, 300 Å pore size, 5 µm (Grace Vydac, Hesperia, CA), Luna C18(2) 100 Å, 5 µm (Phenomenex, Torrance, CA), and XTerra C18, 120 Å, 5 µm (Waters, Milford, MA). Three different ion-pairing systems with linear water-acetonitrile gradients were employed: 0.1% trifluoroacetic acid (TFA) in both eluents for Vydac (30) Valko, K.; Slegel, P. J. Chromatogr. 1993, 631, 49–61.
Table 1. Amino Acid Sequences, m/z Values, Calculated/Measured SSRCalc Hydrophobicity for a 100 Å TFA Model, and HI Values for a Six-Peptide Standard Mixture Measured for Four Different Mobile/Stationary Phase Combinations hydrophobicity 100 Å TFA
P1 P2 P3 P4 P5 P6 a
peptide sequence
m/z (1+) MALDI
m/z (2+) ESI
calculated
LGGGGGGDGSR LGGGGGGDFR LLGGGGDFR LLLGGDFR LLLLDFR LLLLLDFR
889.413 892.428 891.469 890.51 889.551 1002.635
445.211 446.718 446.239 445.759 445.280 501.822
3.25 17.68 21.5 29.7 39.66 42.73
HI values
measureda
100 Å C18-TFA
300 Å C18-TFA
100 Å C18-FA
100 Å C18-pH 10
0.89 13.53 21.32 30.33 39.27 44.95
3.97 10.74 15.53 20.72 26.07 29.58
1.89 8.40 13.35 18.90 24.78 28.60