Unifying Expression Scale for Peptide ... - ACS Publications

Oct 15, 2013 - ABSTRACT: As an initial step in our efforts to unify the expression of peptide retention ..... PolyQuant GmbH (Bad Abbach, Germany). iR...
0 downloads 0 Views 14MB Size
Subscriber access provided by COLD SPRING HARBOR LAB

Article

A unifying expression scale for peptide hydrophobicity in proteomic RP HPLC experiments Marine Grigoryan, Dmitry Shamshurin, Victor Spicer, and Oleg V. Krokhin Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/ac402310t • Publication Date (Web): 15 Oct 2013 Downloaded from http://pubs.acs.org on October 20, 2013

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

A unifying expression scale for peptide hydrophobicity in proteomic RP HPLC experiments Marine Grigoryan1, Dmitry Shamshurin1, Victor Spicer1, Oleg V. Krokhin1,2* AUTHOR ADDRESS 1

Manitoba Centre for Proteomics and Systems Biology, 2Department of Internal

Medicine, University of Manitoba, 799 JBRC, 715 McDermot Avenue, Winnipeg, R3E 3P4, Canada ABSTRACT As an initial step in our efforts to unify the expression of peptide retention times in proteomic LC-MS experiments, we aligned the chromatographic properties of a number of peptide retention standards against a collection peptides commonly observed in proteomic experiments. The standard peptide mixtures and tryptic digests of samples of different origins were separated under the identical chromatographic condition most commonly employed in proteomics: 100Å C18 sorbent with 0.1% formic acid as ionpairing modifier. Following our original approach (Krokhin & Spicer, Analytical Chemistry 2009) the retention characteristics of these standards and collection of tryptic peptides were mapped into Hydrophobicity Index (HI) or acetonitrile percentage units. This scale allows for direct visualization of the chromatographic outcome of LC-MS acquisitions, the monitoring the performance of the gradient LC system, and simplifies

ACS Paragon Plus Environment

1

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 32

method development and inter-laboratory data alignment. Wide adoption of this approach would significantly aid understanding the basic principles of gradient peptide RP-HPLC and solidify our collective efforts in acquiring confident peptide retention libraries – a key component in the development of targeted proteomic approaches.

INTRODUCTION The combination of liquid chromatography and mass spectrometry is the method of choice for the majority of proteomic applications1. The mass spectrometry component has been rightfully considered the dominant element, providing unprecedented capabilities for the analysis peptides and proteins; MS techniques have seen dramatic improvements in the past 15 years2,3. The chromatographic component of LC-MS protocols – mostly reversed-phase HPLC (RP HPLC) – has remained in a shadow of the advancing MS techniques. However, it can provide significant amount of auxiliary information encoded in peptide retention times. In an attempt to use this information for confident peptide identification4, and in the development of targeted quantitation assays5, several approaches have been developed for peptide retention prediction and the collection of retention data sets6-14. Both RP HPLC and mass spectrometry are ultimately separation techniques, based on differences in analytes hydrophobicity and mass, respectively. While the fundamentals of accurate mass measurement and comparison to its calculated values are

ACS Paragon Plus Environment

2

Page 3 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

extremely well developed, the same can’t be said in regard of RP HPLC chromatography, where the prediction approaches developed to date still require significant improvement15. Reproducible measurement of peptide hydrophobicity is also a formidable task, as it is affected by multiple factors, including variations in flow-rate, column size, accuracy of gradient delivery, and column temperature. Different LC-MS settings employ various gradient slopes, column packing and ion-pairing additives. While keeping identical LC conditions is the best recipe for reproducible measurement of retention times, the use of internal calibration (standard peptides) is the only viable approach to achieve accurate measurement and alignment of LC data. First peptide retention standards were developed in the 1980’s to monitor the performance of RP-HPLC columns under the different chromatographic conditions. Hodges and co-worker described mixture of 5 synthetic peptides16, but it spans only ~7% acetonitrile on the RP HPLC retention scale17, making it of limited value for aligning the diversity of peptide mixtures encountered in proteomics. This group has continued with the design of retention standards to address differences in separation selectivity across different RP-HPLC columns18 and various modes of peptide separation. The first applications of retention standards in proteomic era were related to the collection of peptide retention datasets for optimization of retention prediction algorithms. Petritis et al.6 employed six peptides commonly observed in tryptic digests of D.radiodurans and S.oneidenis to align multiple LC-MS runs used for optimization of their peptide retention prediction model. Our work involved spiking samples with tryptic digests of human transferrin or horse myoglobin9, and later with the collection of designed synthetic peptides (Table 1)17 spanning a wide range of hydrophobicity.

ACS Paragon Plus Environment

3

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 32

Tarasova et al. compared the retention parameters of tryptic peptides from Cytochrome C on eight different C18 packing materials and proposed to use this digest for the alignment purposes19. Retention standard peptides also permit the monitoring gradient LC systems performance - the reproducible retention values and peak shapes indicate consistent gradient delivery and separation characteristics. Quality control of both the MS and LC components is an even more challenging task, as attempted by Eyers et al.20 The authors proposed to use an artificial protein QCAL1, which generates a mixture of 22 peptides upon tryptic digestion (Table 1). These peptides were designed to challenge/test various parameters of MS performance, while also covering wide range of hydrophobicities in RP-HPLC scale. The wide adoption of targeted quantitative proteomic approaches5 makes application of retention standards mandatory. The throughput of selective reaction monitoring (SRM) can be increased dramatically by “scheduling” of the transitions21, but this requires highly reproducible chromatographic separations and the constant monitoring of chromatographic performance. To address this, we developed a mixture of 6 synthetic peptides spanning entire hydrophobicity scale for typical tryptic species17. These peptides were also designed to permit their sampling in a single SRM transition by maintaining an almost constant peptide mass (GlyGly – Leu substitution) across the full range of hydrophobicities. A standard mixtures of synthetic peptides were recently introduced by Thermo Fisher22 and Biognosys (iRT - Retention time normalization kit)23, containing 15 and 11 peptides, respectively. Thermo Fisher’s (henceforth TF) standards were designed using sequences of real tryptic peptides of Saccharomyces cerevisiae (Table 1)22, but with C-terminal Lys and Arg substituted with isotopically labeled Lys

ACS Paragon Plus Environment

4

Page 5 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

(+8Da) and Arg (+10Da), preventing their interference with endogenous peptides during LC-MS/MS analysis. Escher et al.23 used sequences of tryptic peptides from Leptospira interrogans with the substitution of some amino acids (Table 1, iRT standard) to derive their eleven unique peptide sequences, also exhibiting a wide span of hydrophobicity values. This mixture was incorporated into Skyline24, the very popular platform for SRM assay development. Retention values (iRTs) are assigned to all 11 components of this mixture23, and used to transfer retention characteristics of targeted peptides between systems with different chromatographic settings; e.g. between discovery and targeted runs. Independent of the set of peptides used, all retention alignments are based on the linear correlation plots between observed retention and assigned hydrophobicity values. Predicted retention times can be used to adjust retention windows for scheduled SRM methods or post-acquisition SRM-type quantitation such as ABSciex’s SWATH25. Recently dynamic or “on-the-fly” adjustment of retention time windows has also benefited from these standard peptides, yielding improvements in scheduled SRM26 or information dependent acquisition27 analysis via the exclusion of expected peptides belonging to previously identified proteins. Given the value of retention time alignment using standard peptides, it’s also important to note the absence of a consensus scale for expressing peptide hydrophobicity in proteomic experiments. Petritis et al.6 used a normalized elution time (NET; 0-1) scale to express peptide retention for their data collection and in the output of the retention prediction model. We use unitless “SSRCalc hydrophobicity” values as an expression for Sequence Specific Retention Calculator model’s outputs.9 iRT peptides were normalized

ACS Paragon Plus Environment

5

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 32

over a scale from 0 to 100, with the extrema belonging to the second and eleventh members of the mixture23. While these are all equally valid approaches, none of them link explicitly into the particular properties of RP-HPLC separation, and require (admittedly minor) effort for cross comparisons. To address these differences we proposed using acetonitrile concentration (HI, hydrophobicity index)17 to express peptide retention in RP HPLC. In earlier studies by Hodges and co-workers16, the retention coefficients for individual amino acids were already linked to the acetonitrile concentrations, as all their separations were performed at a 1% per minute acetonitrile gradient. We measured retention of six designed sequences (P1-P6) under isocratic conditions to determine the acetonitrile concentration at which a particular member of our standard mixture has retention factor (k) equal 10. These values were determined for all six members of our standard using C18 100 Å sorbents with 0.1% TFA, 0.1% formic acid as ion pairing agents and for separation at pH 10. The outputs of the various SSRCalc models were mapped into HI units using a set of linear equations17, providing very straightforward approach for expressing observed or predicted hydrophobicity values in proteomics experiments with some immediate advantages over existing methods: a) The retention time vs. HI plot for a typical bottom-up proteomic run provides information that can directly inform the design of RP-HPLC conditions. For example, the HI scale will indicate that most of the tryptic peptides elute between 0 and 30 % acetonitrile when 0.1% formic acid used as eluent additive. Thus, if the predicted HI value for targeted peptide(s) is below zero, it is unlikely that this component will be observed using injection via a trap column. Conversely, peptides with HI values >30%

ACS Paragon Plus Environment

6

Page 7 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

acetonitrile are also unlikely to be observed, information that would normally be obtained through a preliminary diagnostic separation run. b) Retention time vs. HI plots have slope values (min/acetonitrile%) reciprocal to the actual experimental gradient slope (acetonitrile%/min), allowing the monitoring of gradient delivery accuracy. c) Working in physical units permits the representation of RP-HPLC processes in a way that is more intuitive for instructional purposes. Here we attempt an alignment of existing peptide retention standards20, 22, 23 and some common tryptic digests (human, E. coli, S.cerevisiae) with our P1-P6 mixture17 under typical conditions used in proteomic LC-MS experiments. This will allow the assignment of HI values to all components of retention standard mixtures and digests under investigation, facilitating inter-laboratory data conversion and analysis. EXPERIMENTAL SECTION Chemicals. All chemicals were sourced from Sigma Chemicals (St-Louis, MO), unless noted otherwise. HPLC-grade acetonitrile and de-ionized water were used for the preparation of eluents. QCAL protein sample was provided by PolyQuant GmbH (Bad Abbach, Germany). iRT and Thermo Fisher standard peptide mixtures were purchased from Biognosys (Zurich, Switzerland) and Thermo Fisher Scientific (Rockford, IL). The in-house designed six peptides (P1-P6) were custom synthesized by BioSynthesis Inc. (Lewisville, TX) and purified individually using RP HPLC. Sequencing-grade modified trypsin (Promega, Madison, WI) was used for digestion.

ACS Paragon Plus Environment

7

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 32

Protein digestion. QCAL1 protein (0.5 mg/ml) was provided in 0.1 % trifluoroacetic acid. Prior to digestion the pH was adjusted to ~8 using 200mM ammonium bicarbonate. An aliquot containing ~50 µg of QCAL1 protein was reduced (10 mM dithiothreitol, 30 min, 57oC) then alkylated (50 mM iodoacetamide, 30 min in the dark at room temperature). Excess of iodoacetamide was quenched using dithiothreitol. Resulting mixture was digested with trypsin (1/50 enzyme/substrate weight ratio, 12 hours at 37°C). Stock solutions of the TF and iRT standards were prepared by dissolving according to the manufacturers’ procedures; the QCAL digest and TF/iRT mixtures were analysed separately. In both cases, stock solutions were diluted and spiked with the P1-P6 mixture to provide ~200-400 fmol injection of all components. E. coli K12 Dh5a (Invitrogen) cells transformed with pLKO.1 vector were cultivated in LB media with Carbenicillin 100 µg/ml at 370C until an O.D. at 600 nm of 1.0 was reached. Cells were harvested by centrifugation at 4000 g for 15 min, resuspended and washed twice with PBS buffer. S.cerevisiae strain S150 was grown overnight in YPD (1% (w/v) yeast extract, 2% peptone, 2% dextrose) medium at 30oC28. Cells were harvested by centrifugation at 5000 g for 10 minutes, washed with PBS and stored at -20oC prior to use. Peripheral blood mononuclear cells (PBMC) were isolated from the whole blood of healthy donors by Ficoll gradient centrifugation. The cells were washed twice with PBS, resuspended in AIM V CTS Serum Free Medium (Invitrogen) and stimulated for 16 hours with phytohaemagglutinin P (5µg/mL). Jurkat Clone E6-1, a human T lymphoblastoid cell line derived from an acute T cell leukemia, was obtained from ATCC

ACS Paragon Plus Environment

8

Page 9 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

(TIB-152TM). All human cell lines were maintained at 37°C in a humidified 5% CO2 atmosphere. The PBMC line and Jurkat were cultured in RPMI 1640 medium supplemented with 10% FBS. The nonadherent lymphocytes were harvested by centrifugation (300 g for 5 minutes) and resuspended in fresh medium supplemented with IL-2 12.5 ng/mL. All cell lines were subcultured in fresh IL-2 containing media every 2– 3 days. All samples were obtained with informed consent using a protocol approved by the University of Manitoba, Research Ethics Board. Tryptic digests of Jurkat, PBMC, E.coli, yeast cells were prepared using the FASP digestion procedure29. Protein amounts to be subjected to digestion were monitored using micro-BCA assay (Pierce, Rockford, IL). Digests were acidified with TFA and purified by RP-HPLC. Approximately 1000 ng of the digests, spiked with the standard P1-P6 peptides, was used for each LC-MS/MS acquisition. HPLC-MS. A splitless nano-flow 2D LC Ultra system (Eksigent, Dublin, CA) with 10 µL sample injection via a 300µm×5mm PepMap100 trap-column and a 100µm×200mm analytical column packed with 5µm Luna C18(2) (Phenomenex, Torrance, CA) was used for all RP-LC separations. Both eluents A (water) and B (acetonitrile) contained 0.1 % formic acid as ion-pairing modifier. Three different linear gradients of 0.33, 0.66 and 1 % acetonitrile per minute were used to separate the standard peptide mixtures. Tryptic digests were separated using 0.66% gradient (0.5-35% acetonitrile) followed by 5 min washing step with 90% buffer A and 8 min equilibration with starting 0.5%, which corresponded 65 minutes of LC-MS instrument time. All

ACS Paragon Plus Environment

9

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 32

separations were performed at room temperature. A TripleTOF5600 mass spectrometer (ABSciex, Concord, ON) was used in standard MS/MS data-dependent acquisition mode. 250 ms survey MS spectra were collected (m/z 300-1500) and followed by up to 20 MS/MS measurements on the most intense parent ions (300 counts/sec threshold, +2 - +4 charge state, m/z 100-1500 mass range for MS/MS, 100 ms each). Previously targeted parent ions were excluded from repetitive MS/MS acquisition for 12 sec (50 mDa mass tolerance). Raw spectra files were converted into Mascot Generic File format (MGF) for peptide/protein identification by X!tandem. Following search parameters were used: 20 ppm and 0.1 Da mass tolerance for parent and fragment ions, respectively; constant modification of Cys with iodoacetamide; no variable modification allowed; expectation value cut-off of Log(e) < – 3. The LC-MS/MS data from the analysis of the induced pluripotent stem cell 201B7- P32 line were downloaded from the ProteomeX-change Consortium (http://proteomecentral.proteomexchange.org, data set identifier PXD000071)30. Each chromatogram in this study consisted of five consecutive MS/MS acquisition (2 hours each) performed during 10 hr chromatographic separation on 2-meter long monolithic column. MGF files were created for each portion of a chromatogram, and then all were concatenated encoding a cumulative MS/MS acquisition time. Raw LC-MS/MS data from micro MudPIT analysis of C. cerevisiae by Webb et al.31 were downloaded from http://fields.scripps.edu/published/YeastJPR2013/ and converted to MFG format using Proteowizard software package32. 39 ammonium acetate salt steps (SCX) were applied in combination with short water/acetonitrile gradients (RP) in this case. Single MudPIT

ACS Paragon Plus Environment

10

Page 11 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

acquisition was represented by 13 individual files, each containing three reversed-phase gradients. Similarly, individual MGF files were concatenated with the adjustment of cumulative acquisition time. Retention time assignment and hydrophobicity calculation. Retention times for the standard peptide mixtures were determined manually using the XIC extraction tools in PeakView (ABSciex). Retention times for the tryptic digests samples were assigned to each non-redundant peptide as the intensity weighted time average for the two most intense MS/MS spectra. Experimental retention values for both the standard mixtures and tryptic digests under investigation (Jurkat, PBMC, E. coli, S.cerevisiae) and tryptic fragments of the human/bovine albumin/transferrin/lactoferrin/fibrinogen mixtures33 are provided in Supporting Information. Peptides’ hydrophobicity in HI units was predicted using current versions of SSRCalc for formic acid conditions (http://hs2.proteome.ca/SSRCalc/SSRCalcX.html). HI values for all standard peptides and the 3000+ most abundant analytes in whole cell digests were determined via a least square linear regression fit using the HI values previously assigned to each member of the P1-P6 mixture17. RESULTS AND DISCUSSION RP-HPLC analysis of standard peptide mixtures to confirm identity in separation selectivity. The use of peptide retention standards assumes a linear dependence between retention times of analytes separated using similar chromatographic conditions. The developers of the iRT standard23 highlighted the exclusive character of iRT values provided in original publications fitting separations on C18 columns, using formic acid as

ACS Paragon Plus Environment

11

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 32

ion-pairing modifier and acetonitrile as organic solvent. iRT values were empirically assigned (0 and 100 values for peptides #2 and 11, respectively) to each component of the mixture using normalization between observed retention and iRT scale. These values were used to predict retention of target species separated under different gradient conditions. Similarly, our efforts on developing SSRCalc predictive algorithms resulted in three different models corresponding to separations on C18 columns at pH 10 and acidic pH with TFA or formic acid based eluents; three HI scales were introduced to reflect these differences17. Further interrogation of the compatibility between different RP-LC conditions might include a comparison of different C18 matrices, column temperature, and other parameters. Accurate retention time alignment between different chromatographic systems requires similar separation selectivity. To establish this, we compared the retention of three studied standards under our LC settings, along with reported literature data. The four cited publications used C18 100 Å resins, water-acetonitrile gradients and 0.1% formic acid as ion pairing modifier (0.2% for Eyers et al.20). Figure S1 (Supporting Information) shows the retention time correlations between the literature data and observed peptide retention times using the 0.66% gradient in our system. Eyers et al.20 used a non-linear acetonitrile gradient in their publication resulting in differences in peak shapes, but this also yielded a characteristic S-shaped correlation plot between the literature data and our experiments performed using linear gradients (Figure S1a). The iRT values for the Biognosys standard, and the retention times for TF were determined using linear gradients. Good linear correlations between literature values and our measurements have confirmed they have similar separation selectivity. We find linear

ACS Paragon Plus Environment

12

Page 13 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

correlations with R2-values of 0.9982 and 0.9974 for TF and iRT standards, respectively (Figure S1b,c). TF standard was designed using sequences of real tryptic peptides of S. cerevisiae. This gave us a chance to further confirm the similarity of separation selectivity. We found 13 out of 15 TF peptides in our LC-MS/MS run of the whole S. cerevisiae cell digest, showing 0.9986 correlation with the literature values (Figure S1d). Despite the common assumption of a linear correlation between retention values observed in two RP-LC systems, it’s important to emphasize that alteration in gradient slope, column size or flow rate in otherwise identical systems will yield variations of separation selectivity. In some cases this might manifest as a reversal of retention order33,34. Variations in the slope (S) parameter from the basic equation for linear solvent strength theory35 were pinpointed as a major contributor in these variations. One pair of peptides (10-11) from the TF standard provides a good example of this. The retention order in this pair is reversed (peptide #11 elutes prior to #10) for the shorter 1% and 0.66% gradients compared to the literature data at a 0.25 % gradient, while a separation using a 0.33% gradient showed the retention order consistent with the literature data. Beyond the classical assumption that slope S increases with the size (molecular weight) of the molecule, our Sequence Specific Slope Calculator model33 predicts slope (S) values for #10 GILFVGSGVSGGEEGAR and #11 SFANQPLEVVYSK equal to 27.2 and 29.3, respectively. This is consistent with general observation that peptides with higher S will show smaller retention under fast gradients. Efforts to enhance and apply predictive models for slope (S) values will undoubtedly become an important element in the transfer of retention times across chromatographic platforms in the future. For

ACS Paragon Plus Environment

13

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 32

example, in addition to our study33, Shinoda at al.36 used an Analytical Neural Network to predict peptides’ parameters log(k0) and S for linear solvent strength theory equation. QCAL1 standard peptides. Figure 1 shows the LC-MS analysis of QCAL1 digest spiked with P1-P6 standard using linear 1% per minute acetonitrile gradient. The retention time differences are equivalent to an acetonitrile percentage scale for this case. It should be noted that the most hydrophilic component of our standard mixture (P1) has an HI value less than 0, and does not retain on C18 column under formic acid conditions17. P2 and P6 peptides have HI values of 4.58 and 21.59, respectively. This corresponds to ~17% acetonitrile or ~17 min on chromatographic time scale. The difference between the least (Q18) and the most (Q14) hydrophobic QCAL1 peptides is ~14 % acetonitrile. Table 1 summarizes the observed HI and m/z values for all members of QCAL1 standard, except Q15, which was also not detected in original publication. As we noted before, the QCAL1 protein was designed to test the performance of both LC and MS20. We observed all MS features described by the authors: the challenging mass resolution of the Q1-Q7 pair; an exact 1:3:6 abundance ratio in the Q8Q10 triad; and traceable Met oxidation profiles in Q12-Q14. The HI value of the most hydrophilic Q18 is ~6.9, which may be problematic for monitoring gradient development at low acetonitrile concentrations. Peak splitting for peptides carrying a methionine sulfoxide residue in the hydrophobic face of the amphipathic helix is documented in the chromatographic literature37, but rarely observed in proteomic experiments. The Q12-Q14 members of the QCAL1 standard clearly show amphipathic character - the axial projection of

ACS Paragon Plus Environment

14

Page 15 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

AVMDDFAAFVEK (Q12) indicates that Met(3) is located in the hydrophobic face of the helix (Figure 2a). This composition makes it prone to LC peak splitting following methionine oxidation. Indeed the XIC profile of oxidized AVMDDFAAFVEK using a 1% per minute gradient may indicate presence of 2 different components (wider peak for 679.82 m/z, Figure 2b), while the longer 0.33% gradient allows for partial resolution of two peptides with diastereomeric forms of methionine sulfoxide (Figure 2c). Both Q13 and Q14 showed similar behavior. These characteristic properties of Met-containing peptides in QCAL1 provide an additional testing tool to challenge the separation efficiency of an RP-HPLC system. ThermoFisher and iRT standards. As we mentioned in introduction, these two mixtures were designed for applications to SRM-type experiments: complete coverage of entire hydrophobicity scale was one of priority goals during their design, and is very important for on-the-fly adjustments during a scheduled SRM procedure. Having early eluting components in the standard mixture helps with the early detection of a reference point and subsequent adjustment of the acquisition schedule26. Figure 3 shows TIC trace of P2-P6 – TF – iRT mixture and XIC profiles of its individual components. As shown in Table 1, TF mixture has 3 peptides eluting prior to P2 reference peptide with HI 4.58 (~2.2, 3.7, 4.1 acetonitrile %) and iRT mixture has one (iRT-1, HI ~ 2.3). Both mixtures uniformly cover the entire chromatographic space within ~16 and 17 % acetonitrile for TF and iRT, respectively. HI values in Table 2 were derived using least square linear regression analysis taking HI of P2-P6 determined under isocratic conditions17 as the reference points. These values correspond to a particular chromatographic setup and could be affected by variety

ACS Paragon Plus Environment

15

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 32

of additional factors. As discussed before, variations in column size, flow rate and gradient slope impact the separation selectivity due to differences in slope (S) value in the basic equation for linear solvent strength theory for different peptides. The variations in column sizes and flow rates in nano-flow LC-MS applications are rather limited, while gradient slope could be significantly altered to improve the resolution of complex peptide mixtures. Significant variation in separation selectivity (R2-value correlation between retention times < 0.99) is observed when gradient slope altered ~6x times33. Taking the 0.66% gradient in Table 1 as a starting point, this corresponds to 0.11-0.66% acetonitrile per minute separations, which covers the vast majority of LC conditions in proteomic applications. Additional parameters, which significantly alter chromatographic retention and HI assignment, include type of ion-pairing modifier, column temperature and type of reversed-phase sorbent. To obtain HI values comparable to ours measurements, these parameters should be used as recommended: 0.1% formic acid as eluent additive, separation on C18 100 Å columns with standard end-capping chemistry at room temperature. Expression of peptide hydrophobicity in proteomic experiments. The advantages of expressing peptide retention properties in an acetonitrile percentage (HI units) scale were highlighted in the Introduction; most important is the ability to easily “visualize” the separation process and roughly estimate experimental conditions. Figure 4 shows the retention time prediction for all 52 members of calibration mixtures and the complex whole cell digest mixture from Yamana et al30. SSRCalc retention prediction showed a 0.969 correlation for the aligned collection of the standard peptides (Figure 4a). The reciprocal value for the slope of this graph 1/1.3612 = 0.735, shows some deviation

ACS Paragon Plus Environment

16

Page 17 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

compared to experimental setting of 0.66. The observed difference is caused by random character of prediction errors for this relatively small (52 peptides) dataset. Plotting the same dependence using experimental HI values (Table 1) gives nearly a perfect calibration with estimated 1/1.4758 = 0.678 % acetonitrile per minute gradient (Figure 4b). Yamana et al.30 used a 2-meter long silica based monolithic C18 column for the single shot bottom-up analysis of human induced pluripotent stem cells. Approximately 4 µg of tryptic digests were separated using a very shallow 0.0583 % acetonitrile per minute gradient starting at 4 %. Retention time prediction using SSRCalc HI values showed a ~0.94 correlation across 12,664 tryptic peptides with expectation values log(e) < -3 (Figure 4c). The reciprocal value of the slope 1/18.761 = 0.0533 is very close to the author’s experimental settings. The characteristic deviation from a linear correlation for hydrophilic peptides (HI < 5) is also consistent with starting gradient conditions reported (4%), illustrating again how the use of an HI (acetonitrile % scale) helps to evaluate chromatographic conditions and quality of RP-HPLC separation in any LC-MS experiment. Application of HI scale also allows monitoring of chromatographic performance in reversed-phase mode of on-line 2 D MudPIT (Figure 5). Each analysis reported for micro-MudPIT acquisition31 consisted of 39 salt steps in SCX dimension followed by 18.5 minutes (5-38.75% acetonitrile) reversed-phase gradient. Plotting retention time vs. SSRCalc HI yields 38 individual evenly spaced dependencies of identical slopes indicating high reproducibility of gradient delivery (Figure 5a). The only deviation from the expected behavior is observed in step 1, likely due to lack of RP-packing

ACS Paragon Plus Environment

17

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 32

equilibration at the beginning of the analysis. Detailed inspection of individual dependencies for steps 21-24 (Figure 5b) show good retention prediction accuracy with average slope of the plots 0.551. This corresponds to 1/0.551 = 1.814 (%ACN per minute) experimental gradient slope, while 1.824 % per minute was used as actual experimental settings31. We routinely employ the P1-P6 standard mixture to assess the quality of gradient delivery by spiking all samples analyzed at the Manitoba Centre for Proteomics and Systems Biology and monitoring their retention time vs. experimental HI plots. We consider R2-correlation of more than 0.99 and slope value within ±15 % from expected (reciprocal to % acetonitrile per minute gradient in instrument settings) as acceptable. Plotting similar dependencies for SSRCalc predicted HI values of all confidently identified peptides (Figures 4 a,c and Figure 5) is less accurate due to the random character of peptide properties in each collection and possibility of false positive identifications. Thus, analysis of the graph in Figure 3a shows that 3 peptides (labeled with asterisk) out of 52 exhibit maximal deviation from expected prediction lowering significantly accuracy of retention prediction. All three them belong to the QCAL1 standard and show very similar structural features (Q2-Q4, Table 1). This is an indication that the SSRCalc prediction model should be adjusted to better describe retention behavior of these species. A similar situation is observed if peptide collection has a significant number of amphipathic helical peptides: they retained stronger than predicted causing significantly lower prediction accuracy15. The use of standard mixtures (Table 1) or other peptide libraries with measured HI values (Supplementary information) is the

ACS Paragon Plus Environment

18

Page 19 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

preferable approach to monitor performance of HPLC system in proteomic LC-MS analyses. Any hydrophobicity scale for any peptide mixture can be converted into HI units using a linear transformation similar to what maps SSRCalc outputs into ACN% scale. For example, plotting the dependence of the HI values shown in Table 1 against iRT values yields simple linear equation for conversion from iRT into HI units and vise-versa: HI = 0.1376*iRT + 6.1224; iRT = 7.2512*HI – 44.296 New peptide calibration mixtures can be aligned in a similar fashion using one of the standards included in this study, as well as other peptides commonly observed in proteomics experiments provided in Supporting Information. The consistency of HI value measurements can be demonstrated by correlating retention properties of identical peptides across different LC-MS runs. For example, our tryptic digest analyses of PBMC and Jurkat cells have 1231 common species with an R2 = 0.9995 across these two independent measurements. Similarly, any LC-MS analysis of human, E. coli, or S.cerevisiae digests will likely identify many tryptic peptides common to our datasets (Supporting Information), which in turn can serve to map the entire analysis' retention values into HI units.

ACS Paragon Plus Environment

19

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 32

CONCLUSIONS The use of retention time in RP-HPLC systems has emerged as a new trend in the continuous development of proteomic protocols. Predicted retention times can be used to harden protein identification and direct the development of targeted quantitation techniques like SRM and SWATH. Ultimately the best approach for predicting peptide retention is the experimental measurement of its chromatographic properties. Once separation is performed under strictly controlled conditions in the presence of known standard, these observed values can be mapped across to retention in the other RP-HPLC systems. The collection of peptide retention properties will become a common practice, similar to the archiving of fragmentation patterns in the Peptide Atlas38 or GPMDB39 data repositories. Establishing the pool of standard peptides and consensus formats to represent peptide hydrophobicity is the first step in this direction. We aligned several recently introduced retention standards and determined their hydrophobicity index values expressed in acetonitrile % units. We also supplement this collection of commercially available standard mixtures with extended sets of tryptic peptides of different origin. Spiking peptide mixtures with standards prior LC-MS analysis or monitoring retention of real tryptic peptides from the provided library directly informs on the conditions of chromatographic component of the system, and the accuracy of gradient delivery. We strongly believe that the use of an acetonitrile percentage scale is the most useful, and warrant adoption by the overall proteomic community. Our collection of peptides with carefully measured chromatographic properties might serve as a starting point for large scale archiving of the retention data.

ACS Paragon Plus Environment

20

Page 21 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

REFERENCES 1. Aebersold, R.; Mann, M. Nature 2003, 422, 198-207. 2. Andrews, G. L.; Simons, B. L.; Young, J. B.; Hawkridge, A. M.; Muddiman, D. C. Anal. Chem. 2011, 83, 5442-5446. 3. Hu, Q.; Noll, R. J.; Li, H.; Makarov, A.; Hardman, M.; Graham Cooks, R. J. Mass. Spectrom. 2005, 40, 430-443. 4. Strittmatter, E. F.; Kangas, L. J.; Petritis, K.; Mottaz, H. M.; Anderson, G. A.; Shen, Y.; Jacobs, J. M.; Camp, D. G., 2nd; Smith, R. D. J. Proteome Res. 2004, 3, 760-769. 5. Lange, V.; Picotti, P.; Domon, B.; Aebersold, R. Mol. Syst. Biol. 2008, 4, 222. 6. Petritis, K.; Kangas, L. J.; Ferguson, P. L.; Anderson, G. A.; Pasa-Tolic, L.; Lipton, M. S.; Auberry, K. J.; Strittmatter, E. F.; Shen, Y.; Zhao, R.; Smith, R. D. Anal. Chem. 2003, 75, 1039-1048. 7. Krokhin, O. V.; Craig, R.; Spicer, V.; Ens, W.; Standing, K. G.; Beavis, R. C.; Wilkins, J. A. Mol. Cell. Proteomics 2004, 3, 908-919. 8. Petritis, K.; Kangas, L. J.; Yan, B.; Monroe, M. E.; Strittmatter, E. F.; Qian, W. J.; Adkins, J. N.; Moore, R. J.; Xu, Y.; Lipton, M. S.; Camp, D. G., 2nd; Smith, R. D. Anal. Chem. 2006, 78, 5026-5039. 9. Krokhin, O. V. Anal. Chem. 2006, 78, 7785-7795. 10. Gorshkov, A. V.; Tarasova, I. A.; Evreinov, V. V.; Savitski, M. M.; Nielsen, M. L.; Zubarev, R. A.; Gorshkov, M. V. Anal. Chem. 2006, 78, 7770-7777. 11. Kaliszan, R.; Baczek, T.; Cimochowska, A.; Juszczyk, P.; Wisniewska, K.; Grzonka, Z. Proteomics 2005, 5, 409-415. 12. Gilar, M.; Xie, H.; Jaworski, A. Anal. Chem. 2010, 82, 265-275.

ACS Paragon Plus Environment

21

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 32

13. Moruz, L.; Tomazela, D.; Kall, L. J. Proteome Res. 2010, 9, 5209-5216. 14. Klammer, A. A.; Yi, X.; MacCoss, M. J.; Noble, W. S. Anal. Chem. 2007, 79, 6111-6118. 15. Reimer, J.; Spicer, V.; Krokhin, O. V. J. Chromatogr. A 2012, 1256, 160-168. 16. Guo, D.; Mant, C. T.; Taneja, A. K.; Hodges, R. S. J. Chromatogr. 1986, 359, 519532. 17. Krokhin, O. V.; Spicer, V. Anal. Chem. 2009, 81, 9522-9530. 18. Mant, C. T.; Hodges, R. S. J. Chromatogr. A 2012, 1230, 30-40. 19. Tarasova, I. A.; Guryca, V.; Pridatchenko, M. L.; Gorshkov, A. V.; KiefferJaquinod, S.; Evreinov, V. V.; Masselon, C. D.; Gorshkov, M. V. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 2009, 877, 433-440. 20. Eyers, C. E.; Simpson, D. M.; Wong, S. C.; Beynon, R. J.; Gaskell, S. J. J. Am. Soc. Mass Spectrom. 2008, 19, 1275-1280. 21. Picotti, P.; Rinner, O.; Stallmach, R.; Dautel, F.; Farrah, T.; Domon, B.; Wenschuh, H.; Aebersold, R. Nat. Methods 2009, 7, 43-46. 22. Pierce Retention Time Calibration Mixture (Cat. No 88320 and 88321) Thermo Fisher Scientific Inc. 2011. 23. Escher, C.; Reiter, L.; MacLean, B.; Ossola, R.; Herzog, F.; Chilton, J.; MacCoss, M. J.; Rinner, O. Proteomics 2012, 12, 1111-1121. 24. MacLean, B.; Tomazela, D. M.; Shulman, N.; Chambers, M.; Finney, G. L.; Frewen, B.; Kern, R.; Tabb, D. L.; Liebler, D. C.; MacCoss, M. J. Bioinformatics 2010, 26, 966-968. 25. Gillet, L. C.; Navarro, P.; Tate, S.; Rost, H.; Selevsek, N.; Reiter, L.; Bonner, R.; Aebersold, R. Mol. Cell. Proteomics 2012, 11, O111 016717.

ACS Paragon Plus Environment

22

Page 23 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

26. Gallien, S.; Peterman, S.; Kiyonami, R.; Souady, J.; Duriez, E.; Schoen, A.; Domon, B. Proteomics 2012, 12, 1122-1133. 27. McQueen, P.; Spicer, V.; Rydzak, T.; Sparling, R.; Levin, D.; Wilkins, J. A.; Krokhin, O. Proteomics 2012, 12, 1160-1169. 28. Young, M. J.; Court, D. A. Yeast 2008, 25, 903-912. 29. Wisniewski, J. R.; Zougman, A.; Nagaraj, N.; Mann, M. Nat. Methods 2009, 6, 359-362. 30. Yamana, R.; Iwasaki, M.; Wakabayashi, M.; Nakagawa, M.; Yamanaka, S.; Ishihama, Y. J. Proteome Res. 2013, 12, 214-221. 31. Webb, K. J.; Xu, T.; Park, S. K.; Yates, J. R., 3rd J. Proteome Res. 2013, 12, 21772184. 32. Chambers, M. C.; Maclean, B.; Burke, R.; Amodei, D.; Ruderman, D. L.; Neumann, S.; Gatto, L.; Fischer, B.; Pratt, B.; Egertson, J.; Hoff, K.; Kessner, D.; Tasman, N.; Shulman, N.; Frewen, B.; Baker, T. A.; Brusniak, M. Y.; Paulse, C.; Creasy, D.; Flashner, L.; Kani, K.; Moulding, C.; Seymour, S. L.; Nuwaysir, L. M.; Lefebvre, B.; Kuhlmann, F.; Roark, J.; Rainer, P.; Detlev, S.; Hemenway, T.; Huhmer, A.; Langridge, J.; Connolly, B.; Chadick, T.; Holly, K.; Eckels, J.; Deutsch, E. W.; Moritz, R. L.; Katz, J. E.; Agus, D. B.; MacCoss, M.; Tabb, D. L.; Mallick, P. Nat. Biotechnol. 2012, 30, 918-920. 33. Spicer, V.; Grigoryan, M.; Gotfrid, A.; Standing, K. G.; Krokhin, O. V. Anal. Chem. 2010, 82, 9678-9685. 34. Glaich, J. L.; Quarry, M. A.; Vasta, J. F.; Snyder, L.R. Anal. Chem. 1986, 58, 280285. 35. Snyder, L. R.; Dolan J. W. In High-Performance Gradient Elution: The Practical Application of the Linear-Solvent-Strength Model. Wiley; New York, 2006; pp 229-234.

ACS Paragon Plus Environment

23

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 32

36. Shinoda, K.; Tomita, M.; Ishihama, Y. Bioinformatics 2008, 24, 1590-1595. 37. Blondelle, S. E.; Perez-Paya, E.; Allicotti, G.; Forood, B.; Houghten, R. A. Biophys. J. 1995, 69, 604-611. 38. Deutsch, E. W.; Eng, J. K.; Zhang, H.; King, N. L.; Nesvizhskii, A. I.; Lin, B.; Lee, H.; Yi, E. C.; Ossola, R.; Aebersold, R. Proteomics 2005, 5, 3497-3500. 39. Fenyo, D.; Eriksson, J.; Beavis, R. Computational Biology, Methods in Molecular Biology, 2010, 673, 189-202.

ACKNOWLEDGMENTS This work was supported by grant from the Natural Sciences and Engineering Research Council of Canada (O.V.K.). The authors thank Mrs. P. Sauder, Dr. J.A. Wilkins (Manitoba Centre for Proteomics and Systems Biology) and Dr. D. Court (Department of Microbiology, University of Manitoba) for their assistance in Jurkat, PBMC, S.cerevisiae cell culture. The authors declare no competing financial interest. ABBREVIATIONS RP HPLC – reversed-phase high pressure liquid chromatography; LC-MS liquid chromatography – mass spectrometry; SRM – selected reaction monitoring; HI – hydrophobicity index; SSRCalc – Sequence-Specific Retention Calculator; PBMC Peripheral blood mononuclear cells; TFA – trifluoroacetic acid; ACN – acetonitrile; GPMDB – Global Proteome Machine Database.

ACS Paragon Plus Environment

24

Page 25 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Table 1. Comparison of literature retention data, predicted and experimental HI values for the standard mixtures under investigation. Retention SSRCalc HI Peptide Sequence M/zb (literature predicted experiID c d data) HI mentale 17 P1-P6 standard P2a LGGGGGGDFR 446.717 4.58 6.03 4.58f P3 LLGGGGDFR 446.238 8.55 8.81 8.55 f P4 LLLGGDFR 445.758 13.08 13.33 13.08 f P5 LLLLDFR 445.280 18.26 19.46 18.26 f P6 LLLLLDFR 501.822 21.59 22.44 21.59 f QCAL1 standard20 Q1 VFDEFKPLVEEPQNLIR 691.706(3+) 39.3 18.80 17.58 Q2 VFDEFKPLVKPEEPQNLIR 575.318(4+) 37.9 17.36 14.49 Q3 VFDEFKPLVKPEEKPQNLIR 607.342(4+) 34.6 15.48 12.50 Q4 VFDEFKPLVKPEEKPQNKPLIR 531.105(5+) 31.1 13.40 10.30 Q5 VFKPDEFKPLVKPEEKPQNKPLIR 480.280(6+) 29.6 10.57 8.96 Q6 VFKPDEFKPLVKPEEKPQNKPLIKPR 517.805(6+) 28.4 8.88 8.11 Q7 VFDEFQPLVEEPQNLIR 691.964(3+) 40.1 21.10 19.56 Q8 GVNDNEEGFFSAR 721.321 34 11.74 11.68 Q9 GGVNDNEEGFFSAR 749.832 33.8 12.48 11.57 Q10 GGGVNDNEEGFFSAR 778.342 33.7 11.96 11.46 Q11 GVNDNEEGFFSAK 707.318 33.1 11.44 11.22 Q12 AVMDDFAAFVEK 671.822 39.8 18.77 18.18 Q13 AVMMDDFAAFVEK 737.342 40.1 20.75 19.07 Q14 AVMMMDDFAAFVEK 802.862 40.5 23.22 20.76 Q15 GLVK 208.648 n.d. 1.81 n.d. Q16 FVVPR 309.193 30.9 6.19 7.03 Q17 ALELFR 374.722 34.7 11.66 12.63 Q18 IGDYAGIK 418.730 27.1 6.82 6.88 Q19 EALDFFAR 484.746 38.7 15.39 15.22 Q20 YLGYLEQLLR 634.357 40.4 20.94 19.97 Q21 VLYPNDNFFEGK 721.852 38.3 15.18 14.51 Q22 LFTFHADICTLPDTEK 636.646(3+) 38.4 16.15 14.81 Thermo Fisher standard22 TF-1 SSAAPPPPPR 493.768 15.13 1.08 2.20 TF-2 GISNEGQNASIK 613.317 20.5 5.01 3.73 TF-3 HVLTSIGEK 496.287 21.68 5.02 4.12 TF-4 DIPVPKPK 451.283 27.87 6.08 5.40 TF-5 IGDYAGIK 422.736 31.37 6.82 6.80 TF-6 TASEFDSAIAQDK 695.832 40.69 10.15 8.67 TF-7 SAAGAFGPELSR 586.800 43.6 9.84 9.56 TF-8 ELGQSGVDTYLQTK 773.896 50.17 11.39 10.95 TF-9 GLILVGGYGTR 558.326 57.79 13.27 13.19 TF-10 GILFVGSGVSGGEEGAR 801.412 59.69 14.43 13.50 TF-11 SFANQPLEVVYSK 745.392 60.64 14.65 13.35 TF-12 LTILEELR 498.802 68.55 15.81 15.66 TF-13 NGFILDGFPR 573.303 73.44 17.36 16.72 TF-14 ELASGLSFPVGFK 680.374 76.63 17.74 17.20 TF-15 LSSEAPALFQFDLK 787.421 83.27 20.45 18.64 iRT standard23 iRT-1 LGGNETQVR 487.258 -24.92 1.87 2.32 iRT-2 AGGSSEPVTGLADK 644.823 0 7.36 6.02 iRT-3 VEATFGVDESANK 683.828 12.39 8.40 7.85 iRT-4 YILAGVESNK 547.299 19.79 9.60 9.12 iRT-5 TPVISGGPYYER 669.839 28.71 10.82 10.06 iRT-6 TPVITGAPYYER 683.854 33.38 11.91 10.64 iRT-7 GDLDAASYYAPVR 699.339 42.26 12.67 12.44 iRT-8 DAVTPADFSEWSK 726.836 54.62 14.18 13.86 iRT-9 TGFIIDPGGVIR 622.854 70.52 15.49 15.87 iRT-10 GTFIIDPAAIVR 636.870 87.23 16.89 17.91 iRT-11 FLLQFGAQGSPLFK 776.930 100 20.63 19.56 a – P1 peptide from P1-P6 mixture is not retained under chromatographic conditions used (HI < 0)17; b – M/z values for the most abundant charge state – (2+), unless specified otherwise; c – retention values are provided in the units reported in correspondent references: HI (acetonitrile%) for P1-P617 standard, retention times (min) for QCAL120 and TF22 standards; unitless iRT values for Biognosys standard23; d – predicted SSRCalc hydrophobicities in HI (acetonitrile %) units; e – experimental hydrophobicities in HI (acetonitrile %) units determined for 0.66% gradient; f - measured under isocratic conditions17.

ACS Paragon Plus Environment

25

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 32

FIGURE CAPTIONS Figure 1. LC-MS analysis of QCAL1 digest spiked with P1-P6 standard peptides using a 1% per minute acetonitrile gradient. A – total ion chromatogram (TIC) profile; B – extracted ion chromatograms (XIC) for tryptic fragments of the QCAL1 protein (Q1Q22); C - XIC profiles for the P2-P6 standard. Peptides are color coded according to their primary function in designed standard20: Q1-Q7 (green) to test mass calibration and resolution in MS; Q8-Q11 (red, yellow) to monitor quantitation accuracy (linearity of response); Q12-Q22 (blue) - chromatographic performance and methionine oxidation. Q10* is shown off scale due to high peak intensity: 2x compared to Q9; 6x compared to Q8. Figure 2. RP-HPLC separation of Q12 peptide AVMDDFAAFVEK and its oxidation products. A – axial helical projection; hydrophobic residues are shown in black; B – XICs for non modified m/z 671.82 (blue) and oxidized m/z 679.82 (pink) peptides using a 1% per minute acetonitrile gradient; C – XICs for the same components under a 0.33% gradient. Figure 3. LC-MS analysis of TF and iRT standards spiked with P1-P6 peptides using a 1% per minute acetonitrile gradient. A – TIC profile; B – XIC for TF standard; C – XIC for iRT peptides; C - XIC profiles for P2-P6. Figure 4. Retention time prediction using SSRCalc-predicted and experimental HI values. A – retention time vs. predicted HI values for 52 peptides from all four studies standards. * - Q2-Q4 peptides from QCAL1 standard exhibiting large prediction errors; B – retention time vs. experimental HI values; C – retention time prediction for 12,664 tryptic peptides from Yamana et al30. Figure 5. Retention time prediction for 39-step micro MudPIT acquisition by Webb et al31. A – retention time vs. predicted HI values for the collection of 11,492 nonredundant tryptic peptides; B – retention time vs. HI dependencies for salt steps #21-24.

ACS Paragon Plus Environment

26

Page 27 of 32

A

1e7 5e6

Intensity (counts per second)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

12

13

14

15

3e5

18

19

20

21

22

23

13

14

15

16

17

C

13

14

25

26

27

28

29

30

31

32

Q13 Q2 Q22 Q12 Q21 Q7 Q1 Q19 Q20 Q14

18

19

20

21

22

23

24

25

26

27

P4

P3

P2

4e5

24

Q10*Q9 Q8 Q3 Q17 Q11

Q16 Q4 Q18 Q6 Q5

8e5

12

17

B

6e5

12

16

28

29

30

31

32

P5 P6

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

Time, min

ACS Paragon Plus Environment

30

31

32

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

AVMDDFAAFVEK F V F

7

(A)

9

4

5

12 1

K A

23

24

25

26

A

11

8

E

D F

A m/z 671.82

m/z 679.82

(B)

56

10 3

6

2

D

(C)

V M

27

Retention time (min)

28

29

m/z 671.82

m/z 679.82

58

60

62

64

Retention time (min)

66

68

ACS Paragon Plus Environment

Page 28 of 32

Page 29 of 32

1.6e7

A

8e6

12

13

1e6

Intensity (counts per second)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

14

15

2

16

17

18

20

21

22

4 12

13

14

15

16

17

18

1e6

3

2

5e5

19

4

5

iRT-1 12

8e5

13

14

16

17

18

19

20

21

6

21

22

23

24

25

26

27

28

29

30

24

25

26

27

28

29

30

8

7 22

23

24

9 25

26

10

14

15

32

31

32

27

28

P4

iRT-11 29

30

P5

P2 13

31

C

P3

4e5

12

15

20

23

B 9 11 12 13 10 14 TF-15

67 8

5

TF-1 3

19

31

32

D P6

16

17

18

19

20

21

22

23

24

25

26

27

28

Time, min

ACS Paragon Plus Environment

29

30

31

32

Analytical Chemistry

45.0

A

35.0 30.0 25.0

*

2

R = 0.9689

15.0

10.0 Retention time 0.66% ACN gradient (min) P2-P6 TF 5.0 0.0

*

*

y = 1.3612x + 11.727

20.0

0.0

5.0

iRT

10.0

QCAL1

15.0

20.0

SSRCalc HI, predicted (ACN%)

25.0

45.0

B

Retention time 0.66% ACN gradient (min)

40.0 35.0 30.0 25.0

y = 1.4758x + 11.356

20.0

2

R =1

15.0

10.0 Retention time 0.66% ACN gradient (min) P2-P6 5.0 0.0

0.0

5.0

TF

iRT

QCAL1

10.0 15.0 HI, experimental (ACN%)

20.0

25.0

C

500

Retention time (min)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Retention time 0.66% ACN gradient (min)

40.0

Page 30 of 32

400

300

y = 18.761x - 23.994

200

R2 = 0.9403 100

0 -5

0

5

10

15

20

25

30

35

SSRCalc HI, predicted (ACN%)

ACS Paragon Plus Environment

40

Page 31 of 32

A

900

800

Retention time (min)

700

600

500

Retention time (min) 400

300

200

100

0

-5

B

Retention time (min)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry 1000

-5

0

5

10

15

20

SSRCalc HI, predicted (ACN%)

580

25

30

35

2 RT = 0.5521*HI + 573.0; R = 0.949

560 540

2 RT = 0.547*HI + 548.2; R = 0.933

520

2 RT = 0.5329*HI + 523.6; R = 0.941 2 RT = 0.5737*HI + 498.2; R = 0.931

500 0

5

10

15

20

SSRCalc HI, predicted (ACN%)

25

ACS Paragon Plus Environment

30

35

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

76x48mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 32 of 32