3D HPLC-MS with Reversed-Phase Separation ... - ACS Publications

Feb 5, 2016 - Sequence-Specific Model for Peptide Retention Time Prediction in Strong Cation Exchange Chromatography .... Email a Colleague...
0 downloads 0 Views 1MB Size
Subscriber access provided by UNIV OF CALIFORNIA SAN DIEGO LIBRARIES

Article

3D HPLC-MS with reversed-phase separation functionality in all three dimensions for large-scale bottomup proteomics and peptide retention data collection Victor Spicer, Peyman Ezzati, Haley Neustaeter, Ronald C Beavis, John A Wilkins, and Oleg V. Krokhin Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.5b04567 • Publication Date (Web): 05 Feb 2016 Downloaded from http://pubs.acs.org on February 9, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

3D HPLC-MS with reversed-phase separation functionality in all three dimensions for largescale bottom-up proteomics and peptide retention data collection Vic Spicer1, Peyman Ezzati1, Haley Neustaeter2, Ronald C. Beavis3, John A. Wilkins1,4, Oleg V. Krokhin*1,4

AUTHOR ADDRESS 1

Manitoba Centre for Proteomics and Systems Biology, University of Manitoba, 799 JBRC, 715

McDermot Avenue, Winnipeg, Manitoba R3E 3P4, Canada 2

Department of Chemistry, University of Manitoba, 360 Parker Building, Winnipeg, Manitoba R3T

2N2, Canada 3

Department of Biochemistry & Medical Genetics, University of Manitoba, 792 JBRC, 715 McDermot

avenue, Winnipeg, Manitoba R3E 3P4, Canada 4

Department of Internal Medicine, University of Manitoba, 799 JBRC, 715 McDermot avenue,

Winnipeg, Manitoba R3E 3P4, Canada

Fax: (204) 480 1362, E-mail: [email protected]

ACS Paragon Plus Environment

1

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 24

ABSTRACT The growing complexity of proteomics samples and desire of deeper analysis drive the development of both better MS instrument and advanced multi-dimensional separation schemes. We applied 1D, 2D and 3D LC-MS/MS separation protocols (all of reversed-phase C18 functionality) to a tryptic digest of whole Jurkat cell lysate to estimate the depth of proteome coverage and to collect high-quality peptide retention information. We varied pH of the eluent and hydrophobicity of ion-pairing modifier to achieve good separation orthogonality (utilization of MS instrument time). All separation modes employed identical LC settings with formic acid based eluents in the last dimension. The 2D protocol used high pH – low pH scheme with 21 concatenated fractions. In the 3D protocol, six concatenated fractions from the first dimension (C18, heptafluorobutyric acid) were analyzed using the identical 2D LC-MS procedure. This approach permitted a detailed evaluation of the analysis output consuming 21x and 126x the analysis time and sample load compared to 1D. Acquisition over 189 hours of instrument time in 3D mode resulted in the identification of ~14,000 proteins and ~250,000 unique peptides. We estimated the dynamic range via peak intensity at the MS2 level as approximately 104.2, 105.6 and 106.2 for the 1D, 2D and 3D protocols, respectively. The uniform distribution of the number of acquired MS/MS, protein and peptide identifications across all 126 fractions and through the chromatographic timescale in the last LC-MS stage indicates good separation orthogonality. The protocol is scalable and is amenable to the use of peptide retention prediction in all dimensions. All these features make it a very good candidate for large scale bottom-up proteomic runs, which target both protein identification as well as the collection of peptide retention data sets for targeted quantitative applications.

INTRODUCTION

Liquid chromatography – mass spectrometry (LC-MS) on a peptide level remains the method of choice for proteomic applications1,2. Developments in both techniques have been driving the depth of proteome coverage, with most of the emphasis on the mass-spectrometry component of the protocol. The past decade has seen a dramatic increase in mass spectrometer acquisition speed and mass accuracy3,4, as well as the introduction of new fragmentation techniques5. The LC-component is somewhat lagging behind, and still relies on traditional C18 silica based stationary phases and formic/acetic acid based eluents. The introduction of efficient core-shell, UPLC packing materials,

ACS Paragon Plus Environment

2

Page 3 of 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

wider applications of multidimensional separation techniques6, and new enrichment procedures targeting specific groups of peptides7 are among major trends in the advancement of separation science for proteomics applications. Multi-dimensional separation protocols have been developed to provide the optimal delivery of analytes into the mass spectrometer. Complex digests may contain millions of peptides, and there is no mass spectrometer capable of sampling MS/MS spectra for all these species within the few hours typical of one-dimensional runs. 2D LC-MS protocols now represent the mainstay of deep shotgun proteomic approaches. The most popular 2D method is based on a SCX-RP combination, and was introduced in 20018,9. Other combinations are based on SAX-RP10, RP-RP11, HILIC-RP12, etc. It is worth noting that the vast majority of 2D LC-MS methods utilize reversed-phase sorbents with formic/acetic acid ion pairing modifiers in the last separation dimension, giving good compatibility with ESI. The selection of the first dimension chemistry is dictated by its orthogonal character compared to the formic acid – RP systems, and by its separation efficiency – key parameters in determining the peak capacity of the system6. A recent comprehensive review6 on multidimensional peptide separation highlighted a number of trends in the field and concluded that “the simplest multidimensional system to construct is an RP– RP, with SCX–RP being not far behind” due to the wide choice of the RP stationary phases provided by column vendors. In our laboratory we have been using high pH – low pH RP-RP since 2006, following reports by Gilar et al.11 and Toll et al.13, that demonstrated high pH RP applications for silica- and polymer-based sorbents, respectively. We made this choice based on the superior separation efficiency of silica based RP sorbents, and the separation orthogonality this system can provide. Despite the fact that separation in both dimensions are based on the hydrophobic properties of peptides, the effect of pH on the charge state and hydrophobicity of side chains of some residues is profound. Following a detailed study of separation selectivity in this system, and the development of a peptide retention prediction model for pH 10 separations14, we introduced a pair-wise concatenation procedure that dramatically improved separation orthogonality15. Many research groups have incorporated this approach into their routine analysis16-18. Another way to modify the separation selectivity in RP systems is based on variation of ionpairing modifier’s hydrophobicity. It is known that application of more hydrophobic trifluoroacetic acid (TFA) instead of orthophosphoric acid increases peptide retention proportional with the number of positively charged functional groups (N-terminus, side chains of Lys, Arg, His)19. Stephanowitz et al.20

ACS Paragon Plus Environment

3

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 24

introduced a TFA-formic acid 2D LC-MS separation system. However, the observed change in selectivity between two modifiers required the concatenation of four fractions to provide sufficient orthogonality. Our preliminary studies showed that heptafluorobutyric acid (HFBA) may serve as a good candidate for the ion-pairing modifier for the first dimension: HFBA is more hydrophobic compared to TFA, and causes a larger positive retention time shifts for the peptides carrying multiple positively charged groups. It is interesting to note that reversed-phase separation at pH 10 also exhibits higher retention of basic peptides compared to formic acid. This “positive shift” is additionally complimented by a “negative shift” for peptides carrying acidic residues (Asp, Glu)21. Taken all together, we expected the separation selectivity of C18-HFBA systems to be somewhere between C18pH 10 and C18-formic acid. Usually 2D RP-RP protocols are performed in an off-line mode due to the necessity to reduce acetonitrile concentration in the fractions from the first dimension. This makes it more labor intense than online approaches, and is prone to sample losses between two separation steps. Thus, Magdeldin et al.22 concluded that off-line fractionation results in a ~45% loss of identifications compared to an online method. At the same time, drying down the fractions following the pH 10 RP completely eliminates the impact of high salt concentration on the overall system robustness and the reproducibility of retention times in the second dimension. The latter is of particular importance when 2D-LC MS is used for retention data collection. High quality retention data collection is crucial for development of peptide retention prediction models and in methods for targeted quantitative proteomics such as SRM23 and SWATH24. Currently three-dimensional LC-MS protocols are somewhat rare. F. Zhou et al.25 described an on-line 3D setup based on a RP (pH10) – SAX – RP (pH 2) combination. Depending on the steps in acetonitrile (first dimension) and salt concentrations (second dimension) the authors performed from 19 up to 236 fractions analysis (RP LC-MS runs) of whole cell digests of S.cerevisiae and E.coli. Later they applied the same methodology for protein quantitation using an iTRAQ approach. The 19 fractions from RP(pH10) – SAX were analyzed in 10 hr long RP( pH 2) LC-MS runs in this case26. Recently Chu and co-workers27,28 demonstrated a similar on-line RP (pH10) – SCX – RP (pH 2) approach using 8 elution steps in RP (pH 10) and 3 elution steps in SCX dimension with total 24 hr analysis time for whole cell lysates. Betancourt et al.29 used a SCX-RP (pH10)-RP (pH 2) approach to test the ability to separate peptides from whole mouse embryonic fibroblast cells digests. The authors used reversible labeling of -NH2 groups to facilitate the SCX separation into 3 distinct groups based on

ACS Paragon Plus Environment

4

Page 5 of 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

the number of charged residues. Boichenko et al.30 compared 5 different 2D setups for the analysis human plasma digests, then used the best 2 options to construct a 3D combination: ERLIC-RP pH 10RP pH 2. Interestingly, these research groups built their systems around RP – RP high pH – low pH systems, which further highlights the potential of this 2D platform. Recently H. Zhou et al.31 applied off-line 3D (SCX-HILIC-RP) approach to analyze a human cancer cell phosphoproteome. The three SCX fractions were enriched using Ti4+-IMAC, then each separated into 24 fractions by HILC, followed 2 hr RPLC-MS runs. The 144 hours of total instrument time yielded more than 22,000 unique phosphopeptides identifications in K562 cells. Loroch et al.32 used ERLIC-ERLIC-RP HPLC for the simultaneous analysis of proteome and phospho-proteome of the same sample. For purposes of our discussion, we define 3D-LC MS methods as procedures that use three different separation chemistries and collect/analyze at least 2 fractions in each of them. Thus the method developed by H. Zhou et al.31 could be described as a 4D system: (SCX-Ti4+-IMAC-HILICRP), but in their case the Ti4+-IMAC step was used for enrichment. Similarly, the COFRADIC approach uses an RP-RP-RP scheme, but the first two dimensions are identical and used for sample enrichment, with an additional step of chemical modification applied between them33. Gygi et al. described their SCX-Avidin-RP approach as being 3D34, but the biotin affinity column was used solely for enrichment of iCAT labeled peptides rather than as a mechanism for the separation. All reports of 3D-LC-MS of complex proteomic samples have appeared in the past 4 years, highlighting the overall trend to achieve deep proteome coverage through better fractionation techniques. We pursued a dual purpose in this attempt to improve on our current 2D LC-MS capacity: to improve our analysis of proteins for systems biology studies, and the collection peptide retention time information for the development of our Sequence Specific Retention Calculator (SSRCalc) model35. Our retention database has accumulated data from 1D and 2D runs of ~30 different organisms (unpublished results). Due to its extensive coverage (~1 million unique tryptic peptides) we were able to discover detailed features of peptide retention in RP systems36. This data also provides a solid support for the development of targeted quantitative procedures. The application of the high pH RP – low pH RP 2D LC-MS on our Triple TOF 5600 platform (~24 hrs instrument time) produces sufficient coverage for bacterial proteomes (up to 80%), but leaves unidentified many potentially expressed proteins and peptides for complex organisms: only ~35-40% coverage of the proteomes in human cell lines. Building on the successful implementation of 2D RP – RP we decided to extend it into 3D format using another RP dimension.

ACS Paragon Plus Environment

5

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 24

EXPERIMENTAL SECTION Chemicals. All chemicals were sourced from Sigma Chemicals (St-Louis, MO), unless noted otherwise. HPLC-grade acetonitrile and de-ionized water were used for the preparation of eluents. Sequencing-grade modified trypsin (Promega, Madison, WI) and 15 mL Amicon centrifugal filter units (Merck Millipore, Ireland) was used for digestion. HPLC-grade heptafluorobutyric acid was purchased from Thermo Fisher Scientific (Rockford, IL). Siliconized 1.5 mL vials (BioPlas, San Rafael, CA) were used for all sample preparation and fractions handling steps. Cell culture and protein digestion. Jurkat Clone E6-1, a human T lymphoblastoid cell line derived from an acute T cell leukemia, was obtained from ATCC (TIB-152TM). Cells were cultured in RPMI 1640 medium supplemented with 10% FBS at 37°C in a humidified 5% CO2 atmosphere. The nonadherent lymphocytes were harvested by centrifugation (300 g for 5 minutes) and washed with PBS. All samples were obtained with informed consent using a protocol approved by the University of Manitoba, Research Ethics Board. Tryptic digest of Jurkat cells was prepared using the scaled up (15 ml filter units) FASP digestion procedure37. Protein amounts to be subjected to digestion were monitored using micro-BCA assay (Pierce, Rockford, IL). Resulting digest was acidified with TFA and purified by RP SPE. Approximately 1 mg of the digest (determined by NanoDrop 2000, ThermoFisher) was sufficient for all LC-MS experiments. HPLC-MS settings in last dimension. 1D, 2D and 3D protocols used identical LC-MS settings in the last dimension with 90 min acquisition time (Figure 1). A splitless nano-flow 2D LC Ultra system (Eksigent, Dublin, CA) was used to deliver water/acetonitrile gradient at 500 nL/min flow rate through a 100µm×200mm analytical column packed with 3µm Luna C18(2) (Phenomenex, Torrance, CA) at room temperature. Sample injection (~1 µg of peptides in 10 µL of buffer A, spiked with P1-P6 peptide retention standard, ~200 fmole of each peptide) via a 300µm×5mm PepMap100 trap-column (ThermoFisher) was used in all experiments. The gradient program included following steps: linear increase from 0.5 to 30 % of buffer B (acetonitrile) in 78 min, 5 min columns wash with 90% B and 8

ACS Paragon Plus Environment

6

Page 7 of 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

min system equilibration using starting conditions of 0.5% B. Both eluents A (water) and B (acetonitrile) contained 0.1 % formic acid as ion-pairing modifier.

Figure 1. An overview of separation setups used for 1D, 2D and 3D LC-MS. A) 90 min 1D LC-MS with ~1µg sample load; B) 2D LC-MS (high pH – low pH RP-RP) with 21 concatenated fractions and ~21 µg sample load; C) 3D LC-MS (low pH HFBA – high pH – low pH formic acid) with 6 concatenated fractions in first and 21 concatenated fractions in the second dimension. In total 126 fractions (~126 µg sample load) were used for the final stage of LC-MS analysis. Data-dependent acquisition TripleTOF5600 mass spectrometer (Sciex, Concord, ON) was performed using following settings: 250 ms survey MS spectra (m/z 300-1500) was followed by up to 20 MS/MS measurements on the most intense parent ions (300 counts/sec threshold, +2 - +4 charge state, m/z 100-1500 mass range for MS/MS, 100 ms each, high sensitivity mode). Previously targeted parent ions were excluded from repetitive MS/MS acquisition for 12 sec (50 mDa mass tolerance). 2D LC settings and concatenation procedure were slightly modified compared to our original report15. Agilent 1100 series LC system with UV detector (214 nm) and 3mm×100mm XTerra MS C18, 3.5 µm column (Waters, Ireland) was used for pH 10 separations. 0.66% acetonitrile gradient (040% acetonitrile) was delivered at 300 µL/min flow rate. Both eluents A (water) and B (1:9 water:acetonitrile) contained 20 mM ammonium formate and were prepared by 1:10 dilution of 200 mM ammonium solution with pH 10 adjusted by formic acid. Manual Reodyne injector (Bensheim, Germany) with 200 µL loop was used to deliver ~200 µg and ~120 µg of peptides in 2D and 3D experiments respectively. One-minute fractions were collected over the 7-55 min interval, concatenated

ACS Paragon Plus Environment

7

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 24

into 21 fractions as shows in Figure 2a, lyophilized and re-suspended in buffer A (0.1% formic acid in water) spiked with standard peptides38.

Figure 2. Concatenation strategy and orthogonality of separation between first two dimensions in 3D mode. A) UV (214 nm) chromatogram for pH 10 separation (~200 µg of the digest) and the 21 fraction concatenation scheme; B) UV profile for RP-HFBA separation (~720 µg of the digest) and the 6 fraction concatenation scheme for the first dimension of 3D mode; C) six chromatograms (RP pH 10) for the concatenated fractions (panel B) using separation conditions identical to those in panel A: 3D-1 corresponds to combined fraction pools #1 and 7 from panel B, etc. Note a uniform distribution of UV signal in all six separations.

First dimensions separation for 3D protocol used the same XTerra MS C18 column, 300 µL/min flow rate and 500 µL injection loop. The gradient program consisted of following steps: 5 min wash at starting conditions (100% buffer A, water 0.1 % HFBA), 60 minute linear increase from 0 to 60 % of buffer B (acetonitrile, 0.1 % HFBA), 10 minute wash at 90% B. ~720 µg of the digest was injected, one minute fractions were collected and concatenated into 6 fractions based on UV signal at

ACS Paragon Plus Environment

8

Page 9 of 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

214 nm (Figure 2b). These fractions were lyophilized and re-suspended in 200 µL of buffer A for the second dimension (20 mM ammonium formatted in water pH 10). Data treatment and protein/peptide identification. Raw spectra files were converted into Mascot Generic File format (MGF) for peptide/protein identification by X!Tandem search algorithm39. Seven combined MGF files (each containing 21 MGFs of individual fractions) were created for 2D-LC and six 3D fractions. The following X!Tandem search parameters were used: 20 ppm and 50 ppm mass tolerance for parent and fragment ions, respectively; constant modification of Cys with iodoacetamide; default set post-translational modifications: oxidation of Met, Trp; N-terminal cyclization at Qln, Cys; N-terminal acetylation, phosphorylation (Ser, Thr, Tyr), deamidation (Asn and Gln); an expectation value cut-off of Log(e) < –1 for both proteins and peptides. RESULTS AND DISCUSSION Selection of separation chemistry for additional separation dimension. We decided to use another RP functionality due to its superior separation efficiency compared to other peptide separation techniques. The third separation dimension needs sufficient orthogonality to both pH 10 RP and pH 2 RP - formic acid modes. Since separation mechanism is based on hydrophobic interactions in all 3 instances, such selection represented a very challenging task. It is known however that peptide retention on C18 columns depends on both hydrophobic and ion-pairing interactions. Varying hydrophobicity of ion-pairing modifiers is a traditional way to adjust separation selectivity in these systems. In the era of UV detection, the selection of ion pairing agents was based on separation efficiency, selectivity and low absorptivity in UV region. Perfluorinated carboxylic acids TFA, HFBA were selected as some of the best modifiers19. In contrast, LC-MS applications are dominated by formic and acetic acid based eluents due to better compatibility with ESI, despite the fact that they provide poorer retention of hydrophilic peptides and separation efficiency. For multi-dimensional off-line applications, however, poor ESI compatibility of perfluorinated carboxylic acids is not an issue. Guo et al.19 showed that application of more hydrophobic ion-pairing modifiers at acidic pH increases retention proportionally to the number of positively charged groups in peptide (side chains of Arg, Lys, His and N-terminal NH2 group). It was concluded that the retention on C18 columns increases by approximately 1% acetonitrile between orthophosphoric acid and TFA, and by 2 % per one charged group between TFA and HFBA19. We compared retention characteristics in the order of

ACS Paragon Plus Environment

9

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 24

FA-TFA-HFBA using our standard P1-P6 peptides38 and Luna C18 column (not shown here). Average increase in retention of these peptides carrying 2 charges was found 3.5 % and 6.8% per charged residue between FA-TFA and FA-HFBA pairs. It should be noted that Guo et al.’s19 assumption of proportionality between the number of charged groups and retention is approximate. In fact the overall peptide hydrophobicity and many sequence specific factors will impact the magnitude of the retention shift35. This results in better orthogonality between the HFBA - FA or TFA – FA than predicted based on a solely additive effect of peptide charge. According to both the data reported by Stephanowitz et al.20 and our own observations, we concluded that switching between FA and TFA would not provide a sufficient change in separation selectivity. Therefore heptafluorobutyric acid was chosen for this study. HFBA was used as ion-pairing modifier for reversed-phase dimension in original papers decribing SCX-RP MudPIT approach8,9. Another application40 employed C18-HFBA functionality to separate selected fractions from C18-TFA chromatogram to identify peptides represented by the major histocompatibility molecule HLA-A2.1 that is recognized by melanoma specific CTLs. However, we did not find prior applications of HFBA-based RP separation for multi-dimensional LC-MS proteomic analysis in conjunction with pH 10 or formic acid based RP systems.

Concatenation procedures and separation orthogonality between RP-HFBA and RP-pH 10 systems. Figure 2a shows UV profile for C18 pH 10 separation of ~200 µg of Jurkat cells tryptic digest from 2D experiment. The typical bell-shaped distribution was observed, which allows us to determine the concatenation strategy as described elsewhere15. We chose the time range between 10-50 min as a major portion of the chromatogram and concatenated 1-min fractions pair-wise: 11-31, 12-32… 20-50 (henceforth called fractions 2D-11…2D-30). Only small portion of early eluting peptides (before 10 minutes) had sufficient hydrophobicity to be retained in FA-based systems. The chromatogram’s “tail” (after 50 minutes) contains a small number of peptides as well, so these two regions on the chromatogram were concatenated in a single fraction: 2D-10. We assume that these 21 concatenated fractions (2D-10…2D-30) contain approximately equal amounts of material: ~10 µg each. In the last dimension of 2D procedure we injected ~1/8 of each fraction, corresponding to ~1.2 µg. Approximately 720 µg of the digest was separated using the same XTerra MS C18 column with 0.1% HFBA as ion pairing modifier for the 3D experiment (Figure 2b). One-minute fractions were collected and concatenated into 6 pools based on UV signal intensity, as shown in Figure 2a. Following the same principle as in the pH 10 – FA system, we combined the early and middle fractions 1 – 7, 2 –

ACS Paragon Plus Environment

10

Page 11 of 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

8,…6 – 12. We estimated that resulting fractions contained ~120 µg each, and performed six pH 10 separations under conditions identical to that described previously, illustrated in Figure 2c (3D-1…3D6). All six chromatograms showed uniform UV profiles – a good indication of sufficient orthogonality between the RP HFBA and RP pH 10 systems. Similarly, 21 concatenated fractions were collected for all 6 chromatogram, giving ~6 µg of peptides in each. The resulting (6 x 21) 126 fractions (1/5 of each corresponding to ~1.2 µg) were subjected to the final stage of 90 min LC-MS with formic acid based eluents.

Summary of proteins and peptides identification is shown in Table 1. One of the purposes of this study was to compare 1D, 2D and 3D protocols with identical settings in the last LC-MS dimension, i.e. with proportional 21- and 126-fold increase in the sample load and analysis time. The 90 minute 1D acquisition (averaged across 3 technical replicates) gave: 27,432 acquired MS/MS, 18,849 identified peptides, 11,878 unique peptide and 2,535 protein IDs at an FPR ~0.4%. Assuming a perfect orthogonality between two dimensions and identical sample load, one could expect a proportional increase in acquired MS/MS spectra (27,432 x 21=576,032) and the number of identified peptides (18,849 x 21 = 395,829) for the 2D run. The actual number of acquired MS/MS and identified peptides for the 2D were found 89% and 85% of these hypothetical best values, indicating extremely good orthogonality of our high pH – low pH 2D protocol with concatenation. This was further reinforced by the uniform distribution of both MS/MS acquired and peptide identifications across all 21 fractions (Figure 3a), as well as within the MS/MS level chromatographic space of each of these fractions (Figure S1a). The number of unique peptide identifications was found to be 44% of the 21fold scale-up of the 1D output, due to the increased peak intensity and the redundancy of MS/MS acquisitions. The 3D LC-MS experiment resulted in over 2.5 and 1.5 million acquired and identified MS/MS spectra, respectively. The number of MS/MS acquired in 3D was equal to 73% and 82 % of the ideal scaled-up values from the 1D (x126) and 2D (x6) runs. Comparison of the less-than-unity contributions of the HFBA dimension into 2D (82%), and of the pH 10 dimension into 1D (89%) demonstrates that the acidic pH RP-HFBA system is less orthogonal to the FA separation dimension than the pH 10 separation dimension is-- as we anticipated. The acquisition of more than 70% of the potential MS/MS in 3D indicates a very efficient use of separation space, as illustrated further in Figures 3b and S1b.

ACS Paragon Plus Environment

11

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 24

Overall, 109,461/8,757 and 251,166/14,230 unique peptides/proteins were identified in 2D and 3D LCMS runs, respectively.

Figure 3. Separation orthogonality in 2D and 3D LC-MS systems. The distribution of the number of acquired MS/MS (green), assigned MS/MS (blue) and unique peptide IDs (red) across 21 fractions in 2D (A) and 126 fractions in 3D LC-MS (B). Respective levels for 1D acquisition are shown as horizontal lines.

Estimating the separation orthogonality we assumed that 1 D LC-MS/MS will generate the highest possible number of tandem spectra per unit of time compared to multi-dimensional runs. This assumption is based on anticipation that the non-fractionated digest will be the most complex mixture we are dealing with, thus providing a maximal load of MS/MS. In reality, these numbers may be affected by ionization suppression effects, the selection of the intensity threshold for triggering MS/MS acquisition, the duration of dynamic exclusion of previously fragmented species, and the amount of material used in each last dimension LC-MS runs—thus should be treated with caution. Over the past few years we have analyzed over 30 different cell lines/organisms using 1D and 2D approaches and never observed higher MS/MS output for 2D runs. These acquisitions, however, were performed with strict control of sample load and performance of both LC and mass analyzers. Introducing the metrics, which will account for the uniformity of MS/MS distribution across separation space in all dimensions will be necessary for more accurate estimation of orthogonality. The results shown in Table 1 and Figure S2 follow the usual trends observed in proteomic shotgun approaches: diminishing returns in unique protein and peptide identifications per acquired MS/MS for the deeper sample fractionation. Thus, each unique peptide assignment was based on average of 1.59, 3.08 and 6 identified MS/MS for 1D, 2D and 3D, respectively. Corresponding numbers of unique peptides identifications per protein were 4.7, 12.5 and 17.7. As shown in Table 2,

ACS Paragon Plus Environment

12

Page 13 of 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

deeper peptide fractionation leads to an increasing percentage of identified post-translationally modified peptides. Conceivably, these species have lower abundance (oxidation, deamidation, Nterminal cyclization) or yield less MS signal (phosphorylation, N-terminal cyclization) compared to the non-modified species. In this case, the fraction of PTMs that may occur due to sample handling (oxidation, deamidation, N-terminal cyclization) increase due to both better fractionation and continuous peptide modification during the fraction collection. We found another difference - between N-terminal acetylation and phosphorylation (both occurring in-vivo), which further highlights the features, which could be expected for deep peptide fractionation. Acetylation is a high abundance modification: the percentage of its contribution slightly decreases despite more than 10 times absolute increase the number of N-terminally acetylated peptides between 1D and 3D (203 vs. 2453 identifications). Conversely, phosphorylation is a non-stoichiometric modification, reducing the detection sensitivity of the modified peptide. Its contribution to the total number of peptide IDs due to better fractionation thus increases continuously: 0.54, 1.51 and 3.74% for 1D, 2D and 3D.

Separation orthogonality in 2D and 3D LC-MS systems. We found the output of our analysis on par with the current state-of-the-art level reported in the literature. Considering our instrumental settings, in which we did not attempt pushing the limits of both mass-spectrometer (a 5 year old Triple TOF 5600 instrument, 100 ms MS/MS acquisition, non-aggressive 12 sec exclusion window) and chromatographic system (3-5 µm fully porous sorbents, rather low ~2200 psi operating pressure) we attribute the level of performance to the proper selection of separation chemistry and good orthogonality. To illustrate the latter, one should consider the uniformity of distribution of MS/MS acquisitions, peptide IDs, unique peptide and protein IDs across all dimensions. In case of the 2D experiment, this uniformity is evident for the number of identified peptides in all 21 fractions (Figure 3a) in first dimension, and in the 5-minute-wide time bins across formic acid runs in the second dimension (Figure S1a). It should be noted that uniform distribution between fractions in fist dimension is necessary - but not sufficient - to evaluate system orthogonality: peptides could cover a small portions of time space in last dimension but exhibit the same number of IDs. Analyzing the deviations from expected ideal profiles (1D runs, black bars in Figure S1a) helped to track the losses observed. The asterisks in Figure S1a indicate the incomplete load of MS/MS acquisition for hydrophobic peptides in early fractions, and hydrophilic peptides in late fractions from pH 10 separation, i.e. at the edges of chromatogram.

ACS Paragon Plus Environment

13

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 24

Orthogonality of the whole 3D system is illustrated by the even distribution between the six fractions in the first dimension (Table 1), the 126 fractions in the second dimension (Figure 3b) and across the retention time scale in the third dimension (Figure S1b). Of note is that the "borderline" fractions # 1 and 6 in the HFBA dimension produced a comparable (but lower) number of a spectra and identifications than middle fractions. While this may appear contradictory to the relative amount of the UV signal in pools 1 and 12 in Figure 2b, we performed such pooling in anticipation that a good portion early eluting peptides in the HFBA dimension (pool 1) would not be retained in the FA system. It was also of interest to compare the orthogonality in some of previously reported 3D-LC MS schemes, although in many cases the information on peptide distributions in all dimensions was not provided. F. Zhou et al25. performed their RP pH 10 – SAX – RP pH 2 in different formats ranging from 19 to 236 fractions; the distribution of unique peptides IDs across of 37 RP pH 10 – SAX fractions was found to be satisfactory: 34 out of 37 fractions gave 100-300 identifications. The same approach was applied to the analysis of iTRAQ labeled whole cell digest in a 19 RP pH 10 – SAX fractions format26. In this case, 5 fractions out of the 19 gave 50 to100-times less identifications than the most populated one (see Figure 3 for comparison). It should be noted this protocol was based on a high pH – low pH approach without concatenation, but with SAX steps in between. It is difficult to obtain superior orthogonality for two RP separations without concatenation. H. Zhou’s et al. 3D (SCXHILIC-RP) procedure had a uniform distribution of phosphopeptides IDs across 3 SCX pools31. The authors observed bell-shape distributions for all 3 HILIC separations in the second dimension, with the middle fractions giving up to 1,800 IDs, while the edges of HILIC profiles were much less populated. Dynamic range of the analysis at MS2 level and accuracy of retention time assignment. We used non-aggressive settings for MS/MS acquisition for Triple TOF5600: 100 ms per tandem MS spectra and 12 seconds exclusion window for formerly fragmented parent ion. The motivation was the acquisition of high quality MS/MS information at few points across a chromatographic peak. This would give better quantitative information at the MS2 level and help in assigning peak maxima for the retention data collection. The resulting distribution of the summed MS2 fragments intensity for unique peptides is shown in Figure 4a. Both 2D and 3D were capable of detecting the peptides of slightly lower abundance compared to 1D: 103.2 and 103 vs. 103.4. The major expansion of the dynamic range was observed at the high abundance portion of the scale: from 107.6 (1D), up to 108.8 and 109.2. The

ACS Paragon Plus Environment

14

Page 15 of 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

observed increase of total dynamic range between 1D and 3D -- from 4.2 to 6.2 (100x) was very close to the expected 126-times difference in sample load.

Figure 4. Dynamic range of MS2 acquisition on peptide level and chromatographic peak shape for abundant species in 1D, 2D and 3D analyses. A) Distribution of unique peptides’ total MS/MS fragments intensity for 1D (green), 2D (red) and 3D (blue); B) Extracted ion chromatograms for VTIAQGGVLPNIQAVLLPK (m/z = 966.096 (2+)) peptide in 1D, 2D fraction #10 and 3D-3 fraction #10 LC-MS runs.

We also found that the increase in peptide load leads to a deterioration of chromatographic profiles for the abundant species. Figure 4b show extracted ion chromatograms for one of the most abundant peptides VTIAQGGVLPNIQAVLLPK (m/z = 966.096 (2+)) in 1 D and its most intense fractions in 2D and 3D acquisitions. MS/MS for all charge states of this peptide was acquired 62 (average of 3 replicates), 393 and 837 times in three separation modes, respectively. This redundancy of MS/MS acquisitions is one of the major reasons for diminishing the output in unique peptides/proteins when deeper fractionation used in conjunction with an increased sample load. We

ACS Paragon Plus Environment

15

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 24

recommend that the collection of the retention data should be done in an exclusive, step-wise manner: the retention of the most abundant species should be extracted based on 1D measurements, and these peptides should be excluded from retention assignment from 2D LC-MS data. Similarly, peptides detected in either 1D and 2D modes should not be a part of the retention data collection at the 3D level.

Filtering false positive identifications using peptide retention prediction is used routinely in our laboratory in both 1D and 2D LC-MS applications. This allows for constant control of LC performance and quality of retention data used for further improvement of SSRCalc prediction model. Figures 5 and S3 show application of formic acid and pH 10 models to the 2D LC-MS data obtained in this work. Experimental retention times (formic acid dimension) for 88,049 non-modified tryptic peptides with Log (e) < -1 were plotted against predicted Hydrophobicity Index values in Figure 5 (HI, http://hs2.proteome.ca/SSRCalc/SSRCalcQ.html). Resulting correlation of ~0.96 R2-value (and R2 ~ 0.94 for the current pH 10 model) provides a solid support for exclusion of “chromatographic” outliers (Figure 5a,b). Note that none of the peptides with high confidence Log (e) < -3 values were excluded. While the gains in predictive accuracy of the current model over the version reported previously15 (R2 0.96 vs. 0.94) may seem insignificant, the data shown in Figure 5 covers much more diverse set of peptides (~88,000 vs. ~15,000), include non-Agr/Lys terminated (C-terminal) peptides, and samples a larger proportion of long peptides thanks to significant improvements in detection sensitivity in the newer mass spectrometers. At this point we did not address the development of SSRCalc model for HFBA-based separations. However possible wider application of this separation mode in proteomics will likely prompt such development.

ACS Paragon Plus Environment

16

Page 17 of 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 5. Filtering false positive identifications using peptide retention prediction. A - peptide retention prediction (SSRCalc, formic acid conditions) for 88,049 non-modified tryptic peptides identified in 2D LC-MS/MS with log (e) < -1; B - peptide retention prediction for 87,804 peptides after retention prediction filtering in both dimensions (see also Figure S3).

CONCLUSIONS In an attempt to increase the depth of proteome coverage for the analysis of whole human cell lysates we expanded our high pH – low pH RP-RP 2D LC-MS system15 into a 3D format with reversed-phase separation chemistry in all 3 modes. This additional dimension is based on C18 separation modality with heptafluorobutyric acid as the ion-pairing modifier, giving complimentary orthogonality to both pH 10 and formic acid separation conditions. We evaluated 2D and 3D orthogonality based on the uniformity of detectable features distributions in all dimensions, as well as

ACS Paragon Plus Environment

17

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 24

total number of acquired MS/MS in comparison to a scaled up output of a 1D LC-MS run. The 2D and 3D procedures utilized 89% and 73% of the available instrument MS/MS acquisition time, respectively. The proposed 3D analysis scheme offers increased depth of coverage on a protein and peptide level, allowing identification of higher percentage of low-abundance peptides with PTMs. It resulted in the identification of over 14,000 proteins and 250,000 unique peptides across the 189 hours of instrument time. Approximately 181,000 of these molecules are tryptic non-modified peptides, and their hydrophobicity index values were accurately assigned and concatenated into our SSRCalc retention database. The proposed protocol is scalable (longer/shorter gradients, larger/smaller number of fractions) and amenable to peptide retention prediction in all 3 dimensions. Furthermore, the MS and LC settings can be easily altered to exploit the peak performance of LC-MS methods: faster MS/MS (available on the same TripleTOF 5600 and newer platforms), more aggressive peak exclusion protocols, the utilization UPLC or core-shell sorbents in all 3 dimensions, and non-linear gradients in the last LC-MS stage of analysis. This method’s limitations includes increased time and sample consumptions, the deterioration of chromatographic profiles of the most abundant peptides, and an increased proportion of unwanted peptide modification due to fractions handling. The latter two issues will be of particular importance when applying multidimensional LC-MS for label-free quantitation of proteomic samples. The disproportionally high load of some analytes will result in skewed quantitative results for high abundance (due to MS analyzer saturation) and co-eluting low abundance (due to ionization suppression) peptides. A decrease in relative MS signal for deep off-line fractionation is expected for peptides prone to spontaneous chemical modifications: Met, Trp oxidation, deamidation (Asn-Gly, Asn-Ser, Gln-Gly sequences), N-terminal cyclization (Gln, carbamidomethyl-Cys). Addressing how all of these will impact protein quantitation, which based on quantitation of its constituent peptides, is a subject of ongoing studies in our laboratory. We started our protocol with ~6-times more digested material compared to the amount injected in the last LC-MS dimension. This was done in anticipation of irreversible sample loss (adsorption on plastic surfaces, etc.) during extensive fractionation. While these losses are hard to estimate, using larger starting amounts will help to minimize their effect on quantitative and qualitative analysis output. Undoubtedly, the amount of available protein material dictates the selection of analytical protocol in proteomics: application of 3D LC-MS is not recommended for the low-µg samples. The procedure shown here was developed to ensure a uniform, efficient delivery of separated peptide

ACS Paragon Plus Environment

18

Page 19 of 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

mixtures into MS analyzer over extended period of time. The use of it will be more beneficial for extremely complex protein mixtures (tissues, meta-proteomic samples) when sufficient protein amount available for the analysis.

Supporting Information The MGF spectrum files and peptide/protein identification results are available from the University of California San Diego's MassIVE repository (massive.ucsd.edu) as entry MSV000079480. Additional information as noted in text. This material is available free of charge via the Internet at http://pubs.acs.org. Notes The authors declare no competing financial interests. Acknowledgements This study was supported by the Natural Sciences and Engineering Research Council of Canada Discovery Grant (RGPIN/355939-2011, O.V.K.).

ACS Paragon Plus Environment

19

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 24

Table 1. Summary of protein/peptide identification for 1D, 2D and 3D LC-MS/MS acquisitions. Run 1D-1 1D-2 1D-3

Instrument Time (hrs) 1.5 1.5 1.5

Sample amount (µg) 1 1 1

MS/MS acquired 27,391 27,555 27,349

2D

31.5

21

3D-1 3D-2 3D-3 3D-4 3D-5 3D-6

31.5 31.5 31.5 31.5 31.5 31.5

3D (1-6)*

189

Peptide IDs 18,862 19,027 18,659

Unique peptide IDs 11,844 12,066 11,725

Proteins Log(e)