MALDI versus ESI: The Impact of the Ion Source on ... - ACS Publications

Feb 8, 2017 - To investigate the influence of the ion source on peptide detection in large-scale proteomics, an optimized GeLC/MS workflow was develop...
0 downloads 0 Views 1MB Size
Subscriber access provided by Fudan University

Article

MALDI versus ESI – the impact of the ion source on peptide identification Wiebke Maria Nadler, Dietmar Waidelich, Alexander Kerner, Sabrina Hanke, Regina Berg, Andreas Trumpp, and Christoph Roesli J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.6b00805 • Publication Date (Web): 08 Feb 2017 Downloaded from http://pubs.acs.org on February 8, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

MALDI versus ESI – the impact of the ion source on peptide identification Wiebke Maria Nadler1, Dietmar Waidelich2, Alexander Kerner1, Sabrina Hanke1, Regina Berg3, Andreas Trumpp1 and Christoph Rösli*,1,4 1

German Cancer Research Center and HI-STEM gGmbH, Im Neuenheimer Feld 280, 69120

Heidelberg, Germany 2

SCIEX Germany GmbH, Landwehrstraße 54, 64293 Darmstadt, Germany

3

Department of Chemistry, University of Zurich, Winterthurerstr. 190, 8057 Zurich, Switzerland

4

Current address: Novartis Pharma AG, Werk Klybeck, 4057 Basel, Switzerland

KEYWORDS: MALDI, ESI, Ionization, Proteomics, Peptide Modification

ABSTRACT For mass spectrometry-based proteomic analyses, electrospray ionization (ESI) and matrixassisted laser desorption/ionization (MALDI) are the commonly used ionization techniques. To investigate the influence of the ion source on peptide detection in large-scale proteomics, an optimized GeLC/MS workflow was developed and applied either with ESI/MS or with MALDI/MS for the proteomic analysis of different human cell lines of pancreatic origin. Statistical analysis of the resulting data set with more than 72000 peptides emphasized the complementary character of the two methods, as the percentage of peptides identified with both

1

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

approaches was as low as 39 %. Significant differences between the resulting peptide sets were observed with respect to amino acid composition, charge-related parameters, hydrophobicity and modifications of the detected peptides and could be linked to factors governing the respective ion yields in ESI and MALDI.

INTRODUCTION

Developed more than thirty years ago, matrix-assisted laser desorption/ionization (MALDI)1 and electrospray ionization (ESI)2 remain the two most important techniques for the ionization of biomolecules in mass spectrometric applications. For MALDI, an analyte is embedded into a typically acidic matrix which heavily absorbs UV light. Excited by a short laser pulse, parts of the matrix heat rapidly and are vaporized/ionized together with the analyte.3 In ESI, an electric field is applied to an analyte solution flowing through a capillary. At the fine tip of the capillary, the liquid is emitted towards the counter electrode. As a result of solvent evaporation, the droplet size decreases, thus producing smaller droplets by Coulomb explosions and finally yielding charged ions. Both methods are characterized by unique strengths and limitations due to their fundamental differences in the process of ion generation. Electrospray ionization offers a high degree of instrumental flexibility. Featuring a continuous generation of ions, ESI can easily be combined not only with liquid chromatography (LC) systems but likewise with various mass analyzers. The development of nanospray ESI sources by Wilm and Mann has furthermore greatly improved the sensitivity4 and paved the way for the technique’s predominance in quantitative large-scale proteomics. MALDI mass spectrometry, on the other hand, can capture the spatial component of a sample as illustrated by MALDI imaging applications.5 In addition, the technique allows for the reanalysis of a sample as a result of the decoupling of chromatographic separation and ionization. The lower time-efficiency of these instruments, 2

ACS Paragon Plus Environment

Page 2 of 25

Page 3 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

however, has made MALDI mass spectrometry a niche application and restrained the development of the respective software tools and workflows for complex proteomics. As a consequence, MALDI/MS data sets with peptide and protein numbers comparable to state-of-theart ESI/MS analyses have not been published to date. Recently, the fast-growing interest in MALDI imaging applications resulted in a number of technical advancements. For example, increased laser repetition rates have helped to substantially reduce sample analysis time.6 The impact of the ion source on a proteomic experiment has been the subject of previous investigations 7. However, most of these studies were performed on samples of low complexity or otherwise suffer from low identification numbers that might not adequately reflect the (physicochemical) diversity of a complex proteome. To representatively investigate the influence of the ion source on peptide detection in large-scale proteomics, we present a GeLC-MS workflow with optimized complexity reduction. This workflow was applied with either ESI/MS or MALDI/MS for the proteomic analysis of four different cell lines of human pancreatic origin. The resulting consistent and statistically sound dataset was exploited to compare composition, physicochemical properties and modifications of the detected peptides, but likewise to evaluate the general limitations shared by both techniques.

EXPERIMENTAL SECTION

Sample Preparation Pancreatic ductal adenocarcinoma (PDAC) cell lines PACO2, PACO3 and PACO7 were kindly provided by E. Noll and M. Sprick (HI-STEM). Control cell line HPNE was purchased from ATCC (CRL-4023). All cell lines were cultured under serum-free conditions without antibiotics as described by Noll et al..8 Cells were lysed on flask with RIPA buffer (50 mM Tris HCl, 150 mM NaCl, 1 % NP-40, 0.5 % w/v sodium deoxycholate, 0.1 % w/v SDS, Protease Inhibitor, 3

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Roche) and lysates were left at -20 °C for at least 24 hours. Following homogenization (5 × 30 s on ice, IKA Ultra Turrax) and sonication (5 × 30 s, 30–40 % amplitude, Branson sonifier) the protein concentration of the clarified lysate was determined with a Pierce bicinchoninic acid assay. The proteomic complexity was reduced by SDS-PAGE (Invitrogen NuPage system, 4– 12 % Bis-Tris precast gels, 1.5 mm thickness, 10 wells, 50 µg protein per lane, reducing conditions). Electrophoretic separation was monitored by Coomassie-based staining (SimplyBlue Safe Stain, Life Technologies) to facilitate the slicing of the gel. Each lane was split into 13 gel fractions according to a specific pooling pattern (figure 1 a). Tryptic digestion buffer (TDB, pH 8.0) was prepared with 50 mM Tris-HCl and 1 mM calcium chloride in water. Corresponding gel fractions from six parallel lanes were pooled, destained (50 % methanol in TDB buffer), washed (TDB) and subsequently incubated for 30 min at 37 °C under agitation with 2 mM TCEP·HCl (tris(2-carboxyethyl)phosphine, solution in TDB). Following removal of the reduction solution cysteine residues were carbamidomethylated with 20 mM iodoacetamide in TDB (30 min at room temperature (RT) in the dark with agitation). Samples were washed with water, TDB was added and samples were incubated for 10 min at RT. Gel slices were dehydrated in 80 % acetonitrile (ACN), dried in a vacuum concentrator, drenched with trypsin solution (Promega, 80 ng/µl in TDB) and covered with 400 µl of TDB for tryptic digest (overnight at 37 °C with agitation). Samples were sonicated to extract the peptides from the gel slices and the supernatant was transferred to fresh Eppendorf tubes. The remaining gel slices were overlaid with 400 µl extraction solution (50 % ACN, 0.1 % trifluoroacetic acid (TFA)). After 20 minutes at RT with agitation samples were sonicated for 5 min in a water bath and the supernatant was combined with the aqueous peptide solution. Samples were dried in a vacuum concentrator and resolubilized in 0.1 % TFA in water for desalting with Agilent Omix tips (elution with 75 %

4

ACS Paragon Plus Environment

Page 4 of 25

Page 5 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

ACN, 0.1 % TFA in water). The eluate from each gel fraction was separated into five replicates and samples were dried in vacuo.

MALDI Analysis Peptides from each gel fraction were solubilized in 5 % ACN, 0.1 % TFA in water for UHPLCbased separation (NanoAcquity ultra high pressure liquid chromatography system; C18 column 75 µm × 250 mm; 1.7 µm BEH130, both Waters) with a 90 min gradient (11–40 % ACN with 0.1 % TFA) and a flow rate of 350 nl/min. Approximately 3 µg peptide per run were loaded on column. A UHPLC- and syringe-pump-coupled spotting robot (SunChrom micro fraction collector/MALDI spotter with SunCollect software version 1.7.26) was programmed to mix the eluting peptides with α-cyano-4-hydroxycinnamic acid matrix (3 mg/ml CHCA in 80 % ACN) and standard spike-in peptides (delivered by a SunChrom micro syringe pump, 1 ml, Pico Plus, Sunchrom, see Supporting Information table S-1) prior to spotting into 1200 fractions on a blank stainless steel MALDI target plate (SCIEX). Mass spectrometric analysis was performed with a MALDI TOF/TOF™ 5800 instrument (SCIEX, TOF/TOF™ Series Explorer™ software version V4.1.0). Laser intensity was optimized individually before each batch submission or at least once per day. MS jobs for each run were submitted as batch with MS reflector positive as operation mode, a mass range of 750–4000 Da, a stage velocity of 1000 µm/s, and a laser pulse rate of 400 Hz. The MS processing method supported peak smoothing with FFT and Poisson Denoise. Minimal S/N for peak detection was specified as 10, the local noise window width was set to 250 m/z and the minimal peak width at full width half maximum (FWHM) was 1 bin. Spike-in peptides were used for internal calibration with at least one matching peak (monoisotopic), a maximal outlier error of 10 ppm, a minimal S/N of 10 and a mass tolerance of +/- 0.3 m/z. MS spectra were accepted after 2000 shots. MS/MS jobs of each MS run were 5

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

submitted as batch using LC precursor selection dynamic exit as job-wide interpretation method. This interpretation method uses the dynamic exit algorithm to select monoisotopic precursors for MS/MS. Based on MS intensity, up to 35 precursors were selected in a mass range of 750– 4000 Da over the full retention time range. Minimum chromatogram peak width was required to be one fraction and fraction-to-fraction precursor mass tolerance was set to 200 ppm. MS/MS acquisition was performed with collision induced dissociation (air, medium gas pressure) and automatic acquisition control. Precursors were selected with a relative precursor mass window of 200 resolution (FWHM). Metastable suppressor was activated. MS/MS spectra of the weakest precursors (based on MS1 intensity) were acquired first. For MS/MS spectra, the laser pulse rate was 1000 Hz, the sample plate was moved with a stage velocity of 1200 µm/s and acceptance was reached after 3000 shots or if the final spectrum reached the desired high quality. The processing method for MS/MS spectra specified usage of the Savitsky-Golay method for peak smoothing with five points across peaks and fourth polynomial order. Minimal S/N for peak detection was 15, the local noise window width was set to 250 m/z and the minimal peak width at FWHM was 1.5 bins. The default setting was used for calibration.

ESI Analysis Peptides from each gel fraction were solubilized in 3 % ACN, 0.1 % formic acid in water. For reverse phase separation of the peptides an Ekspert™ nano LC 400 system (Eksigent) was coupled to a TripleTOF® 5600+ system equipped with a NanoSpray® III ion source (both SCIEX, Analyst 1.6 TF software). A PepMap column (100 µm i.d. × 2 cm, Dionex) with 5 µm particle size was installed for trapping and an Acclaim PepMap RSLC C18 column (75 µm i.d. × 250 mm, 2 µm particle size, Thermo) was used for chromatographic separation. Peptides (~1.9 µg per LC run) were separated with a 120 min gradient (5–45 % ACN with 0.1 % FA) at a 6

ACS Paragon Plus Environment

Page 6 of 25

Page 7 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

flow rate of 300 nl/min. All spectra were acquired in positive ion mode. Mass ranges for MS and MS/MS were 400–1250 m/z and 200–1600 m/z, respectively. The accumulation time limit for injection of the ions into the TOF analyzer was set to 150.1 ms for MS scans and 65 ms for MS/MS mode. With a period cycle time of 5078 ms a total number of 8784 cycles was reached. The pulser frequency was 14.946 kHz. The 75 most intense ions of the MS scan meeting the IDA (information dependent acquisition) criteria stated below were selected for collision induced fragmentation with dynamic collision energy. MS/MS scans were triggered based on MS scans, if the prospective precursor had a mass between 400–1250 amu, a charge state between +2 and +5 and an intensity >150 counts per second. Following selection for MS/MS, masses were excluded for twelve seconds (1/3 of chromatographic peak width) and peaks within a range of 6 Da were ignored. The mass tolerance was set to 50 mDa.

Database Search and Data Analysis Resulting MS/MS data from both instruments were searched against a human database with 70101 protein entries from Swiss-Prot and trEMBL using the Paragon algorithm (ProteinPilot, 2012, version 4.5, SCIEX). The respective search methods (thorough search mode, 95 % confidence) specified gel-based tryptic digestion with carbamidomethylation and an ID focus biological modifications and amino acid substitutions, hence varying only in the selected mass spectrometer (TripleTOF 5600 vs MALDI 5800). Proteotypicity of peptides was analyzed with the PepSir software developed in-house. PepSir is available online 9. Unless otherwise stated, molecular weights of peptides were computed using monoisotopic masses. The aliphatic index was calculated according to Atsushi.10 The isoelectric point (pI) of each peptide was estimated using the Expasy Compute pI/MW tool.11 The net charge of each peptide was computed for pH 1 as described by Moore,12 but without the factor for cysteine. The pKa values of the different 7

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

amino acids are those reported by Nelson and Cox.13 The grand average of hydropathy (GRAVY) was assessed as described by Kyte and Doolittle.14 Sequence logos were generated according to Crooks et al.,15 using the WebLogo interface version 2.8.2..

RESULTS & DISCUSSION

Setup of the GeLC-MS Workflow The proteomes of three different human PDAC cell lines and a cell line derived from normal human pancreatic tissue were fractionated in a GeLC-based bottom-up approach. One technical replicate of each fraction was analyzed with a MALDI TOF/TOF™ 5800 instrument and another one with a TripleTOF® 5600+ system (figure 1 a). The ion source of the former instrument is a MALDI 1 kHz OptiBeam™ on-axis laser, while the TripleTOF® 5600+ system features a NanoSpray® III ion source. All necessary settings including amount of material, LC gradient and IDA criteria for mass spectrometric analysis were optimized independently for both instruments to obtain maximum identification numbers with each method. The proteomic analysis of the four cell lines in 52 GeLC-MS runs (4 cell lines × 13 gel fractions) required ~390 hours instrument time for MALDI-based analysis (~0.5 h per MS run and ~7 h per MS/MS run) compared to ~104 hours for the ESI setup. By combination of the search results from both instruments, a total of 72974 peptides were identified with a confidence > 95 % (figure 1 b). The proteotypic peptides were assigned to 6517 different proteins. 5579 of these proteins were identified with at least two proteotypic peptides, whereas 938 were one-hit-wonders, i.e., proteins with a single identified proteotypic peptide. On peptide level, the overlap between the two ionization setups was as low as 39 %, thereby emphasizing the complementary character of the two ionization techniques. 18.6 % of the peptides were identified with MALDI/MS only, while 42.2 % were found solely with ESI/MS. On protein level, the differences were less pronounced, however 8

ACS Paragon Plus Environment

Page 8 of 25

Page 9 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

19.9 % of the proteins were unique for ESI/MS and 6.6 % for MALDI/MS. In accordance with the observed high numbers of unique peptides, combination of the two different ionization methods increased the average sequence coverage to 25 % from 21.7 % by ESI alone (figure 1 c). A high sequence coverage is particularly important if different isoforms or otherwise highly conserved protein species are to be distinguished.

Rationale and Quality Control of the Gel-Based Prefractionation Process A uniform distribution with a constant, moderate degree of complexity, i.e., similar numbers of different peptides per LC run, typically increases the number of overall identifications. While this principle is generally acknowledged for optimization of LC-gradients, it is frequently neglected for the preceding fractionation steps. The frequency of the log-transformed protein weights in the human proteome follows a normal distribution centered at approximately 4.42, corresponding to a molecular weight of 26085.55 Da (figure 2 a). Because the logarithmized molecular weight of a protein is roughly anti-proportional to its traveled distance during electrophoretic separation,16 the center of an SDS-PAGE gel contains a higher number of different proteins than its top or bottom area. The complex pooling pattern depicted in figure 1 a, in combination with a gradient gel, was designed to compensate for this fact and yielded a more homogeneous complexity distribution, i.e., comparable identification numbers across the different analysis runs (figure 2 b). For identification numbers per ion source see Supporting Information (figure S-1). To monitor quality and technical reproducibility of the gel slicing process, the frequency of detected protein masses per gel fraction was visualized in a scatter plot (figure 2 c). As expected for the applied pooling pattern, the molecular weights increase in two clusters with ascending gel fraction. Direct comparison of the scatter dot plots suggests a high technical reproducibility of the process (Supporting Information figure S-2). To further investigate the prefractionation 9

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 25

process, the resolving power of the SDS-PAGE system was evaluated (figure 2 d). Focussing of each protein is important to reach optimal peptide signal intensities in both MS and MS/MS analysis. Applying the described SDS-PAGE-based prefractionation strategy, approximately 85 % of peptides were found in at most two gel fractions. Minor differences between MALDI/MS and ESI/MS resulted from the different sensitivities of the instruments. Independent of the ionization technique, small proteins and protein products from low abundant genes were underrepresented in the dataset (Supporting Information figures S-4/S-5).

Amino Acid Composition of Detected Peptides Based on all proteotypic peptides identified per workflow, the detection frequency of each amino acid was computed relative to its frequency in the search database (figure 3 a). Major differences can be observed for a variety of amino acids, including all basic amino acids. Independent of the ion source, the experimental identification rates of cysteine, methionine and tryptophan were found to be at least 30 % lower than expected considering their representation in the search database. The mass spectrometric challenges arising from the cysteine residues' ability to form disulfide bonds are well described, but can be mitigated by deliberate selection of an adequate alkylating agent.17 The impaired recognition of methionine is most likely linked to both side reactions emerging during the carbamidomethylation of cysteine residues and partial oxidation of the thioether side chain. Finally, the low identification levels of tryptophan might be caused by artifacts from the prefractionation process. According to Perdivara et al. oxidation and dioxidation of tryptophan can occur during the GeLC process, but are less frequently recorded with shotgun preparations.18 In accordance with the findings of Seymour et al. and others

6a, 7c, 19

, a profound difference

between MALDI/MS and ESI/MS is observed for the peptides’ C-termini (figure 3 b). The serine 10

ACS Paragon Plus Environment

Page 11 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

protease trypsin selectively cleaves the peptide bond at the carboxyl side of the amino acids arginine and lysine and thus enables a direct comparison of the ion source-specific preferences with respect to the C-terminal amino acid. While >80 % of the peptides identified only with MALDI/MS contain a C-terminal arginine, the majority of peptides detected solely with ESI/MS features a lysine at this position. Although arginine and lysine are characterized by the highest gas-phase basicities of the 20 naturally occuring amino acids (1007 kJ mol-1 and 952 kJ mol-1, respectively),20 these values differ by 55 kJ mol-1. Originating in the fundamentals of the matrixassisted laser desorption/ionization process,21 gas-phase basicity is the most important analytedependent physical parameter governing ion yields in MALDI, and hence explains the positive bias for arginine with this technique.19

Hydrophobicity of Detected Peptides While MALDI is largely controlled by the gas-phase basicity,22 ion yields in ESI are substantially influenced by the hydrophobicity of a molecule.23 Hydrophobic analytes can accumulate at the surface of a liquid/droplet in the Taylor cone emission process, and thus have a higher chance for mass spectrometric detection. To analyze if this dependence is mirrored in the obtained data, two different measures of the hydrophobicity were calculated for each peptide (figure 4 a, b). The aliphatic index describes the relative volume of a protein or peptide occupied by aliphatic side chains,10 while the grand average of hydropathy (GRAVY) is calculated based on the hydropathy values of all amino acids in a sequence.14 As expected based on the theoretical considerations, the frequency distributions of both aliphatic index and grand average of hydropathy were shifted towards higher values for peptides detected by ESI/MS.

Charge State Distribution of Peptides Identified by ESI/MS 11

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 25

The histograms of two additional physico-chemical properties, isoelectric point and net charge, were altered as a function of the ion source (figure 4 c, d). For both properties the frequency distribution of the peptides detected with MALDI/MS, but not with ESI/MS, was shifted towards higher values. As both measures are positively correlated with the number of basic amino acids, the dependence between the presence of these residues and the observed charge state in ESI was further investigated. Other than in MALDI, where singly charged ions account for the vast majority of peptide ion species, doubly, triply or higher charged ions are typically generated by ESI.24 The charge state of a tryptic peptide ionized by electrospray increases with an additional basic amino acid (figure 4 e). Interestingly, this effect is independent of the type of basic residue, i.e. arginine, lysine and histidine similarly promote the acquisition of an additional charge. With higher charge state, however, peptides with a mass below a certain threshold cannot be detected (e.g. 1 Proteotypic Peptide (6517 with ≥ 1 Proteotypic Peptide)

356 [6.4 %]

4114 [73.7 %]

1109 [19.9 %]

13579 [18.6 %]

28473 [39 %]

30922 [42.4 %]

ESI

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

72974 Peptides (> 95% Confidence)

Figure 1. a GeLC workflow for comparison of MALDI- and ESI-based mass spectrometry. Proteomic complexity was reduced by SDS-PAGE and each lane was cut into 52 gel slices for fractionation according to a specific pattern. For each cell line, gel slices n, n + 1, n + 26 and n + 27 were pooled with n being located proximal to the bromophenol blue dye front. Resulting MS/MS data from both instruments were searched using the Paragon algorithm with a human database. b Venn diagram with accumulated numbers of proteins (without one-hit-wonders) and peptides, identified in four human cell lines of pancreatic origin using GeLC-based MALDI/MS and ESI/MS analysis. For identification rates per cell line see Supporting Information (figures S-3/S-7) c Moving average of the sequence coverage after ranking identified proteins based on the sequence coverage reached by combination of the ionization methods. The moving average was calculated with a subset of 50 proteins and every fifth value is depicted. The average total sequence coverage is represented as a dotted line.

16

ACS Paragon Plus Environment

Page 17 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research a

b

c

d

Figure 2. a Histogram of the logarithmized protein masses of the human search database (SwissProt, 70101 entries). Decadic logarithms of the masses were binned into 13 fractions. b Identified proteins per LC-run and pooled gel fraction regardless of the ion source (mean with SEM, n = 8). c Scatter dot plot showing the protein masses detected per gel fraction. Increase of the molecular weights in two clusters with ascending fraction reflects the pooling strategy as shown in the workflow. d Resolving power of the SDS-PAGE. Approximately 85 % of peptides are found in two or less fractions.

17

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 25

a

b N

ESI Only

1 2 3 4 5

X

5 4 3 2 1

C

N

ESI & MALDI

1 2 3 4 5

X

5 4 3 2 1

C

MALDI Only

N

1 2 3 4 5

X

C

5 4 3 2 1

Figure 3. a Amino acid composition of peptides identified from cell lines with ESI- or MALDIbased GeLC/MS (mean with SD, n = 4). Duplicates were removed and only the main form was included for modified peptides. The percentage of each amino acid in a set is normalized to its natural occurrence (SwissProt, human w/o isoforms, tryptic in silico digest). Statistical significance was determined using repeated measures two-way analysis of variance and p-values were adjusted to account for multiple comparisons using the Sidak method (p = 0.05). Asterisks indicate statistical significance (p < 0.005). b Sequence logos for the peptide sets (≥ 10 residues) detected with MALDI/MS and ESI/MS instruments. The height of each stacked letter is proportional to the frequency of the corresponding amino acid for the respective sequence position. Positively and negatively charged amino acids are depicted in blue and red respectively. C-terminal position 1 causes the global K and R bias.

18

ACS Paragon Plus Environment

Page 19 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

a

b

e

f

c

d

g

h

Figure 4. a, b Frequency distributions of aliphatic index and GRAVY of peptides identified by MALDI/MS and ESI/MS. c, d Frequency distributions of isoelectric point and net charge (calculated at pH 1) of peptides identified with the different techniques. e Histogram of the ESI charge state distribution of tryptic peptides containing additional polar basic amino acids. f Frequency of peptides with differing numbers of basic polar amino acids. g Frequency of the logtransformed peptide masses. h Occurrence of histidine in the different data sets relative to the frequency of the amino acid in the search database (mean with SD, n = 4). Two-tailed ratio paired t-test reported a significant difference for the two methods.

19

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

a

b

Page 20 of 25

c

d

Figure 5. Influence of the ion source on modification detection. Note that samples were prepared without phosphatase inhibitor. For an overview on differential detection of phosphorylation see 30

. a The percentage of peptide modifications was normalized to the total number of peptide

species identified with MALDI/MS or ESI/MS (mean with SD, n = 4). Significance was calculated using multiple t-tests, and p-values were adjusted using the Holm-Sidak method (α = 0.05). Asterisks indicate statistical significance (p < 0.05). b Scatter dot plot showing the ratio of the precursor intensities of peptides containing oxidized and non-oxidized methionine. Median with interquartile range was calculated based on precursor intensities of 92 peptide sets, each set consisting of MALDI/MS and ESI/MS spectra of the oxidized and non-oxidized form of a methionine containing peptide. Two-tailed ratio paired t-test reported a p-value < 0.0001. c Exemplary MALDI- (upper panel) and ESI-based (lower panel) MS/MS spectra of peptide AVEYLLMGIPGDR with (left) and without (right) oxidation of the methionine. d Loss of methanesulfenic acid is the dominant fragmentation process for oxidized methionine under conditions of low proton mobility.29

20

ACS Paragon Plus Environment

Page 21 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Table 1 Impact of the ion source on peptide detection

Uniqueness Peptides Proteins Average Protein Sequence Coverage Amino Acid Composition Overall Bias C-Terminus Polar Basic Amino Acids Modification Detection

MALDI/MS

ESI/MS

18.6 % 6.6 % 14.8 %

42.2 % 19.9 % 21.7 %

C, M, W Disfavored R Highly Preferred K Preferred Impact on Charge State; H Disfavored; Low Mass Peptides Disfavored Unmodified Species Preferred; Oxidized M Disfavored

Biophysical Properties (Averages) Mass of Unique / All Peptides GRAVY Unique / All Peptides pI Unique / All Peptides Net Charge Unique / All Peptides

1265.5 Da (11.52 residues) / 1360.4 Da (12.52 residues) -0.4 / -0.6 6.8 / 6.1 2.8 / 2.5

1412.9 Da (13.65 residues) / 1410.5 Da (13.34 residues) -0.2 / -0.2 5.5 / 5.6 2.3 / 2.4

AUTHOR INFORMATION Corresponding Author *[email protected], telephone 06221423901, fax 06221423902 Author Contributions All authors listed on the manuscript have significantly contributed to the presented work and have read and agreed to the final manuscript. Funding Sources The project was funded by the German Cancer Research Center and the Dietmar Hopp Stiftung. ACKNOWLEDGMENT We thank Bernd Müller and Christian Baumann for useful discussions and kind support during mass spectrometric analyses at the SCIEX applications lab Darmstadt. Access to the PACO

21

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 25

model system was kindly granted by Christian Eisen, Elisa Noll and Martin Sprick. We gratefully acknowledge financial support from the German Cancer Research Center and the Dietmar Hopp Stiftung. ABBREVIATIONS ACN, acetonitrile; ANOVA, analysis of variance; BCA, bicinchoninic acid; CHCA, α-cyano-4hydroxycinnamic acid; ESI, electrospray ionization; FWHM, full width half maximum; GeLC, one-dimensional SDS-PAGE followed by LC; IDA, information dependent acquisition; LC, liquid chromatography; MALDI, matrix-assisted laser desorption/ionization; PDAC, pancreatic ductal adenocarcinoma; ppm, parts per million; RM, repeated measures; RT, room temperature; SD, standard deviation; SDS-PAGE, sodium dodecyl sulfate polyacrylamide gel electrophoresis; SEM, standard error of the mean; TCEP, tris(2-carboxyethyl)phosphine; TDB, trypsin digestion buffer; TFA, trifluoroacetic acid; UHPLC, ultra high pressure liquid chromatography. REFERENCES 1. (a) Karas, M.; Bachmann, D.; Hillenkamp, F., Influence of the Wavelength in High-Irradiance Ultraviolet-Laser Desorption Mass-Spectrometry of Organic-Molecules. Anal. Chem. 1985, 57 (14), 29352939; (b) Karas, M.; Bachmann, D.; Bahr, U.; Hillenkamp, F., Matrix-Assisted Ultraviolet-Laser Desorption of Nonvolatile Compounds. Int. J. Mass Spectrom. Ion Processes 1987, 78, 53-68. 2. (a) Fenn, J. B.; Mann, M.; Meng, C. K.; Wong, S. F.; Whitehouse, C. M., Electrospray Ionization for Mass-Spectrometry of Large Biomolecules. Science 1989, 246 (4926), 64-71; (b) Aleksandrov, M. L.; Baram, G. I.; Gall, L. N.; Grachev, M. A.; Knorre, V. D.; Krasnov, N. V.; Kusner, Y. S.; Mirgorodskaya, O. A.; Nikolaev, V. I.; Shkurov, V. A., Application of a novel mass spectrometric method to sequencing of peptides. Bioorg. Khim. 1985, 11 (5), 705-708. 3. Dreisewerd, K., The desorption process in MALDI. Chem. Rev. 2003, 103 (2), 395-426. 4. Wilm, M. S.; Mann, M., Electrospray and Taylor-Cone Theory, Doles Beam of Macromolecules at Last. Int. J. Mass Spectrom. Ion Processes 1994, 136 (2-3), 167-180. 5. Walch, A.; Rauser, S.; Deininger, S. O.; Hofler, H., MALDI imaging mass spectrometry for direct tissue analysis: a new frontier for molecular histology. Histochem. Cell Biol. 2008, 130 (3), 421-34. 6. (a) Seymour, S.; Booy, A.; Gundry, R.; Van Eyk, J.; Hunter, C., Assessing the Complementarities of MALDI and ESI for Protein Identification in Complex Mixtures. Technical Note, AB SCIEX 2010, 1-5; (b) McLean, J. A.; Russell, W. K.; Russell, D. H., A high repetition rate (1 kHz) microcrystal laser for high throughput atmospheric pressure MALDI-quadrupole-time-of-flight mass spectrometry. Anal. Chem. 2003, 75 (3), 648-54. 7. (a) Bodnar, W. M.; Blackburn, R. K.; Krise, J. M.; Moseley, M. A., Exploiting the complementary nature of LC/MALDI/MS/MS and LC/ESI/MS/MS for increased proteome coverage. J. Am. Soc. Mass 22

ACS Paragon Plus Environment

Page 23 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Spectrom. 2003, 14 (9), 971-979; (b) Zhang, J.; Gao, M. X.; Tang, J.; Yang, P. Y.; Liu, Y. K.; Zhang, X. M., Improvements in protein identification confidence and proteome coverage for human liver proteome study by coupling a parallel mass spectrometry/mass spectrometry analysis with multi-dimensional chromatography separation. Anal. Chim. Acta 2006, 566 (2), 147-156; (c) Hessling, B.; Büttner, K.; Hecker, M.; Becher, D., Global Relative Quantification with Liquid Chromatography–Matrix-assisted Laser Desorption Ionization Time-of-flight (LC-MALDI-TOF)—Cross–validation with LTQ-Orbitrap Proves Reliability and Reveals Complementary Ionization Preferences. Mol. Cell. Proteomics 2013, 12 (10), 29112920; (d) Person, M. D.; Lo, H. H.; Towndrow, K. M.; Jia, Z.; Monks, T. J.; Lau, S. S., Comparative identification of prostanoid inducible proteins by LC-ESI-MS/MS and MALDI-TOF mass spectrometry. Chem. Res. Toxicol. 2003, 16 (6), 757-67. 8. Noll, E. M.; Eisen, C.; Stenzinger, A.; Espinet, E.; Muckenhuber, A.; Klein, C.; Vogel, V.; Klaus, B.; Nadler, W.; Rösli, C.; Lutz, C.; Kulke, M.; Engelhardt, J.; Zickgraf, F. M.; Espinosa, O.; Schlesner, M.; Jiang, X.; Kopp-Schneider, A.; Neuhaus, P.; Bahra, M.; Sinn, B. V.; Eils, R.; Giese, N. A.; Hackert, T.; Strobel, O.; Werner, J.; Büchler, M. W.; Weichert, W.; Trumpp, A.; Sprick, M. R., CYP3A5 mediates basal and acquired therapy resistance in different subtypes of pancreatic ductal adenocarcinoma. Nat. Med. 2016. 9. Kerner, A. https://sourceforge.net/projects/pepsir/. 10. Atsushi, I., Thermostability and aliphatic index of globular proteins. J. Biochem. 1980, 88 (6), 1895-1898. 11. (a) Gasteiger, E.; Hoogland, C.; Gattiker, A.; Duvaud, S. e.; Wilkins, M. R.; Appel, R. D.; Bairoch, A., Protein identification and analysis tools on the ExPASy server. Springer: 2005; (b) Bjellqvist, B.; Hughes, G. J.; Pasquali, C.; Paquet, N.; Ravier, F.; Sanchez, J. C.; Frutiger, S.; Hochstrasser, D., The focusing positions of polypeptides in immobilized pH gradients can be predicted from their amino acid sequences. Electrophoresis 1993, 14 (1), 1023-1031. 12. Moore, D. S., Amino acid and peptide net charges: a simple calculational procedure. Biochem. Educ. 1985, 13 (1), 10-11. 13. Nelson, D. L.; Cox, M. M., Amino acids, peptides, and proteins. In Lehninger Principles of Biochemistry, 6th ed.; Nelson, D. L.; Cox, M. M., Eds. Macmillan: 2013; pp 75-115. 14. Kyte, J.; Doolittle, R. F., A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 1982, 157 (1), 105-132. 15. Crooks, G. E.; Hon, G.; Chandonia, J.-M.; Brenner, S. E., WebLogo: a sequence logo generator. Genome Res. 2004, 14 (6), 1188-1190. 16. Helms, V., Protein-protein interaction networks - pairwise connectivity. In Principles of computational cell biology, Helms, V., Ed. John Wiley & Sons: 2008; pp 39-66. 17. Nadler, W.; Berg, R.; Walch, P.; Hanke, S.; Baalmann, M.; Kerner, A.; Trumpp, A.; Roesli, C., Ion source-dependent performance of 4-vinylpyridine, iodoacetamide, and N-maleoyl derivatives for the detection of cysteine-containing peptides in complex proteomics. Anal. Bioanal. Chem. 2015, 2055-2067. 18. Perdivara, I.; Deterding, L. J.; Przybylski, M.; Tomer, K. B., Mass spectrometric identification of oxidative modifications of tryptophan residues in proteins: chemical artifact or post-translational modification? J. Am. Soc. Mass Spectrom. 2010, 21 (7), 1114-1117. 19. Krause, E.; Wenschuh, H.; Jungblut, P. R., The dominance of arginine-containing peptides in MALDI-derived tryptic mass fingerprints of proteins. Anal. Chem. 1999, 71 (19), 4160-4165. 20. Bouchoux, G., Gas phase basicities of polyfunctional molecules. Part 3: Amino acids. Mass Spectrom. Rev. 2012, 31 (3), 391-435. 21. (a) Knochenmuss, R., Ion formation mechanisms in UV-MALDI. Analyst 2006, 131 (9), 966-986; (b) Karas, M.; Glückmann, M.; Schäfer, J., Ionization in matrix-assisted laser desorption/ionization: singly charged molecular ions are the lucky survivors. J. Mass Spectrom. 2000, 35 (1), 1-12. 22. Nishikaze, T.; Takayama, M., Cooperative effect of factors governing molecular ion yields in desorption/ionization mass spectrometry. Rapid Commun. Mass Spectrom. 2006, 20 (3), 376-82. 23. Wilm, M., Principles of electrospray ionization. Mol. Cell. Proteomics 2011, 10 (7), M111. 009407. 23

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 25

24. Li, Y.; Cole, R. B., Charge State Distributions in Electrospray and MALDI. In Electrospray and MALDI Mass Spectrometry, 2nd ed.; Cole, R. B., Ed. John Wiley & Sons, Inc.: 2010; pp 491-534. 25. Levine, R. L.; Moskovitz, J.; Stadtman, E. R., Oxidation of methionine in proteins: roles in antioxidant defense and cellular regulation. IUBMB Life 2000, 50 (4-5), 301-307. 26. Konigsberg, W. H.; Steinman, H. M., Strategy and Methods of Sequence Analysis. In The Proteins, 3rd ed.; Neurath, H.; Hill, R. L., Eds. Academic Press: 1977; pp 1-178. 27. Mashima, R.; Nakanishi-Ueda, T.; Yamamoto, Y., Simultaneous determination of methionine sulfoxide and methionine in blood plasma using gas chromatography-mass spectrometry. Anal. Biochem. 2003, 313 (1), 28-33. 28. Lioe, H.; Richard, A.; Gronert, S.; Austin, A.; Reid, G. E., Experimental and theoretical proton affinities of methionine, methionine sulfoxide and their N-and C-terminal derivatives. Int. J. Mass Spectrom. 2007, 267 (1), 220-232. 29. Reid, G. E.; Roberts, K. D.; Kapp, E. A.; Simpson, R. J., Statistical and mechanistic approaches to understanding the gas-phase fragmentation behavior of methionine sulfoxide containing peptides. J. Proteome Res. 2004, 3 (4), 751-759. 30. Ruprecht, B.; Roesli, C.; Lemeer, S.; Kuster, B., MALDI-TOF and nESI Orbitrap MS/MS identify orthogonal parts of the phosphoproteome. Proteomics 2016, 16 (10), 1447-56.

24

ACS Paragon Plus Environment

Page 25 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

For TOC Only

The selection of the ion source for mass-spectrometry based proteomics affects the frequency distributions of various physico-chemical properties of detected peptides, including hydrophobicity and charge-related parameters.

25

ACS Paragon Plus Environment