Estimating Influence of Cofragmentation on Peptide Quantification and

Jun 11, 2014 - Dong-Gi Mun,. ‡ ... Sciences, Korea University, Seoul 136-701, Republic of Korea. § ... Hanyang University, Seoul 133-791, Republic ...
0 downloads 0 Views 2MB Size
Technical Note pubs.acs.org/jpr

Estimating Influence of Cofragmentation on Peptide Quantification and Identification in iTRAQ Experiments by Simulating Multiplexed Spectra Honglan Li,† Kyu-Baek Hwang,† Dong-Gi Mun,‡ Hokeun Kim,‡ Hangyeore Lee,‡ Sang-Won Lee,‡ and Eunok Paek*,§ †

School of Computer Science and Engineering, Soongsil University, Seoul 156-743, Republic of Korea Department of Chemistry, Research Institute for Natural Sciences, Korea University, Seoul 136-701, Republic of Korea § Department of Computer Science and Engineering, Hanyang University, Seoul 133-791, Republic of Korea ‡

S Supporting Information *

ABSTRACT: Isobaric tag-based quantification such as iTRAQ and TMT is a promising approach to mass spectrometry-based quantification in proteomics as it provides wide proteome coverage with greatly increased experimental throughput. However, it is known to suffer from inaccurate quantification and identification of a target peptide due to cofragmentation of multiple peptides, which likely leads to under-estimation of differentially expressed peptides (DEPs). A simple method of filtering out cofragmented spectra with less than 100% precursor isolation purity (PIP) would decrease the coverage of iTRAQ/TMT experiments. In order to estimate the impact of cofragmentation on quantification and identification of iTRAQ-labeled peptide samples, we generated multiplexed spectra with varying degrees of PIP by mixing the two MS/MS spectra of 100% PIP obtained in global proteome profiling experiments on gastric tumor−normal tissue pair proteomes labeled by 4-plex iTRAQ. Despite cofragmentation, the simulation experiments showed that more than 99% of multiplexed spectra with PIP greater than 80% were correctly identified by three different database search enginesMODa, MS-GF+, and Proteome Discoverer. Using the multiplexed spectra that have been correctly identified, we estimated the effect of cofragmentation on peptide quantification. In 74% of the multiplexed spectra, however, the cancer-to-normal expression ratio was compressed, and a fair number of spectra showed the “ratio inflation” phenomenon. On the basis of the estimated distribution of distortions on quantification, we were able to calculate cutoff values for DEP detection from cofragmented spectra, which were corrected according to a specific PIP and probability of type I (or type II) error. When we applied these corrected cutoff values to real cofragmented spectra with PIP larger than or equal to 70%, we were able to identify reliable DEPs by removing about 25% of DEPs, which are highly likely to be false positives. Our experimental results provide useful insight into the effect of cofragmentation on isobaric tag-based quantification methods. The simulation procedure as well as the corrected cutoff calculation method could be adopted for quantifying the effect of cofragmentation and reducing false positives (or false negatives) in the DEP identification with general quantification experiments based on isobaric labeling techniques. KEYWORDS: peptide quantification, isobaric tag-based quantification, iTRAQ, differentially expressed peptides, cofragmentation, multiplexed spectra, simulation



clinical diagnosis,18 and general biological researches19 as they are applicable to various in vitro proteome samples. Despite these advantages, the isobaric tag-based quantification methods have a common shortcoming in that precursors of different species with similar m/z could be coisolated for MS/ MS. The co-isolation and resulting cofragmentation would disturb accurate quantification of peptides by producing distorted reporter ion peaks. On average, such distortion is likely to cause the “ratio compression” phenomenon20 as the majority of peptides are not differentially expressed under different conditions.

INTRODUCTION

Protein quantification by mass spectrometry has become a widely adopted means in many areas of biomedical research due to its capabilities in systematic quantification in complex samples.1−5 Diverse approaches to mass spectrometry-based protein quantification include label-free methods,6−8 stable isotope labeling by/with amino acids in cell culture (SILAC),9 isobaric tags for relative and absolute quantification (iTRAQ),10−12 and tandem mass tag (TMT).13,14 Compared to label-free methods, isobaric tagging methods such as iTRAQ and TMT are less sensitive to experimental bias as peptides are being ionized and fragmented simultaneously in LC−MS/MS experiments. The iTRAQ and TMT methods have been applied to many studies such as biomarker discovery,15−17 © 2014 American Chemical Society

Received: January 19, 2014 Published: June 11, 2014 3488

dx.doi.org/10.1021/pr500060d | J. Proteome Res. 2014, 13, 3488−3497

Journal of Proteome Research

Technical Note

samples with eight units of 115 and 117 labels, respectively. Each of the 100 μg peptide samples (a total of 32 samples) was dissolved in 30 μL of dissolution buffer (500 mM TEAB pH 8.5) in one Eppendorf tube (total 32 tubes). Each of 4-plex iTRAQ reagents was prepared by dissolving each unit of 4-plex iTRAQ in 70 μL of ethanol and transferred to each 100 μg sample tube. After 1 h incubation at room temperature for labeling reaction, the unreacted reagent was hydrolyzed by adding 300 μL of 0.05% TFA and further incubated 30 min. The contents of all labeled samples were pooled into one tube and concentrated to 900 μL for the following fractionation. The iTRAQ-labeled peptide sample was fractionated into 24 fractions by a basic pH reverse-phase fractionation method as described previously.30 Briefly, the fractionation was performed using an Agilent 1260 Infinity HPLC system (Agilent, Palo Alto, CA) equipped with an analytical column (4.6 mm × 250 mm, Xbridge C18, 5 μm) and a guard column (4.6 mm × 20 mm, Xbridge C18, 5 μm). LC solvents A and B were 10 mM TEAB in water (pH 7.5) and 10 mM TEAB 90% ACN (pH 7.5), respectively. A 105 min gradient (0% solvent B for 10 min, from 0% to 5% solvent B over 10 min, from 5% to 35% solvent A over 60 min, from 35% to 70% over 15 min, and held at 70% for 10 min) was used for peptide separation. The flow rate was 0.5 mL/min. A fraction collector (G1364C, Agilent, Palo Alto, CA) equipped with 96-well plate was used to collect 96 fractions of peptide samples by collecting every 1-min eluents along with separation time into each well of the 96-well plate. The peptide fractions were divided into four sections (i.e., four sections of 24 early, 24 midearly, 24 midlate, and 24 late fractions). The 96 fractions were concatenated into 24 fractions by pooling four fractions, each from the four sections. The sample fractions were vacuum-dried and stored at −80 °C until LC−MS/MS analysis.

Approaches to solving this problem can be categorized into two groups. One is to focus on developing sophisticated experimental techniques to reduce the precursor co-isolation or its effect, such as employing extensive sample fractionation to reduce the degree of sample complexity,20−22 performing MS3 of a MS/MS fragment to avoid mixing of reporter ions,23 and gas-phase purification to mitigate the co-isolation problem.24 Although these methods have been shown to enhance the accuracy, they are either hardly generalizable due to their dependence on specific instruments or they often result in decreased experimental throughput. The other approach is to estimate the extent of precursor co-isolation and use this information to correct distorted quantification results. Vaudel and colleagues25 proposed a method for estimating error rate of iTRAQ experiments using a decoy sample. Savitski and colleagues26 devised an algorithm for correcting quantification results by precursor interference level. Recently, Sandberg and colleagues27 suggested a minimum cutoff level of precursor isolation interference for accurate quantification. In addition, Wuhr and colleagues28 demonstrated that the use of complement TMT fragment ion clusters (instead of reporter ions) could eliminate the effect of peptide interference. These approaches enabled more accurate quantification in isobaric tag-based methods. However, each method still has its own drawbacks such as the need of additional decoy sample,25 the assumption that the reporter ion pattern of a cofragmented peptide is the same for all spectra,26 and the decrease in coverage of quantified peptides.27,28 In order to estimate the effect of precursor co-isolation in isobaric tag-based quantification without additional decoy sample, we simulated a variety of cofragmented spectra from an iTRAQ experiment that compared tumor and normal tissue of human gastric cancer. Using these simulated multiplexed spectra, we investigated the effect of cofragmentation on peptide identification. Then, we statistically quantified the effect of precursor interference on peptide quantification. On the basis of the quantified effect, we developed a method for reliable identification of differentially expressed peptides (DEPs) from cofragmented MS/MS spectra. The developed method does not assume that the reporter ion pattern of cofragmented spectra is the same for all spectra. Rather, it circumvents the uncertainty associated with the reporter ion pattern resulting from cofragmentation using statistical inference. The proposed method minimizes the decrease in coverage by detecting reliable DEPs even from spectra largely affected by cofragmentation instead of excluding them from the analysis.



LC−MS/MS Experiments

The 24 fraction samples were each separated using a modified nanoACQUITY UPLC (Waters, Milford, MA), which consisted of two reverse-phase analytical columns (75 μm i.d. × 360 μm o.d., 100 cm, 3 μm i.d. and 300 Å pore C18 resin) and two trap columns (150 μm i.d. × 360 μm o.d., 3 cm, 3 μm i.d. and 300 Å pore C18 resin).31 The linear gradient was generated with LC solvents A (0.1% formic acid in water) and B (0.1% formic acid in ACN). For the LC−MS/MS analyses of the 24 fractionated samples, 10 μg of each peptide was injected, and a 180-min gradient at a flow rate of 300 nL/min was used (1% to 40% of solvent B over 160 min, 40% to 80% over 5 min, at 80% over 10 min, and at 1% for 5 min). The eluted peptides were ionized online through an in-house nano electrospray source equipped on Q-Exactive orbitrap hybrid mass spectrometer (Thermo Scientific, Bremen, Germany). The electrospray was maintained at 2.4 kV. MS precursor scans (400−2000 Th) were acquired with a resolution of 70,000 (at 400 Th) and an automated gain control (AGC) target value of 1.0 × 106. Up to 10 most abundant ions in a precursor scan were fragmented by higherenergy collisional dissociation (HCD) with normalized collision energy (NCE) of 30 and the isolation width of 1.6 Th. Tandem MS scans were acquired at a resolution of 17,500 (at 400 Th) with a fixed low m/z of 100 Th.

MATERIALS AND METHODS

Sample Preparation

Human gastric cancer tissues (cancer and adjacent normal) were pulverized into powder using a pulverizing apparatus (CP02 CryoPrep, Covaris, U.S.A.) and dissolved in lysis buffer (4% SDS, and 0.1 M Tris-HCl, pH 7.6) through a focusedultrasonicatior (S220, Covaris, U.S.A.). The tissue lysate was digested through a modified FASP method as described before.29 A total of 3.2 mg of digested peptides (i.e., 1.6 mg normal and 1.6 mg cancer peptides) were labeled with 4-plex iTRAQ reagent (AB Sciex, Foster City, CA) according to the manufacturer’s instruction. Briefly, two sets of eight 100 μg normal peptide samples were labeled with eight units of 114 and 116 labels and two sets of eight 100 μg cancer peptide

LC−MS/MS Data Analysis

All mass spectrometric data were analyzed using PE-MMR before database search (available at http://omics.pnl.gov/ software/PEMMR.php) for precursor mass correction and 3489

dx.doi.org/10.1021/pr500060d | J. Proteome Res. 2014, 13, 3488−3497

Journal of Proteome Research

Technical Note

refinement.32 The resultant MS/MS data were searched against a composite database of uniprot-human-reference (released in May 2013; 90,191 entries) and 179 common contaminants using MODa version 1.10 (http://prix.hanyang.ac.kr/ download/moda.jsp),33 MS-GF+ v9387 (http://proteomics. ucsd.edu/software-tools/ms-gf/), and Proteome Discoverer version 1.3 (ThermoFisher Scientific, CA, U.S.A.), respectively. These tools were run with the following parameters: number of tryptic termini (NTT) = 1, the maximum number of missed cleavages = 3, fixed modifications at iTRAQ4Plex (N-term), iTRAQ4Plex (K), and carbamidomethyl (C). For MODa, we used 0.1 Da for both fragment and precursor ion tolerance. For MS-GF+, 10 ppm precursor mass tolerance and fragment mass tolerance were allowed. For Proteome Discoverer, 10 ppm precursor mass tolerance and 50 mmu of fragment mass tolerance were applied. For MODa, we used MODa_anal version 1.10 to filter peptide spectrum matches (PSMs) at FDR 1%. For MS-GF+, we used ComputeFDR to calculate the pvalue for each spectrum, retaining PSMs with p-value < 0.01. Target decoy FDR ≤ 0.01 was applied for Proteome Discoverer. Generation of Simulated Multiplexed Spectra

Figure 1 illustrates the overall workflow of generating and analyzing the multiplexed spectra. For every PSM within FDR 1%, the precursor isolation purity (PIP) value was calculated by dividing the precursor peak intensity, which is the sum intensity of the precursor isotope cluster, by the total peak intensity within the precursor isolation window.34 From the database search results, we selected PSMs of 100% PIP with all four iTRAQ reporter ion peaks. Only those PSMs commonly identified by all of the three peptide search tools were used for simulation. For the construction of a multiplexed spectrum, as shown in Figure 1, two different precursors that fall within an isolation window of ±0.8 Th were randomly selected, and their MS/MS spectra were mixed in a desired proportion, after normalization based on their total ion count (TIC). For mixing, reporter ion peak intensities from the two chosen spectra were binned by 0.005 Da, i.e., two reporter ion peaks in the 0.005 Da window were merged into a single peak. Then, all the peaks (including the four merged reporter ion peaks) were combined. Before merging (for reporter ions) or combining (for the other ions) peaks, each fragment ion peak was scaled appropriately for a given PIP value. When mixing two spectra, one with the larger TIC value was regarded as being generated from the true peptide and was called “the original spectrum.” The other was designated as “the noise spectrum.” Then, the precursor m/z of a simulated multiplexed spectrum was set as the precursor m/z of the spectrum from the true peptide. To test the impact of precursor intensity on peptide identification and quantification, the PSMs satisfying the above criteria for simulation were divided into three bins according to their precursor intensity (low 15%, intermediate 70%, and high 15%) and simulated spectra were constructed using arbitrary two spectra from the same bin. We also investigated the effect of true expression ratio on the bias in peptide quantification and identification. For each PSM, cancerto-normal expression ratio was calculated as ((I115 + I117)/(I114 + I116)), where I114, I115, I116, and I117 denote intensity values of the four iTRAQ reporter ions. Prior to the ratio calculation, all four iTRAQ reporter ion peaks were normalized by applying the isotopic purity correction factor.35 We grouped the PSMs

Figure 1. Overview of the workflow. First, we constructed simulated multiplexed spectra using noise-free spectra containing all four reporter ions. The spectra used for multiplexed spectra generation (original spectra) and the simulated multiplexed spectra were searched using multiple peptide identification tools. After peptide identification, we filtered out multiplexed spectra that were incorrectly identified. Then, cancer-to-normal log ratio values between the original and multiplexed spectra were compared for obtaining distributions of their differences. Finally, we determined differentially expressed peptide (DEP) identification cutoffs for multiplexed spectra given a specific precursor isolation purity (PIP) value and a specific probability of type I error. PSM: peptide spectrum match.

for simulation by their expression ratio into six groups: [1, 2), [2, 3), [3, 5), [5, 10), [10, 20), and [20, +∞). When assigning each PSM to a specific group, its cancer-to-normal or normalto-cancer ratio was considered. A multiplexed spectrum for a specific true expression ratio group was generated by mixing a spectrum sampled from that group with a “noise” spectrum with an arbitrary expression ratio value. Peptide Identification and Quantification from Simulated Multiplexed Spectra

We applied MODa, MS-GF+, and Proteome Discoverer with the same parameter settings as described in the previous section (e.g., FDR 1%). After the peptide identification, the cancer-tonormal expression ratio was calculated for each multiplexed spectra from its iTRAQ reporter ion peaks after applying the isotope purity correction factor.35 Determining Cutoff Values for Differentially Expressed Peptide Identification from Cofragmented Spectra

As shown in Figure 1, we empirically quantified the effect of cofragmentation on identification of DEPs by simulating multiplexed spectra. Using the simulated spectra, the 3490

dx.doi.org/10.1021/pr500060d | J. Proteome Res. 2014, 13, 3488−3497

Journal of Proteome Research

Technical Note

Figure 2. Peptide identification results from multiplexed spectra for MODa (A) and MS-GF+ (B). The x-axis shows precursor isolation purity values. The y-axis shows the number of identified peptide spectrum matches (PSMs). When a multiplexed spectrum was identified as the same peptide as its original spectrum, it was regarded as correct (“Correct ID” in the tables). “Incorrect ID (Noise)” means that a multiplexed spectrum was identified as its noise counterpart. Other incorrect identifications are denoted as “Incorrect ID (Others).” Numbers in parentheses are proportions with respect to the total number of identified PSMs.

we calculated the probability of type I error when identifying DEPs from multiplexed spectra using MC1 and MC2 as follows.

distribution of amounts of distortion in iTRAQ quantification, i.e., the degrees of ratio compression (or inflation), at a given PIP value were estimated. The amount of distortion was defined as a difference in cancer-to-normal expression ratio between a simulated multiplexed spectrum and the original spectrum used to generate it. Only the simulated spectra matched to the same peptide as their original spectra were used for the distribution estimation. On the basis of the estimated distributions of amounts of distortion at given PIP values, we calculated cutoff values for robust identification of DEPs so that we can control the number of false positives among the DEPs we determine. A false positive (type I error) in DEP identification corresponds to the case where a DEP identified from a simulated multiplexed spectrum is not identified (as differentially expressed) in the original spectrum. Given a specified type I error probability, cutoff values for DEPs were determined as follows. For each spectrum, the cancer-to-normal expression ratio was calculated as described in the previous section. Then, the logarithm (base two) was applied to each ratio. MR means log2-ratio of a multiplexed spectrum with PIP less than 100%. OR denotes log2-ratio of the corresponding original spectrum with 100% PIP. DR denotes the difference between OR and MR (i.e., OR − MR). Cutoff values for DEP determination from the original spectra are denoted as OC1 (lower) and OC2 (upper). In our experiments, we used −1 and 1 for OC1 and OC2, respectively, as they are cutoff values for 2-fold DEP identification. Cutoff values for multiplexed spectra of a specific PIP value are denoted as MC1 (lower) and MC2 (upper). Then,

P(OR ≥ OC1, MR < MC1) + P(OR ≤ OC2 , MR > MC2) = P(OR − MR > OC1 − MC1) + P(OR − MR < OC2 − MC2) = P(DR > OC1 − MC1) + P(DR < OC2 − MC2)

(1)

Given a specific type I error rate (e.g., p-value of 0.05), MC1 and MC2 were calculated by the above equation and used for robust DEP detection from cofragmented spectra. In a similar way, we could calculate MC1 and MC2 for a specific type II error (false negative) rate as shown in Supporting Information (SI). When calculating MC1 and MC2, we used the empirical cumulative distribution of DR, because the distribution of DR was not well-described by any parametric distributions such as normal, gamma, or log-normal (see Results and Discussion). Comparative Evaluation of Different Methods for Differentially Expressed Peptide Identification

We used simulated multiplexed spectra for assessing the performance of the following methods for two-fold DEP identification: ratio correction using the signal-to-interference (S2I) measure proposed by Savitski and colleagues (RATIO_COR),26 conventional cutoff values −1 and 1 (CONV), and cutoff values corrected for a specific type I error probability (CUTOFF_COR) proposed in this manuscript. We evaluated the DEP identification performance by positive predictive value (PPV), sensitivity, and F1 score. The PPV is defined as (# of true positives)/(# of true positives + # of false positives) and 3491

dx.doi.org/10.1021/pr500060d | J. Proteome Res. 2014, 13, 3488−3497

Journal of Proteome Research

Technical Note

Figure 3. Distributions of differences in log2 fold-change ratios (DR) between multiplexed and original spectra for MODa (A and C) and MS-GF+ (B and D). Distributions for ratio compressed multiplexed spectra (A and B) and ratio inflated multiplexed spectra (C and D) are shown separately. The violin plots show the distributions according to the precursor isolation purity of multiplexed spectra. The tables show means ± standard deviations.

spectra of PIP greater than or equal to 50%. However, the proportion of PSMs from the spectra without cofragmentation (i.e., 100% PIP), which were identified by MODa, MS-GF+, and Proteome Discoverer, was 22.5%, 21.7%, and 26.1% respectively. Thus, we would lose 74−78% of PSMs in our experiments, if we only focused on the PSMs with 100% PIP. The number of noise-free PSMs identified concordantly among MODa, MS-GF+, and Proteome Discoverer was 80,616. Among these, 70,413 PSMs with charge state +2/+3 and having four reporter ion peaks were used for simulating multiplexed spectra after being divided into three groups by their precursor intensity (see Materials and Methods). From each precursor intensity group, 2500 multiplexed spectra with a specific PIP value were generated (7500 spectra in total). We also divided the 70,413 PSMs by their cancer-to-normal or normal-to-cancer expression ratio (see Materials and Methods). The number of PSMs in each group is shown in Table 4 of the SI. More than 77% of the PSMs had an expression ratio smaller than 2. For the groups with expression ratios [1, 2) and [2, 3), 2500 multiplexed spectra were generated, respectively. The number of multiplexed spectra for the groups [3, 5), [5, 10), [10, 20), and [20, +∞) were 1000, 800, 300, and 50, respectively. The following PIP values were used in the experiments −50%, 60%, 70%, 80%, 90%, 95%, and 99%.

denotes the proportion of true DEPs among the DEPs identified by a specific method. Sensitivity means the proportion of the true DEPs identified by a specific method and is defined as (# of true positives)/(# of true positives + # of false negatives). F1 score is a measure considering both PPV and sensitivity (the harmonic mean of PPV and sensitivity). Since the chromatogram information was unavailable for simulated multiplexed spectra, we could not calculate S2I. Instead, we used PIP when applying RATIO_COR. We examined the correlation between PIP and S2I using a TMT data set from the study by Ting and colleagues.23 It is observed that the two measures for precursor interference were highly correlated (Pearson’s correlation coefficient: 0.78) (SI, Figure 1). Software

The software was implemented in the Java programming language (Java SE v1. 6), and can be obtained upon request to the corresponding author.



RESULTS AND DISCUSSION

Peptide Identification and Multiplexed Spectra Generation

The number of MS/MS spectra obtained by our LC−MS/MS experiments was 4,255,821 (after PE-MMR). From these spectra, MODa, MS-GF+, and Proteome Discoverer identified 456,408, 520,074, and 329,688 PSMs, respectively. SI Tables 1−3 summarize the numbers of identified PSMs from the three peptide search tools, stratified by charge state and PIP. For all the tools, at least 83% of the identified PSMs were from the

Influence of Cofragmentation on Peptide Identification

We investigated the influence of cofragmentation on peptide identification using the simulated multiplexed spectra. Figure 2 shows changes in the number of identified PSMs within 1% FDR according to different PIP values for MODa and MS-GF+. 3492

dx.doi.org/10.1021/pr500060d | J. Proteome Res. 2014, 13, 3488−3497

Journal of Proteome Research

Technical Note

Table 1. Corrected Cutoff Values for Detection of Two-Fold Differentially Expressed Peptides (DEP) with Type I Error Rate 0.05 for MODa and MS-GF+ precursor isolation purity database tools MODa MS-GF+ a

cutoff valuesa

50%

60%

70%

80%

90%

95%

99%

MC1 MC2 MC1 MC2

−1.91 2.00 −1.93 1.98

−1.76 1.79 −1.76 1.79

−1.59 1.63 −1.58 1.64

−1.41 1.45 −1.41 1.45

−1.22 1.25 −1.22 1.25

−1.12 1.13 −1.12 1.13

−1.02 1.03 −1.02 1.03

MC1 and MC2 respectively denote lower and upper cutoff values for DEP detection.

We observed that the number of identified PSMs increased steadily as PIP values increased. For MODa, the proportion of identified PSMs was 99.1% (7,429/7,500) when PIP was greater than or equal to 70% (Figure 2A). MS-GF+ identified 99.9% (7,494/7,500) of the multiplexed spectra when PIP was greater than or equal to 50% (Figure 2B). Results for Proteome Discoverer were also similar and are shown in Figure 2A in the SI. These identification results were evaluated in terms of correctness. A PSM for a multiplexed spectrum was considered correct if it was the same as the PSM for the original spectrum used to generate it (see Materials and Methods). MODa and Proteome Discoverer correctly identified over 99% of spectra when PIP was greater than or equal to 80%. For MS-GF+, the proportion of correctly identified spectra was over 99% when PIP was greater than or equal to 70%. For MODa and MS-GF +, most of the incorrect identification results were matched to the noise spectra (see Materials and Methods). Thus, it seems that we could achieve an error rate lower than 1% if we selected peptide identifications from multiplexed spectra with PIP values greater than or equal to 80%. For peptide identification from multiplexed spectra, precursor intensity did not show any significant effect as shown in Figure 2B−D and Figure 3 in the SI, probably because we already chose well-identified spectra for generating multiplexed spectra (see Materials and Methods). We did not observe any distinct effect of the true expression ratio on peptide identification from the simulated multiplexed spectra (SI Figure 4).

compressed spectra increased as the true expression ratio increased. For the multiplexed spectra generated from an original spectrum of which the expression ratio is larger than 10, we did not observe any ratio inflation phenomenon. We quantified the ratio compression and inflation phenomena using our simulated multiplexed spectra. Figure 3 shows how the degree of ratio compression and ratio inflation distribution of DR (see Materials and Methods)varies according to the level of noise (i.e., PIP values). Means of DR values for ratio compression ranged from 0.068 to 0.001 (MODa) and from 0.071 to 0.001 (MS-GF+). Means of DR for ratio inflation ranged from −0.103 to −0.003 (MODa) and −0.099 to −0.003 (MS-GF+). These mean values became close to zero as the level of noise decreased. It means that the average effect of cofragmentation is inversely proportional to PIP as expected. Standard deviations also decreased as PIP increased. For multiplexed spectra with 50% PIP, standard deviations of DR for ratio compression and ratio inflation were 0.8 and 0.4, respectively, for both MODa and MS-GF+. When PIP was 99%, standard deviations were 0.038 (ratio compression) and 0.013 (ratio inflation). The DR distribution for ratio compression was wider than that for ratio inflation, suggesting that the effect of cofragmentation is generally larger for ratio compression than for ratio inflation. Results for Proteome Discoverer were also similar and are shown in Figure 6 of the SI. We examined the effect of the true expression ratio on the degree of ratio compression and inflation using the simulated multiplexed spectra which have been correctly identified by MODa. Figures 7 and 8 in the SI show the distribution of DR for ratio compressed and ratio inflated multiplexed spectra, respectively. For ratio compressed spectra, variances of DR distributions were proportional to the true expression ratio. On the contrary, variances of DR distributions for ratio-inflated spectra steadily decreased as the true expression ratio increased. This discrepancy seems to be due to the fact that most spectra in our data set showed cancer-to-normal ratio values close to one (Table 4 in the SI). We calculated corrected cutoff values considering specific values of type I error (in this work, error rate of 0.05 was used) and PIP, using distributions of DR for ratio inflation as described in Materials and Methods. At first, we tried to estimate the distribution using widely used parametric family of distributions, such as normal, gamma, and log-normal. However, distributions of DR were not well described by any of these parametric forms as shown by Q−Q plots (Figures 9 and 10 in the SI) and results of the Kolmogorov−Smirnov goodness-of-fit test (p-values < 2.2 × 10−16). Thus, we used the empirical cumulative distribution function instead of these parametric distributions for calculating the corrected cutoff values, to extract 2-fold DEPs with an expected false positive ratio of 0.05. The resulting corrected cutoff values for MODa

Influence of Peptide Cofragmentation on Peptide Quantification

We only used the correctly identified PSMs from multiplexed spectra for investigating the effect of cofragmentation on iTRAQ-based quantification. We examined the ratio compression phenomenon in our iTRAQ experiments. A multiplexed spectrum was defined as ratio compressed if its ratio became closer to one (zero in the logarithm of base two) than the ratio of its original spectrum. Table 5 in the SI shows the number and proportion of ratio compressed spectra at various PIP values. We observed the ratio compression phenomenon in 74% of the multiplexed spectra correctly identified by MODa. That is, more than a quarter of the correctly identified multiplexed spectra were not ratio compressed (including some “ratio inflated” multiplexed spectra). Figure 5 in the SI shows changes in the proportion between ratio compressed and ratio inflated multiplexed spectra according to the true expression ratio (see Materials and Methods). When the true expression ratio ranges from 1 to 2, about 33% of multiplexed spectra showed ratio inflation. Thus, a uniform application of the same correction factor for mitigating ratio compression could produce false positive 2-fold DEPs. The proportion of ratio 3493

dx.doi.org/10.1021/pr500060d | J. Proteome Res. 2014, 13, 3488−3497

Journal of Proteome Research

Technical Note

Figure 4. Two-fold differentially expressed peptide (DEP) identification results for MODa (A) and MS-GF+ (B) using cutoff values corrected for type I error rate 0.05. The x-axis shows a range of precursor isolation purity values (%) of the spectra. The y-axis shows the number of identified DEPs. “Putative false positives” denote DEPs identified by usual cutoff values (−1 and 1) only. Numbers in parentheses denote the proportions of peptides with respect to the total number of identified peptides.

and MS-GF+ are shown in Table 1. The corrected cutoff values at 99% PIP were −1.02 and 1.03 for both search tools. The absolute values of the corrected cutoff values increased as PIP decreased. For multiplexed spectra with PIP 50%, the corrected cutoff values were −1.91 and 2.00 for MODa and −1.93 and 1.98 for MS-GF+. This means that only PSMs with more than 4-fold expression difference can be considered as actually 2-fold differentially expressed when two different precursors of the same amount are cofragmented. Results for Proteome Discoverer were also similar and are shown in Figure 6 in the SI.

of specificity is required. On the contrary, it showed the lowest sensitivity. It should be noted that the CUTOFF_COR method can be adapted to control type II error (false negative) rate when a high level of sensitivity is required (Materials and Methods and SI Methods). We tested CUTOFF_COR with varying type II error rates (0.3 to 0.5) on the simulated multiplexed spectra correctly identified by MODa. When applying type II error rate of 0.5, CUTOFF_COR and RATIO_COR showed similar performances. We observed that CUTOFF_COR achieved slightly higher PPVs and slightly lower sensitivities compared to RATIO_COR (Figure 12 in the SI); however, both methods showed the same performance for all the PIP values except for 70% when assessed by F1 score (Figure 13 in the SI). This result suggests that CUTOFF_COR could also be used for effective reduction of false negatives.

Performance of Differentially Expressed Peptide Identification Methods

Using the simulated spectra, we compared the performance of different DEP identification methods: RATIO_COR, CONV, and CUTOFF_COR (see Materials and Methods). The RATIO_COR method aims to improve sensitivity by increasing the level of expression ratio under the assumption that cofragmentation in iTRAQ experiments always results in ratio compression. On the contrary, the CUTOFF_COR method was devised for minimizing the chance of false positive DEP identification. Figure 11 of the SI shows the comparison results measured by PPV and sensitivity for MODa. As expected, PPVs of CUTOFF_COR were the highest among the three methods, regardless of the level of noise (PIP). When PIP was 70%, 95% of the DEPs identified by CUTOFF_COR were true positives. Among the DEPs identified by RATIO_COR, however, only 65% were true positives. Thus, CUTOFF_COR can be a reasonable option when a high level

Identifying Differentially Expressed Peptides Using Corrected Cutoff Values Considering PIP Values

We applied the corrected cutoff values for various PIP values calculated in the previous section to real multiplexed (i.e., cofragmented) spectra from the same sample. The numbers of PSMs obtained from spectra, PIPs of which were larger than or equal to 50%, for MODa and MS-GF+ were 381,603 and 439,808, respectively (see Tables 1 and 2 in the SI). These PSMs included 95,387 (for MODa) and 114,394 (for MS-GF +) unique peptides. Figure 4 shows changes in the number of 2-fold DEPs identified by usual log2-ratio cutoff values (−1 and 1) and the cutoff values corrected according to type I error rate 0.05 and PIP values, for MODa and MS-GF+ (see Table 1). Results for Proteome Discoverer are shown in Figure 14 in the 3494

dx.doi.org/10.1021/pr500060d | J. Proteome Res. 2014, 13, 3488−3497

Journal of Proteome Research

Technical Note

Figure 5. (A) Amino acid sequence of the protein Q9BX68. Boldface letter indicates the identified unique sequence peptides. The unique sequence peptides that have MS/MS spectra at PIP 100%, are underlined, and the corresponding iTRAQ reporter ion spectra are shown, with the observed fold changes. (B) The enlarged MS spectrum of a precursor ion (m/z 738.0451) within isolation window (±0.8 Th). The target precursor ion (indicated by red dots) was cofragmented with another ion (indicated by blue dots). The PIP value of the target precursor ion is 20%. (C) The resultant MS/MS spectrum of the cofragmented two ions with its iTRAQ reporter ion spectrum shown in the inset. The calculated fold change was 2.32. The cofragmented spectrum was annotated by fragments from the two cofragmented peptides (SLPADILYEDQQCLVFR in red and VFIPVLQSVTAR in blue).

MS spectra at their PIP 100%, whose reporter ion spectra resulted in an average fold change of 1.19 (Figure 5a). However, the reporter ion spectrum of SLPADILYEDQQCLVFR at PIP 20% resulted in the fold change of 2.32. Although the fold change seems larger than 2, the corrected cutoff value for DEP at PIP under 50% is 3.94, and thus, the peptide was not identified as a DEP and removed as a putative false positive. SLPADILYEDQQCLVFR is a peptide sequence unique to the protein Q9BX68. Since Q9BX68 has no known isoforms, the ratio inflation is not likely to be caused by “isoform mixing” where two or more isoform proteins of different expression result in ratio changes. We confirmed that the ratio inflation in this case was caused by the cofragmentation of a peptide, VFIPVLQSVTAR, whose reporter ion spectrum at PIP 84% indicates the fold change of 2.73. A list of proteins that contains false-positive DEPs resulting from ratio inflation is provided in Table 6 in the SI.

SI. When applying corrected cutoff values to a spectrum, the corrected cutoff value for the largest PIP smaller than or equal to the spectrum’s actual PIP was used. For example, the corrected cutoff value for PIP 80% was applied to the spectrum with actual PIP of 85%. The numbers of DEPs identified from spectra without cofragmentation (100% PIP) were 9891 (MODa), 11,234 (MSGF+), and 8257 (Proteome Discoverer), which did not include “putative” false positives. We defined putative false positive DEPs as the DEPs identified by applying the usual cutoff values, but not by corrected cutoff values. The number of identified DEPs by the usual cutoff values and the number of putative false positives increased as the level of noise in spectra increased. However, the increase in the number of DEPs detected by corrected cutoff values stagnated when the minimum PIP of spectra was less than or equal to 70%. For MODa, 14,795 DEPs were identified from spectra with PIP greater than or equal to 70%. When we included the spectra with a minimum PIP of 50%, only 278 DEPs were additionally identified. Results for MS-GF+ and Proteome Discoverer were also similar. Thus, we concluded that spectra with PIP less than or equal to 70% were too noisy to reliably detect 2-fold DEPs. However, our approach was able to reliably identify small numbers of DEPs highly likely to be true positives even from such noisy spectra by applying cutoff values corrected according to a specified type I error and noise level.



CONCLUSION Isobaric tag-based quantification methods such as iTRAQ and TMT have advantagesdirect relative quantification with less experimental bias, increased experimental throughput, and wide quantification coverage. However, they suffer from deterioration of accuracy in both quantification and identification of peptides due to cofragmentation at the MS/MS stage. In order to estimate the effect of cofragmentation in iTRAQ experiments, we simulated multiplexed spectra with known precursor composition and various PIP values from global proteome profiling experiments on gastric tumor−normal tissue pair. For peptide identification, generally used peptide search tools achieved the error rate lower than 1% when searching multiplexed spectra with PIP over 80%. From the iTRAQ

Multiplexed Spectra with Ratio Inflation

We examined if the corrected cutoff values can identify and reduce the false-positive DEPs, resulting from “ratio inflation”. Figure 5 shows an example of the ratio inflation. In this experiment, the protein Q9BX68 was identified by seven unique sequence peptides. Five of the seven peptides have MS/ 3495

dx.doi.org/10.1021/pr500060d | J. Proteome Res. 2014, 13, 3488−3497

Journal of Proteome Research

Technical Note

sensitivity) of 2-fold differentially expressed peptide identification methods: ratio correction based on the signal-tointerference measure (RATIO_COR), conventional log2 cutoffs −1 and 1 (CONV), and corrected cut-offs for type II error rate of 0.5 (CUTOFF_COR). Figure 13. Performance comparison (F1 score) of 2-fold differentially expressed peptide identification methods: ratio correction based on the signal-tointerference measure (RATIO_COR), conventional log2 cutoffs −1 and 1 (CONV), and corrected cut-offs for type II error rate of 0.5 (CUTOFF_COR). Figure 14. Results of 2-fold differentially expressed peptides (DEPs) from Proteome Discoverer. This material is available free of charge via the Internet at http://pubs.acs.org.

quantification results, we observed the ratio compression phenomenon. However, the proportion of multiplexed spectra with ratio inflation was also observed. On the basis of the estimated effect of cofragmentation on iTRAQ quantification, we calculated corrected cutoff values for a specified type I error rate and various PIP values. By applying the corrected cutoff values, we were able to effectively eliminate false positive DEPs due to ratio inflation. Contrary to simplistic spectrum filtering methods based on noise level, our approach could salvage appropriate numbers of true positive DEPs from noisy spectra. The estimated effect of cofragmentation on iTRAQ experiments and the calculated corrected cutoff values for DEP detection may well be dependent on various experimental parameters and settings. However, the proposed methods for simulating multiplexed spectra and related calculations are readily generalizable to other experiments using isobaric tagbased quantification in mass spectrometry.





AUTHOR INFORMATION

Corresponding Author

*Tel.: +82-2-2220-2377; fax: +82-2-2220-1723; e-mail: [email protected].

ASSOCIATED CONTENT

S Supporting Information *

Notes

Methods. Corrected cutoff calculation method for given precursor isolation purity (PIP) and type II error values. Table 1. Numbers of peptide spectrum matches (PSMs) identified from MODa search, stratified by charge states and precursor isolation purity (PIP). Table 2. Numbers of peptide spectrum matches (PSMs) identified from MS-GF+ search, stratified by charge states and precursor isolation purity (PIP). Table 3. Numbers of peptide spectrum matches (PSMs) identified from Proteome Discoverer, stratified by charge states and precursor isolation purity (PIP). Table 4. Numbers of peptide spectrum matches (PSMs) concordantly identified by the three database search tools, stratified by cancer-to-normal or normal-to-cancer expression ratio. Table 5. Phenomenon of “ratio compression” in the simulation experiment. Table 6. List of proteins including putative false-positive differentially expressed peptides (DEPs) resulting from “ratio inflation”. Figure 1. Scatter plot showing the correlation between S2I and PIP for the spectra in a TMT data set from the study by Ting and colleagues (Ting et al., 2011). Figure 2. Identification results from Proteome Discoverer database search engine. Figure 3. Identifications per different precursor intensity ranges from MODa and MS-GF+. Figure 4. Peptide identification results per different true expression ratio ranges for MODa. Figure 5. Proportion between ratio compressed and ratio inflated multiplexed spectra for MODa. Figure 6. Distributions of differences in log2 fold-change ratios (DR) between multiplexed and original spectra for Proteome Discoverer. Figure 7. Distributions of the degree of ratio compression (DR) stratified by the true expression ratio of simulated multiplexed spectra for MODa. Figure 8. Distributions of the degree of ratio inflation (DR) stratified by the true expression ratio of simulated multiplexed spectra for MODa. Figure 9. Results of fitting DR distributions for ratio compression as parametric functions for MODa. Figure 10. Results of fitting D R distributions for ratio inflation as parametric functions for MODa. Figure 11. Performance comparison (positive predictive value and sensitivity) of 2-fold differentially expressed peptide identification methods: ratio correction based on the signal-to-interference measure (RATIO_COR), conventional log2 cut-offs −1 and 1 (CONV), and corrected cut-offs for type I error rate of 0.05 (CUTOFF_COR). Figure 12. Performance comparison (positive predictive value and



The authors declare no competing financial interest.

ACKNOWLEDGMENTS This research was supported by the Proteogenomics Research Program through the National Research Foundation of Korea funded by the Korean Ministry of Education, Science and Technology (NRF-2012M3A9B9036676), by the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (NRF2012M3A9D1054452; NRF-2013M3C7A1069644), and by a grant from KRIBB Research Initiative Program. This work was also funded by a grant from the National Project for Personalized Genomic Medicine, Ministry for Health & Welfare, Republic of Korea (A111218-CP02). H.L. and K.B.H. were supported by the National Research Foundation of Korea (NRF-2012R1A1A2039822; NRF2012M3A9D1054705).



REFERENCES

(1) Bantscheff, M.; Lemeer, S.; Savitski, M. M.; Kuster, B. Quantitative mass spectrometry in proteomics: critical review update from 2007 to the present. Anal. Bioanal. Chem. 2012, 404 (4), 939− 965. (2) Everley, P. A.; Krijgsveld, J.; Zetter, B. R.; Gygi, S. P. Quantitative cancer proteomics: stable isotope labeling with amino acids in cell culture (SILAC) as a tool for prostate cancer research. Mol. Cell Proteomics 2004, 3 (7), 729−735. (3) Gygi, S. P.; Rist, B.; Gerber, S. A.; Turecek, F.; Gelb, M. H.; Aebersold, R. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat. Biotechnol. 1999, 17 (10), 994−999. (4) Pan, S.; Aebersold, R.; Chen, R.; Rush, J.; Goodlett, D. R.; McIntosh, M. W.; Zhang, J.; Brentnall, T. A. Mass spectrometry based targeted protein quantification: Methods and applications. J. Proteome Res. 2009, 8 (2), 787−797. (5) Zhang, G.; Ueberheide, B. M.; Waldemarson, S.; Myung, S.; Molloy, K.; Eriksson, J.; Chait, B. T.; Neubert, T. A.; Fenyo, D. Protein quantitation using mass spectrometry. Methods Mol. Biol. 2010, 673, 211−22. (6) Wang, W.; Zhou, H.; Lin, H.; Roy, S.; Shaler, T. A.; Hill, L. R.; Norton, S.; Kumar, P.; Anderle, M.; Becker, C. H. Quantification of proteins and metabolites by mass spectrometry without isotopic labeling or spiked standards. Anal. Chem. 2003, 75 (18), 4818−4826. (7) Mueller, L. N.; Brusniak, M. Y.; Mani, D. R.; Aebersold, R. An assessment of software solutions for the analysis of mass spectrometry

3496

dx.doi.org/10.1021/pr500060d | J. Proteome Res. 2014, 13, 3488−3497

Journal of Proteome Research

Technical Note

based quantitative proteomics data. J. Proteome Res. 2008, 7 (1), 51− 61. (8) Asara, J. M.; Christofk, H. R.; Freimark, L. M.; Cantley, L. C. A label-free quantification method by MS/MS TIC compared to SILAC and spectral counting in a proteomics screen. Proteomics 2008, 8 (5), 994−999. (9) Ong, S. E.; Blagoev, B.; Kratchmarova, I.; Kristensen, D. B.; Steen, H.; Pandey, A.; Mann, M. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol. Cell Proteomics 2002, 1 (5), 376−386. (10) Ross, P. L.; Huang, Y. N.; Marchese, J. N.; Williamson, B.; Parker, K.; Hattan, S.; Khainovski, N.; Pillai, S.; Dey, S.; Daniels, S.; Purkayastha, S.; Juhasz, P.; Martin, S.; Bartlet-Jones, M.; He, F.; Jacobson, A.; Pappin, D. J. Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell Proteomics 2004, 3 (12), 1154−1169. (11) Gafken, P. R.; Lampe, P. D. Methodologies for characterizing phosphoproteins by mass spectrometry. Cell Commun. Adhes. 2006, 13 (5−6), 249−262. (12) Zieske, L. R. A perspective on the use of iTRAQ reagent technology for protein complex and profiling studies. J. Exp. Bot. 2006, 57 (7), 1501−1508. (13) Dayon, L.; Hainard, A.; Licker, V.; Turck, N.; Kuhn, K.; Hochstrasser, D. F.; Burkhard, P. R.; Sanchez, J. C. Relative quantification of proteins in human cerebrospinal fluids by MS/MS using 6-plex isobaric tags. Anal. Chem. 2008, 80 (8), 2921−2931. (14) Thompson, A.; Schafer, J.; Kuhn, K.; Kienle, S.; Schwarz, J.; Schmidt, G.; Neumann, T.; Johnstone, R.; Mohammed, A. K.; Hamon, C. Tandem mass tags: A novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. Anal. Chem. 2003, 75 (8), 1895−1904. (15) Desouza, L. V.; Voisin, S. N.; Siu, K. W. iTRAQ-labeling for biomarker discovery. Methods Mol. Biol. 2013, 1002, 105−114. (16) Ernoult, E.; Bourreau, A.; Gamelin, E.; Guette, C. A proteomic approach for plasma biomarker discovery with iTRAQ labelling and OFFGEL fractionation. J. Biomed. Biotechnol. 2010, 2010, 927917. (17) Tsuchida, S.; Satoh, M.; Kawashima, Y.; Sogawa, K.; Kado, S.; Sawai, S.; Nishimura, M.; Ogita, M.; Takeuchi, Y.; Kobyashi, H.; Aoki, A.; Kodera, Y.; Matsushita, K.; Izumi, Y.; Nomura, F. Application of quantitative proteomic analysis using tandem mass tags for discovery and identification of novel biomarkers in periodontal disease. Proteomics 2013, 13 (15), 2339−2350. (18) Latterich, M.; Abramovitz, M.; Leyland-Jones, B. Proteomics: New technologies and clinical applications. Eur. J. Cancer 2008, 44 (18), 2737−2741. (19) Dean, R. A.; Overall, C. M. Proteomics discovery of metalloproteinase substrates in the cellular context by iTRAQ labeling reveals a diverse MMP-2 substrate degradome. Mol. Cell Proteomics 2007, 6 (4), 611−623. (20) Ow, S. Y.; Salim, M.; Noirel, J.; Evans, C.; Wright, P. C. Minimising iTRAQ ratio compression through understanding LC-MS elution dependence and high-resolution HILIC fractionation. Proteomics 2011, 11 (11), 2341−2346. (21) Savitski, M. M.; Sweetman, G.; Askenazi, M.; Marto, J. A.; Lang, M.; Zinn, N.; Bantscheff, M. Delayed fragmentation and optimized isolation width settings for improvement of protein identification and accuracy of isobaric mass tag quantification on Orbitrap-type mass spectrometers. Anal. Chem. 2011, 83 (23), 8959−8967. (22) Bantscheff, M.; Boesche, M.; Eberhard, D.; Matthieson, T.; Sweetman, G.; Kuster, B. Robust and sensitive iTRAQ quantification on an LTQ Orbitrap mass spectrometer. Mol. Cell Proteomics 2008, 7 (9), 1702−1713. (23) Ting, L.; Rad, R.; Gygi, S. P.; Haas, W. MS3 eliminates ratio distortion in isobaric multiplexed quantitative proteomics. Nat. Methods 2011, 8 (11), 937−940. (24) Wenger, C. D.; Lee, M. V.; Hebert, A. S.; McAlister, G. C.; Phanstiel, D. H.; Westphall, M. S.; Coon, J. J. Gas-phase purification enables accurate, multiplexed proteome quantification with isobaric tagging. Nat. Methods 2011, 8 (11), 933−935.

(25) Vaudel, M.; Burkhart, J. M.; Radau, S.; Zahedi, R. P.; Martens, L.; Sickmann, A. Integral quantification accuracy estimation for reporter ion-based quantitative proteomics (iQuARI). J. Proteome Res. 2012, 11 (10), 5072−5080. (26) Savitski, M. M.; Mathieson, T.; Zinn, N.; Sweetman, G.; Doce, C.; Becher, I.; Pachl, F.; Kuster, B.; Bantscheff, M. Measuring and managing ratio compression for accurate iTRAQ/TMT quantification. J. Proteome Res. 2013, 12 (8), 3586−3598. (27) Sandberg, A.; Branca, R. M.; Lehtio, J.; Forshed, J. Quantitative accuracy in mass spectrometry based proteomics of complex samples: The impact of labeling and precursor interference. J. Proteomics 2013, 96C, 133−144. (28) Wuhr, M.; Haas, W.; McAlister, G. C.; Peshkin, L.; Rad, R.; Kirschner, M. W.; Gygi, S. P. Accurate multiplexed proteomics at the MS2 level using the complement reporter ion cluster. Anal. Chem. 2012, 84 (21), 9214−9221. (29) Wisniewski, J. R.; Zougman, A.; Nagaraj, N.; Mann, M. Universal sample preparation method for proteome analysis. Nat. Methods 2009, 6 (5), 359−362. (30) Wang, Y.; Yang, F.; Gritsenko, M. A.; Wang, Y.; Clauss, T.; Liu, T.; Shen, Y.; Monroe, M. E.; Lopez-Ferrer, D.; Reno, T.; Moore, R. J.; Klemke, R. L.; Camp, D. G., 2nd; Smith, R. D. Reversed-phase chromatography with multiple fraction concatenation strategy for proteome profiling of human MCF10A cells. Proteomics 2011, 11 (10), 2019−2026. (31) Lee, H.; Lee, J. H.; Kim, H.; Kim, S.-J.; Bae, J.; Kim, H. K.; Lee, S.-W. A fully automated dual-online multifunctional ultrahigh pressure liquid chromatography system for high-throughput proteomics analysis. J.Chromatogr., A 2014, 1329, 83−89. (32) Shin, B.; Jung, H. J.; Hyung, S. W.; Kim, H.; Lee, D.; Lee, C.; Yu, M. H.; Lee, S. W. Postexperiment monoisotopic mass filtering and refinement (PE-MMR) of tandem mass spectrometric data increases accuracy of peptide identification in LC/MS/MS. Mol. Cell Proteomics 2008, 7 (6), 1124−1134. (33) Na, S.; Bandeira, N.; Paek, E. Fast multi-blind modification search through tandem mass spectrometry. Mol. Cell Proteomics 2012, 11 (4), M111.010199. (34) Mertins, P.; Udeshi, N. D.; Clauser, K. R.; Mani, D. R.; Patel, J.; Ong, S. E.; Jaffe, J. D.; Carr, S. A. iTRAQ labeling is superior to mTRAQ for quantitative global proteomics and phosphoproteomics. Mol. Cell Proteomics 2012, 11 (6), M111.014423. (35) Pappin, D. An iTRAQ Primer. http://www.ushupo.org/portals/ 0/ushupo_techtalk_itraq.pdf.

3497

dx.doi.org/10.1021/pr500060d | J. Proteome Res. 2014, 13, 3488−3497