Cross-Correlation of Spectral Count Ranking to Validate Quantitative

Mar 10, 2014 - Additionally reports on new statistical methods for analyzing spectral count data appear at regular intervals, but a systematic evaluat...
1 downloads 3 Views 2MB Size
Article pubs.acs.org/jpr

Cross-Correlation of Spectral Count Ranking to Validate Quantitative Proteome Measurements Olli Kannaste,†,⊥ Tomi Suomi,‡,⊥ Jussi Salmi,† Esa Uusipaikka,§ Olli Nevalainen,‡ and Garry L. Corthals*,†,# †

Turku Centre for Biotechnology, University of Turku and Åbo Akademi University, 20520 Turku, Finland Department of Information Technology, University of Turku, 20014 Turku, Finland § Department of Statistics, University of Turku, 20014 Turku, Finland ‡

S Supporting Information *

ABSTRACT: The measurement of change in biological systems through protein quantification is a central theme in modern biosciences and medicine. Label-free MS-based methods have greatly increased the ease and throughput in performing this task. Spectral counting is one such method that uses detected MS2 peptide fragmentation ions as a measure of the protein amount. The method is straightforward to use and has gained widespread interest. Additionally reports on new statistical methods for analyzing spectral count data appear at regular intervals, but a systematic evaluation of these is rarely seen. In this work, we studied how similar the results are from different spectral count data analysis methods, given the same biological input data. For this, we chose the algorithms Beta Binomial, PLGEM, QSpec, and PepC to analyze three biological data sets of varying complexity. For analyzing the capability of the methods to detect differences in protein abundance, we also performed controlled experiments by spiking a mixture of 48 human proteins in varying concentrations into a yeast protein digest to mimic biological fold changes. In general, the agreement of the analysis methods was not particularly good on the proteome-wide scale, as considerable differences were found between the different algorithms. However, we observed good agreements between the methods for the top abundance changed proteins, indicating that for a smaller fraction of the proteome changes are measurable, and the methods may be used as valuable tools in the discoveryvalidation pipeline when applying a cross-validation approach as described here. Performance ranking of the algorithms using samples of known composition showed PLGEM to be superior, followed by Beta Binomial, PepC, and QSpec. Similarly, the normalized versions of the same method, when available, generally outperformed the standard ones. Statistical detection of protein abundance differences was strongly influenced by the number of spectra acquired for the protein and, correspondingly, its molecular mass. KEYWORDS: LC/MS2, protein quantification, spectral counting

1. INTRODUCTION In mass spectrometry based proteomics, quantitative analysis has had a long time between its inception and application, mainly due to several technical challenges. Detectability of peptides is not uniform but depends on numerous physicochemical characteristics,1 resulting in differences in MS signal intensities for isomolar but different molecular species, hindering quantitative analysis from becoming a straightforward analytical measurement. In shotgun proteomics based approaches, further discrepancy results from the enzymatic digestion of proteins having different numbers of peptide cleavage sites, resulting in different number of detectable peptides. Further challenges for MS based measurements will be encountered when proteins undergo biologically controlled protein post-translational modifications (PTMs), resulting in site-specific changes to the masses of amino acids in peptides. The summation of effects that play a role in the distribution of the concentration of peptides and subsequent © 2014 American Chemical Society

masses for a single protein are indeed complex. Acknowledging these challenges has also opened up opportunities for specific solutions, and the field has a suit of robust methods that enable the measurement of many, but not all, proteins and their PTMs. The fact that isotopic variants of a peptide have identical ionization properties, but are distinguishable according to mass by MS, has been exploited in proteomics quantification strategies. Proteins or peptides in a biological sample can be labeled with stable isotopes, enabling quantification by comparing signal intensities of labeled and unlabeled peptides corresponding to the same protein originating from experimentally treated and control samples. Labeling can be done chemically after protein purification (ICAT, iTRAQ)2,3 or metabolically by the organism or in cell culture (SILAC).4 Received: November 6, 2013 Published: March 10, 2014 1957

dx.doi.org/10.1021/pr401096z | J. Proteome Res. 2014, 13, 1957−1968

Journal of Proteome Research

Article

after permutation analysis yielded p values for differential expression. Although protein size normalization seems intuitively sound, Lundgren et al.20 identified a higher fraction of true positives with PLGEM using raw spectral counts instead of NSAF values from a SILAC data set containing simulated protein fold changes. An example of a statistical method designed to specifically address the properties of spectral count data is QSpec.8 It takes into consideration issues like the distribution for spectral count data, modeled using a Poisson distribution, as well as the commonly encountered limited number of replicates in proteomic studies. The framework is based on the use of hierarchical Bayes estimation of a generalized linear mixed model. The authors argue that this approach is more powerful than conventional calculation approaches utilizing signal-tonoise ratios for individual proteins. Another statistical method used for proteomics data is PepC16 that uses both t-test and G-test. It balances the trade-off between the number of differentially expressed proteins identified and the false discovery rate (FDR). The recent model proposed by Pham et al.13 uses the βbinomial distribution to model spectral count data. In the model, the within-sample variation is modeled with a binomial distribution, and the between-sample variation is modeled by treating the parameter of the binomial distribution as a random variable from a β distribution. Parameter inference is based on a likelihood ratio test. In this model, each protein is treated separately, in contrast to QSpec, in which statistical information across all proteins is used. In evaluating the performance of statistical methods for spectral count data, information about the actual changes of protein amount is important as a reference point. In previous studies, this has been addressed by using candidate protein lists based on previous knowledge about the particular biological system11 or by inserting simulated protein fold changes into data sets. 8 These approaches are not ideal, as the preconceptions of researchers about the expected changes in protein expression might not reflect the real situation, and simulations using arbitrary fold change values are likely not realistic models of actual biological signaling. However, given the fundamentally unknown quantitative composition of complex biological samples, the use of these approaches still seems reasonable. In this study, we approach the issue of evaluating four spectral count analysis methods from a different point of view. We test how similar are the results from methods given the same biological input data. Specifically, we address similarity by ranking identified proteins according to the probability of differential expression, as calculated by the statistical algorithms, and by comparing the degrees of similarity of the ranked protein lists. We reason that a high degree of consensus in this “cross-validation” approach would increase confidence in results generated by the different analysis methods, whereas low overlap would indicate poor discriminating capability among the methods. For evaluating the statistical spectral counting methods for biological samples, we use three data sets: a rat epileptogenic hippocampal proteome data set, a pig wound tissue data set, both related to ongoing separate translational proteomics projects, and a comparative growth phase yeast data set published by Pavelka et al.12 Also, because of the fundamental uncertainty about the validity of quantification of these biological data sets, we conduct experiments where we spiked different concentrations of

These techniques are used for relative quantification, that is, the measurement of change (in signal intensities) without knowledge about the amount of protein. Absolute protein quantification is enabled by “spiking” the sample with known concentrations of stable isotope labeled synthetic peptides of interest, to which the signal intensities of unlabeled biological peptides are compared (AQUA).5 Thus, change measurements are extracted from the MS signal intensities, which are directly proportional to the amount (grams or moles) of peptides present in the sample. Increasingly, label-free methods have emerged as an option for protein quantification in MS. Apparently, developments have been spurred partly due to the costly and laborious nature of stable isotope labeling methods. However, a practical impetus has also fuelled these developments, as label-free approaches also facilitate the comparison of larger number of different biological states and conditions, and larger numbers of patient samples. This is because the distinction of different pools of proteins is not restricted to pairwise comparisons, which is characteristic for any of the labeling-based methods. Label-free quantification can be performed either at the precursor ion (MS) or fragment ion (MS/MS or MS2, as termed here) level. In the first approach, signal intensities for precursor ions are measured as a function of chromatographic retention time in order to extract ion current curves, which are integrated to yield peak areas corresponding to peptide abundance.6 Precursor ion chromatograms are then mapped across multiple runs using computational feature finding and the intensities of MS signals are compared. For fragmentation ion comparisons, the number of MS2 fragment ions identified for a protein is used to express quantitative differences between samples. This method is commonly known as spectral counting.7 The approach has the benefits of being more straightforward to implement and having a slightly better dynamic range (than MS quantification), although it also has its shortcomings, namely the poor signal-to-noise ratio when spectral counts are low.8 Recently, the spectral counting approach has been refined by considering the cumulative signal intensities of MS2 ions identified for a given protein.9,10 For the statistical analysis of spectral count (SC) data, numerous methods have already been developed.8,11−18 Some of them have been directly adapted from microarray data analysis tools, whereas others are formulated in a way to specifically consider the discrete nature of spectral count data. An example of the former methods is the power law global error model (PLGEM).12 For PLGEM analysis, spectral counts are normalized by protein length and by the sum of all lengthnormalized counts in a sample to generate normalized spectral abundance factors (NSAF).19 These were shown to have similar statistical properties as microarray data after natural-log transformation, namely, the relationship between abundance and standard deviation is characterized by a power law distribution.12 The authors of the method argue that this validates the use of PLGEM for describing and analyzing proteomics data. In PLGEM, the power law relation is used to obtain standard deviation values from a linear error model in log−log space. The reasoning is that standard deviations obtained by statistical modeling are more accurate than calculating them independently for each protein from a small number of replicates, which is usually the situation in proteomics experiments. Protein abundances, as defined by NSAF values, and model-derived standard deviations are used in calculating protein expression signal-to-noise ratios, which 1958

dx.doi.org/10.1021/pr401096z | J. Proteome Res. 2014, 13, 1957−1968

Journal of Proteome Research

Article

fragmentation. Dynamic precursor ion exclusion with duration of 120 s was used. Raw MS files were searched against the UniProt Knowledgebase (Release 15.5: consisting of UniProtKB/Swiss-Prot Release 57.5/UniProtKB/TrEMBL Release 40.5, Rattus sequences, 16647 entries) using the Mascot (Version 2.2.0, Matrix Science, Boston, MA, U.S.A.) search engine for protein identification. For the in silico digestion, trypsin was selected with maximum two missed cleavages allowed. Allowed peptide charges were set at 1+, 2+, and 3+. Peptide and MS2 mass tolerances were set at 0.1 and 0.2 Da, respectively. Monoisotopic masses were used. Cysteine carbamidomethylation and methionine oxidation were set as variable modifications. A probability-based score cutoff for protein identifications, equivalent to a p value of 0.05, was applied. For statistical validation of peptide-spectrum matches and protein identifications, MS2 based peptide and protein identifications were validated with PeptideProphet21 and ProteinProphet22 algorithms using Scaffold software (version 2.06.00, Proteome Software, Portland, OR, U.S.A.). Probability thresholds of 95% and 99% were selected for peptides and proteins, respectively, with minimum two peptides assigned per protein.

Universal Proteomics Standard (UPS1) from Sigma, containing 48 human proteins of equimolar concentration, into a complex yeast proteome background. We hereby wanted to create a reference data set with known UPS protein fold changes and analyze the resulting spectral count data with the different algorithms. This gives us insight into the ability of the methods to detect differences in different concentrations of the spiked proteins. The results of the above experiments lead us to conclude that when considering the proteome-wide expression of different proteins, the correlation between the different methods varies a lot. However, in regard to the proteins having the highest change in abundance, we find that there are methods with good consensus.

2. EXPERIMENTAL PROCEDURES 2.1. Rat Brain Data

Epileptogenic and control animal groups were established by obtaining six postnatal day 9 rats and injecting them with intraperitoneal kainate (n = 3) and saline (n = 3), respectively. All animal experiments were conducted in accordance with the guidelines of the European Community Council Directives 86/ 609/EEC and had the approval of the Office of the Regional Government of Western Finland. All efforts were made to minimize the pain and discomfort of the experimental animals. Kainate is a subtype-specific glumatate receptor agonist that causes a condition that, in rodent models, mimics temporal lobe epilepsy in humans, that is, recurrent seizures originating from limbic brain regions, typically the hippocampus. After one week, animals were sacrificed and their hippocampi collected. Immediately following dissection, the hippocampi were treated with Stabilizor T1 system (Denator, Gothenburg, Sweden) to thermally inactivate intracellular enzyme activity. After this step, the tissues were snap frozen in liquid nitrogen and stored in −80 °C until further processing. For LC/MS2 analysis, tissues were homogenized and proteins were acetone precipitated. Cystein residues were reduced and alkylated, and proteins were digested with trypsin overnight. Samples were desalted using C18 pipet tips (ZipTip, Millipore, Billerica, MA, U.S.A.). Isoelectric focusing was employed for sample prefractionation on a IPGphor IEF system (GE Healthcare, Waukesha, WI, U.S.A.), in which peptides were separated on 13 cm IPG strips (GE Healthcare, Waukesha, WI, U.S.A.), which were subsequently cut into five fractions. After IEF, peptides were extracted from IPG strips, and samples were once more desalted with C18 tips, dried, and resuspended in 1% formic acid. Samples were analyzed with QSTAR Pulsar mass spectrometer (Applied Biosystems/MDS Sciex, Canada) coupled online with a nanoflow HPLC system (Famos, Switchos II and Ultimate, LC Packings, Amsterdam, Netherlands/CapLC, Waters, Milford, MA, U.S.A.). Peptides were first loaded on a trapping column (0.3 × 5 mm PepMap C18, LC Packings) and subsequently separated inline on a 15 cm C18 column (75 μm × 15 cm, Magic 5 μm 200 Å C18, Michrom BioResources Inc., Sacramento, CA, U.S.A.). A 90 min LC gradient from 95% solvent A (5% acetonitrile, 0.1% formic acid) to 95% solvent B (95% acetonitrile, 0.1% formic acid) with a flow rate of 200 nL/ min was used, followed by electrospray ionization and datadependent analysis in positive ion mode. A mass window of 350−1600 m/z was used in the full MS scan, and two most intense doubly or triply charged ions were selected for

2.2. Pig Wound Tissue Data

For six Duroc pigs, 15 × 27 mm wounds were made into oral mucosa and skin of the back for assessment of differential wound healing of these tissues. For proteomic analysis, the tissues were sampled three days after wounding. Biopsy samples were frozen and manually ground into powder, after which they were solubilized in standard Laemmli SDS sample buffer. Proteins were separated on SDS-PAGE. Gels were cut to 12 pieces and proteins were in-gel reduced, alkylated, and digested with trypsin. LC-ESI-MS2 analyses of tryptic peptides were performed on a nanoflow HPLC system (Ultimate 3000, Dionex, Sunnyvale, CA) coupled to a QSTAR Elite mass spectrometer (Applied Biosystems/MDS Sciex, Canada) equipped with a nanoelectrospray ionization source (Proxeon, Odense, Denmark). Peptides were first loaded on a trapping column (0.3 × 5 mm PepMap C18, LC Packings) and subsequently separated inline on a 15 cm C18 column (75 μm × 15 cm, Magic 5 μm 200 Å C18, Michrom BioResources Inc., Sacramento, CA, U.S.A.). The mobile phase consisted of water/ acetonitrile (98:2 (v/v)) with 0.2% formic acid (solvent A) or acetonitrile/water (95:5 (v/v)) with 0.2% formic acid (solvent B). A linear 60 min gradient from 2% to 35% solvent B was used to elute peptides. A flow rate of 200 nL/min was used. Data-dependent acquisition was performed with MS scan mass window was set at 350−1500 m/z, and top three 2−4 charge state ions were selected for fragmentation. Dynamic precursor ion exclusion was used with duration of 60 s. Proteins were identified by Mascot (Version 2.2.0, Matrix Science, Boston, MA, U.S.A.) search using a custom database, consisting of “Sus scrofa” sequences downloaded from UniProt (8313 entries) + common contaminants from ftp://ftp.thegpm. org/fasta/cRAP. One tryptic miscleavage was allowed. Peptide charge states were set at 2+ and 3+. Peptide and MS2 mass tolerances were 0.3 Da. Cysteine carbamidomethylation and methionine oxidation were set as variable modifications. Peptide and protein hits were validated in Scaffold software (version 3). Probability thresholds of 95% were selected for peptides and proteins, with a minimum two peptides assigned per protein. Analyses were done and data was provided by Noora Jaakkola. 1959

dx.doi.org/10.1021/pr401096z | J. Proteome Res. 2014, 13, 1957−1968

Journal of Proteome Research

Article

2.3. Yeast Data

Yeast data set was obtained from Pavelka et al. It consists of LTQ data from four biological replicates of BY4741 yeast strain sampled at two different phases of cell growth, namely logarithmic and stationary, totaling eight samples (Table 1).

tolerances were 5 ppm and 0.5 Da, respectively. Methionine oxidation was selected as a dynamic modification and cysteine carbamidomethylation as a fixed modification. One tryptic miscleavage was allowed. Mascot score corresponding to 95% identification probability was used as a cutoff for peptide and protein identifications.

Table 1. Data Sets Used in the Study

2.5. Statistical Analysis of Spectral Count Data

12

instrument replicates1 identified proteins MS2 spectra3 data analysis4

rat

pig

yeast

Q-TOF 3 vs 3 340

Q-TOF 3 vs 3 643

LTQ 4 vs 4 1856

LTQ-Orbitrap 5×3 1493−19312

6504 Mascot, Scaffold

48061 Mascot, Scaffold

229805 SEQUEST

34219−423062 Mascot

Spectral count data from all four data sets were used as an input for four different data analysis algorithms: QSpec,8 Beta Binomial,13 PLGEM,12 and PepC.16 All of them work with plain text files that contain either tab-separated or commaseparated spectral count values, which the user must export and prepare beforehand. The algorithms produce statistical significance values for differential protein expression between two sample groups after analyzing a data set containing MSidentified proteins and their corresponding SCs. With PLGEM, normalized spectral abundance factors (NSAF)19 as well as unnormalized SCs were used as input. NSAF values were calculated using eq 1, where SpC is the spectral count for protein k and L is the length of protein k. Individual SCs are divided by the corresponding protein sequence lengths and by the sum of all such normalized values in the sample.

yeast + UPS1

1

Number of biological replicates or replicated MS runs (yeast + UPS data set). 2Value range in pairwise UPS1 concentration comparisons. 3 Number of MS2 spectra used for protein identifications. 4 PeptideProphet and ProteinProphet validation algorithms were run in Scaffold software.

Protein mixtures were TCA precipitated, urea denaturated, reduced, alkylated, and digested with Lys-C followed by trypsin. Peptide mixtures were separated by reverse phase and strong cation exchange chromatography columns in 12-step MudPIT runs. MS2 spectra were obtained with a Finnigan LTQ mass spectrometer (Thermo Electron, San Jose, CA, U.S.A.) coupled to an Agilent HP1100 quaternary pump (Agilent Technologies, Palo Alto, CA, U.S.A.). The five most intense ions from each full MS scan, with mass window of 400−1600 m/z, were selected for fragmentation by collision induced dissociation using data-dependent acquisition. Proteins were identified by database search using SEQUEST software. The list of search parameters is included in supplementary methods of the original publication.

(NSAF)k =

(SpC/L)k n ∑i = 1 (SpC/L)i

(1)

QSpec analysis was also performed with two variations; with and without normalization of SCs. Normalization is done by dividing individual SCs by the sum of all counts in a sample. After analyzing the data, identified proteins were assigned a rank number based on the statistical significance of differential expression for the protein. For each identified protein, differences in rank numbers were calculated according to eq 2 for all 15 possible pairwise algorithmic combinations. In this formula, n is the number of identified proteins in the data set, and Ai and Bi are the protein ranks of a protein i given by methods a and b. Identification of proteins and calculating corresponding MS2 spectra is done before this step, so all analysis methods operate with the same input data. This means that the two arrays A and B are of the same length and both contain rank numbers from the set {1, 2, ..., n}. Note that in A (or B) the same rank may appear more than once. The absolute values of the rank differences were summed, to obtain a score Sa,b corresponding to the degree of similarity between the results of two methods. Thus, a value of zero would indicate perfect agreement. For a discussion of other rank list similarity metrics, see Boulesteix and Slawski23

2.4. UPS1 Spiking

Universal Proteomics Standard Set (UPS1, Sigma-Aldrich, St. Louis, MO, U.S.A.) was dissolved in in-solution digestion buffer, reduced and alkylated, and digested with trypsin. After digestion, the peptide mixture was desalted using C18 pipet tips, evaporated to dryness, and resuspended in 0.1% formic acid. Digested UPS1 mixture was spiked into a yeast proteome digest, provided by Dr. Tiina Pakula, VTT Technical Research Centre of Finland, to create following UPS1 concentrations: 2, 4, 10, 25, and 50 fmol/μL. The amount of yeast peptides per injection was 100 ng. Three runs per spiked concentration were analyzed on LTQ Orbitrap Velos mass spectrometer (Thermo Fisher Scientific, Waltham, MA, U.S.A.) coupled to EASY-nLC nanoflow liquid chromatography system (Thermo Fisher Scientific, Waltham, MA, U.S.A.). Length of the LC gradient was 110 min and the flow rate was 300 nL/min. Peptides were separated on an in-house built C18 analytical column, and ionized by ESI. Data-dependent analysis was used, with top 20 ions selected for fragmentation by CID. Mass scan range was 300−2000 m/z. Dynamic exclusion was enabled using the following settings: repeat count 1, repeat duration 30, exclusion list size 500, exclusion list duration 60. Raw data files were submitted to protein database search (UniProt KB/SwissProt release 2011_03, 525997 entries, with UPS protein sequences appended) using Mascot search algorithm (Version 2.2.6) in Proteome Discoverer software (Version 1.2, Thermo Fisher Scientific, Waltham, MA, U.S.A.). Precursor and fragment mass

n

Sa , b =

∑ |Ai − Bi | i=1

(2)

For comparing aforementioned Sa,b values against values corresponding to two methods having no systematic agreement, we randomly permuted the rank values of proteins and calculated the Sa,b values according to eq 2. This operation was iterated 100 000 times to assess whether the Sa,b value from the two methods shows significant agreement between the two methods. The generated Sa,b values were sorted, and if the value of the sorted list at the low 0.05 one-tailed significance limit was larger than the actual value between the methods, then the agreement was deemed significant. The similarity of protein rank lists between different analysis method pairs was also assessed by calculating Spearman’s correlation coefficients for 1960

dx.doi.org/10.1021/pr401096z | J. Proteome Res. 2014, 13, 1957−1968

Journal of Proteome Research

Article

the set of all rank numbers. Also, rank number scatter plots and heat maps were generated for visualization. The relationship of data set size and pairwise method agreement was assessed in a controlled way using a series of reduced data sets obtained by randomly removing peptide spectra from the yeast data of ref 12. From the yeast + UPS1 data, we compared the numbers of UPS1 proteins detected by different analysis methods to be significantly altered in quantity between different UPS1 concentrations. We also examined how sensitive and specific the algorithms were in detecting changed abundance of the proteins. For this, we consider the detected UPS1 proteins as true positives due to their differential spiking amounts, and the detected background yeast proteins as false positives because the yeast background is fixed for all the samples. Then, sensitivity of the method is defined as the fraction of detected UPS1 proteins from the set of all UPS1 proteins. Specificity is likewise defined as the fraction of undetected proteins from the set of non-UPS1 proteins (the background data). These computations were done for multiple p values and Bayes factors, more specifically for the set of statistical significance values corresponding to the range of all possible sensitivity values (1−47 detected UPS1 proteins). By this, we wanted to evaluate how the analysis methods perform with varying UPS1 fold changes and absolute UPS1 amount.

Table 2. Sums of Absolute Differences of Protein Rank Values When Calculated for All Pairwise Analysis Methodsa comparison

rat

pig

yeast

Qspec - Qspec (N) Qspec - Beta Binomial Qspec - Plgem Sc Qspec - Plgem Nsaf Qspec - Pepc Qspec (N) - Beta Binomial Qspec (N) - Plgem Sc Qspec (N) - Plgem Nsaf Qspec (N) - Pepc Beta Binomial - Plgem Sc Beta Binomial - Plgem Nsaf Beta Binomial - Pepc Plgem Sc - Plgem Nsaf Plgem Sc - Pepc Plgem Nsaf - Pepc critical sum (P = 0.05)

6 881 20 279 19 725 21 472 23 512 19 914 20 249 21 378 24 244 15 103 11 868 12 811 17 159 18 250 16 014 36 344

25 844 55 097 43 722 70 336 56 419 46 238 52 558 57 908 65 218 67 458 54 913 43 438 77 856 64 703 76 313 154 630

369 122 542 859 691 219 608 025 528 808 515 287 700 047 575 041 548 395 916 479 585 844 348 134 658 757 885 359 610 814 1 120 256

a

The smaller the sum is, the better the consensus between the methods. The critical sum means the value of the absolute rank differences corresponding to a one-tailed p value 0.05 of random permutations.

Table 3. Correlation Coefficients of Protein Rank Numbers for Different SC Data Analysis Methodsa

3. RESULTS 3.1. Comparison of Analysis Methods in Ranking of Differential Expression of Proteins

A summary of the data sets and their characteristics is shown in Table 1. Protein rank numbers, based on statistical significance of differential expression, were calculated, and the sums of absolute differences of the ranks for all proteins for all pairwise method comparisons, were calculated by eq 2. The significance of the agreement between two methods was determined by randomly permuting the order of proteins and their rank numbers and calculating the rank difference sum, Sa,b, as described above. The observed Sa,b values from the actual analysis method comparisons were small in these comparisons, indicating that the agreement of the protein rank lists cannot be explained by random effects only (Table 2). The agreement of the analysis methods was also assessed by correlation analysis of protein rank numbers (Table 3). For the rat brain data, the lowest correlation was between QSpec and PepC results, whereas the results for normalized and unnormalized QSpec were most similar. For pig data, the lowest correlation was surprisingly between the two PLGEM versions, whereas QSpec versions were most similar with very high correlation. Overall the pairwise correlations and sum of rank differences are mostly average and one might expect better consensus between the methods. There are few comparisons in yeast data, the largest data set, where correlation between two methods is exceptionally poor. These are the comparisons PLGEM SC - Beta Binomial and PLGEM SC - PepC. For this particular data set PLGEM SC gives the most different results compared to others. Using heat maps, the proteins were sorted by their average rank given by the six different methods (Figure 1). The heat maps show visually how much the tested methods differ individually from the general consensus and in what part of the rank spectrum it happens. Here, differences among the highest ranks, that is, the proteins that are most significantly changed in

comparison

rat

pig

yeast

QSpec - QSpec (N) QSpec - Beta Binomial QSpec - PLGEM SC QSpec - PLGEM NSAF QSpec - PepC QSpec (N) - Beta Binomial QSpec (N) - PLGEM SC QSpec (N) - PLGEM NSAF QSpec (N) - PepC Beta Binomial - PLGEM SC Beta Binomial - PLGEM NSAF Beta Binomial - PepC PLGEM SC - PLGEM NSAF PLGEM SC - PepC PLGEM NSAF - PepC

0.96 0.71 0.70 0.69 0.64 0.71 0.70 0.69 0.61 0.79 0.88 0.88 0.77 0.75 0.83

0.94 0.79 0.87 0.70 0.79 0.83 0.83 0.79 0.73 0.73 0.81 0.88 0.65 0.74 0.66

0.78 0.69 0.57 0.67 0.71 0.71 0.56 0.70 0.69 0.34 0.65 0.87 0.67 0.40 0.68

a Proteins were ranked by their statistical significance of differential expression.

abundance are of special interest. There should not appear expressed proteins (red) in bottom of the list but this is just the case with PLGEM SC in yeast data and it reflects the previous observations made by analyzing the pairwise correlations. In the case of rat data (Figure 1a), Beta Binomial gives the best overall spectrum. The methods that differ the most from the consensus in the high rank region are Beta Binomial, PLGEM NSAF and PepC. In pig data (Figure 1b), the most differing methods are PLGEM NSAF an PepC. When looking at the yeast data (Figure 1c), it is observed that the methods differing most from the consensus are the both PLGEM variants. The pairwise comparisons having the best and worst overall correlations were also visualized using scatter plots (Figures 2, 3, 4). Scatter plots for all of the possible combinations are found in Supporting Information (Figures S1, S2, and S3). 1961

dx.doi.org/10.1021/pr401096z | J. Proteome Res. 2014, 13, 1957−1968

Journal of Proteome Research

Article

Figure 1. Heat maps of protein rank number consensus for (a) rat data, (b) pig data, and (c) yeast data. Proteins were sorted by their averaged rank number of differential expression calculated from the ranking results given by the six different spectral count data analysis methods. Color gradation shows the ranking consensus between the individual analysis method results where red indicates the highest ranks, that is, proteins with the most significant expression change.

Figure 2. Protein rank number scatter plots for pairwise comparisons where the overall correlation is highest or lowest among all the comparisons (rat brain data).

1962

dx.doi.org/10.1021/pr401096z | J. Proteome Res. 2014, 13, 1957−1968

Journal of Proteome Research

Article

Figure 3. Protein rank number scatter plots for pairwise comparisons where the overall correlation is highest or lowest among all the comparisons (pig data).

Figure 4. Protein rank number scatter plots for pairwise comparisons where the overall correlation is highest or lowest among all the comparisons (yeast data).

Figure 5. Combined sets of the top 25 abundance-changed proteins as reported by the analysis methods (rat data, 57 proteins; pig data, 68 proteins; yeast data, 69 proteins). Proteins are grouped according to the number of methods that have them in their list of top 25 abundance changed proteins. For example, in the yeast data all 6 methods found 9 common proteins among the 25 most changed proteins and 10 common proteins by at least 5 of the methods.

become increasingly dark. In an optimal case where two different methods rank proteins equally, the resulting graph would be a diagonal line starting at the origo. This is almost the case for QSpec and QSpec (N). The horizontal and diagonal lines in these plots originate from cases where multiple proteins

They were created by plotting protein ranks from two different methods to an (x, y) coordinate system. Because our data has shared ranks, one dot can represent multiple proteins if both methods give them the same rank. Single data point is plotted as partially transparent so multiple points in same position 1963

dx.doi.org/10.1021/pr401096z | J. Proteome Res. 2014, 13, 1957−1968

Journal of Proteome Research

Article

Figure 6. Correlation coefficients of the five highest-correlating methods calculated for reduced yeast data. Peptide spectra were randomly removed resulting in simulated data sets containing 80, 60, 40, and 20% of total spectra from the original yeast data set of ref 12.

Figure 7. (A) Number of UPS proteins (in total 48) detected as significantly changed in abundance by the SC analysis methods in pairwise comparisons of UPS concentrations. (B) Number of UPS peptide spectra summed over all replicates for the 2−4 fmol and 25−50 fmol samples.

are given the same rank by one method but different ranks by the other. For all of the data sets, the plots recapitulate the information contained in the protein rank number correlation coefficients; for many of the comparisons, correlation is weak. On rat and pig data sets the distinction between the best and the worst case is easy to make, but on yeast data, one must focus on the highest ranks, that is, on the lower left corner. The large yeast data also reveals that there are vast number of proteins that are defined equally differentially expressed by one method but not by the other, hence, the vertical or horizontal lines. Given the nature of the method of measurements (MS) and the fact that small perturbations are likely to occur in biological systems, we next focused attention to the highest ranking proteins. This was done by generating tables with combined lists of the top 25 proteins ranked by the different analysis methods and, calculating for the individual proteins, how many of the methods had them in their lists of the top 25 proteins (Figure 5). The total number of different proteins is, thus, larger than 25 because the methods are not in full agreement. In an optimal case, there would be the same 25 proteins reported by all of the different methods, so having only a fraction of these in our list is not a desired situation, but this is expected when adding more methods to the comparison. In all data sets, the biggest fraction consists of proteins assigned to the top 25 by only one method. Less than half of the proteins were shared between three methods or more. The distributions are quite similar for all the data sets; however, the yeast data is characterized by a slightly larger fraction of proteins shared between all six methods in the top 25. The results show that for some of the most significantly abundance-changed proteins, there is good agreement between the methods. The combined top 25 proteins are listed in Supporting Information (Table

S1). Selection of top 25 as the cutoff point was arbitrary and the results vary slightly when changing the value. This approach is used not only for showing what the results look like when looking at the proteins with the largest change in abundance but also for practical reasons as the selected SC analysis methods do not share a similar statistics framework; Beta Binomial and PLGEM report p values, whereas QSpec calculates Bayes factors. We made a similar analysis for the frequentist methods alone, that is, Beta Binomial, PLGEM SC, and PLGEM NSAF, using all proteins with statistically significant differential expression. These results show that the methods have clearly the highest agreement in the pig data set, see Supporting Information (Figures S4, S5, and S6). In an attempt to understand if the amount of data (productive MS2 sequencing cycles/unit of time) had an influence on the SC analysis performance, we randomly removed peptide spectra from the yeast data of Pavelka et al.12 to make a series of method agreement analyses using progressively smaller data sets. Reduction of data resulted in varying results in terms of pairwise analysis method agreement (Figure 6). However, no clear trend was evident. For the highest amount of data reduction, most pairwise comparisons had lower correlation coefficients for the ranks when compared to the full data set, but the differences were small. Furthermore, unexpectedly, the highest correlations were seen often in the simulated data sets of intermediate size. 3.2. Comparison of Analysis Methods in Detecting Differences in Spiked Protein Amounts

The performance of the SC analysis methods was also evaluated by spiking different amounts of UPS proteins to yeast protein samples. Figure 7a shows how the number of UPS proteins with statistically significant differential expression increases for different methods as the fold change of the spiked proteins gets 1964

dx.doi.org/10.1021/pr401096z | J. Proteome Res. 2014, 13, 1957−1968

Journal of Proteome Research

Article

Figure 8. Number of UPS proteins detected by QSpec as changed in abundance when using different Bayes factor (BF) as a threshold.

Figure 9. Relationship of spectral counts (SC) for UPS proteins and statistical significance of abundance change for individual UPS proteins as assigned by the spectral count data analysis methods. UPS proteins were spiked into a yeast protein digest at two different concentrations, 10 and 50 fmol/μL. Spectral counts for individual UPS proteins were summed from six LC/MS2 runs.

detection and UPS fold changes. A similar relationship as in Figure 7 is seen between the numbers of UPS proteins detected as changed, and fold changes, or absolute concentration. Using normalization in QSpec resulted in more UPS proteins detected as changed in abundance, with the exception of the 2−4 fmol comparison. The relationship of UPS protein spectral counts and differential detection statistics was also examined at individual protein level. For UPS proteins detected in the 10−50 fmol comparisons, we plotted the total SCs for each protein across replicate samples against their statistical significance value assigned by the analysis methods (Figure 9). There was a clear trend of higher statistical significance with proteins having higher SCs. There was no correlation between these variables for PLGEM NSAF, due to the protein sequence length

larger. Keeping the fold change constant but increasing the absolute concentration produced a similar effect (25−50 fmol as opposed to 2−4 fmol). This is most likely due to larger numbers of UPS peptide spectra, which increases the signal-tonoise ratio of the SC differences (Figure 7b). It is notable that Beta Binomial detects only 1 and 5 UPS proteins in the fold change 2 cases, whereas for fold change 5, its operation is quite similar to the other two methods. Finally, PLGEM performs more evenly across the different cases when compared to other methods. Because of the nontrivial statistical relationship of Bayes factors and p values, differential UPS protein detection statistics for QSpec and QSpec (N) are presented in a separate chart in Figure 8. Four different cutoff values for Bayes factors were used to show the relationship of differential UPS protein 1965

dx.doi.org/10.1021/pr401096z | J. Proteome Res. 2014, 13, 1957−1968

Journal of Proteome Research

Article

Figure 10. Receiver operating characteristic (ROC) curves for various spiked-in concentrations of UPS proteins.

protein assemblies created by IDPicker. Methods such as SASPECT24 require additional data besides spectral counts and general peptide information. These methods are, therefore, not applicable in the present study. In addition, there are some published methods that do not have readily working programs or scripts available such as Spectral Index (SpI),11 Bayesian mixture model,18 and resampling-based significance analysis (ReSASC).17 It is possible to add more methods into this type of study, but the number of pairwise comparisons would eventually get too large for our purposes; that is, to show the issues that popular spectral counting methods currently have. One possibility for further studies is to implement all of the proposed algorithms and do a large-scale survey of the SC analysis methods. As a novel approach, we studied how similar are the results produced by the different algorithms, given the same biological input data. A good agreement of results from the chosen algorithms would provide means for cross-validation of analysis method performance, particularly important in the case of complex biological samples where protein concentration changes between two sample groups are unknown. For evaluation of the level of agreement, we used three data sets; an epileptogenic rat brain data set, a pig wound tissue data set, and a comparative growth phase yeast data set. Each of the chosen algorithms produces a set of statistical significance values of differential expression between two sample groups for all identified proteins. Proteins were ranked according to these values, and these protein rank values were used for analyzing the method result agreement. We did not evaluate the effects of using multiple search engines and prefiltering methods for upstream data processing before statistical analysis of spectral

normalization inherent in NSAF, which corrects for SC bias due to protein size differences. In the sensitivity and specificity analysis, as the fold change increases, for all the methods the area under curve (AUC) gets bigger (Figure 10). Overall, PLGEM NSAF shows the best performance in terms of sensitivity and specificity. The unnormalized QSpec deviates clearly from the rest of the methods in the 25−50 fmol comparison, but because it happens only in this particular case, it highlights the need for normalization. For fold changes 2.5 and 5 (10−25 fmol and 10−50 fmol), all methods had both high sensitivity and specificity.

4.. DISCUSSION Protein quantification is a central part in modern biosciences and medicine. The development of MS-based proteomics methods has greatly improved the precision and throughput in this subfield of science. Extensive development is occurring particularly in the field of label-free quantitative proteomics where two different approaches are being investigated; MS1level precursor ion quantification and MS2-level spectral counting of fragment ion spectra. In this work, different statistical analysis methods for SC data were evaluated. The panel of analysis methods consisted of the algorithms Beta Binomial, PepC, PLGEM, and QSpec (the latter two with and without normalization; PLGEM NSAF, PLGEM SC, QSpec (N), QSpec).8,12,13,16 Tests with an older version of QSpec did not operate reliably and were, therefore, rejected and executed again using an updated version. There are also other existing statistical methods that we have, however, omitted in this work such as QuasiTel: an approach based on quasilikelihood modeling.15 It only reads output from 1966

dx.doi.org/10.1021/pr401096z | J. Proteome Res. 2014, 13, 1957−1968

Journal of Proteome Research

Article

fraction of proteins in a complex proteome digest were of known quantity. By these experiments, we could compare quantification results of spectral counting to the known protein abundance changes between samples. For this purpose, a mixture of 48 human proteins (Sigma Universal Proteomics Standard Set) was spiked into complex yeast proteome digest in five different concentrations. In our experiments, PLGEM outperformed other methods in detecting differential abundance of UPS proteins (Figure 7a). Particularly in the smaller fold change comparisons the difference was substantial. Also, when NSAF values were used for PLGEM instead of un-normalized spectral counts, the sensitivity of detection was higher, in contrast to findings of Lundgren et al.20 Like the other methods, QSpec detected more abundance changed proteins as the fold change or absolute concentration increased (Figure 8). However, because of the nontrivial statistical relationship of p values and Bayes factors, comparison of Figures 7a and 8 is not straightforward. All methods reported some background yeast proteins to be significantly altered in abundance. This observation, together with the different statistical logic employed by QSpec compared to the other methods, motivated to carry out sensitivity and specificity analyses (Figure 10). The reason for reportedly altered yeast proteins is likely the competition with the spiked UPS proteins for detection by LC-MS2. Raising the amount of UPS proteins results in less yeast peptides to be detected, which the analysis algorithms interpret as genuine change in abundance, even if the amount of yeast peptides in reality is constant. Nevertheless, sensitivity and specificity were taken as good performance indicators of the analysis methods in discriminating between proteins having truly differential abundance and “false positive” background proteins. In performance ranking of the algorithms, these analyses replicated the results shown in Figure 7a; PLGEM NSAF managed the best. Interestingly, the performance of unnormalized QSpec was exceptionally poor in the 25−50 fmol comparison, whereas the QSpec (N) was almost comparable to the other methods. This implies that smaller fold changes are difficult to detect without normalization, even if the number of peptide spectra is high. In general, all of the methods perform well in the 2.5 and 5 fold change cases, showing high sensitivity and specificity. It is notable, however, that the spiked UPS amounts are, here, quite large and, thus, their changes are probably easier to detect compared to many low abundance proteins in biological samples. Also, if the practical lower limit of reliable quantification by spectral counting would be around 2.5 fold changes, then many important changes would remain undetected. When examined at the individual protein level, it was seen that UPS proteins that had more SCs also had higher statistical significance of protein abundance changes, as assigned by the analysis methods (Figure 9). The SCs also correlated strongly (r = 0.82, data not shown) with the protein molecular weight. This implies that spectral counting is not equally suitable for quantifying proteins across the whole span of molecular sizes, even if the protein copy number would be the same.

count data but instead restricted our focus on the effects of data analysis algorithms, in isolation of other factors. This was done to see how much the algorithm choice in itself impacts the results and to keep the complexity of the experimental design in reasonable bounds. We showed that for all data sets, the agreement between pairwise method comparisons was significantly higher than what was obtained from simulated comparisons using randomly permuted protein ranks (Table 2). However, correlation analysis revealed intra- and interdata set variability in pairwise method agreement, that is, variability between different method comparisons for the same data set, as well as between data sets for the same method comparisons (Table 3). Because the rat data set has substantially lower number of total spectra compared to others, it might lead to a situation where crossvalidated results might be inferior, but this was not the case; instead we find the largest variation in the yeast data where the number of total spectra was biggest. The sum of absolute values for rank differences and the correlation coefficient of rank numbers are composite expressions that do not reveal all aspects of the underlying data. From the heat maps (Figure 1) and rank number scatter plots (Figures 2, 4, and 3), it is seen that in some test cases the methods seem to agree reasonably well in the highest rank region, and the agreement gets more random toward the lowest ranks. Regarding cross-validation, this would be wanted because the high rank proteins are by definition the relevant ones when analyzing protein expression changes. However, we could not confirm this observation statistically; correlation coefficients from subsets of high rank proteins were not higher than coefficients calculated from all rank values (data not shown). Although the agreement of the methods on the high rank proteins as a whole did not turn out to be as good as desired, a number of high-confidence proteins could be distinguished as a result of being ranked among the top abundance-changed proteins (here, top 25) by many or all methods (Figure 5). Results are better when examining only couple of the methods at the time, but we wanted also to investigate all methods together instead of pairwise comparisons. It is perhaps reasonable to assume that being ranked high up by most methods increases the likelihood for a protein to be truly differentially expressed. This type of cross validation, we argue, might be a good approach to spectral counting workflows, that is, to use several analysis methods to generate a list of putative differentially expressed proteins, which can be subsequently followed up using other experimental methods, such as immunoassays, Western blotting, or targeted MS. It is also possible to fine tune the properties of new spectral counting methods to perform well on specific data sets, certain equipment, and other variables in the quantification process. These details could be extensively researched from all existing methods and gathered together to a definitive list. However, such a valuable list would also be somewhat impractical to maintain and follow when performing experiments. Therefore, we suggest that new spectral counting methods should be crossvalidated more thoroughly and that they should also be tested using more data sets. Large-scale validation of protein abundance changes as reported by SC analysis methods is problematic as the number of identified proteins in complex biological data sets is generally in the range of hundreds to thousands. This is why the crossvalidation approach was chosen as the most feasible option. In addition to that, we performed controlled experiments where a



ASSOCIATED CONTENT

S Supporting Information *

Graphs of statistically significant differential proteins by data set and detailed listing of top 25 abundance changed proteins and 1967

dx.doi.org/10.1021/pr401096z | J. Proteome Res. 2014, 13, 1957−1968

Journal of Proteome Research

Article

proteins and metabolites by mass spectrometry without isotopic labeling or spiked standards. Anal. Chem. 2003, 75 (18), 4818−26. (7) Liu, H.; Sadygov, R. G.; Yates, J. R. A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal. Chem. 2004, 76 (14), 4193−201. (8) Choi, H.; Fermin, D.; Nesvizhskii, A. I. Significance analysis of spectral count data in label-free shotgun proteomics. Mol. Cell. Proteomics 2008, 7 (12), 2373−85. (9) Griffin, N. M.; Yu, J.; Long, F.; Oh, P.; Shore, S.; Li, Y.; Koziol, J. A.; Schnitzer, J. E. Label- free, normalized quantification of complex mass spectrometry data for proteomic analysis. Nat. Biotechnol. 2010, 28 (1), 83−89. (10) Colaert, N.; Gevaert, K.; Martens, L. RIBAR and xRIBAR: Methods for reproducible relative MS/MS-based label-free protein quantification. J. Proteome Res. 2011, 10 (7), 3183−89. (11) Fu, X.; Gharib, S. A.; Green, P. S.; Aitken, M. L.; Frazer, D. A.; Park, D. R.; Vaisar, T.; Heinecke, J. W. Spectral index for assessment of differential protein expression in shotgun proteomics. J. Proteome Res. 2008, 7 (3), 845−54. (12) Pavelka, N.; Fournier, M. L.; Swanson, S. K.; Pelizzola, M.; Ricciardi-Castagnoli, P.; Florens, L.; Washburn, M. P. Statistical similarities between transcriptomics and quantitative shotgun proteomics data. Mol. Cell. Proteomics 2008, 7 (4), 631−44. (13) Pham, T. V.; Piersma, S. R.; Warmoes, M.; Jimenez, C. R. On the β-binomial model for analysis of spectral count data in label-free tandem mass spectrometry-based proteomics. Bioinformatics 2010, 26 (3), 363−69. (14) Carvalho, P. C.; Fischer, J. S.; Chen, E. I.; Yates, J. R., 3rd; Barbosa, V. C. PatternLab for proteomics: a tool for differential shotgun proteomics. BMC Bioinformatics 2008, 9, 316. (15) Li, M.; Gray, W.; Zhang, H.; Chung, C. H.; Billheimer, D.; Yarbrough, W. G.; Liebler, D. C.; Shyr, Y.; Slebos, R. J. Comparative shotgun proteomics using spectral count data and quasi-likelihood modeling. J. Proteome Res. 2010, 9:4295−4305. (16) Heinecke, N. L.; Pratt, B. S.; Vaisar, T.; Becker, L. PepC: proteomics software for identifying differentially expressed proteins based on spectral counting. Bioinformatics, 2010, 26:1574−75. (17) Little, K. M.; Lee, J. K.; Ley, K. ReSASC: a resampling based algorithm to determine differential protein expression from spectral count data. Proteomics 2010, 10:1212−22. (18) Booth, J. G.; Eilertson, K. E.; Olinares, P. D.; Yu, H. A bayesian mixture model for comparative spectral count data in shotgun proteomics. Mol. Cell. Proteomics 2011, 10 (8), M110.007203. (19) Zybailov, B.; Mosley, A. L.; Sardiu, M. E.; Coleman, M. K.; Florens, L.; Washburn, M. P. Statistical analysis of membrane proteome expression changes in Saccharomyces cerevisiae. J. Proteome Res. 2006, 5 (9), 2339−47. (20) Lundgren, D. H.; Hwang, S.; Wu, L.; Han, D. K. Role of spectral counting in quantitative proteomics. Expert Rev. Proteomics 2010, 7 (1), 39−53. (21) Keller, A.; Nesvizhskii, A. I.; Kolker, E.; Aebersold, R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 2002, 74 (20), 5383−92. (22) Nesvizhskii, A. I.; Keller, A.; Kolker, E.; Aebersold, R. A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 2003, 75 (17), 4646−58. (23) Boulesteix, A. L.; Slawski, M. Stability and aggregation of ranked gene lists. Briefings Bioinf. 2009, 10 (5), 556−68. (24) Whiteaker, J. R.; Zhang, H.; Zhao, L.; Wang, P.; Kelly-Spratt, K. S.; Ivey, R. G.; Piening, B. D.; Feng, L.; Kasarda, E.; Gurley, K.E..; Eng, J. K.; Chodosh, L. A.; Kemp, C. J.; McIntosh, M. W.; Paulovich, A. G. Integrated Pipeline for Mass Spectrometry-Based Discovery and Confirmation of Biomarkers Demonstrated in a Mouse Model of Breast Cancer. J. Proteome Res. 2007, 6 (10), 3962−3975.

scatterplots for all possible pairwise comparisons. This material is available free of charge via the Internet at http://pubs.acs.org.



AUTHOR INFORMATION

Corresponding Author

*G. L. Corthals. E-mail: garcor@utu.fi. Tel.: +358 2 333 8889. Fax: +358 2 2518808. Present Address #

Van’t Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands. E-mail: [email protected].

Author Contributions ⊥

These authors contributed equally.

Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS We acknowledge the support and technical expertise provided by Turku Proteomics Facility and the Biocenter Finland Proteomics and Metabolomics infrastructure. We are grateful to Tiina Pakula and Susumu Imanishi for providing the yeast samples, and Noora Jaakkola for the pig data. This publications was partially supported through a grant to GLC from Nordforsk “NordiQ: Nordic Education Network for Quantitative Proteomics (070178)” and from the Academy of Finland (grant 128712).



ABBREVIATIONS MS, mass spectrometry; PTM, post-translational modification; ICAT, isotope-coded affinity tag; iTRAQ, isobaric tag for relative and absolute quantitation; SILAC, stable isotope labeling by amino acids in cell culture; PLGEM, power-law global error model; NSAF, normalized spectral abundance factor; UPS, universal proteomics standard; IEF, isoelectric focusing; LC, liquid chromatography; ESI, electrospray ionization; HPLC, high-performance liquid chromatography; CID, collision-induced dissociation; SC, spectral count; FC, fold change



REFERENCES

(1) Tang, H.; Arnold, R. J.; Alves, P.; Xun, Z.; Clemmer, D. E.; Novotny, M. V.; Reilly, J. P.; Radivojac, P. A computational approach toward label-free protein quantification using predicted peptide detectability. Bioinformatics 2006, 22 (14), 481−88. (2) Gygi, S. P.; Rist, B.; Gerber, S. A.; Turecek, F.; Gelb, M. H.; Aebersold, R. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat. Biotechnol. 1999, 17 (10), 994−9. (3) Ross, P. L.; Huang, Y. N.; Marchese, J. N.; Williamson, B.; Parker, K.; Hattan, S.; Khainovski, N.; Pillai, S.; Dey, S.; Daniels, S.; et al. Multiplexed protein quantification in saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell. Proteomics 2004, 3 (12), 1154−69. (4) Ong, S. E.; Blagoev, B.; Kratchmarova, I.; Kristensen, D. B.; Steen, H.; Pandey, A.; Mann, M. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol. Cell. Proteomics 2002, 1 (5), 376−86. (5) Gerber, S. A.; Rush, J.; Stemman, O.; Kirschner, M. W.; Gygi, S. P. Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proc. Natl. Acad. Sci. U. S. A. 2003, 100 (12), 6940−45. (6) Wang, W.; Zhou, H.; Lin, H.; Roy, S.; Shaler, T. A.; Hill, L. R.; Norton, S.; Kumar, P.; Anderle, M.; Becker, C. H. Quantification of 1968

dx.doi.org/10.1021/pr401096z | J. Proteome Res. 2014, 13, 1957−1968