IsobariQ: Software for Isobaric Quantitative Proteomics using IPTL

Nov 10, 2010 - In addition, MaxQuant presents advanced algorithms for data analysis and statistical tests, thus setting a standard for new proteomics ...
3 downloads 12 Views 2MB Size
IsobariQ: Software for Isobaric Quantitative Proteomics using IPTL, iTRAQ, and TMT Magnus Ø. Arntzen,†,‡,§ Christian J. Koehler,† Harald Barsnes,|,⊥ Frode S. Berven,| Achim Treumann,# and Bernd Thiede*,† The Biotechnology Centre of Oslo, University of Oslo, Oslo, Norway, Proteomics Core Facility, Oslo University Hospital-Rikshospitalet & University of Oslo, 0027 Oslo, Norway, Proteomics Core Facility, Norwegian University of Life Sciences, 1432 Ås, Norway, Proteomics Unit, Department of Biomedicine, University of Bergen, Bergen, Norway, Computational Biology Unit, Uni BCCS, University of Bergen, Norway, and NEPAF, Devonshire Building, Newcastle upon Tyne, NE1 7RU, United Kingdom Received October 1, 2010

Abstract: Isobaric peptide labeling plays an important role in relative quantitative comparisons of proteomes. Isobaric labeling techniques utilize MS/MS spectra for relative quantification, which can be either based on the relative intensities of reporter ions in the low mass region (iTRAQ and TMT) or on the relative intensities of quantification signatures throughout the spectrum due to isobaric peptide termini labeling (IPTL). Due to the increased quantitative information found in MS/MS fragment spectra generated by the recently developed IPTL approach, new software was required to extract the quantitative information. IsobariQ was specifically developed for this purpose; however, support for the reporter ion techniques iTRAQ and TMT is also included. In addition, to address recently emphasized issues about heterogeneity of variance in proteomics data sets, IsobariQ employs the statistical software package R and variance stabilizing normalization (VSN) algorithms available therein. Finally, the functionality of IsobariQ is validated with data sets of experiments using 6-plex TMT and IPTL. Notably, protein substrates resulting from cleavage by proteases can be identified as shown for caspase targets in apoptosis. Keywords: bioinformatics • degradomics • IPTL • iTRAQ • mass spectrometry • proteomics • quantification • software • TMT • VSN

Introduction MS-based proteomics has matured as a powerful tool for the identification and quantification of proteins and peptides. During the past decade, several different quantification tech* To whom correspondence should be addressed. Bernd Thiede, The Biotechnology Centre of Oslo, University of Oslo, P. O Box 1125 Blindern, N-0317 Oslo, Norway. Email: [email protected]. Tel: +47-22840533. Fax: +47-22840501. † University of Oslo. ‡ Oslo University Hospital-Rikshospitalet & University of Oslo. § Norwegian University of Life Sciences. | Department of Biomedicine, University of Bergen. ⊥ Uni BCCS, University of Bergen. # NEPAF. 10.1021/pr1009977

 2011 American Chemical Society

niques have been developed including stable isotope labeling of amino acids in cell culture (SILAC),1 tandem mass tagging (TMT),2 isobaric tags for relative and absolute quantification (iTRAQ)3 and isobaric peptide termini labeling (IPTL).4 SILAC enables peptides derived from different physiological conditions to be quantified at the MS level, while TMT, iTRAQ and IPTL yield isobaric peptides and thus requires MS/MS data to reveal quantitative information for the peptides. SILAC has proven to be a stable and robust method for quantification,5 but in many cases, for example when studying human tissue or body fluids, metabolic labeling is not applicable and chemical labeling methods are therefore required. The clear advantage of the reporter ion methods iTRAQ and TMT is their ability to compare up to eight6 or six7 different physiologic conditions within one experiment, respectively, and no increased complexity at the MS level is observed. While the drawbacks of iTRAQ and TMT are their relative high cost, systematic dampening,8 and that the low molecular region where the reporter ions appear is not accessible to all mass spectrometers. IPTL is a recently developed duplex isobaric quantification method that is similar to the reporter ion techniques in that there is no increased complexity at the MS level. In contrast to reporter ion techniques, IPTL produces several quantification points per peptide that yield a robust and accurate calculation of protein abundances. In addition, IPTL is not hampered by the low mass cutoff seen in trapping type mass spectrometers, such as ion traps. With the development of novel methodologies, software tools have been designed to address and exploit the different strengths for each technique,9 both proprietary and open source. For the SILAC methodology, a free program suite commonly used is MaxQuant,10 which provides thorough preprocessing, sequence identification and quantification, all in an automated workflow. In addition, MaxQuant presents advanced algorithms for data analysis and statistical tests, thus setting a standard for new proteomics software. For the iTRAQ methodology, free software like i-Tracker,11 Multi-Q12 and jTraqX13 exists for relative quantification, while VEMS14 can be used for iTRAQ and TMT. Recently, Karp and colleges15 described the additivemultiplicative error-model found in iTRAQ data, that is, the dependence of variance on the mean signal intensity. They Journal of Proteome Research 2011, 10, 913–920 913 Published on Web 11/10/2010

technical notes propose using variance stabilizing normalization (VSN) to transform the signal intensities in such a way that the variance is stabilized over the entire intensity range to simplify downstream analysis. Variance heterogeneity is a wider spread of ratios in the low intensity region when compared to the high intensity region.10,16 This effect is not unique to iTRAQ or specific instruments but is also seen in other quantification techniques. To overcome this, the developers of MaxQuant suggested using intensity bins for calculating significance limits for up- and down-regulated proteins (significance B). The variance stabilization methods on the other hand aim to transform the distribution in a way that the same limit can be applied across the whole intensity range. Until recently, no software existed for the new quantification method ITPL. Peptides labeled with IPTL at opposite peptide termini were searched twice with the search engine Mascot17 using opposite fixed modifications and the two protein scores were then used to calculate the ratio. Although an almost linear relationship was found for simple protein digests between protein scores and abundance levels,4 a more accurate and robust method including application of statistics was needed for complex samples to take full advantage of IPTL. Mascot recently implemented IPTL as a quantification technique from version 2.3, but no free software exists. Here, we present IsobariQ, an open-source, free-of-charge quantification software that supports relative quantification of IPTL, iTRAQ, TMT and other reporter ion techniques (e.g., DiLeu18 and CILAT19). In addition, recently described variance stabilization algorithms are included in IsobariQ.

Arntzen et al. matrix. All computations are performed using the GNU Scientific library.22 Exemplified with purity information from iTRAQ 4-Plex and intensity readings of the reporters 114, 115, 116, and 117 to a 1000 counts this yields:

([

Mcorrected ) 10

-2

92.9 2.0 0 0 5.9 92.3 3.0 0 0.2 5.6 92.4 4.0 0 0.1 4.5 92.4

Mcorrected ) C-1 × Mmeasured where C is a correction matrix reflecting the purity information obtained with the reporter ion kit or from separate experimental data. The invert is computed using LU decomposition (Gaussian elimination with partial pivoting21) followed by solving the system Ax ) b for each column of the identity 914

Journal of Proteome Research • Vol. 10, No. 2, 2011

1000 1000 × ) 1000 1000

[] 1055 984 976 1034

After the correction process, the different ratios are calculated using the corrected reporter ion intensities. Peptide Quantification, IPTL. With the IPTL technique, several quantification points per MS/MS spectrum are used to calculate the ratio. In IPTL, all fragments in an MS/MS spectrum which include the peptide N- or C-terminus should be detected as pairs (unless the particular peptide is absent from one of the two samples), and thus IsobariQ needs to know the peptide sequence in order to calculate the correct ratio. Due to this nature of IPTL, only the fragments that are assigned to the sequence can be used for calculating the peptide ratios. The peptide ratio is calculated as the median of the individual fragment ratios and the peptide variability as the average deviation calculated in log space:

Experimentals/Overview of the Software IsobariQ was developed in C++ for the windows platform and released under the GNU General Public License version 3. IsobariQ can be downloaded free of charge for academic users at http://www.biotek.uio.no/research/thiede_group/ software. IsobariQ requires for performing variance stabilizing normalization the statistical language R and the server Rserve have to be installed separately. R and Rserve can be downloaded at http://www.r-project.org and http://www.rforge.net/ Rserve, respectively. IsobariQ runs on all computers, but we recommend at least 4 GB of RAM for optimal performance and the 64-bits version for comprehensive data sets. Typically, IsobariQ quantifies 1000 proteins within 3 h, including the time for preprocessing of raw files, generation of merged peak list files, identification, and data validation. Algorithms for quantification implemented in IsobariQ use the following workflow: (a) peptide quantification, (b) peptide normalization, (c) protein quantification and (d) statistical tests. Peptide Quantification, iTRAQ and TMT. Before quantification, the reporter ion masses are first corrected for their isotopic overlap using a correction matrix previously described by Vaudel and colleges.20 In brief, the original reporter masses are stored in a matrix Mmeasured when detected and the corrected intensities are calculated as follows:

]) [ ] -1

n

∑ |x - M| i

peptide variability )

i)1

n

where M is the median of the log-transformed fragment ratios and x the log-transformed ratio. The variability describes the average distance from the median. The reason why the calculation is performed in log space is to ensure equal treatment of distances on both sides of the median. In addition, we introduce the ratio angle as a new metric for describing the variability. This metric is calculated in log space and is the angle of a linear least-squares fitted line going through all the ratios when sorted from smallest to largest. This metric is more sensitive to variations than the average deviation and is intended to ease the finding of outlier ratios within a single MS/MS spectrum. Peptide Ratio Normalization. The user can choose between three different types of normalization: (a) Division by median. This normalization technique is probably the most common technique used for proteomics data, and is a linear shift in log space that assures that the median of the log transformed ratios are 0. (b) Variance stabilizing normalization (VSN). This technique normalizes the data in a fashion that takes into account the heterogeneity of variance seen in proteomics data,15,16 the phenomenon that the statistical spread of ratios is larger for low-intense mass peaks than for high-intense. VSN has previously been described for microarray data, and a software package, vsn,23 for the statistical language R24 is available through the Bioconductor project.25 IsobariQ performs VSN via interaction with R. (c) Division by channel sum (reporter ions only). For each channel (reporter), the total intensity of this reporter ion in all peptides is calculated. All

technical notes

IsobariQ individual reporter ion intensities are then divided by this sum. This assures that the overall labeling becomes 1:1:1:1, for 4-plex methods. Protein Quantification. Once all the peptides are successfully quantified and normalized, the protein ratios are calculated as the median of the individual peptide ratios to minimize the effect of outliers. The variability of these ratios is estimated as the average deviation of reporter ion intensities in log space (reporter ions) or pooled standard deviation of all quantification points (IPTL). In IsobariQ, the user can select which peptides contribute to the calculation of protein values. Mascot assigns peptides to proteins following the Occam’s Razor principle,26 and in IsobariQ, the user can choose whether to include these nonunique peptides when calculating the protein values, or to only use unique peptides. In addition, the user can limit the calculation to only include peptides assigned by Mascot to be “bold”. MS/MS spectra can be matched to several different peptides in the Mascot report, however, requiring “bold” typeface ensures that the MS/MS spectrum quantification will only be contributing to one single protein quantification to the protein with highest score. Furthermore, manual alterations by selecting/deselecting peptides within the protein, override these general settings. Only peptides that are selected to contribute to the protein quantification are taken into account when the protein ratio and variability is calculated. Statistical Tests. IsobariQ uses z-statistics, similar to what MaxQuant does for SILAC data,10 to address the significance of a protein ratio. All calculations are performed in log space to ensure equal treatment of up- and down-regulated proteins. Since the distribution of ratios is not expected to be Gaussian, but rather to contain up- and down-regulated proteins, we calculate an upper and a lower standard deviation using percentiles: SDlower ) M - Percentile15.87 SDupper ) Percentile84.13 - M where M is the median of the normalized log-transformed protein ratios. Second, the z-score for every ratio is calculated using the correct standard deviations. For ratios lower than the median: z)

M-r SDlower

For ratios higher than the median: z)

r-M SDupper

where M is the median of the normalized log-transformed protein ratios and r is the normalized log-transformed ratio. The probability of regulation is calculated using the Gaussian probability function significance )

∫ √2π 1

∞ -t2/2

x

e

dt

which returns the upper tail of the distribution. For ratios below the median we ask: “What is the chance of obtaining a ratio

this low or lower by chance alone?”, while for ratios above the median we ask the opposite: “What is the chance of obtaining a ratio this large or a larger by chance alone?” Thus the test is one-sided for each question and the values obtained will be in the range from 0 to 0.5. The significance level to use as a cutoff is up to the user, but IsobariQ calculates the limits at R ) 0.05 as follows: Lower limit0.05 ) M - 1.645*SDlower Upper limit0.05 ) M + 1.645*SDupper where M is the median of the normalized log-transformed protein ratios. The constant 1.645 is the z-score from the normal probability table that ensures 5% probability. For each protein, IsobariQ also provides a significance value that is corrected for multiple hypothesis testing by the Benjamini-Hochberg method.27

Results The Workflow of IsobariQ. Mascot17 is the most commonly used search engine for mass spectrometry data and was therefore selected as the search engine to support in IsobariQ. A prerequisite for performing quantifications through IsobariQ is that the data previously has been searched through Mascot. The Mascot search results are imported into IsobariQ from the dat-files, by using a wizard where the user selects the parameters, for example, the significance threshold that affects the loading of the data. At this point the user selects between the quantification methods IPTL, tryptic-IPTL, iTRAQ, TMT and a custom reporter ion technique as well as both CID and ETD fragmentation. A part of the loading procedure includes filtering the Mascot results, for example to remove sequence hits with low ion scores. The filtering, however, has the most pronounced effect on IPTL data because IsobariQ constructs peptide pairs out of detected sequences with opposite IPTL labeling. The probability of finding by chance the same sequence twice in the same MS/MS spectrum and having opposite IPTL labeling is low, and the ion score of such a pair is therefore recalculated. This affects the overall protein score, and the user can choose between standard and MudPIT scoring schemas. Once loaded, the proteins with their peptides are listed in a table where the user can explore the data in more detail (Figure 1). All proteins should be quantified at the same time in order for IsobariQ to compute the correct normalization factors and significances. The workflow after loading the data is therefore to “quantify all proteins” which causes IsobariQ to perform several consecutive steps: (a) peptide quantification, (b) peptide normalization, (c) protein quantification and (d) statistical tests, all as described in detail above. After the quantification procedure, the user can examine the results in the table (Figure 1) or view more details in a specific module (Figure 2). Peptide ratios or (for IPTL) quantpoint ratios within single MS/MS spectra, can be explicitly included in or excluded from the overall protein ratio calculation, which causes IsobariQ to recalculate everything on the fly. The results can be stored in an XML format, or exported to a spreadsheet application for further analysis. The XML format stores all the identifications, their quantitative information and the user’s interactions and enables the user to reload the data and perform further analyses at a later time. The Quantification Modules Q-IPTL and Q-Reporter. Detailed viewing of sequence assignments and their comparison Journal of Proteome Research • Vol. 10, No. 2, 2011 915

technical notes

Arntzen et al.

Figure 1. IsobariQ main window showing all identified proteins, their respective peptides, scores, and quantitative information: ratio, normalized ratio, significance, Benjamini-Hochberg corrected significance, variability, CV and ratio count. In addition to the peptides, the retention time and raw file name are included to aid in pinpointing the protein/peptides location on the HPLC column and on the gel. Every peptide may have several quantification points (for IPTL) that could enable a robust and accurate protein ratio. The protein ratio is calculated as the median of the individual peptide ratios to minimize the effect of outliers. The results can be saved as XML or exported to a spreadsheet application for further analysis. The graphs in the lower right corner of the main window show the distribution of protein ratios as a histogram (red) and as a function of protein intensity (blue).

with the spectra that gave rise to them is possible using the quantification modules Q-IPTL or Q-Reporter. When doubleclicking a protein in IsobariQ, the protein with all detected peptides and their MS/MS spectra is loaded into either the Q-IPTL (Figure 2A) or the Q-Reporter module (Figure 2B). These modules are specific to their respective quantification method and contain three main parts. (i) The MS/MS spectrum: Annotated MS/MS spectra can be viewed and the user can browse through all the peptides for the given protein and look at all spectra and sequence assignments. (ii) A list of all suggested sequences for the spectra: Mascot usually suggests more than one hit for an MS/MS spectrum. The various hits can be tried and the identification of the peptide manually validated. (iii) Quantification information: When an MS/MS spectrum is quantified, detailed information about the ratios, the peptide variability, extreme ratios (possible outliers), reporter intensities and corrected intensity due to normalization is available. The user can interact with the data and select or deselect ratios which should contribute to the protein ratio. All changes cause IsobariQ to update the protein ratios on the fly. The Graphs Module. The main window of IsobariQ displays the distribution of protein ratios as a function of protein intensity, as well as a histogram (Figure 3A). In addition to this 916

Journal of Proteome Research • Vol. 10, No. 2, 2011

graph, the Graph Module in IsobariQ contains several types of graphs. For example “Quant Point Intensities” (Figure 3B) to visualize the relationship between the IPTL fragment ratios (quant points) between two physiological states, “Quant Point CV vs. Rank of Mean Intensity” (Figure 3C) to visualize how the variance depends on the intensities and the effect of VSN, “Quant Point Ratio vs. Rank of Mean Intensity” (Figure 3D) to visualize the effect of the VSN on ratios and that the statistical spread seen on low intense data is reduced after VSN. These and many more graphs are available in the Graphs Module. Application of IsobariQ to Experimental TMT Data. To validate the use of IsobariQ, six aliquots of tryptic BSA peptides were labeled with the TMT 6-plex reagents (Thermo Scientific) and mixed subsequently in a 1:1:1:1:1:1 ratio. The generated mixture was analyzed by nano-LC/LTQ-Orbitrap XL operating in data dependent mode with alternating CID and HCD fragmentation, essentially as described by Dayon et al.7 Raw files were preprocessed using DtaSuperCharge version 1.43 (available from http://msquant.alwaysdata.net) and peak lists from HCD and CID were merged in the low mass region using the CID-HCD-merger tool in IsobariQ before searching with an in-house installation of Mascot version 2.2. Mascot dat-files were imported in IsobariQ and quantified using the Division by Channel Sum algorithm.

IsobariQ

technical notes

Figure 2. IsobariQ modules (A) Q-IPTL and (B) Q-Reporter. Each module consists of three main parts. (1) A list of all suggested sequences for one single spectrum: Different Mascot hits can be matched to the spectrum and the identification of the peptide can thus be manually validated. (2) The MS/MS spectrum: Annotated MS/MS spectra can be viewed and the user can browse all the peptides in the protein for which MS/MS spectra have been acquired, look at all the corresponding spectra and verify their sequence assignments. (3) Quantification information: When a MS/MS spectrum is quantified, detailed information about the ratios, intensities and normalized intensities, peptide variability, number of quantification points in one MS/MS spectrum (IPTL) and extreme ratios (yellow) is available. The user can interact with the data and select or deselect ratios which should contribute to the protein ratio. (4) Reporter ions spectrum: In addition, Q-Reporter has a truncated MS/MS spectrum showing only the reporter region for easy inspection of the reporter ions.

IsobariQ successfully quantified 468 queries identifying 17 unique BSA peptides (32% sequence coverage) in this data set using reporter 126 as control (Figure 2B). The five ratios 127/

126, 128/126, 129/126, 130/126 and 131/126 all showed 1:1 regulation with a CV of 2% between the individual peptide ratios. Journal of Proteome Research • Vol. 10, No. 2, 2011 917

technical notes

Arntzen et al.

Figure 3. IsobariQ graphs. (A) In the main window of IsobariQ the distribution of normalized protein ratios is displayed. The normalized protein ratio is plotted against the protein intensity (blue dots) and a histogram of the ratios is displayed (red bars). When a protein is selected, the normalized ratio is indicated for that protein (purple bar on top) and all its peptides (yellow bars on the top). This allows for easy viewing of the ratio spread within one protein. In addition, IsobariQ has a Graph Module. Depending on the quantification method used, a number of different plots can be generated. (B) “Quant Point Intensities” visualizes the relation between the IPTL fragment ratios (quant points) between two states, (C) “Quant Point CV vs. Rank of Mean Intensity” visualizes how the variance depends on the intensities and the effect of VSN, and (D) “Quant Point Ratio vs. Rank of Mean Intensity” visualizes the effect of VSN on ratios and that the statistical spread seen on low intense data is reduced after VSN. For B, C and D: Blue dots are original data, orange dots are normalized data using variance stabilization.

Application of IsobariQ to Experimental IPTL Data. To validate the use of IsobariQ for IPTL data, HeLa cells were treated with S-trityl-L-cysteine (STLC) to induce mitotic arrest after 16 h and apoptosis after 40 h, respectively. Cell culture, incubation with STLC, and SDS-PAGE was performed as previously described28 and Coomassie Blue G-250 stained bands were excised and digested with Lys-C. Peptides originating from cells incubated for 16 h with STLC were labeled first at lysines with MDHI-d4 and subsequently N-terminally with succinic anhydride. Likewise, cells incubated for 40 h with STLC were labeled first at lysines with MDHI and afterwards on their N-termini with succinic anhydride-d4. The peptide digests were combined for corresponding gel bands and analyzed using a nano-LC/LTQ-Orbitrap XL using data-dependent mode according to the method previously described in detail.4 The generated raw files were preprocessed using DtaSuperCharge and searched using Mascot. Mascot result files (dat-files) were imported in IsobariQ and quantified utilizing the VSN algorithm. Mascot identified 822 proteins with at least two peptides each from this dataset. Furthermore, when quantifying this 918

Journal of Proteome Research • Vol. 10, No. 2, 2011

dataset with IsobariQ, and in addition requiring at least two peptides per protein having quantification events, 296 proteins could be successfully quantified. The average number of matched queries per protein was 14 and the average number of quantification points per protein was 78. This number is sufficiently high to infer an accurate and robust estimate of the protein ratio, whereas for other isobaric quantification methods only one quantification point per peptide can be used. When utilizing proteins with at least two quantifiable peptides, three of these showed significant up-regulation and seven down-regulation (Benjamini-Hochberg corrected p < 0.05) after 40 h compared to 16 h incubation with STLC (Table 1). Interestingly, four of these regulated proteins (Neutral alphaglucosidase AB (GANAB), phosphoserine aminotransferase (SERC), 60S ribosomal protein L10 (RL10), and 60S ribosomal protein L24 (RL24) were identified by quantitative proteome analysis of Bax-expressing and deficient colorectal carcinoma cell after induction of apoptosis using tumor necrosis factorrelated apoptosis inducing ligand (TRAIL).29 In addition, protein DJ-1 (PARK7),30 phosphatidylethanolamine-binding protein 1

technical notes

IsobariQ Table 1. Significantly Up- or Down-Regulated Proteins in Response to Incubation with STLC for 16 h (Mitotic Arrest) and 40 h (Apoptosis) in HeLa Cells Are Displayeda Uniprot ID

normalized ratio

BH corrected significance

quantification points

GANAB_HUMAN PARK7_HUMAN PEBP1_HUMAN RAB1B_HUMAN RAB14_HUMAN RL9_HUMAN RL10_HUMAN RL23A_HUMAN RL24_HUMAN SERC_HUMAN

7.93 9.19 0.15 9.35 0.13 0.09 0.21 0.26 0.08 0.20

2.28 × 10-2 1.23 × 10-2 1.21 × 10-3 1.24 × 10-2 4.04 × 10-4 8.88 × 10-6 1.20 × 10-2 3.46 × 10-2 3.70 × 10-6 1.06 × 10-2

87 111 14 28 13 17 10 62 24 37

a

BH, Benjamini-Hochberg.

Table 2. Proteins Selected by IsobariQ to be Identified at Different Locations on the 1D Gel, which also Showed Different Regulation Depending on the Gel Locationa full length protein Uniprot ID DDX21_HUMAN EIF3A_HUMAN* FAS_HUMAN FLNA_HUMAN* LMNA_HUMAN* MAP4_HUMAN* MYH9_HUMAN* PARP1_HUMAN* PLEC1_HUMAN* RS3A_HUMAN TCPH_HUMAN VIME_HUMAN*

cleaved protein product

normalized quantification normalized quantification ratio points ratio points 0.55 0.79 0.83 0.84 0.31 0.78 0.63 0.49 0.42 0.34 1.31 0.33

42 83 85 366 36 31 237 82 252 21 150 81

1.88 2.42 6.28 4.38 4.10 3.31 4.83 2.81 3.72 1.67 5.74 5.30

28 34 25 33 10 26 26 44 140 6 12 29

a Only proteins showing a distinct cleavage pattern (the amount of the full length protein is reduced while the cleaved protein product is increased) are shown. Eight of the 12 proteins (marked with an asterisk) have previously been reported as caspase targets according to the CAspase Substrate dataBAse Homepage (http://bioinf.gen.tcd.ie/ casbah).35

(PEBP),31 and neutral alpha-glucosidase AB (GANAB)31 were identified in other proteomics studies of apoptosis. Proteases have an important role in many signaling pathways. However, if all lanes of a 1D gel are merged for data analysis and if both the full-length and the cleavage products are detected, the cleavage event might not be detected because it will not be reflected by a quantitative change.32 IsobariQ highlights proteins which show diverging regulation between different locations on a 1D gel, and 65 proteins in the data set fulfilled this criterion. Out of these, 35 contained at least two quantified peptides in both directions (up- and down-regulation). Thereof, 12 showed typical pattern of protein degradation (Table 2), that is, the full length protein is reduced in amount and the cleaved protein product is increased. This included, for example, poly[ADP-ribose]polymerase-1 (PARP-1), a main target protein for activated caspase-3.33 In IsobariQ this protein showed identification in two distinct bands on the SDS-PAGE gel, with opposite regulations. This regulation of PARP-1 in two directions confirmed activation of caspase-3 and apoptosis after 40 h. Other proteins showing this characteristic pattern were cytoskeletal proteins such as filamin, lamin A/C, myosin-9, plectin, and vimentin, all of them well-known targets of caspases,34 verifying the execution of apoptosis after 40 h. Actually, 8 of the 12 proteins were previously reported as

caspase targets according to the CAspase Substrate dataBAse Homepage (CASBAH)35 (available at http://bioinf.gen.tcd.ie/ casbah), while 4 were previously not known (Table 2). Out of these four proteins, fatty acid synthase,29,36 T-complex protein 1 subunit eta (TCPH),29 and 40S ribosomal protein S3A29,37 have been reported in proteome analyses of apoptotic cells. In addition, the bioinformatics tool for score-based prediction of caspase cleavage GraBCas,38 revealed caspase-3 and -7 cleavage sites with high probability for nucleolar RNA helicase 2 (DDX21) with an ideal site at the sequence DEVD/Q (342) and the maximum score of 100, and a high probability of cleavage of fatty acid synthase (Fas) at the sequence DDPD/P (965) with a probability score of 10. Taken this information into account, having only one ratio per protein is maybe not a satisfactory measure of protein abundances. Different protein species (e.g., full-length, cleaved, and modified proteins) might show different abundances during apoptosis, and multiple ratios for each protein must be inferred. With the increasing amount of protein species identifications, future quantification software will have to include algorithms that take this issue into account.

Conclusions In summary, IsobariQ is a novel tool for accurate quantification of IPTL, iTRAQ and TMT data. Currently, it is the only freely available tool to analyze IPTL data and it allows the user to examine proteomic data in detail and interactively through a graphical user interface. In addition, it provides the user with advanced statistical analysis, including the recently published variance stabilizing normalization (VSN). Abbreviations: CID, collision induced dissociation; ETD, electron transfer dissociation; HCD, Higher-energy C-trap dissociation; iTRAQ, isobaric tags for relative and absolute quantification; IPTL, isobaric peptide termini labeling; MDHI, 2-methoxy-4,5-dihydro-1H-imidazole; PARP-1, poly[ADP-ribose]polymerase-1; SILAC, stable isotope labeling with amino acids in cell culture; STLC, S-trityl-L-cysteine; TMT, tandem mass tagging; TRAIL, tumor necrosis factor-related apoptosis inducing ligand; VSN, variance stabilizing normalization.

Acknowledgment. This work was supported by the National Program for Research in Functional Genomics in Norway (FUGE, project no. 183418/S10) of the Norwegian Research Council and FUGE-Øst to B.T. C.J.K. and B.T. are the inventors of “Quantitative proteomics using isobaric peptide termini labeling”, which was made as employees of the University of Oslo and constitutes the basis for a patent application. A.T. is working for NEPAF, a contract proteome analysis facility (funded by the regional development agency ONE Northeast and by the European Regional Development Fund) that will offer IPTL-based protein quantification as a service. H.B. and F.B. are partly supported by the Western Norway Regional Health Authority and by a grant from the Meltzer foundation. References (1) Ong, S. E.; Blagoev, B.; Kratchmarova, I.; Kristensen, D. B.; Steen, H.; Pandey, A.; Mann, M. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol. Cell. Proteomics 2002, 1 (5), 376–86. (2) Thompson, A.; Schafer, J.; Kuhn, K.; Kienle, S.; Schwarz, J.; Schmidt, G.; Neumann, T.; Johnstone, R.; Mohammed, A. K.; Hamon, C. Tandem mass tags: a novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. Anal. Chem. 2003, 75 (8), 1895–904.

Journal of Proteome Research • Vol. 10, No. 2, 2011 919

technical notes (3) Ross, P. L.; Huang, Y. N.; Marchese, J. N.; Williamson, B.; Parker, K.; Hattan, S.; Khainovski, N.; Pillai, S.; Dey, S.; Daniels, S.; Purkayastha, S.; Juhasz, P.; Martin, S.; Bartlet-Jones, M.; He, F.; Jacobson, A.; Pappin, D. J. Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell. Proteomics 2004, 3 (12), 1154–69. (4) Koehler, C. J.; Strozynski, M.; Kozielski, F.; Treumann, A.; Thiede, B. Isobaric peptide termini labeling for MS/MS-based quantitative proteomics. J. Proteome Res. 2009, 8 (9), 4333–41. (5) Mann, M. Functional and quantitative proteomics using SILAC. Nat. Rev. Mol. Cell Biol. 2006, 7 (12), 952–8. (6) Choe, L.; D’Ascenzo, M.; Relkin, N. R.; Pappin, D.; Ross, P.; Williamson, B.; Guertin, S.; Pribil, P.; Lee, K. H. 8-plex quantitation of changes in cerebrospinal fluid protein expression in subjects undergoing intravenous immunoglobulin treatment for Alzheimer’s disease. Proteomics 2007, 7 (20), 3651–60. (7) Dayon, L.; Pasquarello, C.; Hoogland, C.; Sanchez, J. C.; Scherl, A. Combining low- and high-energy tandem mass spectra for optimized peptide quantification with isobaric tags. J. Proteomics 2010, 73 (4), 769–77. (8) Ow, S. Y.; Salim, M.; Noirel, J.; Evans, C.; Rehman, I.; Wright, P. C. iTRAQ underestimation in simple and complex mixtures: “the good, the bad and the ugly”. J. Proteome Res. 2009, 8 (11), 5347– 55. (9) Mueller, L. N.; Brusniak, M. Y.; Mani, D. R.; Aebersold, R. An assessment of software solutions for the analysis of mass spectrometry based quantitative proteomics data. J. Proteome Res. 2008, 7 (1), 51–61. (10) Cox, J.; Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteomewide protein quantification. Nat. Biotechnol. 2008, 26 (12), 1367– 72. (11) Shadforth, I. P.; Dunkley, T. P.; Lilley, K. S.; Bessant, C. i-Tracker: for quantitative proteomics using iTRAQ. BMC Genomics 2005, 6, 145. (12) Lin, W. T.; Hung, W. N.; Yian, Y. H.; Wu, K. P.; Han, C. L.; Chen, Y. R.; Chen, Y. J.; Sung, T. Y.; Hsu, W. L. Multi-Q: a fully automated tool for multiplexed protein quantitation. J. Proteome Res. 2006, 5 (9), 2328–38. (13) Muth, T.; Keller, D.; Puetz, S. M.; Martens, L.; Sickmann, A.; Boehm, A. M. jTraqX: a free, platform independent tool for isobaric tag quantitation at the protein level. Proteomics 2010, 10 (6), 1223–5. (14) Rodriguez-Suarez, E.; Gubb, E.; Alzueta, I. F.; Falcon-Perez, J. M.; Amorim, A.; Elortza, F.; Matthiesen, R. Virtual expert mass spectrometrist: iTRAQ tool for database-dependent search, quantitation and result storage. Proteomics 2010, 10 (8), 1545–56. (15) Karp, N. A.; Huber, W.; Sadowski, P. G.; Charles, P. D.; Hester, S. V.; Lilley, K. S., Addressing accuracy and precision issues in iTRAQ quantitation. Mol. Cell. Proteomics 2010, 9 (9), 1885-97. (16) Bantscheff, M.; Schirle, M.; Sweetman, G.; Rick, J.; Kuster, B. Quantitative mass spectrometry in proteomics: a critical review. Anal. Bioanal. Chem. 2007, 389 (4), 1017–31. (17) Perkins, D. N.; Pappin, D. J.; Creasy, D. M.; Cottrell, J. S. Probabilitybased protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 1999, 20 (18), 3551–67. (18) Xiang, F.; Ye, H.; Chen, R.; Fu, Q.; Li, L. N, N-dimethyl leucines as novel isobaric tandem mass tags for quantitative proteomics and peptidomics. Anal. Chem. 2010, 82 (7), 2817–25. (19) Li, S.; Zeng, D. CILAT--a new reagent for quantitative proteomics. Chem. Commun. 2007, (21), 2181–3. (20) Vaudel, M.; Sickmann, A.; Martens, L. Peptide and protein quantification: a map of the minefield. Proteomics 2010, 10 (4), 650–70. (21) Golub, G. H.; Van Loan, C. F. Matrix computations, 3rd ed.; Johns Hopkins University Press: Baltimore, 1996; pp xxvii, 694.

920

Journal of Proteome Research • Vol. 10, No. 2, 2011

Arntzen et al. (22) Galassi, M. GNU scientific library: reference manual, 2nd ed.; Network Theory: Bristol, 2002; pp xvi, 601. (23) Huber, W.; von Heydebreck, A.; Sultmann, H.; Poustka, A.; Vingron, M. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 2002, 18 (1), S96–104. (24) R-Development-Core-Team R: A Language and Environment for Statistical Computing; http://www.R-project.org, 2010. (25) Gentleman, R. C.; Carey, V. J.; Bates, D. M.; Bolstad, B.; Dettling, M.; Dudoit, S.; Ellis, B.; Gautier, L.; Ge, Y.; Gentry, J.; Hornik, K.; Hothorn, T.; Huber, W.; Iacus, S.; Irizarry, R.; Leisch, F.; Li, C.; Maechler, M.; Rossini, A. J.; Sawitzki, G.; Smith, C.; Smyth, G.; Tierney, L.; Yang, J. Y.; Zhang, J. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004, 5 (10), R80. (26) Nesvizhskii, A. I.; Aebersold, R. Interpretation of shotgun proteomic data: the protein inference problem. Mol. Cell. Proteomics 2005, 4 (10), 1419–40. (27) Benjamini, Y.; Hochberg, Y. Controlling the False Discovery Rate - a Practical and Powerful Approach to Multiple Testing. J. R. Stat. Soc., Ser. B 1995, 57 (1), 289–300. (28) Kozielski, F.; Skoufias, D. A.; Indorato, R. L.; Saoudi, Y.; Jungblut, P. R.; Hustoft, H. K.; Strozynski, M.; Thiede, B. Proteome analysis of apoptosis signaling by S-trityl-L-cysteine, a potent reversible inhibitor of human mitotic kinesin Eg5. Proteomics 2008, 8 (2), 289–300. (29) Wang, P.; Lo, A.; Young, J. B.; Song, J. H.; Lai, R.; Kneteman, N. M.; Hao, C.; Li, L. Targeted quantitative mass spectrometric identification of differentially expressed proteins between Bax-expressing and deficient colorectal carcinoma cells. J. Proteome Res. 2009, 8 (7), 3403–14. (30) He, F.; Zeng, Y.; Wu, X.; Ji, Y.; He, X.; Andrus, T.; Zhu, T.; Wang, T. Endogenous HIV-1 Vpr-mediated apoptosis and proteome alteration of human T-cell leukemia virus-1 transformed C8166 cells. Apoptosis 2009, 14 (10), 1212–26. (31) Kim, D. W.; Chae, J. I.; Kim, J. Y.; Pak, J. H.; Koo, D. B.; Bahk, Y. Y.; Seo, S. B. Proteomic analysis of apoptosis related proteins regulated by proto-oncogene protein DEK. J. Cell Biochem. 2009, 106 (6), 1048–59. (32) Thiede, B.; Treumann, A.; Kretschmer, A.; Sohlke, J.; Rudel, T. Shotgun proteome analysis of protein cleavage in apoptotic cells. Proteomics 2005, 5 (8), 2123–30. (33) Hong, S. J.; Dawson, T. M.; Dawson, V. L. Nuclear and mitochondrial conversations in cell death: PARP-1 and AIF signaling. Trends Pharmacol. Sci. 2004, 25 (5), 259–64. (34) Fischer, U.; Janicke, R. U.; Schulze-Osthoff, K. Many cuts to ruin: a comprehensive update of caspase substrates. Cell Death Differ. 2003, 10 (1), 76–100. (35) Luthi, A. U.; Martin, S. J. The CASBAH: a searchable database of caspase substrates. Cell Death Differ. 2007, 14 (4), 641–50. (36) Solstad, T.; Bjorgo, E.; Koehler, C. J.; Strozynski, M.; Torgersen, K. M.; Tasken, K.; Thiede, B. Quantitative proteome analysis of detergent-resistant membranes identifies the differential regulation of protein kinase C isoforms in apoptotic T cells. Proteomics 2010, 10 (15), 2758–68. (37) Tabbert, A.; Kappes, F.; Knippers, R.; Kellermann, J.; Lottspeich, F.; Ferrando-May, E. Hypophosphorylation of the architectural chromatin protein DEK in death-receptor-induced apoptosis revealed by the isotope coded protein label proteomic platform. Proteomics 2006, 6 (21), 5758–72. (38) Backes, C.; Kuentzer, J.; Lenhof, H. P.; Comtesse, N.; Meese, E. GraBCas: a bioinformatics tool for score-based prediction of Caspase- and Granzyme B-cleavage sites in protein sequences. Nucleic Acids Res. 2005, 33 (Web Server issue), W208–13.

PR1009977