ARTICLE pubs.acs.org/jpr
Sense and Nonsense of Pathway Analysis Software in Proteomics Thorsten M€uller,*,† Andreas Schr€otter,† Christina Loosse,† Stefan Helling,† Christian Stephan,‡ Maike Ahrens,‡ Julian Uszkoreit,‡ Martin Eisenacher,‡ Helmut E. Meyer,‡ and Katrin Marcus† † ‡
Functional Proteomics, Medizinisches Proteom-Center, Ruhr-University Bochum, D-44780 Bochum, Germany Bioanalytics, Medizinisches Proteom-Center, Ruhr-University Bochum, D-44780 Bochum, Germany
bS Supporting Information ABSTRACT: New developments in proteomics enable scientists to examine hundreds to thousands of proteins in parallel. Quantitative proteomics allows the comparison of different proteomes of cells, tissues, or body fluids with each other. Analyzing and especially organizing these data sets is often a Herculean task. Pathway Analysis software tools aim to take over this task based on present knowledge. Companies promise that their algorithms help to understand the significance of scientist’s data, but the benefit remains questionable, and a fundamental systematic evaluation of the potential of such tools has not been performed until now. Here, we tested the commercial Ingenuity Pathway Analysis tool as well as the freely available software STRING using a well-defined study design in regard to the applicability and value of their results for proteome studies. It was our goal to cover a wide range of scientific issues by simulating different established pathways including mitochondrial apoptosis, tau phosphorylation, and Insulin-, App-, and Wnt-signaling. Next to a general assessment and comparison of the pathway analysis tools, we provide recommendations for users as well as for software developers to improve the added value of a pathway study implementation in proteomic pipelines.
’ INTRODUCTION Pathway analysis tools are popular as they promise a fast interpretation of OMICS data revealing background information on affected pathways or mechanisms. Actually, 55 publications report the use of the Ingenuity Pathway Analysis (IPA) software (Ingenuity Systems, http://www.ingenuity.com/) in the field of proteomics; among them are 24 that have been published since 2010 (searching for the term “Ingenuity proteomics” in PubMed). The application is widely spread from the analysis of tissues from treated animals,1 cell lines,2 4 conditioned media,5 biopsied human tissue,6 human milk,7 or human plasma.8 In a similar way, the software tool STRING (http://string-db.org/) is used,9 12 though not as extensive as IPA (6 Pubmed publications). Similar to the wide field of different types of samples used for analysis, the examined scientific background encompasses a broad area of operation like neurological diseases,13,14 hepatic disorders,15 diabetes,16 sepsis,17 lung injury,18 or cancer.5 Finally, different proteomic discovery techniques were used in combination with in-silico pathway analysis.1,5,15,16 IPA and STRING belong to the most often used pathway tools, but many other programs are available as well (for example, GeneGo MetaCore (http://www. genego.com/metacore.php) or Ariadne Pathway Studio (http:// www.ariadnegenomics.com/products/pathway-studio/)). In all setups, researchers used pathway tools to report underlying mechanisms that were putatively changed within their specific scientific questioning. Validation studies and subsequent experiments are often planned on the basis of pathway analyses in some of the cited articles. However, there are actually no publications testing or analyzing the correctness of pathway tools in r 2011 American Chemical Society
proteomics. IPA was compared to the pathway tool ArrayUnlock only in the field of microarrays.19 Authors reported that both tools allow similar conclusions in regard to the interpretation of a chicken infection model, but less is known about the sense or nonsense of pathway tools for proteome data. Some impressions can be conducted from a bootstrap strategy using 1000 sets of 13 random proteins, reporting that IPA can provide additional insight into proteomic data sets.20 However, authors indicate that extreme caution is needed when interpreting that the IPA scores that correspond to the measure of likelihood that the association between a set of focus genes/proteins in an experiment and a given process or pathway is due to random chance (acc. IPA white paper). Due to the mentioned lack in the field of proteomics, the interest of our lab in using pathway tools for data analysis, and as basis for the design of subsequent (validation- or functional) experiments, we set up a test study enabling us to evaluate the power of pathway generation in IPA and STRING (in IPA pathways are termed “networks”). Although both tools use different algorithms, they report basically similar results (when using certain software parameters), which is the presentation of a network of the uploaded proteins plus additional proteins that are densely populated with the input proteins. Our main strategy was divided into two parts: on the one hand, we aimed to assess the accuracy of the mentioned tools by importing proteins (upload Received: July 13, 2011 Published: October 06, 2011 5398
dx.doi.org/10.1021/pr200654k | J. Proteome Res. 2011, 10, 5398–5408
Journal of Proteome Research protein lists in the software) that belong to an established pathway (based on the literature), which we call the pathway study. On the other hand, it was our goal to test the capability of the software tools in identifying pathways from an imported protein list that includes proteins known to belong to a certain pathway (proteins of the pathway study) plus additional false positive proteins. The latter refer to the biological context and are subsequently termed biological irrelevant to separate them from false positives derived from technical reasons. (This name-giving was done in the knowledge that for certain questioning (e.g., balancer studies) biological irrelevant proteins are worth to study, of course.) Biological irrelevants were obtained from a real-world complex proteome analysis comparing six biological replicates of the HEK293T cell line. Such biological irrelevant proteins are often found in screening experiments, and some of them belong to the group of so-called deja-vu proteins.21 Others might be balancer proteins necessary to equilibrate the proteome in response to certain stimuli.22 These proteins might be of biological relevance but are often irrelevant to the experiment and, thus, should be excluded from the data analysis. We were wondering whether the software tools were able to handle these biological irrelevant proteins and whether they nevertheless were capable of reporting the simulated affected pathways as good as possible. We called the combined analysis of pathway proteins and biological irrelevant proteins the background study. We assessed the correctness of protein annotation and description of canonical pathways (if available by the software). To cover different research areas, we simulated established pathways in 5 different fields including mitochondrial apoptosis, tau phosphorylation, Insulin-, App-, and Wnt- signaling. Our evaluation studies allowed us to score the ability of the software tools to identify underlying pathways from less (including biological irrelevant proteins) and more specific (excluding biological irrelevant proteins) data sets, the correct annotation of proteins and pathways as well as the applicability to different research questioning.
’ MATERIAL AND METHODS Samples, Mass Spectrometry Analysis, and Data Preanalysis
The proteome of HEK293T cells was identified as background for the subsequent simulation of pathway changes. Therefore, six biological replicates of HEK293T cells (n = 6) were grown in culture dishes and lysed in Laemmli buffer, and protein lysates were prefractionated using NuPAGE Novex 4 12% Bis-Tris gels (Invitrogen, Karlsruhe, Germany). The gel was stained with Imperial Protein Stain (Thermo Scientific, Waltham, MA). Following destaining, the complete gel was reduced using DTT and alkylated with Iodacetamid. Ten gel pieces per sample (per lane) were prepared for subsequent mass spectrometry analysis corresponding to 60 analyzed mass spectrometry (MS) samples in total. Therefore, gel pieces were excised and in-gel digestion was performed overnight at 37 °C using trypsin dissolved in 10 mM HCl and 50 mM ammonium hydrogen carbonate (NH4HCO3) at pH 7.8. Resulting peptides were extracted once with 100 μL of 1% FA, and twice with 100 μL of 5% FA, 50% ACN. Extracts were combined and ACN was removed in vacuo. For nanoLC MS/MS analysis, a final volume of 40 μL was prepared by addition of 1% FA. ESI-MS/MS was performed on a HCT Ultra plus ion trap instrument (Bruker Daltonics, Bremen, Germany). Fragment ions were generated by low-energy collision-induced dissociation (CID) on isolated ions with a fragmentation amplitude of 0.5 V. MS spectra were
ARTICLE
summed from four individual scans ranging from m/z 300 1500 with a scanning speed of 8100 (m/z)/s and MS/MS spectra were a sum of two scans ranging from m/z 100 2800 at a scan rate of 26 000 (m/z)/s. Raw files were transformed to *.mgf files (Data Analysis 3.2), imported in ProteinScape (version 2.1, Bruker Daltonics), and analyzed using Mascot (Matrixscience, London, U.K.) with a mass tolerance of 1.2 and 0.3 Da for MS and MS/MS masses, respectively. Searches were performed allowing two missed cleavages for tryptic digestion. Carbamidomethylation (C), phosphorylation (S,T,Y), and oxidation (M) were considered as variable modifications. A human IPI decoy database (version 3.66, 86 845 protein entries) was used restricted to the taxonomy Homo sapiens. Proteins were accepted as identified if the Mascot score of one peptide was higher than 27. To enable the limitation of a false discovery rate (FDR) of 5%, the original database (“target” part) was concatenated with a duplicate of itself (“decoy” part) in which the amino acid sequence of each protein entry was shuffled.23 Protein data were readout from ProteinScape using an in-house Visual basic-based script. Data were further processed using the Pivot table function of Microsoft Excel resulting in table representing spectral counts for every peptide belonging to a certain protein. A spectral index (SI) based on spectral and peptide counts was calculated as published in ref 24 and was subsequently used as the basis for label-free quantification. Data Analysis in IPA and STRING
IPA (version 9.0) generates hypothetical protein protein interactions clusters based on the “Ingenuity Knowledge Base”. Interactions are however not limited to “binding”, but also include “activation”, “inhibition”, “expression” and other interactions described in literature. The Ingenuity Knowledge Base is composed of Expert Findings, ExpertAssist Findings, Expert Knowledge, and Supported Third Party Information (more details at http://www.ingenuity.com/products/pathways_knowledge.html). To compare the power of pathway generation, we used the STRING analysis tool (version 8.3), which quantitatively integrates interaction data from high-throughput experiments, genomic context, coexpression, and other literature. STRING was initiated by the European Molecular Biology Laboratory, the Swiss Institute of Bioinformatics, the Novo Nordisk Foundation Center for Protein Research, and the Technical University of Dresden. In principle, a pathway study encompasses the upload of a list of proteins (or genes), which were identified as putatively differential abundant proteins in a real comparative (two or more groups) experiment. The value of the software tool is to interpret the resulting lists by generating pathways displaying underlying mechanisms, known (canonical) pathways, and known protein interactions. On the one hand it should be the goal of the software to include as many of the uploaded proteins as possible, on the other hand it should be able to recognize the most important or central proteins describing a putatively relevant pathway. As a basis for the subsequent pathway studies, the proteome data of the six HEK293T cell culture replicates (protein lysates) were randomly matched in two groups allowing the simulation of a 3 vs 3 sample proteomic experiment. Spectral index values of each replicate were used to calculate t test p-values and to identify differentially abundant proteins between the two groups. p-Values less than 0.05 were supposed to be significant resulting in 53 differentially abundant proteins (called background proteins, 5399
dx.doi.org/10.1021/pr200654k |J. Proteome Res. 2011, 10, 5398–5408
Journal of Proteome Research
ARTICLE
Table 1. Terminology proteins of interest
proteins for that interactions (direct or indirect) or even pathways are known for long time (e.g., Wnt, Axin, Tcf, that all belong to the
interplay proteins
in addition to the proteins of interest the interplay proteins include all other proteins describing the complete pathway as comprehensive
pathway study background study
study including only proteins of interest in addition to the pathway study the proteins identified as biological irrelevant (in our hands differentially abundant proteins from a
biological irrelevant
proteins that were found significantly different regulated in a proteome experiment but that do not mirror the biological background of an
Wnt signaling pathway) as possible. We expect those proteins to be included by the software tools in the pathway modeling
HEK293T study) were included in the pathway generation experiment (e.g., deja-vu proteins, balancer proteins)
Table 1), which were included into the analysis of the background studies. In general, we tested the software tools by the following two different studies: Pathway Study. Proteins of interest belonging to a selected pathway that is known for long time were uploaded for in-silico pathway analysis (see also Table 1 Terminology). Background Study. In addition to the pathway study, the proteins identified as biological irrelevant (from a differentially HEK293T study) were implemented in the in-silico pathway generation. For both types of studies, we assessed in IPA and STRING the correct interplay of proteins, for example, the binding of Wnt to its receptor Fz. As literature reference describing the evaluated pathways, we used the mentioned high impact reviews. By doing that, we have to emphasize that the terms correct and correctness are always relative and depend on the current state of scientific knowledge. Thus, future research or taking other publications as a basis might result in alternative solutions or models. For the protein interplay we considered more than the proteins of interest that correspond to the manually manipulated proteins uploaded in the software tool. Moreover, we included all proteins that are necessary to model the complete pathway. For example Wnt, Axin, and Tcf correspond to the proteins of interest, whereas the complete pathway also encompasses Fz, Lrp5, Lrp6, Dvl, Ck1, Ctnnb, Gsk3, and Apc. The proteins of this complete pathway are subsequently termed interplay proteins (Table 1). To calculate a factor for the protein interplay, we compared the in-silico pathway analysis to the respective high-impact review and counted correct vs false interactions. Missing interactions were assumed as false hit. The interplay factor is given as percentage of correct interactions to all interactions. IPA enables the pathway illustration in a subcellular fashion and provides additional information. Thus, in IPA we also assessed: (a) The correct protein localization calculated as percentage of correctly annotated proteins to all interplay proteins. As reference, we used the indicated high impact reviews as well as the LOCATE subcellular localization database (http://locate.imb.uq.edu.au/). (b) The correct protein family affiliation (e.g., receptor, ligand, transcription factor,...) as percentage of correctly annotated proteins to all interplay proteins. As reference, we used the indicated high impact reviews. Unfortunately, (a) and (b) are not immediately applicable in STRING, which links additional protein information to UniProt (http://www.uniprot.org/). Thus, the evaluation of the correct annotation for STRING was not examined. In general, we focused on pathways that have been described for a long time and are thought to be established. If more than one pathway (this is called network in IPA) was offered by IPA,
the one encompassing most of the interplay proteins was used as a basis for our assessment. IPA offers presetting for the number of proteins displayed in one pathway. To compare IPA and STRING as well as possible, we have chosen the maximum number of proteins for the pathway generation in IPA (140 proteins). In STRING, there is a more/less function to extend/diminish the pathway description: Analyzing protein lists for the first time in STRING results in network illustrations containing merely the proteins that were uploaded. However, by using the “more” button, the pathway description is extended by including additional interplay proteins that were not part of the uploaded list. The “more” (and similarly the “less”) function can be used as much as is necessary to adapt the network description with “more” or “less” interplay proteins. Using this function, we have chosen the pathway encompassing most of the proteins of interest/ interplay proteins. Protein lists were uploaded with the Ensembl (http://www.ensembl.org) and indexed as identifier, which can be recognized by both IPA and STRING.
’ RESULTS Pathway analysis tools are getting more and more popular. Particularly scientists working in the field of OMICS often use them to identify candidates or pathways that are different between two or more groups of samples. Here, we tested IPA and STRING for its applicability to proteomic data, based on selected proteins of a certain pathway and the proteome of the cell line HEK293T, which was identified using a HCT ultra mass spectrometer (the completely identified proteome of the HEK293T cell line is given in Supplement Table 1, Supporting Information). IPA directly creates networks combining the input proteins and additional proteins, which are densely populated with the input protein. STRING initially reports just a network showing connections for the uploaded proteins. However, using the more/ less function in STRING, additional proteins are populated as well, resulting in basically comparable results. To assess the value of pathway analysis software for the detection of underlying mechanisms and the subsequent design of further experiments we pursued the following strategy: on the one hand we aimed to score the accurate interaction of certain proteins of interest (compare Table 1 - Terminology), defined as proteins for that interactions or even pathways are known for long time (like the proteins Wnt, Axin, and Tcf that all belong to the Wnt signaling cascade). To select these proteins in an accurate fashion, we thoroughly screened present literature and selected high-impact reviews as basis for the identification of proteins belonging to a certain pathway. We uploaded just the proteins of interest and termed the resulting network pathway 5400
dx.doi.org/10.1021/pr200654k |J. Proteome Res. 2011, 10, 5398–5408
Journal of Proteome Research
ARTICLE
study (Table 1). In order to force the software to mirror the complete underlying pathway, we have chosen the proteins of interest in a way that the proposed start (e.g., a receptor ligand) Table 2. Results of Five Different Pathway Studiesa (A) Wnt signaling
IPA annotation
interplay proteins
family
localization
Wnt (POI)
group
extracellular space
Axin (POI)
other
cytoplasm
transcr. regul.
nucleus
Fz
G-prot. coupl.rec.
plasma membrane
Lrp5
other
plasma membrane
Dvl
other
cytoplasm
Ck1
complex
cytoplasm
Ctnnb
transcr. regul.
nucleus
Gsk3
kinase
nucleus (cytoplasm)
Lrp6 Apc
other enzyme
plasma membrane nucleus
6/11 (54%)
IPA
IPA
Wnt-Lrp5
+
+
+ +
chemical
unknown
Akt
group
unknown
mTor (POI)
kinase
nucleus/cytoplasm
Tsc1
complex
cytoplasm
Tsc2
complex
cytoplasm
Rheb (POI) Raptor
enzyme other
plasma membrane (cytoplasm) cytoplasm
4Ebp1 (POI)
translat. regul. cytoplasm
S6k
group
unknown
Results
7/12 (58%)
6/12(50%)
+
+
+
+
Ck1-Axin
+
+
+
Gsk-Axin
+
+
+
+
+
+
+
+
IPA annotation
interplay proteins
family
localization
Mapt (POI) Gsk3b (POI)
other kinase
cytoplasm nucleus (cytoplasm)
Tesk1 (POI)
kinase
nucleus (cytoplasm)
Mark
kinase
cytoplasm
Akt
kinase
cytoplasm
Klc1 (POI)
other
cytoplasm
4/6 (67%)
4/6 (67%)
Results
pathway study
STRING
+
+
+
Akt-mTor
+
Akt-Tsc1
+
+
+
+
Akt-Tsc2
+
+
+
+
Tsc-Rheb
+
+
+
+
mTor-Raptor
+
+
+
+
mTor-4Ebp1
+
+
+
+
+
+
+
+ + + + 8/11 (73%) 8/11 (73%) 8/11 (73%) 8/11 (73%)
mTor-S6k Results (D) Apop
IPA annotation family
localization
FasL (POI)
cytokine
extrac. space
FasR
transm.receptor
plasma memb.
Casp8
background study
IPA
STRING
IPA
STRING
Gsk3b-Mapt Mark-Mapt
+ +
+
+
+
Tesk1-Mark +
+
interplay proteins
interactions studied
+
IPA
Pi3k-PIP PIP-Akt
8/11 (73%) 9/11 (82%) 7/11 (64%) 4/11 (36%)
(B) Mapt
STRING
Irs1-Pi3k
+
background study
IPA
Igfr-Irs1
+
+
Akt-Gsk3b
cytoplasm
PIP
interactions studied
+ + +
Dvl-Axin
Results
cytoplasm (plasma membrane)
complex
pathway study
Ctnnb-Tcf
Axin-Apc
other
Pi3k
+
Fz-Dvl
localization
Irs1 (POI)
+
Wnt-Lrp6 Ck1-Ctnnb Gsk3-Ctnnb
STRING
+
Wnt-Fz
family
Igfr
background study
STRING
IPA annotation
interplay proteins
10/11 (91%)
pathway study interactions studied
Table 2. Continued (C) Insu
Tcf (POI)
Results
and the end (e.g., a transcription factor) of the signaling cascade is represented within the proteins of interest. This approach basically simulates a “real-world” experiment, in which not every member of a signaling pathway will be differentially expressed. The resulting pathway should include additional proteins (next to the uploaded ones (Wnt, Axin, Tcf)) that describe the pathway as comprehensively as possible. The whole set of proteins determining
+
Bid
other
cytoplasm
Bbc3 (Puma) (POI)
other
cytoplasm
Bcl2
other
cytoplasm
Bax Bak
other
cytoplasm
Cytc somatic (Cycs)
enzyme
cytoplasm
Apaf1
other
cytoplasm
Akt-Mapt
+
+
+
Casp9 (POI)
peptidase
cytoplasm
Mapt-Klc1
+
+
+
+
diablo
other
cytoplasm
5/6 (83%)
4/6 (67%)
4/6 (67%)
2/6 (33%)
Results
4/12 (33%)
10/12 (83%)
Results
5401
dx.doi.org/10.1021/pr200654k |J. Proteome Res. 2011, 10, 5398–5408
Journal of Proteome Research
ARTICLE
Table 2. Continued pathway study interactions studied
IPA
STRING
+
FasL-FasR FasR-Casp8 Casp8-Bid
+
Bbc3-Bcl2
background study IPA
+
+
+ +
+ +
+
Bcl-Bax
STRING
+
+
+
Bcl-Bak
+
+
+
Bax-Bak
+
Cycs-Apaf1
+
+
+
Cycs-Casp9
+
+
+
diablo-Casp9 Results
+ + + 6/10 (60%)10/10 (100%)1/10 (10%)8/10 (80%)
(E) Aicd
IPA annotation
interplay proteins
family
localization plasma membrane
APP (POI)
other
Fe65 (POI)
other
cytoplasm (also in the nucleus)
Tip60 (Kat5) (POI)
transcription
nucleus
Kai1 (CD82) (POI)
other
plasma membrane
Gsk3ß Tagln
other
cytoplasm
Acta2
other
cytoplasm
1/11 (9%)
5/11 (45%)
regulator
Psen1 Ncstn PsenEn Aph1 Results
pathway study interactions studied
background study
IPA
STRING
IPA
STRING
APP-Fe65 Fe65-Tip60
+ +
+ +
+ +
+ +
APP-Kai1 APP-Tagln APP-Gsk3 APP-Psen1 APP-Ncstn Psen1-PsenEn Psen1-Aph1 Results
+ +
+
+
+
+ + + + 7/9 (78%)
3/9 (33%)
3/9 (33%)
4/9 (44%)
a
POI = Protein of interest used for pathway study in bold; interplay proteins = POI + additional proteins belonging to the pathway; Incorrect or unspecific descriptions of proteins family or localization annotation are crossed out.
a pathway is termed interplay proteins (proteins of interest (Wnt, Axin, Tcf) plus additional proteins of the Wnt pathway Fz, Lrp, Dvl, ... = interplay proteins, Table 1). On the other hand, we aimed to examine the interactions of the interplay proteins with the background of a real proteomic experiment containing false positively regulated proteins. The latter refer to the biological context and are subsequently termed
as biological irrelevant to separate them from false positives derived from technical reasons. We included the biological irrelevant proteins obtained from a 3 vs 3 HEK293T sample comparison experiment to the list of uploaded proteins and reassessed the pathway generation by the software (subsequently termed background study, Table 1). Detailed results are given in Table 2, whereas Table 3 summarizes our findings. This approach was done to evaluate the effect of additional proteins that are irrelevant to the context of the studied pathway. As we aimed to specifically focus on the proteome/mass spectrometry relevant background, we used candidates from a real comparative proteome study for the extension of the uploaded lists instead of using randomly selected proteins. Wingless-type Signaling (Wnt)
Initially, we studied the proteins of the Wnt signaling pathway. Selection of appropriate proteins was done according to recent high-impact reviews (reviewed in refs 25 and 26). For the Wnt signaling we uploaded the proteins Wnt, Axin, and Tcf as proteins of interest for the pathway study. In brief, Axin is part of the destruction complex that is inactivated following binding of Wnt to the Frizzled receptor (Fz). Tcf is involved in the coactivation of Wnt-responsive genes in the nucleus. The complete pathway (interplay proteins) also involves the proteins Fz, Lrp, Dvl, Gsk, Apc, Ck1 (Csnk1) and Ctnnb (Table 2A, interplay proteins and Figure 1A). Family annotation in IPA was not correctly matched for the interplay proteins Axin, Lrp5, Dvl, and Lrp6 (54% correct). Localization annotation in IPA was correct for 91% of the proteins. In the demonstrated pathway, Gsk3 was wrongly annotated to be exclusively located in the nucleus. Surprisingly, evaluation of the pathway study revealed Fz not to be included in the network at all. According to the literature we claimed following interactions to be displayed by the software (IPA and STRING): Wnt binding to Fz, Wnt dependent recruitment of Lrp5 and Lrp6, Fz dependent recruitment of Dvl, Ck1 dependent phosphorylation of Ctnnb, Gsk3 dependent phosphorylation of Ctnnb, Ctnnb association to Tcf, Dvl action on Axin, Ck1 Axin interaction, Gsk Axin interaction, and Axin Apc interaction. Using these assumptions the pathway study in IPA revealed a correctness of 73% for the interplay proteins, which is due to the missing contribution of Fz and Lrp6 in the pathway. In STRING, the pathway study of interplay proteins revealed correctness of 82% due to the missing of the Fz receptor. The pathway diagrams of both analyses are shown in Figure 1 (B (IPA), C (STRING), high resolution images are given in the Supplement Figure 1, Supporting Information, (IPA, pathway study of Wnt signaling) and Figure 2 (STRING, pathway study of Wnt signaling)). The background study of the Wnt signaling including the background proteins of the HEK293T proteome study revealed a correct description of 64% with respect to the Wnt signaling pathway (Figure 1D, Table 2A, and Supplement Figure 3, Supporting Information). Due to the inclusion of biological irrelevant proteins some of the interactions were not displayed any more. The results of the STRING software were even worse (36%, Figure 1E, Table 2A, and Supplement Figure 4, Supporting Information). MAPT Phosphorylation (Mapt)
Next, we studied a pathway involved in the course of neurodegenerative diseases. As proteins of interest we uploaded Gsk3β, Mapt, and Tesk1 for pathway generation. The underlying pathway 5402
dx.doi.org/10.1021/pr200654k |J. Proteome Res. 2011, 10, 5398–5408
Journal of Proteome Research
ARTICLE
Table 3. Summary of Test Results IPA annotation family [%]
pathway study
localization [%]
background study
IPA
STRING
IPA
STRING 4/11 (36.4%)
Wnt
54
91
8/11 (72.7%)
9/11 (81.8%)
7/11 (63.6%)
Mapt
67
67
5/6 (83.3%)
4/6 (66.7%)
4/6 (66.7%)
2/6 (33.3%)
Insu
58
50
8/11 (72.7%)
8/11 (72.7%)
8/11 (72.7%)
8/11 (72.7%)
Apop
33
83
6/10 (60.0%)
10/10 (100.0%)
1/10 (10.0%)
8/10 (80.0%)
Aicd
9
45
4/9 (44.4%)
7/9 (77.8%)
3/9 (33.3%)
3/9 (33.3%)
31/47 (66.0%)
38/47 (80.1%)
23/47 (48.9%)
25/47 (53.2%)
0.1613
0.8365
average (Pstudy tool )
44.2
70.4
proportion test p-value
Figure 1. Results of the Wnt pathway study in IPA and STRING. The Wnt pathway is illustrated in A according to literature data. Upload of pathway proteins (3 selected proteins of the Wnt pathway (Wnt, Axin, Tcf)) resulted in a good description of underlying pathways in IPA (B, 73% correct) and STRING (C, 82%). Addition of background proteins from a real proteome experiment including biological irrelevant proteins impaired the recovery of the Wnt pathway in the illustration (D (IPA, 64%) and E (STRING, 36%)). A high-resolution image of each part (B E) is given in Supplement Figures 1 4 (Supporting Information).
of the Mapt protein phosphorylation is well described and known for long time.27 In brief, Mapt can be phosphorylated by several kinases, among them Gsk3β and Mark. Tesk1 is a protein able to inactivate Mark. In IPA, the protein family annotation was correct for 67% of the interplay proteins (Table 2B). The localization was correctly
annotated for Mapt, Mark, Akt, and Klc1, whereas Gsk3β and Tesk1 are mainly located in the cytoplasm, resulting in a correct protein localization of 67% (compare Table 2B). For the pathway study the additional proteins Mark, Akt, and Klc1 were expected to be included in the pathway (detailed pathway results are given in Supplement Figure 7, Supporting Information). Although the 5403
dx.doi.org/10.1021/pr200654k |J. Proteome Res. 2011, 10, 5398–5408
Journal of Proteome Research
ARTICLE
Figure 2. Additional useful features in IPA. IPA offers illustration of canonical pathways (A, Wnt signaling) with red labeled proteins representing those who were uploaded by the user. Proteins belonging to a certain canonical pathway (here Wnt signaling) can be illustrated in the pathway description as shown in B (blue lines label proteins belonging to the Wnt signaling). In contrast to Figure 1, pathways can also be illustrated containing the localization information of proteins as shown in B. A high-resolution image of each part is given in Supplement Figures 5 6 (Supporting Information).
Mark protein was present in the pathway, the direct binding to Tesk1 was not detected. Other protein interactions were correctly visualized (83%). In STRING, Tesk1 was not associated to Mapt, neither direct nor indirect via Mark. The latter was not included in the pathway at all. Phosphorylation of Mapt by Gsk3β was found and described correctly. All in all, 67% of the interactions were described in the right fashion (Table 2B, detailed pathway results are given in Supplement Figure 8, Supporting Information). The background study of Mapt in IPA failed to demonstrate the Mark dependent phosphorylation of Mapt (67%, detailed pathway results are given in Supplement Figure 9, Supporting Information). In STRING, Akt was not involved in the pathway generation any more (33%); detailed pathway results are given in Supplement Figure 10 (Supporting Information). Insulin Signaling (Insu)
Next we analyzed the pathway generation of the Insulin dependent signaling cascade including the mammalian target of rapamycin (mTor) based on the publication by Corradetti et al.28 As components of the pathway we first uploaded the proteins Irs1, Rheb, mTor, and 4Ebp1 (Eif4ebp). For the interplay proteins we additionally considered the proteins/components Igfr, Pi3k, PIP, Akt, Tsc1, Tsc2, Raptor, and S6k. In brief, Insulin initiates the signaling cascade at the Insulin receptor, resulting in the activation of Irs1 and Pi3K. Subsequently, Akt is recruited to the plasma membrane in a PIP-dependent manner and phosphorylated by kinases like mTor and others. Next, the Tsc1/2 complex gets phosphorylated and the G protein Rheb is inactivated. Finally, the translation control proteins S6k and 4Ebp1 are modulated in regard to their activity. In IPA, the Igf receptor was not considered in the calculated pathway (Table 2C). Family annotation of Irs1, Akt, Raptor, and S6k was insufficient. IPA generally distinguishes cytoplasmatic and plasma membrane localization, which is both true for Irs1
based on the literature. However, the protein was annotated to be solely localized in the cytoplasm which we evaluated as an unspecified (false) annotation. Similar inaccurate localization information was given for Rheb. For some other components of the pathway the localization was missing (50%). For the pathway study we claimed following interactions to be displayed by the software (IPA and STRING): Igfr-Irs1 binding, Irs1-Pi3k activation, Pi3k-PIP conversion, PIP-Akt recruitment, mTor-Akt phosphorylation, Akt-Tsc phosphorylation, Gap activity of Rheb by Tsc, mTor-Raptor interaction, mTor dependent phosphorylation of 4Ebp1 and S6k). Pathway study in IPA (detailed pathway results are given in Supplement Figure 11, Supporting Information) revealed a correctness of 73% equivalent to that one in STRING (73%, detailed pathway results are given in Supplement Figure 12, Supporting Information). The analysis of the Insulin pathway including the background proteins from the HEK293T study in IPA also revealed a correctness of 73% (detailed pathway results are given in Supplement Figure 13, Supporting Information) and was again equivalent to the STRING pathway generation (73%, detailed pathway results are given in Supplement Figure 14, Supporting Information). Apoptosis (Apop)
As fourth signaling cascade we studied a classical apoptosis pathway. Proteins were selected on basis of present knowledge by screening the literature for high-impact reviews.29,30 As components of the pathway we initially uploaded the proteins FasL, Puma, and Caspase9. For the interplay proteins we claimed the proteins/components FasR, Caspase8, Bid, Bcl-2, Bax, Bak, Cytc, Apaf1, and diablo to be included in the analysis (Table 2D). Many proteins in IPA were annotated as “other”, which was assessed as negative hit in our test (33%). In contrast, the localization annotation was mainly correct (83%). 5404
dx.doi.org/10.1021/pr200654k |J. Proteome Res. 2011, 10, 5398–5408
Journal of Proteome Research
ARTICLE
Table 4. Analysis of Canonical Pathways in IPA IPA pathway study (no. of pathway proteins), [correctn. of canon. pathway]
IPA background study (no. of pathway proteins)
Wnt (Wnt, Axin, Tcf) 1 Basal Cell Carcinom Signaling (3)
1 Fcγ Receptor-mediated Phagocytosis in Macrophages and Monocytes
2 Role of Wnt/Gsk-3β Signaling in the Pathogenesis of Influenza (3)
(0) 2 Basal Cell Carcinoma Signaling (3)
3 Human Embryonic Stem Cell Pluripotency (3)
3 Germ Cell-Sertoli Junction Signaling (1)
4 Ovarian Cancer Signaling (3)
4 Wnt/β-catenin Signaling (3)
5 Wnt/β-catenin Signaling (3) [82%]
5 Role of Wnt/Gsk-3β Signaling in the Pathogenesis of Influenza (3)
1 Amyloid Processing (2)
1 Fcγ Receptor-mediated Phagocytosis in Macrophages and Monocytes
Mapt (Mapt, Gsk3b, Tesk1) 2 Reelin Signaling in Neurons (2)
(0)
3 14-3-3-mediated Signaling (2)
2 Integrin Signaling (1)
4 ILK Signaling (2)
3 14-3-3-mediated Signaling (2)
5 Axonal Guidance Signaling (2)
4 Axonal Guidance Signaling (2) 5 Regulation of Actin-based Motility by Rho (0)
Insu (Irs1, mTor, Rheb, 4Ebp1) 1 mTor Signaling (4) [100%]
1 mTor Signaling (4)
2 Regulation of eIF4 and p70S6K Signaling (4) 3 Pi3k/Akt Signaling (4)
2 Fcγ Receptor-mediated Phagocytosis in Macrophages and Monocytes (0)
4 Insulin Receptor Signaling (4)
3 Regulation of eIF4 and p70S6K Signaling (3)
5 Ampk Signaling (4)
4 Pi3k/Akt Signaling (3) 5 Regulation of Actin-based Motility by Rho (0)
Apop (FasL, BBC3, Casp9) 1 Induction of Apoptosis by HIV1 (3) [100%]
1 Fcγ Receptor-mediated Phagocytosis in Macrophages and
2 Tumoricidal Function of Hepatic Natural Killer Cells (2)
Monocytes (0)
3 Molecular Mechanisms of Cancer (3)
2 Induction of Apoptosis by HIV1 (3)
4 Death Receptor Signaling (2) 5 Myc Mediated Apoptosis Signaling (2)
3 Regulation of Actin-based Motility by Rho (0) 4 Tumoricidal Function of Hepatic Natural Killer Cells (2) 5 Apoptosis Signaling (2)
Aicd (APP, Fe65, Tip60, Kai1) 1 Reelin Signaling in Neurons (2)
1 Fcγ Receptor-mediated Phagocytosis in Macrophages and
2 Docosahexaenoic Acid (DHA) Signaling (1)
Monocytes (0)
3 Neuroprotective Role of THOP1 in Alzheimer’s Disease (1)
2 Regulation of Actin-based Motility by Rho (0) 3 Integrin Signaling (0)
4 Amyloid Processing (1)
4 RhoA Signaling (0)
5 Mitochondrial Dysfunction (1)
5 Neuroprotective Role of THOP1 in Alzheimer’s Disease (1)
Surprisingly, the pathway study in IPA revealed a weak grade of correctness (60%, detailed pathway results are given in Supplement Figure 15, Supporting Information) which is due to the lack of some proteins belonging to the pathway (Table 2D). In contrast, STRING was able to correctly identify all interactions (100%, detailed pathway results are given in Supplement Figure 16, Supporting Information). Implementation of the background proteins resulted in a correct pathway description of 10% in IPA (detailed pathway results are given in Supplement Figure 17, Supporting Information) and 80% in STRING (detailed pathway results are given in Supplement Figure 18, Supporting Information). Inexplicably, just one interaction was mirrored by the IPA analysis. Aicd
Finally, we tested a pathway generation including the action of a certain subdomain of APP. In brief, the APP intracellular domain (AICD) is thought to act as a transcriptional activator
in combination with the adapter protein Fe65 and the histone acetyltransferase Tip60. The underlying mechanisms belong to the main expertise of our lab.31 One of the discussed target genes of the complex is the tetraspanine Kai1. Thus, for the pathway study we uploaded APP, Fe65, Tip60, and Kai1 in IPA respective STRING. The expected interactions are shown in Table 2E. The correct annotation in IPA was weak, resulting in 9% correctness for the family annotation and 45% for the correct protein localization. The pathway generation revealed a correctness of 44% in IPA (detailed pathway results are given in Supplement Figure 19, Supporting Information) and 78% in STRING (detailed pathway results are given in Supplement Figure 20, Supporting Information) for the pathway study, whereas the inclusion of the background proteins demonstrated a correct description of the APP dependent pathway of 33% for both software tools (detailed pathway results are given in Supplement Figures 21 (IPA) and 22 (STRING), Supporting Information). 5405
dx.doi.org/10.1021/pr200654k |J. Proteome Res. 2011, 10, 5398–5408
Journal of Proteome Research
ARTICLE
Table 5. Advantages/Disadvantages and User Recommendationsa IPA facts:
STRING facts:
67% correctness in pathway study
80% correctness in pathway study
49% correctness in background study
51% correctness in background study
94% correctness in canonical pathway description (neurodegener. pathways excluded) advantages
+ more/less function to adapt pathway (enables clearer pathway representation)
+ canonical pathway implementation
+ clear indication of source
+ localization information includible disadvantages
no localization information
protein family and localization annotation needs improvement
not possible to save analysis data
number of proteins included in a pathway (network) is not
restricted to some data format (does not upload IPI entries)
dynamically adjustable getting source information complicated
not recognizable which proteins of the generated pathway belong to the uploaded ones (not indicated by color)
a
Three recommendations:
1 Remove unspecific, balancer, and deja-vu proteins as effective as possible before upload (less is more). Therefore, literature data or (better) a replicate study should be used. 2 Check reliability of pathways of interest in the software by uploading some proteins belonging to the pathway 3 In IPA: visualize putative canonical pathways in the network description An overview of the results of all tested pathways is given in Table 3. To assess the observed difference in performance, a statistical test has been conducted. An appropriate test for the comparison of two proportions is based on the observed number of correctly detected items: we regard the event of a correct identification as the realization of a Bernoulli distributed random variable with success probability P estimated by the relative frequency P.k The estimation of the probability that IPA and STRING come to a right decision yield Ppath IPA =0.66 and Ppath STRING =0.81 respectively in the pathway study (absolute difference: back 0.15). In the background study Pback IPA =0.53 and PSTRING =0.49 lead to an absolute difference of 0.04. The values are shown in table 3 which also contains the p-values of the corresponding tests32 (with continuity correction). Neither for the pathway study nor for the background study the null hypotheses of equal proportions can be rejected to the significance level of 0.05. All computations are conducted by means of R version 2.12.1.33 On average, no significant differences were found in regard to the power of pathway visualization in IPA vs STRING.
proteins of interest in each canonical pathway is given in round brackets. For our pathway study as well as for the background study we marked the relevant pathway in Table 4 in bold. Note that for the Mapt and the Aicd study no appropriate canonical pathways could be generated. There was no pathway containing all proteins of interest. Moreover, the suggested canonical pathway “Amyloid Processing” wrongly includes the Mapt phosphorylation cascade, whereas the cleavage of APP by secretases is insufficiently described. In contrast, the Wnt, Insu, and Apop pathway were well described by the IPA canonical pathway ((indicated by squared brackets, on average 94% correctness when the neurodegenerative pathways are excluded according to reasons mentioned above); Figure 2A exemplarily shows the Wnt canonical pathway by IPA). The representation of the canonical pathway can be included in the pathway (network) diagram via the overlay function (Figure 2B). Additionally, IPA enables a pathway view demonstrating the localization of each protein (Figure 2B). Sense and nonsense of these features will be discussed below.
Additional Features in IPA/STRING Considered in This Study
The STRING software tool is dedicated to the analysis of functional protein association networks. For the abovementioned pathways we used the “action view”, which demonstrated different types of action (activation, inhibition, etc.) in different colors. Furthermore, STRING provides the “evidence view” demonstrating the source of information, the evidence for the association was deduced from. Finally, the “confidence view” separates stronger associations (thicker lines) from weaker associations (thinner lines). As we compared the results of STRING to the respective high-impact paper, we focused on the action view encompassing all necessary information to be able to compare the results to that produced by IPA. Additional features are not implemented in STRING. IPA offers a large list of additional tools that seemed to be more or less helpful to us. Next to the generation of pathways the software provides a ranked list of canonical pathways in that the uploaded (or significant) proteins are part of Table 4. Number of
’ DISCUSSION Pathway tools are popular as they promise a fast and reliable analysis of mechanisms behind the results produced by certain OMICS analyses. Here, we tested the pathway tools IPA and STRING for their applicability in proteomics. A summarized overview of our test results including facts, advantages, disadvantages of IPA and STRING as well as user recommendations are given in Table 5. The selection of appropriate proteins for pathway simulation was done according to recent scientific knowledge by screening high-impact reviews for the corresponding studied pathway. Based on these articles we used the terms “correct” and “correctness” in order to evaluate the quality of protein localization, family annotation, and network integration. At this point, we would like to emphasize that these evaluations are always relative and clearly depend on the current knowledge. Moreover, taking other reviews as a basis might result in different 5406
dx.doi.org/10.1021/pr200654k |J. Proteome Res. 2011, 10, 5398–5408
Journal of Proteome Research findings. Thus, it was our endeavor to focus on established pathways known for a long time in order to justify the mentioned terms (correct, correctness) as good as possible. Considering the description of correct pathways we did not find a significant difference between IPA and STRING, but STRING tended to describe the underlying pathways better than IPA in our pathway study. All in all, both tools delivered acceptable results with respect to the description of the underlying pathway. These results are changing when background proteins were additionally uploaded. In STRING, there was a significant impairment which is (of course) due to the slightly better results (but not significant) of the pathway study. In general, the pathway description within the background study was deficient with 50% correctness on average for both tools. Moreover, due to our presetting in IPA (to enable comparability with STRING), the pathway pictogram was confusing in contrast to STRING, in that the more/less function enables an adjustment of the pathway description. For some questioning, for example, background study of Wnt in STRING and the background study of Apop in IPA, it is not possible to deduct that the corresponding pathway was affected. Note that in our Wnt signaling study the Fz receptor was not included in the pathway description (in IPA and STRING) although its ligand as well as downstream targets were delivered in the uploaded protein list. Of course, such negative findings result from the software algorithm trying to include all imported proteins in one network raising the question whether this is the right strategy. Thus, for the user of these software tools, there is a clear message to remove all putatively unspecific, balancer, or deja-vu proteins (biological irrelevant) from the uploaded protein candidate list as effectively as possible in order to get a good description of underlying pathway mechanisms (“less is more” principle). In a real biological experiment this effect is even enlarged as artificial overexpressed or silenced proteins of interest will result not only in biological irrelevant proteins but also in a number of secondary changes (e.g., protein interaction with the protein of interest). For removal of deja-vu proteins the comparison of different proteome studies is of interest as done by Petrak et al.21 Balancer proteins were initially mentioned in ref 22. However, the best strategy would be the performance of a basic biological/technical replicate study to specifically identify those proteins that can be found as unspecific targets in a given questioning and that have to be removed before data upload. Another strategy to analyze proteome data (or OMICS data in general) might be just to focus to the identification of canonical pathways instead of generating complex pathways including all proteins of interest. Therefore, from the 2 tools mentioned, only IPA offers a library of canonical pathways. As for the pathway study, the user should initially check whether his research field of interest is properly implemented and annotated. This can be easily done by importing just a limited number of proteins belonging to a certain pathway of interest. Within our tests, analyzing the correctness of canonical pathways in IPA revealed that particularly the description of signaling mechanisms associated to neurodegeneration are deficiently implemented in the software. On the other hand, other pathways like the Wnt-, and the Insu-signaling were annotated very well in IPA pointing to reliable pathway analysis in this respect. Of course, these pretests are only possible if the user has an idea of what pathways she/he is looking for. The possibility to combine the visualization of canonical pathways in the pathway description in IPA seemed very helpful to us and should be applied (compare Figure 2B) to detect biological mechanisms. With STRING, it is not possible to
ARTICLE
visualize canonical pathways. In addition, IPA offers the function “grow” to increase the number of interacting proteins/molecules associated with a protein of choice. However, this function was not helpful for us as it resulted in a branching out in all directions from the protein of choice without the consideration of other interactions in the pathway resulting in a more confusing pathway description without added value. Instead, a “grow” function for canonical pathways in the network description seems to be much more desirable. Another suggestion to the software developer is the implementation of additional annotation of functional complexes like the apoptosome, proteasome, etc. in the data. This would help the user to get a faster understanding of the mechanisms behind her/his data. Moreover, neither in IPA nor in STRING is it possible to specify the origin of samples in detail. For example, researchers interested in the secretome of cells or human secretome (like CSF) would like to explicitly filter by localization like it is possible with “Function and Disease” in IPA. Similarly, it is actually not implemented to upload data on posttranslational modifications of proteins, which are often very important to understand pathway mechanisms. Finally, a filter function suggesting some candidates belonging to the group of so-called deja-vu or balancer proteins (that might be excluded from the analysis) would be helpful21,22 to reduce the list of proteins for pathway generation to the basic essentials. In conclusion, the quality of pathway description is not significantly different between IPA and STRING, but IPA offers some additional helpful tools. Thus, for the “casual-OMICSresearcher”, we would recommend using the freely available STRING tool to get a “feeling” of putatively underlying pathways/ mechanisms behind the data. For research facilities, in which proteomics is part of the daily business, the use of IPA might be more helpful because of the additional features in IPA. Considering our recommendations (Table 5), especially the removal of biological irrelevant proteins from uploaded protein lists, the use of pathway tools is reasonable to identify underlying mechanisms and to design further validation experiments or functional studies. Of course, software-based suggested pathways need to be handled with care, and our test strategy revealed that the analysis of complex data like proteomics is not as easy as promised by the software producer. A false handling can result in a misdirection of data interpretation. Finally, a researcher specialized in a certain field of disease or pathway will surely recognize specific mechanisms within his OMICS data much faster than a software ever will.
’ ASSOCIATED CONTENT
bS
Supporting Information Supplementary figures and table. This material is available free of charge via the Internet at http://pubs.acs.org.
’ AUTHOR INFORMATION Corresponding Author
*Thorsten M€uller, PhD, Functional Proteomics, Medizinisches Proteom-Center, Ruhr-University Bochum, D-44780 Bochum, Germany, e-mail:
[email protected].
’ ACKNOWLEDGMENT This work was funded by EU (cNEUPRO, sixth EU FP, project LSHM-CT-2007-037950), BMBF (NGFNPlus, project 01GS08138) and by FoRUM (Forschungsf€orderung Ruhr-Universit€at Bochum 5407
dx.doi.org/10.1021/pr200654k |J. Proteome Res. 2011, 10, 5398–5408
Journal of Proteome Research Medizinische Fakult€at) AZ-F616-08 and AZ-F680-09. M.E. and C.S. are funded from PURE (Protein Unit for Research in Europe), a project of Nordrhein-Westfalen, a federal state of Germany. J.U. is part of CLIB (“Cluster Industrielle Biotechnologie”) within the QProM project contract number 616 40003 0315413B. M.A. is funded by BioNRW.PROFILE, F€orderkennzeichen 005-1006-0050, Projekttr€ager J€ulich (PTJ) project number z0911bt004f.
’ REFERENCES (1) Dail, M. B.; Shack, L. A.; Chambers, J. E.; Burgess, S. C. Global liver proteomics of rats exposed for 5 days to phenobarbital identifies changes associated with cancer and with CYP metabolism. Toxicol. Sci. 2008, 106, 556–69. (2) Dai, L.; Li, C.; Shedden, K. A.; Misek, D. E.; Lubman, D. M. Comparative proteomic study of two closely related ovarian endometrioid adenocarcinoma cell lines using cIEF fractionation and pathway analysis. Electrophoresis 2009, 30, 1119–31. (3) Young, C.; Truman, P.; Boucher, M.; Keyzers, R. A.; Northcote, P.; Jordan, T. W. The algal metabolite yessotoxin affects heterogeneous nuclear ribonucleoproteins in HepG2 cells. Proteomics 2009, 9, 2529–42. (4) Munday, D. C.; Hiscox, J. A.; Barr, J. N. Quantitative proteomic analysis of A549 cells infected with human respiratory syncytial virus subgroup B using SILAC coupled to LC-MS/MS. Proteomics 2010, 10, 4320–34. (5) Gunawardana, C. G.; Kuk, C.; Smith, C. R.; Batruch, I.; Soosaipillai, A.; Diamandis, E. P. Comprehensive analysis of conditioned media from ovarian cancer cell lines identifies novel candidate markers of epithelial ovarian cancer. J. Proteome Res. 2009, 8, 4705–13. (6) Gourley, G. R.; Yang, L.; Higgins, L.; Riviere, M. A.; David, L. L. Proteomic analysis of biopsied human colonic mucosa. J. Pediatr. Gastroenterol. Nutr. 2010, 51, 46–54. (7) D’Alessandro, A.; Scaloni, A.; Zolla, L. Human milk proteins: an interactomics and updated functional overview. J. Proteome Res. 2010, 9, 3339–73. (8) Overgaard, A. J.; Thingholm, T. E.; Larsen, M. R.; Tarnow, L.; Rossing, P.; McGuire, J. N.; et al. Quantitative iTRAQ-Based Proteomic Identification of Candidate Biomarkers for Diabetic Nephropathy in Plasma of Type 1 Diabetic Patients. Clin. Proteomics 2010, 6, 105–14. (9) Klammer, M.; Godl, K.; Tebbe, A.; Schaab, C. Identifying differentially regulated subnetworks from phosphoproteomic data. BMC Bioinform. 2010, 11, 351. (10) Zhang, Z.; Zhang, L.; Hua, Y.; Jia, X.; Li, J.; Hu, S.; et al. Comparative proteomic analysis of plasma membrane proteins between human osteosarcoma and normal osteoblastic cell lines. BMC Cancer 2010, 10, 206. (11) Liu, Y.; Teng, X.; Yang, X.; Song, Q.; Lu, R.; Xiong, J.; et al. Shotgun proteomics and network analysis between plasma membrane and extracellular matrix proteins from rat olfactory ensheathing cells. Cell Transplant. 2010, 19, 133–46. (12) Wang, H. Q.; Yang, B.; Xu, C. L.; Wang, L. H.; Zhang, Y. X.; Xu, B.; et al. Differential phosphoprotein levels and pathway analysis identify the transition mechanism of LNCaP cells into androgen-independent cells. Prostate 2010, 70, 508–17. (13) Skynner, H. A.; Amos, D. P.; Murray, F.; Salim, K.; Knowles, M. R.; Munoz-Sanjuan, I.; et al. Proteomic analysis identifies alterations in cellular morphology and cell death pathways in mouse brain after chronic corticosterone treatment. Brain Res. 2006, 1102, 12–26. (14) Liu, X. Y.; Yang, J. L.; Chen, L. J.; Zhang, Y.; Yang, M. L.; Wu, Y. Y.; et al. Comparative proteomics and correlated signaling network of rat hippocampus in the pilocarpine model of temporal lobe epilepsy. Proteomics 2008, 8, 582–603. (15) Meneses-Lorente, G.; Watt, A.; Salim, K.; Gaskell, S. J.; Muniappa, N.; Lawrence, J.; et al. Identification of early proteomic markers for hepatic steatosis. Chem. Res. Toxicol. 2006, 19, 986–98.
ARTICLE
(16) Tilton, R. G.; Haidacher, S. J.; Lejeune, W. S.; Zhang, X.; Zhao, Y.; Kurosky, A.; et al. Diabetes-induced changes in the renal cortical proteome assessed with two-dimensional gel electrophoresis and mass spectrometry. Proteomics 2007, 7, 1729–42. (17) Hinkelbein, J.; Feldmann, R. E., Jr.; Peterka, A.; Schubert, C.; Schelshorn, D.; Maurer, M. H.; et al. Alterations in cerebral metabolomics and proteomic expression during sepsis. Curr. Neurovasc. Res. 2007, 4, 280–8. (18) Chang, D. W.; Hayashi, S.; Gharib, S. A.; Vaisar, T.; King, S. T.; Tsuchiya, M.; et al. Proteomic and computational analysis of bronchoalveolar proteins during the course of the acute respiratory distress syndrome. Am. J. Respir. Crit. Care Med. 2008, 178, 701–9. (19) Jimenez-Marin, A.; Collado-Romero, M.; Ramirez-Boo, M.; Arce, C.; Garrido, J. J. Biological pathway analysis by ArrayUnlock and Ingenuity Pathway Analysis. BMC Proc. 2009, 3 (Suppl 4), S6. (20) Deighton, R. F.; Kerr, L. E.; Short, D. M.; Allerhand, M.; Whittle, I. R.; McCulloch, J. Network generation enhances interpretation of proteomic data from induced apoptosis. Proteomics. 2010, 10, 1307–15. (21) Petrak, J.; Ivanek, R.; Toman, O.; Cmejla, R.; Cmejlova, J.; Vyoral, D.; et al. Deja vu in proteomics. A hit parade of repeatedly identified differentially expressed proteins. Proteomics. 2008, 8, 1744–9. (22) Mao, L.; Zabel, C.; Herrmann, M.; Nolden, T.; Mertes, F.; Magnol, L.; et al. Proteomic shifts in embryonic stem cells with gene dose modifications suggest the presence of balancer proteins in protein regulatory networks. PLoS One 2007, 2, e1218. (23) Stephan, C.; Reidegeld, K. A.; Hamacher, M.; van Hall, A.; Marcus, K.; Taylor, C.; et al. Automated reprocessing pipeline for searching heterogeneous mass spectrometric data of the HUPO Brain Proteome Project pilot phase. Proteomics 2006, 6, 5015–29. (24) Muller, T.; Loosse, C.; Schroetter, A.; Schnabel, A.; Helling, S.; Egensperger, R.; et al. The AICD interacting protein DAB1 is upregulated in Alzheimer frontal cortex brain samples and causes deregulation of proteins involved in gene expression changes. Curr. Alzheimer Res. 2011, 8, 573–82. (25) Roberts, D. M.; Slep, K. C.; Peifer, M. It takes more than two to tango: Dishevelled polymerization and Wnt signaling. Nat. Struct. Mol. Biol. 2007, 14, 463–5. (26) Clevers, H. Wnt/beta-catenin signaling in development and disease. Cell 2006, 127, 469–80. (27) Matenia, D.; Mandelkow, E. M. The tau of MARK: a polarized view of the cytoskeleton. Trends Biochem. Sci. 2009, 34, 332–42. (28) Corradetti, M. N.; Guan, K. L. Upstream of the mammalian target of rapamycin: do all roads pass through mTOR? Oncogene 2006, 25, 6347–60 (29) Roos, W. P.; Kaina, B. DNA damage-induced cell death by apoptosis. Trends Mol. Med. 2006, 12, 440–50. (30) Yu, J.; Zhang, L. PUMA, a potent killer with or without p53. Oncogene 2008, 27 (Suppl 1), S71–S83. (31) Muller, T.; Meyer, H. E.; Egensperger, R.; Marcus, K. The amyloid precursor protein intracellular domain (AICD) as modulator of gene expression, apoptosis, and cytoskeletal dynamics-Relevance for Alzheimer’s disease1. Prog. Neurobiol. 2008, 85, 393–406. (32) Wilson, E. B. Probalbe inference, the law of succession, and statistical inference. J. Am. Stat. Assoc. 1927, 22, 209–12. (33) R Development Core Team. R: A language and environment for statistical computing; R Foundation for Statistical Computing:Vienna, Austria, 2010.
5408
dx.doi.org/10.1021/pr200654k |J. Proteome Res. 2011, 10, 5398–5408