Update of the Functional Mitochondrial Human Proteome Network

Sep 19, 2018 - Computer and Laboratory Investigation of Proteins of Human Origin (CALIPHO), SIB Swiss Institute of Bioinformatics, and Department of ...
0 downloads 0 Views 2MB Size
Subscriber access provided by University of South Dakota

Article

An update of the functional mitochondrial human proteome network Chiara Monti, Lydie Lane, Mauro Fasano, and Tiziana Alberio J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.8b00447 • Publication Date (Web): 19 Sep 2018 Downloaded from http://pubs.acs.org on September 20, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

An update of the functional mitochondrial human proteome network Chiara Montia, Lydie Laneb, Mauro Fasanoa, Tiziana Alberio*a a

Department of Science and High Technology, and Center of Bioinformatics, University of

Insubria, Busto Arsizio, Italy. b

Computer and Laboratory Investigation of Proteins of Human Origin (CALIPHO), SIB Swiss

Institute of Bioinformatics, and Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, Centre Médical Universitaire (CMU), Geneva, Switzerland.

*corresponding author: Tiziana Alberio, PhD Department of Science and High Technology University of Insubria Via Manara, 7 I-21052, Busto Arsizio, (VA) e-mail: [email protected]

1 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 38

Abstract Due to the pivotal role of mitochondrial alterations in several diseases, the Human Proteome Organization (HUPO) has promoted in recent years an initiative to characterize the mitochondrial human proteome, the mitochondrial human proteome project (mt-HPP). Here, we generated an updated version of the functional mitochondrial human proteome network, made by nodes (mitochondrial proteins) and edges (gold binary interactions), using data retrieved from neXtProt, the reference database for HPP metrics. The principal new concept suggested was the consideration of mitochondria-associated proteins (first interactors), which may influence mitochondrial functions. All the proteins described as mitochondrial in the sublocation and/or the GO Cellular Component sections of neXtProt were considered. Their other sub-cellular and sub-mitochondrial localizations have been analyzed. The network represents the effort to collect all the high-quality binary interactions described so far for mitochondrial proteins and the possibility for the community to reuse the information collected. As a proof of principle, we mapped proteins with no function, in order to speculate on their role by the background knowledge of their interactors, and proteins described to be involved in Parkinson’s Disease, a neurodegenerative disorder, where it is known that mitochondria play a central role.

Keywords Mitochondria, Protein-Protein Interaction, neXtProt, Mitochondrial Human Proteome Project, Network

2 ACS Paragon Plus Environment

Page 3 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Introduction Mitochondria are complex and essential organelles of eukaryotic cells and play a leading role in several cellular functions such as energy production, calcium ion homeostasis, regulation of death pathways and aging1,2. Mitochondrial impairment has been associated to many pathological conditions, including metabolic syndromes, cardiovascular diseases, cancers and neurodegenerative disorders

3-5

. Several human disorders are strictly associated with mutations in the mitochondrial

genome, whereas others are related to alterations of proteins encoded by nuclear chromosomes that are imported into mitochondria, or mutation in proteins that, interacting with mitochondria, regulate mitochondrial homeostasis6,7. Therefore, over the past 20 years, the study of the mitochondrial proteome associated with particular physiological or pathological conditions increased. Indeed, the detection of the molecular landscape underlying mitochondrial dysfunction could represent new opportunities for prevention, diagnosis and therapy for several diseases8. Mass spectrometry (MS)-based proteomics is commonly used to define the mitochondrial proteome. To increase mitochondrial proteins identification, mitochondrial enrichment in the sample is needed. However, enrichment protocols may have some drawbacks and can i) generate depletion of matrix-mitochondrial proteins due to mitochondrial damage and consequent leakage of matrixmitochondrial proteins, ii) generate membrane-surrounded artifacts, containing proteins originally not present in the organelle, iii) loose interactions with proteins associated to mitochondria (e.g., proteins of the endoplasmic reticulum or the autophagosome) that influence mitochondrial functionality, maintenance, dynamics and metabolism9. Moreover, a few hundred proteins are more dynamically present, such as proteins present under specific conditions or proteins present only in certain tissues10-13. For these reasons, a great number of mitochondrial proteins have yet to be identified and characterized. Due to the pivotal role of mitochondrial alterations in several diseases, the Human Proteome Organization (HUPO) has promoted in recent years an initiative to characterize the mitochondrial human proteome, the mitochondrial human proteome project (mt-HPP), as part of both the chromosome-centric (C-HPP) and the “biology and disease” (B/D-HPP) initiatives. The main goal of mt-HPP is to obtain robust information about the integrative role of proteins acting at the mitochondrial level. By combining proteomics and computational biology, it would be possible to systematically consider both proteins encoded by the mitochondrial DNA (mt-DNA) and the nuclear encoded mitochondrial proteins (either imported into mitochondria or simply associated to them), to understand the function of the mitochondrial proteome and its crosstalk with the proteome of other organelles14,15.

3 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 38

Several databases are available for the community investigating the mitochondrial proteome. Two main contributions in the field are represented by MitoCarta and Integrated Mitochondrial Protein Index (IMPI). The human MitoCarta 2.0 reports 1158 genes encoding mitochondrial proteins. It is built by the combination of several data obtained from the literature, APEX-based mass spectrometry experiments, GFP-tagging/microscopy and the results of seven large-scale mitochondrial datasets16. On the other hand, IMPI uses a random forest machine learning classifier to recognize mitochondrial proteins scoring all human genes. The IMPI version Q2 2018 contains 1626 human genes encoding for mitochondrial proteins17. neXtProt is a comprehensive humancentric discovery platform that is used as reference database for HPP metrics18. In neXtProt data obtained from several high-throughput approaches (such as micro-array, RNA sequencing, antibody screens, proteomics, interactomics) is added to the information on human proteins available in UniProtKB/Swiss-Prot. All of these data are carefully selected to only provide high-quality data19,20. For instance, neXtProt imports antibody-based data on tissue expression and subcellular location from the Human Protein Atlas (HPA)12. Each data point is considered “Gold” or “Silver” depending on quality criteria that have been defined in collaboration with the HPA team. Therefore, we decided to use neXtProt to generate an updated version of the functional mitochondrial human proteome network14, made by all mitochondrial proteins and their first interactors. This functional mitochondrial human proteome network can be used to: i) improve the neXtProt database and the general knowledge about mitochondrial proteins (e.g., specify their submitochondrial localization, validate their existence at protein level, identify other cellular localizations for mitochondrial proteins), ii) hypothesize functions of mitochondrial proteins with unknown function, through the analysis of the function of first neighbors and iii) map proteins derived from proteomics studies focused on diseases involving mitochondrial alterations, aiming at evidencing the mitochondrial molecular factors and pathways involved. As an example, we used the network to retrieve information about the main mitochondrial processes involved in Parkinson’s Disease and highlight them in a protein network.

4 ACS Paragon Plus Environment

Page 5 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Experimental Procedures Input list generation Using the following query in advanced search of the neXtProt database (https://www.nextprot.org/ release 2018-01-17), all proteins annotated as mitochondrial in subcellular location and/or GO Cellular Component sections with high quality (gold label) were collected. select distinct ?entry where { values ?mitoloc {cv:SL-0173 cv:GO_0005739} ?entry :isoform ?iso. ?iso :cellularComponent ?loc . ?loc :term/:childOf ?mitoloc . ?loc :evidence / :quality :GOLD . filter not exists {?loc :negativeEvidence ?_.} }

Two

input

lists

were

generated:

SUB_LOC

list

and

GO_CC

list.

Venn

diagram

(http://bioinformatics.psb.ugent.be/cgi-bin/liste/Venn/calculate_venn.htpl) was used to compare SUB_LOC list and GO_CC list. MITO list was created unifying the SUB_LOC list and the GO_CC list. MitoMiner 4.0 was used to download MitoCarta (http://mitominer.mrc-mbu.cam.ac.uk/release4.0/mitocarta.do) and IMPI (http://mitominer.mrc-mbu.cam.ac.uk/release-4.0/impi.do) datasets17. In IMPI, predictions with a score of 0.8 or greater indicate strong evidence of mitochondrial localisation and only these were in the final IMPI dataset used for the comparison with gold neXtProt

annotations.

Venn

diagram

(http://bioinformatics.psb.ugent.be/cgi-

bin/liste/Venn/calculate_venn.htpl) was used to compare the three datasets.

Identification of mitochondrial missing proteins MITO list was analyzed using Protein Evidence (PE) information retrieved from the neXtProt database to highlight missing proteins. Proteins with evidence at transcript level (PE2), evidence for homology (PE3) and predicted (PE4) were used to generate the MISSING list. For previously unobserved proteins, we looked for demonstrations of the protein existence in the literature (www.pubmed.org) and in the Global Proteome Machine database GPMdb (gpmdb.thegpm.org) 21.

5 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 38

The functional mitochondrial human proteome network generation To obtain a comprehensive mitochondrial network, the SnorQL search tool of neXtProt (http://snorql.nextprot.org/) was used, and all gold protein-protein interactions for proteins in the MITO list were retrieved in the protein interaction section, using the following query: select distinct ?entry ?interactant where { values

?entry

{

#This

is

to

replace

with

MITO

list

entries

entry:NX_….

entry:NX_…. } ?entry :isoform /:binaryInteraction ?interaction. ?interaction :interactant ?interactant, :quality :GOLD. ?interactant a :Entry. }

Moreover, we retrieved also silver interactions (:quality :SILVER) and reported this information in the Supporting Information Table 1. SnorQL search tool was used to retrieve all gold information in Interactions section for IDs in MITO list using the following query : select distinct ?entry ?comment where { values ?entry { entry: #This is to replace with MITO list entries } ?entry :isoform ?iso. ?iso :interactionInfo /rdfs:comment ?comment. }

Binary interactions were manually retrieved from the free text annotations and reported in the network table, indicating whether the information came from studies performed on human samples or was inferred from homology (by similarity) (Supporting Information Table 1). The functional mitochondrial human proteome network (MITO network) was generated using Cytoscape 3.6.1 (http://www.cytoscape.org/)22. In this network, nodes represent mitochondrial proteins and their first interactors, while edges between nodes represent protein-protein interaction data (gold).

6 ACS Paragon Plus Environment

Page 7 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Bioinformatics Analysis Sub-cellular and sub-mitochondrial location Gold data of the subcellular location and GO Cellular Component sections of the neXtProt database were used to classify the cellular compartment and the sub-mitochondrial localization (e.g., mitochondrial inner membrane, mitochondrial outer membrane, mitochondrial matrix) of mitochondrial proteins of the MITO network. These data were obtained using the advanced search of the neXtProt database and substituting in the following query the desired couple of codes (Table 1 and Table 2) select distinct ?entry where { values ?mitoloc {cv:#Uniprot code cv:GO code} ?entry :isoform ?iso. ?iso :cellularComponent ?loc . ?loc :term/:childOf ?mitoloc . ?loc :evidence / :quality :GOLD . filter not exists {?loc :negativeEvidence ?_.} }

Possible application of the network Function Using the advanced search of the neXtProt database, proteins without any annotated function were retrieved, using the query: select distinct ?entry where { ?entry :isoform ?iso. filter not exists { ?iso :functionInfo ?_ . } filter not exists { ?iso :function ?func . optional {?func :term ?fterm1 .} filter(!bound(?fterm1)) #eliminates functions from pathways } filter not exists { ?iso :function / :term ?fterm . filter(?fterm != cv:GO_0005524 && ?fterm != cv:GO_0000287 && ?fterm != cv:GO_0005515 && ?fterm != cv:GO_0042802 &&

?fterm

!=

cv:GO_0008270

&&

?fterm

!=

!=

cv:GO_0003676

&&

?fterm

!=

cv:GO_0051260 && ?fterm != cv:GO_0005509 &&

?fterm

cv:GO_0003824 && ?fterm != cv:GO_0007165 && ?fterm != cv:GO_0035556)

7 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 38

# eliminates proteins whose ONLY GO functions are one of ATP-binding, magnesium-binding, calcium-binding, zinc-binding, # nucleic acid binding, protein-binding, identical protein binding, protein homooligomerization, catalytic activity, signal transduction, } }

Then, each of these proteins was mapped on the MITO network, its interactors detected, and subclusters extracted.

Proteins relevant for a particular condition Using the MITO network, it was possible to highlight mitochondrial proteome alterations involved in Parkinson’s Disease pathogenesis. To find molecular pathways specifically involved in Parkinson’s Disease, a list of proteins altered only in Parkinson’s Disease23 was mapped on the MITO network. Mapped nodes were extracted (Parkinson’s Disease sub-cluster). Then, an over representation analysis of nodes of Parkinson’s Disease sub-cluster was performed using the “Analyze

tool”

in

Reactome24 25

(http://www.geneontology.org/)

and

Panther

in

GO

Biological

Process

database. Arbitrarily, only GO Biological Processes with fold

enrichment > 5 were considered26. This limits allowed us to focus on specific ontologies (end nodes of ontology trees) and eliminate more general macro GO. For both analyses only categories with FDR0.8 as fixed by curators). The majority of mitochondrial proteins (853) are in common among the three databases. 63% (897 out of 1413) of neXtProt mitochondrial IDs are in common with MitoCarta, while 71% (1000 out of 1413) of IDs are shared by IMPI. 554 proteins considered mitochondrial by other resources are not classified as such by neXtProt, or at least they do not have a gold mitochondrial annotation. Conversely, 369 proteins considered mitochondrial by neXtProt are not classified as such by the two other databases. Protein Evidence level of mitochondrial proteins Proteins were classified according to the Protein Evidence (PE) information obtained by the neXtProt database (Table 3). The majority of proteins had evidence at the protein level (1390 IDs 9 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 38

were PE1). Twenty proteins had evidence at the transcript level (PE2), only 3 proteins were inferred from homology (PE3), no mitochondrial protein was predicted (PE4). “Missing proteins” as defined by the HUPO HPP consortium are neXtProt PE2-4 entries. They are proteins confidently predicted from genome annotation which have not yet been validated in biological samples. Validation of these proteins is one of the key current challenges of the HPP consortium27,28. Missing proteins from our network were carefully examined in order to possibly complete their annotation. PE2 and PE3 protein IDs constituted a new list (MISSING list, containing 23 IDs). We searched in the literature demonstrations of the existence of proteins of the MISSING list and found that methyltransferase-like protein 12 (METTL12) has already been characterized by Malecki and coworkers29. Using several approaches (in vivo and in vitro), they demonstrated that METTL12 methylates citrate synthase (CS). The methylation of Lys-395 inhibits CS. This posttranslational modification is blocked by oxaloacetate and by S-adenosylhomocysteine. Moreover, in another paper, Rhein and coworkers30 described the methylation of another lysine (Lys-368) of CS by METTL12, probably influencing protein-protein interactions in the metabolism of the citric acid cycle29. Due to these new data, the corresponding entry (A8MUP2) has been updated by UniProtKB (2018_01 release) and its PE status will be upgraded to PE1 in next neXtProt release. We also looked for mass spectrometry evidence in GPMdb but we did not find peptides for the mitochondrial MISSING proteins in addition to the ones already reported in neXtProt. ATP5G2 and ATP5G3 are two genes encoding precursors of the mitochondrial ATP synthase proteolipid. They have different import sequences but the mature proteins after cleavage are identical to the mature form of ATP5G1, a PE1 protein. Since the mature forms of these proteins cannot be distinguished at protein level, specific approaches will be required to validate them in the precursor form. Four proteins with evidence at the transcript level (PE2) have no annotated function in the neXtProt database (MCCD1, GDF5OS, TMEM71, ANKRD37). The others are predicted to act in mitochondrial transport, Electron Transport Chain (ETC) and apoptosis (Table 4). As concerns PE3 proteins (Table 5), two of them were inferred from mouse and rat (KIF28P) or yeast and mouse (UQCRHL). On the other hand, ATP5EP2 was defined as a pseudogene but Yu and coworkers31 identified one of its peptides by MS in the context of a phosphoproteome analysis.

The functional mitochondrial human proteome network generation The main goal of mt-HPP is to obtain robust information about the mitochondrial proteome, considering its relationship with other cellular structures. To this purpose, a preliminary version of a functional mitochondrial network has been proposed14. The principal new concept suggested was 10 ACS Paragon Plus Environment

Page 11 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

the consideration of mitochondria-associated proteins, which may influence mitochondrial functionality, maintenance, dynamics and metabolism and, consequentially, all the cellular events guided by these organelles. In order to generate an updated version of the functional mitochondrial network, we used nodes (proteins) and edges (interactions) information retrieved by neXtProt, the reference database for HPP metrics32. Starting from the MITO list, we performed a network-based analysis, using the protein-protein interaction information from neXtProt. This information is stored in two different ways: binary interaction annotations imported from IntAct in the Protein-protein interaction section, and textual descriptions imported from UniProtKB/Swiss-Prot and summarizing the literature on protein complexes, in the Interactions section. We decided to be stringent and keep only “Gold” high-quality data, in order to eliminate false positive binary interactions. Out of the 1413 proteins from the MITO list, 588 were excluded since they did not have gold annotated interactions. The resulting functional mitochondrial human proteome network (MITO network) had 3395 nodes: mitochondrial proteins (yellow nodes) were 825, while 2570 nodes were first interactors (blue nodes) (Figure 2). 3195 nodes were in the principal cluster. The adjacency table with all the nodes and edges is available in the Supporting Information, to reproduce the network in Cytoscape and interactively re-use it (Supporting Information Table 1). Information about the data origin (human studies or inferred by homology) has been added. Even if we used only gold interactions, the adjacency table reports also silver ones, in order to be used in less stringent analyses than those reported here.

Sub-cellular and sub-mitochondrial location of mitochondrial nodes As already mentioned, mitochondrial proteins that were retrieved thanks to annotations in subcellular location and GO Cellular Component sections may not be exclusively mitochondrial. Indeed, according to HPA, more than half of the human proteome has multiple subcellular localizations, either simultaneously occurring within a cell or alternatively occurring at different stages of the cell cycle12. To understand how many proteins were exclusively mitochondrial in our network and which were the other cellular locations associated to mitochondrial proteins, we classified all the mitochondrial nodes (yellow nodes Figure 2) of the functional mitochondrial network in terms of other sub-cellular locations. To do this, we collected all gold information about their locations in the same sections of the neXtProt database and summed them up in a graphical visualization (Figure 3). We decided to take into account only the six most represented cellular components (Cytoplasm, Golgi Apparatus, Lysosome, Endoplasmic Reticulum, Mitochondrion, Nucleus). 4% (36) were exclusively mitochondrial (Figure 3), whereas the majority of the proteins had other cellular locations. Cytoplasm is the most represented other cellular location for 11 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 38

mitochondrial proteins (93% of mitochondrial proteins are also localized in the cytoplasm). This should be due to either their detection before the entry in the organelle or their shuttling from and to mitochondria in different conditions. The second most represented cellular location is the nucleus (41% of mitochondrial proteins are also localized in the nucleus). This may also be due to the fact that nuclei are the major contaminants when isolating mitochondria9. Nevertheless, it is well known that many proteins translocate from the nucleus to mitochondria and viceversa following several stimuli, contributing, for example, to mitochondrial biogenesis and functionality during stress and aging33. Mitochondria are complex organelles where different functions are associated to distinct sub compartments, i.e., the outer membrane, the intermembrane space, the inner membrane and the matrix. While 376 proteins only have a generic mitochondrion localization annotation in neXtProt subcellular location or GO Cellular Component sections, the 449 other mitochondrial nodes have more precise sub-mitochondrial annotations. 138 proteins were localized in the mitochondrial matrix, 103 were inner membrane mitochondrial proteins, 92 were localized in the outer membrane and 16 were localized in the inter membrane space. The others have either unclear localization in outer or inner membrane (5), or locate in multiple sub compartments (Figure 4). Speculations about the sub-mitochondrial location and function of proteins using the functional mitochondrial network The functional mitochondrial proteome network can be a starting point to perform several analyses. As an example, to enrich the database with annotations, by speculating on the sub-mitochondrial precise location of proteins classified as “generic mitochondrion” or on their function, exploiting the background knowledge of their interactors. Using the neXtProt database, we identified 4 mitochondrial proteins (yellow nodes) and 80 first interactors proteins (light blue nodes) that had unknown function (Figure 5). CCDC90B, HSDL1, NIPSNAP3A and FAM210A are PE1 mitochondrial proteins that had no function annotated in neXtProt. NIPSNAP3A localization in the mitochondrion was inferred on the basis of the phylogenetic tree (RefGenome database), whereas this protein was also found in cytoplasmic vesicles of macrophages by antibody-based assays34 and in sperm nuclei by mass spectrometry35. FAM210A is expressed in mitochondria and cytoplasm in mouse muscle, heart and brain. Based on sequence similarity of FAM210A in mouse and human, it was inferred that FAM210A in human was localized in mitochondria

36

. CCDC90B and HSLD1

were shown to be mitochondrial by antibody-based assays performed by HPA. To show examples of a possible use of the MITO network, we extracted CCDC90B and HSLD1 neighborhood from the network (Figure 6). CCDC90B, which has a transmembrane domain, 12 ACS Paragon Plus Environment

Page 13 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

interacts with MCU (Figure 6A). Moreover, it is a paralogue of MCUR1. Based on these observations, it is possible to speculate that CCDC90B is a protein of the inner mitochondrial membrane. MCUR1, formerly named CCDC90A, was described to play a role both in MCU and complex IV assembly 37,38. Being two paralogues, it is possible to speculate that also CCDC90B is a scaffold protein for inner mitochondrial membrane proteins, with a different specificity with respect to MCUR1. HSDL1 interacts with STYXL1, a non-mitochondrial protein in our network (Figure 6B). HSDL1 is a hydroxysteroid dehydrogenase-like protein expected to be inactive because of the substitution of the tyrosine in the active site with a phenylalanine. STYXL1 is an inactive phosphatase that was shown to regulate mitochondrial-dependent apoptosis39, possibly by negatively regulating PTPMT1, an important component of the cardiolipin biosynthetic pathway40. This example highlights the importance of considering mitochondria-associated proteins when investigating the mitochondrial proteome, since a namely non-mitochondrial protein (STYXL1) was demonstrated to be fundamental in influencing mitochondrial function. Moreover, we may speculate that HSDL1 influences STYXL1 action at the mitochondrial level.

Focus on the mitochondrial impairment in Parkinson’s Disease The functional mitochondrial human proteome network can also be used to focus on mitochondrial protein complexes or biochemical pathways potentially involved in disease conditions where the mitochondrial impairment plays a major role, such as Parkinson’s Disease. 675 proteins have already been related specifically to Parkinson’s Disease pathogenesis23 (Supporting Information Table 3). 208 of these proteins were mapped onto the functional mitochondrial human proteome network (Figure 7). Therefore, a considerable percentage of Parkinson’s Disease-related proteins (31%) mapped on the MITO network, highlighting the central role of mitochondria in this pathology. Mapping Parkinson’s Disease proteins on MITO network allowed us to eliminate all proteins that were not mitochondrial (or associated with this organelle). Thus, we were able to focus our attention only on mitochondrial pathways altered by Parkinson’s Disease pathogenesis. To this purpose, an over representation analysis was performed using Reactome and GO Biological Process as reference databases. The whole list of results (Supporting Information Table 4-5) evidenced the processes in Parkinson’s Disease pathogenesis due to mitochondrial proteome alterations. Reactome results highlighted the central role of mitophagy (“Pink/Parkin Mediated Mitophagy”, FDR 3.92×10-4 and “Mitophagy”, FDR 8.02×10-4) and of “mitochondrial protein import” (FDR 5.24×10-4) (Supporting Information Table 4). Mutations in PINK1 and parkin genes have been 13 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 38

associated with familial forms of Parkinson’s Disease. Nevertheless, several pieces of evidence suggest that dysfunctional, depolarized mitochondria are not properly disposed also in the sporadic forms of the disease, thereby contributing to bioenergetic failure and oxidative stress41-43. Moreover, the alteration of proteins involved in the protein import into the mitochondrion is probably due to mitochondrial membrane potential loss44,45. Also the GO Biological Process mainly involved was “mitochondrial transport” (FDR 1.94×10-8, Fold Enrichment 8.81) (Supporting Information Table 5). An interesting result was the over representation of the GO Biological Process “regulation of neurotransmitter transport” (FDR 7.08×10-4, Fold Enrichment 6.91) (Supporting Information Table 5). It has already proposed that the synapses dysfunction should be an early and leading event in determining neuronal degeneration and loss46,47.

Conclusions The updated version of the functional mitochondrial network aims at constituting a tool for researchers focused on the mitochondrial proteome and its involvement in physiological or pathological conditions. The MITO network retrieves all the information collected by the neXtProt database, which is the reference for the HPP community. It constitutes a valuable starting point for several analyses. To zoom on biological processes involving mitochondria, it may be used for the mapping of proteomics investigation results. Eventually, the use of the table, where many binary interactions are resumed along with related information (gold/silver interaction, human study/by similarity), gives the opportunity to customize the functional mitochondrial network.

Acknowledgments The authors thank all the neXtProt team for developing and maintaining the neXtProt database, and especially Alain Gateau for help with SPARQL queries. The neXtProt server is hosted by Vital-IT, the SIB Swiss Institute of Bioinformatics’ Competence Centre in Bioinformatics and Computational Biology.

Supporting Information Supporting Results: Table S-1: adjacency table with all the nodes and edges of the functional mitochondrial human proteome network. Table S-2: proteins of the MITO list. Table S-3: list of proteins involved in Parkinson’s Disease mapped on the MITO network. 14 ACS Paragon Plus Environment

Page 15 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Table S-4: over representation analysis of Parkinson’s Disease sub-cluster. Reactome was used as reference. Table S-5: over representation analysis of Parkinson’s Disease sub-cluster. GO Biological Process was used as reference.

15 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 38

References 1) Kasahara, A.; Scorrano, L. Mitochondria: from cell death executioners to regulators of cell differentiation. Trends Cell Biol. 2014, 24, 761-70. 2) Tilokani, L.; Nagashima, S.; Paupe, V.; Prudent, J. Mitochondrial dynamics: overview of molecular mechanisms. Essays Biochem. 2018, 62, 341-360. 3) Zeng, XS.; Geng, WS.; Jia, JJ.; Chen, L.; Zhang PP. Cellular and Molecular Basis of Neurodegeneration in Parkinson Disease. Front Aging Neurosci. 2018, DOI: doi: 10.3389/fnagi.2018.00109. 4) Hassanpour, SH.; Dehghani, MA.; Karami, SZ. Study of respiratory chain dysfunction in heart disease. J Cardiovasc Thorac Res. 2018 ,10, 1-13. 5) Buj, R.; Aird, KM. Deoxyribonucleotide Triphosphate Metabolism in Cancer and Metabolic Disease. Front Endocrinol. 2018, DOI: 10.3389/fendo.2018.00177. 6) Gorman, GS.; Schaefer, AM.; Ng, Y.; Gomez, N.; Blakely, EL.; Alston, CL.; Feeney, C.; Horvath, R.; Yu-Wai-Man, P.; Chinnery, PF.; Taylor, RW.; Turnbull, DM.; McFarland, R. Prevalence of nuclear and mitochondrial DNA mutations related to adult mitochondrial disease. Ann Neurol. 2015, 77, 753-759. 7) Su, T.; Turnbull, DM.; Greaves, LC. Roles of Mitochondrial DNA Mutations in Stem Cell Ageing. Genes (Basel). 2018, DOI: 10.3390/genes9040182. 8) Picard, M.; Wallace, DC.; Burelle, Y. The rise of mitochondria in medicine. Mitochondrion. 2016, 30, 105-116. 9) Alberio.; T.; Pieroni, L.; Ronci, M.; Banfi, C.; Bongarzone, I.; Bottoni, P.; Brioschi, M.; Caterino, M.; Chinello, C.; Cormio, A.; Cozzolino, F.; Cunsolo, V.; Fontana, S.; Garavaglia, B.; Giusti, L.; Greco, V.; Lucacchini, A.; Maffioli, E.; Magni, F.; Monteleone, F.; Monti, M.; Monti, V.; Musicco, C.; Petrosillo, G.; Porcelli, V.; Saletti, R.; Scatena, R.; Soggiu, A.; Tedeschi, G.; Zilocchi, M.; Roncada, P.; Urbani, A.; Fasano, M. Toward the Standardization of Mitochondrial Proteomics: The Italian Mitochondrial Human Proteome Project Initiative. J Proteome Res. 2017, 16, 4319-4329. 10) Palmfeldt, J.; Bross, P. Proteomics of human mitochondria. Mitochondrion. 2017, 33, 2-14. 11) Thul, PJ.; Lindskog, C. The human protein atlas: A spatial map of the human proteome. Protein Sci. 2018, 27, 233-244. 12) Thul, PJ.; Åkesson, L.; Wiking, M.; Mahdessian, D.; Geladaki, A.; Ait Blal, H.; Alm, T.; Asplund, A.; Björk, L.; Breckels, LM.; Bäckström, A.; Danielsson, F.; Fagerberg, L.; Fall, J.; Gatto, L.; Gnann, C.; Hober, S.; Hjelmare, M.; Johansson, F.; Lee, S.; Lindskog, C.;

16 ACS Paragon Plus Environment

Page 17 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Mulder, J.; Mulvey, CM.; Nilsson, P.; Oksvold, P.; Rockberg, J.; Schutten, R.; Schwenk, JM.; Sivertsson, Å.; Sjöstedt, E.; Skogs, M.; Stadler, C.; Sullivan, DP.; Tegel, H.; Winsnes, C.; Zhang, C.; Zwahlen, M.; Mardinoglu, A.; Pontén, F.; von Feilitzen, K.; Lilley, KS.; Uhlén, M.; Lundberg, E. A subcellular map of the human proteome. Science. 2017, DOI: 10.1126/science.aal3321. 13) Palmfeldt, J.; Bross, P. Proteomics of human mitochondria. Mitochondrion. 2017, 33, 2-14. 14) Fasano, M.; Alberio, T.; Babu, M.; Lundberg, E.; Urbani A. Towards a functional definition of the mitochondrial human proteome. EuPA Open Proteomics. 2016, 10, 24-27. 15) Vlasblom, J.; Jin, K.; Kassir, S.; Babu, M. Exploring mitochondrial system properties of neurodegenerative diseases through interactome mapping. J Proteomics. 2014, 100, 8-24. 16) Calvo, SE.; Clauser, KR.; Mootha, VK. MitoCarta2.0: an updated inventory of mammalian mitochondrial proteins. Nucleic Acids Res. 2016, DOI: 10.1093/nar/gkv1003. 17) Smith, AC.; Robinson, AJ. MitoMiner v3.1, an update on the mitochondrial proteomics database. Nucleic Acids Res. 2016, DOI: 10.1093/nar/gkv1001. 18) Omenn, GS.; Lane, L.; Overall, CM.; Corrales, FJ.; Schwenk, JM.; Paik, YK.; Van Eyk, JE.; Liu, S.; Snyder, M.; Baker, MS.; Deutsch, EW. Progress on Identifying and Characterizing the Human Proteome: 2018 Metrics from the HUPO Human Proteome Project. J Proteome Res. 2018, DOI: 10.1021/acs.jproteome.8b00441. 19) Gaudet, P.; Argoud-Puy, G.; Cusin, I.; Duek, P.; Evalet, O.; Gateau, A.; Gleizes, A.; Pereira, M.; Zahn-Zabal, M.; Zwahlen, C.; Bairoch, A.; Lane, L. neXtProt: organizing protein knowledge in the context of human proteome projects. J Proteome Res. 2013, DOI: 10.1021/pr300830v. 20) Gaudet, P.; Michel, PA.; Zahn-Zabal, M.; Britan, A.; Cusin, I.; Domagalski, M.; Duek, PD.; Gateau, A.; Gleizes, A.; Hinard, V.; Rech de Laval, V.; Lin, J.; Nikitin, F.; Schaeffer M.;Teixeira D.; Lane L.; Bairoch A. The neXtProt knowledgebase on human proteins: 2017 update. Nucleic Acids Res. 2017, DOI: 10.1093/nar/gkw1062. 21) Craig, R.; Cortens, JP.; Beavis, RC. Open source system for analyzing, validating, and storing protein identification data. J Proteome Res. 2004, 3, 1234-42. 22) Shannon, P.; Markiel, A.; Ozier, O.; Baliga, NS.; Wang, JT.; Ramage, D.; Amin, N.; Schwikowski, B.; Ideker, T. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13, 2498-2504. 23) Monti, C.; Colugnat, I.; Lopiano, L.; Chiò, A.; Alberio, T. Network Analysis Identifies Disease-Specific Pathways for Parkinson's Disease. Mol Neurobiol. 2018, 55, 370-381.

17 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 38

24) Fabregat, A.; Jupe, S.; Matthews, L.; Sidiropoulos, K.; Gillespie, M.; Garapati, P.; Haw, R.; Jassal, B.; Korninger, F.; May, B.; Milacic, M.; Roca, CD.; Rothfels, K.; Sevilla, C.; Shamovsky, V.; Shorser, S.; Varusai, T.; Viteri, G.; Weiser, J.; Wu, G.; Stein, L.; Hermjakob, H.; D'Eustachio, P. The Reactome Pathway Knowledgebase. Nucleic Acids Res. 2018, 46, D649-D655. 25) The Gene Ontology Consortium. Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res. 2017, DOI: 10.1093/nar/gkw1108. 26) Macron, C.; Lane, L.; Núñez Galindo, A.; Dayon, L. Deep Dive on the Proteome of Human Cerebrospinal Fluid: A Valuable Data Resource for Biomarker Discovery and Missing Protein Identification. J Proteome Res. 2018, DOI: 10.1021/acs.jproteome.8b00300. 27) Omenn, GS.; Lane, L.; Lundberg, EK.; Overall, CM.; Deutsch, EW. Progress on the HUPO Draft Human Proteome: 2017 Metrics of the Human Proteome Project. J Proteome Res. 2017, 16, 4281-4287. 28) Omenn, GS.; Lane, L.; Overall, CM.; Corrales, FJ.; Schwenk, JM.; Paik, YK.; Van Eyk, JE.; Liu, S.; Snyder, M.; Baker, MS.; Deutsch, EW. Progress on Identifying and Characterizing the Human Proteome: 2018 Metrics from the HUPO Human Proteome Project. J Proteome Res. 2018, DOI: 10.1021/acs.jproteome.8b00441. 29) Małecki, J.; Jakobsson, ME.; Ho, AYY.; Moen, A.; Rustan, AC.; Falnes PØ. Uncovering human METTL12 as a mitochondrial methyltransferase that modulates citrate synthase activity through metabolite-sensitive lysine methylation. J Biol Chem. 2017, 292, 1795017962. 30) Rhein, VF.; Carroll, J.; Ding, S.; Fearnley, IM.; Walker, JE. Human METTL12 is a mitochondrial methyltransferase that modifies citrate synthase. FEBS Lett. 2017, 591, 16411652. 31) Yu, LR.; Zhu, Z.; Chan, KC.; Issaq, HJ.; Dimitrov, DS.; Veenstra, TD. Improved titanium dioxide enrichment of phosphopeptides from HeLa cells and high confident phosphopeptide identification by cross-validation of MS/MS and MS/MS/MS spectra. J Proteome Res. 2007, 6, 4150-4162. 32) Lane, L.; Bairoch, A.; Beavis, RC.; Deutsch, EW.; Gaudet, P.; Lundberg, E.; Omenn, GS. Metrics for the Human Proteome Project 2013-2014 and strategies for finding missing proteins. J Proteome Res. 2014, 13, 15-20. 33) Lionaki, E.; Gkikas, I.; Tavernarakis, N. Differential Protein Distribution between the Nucleus and Mitochondria: Implications in

Aging.

Front Genet.

2016, DOI:

10.3389/fgene.2016.00162. 18 ACS Paragon Plus Environment

Page 19 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

34) Lee, AH.; Zareei, MP.; Daefler, S. Identification of a NIPSNAP homologue as host cell target for Salmonella virulence protein SpiC. Cell Microbiol. 2002, 4, :739-50. 35) de Mateo, S.;

Castillo, J.;

Estanyol, JM.;

Ballescà, JL.; Oliva, R. Proteomic

characterization of the human sperm nucleus. Proteomics. 2011, 11, 2714-26. 36) Tanaka, KI.; Xue, Y.; Nguyen-Yamamoto, L.; Morris, JA.; Kanazawa, I.; Sugimoto, T.; Wing, SS.; Richards, JB.; Goltzman, D. FAM210A is a novel determinant of bone and muscle

structure

and

strength.

Proc

Natl

Acad

Sci

U

S

A.

2018,

DOI:

10.1073/pnas.1719089115. 37) Tomar, D.; Dong, Z.; Shanmughapriya, S.; Koch, DA.; Thomas, T.; Hoffman, NE.; Timbalia, SA.; Goldman, SJ.; Breves, SL.; Corbally, DP.; Nemani, N.; Fairweather, JP.; Cutri, AR.; Zhang, X.; Song, J.; Jaña, F.; Huang, J.; Barrero, C.; Rabinowitz, JE.; Luongo, TS.; Schumacher, SM.; Rockman, ME.; Dietrich, A.; Merali, S.; Caplan, J.; Stathopulos, P.; Ahima, RS.; Cheung, JY.; Houser, SR.; Koch, WJ.; Patel, V.; Gohil, VM.; Elrod, JW.; Rajan, S.; Madesh, M. MCUR1 Is a Scaffold Factor for the MCU Complex Function and Promotes Mitochondrial Bioenergetics. Cell Rep. 2016, 15, 1673-85. 38) Paupe, V.; Prudent, J.; Dassa, EP.; Rendon, OZ.; Shoubridge, EA. CCDC90A (MCUR1) is a cytochrome c oxidase assembly factor and not a regulator of the mitochondrial calcium uniporter. Cell Metab. 2015, 21, 109-16. 39) Niemi, NM.; Lanning, NJ.; Klomp, JA.; Tait, SW.; Xu, Y.; Dykema, KJ.; Murphy, LO.; Gaither, LA.; Xu, HE.; Furge, KA.; Green, DR.; MacKeigan, JP. MK-STYX, a catalytically inactive phosphatase regulating mitochondrially dependent apoptosis. Mol Cell Biol. 2011, 31, 1357-1368. 40) Niemi, NM.; Sacoman, JL.; Westrate, LM.; Gaither, LA.; Lanning, NJ.; Martin, KR.; MacKeigan, JP. The pseudophosphatase MK-STYX physically and genetically interacts with

the

mitochondrial

phosphatase

PTPMT1.

PLoS

One.

2014,

DOI:10.1371/journal.pone.0093896. 41) Zilocchi, M.; Finzi, G.; Lualdi, M.; Sessa, F.; Fasano, M.; Alberio, T. Mitochondrial alterations in Parkinson's disease human samples and cellular models. Neurochem Int. 2018, 118, 61-72. 42) Bondi, H.; Zilocchi, M.; Mare, MG.; D'Agostino, G.; Giovannardi, S.; Ambrosio, S.; Fasano, M.; Alberio, T. Dopamine induces mitochondrial depolarization without activating PINK1-mediated mitophagy. J Neurochem. 2015, DOI: 10.1111/jnc.13506. 43) Tilokani, L.; Nagashima, S.; Paupe, V.; Prudent, J. Mitochondrial dynamics: overview of molecular mechanisms. Essays Biochem. 2018, 62, 341-360. 19 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 38

44) Narendra, D.; Tanaka, A.; Suen, DF.; Youle, RJ. Parkin is recruited selectively to impaired mitochondria and promotes their autophagy. J Cell Biol. 2008, 183, 795-803. 45) Pickles, S.; Vigié, P.; Youle, RJ. Mitophagy and Quality Control Mechanisms in Mitochondrial Maintenance. Curr Biol. 2018, DOI: 10.1016/j.cub.2018.01.004. 46) Monti, C.; Bondi, H.; Urbani, A.; Fasano, M.; Alberio, T. Systems biology analysis of the proteomic alterations induced by MPP(+), a Parkinson's disease-related mitochondrial toxin. Front Cell Neurosci. 2015, DOI: 10.3389/fncel.2015.00014. 47) Schirinzi, T.; Madeo, G.; Martella, G.; Maltese, M.; Picconi, B.; Calabresi, P.; Pisani, A. Early synaptic dysfunction in Parkinson's disease: Insights from animal models. Mov Disord. 2016, 31, 802-13.

20 ACS Paragon Plus Environment

Page 21 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Table 1: Uniprot and GO Cellular Component code of sub-cellular locations Sub-cellular localization

Uniprot code

GO Cellular Component code

Cytoplasm

SL-0086

GO_0005737

Endoplasmic reticulum

SL-0095

GO_0005783

Golgi apparatus

SL-0132

GO_0005794

Lysosome

SL-0158

GO_0005764

Nucleus

SL-0191

GO_0005634

Mitochondrion

SL-0173

GO_0005739

21 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 38

Table 2: Uniprot and GO Cellular Component code of sub-mitochondrial locations Sub-mitochondrial

Uniprot code

GO Cellular Component code

Inner membrane

SL-0168

GO_0005743

Inter membrane space

SL-0169

GO_0005758

Matrix

SL-0170

GO_0005759

Membrane

SL-0171

GO_0031966

Outer membrane

SL-0172

GO_0005741

localization

22 ACS Paragon Plus Environment

Page 23 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Table 3: Analysis of the MITO list using the PE information

Entries in the

PE1

PE2

PE3

(evidence at protein

(evidence at

(Inferred from

level)

transcript level)

homology)

17470

1660

452

74

1390

20

3

0

PE4 (Predicted)

neXtProt database MITO list

23 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 38

Table 4: PE2 proteins of the MITO list. All data were collected from neXtProt database. Only high-quality annotations (gold) were retrieved. PE2 neXtProt ID

Gene Name

NX_P0C7M7

ACSM4

NX_Q6P461

ACSM6

NX_Q7Z713

ANKRD37

NX_Q06055

ATP5G2

NX_P48201

ATP5G3

NX_Q8TF08

COX7B2

NX_Q7Z4L0

COX8C

NX_Q5U4N7

GDF5OS

NX_P59942

MCCD1

NX_A8MUP2

METTL12

Protein Name

Biological Pathway

GO BP: Fatty acid biosynthetic process; Reactome: Conjugation of salicylate with glycine GO BP: Fatty acid biosynthetic process, Acyl-coenzyme A Positive regulation of protein synthetase ACSM6, targeting to mitochondrion; mitochondrial Reactome: Beta oxidation of butanoyl-CoA to acetyl-CoA Ankyrin repeat domain-containing protein 37 GO BP: ATP synthesis coupled proton transport; ATP synthase F(0) Reactome: Formation of complex subunit ATP by chemiosmotic C2, mitochondrial coupling, Cristae formation GO BP: ATP synthesis coupled proton transport; ATP synthase F(0) Reactome: Formation of complex subunit ATP by chemiosmotic C3, mitochondrial coupling, Cristae formation GO BP: Hydrogen ion Cytochrome c transmembrane transport, oxidase subunit Electron transport chain 7B2, mitochondrial GO BP: Mitochondrial electron transport, Cytochrome c cytochrome c to oxygen, oxidase subunit 8C, Hydrogen ion mitochondrial transmembrane transport Protein GDF5OS, mitochondrial Mitochondrial coiled-coil domain protein 1 Methyltransferaselike protein 12, GO BP: Protein methylation mitochondrial Acyl-coenzyme A synthetase ACSM4, mitochondrial

24 ACS Paragon Plus Environment

Page 25 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

NX_P0CJ72

MTRNR2L5

Humanin-like 5 (HN5)

NX_Q8N945

PRELID2

PRELI domaincontaining protein 2

NX_Q8IW03

SIAH3

Seven in absentia homolog 3 (Siah-3)

NX_Q6PIV7

SLC25A34

NX_Q8N413

SLC25A45

NX_Q6Q0C1

SLC25A47

NX_Q3SY17

SLC25A52

NX_Q8IZJ6

TDH

NX_Q6P5X7

TMEM71

NX_Q9HCN2

Solute carrier family 25 member 34 Solute carrier family 25 member 45 Solute carrier family 25 member 47 Solute carrier family 25 member 52 Inactive L-threonine 3-dehydrogenase, mitochondrial Transmembrane protein 71 p53-regulated apoptosis-inducing protein 1 (p53AIP1)

TP53AIP1

GO BP: Negative regulation of mitochondrial membrane permeability involved in apoptotic process GO BP: Phospholipid transport

GO BP: Regulation of protein stability, Negative regulation of protein targeting to mitochondrion GO BP: Mitochondrial transport GO BP: Mitochondrial transport GO BP: Mitochondrial transport GO BP: Mitochondrial transport Reactome: Threonine catabolism

GO BP: Apoptotic process; Reactome: TP53 Regulates Transcription of Genes Involved in Cytochrome C Release

Abbreviation: GO BP, GO Biological Process.

25 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 38

Table 5: PE3 proteins of MITO list. All data were collected from neXtProt database. PE3 neXtProt ID

NX_Q5VTU8

Gene Name

ATP5EP2

NX_B7ZC32

KIF28P

NX_A0A096LP55

UQCRHL

Protein Name

Biological Pathway

GO BP: ATP hydrolysis coupled ATP synthase subunit epsilon-like cation protein, mitochondrial transmembrane transport

Note Defined as a pseudogene by HGNC. However, proteomic data suggest the existence of this protein (Yu et al., 2007)

The sequence of the protein was deduced from the genomic sequence and ESTs Kinesin-like protein KIF28P by similarity to the mouse and rat sequence Sequence similarity evidence used in manual GO BP: assertion with P00127 Mitochondrial Cytochrome b-c1 complex subunit (QCR6_YEAST); 90% Electron Transport 6-like, mitochondrial identity with P99028 Chain, ubiquitinol to (QCR6_MOUSE) (Uniprot cytochrome c database) GO BP: Mitochondrion organization

Abbreviation: GO BP, GO Biological Process.

26 ACS Paragon Plus Environment

Page 27 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Captions to the Figures Figure 1: Mitochondrial IDs Venn diagrams. (A) The Venn diagram shows the source of mitochondrial IDs in the MITO list (retrieved by neXtProt database) and the number of proteins shared by the SUB_LOC list (retrieved by the sublocation section) and the GO_CC list (retrieved by the GO Cellular Component section). (B) The Venn diagram shows the comparison of IDs shared by three databases, neXtProt (MITO list), MitoCarta (MitoCarta list) and IMPI (IMPI list). Figure 2: The functional mitochondrial human proteome network. Yellow nodes represent mitochondrial proteins (encoded by the mitochondrial genome or translocated to the mitochondrion, if indicated as mitochondrial in the sub location and/or GO Cellular Component sections of the neXtProt database). Blue nodes represent gold interactors of the mitochondrial proteins, as obtained by querying the protein interaction sections of the neXtProt database. Figure 3: Graphical representation of multiple sub-cellular locations. Color code is explained on the left. The size of the category square is proportional to the number of proteins in that group, which is written inside each square. Rectangles are placed progressively from the largest and more represented category on the left to the smallest and less represented one on the bottom-right. Category squares describing multiple locations are filled with white (mitochondrion) and colored stripes (other location). Figure 4: Graphical representation of sub-mitochondrial locations. Color code is explained on the left. The size of the category square is proportional to the number of proteins in that group, which is written inside each square. Category squares describing multiple locations are filled with different colors. Rectangles are placed progressively from the largest and more represented category on the left to the smallest and less represented one on the bottom-right. Figure 5: Proteins with unknown function. Proteins that had unknown function were highlighted (yellow nodes represent mitochondrial proteins, while blue nodes represent their interactors). Figure 6: CCDC90B and HSDL1 and their neighbors. (A) 8 nodes were present in the cluster of CCDC90B neighborhood, extracted from the MITO network. (B) Only 2 nodes were present in the cluster of HSDL1, extracted from the MITO network. Color code represents sub-mitochondrial localization of proteins: mitochondrial inner membrane (green nodes), generic mitochondrion sub-location (yellow node), inter membrane space (blue nodes) and mitochondrial membrane (gray node). Multiple colors indicate multiple locations. Light blue nodes: non-mitochondrial proteins. Figure 7: Parkinson’s Disease specific proteins mapped onto the MITO network. Yellow nodes encode for mitochondrial proteins, while blue nodes encode for first interactors.

27 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 38

Figure 1: Mitochondrial IDs Venn diagrams. (A) The Venn diagram shows the source of mitochondrial IDs in the MITO list (retrieved by neXtProt database) and the number of proteins shared by the SUB_LOC list (retrieved by the sublocation section) and the GO_CC list (retrieved by the GO Cellular Component section). (B) The Venn diagram shows the comparison of IDs shared by three databases, neXtProt (MITO list), MitoCarta (MitoCarta list) and IMPI (IMPI list).

Figure 2: The functional mitochondrial human proteome network. Yellow nodes represent mitochondrial proteins (encoded by the mitochondrial genome or translocated to the mitochondrion, if indicated as mitochondrial in the sub location and/or GO Cellular Component sections of the neXtProt database). Blue nodes represent gold interactors of the mitochondrial proteins, as obtained by querying the protein interaction sections of the neXtProt database.

Figure 3: Graphical representation of multiple sub-cellular locations. Color code is explained on the left. The size of the category square is proportional to the number of proteins in that group, which is written inside each square. Rectangles are placed progressively from the largest and more represented category on the left to the smallest and less represented one on the bottom-right. Category squares describing multiple locations are filled with white (mitochondrion) and colored stripes (other location). 28 ACS Paragon Plus Environment

Page 29 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 4: Graphical representation of sub-mitochondrial locations. Color code is explained on the left. The size of the category square is proportional to the number of proteins in that group, which is written inside each square. Category squares describing multiple locations are filled with different colors. Rectangles are placed progressively from the largest and more represented category on the left to the smallest and less represented one on the bottom-right.

Figure 5: Proteins with unknown function. Proteins that had unknown function were highlighted (yellow nodes represent mitochondrial proteins, while blue nodes represent their interactors).

29 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 38

Figure 6: CCDC90B and HSDL1 and their neighbors. (A) 8 nodes were present in the cluster of CCDC90B neighborhood, extracted from the MITO network. (B) Only 2 nodes were present in the cluster of HSDL1, extracted from the MITO network. Color code represents sub-mitochondrial localization of proteins: mitochondrial inner membrane (green nodes), generic mitochondrion sub-location (yellow node), inter membrane space (blue nodes) and mitochondrial membrane (gray node). Multiple colors indicate multiple locations. Light blue nodes: non-mitochondrial proteins.

Figure 7: Parkinson’s Disease specific proteins mapped onto the MITO network. Yellow nodes encode for mitochondrial proteins, while blue nodes encode for first interactors.

30 ACS Paragon Plus Environment

Page 31 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

For TOC only

31 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1: Mitochondrial IDs Venn diagrams. (A) The Venn diagram shows the source of mitochondrial IDs in the MITO list (retrieved by neXtProt database) and the number of proteins shared by the SUB_LOC list (retrieved by the sublocation section) and the GO_CC list (retrieved by the GO Cellular Component section). (B) The Venn diagram shows the comparison of IDs shared by three databases, neXtProt (MITO list), MitoCarta (MitoCarta list) and IMPI (IMPI list). 84x41mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 32 of 38

Page 33 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 2: The functional mitochondrial human proteome network. Yellow nodes represent mitochondrial proteins (encoded by the mitochondrial genome or translocated to the mitochondrion, if indicated as mitochondrial in the sub location and/or GO Cellular Component sections of the neXtProt database). Blue nodes represent gold interactors of the mitochondrial proteins, as obtained by querying the protein interaction sections of the neXtProt database. 47x34mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3: Graphical representation of multiple sub-cellular locations. Color code is explained on the left. The size of the category square is proportional to the number of proteins in that group, which is written inside each square. Rectangles are placed progressively from the largest and more represented category on the left to the smallest and less represented one on the bottom-right. Category squares describing multiple locations are filled with white (mitochondrion) and colored stripes (other location). 71x47mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 34 of 38

Page 35 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 4: Graphical representation of sub-mitochondrial locations. Color code is explained on the left. The size of the category square is proportional to the number of proteins in that group, which is written inside each square. Category squares describing multiple locations are filled with different colors. Rectangles are placed progressively from the largest and more represented category on the left to the smallest and less represented one on the bottom-right. 585x360mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 5: Proteins with unknown function. Proteins that had unknown function were highlighted (yellow nodes represent mitochondrial proteins, while blue nodes represent their interactors). 47x33mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 36 of 38

Page 37 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 6: CCDC90B and HSDL1 and their neighbors. (A) 8 nodes were present in the cluster of CCDC90B neighborhood, extracted from the MITO network. (B) Only 2 nodes were present in the cluster of HSDL1, extracted from the MITO network. Color code represents sub-mitochondrial localization of proteins: mitochondrial inner membrane (green nodes), generic mitochondrion sub-location (yellow node), inter membrane space (blue nodes) and mitochondrial membrane (gray node). Multiple colors indicate multiple locations. Light blue nodes: non-mitochondrial proteins.

84x41mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 7: Parkinson’s Disease specific proteins mapped onto the MITO network. Yellow nodes encode for mitochondrial proteins, while blue nodes encode for first interactors. 47x35mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 38 of 38