A Web of Possibilities: Network-Based Discovery of Protein Interaction

Oct 22, 2014 - enzyme–substrate relationships, node is targeted by two or more modifying enzymes; a greater number indicates greater likelihood of a...
0 downloads 0 Views 1MB Size
Subscriber access provided by BOSTON UNIV

Perspective

A web of possibilities: network-based discovery of protein interaction codes Daniel L. Winter, Melissa A. Erce, and Marc R. Wilkins J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/pr500585p • Publication Date (Web): 22 Oct 2014 Downloaded from http://pubs.acs.org on October 28, 2014

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

A web of possibilities: network-based discovery of protein interaction codes Daniel L. Winter, Melissa A. Erce, Marc R. Wilkins* Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, NSW 2052, Australia Keywords: post-translational modifications / networks / interaction codes / protein-protein interactions

1

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 31

Abstract Many proteins, including p53, the FoxO transcription factors, RNA polymerase II, pRb and the chaperones, have extensive post-translational modifications (PTMs). Many of these modifications modulate protein-protein interactions, controlling interaction presence / absence and specificity. Here we propose the notion of the interaction code; a widespread means by which modifications are used to control interactions in the proteome. Minimal interaction codes are likely to exist on proteins that have two or more modifications on an interaction interface and two or more interaction partners. By contrast, complex interaction codes are likely to be found on ‘date hub’ proteins that have many interactions, many PTMs, and/or are targeted by many modifying and demodifying enzymes. Proteins with new interaction codes should be discoverable by examining protein interaction networks, annotated with PTMs and protein-modifying

enzyme-substrate

links.

Multiple

instances

or

combinations

of

phosphorylation, acetylation, methylation, O-GlcNAc and/or ubiquitination will likely form interaction codes, especially when co-located on a protein’s single interaction interface. A network-based example of code discovery is given, predicting the yeast protein Npl3p to have a methylation / phosphorylation dependent interaction code.

2

ACS Paragon Plus Environment

Page 3 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Introduction Most proteins are modified by at least one post-translational modification (PTM).1 PTMs, alongside alternative splicing, can expand a genome of tens of thousands of genes into a proteome of hundreds of thousands of unique proteins.2 PTMs are regulated through the action of modifying and demodifying enzymes such as kinases and phosphatases and allow for a rapid response of the cell to different stimuli. Of particular interest are proteins that feature more than one PTM site. These proteins can potentially exist in a greater variety of modforms2 (the different modification states of a protein as PTMs are added or removed) than singly modified proteins. A well-characterised example of this is the histones, which are highly modified on their N-terminal tails with phosphorylation, acetylation, methylation, ADP-ribosylation, O-linked N-acetylglucosamination (O-GlcNAc) and ubiquitination or ubiquitin-like modifications. These PTMs regulate the specificity of the protein-protein interactions (PPIs) of histones with their binding partners, or interactors.3 This mechanism, which highlights the role of the combinatorial effects of PTMs, has been dubbed the histone code. With the advent of high-throughput proteomics, there are now tens of thousands of known protein PTM sites. Most of this knowledge concerns the better characterised modifications such as phosphorylation, acetylation and methylation.4 With this information at hand, the idea of a broader code, dubbed the PTM code, has emerged.5 The idea consists of identifying the physiological modforms of proteins that harbor more than one PTM and, then, characterizing the specific functional properties of each modform. Comprehensive research on specific proteins has unveiled new PTM codes, such as the p53 code,6 the FoxO code,7 the RNA polymerase II carboxy-terminal domain (CTD) code8, the pRb 3

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

code9 and the chaperone code.10 p53 works primarily as a transcription factor which recruits chromatin remodelers, histone modifiers or RNA polymerases to DNA. Its capacity to form different transcription complexes with its interactors is due to a complex combination of PTMs, which leads to the formation of docking motifs for interacting domains.6 For the FoxO transcription factors, their activity depends on the recruitment of co-activators or co-repressors and it has been suggested that PTMs regulate these interactions.7 Moreover, the localization of FoxO transcription factors is also regulated by PPIs that depend on PTMs. Similarly, the binding partners of RNA polymerase II, which include transcription factors, chromatin modifying factors and RNA processing factors, are thought to recognise PTM patterns on its C-terminal domain and a writer-reader-eraser model has been proposed.8 The modulation of pRb PPIs by PTMs has been carefully studied.9 In short, its interaction with E2F-1 is dependent on the phosphorylation state of pRb and its interactions with HP1, L3MBTL1 and 53BP1 are modulated by three distinct methylation sites.9, 11 Moreover, acetylation of pRb increases its affinity for Mdm2 and has also been proposed to recruit bromodomain-containing proteins.9 Finally, the high number of PTMs on chaperone proteins – some of the most highly interacting of all proteins, due to their role in protein folding12 – suggests that a PTM code could participate in recruiting different co-chaperones and allocating each chaperone modform to a different type of protein.10 In fact, it is known that different co-chaperones can recruit different client proteins12 and that PTMs can regulate the binding of chaperones to co-chaperones. For example, C-terminal phosphorylation of Hsp70 and Hsp90 shifts their binding affinity from the co-chaperone CHIP to the co-chaperone HOP13 and acetylation of Hsp90 increases its affinity to several co-chaperones.14 In short, all the aforementioned proteins are known to participate in many PPIs and to feature several PTMs; there is also evidence that their affinity to different interactors is PTM-regulated. 4

ACS Paragon Plus Environment

Page 4 of 31

Page 5 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Interaction codes Here we propose that PTM codes that specifically modulate PPIs, such as those referred above6-10 and more generally proposed by Wilkins and Kummerfeld15, be termed interaction codes. We also propose that they will involve specific types of PTMs, rather than all types of PTM that occur within the cell. The histone code and other newly proposed interaction codes should thus not be considered as isolated or rare examples; interaction codes of varying degrees of complexity are likely to be widespread in the proteome. A minimal interaction code would consist of two PTM sites, of the same or different type, that modulate the interactions of a protein with at least two different interactors. More complex codes will involve a greater number of PTMs, which modulate a greater number of interactions (Figure 1A). Conceptually, different types of codes can be imagined. In a protein with one interaction interface, but multiple interaction partners, the PTMs on that interface – alone or in combination – can modulate interactions in a mutually exclusive manner. Alternatively, a protein with multiple interaction interfaces but only one partner per interface might have one PTM per interface; the PTMs would switch each interface’s particular interaction on or off. A third scenario, which is in between the previous two, also exists whereby a protein has multiple interaction interfaces and more than one PTM and interactor per interface. In this case, the total interactions of the protein will depend on the type and total number of PTMs present. The search for new interaction codes could be accelerated if promising protein candidates were easily identifiable. The challenge is how can these proteins be found? We suggest that PPI networks, co-analysed with PTM and other relevant data, can serve as a means to predict which proteins are likely to feature an interaction code. We will consider the situation of the complex

5

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

code rather than the minimal interaction code; these involve proteins with many interactions and many PTMs and should thus be more easily found. Hubs in protein networks are likely candidates for interaction codes In a PPI network, proteins with a large number of interactions (five or more in yeast) are termed hub proteins or hubs. These should be evaluated as candidates for interaction codes. However, there are two types of hub proteins, ‘date’ and ‘party’ hubs (Figure 2). ‘Date hub’ proteins are those that have a large number of interaction partners but a small number of interaction interfaces16. They cannot interact with all of their partners at once, raising the possibility that PTMs are used to specifically control their interactions at one or more interfaces. On the other hand, ‘party hub’ proteins are members of large, stable molecular machines. In networks, these are identifiable as proteins with a characteristically large number of PPIs shared amongst all or many members of the molecular machine. The ‘party hub’ proteins tend to interact with all their partners at the same place and time,16 having less biological need for regulation of interactions. The search for interaction codes should thus focus on ‘date hubs’; for example, there are 91 of these reported in yeast16 of which 36 have just one or two interaction interface.17 Hubs like these should then be examined in the context of their known PTMs. Modifications and enzyme-substrate relationships highlight interaction codes Do all types of PTMs participate in interaction codes? In the histone code, as well as in the recently described interaction codes, it is notable that a small subset of PTMs is used in all five known codes. To date, these are phosphorylation, acetylation, methylation, O-GlcNAc and ubiquitination or ubiquitin-like modifications (Table 1). ADP-ribosylation is also part of three codes. This is not to say that interaction codes are necessarily limited to only these modifications, but these PTMs are likely to form the ‘core’ alphabet. Interestingly, PTMs 6

ACS Paragon Plus Environment

Page 6 of 31

Page 7 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

involved in known codes occur on only a few amino acids: lysine (methylation, acetylation, ubiquitin-like modifications), serine, threonine (phosphorylation, O-GlcNAc), arginine (methylation, ADP-ribosylation) and tyrosine (phosphorylation). Moreover, except for ADP-ribosylation, these types of PTM were shown to co-evolve, indicating some sort of functional association.18 It is also notable that code-associated PTMs affect a range of physicochemical properties of their covalently linked amino acids. This concurs with the idea that PTMs involved in interaction codes are able to create or block interactions by changing the shape, hydrophobicity or charge of a surface. Ultimately, this can affect domain-domain or domain-motif interactions; for example the phosphorylation of a tyrosine can modulate interactions with an Src-homology-2 (SH2) domain-carrying partner, and trimethylation of lysine can be specifically recognised by the chromodomain19. The existence of a core set of modifications in interaction codes suggests that date hubs should be examined for these PTMs. Those hubs that have two or more core PTMs may carry an interaction code, with those carrying larger numbers being of most interest. PTM-associated information can be integrated with PPI networks in two ways, to assist in the discovery of proteins with interaction codes. First is the integration of PTM types and their incidence into networks, as demonstrated by us in 200820 and by Woodsmith and Stelzl in 2014.21 Many proteome-scale studies of PTMs now exist, in model systems and human cells,4 providing extensive data for this purpose. Second is the integration of PTM-associated enzyme-substrate relationships with PPI networks, such as kinase-substrate or methyltransferase-substrate interactions.22 These enzyme-substrate relationships can be easily visualised in PPI networks, for example with the use of coloured edges (Figure 2), and constitute a useful strategy to identify ‘date hubs’ targeted by relevant enzymes. As suggested above, hubs subject to many enzyme7

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

substrate interactions will be those most likely to carry an interaction code. Unfortunately, however, there is a need for more large-scale enzyme-substrate data concerning protein modifications. Two-hybrid screens, used to detect PPIs, do not typically confirm enzyme-substrate relationships, because modifying enzymes also interact with non-substrate partners and enzyme-substrate interactions can be too transient to be detected by two-hybrid techniques.23 Large-scale unbiased enzyme-substrate mapping studies, employing other techniques such as the proteome arrays used in the global analysis of protein phosphorylation in yeast,24 will be critical for the construction of modifying enzyme-protein substrate relationships that can be integrated with PPI networks. Filtering of candidates with structural and localization data Candidate proteins for interaction codes, discovered as above, can finally be filtered with structural and intracellular localisation data. Structural data, when available, is useful to verify that PTM sites are located at interaction surfaces and thus accessible to potential interactors.25 Sites that are buried in the core of proteins, although rare, are unlikely to participate in interaction codes. Data on intrinsic disorder of proteins is also relevant. In fact, many types of PTMs have a preference for occurrence in disordered regions;26 such regions can participate in PPI via disorder-to-order transitions and PTMs have been proposed as a mechanism to modulate such interactions.27 Finally, data on subcellular localization or tissue-specificity of interaction partners and modifying enzymes may be helpful to filter out false positives. If, for example, a human date hub has many interactors but their expression occurs in mutually exclusively tissues, an interaction code would not be required to control interaction specificity. The different types of data to integrate with PPI networks in the search for interaction codes, and the corresponding lines of evidence, are summarised in Table 2. 8

ACS Paragon Plus Environment

Page 8 of 31

Page 9 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

9

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Crosstalk in interaction codes In addition to directly regulating interactions, PTMs can also regulate each other. This is known as PTM crosstalk (Figure 1B) and will feature in codes on interaction interfaces that involve many modifications and interactors. Because PTM crosstalk limits the number of possible modforms of a protein, not all PTMs can occur on a protein at the same time. As a consequence, neither can all PTM-mediated PPIs. The simplest type of PTM crosstalk is where more than one type of PTM can compete for the same residue (Figure 1B, example i). This type of crosstalk is most striking in the case of lysine residues, which can harbour many different types of PTMs (see above).28 For example, in p53, there exists competition between methylation, acetylation and ubiquination on several C-terminal lysines, modulating the fate of the protein.29 A more complex type of crosstalk is where one PTM is required to promote the modification of a second, often nearby PTM site (Figure 1B, example ii). This is the case of methylation of arginine 27 of STAT6, which is required for STAT6 phosphorylation.30 Alternatively, a PTM can block another PTM on a proximal site (Figure 1B, example iii). In this case, the crosstalk can be bi-directional, whereby each PTM will block the other, or uni-directional where one PTM can block the other, but not the converse. In pRb, methylation of lysine 810 prevents phosphorylation of nearby serine 807 by disturbing the interaction between pRB and the Cdk kinase. Phosphorylation of pRb, however, has no impact on methylation of lysine 810,31 making this an example of uni-directional crosstalk. Finally, it should be noted that crosstalk is possible between distant PTM sites through allosteric effects32 (Figure 1B, example iv) or when two PTM sites that are distant on the primary sequence of a protein are close in 3-D structure.

10

ACS Paragon Plus Environment

Page 10 of 31

Page 11 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

To summarise, there are complexities of protein post-translational modifications that constitute crosstalk. Some of these will affect the interactions of modifying enzymes with their substrate proteins. Equally, PTM crosstalk can affect non-enzyme protein-protein interaction specificity and, in the context of date hub proteins, is likely to be associated with many interaction codes.

11

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 31

New candidates for interaction codes The principles of interaction code-containing proteins, outlined above, are best illustrated by example. Erce et al. recently constructed a yeast methylproteome network, embedded within a kinase-substrate network.22 We have extracted two hub proteins from the methylproteome networks (histone Hht1p, well-characterised as featuring an interaction code, and mRNA transport protein Npl3p), and co-visualised them with one hub protein extracted from the entire yeast interactome (spliceosome-associated protein Brr2p) and known enzyme-substrate relationships (Figure 3). A hub protein that features an interaction code, namely histone Hht1p, is different to a hub protein that does not, such as spliceosome-associated protein Brr2p. Both proteins participate in more than ten PPIs, however Brr2p only harbors a few phosphorylation sites by Cdk1 whereas Hht1p is targeted by many enzymes and carries many more PTMs. Brr2p features the typical topology of a ‘party hub’, with many interactions and many PPIs between its interactors, resulting in a very dense network that indicates the existence of a large molecular machine (in this case, the spliceosome). The network topology of Hht1p, at first glance, might indicate a stable complex as well. However, the network topologies of ‘date hubs’ are quite varied33 and Hht1p is in fact such a ‘date hub’. The density of its network is due to the fact that Hht1p participates in many different complexes, as opposed to simple PPIs. The topology of the network in Figure 3 suggests Npl3p as an interesting candidate for an interaction code. Npl3p is a highly interacting RNA-binding protein that is targeted by one methyltransferase, several kinases and one phosphatase. Moreover, it features the typical topology of a ‘date hub’, with many interactions but very few PPIs between its interactors. Using a conditional two-hybrid system, Erce et al. demonstrated that methylation of Npl3p by Hmt1p modulates several, but not all, of its PPIs.34 The fact that Npl3p can also be phosphorylated by a number of kinases is likely 12

ACS Paragon Plus Environment

Page 13 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

to explain this observation, as phosphorylation may modulate some of its other interactions. This is consistent with the phosphorylation-methylation interplay that regulates Npl3p import and export from the cell nucleus.35 Npl3p is not currently known to harbor as many types of PTMs as the histones or p53, but not all interaction codes will necessarily share the same degree of complexity. What is important to note, however, is that Npl3p harbors at least two physicochemically distinct types of PTMs and has a relatively large number of known modification sites (18 methylarginines36 and at least 2 phosphoserines37). A careful examination of integrated networks, such as those in yeast from high throughput protein-protein interactions and enzyme-substrate interactions, along with known PTMs from large-scale mass spectrometric screens and other relevant data, should reveal many other proteins which are subject to interaction codes.

13

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 31

Conclusions We have discussed how PPI networks, integrated with knowledge of protein PTMs, will help identify proteins with new interaction codes. Minimal interaction codes are likely to exist on proteins that have two or more modifications on an interaction interface and two or more interaction partners. By contrast, complex interaction codes are likely to be found on date hub proteins that have many PTMs and/or are targeted by many modifying and demodifying enzymes. Increasingly complete and accurate proteome-scale data for PTMs, and the establishment of detailed relationships between PTM modifying enzymes and their substrates, will underpin this process of discovery.

14

ACS Paragon Plus Environment

Page 15 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

FIGURES

Figure 1 – The logic of interaction codes. (A) Interaction codes are likely to exist on proteins that have two or more PTMs and few interaction surfaces. A minimal code is composed of two PTMs regulating two different interactions. Two types of minimal interaction codes can be imagined: (i) the two PTMs occur on the same interaction interface, thereby modulating interactions in a mutually exclusive manner; or (ii) they can instead occur on two distinct interfaces, and switch each interaction on or off independently (in the illustrated example, the yellow PTM promotes an interaction whereas the blue PTM blocks an interaction). More complex types of interaction codes (iii) would feature several PTMs per interface, and PTMs might act in combination to promote or block interactions. Grey disks represent a protein; coloured disks represent interaction partners; hatched areas represent interaction interfaces; coloured symbols represent PTMs; dotted coloured symbols represent unmodified PTM sites. (B) PTM crosstalk underpins interaction codes by limiting the number of modforms of a given protein. PTMs might block or promote each other in a number of ways: (i) PTMs can compete for the same residue; (ii) a PTM can recruit a modifying enzyme to further modify a protein; (iii) 15

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

a PTM can block a modifying enzyme, preventing further modification of the protein; (iv) a PTM can promote structural changes in a protein, exposing (or concealing, not shown) another PTM site.

16

ACS Paragon Plus Environment

Page 16 of 31

Page 17 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 2 – The integration of PPI networks and enzyme-substrate relationships can reveal interaction codes. (A) A PPI network (white nodes: proteins; grey edges: PPIs) contains ‘party hubs’ (protein A; note the dense network) and ‘date hubs’ (proteins B and C). (B) The integration of enzyme-substrate relationships into a PPI network (white nodes: proteins; coloured nodes: modifying enzymes; grey edges: PPIs; coloured directed edges: enzyme-substrate relationships) reveals that protein B is likely to feature an interaction code.

17

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 31

Figure 3 – Examples of hub proteins from the yeast interactome. Histone protein Hht1p features an interaction code. Note the number of interactions and enzymes targeting Hht1p. Npl3p features a ‘date hub’ topology and is a promising candidate for a new interaction code. In contrast, Brr2p features the topology of a ‘party hub’ protein, with tens of interactions, but only harbors one type of PTM and is unlikely to feature an interaction code. White nodes, proteins; orange nodes, kinases; red nodes, methyltransferases; green nodes, acetyltransferases; grey edges, PPIs; orange edges, phosphorylation; red edges, methylation; green edges, acetylation. Data from Erce et al.,22 UniProtKB and NetworKIN, visualised with Cytoscape.38

18

ACS Paragon Plus Environment

Page 19 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

TABLES Table 1. PTMs involved in interaction codes; numbers represent literature references. Protein

Modifications*

References

Histones

Pho, Ac, Met, O-GlcNAc, Ubq, ADP-ribosylation

3, 39

FoxO transcription factors

Pho, Ac, Met, O-GlcNAc, Ubq, ADP-ribosylation

7, 40

p53

Pho, Ac, Met, O-GlcNAc, Ubq, ADP-ribosylation

6

Hsp90

Pho, Ac, Met, O-GlcNAc, Ubq

13-14, 41

pRb

Pho, Ac, Met, O-GlcNAc, Ubq

9, 42

RNA polymerase II

Pho, Ac, Met, O-GlcNAc, Ubq

8, 43

* Abbreviations: Pho – phosphorylation, Ac – acetylation, Met – methylation, Ubq – ubiquitination

19

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 2. Data integration for the discovery of interaction codes Data

Evidence for an interaction code

PPI datasets

Node appears as a 'date hub’

Enzyme-substrate relationships

Node is targeted by two or more modifying enzymes; a greater number indicates greater likelihood of an interaction code

PTM sites

Two or more PTMs; a greater number indicates greater likelihood of an interaction code

PTM sites – interaction interface

PTM sites occur on interaction surfaces; evidence is stronger if several types of PTM occur on the same interaction interface

PTM sites – structural disorder

PTM sites occur on intrinsically disorganised domains; evidence is stronger if several types of PTM occur on the same domain

Subcellular localization

Interaction code-containing protein candidate, its modifying enzymes, and its interactors are co-localised

20

ACS Paragon Plus Environment

Page 20 of 31

Page 21 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

AUTHOR INFORMATION Corresponding Author *Wilkins, M. R. ([email protected]) AUTHOR CONTRIBUTIONS The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript. All authors contributed equally. FUNDING SOURCES DLW acknowledges support from the University of New South Wales. MAE acknowledges support from the University of New South Wales. MRW acknowledges support from the Australian Research Council, the Australian Federal Government EIF Super Science scheme and the New South Wales State Government Science Leveraging Fund. ACKNOWLEDGMENT DLW, MAE and MRW acknowledge the assistance of Aidan Tay in the generation of Figure 3. The authors declare that they have no conflict of interest. ABBREVIATIONS Ac, acetylation; Met, methylation; O-GlcNAc, O-linked N-acetylglucosamination; Pho, phosphorylation; PPI, protein-protein interaction; PTM post-translational modification; Ubq, ubiquitination.

21

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 31

REFERENCES 1.

Jensen, O. N., Interpreting the protein language using proteomics. Nature reviews.

Molecular cell biology 2006, 7 (6), 391-403. 2.

Prabakaran, S.; Lippens, G.; Steen, H.; Gunawardena, J., Post-translational modification:

nature's escape from genetic imprisonment and the basis for dynamic information encoding. Wiley Interdiscip Rev Syst Biol Med 2012, 4 (6), 565-83. 3.

Rando, O. J., Combinatorial complexity in chromatin structure and function: revisiting

the histone code. Current opinion in genetics & development 2012, 22 (2), 148-55. 4.

(a) Humphrey, S. J.; Yang, G.; Yang, P.; Fazakerley, D. J.; Stockli, J.; Yang, J. Y.;

James, D. E., Dynamic adipocyte phosphoproteome reveals that Akt directly regulates mTORC2. Cell metabolism 2013, 17 (6), 1009-20; (b) Choudhary, C.; Kumar, C.; Gnad, F.; Nielsen, M. L.; Rehman, M.; Walther, T. C.; Olsen, J. V.; Mann, M., Lysine acetylation targets protein complexes and co-regulates major cellular functions. Science 2009, 325 (5942), 834-40; (c) Guo, A.; Gu, H.; Zhou, J.; Mulhern, D.; Wang, Y.; Lee, K. A.; Yang, V.; Aguiar, M.; Kornhauser, J.; Jia, X.; Ren, J.; Beausoleil, S. A.; Silva, J. C.; Vemulapalli, V.; Bedford, M. T.; Comb, M. J., Immunoaffinity enrichment and mass spectrometry analysis of protein methylation. Molecular & cellular proteomics : MCP 2014, 13 (1), 372-87. 5.

(a) Venne, A. S.; Kollipara, L.; Zahedi, R. P., The next level of complexity: Crosstalk of

posttranslational modifications. Proteomics 2013; (b) Sims, R. J., 3rd; Reinberg, D., Is there a code embedded in proteins that is based on post-translational modifications? Nature reviews. Molecular cell biology 2008, 9 (10), 815-20. 6.

Gu, B.; Zhu, W. G., Surf the post-translational modification network of p53 regulation.

Int J Biol Sci 2012, 8 (5), 672-84. 22

ACS Paragon Plus Environment

Page 23 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

7.

Calnan, D. R.; Brunet, A., The FoxO code. Oncogene 2008, 27 (16), 2276-88.

8.

Eick, D.; Geyer, M., The RNA polymerase II carboxy-terminal domain (CTD) code.

Chemical reviews 2013, 113 (11), 8456-90. 9.

Munro, S.; Carr, S. M.; La Thangue, N. B., Diversity within the pRb pathway: is there a

code of conduct? Oncogene 2012, 31 (40), 4343-52. 10. Cloutier, P.; Coulombe, B., Regulation of molecular chaperones through posttranslational modifications: decrypting the chaperone code. Biochimica et biophysica acta 2013, 1829 (5), 443-54. 11. Carr, S. M.; Munro, S.; Zalmas, L. P.; Fedorov, O.; Johansson, C.; Krojer, T.; Sagum, C. A.; Bedford, M. T.; Oppermann, U.; La Thangue, N. B., Lysine methylation-dependent binding of 53BP1 to the pRb tumor suppressor. Proceedings of the National Academy of Sciences of the United States of America 2014, 111 (31), 11341-6. 12. Rohl, A.; Rohrberg, J.; Buchner, J., The chaperone Hsp90: changing partners for demanding clients. Trends in biochemical sciences 2013, 38 (5), 253-62. 13. Muller, P.; Ruckova, E.; Halada, P.; Coates, P. J.; Hrstka, R.; Lane, D. P.; Vojtesek, B., C-terminal phosphorylation of Hsp70 and Hsp90 regulates alternate binding to co-chaperones CHIP and HOP to determine cellular protein folding/degradation balances. Oncogene 2013, 32 (25), 3101-10. 14. Scroggins, B. T.; Robzyk, K.; Wang, D.; Marcu, M. G.; Tsutsumi, S.; Beebe, K.; Cotter, R. J.; Felts, S.; Toft, D.; Karnitz, L.; Rosen, N.; Neckers, L., An acetylation site in the middle domain of Hsp90 regulates chaperone function. Molecular cell 2007, 25 (1), 151-9. 15. Wilkins, M. R.; Kummerfeld, S. K., Sticking together? Falling apart? Exploring the dynamics of the interactome. Trends in biochemical sciences 2008, 33 (5), 195-200. 23

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 31

16. (a) Han, J. D.; Bertin, N.; Hao, T.; Goldberg, D. S.; Berriz, G. F.; Zhang, L. V.; Dupuy, D.; Walhout, A. J.; Cusick, M. E.; Roth, F. P.; Vidal, M., Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature 2004, 430 (6995), 88-93; (b) Kim, P. M.; Lu, L. J.; Xia, Y.; Gerstein, M. B., Relating three-dimensional structures to protein networks provides evolutionary insights. Science 2006, 314 (5807), 1938-41. 17. Goel, A.; Wilkins, M. R., Dynamic hubs show competitive and static hubs noncompetitive regulation of their interaction partners. PloS one 2012, 7 (10), e48209. 18. Minguez, P.; Parca, L.; Diella, F.; Mende, D. R.; Kumar, R.; Helmer-Citterich, M.; Gavin, A. C.; van Noort, V.; Bork, P., Deciphering a global network of functionally associated post-translational modifications. Molecular systems biology 2012, 8, 599. 19. Seet, B. T.; Dikic, I.; Zhou, M. M.; Pawson, T., Reading protein modifications with interaction domains. Nature reviews. Molecular cell biology 2006, 7 (7), 473-83. 20. Ho, E.; Webber, R.; Wilkins, M. R., Interactive three-dimensional visualization and contextual analysis of protein interaction networks. Journal of proteome research 2008, 7 (1), 104-12. 21. Woodsmith, J.; Stelzl, U., Studying post-translational modifications with protein interaction networks. Current opinion in structural biology 2014, 24, 34-44. 22. Erce, M. A.; Pang, C. N.; Hart-Smith, G.; Wilkins, M. R., The methylproteome and the intracellular methylation network. Proteomics 2012, 12 (4-5), 564-86. 23. Westermarck, J.; Ivaska, J.; Corthals, G. L., Identification of protein interactions involved in cellular signaling. Molecular & cellular proteomics : MCP 2013, 12 (7), 1752-63. 24. Ptacek, J.; Devgan, G.; Michaud, G.; Zhu, H.; Zhu, X.; Fasolo, J.; Guo, H.; Jona, G.; Breitkreutz, A.; Sopko, R.; McCartney, R. R.; Schmidt, M. C.; Rachidi, N.; Lee, S. J.; Mah, A. 24

ACS Paragon Plus Environment

Page 25 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

S.; Meng, L.; Stark, M. J.; Stern, D. F.; De Virgilio, C.; Tyers, M.; Andrews, B.; Gerstein, M.; Schweitzer, B.; Predki, P. F.; Snyder, M., Global analysis of protein phosphorylation in yeast. Nature 2005, 438 (7068), 679-84. 25. (a) Pang, C. N.; Hayen, A.; Wilkins, M. R., Surface accessibility of protein posttranslational modifications. Journal of proteome research 2007, 6 (5), 1833-45; (b) Vandermarliere, E.; Martens, L., Protein structure as a means to triage proposed PTM sites. Proteomics 2013, 13 (6), 1028-35. 26. Gao, J.; Xu, D., Correlation between posttranslational modification and intrinsic disorder in protein. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing 2012, 94103. 27. Hsu, W. L.; Oldfield, C. J.; Xue, B.; Meng, J.; Huang, F.; Romero, P.; Uversky, V. N.; Dunker, A. K., Exploring the binding diversity of intrinsically disordered proteins involved in one-to-many binding. Protein science : a publication of the Protein Society 2013, 22 (3), 258-73. 28. Yang, X. J.; Seto, E., Lysine acetylation: codified crosstalk with other posttranslational modifications. Molecular cell 2008, 31 (4), 449-61. 29. Marouco, D.; Garabadgiu, A. V.; Melino, G.; Barlev, N. A., Lysine-specific modifications of p53: a matter of life and death? Oncotarget 2013, 4 (10), 1556-71. 30. Chen, W.; Daines, M. O.; Hershey, G. K., Methylation of STAT6 modulates STAT6 phosphorylation, nuclear translocation, and DNA-binding activity. Journal of immunology 2004, 172 (11), 6744-50. 31. Carr, S. M.; Munro, S.; Kessler, B.; Oppermann, U.; La Thangue, N. B., Interplay between lysine methylation and Cdk phosphorylation in growth control by the retinoblastoma protein. The EMBO journal 2011, 30 (2), 317-27. 25

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 31

32. Nussinov, R.; Tsai, C. J.; Xin, F.; Radivojac, P., Allosteric post-translational modification codes. Trends in biochemical sciences 2012, 37 (10), 447-55. 33. Chang, X.; Xu, T.; Li, Y.; Wang, K., Dynamic modular architecture of protein-protein interaction networks beyond the dichotomy of 'date' and 'party' hubs. Scientific reports 2013, 3, 1691. 34. Erce, M. A.; Abeygunawardena, D.; Low, J. K.; Hart-Smith, G.; Wilkins, M. R., Interactions affected by arginine methylation in the yeast protein-protein interaction network. Molecular & cellular proteomics : MCP 2013, 12 (11), 3184-98. 35. Yun, C. Y.; Fu, X. D., Conserved SR protein kinase functions in nuclear import and its action is counteracted by arginine methylation in Saccharomyces cerevisiae. The Journal of cell biology 2000, 150 (4), 707-18. 36. Hart-Smith, G.; Low, J. K.; Erce, M. A.; Wilkins, M. R., Enhanced methylarginine characterization by post-translational modification-specific targeted data acquisition and electron-transfer dissociation mass spectrometry. Journal of the American Society for Mass Spectrometry 2012, 23 (8), 1376-89. 37. Albuquerque, C. P.; Smolka, M. B.; Payne, S. H.; Bafna, V.; Eng, J.; Zhou, H., A multidimensional chromatography technology for in-depth phosphoproteome analysis. Molecular & cellular proteomics : MCP 2008, 7 (7), 1389-96. 38. Shannon, P.; Markiel, A.; Ozier, O.; Baliga, N. S.; Wang, J. T.; Ramage, D.; Amin, N.; Schwikowski, B.; Ideker, T., Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome research 2003, 13 (11), 2498-504. 39. (a) Sakabe, K.; Wang, Z.; Hart, G. W., Beta-N-acetylglucosamine (O-GlcNAc) is part of the histone code. Proceedings of the National Academy of Sciences of the United States of 26

ACS Paragon Plus Environment

Page 27 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

America 2010, 107 (46), 19915-20; (b) Hottiger, M. O., ADP-ribosylation of histones by ARTD1: an additional module of the histone code? FEBS letters 2011, 585 (11), 1595-9. 40. (a) Housley, M. P.; Rodgers, J. T.; Udeshi, N. D.; Kelly, T. J.; Shabanowitz, J.; Hunt, D. F.; Puigserver, P.; Hart, G. W., O-GlcNAc regulates FoxO activation in response to glucose. The Journal of biological chemistry 2008, 283 (24), 16283-92; (b) Sakamaki, J.; Daitoku, H.; Yoshimochi, K.; Miwa, M.; Fukamizu, A., Regulation of FOXO1-mediated transcription and cell proliferation by PARP-1. Biochemical and biophysical research communications 2009, 382 (3), 497-502; (c) Yamagata, K.; Daitoku, H.; Takahashi, Y.; Namiki, K.; Hisatake, K.; Kako, K.; Mukai, H.; Kasuya, Y.; Fukamizu, A., Arginine methylation of FOXO transcription factors inhibits their phosphorylation by Akt. Molecular cell 2008, 32 (2), 221-31. 41. (a) Donlin, L. T.; Andresen, C.; Just, S.; Rudensky, E.; Pappas, C. T.; Kruger, M.; Jacobs, E. Y.; Unger, A.; Zieseniss, A.; Dobenecker, M. W.; Voelkel, T.; Chait, B. T.; Gregorio, C. C.; Rottbauer, W.; Tarakhovsky, A.; Linke, W. A., Smyd2 controls cytoplasmic lysine methylation of Hsp90 and myofilament organization. Genes & development 2012, 26 (2), 114-9; (b) Overath, T.; Kuckelkorn, U.; Henklein, P.; Strehl, B.; Bonar, D.; Kloss, A.; Siele, D.; Kloetzel, P. M.; Janek, K., Mapping of O-GlcNAc sites of 20 S proteasome subunits and Hsp90 by a novel biotin-cystamine tag. Molecular & cellular proteomics : MCP 2012, 11 (8), 467-77; (c) Kundrat, L.; Regan, L., Identification of residues on Hsp70 and Hsp90 ubiquitinated by the cochaperone CHIP. Journal of molecular biology 2010, 395 (3), 587-94. 42. Wells, L.; Slawson, C.; Hart, G. W., The E2F-1 associated retinoblastoma-susceptibility gene product is modified by O-GlcNAc. Amino acids 2011, 40 (3), 877-83. 43. (a) Schroder, S.; Herker, E.; Itzen, F.; He, D.; Thomas, S.; Gilchrist, D. A.; Kaehlcke, K.; Cho, S.; Pollard, K. S.; Capra, J. A.; Schnolzer, M.; Cole, P. A.; Geyer, M.; Bruneau, B. G.; 27

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 31

Adelman, K.; Ott, M., Acetylation of RNA polymerase II regulates growth-factor-induced gene transcription in mammalian cells. Molecular cell 2013, 52 (3), 314-24; (b) Ranuncolo, S. M.; Ghosh, S.; Hanover, J. A.; Hart, G. W.; Lewis, B. A., Evidence of the involvement of OGlcNAc-modified human RNA polymerase II CTD in transcription in vitro and in vivo. The Journal of biological chemistry 2012, 287 (28), 23549-61.

28

ACS Paragon Plus Environment

Page 29 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

131x60mm (299 x 299 DPI)

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1402x681mm (72 x 72 DPI)

ACS Paragon Plus Environment

Page 30 of 31

Page 31 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

466x179mm (72 x 72 DPI)

ACS Paragon Plus Environment