Investigating the network basis of negative genetic interactions in

Feb 2, 2018 - To investigate this, we analyzed negative genetic interactions within an integrated biological network, being the union of protein-prote...
0 downloads 4 Views 3MB Size
Subscriber access provided by READING UNIV

Article

Investigating the network basis of negative genetic interactions in Saccharomyces cerevisiae with integrated biological networks and triplet motif analysis Chi Nam Ignatius Pang, Apurv Goel, and Marc R. Wilkins J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.7b00649 • Publication Date (Web): 02 Feb 2018 Downloaded from http://pubs.acs.org on February 2, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Investigating the network basis of negative genetic interactions in Saccharomyces cerevisiae with integrated biological networks and triplet motif analysis Chi Nam Ignatius Pang1, Apurv Goel1 and Marc R. Wilkins1,* 1: Systems Biology Initiative, School of Biotechnology and Biomolecular Sciences, University of New South Wales, NSW 2052, Australia. * Corresponding Author Keywords Synthetic lethality, triplet motifs, negative genetic interactions, networks, Saccharomyces cerevisiae

1 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 58

Abstract

Negative genetic interactions in Saccharomyces cerevisiae have been systematically screened to near-completeness, with >500,000 interactions identified. Nevertheless, the biological basis of these interactions remains poorly understood. To investigate this, we analyzed negative genetic interactions within an integrated biological network, being the union of proteinprotein, kinase-substrate, and transcription factor-target gene interactions. Network triplets, containing two genes / proteins that show negative genetic interaction and a third protein from the network, were then analyzed. Strikingly, just six out of 15 possible triplet motif types were present, as compared to randomized networks. These were in three clear groups: proteinprotein interactions, signaling and regulatory triplets where the latter two showed no overlap. In the triplets, negative genetic interactions were associated with paralogs and ohnologs, however these were very rare. Negative genetic interactions among the six triplet motifs did however show strong dosage constraints, with genes being significantly associated with toxicity on overexpression and periodicity in the cell-cycle. Negative genetic interactions overlapped with other interaction types in 37% of cases; these were predominantly associated with protein complexes or signaling events. Finally, we highlight regions of ‘network vulnerability’ containing multiple negative genetic interactions; these could be targeted in fungal species for the regulation of cell growth.

2 ACS Paragon Plus Environment

Page 3 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Background

The cell can be represented and studied as a series of networks. These include proteinprotein interaction networks, signaling and gene regulatory networks, and networks of genetic interactions. When studied individually, these networks each have their own structures and properties. Yet it is biologically of most relevance for them to be integrated and studied together. Given the increasing availability of high quality network data, at least in model organisms such as Saccharomyces cerevisiae, it is of interest to build integrated networks and study the features that they contain (see Beyer et al.1 for review). This should reveal the important building blocks of integrated networks in living things. There are many examples of integrated networks, and many ways in which they have been investigated. Wang et al.2 integrated the phosphorylation and transcriptional networks of yeast to infer functions of uncharacterised transcription factors. Fiedler et al.3 used an integrated network to provide insights into the regulation of signaling pathways, revealing novel links between kinases and the regulation of chromatin integrity during transcription. Pang et al.4 integrated kinase-substrate networks with protein-protein interactions to identify dense interacting clusters of kinases, and their regulatory partners, which act as signal integrators or broadcasters. Other studies, in which phosphorylation and genetic interaction networks were integrated, have demonstrated the usefulness of this approach in understanding relationships that exist within and between signaling pathways.3,5,6 Recently, integrated networks has been used to create a more detailed map of the gene ontology and to predict novel negative genetic interactions.7–10 To assist their study, integrated networks can be broken down into motifs. These reflect fundamental relationships that exist within the cell. Enrichment analysis can then be used to discover motifs that are overrepresented.6,11–16 Through the integration of transcription3 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 58

regulation with protein-protein interaction data, two-, three- and four-protein motifs have been discovered which involve mixed-feedback loops and regulatory complexes.12 Sharifpoor et al.6 analysed kinase-substrate interactions with synthetic dosage lethality and conducted motif enrichment on their resulting integrated network. They reported overrepresentation of network motifs associated with the signaling regulation of biological pathways, such as those involved in cell wall integrity and mitotic exit. The integration of regulatory interactions and protein-protein interactions, in other studies, revealed overrepresentation of certain substructures within networks, including feed-forward loops and backwards activation.11,14,17 These are likely to be found in all cells and represent the modules that form integrated biological networks (see review by Alon18). Negative genetic interactions represent a form of redundancy whereby the deletion of two genes results in a deleterious effect that is synergistic, not just additive, with respect to the effects of single gene mutants.19–21 A global genetic interactions in yeast, in which approximately 6,000 genes were examined, identified about 900,000 genetic interactions of which two thirds were negative.21 Within-pathway and between-pathway effects were noted as major underpinnings of genetic interactions.21 Irrespective of this, the network basis of negative genetic interactions is not completely understood.5 Early analyses of negative genetic interactions within integrated networks (e.g. Beyer et al.1; Zhang et al.13) revealed relationships between interactions, essential proteins, and paralogs. However, the details of these relationships have not been elucidated and their relationships to network motifs are not well understood. Furthermore, the early observations made from integrated networks must be tested with recent, more comprehensive data. Here we have used high quality integrated networks, built from recently published global studies, to further investigate the network basis of negative genetic interactions. By only studying triplets with two negatively genetic protein pairs and a third protein, this revealed 4 ACS Paragon Plus Environment

Page 5 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

that negative genetic interactions are predominantly associated with six overrepresented triplet motifs. Triplets in which the negative genetic interactions overlapped with other types of interactions were further analysed to highlight the functional role of triplet motifs and to investigate the formation of signaling or regulatory feed-forward or feed-back loops. The triplets were also co-analysed with phenotypic datasets, including proteins that are essential or toxic upon overexpression, and cell cycle-regulated expression, to identify triplet motifs that were likely to be recurrent functional modules associated with cellular fitness. Triplets for which all three proteins were associated with the same phenotype were identified and these triplets helped highlighted intracellular subnetworks that are of critical importance to cell survival.

Methods

Data sources for S. cerevisiae interactions and phenotypic annotations Negative genetic interactions were obtained from Costanzo et al.21. Only high confidence negative genetic interactions (ε < -0.12 and p-value < 0.05) were considered; this interaction set contains 139,573 interactions between 5,828 proteins (Supporting Information Table 1S). Protein-protein interactions were an updated version of the ‘SBI’ interaction set from Pang et al.4. This dataset was constructed in a similar manner to Bertin et al.22 in that confidence was increased by only considering interactions that had been reported in at least two highthroughput interaction screens. The SBI interaction set has been updated using the BioGRID database in tab2 format (version 3.2.104)23 and the BIND24, DIP25, IntAct26 and MINT27 databases obtained using the PSICQUIC server28 (downloaded on the 2nd May 2013). This dataset had a total of 13,292 interactions between 3,843 proteins (Supporting Information Table S2). 5 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 58

Kinase-substrate interaction data was obtained from the KID database29 (version 2016-0314). Data in the KID database was gathered using literature mining. Each interaction was scored depending on the physical evidence present that would identify the interaction as a true kinase-substrate pair, as opposed to a false positive. The kinase-substrate interactions are directional. A score threshold of 2.5 was used, to obtain 2,282 interactions, involving 117 kinases and 947 substrates (Supporting Information Table S3). Transcriptional regulation interactions were curated from two ChIP-chip datasets, including Lee et al.30 and MacIsaac et al.31, and databases of curated transcriptional regulation interactions from Reimand et al.32 and the Saccharomyces Genome Database33. An additional 84 interactions from combinatorial experimental evidence were obtained from Madhani et al.34. These datasets were combined through the union of the entries. Any duplicated entries, which involved the same regulatory interaction reported by the same study recorded more than once, were removed from further analyses. An interaction was deemed high-confidence and kept for further analysis if identified by 1) two independent ChIP-chip studies, or 2) a combination of expression evidence from microarray analysis of transcriptional regulatory factor gene knockout experiments and evidence from a ChIP-chip experiment. The highconfidence transcriptional regulatory interaction dataset included 6,141 transcriptional factortarget gene interactions, involving 156 transcriptional factors and 2,938 target genes (Supporting Information Table S4). The transcription factors analysed include the general transcription factors, such as proteins from the mediator complex and the SAGA complex, and site-specific transcription factors that target certain DNA motifs. Cell cycle gene expression data was obtained from Granovskaia et al.35, which identified a list of 598 genes that were periodically expressed in the cell cycle (Supporting Information Table S5). This set of genes was determined using microarray analysis of cdc28∆ mutant

6 ACS Paragon Plus Environment

Page 7 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

yeast cells that had been synchronized for cell growth. Gene expression was measured from 0 to 215 minutes at 5-minute intervals. A list of 1,100 essential proteins defined by Giaever et al.36 was obtained from the Saccharomyces Genome Database (SGD)33 (Supporting Information Table S5). A list of 768 proteins that are toxic to the cell upon overexpression were obtained from Sopko et al.37 (Supporting Information Table S5). A list of paralogous yeast proteins were obtained from the OrthoMCL database38 and a list of ohnologs in yeast were obtained from Byrne and Wolfe et al.39. The union of the list of paralogs and the list of ohnologs resulted in a list containing 4,477 proteins (Supporting Information Table S6). The gene ontology (GO) slim terms were obtained from the SGD33. Top level GO slim terms ‘biological_process’, ‘cellular_component’, and ‘molecular_function’ were excluded from the dataset. This resulted in GO slim annotations for 5,949 genes. Phenotype annotations associated with each protein were from the SGD33. This phenotypic dataset includes analyses of single gene mutants (e.g. gene knockout or knockdown, site directed mutagenesis, and overexpression) which show a different phenotype to wild type. The dataset was filtered to only include data where the background strain was one of "S288C", "S288C (BY4742)", "S288C (BY4743)" or "S288C (BY4741)". To avoid overlap with the analysis of gene overexpression and essential genes described above, further filtering was applied to remove annotation from the Sopko et al.37 gene overexpression dataset and annotation on essential genes, which is indicated by the ‘inviable’ keyword in the phenotype description. The phenotype data included analysis of increased or decreased resistance to chemical or environmental stress, which contained the keywords ‘resistance’ or ‘thermotolerance’, and abnormal changes to cell morphology, which contained the keyword ‘morphology’. The annotation of the chemical agents and their respective dose in

7 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 58

each experiment, if available, were included for data analysis. This dataset included phenotype annotations for 5,734 genes.

Integrated networks and motif enrichment analysis An integrated network was generated from a union of the above data sources. The integrated network is included as Supporting Information (Table S7) and the number of interactions in each network is tabulated (Supporting Information Table S8). Motif enrichment was conducted by first enumerating all triplets within the integrated network. Cases where triplet motifs had counts less than 2% of the total, equivalent to less than ~1000 instances, were not considered as they could potentially arise due to noise, which is inherent in high throughput data and cannot be completely controlled by randomisation and bootstrapping techniques. Resulting enriched motifs were then compared to those with distributions from randomised networks. To build a randomised network, each of the subnetworks (i.e. protein-protein interaction, kinase-phosphate interaction, gene regulatory interactions and genetic interactions) was treated separately. Edges were shuffled but node degree was kept the same, similar to previous studies.2,13,40 Effectively, this preserved the node degree distribution and topology of all networks. All four types of networks were then integrated through the union of the nodes and edges, to give a final randomised network; this whole process was repeated 2,000 times. For each of the 15 possible triplet motifs, a triplet motif type was significant with a Bonferroni adjusted p-value less than 0.05 if the observed count was less than the counts of the randomized networks no more than 6 out of 2000 times.

Robustness analysis of triplet motifs To test the effect of false negatives, edges in the negative genetic interaction networks were randomly removed. For sizes between 20 - 90% of the original network, 2,000 observed 8 ACS Paragon Plus Environment

Page 9 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

subnetworks were randomly sampled for each size with increment steps of 10%. Degreepreserving randomization was performed on each observed subnetwork, yielding 2,000 randomized subnetworks for each increment of subnetwork size. The significance of each type of triplet motif was determined by comparing the frequency distribution of triplet motifs between observed and randomized subnetworks. The counts threshold for each type of triplet motif, corresponding to a raw p-value of less than 0.05, was defined as the 95th percentile of the counts of that triplet motif among the 2,000 randomized subnetworks. The raw p-value for each type of triplet motif was the proportion of triplet motif counts below the threshold, among the 2,000 observed subnetworks. The p-values for the 15 triplet motifs were then adjusted with Bonferroni correction. To test the impact of false positives, the size of the original negative genetic interaction network was increased by including 10 to 30% additional edges that connect random pair of proteins. The adjusted p-values for the 15 triplet motifs in the expanded networks were calculated by the steps above. To further test the robustness of the six overrepresented triplets, triplet motif analysis was repeated but with increasingly stringent genetic interaction fitness scores. The set of all negative genetic interactions more stringent than ε < -0.12 was analysed. From this set of interactions, 20 subsets including 100% to 5% of the interactions were generated, with the smallest set representing a much smaller but high stringency subset. The statistical significance of the six triplets in integrated networks resulting from these 20 increasingly stringent subsets was analysed. A triplet motif type was deemed statistically significant with a Bonferroni adjusted p-value less than 0.05 and the observed count is greater than 2% of the total.

Analysing the overlap of negative genetic interaction with other types of biological interactions 9 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 58

To decipher the role of the negative genetic interactions among the overrepresented triplet motifs, the overlap between the negative genetic interactions and other types of biological interactions were enumerated and analysed. The observed counts were collated for each of the overrepresented triplet motifs and compared to the expected counts obtained from 4,000 randomized networks. The p-values were adjusted using the Bonferroni method, accounting for the 30 combinations of five types of interactions and six overrepresented triplet motifs.

Analysis of negative genetic interactions that are shared by two or more triplets To test whether negative genetic interactions were associated with multiple triplets and represent highly connected interaction modules in the network, we counted the number of genetic interactions that is associated with two or more triplets. The counts were binned according to the number of triplets that shared each genetic interaction. Triplets of the same functional class, including protein complexes, signaling triplets, or regulatory triplets as described in Figure 2a, were grouped together for analysis. The observed counts in the integrated network are compared to the counts from 2,000 randomized integrated networks. The adjusted p-value was calculated using Bonferroni adjustment with respect to the 15 possible triplet motifs.

Co-analysis of triplet motifs with other ‘-omics’ datasets Several ‘-omics’ datasets were co-analysed with the triplet motifs (see Data Sources, above). These included categorical datasets including essential proteins, proteins that are toxic upon overexpression, cell cycle-regulated genes, protein paralogs and ohnologs. The cooccurrence of the categorical data with the triplets was enumerated and these were compared to the expected counts from 2,000 randomized networks. The adjusted p-value was calculated using Bonferroni adjustment with respect to the 15 possible triplet motifs. 10 ACS Paragon Plus Environment

Page 11 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Source code availability The source codes and data used for all analyses of triplet motifs described above were deposited

in

the

following

GitHub

Repository:

(https://github.com/IgnatiusPang/Triplet_Motifs).

Results

Integrated network construction and identification of triplet motifs associated with negative genetic interactions An integrated biological network was formed from the union of four types of networks: negative genetic interactions, protein-protein interactions, kinase-substrate interactions and transcription factor – target gene interactions. A portion of the integrated biological network is shown in Figure 1a. We identified all triplets from the network that contained two proteins with a negative genetic interaction and a third protein having another interaction type (protein-protein, kinase-substrate, and transcription factor – target gene). We avoided the confounding effects of two or three negative interactions in any triplet by filtering the integrated network to contain just triplets with a single genetic interaction. The resulting filtered network was of 30,850 triplets, 3,293 proteins and 23,226 interactions (Figure 1b, Supporting Information Table S8, 9). This contained 12,000 negative genetic interactions, 8.6% of the total in the stringent set, meaning that the majority of negative genetic interactions involve relationships that are more distant than in triplet motifs. Nevertheless, the large number of negative genetic interactions in triplet motifs, and their relative ease of interpretation as compared to more distant network relationships, warranted their detailed investigation. To test that our filtered network preserved functional organisation in the cell, 11 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 58

the spatial analysis of functional enrichment (SAFE) tool41 was used to identify enriched GO terms. Seventeen GO terms were found to be functionally enriched, in 1,083 proteins (Figure 1c), confirming that the filtered network preserved many biological processes.

Figure 1. Visualization of the triplet motifs network and spatial analysis of functional enrichment. a) Each node represents a gene or the corresponding protein product. The edges represent different types of biological interactions, including negative genetic interactions (green line), protein-protein interactions (black line), kinase-substrate interactions (blue line with arrow pointing towards the substrate), and transcription factor – target gene interaction (red line with arrow pointing towards the target gene). These edges are represented with the same colours throughout this study. b) Visualization of the entire triplet motif network, with edge colours as in (a). c) The SAFE tool was used to highlight regions of the network in panel (b) that shared similar gene ontology terms. This figure shows 10 out of the 17 enriched GO biological processes identified, shown as circled regions and their corresponding GO labels. The other seven enriched GO terms (not shown) were mitotic cell cycle, actin filament organization, endonucleolytic cleavage involved in rRNA processing, exocytosis, vacuolar transport and translational initiation. Force-directed layout was used to position the nodes in the network. The networks were visualized using Cytoscape.

12 ACS Paragon Plus Environment

Page 13 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

In this study, triplet motifs refer to network triplets which have specific types of interactions between the nodes. As noted above, the triplet motifs contain two proteins known to have a negative genetic interaction and a third protein having another interaction type (proteinprotein, kinase-substrate, transcription factor – target gene). The naming convention we used for the triplet motifs, inspired by previous approaches13, is based on the direction and type of interactions of the third protein in the triplet. A protein-protein interaction with the third protein was labelled as ‘P’, transcriptional regulation as ‘T’ and kinase-mediated phosphorylation as ‘K’. In the case of transcriptional regulation and phosphorylation, an additional qualifier was required to represent the directional nature of the interactions. We standardised the triplets so the two proteins with negative genetic interaction are at the top when drawn on the page; ‘up’ (U) or ‘down’ (D) then refers to the direction of the transcription or kinase-substrate interactions from the third protein in the triplet (see Figure 2a for examples). In this manner, there are 15 possible triplet motifs in the integrated network;

PP, TDP, PTU, PKD, PKU, TDTD, TDTU, TDKD, TDKU, TUTU, TUKD, TUKU, KDKD, KDKU, KUKU. Edge colour usage was also standardised (Figures 1, 2) where green represents negative genetic interactions, black represents protein-protein interactions, red arrows indicate directional transcription factor – target interactions, and blue arrows represent directional kinase-substrate interactions.

Triplet motifs containing negative genetic interactions are not random Negative genetic interactions should occur more frequently within certain motifs, thus helping to define their ‘network basis’. To investigate this, we compared the triplets observed in the integrated network with triplets in 2,000 randomized networks. For this, we analysed triplets that contained all possible types of negative genetic pairs. Specifically, this included those which were, or were not, also protein-protein, kinase-substrate or transcription factor – 13 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 58

target gene interactions. All 15 possible triplet motifs were present in the integrated network, however we had to assess their occurrence and reliability. Those seen with a count of less than 1,000 were potentially unreliable due to their low count and the existence of noise, which is inherent in high throughput data and cannot be completely controlled by randomisation and bootstrapping. Potentially unreliable triplets of this type included: KDKD,

KUKD, and TDKU. After removal of these, statistical analysis of the integrated network revealed that six triplet motifs were significantly overrepresented (Figure 2a, b). The six triplet motifs had fold enrichment of 1.39 to 3.29 when compared with mean counts from 2,000 randomized networks, had standard deviation of 2 to 6%, and mean adjusted p-values < 7.5 x 10-3 (Supporting Information Table S10). The six triplet motifs represent recurring building blocks in our integrated biological network. They included triplet motifs involving a negative genetic interaction and two protein-protein interactions (PP), a transcription factor targeting the two genes involved in the genetic interaction (TUTU), one transcription factor – target gene pair and one protein-protein interaction (TDP), a kinase that phosphorylate a pair of proteins (KUKU), or involving one protein-protein interaction and one kinase-substrate interaction (PKU and PKD). These overrepresented triplet motifs can be functionally classified as being protein complexes (PP), regulatory triplets (TUTU, TDP) and signaling triplets (KUKU, PKU, PKD). The remaining nine triplet motifs (PTU, TDTD, TDTU, TDKD, TDKU, TUKD, TUKU, KDKD, and KDKU) were not of statistical significance, being infrequently used. To confirm the functional relevance of the six triplet motifs, we examined if proteins in the same motifs showed enrichment for the same phenotype on knockout and GO slim terms as compared to a randomized network. All six triplet motifs had a statistically significant overrepresentation of proteins with shared phenotype on knockout (Supporting Information Figure S1). For the analysis of GO, the majority of triplet motifs were statistically significantly enriched for cases 14 ACS Paragon Plus Environment

Page 15 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

where all three proteins in the triplet shared the same biological process, cellular component, or molecular function (Supporting Information Figure S2). The six overrepresented triplet motifs are the focus of our subsequent analyses, below.

15 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2. Six triplet motifs were significantly overrepresented as compared to randomized networks. a) Network representations of the six significantly overrepresented triplet motifs. The triplets are represented such that the two proteins showing negative genetic interaction are at the top and connected by a green edge. Interactions with the third protein in the triplet are shown as black (protein-protein), red arrow (directional transcription factor – target gene), blue arrow (directional kinase-substrate). Text abbreviations for the interaction types with the third protein in motifs: P = protein-protein interaction; T = transcription factor – target gene interaction; K = kinase-substrate interaction; U = up; D = down. b) Six triplet motifs are significantly overrepresented in the integrated biological network (p < 0.05). c) Paralogs and ohnologs are enriched among the negative genetic interactions of triplet motifs, but they only account for < 5% of the triplet motifs. In the above graphs, the y-axis represents the frequency counts of the triplet motifs and the x-axis represents the six triplet motifs, grouped by different class of triplet motifs including protein complexes, regulatory motifs, and signaling motifs. The y-axis is shown as log scale and a pseudocount of one was added to all values for the observed and randomized network to avoid problems with the log of zero. The observed frequency counts for significantly overrepresented motifs are shown by a red circle (adjusted p-value < 0.05), while the motifs that are non-significant were represented by a blue circle. The frequency distributions of triplet motifs among the 2,000 randomized networks are represented as box-and-whiskers plots.

Overrepresented triplet motifs are robust to random addition or removal of negative genetic interactions It has previously been noted that genetic interactions may have up to 80% false negatives and 30% false positives.20 This may affect the reliable identification of overrepresented triplet motifs. To test the effect of false negatives, 10 to 80% of the edges in the negative genetic 16 ACS Paragon Plus Environment

Page 16 of 58

Page 17 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

interaction networks were randomly removed to generate subnetworks with increasing false negative rates. Strikingly, the six triplet motifs (PP, TUTU, PKU, KUKU, TDP and PKD, Figure 2) remained significantly overrepresented (adj. p-value < 0.05) when the network was reduced in this manner (Figure 3). The only exception to this was the PKD motif, which was not significantly overrepresented after removal of 80% of edges. To test the impact of false positives, the original negative genetic interaction network was expanded to include 10 to 30% additional edges that connect randomly selected pairs of proteins. In this, the six triplet motifs remained significantly overrepresented (adj. p-value < 0.05) (Figure 3). For both analyses above, the nine other possible triplet motifs showed no significant overrepresentation under any condition. The fold-enrichment ratios, which is the median triplet counts before versus after network randomization, and the adjusted p-values from the randomization test were also represented heat maps and are included as Supplementary Figures S3a and b, respectively. To further test the robustness of the six overrepresented triplet motifs, we repeated the triplet motif analysis but with data of increasingly stringent genetic interaction scores. This was done 20 times with data sets becoming increasingly stringent and concomitantly smaller. Interestingly, all six triplet motifs were found to be significantly enriched at all levels of stringency (Supporting Information Figure S4). Together, these results show that the six overrepresented triplet motifs are robust to random removal or addition of edges and represent true entities associated with negative genetic interactions.

17 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3. Random removal or addition of negative genetic interactions highlights the robustness of the six triplet motifs. To test whether false negative and false positive negative genetic interactions could affect the overrepresentation of triplet motifs, negative genetic interactions were randomly removed or added from the integrated network and analysis repeated. False negative rates of 10 to 80% were tested by randomly removing the corresponding proportion of stringent negative genetic interactions from the network. False 18 ACS Paragon Plus Environment

Page 18 of 58

Page 19 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

positive rates of 10 to 30% were tested by adding the corresponding proportion of random edges to the negative genetic interaction network. The updated network, in which edges were added or removed, were then compared to the same network but with the edges randomized by degree-preserving randomization. This process was repeated 2,000 times for each false negative or false positive rate increment to identify whether any triplet motifs were overrepresented in comparison to randomized networks (adjusted p-value < 0.05). Red shaded panels represent significantly higher observed triplet counts than random, while blue shaded panels represent non-significant results.

Paralogs are overrepresented among negative genetic interaction of triplet motifs but are rare Duplicated or paralogous proteins can provide redundancy in a cell, whereby the inactivation of a protein can be rescued by the presence of a paralog. This also applies to ohnologs, which are duplicated genes that have arisen from genome-wide duplication.39 It follows that deletion of both copies of a duplicated protein, where its function is essential, could result in a negative genetic interaction. To investigate this, pairs of proteins that participate in negative genetic interactions were examined to detect whether they were paralogs or ohnologs. This was done for all members of the six overrepresented triplet motifs and revealed that all, except for the TDP motif, showed significant overrepresentation of paralogs or ohnologs in the two negatively interacting proteins (Figure 2c). Consistent with the above, as compared to all negative genetic interactions, there was a 760% increase in the proportion of paralogs or ohnologs that shared negative genetic interactions among the six overrepresented motifs (Fisher’s exact test p-value < 1 x 10-4, Supporting Information Table S11). Despite this, the number of pairs of paralogs or ohnologs that participated in negative genetic interactions was very low; indeed, these involved only 113 negative genetic pairs 19 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 58

from 4,377 pairs of paralogs or ohnologs and only 160 occurrences among 30,500 triplets. Thus whilst paralogs and ohnologs were enriched for negative genetic interactions, as noted elsewhere42, the vast majority of negative genetic interactions in triplets cannot be explained by the presence of paralogs and ohnologs.

Negative genetic interactions overlap with some other types of interactions The overlap between negative genetic interactions and other types of biological interactions may highlight functional or mechanistic underpinnings of genetic interactions (Figure 4a). Such overlaps could also highlight the function of particular triplet motifs and whether negative genetic interactions are one step away in the integrated network, in which a negative genetic interaction overlaps with another interaction, or are two steps away, which involves the third protein in the triplet. To investigate this, the negative genetic interactions in the six overrepresented triplet motifs were analysed to determine whether they showed overlap with other interaction types (exemplified in Figure 4a). The majority of all members of the six types of motifs (63%) did not have negative genetic interactions that significantly overlapped with other types of biological interactions. This highlights that indirect interactions between the negative genetic proteins are predominantly responsible for their phenotypic effects. However, negative genetic interactions that overlapped with one or more other interaction type were present in 37% (11,006 in total) of the six triplet motifs and these triplet types are summarised in Figure 4a. Interestingly, when compared to the entire network, the negative genetic interactions that overlap with other interaction types were ~800% more frequent in triplet motifs (Supporting Information Table S11). Analysis of the 37% of triplets in which negative genetic interactions overlapped with other types of interactions highlighted the roles of different triplet motifs within the network. Protein-protein interactions that overlap with negative genetic interaction were found to be the 20 ACS Paragon Plus Environment

Page 21 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

most common, and were of significance (adjusted p-value < 0.05) among all six overrepresented triplet motifs (Figure 4a, b). As suggested in previous studies20,43, this was most pronounced in the PP triplets (8785 incidences), confirming that protein-protein interactions between multiple members of a complex are crucial for their essential function. Negative genetic interactions that overlap with protein-protein interactions were also found in the TUTU, TDP, KUKU, PKU and PKD triplets (841 incidences) but to a lesser degree then PP triplets.

Negative genetic interactions in signaling triplet motifs (KUKU, PKU, and PKD) showed significant overlap with kinase-substrate interactions (659 incidences) (Figure 4a, b). Signaling feed-forward loops are thus present, as the KUKU motifs have a negative genetic interaction which is also a signaling interaction.5 The PKD and PKU triplets, in which the negative genetic interaction was also a kinase-substrate interaction, reflect a number of signaling roles. These include indirect loops in which one edge in the loop is a protein-protein interaction (as noted previously by Varusai et al.44), a kinase that targets two proteins involved in a protein complex, or two interacting kinases that target a common substrate. The above results suggest that the above signaling triplets, where negative genetic interactions overlap with protein-protein or signaling interactions, represent common network modules.

Interestingly, the only triplet motif type to show overlap between genetic interactions and transcription factor – target gene interactions was the TDP triplet motif (362 incidences) (Figure 4a, b). This situation represents one transcription factor that targets two interacting genes / proteins, making them likely to be co-regulated. Examples include the Rpn4p transcription factor, which activates the expression of genes involved in the proteasome

21 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 58

complex45, and the heat shock transcription factor Hsf1p, which activates the expression of heat shock factors such as Ssa1p and Sse1p46.

Figure 4. Overlap of the negative genetic interactions in triplet motifs with other types of interactions. a) Triplet motifs that have an overrepresentation of another interaction type with the negative genetic interaction in the triplet. Edges with double arrows represent bidirectional signaling or regulatory interactions. b) Each panel represents one of the six types 22 ACS Paragon Plus Environment

Page 23 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

of overrepresented triplet motifs as per Figure 2b. For each panel, the y-axis represents the observed counts of the triplets and the x-axis represents the 5 possible types of interactions that may overlap with a negative genetic interaction in a triplet motif. P = protein-protein, K = kinase-substrate, T = transcription factor – target gene. Directional interactions are marked as ‘1’ or ‘2’ to indicate direction of left to right in the triplet, or right to left, respectively. Negative genetic interactions that do not overlap with any other type of interaction are represented as ‘None’. The y-axis is shown as log scale and a pseudocount of one was added to all values for the observed and randomized network to avoid problems with the log of zero. Red coloured dots represent significant overrepresentation of the corresponding type of interaction (adjusted p-value < 0.05) as compared to 4,000 randomized networks (box-andwhisker plots). Blue coloured dots represent non-significant representation of the type of interaction among negative genetic interactions in triplet motifs.

Overlapping triplets highlight larger protein complexes, and extended signaling or regulatory modules Negative genetic interactions can be present in one or more triplets in the integrated network. Those that show this feature are of high biological interest. To find negative genetic interactions that are shared by two or more triplets, which were more frequent than by chance, their frequency was compared between observed and randomized networks (Figure 5a). For the triplets in protein complexes (PP triplets), specific negative genetic interactions were present in 2 to 21 triplets (with frequency of 6 to 917) and were significantly overrepresented. An example of a protein complex with overlapping PP triplets is the TIM/TOM complex (Figure 5b), involved in the import of cytosolic proteins into the mitochondria

47

. For the

regulatory triplets (TUTU, TDP), significant overrepresentation of negative genetic interactions in more than one triplet was observed for a third of all cases: specific genetic 23 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 58

interactions were present in 2 to 20 triplets (frequency of 1 to 306). An example is the negatively interacting genes Ctf19 and Vik1 (Figure 5c); their expression is redundantly controlled by transcription factors that are known to regulate spindle pole assembly48 or nutrient homeostasis49–51. Negative genetic interactions associated with signaling triplets (KUKU, PKU, and PKD) were typically not found in multiple triplets, compared to protein complexes and regulatory triplets. However negative genetic interactions shared in two, three, four and 16 overlapping triplets were significantly overrepresented (frequency of 1 to 316). For example, negatively interacting kinases Mps1p and Bub1p are involved in a signaling cascade that regulate spindle pole assembly, but their activity and stability were regulated by the upstream kinase Cdc28p (Figure 5d).52,53 The above suggests that, beside protein complexes, overlapping triplets can reveal signaling and regulatory modules that control diverse biological functions18.

24 ACS Paragon Plus Environment

Page 25 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 5. Genetic interactions shared by two or more overlapping triplets highlight functional interaction modules. a) The number of genetic interactions which are shared by two or more triplets and thus represent overlapping triplets (y-axis) and the number of triplets that are shared (x-axis). Triplets were grouped together and counted based on their functional classifications, including protein complexes (PP), regulatory triplets (TUTU and TDP), and signaling triplets (KUKU, PKU, and PKD), as per Figure 2a. For the protein complexes, the xaxis was truncated beyond 35 overlapping triplets and not shown. The y-axis is shown as log scale and a pseudocount of one was added to all values for the observed and randomized network to avoid problems with log of zero. Red coloured dots represent significant overrepresentation of the corresponding type of interaction (adjusted p-value < 0.05) as compared to 2,000 randomized networks (box-and-whisker plots). Blue coloured dots represent non-significant representation of the type of interaction among negative genetic interactions in triplet motifs. b) An example of overlapping PP triplets involving the same negative genetic interaction between Sam25p and Tom20p and members of the TIM/TOM complex. c) An example of overlapping TUTU triplets in which the same negatively interacting genes Ctf19 and Vik1 were targeted by transcription factors Aft1p, Bas1p, Hir1p and Pho4p. d) An example of overlapping signaling triplets that shared the negative genetic interaction involving Mps1p and Bub1p. It highlights a signaling cascade in which Cdc28p phosphorylate Bub1p and Mps1p, which further regulates proteins involved in spindle pole assembly.

Enriched triplets contain significant numbers of essential proteins In early small-scale analyses of negative genetic triplets in yeast13, it was shown that triplet motifs are enriched for essential proteins. Subsequently, Baryshnikova et al.43 showed that complexes containing negative genetic interactions between their constituent proteins can 25 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 58

have at least one essential gene. Talavera et al.54 also reported that pairs of negatively interacting proteins cluster with essential proteins in protein interaction networks. Given that the recent study from Costanzo et al.21 included an analysis of essential genes as temperaturesensitive mutants, we were able to comprehensively determine which of the six types of triplet motifs show associations with essential proteins. Overall, we noted that essential proteins were ~30% more abundant in triplets as compared to all negative genetic interactions (Fisher’s exact test p-value < 1 x 10-4, Supporting Information Table S11). However, we found that essential proteins in the negative genetic interacting pair were only overrepresented in PP, TUTU and KUKU triplet motifs (Figure 6a). This detail was not revealed in Costanzo et al.21, who noted that essential proteins form densely connected networks that involve negative genetic interactions. However our observations are consistent with literature, in which essential proteins have been noted to be co-regulated and subunits of essential protein complexes55,56, as highlighted by the TUTU and PP motifs respectively, or participate in essential signaling pathways57, as highlighted by the KUKU motifs. We also determined whether the third protein in the triplet motif, separate to the negative genetic interacting pair, was essential. The PP, TDP and PKD triplet motifs were significantly enriched for an essential third protein (Figure 6b).

26 ACS Paragon Plus Environment

Page 27 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 6. Co-analysis of the triplet motifs with essential proteins, genes that are periodically expressed in the cell cycle and proteins that is toxic on overexpression. Triplet motifs that have a significant overrepresentation of a) a pair of essential proteins involved in negative genetic interactions, b) an essential third protein not part of the negative genetic interaction in the triplet motif, c) a pair of cell cycle-regulated genes involved in negative genetic interactions, d) a cell cycle-regulated protein not part of the negative genetic interaction in the triplet motif, e) a pair of proteins that are toxic upon overexpression and involved in negative genetic interaction, and f) a protein that is toxic upon overexpression and not part of the negative genetic interaction in the triplet motif. The y-axis represents the 27 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

frequency counts of the triplets and the x-axis represents the six triplet motifs. The y-axis is shown as log scale and a pseudocount of one was added to all values for the observed and randomized network to avoid problems with the log of zero. The observed frequency counts for significantly overrepresented motifs are shown by a red circle, while the motifs that are non-significant were represented by a blue circle. The frequency distribution of triplet motifs among the 2,000 randomized networks is represented as box-and-whiskers plots. There observed frequency counts for significantly overrepresented motifs are shown by a red circle (adjusted p-value < 0.05), while the motifs that are non-significant were represented by a blue circle.

Enrichment of cell cycle-regulated genes in motifs associated with protein complexes and signaling regulation Since cell cycle-regulated gene expression is important for the regulation of signaling pathways involved in cell growth and division58, it was expected that such genes would be enriched among triplet motifs that contain negative genetic interactions. Consistent with this, the proportion of periodically expressed genes in the cell cycle35 found in the six overrepresented triplet motifs was 34% greater than in all negative genetic interactions (Fisher’s exact test p-value < 1 x 10-4, Supporting Information Table S11). By contextualising the periodically expressed genes on the six triplets (Figure 6c), we identified a significant enrichment in the negatively interacting protein pair of the PP motif. This is consistent with the periodic expression of genes that encode subunits of cell cycle-regulated protein complexes.59 In addition, there was a significant enrichment of periodically expressed genes among the signaling PKU and KUKU motifs, which highlights the importance of signal-based regulation in cell cycle.58 We also investigated the overrepresentation of periodically expressed genes among the third interacting protein, separate to the genetic interaction 28 ACS Paragon Plus Environment

Page 28 of 58

Page 29 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(Figure 6d). Significant enrichment was revealed for this in the PP motif, consistent with observations above, and for the PKD signaling motif. Unexpectedly, none of the triplet motifs involved in transcriptional regulation were enriched for any periodically expressed genes. This suggests a high level of genetic redundancy may exist among transcriptional factors involved in cell cycle regulation of gene expression, which is to be expected given the importance of this biological process, or that the transcription factors could be subject to posttranslational rather than transcriptional regulation.

The majority of triplet motifs are enriched for proteins that are toxic upon protein overexpression It has been suggested that overexpression of certain proteins, such as the ectopic expression of cell cycle-regulated genes, could disrupt important cellular functions.60 Ma et al.61 also suggested that overexpression of proteins could induce cellular toxicity through promiscuous binding with other proteins. In a large-scale screen, Sopko et al.37 identified 768 genes that are toxic to the yeast cell upon gene overexpression; we refer to these as toxic overexpressed proteins. To investigate whether triplet motifs were vulnerable to protein overexpression, overlaps between toxic overexpressed proteins and the negative genetic interacting pairs in triplets were analysed (Figure 6e). Triplet motifs, as compared to all negative genetic interactions, showed a 21% increase in the proportion of negative genetic interactions that involved at least one gene that is toxic on overexpression (Supporting Information Table S11). All types of triplet motifs, except for the TDP motif, had negative genetic interactions enriched for toxic overexpressed proteins. In contrast, only the PP and PKD motifs showed enrichment for toxic overexpressed proteins when the third interacting protein in the motif, separate to the negative genetic interacting pair, was analysed (Figure 6f). These observations confirm that all triplets, with the exception of those in the TDP motif, are susceptible to 29 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 58

changes in the abundance of proteins, whether from overexpression or the deletion of their genes.

Cellular subnetworks of exceptional importance identified through integrated analysis of triplets with genes that are cell cycle-regulated, essential and toxic on overexpression Having established that abundance changes appear to contribute strongly to the negative genetic interactions in the six motifs, we finally explored the utility of the six motif types to reveal subnetworks of exceptional biological importance in the eukaryotic cell. Such regions of ‘network vulnerability’ could become the focus of future intervention, for control of fungi that are animal or plant pathogens.62,63 To do this, we searched for triplets where 1) all three proteins were essential, or 2) all proteins were toxic upon overexpression, or 3) all corresponding genes were cell cycle-regulated. We then asked whether these triplets were present in one, two or all of these cases and visualized this in a Venn diagram (Figure 7a). For overlapping regions in the Venn diagram, we identified the relevant subnetworks and analyzed their function (Supporting Information Table S9). Strikingly, there was only one triplet in which all three proteins were essential, toxic on overexpression and for which the corresponding genes were cell cycle-regulated (Figure 7b). This triplet consists of Spc29p, and Spc42p, which are part of the spindle pole body complex, and the Mps1p kinase that phosphorylates Spc42p and regulates the proper assembly of Spc42p into the spindle pole body.64 Our finding is consistent with the function of the spindle pole body as a dynamic multi-subunit complex that is essential to cell division and cell cycle regulation. There were 123 triplet motifs in which all three genes per triplet were cell cycle-regulated and essential. Interestingly, this highlighted one area of ‘network vulnerability’, of exceptional biological importance. The subnetwork included parts of the DNA prereplication, DNA polymerase and cohesin complex, containing a mix of triplet motifs PP, 30 ACS Paragon Plus Environment

Page 31 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

KUKU, PKD and PKU (Figure 7c). These complexes were centrally connected by a signaling triplet involving two key kinases, the cell cycle and DNA replication checkpoint kinases Rad53p, and Cdc5p, which regulate cohesin and condensin functions.65,66 Two other areas of ‘network vulnerability’ (Figure 7d and e) had core transcription factors that served to connect components involved in multiple biological functions. For the subnetwork in which all proteins were essential and toxic upon overexpression (Figure 7d), the transcription factor Abf1p regulates genes involved in diverse functions, including genes involved in preribosome assembly67, and protein folding and nucleocytoplasmic transport.68 The regulation of genes involved in glycerolipid and sphingolipid metabolism by Abf1p is a novel observation. For the subnetwork in which all proteins were cell cycle-regulated and toxic upon overexpression (Figure 7e), the transcription factor Swi4p activates the expression of histone genes.69 Swi4p has also been implicated in cytokinesis.70 To summarize, the construction of subnetworks, by combining triplet motifs with data concerning cellular effects of modifying expression levels, has highlighted vital functions that exist in regions of network vulnerability in the cell.

31 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 58

Figure 7. Triplet motifs consisting of proteins that are essential, toxic upon protein overexpression and those periodically expressed in the cell cycle. a) Of the six overrepresented triplet motifs, we identified instances where all three proteins were associated with the same set of phenotypes. These phenotypes include essential proteins, proteins that are toxic to the cell upon overexpression and cell cycle-regulated gene expression. The 32 ACS Paragon Plus Environment

Page 33 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

frequencies of triplets that match these criteria were shown in the Venn diagram, including cases where the different phenotypes overlap. For each of the four overlapping areas in the Venn diagram, the largest connected networks consisting of the corresponding triplets were shown (b-e). b) The proteins Spc42p, Mps1p, and Spc29p are essential, toxic upon overexpression, and their corresponding genes are cell cycle-regulated. These proteins form a triplet involved in spindle-pole body formation. This example highlights a sub-network which is regulated by the combination of protein-protein interactions and kinase-substrate interactions. c) Triplets in which all proteins are essential and their corresponding genes are expressed periodically in the cell cycle. This subnetwork consists of different protein complexes involved in DNA replication, cohesin, condensin and mitotic exits. These complexes were connected by cell-cycle regulated kinases Rad53p and Cdc5p. These two proteins represent major regulator of the connected protein complexes. d) Triplets where all proteins are essential and toxic upon overexpression. This example showed that the transcription factor Abf1p regulates diverse functions. The triplet motifs connect gene targets of Abf1p that are involved in different biological functions, including glycerolipid and sphingolipid metabolism, pre-ribosome assembly, protein folding and nucleocytoplasmic transport. e) All three genes in the triplet motif are cell cycle-regulated and the corresponding protein product is toxic upon overexpression. This subnetwork includes the histone complex and proteins involved in bud neck morphology and cytokinesis. The transcription factor Swi4p regulates the expression of histone genes and forms negative genetic interaction with Cdc11, which is required for cytokinesis. The kinase Rad53p interacts with both Swi4p and Cdc11p. The colour of the nodes represents the enriched GO terms from Figure 2. The enriched GO terms are rRNA assembly and maturation (cyan), nucleocytoplasmic transport (magenta), mitotic cell-cycle regulation (red), positive regulation of gene expression (blue), and no enriched GO terms (grey). 33 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 58

Discussion This study has investigated the network basis of negative genetic interactions in the model organism, S. cerevisiae. An integrated network was built by combining high stringency negative genetic interactions21 with high quality data from large-scale studies of proteinprotein interactions4, kinase-substrate mapping data29 and transcription factor – target gene relationships

30–32

. A targeted triplet motif analysis12,13 was then undertaken on this network,

in which we only analysed triplets containing two genes / proteins known to show negative genetic interaction and a third protein from the network. Whilst a relatively small proportion of all negative genetic interactions are in triplet motifs, this allowed us to focus our investigation and to interpret the results in this simplest possible context. Interestingly our analysis showed that only six triplets (PP, TUTU, PKU, KUKU, TDP and PKD), out of the 15 possible motifs, were significantly overrepresented as compared to randomized networks. These fell into three non-overlapping groups – those just involving protein-protein interactions (PP), those involving signaling (PKU, KUKU and PKD) and those involving transcription factor – target relationships (TUTU and TDP). These six triplets were robust to false-positives of up to 30% and false-negative rates of 80% for the negative genetic interactions and thus represent important building blocks of integrated networks. Further features of the six triplets are discussed below.

Paralogs are overrepresented in triplets but are rare in negative genetic interactions In this study, we have shown that negative genetic interactions in triplet motifs are enriched for pairs of paralogs and ohnologs. It has been suggested that paralogous genes, since they are duplicated, can provide functional redundancy if either copy is deleted or mutated to be nonfunctional. Our results are consistent with VanderSluis et al.71, who showed an enrichment of 34 ACS Paragon Plus Environment

Page 35 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

negative genetic interactions among paralogous gene pairs, confirming the presence of genetic buffering. However, the number of paralog-containing triplets with negative genetic interactions in our study was very low, being ~1%, so only partially contributes to the total number of triplets with negative genetic interactions. This conclusion is supported by Ihmels et al.72 and Stein and Aloy73 who also showed that paralogs could only explain a small percentage of negative genetic interactions in yeast. Interestingly, Diss et al.74 showed from the analysis of interaction partners of 56 pairs of paralogous proteins in yeast, that paralogs often form heterodimers and that deletion of either one paralog interferes the binding of the heterodimer with other partners, thus perturbing the intracellular network.

Signaling and regulatory triplets show no direct interconnections but contain feed forward loops The cell integrates inputs, including environmental cues and cellular status, to determine the appropriate output response. This occurs in signaling and regulatory networks, often through recurring logical processing modules.17 About one third of the triplets identified in this study contained negative genetic interactions that overlapped with other interactions in protein complexes, overlapped with kinase-substrate interactions in signaling loops or overlapped with a transcriptional regulatory relationship. However, we noted that regulatory and signaling triplets, and the regulatory / signaling networks that they are part of, show a striking lack of direct overlap with each other. Analysis of the genetic interactions in regulatory triplets that overlapped with other interaction types revealed a depletion of kinase-substrate interactions. Equally, the analysis of genetic interactions in signaling triplets that overlapped with other interaction types revealed a reduction of transcription factor – target regulatory relationships. Consistent with this, we observed a low representation of ‘mixed triplets’ in our integrated network (e.g. those that contained kinase-substrate and transcription factor – target 35 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 58

regulatory relationships, such as TDKD, TDKU, TUKD, TUKU). Together, this implies that the signaling and regulatory networks as studied here show incredible robustness with respect to each other and/or that any negative genetics that exists between these two networks are indirect (and multiple steps away from each other in an integrated network). Despite the low degree of direct interconnection between regulatory and signaling events in the triplet motifs, the regulatory and signaling triplets each had negative genetic interactions overlapping with regulatory and signaling interactions respectively. These are consistent with feed-forward loops within signaling and regulatory networks.12,15,75 Such loops are involved in detecting persistent signals and filtering out short signal pulses76, increasing the transcriptional response time76, the detection of cellular signals based on fold-change77 and maintaining stability and reversibility of cellular state78.

Regions of high network vulnerability are highlighted by triplets that are deleterious when perturbed Recent studies have identified ‘network vulnerability’ to be important for the cause of disease, in which perturbation of specific proteins and their local network partners can lead to detrimental effects on cellular homeostasis and growth.79 Examples of network vulnerability include the enrichment of gene mutations in network ‘disease modules’ associated with complex and multi-allelic diseases79,80 and the role of perturbation and mutations of kinases and phosphorylation targets in rewiring and sustaining signaling pathways associated with cancer hallmarks81–84. Here, co-analysis of the integrated biological network with phenotypic data identified regions of high network vulnerability, thus defining a number of ‘Achille’s heels’ in the cell. These subnetworks contained triplets in which all three proteins were essential, toxic upon overexpression and/or their corresponding genes were cell cycle-regulated. Importantly, these 36 ACS Paragon Plus Environment

Page 37 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

subnetworks show extreme sensitivity to perturbation of gene or protein expression levels and are also likely to show extreme sensitivity to mutation. Interestingly, the cohesin complex was identified as part of the subnetwork of exceptional biological importance in our study. Evolutionarily conserved proteins in this complex have been reported as mutated in colorectal cancer, suggesting them as potential drug targets for synergistic chemotherapy treatment.85 Our observations of regions of network vulnerability highlight how critical network regions can be defined, when rich data for intracellular networks exists. As more human biological network data becomes available86–90, it will be of interest to apply similar methods used in this study to analyse genetic interactions in the context of integrated biological networks. This will help highlight regions of network vulnerability in the human cell, which could in turn reveal targets for single or synergistic drugs and help understand mechanisms of synergy.91

Limitations of the triplet motifs analyses There are a number of limitations to this study. The coverage of the networks will have directly impacted on the number and type of triplets identified. The most comprehensive network is the genetic interaction network, with 90% of the network mapped to date.21 Although screens of yeast protein-protein interactions are approaching saturation, a lower proportion of interactions were identified independently in multiple screens, with 18-35% estimated coverage in our network using this criteria55. The coverage of the signaling (9%) and regulatory (3%) network in this study is small, as compared to upper estimates of 26,651 kinase-substrate interactions29 and 206,299 regulatory interactions92. These upper estimates include indirect interactions arising from the secondary effects of a kinase or transcription factor knockout; therefore, it is likely that the number of actual direct interactions could be much lower (e.g. MacIsaac et al.31; Ptacek et al.75). The low coverage of signaling and regulatory networks here will have affected the relatively low overall number of negative 37 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 58

genetic interactions that were found within triplet motifs. While large-scale screens of yeast protein-protein interactions are of high quality, such as Yu et al.93, these datasets will not be free of false-positives94. Using only interactions verified independently in multiple experiments should reduce the number of false positive interactions22,29, an approach used here for the protein-protein, signaling and regulatory networks in this study. Screening with orthogonal methods would likely increase the interactome coverage and lessen false negative identifications.95

The majority of large-scale interaction screens were performed in standard laboratory conditions, but there are many interactions that will only be present in specific environmental or stress conditions, which have been experimentally identified in small number of studies 96– 99

. Thus most condition-specific interactions will be missing in the current interactome. A

further limitation is that the networks analyzed in this study, with the exception of cell-cycle changes, are static. To better represent the dynamics of networks, further data could be considered including but not limited to subcellular localization, stress stimuli, and interactions that are dependent on protein co- or post-translational modifications.100–104 Most screening techniques remain limited to ‘brute force’ techniques that involves ‘one-at-a-time’ experiments, including pairwise screens for interactions (e.g. yeast two-hybrid) and screening of bait proteins with their interaction partners (e.g. affinity purifications and mass spectrometry). The development of novel approaches to measure many protein-protein interactions in parallel and under different environments (e.g. Celaj et al.99) would enable better understanding of the dynamics of biological networks.

This study elucidated the ‘network basis’ of negative genetic interactions by focusing on a subset of triplets; these represent the smallest possible set of network motifs beyond pairwise 38 ACS Paragon Plus Environment

Page 39 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

interactions that involve a single negative genetic interaction. Our analyses showed that onethird of the negative genetic interactions identified in the six overrepresented triplet motifs could be accounted for by overlaps with known physical or regulatory interactions (one-step interactions), while the remaining could be explained by indirect interactions within the triplet (two-steps interactions). Since there are many kinase-substrate and transcription factor-target relationships that remain undefined as compared to protein-protein interactions, it is likely that the one- and two-steps interactions associated with negative genetic interactions are likely to be underestimated in yeast. In addition, restricting the analysis of triplets to a specific subclass, triplets with one negative genetic interaction, could lead to bias and limit the ability to analyse the full spectrum of indirect interactions and cross-talk between signaling and regulatory networks. It would be of interest to extend our investigations, such as overlapping motifs, overlapping interactions, and co-analyses with diverse phenotypic datasets, to larger integrated networks involving additional types of interactions and motifs of larger size. Analyses of different types of genetic interactions within triplet motifs6, or analysis of quadruplets and larger motifs12,16,105 have been reported elsewhere. Nevertheless, our study has identified core features within integrated biological networks. It is expected that as data quantity and quality improves, systems-level analysis of network motifs and their dynamic regulations could further illuminate network features, including how negative genetic interactions might emerge.

Conclusions In conclusion, using triplet analysis of integrated networks, this study has defined many biological features of networks that are significantly associated with negative genetic interactions. Together, these explain a substantial proportion of the negative genetic interactions that exists in the yeast cell. Analysis of triplets that contained proteins that causes 39 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 58

deleterious phenotype when their abundance is perturbed helped defined regions of network vulnerability, which could be the targets of intervention for future control of pathogenic fungi. It is highly likely that the motifs we have described will exist, and show similar relationships, in all living things and thus represent some of the important building blocks of eukaryotic intracellular networks.

Supporting Information The following files are available free of charge. Figure S1. Proteins in triplet motifs show significant association with the same phenotypes on genetic perturbation (PDF). Figure S2. Proteins in triplet motifs show an overrepresentation of one or more GO terms, especially for biological process and cellular component (PDF). Figure S3. Random removal or addition of negative genetic interactions highlights the robustness of the six triplet motifs (PDF). Figure S4. The six triplet motif types are overrepresented in integrated networks constructed with increasingly stringent and smaller subsets of the negative genetic interaction data (PDF). Table S1. The list of negative genetic interactions (ZIP). Table S2. The list of protein-protein interactions (ZIP). Table S3. The list of kinase-substrate interactions (ZIP). Table S4. The list of transcription factor – target gene interactions used in this study (ZIP).

40 ACS Paragon Plus Environment

Page 41 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Table S5. The list of genes that are cell cycle-regulated, essential or toxic on overexpression (ZIP). Table S6. The list of all paralogs and ohnologs in S. cerevisiae (ZIP). Table S7. All interactions present in the integrated biological network, including information on which interactions are part of triplets (ZIP). Table S8. The number of interactions in each network (ZIP). Table S9. The list of all the triplets in the filtered integrated network and triplets with genes that are cell cycle-regulated, essential and toxic on overexpression, and their shared phenotype on knockout and GO slim terms (ZIP). Table S10. The observed counts, standard deviation, fold-enrichments and p-value of the six overrepresented triplet motifs (ZIP). Table S11. Biological properties associated with negative genetic interactions and their increased predominance in triplet motifs (ZIP).

Corresponding Author *Phone: (+61) 2 9385 3633; fax: (+61) 2 9385 1483; e-mail: [email protected]

ORCID Chi Nam Ignatius Pang: 0000-0001-9703-5741

Author’s contributions 41 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 42 of 58

C.N.I.P. carried out the bioinformatics analyses. A.G. carried out preliminary bioinformatics analyses. M.R.W. had conceived and coordinated the project and reviewed all results. C.N.I.P. and M.R.W. wrote and edited the manuscript with contributions from A.G. All authors gave final approval for publication. The authors declare no competing financial interest.

Funding Sources MRW acknowledges support from the Australian Research Council Discovery Project Grant (DP130100349) and NCRIS funding administered by Bioplatforms Australia. AG was the recipient of an Australian Postgraduate Award.

42 ACS Paragon Plus Environment

Page 43 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

References (1)

Beyer, A.; Bandyopadhyay, S.; Ideker, T. Integrating physical and genetic maps: from genomes to interaction networks. Nature Rev. Genetics 2007, 8 (9), 699–710.

(2)

Wang, L.; Hou, L.; Qian, M.; Deng, M. Integrating phosphorylation network with transcriptional network reveals novel functional relationships. PloS one 2012, 7 (3), e33160.

(3)

Fiedler, D.; Braberg, H.; Mehta, M.; Chechik, G.; Cagney, G.; Mukherjee, P.; Silva, A. C.; Shales, M.; Collins, S. R.; van Wageningen, S.; et al. Functional Organization of the S. cerevisiae Phosphorylation Network. Cell 2009, 136 (5), 952–963.

(4)

Pang, C. N. I.; Goel, A.; Li, S. S.; Wilkins, M. R. A Multidimensional Matrix for Systems Biology Research and Its Application to Interaction Networks. Journal of Proteome Research 2012, 11 (11), 5204–5220.

(5)

van Wageningen, S.; Kemmeren, P.; Lijnzaad, P.; Margaritis, T.; Benschop, J. J.; de Castro, I. J.; van Leenen, D.; Groot Koerkamp, M. J.; Ko, C. W.; Miles, A. J.; et al. Functional overlap and regulatory links shape genetic interactions between signaling pathways. Cell 2010, 143 (6), 991–1004.

(6)

Sharifpoor, S.; Van Dyk, D.; Costanzo, M.; Baryshnikova, A.; Friesen, H.; Douglas, A. C.; Youn, J. Y.; VanderSluis, B.; Myers, C. L.; Papp, B.; et al. Functional wiring of the yeast kinome revealed by global analysis of genetic network motifs. Genome Research 2012, 22 (4), 791–801.

(7)

Cho, H.; Berger, B.; Peng, J. Compact Integration of Multi-Network Topology for Functional Analysis of Genes. Cell Systems 2016, 3 (6), 1–9.

43 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(8)

Page 44 of 58

Dutkowski, J.; Ono, K.; Kramer, M.; Yu, M.; Pratt, D.; Demchak, B.; Ideker, T. NeXO Web: The NeXO ontology database and visualization platform. Nucleic Acids Research 2014, 42 (Database issue), D1269-74.

(9)

Yu, M. K.; Kramer, M.; Dutkowski, J.; Srivas, R.; Licon, K.; Kreisberg, J. F.; Ng, C. T.; Krogan, N.; Sharan, R.; Ideker, T. Translation of genotype to phenotype by a hierarchy of cell subsystems. Cell Systems 2016, 2 (2), 77–88.

(10)

Young, J. H.; Marcotte, E. M. Predictability of Genetic Interactions from Functional Gene Modules. G3 (Bethesda) 2017, 7 (2), 617–624.

(11)

Chechik, G.; Oh, E.; Rando, O.; Weissman, J.; Regev, A.; Koller, D. Activity motifs reveal principles of timing in transcriptional control of the yeast metabolic network. Nature biotechnology 2008, 26 (11), 1251–1259.

(12)

Yeger-Lotem, E.; Sattath, S.; Kashtan, N.; Itzkovitz, S.; Milo, R.; Pinter, R. Y.; Alon, U.; Margalit, H. Network motifs in integrated cellular networks of transcriptionregulation and protein-protein interaction. Proceedings of the National Academy of Sciences of the United States of America 2004, 101 (16), 5934–5939.

(13)

Zhang, L. V; King, O. D.; Wong, S. L.; Goldberg, D. S.; Tong, A. H. Y.; Lesage, G.; Andrews, B.; Bussey, H.; Boone, C.; Roth, F. P. Motifs, themes and thematic maps of an integrated Saccharomyces cerevisiae interaction network. Journal of biology 2005, 4 (2), 6.

(14)

Shen-Orr, S. S.; Milo, R.; Mangan, S.; Alon, U. Network motifs in the transcriptional regulation network of Escherichia coli. Nature genetics 2002, 31 (1), 64–68.

(15)

Kemmeren, P.; Sameith, K.; van de Pasch, L. A. L.; Benschop, J. J.; Lenstra, T. L.; 44 ACS Paragon Plus Environment

Page 45 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Margaritis, T.; O’Duibhir, E.; Apweiler, E.; van Wageningen, S.; Ko, C. W.; et al. Large-Scale Genetic Perturbations Reveal Regulatory Networks and an Abundance of Gene-Specific Repressors. Cell 2014, 157 (3), 740–752. (16)

Chung, S. S.; Pandini, A.; Annibale, A.; Coolen, A. C. C.; Thomas, N. S. B.; Fraternali, F.; Vidal, M.; Liang, Z.; Xu, M.; Teng, M.; et al. Bridging topological and functional information in protein interaction networks by short loops profiling. Sci. Rep. 2015, 5, 8540.

(17)

Milo, R.; Shen-Orr, S.; Itzkovitz, S.; Kashtan, N.; Chklovskii, D.; Alon, U. Network motifs: simple building blocks of complex networks. Science 2002, 298 (5594), 824– 827.

(18)

Alon, U. Network motifs: theory and experimental approaches. Nat. Rev. Genet. 2007, 8 (6), 450–461.

(19)

Tong, A. H.; Evangelista, M.; Parsons, A. B.; Xu, H.; Bader, G. D.; Pagé, N.; Robinson, M.; Raghibizadeh, S.; Hogue, C. W.; Bussey, H.; et al. Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science 2001, 294 (5550), 2364–2368.

(20)

Costanzo, M.; Baryshnikova, A.; Bellay, J.; Kim, Y.; Spear, E. D.; Sevier, C. S.; Ding, H.; Koh, J. L.; Toufighi, K.; Mostafavi, S.; et al. The genetic landscape of a cell. Science 2010, 327 (5964), 425–431.

(21)

Costanzo, M.; VanderSluis, B.; Koch, E. N.; Baryshnikova, A.; Pons, C.; Tan, G.; Wang, W.; Usaj, M.; Hanchard, J.; Lee, S. D.; et al. A global genetic interaction network maps a wiring diagram of cellular function. Science 2016, 353 (6306), aaf1420. 45 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(22)

Page 46 of 58

Bertin, N.; Simonis, N.; Dupuy, D.; Cusick, M. E.; Han, J. D.; Fraser, H. B.; Roth, F. P.; Vidal, M. Confirmation of organized modularity in the yeast interactome. PLoS Biol. 2007, 5 (6), e153.

(23)

Chatr-aryamontri, A.; Oughtred, R.; Boucher, L.; Rust, J.; Chang, C.; Kolas, N. K.; O’Donnell, L.; Oster, S.; Theesfeld, C.; Sellam, A.; et al. The BioGRID interaction database: 2017 update. Nucleic Acids Research 2017, 45 (D1), D369–D379.

(24)

Bader, G. D.; Donaldson, I.; Wolting, C.; Ouellette, B. F.; Pawson, T.; Hogue, C. W. V; Betel, D.; Hogue, C. W. V. BIND: The Biomolecular Interaction Network Database. Nucleic acids research 2003, 31 (1), 248–250.

(25)

Salwinski, L.; Miller, C. S.; Smith, A. J.; Pettit, F. K.; Bowie, J. U.; Eisenberg, D. The Database of Interacting Proteins: 2004 update. Nucleic acids research 2004, 32 (Database issue), D449-51.

(26)

Orchard, S.; Ammari, M.; Aranda, B.; Breuza, L.; Briganti, L.; Broackes-Carter, F.; Campbell, N. H.; Chavali, G.; Chen, C.; Del-Toro, N.; et al. The MIntAct project IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Research 2014, 42 (Database issue), D358-63.

(27)

Licata, L.; Briganti, L.; Peluso, D.; Perfetto, L.; Iannuccelli, M.; Galeota, E.; Sacco, F.; Palma, A.; Nardozza, A. P.; Santonico, E.; et al. MINT, the molecular interaction database: 2012 Update. Nucleic Acids Research 2012, 40 (Database issue), D857-61.

(28)

del-Toro, N.; Dumousseau, M.; Orchard, S.; Jimenez, R. C.; Galeota, E.; Launay, G.; Goll, J.; Breuer, K.; Ono, K.; Salwinski, L.; et al. A new reference implementation of the PSICQUIC web service. Nucleic acids research 2013, 41 (Web Server issue), W601-6. 46 ACS Paragon Plus Environment

Page 47 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(29)

Sharifpoor, S.; Nguyen Ba, A. N.; Young, J. Y.; van Dyk, D.; Friesen, H.; Douglas, A. C.; Kurat, C. F.; Chong, Y. T.; Founk, K.; Moses, A. M.; et al. A quantitative literature-curated gold standard for kinase-substrate pairs. Genome Biol 2011, 12 (4), R39.

(30)

Lee, T. I.; Rinaldi, N. J.; Robert, F.; Odom, D. T.; Bar-Joseph, Z.; Gerber, G. K.; Hannett, N. M.; Harbison, C. T.; Thompson, C. M.; Simon, I.; et al. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 2002, 298 (5594), 799–804.

(31)

MacIsaac, K. D.; Wang, T.; Gordon, D. B.; Gifford, D. K.; Stormo, G. D.; Fraenkel, E. An improved map of conserved regulatory sites for Saccharomyces cerevisiae. BMC bioinformatics 2006, 7 (1), 113.

(32)

Reimand, J.; Aun, A.; Vilo, J.; Vaquerizas, J. M.; Sedman, J.; Luscombe, N. M. m:Explorer: multinomial regression models reveal positive and negative regulators of longevity in yeast quiescence. Genome Biol 2012, 13 (6), R55.

(33)

Cherry, J. M.; Hong, E. L.; Amundsen, C.; Balakrishnan, R.; Binkley, G.; Chan, E. T.; Christie, K. R.; Costanzo, M. C.; Dwight, S. S.; Engel, S. R.; et al. Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Res. 2012, 40 (Database issue), D700-5.

(34)

Madhani, H. D.; Galitski, T.; Lander, E. S.; Fink, G. R. Effectors of a developmental mitogen-activated protein kinase cascade revealed by expression signatures of signaling mutants. Proceedings of the National Academy of Sciences of the United States of America 1999, 96 (22), 12530–12535.

(35)

Granovskaia, M. V; Jensen, L. J.; Ritchie, M. E.; Toedling, J.; Ning, Y.; Bork, P.; Huber, W.; Steinmetz, L. M. High-resolution transcription atlas of the mitotic cell 47 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 48 of 58

cycle in budding yeast. Genome biology 2010, 11 (3), R24. (36)

Giaever, G.; Chu, A. M.; Ni, L.; Connelly, C.; Riles, L.; Véronneau, S.; Dow, S.; Lucau-Danila, A.; Anderson, K.; André, B.; et al. Functional profiling of the Saccharomyces cerevisiae genome. Nature 2002, 418 (6896), 387–391.

(37)

Sopko, R.; Huang, D.; Preston, N.; Chua, G.; Papp, B.; Kafadar, K.; Snyder, M.; Oliver, S. G.; Cyert, M.; Hughes, T. R.; et al. Mapping pathways and phenotypes by systematic gene overexpression. Mol. Cell 2006, 21 (3), 319–330.

(38)

Fischer, S.; Brunk, B. P.; Chen, F.; Gao, X.; Harb, O. S.; Iodice, J. B.; Shanmugam, D.; Roos, D. S.; Stoeckert Jr., C. J. Using OrthoMCL to assign proteins to OrthoMCL-DB groups or to cluster proteomes into new ortholog groups. Current protocols in bioinformatics / editoral board, Andreas D. Baxevanis ... [et al.] 2011, Chapter 6, Unit 6.12.1-19.

(39)

Byrne, K. P.; Wolfe, K. H. The Yeast Gene Order Browser: Combining curated homology and syntenic context reveals gene fate in polyploid species. Genome Research 2005, 15 (10), 1456–1461.

(40)

Yu, X.; Lin, J.; Zack, D. J.; Mendell, J. T.; Qian, J. Analysis of regulatory network topology reveals functionally distinct classes of microRNAs. Nucleic Acids Res 2008, 36 (20), 6494–6503.

(41)

Baryshnikova, A. Systematic Functional Annotation and Visualization of Biological Networks. Cell Systems 2016, 2 (6), 412–421.

(42)

Li, J.; Yuan, Z.; Zhang, Z. The Cellular Robustness by Genetic Redundancy in Budding Yeast. PLoS Genetics 2010, 6 (11), e1001187. 48 ACS Paragon Plus Environment

Page 49 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(43)

Baryshnikova, A.; Costanzo, M.; Kim, Y.; Ding, H.; Koh, J.; Toufighi, K.; Youn, J.-Y.; Ou, J.; San Luis, B.-J.; Bandyopadhyay, S.; et al. Quantitative analysis of fitness and genetic interactions in yeast on a genome scale. Nature Methods 2010, 7 (12), 1017– 1024.

(44)

Varusai, T. M.; Kolch, W.; Kholodenko, B. N.; Nguyen, L. K. Protein-protein interactions generate hidden feedback and feed-forward loops to trigger bistable switches, oscillations and biphasic dose-responses. Mol. Biosyst. 2015, 11 (10), 2750– 2762.

(45)

Owsianik, G.; Balzi l, L.; Ghislain, M. Control of 26S proteasome expression by transcription factors regulating multidrug resistance in Saccharomyces cerevisiae. Mol. Microbiol. 2002, 43 (5), 1295–308.

(46)

Hahn, J. S.; Hu, Z.; Thiele, D. J.; Iyer, V. R. Genome-wide analysis of the biology of stress responses through heat shock transcription factor. Mol Cell Biol 2004, 24 (12), 5249–5256.

(47)

Pfanner, N.; Meijer, M. Mitochondrial biogenesis: The Tom and Tim machine. Current Biology 1997, 7 (2), R100–R103.

(48)

Sharp, J. A.; Franco, A. A.; Osley, M. A.; Kaufman, P. D. Chromatin assembly factor I and Hir proteins contribute to building functional kinetochores in S. cerevisiae. Genes and Development 2002, 16 (1), 85–100.

(49)

Yamaguchi-Iwai, Y.; Dancis, A.; Klausner, R. D. AFT1: a mediator of iron regulated transcriptional control in Saccharomyces cerevisiae. Embo J 1995, 14 (6), 1231–1239.

(50)

Lemire, J. M.; Willcocks, T.; Halvorson, H. O.; Bostian, K. A. Regulation of 49 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 50 of 58

repressible acid phosphatase gene transcription in Saccharomyces cerevisiae. Mol Cell Biol 1985, 5 (8), 2131–2141. (51)

Daignan-Fornier, B.; Fink, G. R. Coregulation of purine and histidine biosynthesis by the transcriptional activators BAS1 and BAS2. Proceedings of the National Academy of Sciences of the United States of America 1992, 89 (15), 6746–6750.

(52)

Goto, G. H.; Mishra, A.; Abdulle, R.; Slaughter, C. A.; Kitagawa, K. Bub1-mediated adaptation of the spindle checkpoint. PLoS Genetics 2011, 7 (1), e1001282.

(53)

Jaspersen, S. L.; Huneycutt, B. J.; Giddings, T. H.; Resing, K. A.; Ahn, N. G.; Winey, M. Cdc28/Cdk1 regulates spindle pole body duplication through phosphorylation of Spc42 and Mps1. Developmental Cell 2004, 7 (2), 263–274.

(54)

Talavera, D.; Robertson, D. L.; Lovell, S. C. The Role of Protein Interactions in Mediating Essentiality and Synthetic Lethality. PLoS ONE 2013, 8 (4), e62866.

(55)

Hart, G. T.; Ramani, A. K.; Marcotte, E. M. How complete are current yeast and human protein-interaction networks? Genome Biol 2006, 7 (11), 120.

(56)

Benschop, J. J.; Brabers, N.; van Leenen, D.; Bakker, L. V; van Deutekom, H. W.; van Berkum, N. L.; Apweiler, E.; Lijnzaad, P.; Holstege, F. C.; Kemmeren, P. A consensus of core protein complex compositions for Saccharomyces cerevisiae. Mol. Cell. 2010, 38 (6), 916–928.

(57)

Cappell, S. D.; Baker, R.; Skowyra, D.; Dohlman, H. G. Systematic analysis of essential genes reveals important regulators of G protein signaling. Mol. Cell 2010, 38 (5), 746–757.

(58)

Jensen, L. J.; Jensen, T. S.; de Lichtenberg, U.; Brunak, S.; Bork, P. Co-evolution of 50 ACS Paragon Plus Environment

Page 51 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

transcriptional and post-translational cell-cycle regulation. Nature 2006, 443 (7111), 594–597. (59)

de Lichtenberg, U.; Jensen, L. J.; Brunak, S.; Bork, P. Dynamic complex formation during the yeast cell cycle. Science 2005, 307 (5710), 724–727.

(60)

Niu, W.; Li, Z.; Zhan, W.; Iyer, V. R.; Marcotte, E. M. Mechanisms of cell cycle control revealed by a systematic and quantitative overexpression screen in S. cerevisiae. PLoS Genet 2008, 4 (7), e1000120.

(61)

Ma, L.; Pang, C. N. I.; Li, S. S.; Wilkins, M. R. Proteins deleterious on overexpression are associated with high intrinsic disorder, specific interaction domains, and low abundance. Journal of proteome research 2010, 9 (3), 1218–1225.

(62)

Jerby-Arnon, L.; Pfetzer, N.; Waldman, Y. Y.; McGarry, L.; James, D.; Shanks, E.; Seashore-Ludlow, B.; Weinstock, A.; Geiger, T.; Clemons, P. A.; et al. Predicting cancer-specific vulnerability via data-driven detection of synthetic lethality. Cell 2014, 158 (5), 1199–1209.

(63)

Roemer, T.; Boone, C. Systems-level antimicrobial drug and drug synergy discovery. Nature chemical biology 2013, 9 (4), 222–231.

(64)

Castillo, A. R.; Meehl, J. B.; Morgan, G.; Schutz-Geschwender, A.; Winey, M. The yeast protein kinase Mps1p is required for assembly of the integral spindle pole body component Spc42p. The Journal of Cell Biology 2002, 156 (3), 453–465.

(65)

Pakchuen, S.; Ishibashi, M.; Takakusagi, E.; Shirahige, K.; Sutani, T. Physical Association of Saccharomyces cerevisiae Polo-like Kinase Cdc5 with Chromosomal Cohesin Facilitates DNA Damage Response. Journal of Biological Chemistry 2016, 51 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 52 of 58

291 (33), 17228–17246. (66)

St-Pierre, J.; Douziech, M.; Bazile, F.; Pascariu, M.; Bonneil, É.; Sauvé, V.; Ratsima, H.; D’Amours, D. Polo Kinase Regulates Mitotic Chromosome Condensation by Hyperactivation of Condensin DNA Supercoiling Activity. Molecular Cell 2009, 34 (4), 416–426.

(67)

Planta, R. J.; Goncalves, P. M.; Mager, W. H. Global regulators of ribosome biosynthesis in yeast. Biochem.Cell Biol. 1995, 73 (11–12), 825–834.

(68)

Loch, C. M.; Mosammaparast, N.; Miyake, T.; Permberton, L. F.; Li, R. Functional and physical interactions between autonomously replicating sequence-binding factor 1 and the nuclear transport machinery. Traffic 2004, 5 (12), 925–935.

(69)

Eriksson, P. R.; Ganguli, D.; Clark, D. J. Spt10 and Swi4 control the timing of histone H2A/H2B gene activation in budding yeast. Molecular and cellular biology 2011, 31 (3), 557–572.

(70)

Igual, J. C.; Toone, W. M.; Johnston, L. H. A genetic screen reveals a role for the late G1-specific transcription factor Swi4p in diverse cellular functions including cytokinesis. Journal of cell science 1997, 110 (Pt 14), 1647–1654.

(71)

VanderSluis, B.; Bellay, J.; Musso, G.; Costanzo, M.; Papp, B.; Vizeacoumar, F. J.; Baryshnikova, A.; Andrews, B.; Boone, C.; Myers, C. L. Genetic interactions reveal the evolutionary trajectories of duplicate genes. Molecular Systems Biology 2010, 6 (429), 1–13.

(72)

Ihmels, J.; Collins, S. R.; Schuldiner, M.; Krogan, N. J.; Weissman, J. S. Backup without redundancy: genetic interactions reveal the cost of duplicate gene loss. 52 ACS Paragon Plus Environment

Page 53 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Molecular Systems Biology 2007, 3 (86), 86. (73)

Stein, A.; Aloy, P. A molecular interpretation of genetic interactions in yeast. FEBS Letters 2008, 582 (8), 1245–1250.

(74)

Diss, G.; Gagnon-Arsenault, I.; Dion-Coté, A.-M.; Vignaud, H.; Ascencio, D. I.; Berger, C. M.; Landry, C. R. Gene duplication can impart fragility, not robustness, in the yeast protein interaction network. Science 2017, 355 (6325), 630–634.

(75)

Ptacek, J.; Devgan, G.; Michaud, G.; Zhu, H.; Zhu, X.; Fasolo, J.; Guo, H.; Jona, G.; Breitkreutz, A.; Sopko, R.; et al. Global analysis of protein phosphorylation in yeast. Nature 2005, 438 (7068), 679–684.

(76)

Mangan, S.; Alon, U. Structure and function of the feed-forward loop network motif. Proceedings of the National Academy of Sciences of the United States of America 2003, 100 (21), 11980–11985.

(77)

Goentoro, L.; Shoval, O.; Kirschner, M. W.; Alon, U. The Incoherent Feedforward Loop Can Provide Fold-Change Detection in Gene Regulation. Molecular Cell 2009, 36 (5), 894–899.

(78)

Doncic, A.; Skotheim, J. M. Feed-forward regulation ensures stability and rapid reversibility of a cellular state. Mol. Cell 2014, 50 (6), 856–868.

(79)

Menche, J.; Sharma, A.; Kitsak, M.; Ghiassian, S. D.; Vidal, M.; Loscalzo, J.; Barabási, A.-L. Uncovering disease-disease relationships through the incomplete interactome. Science 2015, 347 (6224), 1257601.

(80)

Sharma, A.; Menche, J.; Chris Huang, C.; Ort, T.; Zhou, X.; Kitsak, M.; Sahni, N.; Thibault, D.; Voung, L.; Guo, F.; et al. A disease module in the interactome explains 53 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 54 of 58

disease heterogeneity, drug response and captures novel pathways and genes in asthma. Human Molecular Genetics 2014, 24 (11), 3005–3020. (81)

Reimand, J.; Bader, G. D. Systematic analysis of somatic mutations in phosphorylation signaling predicts novel cancer drivers. Molecular Systems Biology 2014, 9 (1), 637– 637.

(82)

Creixell, P.; Schoof, E. M.; Simpson, C. D.; Longden, J.; Miller, C. J.; Lou, H. J.; Perryman, L.; Cox, T. R.; Zivanovic, N.; Palmeri, A.; et al. Kinome-wide Decoding of Network Attacking Mutations Driving Cancer Signaling. Cell 2015, 163 (1), 1–16.

(83)

Creixell, P.; Palmeri, A.; Miller, C. J.; Lou, H. J.; Santini, C. C.; Nielsen, M.; Turk, B. E.; Linding, R. Unmasking Determinants of Specificity in the Human Kinome. Cell 2015, 163 (1), 187–201.

(84)

Rozenblatt-Rosen, O.; Deo, R. C.; Padi, M.; Adelmant, G.; Calderwood, M. A.; Rolland, T.; Grace, M.; Dricot, A.; Askenazi, M.; Tavares, M.; et al. Interpreting cancer genomes using systematic host network perturbations by tumour virus proteins. Nature 2012, 487 (7408), 491–495.

(85)

Barber, T. D.; McManus, K.; Yuen, K. W. Y.; Reis, M.; Parmigiani, G.; Shen, D.; Barrett, I.; Nouhi, Y.; Spencer, F.; Markowitz, S.; et al. Chromatid cohesion defects may underlie chromosome instability in human colorectal cancers. Proceedings of the National Academy of Sciences of the United States of America 2008, 105 (9), 3443– 3448.

(86)

Rolland, T.; Taşan, M.; Charloteaux, B.; Pevzner, S. J.; Zhong, Q.; Sahni, N.; Yi, S.; Lemmens, I.; Fontanillo, C.; Mosca, R.; et al. A proteome-scale map of the human interactome network. Cell 2014, 159 (5), 1212–1226. 54 ACS Paragon Plus Environment

Page 55 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(87)

Huttlin, E. L.; Bruckner, R. J.; Paulo, J. A.; Cannon, J. R.; Ting, L.; Baltier, K.; Colby, G.; Gebreab, F.; Gygi, M. P.; Parzen, H.; et al. Architecture of the human interactome defines protein communities and disease networks. Nature 2017, 545 (7655), 505–509.

(88)

Drew, K.; Lee, C.; Huizar, R. L.; Tu, F.; Borgeson, B.; McWhite, C. D.; Ma, Y.; Wallingford, J. B.; Marcotte, E. M. Integration of over 9,000 mass spectrometry experiments builds a global map of human protein complexes. Molecular Systems Biology 2017, 13 (6), 1–21.

(89)

Li, T.; Wernersson, R.; Hansen, R. B.; Horn, H.; Mercer, J.; Slodkowicz, G.; Workman, C. T.; Rigina, O.; Rapacki, K.; Stærfeldt, H. H.; et al. A scored human protein–protein interaction network to catalyze genomic interpretation. Nature Methods 2016, 14 (1), 61–64.

(90)

Srivas, R.; Shen, J. P.; Yang, C. C.; Sun, S. M.; Li, J.; Gross, A. M.; Jensen, J.; Licon, K.; Bojorquez-Gomez, A.; Klepper, K.; et al. A Network of Conserved Synthetic Lethal Interactions for Exploration of Precision Cancer Therapy. Molecular Cell 2016, 63 (3), 514–525.

(91)

Du, D.; Roguev, A.; Gordon, D. E.; Chen, M.; Chen, S.-H.; Shales, M.; Shen, J. P.; Ideker, T.; Mali, P.; Qi, L. S.; et al. Genetic interaction mapping in mammalian cells using CRISPR interference. Nature Methods 2017, 14 (6), 577–580.

(92)

Teixeira, M. C.; Monteiro, P. T.; Guerreiro, J. F.; Gonçalves, J. P.; Mira, N. P.; dos Santos, S. C.; Cabrito, T. R.; Palma, M.; Costa, C.; Francisco, A. P.; et al. The YEASTRACT database: an upgraded information system for the analysis of gene and genomic transcription regulation in Saccharomyces cerevisiae. Nucleic acids research 2014, 42 (Database issue), D161-6. 55 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(93)

Page 56 of 58

Yu, H.; Braun, P.; Yildirim, M. A.; Lemmens, I.; Venkatesan, K.; Sahalie, J.; Hirozane-Kishikawa, T.; Gebreab, F.; Li, N.; Simonis, N.; et al. High-quality binary protein interaction map of the yeast interactome network. Science 2008, 322 (5898), 104–110.

(94)

Huang, H.; Jedynak, B. M.; Bader, J. S. Where have all the interactions gone? Estimating the coverage of two-hybrid protein interaction maps. PLoS Computational Biology 2007, 3 (11), 2155–2174.

(95)

Jensen, L. J.; Bork, P. Biochemistry. Not comparable, but complementary. Science 2008, 322 (5898), 56–57.

(96)

Filteau, M.; Diss, G.; Torres-Quiroz, F.; Dubé, A. K.; Schraffl, A.; Bachmann, V. A.; Gagnon-Arsenault, I.; Chrétien, A.-È.; Steunou, A.-L.; Dionne, U.; et al. Systematic identification of signal integration by protein kinase A. Proceedings of the National Academy of Sciences 2015, 112 (14), 4501–4506.

(97)

Kumar, A.; Beloglazova, N.; Bundalovic-Torma, C.; Phanse, S.; Deineko, V.; Gagarinova, A.; Musso, G.; Vlasblom, J.; Lemak, S.; Hooshyar, M.; et al. Conditional Epistatic Interaction Maps Reveal Global Functional Rewiring of Genome Integrity Pathways in Escherichia coli. Cell Reports 2015, 14 (3), 648–661.

(98)

Gutin, J.; Sadeh, A.; Rahat, A.; Aharoni, A.; Friedman, N. Condition-specific genetic interaction maps reveal crosstalk between the cAMP / PKA and the HOG MAPK pathways in the activation of the general stress response. Molecular systems biology 2015, 11 (10), 1–21.

(99)

Celaj, A.; Schlecht, U.; Smith, J. D.; Xu, W.; Suresh, S.; Miranda, M.; Aparicio, A. M.; Proctor, M.; Davis, R. W.; Roth, F. P.; et al. Quantitative analysis of protein interaction 56 ACS Paragon Plus Environment

Page 57 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

network dynamics in yeast. Molecular Systems Biology 2017, 13 (7), 934. (100) Przytycka, T. M.; Singh, M.; Slonim, D. K. Toward the dynamic interactome: It’s about time. Briefings in Bioinformatics 2010, 11 (1), 15–29. (101) Winter, D. L.; Erce, M. A.; Wilkins, M. R. A web of possibilities: network-based discovery of protein interaction codes. Journal of Proteome Research 2014, 13 (12), 5333–5338. (102) Seet, B. T.; Dikic, I.; Zhou, M.-M. M.; Pawson, T. Reading protein modifications with interaction domains. Nat. Rev. Mol. Cell Biol. 2006, 7 (7), 473–483. (103) Mackay, J. P.; Sunde, M.; Lowry, J. A.; Crossley, M.; Matthews, J. M. Protein interactions: is seeing believing? Trends in biochemical sciences 2007, 32 (12), 530– 531. (104) Wilkins, M. R.; Kummerfeld, S. K. Sticking together? Falling apart? Exploring the dynamics of the interactome. Trends Biochem. Sci. 2008, 33 (5), 195–200. (105) Wernicke, S. Efficient detection of network motifs. IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM 2006, 3 (4), 347–359.

57 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 58 of 58

For TOC only

58 ACS Paragon Plus Environment