Global De Novo Protein–Protein Interactome Elucidates Interactions of

May 10, 2016 - Jyoti Bhardwaj†, Indu Gangwar‡§, Ganesh Panzade‡§, Ravi Shankar‡§, and Sudesh Kumar Yadav†§∥. †Plant Metabolic Engine...
0 downloads 8 Views 2MB Size
Subscriber access provided by UNIV OF LETHBRIDGE

Article

Global de novo protein-protein interactome elucidates interactions of drought responsive proteins in horsegram (Macrotyloma uniflorum) Jyoti Bhardwaj, Indu Gangwar, Ganesh Prabhakar Panzade, Ravi Shankar, and Sudesh Kumar Yadav J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.5b01114 • Publication Date (Web): 10 May 2016 Downloaded from http://pubs.acs.org on May 16, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Global de novo protein-protein interactome elucidates interactions of drought responsive proteins in horsegram (Macrotyloma uniflorum) Jyoti Bhardwaj1,†, Indu Gangwar2,3,† Ganesh Panzade2,3, Ravi Shankar2,3,* and Sudesh Kumar Yadav1,3,4* 1

Plant Metabolic Engineering Laboratory and 2Studio of Computational Biology & Bioinformatics, Biotechnology

Division, CSIR-Institute of Himalayan Bioresource Technology, 3

Academy of Scientific and Innovative Research, New Delhi, India

4

Center of Innovative and Applied Bioprocessing (CIAB), Mohali-160071, Punjab, India

†Contributed equally *Corresponding authors E-mail: [email protected]; [email protected] [email protected]; [email protected]; [email protected] Fax number: +91-1894-230433

1

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ABSTRACT Inspired by the availability of de novo transcriptome of horsegram (Macrotyloma uniflorum) and recent developments in systems biology studies, first ever global protein-protein interactome (PPI) map was constructed for this highly drought tolerant legume. Large-scale studies of PPIs and the constructed database would provide rationale behind the interplay at cascading translational levels for drought stress adaptive mechanisms in horsegram. Using a bidirectional approach (interolog and domain-based), a high confidence interactome map and database for horsegram was constructed. Available transcriptomic information for shoot and root tissues of a sensitive genotype (M-191; genotype 1) and a drought tolerant (M-249; genotype 2) of horsegram was utilized to draw comparative PPI sub-networks under drought stress. High confidence 6804 interactions were predicted among 1812 proteins covering about one-fourth of the horsegram proteome. Highest number of interactions (33.86%) in horsegram interactome matched with Arabidopsis PPI data. Top five hub nodes mostly included ubiquitin and heat shock related proteins. Higher numbers of PPIs were found to be responsive in shoot tissue (416) and root tissue (2228) of genotype 2 compared to shoot tissue (136) and root tissue (579) of genotype 1. Characterization of PPIs using gene ontology analysis revealed that kinase and transferase activities involved in signal transduction, cellular processes, nucleocytoplasmic transport, protein ubiquitination and localization of molecules were most responsive to drought stress. Hence, these could be framed in stress adaptive mechanisms of horsegram. Being the first legume global PPI map, it would provide new insights in gene and protein regulatory networks for drought stress tolerance mechanisms in horsegram. Information compiled in form of database (MauPIR) will provide the much needed high confidence systems biology information for horsegram genes, proteins and involved processes. This information would ease the effort and increase the efficacy for similar studies on other legumes. Public access is available at http://14.139.59.221/MauPIR/.

Keywords: Computational, domain-based study, drought stress, horsegram (Macrotyloma uniflorum), interologs, protein interactome

2

ACS Paragon Plus Environment

Page 2 of 43

Page 3 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

INTRODUCTION Drought cannot be prevented but it can be coped up with. It is emphasized that drought stress in particular is an impending danger to agriculturally important plants1. Horsegram (Macrotyloma uniflorum), a highly drought tolerant yet under exploited legume could be a potential solution to drought stress2. Unfortunately, horsegram has not been subjected to high-throughput analysis except for a recent transcriptomic study where genes and pathways of horsegram associated with drought tolerance were unravelled with development of de novo transcriptome assembly3. Despite its importance, the available genomic and proteomic information for horsegram is largely limited in this post-genomic era as compared with other plants. Only genes cannot provide comprehensive understanding of stress tolerance scenario in horsegram. Therefore, taking a step forward to complete the picture, systems biology approach needs to be utilized to delve deeper into drought tolerance mechanisms of horsegram by unfolding the gene products and their interactions. Systems biology is a high-throughput application of science dedicated towards establishment of complex proteinprotein interactions4,5. Proteins are interactively involved with other proteins in complex network systems to perform their varied targeted functions. Proteins handle some of the most fundamental functions like cell to cell interactions, metabolism and development5,6. Thus, proteins and their interactions significantly range from elementary molecular functions to complex biological processes. PPIs hint towards proteins function and character because their interactions are affecting functions on a larger scale. Above facts necessitate the understanding of these complex protein networks. The in vivo and in vitro methods like yeast two-hybrid methods7, affinity chromatography8 and NMR spectrometry9 used to unfold these protein networks are mostly time consuming, costly and tedious. Hence, fast paced recent times are moulding us towards usage of in silico approaches towards understanding complex systems. Synergistic to wet lab experiments, computational methods were also developed based on various attributes. Methods could be like domain interactions10, gene expression profiles11, gene ontology12 and interolog13. Out of these, interolog approach is the most reliable and hence widely used. It is based on the idea that many PPIs are conserved in different organisms and these are accurately derived from experimentally validated interactome of other species13. On the other hand domain approach predicts PPIs based on the fact that domains are independent yet conserved key regulators of protein-protein interactions, thus complementing the interolog approach10. Most of the

3

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

studies on protein interactions are focused on model systems by the virtue of their importance. These include A. thaliana14, nematode (C. elegans)15,16, human (H. sapiens)17-19, insect (D. Melanogaster)20,21, fungus (Fusarium, M.grisea)22,23 and yeast (S. cerevisiae) 24-26. To the best of our knowledge this is the first computational effort to frame global interactome map and database (MauPIR) of horsegram. Using a stringent criteria of selection by both interolog and domain-based approach, differential quantitative PPIs are presented in root and shoot tissues of a drought tolerant (M-249; genotype 2) and a sensitive genotype (M-191; genotype 1) of horsegram. Important protein hubs were identified in the interactome. Biological and functional importance of the identified PPIs was conducted to streamline the complex responses of horsegram to drought stress. This study satisfies the strong demand of basic frame work for future understanding of drought stress tolerance mechanisms in horsegram and resources generated may be utilized for other legume studies. EXPERIMENTAL PROCEDURES Data Collection Horsegram transcriptome data, GO and KEGG annotation details were obtained from the previously published research article3. 29603 assembled transcript sequences of horsegram were used for interactome construction. Total 8 conditions Genotype 1 shoot control (G1SHC), Genotype 1 shoot stressed (G1SHS), Genotype 1 root control (G1RC), Genotype 1 root stressed (G1RS), Genotype 2 shoot control (G2SHC), Genotype 2 shoot stressed (G2SHS), Genotype 2 root control (G2RC) and Genotype 2 root stressed (G2RS) were used for RNA-seq expression calculations3. Experimentally verified high quality protein-protein interaction datasets of five well characterized reference model organisms commonly used for prediction of protein-protein interactome were downloaded from BIOGRID27, IntAct28, DIP29, MINT30, HPRD19 and TAIR31. Reference model organisms taken for the study were A. thaliana, S. cerevisiae, C. elegans, D. melanogaster and H. sapiens. Redundant protein-protein interactions from different PPI data resources were unified. Details of number of PPIs considered for all reference species are depicted in Table 1. For the prediction of orthologs, A. thaliana 35386 peptide sequences were gathered from Ensembl Plants (http://plants.ensembl.org/index.html) whereas 100778 peptide sequences of H. sapiens, 30362 peptide sequences of D. melanogaster, 30939 peptide sequences of C. elegans and 6692 peptide sequences of S. cerevisiae were retrieved

4

ACS Paragon Plus Environment

Page 4 of 43

Page 5 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

from Ensembl (www.ensembl.org/index.html32). Orthologs search and protein-protein interactions prediction using interolog approach We have used Inparanoid version 4.1 standalone33,34 with default parameters for detecting orthologous proteins between horsegram and the five reference species. 29603 transcripts of M. uniflorum were translated into corresponding protein sequences using Transeq tool of EMBOSS packages. As a consequence, a lot of ORFs were manifested and longest ORFs starting at methionine (Met) residue were selected as final peptide sequence of particular transcripts. However, in absence of Met residue at start position, longest ORFs were chosen as final peptide sequence. Inparanoid, a Blast based method for ortholog identification was used to sketch many to many and one to many associations. On the basis of pairwise similarity scores, orthologous groups between two proteomes were established. Initially two seed orthologs were constructed utilizing two way best hits. Orthologous group or Inparalogs were built finally through addition of more sequences closer to seed orthologs in two proteomes. The term interolog was introduced by Walhout35. Conserved protein pairs interactions having interacting homologs in other organisms were referred as interologs. Suppose X and Y are two different proteins of target organism and X’ and Y’ are two interacting proteins of reference organism, then two proteins (X and Y) in target organism are predicted to interact if a known interaction between two proteins (X’ and Y’) in reference organism exists, such that X is orthologous to X’ and Y is orthologous to Y’. We have drafted a prediction score for each PPI based on orthology. Prediction score was described as N

S (orthologs )= ∑ {Score (XX' )× Score (YY' )} i=1

where N is the total number of interologs of protein pair XY determined in the reference organisms. Score XX' and YY' represent normalized Inparanoid scores between proteins XX' and YY', respectively evaluated in reference organisms. Domain-based protein-protein interaction prediction A protein domain is a conserved structural and functional unit of a protein sequence which has ability to fold independently in comparison to rest of the structure and perform its function. Domains are the key regulators in protein-protein interactions. Domain-based PPI prediction method adopts domain-domain interaction (DDI)

5

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 43

information to infer potential PPIs based on the assumption that if proteins A and B contain an interacting domain pair, it is expected that the two proteins interact with each other. Horsegram protein sequences identified previously were

subjected

to

domains

assignment

via

searching

it

against

the

Pfam

database36

(PfamA)

(http://pfam.sanger.ac.uk/) utilizing HMMER tool (http://hmmer.janelia.org). The e-value threshold and domain length alignment coverage was set as 1e-2 and >=50%, respectively. According to DDIs in DOMINE, we have identified DDIs for horsegram. The DOMINE database37 is a large collection of experimentally validated and predicted DDIs consisting of 5410 domains (26219 DDIs) gathered from 15 different sources, including iPfam, 3did, ME, RCDP, P-value, Interdom, DPEA, PE, GPE, DIPD, RDFF, K-GIDDI, Insite, DomainGA, and DIMA. Interacting Pfam domain pairs in the DOMINE database were used to predict the PPIs for horsegram. If two domains were found to interact with each other in the DOMINE database, the pair was considered a candidate for a possible domain interaction. DDIs were filtered via removing self and very less confident DDI pairs. PPIs were then predicted based on two assumptions: (1) the DDIs are independent and (2) two proteins interact with each other only if they have at least one pair of interacting domains. DOMINE has provided a rich set of DDIs to understand the interaction interfaces, although it has also generated many false positives and negatives. To overcome the limitations of one to one domain interactions, we have filtered the PPIs obtained from traditional DDI approach through domain combination (dc) method given by Han38 elaborating the consequence of multiple domain interactions. Method considers the notion of dc and dc-pairs as basic unit of PPIs. To calculate the score of each predicted PPI based on dc, we have entrenched a model to calculate all dc-pairs for each interacting protein pair of horsegram in reference organisms and their benefaction to the interacting protein pairs. Initially utilizing, PfamA database and threshold mentioned previously domains were identified in all 5 reference organisms. A protein pair X and Y if pertaining m and n domains respectively, then dc for protein X was given by (2m-1) and for protein Y was given by (2n-1). dcpairs for protein X and Y were evaluated as [(2m-1)(2n-1)]. The score for each dc-pair in five model organisms was obtained using equation

Score dc− pair =

1

(2 − 1 )(2n − 1 ) m

When a dc-pair appears in more than one protein pair having different scores, highest score was selected. After evaluating scores for every dc-pair, finally scores of potential interacting protein pair in horsegram was calculated

6

ACS Paragon Plus Environment

Page 7 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

using the equation defined as follows m

n

Score domain − combinations= ∑ ∑ Score dc− pair × i=1 j=1

dc− pairs− ref all− dc− pairs

For any protein pair XY, dc-pairs-ref and all-dc-pairs represent dc-pairs of protein pair XY in the reference organisms and all possible dc-pairs of protein pair XY, respectively. Computational prediction of protein-protein interactions have generated false positives and false negatives enormously. Therefore, to overcome the limitations of in silico approaches the protein-protein interactions obtained from both interolog based and domain-based approaches were integrated and common PPIs were further used to generate a core PPI map with highly confident interactions. Randomization of PPI network Pragmatic and most prevailing approach for assessment of predicted interactome suggested it not to be a random network. Utilizing probability and graph theory, randomization was accomplished to build 1000 random network models. Widely used Erdos-Renyi random graph39 consisting of same number of vertices with connection probability in the form of edge density comparable to protein-protein interaction network was employed. 1000 randomized networks with degree distribution as of predicted PPI network were built using GraphCrunch40. GraphCrunch is a software to construct a variety of random network models allowing appraisal of the fit of models with real world networks. Stubs method was used for generation of random graphs. Based upon degree distribution of the PPI network, numbers of stubs were assigned to each node and edges were drawn randomly between pairs of nodes. Number of stubs were decreased by one after an edge generation. Process was repeated and random graphs with the same degree distribution as generated PPI network were built41. In silico validation of predicted PPIs Information regarding horsegram genome, other experimentally determined or in silico predicted database is not available till date. It was strenuous to validate the predicted horsegram PPIs. We used three computational methods based on Gene Ontology (GO), correlation coefficient and network topology for assessing authenticity of predicted protein interactome. GO is a convenient and favoured taxonomic method to describe the functional aspects of biology for various gene products. During complex formation, proteins always locate in same cellular components.

7

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

When proteins lie in a functional module, they are expected to be involved in similar biological roles and perform similar molecular functions. Number of protein-protein interactions sharing at least one GO term at depth 3 to 8 in the predicted and randomized networks were compared. To appraise the functional aspects of different proteins, GO annotation is a favoured method. Employing horsegram GO annotations from recently published transcriptomic data3 GO terms were retrieved. It was observed that proteins occurring in complex generally have some similar functions in gene ontology annotations. Consequently, it was presumed that predicted interactome should have large number of protein-protein interactions sharing at least one GO terms in contrast to randomized networks. Gene expression profile of interacting proteins generally should be correlated42-44. The level of coexpression of an interacting protein pair was identified using Pearson Correlation Coefficient (PCC) as an authentic way of measurement. The PCC value for each pair of non-self-interacting proteins was calculated using the Reads Per Kilobase of transcript per Million mapped reads (RPKM) value of mRNA expression at eight conditions. Positive value of PCC between two different genes indicated that the genes were synergistically related because they co-expressed. To estimate the reliability of the predicted network, average value of PCC in predicted and 1000 randomized networks were set side by side. We also determined the number of protein interactions containing PCC values at a range lying between 0.5 to 1.0 in our predicted network contrast to randomized networks. Generation and topological validation of protein-protein interactome Protein-protein interactions passed through above mentioned percolations and common to both ortholog based and domain-based interaction detection methods were subjected to interactome construction. Interactome was built and its topological features were determined such as Degree, Clustering Coefficient (CC), Betweenness Centrality (BC), Shortest Path Length (SPL) adopting popular tool of network construction Cytoscape45. Protein-protein interaction networks of different organisms contribute similar topological features distinguishing such networks from random ones. Therefore, it is essential to analyze topological properties of predicted PPI network and its comparison to 1000 randomized networks. PPI networks of the various organisms represent scale-free topology. These networks follow power-law distribution (heavy tailed) representing a number of nodes with few connections and less number of hub nodes with many connections among them46. CC in the PPI networks vary significantly rather than randomized

8

ACS Paragon Plus Environment

Page 8 of 43

Page 9 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

networks. CC of a node in the PPI network is described as the extent to which neighbours of the node are connected. If the value of CC is 1 for a node then it means that all neighbours of the node are fully connected to each other and value 0 depicts rarely connected neighbourhood of the node. CC of the network is defined as the average value of all nodes present in the network. It was observed that as like real world networks, PPI networks have high CC in comparison to random networks denoting high neighbourhood interaction probability47. Two features, power-law degree distribution and higher CC of the real world networks were utilized to distinguish any real world network with random graphs47. Average diameter of the predicted network and 1000 random networks was also calculated and defined as mean of shortest pathlength across all pairs of vertices in the network. Path length between pair of nodes is defined as the shortest route between them. Subnetworks of differentially expressed genes (DEGs) in drought stress Two fold and above DEGs were identified in G1SHS, G1RS, G2SHS, G2RS. DEGs were mapped on protein-protein interactome and first degree neighbours of genes were captured to build PPI subnetworks. Some nodes in the PPI network occurs in the form of highly interconnected clusters compared to remaining network. Group of these highly connected nodes is referred as clique. Such cliques are very meaningful clusters of a network. Therefore, to explore these cliques of DEGs subnetworks, the CFinder program48 was employed. Program firstly identifies all possible interconnected dense nodes forming subgraphs in the network utilizing Clique Percolation algorithm. Hence, it determines communities within the network via overlapping of cliques. CFinder algorithm captures cliques at varying k-values, higher k-value describes highly interconnected nodes within the clique. In the present study, we extracted clusters at k-value threshold of 4. For each identified subnetwork, GO enrichment was performed using most widely used BINGO49 tool available in Cytoscape applications. Gene ontology categories; Biological Process and Molecular Function were chosen for enrichment analysis. BINGO exploit Fisher Exact test for p-values of significance calculation followed by corrected or adjusted p-values implication via False Discovery Rate (FDR) correction. The most significantly over-represented GO terms determined through p-values were assigned to every cluster within the subnetworks of DEGs. Metabolic Pathways analysis of enriched proteins Enriched proteins identified through GO analysis were further subjected to PMN pathways analysis. SoyCyc present

9

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

in PMN database (www.plantcyc.org) was utilized for elaborating metabolic pathways influenced by differentially expressed genes in horsegram due to its high similarity with soybean. An edge was drawn between two pathways, if they share at least one protein in any of the differentially expressed condition of two genotypes. RESULTS AND DISCUSSION Construction of de novo global PPI map of horsegram Bidirectional workflow approach was utilized to achieve a stringent de novo global PPI map for horsegram (Figure 1). Since there always remains a scope of false-positives and false-negatives in large-scale PPI data therefore, two strategies were employed to complement and increase the confidence level of the predicted data. First, we used interolog-based approach wherein a stringent InParanoid algorithm was used to distinguish true orthologs50. Second, we utilized domain-based approach that depends on regulation of protein-protein interactions by domains which are independent and conserved units of protein sequence10. Contribution and Quantitation of drought stress responsive PPIs The combined PPI data from A. thaliana, S. cerevisae, C. elegans, D. melanogaster and H. sapiens was used to identify true orthologs in horsegram. In horsegram, 33.86% (10201) of the interactions can be directly correlated from the PPI data of A. thaliana followed by 24.38% (65486) from S. cerevisae, 14.49% (27551) from H. sapiens, 12.90% (5308) from D. melanogaster and least 5.28% (960) from C. elegans (Figure 2A). Out of the total 129779 PPIs, 9613 interologs were unique from A. thaliana followed by 63328 from S. cerevisae, 25165 from H. sapiens, 4377 from D. melanogaster and 722 from C. elegans. Drought is a complex quantitative trait51. In order to weigh the number of differential as well as unique PPIs in test samples, sub-networks of protein interactions were derived from the global interactome of horsegram. The test samples allowed us to have comparative accounts of shoot and root tissues and the drought tolerance of the two genotypes in horsegram (Figure 2B). Only 9 PPIs were observed to be commonly up-regulated in shoot tissue of both the genotypes under drought stress. While in root tissue 224 PPIs were observed to be commonly up-regulated in both the genotypes under drought stress. 204 PPIs were uniquely up-regulated in shoot tissue of G2 as compared to 10 in G1. 1851 PPIs were unique to root tissue of G2 as opposed to only 229 in G1. The results suggest that more number of protein interactions were responsive to drought stress in root tissue than shoot tissue of horsegram. More

10

ACS Paragon Plus Environment

Page 10 of 43

Page 11 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

PPIs were up-regulated in the G2 as compared to G1. Hence, PPIs and associated proteins in root tissue of G2 appear to be the backbone for high drought tolerance in horsegram. Recently published transcriptomic data also suggests that root tissue was more responsive3. Predicted interactome covered about one fourth of the horsegram proteome through both the approaches separately which suggests the accuracy of the work criteria and the results obtained. In agreement with our data, genome of a rice blast fungus M. grisea was sequenced way back in 2005, still only one fourth of proteome was covered during construction of its first PPI map23. Despite the availability of whole genome sequence in 2007 for the fungus (F. graminearum) about half of the proteome was covered during development of its first PPI map22. Results from this study should be appreciated considering the facts that; 1) this is the first effort to construct the PPI map for a legume as well as horsegram; 2) horsegram genome sequence is not available as yet; 3) deriving the results only from the available transcriptomic data and 4) limited availability of proteomic information for horsegram. Topology of PPI network Combination of the two strategies yielded 6804 interactions among 1812 horsegram proteins (Figure 3). In a study relating to development of first PPI map for M. grisea, a fungal causal agent of rice blast disease, interolog approach has predicted 11674 PPIs among 3017 proteins23. Another study relating to generation of first PPI map was conducted on fungal pathogen F. graminearum, causal agent of several destructive crop diseases in wheat, barley and maize. Utilizing interolog and domain-based approaches a high confidence core PPI set consisting of 27102 interactions among 3745 proteins was constructed22. Using interolog approach Arabidopsis interactome has been developed14. Interactome of rice was developed via interolog method wherein 37112 interactions from 4567 proteins were obtained52. The number of interactions and the proteins achieved in this study are comparable with the other studies discussed here. The topological properties indicated that predicted interactome was a scale-free network, i.e. most nodes had low degrees of connection, whereas a few hub nodes had very high degrees of connection. In total, 4 proteins superscripted () had more than 100 connections. These are included in the top ten hub nodes which are namely; heat shock protein (Hsp) (Scaffold11194_65), Hsp SSB1 (C92475_65), polyubiquitin (Scaffold7450_67), ubiquitin C variant (C89789_65), CDC 2 protein kinase (C90647_65), histone acetyl transferase gcn5 (C100797_65), cell division cycle protein (C101937_65), polyadenylate binding protein (Scaffold4415_65), nucleic acid binding protein (Piso0-001939) and an uncharacterized protein (C91771_65). This data indicated that drought

11

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 43

stress has primarily affected ubiquitin related proteins and Hsps in the horsegram interactome. Results pertaining to similar proteins being enriched under drought stress have been obtained for woody plant Populus53. The uncharacterized protein although lower in ranking but being novel and included in the top ten hubs suggests its importance in response to drought stress in horsegram. GO analysis interpreted that this could be a non-specific serine/threonine kinase involved in photoperiodism or flowering under drought stress. Top BC nodes were proteins involved in glycolysis (C83733_65), uncharacterized protein (C83071_65), basal transcription factor (C89727_65), Cu/Zn-superoxide dismutase (C87917_65), hypothetical protein involved in plant hormone signal transduction (C97349_65), non-specific serine/threonine protein kinase (Scaffold4291_65), ubiquitin-conjugating enzyme (C101617_65), unknown protein (C85875_65), protein geranylgeranyl transferase type I (C88455_65) and peptide-N(4)-(N-acetyl-beta-glucosaminyl) asparagine amidase (C68387_65). Validation of network In the absence of any direct method to evaluate the quality of predicted interactome, three indirect computational approaches as mentioned in method sections based on functional similarity, expression correlation measures and graph theoretical properties were employed. Results conclude good reliability of the predicted network of interactions at all depths implying high number of functionally similar interactions (Figure 4A). It was found that expression values of interacting proteins were highly correlated. PCC was measured between all pairs of proteins in the interactome using in house built script. In the predicted network, average value of PCC between all interacting pair of proteins was found as 0.586042 while for 1000 randomized networks mean value was 0.5145 with standard deviation of 0.00403133. Thus, predicted network has significantly high average PCC value than random networks. We also plotted the number of PPIs containing PCC varying from 0.5 to 1.0 at a interval of 0.5. As a result, the PCC value raised number of interactions in the predicted network comparative to 1000 randomized networks (Figure 4B). Degree distribution of PPI network satisfies power-law degree distribution shown in form of log-log and cumulative log-log plot depicted in Figure 4C and D. The exponent of the fitted power-law distribution was calculated as 4.475726. Null hypothesis was defined as the predicted network drawn from the fitted power-law distribution. Kolmogorov-Smirnov statistics was applied and p-values was calculated as 0.999996 (>0.05), thus accepting the null hypothesis. CC and average diameter of predicted network was calculated as 0.0831519 and

12

ACS Paragon Plus Environment

Page 13 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

4.014226 which is higher than that

of 1,000 random networks represented in Table 2. CC, diameter and PCC of

Degree distribution of predicted network in comparison to 1,000 random networks are presented in supplementary Table S6. It was found that predicted network is significantly better than randomized networks and replicates all graph theoretical features of PPI networks as like many real world network models. Hence, we confidently state that the quality of our predicted interactome is acceptable. Identification and characterization of drought stress responsive PPIs To identify the important proteins responsive to drought, their functions and interactions GO analysis of global interactome map of horsegram was conducted. A broad repertoire of protective proteins was detected. However, in each category first two biological processes or molecular functions having the most significant p-value are discussed. (Supplementary Table S1: Clustering information of PPIs for GO analysis; Supplementary Table S2-S5: Full details of GO analysis/results). The enriched proteins were also subjected to PMN pathways analysis to identify the interdependency of important pathways within enriched proteins (Figure 7). Biological processes Signalosome assembly (p-value: 3.60E-006) and regulation of cell morphogenesis (p-value: 2.44E-005) were most enriched biological processes in G1SHS (Figure 5A). Signalosome assembly is a highly conserved protein complex54 which impinges upon various developmental processes like cell division, DNA repair and plant responses to external stimuli55,56. This assembly also has scaffolding role with ubiquitin-proteosome system57. In perennial woody plant Populus euphratica, signalosome assembly has been observed to be the most enriched biological process. Two genes encoding subunits of the COP9 signalosome were observed to be highly up-regulated under drought stress. They suggested that signal transduction, ubiquitin-proteosome system and associated kinases may maintain the woody plant under water stress conditions52. Another study on Arabidopsis also corroborates that signalosome is associated with different cellular and developmental processes necessary for plant homeostasis under drought stress conditions. Regulation of some of these processes is mediated through a crosstalk between signalosome and the signaling cascades of other hormones including auxin, ABA, ethylene and salicylic acid58,59. Cellular processes are basis of normal growth and development in plants. Hence, proteins associated with cellular morphogenesis play crucial role particularly under drought stress conditions when plant demands are high

13

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 43

for survival60. In rice various stress responsive genes and their products are involved in cell growth and development. It was proposed that cellular morphogenesis could be a regulatory mechanism of stress responses in rice61. In potato plant (Solanum tuberosum L.), MYB-related transcriptional factors (TF) are believed to be involved in the control of cellular morphogenesis under drought stress62. However, MYB-TF in G1SHS were not identified as a separate category in enrichment analysis. The bHLH-type TF (OsPIL1) modulates expression of cell elongationrelated genes in rice. Under drought stress conditions OsPIL1 expression is reduced leading to reduction in shoot growth and surface area. Hence, it has been proposed that conserving energy during photosynthesis could be used for activation of mechanisms involved in stress tolerance60. In tolerant genotypes of tomato plants it has been observed that genes responsive to drought stress were involved in cell growth, differentiation and morphogenesis. This has indicated that adaptation of drought-tolerant lines to water deficit may occur partially through organ morphogenesis63. In G1RS, nucleocytoplasmic transport (p-value: 1.64E-005) and nuclear transport (p-value: 1.77E-005) were observed as most significant processes (Figure 5B). MYB-TFs in G1RS were also observed in enrichment analysis (Table S4). Nucleocytoplasmic transport involves nuclear pore complex (NPC) controlled transport of material from nucleoplasm across nuclear membrane to cytoplasm64. It is considered to be functionally conserved in plants and encompasses crucial basic cellular functions of growth and development65 to complex processes like signaling cascade, cellular differentiation and response to external environment64. Nucleoporins are fundamental to cellular functions and hence are thought to be essential for plant viability particularly under stress conditions. SAD2 {super sensitive to abscisic acid (ABA) and drought 2}, an ortholog of vertebrate importin 7 and 8 was suggested to regulate various hormone and environmental response pathways in A. thaliana. The sad2 mutant showed hypersensitivity to ABA and stress66. In Arabidopsis, kpnb1 mutant has been reported to modulate ABA signalling pathway through nucleocytoplasmic transport. The kpnb1 mutation increases stomatal closure in response to ABA, reduces the rate of water loss, and substantially increases drought tolerance. So studies involving its mutation suggested its mentioned role during abiotic stress67. Regulation of protein modification (p-value: 6.05E-006) and protein ubiquitination (p-value: 1.19E-005) were two most significant processes observed for G2SHS (Figure 5C). Proteins are pivotal in the regulation of adaptive mechanisms of plants to stresses. Hence, their post-translational modifications (PTMs) contribute

14

ACS Paragon Plus Environment

Page 15 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

substantially to plant stress responses68. Ubiquitination and SUMO (Small Ubiquitin-like MOdifie) activity which modify localization and protein activity are specifically important in this regard69. Ubiquitination is the mechanism through which small ubiquitin molecule is conjugated to a protein substrate commonly for being degraded by the 26S proteasome. An increasing number of studies on stress-related mutants in plants indicate a strong link between signaling of ABA, ubiquitination and drought tolerance70,71. In Arabidopsis it has been observed that cer9 (a mutant in the E3 ubiquitin ligase) tolerated drought stress through ABA regulation and increased accumulation of cutin72. Sumoylation is a reversible post-translational modification of protein substrates based on covalent conjugation of the SUMO peptide. It can induce changes relating to conformation or enzymatic activity. Sumoylation often occurs on lysine residue. Phosphorylation may regulate the sumoylation of a substrate73. In Arabidopsis, SIZ1 is responsible for SUMO conjugated response to stress. The analysis of siz1 mutant displayed improved drought tolerance indicating importance of sumoylation74. Localization (p-value: 1.66E-006) and cellular macromolecular localization (5.11E-006) were the most important processes in G2RS (Figure 5D, E). The spatio-temporal localization of different molecules has been implicated in response to abiotic stress in plants. These molecules could be anything ranging from osmolytes, kinases, hormones, TFs and Hsps75-77. Compatible solutes like mannitol, proline, glycinebetaine do not interfere with the cellular machinery but provide protection either via osmoregulation or cellular compatibility mechanism during stress75. Through studies on transgenic plants it has been shown that differential stress tolerance is exhibited depending on the localization of these compounds during drought stress. In transgenic rice although glycinebetaine accumulation was 5 times higher in cytosol than in chloroplast, still glycinebetaine accumulation in chloroplast was shown to provide more protection of photosynthetic machinery against salt and cold stress78. Plant small Hsps localized in mitochondria and chloroplasts protect electron transport chain and exhibit translational control during heat stress79,80. Spatial localization of a gene (OsPIN3t) involved in auxin transport was suggested to be important in the regulation of drought stress response in rice81. The plasma membrane localized histidine kinase receptors in Arabidopsis (AHKs) have been shown as regulators of osmotic stress response82. Gluconeogenesis (7.51E-005) was also observed among highly enriched processes under drought stress in the root tissue of tolerant genotype which coincides with the observed result from transcriptome analysis of horsegram. Hence, it could be concluded that besides the other processes, gluconeogenesis may be an inherent stress adaptive

15

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 43

mechanism employed by horsegram to counteract drought stress3. Molecular Functions Protein kinase and phosphotransferase activities were observed as two most significant molecular functions common in all the four types of samples (Figure 6A-F). Drought perception by plants involves intricate play of complex signaling networks involving mainly protein receptors, kinases and TFs. Their effects are travelled downstream to exhibit expressions of drought responsive genes1. In particular the signal transduction factors having kinase83 and transferase activity84 are at the heart of plant responses under drought stress. Different types of kinase families have been implicated into drought stress responses from time to time. In addition to MAPKs, SnRK2 (SNF1-related protein kinase 2) family has been implicated in drought stress response in soybean plants85. A study on cotton proposed that over-expression of GbRLK (receptor like kinase) may improve stress tolerance by regulating stressresponsive genes to reduce water loss86. It was shown that up-regulation of an adenylate kinase participating in ATP biosynthesis in the tolerant genotype of tomato plant provided more ATP for maintaining cellular activities under drought stress63. Transferase activities involving transfer of phosphate and palmitoyl groups were observed to be central to drought response in horsegram. Phosphorylation and de-phosphorylation of proteins are key activities in signal transduction pathways87. Rice NAC4 TF essentially undergoes phosphorylation during its localization in the nucleus. Kinases regulate NAC TFs/proteins under stress conditions through phosphorylation modulating its sub-cellular localization, DNA binding activity and other protein interactions88. Palmitoylation is a post-translational modification that catalyzes the addition of a saturated lipid group reversibly to the sulfhydryl group of a cysteine residue in protein. It is also known as protein S-acylation and is catalyzed by protein S-acyl transferases (PATs). Palmitoylation regulates protein stability, activity and sub-cellular localization89,90. In a study on Arabidopsis, ltp3 (lipid transfer protein) mutant was sensitive to drought stress, whereas ltp3 gene over-expressing plants were drought tolerant90. In another study on Arabidopsis, loss-of-function mutants for PAT10 were observed to be hypersensitive to salt stress89. DNA-dependent ATPase activity (6.68E-005) was observed to be highly enriched in addition to other two activities in G2RS (Figure 6D). Helicases exhibit DNA-dependent ATPase activities and are involved in many cellular processes like RNA synthesis and translation. It has been reported that helicases are induced under different type of stresses and could be central to stress signaling pathways91,92.

16

ACS Paragon Plus Environment

Page 17 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Hence, the above detailed discussion allows us to conclude that pathways involving the biosynthesis and/or degradation of four biomolecules of life carbohydrates, proteins, lipids and nucleic acids are central in drought stress response (Figure 7). However, these contribute independent of each other towards drought tolerance of horsegram. Fatty acid metabolism was the only pathway common in defence in the two types of tissues and the two differentially drought responsive genotypes. This shows that membrane being the first line of defence critically contributes towards maintenance of drought stress tolerance irrespective of the genotype in horsegram. We observed that pathways involving the degradation of reactive oxygen species (ROS), biosynthesis of choline, phosphatides and rosmarinic acid were unique to G2RS suggesting the strength of drought tolerance in G2 of horsegram87,93. MauPIR Portal The predicted and high confidence 6804 protein interactions for M. uniflorum are available for search using an interactive web portal MauPIR available through the following link: http://14.139.59.221/MauPIR/. Database workflow and the web interface are shown in figure 8 and 9, respectively. Database was developed using the HTML5, PHP scripts, JQuery libraries, MySQL database and interaction visualization using linkurious based SigmaJS network library (https://github.com/Linkurious/linkurious.js/wiki). User can search query via two ways; first by selecting the protein id from provided list or secondly by putting the id in text box and submitting it for the result display. Each protein query in database is unique having domain information with PFAM id, its number of interactors, ortholog proteins description, KEGG and EC annotation, sequence and length of protein. Once it displays the results, user can view all the related information in dynamic way through modal popups which are displayed after clicking on various options like interlog search, Uniprot ortholog search, interaction specific network properties in tabular format and visual representation as an undirected graph. At every node we have provided the BLAST functional description of parent and child protein interactors in network graphs. Also, number of connections corresponding to each node can be visualized as tooltip option on right click. A special feature implemented in network visualization is that user can click on further interacting nodes for their independent interaction search. There is an option available in network visualization page to export graph in image format. The interaction table contains information about shortest path length, Betweenness centrality, closeness centrality, clustering coefficient, degree and eccentricity. The last option gives the results of gene enrichment from different ontology terms for Biological Process, Cellular Component and Molecular Function.

17

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

CONCLUSIONS It is known that variation in stress tolerance character exists naturally among genotypes. Comparing a drought sensitive and a tolerant genotype gave us an overview of similar or differential strategies that may be employed by the same plant under same condition i.e. drought stress. Varying the tissues in the study allowed us to know which part of the plant is more affected and its associated mechanism in response to drought stress. PPIs functional in roots are quantitatively more responsive to drought stress than those in shoot tissue. PPIs in drought tolerant genotype of horsegram could constitute the backbone for high drought tolerance in horsegram. The processes like signal transduction (primarily involving kinase and transferase activities), cellular processes, nuclear transport, ubiqitination of proteins and localization of molecules could affect involvement of stomatal conductance, photosynthesis, cellular metabolism, hormone signaling pathways as perspective mechanisms for drought stress tolerance. These results indicated that horsegram could adapt to drought stress conditions through decreasing energy dissipation, increasing ATP energy, and reducing oxidative damage largely through protein metabolism, translational regulation and enzyme activity. COMPETING INTERESTS The authors declare that they have no competing interests. AUTHORS’ CONTRIBUTIONS JB and SKY developed the concept, provided the raw data and mentored the study. IG performed the computational part of this study. JB and IG interpreted and analyzed the results. GP constructed the database (MauPIR). RS supervised the computational analysis performed by IG. JB, IG wrote the MS. RS and SKY edited and approved the MS. All authors have read and approved the manuscript. ACKNOWLEDGEMENTS We are thankful to the Director, CSIR-IHBT, for providing all the facilities and guidance needed for completion of this study. We acknowledge the financial support provided under CSIR-YSA research grant to SKY and BSC-0121 research project to RS. JB is thankful to DST, GOI for providing WOS-A fellowship. IG is thankful to CSIR for Senior Research Fellowship and GP to DST for Junior Research Fellowship. The overall financial assistance and infrastructure provided by Council of Scientific and Industrial Research (CSIR) is duly acknowledged. This

18

ACS Paragon Plus Environment

Page 18 of 43

Page 19 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

manuscript represents IHBT publication number 3931. ASSOCIATED CONTENT Supplemental Material Supplementary Table S1: Communities identified at clique size 4 in subnetworks of Differentially Expressed Genes (DEGs). Supplementary Table S2: Gene Ontology enrichment of G1SHS upregulated protein-protein interactions in drought stress. Supplementary Table S3: Gene Ontology enrichment of G2SHS upregulated protein-protein interactions in drought stress. Supplementary Table S4: Gene Ontology enrichment of G1RS upregulated protein-protein interactions in drought stress. Supplementary Table S5: Gene Ontology enrichment of G2RS upregulated protein-protein interactions in drought stress. Supplementary Table S6: Clustering coefficient, diameter and PCC of degree distribution of predicted network in comparision to 1,000 random network

19

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 43

REFERENCES 1.

Bhardwaj, J.; Yadav, S. K. Genetic Mechanisms of Drought Stress Tolerance, Implications of Transgenic Crops for Agriculture. In Agroecology and Strategies for Climate Change of Sustainable Agriculture Reviews, Volume 8; Lichtfouse, E., Ed.; Springer: Netherlands, 2012; pp 213–235.

2.

Bhardwaj, J.; Yadav, S. K. Drought Stress Tolerant Horse Gram for Sustainable Agriculture. Sustainable Agriculture Reviews, Volume 15; Lichtfouse, E., Ed.; Springer: Netherlands, 2015; pp 293–328.

3.

Bhardwaj, J.; Chauhan, R.; Swarnkar, M. K.; Chahota, R. K.; Shankar, R.; Yadav, S. K. Comprehensive Transcriptomic Study on horse gram (Macrotyloma uniflorum): De novo Assembly, Functional Characterization and Comparative Analysis in Relation to Drought Stress. BMC Genomics 2013, 14, 647.

4.

Ding, Y. D.; Chang, J. W.; Guo, J.; Chen, D.; Li, S.; Xu, Q.; Deng, X. X.; Cheng, Y. J.; Chen, L. L. Prediction and functional analysis of the sweet orange protein-protein interaction network. BMC Plant Biol. 2014, 14, 213.

5.

Miteva, Y. V.; Budayeva, H. G.; Cristea, I. M. Proteomics-Based Methods for Discovery, Quantification, and Validation of Protein–Protein Interactions. Anal. Chem. 2013, 85, 749–768.

6.

De Las Rivas, J.; Fontanillo, C. Protein–Protein Interactions Essentials: Key Concepts to Building and Analyzing Interactome Networks. PLoS Comput. Biol. 2010, 6, e1000807.

7.

Gietz, R. D.; Woods, R. A. Transformation of yeast by lithium acetate/single-stranded carrier DNA/polyethylene glycol method. Methods Enzymol. 2002, 350, 87–96.

8.

Rohila, J. S.; Chen, M.; Cerny, R.; Fromm, M. E. Improved tandem affinity purification tag and methods for isolation of protein heterocomplexes from plants. Plant J. 2004, 38, 172–181.

9.

O’Connell, M. R.; Gamsjaeger, R.; Mackay, J. P. The structural analysis of protein-protein interactions by NMR spectroscopy. Proteomics 2009, 9, 5224–5232.

10. Ng, S. K.; Zhang, Z.; Tan, S. H. Integrative approach for computationally inferring protein domain interactions. Bioinformatics 2003, 19, 923–929. 11. Ideker, T.; Ozier, O.; Schwikowski, B.; Siegel, A. F. Discovering regulatory and signalling circuits in

20

ACS Paragon Plus Environment

Page 21 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

molecular interaction networks. Bioinformatics 2002, 18, S233–S240. 12. Wu, X.; Zhu, L.; Guo, J.; Zhang, D. Y.; Lin, K. Prediction of yeast protein–protein interaction network: insights from the Gene Ontology and annotations. Nucleic Acids Res. 2006, 34, 2137–2150. 13. Matthews, L. R.; Vaglio, P.; Reboul, J.; Ge, H.; Davis, B. P.; Garrels, J.; Vincent, S.; Vidal, M. Identification of potential interaction networks using sequence-based searches for conserved proteinprotein interactions or “interologs.” Genome Res. 2001, 11, 2120–2126. 14. Arabidopsis Interactome Mapping Consortium, Evidence for network evolution in an Arabidopsis interactome map. Science 2011, 333, 601–607. 15. Simonis, N.; Rual, J. F.; Carvunis, A. R.; Tasan, M.; Lemmens, I.; Hirozane-Kishikawa, T.; Hao, T.; Sahalie, J. M.; Venkatesan, K.; Gebreab, F.; Cevik, S.; Klitgord, N.; Fan, C.; Braun, P.; Li, N.; AyiviGuedehoussou, N.; Dann, E.; Bertin, N.; Szeto, D.; Dricot, A.; Yildirim, M. A.; Lin, C.; de Smet A. S.; Kao, H. L.; Simon, C.; Smolyar, A.; Ahn, J. S., Tewari, M.; Boxem, M.; Milstein, S.; Yu, H.; Dreze, M.; Vandenhaute, J.; Gunsalus, K.; Cusick, M. E.; Hill, D. E.; Tavernier, J.; Roth, F. P.; Vidal, M. Empirically controlled mapping of the Caenorhabditis elegans protein-protein interactome network. Nat. Methods 2009, 6, 47–54. 16. Li, S.; Armstrong, C. M.; Bertin, N.; Ge, H.; Milstein, S.; Boxem, M.; Vidalain, P. O.; Han, J. D. J.; Chesneau, A.; Hao, T.; Goldberg, D. S.; Li, N.; Martinez, M.; Rual, J. F.; Lamesch, P.; Xu, L.; Tewari, M.; Wong, S. L.; Zhang, L. V.; Berriz, G. F.; Jacotot, L.; Vaglio, P.; Reboul, J.; Hirozane-Kishikawa, T.; Li, Q.; Gabel, H. W.; Elewa, A.; Baumgartner, B.; Rose, D. J.; Yu, H.; Bosak, S.; Sequerra, R.; Fraser, A.; Mango S. E.; Saxton, W. M.; Strome, S.; Van Den Heuvel, S.; Piano, F.; Vandenhaute, J.; Sardet, C.; Gerstein, M.; Doucette-Stamm, L.; Gunsalus, K. C.; Harper, J. W.; Cusick, M. E.; Roth, F. P.; Hill, D. E.; Vidal, M. A map of the interactome network of the metazoan C. elegans. Science 2004, 303, 540–543. 17. Rual, J. F.; Venkatesan, K.; Hao, T.; Hirozane-Kishikawa, T.; Dricot, A.; Li, N.; Berriz, G. F.; Gibbons, F. D.; Dreze, M.; Ayivi-Guedehoussou, N.; Klitgord, N.; Simon, C.; Boxem, M.; Milstein, S.; Rosenberg, J.; Goldberg, D. S.; Zhang, L. V.; Wong, S. L.; Franklin, G.; Li, S.; Albala, J. S.; Lim, J.; Fraughton, C.; Llamosas, E.; Cevik, S.; Bex, C.; Lamesch, P.; Sikorski, R. S.; Vandenhaute, J.; Zoghbi, H. Y.; Smolyar,

21

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 43

A.; Bosak, S.; Sequerra, R.; Doucette-Stamm, L.; Cusick, M. E.; Hill, D. E.; Roth, F. P.; Vidal, M. Towards a proteome-scale map of the human protein-protein interaction network. Nature 2005, 437, 1173– 1178. 18. Stelzl, U.; Worm, U.; Lalowski, M.; Haenig, C.; Brembeck, F. H.; Goehler, H.; Stroedicke, M.; Zenkner, M.; Schoenherr, A.; Koeppen, S.; Timm, J.; Mintzlaff, S.; Abraham, C.; Bock, N.; Kietzmann, S.; Goedde, A.; Toksoz, E.; Droege, A.; Krobitsch, S.; Korn, B.; Birchmeier, W.; Lehrach, H.; Wanker, EE. A human protein-protein interaction network: a resource for annotating the proteome. Cell 2005, 122, 957–968. 19. Peri, S.; Navarro, J. D.; Kristiansen, T. Z.; Amanchy, R.; Surendranath, V.; Muthusamy, B.; Gandhi, T. K. B.; Chandrika, K. N.; Deshpande, N.; Suresh, S.; Rashmi, B. P.; Shanker, K.; Padma, N.; Niranjan, V.; Harsha, H. C.; Talreja, N.; Vrushabendra, B. M.; Ramya, M. A.; Yatish, A. J.; Joy, M.; Shivashankar, H. N.; Kavitha, M. P.; Menezes, M.; Choudhury, D. R.; Ghosh, N.; Saravana, R.; Chandran, S.; Mohan, S.; Jonnalagadda, C. K.; Prasad, C. K.; Sinha, C. K.; Deshpande K. S.; Pandey, A. Human protein reference database as a discovery resource for proteomics. Nucleic Acids Res. 2004, 32, D497–D501. 20. Formstecher, E.; Aresta, S.; Collura, V.; Hamburger, A.; Meil, A.; Trehin, A.; Reverdy, C.; Betin, V.; Maire, S.; Brun, C.; Jacq, B.; Arpin, M.; Bellaiche, Y.; Bellusci, S.; Benaroch, P.; Bornens, M.; Chanet, R.; Chavrier, P.; Delattre, O.; Doye, V.; Fehon, R.; Faye, G.; Galli, T.; Girault, J-A.; Goud, B.; de Gunzburg, J.; Johannes, L.; Junier, M-P.; Mirouse, V.; Mukherjee, A.; Papadopoulo, D.; Perez, F.; Plessis, A.; Rosse, C.; Saule, S.; Stoppa-Lyonnet, D.; Vincent, A.; White, M.; Legrain, P.; Wojcik, J.; Camonis, J.; Daviet, L.; Protein interaction mapping: a Drosophila case study. Genome Res. 2005, 15, 376–384. 21. Giot, L.; Bader, J. S.; Brouwer, C.; Chaudhuri, A.; Kuang, B.; Li, Y.; Hao, Y. L.; Ooi, C. E.; Godwin, B.; Vitols, E.; Vijayadamodar, G.; Pochart, P.; Machineni, H.; Welsh, M.; Kong, Y.; Zerhusen, B.; Malcolm, R.; Varrone, Z.; Collis, A.; Minto, M.; Burgess, S.; McDaniel, L.; Stimpson, E.; Spriggs, F.; Williams, J.; Neurath, K.; Ioime, N.; Agee, M.; Voss, E.; Furtak, K.; Renzulli, R.; Aanensen, N.; Carrolla, S.; Bickelhaupt, E.; Lazovatsky, Y.; DaSilva, A.; Zhong, J.; Stanyon, C. A.; Finley Jr., R. L.; White, K. Braverman, P.; Jarvie, T.; Gold, S.; Leach, M.; Knight, J.; Shimkets, R. A.; McKenna, M. P.; Chant, J.; Rothberg, M. A Protein Interaction Map of Drosophila melanogaster. Science 2003, 302, 1727–1736.

22

ACS Paragon Plus Environment

Page 23 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

22. Zhao, X. M.; Zhang, X. W.; Tang, W. H.; Chen, L. FPPI: Fusarium graminearum protein-protein interaction database. J. Proteome Res. 2009, 8, 4714–4721. 23. He, F.; Zhang, Y.; Chen, H.; Zhang, Z.; Peng, Y. L. The prediction of protein-protein interaction networks in rice blast fungus. BMC Genomics 2008, 9, 519. 24. Uetz, P.; Giot, L.; Cagney, G.; Mansfield, T. A.; Judson, R. S.; Knight, J. R.; Lockshon, D.; Narayan, V.; Srinivasan, M.; Pochart, P.;Qureshi-Emili, A.; Li, Y.; Godwin, B.; Conover, D.; Kalbfleisch, T.; Vijayadamodar, G.; Yang, M.; Johnston, M.; Fields, S.; Rothberg, J. M. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 2000, 403, 623–627. 25. Yu, H.; Braun, P.; Yildirim, M. A.; Lemmens, I.; Venkatesan, K.; Sahalie, J.; Hirozane-Kishikawa, T.; Gebreab, F.; Li, N.; Simonis, N.; Hao, T.; Rual, J-F.; Dricot, A.; Vazquez, A.; Murray, R. R.; Simon, C.; Tardivo, L.; Tam, S.; Svrzikapa, N.; Fan, C.; de Smet, A-S.; Motyl, A.; Hudson, M. E.; Park, J.; Xin, X.; Cusick, M. E.; Moore, T.; Boone, C.; Snyder, M.; Roth, F. P.; Barabasi, A-L.; Tavernier, J.; Hill, D. E.;Vidal, M. High-quality binary protein interaction map of the yeast interactome network. Science 2008, 322, 104–110. 26. Ito, T.; Chiba, T.; Ozawa, R.; Yoshida, M.; Hattori, M.; Sakaki, Y. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. U.S.A. 2001, 98, 4569–4574. 27. Stark, C.; Breitkreutz, B. J.; Reguly, T.; Boucher, L.; Breitkreutz, A.; Tyers, M. BioGRID: a general repository for interaction datasets. Nucleic Acids Res. 2006, 34, D535–D539. 28. Hermjakob, H.; Montecchi-Palazzi, L.; Lewington, C.; Mudali, S.; Kerrien, S.; Orchard, S.; Vingron, M.; Roechert, B.; Roepstorff, P.; Valencia, A.; Margalit, H.; Armstrong, J.; Bairoch, A.; Cesareni, G.; Sherman, D.; Apweiler, R. IntAct: an open source molecular interaction database. Nucleic Acids Res. 2004, 32, D452–D455. 29. Xenarios, I.; Rice, D. W.; Salwinski, L.; Baron, M. K.; Marcotte, E. M.; Eisenberg, D. DIP: the Database of Interacting Proteins. Nucleic Acids Res. 2000, 28, 289–291. 30. Chatraryamontri, A.; Ceol, A.; Palazzi, L. M.; Nardelli, G.; Schneider, M. V.; Castagnoli, L.; Cesareni, G.

23

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 43

MINT: the Molecular INTeraction database. Nucleic Acids Res. 2007, 35, D572–D574. 31. Huala, E.; Dickerman, A. W.; Garcia-Hernandez, M.; Weems, D.; Reiser, L.; LaFond, F.; Hanley, D.; Kiphart, D.; Zhuang, M.; Huang, W.; Mueller, L. A.; Bhattacharyya, D.; Bhaya, D.; Sobral, B. W.; Beavis, W.; Meinke, D. W.; Town, C. D.; Somerville, C.; Rhee, S. Y. The Arabidopsis Information Resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant. Nucleic Acids Res. 2001, 29, 102–105. 32. Flicek, P.; Aken, B. L.; Ballester, B.; Beal, K.; Bragin, E.; Brent, S.; Chen, Y.; Clapham, P.; Coates, G.; Fairley, S.; Fitzgerald, S.; Fernandez-Banet, J.; Gordon, L.; Graf, S.; Haider, S.; Hammond, M.; Howe, K.; Jenkinson, A.; Johnson1, N.; Kahari, A.; Keefe, D.; Keenan, S.; Kinsella, R.; Kokocinski, F.; Koscielny, G.; Kulesha, E.; Lawson, D.; Longden, I.; Massingham, T.; McLaren, W.; Megy, K.; Overduin, B.; Pritchard, B.; Rios, D.; Ruffier, M.; Schuster, M.; Slater, G.; Smedley, D.; Spudich, G.; Tang, Y. A.; Trevanion, S.; Vilella, A.; Vogel, J.; White, S.; Wilder, S. P.; Zadissa, A.; Birney, E.; Cunningham, F.; Dunham, I.; Durbin, R.; Fernández-Suarez, X. M.; Herrero, J.; Hubbard, T. J. P.; Parker, A.; Proctor, G.; Smith, J.; Searle, S. M. J. Ensembl’s 10th year. Nucleic Acids Res. 2010, 38, D557–D562. 33. O’Brien, K. P.; Remm, M.; Sonnhammer, E. L. L. Inparanoid: a comprehensive database of eukaryotic orthologs. Nucleic Acids Res. 2005, 33, D476–D480. 34. Ostlund, G.; Schmitt, T.; Forslund, K.; Köstler, T.; Messina, D. N.; Roopra, S.; Frings, O.; Sonnhammer, E. L. L. InParanoid 7: new algorithms and tools for eukaryotic orthology analysis. Nucleic Acids Res. 2010, 38, D196–D203. 35. Walhout, A. J.; Sordella, R.; Lu, X.; Hartley, J. L.; Temple, G. F.; Brasch, M. A.; Thierry-Mieg, N.; Vidal, M. Protein interaction mapping in C. elegans using proteins involved in vulval development. Science 2000, 287 (5450), 116–122. 36. Finn, R. D.; Bateman, A.; Clements, J.; Coggill, P.; Eberhardt, R. Y.; Eddy, S. R.; Heger, A.; Hetherington, K.; Holm, L.; Mistry, J.; Sonnhammer, E. L. L.; Take, J. Pfam: the protein families database. Nucleic Acids Res. 2014, 42, D222–D230.

24

ACS Paragon Plus Environment

Page 25 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

37. Raghavachari, B.; Tasneem, A.; Przytycka, T. M.; Jothi, R. DOMINE: a database of protein domain interactions. Nucleic Acids Res. 2008, 36, D656–D661. 38. Han, D. S.; Kim, H. S.; Jang, W. H.; Lee, S. D.; Suh, J. K. PreSPI: a domain combination based prediction system for protein-protein interaction. Nucleic Acids Res. 2004, 32, 6312–6320. 39. Erdös, P.; Rényi, A. On random graphs, Publ. Math. 1959, 6, 290-297. 40. Milenković, T.; Lai, J.; Pržulj, N. GraphCrunch: A tool for large network analyses. BMC Bioinformatics 2008, 9, 70. 41. Molloy, M.; Reed, B. A critical point for random graphs with a given degree sequence. Random Struct. Algor. 1995, 6, 161–180. 42. Bhardwaj, N.; Lu, H. Correlation between gene expression profiles and protein-protein interactions within and across genomes. Bioinformatics 2005, 21, 2730–2738. 43. Grigoriev, A. A relationship between gene expression and protein interactions on the proteome scale: analysis of the bacteriophage T7 and the yeast Saccharomyces cerevisiae. Nucleic Acids Res. 2001, 29, 3513–3519. 44. Jansen, R.; Greenbaum, D.; Gerstein, M. Relating Whole-Genome Expression Data with Protein-Protein Interactions. Genome Res. 2002, 12, 37–46. 45. Shannon, P.; Markiel, A.; Ozier, O.; Baliga, N. S.; Wang, J. T.; Ramage, D.; Amin, N.; Schwikowski, B.; Ideker, T. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13, 2498–2504. 46. Jeong, H.; Mason, S. P.; Barabási, A. L.; Oltvai, Z. N. Lethality and centrality in protein networks. Nature 2001, 411, 41–42. 47. Albert, R.; Barabasi, A-L. Statistiacl mechanics of complex networks. Rev. Mod. Phys. 2002, 74, 47-97. 48. Adamcsek, B.; Palla, G.; Farkas, I. J.; Derényi, I.; Vicsek, T. CFinder: locating cliques and overlapping modules in biological networks. Bioinformatics 2006, 22, 1021–1023.

25

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 43

49. Maere, S.; Heymans, K.; Kuiper, M. BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics 2005, 21, 3448–3449. 50. Remm, M.; Storm, C. E.; Sonnhammer, E. L. Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J. Mol. Biol. 2001, 314, 1041–1052. 51. Cattivelli, L.; Rizza, F.; Badeck, F. W.; Mazzucotelli, E.; Mastrangelo, A. M.; Francia, E.; Marè, C.; Tondelli, A.; Stanca, A. M. Drought tolerance improvement in crop plants: An integrated view from breeding to genomics. Field Crop Res 2008, 105, 1–14. 52. Ho, C. L.; Wu, Y.; Shen, H-B.; Provart, N. J.; Geisler, M. A predicted protein interactome for rice. 2012, 5, 15. 53. Tang, S.; Liang, H.; Yan, D.; Zhao, Y.; Han, X.; Carlson, J. E.; Xia, X.; Yin, W. Populus euphratica: the transcriptomic response to drought stress. Plant Mol. Biol. 2013, 83, 539–557. 54. Dessau, M.; Halimi, Y.; Erez, T.; Chomsky-Hecht, O.; Chamovitz, D. A.; Hirsch, J. A. The Arabidopsis COP9 Signalosome Subunit 7 Is a Model PCI Domain Protein with Subdomains Involved in COP9 Signalosome Assembly. Plant Cell 2008, 20, 2815–2834. 55. Busch, S.; Eckert, S. E.; Krappmann, S.; Braus, G. H. The COP9 signalosome is an essential regulator of development in the filamentous fungus Aspergillus nidulans. Mol. Microbiol. 2003, 49, 717–730. 56. Richardson, K. S.; Zundel, W. The emerging role of the COP9 signalosome in cancer. Mol. Cancer Res. 2005, 3, 645–653. 57. Gusmaroli, G.; Figueroa, P.; Serino, G.; Deng, X. W. Role of the MPN subunits in COP9 signalosome assembly and activity, and their regulatory interaction with Arabidopsis Cullin3-based E3 ligases. Plant Cell 2007, 19, 564–581. 58. Yang, C. J.; Zhang, C.; Lu, Y. N.; Jin, J. Q.; Wang, X. L. The mechanisms of brassinosteroids action: from signal transduction to plant development. Mol. Plant 2011, 4, 588–600. 59. Witthöft, J.; Harter, K. Latest news on Arabidopsis brassinosteroid perception and signaling. Front Plant Sci. 2011, 2, 58.

26

ACS Paragon Plus Environment

Page 27 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

60. Todaka, D.; Nakashima, K.; Maruyama, K.; Kidokoro, S.; Osakabe, Y.; Ito, Y.; Matsukura, S.; Fujita, Y.; Yoshiwara, K.; Ohme-Takagi, M.; Sakakibara, H.; Shinozaki, K.; Yamaguchi-Shinozaki K. Rice phytochrome-interacting factor-like protein OsPIL1 functions as a key regulator of internode elongation and induces a morphological response to drought stress. Proc. Natl. Acad. Sci. U.S.A. 2012, 109, 15947– 15952. 61. Song, S. Y.; Chen, Y.; Chen, J.; Dai, X. Y.; Zhang, W. H. Physiological mechanisms underlying OsNAC5dependent tolerance of rice plants to abiotic stress. Planta 2011, 234, 331–345. 62. Shin, D.; Moon, S. J.; Han, S.; Kim, B. G.; Park, S. R.; Lee, S. K.; Yoon, H. J.; Lee, H. E.; Kwon, H. B.; Baek, D.; Yi, B. Y.; Byon, M. O. Expression of StMYB1R-1, a novel potato single MYB-like domain transcription factor, increases drought tolerance. Plant Physiol. 2011, 155, 421–432. 63. Gong, P.; Zhang, J.; Li, H.; Yang, C.; Zhang, C.; Zhang, X.; Khurram, Z.; Zhang, Y.; Wang, T.; Fei, Z.; Zhibiao, Y. Transcriptional profiles of drought-responsive genes in modulating transcription signal transduction, and biochemical pathways in tomato. J. Exp. Bot. 2010, 61, 3563–3575. 64. Tamura, K.; Hara-Nishimura, I. Functional insights of nucleocytoplasmic transport in plants. Front Plant Sci. 2014, 5, 118. 65. Parry, G. Assessing the function of the plant nuclear pore complex and the search for specificity. J. Exp. Bot. 2013, 64, 833–845. 66. Verslues, P. E.; Guo, Y.; Dong, C. H.; Ma, W.; Zhu, J. K. Mutation of SAD2, an importin beta-domain protein in Arabidopsis, alters abscisic acid sensitivity. Plant J. 2006, 47, 776–787. 67. Luo, Y.; Wang, Z.; Ji, H.; Fang, H.; Wang, S.; Tian, L.; Li, X. An Arabidopsis homolog of importin β1 is required for ABA response and drought tolerance. Plant J. 2013, 75, 377–389. 68. Romero-Puertas, M. C.; Rodríguez-Serrano, M.; Sandalio, L. M. Protein S-nitrosylation in plants under abiotic stress: an overview. Front Plant Sci. 2013, 4, 373. 69. Guerra, D.; Crosatti, C.; Khoshro, H. H.; Mastrangelo, A. M.; Mica, E.; Mazzucotelli, E. Posttranscriptional and post-translational regulations of drought and heat response in plants: a spider’s web of

27

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 43

mechanisms. Front Plant Sci. 2015, 6, 57. 70. Lyzenga, W. J.; Liu, H.; Schofield, A.; Muise-Hennessey, A.; Stone, S. L. Arabidopsis CIPK26 interacts with KEG, components of the ABA signalling network and is degraded by the ubiquitin–proteasome system. J. Exp. Bot. 2013, 64, 2779-2791. 71. Stone, S. L. The role of ubiquitin and the 26S proteasome in plant abiotic stress signaling. Front Plant Sci 2014, 5, 135. 72. Lü, S.; Zhao, H.; Des Marais, D. L.; Parsons, E. P.; Wen, X.; Xu, X.; Bangarusamy, D. K.; Wang, G.; Rowland, O.; Juenger, T.; Bressan, R. A.; Jenks, M. A. Arabidopsis ECERIFERUM9 involvement in cuticle formation and maintenance of plant water status. Plant Physiol. 2012, 159, 930–944. 73. Park, H. J.; Yun, D. J. New insights into the role of the small ubiquitin-like modifier (SUMO) in plants. Int. Rev. Cell. Mol. Biol. 2013, 300, 161–209. 74. Park, H. C.; Kim, H.; Koo, S. C.; Park, H. J.; Cheong, M. S.; Hong, H.; Baek, D.; Chung, W. S.; Kim, D. H.; Bressan, R. A.; Lee, S. Y.; Bohnert, H. J.; Yun, D. J. Functional characterization of the SIZ/PIAS-type SUMO E3 ligases, OsSIZ1 and OsSIZ2 in rice. Plant Cell Environ. 2010, 33, 1923–1934. 75. Chen, T. H. H.; Murata, N. Enhancement of tolerance of abiotic stress by metabolic engineering of betaines and other compatible solutes. Curr. Opin. Plant Biol. 2002, 5, 250–257. 76. Osakabe, Y.; Yamaguchi-Shinozaki, K.; Shinozaki, K.; Tran, L. S. P. Sensing the environment: key roles of membrane-localized kinases in plant perception and response to abiotic stress. J. Exp. Bot. 2013, 64, 445– 458. 77. Hanin, M.; Brini, F.; Ebel, C.; Toda, Y.; Takeda, S.; Masmoudi, K. Plant dehydrins and stress tolerance: versatile proteins for complex mechanisms. Plant Signal Behav. 2011, 6, 1503–1509. 78. Sakamoto, A.; Murata, A. N. Metabolic engineering of rice leading to biosynthesis of glycinebetaine and tolerance to salt and cold. Plant Mol. Biol. 1998, 38, 1011–1019. 79. Heckathorn, S. A.; Downs, C. A.; Sharkey, T. D.; Coleman, J. S. The small, methionine-rich chloroplast heat-shock protein protects photosystem II electron transport during heat stress. Plant Physiol. 1998, 116,

28

ACS Paragon Plus Environment

Page 29 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

439–444. 80. Downs, C.; Heckathorn, S.; Bryan, J.; Coleman, J. The methionine-rich low-molecular-weight chloroplast heat-shock protein: evolutionary conservation and accumulation in relation to thermotolerance. Am. J. Bot. 1998, 85, 175. 81. Zhang Q.; Li J.; Zhang, W.; Yan, S.; Wang, R.; Zhao, J.; Li, Y.; Qi, Z.; Sun, Z.; Zhu, Z. The putative auxin efflux carrier OsPIN3t is involved in the drought stress response and drought tolerance. Plant J. 2012, 72, 805–816. 82. Desikan, R.; Horák, J.; Chaban, C.; Mira-Rodado, V.; Witthöft, J.; Elgass, K.; Grefen, C.; Cheung, M. K.; Meixner, A. J.; Hooley, R.; Neill, S. J.; Hancock, J. T.; Harter, K. The histidine kinase AHK5 integrates endogenous and environmental signals in Arabidopsis guard cells. PLoS One 2008, 3, e2491. 83. Bartels, S.; González Besteiro, M. A.; Lang, D.; Ulm, R. Emerging functions for plant MAP kinase phosphatases. Trends Plant Sci. 2010, 15, 322–329. 84. Arraes, F. B.; Beneventi, M. A.; Sa, M. E. L. de; Paixao, J. F.; Albuquerque, E. V.; Marin, S. R.; Purgatto, E.; Nepomuceno, A. L.; Grossi-de-Sa, M. F. Implications of ethylene biosynthesis and signaling in soybean drought stress tolerance. BMC Plant Biol. 2015, 15, 213. 85. Yang, L.; Ji, W.; Gao, P.; Li, Y.; Cai, H.; Bai, X.; Chen, Q.; Zhu, Y. GsAPK, an ABA-activated and calciumindependent SnRK2-type kinase from G. soja, mediates the regulation of plant tolerance to salinity and ABA stress. PLoS One 2012, 7, e33838. 86. Zhao, J.; Gao, Y.; Zhang, Z.; Chen, T.; Guo, W.; Zhang, T. A receptor-like kinase gene (GbRLK) from Gossypium barbadense enhances salinity and drought-stress tolerance in Arabidopsis. BMC Plant Biol. 2013, 13, 110. 87. Guenther, J. F.; Chanmanivone, N.; Galetovic, M. P.; Wallace, I. S.; Cobb, J. A.; Roberts, D. M. Phosphorylation of soybean nodulin 26 on serine 262 enhances water permeability and is regulated developmentally and by osmotic signals. Plant Cell 2003, 15, 981–991. 88. Kaneda, T.; Taga, Y.; Takai, R.; Iwano, M.; Matsui, H.; Takayama, S.; Isogai, A.; Che, F. S. The

29

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 43

transcription factor OsNAC4 is a key positive regulator of plant hypersensitive cell death. EMBO J. 2009, 28, 926–936. 89. Zhou, L. Z.; Li, S.; Feng, Q. N.; Zhang, Y. L.; Zhao, X.; Zeng, Y.; Wang, H.; Jiang, L.; Zhang, Y. Protein SACYL Transferase10 is critical for development and salt tolerance in Arabidopsis. Plant Cell 2013, 25, 1093–1107. 90. Guo, L.; Yang, H.; Zhang, X.; Yang, S. Lipid transfer protein 3 as a target of MYB96 mediates freezing and drought stress in Arabidopsis. J. Exp. Bot. 2013, 64, 1755–1767. 91. Kim, J. S.; Kim, K. A.; Oh, T. R.; Park, C. M.; Kang, H. Functional characterization of DEAD-box RNA helicases in Arabidopsis thaliana under abiotic stress conditions. Plant Cell Physiol. 2008, 49, 1563–1571. 92. Linder, P.; Owttrim, G. W. Plant RNA helicases: linking aberrant and silencing RNA. Trends Plant Sci. 2009, 14, 344–352. 93. Celik, O.; Atak, C.; Suludere, Z. Response of soybean plants to gamma radiation: Biochemical analyses and expression patterns of trichome development. Plant Omics 2014, 7, 382-391. TABLES Table 1: Protein-protein interactions summary from different PPI data resources BIOGRID

IntAct

MINT

TAIR

HPRD

Unique PPIs

Unique PPIs (Nonself)

Arabidopsis thaliana

19099

14214

-

2177

-

34248

30120

Caenorhabditis elegans

8264

12054

3642

-

-

22263

18179

Drosophila melanogaster

34392

25316

15768

-

-

61444

41143

Homo sapiens

151394

68900

17564

-

35345

229667

190129

30

ACS Paragon Plus Environment

Page 31 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Saccharomyces cerevisiae 262590

82761

28038

-

-

319635

268575

Table 2: Topological features of protein-protein interaction network of Macrotyloma uniflorum in comparison with 1,000 random networks Predicted Network

Random Networks

Clustering Coefficient

0.0831519

0.02738±0.00289747

Average Diameter

4.014226

3.40941±0.00876534

Average Degree

7.50

7.50

FIGURE LEGENDS Figure 1: The workflow of in silico prediction of protein-protein interactome in Macrotyloma uniflorum illustrating interolog and domain-based method. Figure 2: Contribution and quantitative analysis of PPIs. (A) Species wise contribution of five reference organisms in interolog based PPI prediction. (B) Venn diagram representing root and shoot specific PPIs in two genotypes of horsegram. Figure 3: Predicted protein-protein interactome consisting of 6804 interactions among 1812 proteins.............. Hub nodes common to both methods are depicted in bold and red color. Figure 4: Validation of network. (A) Gene ontology based validation of PPI with 1,000 random networks. (B) Pearson Correlation Coefficient based validation of PPI with 1,000 random networks. (C) Log-log plot of Degree distribution of predicted PPI. (D) Cumulative log-log plot of Degree distribution of predicted PPI. Figure 5: Gene Ontology enrichment in category Biological Processes (A) PPIs unique to Genotype 1 Shoot stress. (B) PPIs unique to Genotype 1 root stress. (C) PPIs unique to Genotype 2 Shoot stress. (D) PPIs unique to Genotype 2 root stress (K4_1 cluster). (E) PPIs unique to Genotype 2 root stress (K4_2 cluster).

31

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 6: Gene Ontology enrichment in category Molecular Functions (A) PPIs unique to Genotype 1 Shoot stress. (B) PPIs unique to Genotype 1 root stress. (C) PPIs unique to Genotype 2 shoot stress. (D) PPIs unique to Genotype 2 root stress (K4_0 cluster). (E) PPIs unique to Genotype 2 root stress (K4_1 cluster). (F) PPIs unique to Genotype 2 root stress (K4_2 cluster). Figure 7: PMN pathway analysis of highly enriched proteins revealing important network pathways involved in drought stress tolerance in horsegarm and their interdependency. Figure 8: Workflow for construction of MauPIR database. Every query has search output in four forms; interolog, ortholog, interaction and gene enrichment. The loads of information available after query search is presented in sky blue color. Figure 9: Screenshot of front page of MauPIR database showing (A) homepage of the database. (B) Functional description of a protein query.

for TOC only

32

ACS Paragon Plus Environment

Page 32 of 43

Page 33 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 1

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2 173x89mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 34 of 43

Page 35 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 3 177x108mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 4 207x218mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 36 of 43

Page 37 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 5 - Part 1

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 5 - Part 2

ACS Paragon Plus Environment

Page 38 of 43

Page 39 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 6 - Part 1

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 6 - Part 2

ACS Paragon Plus Environment

Page 40 of 43

Page 41 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 7 152x93mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 8

ACS Paragon Plus Environment

Page 42 of 43

Page 43 of 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 9 254x190mm (96 x 96 DPI)

ACS Paragon Plus Environment