Pathogenicity Genes in Ustilaginoidea virens Revealed by a Predicted

Jan 18, 2017 - Second, Pearson's correlation coefficients (PCCs) and Spearman's correlation coefficients (SCCs) were used to evaluate the reliability ...
0 downloads 0 Views 2MB Size
Subscriber access provided by UNIV OF CALIFORNIA SAN DIEGO LIBRARIES

Article

Pathogenicity genes in Ustilaginoidea virens revealed by a predicted protein-protein interaction network Kang Zhang, Yuejiao Li, Tengjiao Li, Zhi-Gang Li, Tom Hsiang, Ziding Zhang, and Wenxian Sun J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.6b00720 • Publication Date (Web): 18 Jan 2017 Downloaded from http://pubs.acs.org on January 19, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Pathogenicity genes in Ustilaginoidea virens revealed by a predicted protein-protein interaction network †









Kang Zhang, Yuejiao Li, Tengjiao Li, Zhi-Gang Li, Tom Hsiang, Ziding Zhang, §

and Wenxian Sun*,



†Department of Plant Pathology and the Ministry of Agriculture Key Laboratory for Plant Pathology, China Agricultural University, Beijing 100193, China.

‡School of Environmental Sciences, University of Guelph, Guelph, Canada N1G 2W1.

§State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China.

Address for correspondence and proofs:

Wenxian Sun

Department of Plant Pathology

China Agricultural University 1

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 66

2 West Yuanmingyuan Rd., Haidian District

Beijing 100193, China

Telephone: +86 10 6273 3532;

Fax: +86 10 6273 3532;

E-mail: [email protected]

Running title: The PPI network of Ustilaginoidea virens

KEYWORDS: Ustilaginoidea virens; protein-protein interaction (PPI); interolog; domain-domain interaction; pathogenicity; secreted protein

2

ACS Paragon Plus Environment

Page 3 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

ABSTRACT

Rice false smut, caused by Ustilaginoidea virens, produces significant losses in rice yield and grain quality, and has recently emerged as one of the most important rice diseases worldwide. Despite its importance in rice production, relatively few studies have been conducted to illustrate the complex interactome and the pathogenicity gene interactions. Here, a protein-protein interaction (PPI) network of U. virens was built through

two

well-recognized

approaches,

interolog

and

domain-domain

interaction-based methods. A total of 20,217 interactions associated with 3,305 proteins were predicted after strict filtering. The reliability of the network was assessed computationally and experimentally. The topology of the interactome network revealed highly connected proteins. A pathogenicity-related subnetwork involving up-regulated genes during early U. virens infection was also constructed, and many novel pathogenicity proteins were predicted in the subnetwork. In addition, we built an interspecies PPI network between U. virens and Oryza sativa, providing new insights for molecular interactions of this host-pathogen pathosystem. A web-based publicly available interactive database based on these interaction networks has also been released. In summary, a proteome-scale map of the PPI network was described for U. virens, which will provide new perspectives for finely dissecting interactions of genes related to its pathogenicity. 3

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 66

INTRODUCTION Ustilaginoidea virens (teleomorph: Villosiclava virens), the causal agent of rice false smut, has recently become an economically important fungal pathogen.1-2 Because of changes in the dominant cultivars of rice (Oryza sativa) and the overuse of chemical fertilizers, false smut has expanded rapidly in the majority of rice-growing regions around the world, especially in China.2 The disease, fungal blast and sheath blight are now considered as the most important rice diseases in Asia. After infecting and colonizing rice floral organs, U. virens produces yellow or dark green false smut balls covered with powdery chlamydospores in spikelets.3 Thereby, the disease is able to cause significant yield losses in rice. In addition, various types of mycotoxins including ustiloxins and ustilaginoidins are produced during the formation of the spore balls, which not only leads to reduction in grain quality, but also poses health threats to human and animals.4,5 Although many efforts have been made to study U. virens infection processes,3 genetic diversity,6,7 morphology,8,9 and mycotoxin characterization,5,10,11 the molecular mechanisms underlying its pathogenicity remain largely unknown. This poses an obstacle for development of disease management strategies based on a thorough understanding of the disease at molecular and cellular levels. Recently, genomic features adapted for biotrophy and floret infection lifestyle, gene inventories, and interspecific evolutionary relationships of U. virens were revealed by comparative genomics.12 The biosynthetic gene clusters for ustiloxins and 4

ACS Paragon Plus Environment

Page 5 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

ustilaginoidins have been also predicted.12,13 In addition, a specific set of mycotoxin biosynthesis genes and effector genes were predicted to be pathogenicity genes through comparing gene expression profiles before and after U. virens infection.12 However, the interactions and functional relevance of the proteins encoded by the predicted pathogenicity genes are largely unknown. Specific cellular processes or biological functions usually depend on interconnected protein interaction networks rather than on individual proteins,14 and pathogenicity is dependent on a complicated protein interactome network consisting of numerous small biological modules in the pathogens.15 Protein interactome networks have been demonstrated to be powerful in predicting novel essential genes in specific signaling transduction pathways. As a pattern recognition receptor, XA21 confers high resistance to rice bacterial leaf blight.16 Using a network guilt-by-association approach followed by protein-protein interaction assays, Lee et al. predicted three regulators of XA21-mediated immunity (Rox) using RiceNet, a genome-scale protein interaction network for rice.17 Functional analyses indicate that Rox1 and Rox2 are positive regulators while Rox3 is a negative one.17 Similarly, pathogenicity genes could be predicted via protein interactomes of plant pathogens. Liu et al. identified two intensely interconnected network modules from the pathogenicity-related protein network of Fusarium graminearum.18 Homologous genes of many components in these modules were putative pathogenicity genes in other pathogens.18 5

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 66

Several high-throughput technologies, such as yeast two-hybrid screening19 and affinity purification coupled with mass spectrometry20 have been developed to build protein-protein interaction networks. At present, a large number of binary PPIs have been experimentally verified in multiple model organisms, such as Homo sapiens,21 Drosophila melanogaster,22 Saccharomyces cerevisiae,23 and Escherichia coli.24 The availability of these interactome databases greatly facilitates studies on protein biological functions and signaling pathways in these model species. Nevertheless, no similar PPI network is available for U. virens and for most fungal plant pathogens despite their economic importance. While experimental approaches to reveal PPI network are time-consuming and extremely expensive, some computational prediction methods including interolog,25 domain-domain interaction (DDI),26,27 structural matching,28 gene expression profiling,29 co-evolution,30 and machine learning31 have been developed based on genomic, proteomic and other resources. To date, PPI networks have been predicted for some model plant species, such as Oryza sativa,17,32 and Arabidopsis thaliana35,33,34 and for a few plant pathogens including Magnaporthe oryzae,35 F. graminearum,36 Rhizoctonia solani AG1,37 and Xanthomonas oryzae pv. oryzae,38 using various computational methods. Among these methods, the interolog and DDI-based methods are the most widely used. The interolog method is based on the existence of conserved PPIs in different organisms.39 The interaction between a pair of proteins can be predicted if the respective homologs interact physically with each 6

ACS Paragon Plus Environment

Page 7 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

other in other organisms. In the DDI-based approach, two proteins are predicted to interact with each other based on their structural domains, through which the related proteins have been experimentally determined to interact with in other species.26 Both approaches rely primarily on the structural and sequence similarity of the proteins, making them especially suitable for non-model organisms. A comprehensive PPI network covering approximately one-fourth proteome has been successfully predicted for M. oryzae using the interolog approach.35 Similarly, a global PPI network has been also constructed for F. graminearum through interolog and DDI-based methods.36 These networks provide valuable information for understanding molecular mechanisms underlying growth, sporulation and pathogenicity of these important plant pathogens. Different from the hemibiotrophic pathogens M. oryzae and F. graminearum, U. virens is a biotrophic pathogen.3 Construction of the protein interactome network for U. virens will facilitate understanding pathogenicity mechanisms in biotrophic pathogens. The availability of U. virens genome offers an excellent opportunity to explore the complex interaction network of the pathogen.12 Here, an interactome network of U. virens was constructed at the genome scale. A total of 20,217 PPIs associated with 3,305 proteins were predicted using the interolog and DDI-based methods. To provide insights on protein functions and pathogenicity of U. virens, a pathogenicity-related subnetwork was extracted from the global network based on the expression profiles during U. virens infection. Furthermore, an interspecies PPI network between U. 7

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 66

virens and its rice host was constructed using similar approaches, and it illustrated the intense battle between the host and the pathogen.

MATERIALS AND METHODS Construction of protein-protein interaction network for U. virens Derived protein sequences of 8,426 predicted genes in U. virens were used for the PPI prediction15. Protein sequences of the model organisms S. cerevisiae, Caenorhabditis elegans, D. melanogaster, E. coli and H. sapiens were downloaded from UniProt, FlyBase and Ensembl. Experimentally verified PPIs of these model organisms were downloaded from the Database of Interacting Proteins (DIP),40 and other related databases for the interolog prediction (Supplementary Table S1, Supporting Information). All potential orthologs between U. virens and these model organisms were firstly identified using Inparanoid v4.0.41 The one that has the greatest Inparanoid score was considered as the ortholog and others were considered as potential inparalogs.42 Reciprocal Best Hit (RBH) algorithm was used as a supplement for detection of orthologs.43 BLASTP was performed to reciprocally compare all proteins in U. virens and in model organisms.44 Only protein pairs that were satisfied with reciprocal best matches and were not contradictory to those identified by Inparanoid were considered as orthologs (E-value ≤ 1e-5). The identified orthologs and inparalogs were analogized to predict PPIs based on the experimentally

8

ACS Paragon Plus Environment

Page 9 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

determined protein interactions in model organisms. The resultant PPIs were called as raw interolog-based PPIs (oPPIs) (Figure 1). In the DDI-based method, the protein domains were identified following the method described by Li et al.45 with minor modification. Briefly, U. virens protein sequences were all subject to Pfam46 scan to identify potential protein domains (E-value ≤ 1e-3). Only protein domains with length coverage ≥ 80% were used for subsequent analyses. Raw DDI-based PPIs (dPPIs) were preliminarily predicted based on experimentally determined DDIs, which were downloaded from iPfam47 and 3did,48 and were merged as templates. A simplified but stringent strategy was introduced to filter raw dPPIs only if all domains in one protein interacted with all of the corresponding domains of the other protein. Furthermore, the total length of the domains was required to cover ≥ 40% of the protein to decrease prediction error. For a final filtering step, dPPIs involving protein pairs without targeting the same subcellular localization predicted by WolfPSort were eliminated.49 The remaining dPPIs were merged with the ortholog-associated oPPIs and the overlap of raw oPPIs and raw dPPIs into an integrated network illustrated in Figure 1.

Assessment of the predicted network The quality of the PPI network was evaluated via computational and experimental methods independently. First, the Gene Ontology (GO)50 annotation test was conducted to evaluate the proportions of interacting pairs sharing the same GO term for assessing the PPI network of U. virens. Since U. virens proteins were annotated 9

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 66

with GO terms at different depths in the GO hierarchy, it was difficult to directly compare the annotations of the proteins in an interacting pair. A GO hierarchy tree was then constructed using the parent relationships of two terms described in OBO format files provided in GO Consortium.51 If certain GO terms had more than one path to the root terms, the shortest path length was defined as the depth of this term. In addition, 100 randomly connected networks of comparable size using all proteins in the predicted network were created as null models for comparison. Self-interacting proteins were removed here to eliminate interference. The proportions of interacting pairs sharing the same GO term at different depths from 3 to 8 and more than 8 in GO hierarchies were calculated for the predicted and randomized networks. Second, Pearson’s correlation coefficients (PCC) and Spearman’s correlation coefficient (SCC) were used to evaluate the reliability of pair-relationship predictions using gene expression profiles of U. virens from seven independent samples during early infection (GEO: GSE87345, BioProject: PRJNA344467).12,52 Interacting pairs without expression data and self-interactions were excluded from the following calculations. The absolute value of PCC for each interaction pair was computed based on the fragments per kilobase of transcript per million mapped reads (FPKM) values of corresponding mRNAs in seven samples. Similarly, randomized networks of comparable size were also created and analyzed. Third, the yeast two-hybrid (Y2H) assay was performed to validate the predicted interactions following the manufacturer’s instructions. The U. virens strain UV8b was 10

ACS Paragon Plus Environment

Page 11 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

cultured in potato sucrose broth for 6 d at 28ºC and then collected for RNA isolation. Total RNAs were extracted using Ultrapure RNA Kit (CWBIO, Beijing, China). Complementary DNA was synthesized by reverse transcriptase M-MLV (Takara, Dalian, China) using total RNAs as template. Coding sequences of the tested genes were amplified from cDNA by polymerase chain reaction (PCR). The PCR primer sets are listed in the Supplementary Table S2 (Supporting Information). The amplified fragments of UV_1325, UV_4823 and UV_7680 was cloned into pGBKT7, while sequences of predicted partners were cloned into pGADT7. The corresponding pair of pGADT7 and pGBKT7 constructs for each interaction were co-transformed into the yeast Gold strain (Frozen-EZ Yeast Transformation II Kit). The transformants were screened on the plates with double dropout medium without Leu and Trp. The successful transformants (5 µl) with different concentration (OD600 =1, 0.1, 0.01) were inoculated onto the selective plates with quadruple dropout medium lacking Leu, Trp, His and Ade to validate the interactions. The plasmids pGADT7-T and pGBKT7-53 were co-transformed into yeast cells as a positive control, while pGADT7-T and pGBKT7-λ were used for a negative control.

Network analysis Cytoscape v3.1.1 was used for visualizing the interaction network and analyzing its basic properties.53 Topological parameters of the predicted and randomized networks including average degree of nodes (i.e., the number of the interacting partners of a

11

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 66

specific protein), network diameter, clustering coefficient, and shortest paths were calculated using NetworkAnalyzer,54 a plugin of Cytoscape.

Prediction of pathogenicity proteins-based subnetwork Pathogenicity genes in U. virens were predicted by comparison with the PHI-base55 which was created in our previous study.12 A pathogenicity-related PPI subnetwork was extracted from the entire PPI network based on putative pathogenicity genes (also named seed nodes) that were up-regulated during U. virens infection.12 Proteins interacting

with

multiple

seed

nodes

were

also

considered

potential

pathogenicity-related proteins. Potential functional clusters were revealed in the subnetwork using CFinder.56 The clique percolation method was exploited in CFinder to locate the fully connected subgraphs (k-cliques) communities of networks.57 For each cluster, a Fisher Exact test followed by False Discovery Rate (FDR) correction58 was conducted to determine the most significantly enriched GO term in the category of biological process. For simplicity, only GO terms at the depth level of 4 in the GO hierarchy were considered.

Construction of interspecies PPI network with Oryza sativa The interspecies PPI network between U. virens and Oryza sativa was constructed using interolog and DDI-based methods. Rice protein sequences (TIGR Rice Genome Annotation Release 559) were downloaded from http://rice.plantbiology.msu.edu. The DDI-based method was used as described above, except that subcellular localizations 12

ACS Paragon Plus Environment

Page 13 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

of the proteins were not taken into consideration. In the interolog method, all experimentally verified PPIs and the corresponding sequences deposited in DIP were downloaded as templates. Proteins in U. virens and in rice were compared against the proteins in DIP to identify potential homologs. To improve the accuracy of predictions, only Blast hits with E-value ≤ 1e-20, sequence identity ≥ 30% and length coverage ≥ 80% were considered homologs. The interacting pairs in DIP were mapped to predict the interactions between the respective homologs in U. virens and in rice. Finally, the predicted PPIs using interolog and DDI-based methods were combined to construct the interspecies PPI network, and visualized using Cytoscape. GO annotations of the rice proteome were downloaded from GO Consortium.51 The GO terms at depth level 3 or 4 in the GO hierarchy were summarized, and GO enrichments were analyzed using the above-described methods for both species. Subcellular localizations of rice proteins were also predicted by WolfPSort.49 In addition, the degrees of the rice proteins in the interspecies network, which were also present in RiceNet17, were calculated and compared with their degrees in RiceNet.

RESULTS Protein-protein interaction network for U. virens To increase network coverage, the orthologs identified from the Inparanoid and RBH methods and potential inparalogs were merged to build the oPPI network in U.

13

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 66

virens (Supplementary Table S3, Supporting Information). A total of 22,120 raw oPPIs were predicted after analogizing these identified orthologs and inparalogs to experimentally verified PPIs in the model organisms and 18,244 interactions were associated with 3,044 orthologs (Supplementary Table S1, Supporting Information). Most of the predicted oPPIs were derived from S. cerevisiae (81.46%), and those from C. elegans only accounted for 2.29%, which was the lowest among the five species. In addition, 47,681 raw DDI-based interactions associated with 2,483 U. virens proteins were predicted based on experimentally determined DDIs downloaded from iPfam and 3did. A total of 2,021 dPPIs remained after filtering with the strict strategy described in the Materials and Methods section (Supplementary Figure S1A, Supporting Information). Following the pipeline summarized in Figure 1, an integrated PPI network containing 20,217 PPIs associated with 3,305 proteins was created for U. virens (Supplementary Figure S1B, Supporting Information). Notably, 1,372 interactions among 897 proteins were supported by the interolog and DDI-based methods, which is significantly more than the number of the overlap proteins from two randomly constructed PPI datasets (P < 0.001).

14

ACS Paragon Plus Environment

Page 15 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

ter as g iae s ns no i an vis e i a l g e p l r sa ele . me . co ce iPfam H. E C. D S.

3did

Interolog-based method

DDI-based method

Raw oPPIs

Raw dPPIs Overlap

Extraction

Filtration

oPPIs

dPPIs

The interactome network for U. virens

Web service of the database

Figure 1. The pipeline for predicting the protein-protein interaction (PPI) network for U. virens. The interolog-based and domain-domain interaction (DDI)-based methods were used to generate U. virens PPI network. In the interolog-based approach, the protein interactions in U. virens were predicted through analogizing to the experimentally established interactions in model organisms. The interactions associated with orthologs, which were identified using Inparanoid and Reciprocal Best Hits, were considered as the oPPIs. The dPPIs were predicted through the DDI-based approach followed by a stringent filtering process described in the Materials and Methods section. Eventually, the oPPIs, the dPPIs and the overlap of raw oPPIs and dPPIs were combined to form an integrated network.

Evaluation of the predicted network To evaluate the quality of the bioinformatics-based PPI network, three independent testing procedures were exploited to test the reliability of the network. The first approach was based on GO annotations of predicted proteins. It is well established that two interacting proteins tend to have similar or related functions, and therefore the interacting protein pairs tend to have the identical GO term.60,61 Therefore, the quality of the predicted PPI network can be evaluated through calculating the proportion of PPIs involving the proteins with the same GO terms. 15

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 66

The constructed PPI network of U. virens included 18,644 non-self, GO annotated interaction pairs. The proportion of the predicted interacting proteins with the same GO terms in the network was compared with that of 100 randomized networks at different depths in GO hierarchies. The performance of the constructed network was significantly better than that of randomized ones at all depths (Student’s t-test, P < 0.001) (Figure 2A). About 51% (9,561) of the predicted PPIs shared at least one GO term at depth 5, while ~30% (5,617) of the randomly selected protein pairs had the same GO term at this depth. Furthermore, the discrepancy between the predicted and randomized networks was increasingly evident with greater depths within GO hierarchies. The number of PPIs sharing identical GO terms at the depths of more than 8 in the predicted network was 20 times more than that in the randomized networks. These results indicate that the predicted network was much more accurate in connecting proteins with related functions as compared to randomized ones.

16

ACS Paragon Plus Environment

Page 17 of 66

A

B

90

9 Predicted network Randomized networks

80 70 60 50 40 30 20 10 0

Predicted network Randomized networks

8 Percentage of interactions (%)

Percentage of sharing identical GO term (%)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

7 6 5 4 3 2 1

3

4

5

6

7

8

0

>8

~ -1

Depth of GO terms

C

100

10-1

10-2

100

10-1

9 0.

~ .8 -0

.7 -0

~ .6 -0

.5 -0

1 .5 .9 .3 .7 .3 .1 0. -0 -0 ~0 ~0 ~0 ~0 0~ .4 .8 .6 .2 4~ 2~ . . 0 0 0 0 -0 -0 Pearson’s correlation coefficient

10-2

100

10-1

10-2

AD-T BD-53

AD-2494 BD-1325

AD-3187 BD-4823

AD-T BD-λ

AD-1267 BD-1325

AD-3334 BD-4823

AD-1325 BD-1325

AD-860 BD-1325

AD-7716 BD-7680

AD-2516 BD-1325

AD-4506 BD-1325

AD-7445 BD-7680

AD-6503 BD-1325

Figure 2. Validation of the reliability of the predicted U. virens PPI network. (A) The percentages of non-self interaction pairs sharing identical GO term in the predicted network were significantly greater compared with those in randomized networks at different depths in the GO hierarchies. The proteins without GO annotations in the PPI network were excluded from this analysis. (B) The Pearson’s correlation coefficient (PCC) distribution of non-self interaction pairs in the predicted network was compared to that in randomized networks. The PCC value of each interaction pair was calculated based on FPKM in the expression profiles during early infection of U. virens. The proteins without expression data in the PPI network were excluded from this analysis. For comparisons in (A) and (B), randomized networks were constructed using the same group of proteins in the predicted network. (C) Eleven interactions associated with three independent proteins including UV_1325, UV_4823 and UV_7680 were verified in the yeast two-hybrid assay. Three proteins including UV_1325, UV_4823 and UV_7680 were selected and the associated interaction pairs were tested using the Y2H assay. The growth of yeast colonies on the selective quadruple dropout media indicated a positive interaction. The assays were repeated at least three times with similar results. The pGADT7-T and pGBKT7-53 plasmids were co-transformed in the yeast Gold strain as a positive control, while pGADT7-T and pGBKT7-λ were used for a negative control. AD, pGADT7; BD, pGBKT7.

Secondly, the reliability of the PPI network was assessed using gene co-expression profiles based on the hypothesis that the interacting pairs have similar expression patterns.62,63 The correlation values for expression profiles of each predicted interacting pair in U. virens were calculated to evaluate the quality of the predictions in the network. In our predicted network, 3,054 proteins involved in 18,405 non-self 17

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 66

interactions were found in the expression profiles during early infection of U. virens. The PCC distribution diagram showed that the expression correlations values of PPIs in the predicted network were significantly greater than those in randomized networks, especially at high PCC intervals of 0.5~1.0 (Figure 2B). A total of 3,128 interacting pairs in the predicted network were expressed with a PCC value ≥ 0.8, while the number dropped to 1,182 in randomized ones. A similar tendency was found when the Spearman’s correlation coefficients were used to evaluate the predicted PPIs (Supplementary Figure S2, Supporting Information). Collectively, the results demonstrated the greater reliability of the constructed PPI network compared with random networks. Subsequently, all predicted interactions associated with three independent nodes including UV_1325, UV_4823 and UV_7680 were investigated using the Y2H assay except four interactions which we failed making the plasmid constructs for (Supplementary Figure S3, Supplementary Table S4, Supporting Information). The results showed that 7 out of 15 potential interactions for UV_1325, 2 out of 9 for UV_4823 and 2 out of 4 for UV_7680 were validated by the assays (Figure 2C). UV_1325 was self-associated and strongly interacted with UV_6503, UV_2516, UV_1267 and UV_2494, weakly with UV_860 and UV_4506. As negative controls, UV_4823 was not associated with any of 15 potential UV_1325 interacting partners as revealed by Y2H assays (Supplementary Figure S4A, Supporting Information).

18

ACS Paragon Plus Environment

Page 19 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Taken together, these experimentally validated interactions further supported the reliability of the constructed PPI network in U. virens.

Properties of the predicted network The constructed network was displayed using Cytoscape (Supplementary Figure S5, Supporting Information).53 Proteins in the network were named as nodes, while the interactions between nodes were referred to as edges. The average degree of nodes, i.e. the average number of edges which the nodes possessed, was 12.23. This was higher than that in the predicted networks of R. solani AG1,37 and M. oryzae (Supplementary Table S5, Supporting Information),35 suggesting that this network may be more complicated, and this may provide more opportunities for discovery of potential interactions. As compared with randomized networks, the characteristic path length of the established network was longer (Supplementary Table S5, Supporting Information). Moreover, the predicted network had a higher average clustering coefficient (0.1539) than the randomized ones (0.0037 ± 0.0002), indicating cohesiveness of clusters in the network. These results suggest that most nodes in the network can connect directly or associate with each other indirectly through a few edges, which is a typical property of a “small world” model.64 Furthermore, similar to M. oryzae PPI network,37 the degree of nodes in the predicted network followed the power-law distribution rather than Poisson distribution (Supplementary Figure S6A, Supporting Information), indicating that the predicted network would process a scale-free 19

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 66

structure.65 In a scale-free network, some nodes with very high degrees exist to connect many other nodes, while the overwhelming majority of nodes only have a limited number of neighbors. These highly connected nodes, also known as hubs, may have multiple functions in maintaining essential cellular processes and biological networks. Since no node had a degree of more than 40 in randomized networks, the degree of 40 was used as the threshold to define a hub in our predicted network. In total, 214 hub proteins were identified. The proteins with GO terms in the category of molecular function such as protein binding and translation regulation were significantly enriched in the hubs revealed by GO enrichment analyses (Supplementary Table S6, Supporting Information). Furthermore, many GO terms in the category of biological process, such as reproduction, filamentous growth, biogenesis, response to multiple stimuli, regulation of biological process and cellular localization, were also enriched (Supplementary Table S6, Supporting Information). The results were not unexpected, since many of these hubs may play roles in essential cellular processes through associating with other proteins. Taking advantage of the expression data of U. virens, static hubs and dynamic hubs were identified based on their average PCC values with partners. The distribution of nodes vs average PCC values showed a clear bimodal distribution of the hubs (Supplementary Figure S6B, Supporting Information). Accordingly, the hubs could be split into two categories, static hubs with higher average PCC values (≥ 0.3) and dynamic hubs with average PCC values of < 0.3. While most of static hubs were 20

ACS Paragon Plus Environment

Page 21 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

ribosomal proteins, dynamic hubs included protein kinases, DNA/RNA binding proteins and transcription-related proteins (Supplementary Table S7, Supporting Information). Compared with static hubs that may play key roles in shaping the interaction within modules, dynamic hubs are more important in connecting different complexes, in scaffolding the structure of network and in further mediating the proteome biologically.66 For instance, many serine/threonine protein kinases including four potential mitogen-activated protein kinases (MAPKs) were dynamic hubs in the network (Supplementary Table S7, Supporting Information). UV_2494, a homolog of Fus3/Kss1, had 92 interacting partners and another potential MAPK (UV_368) had 55. Since MAPKs are known to mediate many signal pathways, the importance and essence of these kinases were appropriately reflected by the hub classification. Aside from distinct functions of the hubs in various categories, they varied greatly from each other in terms of topological characteristics. The average cluster coefficient of static hubs was 0.386, while the value of dynamic hubs was 0.074 (Table 1). The difference was an indication that the interacting partners of static hubs tended to interact with other partners to form highly connected modules. To better explore the influence of different types of hubs on the topology of the predicted network, the network was reconstructed by randomly removing dynamic or static hubs. The characteristic path length increased significantly when dynamic hubs were removed from the network, while the length remained almost unchanged by eliminating static hubs (Supplementary Figure S6C, Supporting Information). These results suggest that 21

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 66

these predicted dynamic hubs were more topologically important in linking nodes than the static ones.

Pathogenicity-related subnetwork Fungal pathogenicity is associated with large numbers of proteins that can degrade host cell walls, overcome or suppress host defenses, generate mycotoxins and mediate signal pathways.67 Therefore, it is important to be able to predict the interactions associated with such proteins. Our previous study suggests that 1,103 genes are probably involved in pathogen-host interactions by comparisons with the PHI-base.12 The majority of these putative pathogenicity proteins (665) were present in the PPI network. These proteins had an average degree of 14.8, which is slightly greater than that of other nodes. Among them, 53 proteins were predicted to be hubs, most of them (41) were defined as dynamic hubs, indicating their important roles in maintaining biological networks. Given that a large number of putative pathogenicity proteins and corresponding interactions were predicted, this was narrowed down using their expression profiles. As reported previously, pathogenicity genes tend to be differentially regulated during infection.68,69 Based on this characteristic, a total of 110 proteins encoded by putative pathogenicity genes that were up-regulated during infection were chosen as seed nodes to build a subnetwork. The resultant pathogenicity-related subnetwork included 1,010 interactions associated with these seed nodes (Supplementary Figure S7, Supporting Information). 22

ACS Paragon Plus Environment

Page 23 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

The degrees of seed nodes varied greatly, implying their distinct functions. For example, UV_1036, a putative β-1,3-glucanosyltransferase, only had three partners (Figure 3A). In contrast, UV_428, a cell division control protein (CDC42), might interact with 33 proteins (Figure 3B). As a member of GH72 family, UV_1036 may act as an important enzyme involved in the fungal cell wall biogenesis.70 Two of its partners, casein kinase (UV_1965) and another GH72 protein (UV_7648), were also annotated as putative pathogenicity proteins. Although the other partner UV_4979, an amino peptidase, was not predicted as a PHI-base protein, it was up-regulated during early infection process, indicating that the four proteins were likely all involved in pathogenicity. UV_428 was predicted to be a homolog of CDC42, which is known as a small GTPase of Rho family and has been reported to interact with many proteins to affect diverse cellular functions including cell cycle and cell polarity in yeast.71 As shown in Figure 3B, many of its interactions were supported by both methods, and many partners were also putative pathogenicity proteins.

23

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

A

B

8147 2270

1965

7648

4095

4928 2647 3349

428

1733

1965

689 2371

6227 6541

1637 197

7310

D

2679

7462 7207

860

7744

7822

6310

6662

3148

1637

4452

5807 5865

7562

7103

2098 7712

4848

4095 6483

2830

6630

3949 298

3608

1272

4979

4060

7779

4823

1842

C

6912 5871

2830

3315 1036

Page 24 of 66

1197

428

6701

745 2494

1626

7207 1965

Gene Category PHI-base MAPK pathway Secreted NRPS/PKS

Cluster #1 Secondary metabolic process

Degree

GO Depth

PCC

1-4

No test

5-14

3

15-39 40-79

0.6 4

oPPI dPPI Both

5

0.7

6

0.8

7

Expression

8 Up-regulated

PPI Type 0.5

>=80

>=9

0.9 Other

Figure 3. The predicted interactions associated with several potential pathogenicity proteins in the pathogenicity-related subnetwork for U. virens. (A) The predicted interaction partners of UV_1036, a putative beta-1,3-glucanosyltransferase were all predicted to be involved in pathogenicity. (B) UV_428, a cell division control protein, was predicted to interact with 33 partners, many of which were putative pathogenicity proteins. (C) The bridging node UV_4095, a putative MAP kinase involved in pathogenicity, was predicted to interact with four seed nodes, UV_2830, UV_4452, UV_428 and UV_7744 in the subnetwork. These proteins were all putative pathogenicity proteins and were annotated as 3-isopropylmalate dehydrogenase, plasma membrane ATPase, cell division control protein 42 and MAP kinase kinase, respectively. For simplicity, only nodes shared by these seed nodes were shown in the figure. (D) The interconnected protein cluster associated with “secondary metabolic process” was revealed by the clique percolation algorithm. Ten clusters were found in the pathogenicity-related subnetwork in this analysis (k = 3). Other clusters are shown in the Supplementary Figure S8. The GO terms enriched in the cluster at the depth level of 4 in GO hierarchy were identified by Fisher Exact test followed by FDR correction. The nodes were all connected with different lines (i.e. edges). The prefix “UV_” was omitted from the gene names. Four important gene categories were indicated by different circle colors. The degrees of the nodes were indicated by different circle sizes. GO depth levels were indicated by different points of the lines. The PCC values were also represented by different line colors. The oPPIs were indicated by solid lines; the dPPIs were indicated by dash lines, and the overlap of oPPI and dPPI were indicated by double lines. Up-regulated genes were shown by circles with a red border.

24

ACS Paragon Plus Environment

Page 25 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

In the pathogenicity-related subnetwork, the majority of nodes (447) were found to be “free end” nodes, which only connect with one seed node. Meanwhile, 180 bridging nodes were predicted to associate with two or more seed nodes (Supplementary Table S8, Supporting Information). Although the majority of the bridging nodes were not up-regulated during early infection, they might be also involved in pathogenicity. Interestingly, 19 out of 26 highly connected bridging nodes interacting with four or more seed nodes were predicted PHI-base proteins.12 Many proteins involved in the MAPK-mediated pathways were found to be bridging nodes, while several were predicted as seed nodes (Supplementary Figure S7, Supporting Information). UV_4095, the putative MAP kinase Slt2, was predicted to interact with four seed nodes, such as MAPKK UV_7744 and UV_428 (Figure 3C). Slt2 is known to be important for cell wall integrity and pathogenicity in many plant pathogenic fungi72. Thus, UV_4095 may be essential for successful infection of U. virens, although it was not differentially expressed during early infection12. These results suggest that the bridging nodes can facilitate prediction of pathogenicity genes. In addition, individual pathogenicity proteins in the predicted network functioning in the same signaling pathway or within the same biological process were clustered based on the clique percolation algorithm.73 Ten clusters were found in pathogenicity-related subnetwork (k = 3) (Figure 3D and Supplementary Figure S8, Supporting Information). GO terms such as secondary metabolic process, cellular homeostasis, sulfur compound transport and carbohydrate metabolic process were 25

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 66

enriched in these clusters. Interestingly, the cluster of secondary metabolic process composed of eight proteins was found to be highly interconnected. These proteins were all putative pathogenicity proteins and four of them were up-regulated during early U. virens infection (Figure 3D). Therefore, the densely connected clusters provided important information to predict novel pathogenicity proteins.

Interspecies PPI network with Oryza sativa In our predicted PPI network, 118 out of 628 putative secreted proteins were predicted to interact with 423 other proteins. These secreted proteins had an average degree of 6.37, which is significantly less than that of other proteins in the network (Student’s t-test, P < 0.001. Supplementary Table S9, Supporting Information). Since many of these proteins are secreted and delivered into the host cells, it is interesting to see the interactions of these secreted proteins with the proteins in the host.45,74 Therefore, an U. virens - rice interspecies PPI network associated with secreted proteins in U. virens was constructed. A total of 1,319 interactions were predicted through the interolog-based method, and 2,281 interactions were from the DDI-based method, resulting in 3,595 interactions between U. virens and rice (Supplementary Figure S9 and Table S10, Supporting Information). In the interspecies PPI network, 97 secreted proteins, covering 15.5% of the secretome in U. virens, were predicted to interact with 1,818 proteins in rice. Interestingly, 44 secreted proteins were also putative PHI-base proteins, which were significantly enriched in the interspecies PPI network (Chi-squared test, P < 0.01). 26

ACS Paragon Plus Environment

Page 27 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

The result supports the hypothesis that fungal secreted proteins are more likely involved in pathogenicity.75 Many secreted proteins were predicted to have multiple interacting partners in rice (Supplementary Figure S9, Supporting Information). In particular, 12 secreted proteins were predicted to be hubs using the degree cutoff of 65, since no secreted protein had a degree of more than 65 in random networks. Six of them including one putative effector protein (UV_44) were PHI-base proteins. To better understand the functions of interacting proteins in the network, GO annotation of each protein was analyzed and summarized. The most highly enriched terms in the category of molecular function were peptidase activity, hydrolase activity, and glucanosyltransferase activity (Figure 4A). Proteases secreted by fungal pathogens are known as potential virulence factors and are required for successful infection.76 The enriched terms indicate that these peptidase-related proteins may exert their functions through interacting with host proteins. As for terms in the category of biological process, protein metabolic process, response to host, cell wall organization or biogenesis, and organic substance catabolic process were significantly enriched in the 97 U. virens secreted proteins (Figure 4A).

27

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 66

A U. virens

Oryza sativa

glucanosyltransferase activity

signal transducer activity

peptidase activity

hydrolase activity

hydrolase activity

nucleotide binding

organic substance catabolic process

protein metabolic process

response to host

signal transduction

cell wall organization or biogenesis protein metabolic process 0

5

10

15

20

transport

molecular function biological process

cellular response to stimulus 25

30 35 40 45 45 40 35 30 Percentage of proteins in specific GO terms (%)

25

20

15

10

5

0

B 12g38000 (916)

02g02750 (284)

02g07870 (301)

04g38600 08g0329007g41790 02g07490 (178) (356) (243) (392) 04g40950 02g22780 (556) (589) 03g08300 06g4512002g38920 (96) (344) (301) 06g36890 10g37060 03g24170 UV_7256 (79) (422) (90) 03g03720 (25) 03g42220 05g48290 (140) (618) (498) 06g10770 06g36700 06g34690 (13) (528) (435) 03g15550 04g46620 06g47320 (45) (609) (275) 08g40110 (54)

Subcellular locations of Oryza sativa node Chloroplast

PPI Type

Degree of Orysa sativa node in RiceNet oPPI dPPI

Cytosol

1-19 200-499 20-49

Both

Nuclear

50-99

Mitochondria Extracellular

>=500

100-199

Figure 4. Characteristics of the interspecies network between U. virens and Oryza sativa revealed by GO enrichment analyses and exemplified by interaction partners of UV_7256. (A) The representative enriched GO terms in categories of molecular function (orange) and biological process (green) for U. virens and Oryza sativa proteins in the interspecies PPI network were revealed by GO annotation. (B) UV_7256, a putative secreted carboxypeptidase, was predicted to interact with 25 rice proteins in the interspecies network. The average degree of these rice proteins in RiceNet was 334.5, which was significantly higher than the average of all proteins (64.0). The prefix “LOC_Os” was omitted from the gene names in rice. The degree of each node was indicated by the number in the parenthesis below the gene name. Subcellular localizations of Oryza sativa nodes were represented by different circle colors.

As for rice proteins in the interspecies network, some GO terms in the category of molecular functions, such as nucleotide binding and signal transducer activity, were overrepresented. Several terms in the category of biological process including transport, signal transduction, response to stimulus and protein metabolic processes 28

ACS Paragon Plus Environment

Page 29 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

were highly enriched (Figure 4A). Interestingly, more than 50 rice proteins were predicted to interact with seven or more secreted proteins of U. virens (Supplementary Table S11, Supporting Information). The majority of these rice proteins were calcium-dependent protein kinases (CDPKs), protein inhibitors, phosphatases and cyclin-dependent kinases. It is well known that CDPKs are important Ca2+ sensor proteins in transmitting signals to activate downstream signaling processes, and are often triggered by effectors or pathogen-associated molecular patterns.77 These predicted interactions indicate that CDPKs may not only be essential signaling components in rice, but also potential targets for pathogen secreted proteins. The average degree of rice proteins present in the interspecies network (132.3) was markedly higher than the average of all proteins (64.0) in the RiceNet PPI network.17 Several U. virens nodes, including carboxypeptidase, glucosidase and glucosyl hydrolases, had rice partners with an average degree of more than 200 in RiceNet (Supplementary Table S12, Supporting Information). For example, UV_7256, a putative secreted carboxypeptidase, was predicted to interact with 25 rice proteins (Figure 4B). The average degree of these partners in RiceNet was 334.5, implying that they may act as hubs in the rice PPI network. These results suggest that these U. virens proteins may have special roles in targeting the hubs in rice PPI network.

Availability of the databases An online resource was constructed giving public access to the U. virens PPI network

and

the

U.

virens

-

rice

interspecies

PPI

network

at 29

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 66

http://sunlab.cau.edu.cn/uvpid/ (Figure 5). In this interactive system, U. virens protein IDs from local (UV_0000, UV8b_0000) and NCBI GI are both supported to query the database. In the interspecies network, rice protein ID in MSU rice gene format (e.g. LOC_Os01g52880) is acceptable as long as the protein is included in the network. The query and a table listing the predicted interacting partners are the output. By selecting a partner protein, the user can access all interactions associated with that protein. Furthermore, an interactive network for the query protein was also integrated into the system. Characteristics of the nodes and edges mentioned above can be displayed similarly, making it simple and clear to explore the related interactions. The list of interacting pairs in the database can be downloaded from the website for further investigation.

30

ACS Paragon Plus Environment

Page 31 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 5. Availability of the U. virens PPI network database. The PPI network can be accessed from the website http://sunlab.cau.edu.cn/uvpid/. The interaction partners of a specific protein can be queried by the local protein ID, such as UV_1325, and NCBI GI number. A table listing the interaction partners will be presented as output. The protein interaction diagram with other information, such as the degree of nodes, PCC value, and the depth level in GO hierarchy can be accessed by clicking the Figure button above the table.

DISCUSSION

As an emerging fungal disease, rice false smut has become increasingly important in rice commercial production. Therefore, understanding protein-protein interactions underlying U. virens pathogenicity can greatly facilitate development of strategies for 31

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 66

disease management. In this study, a PPI network containing over 20 thousand protein-protein interactions was constructed based on the predicted U. virens proteome using an integrated pipeline. Similarly, a pathogenicity-related PPI subnetwork was also generated using PHI-base genes and their expression profiles during infection. Multiple strategies were used to improve and promote the accuracy of the network. First, only experimentally verified PPIs or DDIs were used for the interolog and DDI predictions, respectively. Second, both approaches were accompanied by stringent filtering processes in attempts to eliminate possible false positive PPIs. Although the principles, such as “ortholog selection” and “interaction supported by all domains”, may cause biases and exclude substantial interacting pairs, the trade-off for using these criteria should increase the accuracy of the prediction at the expense of coverage. Furthermore, the reliability of the predicted network was validated by different computation-based and experiment-based approaches including the GO annotation test, the PCC test and the Y2H assay. In the Y2H assay, 11 out of 28 tested PPIs associated with three independent nodes were validated. Besides self-association, five out of the six validated interacting partners of UV_1325 were predicted as PHI-base proteins (Figure 2C). For example, the mutants of Ste50, Ste11 and Ste7, the homologous genes of UV_2516, UV_1325 and UV_6503, respectively, were defective in conidiation and pathogenicity in Fusarium graminearum.78 Another partner UV_2494 is the homolog of Fus3/Kss1, 32

ACS Paragon Plus Environment

Page 33 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

which is known as a pathogenicity MAPK and is required for virulence in many phytopathogenic fungi.79 UV_1325 was also predicted to interact with UV_7828 and UV_1888, which encode the homologs of Pbs2 and Hog1, respectively, but these interactions were not confirmed in the Y2H assays. Several possibilities might explain these results. First, these U. virens proteins might not be well expressed in yeast so that they cannot activate the reporter genes. Second, some weak interactions might be hardly detected under strict selection conditions with the quadruple dropout media. Only two UV_4823-associated interactions were confirmed in the quadruple dropout media, but eight interacting pairs were able to grow in the triple dropout media lacking Leu, Trp and His (Supplementary Figure S4B, Supporting Information). In addition, we cannot rule out the possibility that significant PPI differences might exist between U. virens and other model species, such as S. cerevisiae. Collectively, these results demonstrate the reliability of the predicted PPI network. The topology of the network revealed special characteristics of the biological network. The high clustering coefficient, short characteristic path length and degree distribution suggest that the network was an empirical example of a “small world” network with a “scale-free” structure (Supplementary Figure S6A, Supporting Information). The topology of the predicted network was consistent not only with that of the predicted M. oryzae PPI network,37 but also with that of experimentally-built biological PPI networks in other organisms.80 It is generally accepted that the proteins with higher degrees are more likely to be essential for biological functions and 33

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 66

processes.81 This speculation was also tested by analyzing 84 protein kinases in U. virens, which were analogized to their homologs in F. graminearum. Gene-deletion mutants for all 84 kinase-encoding genes in F. graminearum have been investigated for phenotypes,82 and it was found here that the average degree of the U. virens protein kinases, which homolog mutations can cause observable changes in phenotype in F. graminearum, was significantly higher than that of the ones for which mutations have caused no visible phenotypic change, confirming this hypothesis (Supplementary Table S13, Supporting Information). The hub proteins in the network indicated by its topologies facilitate uncovering functionally essential proteins in the network. Dynamic and static hubs were identified based on average PCC values of the hub. Many proteins involved in signal transduction, regulation, chaperone and stimulus responses were categorized as dynamic hubs, indicating that these hubs are more likely play important roles in biological functions and processes. In addition, dynamic hubs may have important functions in pathogenicity. It was found that most of PHI-base proteins predicted as hubs were dynamic hubs. Therefore, hub classification not only can deepen the understanding of network structure, but also can help to predict essential proteins for growth and development and for pathogenicity. A pathogenicity-related subnetwork, in combination with transcriptome data, can be considered robust in the prediction of novel pathogenicity proteins. For example, UV_2830, a putative 3-isopropylmalate dehydrogenase, was predicted to be a 34

ACS Paragon Plus Environment

Page 35 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

dynamic hub, and was up-regulated during early infection of U. virens. Furthermore, UV_2830 was found to connect to three disparate interacting protein clusters, in which the proteins with the GO terms including macromolecule metabolic process, carbohydrate metabolic process and regulation of biological process were significantly enriched (Supplementary Figure S8, Supporting Information). As an enzyme involved in amino acid biosynthesis, the homolog in M. oryzae was previously reported to be essential for M. oryzae pathogenicity.83 Speculatively, UV_2830 is an indispensable pathogenicity protein in U. virens. In addition, bridging nodes in the subnetwork can also be exploited to predict novel pathogenicity proteins. UV_2443, an importin alpha subunit encoded by a non-PHI-base gene, was predicted to interact with nine seed nodes, including two lyases and three transferases (Supplementary Figure S10, Supporting Information). These seed nodes are well predicted to be pathogenicity proteins, implying that UV_2443 might be an important pathogenicity protein. This speculation is consistent with the previous finding that PsIMPA1, sharing 41.8% identity with UV_2443, was demonstrated to be required for Phytophthora sojae pathogenicity.84 The PsIMPA1 mutants exhibited a decreased growth rate and could not form sporangia nor release zoospores. Similarly, several protein kinases, such as UV_7103 (histidine kinase HHK5p), UV_6310 (CDC28), UV_1267 (G-protein alpha subunit GPA1) and UV_2494 (Fus3/Kss1) had three seed partners. These proteins are known to be pathogenicity-associated proteins.72,85-87 35

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 66

A draft interspecies PPI network between U. virens and rice was also constructed, reflecting the co-evolutionary interaction of the host-pathogen pair. The U. virens rice PPI network is a valuable resource, and expected to bring about new insights into U. virens pathogenicity. Rice proteins in this interspecies network were significantly enriched in biological processes such as transport, signal transduction, suggesting that they act as potential targets of U. virens. Interestingly, the putative effector proteins UV_1044 and UV_5851 may interact with a LRR family protein LOC_Os01g52880 in rice. Upon checking the raw dPPIs, we found that both of two putative effectors were predicted to interact with several NBS-LRR proteins. To our knowledge, no gene-for-gene resistance against U. virens has been reported in rice. However, the interacting relationship predicted here implies the existence of R gene-mediated resistance, which is yet to be confirmed experimentally. In summary, this is the first systematic, high-confidence PPI network of U. virens at a proteome level. The pathogenicity-related subnetwork was extracted aiming to provide insights into molecular mechanisms underlying U. virens pathogenicity. A draft interspecies PPI network with Oryza sativa was also generated to provide insights into U. virens virulence and pathogenicity. These valuable resources are freely available on the web for public access, and can provide a helpful guidance for designing experiments for future functional studies on U. virens proteins.

36

ACS Paragon Plus Environment

Page 37 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

ASSOCIATED CONTENT

Supporting Information

Supporting Information file 1: Supplementary Figures S1-S10.

Figure S1: Venn diagram showing the composition of the predicted dPPIs and PPI network for U. virens;

Figure S2: Spearman’s correlation coefficient distribution of non-self interaction pairs in the predicted network;

Figure S3: the interactions associated with UV_1325, UV_4823 and UV_7680 in the predicted network;

Figure S4: the yeast two-hybrid assay showing negative controls and the UV_4823-associated interactions;

Figure S5: the predicted PPI network for U. virens;

Figure S6: the characteristics of the predicted PPI network and potential hubs;

Figure S7: the pathogenicity-related PPI subnetwork for U. virens;

Figure S8: ten closely-interconnected clusters identified in pathogenicity-related subnetwork;

37

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 66

Figure S9: the interspecies PPI network between U. virens and Oryza sativa;

Figure S10: the predicted interactions associated with a bridging node UV_2443 in the pathogenicity-related subnetwork.

Supporting Information file 2: Supplementary Tables S1-S13.

Table S1: summary of the protein-protein interaction network constructed through interolog-based method;

Table S2: the primers used for the construction of pGADT7 and pGBKT7 plasmids in this study;

Table S3: annotations of the U. virens proteins that identified as the orthologs of proteins in five model organisms;

Table S4: annotations of the interaction pairs tested by the yeast two-hybrid assay;

Table S5: characteristics of the predicted PPI networks for U. virens and M. oryzae and randomized networks;

Table S6: GO enrichment analysis of potential hubs in the predicted PPI network;

Table S7: average Pearson's correlation coefficient values of static and dynamic hubs with their partners;

Table S8: novel potential pathogenicity proteins that were predicted to interact with at least one seed node; 38

ACS Paragon Plus Environment

Page 39 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Table S9: protein-protein interactions associated with putative secreted proteins in the predicted network for U. virens.

Table S10: summary of the interspecies PPI network between U. virens and Oryza sativa;

Table S11: rice proteins interacting with multiple (7~9) U. virens proteins in the interspecies PPI network;

Table S12: U. virens nodes in the interspecies PPI network having rice interacting partners that have extremely high degrees (>=200) in RiceNet;

Table S13: gene-deletion mutants and their phenotypes of 84 protein kinase-encoding genes in F. graminearum, which are homologous to U. virens genes.

This material is available via the Internet at http://pubs.acs.org.

AUTHOR INFORMATION Corresponding Author *Tel/Fax: +86 10 6273 3532. E-mail: [email protected].

Author Contributions KZ and WS conceived and designed the study. KZ wrote programs, analyzed the data, and developed the database. YL and TL conducted the yeast two-hybrid experiments. ZL, TH and WS analyzed the data. KZ, TH, ZZ and WS wrote the manuscript. All the authors have read and approved the final manuscript. Notes 39

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 66

The authors declare no competing financial interest.

ACKNOWLEDGMENTS The work was supported by the National Natural Science Foundation of China grant 31371728, Key Projects in the National Science & Technology Pillar Program 2012BAD19B03, and the 111 project B13006 to W. S.

40

ACS Paragon Plus Environment

Page 41 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

REFERENCES

(1) Tanaka, E.; Ashizawa, T.; Sonoda, R.; Tanaka, C. Villosiclava virens gen. nov., comb. nov., teleomorph of Ustilaginoidea virens, the causal agent of rice false smut. Mycotaxon 2008, 106, 491-501.

(2) Guo, X.; Li, Y.; Fan, J.; Li, L.; Huang, F.; Wang, W. Progress in the study of false smut disease in rice. J. Agric. Sci. Technol. A 2012, 2 (11), 1211-1217.

(3) Tang, Y. X.; Jin, J.; Hu, D. W.; Yong, M. L.; Xu, Y.; He, L. P. Elucidation of the infection process of Ustilaginoidea virens (teleomorph: Villosiclava virens) in rice spikelets. Plant Pathol. 2013, 62 (1), 1-8.

(4) Li, Y.; Koiso, Y.; Kobayashi, H.; Hashimoto, Y.; Iwasaki, S. Ustiloxins, new antimitotic cyclic peptides: interaction with porcine brain tubulin. Biochem. Pharmacol. (Amsterdam, Neth.) 1995, 49 (10), 1367-1372.

(5) Koyama, K.; Natori, S. Further characterization of seven bis(naphtho-γ-pyrone) congeners of ustilaginoidins, coloring matters of Claviceps virens (Ustilaginoidea virens). Chem. Pharm. Bull. 1988, 36 (1), 146-152.

(6) Sun, X.; Kang, S.; Zhang, Y.; Tan, X.; Yu, Y.; He, H.; Zhang, X.; Liu, Y.; Wang, S.; Sun, W.; Cai, L.; Li, S. Genetic diversity and population structure of rice pathogen Ustilaginoidea virens in China. PLoS One 2013, 8 (9), e76879. 41

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 42 of 66

(7) Wang, F.; Zhang, S.; Liu, M. G.; Lin, X. S.; Liu, H. J.; Peng, Y. L.; Lin, Y.; Huang, J. B.; Luo, C. X. Genetic diversity analysis reveals that geographical environment plays a more important role than rice cultivar in Villosiclava virens population selection. Appl. Environ. Microbiol. 2014, 80 (9), 2811-2820.

(8) Kim, K. W.; Park, E. W. Ultrastructure of spined conidia and hyphae of the rice false smut fungus Ustilaginoidea virens. Micron 2007, 38 (6), 626-631.

(9) Fu, R.; Ding, L.; Zhu, J.; Li, P.; Zheng, A.-P. Morphological structure of propagules and electrophoretic karyotype analysis of false smut Villosiclava virens in rice. J. Microbiol. (Seoul, Repub. Korea) 2012, 50 (2), 263-269.

(10) Koiso, Y.; Li, Y.; Iwasaki, S.; Hanaoka, K.; Kobayashi, T.; Sonoda, R.; Fujita, Y.; Yaegashi, H.; Sato, Z. Ustiloxin, antimitotic cyclic peptides from false smut balls on rice panicles caused by Ustilaginoidea virens. J. Antibiot. 1994, 47 (7), 765-773.

(11) Shan, T.; Sun, W.; Liu, H.; Gao, S.; Lu, S.; Wang, M.; Sun, W.; Chen, Z.; Wang, S.; Zhou, L. Determination and analysis of ustiloxins A and B by LC-ESI-MS and HPLC in false smut balls of rice. Int. J. Mol. Sci. 2012, 13 (9), 11275-11287.

(12) Zhang, Y.; Zhang, K.; Fang, A.; Han, Y.; Yang, J.; Xue, M.; Bao, J.; Hu, D.; Zhou, B.; Sun, X.; Li, S.; Wen, M.; Yao, N.; Ma, L.-J.; Liu, Y.; Zhang, M.; Huang, F.; Luo, C.; Zhou, L.; Li, J.; Chen, Z.; Miao, J.; Wang, S.; Lai, J.; Xu, J.-R.; Hsiang, T.; Peng, Y.-L.; Sun, W. Specific adaptation of Ustilaginoidea virens in occupying host

42

ACS Paragon Plus Environment

Page 43 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

florets revealed by comparative and functional genomics. Nat. Commun. 2014, 5, 3849.

(13) Tsukui, T.; Nagano, N.; Umemura, M.; Kumagai, T.; Terai, G.; Machida, M.; Asai, K. Ustiloxins, fungal cyclic peptides, are ribosomally synthesized in Ustilaginoidea virens. Bioinformatics 2015, 31 (7), 981-985.

(14) Hartwell, L. H.; Hopfield, J. J.; Leibler, S.; Murray, A. W. From molecular to modular cell biology. Nature 1999, 402 (Suppl 6761), C47-52.

(15) Gonzalez-Fernandez, R.; Prats, E.; Jorrin-Novo, J. V. Proteomics of plant pathogenic fungi. J. Biomed. Biotechnol. 2010, 2010, 932527.

(16) Song, W. Y.; Wang, G. L.; Chen, L. L.; Kim, H. S.; Pi, L. Y.; Holsten, T.; Gardner, J.; Wang, B.; Zhai, W. X.; Zhu, L. H.; Fauquet, C.; Ronald, P. A receptor kinase-like protein encoded by the rice disease resistance gene, Xa21. Science 1995, 270 (5243), 1804-1806.

(17) Lee, I.; Seo, Y. S.; Coltrane, D.; Hwang, S.; Oh, T.; Marcotte, E. M.; Ronald, P. C. Genetic dissection of the biotic stress response using a genome-scale gene network for rice. Proc. Natl. Acad. Sci. U. S. A. 2011, 108 (45), 18548-18553.

(18) Liu, X.; Tang, W. H.; Zhao, X. M.; Chen, L. A network approach to predict pathogenic genes for Fusarium graminearum. PLoS One 2010, 5 (10), e13021.

43

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 44 of 66

(19) Fields, S.; Song, O.-k. A novel genetic system to detect protein-protein interactions. Nature 1989, 340 (6230), 245-246.

(20) Gingras, A. C.; Gstaiger, M.; Raught, B.; Aebersold, R. Analysis of protein complexes using mass spectrometry. Nat. Rev. Mol. Cell Biol. 2007, 8 (8), 645-654.

(21) Stelzl, U.; Worm, U.; Lalowski, M.; Haenig, C.; Brembeck, F. H.; Goehler, H.; Stroedicke, M.; Zenkner, M.; Schoenherr, A.; Koeppen, S.; Timm, J.; Mintzlaff, S.; Abraham, C.; Bock, N.; Kietzmann, S.; Goedde, A.; Toksoz, E.; Droege, A.; Krobitsch, S.; Korn, B.; Birchmeier, W.; Lehrach, H.; Wanker, E. E. A human protein-protein interaction network: a resource for annotating the proteome. Cell 2005, 122 (6), 957-968.

(22) Giot, L.; Bader, J. S.; Brouwer, C.; Chaudhuri, A.; Kuang, B.; Li, Y.; Hao, Y. L.; Ooi, C. E.; Godwin, B.; Vitols, E.; Vijayadamodar, G.; Pochart, P.; Machineni, H.; Welsh, M.; Kong, Y.; Zerhusen, B.; Malcolm, R.; Varrone, Z.; Collis, A.; Minto, M.; Burgess, S.; McDaniel, L.; Stimpson, E.; Spriggs, F.; Williams, J.; Neurath, K.; Ioime, N.; Agee, M.; Voss, E.; Furtak, K.; Renzulli, R.; Aanensen, N.; Carrolla, S.; Bickelhaupt, E.; Lazovatsky, Y.; DaSilva, A.; Zhong, J.; Stanyon, C. A.; Finley, R. L.; White, K. P.; Braverman, M.; Jarvie, T.; Gold, S.; Leach, M.; Knight, J.; Shimkets, R. A.; McKenna, M. P.; Chant, J.; Rothberg, J. M. A protein interaction map of Drosophila melanogaster. Science 2003, 302 (5651), 1727-1736.

44

ACS Paragon Plus Environment

Page 45 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(23) Uetz, P.; Giot, L.; Cagney, G.; Mansfield, T. A.; Judson, R. S.; Knight, J. R.; Lockshon, D.; Narayan, V.; Srinivasan, M.; Pochart, P.; Qureshi-Emili, A.; Li, Y.; Godwin, B.; Conover, D.; Kalbfleisch, T.; Vijayadamodar, G.; Yang, M.; Johnston, M.; Fields, S.; Rothberg, J. M. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 2000, 403 (6770), 623-627.

(24) Butland, G.; Peregrin-Alvarez, J. M.; Li, J.; Yang, W. H.; Yang, X. C.; Canadien, V.; Starostine, A.; Richards, D.; Beattie, B.; Krogan, N.; Davey, M.; Parkinson, J.; Greenblatt, J.; Emili, A. Interaction network containing conserved and essential protein complexes in Escherichia coli. Nature 2005, 433 (7025), 531-537.

(25) Doerks, T.; Copley, R. R.; Schultz, J.; Ponting, C. P.; Bork, P. Identification of potential interaction networks using sequence-based searches for conserved protein-protein interactions or "interologs". Genome Res. 2002, 12 (1), 47-56.

(26) Deng, M.; Mehta, S.; Sun, F.; Chen, T. Inferring domain-domain interactions from protein-protein interactions. Genome Res. 2002, 12 (10), 1540-1548.

(27) Ng, S. K.; Zhang, Z.; Tan, S. H. Integrative approach for computationally inferring protein domain interactions. Bioinformatics 2003, 19 (8), 923-929.

(28) Ogmen, U.; Keskin, O.; Aytuna, A. S.; Nussinov, R.; Gursoy, A. PRISM: protein interactions by structural matching. Nucleic Acids Res. 2005, 33 (Suppl 2), W331-W336.

45

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 46 of 66

(29) Ideker, T.; Ozier, O.; Schwikowski, B.; Siegel, A. F. Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 2002, 18 (Suppl 1), S233-240.

(30) Jothi, R.; Kann, M. G.; Przytycka, T. M. Predicting protein-protein interaction by searching evolutionary tree automorphism space. Bioinformatics 2005, 21 (Suppl 1), i241-250.

(31) Shen, J.; Zhang, J.; Luo, X.; Zhu, W.; Yu, K.; Chen, K.; Li, Y.; Jiang, H. Predicting protein-protein interactions based only on sequences information. Proc. Natl. Acad. Sci. U. S. A. 2007, 104 (11), 4337-4341.

(32) Gu, H.; Zhu, P.; Jiao, Y.; Meng, Y.; Chen, M. PRIN: a predicted rice interactome network. BMC Bioinf. 2011, 12 (1), 161.

(33) Geisler-Lee, J.; O'Toole, N.; Ammar, R.; Provart, N. J.; Millar, A. H.; Geisler, M. A predicted interactome for Arabidopsis. Plant Physiol. 2007, 145 (2), 317-329.

(34) Lee, I.; Ambaru, B.; Thakkar, P.; Marcotte, E. M.; Rhee, S. Y. Rational association of genes with traits using a genome-scale gene network for Arabidopsis thaliana. Nat. Biotechnol. 2010, 28 (2), 149-156.

(35) He, F.; Zhang, Y.; Chen, H.; Zhang, Z.; Peng, Y. L. The prediction of protein-protein interaction networks in rice blast fungus. BMC Genomics 2008, 9 (1), 519.

46

ACS Paragon Plus Environment

Page 47 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(36) Zhao, X. M.; Zhang, X. W.; Tang, W. H.; Chen, L. FPPI: Fusarium graminearum protein-protein interaction database. J. Proteome Res. 2009, 8 (10), 4714-4721.

(37) Lei, D.; Lin, R.; Yin, C.; Li, P.; Zheng, A. Global protein-protein interaction network of rice sheath blight pathogen. J. Proteome Res. 2014, 13 (7), 3277-3293.

(38) Guo, J.; Li, H.; Chang, J.-W.; Lei, Y.; Li, S.; Chen, L.-L. Prediction and characterization of protein–protein interaction network in Xanthomonas oryzae pv. oryzae PXO99A. Research in Microbiology 2013, 164 (10), 1035-1044.

(39) Hirsh, E.; Sharan, R. Identification of conserved protein complexes based on a model of protein network evolution. Bioinformatics 2007, 23 (2), e170-176.

(40) Xenarios, I.; Salwinski, L.; Duan, X. J.; Higney, P.; Kim, S. M.; Eisenberg, D. DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 2002, 30 (1), 303-305.

(41) Remm, M.; Storm, C. E.; Sonnhammer, E. L. Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J. Mol. Biol. 2001, 314 (5), 1041-1052.

(42) O'Brien, K. P.; Remm, M.; Sonnhammer, E. L. Inparanoid: a comprehensive database of eukaryotic orthologs. Nucleic Acids Res. 2005, 33 (Database issue), D476-480.

47

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 48 of 66

(43) Moreno-Hagelsieb, G.; Latimer, K. Choosing BLAST options for better detection of orthologs as reciprocal best hits. Bioinformatics 2008, 24 (3), 319-324.

(44) Altschul, S. F.; Madden, T. L.; Schaffer, A. A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D. J. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25 (17), 3389-3402.

(45) Li, Z. G.; He, F.; Zhang, Z.; Peng, Y. L. Prediction of protein-protein interactions between Ralstonia solanacearum and Arabidopsis thaliana. Amino Acids 2012, 42 (6), 2363-2371.

(46) Finn, R. D.; Bateman, A.; Clements, J.; Coggill, P.; Eberhardt, R. Y.; Eddy, S. R.; Heger, A.; Hetherington, K.; Holm, L.; Mistry, J.; Sonnhammer, E. L.; Tate, J.; Punta, M. Pfam: the protein families database. Nucleic Acids Res. 2014, 42 (Database issue), D222-230.

(47) Finn, R. D.; Miller, B. L.; Clements, J.; Bateman, A. iPfam: a database of protein family and domain interactions found in the Protein Data Bank. Nucleic Acids Res. 2014, 42 (Database issue), D364-373.

(48) Mosca, R.; Ceol, A.; Stein, A.; Olivella, R.; Aloy, P. 3did: a catalog of domain-based interactions of known three-dimensional structure. Nucleic Acids Res. 2014, 42 (Database issue), D374-379.

48

ACS Paragon Plus Environment

Page 49 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(49) Horton, P.; Park, K. J.; Obayashi, T.; Fujita, N.; Harada, H.; Adams-Collier, C. J.; Nakai, K. WoLF PSORT: protein localization predictor. Nucleic Acids Res. 2007, 35 (Web Server issue), W585-587.

(50) Ashburner, M.; Ball, C. A.; Blake, J. A.; Botstein, D.; Butler, H.; Cherry, J. M.; Davis, A. P.; Dolinski, K.; Dwight, S. S.; Eppig, J. T.; Harris, M. A.; Hill, D. P.; Issel-Tarver, L.; Kasarskis, A.; Lewis, S.; Matese, J. C.; Richardson, J. E.; Ringwald, M.; Rubin, G. M.; Sherlock, G. Gene ontology: tool for the unification of biology. Nat. Genet. 2000, 25 (1), 25-29.

(51) Harris, M. A.; Clark, J.; Ireland, A.; Lomax, J.; Ashburner, M.; Foulger, R.; Eilbeck, K.; Lewis, S.; Marshall, B.; Mungall, C.; Richter, J.; Rubin, G. M.; Blake, J. A.; Bult, C.; Dolan, M.; Drabkin, H.; Eppig, J. T.; Hill, D. P.; Ni, L.; Ringwald, M.; Balakrishnan, R.; Cherry, J. M.; Christie, K. R.; Costanzo, M. C.; Dwight, S. S.; Engel, S.; Fisk, D. G.; Hirschman, J. E.; Hong, E. L.; Nash, R. S.; Sethuraman, A.; Theesfeld, C. L.; Botstein, D.; Dolinski, K.; Feierbach, B.; Berardini, T.; Mundodi, S.; Rhee, S. Y.; Apweiler, R.; Barrell, D.; Camon, E.; Dimmer, E.; Lee, V.; Chisholm, R.; Gaudet, P.; Kibbe, W.; Kishore, R.; Schwarz, E. M.; Sternberg, P.; Gwinn, M.; Hannick, L.; Wortman, J.; Berriman, M.; Wood, V.; de la Cruz, N.; Tonellato, P.; Jaiswal, P.; Seigfried, T.; White, R.; Gene Ontology, C. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 2004, 32 (Database issue), D258-261.

49

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 50 of 66

(52) Han, Y.; Zhang, K.; Yang, J.; Zhang, N.; Fang, A.; Zhang, Y.; Liu, Y.; Chen, Z.; Hsiang, T.; Sun, W. Differential expression profiling of the early response to Ustilaginoidea virens between false smut resistant and susceptible rice varieties. BMC Genomics 2015, 16 (1), 955.

(53) Shannon, P.; Markiel, A.; Ozier, O.; Baliga, N. S.; Wang, J. T.; Ramage, D.; Amin, N.; Schwikowski, B.; Ideker, T. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13 (11), 2498-2504.

(54) Assenov, Y.; Ramirez, F.; Schelhorn, S. E.; Lengauer, T.; Albrecht, M. Computing topological parameters of biological networks. Bioinformatics 2007, 24 (2), 282-284.

(55) Winnenburg, R.; Baldwin, T. K.; Urban, M.; Rawlings, C.; Kohler, J.; Hammond-Kosack, K. E. PHI-base: a new database for pathogen host interactions. Nucleic Acids Res. 2006, 34, D459-464.

(56) Adamcsek, B.; Palla, G.; Farkas, I. J.; Derenyi, I.; Vicsek, T. CFinder: locating cliques and overlapping modules in biological networks. Bioinformatics 2006, 22 (8), 1021-1023.

(57) Palla, G.; Derenyi, I.; Farkas, I.; Vicsek, T. Uncovering the overlapping community structure of complex networks in nature and society. Nature 2005, 435 (7043), 814-818. 50

ACS Paragon Plus Environment

Page 51 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(58) Storey, J. D. A direct approach to false discovery rates. J. Roy. Stat. Soc. B. 2002, 64 (3), 479-498.

(59) Ouyang, S.; Zhu, W.; Hamilton, J.; Lin, H.; Campbell, M.; Childs, K.; Thibaud-Nissen, F.; Malek, R. L.; Lee, Y.; Zheng, L.; Orvis, J.; Haas, B.; Wortman, J.; Buell, C. R. The TIGR Rice Genome Annotation Resource: improvements and new features. Nucleic Acids Res. 2007, 35 (Database issue), D883-887.

(60) Sharan, R.; Ulitsky, I.; Shamir, R. Network-based prediction of protein function. Mol. Syst. Biol. 2007, 3 (1), 88.

(61) Lehner, B.; Fraser, A. G. A first-draft human protein-interaction map. Genome Biol. 2004, 5 (9), R63.

(62) Jansen, R.; Greenbaum, D.; Gerstein, M. Relating whole-genome expression data with protein-protein interactions. Genome Res. 2002, 12 (1), 37-46.

(63) Grigoriev, A. A relationship between gene expression and protein interactions on the proteome scale: analysis of the bacteriophage T7 and the yeast Saccharomyces cerevisiae. Nucleic Acids Res. 2001, 29 (17), 3513-3519.

(64) Watts, D. J.; Strogatz, S. H. Collective dynamics of 'small-world' networks. Nature 1998, 393 (6684), 440-442.

51

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 52 of 66

(65) Hein, O.; Schwind, M.; Konig, W. Scale-free networks: the impact of fat tailed degree distribution on diffusion and communication processes. Wirtschaftsinformatik 2006, 48 (4), 267-275.

(66) Han, J. D.; Bertin, N.; Hao, T.; Goldberg, D. S.; Berriz, G. F.; Zhang, L. V.; Dupuy, D.; Walhout, A. J.; Cusick, M. E.; Roth, F. P.; Vidal, M. Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature 2004, 430 (6995), 88-93.

(67) Idnurm, A.; Howlett, B. J. Pathogenicity genes of phytopathogenic fungi. Molecular Plant Pathology 2001, 2 (4), 241-255.

(68) Breakspear, A.; Momany, M. The first fifty microarray studies in filamentous fungi. Microbiology 2007, 153 (Pt 1), 7-15.

(69) O'Connell, R. J.; Thon, M. R.; Hacquard, S.; Amyotte, S. G.; Kleemann, J.; Torres, M. F.; Damm, U.; Buiate, E. A.; Epstein, L.; Alkan, N.; Altmuller, J.; Alvarado-Balderrama, L.; Bauser, C. A.; Becker, C.; Birren, B. W.; Chen, Z.; Choi, J.; Crouch, J. A.; Duvick, J. P.; Farman, M. A.; Gan, P.; Heiman, D.; Henrissat, B.; Howard, R. J.; Kabbage, M.; Koch, C.; Kracher, B.; Kubo, Y.; Law, A. D.; Lebrun, M. H.; Lee, Y. H.; Miyara, I.; Moore, N.; Neumann, U.; Nordstrom, K.; Panaccione, D. G.; Panstruga, R.; Place, M.; Proctor, R. H.; Prusky, D.; Rech, G.; Reinhardt, R.; Rollins, J. A.; Rounsley, S.; Schardl, C. L.; Schwartz, D. C.; Shenoy, N.; Shirasu, K.; Sikhakolli, U. R.; Stuber, K.; Sukno, S. A.; Sweigard, J. A.; Takano, Y.; Takahara, H.; 52

ACS Paragon Plus Environment

Page 53 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Trail, F.; van der Does, H. C.; Voll, L. M.; Will, I.; Young, S.; Zeng, Q.; Zhang, J.; Zhou, S.; Dickman, M. B.; Schulze-Lefert, P.; Ver Loren van Themaat, E.; Ma, L. J.; Vaillancourt, L. J. Lifestyle transitions in plant pathogenic Colletotrichum fungi deciphered by genome and transcriptome analyses. Nat. Genet. 2012, 44 (9), 1060-1065.

(70) Caracuel, Z.; Martinez-Rocha, A. L.; Di Pietro, A.; Madrid, M. P.; Roncero, M. I. Fusarium oxysporum gas1 encodes a putative beta-1,3-glucanosyltransferase required for virulence on tomato plants. Mol. Plant-Microbe Interact. 2005, 18 (11), 1140-1147.

(71) Etienne-Manneville, S. Cdc42 - the centre of polarity. J. Cell Sci. 2004, 117 (8), 1291-1300.

(72) Zhao, X.; Mehrabi, R.; Xu, J. R. Mitogen-activated protein kinase pathways and fungal pathogenesis. Eukaryot Cell 2007, 6 (10), 1701-1714.

(73) Zhang, S.; Ning, X.; Zhang, X. S. Identification of functional modules in a PPI network by clique percolation clustering. Comput. Biol. Chem. 2006, 30 (6), 445-451.

(74)

Sahu,

S.

S.;

Weirick,

T.;

Kaundal,

R.

Predicting

genome-scale

Arabidopsis-Pseudomonas syringae interactome using domain and interolog-based approaches. BMC Bioinf. 2014, 15 (Suppl 11), S13.

53

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 54 of 66

(75) Gupta, R.; Lee, S. E.; Agrawal, G. K.; Rakwal, R.; Park, S.; Wang, Y.; Kim, S. T.

Understanding

the

plant-pathogen

interactions

in

the

context

of

proteomics-generated apoplastic proteins inventory. Front. Plant Sci. (New Haven, CT, U. S.) 2015, 6, 352.

(76) Yike, I. Fungal proteases and their pathophysiological effects. Mycopathologia 2011, 171 (5), 299-323.

(77) Gao, X.; Cox Jr, K.; He, P. Functions of calcium-dependent protein kinases in plant innate immunity. Plants 2014, 3 (1), 160-176.

(78) Gu, Q.; Chen, Y.; Liu, Y.; Zhang, C.; Ma, Z. The transmembrane protein FgSho1 regulates fungal development and pathogenicity via the MAPK module Ste50-Ste11-Ste7 in Fusarium graminearum. New Phytol. 2015, 206 (1), 315-328.

(79) Turrà, D.; Segorbe, D.; Di Pietro, A. Protein kinases in plant-pathogenic fungi: conserved regulators of infection. Annu. Rev. Phytopathol. 2014, 52 (1), 267-288.

(80) Bork, P.; Jensen, L. J.; von Mering, C.; Ramani, A. K.; Lee, I.; Marcotte, E. M. Protein interaction networks from yeast to human. Curr. Opin. Struct. Biol. 2004, 14 (3), 292-299.

(81) Jeong, H.; Mason, S. P.; Barabasi, A. L.; Oltvai, Z. N. Lethality and centrality in protein networks. Nature 2001, 411 (6833), 41-42.

54

ACS Paragon Plus Environment

Page 55 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(82) Wang, C.; Zhang, S.; Hou, R.; Zhao, Z.; Zheng, Q.; Xu, Q.; Zheng, D.; Wang, G.; Liu, H.; Gao, X.; Ma, J. W.; Kistler, H. C.; Kang, Z.; Xu, J. R. Functional analysis of the kinome of the wheat scab fungus Fusarium graminearum. PLoS Pathog. 2011, 7 (12), e1002460.

(83) Hamer, L.; Adachi, K.; Dezwaan, T.; Lo, C.; Frank, S.; Darveaux, B.; Mahanty, S.; Heiniger, R.; Skalchunes, A.; Pan, H.; Tarpey, R.; Shuster, J.; Tanzer, M. M. Methods for the identification of inhibitors of 3-isopropylmalate dehydratase as antibiotics. 2003143657, 2001.

(84) Yang, X.; Ding, F.; Zhang, L.; Sheng, Y.; Zheng, X.; Wang, Y. The importin α subunit PsIMPA1 mediates the oxidative stress response and is required for the pathogenicity of Phytophthora sojae. Fungal Genet. Biol. 2015, 82, 108-115.

(85) Zhang, H.; Liu, K.; Zhang, X.; Song, W.; Zhao, Q.; Dong, Y.; Guo, M.; Zheng, X.; Zhang, Z. A two-component histidine kinase, MoSLN1, is required for cell wall integrity and pathogenicity of the rice blast fungus, Magnaporthe oryzae. Curr. Genet. 2010, 56 (6), 517-528.

(86) Jiang, C.; Xu, J. R.; Liu, H. Distinct cell cycle regulation during saprophytic and pathogenic growth in fungal pathogens. Curr. Genet. 2016, 62 (1), 185-189.

(87) Alspaugh, J. A.; Perfect, J. R.; Heitman, J. Cryptococcus neoformans mating and virulence are regulated by the G-protein α subunit GPA1 and cAMP. Genes Dev. 1997, 11 (23), 3206-3217. 55

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 56 of 66

TABLES Table 1. Topological Characteristics of Static and Dynamic Hubs in the Predicted Network Characteristics

Dynamic Hubs

Static Hubs

P value*

Degree

64.0

58.1

0.0326

Cluster Coefficient

0.0744

0.386

8.61E-32

Average Shortest Path Length

2.93

3.10

4.50E-19

Betweenness Centrality

0.00876

0.00261

7.75E-14

Closeness Centrality

0.342

0.323

9.33E-20

Neighborhood Connectivity

33.2

46.4

1.31E-22

* Student’s t-test.

56

ACS Paragon Plus Environment

Page 57 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

FIGURE LEGENDS

Figure 1. The pipeline for predicting the protein-protein interaction (PPI) network for U. virens. The interolog-based and domain-domain interaction (DDI)-based methods were used to generate U. virens PPI network. In the interolog-based approach, the protein interactions in U. virens were predicted through analogizing to the experimentally established interactions in model organisms. The interactions associated with orthologs, which were identified using Inparanoid and Reciprocal Best Hits, were considered as the oPPIs. The dPPIs were predicted through the DDI-based approach followed by a stringent filtering process described in the Materials and Methods section. Eventually, the oPPIs, the dPPIs and the overlap of raw oPPIs and dPPIs were combined to form an integrated network.

Figure 2. Validation of the reliability of the predicted U. virens PPI network. (A) The percentages of non-self interaction pairs sharing identical GO term in the predicted network were significantly greater compared with those in randomized networks at different depths in the GO hierarchies. The proteins without GO annotations in the PPI network were excluded from this analysis. (B) The Pearson’s correlation coefficient (PCC) distribution of non-self interaction pairs in the predicted network was compared to that in randomized networks. The PCC value of each interaction pair was calculated based on FPKM in the expression profiles during early infection of U. 57

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 58 of 66

virens. The proteins without expression data in the PPI network were excluded from this analysis. For comparisons in (A) and (B), randomized networks were constructed using the same group of proteins in the predicted network. (C) Eleven interactions associated with three independent proteins including UV_1325, UV_4823 and UV_7680 were verified in the yeast two-hybrid assay. The growth of yeast colonies on the selective quadruple dropout media indicated a positive interaction. The assays were repeated at least three times with similar results. The pGADT7-T and pGBKT7-53 plasmids were co-transformed in the yeast Gold strain as a positive control, while pGADT7-T and pGBKT7-λ were used for a negative control. AD, pGADT7; BD, pGBKT7.

Figure 3. The predicted interactions associated with several potential pathogenicity proteins in the pathogenicity-related subnetwork for U. virens. (A) The predicted interaction partners of UV_1036, a putative beta-1, 3-glucanosyltransferase were all predicted to be involved in pathogenicity. (B) UV_428, a cell division control protein, was predicted to interact with 33 partners, many of which were putative pathogenicity proteins. (C) The bridging node UV_4095, a putative MAP kinase involved in pathogenicity, was predicted to interact with four seed nodes, UV_2830, UV_4452, UV_428 and UV_7744 in the subnetwork. These proteins were all putative pathogenicity proteins and were annotated as 3-isopropylmalate dehydrogenase, plasma membrane ATPase, cell division control protein 42 and MAP kinase kinase, respectively. For simplicity, only nodes shared by these seed nodes were shown in the 58

ACS Paragon Plus Environment

Page 59 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

figure. (D) The interconnected protein cluster associated with “secondary metabolic process” was revealed by the clique percolation algorithm. Ten clusters were found in the pathogenicity-related subnetwork in this analysis (k = 3). Other clusters are shown in the Supplementary Figure S8. The GO terms enriched in the cluster at the depth level of 4 in GO hierarchy were identified by Fisher Exact test followed by FDR correction. The nodes were all connected with different lines (i.e. edges). The prefix “UV_” was omitted from the gene names. Four important gene categories were indicated by different circle colors. The degrees of the nodes were indicated by different circle sizes. GO depth levels were indicated by different points of the lines. The PCC values were also represented by different line colors. The oPPIs were indicated by solid lines; the dPPIs were indicated by dash lines, and the overlap of oPPI and dPPI were indicated by double lines. Up-regulated genes were shown by circles with a red border.

Figure 4. Characteristics of the interspecies network between U. virens and Oryza sativa revealed by GO enrichment analyses and exemplified by interaction partners of UV_7256. (A) The representative enriched GO terms in categories of molecular function (orange) and biological process (green) for U. virens and Oryza sativa proteins in the interspecies PPI network were revealed by GO annotation. (B) UV_7256, a putative secreted carboxypeptidase, was predicted to interact with 25 rice proteins in the interspecies network. The average degree of these rice proteins in RiceNet was 334.5, which was significantly higher than the average of all proteins 59

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 60 of 66

(64.0). The prefix “LOC_Os” was omitted from the gene names in rice. The degree of each node was indicated by the number in the parenthesis below the gene name. Subcellular localizations of Oryza sativa nodes were represented by different circle colors.

Figure 5. Availability of the U. virens PPI network database. The PPI network can be accessed from the website http://sunlab.cau.edu.cn/uvpid/. The interaction partners of a specific protein can be queried by the local protein ID, such as UV_1325, and NCBI GI number. A table listing the interaction partners will be presented as output. The protein interaction diagram with other information, such as the degree of nodes, PCC value, and the depth level in GO hierarchy can be accessed by clicking the Figure button above the table.

60

ACS Paragon Plus Environment

Page 61 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

For TOC Only

61

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1. The pipeline for predicting the protein-protein interaction (PPI) network for U. virens. The interolog-based and domain-domain interaction (DDI)-based methods were used to generate U. virens PPI network. In the interolog-based approach, the protein interactions in U. virens were predicted through analogizing to the experimentally established interactions in model organisms. The interactions associated with orthologs, which were identified using Inparanoid and Reciprocal Best Hit, were considered as oPPIs. The dPPIs were predicted through the DDI-based approach followed by a stringent filtering process described in the Meterials and Methods section. Eventually, the oPPIs, the dPPIs and the overlap of raw oPPIs and raw dPPIs were combined to form an integrated network. 83x122mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 62 of 66

Page 63 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 2. Validation of the reliability of the predicted U. virens PPI network. (A) The percentages of non-self interaction pairs sharing identical GO term in the predicted network were significantly greater compared with those in randomized networks at different depths in the GO hierarchies. The proteins without GO annotations in the PPI network were excluded from this analysis. (B) The Pearson’s correlation coefficient (PCC) distribution of non-self interaction pairs in the predicted network was compared to that in randomized networks. The PCC value of each interaction pair was calculated based on FPKM in the expression profiles during early infection of U. virens. The proteins without expression data in the PPI network were excluded from this analysis. For comparisons in (A) and (B), randomized networks were constructed using the same group of proteins in the predicted network. (C) Eleven interactions associated with three independent proteins including UV_1325, UV_4823 and UV_7680 were verified in the yeast two-hybrid assay. The growth of yeast colonies on the selective quadruple dropout media indicated a positive interaction. The assays were repeated at least three times with similar results. The pGADT7-T and pGBKT7-53 plasmids were cotransformed in the yeast Gold strain as a positive control, while pGADT7-T and pGBKT7-λ were used for a negative control. AD, pGADT7; BD, pGBKT7. 175x130mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3. The predicted interactions associated with several potential pathogenicity proteins in the pathogenicity-related subnetwork for U. virens. (A) The predicted interaction partners of UV_1036, a putative beta-1,3-glucanosyltransferase were all predicted to be involved in pathogenicity. (B) UV_428, a cell division control protein, was predicted to interact with 33 partners, many of which were putative pathogenicity proteins. (C) The bridging node UV_4095, a putative MAP kinase involved in pathogenicity, was predicted to interact with four seed nodes, UV_2830, UV_4452, UV_428 and UV_7744 in the subnetwork. These proteins were all putative pathogenicity proteins and were annotated as 3isopropylmalate dehydrogenase, plasma membrane ATPase, cell division control protein 42 and MAP kinase kinase, respectively. For simplicity, only nodes shared by these seed nodes were shown in the figure. (D) The interconnected protein cluster associated with “secondary metabolic process” was revealed by the clique percolation algorithm. Ten clusters were found in the pathogenicity-related subnetwork in this analysis (k = 3). Other clusters are shown in the Supplementary Figure S8. The GO terms enriched in the cluster at the depth level of 4 in GO hierarchy were identified by Fisher Exact test followed by FDR correction. The nodes were all connected with different lines (i.e. edges). The prefix “UV_” was omitted from the gene names. Four important gene categories were indicated by different circle colors. The degrees of the nodes were indicated by different circle sizes. GO depth levels were indicated by different points of the lines. The PCC values were also represented by different line colors. The oPPIs were indicated by solid lines; the dPPIs were indicated by

ACS Paragon Plus Environment

Page 64 of 66

Page 65 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

dash lines, and the overlap of oPPI and dPPI were indicated by double lines. Up-regulated genes were shown by circles with a red border. 147x162mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 4. Characteristics of the interspecies network between U. virens and Oryza sativa revealed by GO enrichment analyses and exemplified by interaction partners of UV_7256. (A) The representative enriched GO terms in categories of molecular function (orange) and biological process (green) for U. virens and Oryza sativa proteins in the interspecies PPI network were revealed by GO annotation. (B) UV_7256, a putative secreted carboxypeptidase, was predicted to interact with 25 rice proteins in the interspecies network. The average degree of these rice proteins in RiceNet was 334.5, which was significantly higher than the average of all proteins (64.0). The prefix “LOC_Os” was omitted from the gene names in rice. The degree of each node was indicated by the number in the parenthesis below the gene name. Subcellular localizations of Oryza sativa nodes were represented by different circle colors. 155x179mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 66 of 66

Page 67 of 66

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 5. Availability of the U. virens PPI network database. The PPI network can be accessed from the website http://sunlab.cau.edu.cn/uvpid/. The interaction partners of a specific protein can be queried by the local protein ID, such as UV_1325, and NCBI GI number. A table listing the interaction partners will be presented as output. The protein interaction diagram with other information, such as the degree of nodes, PCC value, and the depth level in GO hierarchy can be accessed by clicking the Figure button above the table. 164x182mm (300 x 300 DPI)

ACS Paragon Plus Environment