Comparative Proteomic Analysis of Developing Rhizomes of the

Feb 26, 2015 - In this work, we characterized the proteome of rhizome apical tips and elongation zones from different species using a GeLC–MS/MS (on...
1 downloads 17 Views 2MB Size
Subscriber access provided by University of Victoria Libraries

Article

Comparative proteomic analysis of developing rhizomes of the ancient vascular plant Equisetum hyemale and different monocot species Fernanda Salvato, Tiago Santana Balbuena, William Nelson, R. Shyama Prasad Rao, Ruifeng He, Carol A Soderlund, David R Gang, and Jay J Thelen J. Proteome Res., Just Accepted Manuscript • Publication Date (Web): 26 Feb 2015 Downloaded from http://pubs.acs.org on February 26, 2015

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Comparative proteomic analysis of developing rhizomes of the ancient vascular plant Equisetum hyemale and different monocot species Fernanda Salvato1¥, Tiago S. Balbuena1§, William Nelson2, R. Shyama Prasad Rao1ǂ, Ruifeng He3, Carol A. Soderlund2, David R. Gang3, Jay J. Thelen1* 1

Department of Biochemistry, Christopher S. Bond Life Sciences Center, University of

Missouri, Columbia, MO, USA. 2

BIO5 Institute, The University of Arizona, Tucson, AZ, USA.

3

Institute of Biological Chemistry, Washington State University, Pullman, WA, USA.

*Email: [email protected], Fax: 573-884-9676, Phone: 573-884-1325

Keywords: rhizome development, proteomics, label free quantification

1

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ABSTRACT The rhizome is responsible for the invasiveness and competitiveness of many plants with great economic and agricultural impact worldwide. Besides its value as an invasive organ, the rhizome plays a role in the establishment and massive growth of forage, providing biomass for biofuel production. Despite these features, little is known about the molecular mechanisms that contribute to rhizome growth, development and function in plants. In this work, we characterized the proteome of rhizome apical tips and elongation zones from different species using a GeLCMS/MS (one dimensional electrophoresis in combination with liquid chromatography coupled on-line with tandem mass spectrometry) spectral-counting proteomics strategy. Five rhizomatous grasses and an ancient species were compared to study the protein regulation in rhizomes. An average of 2,200 rhizome proteins per species were confidently identified and quantified. Rhizome-characteristic proteins showed similar functional distributions across all species analyzed. The over-representation of proteins associated with central roles in cellular, metabolic, and developmental processes indicated accelerated metabolism in growing rhizomes. Moreover, 61 rhizome-characteristic proteins appeared to be regulated similarly among analyzed plants. In addition, 36 showed conserved regulation between rhizome apical tips and elongation zones across species. These proteins were preferentially expressed in rhizome tissues regardless of the species analyzed, making them interesting candidates for more detailed investigative studies about their roles in rhizome development.

2

ACS Paragon Plus Environment

Page 2 of 44

Page 3 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

INTRODUCTION

Rhizomes are modified underground stems that are diageotropic and have fundamental agricultural roles in plant competitiveness and invasiveness. Many of the most noxious weeds, like Johnsongrass (Sorghum halepense), quack grass (Agropyron repens), and cogon grass (Imperata cylindrica) produce rhizomes as an important organ of dispersion and “weediness”.1 Besides the invasiveness importance, rhizomes play important roles in the establishment and massive growth of forage and turf grasses, and of species cultivated to provide plant biomass for biofuel production.1,2 In addition to their agricultural significance, rhizomes have evolutionary importance. They were the original stem of the vascular plant lineage3 and are still present in primitive plants like ferns and fern allies. However, many advanced plants also produce rhizomes, leading to a great economic impact, due to both the need for heavy weed control as well as potential gains in productivity due to an increase in biomass production. Therefore, understanding the molecular basis of rhizome growth and development may have implications for the overall utilization of herbicides as well as better occupation of marginal lands for agriculture expansion.

Despite its importance, very little is known about gene expression involved in rhizome growth and development. Jang and coworkers2 identified several genes that are expressed in the rhizomes of Sorghum species related to the “rhizomatousness” of those species. However, the exact function of these genes remains unclear. The utilization of new technologies enabled the identification of genes and proteins related to rhizome development. Recently, He et al.4 employed next-generation sequencing technologies and quantitative proteomics techniques to characterize the transcriptome and proteome of common reed rhizomes. In a similar way,

3

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Balbuena et al.5 identified nearly 2,000 proteins in rhizome tissues of Equisetum hyemale (horsetail) employing spectral counting quantification. Both studies revealed several rhizomecharacteristic genes and proteins associated with rhizome development, setting the stage for this investigation.

The study of proteins can reveal important mechanisms of regulation of biological processes because they are involved in a broad spectrum of essential functions, such as the regulation of DNA replication or RNA splicing and editing, transcription and translation control, and molecular catalysts, receptors or intracellular signals in different metabolic processes. Furthermore, RNA and protein expression are not always concordant6, demonstrating the importance of complementing transcriptome studies with proteomic analyses.

Here, we describe an extensive proteomic study on rhizome development of different species, including five grasses: quack grass (Elytrigia repens), cogon grass (Imperata cylindrica), miscanthus (Miscanthus x giganteus), red rice (Oryza longistaminata), common reed (Phragmites australis), and a fern ally representative: horsetail (Equisetum hyemale). Preparative SDS-PAGE separation of proteins followed by in-gel digestion and tandem mass spectrometry is a common strategy for unbiased characterization of proteomes, and is referred to as GeLC-MS. GeLC-MS analyses and relative quantification based on spectral counting were performed to identify proteins from rhizome elongation zones and apical tips of each species. These two rhizome tissues were chosen because they are linked to biological process such as rhizome cell division, elongation, and differentiation related to rhizome growth and development. The comparison of rhizome-enriched proteins in several monocotyledons and horsetail may identify conserved expression of rhizome related genes across a primitive rhizome species (E. hyemale)

4

ACS Paragon Plus Environment

Page 4 of 44

Page 5 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

and more evolved rhizomatous species (monocotyledons). Widespread conservation of gene expression between these groups of species can reveal candidate genes responsible for the ecological success of more modern invasive plants (monocots). Thus, the present work comprises not only the characterization of differently regulated proteins in rhizome tissues of different species, but also the investigation of key genes with conserved expression across monocots and primitive plants with the aim to more comprehensively understand the process of rhizome growth and development at the proteome level.

MATERIAL AND METHODS

Plant Material

Species evaluated included the primitive plant, Equisetum hyemale (also called horsetail) and five monocots: quack grass (Elytrigia repens), cogon grass (Imperata cylindrical), miscanthus (Miscanthus x giganteus), red rice (Oryza longistaminata) and common reed (Phragmites australis). Plants from the Plant Rhizome project living collection were maintained in a greenhouse under controlled conditions as described by He et al.4 Samples from the rhizome apical tip and elongation zone, as well as root samples, were dissected and immediately frozen in liquid N2. Samples were stored at -80 ºC until protein extraction.

Protein Extraction and SDS-PAGE

Five biological replicates for rhizome apical tip, rhizome elongation zone and roots for all species were used for protein extraction. Frozen samples were ground with a mortar and pestle to produce a fine powder and proteins were extracted using a phenol based protocol described by Balbuena et al.5 Proteins were precipitated in methanol containing 0.1 M ammonium acetate and

5

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

incubated overnight at -20 ºC. Precipitates were collected by centrifugation at 5000 g for 15 min and pellets washed twice with methanol containing 0.1 M ammonium acetate. Proteins precipitates were resuspended in 200 µL resuspension buffer [65 mM Tris (pH 6.8), 2 % (w/v) SDS] and protein concentration was estimated by the BCA Protein Kit (Thermo Fisher Scientific, Houston, TX) using BSA as standard. Aliquots of 100 µg of the protein extracts for each biological replicate were mixed with an equal volume of loading buffer containing 125 mM Tris (pH 6.8), 20 % (v/v) glycerol, 4 % (w/v) SDS, 0.5 % (w/v) DTT and traces of bromophenol blue and incubated for 5 min at 99 ºC. Gel electrophoresis was performed under denaturing conditions in 12 % polyacrylamide gels using 20 mA per gel. After protein migration, gels were stained with colloidal Coomassie Blue stain under standard conditions.

In gel digestion and LC-MS/MS analyses

The gel lane of each biological replicate was sliced into 30 equal size fragments and transferred into a 96 well plate device (MultiScreen Solvinert Plates, Millipore) for gel destaining and in gel digestion as described by Balbuena et al 5. Protein digestion was performed with the addition of 700 ng of porcine trypsin (Promega, Madison, WI) in each well. Samples were overnight incubated at 37 ºC for digestion. After gel digestion, the gel pieces were saturated with 400 µL of extraction buffer containing 5 % formic acid (FA): acetonitrile (1:2, v/v) and kept in horizontal agitation for 30 min. Supernatants were collected by centrifugal filtration (3,000 g for 30 min), dried down in a vacuum centrifuge and kept at -80 ºC until LC-MS/MS analyses. Extracted peptides were resuspended in 0.1 % (v/v) FA and injected on a ProteomeX-LTQ Workstation (Thermo, San Jose, CA) and mass spectra acquired according to Balbuena et al.5 Peptides were separated at the flow rate of 150 µL/min into a 10 cm × 150 µm ID in-house packed nanocolumn

6

ACS Paragon Plus Environment

Page 6 of 44

Page 7 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(C18, 100 Å, 5 µm, Michrom Bioresources) using the following mobile phase gradient: from 5 to 35% of solvent B in 25 min, from 35 to 70% in 25 min, then back to 5% in 5 min. Solvent A was water containing 0.1% FA, solvent B was ACN containing 0.1% FA. Peptides were positively ionized at 2.1 kV, at 250°C and injected into the mass spectrometer. Mass spectrometry data were acquired in data-dependent acquisition (DDA) mode controlled by XCalibur 2.0 software (Thermo Fisher Scientific). The typical DDA cycle consisted of a survey scan within m/z 200– 2,000 followed by MS/MS fragmentation of the seven most abundant precursor ions under normalized collision energy of 35%. Fragmented precursor ions were dynamically excluded according to the following: repeat counts: 3, repeat duration: 30 s, exclusion duration: 30 s.

Databases for protein identification For each species the database used for protein identification was obtained through the isolation of RNA from different tissues. The cDNA libraries were constructed from five types of tissues: rhizome tip, rhizome zone, whole rhizome, root, stem and leaf. A total of five replicate plants (five independent biological replicates for each type of tissue) were sampled. The same tissue samples (from a common grind) were used for RNA and protein isolation for all procedures to ensure that direct comparisons between RNA and protein data could be made. The CLC assembly program iss widely used for de nvovo transcriptome assembly 7,8. To ensure the quality of sequence assembly, we used high-stringency settings. The raw sequencing reads were trimmed based on a quality value ≥20, with short reads less than 20 bp being removed. For each species, the filtered reads from all five types of RNA sequence datasets were initially assembled using CLC Genomics Workbench 5.0 with default stringency settings, including trim quality score of 0.01, maximum ambiguous nucleotide of 2, mismatch cost of 2, insert cost of 3, similarity of 0.8, and minimum contig length of 200 bp. Further, possible poly-A

7

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

tails were removed with EMBOSS trimmest followed by finalizing with MIRA and CAP3 in the iAssembler package 9. The assembly metrics provide insight into the quality of the assembly and are summarized in Table S1.

The final assemblies (unique transcripts) were translated using the Virtual Ribosome software version 1.110 and the longest open read frame (ORF) reported. For each species, the peptide database was combined with randomized (i.e., decoy) sequences resulting in a concatenated search database, which permitted the false discovery rate (FDR) calculations.

Protein identification

Database searches were performed using SEQUEST search engine integrated within the Bioworks 3.3.1 SP1 software package (Thermo). Search parameters were set as follows: oxidation of methionine was allowed as a variable modification and carbamidomethylation of cysteine as a static modification, enzyme: trypsin, number of allowed missed cleavages: 2, mass range: 200 to 2,000, threshold: 500, minimum ion count: 10, peptide tolerance: 1,000 ppm, fragment ions tolerance: 1 Da. Duplicate peptide matches were reported. After database searches, validation of the peptide-spectrum matches (PSMs) candidates was computationally assessed using the Search Engine Processor tool11 as described by Balbuena et al.5 For confident protein identification, spectrum, peptide and protein cutoffs were adjusted to achieve a false discovery rate of 1 % at the protein level for each biological replicate.

Quantification based on spectral counting

Proteins containing common peptides were grouped and relative protein quantification was given based on the number of spectral counts per protein group. This procedure was done with the help

8

ACS Paragon Plus Environment

Page 8 of 44

Page 9 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

of Regrouper tool (available for downloading at http://pcarvalho.com/patternlab/sepro.shtml). At this step, the spectral counts of the shared peptides within each proposed group were counted only once to avoid overestimation. Replicate groups having three or more 0-count replicates were capped at a maximum count of 5 to remove potential spurious counts. The spectral counts were loaded into the TCW database for analysis, as described below.

Single-species and multi-species databases The Trancriptome Computational Workbench9 was used to build and analyze both single-species databases (five grasses and E. hyemale) and a multi-species database of all species. To build the single TCW databases, the runSingleTCW program was used to load the protein sequences and raw spectral counts, compare the protein sequences against taxonomic UniProt13 using BLAST14 execute edgeR15 to determine differential expression, and load the results into the database. The multiSingleTCW program was used to build a database of the protein sequences, annotations, spectral counts and differential expression from the single species database, execute OrthoMCL 16

and load the results into the database. The viewSingleTCW program was used to query for

differential expression and the viewMultiTCW program was used to query the orthologous clusters. A custom script was written to determine differentially regulated AT compared to roots in at least two grasses and E. hyemale. The single-species and multi-species databases can be queried and viewed via a Java applet from www.plantrhizome.org, the peptides files are also available from this site.

Functional classification and hierarchical clustering

9

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The functional classification of the differently regulated proteins in the rhizome tissues (AT and EZ) compared to roots was performed based on GO terms retrieval using the Blast2GO tool17. The combined graphs for biological process, molecular function and cellular component are presented at the second level of depth. The hierarchical clusterings were constructed using the software PermutMatrix18. The mean of normalized spectral counts of proteins, which were classified in the same orthologous group and showed similar trends of expression among different species were taken and the fold change calculated between AT vs. Roots and EZ vs. Roots, followed by log2 transformation for heat map representation. Then, dissimilarities were calculated based on Euclidean distances and hierarchical clustering was carried out according to the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) method.19

RESULTS

Large-scale protein identification of underground tissues from rhizomatous species

A simple workflow based on GeLC experiments was employed in the present study for in-depth protein identification and quantification (Figure S1) of the underground system of different rhizomatous species. Homologous databases from RNA sequencing projects for each species were used to mine the MS/MS data to obtain maximum proteome coverage for each tissue and maximum amount of confident peptide spectral matches (PSMs). Five biological replicates were employed for each tissue of all six species to improve the protein assessment by mass spectrometry and accuracy of quantification. All SePro filtered MS/MS spectra and the corresponding Sequest scores for each PSM and for each species may be found in the Tables S2 to S19 in the Supplementary Material. To reduce the protein inference problem, proteins identified by the same set of peptides were classified in the same group of proteins and the

10

ACS Paragon Plus Environment

Page 10 of 44

Page 11 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

quantitative analysis considered the number of spectral counts per protein group (Table S20). Using this strategy, around 30,000 assigned spectra and 10,000 peptides were acquired per biological replicate corresponding on average to nearly 2,200 proteins per species confidently identified and quantified with a false discovery rate (FDR) of less than 1 %.

Overall, the number of peptides and proteins identified was comparable among the various plants analyzed, except for miscanthus, which showed about 40 % lower values. In general, the number of non-redundant proteins identified in rhizome apical tips (AT) was similar to the number identified in rhizome elongation zones (EZ), while the number of non-redundant proteins identified in roots was lower than those identified in rhizome tissues (Figure 1). Comparative analysis based on spectral counting between rhizome tissues (AT and EZ) and developing roots (used for comparison) was performed in order to identify rhizome-characteristic proteins, i.e. proteins up-regulated in the AT and EZ compared to root in each species. Thus, two pairwise comparisons were carried out: AT versus roots and EZ versus roots in all species. In addition, differences between AT and EZ were also computed. On average 40 % of proteins identified in AT and EZ showed differences in abundance when compared to roots in each species, while the differences between AT and EZ ranged from 3 % (quack grass) to 26 % (miscanthus) (Table 1). The number of rhizome characteristic-proteins varied by species from 81 in miscanthus to 266 in red rice (Table 1).

Quantitative proteome profiles of underground tissues indicate a wide dynamic range

To evaluate protein abundance differences between tissues, we employed spectral counting for which the number of spectra for a given peptide are counted and then integrated with the total number of spectra for each peptide assigned to the same protein. This strategy has shown

11

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

promising results with the advantage that relative abundances of different proteins present in a complex sample can be measured. Protein standards added to yeast extracts produced a linear correlation between protein abundance and spectral counts.20 Spectral counting also proved to agree well with peak area intensity measurements and with independent measurements based on gel staining intensities in complex samples of human cells and plant seed.21,22 Relatively simple analytical procedures are required to employ this kind of strategy. After normalization of spectral counts, biological change in protein abundance can be determined by different statistical tests. In the present work, normalization and differential expression of proteins between different tissues were determined using edgeR15 and TCW.12 Figure 1B shows the log10 transformed spectral count distributions of the rhizome and root proteins identified in all species studied. Each bin corresponds to 0.25 orders of magnitude difference in protein abundance. The most abundant proteins in roots, AT, and EZ showed 5.39, 5.21, and 5.11 log10 values, while the least abundant proteins showed 0.23, 0.11, and 0.16 log10 values, respectively (Table S20). This provides a global dynamic range of five orders of magnitude. Besides the wide dynamic range detected in these proteomes, in Figure 1C we can observe a large number of proteins at least two orders of magnitude below the abundance of the most prominently expressed protein in the sample indicating good coverage of low abundance proteins.

The most abundant proteins considering all tissues and species analyzed were found in quack grass roots (ErRi_37329), quack grass AT (ErRi_37329), and miscanthus EZ (MgRi_39473). These proteins were classified as unknown since the annotation did not return any description. On the other hand, the least abundant proteins were found in red rice roots (OlR_075127), common reed AT (PaRi_001192) and red rice EZ (OlR_057933), and were annotated as E3 ubiquitin-protein ligase UPL2, transformation/transcription domain-associated protein, and

12

ACS Paragon Plus Environment

Page 12 of 44

Page 13 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

HEAT repeat-containing protein respectively (Table S20). According to the functional classification of the top 100 proteins more abundant and less abundant in each underground tissue (Figures S1 and S2), we can observe that in the biological process category the functional distribution of proteins through the subcategories were very similar among the underground tissues and also between the more and less abundant protein sets (Figure S2). The same could not be said for molecular function classification, especially among the top 100 more abundant proteins where proteins classified in the structural molecule activity subcategory were clearly more abundant in rhizome tissues than in roots (Figure S3). Proteins in this subcategory are associated to the structural integrity of complexes or assembly, which may have greater importance in rhizome tissues due to their constant growth and development. Among the top 100 less abundant proteins, the distribution of the molecular function categories did not indicate striking differences through the different tissues. However, compared to the top more abundant proteins, slight differences were found with the addition of subcategories such as: enzyme regulator activity present in all tissues and electron carrier activity only present in rhizome tissues.

Functional classification of the rhizome-characteristic proteins showed similar distribution among rhizomatous species

The rhizome characteristic-proteins in each species evaluated were functionally classified based on Gene Ontology (GO) terms in the three main categories: biological process, molecular function and cellular component. In general, similar distributions of GO terms were observed for these proteins across the rhizomatous species (Figure S4). In Figure S4, these proteins were classified in 20, 15 and 10 functional categories representing biological process, molecular

13

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

function and cellular component, respectively. The most frequent terms referred to in the biological process category were cellular process (~15 %) followed by metabolic process (~15 %) and cellular component organization (~9 %). Other important terms associated with developmental process, biological regulation, and response to stimulus also showed significant representation. In the molecular function category, binding (~47 %), catalytic activity (~35 %) and structural molecule activity (~10 %) were the most frequent terms. The terms cell (~25 %), organelle (~22 %) and membrane (~14 %) were observed to occur most frequently in the cellular component category (Figure S4). Thus, the functional categorization of rhizome characteristicproteins showed over-representation of the terms associated with central roles in cellular, metabolic and developmental processes indicating an accelerated metabolism mainly due to the inherent role of these tissues in rhizome development and growth.

Conservation of expression of rhizome proteins across grasses and Equisetum hyemale

Protein abundance analysis in rhizome apical tips and elongation zones in different species can highlight candidate proteins associated with the process of rhizome growth and development. Thus, the classification of differentially regulated proteins in these tissues into orthologous groups was performed with the aim of comparing trends of protein enrichment among the different species. The classification of orthologous groups was done using OrthoMCL16 and TCW. The protein classification into orthologous group clusters was based on sequence similarity, where 91 % of proteins identified in the present study were classified in orthologous groups and could be analyzed among the rhizomatous species.

To evaluate similarity of protein regulation in rhizome tissues among species, only proteins differentially regulated in the rhizome tissues were considered (Table 1). Orthologous groups

14

ACS Paragon Plus Environment

Page 14 of 44

Page 15 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

that showed the same trend of regulation (up or down regulation) in rhizome tissues compared to roots in at least two grasses and Equisetum hyemale (horsetail) were used to produce the cluster heat maps (Figures 3, and 4). A total of 223 orthologous groups were differentially regulated in AT compared to roots in at least two grasses and E. hyemale. The same trend of regulation was observed in those species, with 126 groups up-regulated and 97 down-regulated in AT compared to roots (Table S21). In the same way, 158 differentially orthologous groups in EZ compared to roots showed the conservation of differential regulation in at least two grasses and E. hyemale, with 92 and 66 groups up and down regulated in EZ, respectively (Table S22, Figure 2). All of these proteins, summarized in Tables S21 and S22, are also represented in schematic metabolic pathways in Figure 2 and Table S25, where a general overview is presented of important pathways differentially regulated in rhizome tissues of different rhizomatous species (at least two grasses and E. hyemale) compared to roots.

From those pairwise analyses (AT × roots and EZ × roots), 61 rhizome-characteristic orthologous protein groups could be selected and clustered (Table S23, Figure 3). These rhizome-characteristic groups represent the up-regulated groups in both rhizome tissues (AT and EZ) compared to roots. The rhizome characteristic-protein groups are in general associated with information pathways specially linked to RNA processing and cellular processes such as proteolysis, protein synthesis and cell proliferation. Among these proteins are ribosomal proteins (40S and 60S), elongation factors, a translation initiation factor, a TCP domain transcription factor, arginyl-tRNA synthase, ribophorin, heat shock proteins, 26S proteasome and proliferation-associated proteins, which were distributed in ten main clusters (Figure 3). Cluster analysis of expression data for the 61 proteins was used to identify groups of similarly expressed proteins in different species. As shown in Figure 3, clusters B to G represent proteins that

15

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

showed conservation of expression among different species, in particular between E. hyemale and E. repens tissues, where the E. hyemale expression was lower than that in E. repens. Cluster J comprised protein groups with the largest expression values among the rhizome tissues (AT and EZ) regardless of species type. Also, tissue (AT and EZ) expression from all species was clustered and is represented at the top of the columns in Figure 3. It was found that rhizome tissues from the same species clustered together, showing similarity of expression between AT and EZ tissues of the same species.

Using the same approach, spatial differences were detected by comparing AT and EZ protein abundances, where 36 orthologous groups were differentially regulated between AT and EZ (Table S24). These protein groups fell into eight main clusters based on their regulation profiles within the rhizome samples and were represented in functional classes in Figure 4. Six ribosomal protein groups were distributed throughout the clusters A to C along with other proteins such as: leukotriene A4 hydrolase (cluster A), xyloglucan endoglucanase and ketol-acidreductoisomerase (cluster B), S-norcoclaurine synthase, citrate synthase, vacuolar membrane proton pump and S-adenosylmethionine synthase (cluster C). Cluster D comprised only two unknown proteins, while cluster E included a histone H2B and RuBisCO large subunit. Cluster F represented proteins involved in carbohydrate metabolism (sucrose synthase, endoglucanase), associated with the endoplasmic reticulum (endoplasmin and calreticulin) or cytoeskeleton (alpha-tubulin), or that were stress-related (alcohol dehydrogenase and DERPP4) or a membrane protein (porin). Cluster G contained one enzyme, phenylalanine ammonia lyase. Finally, in cluster H, two proteins involved in ROS detoxification (glutathione S-tranferase and glutathione peroxidase) and a heat shock protein were grouped. In summary, proteins related to carbohydrate metabolism, lignin biosynthesis, detoxification, stress responsive proteins, protein folding, cell

16

ACS Paragon Plus Environment

Page 16 of 44

Page 17 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

wall organization, TCA cycle and lipid breakdown were up-regulated in EZ compared to AT, while proteins associated with transcriptional control, RNA splicing, protein synthesis, photosynthesis, aminoacid biosynthesis, alkaloid metabolism and leukotriene biosynthesis were up-regulated in AT compared to EZ (Figure 4). Next-generation sequencing databases for data mining Mass spectral data mining for proteomics relies on high quality genome or transcript sequence databases. Transcriptome database quality can be assessed using many metrics including coverage, contig length, and sequence ambiguity. At present, no specific guidelines exist for validating databases for mass spectral data mining. This is not surprising given the recent development of Next Generation Sequencing (NGS), the many different NGS platforms, and the diversity of assembly algorithms. However, there have been recent attempts to address these aspects by generating RNA-seq-based proteome databases 23, 24. Furthermore, transcriptome databases are a reflection of the source material from whence they came. For example, a rhizome-specific NGS database would have a different frequency of transcript (contigs) than a database derived from leaves of the same plant. The abundant transcripts in such databases would generally have a higher quality contig assembly due to the increased read counts – which is quite relevant for proteomic studies given that the most comprehensive identify only the top 10% of abundant proteins. Thus any stringency requirement implemented for “mass spectral mining quality” could not account for all platform, algorithm, and RNA source material combinatorial possibilities. For this reason the criteria and thresholds mentioned here are not intended to be recommendations, though we strongly endorse the use of homologous databases with respect to species and tissue sources. Instead, we recommend when mining NGS databases to monitor outputs that reveal poor spectral matching, including: 1) FDR rates (using the decoy

17

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

database strategy); 2) proteome coverage (number of unique peptide matches to achieve a standard 1% FDR cutoff); and 3) correlation of differentially expressed proteins among parallel proteomic studies (i.e. same tissues, but multiple organisms) as performed here. In this investigation, the use of these stringent criteria for proteomics data supports the quality of the database assemblies. DISCUSSION

In the first vascular land plants, rhizomes were common. As plants colonized the land, they would have experienced powerful selective pressure that favored upright stems instead of underground stems. Plants abandoned the underground “stems” (rhizomes) and increased the surface area of their absorptive systems in soil at the same time as increasing their photosynthetic organs with the aim to acquire more nutrients. Thus, selective pressures favored the development of extensive root systems and upright stems. However, rhizomes reemerged (or never entirely disappeared) due to benefits that rhizomes offer, such as fire resistance or the ability to reach water and nutrients afield, and the ability to rapidly expand to new territory in an environment that is less favorable to seed dispersal. Thus, many of the most noxious weeds use rhizomes as an important organ for invasiveness and massive proliferation. Understanding rhizome development not only may permit us one day to better control weedy grasses, but also may help us to understand how an upright stem grows and develops, possibly improving the yields of different plant cultivars with better use of land resources.

In this study we employ a quantitative proteomic approach to identify and relatively quantify proteins involved in rhizome growth and development. Protein classification by ortholog mapping was necessary for subsequent analyses, allowing for the same proteins to have their

18

ACS Paragon Plus Environment

Page 18 of 44

Page 19 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

expression evaluated within the studied tissues in different rhizomatous species, including a more primitive species (E. hyemale) and five grasses.

Proteomics of plant rhizomes identified and relatively quantified approximately 2,000 proteins per species

The GeLC-MS/MS and spectral counting method used in this study is an example of a simple and fast global proteomics approach to detect differences in protein expression. Utilization of five biological replicates per tissue type improved the assessment of peptides by mass spectrometry since when using a data-dependent acquisition (DDA mode) the number of peptides randomly sampled is limited by MS/MS sampling speed of the mass analyzer. Moreover, with the objective to obtain maximum proteome coverage and assigned PSMs, custom databases for each plant species studied were produced from comprehensive RNA sequencing of rhizome tissues25 (www.plantrhizome.org).

Proteomics has previously been employed to study features of rhizomes, or rhizomatousness. One of the original studies was Lum et al.26 who used 2D gels to find proteins that could be used as markers for different ginseng species and rhizome parts. More recently, Boonmee et al.27 (2011) also using 2D gels to identify several rhizome proteins from Curcuma comosa, most having roles in antioxidation. Then using the same approach employed here, three comparative studies using rhizome tissues of Equisetum hyemale 5, common reed4 and red rice25 identified a large number of proteins related to rhizome development. In the first study, a total of 1,911 and 1,860 non-redundant proteins were identified in E. hyemale AT and EZ, respectively, while in the second one, 1,195 and 1,236 non-redundant proteins were identified in AT and EZ of common reed. More recently, in the red rice paper 9 2,921 non-redundant proteins were detected

19

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

in the rhizome tissues, 41 of which were enriched in both rhizome tissues (AT and EZ) compared to roots. In the same way, in the present study, the GeLC-MS/MS strategy allowed the identification of approximately 2,000 non-redundant proteins from the studied organs in each species producing an extensive database of rhizome proteins.

Moreover, our basic approach relied on the examination of proteins that differ by some degree in abundance in EZ and AT in different rhizomatous species. Root samples were used as a background material, serving as a reference in the spectral counting pairwise comparisons. From these successful comparisons, 200 rhizome-characteristic proteins were quantified on average from each species and showed similar functional distributions based on GO terms. This is indicative that regardless of species studied the same or functionally similar rhizomecharacteristic proteins are active in these tissues.

Rhizome characteristic-proteins with conserved expression among rhizomatous species

As described above, the pairwise comparisons involving rhizomes tissues (AT and EZ) and roots in all species revealed rhizome-characteristic proteins, which were then classified into protein orthologous groups. From that analysis, 61 orthologous groups showed conservation of expression among the rhizomatous species studied (Figure 2 and 4).

From those, the ribosomal proteins (40S and 60S subunits) were the main representatives, appearing in six of the ten clusters defined in Figure 3. A total of 15 ribosomal protein orthologous groups showed higher expression levels in AT in relation to EZ. This is clear for cluster C, which was composed mainly of ribosomal orthologous groups. Arabidopsis mutants in ribosomal proteins show a large range of developmental phenotypes suggesting that ribosomes

20

ACS Paragon Plus Environment

Page 20 of 44

Page 21 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

have specific functions regulating the expression of developmental genes.9,28,29,30 These proteins may have an important role in the development of rhizomes especially in meristematic tissues (AT) where cellular differentiation is active.

In addition to ribosomal proteins, we could observe a distribution of other proteins mainly involved in protein synthesis, cell division, transcription regulation, and protein degradation throughout the clusters (Figure 3). The presence of these protein categories is indicative of intense proliferative activity of both tissues (AT and EZ). These proteins can also be visualized in the right-bottom of Figure 2 where they are represented with a red signal (up-regulation) for both tissues.

Increased abundance of elongation factors (cluster C and H), eukaryotic translation initiation factor 3 (clusters G and H), and arginyl-tRNA synthetase (cluster F) in rhizome tissues, independent of numerous up-regulated ribosomal proteins, suggests increased protein synthesis to support developmental changes and cell growth. Elongation factors are essential for protein synthesis by mediating the translocation step in peptide chain elongation. In tobacco, elongation factor EF-1 was shown to accumulate in meristems, rapidly growing tissues, and developing gametophytes.31 Similarly, translation initiation factors (elF) are the main elements affecting global translation either through changes in the level or phosphorylation (reviewed in Mathews et al.32). The eIF3 subunits b and e (cluster H and G, respectively) were found to be enriched in rhizome tissues. elF3b transcripts accumulated in tissues with high mitotic activity and are cellspecifically regulated.33 On the other hand, elF3e, in addition to its presence in the elF3 complex, was found to be associated with the COP9 proteasome complex and to be localized in the nucleus, suggesting a dual function that may be regulatory.34,35,36

21

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Another protein associated with ribosomal proteins and found among rhizome characteristicproteins was ribophorin II (cluster I). This protein is located in the membrane of the rough endoplasmic reticulum and has a key role in the binding of ribosomes to the rough ER as well as in co-translational processes. In humans, ribophorin II regulates breast tumor initiation and metastasis.37 In Arabidopsis, ribophorin II was up-regulated in leaves under cold acclimation. 38 Its role in rhizome development is yet to be determined.

Another protein enriched in the rhizome tissues was the T-complex protein (TCP-1), represented by two different orthologous groups in the same cluster (cluster D). TCP-1 assists with the folding of newly translated proteins.39 Other proteins related to protein fate, such as heat shock proteins (clusters E and G) and 26S proteasome (clusters I and J), were also enriched in the rhizome tissues, further pointing to broader proteome remodeling in this organ.

Proteins involved in transcriptional regulation, such as proliferation associated protein 2G4 (cluster H) and TCP transcription factor family protein (cluster A), were also up regulated in rhizome tissues. The proliferation-associated protein 2G4 is a cell cycle-specifically modified protein that varies with the cell cycle.40 Members belonging to the TCP transcription factor protein family are important regulators of plant growth and development, including leaf morphogenesis.41,42 They can modulate cell proliferation by controlling transcription of cyclin43 or by inducing cell differentiation.42 In the present study, both proteins showed higher expression in rhizome AT where active cell proliferation and differentiation are in progress.

Three representatives of cell division were also found enriched in these tissues: cell division control protein, dynamin-related protein and the proliferating cell nuclear antigen (PCNA). All of these were grouped in cluster C. In the case of the PCNA, it serves as an accessory factor of

22

ACS Paragon Plus Environment

Page 22 of 44

Page 23 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

DNA polymerase epsilon, required for DNA replication44 and meristematic cell division.45 Additionally, promoter elements of PCNA genes can be regulated by binding to TCP proteins (also found enriched in the same tissues), demonstrating the interaction of TCP and PCNA genes in plant growth and development.45

Differences between rhizome contiguous tissues – Apical tip and Elongation Zone

Differences in protein regulation between AT and EZ were also evaluated in all species and the protein groups with conserved regulation were reported in Figure 4. In AT samples, we observed an enrichment of ribosomal proteins in relation to EZ samples. Six protein ortholog groups (cluster A to C, Figure 4) of ribosomal proteins were detected including 40S and 60S subunits. As discussed in the previous section, 15 ribosomal protein groups were enriched in both rhizome tissues when compared to roots. However, only two ortholog groups out of the six differently regulated between AT and EZ were comprised among those 15 groups. In other words, the other four ribosomal protein groups (OM4_0000244, OM4_0000537, OM4_0000120 and OM4_0000178 in Table S24) differentially expressed in AT compared to EZ seem to have a specific role in the development of AT tissue, since they were not enriched in EZ samples.

Two proteins involved in transcription regulation were up-regulated in AT, the histone H2B (cluster E, Figure 4) and DEAD-box ATP-dependent RNA helicase 15 (cluster H, Figure 4). Both proteins are classified as nucleotide-binding proteins, the first a DNA-binding protein and the second an RNA-binding protein, but both function during transcription. Histone H2B is a nuclear protein involved in the structure of chromatin as a core component of nucleosomes. Histone modification plays an essential role in reprogramming gene expression that is important on cellular proliferation and differentiation processes46,47 or defense responses.48 Mutations of

23

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

histone H2B, hub1 and hub2, affected leaf growth rate due to the interruption of G2-M transition, increasing the cell cycle length.49 Several genes associated with mitotic transition were downregulated in the hub1 mutant. Up-regulation of histone H2B in AT is consistent with the proliferative activity of this tissue. Regarding the DEAD-box ATP-dependent RNA helicases, they can participate in the regulation of essentially all of the processes involved in RNA metabolism from transcription to degradation.50 Several DEAD-box RNA helicases have been shown to play crucial roles at important steps of splicing.51

RuBisCO “large chain” is a 60 kDa chaperonin subunit alpha, which participates in the assembly of RuBisCO complex in chloroplasts of higher plants. This chaperone subunit was up-regulated in AT of all species analyzed (except in red rice), which suggests that this protein has an important role in actively dividing tissues. Chaperone proteins are employed by plants to avoid misfolded proteins and prevent or reverse incorrect protein interactions.52 In EZ, proteins associated with cell wall metabolism were up-regulated compared to AT, except xyloglucan endotransglusosylase/hydrolase (XTH), which was up-regulated in AT. The α-1,4-glucan protein synthase 2, endoglucanase 25, and sucrose synthase (SuSy) enzymes were each up-regulated in EZ compared to AT. Sucrose synthase, a key enzyme of carbohydrate metabolism, was upregulated in EZ of all species analyzed except red rice. Conserved regulation in EZ tissues demonstrates the importance of this enzyme for this tissue. The UDP-glucose formed from sucrose by SuSy is used directly as a substrate by the cellulose synthase complex. Suppression of SuSy gene expression repressed cotton fiber cell initiation, elongation and seed development.53 The α-1,4-glucan-protein synthase is a self-glycosylating protein localized in the Golgi apparatus, with a potential role in hemicellulose synthesis.54 Although enzymes involved in the biosynthesis of carbohydrate cell components were up-regulated in EZ, an endoglucanase with

24

ACS Paragon Plus Environment

Page 24 of 44

Page 25 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

hydrolase activity against cell wall polysaccharides was also up-regulated in this tissue. It has been shown that the action of endoglucanase causes cell walls to extend in vitro55, which is a required process in EZ tissues. Participating in the same process of cell wall loosening and rearrangement, the XTHs have been shown to be the key enzymes that catalyze the reconstruction, rearrangement, breakdown and incorporation of xyloglucan bound to adjacent cellulose microfibrils in the cell wall.56 Characterization of a specific XTH in Arabidopsis revealed a tendency of strong expression of XTH9 in rapidly dividing and expanding tissues, suggesting an important function in the development and morphogenesis of tissues close to shoot apices.57

In addition to carbohydrate metabolism related proteins, EZ samples showed up-regulation of proteins involved in detoxification and the phenylpropanoid pathway. In the detoxification category, glutathione S-transferase (GST) and glutathione peroxidase (GPX) were detected (cluster H) in all species analyzed with higher accumulation in EZ samples compared to AT. In tobacco, transgenic lines overexpressing GST/GPX showed improvement in seed germination and seedling growth and this difference was accentuated when exposed to stress conditions.58 The phenylpropanoid pathway representative was phenylalanine ammonia-lyase (PAL) (cluster G), which was up-regulated in EZ in almost all species analyzed (except common reed). PAL catalyzes the first step of the phenylpropanoid pathway, which produces precursors to a variety of important metabolites such as lignin, flavonoids, anthocyanins and plant hormones. 59 Most of the lignification occurs during secondary cell wall deposition, which takes place after cell wall elongation.60 This might be the case of rhizome EZ tissues, which are synthesizing cell wall components at high rates and showing PAL up-regulation compared to AT tissues.

25

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Other proteins associated with different metabolic pathways were also up-regulated in EZ compared to AT with lower conservation of regulation among the species analyzed. Proteins such as those involved in lipid biosynthesis, protein folding, stress response, cell organization and transport were up-regulated in horsetail and in two or three grasses.

CONCLUSION

To better define rhizomes at the proteome level we employed large-scale proteome analyses using spectral counting for comparison of developing rhizome tissues against developing roots. The present study encompasses different rhizomatous species including an ancient species (E. hyemale) and five grasses. Utilization of multiple plant species allowed us to identify consensus proteins important for establishment, growth and development of rhizomes. These proteins are specifically enriched in rhizome tissues across different species. Among these were proteins related to protein fate, specifically ribosomal proteins and proteins related to transcription regulation. Differences between apical tips and elongation zones pointed to active carbohydrate metabolism and cell wall elongation in elongation zones. Further studies aiming to characterize tissue-specific protein isoforms may be necessary to further expand our understanding of protein regulation during rhizome growth and development.

ASSOCIATED CONTENT

Table S1: Statistics of de novo transcriptome assemblies of horsetail and different monocot species. Table S2 to S4: List of SePro filtered MS/MS spectra containing the SEQUEST scores and proposed peptide sequences detected in Equisetum hyemale tissues. Table S5 to S7: List of SePro filtered MS/MS spectra containing the SEQUEST scores and proposed peptide sequences

26

ACS Paragon Plus Environment

Page 26 of 44

Page 27 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

detected in Elytrigia repens tissues. Table S8 to S10: List of SePro filtered MS/MS spectra containing the SEQUEST scores and proposed peptide sequences detected in Imperata cylindrica tissues. Table S11 to S13: List of SePro filtered MS/MS spectra containing the SEQUEST scores and proposed peptide sequences detected in Miscathus x giganteus tissues. Table S14 to S16: List of SePro filtered MS/MS spectra containing the SEQUEST scores and proposed peptide sequences detected in Oryza longistaminata tissues. Table S17 to S19: List of SePro filtered MS/MS spectra containing the SEQUEST scores and proposed peptide sequences detected in Phragmites australis tissues. Table S20: Proteins identified by GeLC-MS/MS in rhizome tissues (apical tip and elongation zone) and roots of different rhizomatous species. Table S21: Orthologous groups that showed the same pattern of regulation across rhizome apical tip tissues (AT) compared to roots in different monocot species and horsetail. Table S22: Orthologous groups that showed the same pattern of regulation across rhizome elongation zone tissues (EZ) compared to roots in different monocot species and horsetail. Table S23: Rhizome characteristic orthologous protein groups up regulated in AT and EZ compared to roots in horsetail and across two, three or four grasses. Table S24: Orthologous groups that showed the same pattern of regulation across rhizome apical tip (AT) compared to elongation zone tissues (EZ) in different monocot species and horsetail. Table S25: Orthologous groups that showed the same pattern of regulation across rhizome tissues (apical tip and elongation zone) compared to roots in different monocot species and horsetail. Figure S1: GeLC-MS/MS workflow employed in the identification and spectral counting-based quantification of rhizome characteristic proteins of different species. Figure S2: Functional classification based on GO terms associated with biological process of the top 100 most abundant and least abundant proteins identified in each tissue based on normalized spectra counts and regardless of species type. R, roots, AT, apical tip,

27

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

EZ, elongation zone. Figure S3: Functional classification based on GO terms associated with molecular function of the top 100 most abundant and least abundant proteins identified in each tissue based on normalized spectra counts regardless of species type. R, roots, AT, apical tip, EZ, elongation zone. Figure S4: Gene Ontology categorization of up-regulated proteins in the apical tip and elongation zone (rhizome-characteristic proteins) of all six rhizome species analyzed. Upregulated proteins were classified based on GO terms for biological process, molecular function and cellular component. The data are represented at the second GO hierarchical level. This material is available free of charge via the Internet at http://pubs.acs.org.

AUTHOR INFORMATION Corresponding Author * Jay J. Thelen Christopher S. Bond Life Sciences Center, Department of Biochemistry, University of Missouri, 1201 Rollins St., Columbia, MO, USA, E-mail: [email protected] Present Addresses ¥

State University of Campinas, Institute of Biology, Campinas, SP, Brazil and São Paulo State

University “Julio de Mesquita Filho”, Department of Technology, Jaboticabal, SP, Brazil §

São Paulo State University “Julio de Mesquita Filho”, Department of Technology, Jaboticabal,

SP, Brazil ǂ Biostatistics and Bioinformatics Division, Yenepoya Research Center/ Yenepoya University, Mangalore 575018, India

Author Contributions

28

ACS Paragon Plus Environment

Page 28 of 44

Page 29 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

DRG and JJT designed the research. FS, TS and RH performed experiments. FS, TS analyzed the proteomic data. RPR contributed with computational tools. CAS, WN adapted the TCW software for protein analysis and built the databases, and FS, JJT, DRG wrote the paper. Funding Sources National Science Foundation (Grant IOS-1044821).

ACKNOWLEDGMENT

We gratefully acknowledge the US National Science Foundation (Grant IOS-1044821) for financial support of this research.

REFERENCES (1) Jang, C.S.; Kamps, T.L.; Tang, H.; Bowers, J.E.; Lemke, C.; Paterson, A.H. Evolutionary fate of rhizome-specific genes in a non-rhizomatous Sorghum genotype. Heredity 2009, 102: 266–273 (2) Jang, C.S.; Kamps, T.L.; Skinner, D.N.; Schulze, S.R.; Vencill, W.K.; Paterson, A.H. Functional classification, genomic organization, putatively cis-acting regulatory elements, and relationship to quantitative trait loci, of sorghum genes with rhizome-enriched expression. Plant Physiol 2006, 142, 1148–1159 (3) Mauseth, J.D. Plant Anatomy. The Benjamin/Cummings Publishing Co.; Menlo Park; CA, 1988 (4) He, R.; Kim, M.J.; Nelson, W.; Balbuena, T.S.; Kim, R.; Kramer, R.; Crow, J.A.; May, G.D.; Thelen, J.J.; Soderlund, C.A.; Gang, D.R. Next generation sequencing based transcriptomic and proteomic analysis of the common reed, Phragmites australis (Poaceae) reveals genes involved in invasiveness and rhizome specificity. Am J Bot 2012, 99, 232–247

29

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(5) Balbuena, T.S.; He, R.; Salvato, F.; Gang, D.R.; Thelen, J.J. Large-scale proteome comparative analysis of developing rhizomes of the ancient vascular plant Equisetum hyemale. Front Plant Sci 2012, 3, 131 (6) Hajduch, M.; Hearne, L.B.; Miernyk, J.A.; Casteel, J.E.; Joshi, T.; Agrawal, G.K.; Song, Z.; Zhou, M.; Xu, D.; Thelen, J.J. Systems analysis of seed filling in Arabidopsis using general linear modeling to assess concordance of transcript and protein expression. Plant Physiol 2010, 152, 2078-2087 (7) Jayasena, A.S.; Secco, D.; Bernath-Levin, K.; Berkowitz, O.; Whelan, J.; Mylne, J.S. Next generation sequencing and de novo transcriptomics to study gene evolution. Plant Methods 2014, 10, 34 (8) Torre, S.; Tattini, M.; Brunetti, C.; Fineschi, S.; Fini, A.; Ferrini, F.; Sebastiani, F. RNA-seq analysis of Quercus pubescens leaves: De Novo transcriptome assembly, annotation and functional markers development. Plos One 2014, 9, 11, e112487 (9) He, R.; Salvato, F.; Park, J.J.; Kim, M.J.; Nelson, W.; Balbuena, T.S.; Willer, M.; Crow, J.A.; May, G.D.; Soderlund, C.A.; Thelen, J.J.; Gang, D.R. A systems-wide comparison of red rice (Oryza longistaminata) tissues identifies rhizome specific genes and proteins that are targets for cultivated rice improvement. BMC Plant Biology 2014, 14, 46 (10) Wernersson, R. Virtual ribosome – a comprehensive translation tool with support for sequence feature integration. Nucleic Acids Res 2006, 34, W385–W388 (11) Carvalho, P.C.; Fischer, J.S.G.; Tao, X.; Cociorva, D.; Balbuena, T.S.; Valente, R.; Perales, J.; Yates, J.R.III; Barbosa, V.C. Search engine processor: filtering and organizing peptide spectrum matches. Proteomics 2012, 12, 944–949

30

ACS Paragon Plus Environment

Page 30 of 44

Page 31 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(12) Soderlund, C.; Nelson, W.; Willer, M.; Gang, D.R. TCW: Transcriptome computational workbench. PLoS One 2013, 8, e69401 (13) Dimmer, E.C.; Huntley, R.P.; Alam-Faruque, Y.; Sawford, T. ; O'Donovan, C.; Martin, M.J. et al. The UniProt-GO Annotation database in 2011. Nucleic Acids Research 2012, 40: D565D570 (14) Altschul, S.F.; Madden; T.L.; Schäffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D.J. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997; 25; 3389-402 (15) Robinson, M.D.; McCarthy, D.J.; Smyth, G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 2010, 26, 139– 140 (16) Li, L.; Stoeckert, C.J. Jr.; Roos, D.S. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 2003, 13, 2178-89 (17) Conesa, A.; Gotz, S.; Garcia-Gomez, J.M.; Terol, J.; Talon, M.; Robles, M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 2005, 21, 3674–3676 (18) Caraux, G.; Pinloche, S. PermutMatrix: a graphical environment to arrange gene expression profiles in optimal linear order. Bioinformatics 2005, 21, 1280–1281 (19) Sokal, R.R.; Michener, C.D. A statistical method for evaluating systematic relationships. Univ Kans Sci Bull 1958, 38, 1409–1438 (20) Liu, H.; Sadygov, R.G.; Yates, J.R.III. A model for random sampling and estimation of relative protein abundance in shotgun proteomics Analytical Chemistry 2004, 76, 4193–4201 (21) Old, W.M.; Meyer-Arendt, K.; Aveline-Wolf, L.; Pierce, K.G.; Mendoza, A.; Sevinsky,

31

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

J.R.; Resing, K.A.; Ahn, N.G. Comparison of label-free methods for quantifying human proteins by shotgun proteomics. Mol Cell Proteomics 2005, 4, 1487-502 (22) Stevenson, S.E.; Chu, Y.; Ozias-Akins, P.; Thelen, J.J. Validation of gel-free quantitative proteomics approaches: applications for seed allergen profiling. Journal of Proteomics 2009, 72, 555-66 (23) Sheynkman, G.M.; Johnson, J.E.; Jagtap, P.D.; Shortreed, M.R.; Onsongo, G.; Frey, B.L.; Griffin, T.J.; Smith, L.M. Using Galaxy-P to leverage RNA-Seq for the discovery of novel protein variations.BMC Genomics 2014, 22, 15:703 (24) Park, H.; Bae, J.; Kim, H.; Kim, S.; Kim, H.; Mun, D.G.; Joh, Y.; Lee, W.; Chae, S.; Lee, S.; Kim, H.K.; Hwang, D.; Lee, S.W.; Paek, E. Compact variant-rich customized sequence database and a fast and sensitive database search for efficient proteogenomic analyses.Proteomics 2014, 14(23-24): 2742-9 (25) Lum, J.H-K.; Fung, K-L.; Wong, M-S.; Lee, C-H.; Kwok, FS-L.; Leung, MC-P.; Hui, P-K.; Lo, SC-L. Proteome of oriental ginseng Panax ginseng C.A. Meyer and the potential to use it as an identification tool. Proteomics 2002, 2, 1123-1130 (26) Boonmee, A.; Srisomsap, C.; Chockchaichamnankit, D.; Karnchanatat, A.; Sangvanich, P. A proteomic analysis of Curcuma comosa Roxb. rhizomes. Proteome Science 2011, 9, 43 (27) Degenhardt, R.F.; Bonham-Smith, P.C. Arabidopsis ribosomal proteins RPL23aA and RPL23aB are differentially targeted to the nucleolus and are disparately required for normal development. Plant Physiol 2008, 147, 128–142 (28) Byrne, M.E. A role for the ribosome in development. Trends Plant Sci 2009, 14, 512–519 (29) Horiguchi, G.; Mollá-Morales, A.; Pérez-Pérez, J.M.; Kojima, K.; Robles, P.; Ponce, M.R.; Micol, J.L.; Tsukaya, H. Differential contributions of ribosomal protein genes to Arabidopsis

32

ACS Paragon Plus Environment

Page 32 of 44

Page 33 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

thaliana leaf development. Plant J 2011, 65, 724–736 (30) Szakonyi, D.; Byrne, M.E. Ribosomal protein L27a is required for growth and patterning in Arabidopsis thaliana. Plant J 2011, 65, 269–281 (31) Ursin, V.M.; Irvine, J.M.; Hiatt, W.R.; Shewmaker, C.K. Developmental analysis of elongation factor-1α expression in transgenic tobacco. The Plant Cell 1991, 3, 583-591 (32) Mathews, M.B.; Sonenberg, N.; Hershey, J.W.B. Origins and targets of translational control. In Translational Control; M.B. Mathews; N. Sonenberg; J.W.B. Hershey; eds; Cold Spring Harbor Laboratory Press: New York; 1996, 1–29 (33) Shen, W.H.; Gigot, C. Characterization of Prt1; a gene encoding for one of the subunits of the translation initiation factor 3 (eIF3) from Nicotiana tabacum. Plant Sci 1999, 143, 45–54 (34) Yahalom, A.; Kim, T.H.; Winter, E.; Karniol, B.; Von Arnim, A.G.; Chamovitz, D.A. Arabidopsis eIF3e (INT-6) associates with both eIF3c and the COP9 signalosome subunit CSN7. J Biol Chem 2000, 276: 334–340 (35) Desbois, C.; Rousset, R.; Bantignies, F.; Jalinot, P. Exclusion of Int-6 from PML nuclear bodies by binding to the HTLV-I Tax oncoprotein. Science 1996, 273, 951–953 (36) Paz-Aviram, T.; Yahalom, A.; Chamovitz, D.A. Arabidopsis eIF3e interacts with subunits of the ribosome, Cop9 signalosome and proteasome. Plant Signaling & Behavior 2008, 3, 409411 (37) Takahashi, R.; Takeshita, F.; Honma, K.; Ono, M.; Kato, K.; Ochiyaa, T. Ribophorin II regulates breast tumor initiation and metastasis through the functional suppression of GSK3β. Sci Rep 2013, 3, 2474 (38) Kawamuray, Y.; Uemura, M. Mass spectrometric approach for identifying putative plasma membrane proteins of Arabidopsis leaves associated with cold acclimation. The Plant Journal

33

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2003, 36, 141-154 (39) Yaffe, M.B.; Farr, G.W.; Miklos, D.; Horwich, A.L.; Sternlicht ,M.L.; Sternlicht, H. Tcp1 complex is a molecular chaperone in tubulin biogenesis. Nature 1992, 358, 245–248 (40) Radomski, N.; Jost, E. Molecular cloning of a murine cDNA encoding a novel protein p382G4; which varies with the cell cycle. Experimental Cell Research 1995, 220, 434–445 (41) Ori, N.; Cohen, A.R.; Etzioni, A.; Brand, A.; Yanai, O.; Shleizer, S.; Menda, N.; Amsellem, Z.; Efroni, I.; Pekker, I.; Alvarez, J.P.; Blum, E.; Zamir, D.; Eshed, Y. Regulation of LANCEOLATE by miR319 is required for compound-leaf development in tomato. Nat Genet 2007, 39, 787–791 (42) Nath, U.; Crawford, B.C.; Carpenter; Coen, E. Genetic control of surface curvature. Science 2003, 299, 1404–1407 (43) Li, C.; Potuschak, T.; Colón-Carmona, A.; Gutiérrez, R.A.; Doerner, P. Arabidopsis TCP20 links regulation of growth and cell division control pathways. Proc Natl Acad Sci USA 2005, 102, 12978–12983 (44) Tan, C.K.; Castillo, C.; So, A.G.; Downey, K.M. An auxiliary protein for DNA polymerase delta from fetal calf thymus. Journal of Biological Chemistry 1986, 261, 12310-12316 (45) Cubas, P.; Lauter, N.; Doebley, J.; Coen, E. The TCP domain: a motif found in proteins regulating plant growth and development. Plant J 1999, 18, 215-22 (46) Braszewska-Zalewska, A.J.; Wolny, E.A.; Smialek, L.; Hasterok, R. Tissue-specific epigenetic modifications in root apical meristem cells of Hordeum vulgare. PLoS One 2013, 8, e69204 (47) Wright, D.E.; Wang, C.Y.; Kao, C.F. Flickin' the ubiquitin switch: the role of H2B ubiquitylation in development. Epigenetics 2011, 6, 1165-75

34

ACS Paragon Plus Environment

Page 34 of 44

Page 35 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(48) Zou, B.; Yang, D.L.; Shi, Z.; Dong, H.; Hua J. Monoubiquitination of histone 2B at the disease resistance gene locus regulates its expression and impacts immune responses in Arabidopsis. Plant Physiol 2014, 165, 309-18 (49) Fleury, D.; Himanen, K.; Cnops, G.; Nelissen, H.; Boccardi, T.M.; Maere, S.; Beemster, G.T.; Neyt, P.; Anami, S.; Robles, P.; Micol, J.L.; Inzé, D.; Van Lijsebettens, M. The Arabidopsis thaliana homolog of yeast BRE1 has a function in cell cycle regulation during early leaf and root growth. Plant Cell 2007, 19 417–432 (50) Cordina, O.; Banroquesa, J.; Tannera, N.K.; Linder, P. The DEAD-box protein family of RNA helicases. Gene 2006, 367, 17–37 (51) Schwer, B.; Meszaros, T. RNA helicase dynamics in pre-mRNA splicing. EMBO J 2000, 19: 6582-91 (52) Ranford, J.C.; Coates, A.R.; Henderson, B. Chaperonins are cell-signalling proteins: the unfolding biology of molecular chaperones. Expert Rev Mol Med 2000, 2, 1-17 (53) Ruan, Y.L.; Llewellyn, D.J.; Furbank, R.T. Suppression of sucrose synthase gene expression represses cotton fiber cell initiation; elongation; and seed development. Plant Cell 2003, 15, 952-964 (54) Dhugga, K.S.; Ulvskov, P.; Gallagher, S.R.; Ray, P.M. Plant polypeptides reversibly glycosylated by UDPglucose—possible components of Golgi beta-glucan synthase in pea cells. J Biol Chem 1991, 266, 21977–21984 (55) Yuan, S.; Wu, Y.; Cosgrove, D.J. A fungal endoglucanase with plant cell wall extension activity. Plant Physiol 2001,127: 324–333

35

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(56) Yokoyama, R.; Nishitani, K. A comprehensive expres- sion analysis of all members of a gene family encoding cell-wall enzymes allowed us to predict cis-regulatory regions involved in cell-wall construction in specific organs of Arabidopsis. Plant Cell Physiol 2001, 42, 1025–1033 (57) Hyodo, H.; Yamakawa, S.; Takeda, Y.; Tsuduki, M.; Yokota, A.; Nishitani, K.; Kohchi T. Active gene expression of a xyloglucan endotransglucosylase/hydrolase gene, XTH9, in inflorescence apices is related to cell elongation in Arabidopsis thaliana. Plant Mol Biol 2003, 52, 473-82 (58) Roxas, V.P.; Lodhi, S.A.; Garrett, D.K.; Mahan, J.R.; Allen, R.D. Stress tolerance in transgenic tobacco seedlings that overexpress glutathione-S-transferase/glutathione peroxidase. Plant Cell Physiol 2000, 41, 1229–1234 (59) La Camera, S.; Gouzerh, G.; Dhondt, S.; Hoffmann, L.; Fritig, B.; Legrand, M.; Heitz, T. Metabolic reprogramming in plant innate immunity: the contributions of phenylpropanoid and oxylipin pathways. Immunological Reviews 2004, 198, 267–284 (60) Jung, H.G.; Casler, M.D. Maize stem tissues: cell wall concentration and composition during development. Crop Science 2006, 46, 1793-1800

36

ACS Paragon Plus Environment

Page 36 of 44

Page 37 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Table I: Number of differentially expressed proteins and proportion (%) resulted from pairwise comparisons using apical tip (AT), elongation zone (EZ) and roots (R). The percentage (%) represents the ratio between the number of differentially expressed proteins and the total number of protein identified in each tissue. Statistical tests were performed using EdgeR and differently expressed proteins were determined with a p value < 0.05. Proteins up-regulated in AT and EZ when compared to roots were defined as rhizome characteristic-proteins (last column).

AT x R Species

up

down

AT

AT

horsetail

417

279

quack grass

345

cogon grass

EZ x R up

down

EZ

EZ

38

330

206

292

36

350

399

296

46

miscanthus

186

195

red rice

453

reed

239

AT x EZ

Rhizome characteristic

up

down

AT

AT

30

114

147

14

227

264

36

18

43

3

252

338

238

40

99

123

15

249

38

171

196

40

134

124

26

81

484

48

323

500

49

114

69

9

266

213

35

233

194

28

56

94

12

146

%

%

37

ACS Paragon Plus Environment

%

up AT & EZ

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1: Number of proteins identified and respective abundance in underground tissues of studied species. A) Number of proteins identified in each tissue per species. Total, total number of non-redundant proteins identified in all tissues; AT, apical tip; EZ, elongation zone; R, roots. B) Protein abundance distribution considering all proteins identified in each tissue regardless species type. Normalized spectral counts were log10 transformed and represented on x-axis. Each category on the x-axis corresponds to 0.25 order of magnitude difference in the abundance of proteins. The difference between the most abundant and the least abundant proteins comprehends five orders of magnitude. C) Abundance distribution by total protein identifications. X-axis represents all protein identified in the respective tissue and y-axis represents spectra counts. Plus signal (+) and minus signal (-) corresponds to more abundant

38

ACS Paragon Plus Environment

Page 38 of 44

Page 39 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(spectra counts > 3) and less abundant (spectra counts < 3) proteins, respectively. Spectra counts in B and C are normalized and Log10 transformed.

39

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

40

ACS Paragon Plus Environment

Page 40 of 44

Page 41 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Journal of Proteome Research

Figure 2: Summary of differentially expressed proteins in rhizome tissues compared to roots in different species. Schematic representation of plant cell metabolic pathways. Each number represents (a) protein (s) identified and quantitatively evaluated as differentially expressed in apical tip and/or elongation zone of rhizomes from at least two monocotyledons and E. hyemale compared to roots. Enzyme numbers with respective protein names and expression values can be found in Table S25. Symbols in the left and right of the protein number represents the expression in apical tips and elongation zones, respectively. Red: up-regulated compared to roots; Green: down-regulated compared to roots.

41

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3: Hierarchical clustering of orthologous rhizome characteristic-protein groups enriched in the elongation zone (EZ) and apical tip (AT) compared to roots (p < 0.05) in different species. Normalized spectral counts and log2 transformed are represented. Arrows indicate two main clusters due to tissue specificity regulation among species 1, Equisetum hyemale; 2, Elytrigia repens; 3, Imperata cylindrica; 4, Miscanthus x giganteus; 5, Oryza longistaminata; 6, Phragmites australis.

42

ACS Paragon Plus Environment

Page 42 of 44

Page 43 of 44

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 4: Proteins differently regulated between AT and EZ in rhizomatous species. Upper panel: Hierarchical clustering of orthologous protein groups differently regulated (p < 0.05) between rhizome apical tip (AT) and elongation zone (EZ) in different species. 1, Equisetum hyemale; 2, Elytrigia repens; 3, Imperata cylindrica; 4, Miscanthus x giganteus; 5, Oryza longistaminata; 6, Phragmites australis. Bottom panel: Plant cell scheme showing the same proteins showed in panel A, but classified in functional categories. Arrows indicate up and down-regulation. Red squares in the functional category indicates an overall up-regulation of the category in AT compared to EZ, while green squares means down-regulation. 43

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract Graphic 76x50mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 44 of 44