Proteomic Analysis of Differences in Fiber Development between Wild

Jul 7, 2017 - In this study, the wild and cultivated cottons (YU-3 and TM-1) ... were defined as differentially expressed proteins between fibers of w...
0 downloads 0 Views 3MB Size
Subscriber access provided by UNIVERSITY OF CONNECTICUT

Article

Proteomic Analysis of Differences in Fiber Development between Wild and Cultivated Gossypium hirsutum L. yuan qin, Hengling Wei, Huiru Sun, Pengbo Hao, Hantao Wang, Junji Su, and Shuxun Yu J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.7b00122 • Publication Date (Web): 07 Jul 2017 Downloaded from http://pubs.acs.org on July 8, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Proteomic Analysis of Differences in Fiber Development between Wild and Cultivated Gossypium hirsutum L. Yuan Qin1,2, Hengling Wei2, Huiru Sun1,2, Pengbo Hao1,2, Hantao Wang2, Junji Su2, Shuxun Yu1,2*

1. College of Agronomy, Northwest A&F University, No.3 Taicheng Road, Yangling, Shaanxi 712100, China 2. State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, No.38 Huanghe Road, Anyang, Henan 455000, China

1

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ABSTRACT: Upland cotton (Gossypium hirsutum L.) is one of the world’s most important fiber crops, accounting for more than 90% of all cotton production. While their wild progenitors have relatively short and coarse, often tan-colored fibers, modern cotton cultivars possess longer, finer, stronger, and whiter fiber. In this study, the wild and cultivated cottons (YU-3 and TM-1) selected show significant differences on fibers at 10 day post-anthesis (DPA), 20 DPA and mature stages at the morphological level. In order to explore the effects of domestication, reveal molecular mechanisms underlying these phenotypic differences and better inform our efforts to further enhance cotton fiber quality, an iTRAQ-facilitated proteomic methods were performed on developing fibers. There were 6990 proteins identified, among them 336 were defined as differentially expressed proteins (DEPs) between fibers of wild versus domesticated cotton. The down- or up-regulated proteins in wild cotton were involved in Phenylpropanoid biosynthesis, Zeatin biosynthesis, Fatty acid elongation and other processes. Association analysis between transcriptome and proteome showed positive correlations between transcripts and proteins at both 10 DPA and 20 DPA. Differences in proteomics have been verified at the mRNA level by qPCR and have been validated at the physiological and biochemical levels by POD (peroxidase) activity assays and ZA (zeatin) content estimates. This work corroborates the major pathways involved in cotton fiber development and demonstrates that POD activity and zeatin content have a great potential related to fiber elongation and thickening.

2

ACS Paragon Plus Environment

Page 2 of 48

Page 3 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

KEYWORDS: Gossypium hirsutum, fiber development, domestication, iTRAQ, phenylpropanoid, zeatin

INTRODUCTION Cotton fiber is elongated epidermal hairs emanating from the surface of seeds of members of the Malvaceae genus Gossypium, commonly referred to as ‘the cotton genus’. Cotton is the main raw material of the textile industry, occupying an important position among the textile fibers. Products made from cotton fiber have many advantages over synthetic fiber, such as softness, warmth, good hygroscopicity and air permeability. Although chemical fiber, especially synthetic fiber such as polyester, has occupied an important position in the 20th century textile industry, cotton fiber still has a significant market share. According to the National Bureau of Statistics of the People's Republic of China, cotton production was 5600 thousand tons in 2015. In the same year, the cotton import quota was 894 thousand tons (http://www.stats.gov.cn/). Due to public concern about using nonrenewable oil resources to produce synthetic fiber, increased public environmental consciousness and pursuit of a higher standard of living, cotton fiber retains strong market competitiveness in the textile industry. The development of cotton fiber could be separated into 4 independent but overlapping stages: initiation (-1~5 days post anthesis, DPA), elongation (4~16 DPA), secondary cell wall thickening (15~26 DPA) and maturation (25~50 DPA).1, 2 To date, many genes have been verified to function during fiber development, such as 3

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

GhHOX3, GhSusA1, ACO1-3, GhFLA1, GhRDL1 etc.2-6 Suppression of sucrose synthase gene (Sus) activity by 70% or more in the ovule epidermis confers a fiberless phenotype.7 GhACT1, encoding an actin cytoskeleton, plays a key role during fiber development but not fiber initiation.8 As Han et al.9 reported, an obvious fiber elongation and a more compact secondary cell wall appeared when WLIM1a was overexpressed in Upland cotton, these operation helped to improve fiber strength and fineness. However, the regulatory networks that determine fiber length and strength still remain unclear. Both the wild and cultivated species of the Gossypium (cotton) genus belong to order Malvales, family Malvaceae, subfamily Malvoideae in plant taxonomy. Four Gossypium species have been domesticated, the rest being wild,10 distributed over vast areas of tropics and subtropics. According to its geographic distribution, wild G. hirsutum cottons have been divided into seven races by Hutchinson (1951). Race Yucatanense, named after the Yucatan peninsula, is one of the seven kinds of semi-wild upland cotton. The phenotype of G. hirsutum differs greatly between wild and cultivated forms.11 The yucatanense 3 (i.e. YU-3) strain studied here has a small plant, more branches, small round boll, short lint fiber and brown fuzz fiber. Domesticated cotton, represented by genetic standard line TM-1, has wide adaptability, strong stress resistance, longer and white fibers. Draft whole genome sequences of Gossypium hirsutum L. acc. TM-112, 13 provide a reference database for functional genomic analysis of Upland cotton. These genome maps with annotations of functional genes laid an important genetic foundation for 4

ACS Paragon Plus Environment

Page 4 of 48

Page 5 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

accelerating genetic improvement of cotton varieties. Large-scale proteomic profiling of plants offers the opportunity to explore the mechanisms underlying different phenotypes. Protein is the direct performer of body function and vital movement, and proteome analysis can contribute to deeper understanding of developmental mechanisms. Isobaric tags for relative and absolute protein quantification (iTRAQ) is a new technology according to the same kind of isotope labeled in vitro combined with liquid chromatography and high-precision mass spectrometer series analysis (LC-MS/MS).14 It is a high-throughput screening technique commonly used in quantitative proteomics in recent years. Here, iTRAQ-facilitated proteomic analysis of developing fibers from wild and cultivated upland cotton (i.e. YU-3 and TM-1) was performed. Two representative stages were selected: 10 DPA, primary cell wall elongation and 20 DPA, the transition to secondary cell wall thickening.1, 15 The purpose of this trial design was to identify some candidate proteins and processes involved in fiber development during cotton domestication. From these expected results, we hope to lay the foundation for future research and contribute to cotton fiber improvement. MATERIALS AND METHODS Plant Materials, Sample Collection and Protein Preparation In order to investigate the effects caused by domestication on the fiber development, we used a cultivated and wild accession from Gossypium hirsutum. The modern cultivated line selected was Texas Marker Stock 1 (TM-1, considered by some to be a genetic and cytogenetic standard). The wild accession selected was var. yucatanense 5

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

3 (YU-3, US Department of Agriculture GRIN accession PI501501). This is an unambiguously wild form according to existing morphological and molecular evidence.10 Var. yucatanense accessions have been used in a lot of research about the evolution and domestication of Upland cotton.11, 16, 17 TM-1 and YU-3 plants were grown in greenhouses at the south species breeding base of the Chinese Academy of Agricultural Science (CAAS), Damao (E 109°37′, N 18°20′), Sanya, China. The seeds were sown in sand, and then transplanted to the glasshouse. Cotton materials were grown under conventional field management from September (2013) to February (2014). We started tagging when the two cotton accessions began flowering. Cotton bolls were harvested at 10 DPA and 20 DPA. The harvested bolls were opened immediately and fibers were separated from ovules in liquid nitrogen. Each accession was divided into three biological replicates, and five plants were included in each replicate.18 Fiber tissues were finely ground in liquid nitrogen. Then the protein extraction and concentration determination were conducted as reported.19, 20 The detailed procedure was described in Protocol S1. Three biological replicates of TM-1 fibers and one replicate of YU-3 fibers were performed. Proteome Analysis of Cotton Fibers by iTRAQ iTRAQ analysis was implemented at the Beijing Genomics Institute, Shenzhen, China. Before iTRAQ labeling, Total protein (100 µg) taken out of each sample solution were digested and reconstituted using 8-plex iTRAQ reagent (Applied Biosystems). The TM-1 proteins were labeled with iTRAQ tags as follows: 113 (10 DPA_1), 114 6

ACS Paragon Plus Environment

Page 6 of 48

Page 7 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(20 DPA_1), 115 (10 DPA_2), 116 (20 DPA_2), 117 (10 DPA_3), 118 (20 DPA_3); and the YU-3 proteins were labeled with tags 119 (10 DPA) and 121 (20 DPA). After labeling, the peptides were incubated at room temperature for 2 h and then pooled and dried by vacuum centrifugation (Protocol S2). The dried peptides were redissolved for fractionation with strong cation exchange chromatography as Protocol S3 described. The fractionated fractions were performed with LC-ESI-MS/MS analysis based on Ttiple TOF 5600 (Protocol S4). Raw data files acquired from the TripleTOF 5600 System were converted into MGF files using Proteome Discoverer 1.2 (PD 1.2, Thermo), [5600 msconverter] and the MGF files were searched. Protein identifications were performed by using the Mascot search engine (Matrix Science, London, UK; version 2.3.02) against an Upland cotton TM-1 database (Gossypium_hirsutum_v1.1, http://mascotton.njau.edu.cn/) containing 70478 protein sequences.13 At least one unique peptide was necessary for the identified protein. The parameters setting in our search were as follows: a mass tolerance of 0.05 Da (ppm) was permitted for intact peptide masses and 0.1 Da for fragmented ions, with allowance for one missed cleavage in trypsin digests, Oxidation (M), iTRAQ8plex (Y) as the potential variable modifications, and Carbamidomethyl (C), iTRAQ8plex (N-term), iTRAQ8plex (K) as fixed modifications. The result of peptide data matches was shown in Table S1. An automated software called IQuant21 for quantitatively analyzing the labeled peptides with isobaric tags was used. The main IQuant quantitation parameters were as follows: all unique peptide were used for protein quantitation; and it was required 7

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 48

that a protein contained at least one unique spectra. We only used ratios with p-values < 0.05, and only fold changes > 2 were considered significant. RNA-Seq Data Acquisition and Processing In order to complement proteome analysis, transcriptome data at the same developmental stage reported by Yoo et al.17 were utilized. Four samples (Gossypium hirsutum cv. TM1 10 DPA, G. hirsutum cv. TM1 20 DPA, G. hirsutum var. yucatanense TX2094 10 DPA, and G. hirsutum var. yucatanense TX2094 20 DPA) were

downloaded

from

the

NCBI

Sequence

Read

Archive

(SRA)

(http://www.ncbi.nlm.nih.gov/sra/?term=SRP017061) as raw data for bioinformatics analysis with RNA-seq procedures. Clean

reads

of

each

sample

were

mapped

to

TM-1

genome

(http://mascotton.njau.edu.cn/). Gene and isoform expression levels were quantified by RSEM (RNASeq by Expectation Maximization) software. We use 'FDR ≤ 0.00122 and the absolute value of Log2Ratio ≥ 2' as the threshold to judge the significance of gene expression differences. Function Method Description Protein ortholog classification was performed with Cluster of Orthologous Groups of proteins (COG) database. Personalized GO and KEGG enrichment analysis was performed using the OmicShare tools, a free online platform for data analysis (www.omicshare.com/tools). Quantitative Real-Time PCR (qPCR) Total RNA from fiber samples was extracted using the RNAprep Pure Plant Kit 8

ACS Paragon Plus Environment

Page 9 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(DP441, TIANGEN, Beijing). Reverse transcription and real-time PCR were performed as Liu et al.23 described. All reactions were run with three technical replicates and three biological replicates. Relative gene expression was calculated using the △△Ct algorithm. GhActin7 (AT5G09810) was used as an endogenous reference gene, and TM-1 at 10 DPA was set as a reference sample for data normalization, which was set to 1. All primer pairs used here were shown in Table S2. Enzyme Activity Determination and Substance Content Estimation Peroxidase (POD) assay. Peroxidase activity was determined with microplate reader at 470 nm according to its ability to oxidize guaiacol. Crude enzyme solution extraction and determination step I was used according to Ma et al.24 See detailed description in "Supporting Information", Protocol S5. Zeatin (ZA) content measurement. ZA content in cotton fibers was determined by HPLC. The sample preparation method used here was according to a protocol adapted from Ai et al.25 See detailed description in "Supporting Information", Protocol S6. The instrument used is Rigol L3000 high performance liquid chromatograph, with Kromasil C18 reversed-phase column (250 mm * 4.6 mm, 5 µm). The injection volume was 10 µL, flow velocity was 0.8 mL/min, column temperature was 35℃ and peak wavelength was 254 nm. RESULTS Phenotypic Differences between Domesticated and Wild Cotton

9

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1. Phenotype of cotton seeds with attached fiber at different developmental stages from TM-1 and YU-3. The phenotypes of developing and matured fibers have obvious differences between TM-1 (domesticated) and YU-3 (wild). As shown in Figure 1A, wild cotton (i.e. YU-3) seeds are smaller than domesticated at 10 DPA and with few raw fibers on the surface which are difficult to tease away by a dissecting needle. In contrast, seeds from TM-1 have more fibers that can be peeled away slightly. After boiling and combing with running water, 20 DPA seeds are shown in Figure 1B. The large difference in fiber number and fiber length between TM-1 and YU-3 can be clearly seen. The fully matured and dehydrated fiber length of TM-1 was also superior to that 10

ACS Paragon Plus Environment

Page 10 of 48

Page 11 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

of YU-3 (Figure 1C). TM-1 fiber is glossy and abundant, while YU-3 fiber is short and sparse (Figure 1D). TM-1 showed average fiber length of 29.3 mm at 20 DPA, versus 21.3 mm for YU-3, a significant difference based on two-tailed Student's t-tests (Figure 1E). The matured fiber length of TM-1 and YU-3 were 31.5 mm and 23.5 mm respectively, a highly significant difference (Figure 1F). Fibers at 20 DPA fibers (still in elongation) had high flexibility and could easily be neatly combed in TM-1, but were more difficult to comb in wild cotton because the fiber has entered the late stage of development. Previous results have shown that the duration of fiber elongation in upland cotton cultivars was greatly prolonged by domestication process.17 Another comparative proteomic analysis suggests that the domestication process strengthened and/or extended fiber elongation by altering both the invertase activity and osmotic regulation pathways.16 While the exact reason remains unclear, these results greatly increased our interest in revealing molecular mechanisms associated with fiber domestication in wild Upland cotton. Basic Information for the Identification of Cotton Fiber Proteome A proteomic profiling of developing fibers at 10 DPA and 20 DPA from TM-1 and YU-3 plants were performed with iTRAQ technology. The proteins were extracted from each sample respectively, digested and labeled with isotopic later. Then, the mixture containing eight samples equally were analytical separated and identified by LC-ESI-MS/MS (Figure S1). A total of 387565 spectra were generated, among which 33338 are unique after data filtering to remove low score spectra. By searching against the G. hirsutum database with the cutoff: Mascot Percolator26 Q-value ≤ 11

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

0.01, there are 17297 unique peptides in total representing 6990 proteins were identified (Figure 2A and Table S3). The distribution of numbers of unique peptides is as Figure 2B shows. The vast majority of identified proteins contain less than 10 peptides and protein quantity becomes less with increased peptides. About half of the proteins (50.03%) were mapped with more than one unique peptide, indicating that the data here is regular and reliable. The 6990 proteins identified accounted for 9.92% of all proteins annotated in the G. hirsutum genome (Figure 2C). The huge numbers of fiber proteins identified here imply an intricate protein regulatory network in cotton fiber elongation and thickening.

Figure 2. Identification of fiber proteomes and analyses of differentially expressed proteins (DEPs) between wild and domesticated cotton. Differential Protein Expression between TM-1 and YU-3 Fibers 12

ACS Paragon Plus Environment

Page 12 of 48

Page 13 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Analysis of reproducibility23, 27 between TM10 and TM20 replicate samples revealed average variation of 20% and 8% respectively (Table S4 and Figure 2D), showing better repeatability. Using the criterion of a 95% confidence level and a 2-fold change, 336 differentially expressed proteins (DEPs) were identified between YU-3 and TM-1 at both 10 DPA and 20 DPA. There are 21 DEPs identified at both stages (Figure 2E). Among these DEPs, 80 were identified at 10 DPA and 277 at 20 DPA (Table S5). At 10 DPA, there are 35 proteins up regulated and 45 down regulated in YU-3 fibers. However, 179 DEPs were up regulated and 98 were down regulated in YU-3 at 20 DPA (Figure 2F), consistent with the big difference in fiber morphology. The difference of protein expression abundance between wild and domesticated cotton is more obvious at 20 DPA than 10 DPA, again consistent with the phenotypes shown in Figure 1. In order to explore potential functions of the DEPs, a preliminary analysis of clusters of orthologous groups of proteins (COGs) was conducted. Approximately 159 DEPs (47%) were assigned no COG information (Table S5), and the other 177 (Figure S2) were mainly categorized in R (General function prediction only), G (Carbohydrate transport and metabolism), and O (Post-translational modification, protein turnover, chaperones), accounting for 43, 21 and 20 proteins in each group. A few other COG groups containing slightly less proteins were also identified, for example, K (Transcription), J (Translation, ribosomal structure and biogenesis), T (Signal transduction mechanisms), C (Energy production and conversion), and I (Lipid transport and metabolism). The plentiful DEPs identified functioning in carbohydrate 13

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 48

transport, posttranslational modification and transcription suggest that fiber development and domestication may have relationships with these pathways. These DEPs laid a solid foundation for explaining the causes of phenotypic differences. A better understanding about the impact of domestication on wild cotton fiber would be expected with a deep analysis. Functional Analysis of DEPs To uncover biological mechanisms that differentiate wild and domesticated cotton fibers, we annotated the DEPs with GO terms and conducted a GO biological process analysis (Table S6). The differentially expressed proteins between YU-3 and TM-1 (Table 1) were found to be involved in multiple biological processes, including L-fucose biosynthetic process (GO:0006005, p = 1.95E-03), cell wall organization or biogenesis (GO:0071554, p = 8.17E-03), hexose biosynthetic process (GO:0019319, p = 4.60E-02), very long-chain fatty acid metabolic process (GO:0000038, p = 3.53E-02), L-phenylalanine metabolism (GO:0006558, p = 2.60E-02), carbohydrate biosynthesis (GO:0016051, p = 3.37E-02), acyl-CoA metabolism (GO:0006637, p = 3.67E-02) and so on. A series of components, including carbohydrate, microtubule, cytoskeleton, cellulose and lipid from the three different ontologies, have been previously reported as being related to fiber development,1, 28-33 supporting our results. Table 1. Significantly enriched Gene Ontology categories of biological processes for all DEPs at 10 DPA and 20 DPA. Total GO ID

Description

No.

P value

10 DPA No.

14

ACS Paragon Plus Environment

P value

20 DPA No.

P value

Page 15 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(336)

a

(80)

(277)

GO:0006005

L-fucose biosynthetic process

2

1.95E-03

2

1.29E-03

GO:0006874

cellular calcium ion homeostasis

2

1.10E-02

2

7.40E-03

glucose 1-phosphate metabolic

2

1.10E-02

2

7.40E-03

GO:0019255

process GO:0019388

galactose catabolic process

2

1.10E-02

2

7.40E-03

GO:1902223

erythrose

1

4.43E-02

1

3.61E-02

4-phosphate/phosphoenolpyruvate family amino acid biosynthetic process GO:0019319

hexose biosynthetic process

4

4.60E-02

4

2.40E-02

GO:0071555

cell wall organization

7

2.74E-02

7

9.77E-03

GO:0000038

very long-chain fatty acid

2

3.53E-02

2

2.41E-02

2

3.53E-02

2

2.41E-02

9

5.36E-03

1

8

5.15E-03

11

8.17E-03

1

10

5.47E-03

8

1.04E-02

1

8

3.06E-03

2

2.60E-02

1

1

2.74E-02

25

92

metabolic process GO:0009834

plant-type secondary cell wall biogenesis

GO:0045229

external encapsulating structure organization

GO:0071554

cell wall organization or biogenesis

GO:0043648

dicarboxylic acid metabolic process

GO:0006558

L-phenylalanine metabolic process

GO:0044699

single-organism process

111

GO:0016051

carbohydrate biosynthetic process

11

3.37E-02

2

9

GO:0006637

acyl-CoA metabolic process

3

3.67E-02

1

2

GO:0052548

regulation of endopeptidase

2

5.68E-03

2

3.53E-04

GO:0052547

regulation of peptidase activity

2

1.10E-02

2

7.00E-04

negative regulation of cellular

2

4.58E-02

2

3.18E-03

2.39E-02

activity

GO:0032269

protein metabolic process a

The number of DEPs enriched in this GO term. The number in the parentheses is

total number of DEPs. To further comprehend the enriched pathways of those DEPs, a KEGG pathway enrichment analysis was performed. Two separate analyses were done at 10 DPA and 20 DPA, then twenty pathways with the lowest P values were selected. Some pathways were enriched at both stages, including Phenylpropanoid biosynthesis, 15

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 48

Taurine and hypotaurine metabolism, beta-Alanine metabolism, and Butanoate metabolism (Figure 3). While Zeatin biosynthesis and Phenylalanine metabolism presented significant enrichment at 10 DPA, Monoterpenoid biosynthesis, Fatty acid elongation, Pentose phosphate pathway and Tyrosine metabolism were only significantly enriched at 20 DPA.

Figure 3. KEGG pathway enrichment analysis of differentially expressed proteins. There are 3 significantly enriched pathways at 10 DPA and 8 at 20 DPA (Figure 3). Only one pathway, i.e. Phenylpropanoid biosynthesis, shows significant enrichment at both

stages.

After

alignment

with

an

Arabidopsis

database

(TAIR,

http://www.arabidopsis.org/), 53 DEPs from ten significantly enriched pathways were described with functional annotation (Table 2). Table 2. DEPs with homologs from Arabidopsis in significantly enriched KEGG pathways. G.hirsutum

Arabidopsis

ArabDesc

Gene 16

ACS Paragon Plus Environment

Protein level

Gene level

Page 17 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

gene ID

ID

name

Log2(YU-3/TM-1)

Log2(YU-3/TM-1)

10 DPA

20 DPA

10 DPA

20 DPA

1.58

1.12

-

-

Phenylpropanoid biosynthesis Gh_A04G1322

AT5G05340

Peroxidase superfamily protein

PRX52

Gh_A08G0711

AT5G19890

Peroxidase superfamily protein

AT5G19890

0.19

1.44

11.86

0.30

Gh_A10G1317

AT5G05340

Peroxidase superfamily protein

PRX52

0.45

1.06

4.48

-0.75

Gh_A11G1873

AT4G37970

cinnamyl alcohol dehydrogenase

CAD6

0.32

1.23

2.72

1.61

Gh_A12G0193a

AT4G34230

CAD5

0.58

1.39

0.29

2.95

6 cinnamyl alcohol dehydrogenase 5 Gh_A13G0772

AT1G05260

Peroxidase superfamily protein

RCI3

1.79

2.08

-

-

Gh_D03G0059

AT5G42180

Peroxidase superfamily protein

PER64

-0.1

1.4

9.28

1.78

Gh_D03G1382

AT5G05340

Peroxidase superfamily protein

PRX52

0.61

2.01

3.92

3.59

Gh_D06G2262

AT5G14980

alpha/beta-Hydrolases

MAGL14

-0.69

-1.69

0.01

-0.35

CAD6

1.3

1.56

4.29

0.67

superfamily protein Gh_D11G2098

AT4G37970

cinnamyl alcohol dehydrogenase

Gh_Sca005268G01

AT1G14550

Peroxidase superfamily protein

AT1G14550

1.49

2.36

-

-

Gh_A03G1091

AT1G62940

acyl-CoA synthetase 5

ACOS5

1.14

-0.27

-

-

Gh_A05G0157

AT3G24503

aldehyde dehydrogenase 2C4

ALDH2C4

1.62

0.18

12.72

4.54

Gh_A06G0667

AT3G53260

phenylalanine ammonia-lyase 2

PAL2

1.29

0.58

7.41

0.00

Gh_D11G2151

AT2G22420

Peroxidase superfamily protein

PRX17

1.06

0.54

1.87

0.89

D-mannose binding lectin

AT1G78850

1.28

1.29

4.05

-1.93

6

Phenylalanine metabolism Gh_A01G0388

AT1G78850

protein with Apple-like carbohydrate-binding domain Gh_A03G1091

AT1G62940

acyl-CoA synthetase 5

ACOS5

1.14

-0.27

-

-

Gh_A06G0667

AT3G53260

phenylalanine ammonia-lyase 2

PAL2

1.29

0.58

7.41

0.00

cytochrome P450, family 714,

ELA1

-1.47

-0.62

-

-

Zeatin biosynthesis Gh_A10G1778

AT5G24910

subfamily A, polypeptide 1

Taurine and hypotaurine metabolism Gh_A12G1414

AT2G02010

glutamate decarboxylase 4

GAD4

-1.12

-1.09

-0.03

-2.55

Gh_D02G1390

AT2G02010

glutamate decarboxylase 4

GAD4

-0.51

-1.36

-

-

Gh_D12G1534

AT2G02010

glutamate decarboxylase 4

GAD4

-

-1.36

-0.60

-1.76

Phototropic-responsive NPH3

RPT2

0.53

1.11

-0.64

-1.00

SDR1

0.33

1.03

-0.14

0.00

Monoterpenoid biosynthesis Gh_A08G1808

AT2G30520

family protein Gh_D07G1144

AT3G61220

NAD(P)-binding Rossmann-fold superfamily protein

beta-Alanine metabolism 17

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Gh_A01G0388

AT1G78850

D-mannose binding lectin

Page 18 of 48

AT1G78850

1.28

1.29

4.05

-1.93

glutamate decarboxylase 4

GAD4

-1.12

-1.09

-0.03

-2.55

D-mannose binding lectin

AT1G78850

0.69

1.3

3.22

-11.65

protein with Apple-like carbohydrate-binding domain Gh_A12G1414

AT2G02010

Gh_D01G2338

AT1G78850

protein with Apple-like carbohydrate-binding domain Gh_D02G1390

AT2G02010

glutamate decarboxylase 4

GAD4

-0.51

-1.36

-

-

Gh_D07G2232

AT1G23800

aldehyde dehydrogenase 2B7

ALDH2B7

0.68

1.2

3.10

2.44

Gh_D12G1534

AT2G02010

glutamate decarboxylase 4

GAD4

-

-1.36

-

-

3-ketoacyl-CoA synthase 6

KCS6

-0.79

-1.06

-0.20

-1.04

Fatty acid elongation Gh_A03G1286

AT1G68530

Gh_A05G0840

AT2G28630

3-ketoacyl-CoA synthase 12

KCS12

-0.71

-1.22

-2.13

-3.07

Gh_A13G1665

AT2G26250

3-ketoacyl-CoA synthase 10

KCS10

-0.27

-1.06

-1.34

-1.53

Gh_D02G0422

AT4G14440

3-hydroxyacyl-CoA dehydratase

HCD1

-0.03

-1.22

-

-

1

Plant-pathogen interaction Gh_A05G3063

AT1G54470

RNI-like superfamily protein

RPP27

-0.06

1.43

-

-

Gh_A12G2018

AT1G12310

Calcium-binding EF-hand family

AT1G12310

1.28

-1.29

-0.82

-0.72

Gh_D01G1045

AT1G35710

AT1G35710

0.68

1.01

-0.15

-9.51

Gh_D02G0422

AT4G14440

HCD1

-0.03

-1.22

-

-

AT5G04170

0.52

2.32

16.45

16.06

CNGC17

-0.67

-1.06

1.33

1.68

protein Protein kinase family protein with leucine-rich repeat domain 3-hydroxyacyl-CoA dehydratase 1 Gh_D03G1708

AT5G04170

Calcium-binding EF-hand family protein

Gh_D04G0818

AT4G30360

cyclic nucleotide-gated channel

Gh_D05G0323

AT4G29810

MAP kinase kinase 2

MKK2

0.34

2.15

1.27

0.60

Gh_D08G2441

AT1G34110

Leucine-rich receptor-like

RGI5

0.59

1.56

-4.55

-4.16

Gh_D10G2326

AT5G43470

RPP8

0.29

1.4

0.39

10.89

AT1G12310

0.79

-1.79

-0.89

-0.92

17

protein kinase family protein Disease resistance protein (CC-NBS-LRR class) family Gh_D12G2196

AT1G12310

Calcium-binding EF-hand family protein

Butanoate metabolism Gh_A12G1414

AT2G02010

glutamate decarboxylase 4

GAD4

-1.12

-1.09

-0.03

-2.55

Gh_D02G0154

AT3G48560

chlorsulfuron/imidazolinone

CSR1

-0.32

-1.12

-8.71

-1.24

resistant 1 Gh_D02G1390

AT2G02010

glutamate decarboxylase 4

GAD4

-0.51

-1.36

-

-

Gh_D12G1534

AT2G02010

glutamate decarboxylase 4

GAD4

-

-1.36

-

-

Tyrosine metabolism

18

ACS Paragon Plus Environment

Page 19 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Gh_A01G0388

AT1G78850

D-mannose binding lectin

AT1G78850

1.28

1.29

4.05

-1.93

alcohol dehydrogenase 1

ADH1

-0.03

1.98

0.79

-0.40

Phosphoenolpyruvate

AT4G10750

0.58

1.29

-1.65

-1.26

AT1G78850

0.69

1.3

3.22

-11.65

ADH1

0.36

1.98

-

2.85

protein with Apple-like carbohydrate-binding domain Gh_A01G1605

AT1G77120

Gh_A02G0965

AT4G10750

carboxylase family protein Gh_D01G2338

AT1G78850

D-mannose binding lectin protein with Apple-like carbohydrate-binding domain

Gh_D02G0730

AT1G77120

alcohol dehydrogenase 1

- not detected at the gene level or protein level. a

Bold lines means that the gene was selected for qPCR analysis.

Integrative Analysis of Proteome and Transcriptome To complement proteome analysis, transcriptome data of the same developmental stage reported by Yoo et al.17 were utilized. These data were reanalyzed using the TM-1 genome sequence 13 with a standard RNA-seq technique workflow. With all the genes identified from the RNA-seq, a criteria of FDR (false discovery rates) ≤ 0.001 and the value of |log2Ratio| ≥ 2 was used, then a total of 8296 differentially expressed genes (DEGs) were detected (Table S7 and Figure S3). Of these DEGs, 3308 and 4988 were detected between wild and cultivated upland cotton at 10 DPA and 20 DPA, respectively. Among those DEGs, there were 51 genes having expression patterns consistent with the proteomic level (Table S8). Among these 51 genes, 30 could not be observed at 10 DPA and 2 could not be observed at 20 DPA, both at the protein and transcript levels. These stage-specific DEGs/DEPs may play roles only at 10 DPA or 20 DPA. Ten proteins could not be detected at the protein level but had differences at the transcript level at 10 DPA; and five proteins followed the same behavior at 20 DPA, implying that these proteins were translated 19

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

more in later fiber development. To compare proteomes and transcriptomes, we matched DEGs with quantifiable proteins. At 10 DPA, 11 genes were identified from both DEGs and DEPs. Similarly, 54 genes were identified at 20 DPA. Positive correlations between transcript and protein were observed at both 10 DPA and 20 DPA (Figure 4). Although the correlation was lower at 20 DPA than 10 DPA (0.4232 vs. 0.6398, Figure 4), it was more significant at 20 DPA.

Figure 4. Concordance analysis among changes in the abundance of transcriptome and proteome at 10 DPA (A) and 20 DPA (B). qRT-PCR Analysis Reveals Response Pathways between Wild and Domesticated 20

ACS Paragon Plus Environment

Page 20 of 48

Page 21 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Cotton

Figure 5. Relative mRNA abundance of genes from significantly enriched pathways as indicated by qPCR between YU-3 and TM-1 fibers. The qRT-PCR analysis was performed with twelve genes, one or more selected from each enriched pathway (Table 2), at two developmental stages in TM-1 and YU-3 (Figure 5). Four genes (ALDH2B7, ALDH2C4, CAD5, RPT2) from the 12 showed similar expression patterns as iTRAQ results at both developmental stages, despite some differences in the expression level. In contrast, CNGC17 mRNA abundance showed a trend opposite the result of iTRAQ. The CNGC17 gene was up regulated at both stages in YU-3 at the mRNA level, but down regulated at the protein level (Figure 5 and Table 2). Interestingly, the remaining 7 genes all show the same trend 21

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

between mRNA and protein levels at 10 DPA, but opposite trends at 20 DPA. These different trends may be caused by more post transcriptional regulation at 20 DPA. At 10 DPA, except for the GAD4 gene which was not detected by iTRAQ results (Table 2), there are 3 genes (KCS10, KCS12 and ELA1) showing similar down regulation, while 3 others (AT1G12310, AT1G78850 and MKK2) show similar up regulation at both levels (Figure 5 and Table 2). At 20 DPA, 2 genes, AT1G78850 and MKK2, showed down regulation in YU-3 at the mRNA level, but up regulation at the protein level. The remaining 5 genes (KCS10, KCS12, AT1G12310, ELA1 and GAD4) were up regulated at the mRNA level but down regulated at the protein level (Figure 5 and Table 2). The results above suggest a moderate correlation at 10 DPA but a poor correlation at 20 DPA between mRNA and protein expression profiles, i.e. different regulation patterns are acting at different developmental stages. As our COG analysis (Figure S2) described, most of differentially expressed proteins were related to Post-translational modification, protein turnover, chaperones (O) and Transcription (K). It indicated that transcriptional and post-translational regulatory pattern may play a significant role in fiber elongation and thickening. DISCUSSION A Key Period in Fiber Development and Domestication As we know, cotton fiber experiences rapid elongation at 10 DPA, with a transition at 16 DPA to the secondary cell wall synthesis stage.1 We defined 10 DPA to 20 DPA as the fiber elongation stage. From the results reported by Hu et al.,16 four representative 22

ACS Paragon Plus Environment

Page 22 of 48

Page 23 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

developmental stages (5 DPA, 10 DPA, 20 DPA, 25 DPA) were selected from both wild and domesticated cotton and protein expression differences between adjacent stages were compared. From their hierarchical cluster analysis, we found that two periods at 5–10 DPA and 20–25 DPA were similar between wild and cultivated cotton from both iTRAQ and 2-DE results. This suggested that the most noticeable diversification of cotton fiber proteome caused by domestication and crop improvement was occurred at the period from 10 to 20 DPA.16 In addition, Yoo et al. only selected two developmental time points, 10 DPA and 20 DPA, to examine differences of gene expression levels between 4 wild and 5 domesticated G. hirsutum accessions by RNA-Seq,17 since these time points represent critical stages of primary cell wall elongation and the transition to secondary cell wall thickening, respectively.1, 15 For these reasons, our study also used these two time points. From our results, the number of proteins differentially expressed between developmental stages is greater than that between the two genotypes (Figure 6A and Figure 6B), consistent with previously published results.16, 17, 34 The large proteome difference between 10 DPA and 20 DPA was consistent with the phenotypic difference displayed in Figure 1. It is believed that the time course analysis and comparisons between time points during this period within a species may help to discover critical changes of developmental mode. To this end, 88 proteins (70+11+7) identified in Figure 6B were annotated by KEGG enrichment analysis. Four pathways were significantly enriched, including Phenylpropanoid biosynthesis, Fatty acid elongation, Carbon fixation in photosynthetic organisms and Zeatin biosynthesis. The 23

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

fiber experienced rapid elongation and generated extreme variation between the two genotypes during this period. This period is of great significance for fiber development and may be more affected by domestication than other periods.

Figure 6. Summary of differentially expressed proteins identified in this study. The G. hirsutum Genome is a Good Reference for Fiber Development Work in Upland Cotton 24

ACS Paragon Plus Environment

Page 24 of 48

Page 25 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

A whole genome draft sequence of G. hirsutum published in 201512,

13

made it

possible to study Upland cotton using its own genome, in theory being more accurate and reliable than using other databases. Gossypium hirsutum is a primary cultivated allotetraploid species (AADD; 2n = 4× = 52) formed from the combination of genomes resembling those of G. raimondii (DD; 2n = 2× = 26) and G. arboreum (AA; 2n = 2× = 26).10 Neither the single D or A genome are a complete and comprehensive representation of that of allotetraploid cotton. In G. barbadense fiber proteome profiling reported by Hu et al,18 1317 proteins were detected by searching against a G. raimondii database. The 6990 proteins identified here are more than 5x (6990 vs. 1317) the previous result, providing more comprehensive information. Differences in the fiber proteomes of wild and domesticated G. hirsutum were investigated by iTRAQ technologies. RNA-seq analysis was also conducted on previously published data.17 By mapping to the TM-1 genome database,13 we found a total of 8296 genes to be differentially expressed at the two time points (Table S7). Proteome analysis identified 6990 proteins, of which 80 and 277 were differentially expressed at 10 DPA and 20 DPA, respectively (Table S3 and S5). A poor correlation between proteome and transcriptome has been widely reported for a few reasons, such as posttranscriptional, translational, and posttranslational regulatory processes. Another factor contributing to this poor correlation may be the unequal development of detection technologies for expression of genes and proteins, respectively. Positive correlations in this study indicated that DEPs identified here may be more reliable. Of all the DEPs detected, 53 from ten significantly enriched pathways were functional 25

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

annotated (Table 2). These proteins had inferred functions related to Phenylpropanoid biosynthesis, Zeatin biosynthesis, Fatty acid elongation etc. based on homologous BLAST in TAIR (http://www.arabidopsis.org/). Phenylpropanoid Biosynthetic Processes Phenylpropanoids functioned positively in many aspects of plant responses to biotic and abiotic stimuli.35 Phenylpropanoids include coumarins, lignans, lignins and flavonoids, all synthesized by the cinnamic acid pathway. They are indicators of plant stress responses to light or mineral treatments, and key regulators of plant resistance to pests.36 Plant aldehyde dehydrogenases (ALDHs) play important roles in these functions.37 The functional identification of two orthologs of AtREF1 from Brassica napus showed that the ALDHs was involved in the formation of ferulate and sinapate from the corresponding aldehydes, and thereby linking lignin and HCA (Hydroxy citric acid) biosynthesis.38 AtPrx52 from Arabidopsis is the only peroxidase that has been unequivocally linked to lignin formation. Two different atprx52 knock-out mutants revealed decreased lignin amounts compared with wild type.39 It has been shown in many plant species that cinnamyl alcohol dehydrogenase (CAD) has a deep relationship with lignification. In Arabidopsis, AtCAD4 and AtCAD5 were involved in monolignol biosynthesis.40 Lignin is composed of polymerized aromatic alcohols and is the main component of the secondary cell wall in plants. By forming an interwoven network, the main role of lignin is to harden the cell wall. Cotton fiber development could be affected differently during ovule culture with adding different flavonoids. Tan et al.41 reported that naringenin (NAR, a kind of flavonoids) exhibited negative 26

ACS Paragon Plus Environment

Page 26 of 48

Page 27 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

effects on fiber development. Amazingly, all these proteins were up regulated in the wild cotton plant, YU-3 (Figure 6C and Table 2), indicating that phenylpropanoid compounds were largely accumulated during YU-3 fiber development. Among the 15 proteins significantly enriched in the Phenylpropanoid biosynthesis pathway, most were up regulated in YU-3 at 10 DPA or 20 DPA, with only PER64, ACOS5 and MAGL14 slightly down regulated (Table 2). At the mRNA level, the expression trends of ALDH2C4 and CAD5 assessed by qPCR (Figure 4) were consistent with results at the protein level, supporting our deduction. Further, peroxidase activity (Figure 6D) in YU-3 were significantly higher than that in TM-1 at both 10 DPA and 20 DPA, consistent with our proteome profiling results. According to our proteomic results and experimental validation, phenylpropanoids were largely accumulated during YU-3 fiber development. However, the phenylpropanoid contents were significantly decreased in cultivar fibers because of the process of domestication. Phenylpropanoids are a group of plant secondary metabolites derived from phenylalanine and having a wide variety of functions both as structural and signaling molecules.42 Compounds resulting from the activities of these proteins could contribute to pest resistance and environmental adaptability43 of wild cotton. In order to survive better under natural environments, wild cotton must rely on its own strong adaptability to abiotic stress and immunity to pathogens and pests. It could be deduced that phenylalanine metabolic pathways played a key role in this adaptability. Improved cultivation and management measures made cotton growth conditions more stable, reducing the selective advantage of adaptability to biotic and 27

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

abiotic stresses. On the other hand, fiber elongation and thickening were negatively related to phenylpropanoid content; a similar relation to NAR (naringenin) was reported by Tan et al.41 Moreover, demand for excellent fiber added negative selection pressure on phenylalanine metabolic pathways. As Yoo et al.17 reported, carbon resources might be reallocated in cotton fibers by domestication. Our results suggest that carbon resources have been transferred away from phenylpropane metabolism during domestication, which is consistent with their speculation. Zeatin Biosynthetic Processes Zeatin (ZA) is a natural cytokinin and plays an important role in multiple aspects of plant growth and development, for example, seed germination, chloroplast differentiation, apical dominance, flower and fruit development, leaf senescence and so on. From the reference pathway of zeatin biosynthesis (KEGG PATHWAY: map 00908; http://www.genome.jp/kegg-bin/show_pathway?map00908), one enzyme, cytokinin trans-hydroxylase (CYP735A), was significantly down-regulated in wild cotton (Figure 6E and Table 2). CYP735A has several other names, like ELA1, EUI-LIKE P450 A1 and others. In plants, there are many papers reporting information about this gene, and it usually interacts with GA in regulating plant growth and development. As reported in rice, the amounts of biologically active GAs were increased in ELA1-RNAi plants; and ELA1 and ELA2 over expression reduces the amounts of GAs in the internodes of transgenic rice, consistent with their dwarf phenotypes.44 In Arabidopsis, possible role of two CYP714 members, CYP714A1 and CYP714A2, 28

ACS Paragon Plus Environment

Page 28 of 48

Page 29 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

were clarified as negative regulators of GA metabolism through analyzing the enzymatic activities of their recombinant proteins using a yeast expression system.45 Endogenous mechanical stresses regulate plant growth and development, and Creff A. et al.46 suggested that ELA1 and ELA2 may provide a link between the regulation of GA levels and responses to mechanical stress with functioned in deactivation of bioactive GAs. In our study, zeatin biosynthesis was significantly enriched at 10 DPA and only one protein involved in this pathway was found. The protein (Gh_A10G1778) had lower expression abundance in wild cotton at both 10 and 20 DPA by iTRAQ (Table 2). By qPCR, ELA1 showed the same expression pattern as protein level at 10 DPA (Figure 4), suggesting a big difference between the two accessions in the rapid fiber elongation stage. Higher expression abundance at both mRNA and protein levels in TM-1 suggests that more zeatin or related cytokinins would be accumulated in its fiber cells, which might increase fiber elongation. The total zeatin content of cotton fibers at 10 DPA and 20 DPA was determined by High Performance Liquid Chromatography. ZA content in TM-1 was significantly higher than in YU-3 at 10 DPA (Figure 6F), consistent with results at the mRNA and protein level, supporting our deduction. At 20 DPA, ELA1 gene expression showed no significant difference between genotypes (Figure 4) but was down regulated in YU-3 at the protein level (Table 2), and no significant difference found on ZA content (Figure 6F), suggesting its regulation at transcriptional, post-transcriptional and post-translational levels. Fatty Acid Elongation Processes 29

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 48

According to the number of carbon atoms of the synthesized product, the fatty acid elongation pathway can divided into two independent parts. Biosynthesis of fatty acids with 4