Domestic fowl breed variation in egg white protein expression

Oct 8, 2018 - Our study is the first to show variation in protein abundances in egg white across chicken breeds with potential effects on egg quality,...
0 downloads 0 Views 1MB Size
Subscriber access provided by UNIVERSITY OF ADELAIDE LIBRARIES

Omics Technologies Applied to Agriculture and Food

Domestic fowl breed variation in egg white protein expression: application of proteomics and transcriptomics Barbora Bílková, Zuzana #widerská, Lukáš Zita, Denis Laloë, Mathieu Charles, Vladimir Benes, Pavel Stopka, and Michal Vinkler J. Agric. Food Chem., Just Accepted Manuscript • DOI: 10.1021/acs.jafc.8b03099 • Publication Date (Web): 08 Oct 2018 Downloaded from http://pubs.acs.org on October 15, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 68

Journal of Agricultural and Food Chemistry

1

Domestic fowl breed variation in egg white protein

2

expression: application of proteomics and transcriptomics

3 4

Barbora Bílková (a), Zuzana Świderská (a,b), Lukáš Zita (c), Denis Laloë (d), Mathieu

5

Charles (d), Vladimír Beneš (e), Pavel Stopka (a), Michal Vinkler (a)

6 7

a) Charles University, Faculty of Science, Department of Zoology, Prague, Czech Republic,

8

EU

9

b) Charles University, Faculty of Science, Department of Cell biology, Prague, Czech

10

Republic, EU

11

c) Czech University of Life Sciences, Faculty of Agrobiology, Food and Natural Resources,

12

Department of Animal Husbandry, Prague, Czech Republic, EU

13

d) GABI, INRA, AgroParisTech, Université Paris-Saclay, Jouy-en-Josas, France, EU

14

e) European Molecular Biology Laboratory, Heidelberg, Germany, EU

15 16

List of abbreviations

17

BGA - between group analysis

18

CIA - co-inertia analysis

19

PLGEM - power low global error model

20

RPKM - Reads Per Kilobase Million

21 22 23 24

Keywords 1 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

25

antimicrobial peptides, avian oviduct transcriptome, bird albumen, chicken breed, egg white

26

proteome

Page 2 of 68

27 28

2 ACS Paragon Plus Environment

Page 3 of 68

Journal of Agricultural and Food Chemistry

29

Abstract

30 31

Avian egg white is essential for protecting and nourishing bird embryos during their

32

development. Being produced in the female magnum, variability in hen oviduct gene

33

expression may affect egg white composition in domestic chickens. Since traditional poultry

34

breeds may represent a source of variation, in the present study we describe the egg white

35

proteome (mass spectrometry) and corresponding magnum transcriptome (high-throughput

36

sequencing) for twenty hens from five domestic fowl breeds (large breeds: Araucana, Czech

37

golden pencilled, Minorca and small breeds: Booted bantam, Rosecomb bantam). In total, we

38

identified 189 egg white proteins and 16391 magnum-expressed genes. The majority of egg

39

white protein content comprised proteins with an antimicrobial function. Despite general

40

similarity, Between-class Principal Component Analysis revealed significant breed-specific

41

variability in protein abundances, differentiating especially small and large breeds. Though

42

we found strong association between magnum mRNA expression and egg white protein

43

abundance across genes, co-inertia analysis revealed no transcriptome/proteome co-structure

44

at the individual level. Our study is the first to show variation in protein abundances in egg

45

white across chicken breeds with potential effects on egg quality, biosafety and chick

46

development. The observed inter-individual variation probably results from post-

47

transcriptional regulation creating a discrepancy between proteomic and transcriptomic data.

48 49 50

3 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

51

Page 4 of 68

Introduction

52 53

The structure of the avian egg is a characteristic apomorphy of present-day birds. The egg is

54

gradually developed by passing through specialised sections of the female oviduct, where

55

distinct egg structures are progressively formed1. Given that the domestic chicken (Gallus

56

gallus f. domestica) is both a model research-species in avian biology and an economically

57

important agricultural species, most of our present knowledge about the egg formation

58

process comes from this species. In chickens, it has previously been reported that distinct

59

parts of the hen oviduct differ in the amount of proteins expressed, corresponding to their

60

respective function in egg formation2. By volume, the egg white makes up the largest part of

61

the chicken egg (ca. two thirds). Correspondingly, the magnum where the egg white is formed

62

is the longest part of the hen oviduct. The magnum is covered with a mucosal tissue formed

63

by tubular gland cells folded in a spiral pattern that enlarges the secreting surface1. During a

64

three-hour period when egg yolk passages through the magnum, the complex egg white

65

substance is secreted, which is involved in both protection and nourishment of the developing

66

embryo.

67 68

To date, only four studies have attempted to describe the complete composition of chicken

69

egg white using modern proteomic methods4-6. Each of these studies was conducted using

70

modern commercially produced eggs only, in which they found between 78 and 202 distinct

71

egg white proteins. Earlier results gained by two-dimensional electrophoresis suggest that egg

72

white protein abundances could differ importantly between both chicken breeds and

73

populations7. Since the egg white contains large amounts of proteins involved in antimicrobial

74

defence8, such variation could potentially affect egg quality, biosafety and chicken embryo

75

development. 4 ACS Paragon Plus Environment

Page 5 of 68

Journal of Agricultural and Food Chemistry

76 77

Furthermore, the whole magnum transcriptome has not yet been described for chickens.

78

While previous research has targeted experimentally induced changes in mRNA expression of

79

genes within the oviduct9, associations between such changes and egg proteome composition

80

in terms of both protein expression and their abundances remain largely unknown. As

81

demonstrated by Kim and Choi10, experimental corticosterone treatment affects both magnum

82

mRNA expression and egg white protein abundance of certain genes, though not necessarily

83

in the same direction. In various systems, the integrative studies of whole proteome and

84

transcriptome data have shown that despite their general consistency11-12 large discrepancies

85

between mRNA expression and respective protein expression can be found13,14, suggesting an

86

important role of post-transcriptional regulation of gene expression15.

87 88

In this study, we attempt to improve our understanding of breed-specific variability in egg-

89

white protein composition in terms of protein identification and quantification of relative

90

protein abundances and fill gaps in our knowledge of the hen-magnum transcriptome-egg

91

proteome relationship. We focus on traditional chicken breeds as these exhibit high

92

phenotypic variability in morphological, physiological and immunological traits16,17 and egg

93

production18. We selected five distinct traditional chicken breeds (Araucana, Booted bantam,

94

Czech golden pencilled, Rosecomb bantam and Minorca) that highly differ in their phenotypic

95

traits such as body size, weight and shape and also area of origin. We describe both the egg

96

proteome and hen magnum transcriptome in order to highlight differences between breeds in

97

both protein and transcript amounts and to test for associations between proteomic and

98

transcriptomic data between individuals.

5 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

99

Page 6 of 68

Methods

100 101

Animals

102

A total of 535 fertilised eggs from hens of five different chicken breeds (Araucana, Booted

103

bantam, Czech golden pencilled, Minorca and Rosecomb bantam) were kindly provided by

104

small non-commercial breeders located in the Czech Republic, EU. All of the chosen breeds

105

can be considered as layer breeds, with relatively high laying capacity from 90-190 eggs per

106

year. Booted bantam and Rosecomb bantam also belong to dwarf fancy breeds, herein

107

referred as the small breeds with adult body weight 0.4-0.8 kg; body weight of Araucana,

108

Czech golden pencilled and Minorca breeds range from 1.2 to 1.7 kg, therefore these are

109

referred to as the large breeds. The eggs were incubated to hatching in an OvaEasy 380

110

Advance EX automatic egg incubator (Brinsea, Weston-super-Mare, UK) at a temperature of

111

37.5°C and 50% humidity. The twenty hens used within this experiment were hatched

112

between 13th May 2015 and 22nd June 2015 (Table S1) and immediately marked with two

113

numbered wing-marks. All hens were housed under standardised conditions at the animal

114

facility of the Czech University of Life Sciences in 0.50 × 0.88 × 0.45 m cages with access to

115

food and water ad libitum (initially in breed-specific flocks of 5-10 animals, later

116

individually). After reaching reproductive maturity (estimated by the onset of egg production;

117

for all hens between 224 and 244 days of age), three eggs (excluding the first three) were

118

collected from each hen. The hens were then euthanised by rapid cervical dislocation.

119

Samples of the magnum tissue (the segment of the oviduct were egg white is formed) were

120

taken within 15 min. after euthanasia, placed into RNAlater reagent (Quiagen, Hilden,

121

Germany) and kept overnight at +4°C and afterwards stored at -80°C. This research was

122

approved by the Ethical Committee of the Faculty of Science, Charles University (reference

123

number 1373/2016-4). 6 ACS Paragon Plus Environment

Page 7 of 68

Journal of Agricultural and Food Chemistry

124 125

Egg white sample preparation

126 127

All eggs were collected from hens within 12 hours after oviposition and immediately stored at

128

5°C. To minimise changes in the abundance of some egg white proteins during storage19, we

129

processed all eggs within two days of oviposition. First, the eggshells were opened with a

130

sterilised knife and the egg whites separated from the yolks. The whites were then

131

homogenised with a glass mechanical hand homogeniser for 60 seconds and divided into six

132

250 µL aliquots that were kept at -20°C until analysis. Prior to analysis, all egg white samples

133

from the three eggs of the same individual were pooled by mixing 4 µL of each sample (12µL

134

of egg white in total). The pooled samples were then diluted in 12µL of PBS and precipitated

135

with 112 µL of 98% ethanol. All precipitated samples were centrifuged at 2800 g and 4°C for

136

15 min. After centrifugation, the supernatant was discarded and the samples dried for 30 min

137

at 37°C. These were then re-suspended in 48 µL of digestion buffer (1% SDS, 100 mM

138

triethylammonium bicarbonate (TEAB) – pH = 8.5) and were cleaved with trypsin (i.e., 1/50,

139

trypsin/protein) at 37 °C overnight.

140 141

nLC-MS2 analysis

142 143

Nano Reversed phase columns were used to elute peptide cations using the same method as

144

Cerna20. The eluting peptide cations were converted to gas-phase ions by electrospray

145

ionisation and analysed on a Thermo Orbitrap Fusion mass spectrometer (Q-OT-qIT,

146

Thermo). Survey scans of peptide precursors from 400 to 1600 m/z were performed at 120K

147

resolution (at 200m/z) with a 5 × 105 ion count target. Tandem MS/MS was performed by

7 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 8 of 68

148

isolation at 1.5 Th with the quadrupole, high-energy collision dissociation (HCD)

149

fragmentation with normalized collision energy of 30, and rapid scan MS analysis in the ion

150

trap. The MS/MS ion count target was set to 104 and the max injection time was 35ms. Only

151

those precursors with a charge state of 2–6 were sampled for MS/MS. The dynamic exclusion

152

duration was set to 45s with a 10 ppm tolerance around the selected precursor and its isotopes.

153

Monoisotopic precursor selection was turned on and the instrument was run at top speed with

154

2s cycles.

155 156

Proteome data analysis

157 158

All data were collected and quantified using MaxQuant software version 1.5.3.821. False

159

discovery rate (FDR) was set to 1% for identification of all peptides and proteins. We set a

160

minimum peptide length of seven amino acids. The Andromeda search engine was used for

161

the MS/MS spectra search against the Uniprot database, with all duplicates removed. Enzyme

162

specificity was set as C-terminal to Arg and Lys, also allowing cleavage at proline bonds and

163

a maximum of two missed cleavages. Dithiomethylation of cysteine was selected as a fixed

164

modification and N-terminal protein acetylation and methionine oxidation as variable

165

modifications. Quantifications were performed with the label-free algorithms21.

166

We were able to detect 266 proteins, of which 67 were assigned as contaminants. To

167

standardise the total abundance of non-contaminating egg white proteins across samples, we

168

normalised the amount of each protein according to the formula Cnet = Craw/(1-Pcont), where

169

Cnet indicates the net amount of protein counts after normalisation, Craw the raw abundance of

170

protein counts and Pcont the proportion of contamination in a given sample. In addition to

171

chicken proteins, we also detected 10 proteins of avian specific viruses. We excluded all

172

proteins occurring in three or less samples from further analysis, leaving a total dataset of 115 8 ACS Paragon Plus Environment

Page 9 of 68

Journal of Agricultural and Food Chemistry

173

egg

white

proteins.

174

For protein classification, we used the online PANTHER library22. To find overrepresented

175

Gene Ontologies (GOs) among the egg white proteins, we launched the PANTHER

176

Overrepresentation Test with Bonferroni correction for multiple testing against the G. gallus

177

(all genes in the database) reference list and annotation data sets, with the GO molecular

178

function and GO biological process complete (GO Ontology database release date: 14. 08.

179

2017).

180 181

Transcriptomic data analysis

182 183

Total RNA was isolated from magnum samples using the High Pure RNA Tissue Kit (Roche,

184

Basel, Switzerland), according to the manufacturer’s instructions. The total amount of

185

extracted RNA was quantified using an Agilent 2100 Bioanalyser with the Agilent RNA 6000

186

Nano Kit (Agilent Technologies, California, USA). Library preparation and stranded paired-

187

end mRNA sequencing was performed at The European Molecular Biology Laboratory

188

(EMBL), Heidelberg. Barcoded stranded mRNA-seq libraries were prepared using the

189

Illumina TruSeq RNA Sample Preparation v2 Kit (Illumina, San Diego, CA, USA),

190

implemented on the liquid handling robot Beckman FXP2. The libraries obtained were pooled

191

in equimolar amounts, with a 1.8 pM solution of this pool loaded on the NextSeq 500

192

Illumina sequencer and sequenced bi-directionally, each read being 85 bases long, thereby

193

generating ~25 million sequence pairs for each library. The sequencing results were submitted

194

to the NCBI Sequence Read Archive (SRA Acc. No. SRP126816).

195

Read sequences were trimmed of sequencing adaptors using Trim Galore! Software

196

(Babraham Bioinformatics, Braham Institute, Cambridge, GB) and low-quality score bases

197

(Phred quality score < 30) were removed from both (3’ and 5’) ends using SICKLE23 9 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 10 of 68

198

software. Reads with a resulting trimmed sequence of less than 30 bp were discarded. The

199

remaining trimmed reads of high quality sequences were then aligned to the G. gallus

200

reference genome assembly Gallus_gallus-5.0 (GCA_000002315.3) using STAR software24

201

for ultrafast transcript assembly. The percentage of uniquely mapped reads ranged from 91.82

202

% to 94.34% (Table S2). Sequence reads were assigned and quantified at the gene level using

203

the featureCounts program from the Subread package25. In order to compare mean expression

204

between genes, the read counts obtained from the featureCounts program were converted to

205

Reads Per Kilobase Million (RPKM).

206 207

Statistical analysis

208 209

Prior to analysis, we transformed both the proteomic and transcriptomic data using a natural

210

logarithmic transformation (y ~log(x+1)). All data were then centred and scaled. First, we

211

performed a PCA on the proteome and transcriptome datasets and analysed the effects of

212

breed on proteome and transcriptome inter-individual variation. Second, we performed a

213

Between Groups Component Analysis (BGA) on the proteomic dataset, with breed groups

214

(breed; small breeds [Booted bantam and Rosecomb bantam]; large breeds [Araucana, Czech

215

golden pencilled, Minorca]) used as grouping factors. BGA focuses on between-group

216

variability by performing a PCA on group means. The importance of the difference between

217

groups is assessed by the ratio of between-group inertia over total inertia26. The statistical

218

significance of differences between groups was checked using the Monte-Carlo permutation

219

test.

220 221

We used the Power Low Global Error Model (PLGEM)27 to identify egg white proteins

222

differing in abundance between small and large breeds. The signal-to-noise ratio was 10 ACS Paragon Plus Environment

Page 11 of 68

Journal of Agricultural and Food Chemistry

223

calculated as it explicitly takes unequal variances into account and penalises those proteins

224

that have higher variance in each group more than those proteins that have a high variance in

225

one group and a low variance in another27. As PLGEM can only be fitted on a set of replicates

226

under the same experimental conditions, we first applied the test to the Czech golden

227

pencilled data. Correlation between mean values and standard deviations was high (r2= 0.96,

228

Pearson=0.883); hence, we continued with the resampled signal-to-noise ratio and calculated

229

differences with the corresponding p-values between the small and large breed groups.

230

For the purpose of consistency, we selected 97 genes-protein pairs that were common to both

231

the datasets (Table S3) when comparing the proteomic and transcriptomic data. We calculated

232

Spearman’s correlation between the mean mRNA expression values and mean protein

233

abundance. To further describe relationships between the proteome and transcriptome

234

datasets, we used Co-Inertia multivariate Analysis (CIA)28,29. CIA is useful for analysing the

235

relationships between two tables (here representing proteomic and transcriptomic data) having

236

the same samples in rows. This method finds the maximum shared structure between two

237

datasets representing the same individual. CIA finds ordinations (dimension reduction

238

diagrams) from the datasets that are most similar. This is done by finding successive axes

239

from the two datasets with maximum covariance. Co-structure between proteome and

240

transcriptome datasets is measured by the RV-coefficient, ranging from 0 to 1, where 1

241

indicates highest and 0 the lowest degree of co-structure. The statistical significance of co-

242

inertia was evaluated with a Monte Carlo permutation test. CIA can be applied to datasets

243

where the number of variables far exceeds the number of samples (which is the case when

244

applied to x-omics data; see29,30). All statistical analyses were undertaken in R software

245

version 3.4.031with the ade4 package32.

11 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

246

Page 12 of 68

Results and discussion

247 248

General description of the hen egg white proteome

249 250

At the proteomic level, we detected 115 egg white proteins at 0.01 FDR, ranging from 109 to

251

112 in each breed-specific proteome (Table S4). The successful identification of these

252

proteins was due to the relatively high number of peptides per identification (9.45 ± 13.19,

253

mean ± SD), high sequence coverage (27.02 ± 19.86 %) and high unique sequence coverage

254

(26.08 ± 19.97%). Furthermore, we obtained a high and significant Spearman’s rank

255

correlation (r = 0.720, P