Genome-Wide Association Mapping of Seed Coat ... - ACS Publications

Jun 26, 2017 - Genome-Wide Association Mapping of Seed Coat Color in Brassica napus. Jia Wang,. †,‡,§. Xiaohua Xian,. †,‡. Xinfu Xu,. ‡. Cu...
0 downloads 0 Views 3MB Size
Article pubs.acs.org/JAFC

Genome-Wide Association Mapping of Seed Coat Color in Brassica napus Jia Wang,†,‡,§ Xiaohua Xian,†,‡ Xinfu Xu,‡ Cunmin Qu,‡ Kun Lu,‡ Jiana Li,‡ and Liezhao Liu*,‡ ‡

College of Agronomy and Biotechnology, Southwest University, Beibei, Chongqing 400715, People’s Republic of China Nanchong Academy of Agricultural Sciences, Nanchong, Sichuan 637000, People’s Republic of China

§

S Supporting Information *

ABSTRACT: Seed coat color is an extremely important breeding characteristic of Brassica napus. To elucidate the factors affecting the genetic architecture of seed coat color, a genome-wide association study (GWAS) of seed coat color was conducted with a diversity panel comprising 520 B. napus cultivars and inbred lines. In total, 22 single-nucleotide polymorphisms (SNPs) distributed on 7 chromosomes were found to be associated with seed coat color. The most significant SNPs were found in 2014 near Bn-scaff_15763_1-p233999, only 43.42 kb away from BnaC06g17050D, which is orthologous to Arabidopsis thaliana TRANSPARENT TESTA 12 (TT12), an important gene involved in the transportation of proanthocyanidin precursors into the vacuole. Two of eight repeatedly detected SNPs can be identified and digested by restriction enzymes. Candidate gene mining revealed that the relevant regions of significant SNP loci on the A09 and C08 chromosomes are highly homologous. Moreover, a comparison of the GWAS results to those of previous quantitative trait locus (QTL) studies showed that 11 SNPs were located in the confidence intervals of the QTLs identified in previous studies based on linkage analyses or association mapping. Our results provide insights into the genetic basis of seed coat color in B. napus, and the beneficial allele, SNP information, and candidate genes should be useful for selecting yellow seeds in B. napus breeding. KEYWORDS: association mapping, seed coat color, Brassica napus



INTRODUCTION

With highly developed molecular marker technologies, previous studies have mapped seed coat color using molecular markers. Badani et al.5 observed major quantitative trait loci (QTLs) with a large effect on seed coat color in multiple environments on chromosome N18, and this result was supported by Zhang et al., who found that a significant QTL in linkage group N18 explained 47.16−60.92% of the phenotypic variation in seed coat color using one cross, namely, the Quantum 9 × No. 2127-17 (HZ-1) population.6 Similarly, a major QTL with a large effect on seed coat color has been found at a locus on homologous chromosome A09 using different genetic backgrounds in different studies.4,7−11 In addition, seed coat color QTLs have been reported on other chromosomes.5,9,12,13 However, conventional QTL mapping efforts using the segregated progeny of a biparental cross enable the detection of only a subset of loci/alleles within the crop and offer limited resolution as a result of the small number of informative recombination events between linked genetic loci.14 Genome-wide association studies (GWASs), also called association mapping or linkage disequilibrium (LD) mapping, are a valuable tool for the dissection of QTLs controlling complex traits in crop plants.15 In recent years, GWASs have been widely used in research on B. napus complex traits, including seed quality traits, such as seed oil content,16 seed weight,17 phenolic compounds,18 and seed tocopherol content and composition,19 agronomic traits,20 such as flowering time,21

Brassica napus L., which is grown worldwide, is one of the most important oilseed crops, and the yellow seed coat color is one of the most important traits of B. napus.1,2 With the improvement in living standards, there are higher requirements for the quality of edible oil; moreover, the development of more nutritious feed is required for livestock and poultry. Thus, the objective of rapeseed breeding should be to improve the per unit area of the oil yield as well as the protein content and to improve the edible quality of rapeseed oil and quality of feed. With the same genetic background, yellow-seeded B. napus has a thinner seed coat, higher protein and oil contents, and a lower fiber content than black-seeded B. napus.3 In addition, yellowseeded B. napus has a series of advantages in industrialization. First, the yellow-seeded hybrid rapeseed characteristics are obvious, which is conducive to the breeding of this trait and the development of a high-quality seed. Second, the rapeseed oil characteristics of yellow-seeded B. napus are obvious, easy-toform brands and preventing the development of counterfeit products. Because of the many advantages of yellow-seeded lines, research on yellow-seeded B. napus is a hot topic in the field of rapeseed breeding for quality. However, as a result of allotetraploidy, multiple gene inheritance, maternal effects, and environmental effects, seed coat color is difficult to use as a morphological marker for improved oil quality in breeding programs.4 Therefore, gene mapping and cloning of the yellowseeded traits play crucial roles in understanding the mechanism of these traits and can also establish a solid theoretical foundation for the better utilization of yellow-seeded lines in production. © XXXX American Chemical Society

Received: March 20, 2017 Revised: June 4, 2017 Accepted: June 14, 2017

A

DOI: 10.1021/acs.jafc.7b01226 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry Table 1. Phenotypic Variations in Seed Coat Color R Value in the B. napus Panel trait 14 R value 15 R value

range

mean ± SD

30.39−168.42 24.50−157.16

84.25 ± 0.97 71.26 ± 1.01

CV (%)a

skewness

33.38 36.39

kurtosis

G

E

G×E

h2 (%)b

1.52 1.16

c

c

c

85.66

0.905 0.766

a CV is an abbreviation of the coefficient of variation, which was estimated as the ratio of the standard deviation to the mean of all accessions. bh2 is broad-sense heritability; h2 = σ̂2g/(σ̂2g + σ̂2ge/n + σ̂2e/nr) × 100%, where σ̂2g is the genetic variance, σ̂2ge is the variance due to the G × E interaction, σ̂2e represents the residual error, n is the number of environments (years), and r is number of replicates. cThe values are significant at p < 0.01 for the effect of genotype (G), environment (E), and genotype × environment interaction (G × E) on phenotypic variance estimated by two-way ANOVA.

Table 2. Genome-Wide Significant Association Signals of Seed Coat Colora R2 (%)

p value

a

marker

chromosome

allele

MAF

Bn-A06-p21127973 Bn-A06-p21127779 Bn-A06-p21127805 Bn-A06-p21127138 Bn-scaff_15762_1-p1132736 Bn-A09-p34469671 Bn-A09-p35321301 Bn-A09-p35508720 Bn-A09-p35505120 Bn-A09-p30463234 Bn-scaff_18181_1-p1267885 Bn-scaff_15763_1-p233999 Bn-scaff_15763_1-p234548 Bn-scaff_17984_1-p517694 Bn-scaff_23771_1-p151934 Bn-scaff_20947_1-p93326 Bn-scaff_21269_1-p158333 Bn-scaff_21269_1-p122418 Bn-scaff_22788_1-p739 Bn-scaff_20947_1-p93377 Bn-scaff_21269_1-p204828 Bn-scaff_21269_1-p246861

A06 A06 A06 A06 A07 A09 A09 A09 A09 A09 C05 C06 C06 C06 C07 C08 C08 C08 C08 C08 C08 C08

A/C C/T A/G A/C A/G T/C T/G T/C C/A C/A C/A T/C A/G G/A G/A T/C T/C A/C G/T C/T T/C C/T

0.45 0.36 0.44 0.37 0.06 0.13 0.29 0.19 0.12 0.32 0.24 0.15 0.40 0.06 0.13 0.42 0.29 0.28 0.17 0.40 0.23 0.20

2014

2015

2014

2015

× 10−8 × 10−7 × 10−7 × 10−6 × 10−6 × 10−8 × 10−8 × 10−7 NA NA 1.12 × 10−6 4.13 × 10−10 3.10 × 10−8 NA NA 3.11 × 10−9 9.23 × 10−9 1.27 × 10−7 8.10 × 10−7 NA NA NA

3.46 × 10−7 NA NA NA NA 5.72 × 10−8 2.17 × 10−7 1.81 × 10−7 9.06 × 10−7 1.37 × 10−6 NA NA NA 1.26 × 10−6 1.01 × 10−6 4.47 × 10−10 6.23 × 10−9 1.26 × 10−7 8.30 × 10−7 1.11 × 10−6 7.24 × 10−7 8.10 × 10−8

8.11 7.09 6.74 5.55 5.32 8.32 7.56 6.87 NA NA 5.93 10.58 8.23 NA NA 8.73 8.66 7.38 6.53 NA NA NA

6.98 NA NA NA NA 8.03 7.18 7.29 6.31 5.87 NA NA NA 5.98 6.21 9.99 8.72 7.42 6.46 6.02 6.62 7.64

5.36 2.94 6.73 1.56 1.55 1.89 9.99 4.12

MAF, minor allele frequency; NA, this SNP did not reach the threshold value of the year.

plant height,22,23 primary branch number,23 branch angle,24 and pod shatter resistance,13 and disease resistance, such as Leptosphaeria maculans resistance,25 stem canker,26 and Sclerotinia stem rot resistance.27 However, there is little reported on seed coat color via association mapping. In this study, we performed a GWAS for seed coat color with 520 B. napus inbred lines. The objectives of our study were (i) to identify the single-nucleotide polymorphisms (SNPs) associated with seed coat color, (ii) to compare previous QTL results to our analysis results, and (iii) to identify candidate genes and trait−SNPs for seed coat color.



each line at maturity for near-infrared reflectance spectroscopy (NIRS) analysis. Measurement of Seed Coat Color. The seed coat color was scored using NIRS with a NIR System 6500 and the WinISI II software (FOSS GmbH, Rellingen, Germany). The spectra between 1100 and 2498 nm were recorded, registering log(1/R) absorbance values at 2 nm intervals for each sample.10,28 The phenotype values of the seed coat color were extrapolated from near-infrared (NIR) spectra using NIR calibrations developed in our lab specifically for the measurement of the seed coat color in B. napus.29 NIR-derived estimates for seed coat color were averaged over three technical repetitions. Genome-Wide Association Analysis. Population structure, relative kinship, and LD analysis had already been completed in previous studies.27,28 A trait−SNP association analysis was performed using the Q + K model with a total of 31 839 SNP sites [missing data < 20%, and minor allele frequency (MAF) > 0.05], and the Q + K model was implemented via a mixed linear model (MLM)30 by a variance component estimation in TASSEL 5.1. The Bonferroni test (0.01/number of tests) criterion is typically a very strict threshold;31 therefore, we performed a correction for multiple hypothesis testing by the negative log(0.05/n), where n is the total number of SNPs used in the association analysis. In this study, 5.8 [−log(0.05/31 839) ≈ 5.8] was used as a threshold for the significance of associations between SNPs and traits. After the rectification, the association between a SNP locus and a target trait was considered significant if −log(p value) > 5.8. A Manhattan plot was displayed using qqman software.32

MATERIALS AND METHODS

Plant Material. A total of 520 B. napus lines were collected from spring, winter, and semi-winter accessions and were cultivated under natural growing conditions on the experimental farm of the Chongqing Engineering Research Center for Rapeseed, Southwest University in Beibei, Chongqing, China (106.40° E, 29.80° N), for 2 consecutive years.27,28 The lines were arranged in a randomized complete block design with three replicates. In the growing periods from September 2013 to May 2014 (referred to as 2014) and from September 2014 to May 2015 (referred to as 2015), each line was planted in two rows of 10 plants per row, with 30 cm between rows and a distance of 20 cm between plants within each row. Openpollinated seeds were collected from five randomly chosen plants in B

DOI: 10.1021/acs.jafc.7b01226 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry

Figure 1. Genome-wide association studies of seed coat color. Manhattan plots of the compressed MLMs for seed coat color. Negative log10transformed p values from a genome-wide scan are plotted against position on each of the 19 chromosomes. The black horizontal dashed line indicates the genome-wide significance threshold, and the green marker is the repeatedly detected locus. The corresponding candidate genes are annotated. Because multiple candidate genes may be identified for one locus, the most likely candidate gene is annotated in the plots. Full gene information is listed in Table S3 of the Supporting Information. Haplotype blocks were constructed via the four-gamete rule with Haploview 4.2.33 The parameters were set as follows: the Hardy− Weinberg p value cutoff was 0.001; the minimum genotype was 75%; the maximum number of Mendel errors was 1; and the MAF was 0.05. NEBcutter V2.0 (http://nc2.neb.com/NEBcutter2/index.php) was used for enzyme digestion site prediction based on SNP site base differences. Comparison of the GWAS Results to Those of Previous QTLs. To compare the GWAS results to those of previous QTLs, we performed a comparison of the physical position of the seed coat color QTL among different populations. The QTL information for seed coat color was collected from previously reported mapping populations, including six B. napus linkage populations, i.e., GY F2,34 GZ RIL1, HZ F26, YE DH, EV DH,11 and GP RIL,10 two Brassica rapa linkage populations, i.e., S3 RIL9 and LR F2,35 and one association map with 217 accessions.36 To align these QTLs with the B. napus reference genome, we performed a BLASTN search against the B. napus reference genome with the primer sequences of their flanking markers (Table S1 of the Supporting Information). For traditional markers with left and right primers, such as simple sequence repeat (SSR), the polymerase chain reaction (PCR) products were generally 100−500 bp in length; thus, only these markers, the left and right primers of which were blasted to the same chromosome and less than 500 bp in distance, were retained for further analysis.22 Synteny Analysis of Significant Association Intervals on A09 and C08. A synteny analysis of significant association intervals on A09 and C08 was performed using GSV (http://cas-bioinfo.cas.unt.edu/ gsv/homepage.php). In accordance with the requirements of the GSV format, the scores and E value were obtained via a homology comparison of the source sequences of the SNPs from significant association intervals on the A09 and C08 chromosomes.

to 168.42 with a mean of 84.25. The R value for 2015 showed a similar distribution, but the range of 24.50−157.16 with a mean of 71.26 was relatively low. Two-way analysis of variance (ANOVA) was performed for the seed coat color R value using SPSS 20.0 software for the GWAS population, and the genotype (G), environment (E), and genotype × environment interaction (G × E) were indicated to have significant effects on all of these traits (p < 0.01). In addition, a correlation analysis showed a strong correlation between 2014 and 2015, and the seed coat color R value and the seed acid detergent lignin (ADL) content had an extremely significant negative correlation in 2014 and 2015, with coefficients of 0.705 and 0.555, respectively (Table S2 of the Supporting Information). Genome-Wide Association Analysis. The results of the trait−SNP association analysis are shown in Table 2 and Figure 1. A total of 22 significant associations were detected on A06, A07, A09, C05, C06, C07, and C08 at p < 1.57 × 10−6 (p = 0.05/31 839, and −log10 p = 5.80) in the 2 years. Among these, 8 of 22 were repeatedly detected in 2014 and 2015. The most significant SNP, Bn-scaff_15763_1-p233999, was only detected in 2014. The peak SNP locus (Bn-scaff_20947_1-p93326) on C08 explained 8.73 and 9.99% of the total phenotypic variance for 2014 and 2015, respectively, based on the R2 values. Enzyme digestion site prediction for these eight significantly associated SNP sequences found that Bn-A09-p34469671 can be identified and digested by the restriction endonuclease Apo I (R▼AATTY; R, arbitrary purine; Y, arbitrary pyrimidine) when the Bn-A09-p34469671 allele is a thymine (T) (Figure 2A). However, the site cannot be identified and digested when the SNP genotype is a cytosine (C). In addition, Bnscaff_21269_1-p158333 can be identified and digested by the restriction endonuclease Sac I (GAGCT▼C) when the allele of Bn-scaff_21269_1-p158333 is a thymine (T) (Figure 2B) but not a cytosine (C). Further analysis of these two SNPs found that accessions with a thymine (T) allele at Bn-A09-p34469671 displayed, on



RESULTS Phenotypic Characteristics of Seed Coat Color R Value. Seed coat color was reflected by an extrapolated visual light absorbance value. The distribution of the R value in the B. napus accessions is shown in Table 1 and Figure S1 of the Supporting Information. The distribution of the R value for 2014 followed the normal distribution and ranged from 30.39 C

DOI: 10.1021/acs.jafc.7b01226 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry

decay range (500 kb; r2 > 0.1) up- and downstream of the GWAS loci were mined. A total of 14 candidate genes for seed coat color regulation were predicted for 22 loci (Table S3 of the Supporting Information); among these, 42.9% (6/14) of the candidate genes were annotated as associated with the TRANSPARENT TESTA (TT) gene. BnaC06g17050D (TT12), an important gene involved in the transportation of proanthocyanidin precursors into the vacuole, was located at 43.42 kb upstream from the most significant SNP Bn-scaff_15763_1-p233999 on C06. BnaC08g42690D (MYB61) encodes a putative transcription factor; reduced quantities of mucilage are deposited during the development of the seed coat epidermis in myb61 mutants. In our study, MYB61 was located 10.37 kb upstream from the peak SNP Bn-scaff_20947_1-p93326 on C08. We also identified the MYB61 gene BnaA09g48440D, located 11.69 kb downstream from the peak SNP Bn-A09-p35321301 on A09. In addition, the peak SNP Bn-A09-p35321301 was found within a BnaA09g48450D (PIF3) gene, where a transcription factor binds to anthocyanin biosynthetic genes in a light- and HY5independent fashion and where positive regulation of the anthocyanin metabolic process occurs during the plant embryo globular stage. We also identified the PIF3 gene BnaC08g42700D located 20.78 kb upstream of the peak SNP Bn-scaff_20947_1-p93326 on C08 (Figures 1 and 4). We also found that all significantly associated SNPs on A06 were within a proanthocyanidin-specific gene BnaA06g30430D (TT10).37 Synteny Analysis of Significant Association Intervals on A09 and C08. During candidate gene mining, we detected the same candidate genes in the vicinity of the significant SNP loci on chromosomes A09 and C08. Early studies reported that the A09 linkage group was highly homologous to the C08 linkage group in B. napus (http://genomevolution.org/wiki/ index.php/Brassica_oleracea_v._Brassica_rapa). We conducted a synteny analysis of significant association intervals on A09 and C08 using SNP sequences. The results, as shown in Figure 5, show that 32.45−32.66 Mb on A09 and 36.74−37.02 Mb on C08 have very high collinearity, indicating that the two regions may be homologous. Interestingly, the peak SNP on A09 and the peak SNP on C08 are from these two intervals, respectively. Thus, the reliability and stability of significant association intervals are confirmed. Comparison of the GWAS Results to Those of Previous QTLs. The primer sequences of QTL flanking

Figure 2. Prediction of the restriction site of two SNPs. Inverted black triangles denote the SNP site: (A) when the allele of Bn-A09p34469671 was T and (B) when the allele of Bn-scaff_21269_1p158333 was T.

average, 19.3 and 19.9% reduced seed coat color R values compared to the accessions with a cytosine (C) allele in 2014 and 2015, respectively. The minor allele (C) was represented in only 13% of the 520 accessions. Accessions with a thymine (T) allele at the Bn-scaff_21269_1-p158333 had, on average, 16.4 and 19.9% reduced seed coat color R values compared to the accessions with a cytosine (C) allele in 2014 and 2015, respectively (Figure 3). In addition, accessions with a thymine (T) allele at Bn-scaff_22788_1-p739 had more than, on average, 42.3 and 61.4% increased seed coat color R values compared to the accessions with a guanine (G) allele in 2014 and 2015, respectively (Table S4 of the Supporting Information). The minor allele (T) was represented in only 17% of the 520 accessions. Haplotype block structures were investigated for chromosomes A09 and C08. A total of 40 and 26 haplotype blocks were found on A09 and C08, respectively. Significant SNPs (Bn-scaff_21269_1-p122418 and Bn-scaff_21269_1-p158333) on C08 were located in a haplotype block of 312 kb (from 36.98 to 37.01 Mb), and another SNP Bn-scaff_20947_1p93326 was located in an adjacent haplotype block. For A09, the region of significant association ranged from 31.70 to 32.65 Mb, the peak SNP (Bn-A09-p34469671) was located in a haplotype block of 37 kb, and another significant SNP (BnA09-p35508720) was located in a haplotype block of 279 kb (Figure 4). Candidate Gene Mining. To define the regions of interest that contain potential candidate genes, genes within the LD

Figure 3. Phenotypic differences between lines carrying different alleles of the SNPs associated with seed coat color: (A) phenotypic difference between lines carrying different alleles, TT, TC, and CC, of locus Bn-A09-p34469671 and Bn-scaff_21269_1-p158333 in 2014 and (B) phenotypic difference between lines carrying different alleles, TT, TC, and CC, of locus Bn-A09-p34469671 and Bn-scaff_21269_1-p158333 in 2015. The difference of the mean and p value based on ANOVA are also given. (∗∗) Significant at p ≤ 0.01. (∗∗∗) Significant at p ≤ 0.001. The numbers in parentheses behind the allele refer to the number of accessions carrying the corresponding allele. D

DOI: 10.1021/acs.jafc.7b01226 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry

Figure 4. Candidate genes near the SNPs associated with seed coat color and pairwise LD estimates in A09 and C08 haplotype blocks. The green arrows denote the significantly associated SNPs located in the haplotype block, and the red arrows denote the candidate genes located in the chromosome.

Figure 5. Synteny analysis map of the significant association interval on A09 and C08 in B. napus. Black triangles denote the SNP site.

on functional domains to generate specific effects on trait variation for selection using a GWAS. This is also a direction for the further development of molecular breeding. Yellow seed coat color is an important agronomic trait that affects the quality of B. napus. Wittkop et al. found seed color and oil content being significantly positively correlated, but significant negative correlations were found between seed color and protein content.38 In our previous work, we found the colocation of seed coat color and oil content QTL on N8 in the RIL population;12 this could be the seed oil content and seed color locus linkage or the seed color contribution. In the breeding program, the seed color has been used as a visible phenotype marker for high oil content selection. In recent decades, many researchers have attempted to find the yellowseeded gene and conduct parallel genetic research to solve the problem of seed coat color. However, currently, there are few yellow-seeded genes applied to large-scale production. On the one hand, there might be a linkage drag from unfavorable genes. On the other hand, the seed coat color trait is influenced by factors such as maternal effects and the environment. In this study, eight SNPs significantly associated with seed coat color

markers associated with seed coat color were mapped via alignment to the reference genome sequence of B. napus (http://www.genoscope.cns.fr/brassicanapus/). We compared our results to these QTLs from the publically available literature for seed coat color, and the results are shown in Figure 6. The significantly associated SNPs and QTLs with overlapping physical positions among different populations were observed on A09 and C08. On chromosome A09, five peak SNPs and four QTLs, from the YE, EV, GP, and S3 populations, had an overlapping physical position of 27.12− 32.65 Mb. On chromosome C08, one QTL was projected from the HZ population and co-localized with six peak SNPs (Bnscaff_20947_1-p93326, Bn-scaff_21269_1-p158333, Bnscaff_21269_1-p122418, Bn-scaff_20947_1-p93377, Bnscaff_21269_1-p204828, and Bn-scaff_21269_1-p246861). In addition, there were no significant SNPs in other overlapping QTL intervals, such as A01, A03, A05, and A08.



DISCUSSION Molecular-assisted selection (MAS) is the primary component of molecular breeding. To ensure selection, we can directly act E

DOI: 10.1021/acs.jafc.7b01226 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry

Figure 6. Comparison of the GWAS results to those of previous QTLs. The SNPs associated with seed coat color are shown on the outside circle. The inside circles represent loci from association mapping (AM) and QTLs projected from the GY, GZ, HZ, YE, EV, GP, S3, and LR populations. The gray traits indicated that the SNP loci and QTLs detected in different populations were co-localized.

A previous study showed that there was a significant difference in the expression of BnTT10 between ZY821 and GH06, a typical yellow-seeded cultivar, at 28 and 42 days after planting (DAP).14 TT12, as a transporter, was involved in the transport of intermediate products from the cytoplasm to the vacuole in the phenylpropanoid and flavonoid pathways.43 Moreover, it is reported that TTG1 is necessary for correct expression of BAN in seed endothelium.44 Qu et al. found that ZY821 and GH06 contained two copies of BnTT12, and BnTT12 was expressed more strongly in ZY821 than in GH06, the same as TTG1 and ANR.14 These genes may play important roles in the formation of yellow seed coat color. Further fine mapping and cloning of these genes cannot only promote an understanding of the formation mechanism of the yellow seed coat color but also provide the yellow-seeded genetic resources for the production of rapeseed. Although there are many advantages to GWASs, there are still some unavoidable weaknesses. For example, the smaller effect value and lower frequency of the gene function site may make detection difficult and, to a certain extent, cause the loss of heritability.45,46 However, linkage analysis can compensate for these deficiencies in GWASs. Thus, the effective combination of linkage analysis and GWASs is an effective way to analyze the genetic basis of complex quantitative traits.47 GWASs and linkage analysis can be combined in two ways. First, on the basis of the results of GWASs, a near-isogenic line of candidate genes can be constructed to determine whether the phenotype and genotype co-segregate. Another method is to analyze the co-location of candidate functional sites and

were repeatedly detected on chromosomes A06, A09, and C08, and the phenotypic variation was explained by individual loci, ranging from 6.53 to 9.99% for these SNPs. Further research found that the phenotype exhibited significant differences in the groups of eight significant SNP alleles according to a t test, and two SNPs (Bn-A09-p34469671 and Bn-scaff_21269_1p158333) can be identified and digested by restriction endonucleases Apo I and Sac I, respectively. We also found 27 lines (including ‘ZY821’, a typical black-seeded cultivar) containing all lower value alleles of eight significant SNPs, and their average seed coat color R value was only 67.57. No lines contained all higher value alleles of the eight significant SNPs. These SNPs can provide useful information for the molecular breeding for the selection of the yellow-seeded lines in B. napus. Research on Arabidopsis showed that the core component of seed coat pigments is proanthocyanidins (PAs), which accumulate in the endothelium.39 The seed coat color of B. napus is also determined by the content of the phenolic compounds cyanidin and procyanidins.40−42 In previous studies, TT genes, phenylpropanoid biosynthetic genes, and transcription factor genes were also considered to be important candidates to explain seed coat color differences in Brassica species.1,14,40 In our study, the yellow-seeded phenotype-related candidate genes were found in the vicinity of significant SNPs. Among these, six candidate genes were annotated as being associated with the TT gene and seven candidate genes were involved in procyanidins and anthocyanin biosynthesis or metabolic processes. TT10, which has four SNPs, encodes an enzyme that oxidizes procyanidins to yield PAs in the seed coat. F

DOI: 10.1021/acs.jafc.7b01226 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry ORCID

QTL; if the candidate functional site of the association study is co-located with the QTL of the same trait in an isolated population, then it will likely affect the trait. Many researchers have studied the various agronomic traits of maize using this method, and the method has a high detection efficiency and accuracy.48−50 In the present study, we found three co-location regions between the association studies and linkage analysis on A09, C06, and C08. Among these, a co-location region on A09 was considered a major QTL site of yellow seed coat color. The transcription factor gene TTG1, which functions in the flavonoid biosynthetic pathway, and some anthocyanin- and proanthocyanidin-specific genes were found in this region. These results showed that GWASs can help identify the closely linked SNPs and improve our chances of isolating key genes for seed coat color using map-based cloning. However, different locations between significant SNPs and previous QTL intervals have been identified. This phenomenon indicates the likely existence of a new yellow seed coat color-related site or an environment-specific SNP. Although it is difficult to promote the development of yellow-seeded breeding using the environment-specific SNP by molecular marker-assisted selection in the case of the G × E interaction, genetic engineering using the candidate genes may provide a more detailed understanding of the genetic mechanism of yellow-seeded lines. Such findings would be of great value to the production and development of rapeseed worldwide. Rapeseed has an amphidiploid genome that originated from interspecific hybridization between B. rapa and Brassica oleracea. Normally, 2−6 homologous copies of a gene are located in different subgenomes (A or C), and different nucleotide sequences exist among various homologous copies.51 In this study, the relevant regions with significant SNP loci on chromosomes A09 and C08 showed high homology. We aligned the candidate gene sequences of BnaA09g48450D and BnaC08g42700D with the sequences in the Arabidopsis thaliana genome database and found the best match in BIF3, with the same result as CRT3 and MYB61. Our results showed that the significant region on C08 was a homologous copy of the region on A09, and these three genes may be closely linked to regulate the formation of yellowseeded B. napus.



Jia Wang: 0000-0002-4066-0201 Author Contributions †

Jia Wang and Xiaohua Xian contributed equally to this work.

Author Contributions

Liezhao Liu designed the study. Jia Wang and Xiaohua Xian conducted the study. Jia Wang, Xiaohua Xian, Cunmin Qu, Liezhao Liu, and Kun Lu analyzed the data. Xinfu Xu, Liezhao Liu, and Jiana Li provided resources. Jia Wang and Xiaohua Xian wrote the manuscript. All authors read and approved the final manuscript. Funding

This work was supported by the National Natural Science Foundation of China (31371655), the Fundamental Research Funds for the Central Universities (XDJK2017A009), the Chongqing Science and Technology Commission (cstc2016shmszx80083), the National Key Research and Development Program of China (2016YFD0100202), and the 973 Program (2015CB150201). Notes

The authors declare no competing financial interest.



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jafc.7b01226. Frequency distribution on the phenotype for 2014 and 2015 (Figure S1), correlation analysis for the seed coat color R value and seed ADL content from the B. napus population in 2014 and 2015 (Table S2), BLAST search results for the preselected candidate genes for the SNP− trait associations with p < 1.57 × 10−6 within a distance of 0.5 Mb around the SNPs (Table S3), and significantly associated SNPs for seed coat color (Table S4) (PDF) Detailed information on QTLs and loci collected from the nine published populations (Table S1) (XLSX)



REFERENCES

(1) Fu, F. Y.; Liu, L. Z.; Chai, Y. R.; Chen, L.; Yang, T.; Jin, M. Y.; Ma, A. F.; Yan, X. Y.; Zhang, Z. S.; Li, J. N. Localization of QTLs for seed color using recombinant inbred lines of Brassica napus in different environments. Genome 2007, 50, 840−854. (2) Vu, T. T.; Jeong, C. Y.; Nguyen, H. N.; Lee, D.; Lee, S. A.; Kim, J. H.; Hong, S. W.; Lee, H. Characterization of Brassica napus Flavonol Synthase Involved in Flavonol Biosynthesis in Brassica napus L. J. Agric. Food Chem. 2015, 63, 7819−7829. (3) Wittkop, B.; Snowdon, R. J.; Friedt, W. Status and perspectives of breeding for enhanced yield and quality of oilseed crops for Europe. Euphytica 2009, 170, 131. (4) Liu, L.; Stein, A.; Wittkop, B.; Sarvari, P.; Li, J.; Yan, X.; Dreyer, F.; Frauen, M.; Friedt, W.; Snowdon, R. J. A knockout mutation in the lignin biosynthesis gene CCR1 explains a major QTL for acid detergent lignin content in Brassica napus seeds. Theor. Appl. Genet. 2012, 124, 1573−1586. (5) Badani, A. G.; Snowdon, R. J.; Wittkop, B.; Lipsa, F. D.; Baetzel, R.; Horn, R.; De Haro, A.; Font, R.; Luhs, W.; Friedt, W. Colocalization of a partially dominant gene for yellow seed colour with a major QTL influencing acid detergent fibre (ADF) content in different crosses of oilseed rape (Brassica napus). Genome 2006, 49, 1499−1509. (6) Zhang, Y.; Li, X.; Chen, W.; Yi, B.; Wen, J.; Shen, J.; Ma, C.; Chen, B.; Tu, J.; Fu, T. Identification of two major QTL for yellow seed color in two crosses of resynthesized Brassica napus line No. 2127−17. Mol. Breed. 2011, 28, 335−342. (7) Rahman, M.; McVetty, P. A review of Brassica seed color. Can. J. Plant Sci. 2011, 91, 437−446. (8) Xiao, L.; Zhao, Z.; Du, D.; Yao, Y.; Xu, L.; Tang, G. Genetic characterization and fine mapping of a yellow-seeded gene in Dahuang (a Brassica rapa landrace). Theor. Appl. Genet. 2012, 124, 903−909. (9) Kebede, B.; Cheema, K.; Greenshields, D. L.; Li, C.; Selvaraj, G.; Rahman, H. Construction of genetic linkage map and mapping of QTL for seed color in Brassica rapa. Genome 2012, 55, 813−823. (10) Liu, L.; Qu, C.; Wittkop, B.; Yi, B.; Xiao, Y.; He, Y.; Snowdon, R. J.; Li, J. A high-density SNP map for accurate mapping of seed fibre QTL in Brassica napus L. PLoS One 2013, 8, e83052. (11) Stein, A.; Wittkop, B.; Liu, L.; Obermeier, C.; Friedt, W.; Snowdon, R. J. Dissection of a major QTL for seed colour and fibre content in Brassica napus reveals colocalization with candidate genes for phenylpropanoid biosynthesis and flavonoid deposition. Plant Breed. 2013, 132, 382−389.

AUTHOR INFORMATION

Corresponding Author

*Telephone: 86-23-13883628614. Fax: 86-23-68250701. Email: [email protected]. G

DOI: 10.1021/acs.jafc.7b01226 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry

Lignin (ADL) and Hull Content in Rapeseed (Brassica napus L.). PLoS One 2015, 10, e0145045. (29) Li, Y.; Liu, X.; Li, J.; Ying, J.; Xu, X. Construction of nearinfrared reflectance spectroscopy model for seed color of rapeseed. Chin. J. Oil Crop Sci. 2012, 34, 533−536. (30) Zhang, Z.; Ersoz, E.; Lai, C. Q.; Todhunter, R. J.; Tiwari, H. K.; Gore, M. A.; Bradbury, P. J.; Yu, J.; Arnett, D. K.; Ordovas, J. M.; Buckler, E. S. Mixed linear model approach adapted for genome-wide association studies. Nat. Genet. 2010, 42, 355−360. (31) Yang, J.; Manolio, T. A.; Pasquale, L. R.; Boerwinkle, E.; Caporaso, N.; Cunningham, J. M.; de Andrade, M.; Feenstra, B.; Feingold, E.; Hayes, M. G.; Hill, W. G.; Landi, M. T.; Alonso, A.; Lettre, G.; Lin, P.; Ling, H.; Lowe, W.; Mathias, R. A.; Melbye, M.; Pugh, E.; Cornelis, M. C.; Weir, B. S.; Goddard, M. E.; Visscher, P. M. Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 2011, 43, 519−525. (32) Turner, S. D. qqman: An R package for visualizing GWAS results using Q−Q and Manhattan plots. bioRxiv 2014, 1−2. (33) Barrett, J. C.; Fry, B.; Maller, J.; Daly, M. J. Haploview: Analysis and visualization of LD and haplotype maps. Bioinformatics 2005, 21, 263−265. (34) Liu, L. Z.; Meng, J. L.; Lin, N.; Chen, L.; Tang, Z. L.; Zhang, X. K.; Li, J. N. QTL mapping of seed coat color for yellow seeded Brassica napus. Yichuan Xuebao 2006, 33, 181−187. (35) Bagheri, H.; Pino-Del-Carpio, D.; Hahnart, C.; Bonnema, G.; Keurentjes, J.; Aarts, M. G. M. Identification of seed-related QTL in Brassica rapa. Span. J. Agric. Res. 2013, 11, 1085−1093. (36) Qu, C.; Hasan, M.; Lu, K.; Liu, L.; Zhang, K.; Fu, F.; Wang, M.; Liu, S.; Bu, H.; Wang, R.; Xu, X.; Chen, L.; Li, J. Identification of QTL for seed coat colour and oil content in Brassica napus by association mapping using SSR markers. Can. J. Plant Sci. 2015, 95, 387−395. (37) Zhang, K.; Lu, K.; Qu, C.; Liang, Y.; Wang, R.; Chai, Y.; Li, J. Gene silencing of BnTT10 family genes causes retarded pigmentation and lignin reduction in the seed coat of Brassica napus. PLoS One 2013, 8, e61247. (38) Wittkop, B.; Snowdon, R. J.; Friedt, W. New NIRS calibrations for fiber fractions reveal broad genetic variation in Brassica napus seed quality. J. Agric. Food Chem. 2012, 60, 2248−2256. (39) Mizzotti, C.; Ezquer, I.; Paolo, D.; Rueda-Romero, P.; Guerra, R. F.; Battaglia, R.; Rogachev, I.; Aharoni, A.; Kater, M. M.; Caporali, E.; Colombo, L. SEEDSTICK is a master regulator of development and metabolism in the Arabidopsis seed coat. PLoS Genet. 2014, 10, e1004856. (40) Akhov, L.; Ashe, P.; Tan, Y.; Datla, R.; Selvaraj, G. Proanthocyanidin biosynthesis in the seed coat of yellow-seeded, canola quality Brassica napus YN01−429 is constrained at the committed step catalyzed by dihydroflavonol 4-reductase. Botany 2009, 87, 616−625. (41) Nesi, N.; Lucas, M. O.; Auger, B.; Baron, C.; Lecureuil, A.; Guerche, P.; Kronenberger, J.; Lepiniec, L.; Debeaujon, I.; Renard, M. The promoter of the Arabidopsis thaliana BAN gene is active in proanthocyanidin-accumulating cells of the Brassica napus seed coat. Plant Cell Rep. 2009, 28, 601−617. (42) Auger, B.; Marnet, N.; Gautier, V.; Maia-Grondard, A.; Leprince, F.; Renard, M.; Guyot, S.; Nesi, N.; Routaboul, J. M. A detailed survey of seed coat flavonoids in developing seeds of Brassica napus L. J. Agric. Food Chem. 2010, 58, 6246−6256. (43) Zhao, J.; Dixon, R. A. MATE transporters facilitate vacuolar uptake of epicatechin 3′-O-glucoside for proanthocyanidin biosynthesis in Medicago truncatula and Arabidopsis. Plant Cell 2009, 21, 2323−2340. (44) Li, X.; Westcott, N.; Links, M.; Gruber, M. Y. Seed coat phenolics and the developing silique transcriptome of Brassica carinata. J. Agric. Food Chem. 2010, 58, 10918−1028. (45) McCarthy, M. I.; Abecasis, G. R.; Cardon, L. R.; Goldstein, D. B.; Little, J.; Ioannidis, J. P.; Hirschhorn, J. N. Genome-wide association studies for complex traits: Consensus, uncertainty and challenges. Nat. Rev. Genet. 2008, 9, 356−369.

(12) Yan, X. Y.; Li, J. N.; Fu, F. Y.; Jin, M. Y.; Chen, L.; Liu, L. Z. Colocation of seed oil content, seed hull content and seed coat color QTL in three different environments in Brassica napus L. Euphytica 2009, 170, 355−364. (13) Raman, H.; Raman, R.; Kilian, A.; Detering, F.; Carling, J.; Coombes, N.; Diffey, S.; Kadkol, G.; Edwards, D.; McCully, M.; Ruperao, P.; Parkin, I. A.; Batley, J.; Luckett, D. J.; Wratten, N. Genome-wide delineation of natural variation for pod shatter resistance in Brassica napus. PLoS One 2014, 9, e101673. (14) Qu, C.; Fu, F.; Lu, K.; Zhang, K.; Wang, R.; Xu, X.; Wang, M.; Lu, J.; Wan, H.; Zhanglin, T.; Li, J. Differential accumulation of phenolic compounds and expression of related genes in black- and yellow-seeded Brassica napus. J. Exp. Bot. 2013, 64, 2885−2898. (15) Flint-Garcia, S. A.; Thornsberry, J. M.; Buckler, E. S. Structure of linkage disequilibrium in plants. Annu. Rev. Plant Biol. 2003, 54, 357− 374. (16) Zou, J.; Jiang, C.; Cao, Z.; Li, R.; Long, Y.; Chen, S.; Meng, J. Association mapping of seed oil content in Brassica napus and comparison with quantitative trait loci identified from linkage mapping. Genome 2010, 53, 908−916. (17) Li, F.; Chen, B.; Xu, K.; Wu, J.; Song, W.; Bancroft, I.; Harper, A. L.; Trick, M.; Liu, S.; Gao, G.; Wang, N.; Yan, G.; Qiao, J.; Li, J.; Li, H.; Xiao, X.; Zhang, T.; Wu, X. Genome-wide association study dissects the genetic architecture of seed weight and seed quality in rapeseed (Brassica napus L.). DNA Res. 2014, 21, 355−367. (18) Rezaeizad, A.; Wittkop, B.; Snowdon, R.; Hasan, M.; Mohammadi, V.; Zali, A.; Friedt, W. Identification of QTLs for phenolic compounds in oilseed rape (Brassica napus L.) by association mapping using SSR markers. Euphytica 2011, 177, 335−342. (19) Fritsche, S.; Wang, X.; Li, J.; Stich, B.; Kopisch-Obuch, F. J.; Endrigkeit, J.; Leckband, G.; Dreyer, F.; Friedt, W.; Meng, J.; Jung, C. A candidate gene-based association study of tocopherol content and composition in rapeseed (Brassica napus). Front. Plant Sci. 2012, 3, 129. (20) Chen, B.; Xu, K.; Li, J.; Li, F.; Qiao, J.; Li, H.; Gao, G.; Yan, G.; Wu, X. Evaluation of yield and agronomic traits and their genetic variation in 488 global collections of Brassica napus L. Genet. Resour. Crop Evol. 2014, 61, 979−999. (21) Xu, L.; Hu, K.; Zhang, Z.; Guan, C.; Chen, S.; Hua, W.; Li, J.; Wen, J.; Yi, B.; Shen, J.; Ma, C.; Tu, J.; Fu, T. Genome-wide association study reveals the genetic architecture of flowering time in rapeseed (Brassica napus L.). DNA Res. 2015, 23, 43−52. (22) Sun, C.; Wang, B.; Yan, L.; Hu, K.; Liu, S.; Zhou, Y.; Guan, C.; Zhang, Z.; Li, J.; Zhang, J.; Chen, S.; Wen, J.; Ma, C.; Tu, J.; Shen, J.; Fu, T.; Yi, B. Genome-Wide Association Study Provides Insight into the Genetic Control of Plant Height in Rapeseed (Brassica napus L.). Front. Plant Sci. 2016, 7, 1102. (23) Li, F.; Chen, B.; Xu, K.; Gao, G.; Yan, G.; Qiao, J.; Li, J.; Li, H.; Li, L.; Xiao, X.; Zhang, T.; Nishio, T.; Wu, X. A genome-wide association study of plant height and primary branch number in rapeseed (Brassica napus). Plant Sci. 2016, 242, 169−177. (24) Liu, J.; Wang, W.; Mei, D.; Wang, H.; Fu, L.; Liu, D.; Li, Y.; Hu, Q. Characterizing Variation of Branch Angle and Genome-Wide Association Mapping in Rapeseed (Brassica napus L.). Front. Plant Sci. 2016, 7, 21. (25) Jestin, C.; Lodé, M.; Vallée, P.; Domin, C.; Falentin, C.; Horvais, R.; Coedel, S.; Manzanares-Dauleux, M. J.; Delourme, R. Association mapping of quantitative resistance for Leptosphaeria maculans in oilseed rape (Brassica napus L.). Mol. Breed. 2011, 27, 271−287. (26) Fopa Fomeju, B.; Falentin, C.; Lassalle, G.; ManzanaresDauleux, M. J.; Delourme, R. Homoeologous duplicated regions are involved in quantitative resistance of Brassica napus to stem canker. BMC Genomics 2014, 15, 498. (27) Wei, L.; Jian, H.; Lu, K.; Filardo, F.; Yin, N.; Liu, L.; Qu, C.; Li, W.; Du, H.; Li, J. Genome-wide association analysis and differential expression analysis of resistance to Sclerotinia stem rot in Brassica napus. Plant Biotechnol J. 2016, 14, 1368−1380. (28) Wang, J.; Jian, H.; Wei, L.; Qu, C.; Xu, X.; Lu, K.; Qian, W.; Li, J.; Li, M.; Liu, L. Genome-Wide Analysis of Seed Acid Detergent H

DOI: 10.1021/acs.jafc.7b01226 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry (46) Myles, S.; Peiffer, J.; Brown, P. J.; Ersoz, E. S.; Zhang, Z.; Costich, D. E.; Buckler, E. S. Association mapping: Critical considerations shift from genotyping to experimental design. Plant Cell 2009, 21, 2194−2202. (47) Wu, R.; Ma, C. X.; Casella, G. Joint linkage and linkage disequilibrium mapping of quantitative trait loci in natural populations. Genetics 2002, 160, 779−792. (48) McMullen, M. D.; Kresovich, S.; Villeda, H. S.; Bradbury, P.; Li, H.; Sun, Q.; Flint-Garcia, S.; Thornsberry, J.; Acharya, C.; Bottoms, C.; Brown, P.; Browne, C.; Eller, M.; Guill, K.; Harjes, C.; Kroon, D.; Lepak, N.; Mitchell, S. E.; Peterson, B.; Pressoir, G.; Romero, S.; Oropeza Rosas, M.; Salvo, S.; Yates, H.; Hanson, M.; Jones, E.; Smith, S.; Glaubitz, J. C.; Goodman, M.; Ware, D.; Holland, J. B.; Buckler, E. S. Genetic properties of the maize nested association mapping population. Science 2009, 325, 737−740. (49) Yu, J.; Holland, J. B.; McMullen, M. D.; Buckler, E. S. Genetic design and statistical power of nested association mapping in maize. Genetics 2008, 178, 539−551. (50) Tian, F.; Bradbury, P. J.; Brown, P. J.; Hung, H.; Sun, Q.; FlintGarcia, S.; Rocheford, T. R.; McMullen, M. D.; Holland, J. B.; Buckler, E. S. Genome-wide association study of leaf architecture in the maize nested association mapping population. Nat. Genet. 2011, 43, 159− 162. (51) Zhou, L.; Li, Y.; Hussain, N.; Li, Z.; Wu, D.; Jiang, L. Allelic Variation of BnaC.TT2.a and Its Association with Seed Coat Color and Fatty Acids in Rapeseed (Brassica napus L.). PLoS One 2016, 11, e0146661.

I

DOI: 10.1021/acs.jafc.7b01226 J. Agric. Food Chem. XXXX, XXX, XXX−XXX