Metabolomic Assessment of Key Maize Resources: GC-MS and NMR

Feb 28, 2016 - KEYWORDS: maize hybrids, nested association mapping (NAM) founders, landraces, metabolomics, GC-TOF-MS, NMR. □ INTRODUCTION ..... ope...
3 downloads 7 Views 2MB Size
Article pubs.acs.org/JAFC

Metabolomic Assessment of Key Maize Resources: GC-MS and NMR Profiling of Grain from B73 Hybrids of the Nested Association Mapping (NAM) Founders and of Geographically Diverse Landraces Tyamagondlu V. Venkatesh,*,† Alexander W. Chassy,§ Oliver Fiehn,§,# Sherry Flint-Garcia,*,Δ,⊥ Qin Zeng,† Kirsten Skogerson,† and George G. Harrigan† †

Monsanto Company, 800 North Lindbergh Boulevard, St. Louis, Missouri 63167, United States Genome Center − Metabolomics, University of California at Davis, Davis, California 95616, United States # Biochemistry Department, King Abudalaziz University, Jeddah, Saudi-Arabia Δ Agricultural Research Service, U.S. Department of Agriculture, Columbia, Missouri 65211, United States ⊥ Division of Plant Sciences, University of Missouri, Columbia, Missouri 65211, United States §

S Supporting Information *

ABSTRACT: The present study expands metabolomic assessments of maize beyond commercial lines to include two sets of hybrids used extensively in the scientific community. One set included hybrids derived from the nested association mapping (NAM) founder lines, a collection of 25 inbreds selected on the basis of genetic diversity and used to investigate the genetic basis of complex plant traits. A second set included 24 hybrids derived from a collection of landraces representative of native diversity from North and South America that may serve as a source of new alleles for improving modern maize hybrids. Metabolomic analysis of grain harvested from these hybrids utilized gas chromatography−time-of-flight mass spectrometry (GC-TOF-MS) and 1 H nuclear magnetic resonance spectroscopy (1H NMR) techniques. Results highlighted extensive metabolomic variation in grain from both hybrid sets, but also demonstrated that, within each hybrid set, subpopulations could be differentiated in a pattern consistent with the known genetic and compositional variation of these lines. Correlation analysis did not indicate a strong association of the metabolomic data with grain nutrient composition, although some metabolites did show moderately strong correlations with agronomic features such as plant and ear height. Overall, this study provides insights into the extensive metabolomic diversity associated with conventional maize germplasm. KEYWORDS: maize hybrids, nested association mapping (NAM) founders, landraces, metabolomics, GC-TOF-MS, NMR



INTRODUCTION Maize (Zea mays L.) grain is a major food and feed commodity as well as a source of raw material for industrial processes worldwide.1 The major nutritional components are starch, protein, oil, and fiber, which encompass ∼95% of the biomass of harvested grain.2 The low-abundance small metabolite pool is of less nutritional and economic significance, but does contain some minor nutrients and antinutrients that are the target of breeding programs. For example, increasing vitamin A levels, a target of specific breeding programs,3 could offer health benefits, whereas reduction of phytic acid levels would potentially improve nutritional status.4 Developments in high-throughput metabolite data acquisition platforms have facilitated their application in crop improvement. Evaluation of the extensive natural variation inherent to the metabolite composition of crop plants and the role of metabolite assessments in trait discovery are under investigation.4b,5 Use of metabolite markers as a tool for selecting complex traits in plants has also been discussed.5b To date, metabolomic studies on maize grain have typically been driven by comparative assessments of genetically modified (GM) and non-GM lines or have focused on the relative influences of germplasm and growing location. These studies have demonstrated that the maize grain metabolome is greatly © XXXX American Chemical Society

influenced by environment and germplasm and that the effect of GM is negligible.6 Today, maize germplasm can be defined broadly in terms of landraces and modern inbred lines, both of which are derived from the ancestral progenitor teosinte (Z. mays ssp. parviglumis).7a The extensive metabolite variation recorded in the scientific literature for even a small set of modern commercial hybrids clearly reflects the influence of geographic adaptation and selective breeding to develop corn as a nutritious food source. Landraces are highly heterozygous and heterogeneous open-pollinated populations adapted to specific environments, and many consider that they may serve as a source of potential new alleles.7 Modern inbred lines, on the other hand, are uniform and homozygous (i.e., the genome of each line will not change as additional seed is made by selfpollination); these inbred lines are the basis of the commercial maize hybrids that are generated today. Because maize domestication and modern breeding efforts have resulted in a reduction in genetic diversity,7 prior studies Received: October 8, 2015 Revised: February 16, 2016 Accepted: February 28, 2016

A

DOI: 10.1021/acs.jafc.5b04901 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry

Table 1. Germplasm Used in the Study: Hybrids Derived by Crossing B73 as Female with Pollen from 26 NAM Founder Inbred Parents and 24 Landrace Inbredsa diversity group

inbred

broad group

NAM founder parents

HP301, Il14H P39 B97, Ky21, M162W, Mo17,b MS71, Oh43, Oh7B M37W, Mo18W, Tx303 CML103, CML228, CML247, CML277, CML322, CML333, CML52, CML69, Ki11, Ki3, NC350, NC358, Tzi8

northern flint temperate mixed tropical

landrace inbreds

MR01 (Araguito), MR03 (Bolita), MR04 (Canilla), MR05 (Cateto), MR07 (Comiteco), MR08 (Costeno), MR09 (Cravo Riogranense), MR10 (Crystalino Norteno), MR11 (Cuban Flint), MR16 (Pepetilla), MR18 (Reventador), MR21 (Tabloncillo), MR22 (Tuxpeno), MR23 (Zapalote Chico), MR25 (Poropo), MR26 (Pollo) MR02 (Assiniboine), MR12 (Havasupai), MR14 (Longfellow Flint), MR19 (Santo Domingo) MR06 (Chapalote), MR15 (Palomero de Jalisco) MR13 (Hickory King), MR20 (Shoe Peg)

tropical/ subtropical northern flint intermediate temperate

a Each of these inbreds and landraces was classified broadly as one of four groups (northern flint, temperate, intermediate/mixed, or tropical/ subtropical) on the basis of population structure using marker data,12 phenology,15 breeding history,16 and/or race characterization.17 Brief descriptions of these groups are also provided in Venkatesh et al.13 bThe Mo17 hybrid appeared to be cross-contaminated and/or was not phenotypically consistent with known Mo17 characteristics; thus, it was omitted from statistical analysis of acquired data.

assignments, which were confirmed with data collected on pure standards under identical buffer and instrumental parameters. Biological Materials. The germplasm used in this study (Table 1) was selected to represent a broad diversity within maize landraces and inbred lines. Each of these inbreds and landraces was classified broadly as one of four groups (northern flint, temperate, intermediate/mixed, or tropical/subtropical) based on phenology,15 breeding history,16 and/or race characterization17 (Supporting Tables 1 and 2). Production of Hybrids. Hybrid seeds of landrace and NAM founder lines crossed to B73 were produced over four seasons in different nurseries: 24 and 2 entries were produced, respectively, at Columbia, MO, USA, and Puerto Rico in 2008; 18 entries were produced in Puerto Rico in 2009; and 6 entries were produced in 2010 Columbia, MO, USA. All hybrids were produced by controlled pollination. Field Design. Materials for metabolomic analyses were generated in 2012 at Ithaca, NY, USA. The B73 × NAM parental line hybrids and B73 × landrace hybrids were planted in a randomized complete block design in 3 m × 0.9 m rows in three replications. Standard agronomic practices were employed for the production of grain in this region, and no specific disease or pest problems were observed. Three plants were self-pollinated in each row, and selfed ears were hand harvested and dried to 12−13% moisture. Dried ears were shelled, and grain was stored at room temperature prior to metabolomic analyses. Grain samples were homogenized by grinding on dry ice to a fine powder and stored frozen at approximately −20 °C until analysis. GC-MS Sample Extraction and Derivatization. Powdered samples (20 mg) were extracted with 1 mL of degassed extraction solvent (5:2:2 methanol/chloroform/water). The mixture was vortexed and shaken for 6 min at 4 °C prior to centrifugation at 14000 rcf for 2 min. The supernatant was removed (200 μL) and dried using a centrifugal vacuum concentrator at room temperature. Derivatization was performed as previously described.18a To each sample was added 10 μL of methyloxamine hydrochloride (40 mg/mL in pyridine) prior to 1.5 h of shaking at 30 °C. A mixture of MSTFA containing fatty acid methyl ester (FAME) markers was aliquoted (91 μL) into the sample and incubated at 37 °C for 30 min. GC-MS Profiling. Samples were analyzed using a GC-TOF approach.18,19 The study design was entered into the MiniX database.19 A Gerstel MPS2 automatic liner exchange system (ALEX) was used to eliminate cross-contamination from sample matrix occurring between sample runs. Derivatized sample injections of 0.5 μL were made in splitless mode with a purge time of 25 s and temperature program as follows: from 50 to 275 °C at a linear rate of 12 °C/s and held for 3 min. An Agilent 6890 gas chromatograph (Santa Clara, CA, USA) was used with a 30 m long, 0.25 mm i.d., Rtx5Sil-MS column with a 0.25 μm 5% diphenyl film; an additional 10 m integrated guard column was used (Restek, Bellefonte, PA, USA).

have accessed only a limited amount of genetic information associated with metabolomic diversity in maize. Thus, data from these prior studies may not represent a complete resource for the broader scientific community or for breeders seeking new alleles that could support the development of improved maize. The present study, therefore, expands metabolomic assessments of maize beyond previous studies reported on commercial lines to include diverse germplasm used in the scientific community to investigate the genetic basis of complex plant traits and as a source of new alleles for improving modern maize hybrids by breeding. The two sets of publicly available lines used in the present study represent diversity within the landraces and public inbred lines. The nested association mapping (NAM) founders are a collection of 25 inbreds selected on the basis of genetic diversity from the Goodman panel of 301 maize lines8 and popularly used in investigations on maize genetics and phenotypic traits.9a,b The 24 landrace accessions assessed in this study represent unimproved germplasm from North and South American geographies and provide a unique representation of corn alleles relative to commercial maize lines. The materials for the current metabolomic analysis included grain harvested from maize hybrids of both the NAM founders and landraces crossed with the inbred line B73. B73 has contributed extensively to the pedigree of many important commercial lines10 and serves as the reference genome for maize.11 The metabolomic data acquisition utilized both gas chromatography−time-of-flight mass spectrometry (GC-TOFMS) and 1 H nuclear magnetic resonance (1H NMR) spectroscopy. GC-MS is one of the most commonly used platforms in metabolomics, whereas NMR offers advantages in platform robustness and capabilities for absolute quantitation. Alternative multivariate options were pursued for each complementary platform with partial least-squares discriminant analysis (PLS-DA) used for the MS data and canonical discriminant analysis (CDA) used for the NMR data.



MATERIALS AND METHODS

Chemicals. All chemical reagents were of analytical grade from Sigma (St. Louis, MO, USA). Meatbolites from GC-MS were identified to level 1 of the Metabolomics Standards Initiative (MSI).18a Identification of chemicals described in the NMR section are based on Chenomx B

DOI: 10.1021/acs.jafc.5b04901 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry Chromatography was performed at a constant flow of 1 mL/min, ramping the oven temperature from 50 to 330 °C over 22 min at a linear rate. Mass spectrometry used a Leco Pegasus IV time-of-flight mass (TOF) spectrometer with 280 °C transfer line temperature, electron ionization at −70 V, and an ion source temperature of 250 °C. Mass spectra were acquired from m/z 85 to 500 at 20 spectra/s and 1750 V detector voltage. Result files were exported and further processed by our metabolomics BinBase database.19 All database entries in BinBase were matched against the Fiehn mass spectral library of 1200 authentic metabolite spectra using retention index and mass spectrum information or the NIST11 commercial library. Identified metabolites were reported if present with at least 50% of the samples per study design group (as defined in the MiniX database). Data from samples were exported to the netCDF format for further data evaluation with BinBase. Briefly, output results were exported to the BinBase database and filtered by multiple parameters to exclude noisy or inconsistent peaks. Quantitation was reported as peak height using the unique ion as default. Prior to data analyses, based on the underlying assumption that the total amounts of ionized metabolites that reach the detector are similar for different samples, metabolite peak heights were normalized by the sum of all peak heights for annotated metabolites for each sample. NMR Profiling. Grain samples (50 mg ground and lyophilized) were spiked with 50 mM maleic acid (50 μL) as an internal standard and then extracted with 1.8 mL of MeOH. After 15 min of sonication, the sample was centrifuged and the supernatant collected. The pellet was re-extracted twice as above. Pooled supernatants were dried and resuspended in qNMR buffer (100 mM sodium phosphate, pD 7.4 + DSS chemical shift standard) and then filtered due to the presence of insoluble material. Proton NMR spectra with water suppression were generated on a 600 MHz Bruker AVANCE III spectrometer equipped with a 60sample autochanger and a Bruker 5 mm TCI CryoProbe. Raw data were processed (Fourier transform, apodization, phasing, and chemical shift calibration) with Bruker TopSpin 3.1 and then uploaded into Chenomx for standardized area normalization and spectral binning. Analysis in this study used spectra binned at 0.04 and 0.005 ppm. Some spectral regions were deleted as follows: δ 3.332−3.352 (residual MeOH), δ 2.195−2.23 (residual acetone), δ −0.02 to 0.02, δ 0.59− 0.65, δ 1.725−1.79, δ 2.89−2.925 (DSS), δ 4.68−4.88 (H2O), δ 1.139−1.185, δ 3.99−4.02 (residual isopropanol), δ 5.82−6.16 (maleic acid internal standard), δ 2.70−2.74 (putative dimethylamine). Four samples analyzed by GC-MS were removed from the NMR analysis due to excessive levels of residual isopropanol from the NMR tube washing procedure (one replicate of Poropo landrace and one replicate each of NAM Oh7B, CML333, and P39). Chenomx was used for quantitation of metabolites. Statistical Analysis of GC-MS Data. Prior to statistical analyses, metabolite peaks were removed if the log of their average peak area was 150%). Both criteria removed 171 metabolite peaks (25% of the total), retaining 504 quality peaks that could be included in statistical analyses (Supporting Figure 1; see also Associated Content Files 1−4 for all data). In this study, statistical analysis focused on identified metabolites. Statistical analyses were performed using the open-source ImDev software package.20 Principal component analysis (PCA), hierarchical cluster analysis (HCA), and partial least-squares discriminant analysis (PLS-DA) on autoscaled data were used to differentiate NAM and landrace lines individually or based upon their broad classification or breeding group (see Supporting Tables 1 and 2 for groupings). Partial least-squares (PLS) is a versatile algorithm that can be used to predict either continuous or discrete/categorical variables.21 Classification with PLS is termed PLS-DA, where the DA stands for discriminant analysis. The PLS-DA algorithm has many favorable properties for dealing with multivariate data types such as omics data including metabolomics data that rely on the class (subgroup) membership of each observation (metabolite level).22

The Mo17 hybrid appeared to be cross-contaminated and/or was not phenotypically consistent with known Mo17 characteristics. GCMS data were generated for this sample but not utilized in the PLS-DA of known samples or the correlation analyses. Statistical Analysis of 1H NMR Data. Means and ranges of identified metabolites were determined (Tables 3 and 4; Associated Content File 8). Canonical discriminant analysis (CDA) was used to determine if patterns existed in the NMR data that could separate the hybrid groups, using the multivariate linear discriminant analysis available in JMP software V10 (SAS Institute). The classification variable was hybrid groupings for NAM and landrace inbreds, and the quantitative variables were metabolites. Pairwise correlation values between each of the metabolite mean values and canonical scores 1 and 2 were derived using the multivariate procedure in the JMP software (Supporting Table 3). Box plots for subgroups of NAM and landrace hybrid sets were generated using the “ggplot2” plotting package available in the R statistical package. Statistical difference between subgroup mean values of NAM and landrace hybrid sets were determined by one-way analysis of variance (ANOVA) using the “aov” function available in the R statistical package “agricolae”. Correlation Analysis. Pearson correlations were calculated between annotated/identified GC-MS and NMR metabolites and previously reported composition and agronomic phenotypic data from the same samples13 using the “cor” function in the R statistical package [(Associated Content Files 6 (GC-MS) and 9 (NMR)]. Only significant correlations are discussed herein. Genetic Fingerprint Analysis of Maize Inbreds (NAM Founders and Landraces). Single nucleotide polymorphism (SNP) genotype (Illumina maize SNP 50) data for the NAM founders and a subset of landrace inbred lines14a were downloaded from the Panzea Web site (http://www.panzea.org/). The NAM founder, Tzi8, and the landraces Canilla, Palomero de Jalisco, and Pepetilla were not genotyped before14b and therefore not included in the genetic analysis in this study. TASSEL software23 version 5.2.3 was used to analyze SNP genotype data and construct neighbor-joining (NJ) trees using default parameters. The NJ trees were imported in “nwk” format into the MEGA6 software24 for visualization and annotation. SNP data from one of the teosinte lines14b was used as an out-group to root the tree.



RESULTS AND DISCUSSION Descriptive Overview of GC-MS and 1H NMR Data. In the GC-MS analysis profile, a total of 675 metabolites were collected in both NAM and landrace hybrid sets. All collected metabolites were common to both NAM and landrace hybrid sets. There was no single metabolite that was only absent or present in either of the hybrid data sets, associated with a specific geography, or with any genetic groupings. This highlights the fact that the extensive geographical adaptation and varietal development associated with maize is not accompanied by major qualitative changes in the metabolome, but rather that variation is associated with changes in the levels of shared or common metabolites. As described under Materials and Methods, peaks in the GC-MS profile were removed if the log of their average peak area was 150%) (Supporting Figure 1). This resulted in a total of 504 peaks in both hybrid sets that included 174 annotated metabolites identified to level 1 of the Metabolomics Standards Initiative (MSI).18a Notably, sucrose was removed due to high variability. The means and ranges for these metabolites are presented in Associated Content File 5, according to four broad groups (flint, temperate, intermediate/mixed, and tropical/subtropical) for both NAM founder and landrace hybrids. The identified metabolites in both the NAM and landrace hybrid sets included C

DOI: 10.1021/acs.jafc.5b04901 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry

Figure 1. PLS of GC-MS data (known metabolites, identified to level 1 of the Metabolomics Standards Initiative18b) based on broad classification of hybrids derived from the NAM founders.

amine, and methylamine were not detected in the GC-MS platforms. Previous 1H NMR analyses of maize grain have been relatively limited in scope and focused on comparative assessments of GM and non-GM maize6c,g,h and have therefore not addressed the utility of 1H NMR when applied to the greater sources of metabolite variation such as growing location and the large diversity in genetic backgrounds associated with conventional germplasm. Manetti et al. identified a total of 25 metabolites,6g 22 of which were detected in the platform utilized here, with the exceptions being melibiose, pyruvate, and ferulic acid. Piccioni et al. identified a total 39 metabolites, of which 28 overlapped with our data (see Associated Content File 8).6h Although encompassing broad coverage, some limitations of nontargeted profiling approaches such as metabolomics should be considered. For example, of the two key antinutrients in maize (raffinose and phytic acid) as listed by the Organization of Economic Development (OECD) in its consensus document on maize composition,26 phytic acid was not observed in the GC-MS platform utilized here or by others,6i,25 although it is measured in the LC-MS platform of Rao et al.25 Phytic acid was established to be present in the samples used in this study in an earlier targeted compositional assessment.13 Clearly, programs focused on assessing this metabolite would require validation of any profiling approach if this were preferred over a targeted analysis. Of selected nutrients listed in the OECD consensus document, vitamin A (as β-carotene) is known to be present in these corn samples,13 but this metabolite was not captured in the current metabolomic platforms. Vitamin E (as α- and γtocopherols) was observed using the current metabolomic platforms, whereas the water-soluble B vitamins were not. In summary, the metabolic platforms utilized here allowed broad biochemical coverage consistent with those utilized in previous metabolomic studies, but, as with any platform, may not capture all metabolites of potential interest.

34 amino acids, 23 free fatty acids and related metabolites, 23 saccharides, 18 sugar alcohols, 10 sugar acids, 27 organic acids, 16 compounds classified as “nitrogenous”, and 11 compounds classified broadly as “sterols, tocopherols, and others”. Twelve metabolites annotated as carbohydrate or saccharide or miscellaneous were also recorded (Associated Content File 5). A total of 80 compounds overlapped with those listed in Rao et al., who utilized both GC- and LC-MS technologies in an evaluation of 13 inbred lines and 1 hybrid popularly grown in China.25 A total of 53 compounds overlapped with those listed in Röhlig et al., who applied GC-MS in an assessment of four different maize hybrids grown in France and Germany.6i Associated Content File 5 includes the list of overlapping metabolites as well as the compound classification used by Rao et al. for their metabolites.25 Although less sensitive than MS-based approaches, 1H NMR profiling offers advantages in terms of platform robustness and the ability to quantitate levels of identified metabolites. In the current study a total of 33 metabolites with known biochemical identity were measured. These included 17 amino acids (4aminobutyrate (GABA), alanine, arginine, asparagine, aspartate, glutamate, glutamine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, threonine, tyrosine, and valine), seven organic acids (acetate, formate, fumarate, lactate, malate, maleate (internal standard), and succinate), five sugars (fructose, galactose, glucose, maltose, and sucrose), and choline, dimethylamine, methylamine, and trigonelline. Data for all samples are presented in Associated Content File 8. 1H NMR profiles contained all amino acids that were detected in the GC-MS platform; alanine, asparagine, aspartate, glutamate, and proline were the most abundant amino acids observed in both platforms. The most abundant metabolite classes observed in the 1H NMR profiles, however, were the sugars (sucrose, maltose, glucose, fructose, and galactose). Notably, sucrose was observed as the most abundant metabolite, overall, in the 1H NMR profiles, whereas measurement of sucrose was highly variable by GC-MS. Of the other metabolites measured by 1H NMR, acetate, choline, dimethylD

DOI: 10.1021/acs.jafc.5b04901 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry

Figure 2. Genetic distance of NAM founder and landrace lines: neighbor-joining tree of NAM founders (left) and selected landrace lines using IlluminaSNP50 genotype data: blue squares, tropical; yellow triangles, northern flint; green inverted triangles, mixed; turquoise diamonds, temperate; pink solid circles, intermediate; red open circles, teosinte line (used as out group for rooting the tree).

Figure 3. PLS of GC-MS data (known metabolites, identified to level 1 of the Metabolomics Standards Initiative18b) based on broad classification of hybrids derived from the landraces.

Multivariate Analysis of GC-MS data. On the basis of geography and/or breeding history, the NAM parents and landraces used in the present study were classified into distinct groups8,14a (Supporting Tables 1 and 2). To understand the metabolic diversity of maize in the context of maize breeding and domestication, we performed multivariate analysis of the metabolic data generated in this study to differentiate NAM and

These observations point to limitations of metabolomics in safety assessments. Overall, this technology is not consistent with principles that safety assessments should address targeted end points of direct relevance to safety and nutrition, provide strong testing of relevant hypotheses,27 and follow standardized methodologies that would be highly reproducible across the safety assessment community. E

DOI: 10.1021/acs.jafc.5b04901 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry

OECD in its consensus document on maize composition.26 Therefore, the GC-MS data reported here were correlated with grain components (protein, amino acids, oil, fatty acids, fibers, raffinose, phytic acid, and tocopherols) previously reported for the same sample set of NAM and landrace hybrids.13 Results of the correlation analysis are summarized in Associated Content File 6. Only raffinose, α-tocopherol, and γ-tocopherol were present in both data sets, as many of the components listed in the OECD consensus documents could not be assessed by the GC-MS methodology (note that the amino acids and fatty acids in the previous composition analysis represent protein and oilderived pools, respectively, whereas the amino acids and fatty acids in the GC-MS analysis represent free pools). Correlations between the analytes found in both the nontargeted profiling and targeted (compositional analysis) quantitative approach ranged from 0.55 to 0.82. However, correlation values between the nonshared components assessed in the GC-MS analysis and those assessed in the prior targeted analysis (both from the same samples) were typically low and were not consistent across the NAM and landrace hybrid sets (Associated Content File 6). This is not surprising given the low abundance of many measured metabolites and the high sensitivity of the metabolome to genetic and environmental variation relative to that of the key nutrients (starch, protein, oil, and fiber) that comprise the bulk biomass of maize grain.2,28 This lack of association of the metabolome with nutrient composition, in addition to the limited nutrient coverage offered by nontargeted profiling, means that metabolomics is unable to contribute to the “identification of potential safety and nutrition issues” as effectively as quantitative analyses of known nutrients and antinutrients.26 We likewise examined the correlations among metabolite and phenotypic data (Associated Content File 6). Again, the number of strong correlations (>0.5) was low, but was greatest for days to anthesis, plant height, and ear height. Several of these correlations were consistent across the NAM and landrace hybrid sets. Metabolites that correlated with the three aforementioned traits are listed in Table 2. It is not immediately obvious why these grain metabolites, measured as a snapshot in time, would correlate with these complex growth-

landrace hybrids individually or into distinct breeding groups. Multivariate analyses focused on the GC-MS data set using PCA, HCA, and PLS-DA techniques. Associated Content File 7 shows representative results from these approaches. Discussion here focuses on results using a broad classification for the NAM founders and landraces using PLS-DA. For each set (NAM founder or landrace), PLS-DA of the 174 identified metabolites was conducted and based on individual hybrids or on group classifications (see Supporting Tables 1 and 2 for group classifications). For the NAM founders hybrids, PLS-DA based on broad group classification (Figure 1) showed a clear separation of the northern flint samples, especially the sweet corn lines, from the tropical/subtropical set on the x-axis with the temperate/mixed set clustered in the middle. The PLS scores showed that a range of diverse metabolite classes contributed to this separation. Key metabolites that contributed to observed metabolic pattern of a given subgroup or population are listed in Supporting Figure 2 and Associated Content File 7. It was shown earlier13 that NAM hybrids studied here were also compositionally distinct from each other. Metabolites positively associated with the flint group were mainly dominated by sugars, especially maltose and isomaltose, consistent with the large effects of sugary1 on starch/sugar composition. Abnormal starch content of sweet corn lines due to the sugary1 allele is most probably a contributing factor to the metabolic differences reflected in the multivariate analysis results. The other line with flint ancestry (HP301) is a popcorn line and coclustered with the other nonsweet lines. Encouragingly, the metabolic differentiation of the NAM and landrace hybrid groups was broadly consistent with the genetic differentiation of their inbred lines based on 50K SNP markers (Figure 2). As with the GC-MS data, the SNP data indicate that northern flint lines are clearly distinct and that separation of the temperate, mixed, and tropical/subtropical lines can be observed. Our findings from the metabolomic study reported here are similar to results from cluster analysis of genetic distances of over 2500 maize inbred lines15 that included the NAM founders and a number of the landrace samples used in the current study. In that genetic analysis, multivariate statistics on genetic marker data (>680,000 SNPs generated by genotyping-by-sequencing; GBS) could describe the genetic variation among breeding lines in accordance with their known ancestral history. The genetic analysis by Romay et al. highlighted differences between subpopulations of the present metabolomic study, many of which could be attributed to geography and/or discrete university/country breeding programs.15 A corresponding analysis of the landrace hybrids also highlighted that a range of metabolites contributed to difference among subgroups and that there was a broad alignment in the genetic and metabolic differentiations of these differences within the lines derived from northern flint ancestors (Figures 2 and 3; Supporting Figure 3; Associated Content File 7). Interestingly, 16 of the landraces included here were also evaluated in the genotyping analysis conducted by Romay et al., with results showing a clear distinction between northern flint (Assiniboine and Longfellow flint) and tropical lines.15 Correlation Analysis of GC-MS Data with Crop Composition and Phenotypic Data. Correlation analysis was conducted to gain insight into the relationship between metabolomic data and composition data generated using targeted assessments of crop components. The components assessed in targeted composition studies are listed by the

Table 2. GC-MS Metabolites Correlated (r > 0.5) with Selected Phenotypic Traits across both NAM and Landrace Hybrid Sets

F

days to anthesis

plant height

ear height

alanine aspartate fumaric acid glutamic acid glutamine histidine isoleucine leucine lysine malic acid methionine succinic acid threonine valine 4-aminobutyrate trigonelline (−ve)

malic acid succinic acid glutamic acid trigonelline (−ve)

fumaric acid glutamic acid lysine malic acid methionine succinic acid 4-aminobutyrate trigonelline (−ve)

DOI: 10.1021/acs.jafc.5b04901 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry Table 3. NMR Metabolite Values for Hybrids Derived from the NAM Founders flint

mixed

a

temperate

tropical

metabolitea

mean

min

max

mean

min

max

mean

min

max

mean

min

max

4-aminobutyrate acetate alanineb arginine asparagine aspartateb choline dimethylamine formate fructoseb fumarateb galactose glucoseb glutamate glutamine histidineb isoleucineb lactate leucineb lysineb malateb maltoseb methionineb methylamine phenylalanine proline succinateb sucrose threonineb trigonelline tyrosine valinea

4.02 2.84 9.31 4.88 24.56 14.46 7.04 61.63 1.19 78.50 1.27 9.69 104.45 14.81 4.92 1.24 1.14 1.41 1.16 2.50 38.72 16.06 0.76 0.44 1.37 61.46 3.86 1209.65 2.28 1.45 3.03 1.93

2.71 0.43 6.50 3.07 19.43 9.88 3.67 5.41 0.60 52.52 0.77 6.56 72.33 9.49 3.27 0.98 0.92 1.04 0.81 1.36 24.87 8.93 0.48 0.06 0.86 54.93 2.30 1084.43 1.50 0.93 2.19 1.43

5.56 6.09 10.67 5.96 26.53 16.81 10.73 141.00 2.10 122.42 2.01 12.30 132.07 19.86 6.78 1.58 1.50 2.14 1.59 3.11 56.40 31.46 1.00 0.92 1.90 68.40 5.66 1371.68 3.49 2.15 3.86 2.42

2.81 4.10 5.70 3.91 20.87 12.09 5.52 67.85 1.38 58.73 0.82 8.64 152.97 12.08 4.58 1.08 0.69 1.18 0.77 2.08 21.38 82.82 0.61 0.48 1.41 51.72 1.53 1419.00 1.57 1.61 3.00 1.37

2.01 0.56 4.32 2.56 14.80 9.97 3.74 5.79 0.48 33.91 0.39 5.12 53.70 8.12 2.98 0.90 0.47 0.62 0.59 0.98 8.72 12.73 0.45 0.05 0.99 28.09 1.03 684.87 1.11 1.01 1.90 1.04

4.14 7.06 7.21 6.01 28.26 14.65 8.00 129.02 2.83 88.40 1.03 10.81 264.38 15.30 6.24 1.38 0.84 1.84 0.94 2.94 35.77 251.55 0.73 0.88 1.72 69.04 2.36 2269.19 2.32 2.26 3.68 1.77

3.69 4.44 9.26 4.24 25.51 14.68 5.61 70.32 1.34 53.02 1.02 8.52 76.19 13.82 4.05 1.27 0.83 1.35 0.87 2.74 30.74 14.88 0.61 0.49 1.51 56.23 2.57 1137.37 1.79 1.37 3.02 1.62

2.29 0.68 5.00 2.30 15.87 9.98 3.52 2.34 0.54 26.45 0.59 6.45 59.72 8.30 3.16 0.71 0.56 0.67 0.60 1.48 21.07 10.54 0.40 0.05 1.07 36.24 1.25 920.47 1.35 1.06 2.19 1.04

6.00 8.87 16.43 11.25 39.70 20.71 8.16 134.37 2.13 91.48 1.88 11.98 132.67 21.94 4.94 2.19 1.27 2.54 1.22 6.87 39.15 21.12 0.78 0.91 2.26 74.47 4.45 1421.11 2.33 2.04 4.11 2.37

3.75 3.24 10.18 5.21 27.35 17.74 6.63 70.74 1.24 55.58 1.18 9.66 82.49 15.81 5.29 1.47 1.19 1.16 1.13 5.67 34.07 15.96 0.94 0.49 1.61 59.49 3.42 1218.76 2.30 1.53 3.15 2.08

1.63 0.56 4.27 2.26 11.28 8.70 3.43 2.00 0.27 15.48 0.63 6.49 41.60 8.25 1.80 0.71 0.54 0.72 0.67 1.29 14.67 5.44 0.51 0.03 0.93 29.31 1.67 942.86 1.25 0.86 1.88 1.04

7.79 6.68 23.48 9.67 62.85 29.56 11.65 151.52 2.23 112.17 1.96 14.84 127.91 26.81 10.73 2.28 2.64 2.08 1.82 18.11 74.02 31.25 1.70 0.99 2.61 117.24 6.36 1611.87 4.55 2.28 4.95 3.82

Metabolites identified on the basis of Chenomx assignments (see text for details). bMean difference at p < 0.05.

multivariate approaches, was selected to allow a more direct comparison to the composition data set. Overall, results from the 1H NMR profiling were consistent with results from the GC-MS analyses. Many of the differences between subpopulations for NAM hybrids were attributable to different metabolites. Among NAM hybrids, in addition to sugars (fructose, glucose, and maltose), significant differences (p < 0.05) between subgroups were observed for alanine, fumarate, histidine, isoleucine, lysine, malate, methionine, succinate, threonine, and valine (Table 3; see also Supporting Figure 4 and Associated Content File 8). For the NAM founders, the NMR data established that hybrids derived from the northern flint group, especially the sweet corn (P39 and Il14H), tended to have higher levels of sugars; sugar levels in HP301-derived hybrids were generally more similar to those of the other hybrids. Interestingly, levels of maltose were notably higher in the Il14H hybrids (187.53 μg/50 mg DW) when compared to all other NAM lines (9.39−21.66/50 μg DW) (Associated Content File 8). Sweet corn lines that contain the sugary enhancer (se) allele can be distinguished from other sweet corn lines on the basis of their higher accumulation of grain maltose.31 However, whereas Il14H is a sweet corn that carries the sugary1 allele, it is not known to carry the se allele. Because Il14H shares ancestry with the population where se was first identified, it is likely that Il14H carries multiple QTL alleles

dependent traits. Riedelsheimer et al. observed that metabolite profiles of leaf samples of maize inbred lines could serve as good predictors of some phenotypic characteristics in hybrids,29 including plant height, but also felt that such results could be considered unexpected due to (i) the early vegetative stage at which metabolites were profiled, (ii) the incomplete coverage of metabolic profiles relative to the metabolites present, and (iii) the fact the metabolite profiles represent a snapshot in time, whereas plant phenotypes represent the end-point of a complex dynamic system over an extended growing time. Statistical Analysis of NMR Profiling. Our analysis focused on simple descriptive summaries of the quantitative metabolite data (Tables 3 and 4; see also Supporting Figures 4 and 5 and Associated Content File 8) to allow discussion in the context of results obtained by the GC-MS platform. CDA of the NMR-derived metabolite data was also conducted. CDA is a dimension reduction technique related to PCA and canonical correlation. The CDA technique finds linear combinations of the quantitative variables that provide maximal separation between classes or groups. CDA has been successfully applied to 1H NMR-based metabolomic data of soybean seed.30 It was also utilized in the analysis of seed composition data13 generated on the same NAM and landrace hybrid sample set used in the current study, and its application here, over other G

DOI: 10.1021/acs.jafc.5b04901 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry Table 4. NMR Metabolite Values for Hybrids Derived from the Landraces flint

mixed

a

temperate

tropical

metabolitea

mean

min

max

mean

min

max

mean

min

max

mean

min

max

4-aminobutyrateb acetate alanine arginine asparagine aspartatea choline dimethylamine formate fructoseb fumarate galactose glucoseb glutamateb glutamine histidineb isoleucine lactate leucine lysine malateb maltose methionine methylamine phenylalanine prolineb succinateb sucroseb threonine trigonelline tyrosine valine

3.26 3.54 7.14 3.65 22.55 12.76 6.30 47.50 1.23 47.34 1.02 8.29 65.40 14.53 4.90 0.96 0.91 1.13 0.97 2.22 19.06 15.09 0.69 0.35 1.88 65.73 2.77 1229.85 1.94 1.88 3.10 1.81

3.10 0.53 4.66 2.77 20.28 10.41 6.10 1.68 1.03 25.65 0.52 6.05 47.72 11.59 4.76 0.42 0.76 0.77 0.76 1.58 12.05 11.77 0.52 0.07 1.77 60.60 1.15 1058.21 1.44 1.26 2.99 1.46

3.52 8.02 12.19 4.39 23.57 17.34 6.69 83.56 1.45 100.08 1.49 12.94 106.29 17.02 4.95 1.27 1.31 1.54 1.30 3.68 30.69 20.47 1.01 0.57 2.00 71.22 7.17 1358.57 2.70 2.28 3.24 2.32

3.05 3.99 6.94 5.38 34.51 14.01 6.49 78.38 1.24 26.50 1.14 8.04 50.77 11.73 3.65 1.65 0.78 1.08 0.92 4.27 24.56 15.10 0.81 0.56 1.95 70.84 1.68 1459.45 1.77 1.82 3.59 1.70

2.09 0.55 4.73 2.73 22.14 11.22 4.29 1.95 0.47 16.02 0.66 5.66 36.50 5.99 2.28 0.95 0.58 0.76 0.59 2.05 12.63 3.70 0.55 0.06 1.21 17.15 1.03 975.44 1.12 1.14 2.34 1.31

4.83 7.66 11.57 7.60 50.01 20.82 8.34 155.85 2.18 39.22 1.69 10.03 77.61 17.08 5.31 2.70 1.15 1.66 1.55 7.88 47.20 36.18 1.09 1.10 2.44 133.82 2.29 2045.97 2.48 2.91 5.20 2.37

5.08 3.41 10.94 6.34 29.17 19.00 6.34 58.55 1.27 66.50 1.36 9.57 90.87 20.29 4.72 1.27 1.16 1.60 1.06 3.56 45.13 16.20 0.71 0.42 1.38 61.63 3.81 1320.55 2.39 1.60 3.23 2.06

4.63 0.79 8.62 3.57 15.50 14.10 4.59 5.07 0.95 27.60 0.80 7.01 50.12 15.64 3.19 1.10 0.72 1.00 0.94 2.34 29.75 5.61 0.45 0.04 0.96 51.27 2.05 1064.20 2.06 0.95 2.83 1.56

5.75 5.37 15.32 13.54 61.75 22.39 7.97 85.19 1.68 93.65 2.01 12.07 119.53 26.06 8.24 1.64 1.61 2.22 1.15 6.10 59.78 37.62 1.00 0.61 1.64 75.27 5.61 1615.88 2.75 2.73 3.82 2.86

3.89 2.75 8.80 4.68 30.00 16.27 5.87 67.84 1.16 46.43 1.23 9.41 67.87 16.43 5.02 1.35 1.11 1.23 1.18 4.96 33.24 14.13 0.87 0.48 1.76 49.17 3.43 1145.76 2.20 1.47 3.37 2.07

1.48 0.39 3.79 2.42 14.43 6.77 3.24 1.65 0.49 17.31 0.63 4.23 29.13 5.16 1.70 0.65 0.42 0.46 0.54 1.55 16.05 4.48 0.42 0.03 1.07 16.10 0.66 678.67 0.96 0.81 2.32 0.98

7.22 6.67 18.89 8.85 48.02 27.66 9.13 134.13 1.91 115.12 2.45 20.20 139.04 34.85 13.71 2.28 2.74 2.16 2.49 12.85 57.28 22.90 1.52 0.93 3.55 91.13 7.51 1717.23 4.53 2.19 5.96 4.37

Metabolites identified on the basis of Chenomx assignments (see text for details). bMean difference at p < 0.05.

Correlation analysis of 1 H NMR data showed few correlations with composition data. As with the GC-MS data, there were more correlations associated with growth dependent traits, specifically days to anthesis, plant height, and ear height (Associated Content File 9). For the NAM data, glutamate, succinate, and trigonelline showed correlations with all three phenotypic traits, whereas for the landrace data set malate and succinate correlated with these three traits. In summary, the present study expands metabolomic assessments of maize beyond previously reported commercial lines to include genetic resources that are used by the maize research community to investigate the genetic basis of complex plant traits or that may serve as a source of new alleles for improving modern maize hybrids. These resources include the NAM founder lines and a collection of landrace lines representing North and South America. Our results using two different and complementary profiling techniques (GC-MS and NMR) highlighted extensive metabolomic variation in the grain harvested from B73 hybrids derived from these two genetic resources. This wide variation is consistent with the compositional diversity observed in the same sample set,13 as well as results observed from metabolic profiling of commercial hybrids.6a Of potentially more relevance is that, within each hybrid set, subpopulations could be differentiated in a pattern consistent with the known genetic and compositional variation of these

that contribute to the eating quality characteristic se varieties (personal communication, William Tracy and Matt Murray). Sucrose and glucose levels were also higher for the P39 (1.55 and 160 μg/50 mg DW, respectively) and Il14H (1.95 mg/50 mg DW and 230 μg/50 mg DW, respectively) hybrids. Differences in sugar and other metabolite levels between subgroups of landrace-derived hybrids were less apparent than those observed in the NAM founder set hybrids, most probably reflecting the absence of sweet corn lines. In addition to sugars (fructose, glucose, and sucrose), significant differences in 4aminobutyrate, glutamate, histidine, malate, proline, and succinate were observed between landrace hybrid subgroups (Table 4; see also Supporting Figure 5 and Associated Content File 8). The CDA of the NAM and landrace hybrid data set for 1H NMR data established effective separation of the breeding groups assigned for the NAM hybrids and landrace hybrids (Figure 4). These results, like composition data reported previously,13 clearly imply unique metabolic characteristics that can distinguish between different sets of maize hybrids. As with the previously reported composition data, overall correlations between the canonical scores (Can1 and Can2; Supporting Table 3) and individual metabolites that provided the separation were generally low to moderate, and no single component was uniquely different in one breeding group versus another. H

DOI: 10.1021/acs.jafc.5b04901 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry

Figure 4. Canonical discriminant analysis of NMR metabolite data from the landrace (top panel) and the NAM (bottom panel) hybrids. Labeled circle represents the multivariate mean for each group. The size of the circle corresponds to a 95% confidence limit for the mean. Groups that are significantly different tend to have nonintersecting circles.

nontargeted profiling, it is clear that metabolomics does not address targeted end-points of direct relevance to safety and nutrition and does not provide strong testing of relevant safety hypotheses. Encouragingly, this study helps provide an assurance that extensive metabolite variation is not associated with a safety concern, given its association with the diversityselected and widely adapted conventional germplasm evaluated here. Utilization of publicly available germplasm to improve modern maize lines will, no doubt, be accompanied by incorporation of metabolic diversity that is beyond the potential unintended changes due to genetic modification (GM) using modern agricultural biotechnology.32 The development of improved maize hybrids will require optimization of current resources to guide breeding strategies. The genetic diversity preserved in germplasm banks represents a key current resource. In this study we conducted the first metabolomic survey of kernel composition of hybrids from two important genetic resources: the NAM founder lines as well as landraces representing unimproved germplasm from a wide geographic area. It is our hope that the data generated here will encourage more detailed metabolomic investigations of publicly available resources that will provide value to plant breeders and plant biotechnologists involved in the nutritional and agronomic improvement of maize.

lines, as highlighted in Figures 1−4. In other words, analysis of the GC-MS and NMR data allowed discrete classification of the flint, tropical/subtropical, and intermediate/mixed lines, as did the genetic fingerprint analysis. This implies that there may be metabolic features that are associated with different genetic groupings, but our data suggest that these are subtly expressed with the exception of some metabolites in the popcorn/ sweetcorn samples; no analytes were uniquely or consistently higher in one group than the other. Furthermore, it is interesting to note that all of the metabolites identified in this study using two distinct metabolomic approaches (GC-MS and NMR) were identified in both hybrid sets, as well as their subpopulations. It is interesting that major developmental and morphological changes in the plant result from varietal development and geographical adaptation of maize across the Americas, but these adaptations and breeding do not result in extreme changes in the metabolome. All metabolites were present across samples with subtle variation, except where major mutations (i.e., sugary1 in the sweet corn lines) cause dramatic changes in seed composition. The correlations observed between metabolite and phenotypic data, although admittedly surprising for reasons described earlier, may suggest the potential of metabolomics to enable trait development. Conversely, the low correlations observed between metabolomic and crop composition data call to attention the limitations of metabolomics in safety assessments. In addition to the limited nutrient coverage offered by I

DOI: 10.1021/acs.jafc.5b04901 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry



Lucio, M.; Garcia-Cañ as, V.; Ibañ ez, E.; Schmitt-Kopplin, P.; Cifuentes, A. Metabolomics of transgenic maize combining Fourier transform-ion cyclotron resonance-mass spectrometry, capillary electrophoresis-mass spectrometry and pressurized liquid extraction. J. Chromatogr. A 2009, 1216 (43), 7314−7323. (f) Levandi, T.; Leon, C.; Kaljurand, M.; Garcia-Cañas, V.; Cifuentes, A. Capillary electrophoresis time-of-flight mass spectrometry for comparative metabolomics of transgenic versus conventional maize. Anal. Chem. 2008, 80 (16), 6329−6335. (g) Manetti, C.; Bianchetti, C.; Casciani, L.; Castro, C.; Di Cocco, M. E.; Miccheli, A.; Motto, M.; Conti, F. A metabonomic study of transgenic maize (Zea mays) seeds revealed variations in osmolytes and branched amino acids. J. Exp. Bot. 2006, 57 (11), 2613−2625. (h) Piccioni, F.; Capitani, D.; Zolla, L.; Mannina, L. NMR metabolic profiling of transgenic maize with the Cry1A(b) gene. J. Agric. Food Chem. 2009, 57 (14), 6041−6049. (i) Röhlig, R.; Eder, J.; Engel, K.-H. Metabolite profiling of maize grain: differentiation due to genetics and environment. Metabolomics 2009, 5 (4), 459−477. (j) Skogerson, K.; Harrigan, G. G.; Reynolds, T. L.; Halls, S. C.; Ruebelt, M.; Iandolino, A.; Pandravada, A.; Glenn, K. C.; Fiehn, O. Impact of genetics and environment on the metabolite composition of maize grain. J. Agric. Food Chem. 2010, 58 (6), 3600−3610. (k) Zeng, W.; Hazebroek, J.; Beatty, M.; Hayes, K.; Ponte, C.; Maxwell, C.; Zhong, C. X. Analytical method evaluation and discovery of variation within maize varieties in the context of food safety: transcript profiling and metabolomics. J. Agric. Food Chem. 2014, 62 (13), 2997−3009. (7) (a) Flint-Garcia, S. A. Genetics and consequences of crop domestication. J. Agric. Food Chem. 2013, 61 (35), 8267−8276. (b) Palmgren, M. G.; Edenbrandt, A. K.; Vedel, S. E.; Andersen, M. M.; Landes, X.; Østerberg, J. T.; Falhof, J.; Olsen, L. I.; Christensen, S. B.; Sandøe, P.; Gamborg, C.; Kappel, K.; Thorsen, B. J.; Pagh, P. Are we ready for back-to-nature crop breeding? Trends Plant Sci. 2015, 20 (3), 155−164. (8) Flint-Garcia, S. A.; Thuillet, A.-C.; Yu, J.; Pressoir, G.; Romero, S. M.; Mitchell, S. E.; Doebley, J.; Kresovich, S.; Goodman, M. M.; Buckler, E. S. Maize association population: a high-resolution platform for quantitative trait locus dissection. Plant J. 2005, 44 (6), 1054− 1064. (9) (a) McMullen, M. D.; Kresovich, S.; Villeda, H. S.; Bradbury, P.; Li, H.; Sun, Q.; Flint-Garcia, S.; Thornsberry, J.; Acharya, C.; Bottoms, C.; Brown, P.; Browne, C.; Eller, M.; Guill, K.; Harjes, C.; Kroon, D.; Lepak, N.; Mitchell, S. E.; Peterson, B.; Pressoir, G.; Romero, S.; Rosas, M. O.; Salvo, S.; Yates, H.; Hanson, M.; Jones, E.; Smith, S.; Glaubitz, J. C.; Goodman, M.; Ware, D.; Holland, J. B.; Buckler, E. S. Genetic properties of the maize nested association mapping population. Science 2009, 325 (5941), 737−740. (b) Tian, F.; Bradbury, P. J.; Brown, P. J.; Hung, H.; Sun, Q.; Flint-Garcia, S.; Rocheford, T. R.; McMullen, M. D.; Holland, J. B.; Buckler, E. S. Genome-wide association study of leaf architecture in the maize nested association mapping population. Nat. Genet. 2011, 43, 159−162. (10) Mikel, M. A.; Dudley, J. W. Evolution of North American dent corn from public to proprietary germplasm. Crop Sci. 2006, 46 (3), 1193−1205. (11) Schnable, P. S.; Ware, D.; Fulton, R. S.; Stein, J. C.; Wei, F.; Pasternak, S.; Liang, C.; Zhang, J.; Fulton, L.; Graves, T. A.; Minx, P.; Reily, A. D.; Courtney, L.; Kruchowski, S. S.; Tomlinson, C.; Strong, C.; Delehaunty, K.; Fronick, C.; Courtney, B.; Rock, S. M.; Belter, E.; Du, F.; Kim, K.; Abbott, R. M.; Cotton, M.; Levy, A.; Marchetto, P.; Ochoa, K.; Jackson, S. M.; Gillam, B.; Chen, W.; Yan, L.; Higginbotham, J.; Cardenas, M.; Waligorski, J.; Applebaum, E.; Phelps, L.; Falcone, J.; Kanchi, K.; Thane, T.; Scimone, A.; Thane, N.; Henke, J.; Wang, T.; Ruppert, J.; Shah, N.; Rotter, K.; Hodges, J.; Ingenthron, E.; Cordes, M.; Kohlberg, S.; Sgro, J.; Delgado, B.; Mead, K.; Chinwalla, A.; Leonard, S.; Crouse, K.; Collura, K.; Kudrna, D.; Currie, J.; He, R.; Angelova, A.; Rajasekar, S.; Mueller, T.; Lomeli, R.; Scara, G.; Ko, A.; Delaney, K.; Wissotski, M.; Lopez, G.; Campos, D.; Braidotti, M.; Ashley, E.; Golser, W.; Kim, H.; Lee, S.; Lin, J.; Dujmic, Z.; Kim, W.; Talag, J.; Zuccolo, A.; Fan, C.; Sebastian, A.; Kramer, M.; Spiegel, L.; Nascimento, L.; Zutavern, T.; Miller, B.; Ambroise, C.; Muller, S.; Spooner, W.; Narechania, A.; Ren, L.; Wei, S.; Kumari, S.;

ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jafc.5b04901. Group information for NAM and landrace hybrids, correlation scores of canonical scores 1 and 2 for NMR metabolites, data filtering and PLS analysis for GC-MS data, and box plots for NMR metabolites (PDF) Raw metabolomics and as well as statistically analyzed data (ZIP)



AUTHOR INFORMATION

Corresponding Authors

*(T.V.V.) E-mail: [email protected]. Phone: (314) 7573954. Fax: (314) 300-8020. *(S.F.-G.) E-mail: [email protected]. Phone: (573) 884-0116. Fax: (573) 884-7850. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS We are grateful to Nordine Cheikh for his support and encouragement in initiating this project. We thank Erin Bell for a critical review of the manuscript. We also thank the Buckler Laboratory at Cornell University for contributing seed for the analyses and the Sample Management Team at Monsanto Co. for preparing samples for compositional analyses.



REFERENCES

(1) (a) FAO Crops Primary Equivalent, http://faostat.fao.org. (b) USDA-NASS National Statistics for Corn, http://www.nass.usda. gov. (2) Watson, S. A. Description, development, structure, and composition of the corn kernel. In Corn Chemistry and Technology; White, P., Johnson, L., Eds.; American Association of Cereal Chemists: St. Paul, MN, USA, 2003; pp 1−38. (3) Yan, J.; Kandianis, C. B.; Harjes, C. E.; Bai, L.; Kim, E.-H.; Yang, X.; Skinner, D. J.; Fu, Z.; Mitchell, S.; Li, Q.; Fernandez, M. G. S.; Zaharieva, M.; Babu, R.; Fu, Y.; Palacios, N.; Li, J.; DellaPenna, D.; Brutnell, T.; Buckler, E. S.; Warburton, M. L.; Rocheford, T. Rare genetic variation at Zea mays crtRB1 increases [beta]-carotene in maize grain. Nat. Genet. 2010, 42 (4), 322−327. (4) (a) Cerino Badone, F.; Amelotti, M.; Cassani, E.; Pilu, R. Study of Low Phytic Acid1−7 (lpa1−7), a new ZmMRP4 mutation in maize. J. Hered. 2012, 103 (4), 598−605. (b) Fernie, A. R. S. Nicolas, Metabolomics-assisted breeding: a viable option for crop improvement? Trends Genet. 2009, 25 (1), 39−48. (5) (a) Kusano, M.; Saito, K. Role of metabolomics in crop improvement. J. Plant Biochem. Biotechnol. 2012, 21 (1), 24−31. (b) van Dongen, J. T.; Schauer, N. Metabolic marker as selection tool in plant breeding. ISB News Rep. 2010, 7−10. (6) (a) Asiago, V. M.; Hazebroek, J.; Harp, T.; Zhong, C. Effects of genetics and environment on the metabolome of commercial maize hybrids: a multisite study. J. Agric. Food Chem. 2012, 60 (46), 11498− 11508. (b) Baniasadi, H.; Vlahakis, C.; Hazebroek, J.; Zhong, C.; Asiago, V. Effect of environment and genotype on commercial maize hybrids using LC/MS-based metabolomics. J. Agric. Food Chem. 2014, 62 (6), 1412−1422. (c) Barros, E.; Lezar, S.; Anttonen, M. J.; Van Dijk, J. P.; Röhlig, R. M.; Kok, E. J.; Engel, K.-H. Comparison of two GM maize varieties with a near-isogenic non-GM variety using transcriptomics, proteomics and metabolomics. Plant Biotechnol. J. 2010, 8 (4), 436−451. (d) Frank, T.; Röhlig, R. M.; Davies, H. V.; Barros, E.; Engel, K.-H. Metabolite profiling of maize kernelsgenetic modification versus environmental influence. J. Agric. Food Chem. 2012, 60 (12), 3005−3012. (e) Leon, C.; Rodriguez-Meizoso, I.; J

DOI: 10.1021/acs.jafc.5b04901 J. Agric. Food Chem. XXXX, XXX, XXX−XXX

Article

Journal of Agricultural and Food Chemistry Faga, B.; Levy, M. J.; McMahan, L.; Van Buren, P.; Vaughn, M. W.; Ying, K.; Yeh, C. T.; Emrich, S. J.; Jia, Y.; Kalyanaraman, A.; Hsia, A. P.; Barbazuk, W. B.; Baucom, R. S.; Brutnell, T. P.; Carpita, N. C.; Chaparro, C.; Chia, J. M.; Deragon, J. M.; Estill, J. C.; Fu, Y.; Jeddeloh, J. A.; Han, Y.; Lee, H.; Li, P.; Lisch, D. R.; Liu, S.; Liu, Z.; Nagel, D. H.; McCann, M. C.; SanMiguel, P.; Myers, A. M.; Nettleton, D.; Nguyen, J.; Penning, B. W.; Ponnala, L.; Schneider, K. L.; Schwartz, D. C.; Sharma, A.; Soderlund, C.; Springer, N. M.; Sun, Q.; Wang, H.; Waterman, M.; Westerman, R.; Wolfgruber, T. K.; Yang, L.; Yu, Y.; Zhang, L.; Zhou, S.; Zhu, Q.; Bennetzen, J. L.; Dawe, R. K.; Jiang, J.; Jiang, N.; Presting, G. G.; Wessler, S. R.; Aluru, S.; Martienssen, R. A.; Clifton, S. W.; McCombie, W. R.; Wing, R. A.; Wilson, R. K. The B73 maize genome: complexity, diversity and dynamics. Science 2009, 326, 1112−1115. (12) Flint-Garcia, S. A.; Thuillet, A. C.; Yu, J.; Pressoir, G.; Romero, S. M.; Mitchell, S. E.; Doebley, J.; Kresovich, S.; Goodman, M. M.; Buckler, E. S. Plant J. 2005, 44, 1054. (13) Venkatesh, T. V.; Harrigan, G. G.; Perez, T.; Flint-Garcia, S. Compositional assessments of key maize populations: B73 hybrids of the nested association mapping founder lines and diverse landrace inbred lines. J. Agric. Food Chem. 2015, 63 (21), 5282−5295. (14) (a) Hufford, M. B.; Xu, X.; van Heerwaarden, J.; Pyhajarvi, T.; Chia, J.-M.; Cartwright, R. A.; Elshire, R. J.; Glaubitz, J. C.; Guill, K. E.; Kaeppler, S. M.; Lai, J.; Morrell, P. L.; Shannon, L. M.; Song, C.; Springer, N. M.; Swanson-Wagner, R. A.; Tiffin, P.; Wang, J.; Zhang, G.; Doebley, J.; McMullen, M. D.; Ware, D.; Buckler, E. S.; Yang, S.; Ross-Ibarra, J. Comparative population genomics of maize domestication and improvement. Nat. Genet. 2012, 44 (7), 808−811. (b) Chia, J. M.; Song, C.; Bradbury, P. J.; Costich, D.; de Leon, N.; Doebley, J.; Elshire, R. J.; Gaut, B.; Geller, L.; Glaubitz, J. C.; Gore, M.; Guill, K. E.; Holland, J.; Hufford, M. B.; Lai, J.; Li, M.; Liu, X.; Lu, Y.; McCombie, R.; Nelson, R.; Poland, J.; Prasanna, B. M.; Pyhajarvi, T.; Rong, T.; Sekhon, R. S.; Sun, Q.; Tenaillon, M. I.; Tian, F.; Wang, J.; Xu, X.; Zhang, Z.; Kaeppler, S. M.; Ross-Ibarra, J.; McMullen, M. D.; Buckler, E. S.; Zhang, G.; Xu, Y.; Ware, D. Maize HapMap2 identifies extant variation from a genome in flux. Nat. Genet. 2012, 40, 803−807. (15) Romay, M.; Millard, M.; Glaubitz, J.; Peiffer, J.; Swarts, K.; Casstevens, T.; Elshire, R.; Acharya, C.; Mitchell, S.; Flint-Garcia, S.; McMullen, M.; Holland, J.; Buckler, E.; Gardner, C. Comprehensive genotyping of the USA national maize inbred seed bank. Genome Biol. 2013, 14 (6), R55. (16) Compilation of North American Maize Breeding Germplasm; Crop Science Society of America: Madison, WI, USA, 1993. (17) Goodman, M. Races of corn. In Corn and Corn Improvement, 3rd ed.; Sprague, G. F., Dudley, J. W., Eds.; American Society of Agronomy: Madson, WI, USA, 1988; pp 33−79. (18) (a) Fiehn, O.; Wohlgemuth, G.; Scholz, M.; Kind, T.; Lee, D. Y.; Lu, Y.; Moon, S.; Nikolau, B. Quality control for plant metabolomics: reporting MSI-compliant studies. Plant J. 2008, 53 (4), 691−704. (b) Sumner, L. W.; Amberg, A.; Barrett, D.; Beger, R.; Beale, M. H.; Daykin, C.; Fan, T. W.; Fiehn, O.; Goodacre, R.; Griffin, J. L.; Hamkemeier, T.; Hardy, N.; Harnly, J.; Higashi, R.; Kpoka, J.; Lane, A. N.; Lindon, J. C.; Marriott, P.; Nicholls, A. W.; Reily, M. D.; Thaden, J. J.; Viant, M. R. Proposed minimum reporting standards for chemical analysis. Metabolomics 2007, 3, 211−221. (19) Kind, T.; Wohlgemuth, G.; Lee, D. Y.; Lu, Y.; Palazoglu, M.; Shahbaz, S.; Fiehn, O. FiehnLib: Mass spectral and retention index libraries for metabolomics based on quadrupole and time-of-flight gas chromatography/mass spectrometry. Anal. Chem. 2009, 81 (24), 10038−10048. (20) Grapov, D.; Newman, J. W. imDEV: a graphical user interface to R multivariate analysis tools in Microsoft Excel. Bioinformatics 2012, 28 (17), 2288−2290. (21) Partial least squares regression, https://en.wikipedia.org/wiki/ Partial_least_squares_regression. (22) Pérez-Enciso, M.; Tenenhaus, M. Prediction of clinical outcome with microarray data: a partial least squares discriminant analysis (PLSDA) approach. Hum. Genet. 2003, 112 (5−6), 581−592.

(23) Bradbury, P. J.; Zhang, Z.; Kroon, D. E.; Casstevens, T. M.; Ramdoss, Y.; Buckler, E. S. TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 2007, 23 (19), 2633−2635. (24) Tamura, K.; Stecher, G.; Peterson, D.; Filipski, A.; Kumar, S. MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol. Biol. Evol. 2013, 30 (12), 2725−2729. (25) Rao, J.; Cheng, F.; Hu, C.; Quan, S.; Lin, H.; Wang, J.; Chen, G.; Zhao, X.; Alexander, D.; Guo, L.; Wang, G.; Lai, J.; Zhang, D.; Shi, J. Metabolic map of mature maize kernels. Metabolomics 2014, 10 (5), 775−787. (26) OECD. Consensus Document on Compositional Considerations for New Varieties of Maize (Zea mays): Key Food and Feed Nutrients, Antinutrients and Secondary Plant Metabolites; Organisation of Economic Co-Operation and Development: Paris, France: 2002; Vol. OECD ENV/JM/MONO (2002)25. (27) (a) Codex Alimentarius. Foods Derived from Modern Biotechnology, 2nd ed.; Codex Alimentarius Commission, Joint FAO/ WHO Food Standards Programme; Food and Agriculture Organization of the United Nations: Rome, Italy, 2009. (b) ILSI. International Life Science Institute Crop Composition Database, v5, www.cropcomposition.org, 2014. (28) Harrigan, G. G.; Stork, L. G.; Riordan, S. G.; Reynolds, T. L.; Ridley, W. P.; Masucci, J. D.; MacIsaac, S.; Halls, S. C.; Orth, R.; Smith, R. G.; Wen, L.; Brown, W. E.; Welsch, M.; Riley, R.; McFarland, D.; Pandravada, A.; Glenn, K. C. Impact of genetics and environment on nutritional and metabolite components of maize grain. J. Agric. Food Chem. 2007, 55 (15), 6177−6185. (29) Riedelsheimer, C.; Czedik-Eysenberg, A.; Grieder, C.; Lisec, J.; Technow, F.; Sulpice, R.; Altmann, T.; Stitt, M.; Willmitzer, L.; Melchinger, A. E. Genomic and metabolic prediction of complex heterotic traits in hybrid maize. Nat. Genet. 2012, 44 (2), 217−220. (30) Harrigan, G. G.; Skogerson, K.; MacIsaac, S.; Bickel, A.; Perez, T.; Li, X. Application of 1H NMR profiling to assess seed metabolomic diversity. A case study on a soybean era population. J. Agric. Food Chem. 2015, 63 (18), 4690−4697. (31) Ferguson, J. E.; Dickinson, D. B.; Rhodes, A. M. Analysis of endosperm sugars in a sweet corn inbred (Illinois 677a) Which contains the sugary enhancer (se) gene and comparison of se with other corn genotypes. Plant Physiol. 1979, 63 (3), 416−420. (32) (a) Venkatesh, T. V.; Cook, K.; Liu, B.; Perez, T.; Willse, A.; Tichich, R.; Feng, P.; Harrigan, G. G. Compositional differences between near-isogenic GM and conventional maize hybrids are associated with backcrossing practices in conventional breeding. Plant Biotech J. 2015, 13, 200−210. (b) Ricroch, A. E.; Bergé, J. B.; Kuntz, M. Evaluation of genetically engineered crops using transcriptomic, proteomic, and metabolomic profiling techniques. Plant Physiol. 2011, 155 (4), 1752−1761. (c) Ricroch, A. E. Assessment of GE food safety using ‘-omics’ techniques and long-term animal feeding studies. New Biotechnol. 2013, 30, 349−354.

K

DOI: 10.1021/acs.jafc.5b04901 J. Agric. Food Chem. XXXX, XXX, XXX−XXX