Article pubs.acs.org/JAFC
Effect of Environment and Genotype on Commercial Maize Hybrids Using LC/MS-Based Metabolomics Hamid Baniasadi,† Chris Vlahakis,† Jan Hazebroek,† Cathy Zhong,‡ and Vincent Asiago*,† †
DuPont Pioneer, Analytical & Genomics Technologies, 7300 NW 62nd Avenue, Johnston, Iowa, 50131-1004, United States Regulatory Group, DuPont Experimental Station, Route 141 and Henry Clay Road, Wilmington, Delaware, 19880, United States
‡
S Supporting Information *
ABSTRACT: We recently applied gas chromatography coupled to time-of-flight mass spectrometry (GC/TOF-MS) and multivariate statistical analysis to measure biological variation of many metabolites due to environment and genotype in forage and grain samples collected from 50 genetically diverse nongenetically modified (non-GM) DuPont Pioneer commercial maize hybrids grown at six North American locations. In the present study, the metabolome coverage was extended using a core subset of these grain and forage samples employing ultra high pressure liquid chromatography (uHPLC) mass spectrometry (LC/MS). A total of 286 and 857 metabolites were detected in grain and forage samples, respectively, using LC/MS. Multivariate statistical analysis was utilized to compare and correlate the metabolite profiles. Environment had a greater effect on the metabolome than genetic background. The results of this study support and extend previously published insights into the environmental and genetic associated perturbations to the metabolome that are not associated with transgenic modification. KEYWORDS: metabolomics, metabolite profiling, LC/MS, GC/MS, chromatography, multivariate analysis, corn, maize, substantial equivalence
■
tion.33 However, GC/MS requires time-consuming sample derivatization and mainly detects primary metabolites.13,34 NMR has some advantages in that it provides quick accurate quantitative measurements. However, NMR has poor dynamic range (less than 5 orders of magnitude), and only metabolites that exceed a relatively high concentration threshold can be detected.35,36 LC/MS has demonstrated several advantages over NMR, including greater sensitivity and dynamic range. LC/MS analysis does not require sample derivatization and provides analytical access to molecules of more diverse mass and chemical structures such as secondary metabolites.13 Therefore, using multiple platforms covering more metabolites should improve the understanding of biological variation while enhancing the ability to detect altered metabolites. 37 In the present study, an LC/MS metabolomics method was applied to investigate the effect of environment, genotype, or both on variation of many metabolites in commercial maize. Multivariate statistical analysis, including principal component analysis (PCA), was used to discriminate the metabolomics profiles of grain and forage samples collected from five genetically diverse non-GM maize hybrids grown at six locations. We extend the metabolome coverage of these grain and forage samples by using a combination of GC/MS and LC/ MS. Our results may help to better understand the natural variation contributed by environment and genotype before metabolomics could be potentially considered in GM crop assessment.
INTRODUCTION Genetically modified crops are expected to play a critical role in the future due to their actual or projected high quality, high yield, increased nutrition, disease resistance, and drought tolerance in comparison with conventional crops.1 Although there is no direct evidence that shows that genetically modified (GM) products are harmful,2,3 much research has been focused on ways to evaluate substantial equivalence of GM crops compared to their conventional conterparts.4,5 Several nontargeted approach such as proteomics,6,7 genomics,8,9 and metabolomics10−12 have been suggested to supplement traditional targeted substantial equivalence approaches and to perhaps better identify unintended changes. Among the “omics” technologies, metabolomics is of particular interest because it can provide a direct biochemical picture of an organism’s phenotype.13 Metabolomics has been applied to detect unexpected, potentially undesired changes among the metabolites in genetically modified and conventional crops. Several studies have also reported using metabolomics for substantial equivalence measurements in other plants such as rice, Arabidopsis, tomato, potatoes, and wheat, among others.14−24 These studies show that genetic modification is not a major contributor to omics variation.25,26 Metabolites can be identified from untargeted metabolomics analyses performed with a variety of analytical methods. Common analytical platforms for metabolomics studies included gas chromatography−mass spectrometry (GC/ MS),21,27,28 nuclear magnetic resonance (NMR),16,20 and liquid chromatography−mass spectrometry (LC/MS).29−32 Each platform is sensitive and selective to a specific set of metabolites and typically provides complementary but often poorly overlapping information on metabolites. GC/MS offers broad coverage of analytical compounds and easier peak identifica© 2014 American Chemical Society
Received: Revised: Accepted: Published: 1412
October 22, 2013 January 17, 2014 January 18, 2014 January 18, 2014 dx.doi.org/10.1021/jf404702g | J. Agric. Food Chem. 2014, 62, 1412−1422
Journal of Agricultural and Food Chemistry
■
Article
(uHPLC) system coupled to a LTQ-FT hybrid mass spectrometer (Thermo Scientific, West Palm Beach, FL, USA) via an electrospray ionization (ESI) interface. Sample extracts were separated via reversed phase on a Thermo Scientific Hypersil Gold C18 column (2.1 × 200 mm × 1.9 μm). The mobile phase consisted of water/0.1% formic acid (A) and acetonitrile/0.1% formic acid (B) and flow rate was set to 0.5 mL/min. A linear gradient was used beginning with 10% B at 0.98 min, reaching 100% B at 18 min, 100% B at 35 min, then recovered to initial condition within 30 s where it was held for 4.5 min. Injection volume and column temperature were 25 μL and 40 °C, respectively. The optimized parameters for positive and negative ion electrospray were set as follows: capillary temperature 250 °C; sheath gas flow rate 70 arbitrary units, sweep gas flow rate 5 arbitrary units. The instrument was tuned prior to each batch run. Metabolites were detected using the linear ion trap detector collecting full-scan data from 150 to 975 m/z. Automatic gain control were set at 30 000 for both positive and negative ionization modes. The source voltage was set at 2.5 kV for positive mode and 4 kV for negative mode. Xcalibur version 2.1.0 SP1 software (Thermo Scientific) was used for data acquisition. Data Pretreatment. Xcalibur raw files were imported to Genedata Expressionist Refiner MS version 6.1 (Basel, Switzerland) for chromatogram gridding, aligning, chemical noise subtraction, peak detection, isotope clustering, adduct detection, and signal clustering. Preprocessed data consisted of intensities for each m/z value and retention time combination or peak. For a peak to be valid and retained, it had to be present in at least 50% of the samples in addition to being detected in all QC samples. The data matrix was exported to Genedata Expressionist Analyst version 2.2 for normalization by both internal standard signal and the sample dry weight. To further minimize unwanted variation in our data set due to within batch run order drift or across batch variability, we performed QC and reference sample normalizations in addition to internal standard and weight normalizations. QC samples normalization was used to minimize and correct for within batch variation. QC sample normalization was performed by dividing the relative abundance of each sample peak by the mean of that for the two bracketing QC samples within a batch. Reference sample normalization was used to correct for and minimize variability across batches. Reference sample normalization was performed after QC normalization by dividing the relative abundance of each sample peak by the mean of that peak in all reference samples within a batch. Fully normalized data were exported to Matlab or Excel 2007 for further data analysis. We used the LC/MS data individually and in combination with GC/MS data for statistical analyses. GC/MS data were obtained from the same samples and were processed in the same way as described elsewhere10 prior to combining the two data sets. Data Analysis. Univariate Analysis. To measure of the variance in metabolomics data due to different genotypes and environments, we calculated the coefficient of variation (CV) for each metabolite. The mean CV for all metabolites grown in all location was used to quantify the effect of the environment. The mean CV for all metabolites from each genotype using samples grown in each location was used to evaluate the effect of genotype. To investigate the effects of both environment and genotype, a paired student’s t test for each metabolite was applied. The p-value was calculated for each metabolite and used to determine those metabolites that were significantly altered (p-value < 0.01). To determine effect of environment, the p-values of each metabolite from one location were compared to metabolites in other locations. To evaluate effect of genotype, p-values of every metabolite for a given genotype were compared to those from the other genotypes at the same location. To further determine the consistency of those metabolites that were altered due to environment or genotype, we first identified those metabolites that were statistically significant from one location compared to the other five (total of 15 comparisons were made). We then calculated the number of paired comparisons (locations) in which the abundances of each of those metabolites were significantly altered (p-value < 0.01), for example, if a metabolite was altered between Kansas and Illinois, was it also altered in Kansas vs Texas, etc. A similar exercise was performed to identify those metabolites that
MATERIALS AND METHODS
Chemicals. Acetonitrile was purchased from Fisher Scientific (Fair Lawn, NJ, USA). Water was purified to 18.1 MΩ·cm using a Barnstead E-Pure System (Barnstead International, Dubuque, IA, USA). Methanol was purchased from EMD Millipore Corporation (Billerica, MA, USA). Formic acid (98−100% purity) was purchased from EMD Chemicals (Gibbstown, NJ, USA). Met-Arg-Phe-Ala (MRFA) was purchased from Research Plus, Inc. (Barnegat, NJ, USA) and taurocholic acid sodium salt hydrate from Sigma-Aldrich (St. Louis, MO, USA) were used as internal standards. Plant Material. Fifty genetically diverse non-GM commercial maize hybrids from DuPont Pioneer were planted at six different locations. Five of the locations were in the U.S. (Illinois, Kansas, Minnesota, Nebraska, and Texas) and the sixth location was in Ontario, Canada. Planting locations for the hybrids were selected based on days to maturity such that each location had 20 unique genotypes. At every location, each genotype was planted in three randomized blocks (three blocks and two replicate plots per block). Each block was separated by an alley at least 36 in. wide and surrounded on each end by two-row borders. Details and a pictorial representation are shown elsewhere.10 Agronomic practices such as irrigation, fertilization, and herbicide and pesticide applications were applied uniformly across locations and were consistent with the normally acceptable practices for maize production. From the 50 genotypes, a core subset of five that were grown at all six locations were used for LC/MS analysis. Two forage samples were collected from three plants after flowering for each genotype and block and placed immediately on dry ice. Each forage sample represented the pooled aerial portions of three entire plants. Collected frozen samples were stored temporally at less than or equal to −10 °C for less than 10 min. Each grain sample was collected at maturity from five hand pollinated ears from each genotype and block. For each sample, the resulting grain from five shelled ears were pooled and immediately placed on dry ice until transferred to a less than or equal to −10 °C freezer for temporary storage. The same harvesting protocol was adopted in all locations. Samples were shipped frozen to the DuPont Pioneer Regulatory Science processing laboratory in Ankeny, Iowa, where they were lyophilized and ground. Lyophilized forage and grain samples were then shipped on dry ice to the DuPont Pioneer metabolomics laboratory in Johnston, Iowa, where they were stored at −80 °C until analyzed. Sample Preparation. For forage samples, metabolites were extracted from lyophilized tissue with a mean dry weight of 6.7 mg. Metabolites were extracted from grain samples with a mean dry weight of 29.9 mg. Five hundred microliters of cold extraction solvent (80% methanol/20% water) containing 10 mg/mL Met-Arg-Phe-Ala (MRFA) and 5 mg/mL taurocholate as internal standards for positive and negative electrospray, respectively, were added to each sample and reference sample in a 1.1 mL polypropylene microtube containing two 5/32 in. stainless steel ball bearings. Samples were homogenized in a 2000 Geno/Grinder ball mill (SPEX CertiPrep, Inc., Metuchen, NJ) at a setting of 1650 rpm for 1 min and then rotated on an end-over-end mixer (Glas-Col, LLC, Terre Haute, IN) at 4 °C for 30 min. Samples were then centrifuged at 1454g at 4 °C for 15 min. The supernatant of each sample was transferred to 0.2 μM (VWR International LLC, Radnor, PA) spin filter and centrifuged at 800g. Solution was transferred to a limited volume autosampler vial. Forage and grain samples were rearranged into 10 analytical batches separately. Each batch contained samples from every genotype and location to minimize analytical errors and system bias in the acquired data. We prepared four quality control (QC) samples for each batch by pooling extract aliquots from each individual sample within the batch. In addition, we ran three reference samples in each batch that were prepared from pooled grain or forage material. The reference samples were sourced from grain or forage originating from all locations and containing all genotypes. Both QC and reference samples were placed at regular intervals within each batch. Liquid Chromatography−Mass Spectrometry. Analyses were performed on an Accela ultra high pressure liquid chromatograph 1413
dx.doi.org/10.1021/jf404702g | J. Agric. Food Chem. 2014, 62, 1412−1422
Journal of Agricultural and Food Chemistry
Article
Table 1. Mean Percent CV of Metabolomics Data from Different Locations in (a) Forage and (b) Grain Samples Using LC/MS Data Set Alone and in Combination with GC/MS Data Set locations
# of samples
LC/+ESI/MS
LC/−ESI/MS
LC/±ESI/MS
GC/MS and LC/MS
38.11 37.25 34.13 37.56 35.96 45.55
41.48 43.49 37.31 42.40 43.02 48.39
41.46 43.49 36.60 44.73 42.87 50.97
57.16 62.79 61.22 56.94 65.61 53.11
49.83 49.86 48.69 50.05 51.14 45.95
52.95 55.7 52.51 59.42 54.41 49.37
(a) Illinois Kansas Minnesota Nebraska Ontario Texas
30 30 29 30 23 30
41.63 43.86 37.74 42.48 43.48 48.73
Illinois Kansas Minnesota Nebraska Ontario Texas
28 27 29 30 29 29
47.59 45.9 44.85 47.95 46.71 43.77
(b)
were altered due to genotype. For each location, we first determined the number of genotype paired comparisons in which a particular metabolite exhibited a significantly altered abundance (a total of 10 comparisons were made for every location). Comparisons across different locations asked the same question, for example, if metabolite X was altered between entries 1 vs 2 in Kansas was it also altered in other locations? Finally, those metabolites that were altered by both the environment and genotype were identified. Multivariate Analysis. Principal component analysis (PCA) was performed by using Matlab version R2010b (Mathworks) installed with the PLS toolbox version 6.0.1 (eigenvector Research Inc., Wenatchee, WA). All data sets were normalized and autoscaled prior to modeling. R statistical package (version 2.12.1) was used for hierarchical cluster analysis (HCA) to generate dendrograms and heat maps and calculate correlations between metabolites.
Table 2. Mean Percent CV of Metabolomics Data from Different Genotypes in Different Locations in (a) Forage and (b) Grain Samples Using LC/MS Data Set genotype
■
RESULTS AND DISCUSSION Data Variability in Grain and Forage (Coefficient of Variance Analysis). To evaluate the effect of the environment, we compared the mean percent CV of all metabolites detected using samples grown at different locations (Table 1). The mean percent CV for all metabolites in forage samples was lower than for grain samples grown at the same locations, with the exception of Texas. Similar trends were observed in the mean percent CV using LC/MS data acquired in positive or negative ionization mode analyzed independently, although more metabolites were detected in the positive mode (866 metabolites) compared to the negative mode (277 metabolites) as shown in Supporting Information Table S1. This result was also consistent with what we observed using GC/MS data, in which grain samples had the higher mean percent CV compared to forage samples originating from the same location. To investigate the effect of different genotypes on data variability, we compared mean percent CV of all metabolites from different genotypes grown at different locations, as shown in Table 2. The number of samples per genotype and location is shown in Supporting Information Table S2. These results allowed for the comparison of the mean percent CV of one genotype to another within and across different environments. We did not observe any differences in mean percent CV across different locations for any genotype, indicative of good reproducibility across different locations. In addition, mean percent CV values for different genotypes at the same location were very similar to those for the same genotype at different locations with the exception of few cases. For example, grain
Illinois
Kansas
1 2 3 4 5
36.36 36.44 31.14 41.28 34.88
32.58 39.59 30.28 37.56 34.74
1 2 3 4 5
38.96 37.02 37.53 38.77 33.4
44.23 38.15 42.01 35.81 36.1
Minnesota (a) 31.79 30.71 30.24 28.56 35.96 (b) 39.1 36.22 40.52 35.32 36.12
Nebraska
Ontario
Texas
37.59 37.34 39.54 33.8 34.36
39.54 35 32.91 35.68 38.06
42.36 32.75 45.52 49.09 40.44
38.94 31.19 47.03 35.69 30.62
34.89 36.63 34 37.12 33.5
32.53 38.29 39.46 35.17 35.17
samples from Minnesota had higher mean percent CV values compared to forage samples. Univariate Analysis, Effect of Environment. To determine the number of metabolites detected by LC/MS that were affected by the environment (location), we calculated and compared p-values for the relative amount of each metabolite from one location to another in both grain and forage samples. Unless stated otherwise, herein and hereafter, LC/MS data sets were comprised of combined positive and negative ionization mode data. Table 3a,b shows the percentage of metabolites with statistically significant altered levels (pvalues