Article pubs.acs.org/ac
Integrated Analysis of Seaweed Components during Seasonal Fluctuation by Data Mining Across Heterogeneous Chemical Measurements with Network Visualization Kengo Ito,† Kenji Sakata,‡ Yasuhiro Date,†,‡ and Jun Kikuchi*,†,‡,§,¶ †
Graduate School of Medical Life Science, Yokohama City University, 1-7-29 Suehirocho, Tsurumi-ku, Yokohama 230-0045, Japan RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 235-0045, Japan § Biomass Engineering Research Program, RIKEN Research Cluster for Innovation, 2-1 Hirosawa, Wako 351-0198, Japan ¶ Graduate School of Bioagricultural Sciences and School of Agricultural Sciences, Nagoya University, 1 Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan
Downloaded via LMU MUENCHEN on January 28, 2019 at 10:20:22 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.
‡
S Supporting Information *
ABSTRACT: Biological information is intricately intertwined with several factors. Therefore, comprehensive analytical methods such as integrated data analysis, combining several data measurements, are required. In this study, we describe a method of data preprocessing that can perform comprehensively integrated analysis based on a variety of multimeasurement of organic and inorganic chemical data from Sargassum f usiforme and explore the concealed biological information by statistical analyses with integrated data. Chemical components including polar and semipolar metabolites, minerals, major elemental and isotopic ratio, and thermal decompositional data were measured as environmentally responsive biological data in the seasonal variation. The obtained spectral data of complex chemical components were preprocessed to isolate pure peaks by removing noise and separating overlapping signals using the multivariate curve resolution alternating least-squares method before integrated analyses. By the input of these preprocessed multimeasurement chemical data, principal component analysis and self-organizing maps of integrated data showed changes in the chemical compositions during the mature stage and identified trends in seasonal variation. Correlation network analysis revealed multiple relationships between organic and inorganic components. Moreover, in terms of the relationship between metal group and metabolites, the results of structural equation modeling suggest that the structure of alginic acid changes during the growth of S. f usiforme, which affects its metal binding ability. This integrated analytical approach using a variety of chemical data can be developed for practical applications to obtain new biochemical knowledge including genetic and environmental information.
S
molecular biomass, which is usually untargeted in the metabolomics, is abundantly included because minerals are also important constituents of seaweed systems. Approximately 8 million tons of wet seaweeds are annually harvested worldwide; stranded seaweeds on the beach constitute a considerable part of this.5 Stranded seaweeds are harvested from beaches and utilized for a variety of purposes such as feed, fertilizer, and a source of raw material for industrial production of phytochemicals of commercial importance.6 In addition, seaweeds have a biosorption function, and the metal-sorbing potential of their constituent polysaccharides has been indicated.7−10 To measure chemical components such as macromolecules, metabolites, and elements, a variety of
ince the theory of evolution was proposed by Charles Darwin, biological variety has very often been characterized by genetic factors, particularly nucleotide sequence information, in modern biological science.1 Although such a research approach is never in doubt, biological information is also influenced by environmental factors, resulting in chemical diversity even within the same species. For example, the taste of food varies with harvesting season or region of origin because the sum of its constituent chemicals is strongly affected by the surrounding environment. Therefore, environmental metabolomics have now gained considerable attention as a method of understanding the mechanism of variation of the sum of composite metabolites with growth environment.2 The environmental metabolomics of aquatic organisms is a subject of increasing concern in relation to food and environmental issues.3,4 However, information on the evaluation of constituted chemical diversity is scarce because the high © 2014 American Chemical Society
Received: August 30, 2013 Accepted: December 19, 2013 Published: January 8, 2014 1098
dx.doi.org/10.1021/ac402869b | Anal. Chem. 2014, 86, 1098−1105
Analytical Chemistry
Article
such as metabolites, genes, and environmental factors. For example, a metabolomic study of rice using two analytical platforms of NMR and gas chromatography/mass spectrometry suggested that environmental conditions affect metabolic responses.27 Therefore, constituted chemicals should be measured as broadly and comprehensively as possible to understand the biological system. Integrated analysis using data measured in this way without a predetermined target can see the natural situation without losing biological information and, thus, provide new biochemical knowledge in areas such as biological mechanisms or complex biological systems. The study of metabolomics in the hydrosphere is important to explore the roles and functions of complex biological substances, but such a comprehensive approach has not been applied to the study of seaweed thus far. Integration of spectral data with other data to perform statistical analysis can be difficult. Without extracting each fraction from a natural sample, pure peaks overlap in IR and NMR spectra. In this case, noise and misalignment affect the results of statistical analysis and relationships and tendencies cannot be clearly seen. If a fraction is extracted, there is a loss of information as well as a lot of difficulty in performing the extraction. For comprehensive evaluation of biomass using integrated data including spectra, spectral data should be preprocessed by spectral decomposition. In this study, a data preprocessing technique for integrated data analysis combining various measurements was constructed and a plant from the hydrosphere was comprehensively evaluated using several methods of statistical analysis. The experimental species in this study, Sargassum f usiforme, contains abundant polysaccharides associated with various minerals, and was used to evaluate variations in chemical composition with the seasons. To evaluate seasonal environmental fluctuations in S. f usiforme, we chose a natural sampling point in Sagami Bay, Japan, which has long been favored by Japanese emperors for the study of aquatic organisms.28,29
analytical methods and their integrated data analysis are required for comprehensive characterization and evaluation of the seaweed biomass. Several measurement instruments can be used for macromolecular evaluation. For example, we have recently demonstrated supramolecular and compositional characterization of cell wall components of terrestrial plants by Fourier transforminfrared spectroscopy (FT-IR), solid- and solution-state nuclear magnetic resonance (NMR), and thermogravimetry-differential thermal analysis (TG-DTA).11−15 Furthermore, inductively coupled plasma-optical emission spectrometry (ICP-OES) and CHNS/O total elemental analysis can be used for characterization of elemental composition. Among them, the highly useful polysaccharides in seaweeds have been widely measured using NMR and characterized by various statistical analyses.16 The numeric data acquired by several instruments are analyzed on the basis of chemometric techniques.17 In our previous study, for example, heterogeneous measurement data obtained by different instruments were successfully combined by correlation analysis (one of the chemometric techniques), and the combined data have revealed biological relationships in the ecosystems.18,19 Therefore, chemometrics is useful for exploring features and relationships between numerical values obtained by different instruments. Components included in natural samples vary flexibly by responding to environmental conditions. Metabolomics can track the metabolic pathway and provide the relationship between the environment and metabolites affected by metabolism by responding to environmental variation.20−22 Recent metabolome studies have demonstrated that the same species of food ingredients varies according to the area of production (i.e., different growing environment) and that variations in the hydrosphere environment affect the metabolic profile of aquatic organisms.23,24 Therefore, it is important to capture and trace the variations of relative abundance in each component and the relationships between the components. To capture the variations, comprehensive measurements based on “relative” quantifications, such as metabolomic and ionomic approaches (i.e., nontargeted analysis and data-driven approaches) are suitable and useful compared to traditional “absolute” quantification methods for each component (i.e., targeted analysis and hypothesis-driven approaches).20 Using comprehensive measurements of relative variations, the covariations (i.e., synchronous and continuous variations) between chemicals and/or elements can be highlighted by viewing the relationships based on combined relative variation data. Integrated analysis incorporating various data measurements can comprehensively detect variations in chemicals over time or relationships between organic and inorganic components. However, no useful data processing technique with combination of existing methods and optimization has been developed, which can perform comprehensive evaluation based on chemometrics using several measurements. Therefore, a chemometric approach to comprehensively evaluate organisms can be a critical technology for bioinnovation. Integrated analysis based on chemometrics using plant metabolites and elements has been reported in a recent study.25,26 Integrated analysis of metabolites and elements provides new information in areas such as reaction to environmental changes and biological functions. However, these studies had a decided target source. Biological information is intricately intertwined with several other factors
■
MATERIALS AND METHODS Samples. The natural samples of the brown algae S. f usiforme used in this study were collected from an intertidal area at Aburatsubo in Miura City, Kanagawa, Japan (35°16′ N, 139°62′ E) between May 2011 and April 2012 (one or two times per month) (Supplementary Figure 1 in the Supporting Information). The 17 collected samples were washed with distilled water, the holdfast organ and any contaminants attached to the samples were removed, and samples were frozen at −30 °C. Lyophilized samples were crushed to a powder using an Automill machine (Tokken, Inc., Chiba, Japan). Biomass Measurements. All 13C solid-state NMR spectra were recorded on a DRX-500 spectrometer (500 MHz BrukerBioSpin, Billerica, MA) operating at 500.13 MHz with a Bruker MAS VTN 500SB BL4 probe, using a 4-mm cross-polarization−magic angle spinning (CP−MAS) probe head, as described in a previous study.14,30,31 In solution-state NMR, one-dimensional (1D) Watergate (WG)32 spectra of these samples were acquired at 298 K on a 700 MHz Bruker Biospin NMR instrument (AVENCEII-700) equipped with an inverse (proton coils closest to the sample) gradient 5 mm Cryo 1 H/13C/15N probe (Bruker Biospin, Rheinstetten, Germany). The two-dimensional (2D) 1H−13C heteronuclear single quantum coherence (HSQC) method for NMR measurements has been previously described.18,33−35 The ATR−FT-IR spectra 1099
dx.doi.org/10.1021/ac402869b | Anal. Chem. 2014, 86, 1098−1105
Analytical Chemistry
Article
Figure 1. Concept and strategies for our technological advances. (1) Several instruments were used to comprehensively extract the components of S. f usiforme. Evaluation of intact biomass components was conducted by solid-state NMR and FT-IR. Characterization of thermal decomposition profiles used TG-DTA. Metabolic profiles of S. fusiforme were obtained by solution-state NMR. Seasonal variation of elemental content was measured by ICP-OES, CHNS/O analyzer, and IR-MS. (2) PCA and SOM statistical techniques were used to comprehensively extract features of the components of S. f usiforme. (3) The relationships between organic and inorganic matter were explored by network analysis. (4) SEM was used to verify the relationship between organic and inorganic matter estimated by network analysis. If the model shows a statistically good fit, biochemical phenomena can be suggested.
(650−4000 cm−1) of the lyophilized and powdered samples were obtained using a Nicolet 6700 FT-IR spectrometer (Thermo Fisher Scientific Inc., Waltham, MA) with KBr disks, according to a previously published method.18 Thermogravimetric analysis was conducted using an EXSTAR TG/DTA 6300 (SII Nanotechnology Inc., Tokyo, Japan) instrument, according to a previously described method.13 The ICP-OES analysis was conducted using an SPS 5510 (SII Nanotechnology Inc., Tokyo, Japan) instrument with CCD detector, with a range of wavelengths from 167 to 785 nm and 74 applicable elements.36 Elemental analysis of the biomass samples was performed with a CHNS/O analyzer (Vario Micro cube, Elementar Analysensysteme GmbH, Hanau, Germany) using helium as the carrier gas. Such analysis simultaneously gives the weight percent of carbon, hydrogen, nitrogen, and sulfur in the samples.37 The isotope ratio mass spectrometry (IR-MS) analysis was performed on an IsoPrime 100 (Jasco international Co., Ltd., Tokyo, Japan) in
combination with an elemental analyzer (Vario MICRO cube) in “CN mode,” and isotopic ratios of carbon and nitrogen in the samples were measured as CO2 and N2 gases using IR-MS. Spectra Data Preprocessing. All solution 1H NMR spectra were manually phased and baseline-collected and normalized to DSS intensity using Excel. All solid state 13C CP−MAS NMR spectra were manually phased and baselinecollected and normalized to total integral area. All FT-IR spectra were normalized to 650 cm−1 intensity. Solution NMR spectra were processed using Topspin software (Bruker Biospin,). A 1H−13C HSQC spectrum of S. f usiforme collected in May 2011 was processed by peak picking of discriminable peaks from noise, and candidate annotation was performed using the SpinAssign program on the PRIMe Web site (http:// prime.psc.riken.jp/).38−40 Intensity scores were extracted from relative peaks in 1H NMR spectra. 1100
dx.doi.org/10.1021/ac402869b | Anal. Chem. 2014, 86, 1098−1105
Analytical Chemistry
Article
Multivariate Spectral Decomposition. In a recent study, the multivariate curve resolution alternating least-squares (MCR-ALS) method was suggested as a useful method for separating the overlapping peaks in 13C CP−MAS spectra.16 MCR components were fitted with Gaussian distributions using the program Fityk (http://fityk.nieto.pl/). Statistical Analyses of Integrated Data. Principal component analysis (PCA) was used to explore any biomass component clustering based on intrinsic biochemical similarities between the samples in annual variation. In this study, each measurement data point was integrated after scaling, and the integrated data set was subsequently imported into “R” software (http://www.r-project.org/). PCA was calculated using a correlation matrix. Self-organizing maps (SOMs) can be considered as a nonlinear mapping technique, which identifies clusters in an unsupervised way within data sets without the rigid assumptions of linearity or normality associated with traditional statistical techniques. In this study, all calculations concerning SOM were performed by the Kohonen library on the R platform. Correlation analysis enables identification of relationships between several components. In this study, Pearson’s correlations of integrated data were calculated using “R” software. A high correlation coefficient (|r| > 0.7) was collected from all correlation coefficient and transformed to a matrix of connection between source and target. The transformed data matrix was imported to network analysis software Gephi (https://gephi.org/). Structural equation modeling (SEM) is a technique of multivariate statistical analysis. In this study, SEM was used to verify the relationship between elements and metabolites estimated by correlation network analysis (CNA). SEM was conducted using AMOS software (http://www-03.ibm.com/ software/products/jp/ja/spss-amos/) for Windows.
■
PCA and SOM using integrated data can extract comprehensive features of the overall components, whereas PCA of one set of measurement data reveals only part of the biological information. Therefore, all data extracted by data preprocessing were integrated for comprehensive feature extraction of biomass by statistical analyses. An integrated data matrix containing all measurements of 17 samples × 166 variables was constructed. All organic and inorganic chemical data were integrated to identify seasonal features of S. fusiforme components using PCA and SOM. PCA was performed using a correlation matrix. CNA using integrated data can explore the relationships between organic and inorganic matter. To calculate the correlation coefficient, it is useful to investigate the relationships between metabolites and elements. Two-variable plots and heat maps were used to visualize the correlation coefficient. In biology, an integrated analysis approach can reveal several relationships using correlation coefficients visualized by heat maps and networks.41−44 However, few complex relationships are seen in this case. Network analysis is a multiple data mining method involving several components. Therefore, CNA was used in this study to explore complex relationships within the integrated data. Data with a high correlation coefficient (|r| > 0.7) were collected and used for CNA after calculation of Pearson’s correlation of integrated data. Groups were created by the OpenOrd algorithm and manual layout. SEM using integrated data can verify specific relationships. CNA determines the relationship but not the causal relationship. SEM can suggest the causal mechanism statistically using the relationship estimated by CNA. Evaluation of Intact Biomass Components by SolidState NMR and FT-IR. Using FT-IR, 17 samples were measured to evaluate seasonal tendencies and features. In samples of the same species, spectral data showed little seasonal variation in their biomass components; therefore, PCA was subsequently performed to evaluate seasonal features. From a score plot of PC1 vs PC2, the sample from July 5th showed a characteristically different profile to that of other samples (Supplementary Figure 2A in the Supporting Information). Assignment of alginic acid in S. f usiforme was based on previous reports45,46 of alginic acid from brown algae. O−H (3150− 3250 cm−1), COO− (asymmetric) (1600 cm−1), and C−C and C−O stretching (1120−1130 cm−1) of alginic acid were used to produce a loading plot (Supplementary Figure 2a in the Supporting Information). CP−MAS spectra were assigned to the mannuronic and guluronic acid components of the alginic acid monomer, according to published references.16,47 Samples from December to March were clustered on the PC1 axis of the PCA score plot (Supplementary Figure 2B in the Supporting Information). The positive scores of PC1 and PC2 resulted in a high MG ratio (mannuronic:guluronic acid), which can be seen from the loading plot (Supplementary Figure 2B in the Supporting Information). Characterization of Thermal Decomposition Profiles. Derivative thermogravimetry (DTG) was used to evaluate the seasonal variation in thermal decomposition profiles. Chemical standards of major polysaccharides in S. f usiforme were measured by TG−DTA, and DTG spectra of S. fusiforme were assigned using standard spectra. Thus, the peak from 230 to 240 °C was assigned to alginic acid, whereas the peak from 300 to 310 °C was assigned to laminarin. In the PCA of DTG, score plots showed seasonal changes from June to August.
RESULTS AND DISCUSSION
Concept and Strategies of Integrated Analyses Across Heterogeneous Measurements. The concept and strategies behind our technological advances are shown in Figure 1. Several instruments were used to comprehensively extract components of S. f usiforme. Solid-state NMR and FT-IR was used to evaluate intact biomass components, TG-DTA was used to characterize thermal decomposition profiles, solutionstate NMR was used to evaluate metabolic profiles of S. f usiforme, and ICP-OES, CHNS/O analyzer, and IR-MS was used to examine the seasonal variation of elemental contents. For integrated analyses of multiple measurements, preprocessing steps were performed for all measurement data. Noise and tiny differences in the chemical shift of the same metabolite affect the results of statistical analysis using integrated and spectral data. In addition, spectra of natural samples are not suitable for evaluation as single components because different components of the spectra overlap. Preprocessing is therefore required for integrated data analysis using spectra from natural samples. In this study, this preprocessing involved estimating and extracting unmixed component peaks, and removing the noise and displacement of chemical shifts. In this case, MCRALS was a useful method for the statistical analysis of integrated data because it is able to extract a Gaussian peak as a single component peak. Spectral data were preprocessed by MCRALS and peak picking prior to the integrated data analyses. All measurement data after preprocessing were integrated to one matrix and normalized. 1101
dx.doi.org/10.1021/ac402869b | Anal. Chem. 2014, 86, 1098−1105
Analytical Chemistry
Article
Thermal decomposition profiles of alginic acid peaks in this season indicated greater changes than those in other seasons (Supplementary Figure 2C and c in the Supporting Information). Metabolic Profiles of S. f usiforme. Water-soluble and methanol-soluble fractions were evaluated by 1H and HSQC NMR spectra with KPi/D 2 O and MeOD extraction, respectively. The assignment of 1H NMR spectra is illustrated in Supplementary Figure 3A,B in the Supporting Information. Seasonal variations in the water-soluble and methanol-soluble fractions were evaluated by applying PCA to the 1H NMR data; factor loadings were assigned on the basis of HSQC assignments (Supplementary Figure 4A,B and Supplementary Tables 1 and 2 in the Supporting Information). No seasonal variation was discernible on the PCA score plot because the mannitol signal was of higher intensity than those of other components (Supplementary Figure 2D,d,F,f in the Supporting Information). Therefore, PCA was repeated after removing the mannitol peaks in the binning data. The modified score plot showed seasonal features when samples from June to July were plotted on the PC1 axis (Supplementary Figure 2E in the Supporting Information) and samples from December to April were plotted on the PC2 axis (Supplementary Figure 2G in the Supporting Information). Seasonal Variation of Elemental Content. The elemental content of S. f usiforme was evaluated using ICPOES, CHNS/O analyzer, and IR-MS instruments. Time series and correlation analyses were used for biomass evaluation. Levels of aluminum, titanium, iron, silicon, and manganese increased around July and were all highly correlated (Supplementary Figure 5 in the Supporting Information). In contrast, phosphorus and potassium showed a negative correlation with the other elements. The PCA results and time series analysis suggest distinctive features of the different seasons and a tendency for these features to change over the growth cycle. Data Preprocessing before Integrated Data Analyses (Multivariate Spectral Decomposition). Single component peaks in FT-IR, DTG, and CP−MAS spectra were extracted using the MCR-ALS method. After using multivariate spectral decomposition to avoid picking noise, 20 peaks were estimated in FT-IR and CP−MAS and 5 peaks were estimated in DTG (Figure 2). A CP−MAS peak of approximately 50−100 ppm overlapped with several other peaks in this region. In FT-IR spectra, multiple peaks were hidden between 1000 and 2000 cm−1. Thus, area and intensity scores of the Gaussian function were extracted for integrated data analyses. In addition, deconvolution peaks of Gaussian function were produced as a correction peak for fitting. Normal distribution of height and area of Gaussian function were verified by the Kolmogorov− Smirnov (KS) test. Almost all variables followed normal distributions (Supplementary Table 3 in the Supporting Information). Comprehensive Feature Extraction. Supplementary Figure 6 in the Supporting Information shows the results of PCA on PC1 to PC3 and contains the information of approximately half of the integrated data set. Score plots of samples from July and other seasons were clustered on the PC1 axis. In addition, the PC2 and PC3 axes showed a tendency for seasonal variation. Furthermore, SOM showed the same result as PCA on PC2 and PC3 (Supplementary Figure 7 in the Supporting Information). Loading plots of the PC1 axis on PCA show that mannitol, alginic acid, citrate, L-glutamate, and
Figure 2. Multivariate spectral decomposition of FT-IR (A), DTG (B), and CP−MAS (C) spectra in sample 1 using Fityk software. Black dots indicate binning data. Red lines show Gaussian peaks estimated as pure components and deconvolution peaks, and the fitting line is represented in blue. The values of area and height of each Gaussian peak was extracted for integrated data analyses. Central positions and height and area of the Gaussian function were validated by the Kolmogorov−Smirnov test (Supplementary Table 3 in the Supporting Information). L-glutamine
appear on the positive direction as metabolites. Silicon, boron, magnesium, iron, aluminum, manganese, titanium, yttrium, sodium, strontium, and cadmium appear on the positive direction as inorganic materials, while delta 13 carbon and the C/N ratio (carbon/nitrogen) appear in the positive direction as organic materials. It is known that the C/N ratio is low in young plants but high in mature plants. In China, maturation begins in mid-April (seawater temperature, 19 °C− 21 °C), reaches a peak in mid-May (maturation rate ∼70%; seawater temperature, 23.5 °C−25 °C), and finishes in late June (seawater temperature, 27.5 °C−30 °C).48 Further, the C/ N ratio of other algae revealed that the C/N weight ratios 1102
dx.doi.org/10.1021/ac402869b | Anal. Chem. 2014, 86, 1098−1105
Analytical Chemistry
Article
showed a marked seasonal variation, with low and almost constant values (approximately 10) during September through March, a sharp increase in April reaching a maximum in June− July, followed by a decrease in August−September. The highest values in June−July were 37 for Macrocystis integrifolia and 24 for Nereocystis luetkeana, which indicate a 2- to 4-fold change in C/N ratio during the year.49 The environmental condition of seawater varies according to area and year, but the seasonal variation of the C/N ratio found in this study closely correlates to the result of another study.49 Therefore, the scores suggested separation by maturation factor on the PC1 axis and show that the amounts of several biomasses widely varied in this season. The results of loading plots on the PC2 and PC3 axes of SOM show that the acidic polysaccharide alginic acid increases in summer, while laminarin, a source of stored sugar, increases in winter. In a recent study, results of integrated PCA provided strong evidence of the relationship between changes in foliar C/N/P/K stoichiometry of Erica multif lora and changes in the leaf’s metabolome during plant growth and environmental stress.25 In addition, in this study, some metabolites were related to some minerals. Analysis by PCA and SOM using integrated data of several measurements are able to show multiple tendencies and evaluate variations in biomass. In addition, integrated PCA can reveal tendencies for similar variation by different factors. Exploring and Verification of the Relationships. CNA revealed several relationships between inorganic and organic substances (Figure 3A). The biggest group in CNA primarily consisted of polysaccharides and amino acids derived from extracted data of CP−MAS, FT-IR, and KPi/D2O extraction of 1 H solution NMR and DTG. The second biggest group primarily consisted of fatty acids derived from methanol extraction of 1H solution NMR. The central group in CNA primarily consisted of elements of ICP-OES measurement related to several fatty acids and polysaccharides (Figure 3B). For example, there was a negative relationship between fatty acids and boron. After its discovery in 1910, boron was identified as one of the essential microelements for higher plants, with cell wall structure, flowering and fruiting, plant hormone regulation, sugar transport, and cell division all being known functions of boron in plant nutrition, and its biological role has been the subject of a number of studies.50−52 In addition, in contrast to the generally boron-poor terrestrial environment, the relatively high concentration of boron in the marine environment suggests that boron deficiency is not likely to be an issue for marine primary productivity despite it also being an essential element for marine algae. However, the potential toxicity of boron coupled with its high concentration in the ocean suggest the need for some homeostatic control mechanisms in marine organisms.53 It was expected that this network would reveal some effect of boron on the metabolism of fatty acids in S. f usiforme or some factor affecting this relationship. However, no such relationship could be confirmed in the present study. Silicon and a metabolite of 4.9 ppm in 1H solution NMR extracted by KPi/D2O buffer have a high positive correlation, and a component of approximately 31 ppm in CP−MAS measurement is also highly negatively correlated with them. Some metals have a positive relationship with some polysaccharides estimated by CP−MAS, DTG, and 1H solution NMR. In a recent study, an integrated PCA and correlation network was able to provide information on fruit metabolism and mechanisms of plant responses to environmental modifications, thereby paving the way for metabolomics-guided
Figure 3. The correlation network of S. f usiforme components based on Pearson’s correlation coefficients (|r| > 0.7) using 166 variables was made to explore the relationships between metabolites and elements and was visualized using Gephi software. A comprehensive network was created using the OpenOrd algorithm with manual modifications (A). The cluster in the dotted square in network A primarily contains elements and components. The network of nodes in this dotted square, as well as its neighbor nodes, was created manually to explore the relationships between saccharides and elements (B). Nodes are colored to discriminate different biomass measurements (FT-IR, yellow; DTG, orange; elements, purple; CP−MAS, brown; 1H NMR with MeOD extraction, green; 1H NMR with KPi/D2O extraction, light blue). For the lines linked to each node, blue corresponds to positive correlations, whereas red corresponds to negative correlations. Line thickness also shows three phases in B (0.7 < |r| ≤ 0.8, thin; 0.8 < |r| ≤ 0.9, medium; 0.9 < |r| ≤ 1.0, bold). In the spectral data, labels show the mean abbreviations of measurement and central position.
improvement of cultural practices for better fruit quality.26 Similarly, in this study, the metabolomics approach in combination with mineral element analysis provided new information on seaweed metabolism and seaweed response to seasonal modifications and fluctuations. A comprehensive exploration map made by correlation network analysis can 1103
dx.doi.org/10.1021/ac402869b | Anal. Chem. 2014, 86, 1098−1105
Analytical Chemistry
Article
reveal the relationships between organic and inorganic components. The central group formed by CNA showed the relationship between alginic acid and specific metals. In addition, it revealed that alginic acid is the main polysaccharide constituent of brown algae. Therefore, the anomeric region of alginic acid was estimated with the MCR-ALS method using CP−MAS spectra, and iron, aluminum, and titanium, which were used to verify the relationship between alginic acid and metals, were measured using ICP-OES. The hypothetical model was verified using SEM. Area and intensity of the anomeric region of alginic acid formed the latent variables. Iron, aluminum, and titanium were also latent variables. The correlation coefficient of two latent variables was 0.81, and the model fitting score was satisfactory (Figure 4A).54
The mechanisms of biosorption are not fully understood, but it is well-known that alginic acid in algal cell walls plays an important role in metal binding.55 Recently, alginic acid has been shown to link aluminum, titanium, and iron.56−58 A linear binary copolymer is formed, consisting of (1 → 4) linked β-Dmannuronic acid (M) and α-L-guluronic acid (G), which are arranged in a block structure of homopolymeric (MM and GG) and/or heteropolymeric (MG and GM) blocks.59 Our result suggests that the structure of alginic acid changes with growth; this change affects the adsorption of metals (Figure 4B). Complex relationship maps made by CNA using several sets of organic and inorganic chemical data can test several biological hypotheses. Moreover, SEM is a useful method to verify relationships estimated using CNA.
■
CONCLUSIONS Integrated analyses based on chemometrics showed multiple differences, similarities, and trends in chemical composition in a hydrosphere plant. Statistical analysis using integrated data from several measurements can evaluate a variety of perspectives. We show for the first time that developing from individual PCA to integrated PCA and network analysis can comprehensively extract similar tendencies in variation between different measurement data sets, and these relationships could be evaluated by SEM. Moreover, integrated analysis involving genetic and environmental information as well as a variety of chemical data can be applied for practical development and to obtain new biochemical knowledge.
■
ASSOCIATED CONTENT
S Supporting Information *
Water-soluble compounds detected in 1H−13C HSQC spectra, methanol-soluble compounds detected in 1H−13C HSQC spectra, verification of Gaussian function, location of the sampling spot, PCA of each measurement, 1H NMR spectra of S. f usiforme, 1H−13C HSQC NMR spectra of S. f usiforme, time series variation of elements, score and loading plots of PCA using integrated data, and loading plots of PCA and SOM using integrated data. This material is available free of charge via the Internet at http://pubs.acs.org.
■
AUTHOR INFORMATION
Corresponding Author
*E-mail:
[email protected]. Figure 4. The relationship between biomass composition of alginic acid and metals, verified by SEM. The anomeric region (C1) of alginic acid in CP−MAS spectra was estimated by the MCR-ALS method (Figure 2C). The relationship between alginic acid and three metals was suggested by correlation network analysis (Figure 3). SEM for verification of the relationship was calculated and normalized by AMOS software. Squares indicate observed variables, and circles indicate latent variables. The values on single-ended arrows indicate pass coefficients, and double-ended arrows indicate correlations. The values on observed variables are coefficient of determination (R2). Relationships of two latent variables were verified by confirmatory factor analysis (A). This model fit (chi-squared (χ2) = 1.717, degree of freedom (df) = 4, P-value = 0.788, goodness of fit index (GFI) = 0.96, adjusted goodness of fit index (AGFI) = 0.85, comparative fit index (CFI) = 1.00, root-mean-square error of approximation (RMSEA) = 0.00) is appropriate according to a previously published study.54 On the basis of the correlation between the metal and glycoside binding site, the model suggests that changing the structure of the monomers (M and G blocks) in alginic acid with growth affects the adsorption of several metals (B).
Author Contributions
The manuscript was written with contributions from of all the authors. All the authors have approved the final version of the manuscript. Funding
This research was supported in part by Grants-in-Aid for Scientific Research (Grant No. 25513012) (to J.K.) and the Advanced Low Carbon Technology Research and Developmental Program (Grant No. 200210023, ALCA to J.K.) from the Ministry of Education, Culture and Sports. Notes
The authors declare no competing financial interest.
■
ACKNOWLEDGMENTS The authors wish to thank S. Moriya (RIKEN) for his advice and encouragement during this study. The 166 numerical data used for the integrated analysis can be provided on request. 1104
dx.doi.org/10.1021/ac402869b | Anal. Chem. 2014, 86, 1098−1105
Analytical Chemistry
■
Article
(36) Sekiyama, Y.; Chikayama, E.; Kikuchi, J. Anal. Chem. 2011, 83, 719−726. (37) Feng, Y.; Xiao, B.; Goerner, K.; Cheng, G.; Wang, J. Smart Grid Renewable Energy 2011, 2, 158−164. (38) Akiyama, K.; Chikayama, E.; Yuasa, H.; Shimada, Y.; Tohge, T.; Shinozaki, K.; Hirai, M. Y.; Sakurai, T.; Kikuchi, J.; Saito, K. In Silico Biol. 2008, 8, 339−345. (39) Chikayama, E.; Sekiyama, Y.; Okamoto, M.; Nakanishi, Y.; Tsuboi, Y.; Akiyama, K.; Saito, K.; Shinozaki, K.; Kikuchi, J. Anal. Chem. 2010, 82, 1653−1658. (40) Chikayama, E.; Suto, M.; Nishihara, T.; Shinozaki, K.; Kikuchi, J. PLoS One 2008, 3, e3805. (41) Carrari, F.; Baxter, C.; Usadel, B.; Urbanczyk-Wochniak, E.; Zanor, M. I.; Nunes-Nesi, A.; Nikiforova, V.; Centero, D.; Ratzka, A.; Pauly, M.; Sweetlove, L. J.; Fernie, A. R. Plant Physiol. 2006, 142, 1380−1396. (42) Crockford, D. J.; Holmes, E.; Lindon, J. C.; Plumb, R. S.; Zirah, S.; Bruce, S. J.; Rainville, P.; Stumpf, C. L.; Nicholson, J. K. Anal. Chem. 2006, 78, 363−371. (43) Peluffo, L.; Lia, V.; Troglia, C.; Maringolo, C.; Norma, P.; Escande, A.; Hopp, H. E.; Lytovchenko, A.; Fernie, A. R.; Heinz, R.; Carrari, F. Phytochemistry 2010, 71, 70−80. (44) Mounet, F.; Moing, A.; Garcia, V.; Petit, J.; Maucourt, M.; Deborde, C.; Bernillon, S.; Le Gall, G.; Colquhoun, I.; Defernez, M.; Giraudel, J. L.; Rolin, D.; Rothan, C.; Lemaire-Chamley, M. Plant Physiol. 2009, 149, 1505−1528. (45) Sartori, C.; Finch, D. S.; Ralph, B.; Gilding, K. Polymer 1997, 38, 43−51. (46) Leal, D.; Matsuhiro, B.; Rossi, M.; Caruso, F. Carbohydr. Res. 2008, 343, 308−316. (47) Mollica, G.; Ziarelli, F.; Lack, S.; Brunel, F.; Viel, S. Carbohydr. Polym. 2012, 87, 383−391. (48) Zou, D.; Gao, K.; Ruan, Z. J. Appl. Phycol. 2006, 18, 195−201. (49) Rosell, K. G.; Srivastava, L. M. J. Phycol. 1985, 21, 304−309. (50) Blevins, D. G.; Lukaszewski, K. M. Annu. Rev. Plant Physiol. 1998, 49, 481−500. (51) Brown, P. H.; Bellaloui, N.; Wimmer, M. A.; Bassil, E. S.; Ruiz, J.; Hu, H.; Pfeffer, H.; Dannel, F.; Römheld, V. Plant Biol. 2002, 4, 205−223. (52) Bolaños, L.; Lukaszewski, K.; Bonilla, I.; Blevins, D. Plant Physiol Biochem. 2004, 42, 907−912. (53) Carrano, C. J.; Schellenberg, S.; Amin, S. A.; Green, D. H.; Kupper, F. C. Mar. Biotechnol. 2009, 11, 431−440. (54) Hooper, D.; Coughlan, J.; Mullen, M. Electron. J. Bus. Res. Methods 2008, 6, 53−60. (55) Kuyucak, N.; Volesky, B. Biotechnol. Bioeng. 1989, 33, 823−831. (56) Gregor, J. E.; Fenton, E.; Brokenshire, G.; Van Den Brink, P.; O’Sullivan, B. Water Res. 1996, 30, 1319−1324. (57) Brizzolara, R. A. Surf. Interface Anal. 2002, 33, 351−360. (58) Min, J. H.; Hering, J. G. Water Res. 1998, 32, 1544−1552. (59) Burana-osot, J.; Hosoyama, S.; Nagamoto, Y.; Suzuki, S.; Linhardt, R. J.; Toida, T. Carbohydr. Res. 2009, 344, 2023−2027.
REFERENCES
(1) Rocha, E. P. Science 2013, 339, 1154−1155. (2) Jung, Y.; Lee, J.; Kim, H. K.; Moon, B. C.; Ji, Y.; Ryu do, H.; Hwang, G. S. Analyst 2012, 137, 5597−5606. (3) Kwon, Y. K.; Jung, Y. S.; Park, J. C.; Seo, J.; Choi, M. S.; Hwang, G. S. Mar. Pollut. Bull. 2012, 64, 1874−1879. (4) Karakach, T.; Huenupi, E.; Soo, E.; Walter, J.; Afonso, L. B. Metabolomics 2009, 5, 123−137. (5) McHugh, D. J. FAO Fish Tech. Pap. 2003, 441, 105. (6) Kirkman, H.; Kendrick, G. A. J. Appl. Phycol. 1997, 9, 311−326. (7) Aderhold, D.; Williams, C. J.; Edyvean, R. G. J. Bioresour. Technol. 1996, 58, 1−6. (8) Matheickal, J. T.; Yu, Q. M. Water Sci. Technol. 1996, 34, 1−7. (9) Sandau, E.; Sandau, P.; Pulz, O.; Zimmermann, M. Acta Biotechnol. 1996, 16, 103−119. (10) Figueira, M. M.; Volesky, B.; Ciminelli, V. S. T.; Roddick, F. A. Water Res. 2000, 34, 196−204. (11) Ogata, Y.; Chikayama, E.; Morioka, Y.; Everroad, R. C.; Shino, A.; Matsushima, A.; Haruna, H.; Moriya, S.; Toyoda, T.; Kikuchi, J. PLoS One 2012, 7, e30263. (12) Watanabe, T.; Shino, A.; Akashi, K.; Kikuchi, J. Plant Biotechnol. 2012, 29, 163−170. (13) Ogura, T.; Date, Y.; Kikuchi, J. PLoS One 2013, 8, e66919. (14) Komatsu, T.; Kikuchi, J. J. Phys. Chem. Lett. 2013, 4, 2279− 2283. (15) Komatsu, T.; Kikuchi, J. Anal. Chem. 2013, 85, 8857−8865. (16) Salomonsen, T.; Jensen, H. M.; Larsen, F. H.; Steuernagel, S.; Engelsen, S. B. Carbohydr. Res. 2009, 344, 2014−2022. (17) Trygg, J.; Holmes, E.; Lundstedt, T. J. Proteome Res. 2007, 6, 469−479. (18) Date, Y.; Sakata, K.; Kikuchi, J. Polym. J. 2012, 44, 888−894. (19) Date, Y.; Nakanishi, Y.; Fukuda, S.; Kato, T.; Tsuneda, S.; Ohno, H.; Kikuchi, J. J. Biosci. Bioeng. 2010, 110, 87−93. (20) Zhang, G. F.; Sadhukhan, S.; Tochtrop, G. P.; Brunengraber, H. J. Biol. Chem. 2011, 286, 23631−23635. (21) Miller, M. G. J. Proteome Res. 2007, 6, 540−545. (22) Yamazawa, A.; Date, Y.; Ito, K.; Kikuchi, J. J. Biosci. Bioeng. 2013, DOI: 10.1016/j.biosc.2013.08.010. (23) Caruso, M.; Galgano, F.; Castiglione Morelli, M. A.; Viggiani, L.; Lencioni, L.; Giussani, B.; Favati, F. J. Agric. Food Chem. 2012, 60, 7− 15. (24) Samuelsson, L. M.; Bjorlenius, B.; Forlin, L.; Larsson, D. G. Environ. Sci. Technol. 2011, 45, 1703−1710. (25) Rivas-Ubach, A.; Sardans, J.; Perez-Trujillo, M.; Estiarte, M.; Penuelas, J. Proc. Natl. Acad. Sci. U.S.A. 2012, 109, 4181−4186. (26) Bernillon, S.; Biais, B.; Deborde, C.; Maucourt, M.; Cabasson, C.; Gibon, Y.; Hansen, T. H.; Husted, S.; de Vos, R. C. H.; Mumm, R.; Jonker, H.; Ward, J. L.; Miller, S. J.; Baker, J. M.; Burger, J.; Tadmor, Y.; Beale, M. H.; Schjoerring, J. K.; Schaffer, A. A.; Rolin, D.; Hall, R. D.; Moing, A. Metabolomics 2013, 9, 57−77. (27) Barding, G. A., Jr.; Beni, S.; Fukao, T.; Bailey-Serres, J.; Larive, C. K. J Proteome Res. 2013, 12, 898−909. (28) Akihito, P. Jpn. J. Ichthyol. 1972, 19, 103−110. (29) Fumihito, A.; Ikeda, Y.; Aizawa, M.; Makino, T.; Umehara, Y.; Kai, Y.; Nishimoto, Y.; Hasegawa, M.; Nakabo, T.; Gojobori, T. Gene 2008, 427, 7−18. (30) Mori, T.; Chikayama, E.; Tsuboi, Y.; Ishida, N.; Shisa, N.; Noritake, Y.; Moriya, S.; Kikuchi, J. Carbohydr. Polym. 2012, 90, 1197− 1203. (31) Okushita, K.; Komatsu, T.; Chikayama, E.; Kikuchi, J. Polym J. 2012, 44, 895−900. (32) Piotto, M.; Saudek, V.; Sklenar, V. J. Biomol NMR 1992, 2, 661− 665. (33) Kikuchi, J.; Hirayama, T. Methods Mol. Biol. 2007, 358, 273− 286. (34) Sekiyama, Y.; Chikayama, E.; Kikuchi, J. Anal. Chem. 2010, 82, 1643−1652. (35) Sekiyama, Y.; Kikuchi, J. Phytochemistry 2007, 68, 2320−2329. 1105
dx.doi.org/10.1021/ac402869b | Anal. Chem. 2014, 86, 1098−1105