Subscriber access provided by CORNELL UNIVERSITY LIBRARY
Article
Multivariate analyses of phytoplankton pigment fluorescence from a freshwater river network Ruchi Bhattacharya, and Christopher L. Osburn Environ. Sci. Technol., Just Accepted Manuscript • Publication Date (Web): 16 May 2017 Downloaded from http://pubs.acs.org on May 20, 2017
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
Environmental Science & Technology is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 28
Environmental Science & Technology
1
Multivariate analyses of phytoplankton pigment fluorescence from a freshwater river
2
network
3
Ruchi Bhattacharya*, Christopher L. Osburn
4
Department of Marine, Earth, and Atmospheric Sciences, North Carolina State University,
5
Raleigh, NC, USA 27606
6
*Corresponding author info:
[email protected] 7 8 9 10 11 12 13 14 15 16 17
1
ACS Paragon Plus Environment
Environmental Science & Technology
18
Abstract
19
Monitoring phytoplankton classes in river networks is critical to understand phytoplankton
20
dynamics, and to predict the ecosystem response to changing land-use and seasons. Applicability
21
of phytoplankton fluorescence as a quick and effective ecological monitoring approach is
22
relatively unexplored in freshwater ecosystems. We used multivariate analyses of fluorescence
23
from pigment extracted in 90% acetone to assess the variability in phytoplankton classes,
24
herbivory and organic matter quality in a freshwater river network. Four models developed by
25
Parallel Factor Analysis (PARAFAC) of fluorescence excitation and emission matrices identified
26
6 components: Model 1 (Pheophytin-A, Chlorophyll-A), Model 2 (Chl-B, Chl-C), Model 3 (Phe-
27
B) and Model 4 (Phe-C). Redundancy analyses revealed that in summer, urban and agricultural
28
streams were abundant in chlorophylls, fresh organic matter and organic nitrogen; whereas in
29
winter, streams were high in phaeopigments. A slow moving, light-limited wetland stream was
30
an exception as high phaeopigment abundance was observed in both seasons. The PARAFAC
31
components were used to develop a partial least square regression-based model (r2=0.53;
32
NSE=0.5; n=147) that successfully predicted Chl-A concentrations from an external sub-set of
33
river water samples (r2=0.41; p
165
95% of the variation in the data. An efficient alternating least squares approach is used to
166
minimize residuals between measured and modeled EEMs and usually produces robust models
167
[45; 46]
168
. The PARAFAC Models were validated by first dividing the data array into two halves,
169
and then using Tucker’s congruence coefficient (TCC, set to 95% similarity) to test separate
170
PARAFAC models fit to each half (split-half validation)[24]. Random initialization was next used
171
on 10 models to find the best-fit model having the lowest sum of squared error [46]. Residuals
172
were determined as the difference between the measured and the modeled EEMs [47]. In case
173
distinct residual peaks with intensities of few orders of magnitude lower than the measured
174
PARAFAC components were observed at a specific excitation/emission pair, then residual EEMs
175
were modeled using in-house MATLAB functions built around the DOMFluor toolbox [24]. The
176
modeled residuals were also split-half validated using TCC and random initialization methods
177
described above.
178
The principle component analysis (PCA) - a linear response ordination method [37] was
179
used to explore the spatio-temporal variability in PARAFAC components. The PCA was
180
conducted on the maximum fluorescence intensity (FMax) of each PARAFAC component, 9
ACS Paragon Plus Environment
Environmental Science & Technology
181
typically used as a qualitative measure of the strength/presence of each component [44; 45]. The
182
PCA was conducted on the water quality parameters to explore their spatio-temporal variability
183
(Table S2). Prior to conducting PCA the distribution of PARAFAC component Fmax and water
184
quality parameters were checked by constructing q-q plots and histograms; parameters not-
185
normally distributed were centered and log transformed log10 (n+10). Linear regressions were
186
conducted for outlier identification (high leverage) and removal. The PCA was also used to
187
explore the inter-variable relationships by calculating correlation coefficients between the
188
variables [48]. Major gradients in the dataset were identified by calculating correlation
189
coefficients for the variables and each of the ordination axes [48]. Next, the relation between the
190
water quality parameters and the PARAFAC component FMax were determined by redundancy
191
analysis (RDA). In order to identify a minimum subset of variables that significantly explain
192
variation in the PARAFAC components, we first assessed the multi-collinearity in the data set by
193
checking for high variance inflation factor (VIF > 10). Second, the redundant variables were
194
removed through step-wise regression with Monte Carlo permutation tests [49]. Other than the
195
statistical significance of each variable, their relative ecological importance in explaining
196
phytoplankton abundance was also considered prior to variable removal. We further conducted
197
variance partitioning to explore the percent variance uniquely explained by each variable. The
198
PCA and RDA were conducted using “vegan 2.3-1” package in R statistical software [50].
199
Partial least squares regression (PLS) [26] was conducted on PARAFAC component Fmax
200
from using “rioja” package in R statistical software [51]. The PLS model was calibrated with the
201
sample Subset 1 (n = 147), that was analyzed for pigment fluorescence, Chl-A concentrations
202
and other water quality parameters. The water quality parameter that individually explained the 10
ACS Paragon Plus Environment
Page 10 of 28
Page 11 of 28
Environmental Science & Technology
203
most variance was used for the development of the calibration model. Leave-one-out cross
204
validation procedure (Jack-knife) was used to validate the error predictions [52, 53]. The model
205
efficiency was evaluated by r2 and Nash-Sutcliffe Efficiency value (NSE). The PLS based
206
calibration model was externally validated using the PARAFAC component Fmax values from a
207
second sample subset (Subset 2, n = 75). The sample subset 2 was analyzed for pigment
208
fluorescence, and Chl-A, but the associated water quality parameters were not available.
209
However, we ensured that both subset 1 and 2 were representative of the environmental and
210
seasonal gradients.
211
11
ACS Paragon Plus Environment
Environmental Science & Technology
212
Results
213
PARAFAC Modeling
214 215 216 217 218
Figure 1 PARAFAC model component excitation and emission spectral loading (shown by the solid line). The excitation and emission loadings of a split-half validated component (shown by the dashed line) is also plotted with each PARAFAC component. An overlap between the PARAFAC and split-half validated component indicates good model performance.
219
A total of four PARAFAC models (see Figures S2 and S3) were developed using two sets of
220
data: 1) the chlorophyll pigment EEMs and 2) the phaeopigment EEMs. The first PARAFAC
221
model (Model1) was developed from the chlorophyll pigment EEMs and resulted in two
222
components that were validated by split-half analysis. The excitation (Ex) and emission (Em)
223
spectral loadings of the two-modeled components resembled Phe-A (Model1-Phe-A) and Chl-A 12
ACS Paragon Plus Environment
Page 12 of 28
Page 13 of 28
Environmental Science & Technology
224
(M1-Chl-A), respectively (Figure 1A-B, Table 1, Figure S2). Examination of the residuals EEMs
225
from Model 1 revealed two low intensity, distinct fluorescence peaks. The second PARAFAC
226
model (Model 2) was developed using these residual EEMs and two validsated components were
227
identified as Chl-B (M2-Chl-B) and Chl-C (M2-Chl-C) (Figure 1C-D, Table 1). Then, the
228
phaeopigment EEMs were used to develop a two-component PARAFAC model (Model 3), and
229
the validated components corresponded with Phe-A (M3-Phe-A; not shown in Figure 1) and Phe-
230
B (M3-Phe-B) (Figure 1E; Table 1). The model 3 residuals also revealed a distinct low intensity
231
fluorescence peak. Thus, the final, and fourth model (Model 4) was developed from the residual
232
EEMs from Model 3, and the validated component was identified as Phe-C (M4-Phe-C) (Figure
233
1F, Table 1, Figure S2).
234
Spatial and temporal variability in PARAFAC components
235
PCA conducted on the PARAFAC components Fmax (log10 (n+10)) showed that PCA
236
axis 1 and 2 explained 47.7% and 32.9% of the variance, respectively (Figure 2A). The M3-Phe-
237
B and M4-Phe-C vectors were significantly and positively correlated to each other (r2 = 0.4, p