Subscriber access provided by Macquarie University
General Research
Analysis of Multicomponent Ionic Mixtures using Blind Source Separation - a Processing Case Study Giovanni Maria Maggioni, Stefani Kocevska, Martha A. Grover, and Ronald W. Rousseau Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/acs.iecr.9b03214 • Publication Date (Web): 27 Aug 2019 Downloaded from pubs.acs.org on August 30, 2019
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Analysis of Multicomponent Ionic Mixtures using Blind Source Separation - a Processing Case Study Giovanni Maria Maggioni, Stefani Kocevska, Martha A. Grover,∗ and Ronald W. Rousseau∗ Georgia Institute of Technology E-mail:
[email protected];
[email protected] 1
August 13, 2019
2
Abstract
3
Management and remediation of complex nuclear waste solutions require identifi-
4
cation and quantification of multiple species. Some of the species forming the solution
5
are unknown and they can be different from vessel to vessel, thus limiting the utility of
6
standard calibration approaches. To cope with such limited information, we propose
7
a procedure based on blind source separation (BSS) techniques, in particular indepen-
8
dent component analysis and multivariate curve resolution, with a one-point calibration
9
library. Here we show the applicability and reliability of our procedure for on-line mea-
10
surements of aqueous ionic solutions by proposing an automatic procedure to identify
11
the number of species in the mixture, estimate the spectra of the pure species, and label
12
the spectra with respect to a library of reference components. We test our procedure
13
against simulated and experimental data for mixtures with six species (water plus five
14
sodium salts) for the case of Raman and ATR-FTIR spectroscopy.
1
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
15
1
Introduction
16
The low-level radioactive waste at the Hanford Site in Washington State (USA) is to be
17
vitrified to achieve long-term, safe, and environmentally sustainable storage. The process
18
selected to achieve this aim significantly reduces the total volume of waste by separating
19
most of the water contained in the waste from dissolved species. The process comprises
20
several unit operations and is expected to run continuously for about fifty years (from its
21
scheduled start in 2023) to complete treatment of the whole mass of waste. 1,2
22
Process safety, efficiency, and stability require that operating conditions remain within a
23
relatively narrow range of values. The key variables are the temperature, the identity of the
24
species present in the feed, and their relative concentrations.
25
Spectroscopic techniques, such as Infrared (IR) in the form of Attenuated Total Reflection-
26
Fourier Transform IR, ATR-FTIR, and Raman, are commonly used to analyze and monitor
27
the composition of solutions and slurries. The standard approach to obtain quantitative in-
28
formation from these techniques usually relies on time-consuming calibration procedures, 3–6
29
which also need carefully designed sets of experiments with known species and concentrations
30
to estimate model parameters. Additionally, if the species present in the mixture change,
31
a new calibration typically becomes necessary, which may halt or delay processing. In the
32
case of nuclear-waste treatment, this clearly is an undesired event since the process aims at
33
running continuously for several decades. 1
34
The waste at Hanford originated from various processes and treatments. 1,2 Due to the
35
history of the tank-waste farm, the waste is not homogeneous: each tank may contain
36
different species and would require its own calibration for analysis. Therefore, in the present
37
work we have developed a protocol (1) to identify the spectra of pure major species and
38
(2) to compute a reliable estimate of their relative concentrations based on Blind Source
39
Separation (BSS) techniques and using a library that stores a single reference spectrum for
40
each species. The protocol builds upon two different well-established techniques, namely
41
Independent Component Analysis (ICA) and Multivariate Curve Resolution - Alternating 2
ACS Paragon Plus Environment
Page 2 of 40
Page 3 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
42
Least Squares (MCR-ALS). In particular, we have investigated the application of our protocol
43
to Raman and ATR-FTIR spectroscopy, using a simulant of the actual low-level radioactive
44
waste.
45
The paper is organized as follows. In Section 2, we review the basic principles of Raman
46
and IR spectroscopy, the structure of the algorithms used, and the main data pre-processing
47
techniques. In Section 3, we examine the results obtained with multi-component mixtures.
48
First, we briefly discuss the details of the simulant mixture used in this study; second, we test
49
our procedure against synthetic, simulated data (Section 3.2); third, we test the procedure
50
on actual measurements (Section 3.3). Finally, in Section 4, we summarize our findings.
51
2
Modelling
In this contribution, we consider only two spectroscopic techniques, namely ATR-FTIR and Raman: the mathematical treatment developed in this section applies equally to both techniques. We assume that the intensity of the measured spectroscopic signal is linearly proportional to the concentration (Beer-Lambert Law). Additionally, we assume that the total intensity, at any wavenumber, is given by the linear superposition of the intensities of the individual species. Mathematically, these relationships can be written as a linear system:
X = CL
(1)
52
where X ∈ RnN ×nL is the matrix of measured spectra, C ∈ RnN ×nK the matrix of concentra-
53
tions, and L ∈ RnK ×nL the matrix containing the spectra of the pure species. Note that nK
54
is the number of species, nN the number of measurements, and nL the number of sampled
55
point in the wavenumber space. Standard calibration approaches rely on various forms of
56
supervised learning, such as Partial Least Squares (PLS), Principal Component Analysis
57
(PCA), or Support Vector Machines (SVM). 5–10 In the context of nuclear waste processing,
58
extensive investigations on the use and reliability of calibrations technique, PLS in particular, 3
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
59
have been conducted by Bryan and co-workers at the Pacific Northwest National Labora-
60
tory. 8–10 Bryan et al. investigated several anionic systems at various pH and temperature
61
conditions, including some non-linear modifications to account for cases at particularly high
62
concentrations, where the Beer-Lambert law was found to be inaccurate. PLS and related
63
methods can determine with high accuracy and robustness the concentration of samples
64
within the range of training values. However, the main disadvantage of this approach is
65
that a significant amount of prior information is required in order to design an appropriate
66
calibration set. During the calibration phase, PLS techniques requires both X and C (or
67
even its extended version, containing information about the temperature and the pH as well)
68
to be known. Thus, the creation of a robust and accurate PLS model for a multi-component
69
system may require preparation and collection of tens or hundreds of different samples to
70
produce. In general, accurate and robust predictions at the price of lengthy calibrations is
71
typical of supervised learning approaches, not only of PLS. For example, the SVM approach
72
investigated by Griffin et al. 6 required appropriate training with tens of samples taken at
73
different conditions. Finally, PLS does not allow direct inference of the spectra of the pure
74
species from the data, since the technique is designed to exploit so-called latent variables
75
that best explain the variance between the input and output data.
76
2.1
77
As described earlier, standard calibration approaches may not be feasible during the oper-
78
ations involved in the treatment of nuclear waste, since only limited information may be
79
available and/or model recalibration could be too lengthy. In that case, one must extract
80
from the data themselves the number of species present in the system, their identity, and
81
their concentration without (or with minimal) prior knowledge of the system itself. Because
82
of such blindness, the methods developed to meet these conditions are known in the field of
83
signal analysis as Blind Sources Separation techniques. Two among the several approaches
84
available have gained popularity in analytical spectroscopy: Independent Component Anal-
Blind Source Separation
4
ACS Paragon Plus Environment
Page 4 of 40
Page 5 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
85
ysis (ICA) 11 and Multivariate Curve Resolution, particularly its Alternating Least Squares
86
variant (MCR-ALS). 12 We shall now briefly review their essential features.
87
2.1.1
Independent Component Analysis
ICA is based on the assumption that a signal can be decomposed into a linear combination of statistically independent, non-Gaussian components (or sources) that correspond to the spectra of the pure species. ICA aims to find an approximate solution to Eq. (1) by identifying two matrices A and S such that:
X = AS
(2)
with the matrices A and S related to the original spectra and concentration matrices:
A ←→ C
(3)
S ←→ L
(4)
88
where A ∈ RnN ׈nK is the mixing matrix ; S ∈ Rnˆ K ×nL is the sources matrix ; n ˆ K is the
89
estimated number of species, obtained from the analysis of the data and used instead of nK ,
90
which is unknown. However, because ICA inherently suffers from permutation, rotation, and
91
scaling ambiguity, 11 Eqs. (3) and (4) are equivalences, not identities. In fact, the related
92
matrices are the same up to a scaling and a permutation of their columns (A) or rows (S).
93
Consequently, the actual spectra of pure species and the independent components computed
94
by ICA have the same shape and upon normalization the spectra and the independent
95
components should overlap when the algorithm converges to the correct solution. From a
96
logical perspective, ICA can be broken down to two main steps: first, de-correlate the data
97
(a process usually called whitening) and reduce their dimensionality (to find n ˆ K ); second,
98
rotate the data in the reduced space to find the independent components. 11,13
99
There are several alternative algorithms to compute the whitening matrix W and the 5
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 6 of 40
100
rotation matrix R, divided into two main classes: those minimizing the mutual dependence
101
(entropy maximization, mutual information minimization) of the sources and those maxi-
102
mizing their non-Gaussianity (kurtosis, higher order cumulants). A popular and efficient
103
algorithm, based on maximization of non-Gaussianity, is FastICA, developed by Hyvarinen
104
and co-workers 14 ; another algorithm is MILCA, based on minimization of mutual informa-
105
tion, developed by St¨ogbauer and co-workers. 15 The details of the two algorithms can be
106
found in the dedicated literature. 11,15,16 Note that the number of independent components,
107
which ICA algorithms use to perform the analysis, is a decision variable provided by the
108
user.
109
The use of ICA for analytical spectroscopy was first proposed in 2001 by Chen et al. 17 in
110
the context of Near-IR, to study a ternary system (starch-protein-water). More recently, the
111
technique has also been applied to NMR, IR, UV, Raman, and Fluorescence, 18–20 in partic-
112
ular investigating the possibility of using a one-point calibration, as shown by Monakhova
113
et al. in UV and IR. 21 Nevertheless, these studies have been mainly limited to three- or
114
four-component systems of organic substances and have focused on analytical, rather than
115
processing applications.
116
2.1.2
Multivariate Curve Resolution
Multivariate Curve Resolution - Alternate Least Square (MCR-ALS) is a well-established chemometric technique 12,22–25 specifically aimed at retrieving the mixing and the source matrices. From a mathematical perspective, MCR-ALS solves the same problem as ICA, but it does so without relying on the independence of the sources. In fact, MCR-ALS seeks solution matrices A and S, by solving alternating least-squares problems such that:
min kX − ASk A,S
6
ACS Paragon Plus Environment
(5)
Page 7 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
117
where an initial guess for either A or S must be provided. Note that, similarly to ICA, MCR-
118
ALS also suffers from permutation, rotation, and scaling ambiguity and that the number of
119
species, nK , is a degree of freedom of the algorithm, even though embedded into the initial
120
guess.
121
The MCR-ALS formulation can also take advantage of some physical properties of the
122
spectra, such as non-negativity and mass-balance closure, to constrain the space of solutions.
123
However, when no prior information about the structure of the sought matrices is known,
124
i.e. without providing explicit search directions for the ALS algorithm, MCR-ALS may not
125
converge, or may converge only very slowly to a solution; additionally, in some cases the
126
solution may not be unique. Therefore, the initial guess for either the mixing matrix or
127
the sources matrix is crucial in determining the quality of the decomposition. When several
128
species are present, derivative spectroscopy may be better suited to estimate the spectra
129
of the individual species, 26–28 even though MCR-ALS cannot exploit the non-negativity
130
constraint on the spectra.
131
MCR-ALS has been mainly applied in systems undergoing kinetic reactions, where the
132
concentration of the species and their absorbance/scattering may be unknown, but where
133
their identity was known, or at least their evolution in time was constrained by kinetics. For
134
example, Chen and co-workers have recently adopted this approach to estimate both the
135
kinetic parameters and the unknown absorbance profiles using MCR-ALS. 29–31 However, in
136
the case of interest here, no underlying kinetic reaction constrains the system and no a priori
137
information on which species are in solution is available.
138
2.2
139
We propose a three-step procedure to analyze spectroscopic data sets: the first two steps
140
focus on determining the spectra of pure species, while the third estimates the composition
141
using a one-point calibration. We suggest sequentially exploiting both ICA and MCR-
142
ALS, rather than using them individually; Valdemara et al. 32 proposed a similar approach
Three-step Procedure
7
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 8 of 40
143
in the context of IR spectroscopy for food processing, although they neither investigated
144
thoroughly the robustness of the method nor explored the feasibility and reliability of one-
145
point calibration.
146
2.2.1
147
In the first step, one must determine the number of species in the system, n ˆ K , since this is
148
the only degree of freedom in BSS algorithms. Several methods have been suggested based
149
on two main approaches: one based on inspection of the eigenvalues of the matrix X; the
150
other based on a trial-and-error procedure. Here, we have adopted a variant of the former,
151
looking at the singular values.
Step One: Determination of n ˆK
To this aim, we perform a singular value decomposition (SVD) of the data matrix, i.e. X = USVT . The diagonal elements of S ∈ RnN ×nL are the singular values, in decreasing magnitude, i.e. diag(S) = [S1 , S2 , ..., SN ] and S1 ≥ S2 ... ≥ SN . It is well-known from linear algebra that the row rank of X equals the number of non-zero singular values: based on this property, linear superposition, and Eq. (1), one sees that rank(X) = n ˆ K , in the absence of noise. In real systems, though, noise is present and the singular values are usually not exactly zero, hence we assume that n ˆ K is equal to the number of relevant singular values; i.e. we look at the so-called effective rank of X. Three main criteria can be used to determine the effective rank. The first one, based on the relationship between the singular values of X and the eigenvalues of its covariance matrix, and it looks at the explained variance 12 and the criterion to determine n ˆK :
n ˆK
PnK Si ≥ α1 s.t. V = Pni=1 N i=1 Si
(6)
where α1 ∈ [0, 1) is a constant sufficiently close to one, e.g. 0.99. The second criterion is based on the distance, measured by a p-norm, between the original set of data and the one
8
ACS Paragon Plus Environment
Page 9 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
reconstructed using the first k singular values:
n ˆ K s.t. εp =
X − (USVT )n K p kXkp
≤ α2
(7)
where α2 is a constant and the subscript n ˆ K indicates that only the first n ˆ K singular values of the SVD have been used, p ∈ [0, +∞). The third criterion looks at the rate of variation of the p-norm and can be written as:
n ˆ K s.t. |εp (ˆ nK ) − εp (ˆ nK − 1)| ≤ α3
(8)
152
where α3 should be close to zero. The choice of α1 , α2 , and α3 depends on the level of noise
153
corrupting the data, as we discuss in Section 3. In the following, we use the second and third
154
criteria together.
155
2.2.2
Step Two: Species Identification
After determining n ˆ K , in the second step we remove the blindness about the system. We compute the first (n=1) or second (n=2) derivative of X with Savitzky-Golay differentiation 33,34 and estimate the mixing matrix AI with the ICA algorithm: (n)
SI dX(n) = AI (n) (n) d λ d λ
(9)
Because of the linearity of Eq. (1) and of the derivative operator, one can directly compute the sources matrix, SI , associated with the original data matrix X by the pseudo-inverse A−1 I : SI = A−1 I X
(10)
156
The element-wise square of SI is used as initial guess for MCR-ALS. We use such a matrix
157
for two reasons. First, ICA is not constrained to provide non-negative solutions, 11,15,17 hence
158
some (or all) sources retrieved by ICA may be negative; such negative entries hinder the 9
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
159
convergence rate of MCR-ALS, which, if the solution is not unique, can even converge to an
160
incorrect solution. Second, taking the element-wise square usually improves the strength of
161
the signal over the noise. The input to the MCR-ALS algorithm is the actual spectra, X,
162
rather than its derivative, which enforces non-negativity constraints; the output are the final
163
mixing matrix, AM , and its associated source matrix, SM .
164
To proceed with species identification, one must create a library of spectra of pure species.
165
To construct the library, we have measured the spectra of pure analytes in water at a known
166
molar concentration. With this point, we compute the linear relation between the spectrum
167
and the concentration, enforcing that a species not present in the mixture has zero molar
168
concentration. The calibration line obtained in this manner uses a single experimental con-
169
centration (hence one-point calibration) and will be used in Step 3. Note that, to obtain the
170
spectrum at 1 M, we divide the measured spectrum by its associated known concentration,
171
relying on the assumption of the Beer-Lambert law. The robustness of one-point calibration
172
is improved by the fact that we do not use a single value of the spectrum (e.g. the maximum
173
of the peak), but rather the whole spectrum, therefore mitigating minor deviations from the
174
regime of validity of the Beer-Lambert law.
175
Theoretically, the estimated sources and their associated actual spectra are equivalent
176
up to an arbitrary scaling and they should overlap when the separation has been correctly
177
carried out. Given the library, if one envisages the concentration of each pure species as a
178
random variable, then the intensities at the different wavenumbers in its spectral response
179
correspond to the realizations of such a random variable; the same can be thought of for the
180
sources. Therefore, one can compute the correlation coefficients, γ, between each normalized
181
library spectrum and each normalized source spectrum and create a correlation matrix, with
182
as many rows as pure species and as many columns as sources (from MCR-ALS). For each
183
row, the highest value of γ identifies the matching pure species. 18
184
Note that this procedure may not associate each source with an actual species, either
185
because the library does not contain the species or because the identified source is not a real
10
ACS Paragon Plus Environment
Page 10 of 40
Page 11 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
186
chemical species. The latter case occurs, for instance, when ICA splits the contribution of one
187
species into two or more sources due to high noise levels, resulting in n ˆ K > nK ; the opposite
188
scenario may occur (i.e. n ˆ K < nK ) due to peak overlap. Other instances where spurious
189
sources can be identified may occur when a change in temperature and/or pH takes place,
190
or when the addition of a new species causes nonlinear interactions with a species already
191
present. In all these cases, the actual spectra of the pure species may undergo changes in
192
intensity and/or drifts, i.e. nonlinear behaviors. However, since the model is forced to be
193
linear, they are interpreted as the appearance of a new independent source.
194
2.2.3
Step Three: Compositions Estimation
We now estimate species compositions. Ideally, BSS methods yield both the source and the mixing matrix, but the inherent ambiguity (see Section 2.1) means that this typically does not typically occur. Most algorithms are constructed in such a way that the mixing matrix A retrieved from either ICA or MCR-ALS does not even retain the relative proportions among the species. This is because in reconstructing the signal the product of the two matrices is important, rather than their individual entries. To alleviate this problem, Chen et al. 17 proposed a calibration step, during which one estimates a matrix B such that C = BA, where C is the concentration matrix and A is the mixing matrix. However, such a calibration can be an effective solution only in a laboratory, off-line framework, but not for on-line process control. The appearance of a new species or a concentration outside the calibration range would require a new calibration campaign. We have adopted a different approach and exploited the pre-constructed library used in Step 2, where spectra have been recorded at 1 M. With the species identified in Step 2, one constructs the matrix L, containing the spectra of the identified species, and solves the inverse problem of Eq. (1):
XL−1 = G ∝ C
11
ACS Paragon Plus Environment
(11)
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 12 of 40
where, more precisely, each row of G is a multiple of the corresponding row of C, and: Gik χik = Pnˆ K k=1 Gik
∀i = 1, ..., nN
(12)
195
where χik represent the molar fraction of each species k in the measurement i. Note that,
196
because of linearity, any error and uncertainty in X propagates linearly to C and χ , with L−1
197
corresponding to the relative local sensitivity. Note also that G represents an estimate of the
198
concentration matrix, but there are two limitations for its direct use. First, the solution is
199
affected by the noise in X. Second, it assumes that the one-point calibration is valid for each
200
measuring device, independent of the fact that the reference spectra and the measurements
201
may be obtained with different machines, thus neglecting the device-specific bias. By bias,
202
we mean here the dependence of the spectrum on the intensity of the excitation source: since
203
this relationship is a property of the specific instrument used, so are the absolute values of a
204
spectrum as well as the one-point calibration. Let us suppose that all the spectra measured
205
with the same device are affected by the same type and amount of bias. Then the relative
206
intensities, i.e. the ratio between two characteristic peaks of pure species (or between the
207
areas underneath the spectra), are inherent properties of the materials and should not be
208
changed by bias. For these reasons, the estimate of the mole fractions should be more robust
209
than that of the molar concentrations.
210
3
211
3.1
212
Low-level nuclear waste is composed of more than 20 known species. 1 However, most of its
213
mass is comprised of water and a limited number of sodium and potassium salts.
Results Simulant and Experimental Conditions
214
In typical laboratory studies, actual radioactive waste is replaced with non-radioactive
215
simulant mixtures that contain the relevant ions in proportions such that the chemical and
12
ACS Paragon Plus Environment
Page 13 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
216
physical properties are very similar to those of the actual material (see for example Nassif et
217
al. 35 ). In this work, we have chosen to study a simulant formed by aqueous solutions of five
218
sodium salts, 36 namely sodium phosphate (Na3 PO4 ), sulfate (Na2 SO4 ), nitrite (NaNO2 ),
219
carbonate (Na2 CO3 ), and nitrate (NaNO3 ). Raman and IR spectra of the anions and water
220
are reported in Figure 1; the sodium ion is neither Raman- nor IR-active. The simultaneous
221
use of IR- and Raman-spectroscopy for on-line in situ monitoring could also offer several
222
advantages. First, since some species have different Raman and IR activities, their combined
223
used allows a larger number of species to be monitored; e.g. PO43 – is weakly Raman-active,
224
but strongly IR-active. Second, they are independent methods providing an effective way to
225
cross-check the results of BSS for species identification and composition estimates. Third,
226
Raman spectroscopy can also detect solid material, thus allowing possible identification of
227
the onset of precipitation, which may be problematic in waste processing.
Figure 1: The Raman (left) and IR (right) spectra of the pure species measured at 1 M.
228
13
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 14 of 40
229
3.2
Simulated Data
230
3.2.1
Data Generation
231
We tested the three-step procedure on data generated via computer simulations, which surely
232
comply with the hypotheses of linearity and linear superposition. The spectra of such sim-
233
ulated mixtures have been produced using the measured Raman and IR spectra of each of
234
the five sodium salts and of water (see Figure 1) constituting the simulant. For each species
235
in each mixture, the value of concentration is a random number drawn from a Gaussian dis-
236
tribution centered around a mean µ (see Table 1) with variance σ 2 = (κµ)2 , where κ = σ/µ
237
is the coefficient of variation. Each set of synthetic data forms a (nN × nL ) matrix, Xc .
238
To study the effect of inherent sample variability on the decomposition performance, we
239
simulated two types of mixtures, one with κ1 = 0.10 and another with κ2 = 0.01. We also
240
performed simulations with κ > 0.10, namely 0.25, 0.50 and 0.70, which are representative
241
of the values often used during calibration. The results (reported in Section SI-1.4 of the
242
Supplementary Information) did not differ qualitatively from those reported for κ1 , indicat-
243
ing that once the dispersion of the data is sufficiently large (or conversely the information
244
sufficiently high) the algorithm’s performance does not improve.
245
Each set of simulated data consisted of 15 mixtures (i.e. nN = 15) and here we consider
246
the simulations of Raman spectra. Figure 2 illustrates typical examples of data with the
247
chosen values of κ; the insets in each plot show a magnification of one characteristic peak of
248
NO2 – to illustrate how the data sets change with κ.
249
250
3.2.2
Noise
Actual data are affected by noise, which we assumed to be additive to Xc :
X = Xc + η
14
ACS Paragon Plus Environment
(13)
Page 15 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Table 1: The mean value, µ, and the coefficient of variations, κ used for the creation of simulated data. value/ species
µ [Mol L-1 ]
κ1
κ2
PO43 – SO42 – NO2 – CO32 – NO3 – H2 O
0.6 0.6 1.85 1.25 1.85 55
0.10 0.10 0.10 0.10 0.10 0.10
0.01 0.01 0.01 0.01 0.01 0.01
Figure 2: Examples of typical sets of simulated Raman spectra based on the references in Figure 1. Each set contains 15 random mixtures: the one on the left with κ = 0.10, while that on the right with κ = 0.01. Na+ is their common counter-ion. The insets show a magnification of a nitrite peak to illustrate the spectra variability for different values of κ. where η ∈ RnN ×nL is the noise matrix and X the input matrix for the three-step procedure. The noise, acting at each wavelength, is Gaussian and white, i.e. generated from a multivariate Gaussian distribution G(0, ση2 I). The noise covariance matrix, in which I is the identity matrix, is controlled by the constant ση2 , i.e. the variance of noise; implicit in this formulation is the assumption that the noise is a stationary property of the measurement device and system of interest, so that the average noise intensity does not change over time and for different samples. Furthermore, the noise intensity does not depend on the wavenumber. 15
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 16 of 40
We validated our assumptions on the noise against ad hoc experimental data for our system. The importance of noise with respect to the signal of interest is usually measured by the signal-to-noise ratio, SN R. We will use two different types of signal-to-noise ratios in the discussion of our results. The first is a local signal-to-noise ratio, SN Rl , which measures the ratio of the intensity at a specific wavenumber λl to the average noise:
SN Rl =
Xil2 ση2
(14)
The second one is the average signal-to-noise ratio, SN R, defined as: nL nL 1 X hSN Rl i 1 X SN Rl = 2 SN Rl = SN R = nL l=1 ση nL l=1 ση2
(15)
251
where hSN Rl i indicates the arithmetic average of the local signal-to-noise ratio. By specify-
252
ing the value of SN R, one can compute for each set of simulated data the noise covariance
253
and generate an appropriate noise matrix η . Since SN R can span several orders of magni-
254
tude, we report its value (and similarly for SN Rl ) in decibels (dB), where a decibel of SN R
255
is defined as 10 log10 (SN R).
256
3.2.3
257
Our investigations focus on the performance of the three-step procedure under progressively
258
noisier conditions. It is well-known (and rather intuitive) that noise deteriorates the perfor-
259
mance of ICA and MCR-ALS algorithms, 11,15 and therefore it is important to determine the
260
level of noise above which the results of Blind Source Separation are no longer reliable. Con-
261
cerning the algorithms, we have chosen FastICA, developed by Hyvarinen and co-workers 11
262
and known to be efficient and robust, as the ICA algorithm, and pyMCR, developed by Camp
263
and freely down-loadable from the Pypi project website (https://pypi.org/project/pyMCR),
264
for MCR-ALS. The noisy data are first centered and scaled via Pareto scaling, 37 then the
265
Savitzky-Golay filter is used to compute the spectra derivative from the simulated data (to
Analysis of Simulations
16
ACS Paragon Plus Environment
Page 17 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
266
which noise was added). Note that the Savitzky-Golay filter not only yields an estimate
267
of the derivative, but also mitigates the effect of noise thanks to its smoothing properties.
268
The smoothing effect increases for windows of increasing size, i.e. for larger values of the
269
number of points Sw used by the algorithm. Unfortunately, values of Sw that are too large
270
lead to a distortion of the signal shape by broadening the width of peaks, reducing their
271
intensity, and also causing the maxima to drift 34,38,39 : an optimal trade-off exists between
272
noise removal and signal distortion. Additionally, when peaks are partially or totally over-
273
lapping, using windows that are too broad hinders a complete peak resolution. Note that
274
the sources obtained from ICA are corrected to account for the scaling and centering step,
275
prior to applying MCR-ALS. We have investigated a broad range of SN R values, from 110
276
dB (practically a noise-free system) to 20 dB (a very noisy one), and of Sw values, from 3
277
(the minimum value to compute a second derivative of degree two) to 39 (roughly the width
278
of the Raman nitrate peak). For each set of parameters, we have generated 100 simulations.
279
As discussed in Section 2.2, first we need to determine the number of species, n ˆ K , which
280
we do by looking at the singular values of X for the system with κ = 0.10. We applied
281
Criterion 1 in Eq. (6) and plotted on the left side of Figure 3 the logarithms of singular
282
values, Sk , and, on the right side, the associated fraction of explained variance, V, as functions
283
of the singular values, k. The color shades from black to red indicate that SN R decreases
284
from 110 db to 20 dB, while the dashed vertical lines visualize the condition k = 6, i.e.
285
the actual number of species in the system. If we set α1 = 0.99, the algorithm determines
286
that n ˆ K = 6 for SN R ∈ [50, 110] dB, while for SN R < 50 dB, n ˆ K > 6. By inspecting
287
directly the singular values associated with SN R ≥ 50 dB, one sees that for k > 6 they
288
are almost constant and much smaller than those for k ≤ 6, thus suggesting that they are
289
describing the (low level of) noise. On the contrary, when the noise becomes more important
290
(SN R < 50 dB), overshadowing the information of interest, the number of singular values
291
necessary to describe the system increases. Since the noise is random and independent from
292
one measurement to another, the effective rank of X increases.
17
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 3: On the left, one sees the base-10 logarithm of singular values, Sk , for a typical matrix X, while on the right the explained variance, V; both Sk and V are given as functions of k. The color from black to red indicates an increasingly noisy system, with SN R decreasing from 110 db to 20 dB, while the dashed vertical lines visualize the condition k = 6, i.e. the actual number of species in the system. 293
294
The use of Criterion 2 in Eq. (7), shown in Figure 4 (right) with α2 = 10−3 , leads to
295
conclusions qualitatively similar to those from Criterion 1. Figure 4 (left) shows the difference
296
∆ε1 between two consecutive norms (i.e. Criterion 3 in Eq. (8)) for p = 1 and α3 = 10−4 ;
297
the color shades from black to blue indicate a decrement of SN R. For high values of SN R, it
298
is apparent that setting n ˆ K > 6 improves only marginally the reconstruction of the original
299
data matrix (∆ε1 almost zero). Vice versa, for the values SN R < 35 dB, consistently with
300
the results provided by Criterion 1 and 2, the noise covers the actual signal and the number
301
of singular values to be used for correctly reproducing the original data increases. Therefore,
302
based on these criteria, we have chosen to set n ˆ K = 6. Note that at high SNR levels,
303
Criterion 3 provides the best guidance for the selection of n ˆ K , while Criterion 1 and 2 can
304
lead to erroneous selection of n ˆK . 18
ACS Paragon Plus Environment
Page 18 of 40
Page 19 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Figure 4: On the left, the value of the ε1 , measured by the L1 -norm, as a function of the SN R, for an increasing number of singular values (from black, k = 1, to blue, k = 15). The horizontal dashed black line indicates the threshold α2 . On the right, The rate of change of the reconstruction error, ∆ε1 , measured by the L1 -norm relative to the L1 -norm of the original data set, as a function of the number k of singular values used.
305
306
After determining n ˆ K , we can run the BSS algorithm and proceed towards spectra identifi-
307
cation. We set n ˆ K = 6 for all levels of noise and allow the system to retrieve the spectra of
308
all species. Recognize, though, that the algorithm selects n ˆ K > 6 for high levels of noise to
309
satisfy both Criterion 2 and 3.
310
We first inspect the identifiability of pure species, using the correlation coefficient, γ,
311
between the reference spectrum and the spectrum produced by the algorithm. In the case
312
of phosphate, nitrite, and water, the values of γ > 0.90 extend from the noiseless region
313
(105 dB) up to about 50 dB, for all values of Sw . When SN R decreases below 50 dB, γ
314
rapidly decreases below 0.50, eventually dropping to zero for the lowest values of SN R,
315
where the sources corresponding to these three species incorporate features from the noise.
316
On the contrary, sulfate (which has an inherently stronger signal, but low concentration) and
317
carbonate (weaker than sulfate, but in greater concentration) are retrieved overall better,
318
with values of γ decreasing, but never reaching zero, only for SN R below 30 dB. Finally, 19
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
319
nitrate is the only component always retrievable and the values of γ associated with it are
320
above 0.90 in all conditions explored. Contour maps of average gamma in the (SN R, Sw )-
321
plane for all the species in the mixture (namely phosphate, sulfate, nitrite, carbonate, nitrate,
322
and water) are reported in the Supplementary Information (Figure SI-2).
323
The identifiability of different species is easily rationalized from the results in terms
324
of SN Rl , rather than SN R. In the simulant, nitrate and sulfate are very Raman active,
325
while nitrite, carbonate, and water are moderately Raman active, and phosphate is only
326
weakly active; additionally, the (average) concentration of phosphate and sulfate is much
327
lower compared to that of the other species. For these reasons, the peak contribution of
328
phosphate is much smaller than that of either nitrate or sulfate: the variations in the signals
329
of phosphate can be close to the noise level and lost, even though its peak does not overlap
330
with the signals of nitrate and sulfate. A similar analysis holds for water: although its
331
concentration is high, its inherent Raman activity (in the region accessible to our device) is
332
small, so that its peak is much smaller compared to the other species. In spite of the fact
333
that the average SN R seems quite strong (30 dB indicates roughly that average signal is 30
334
times stronger that the noise), it is actually a mean between extremely strong contributions
335
(due to nitrate, and in second order sulfate, carbonate, and nitrite) and weak ones (due to
336
phosphate and water, whose signal is as intense as the noise, i.e. SN Rl ≈ 5 dB even if
337
SN R = 30 dB).
338
It is important to recall that the tolerable level of noise is also determined by the inherent
339
variability of each data set, measured by κ. Intuitively, the variability of the spectra due
340
to actual differences in concentration may be overshadowed by the variability due to noise
341
for the sets of data, in which κ is sufficiently small. To illustrate this issue, we compare in
342
Figure 5 the reference spectra with the sources recovered with our procedure at SN R = 50
343
dB, for κ = 0.10 (left) and κ = 0.01 (right). The reference spectra are reported as dashed
344
black lines, with the estimated sources in dashed color lines. While for κ = 0.10 the match is
345
almost perfect (as expected from the high values of γ), for κ = 0.01 the spectra of phosphate
20
ACS Paragon Plus Environment
Page 20 of 40
Page 21 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
346
and of water are no longer recognized (the best matches between the sources and the actual
347
spectra have γ < 0.10). Even sulfate and nitrite are more affected by noise: spurious peaks
348
belonging to nitrate and sulfate appear in the source associated with nitrite, while a spurious
349
peak from nitrate appears in the sources associated with sulfate. However, in both cases the
350
correlation coefficient with the correct spectra is still larger than 0.80.
Figure 5: Comparison between the reference spectra of the pure species (black solid lines) and their corresponding estimates from ICA/MCR-ALS (dashed colored lines), for simulations at κ = 0.10 and κ = 0.01, in the left and right plots, respectively. The level of noise was set to 50 dB, and the Sw parameter to 11. The effect of noise is clear for the data in the right plot. One sees spurious bumps in the BSS spectra of sulfate and carbonate, due to the incomplete separation between each other and with nitrite. Moreover, the spectra of nitrite and phosphate are no longer identified by comparing the BSS spectra with the reference spectra, hence no dashed line for either species is reported.
351
352
After discussing the performance in terms of species identifiability, we turn to the estimates
353
of compositions, using Eq. (12). We consider κ = 0.10, where all species can be correctly 21
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
354
identified, and compare three values of SN R, namely 100, 70, and 50 dB, with a value of
355
Sw set to 11 for all cases. Figure 6 shows the parity plots with the actual compositions used
356
to generate the spectra, χo , along the abscissa and their associated estimates, χm , along
357
the ordinate. The dashed black line in the (χo , χm )-plane represents a perfect estimate, the
358
dashed blue lines an estimate within ±10% error, and the dashed red lines within ±20%
359
error. It is apparent that the estimates are rather good, even in presence of moderate noise
360
(50 dB), when all species are identified.
361
We have also looked at the influence of a set’s inherent variability by simulating the
362
case with κ = 0.01. The results are quite similar to those shown in Figure 6: the figure
363
for κ = 0.01 can be found in the Supplementary Information (Figure SI-6). The main
364
qualitative difference for κ = 0.01 is observed for the highest level of noise (SN R = 50 dB),
365
where PO43 – and NO2 – are no longer correctly identified: consequently, the fractions of
366
the remaining species deviate from their actual values.
367
22
ACS Paragon Plus Environment
Page 22 of 40
Page 23 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Figure 6: Estimation of the composition for a set of 15 mixtures, with κ = 0.10, with SN R = 100, 70, 50 dB, from left to right; the values have been computed enforcing the spectra nonnegativity and choosing a Savitsky-Golay window of 11 for all values of SN R. The black, blue, and red dashed lines in each plot indicate a perfect match, the ±10% boundaries, and the ±20% boundaries, respectively. H2 O, NO3 – , NO2 – , CO32 – , SO42 – , PO43 – are reported as red, black, light blue, violet, orange, and green symbols, respectively. 368
3.3
Experimental Data
369
We tested the performance of the three-step procedure with the simulant solutions made of
370
the five sodium salts used to obtain the simulated data, namely sodium phosphate (Na3 PO4 ),
371
sulfate (Na2 SO4 ), nitrite (NaNO2 ), carbonate (Na2 CO3 ), and nitrate (NaNO3 ), plus water,
372
i.e. nK = 6. Raman and IR spectra of simulant solutions at different concentrations were
373
obtained at a constant temperature T = 298 K and were collected simultaneously, thus pro-
374
viding complementary, but independent information. Pre-processing consisted of removing
375
the effect of cosmic rays (de-spiking) for the Raman spectra, followed by baseline correction
376
for both Raman and IR data; the pre-processed Raman (left) and IR (right) data sets are
377
reported in Figure 7; the data fed to the ICA algorithm were also pre-processed with Pareto
378
scaling. Further details about pre-processing are reported in the Supplementary Information. 23
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
379
The three-step procedure was carried out separately on Raman and IR measurements.
380
All Raman spectra were collected using a MarqMetrix All-in-one Raman System, with a
381
785 nm laser, at 300 mW laser power, and 30 sec integration time. ATR-FTIR spectra were
382
collected using a ReactIR 10 ATR-FTIR technology from Mettler Toledo. The experimental
383
measurements were conducted in a 250-mL vessel and stirred at 400 rpm to ensure well-
384
mixed conditions. The set of data used to run the algorithm comprises 18 samples, of which
385
14 had different compositions and 4 were pure water; the values of the mole fractions are
386
reported in the Supplementary Information (Table SI-1). Note that the values of κ for the
387
experimental data (excluding the measurements with only water) vary between 0.6 and 0.8.
Figure 7: The pre-processed spectra used in the three-step procedure from Raman (left) and IR (right) measurements.
388
389
3.3.1
Species Identification and Composition Estimation
390
We applied the three-step procedure, similarly to its applications with the simulated data.
391
Inspection of the singular values, using either Criterion 2 or 3 of Section 2.2.1, suggests
392
setting n ˆ K = 6 for both Raman and IR spectra (see also Figure 8, where the vertical dashed 24
ACS Paragon Plus Environment
Page 24 of 40
Page 25 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
393
lines indicate the condition k = 6); values of n ˆ K > 6 do not significantly improve the
394
reconstruction of the original spectra.
Figure 8: The error norm ε1 as a function of the number of singular values, k, used to determine n ˆ K , for the set of data measured with Raman (left) and IR (right). The vertical dashed lines correspond to k = 6.
395
396
Analysis of IR spectra associates with each source a different species used to produce the
397
simulant, with γ > 0.95 for every species. However, the analysis of Raman data did not iden-
398
tify any source to associate with phosphate; on the contrary, two other non-identical sources
399
were both identified as belonging to nitrate. To understand this behavior, we examined the
400
SN R, since a major factor hindering identifiability is the noise (see Section 3.2): the SN R
401
for the IR data is about 50 dB, whereas for Raman it is only 30 dB. Moreover, the maximum
402
value of SN Rl corresponding to the peak of phosphate is about 45 dB for the IR data, but
403
only about 5 dB for the Raman data. This indicates that, essentially, the contribution of
404
phosphate to the whole spectrum and its variability are covered by the noise, to the point
405
that they are lost in the Raman measurements. After identifying all species by the cross-
406
check of Raman and IR results, we can use Eqs. (11) and (12) to compute the composition
407
with the spectra from the library. The estimates of the mole fractions, for both Raman and 25
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
408
IR data, are reported in Figure 9 in the form of parity plots. The actual numerical values
409
can be found in the Supplementary Information, Tables SI-1, SI-2, and SI-3.
Figure 9: Parity plot showing the estimates of our three-step procedure (vertical axis) against the actual composition of a simulant mixture (horizontal axis); the blue and red dashed lines represent ±10 and ±20 % deviations from the experimental values. The plots along the left column show the data from Raman measurements, while those along the right column those from IR measurements. The number of species identified is set to n ˆ K = 6. The colors for water, phosphate, sulfate, nitrite, carbonate, and nitrate are red, orange, green, light blue, violet, and black, respectively. We have reported the water (upper row) separately from the other components (lower row) for clarity, since most of the mixture (mole-wise) is made of this substance.
410
26
ACS Paragon Plus Environment
Page 26 of 40
Page 27 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
411
3.3.2
Error Analysis
412
Even though the estimates in Figure 9 mainly lay within a ±20% interval (red dashed lines)
413
from the actual concentrations, it is important to verify our explanation for the incorrect
414
identification obtained from Raman measurements alone. With the matrix G identified
415
using Eq. (11), the sampled data set can be reconstructed and compared with the measured
416
set and we can compute the error as the distance between the actual spectra and their
417
reconstructions. The reconstructed and the actual sets match perfectly only under two
418
conditions: first, the library spectra truly represent the spectra of the pure species, and,
419
second, linear superposition and Beer-Lambert law hold.
420
We adopt the simplest possible metric to quantify the distance between the reconstructed
421
data and the measured data, namely the difference between each entry of X and its corre-
422
sponding entry of GL, and analyze the error, labeled as , on the data from both Raman
423
and IR measurements.
424
Overall, the error for the IR spectra (reported in Figure SI-7 in the Supplementary In-
425
formation) is very small, and it follows a pseudo-sinusoidal pattern along the wavenumber
426
coordinate, which is likely due to baseline preprocessing. The error for the Raman measure-
427
ments is reported in Figure 10; on the left side, we show the error when the reconstruction
428
does not account for phosphate (i.e. using only the results of Raman measurements); on
429
the right side, the shown error accounts for the phosphate, whose presence was identified by
430
IR. The insets highlight the region where the phosphate peak is located (about 920 cm-1 ).
431
Note the peaks clearly visible in the left plots disappear on the right; nevertheless, the im-
432
provement is indeed minimal. Only two deviations, one negative and one positive, are very
433
significant. These two deviations are about the same magnitude and the negative deviation
434
corresponds to the peak of reference nitrate (about 1049 cm-1 ), while the positive one to
435
the spurious source identified by MCR-ALS (about 1053 cm-1 ). The correlation coefficient
436
analysis attributes the spurious source to nitrate (γ = 0.89).
437
27
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 10: The error between the measured and the reconstructed spectra for Raman measurement, computed using the element-wise difference. On the left, we report the case where phosphate is not used for the reconstruction, using only the results from Raman measurements. On the right, we show the case where phosphate is included, taking into account the results from IR measurements. 438
An accurate analysis of the spectra revealed that the error spikes in the nitrate region are
439
correlated with the presence of nitrite, which induced a drift in the nitrate peak. This drift
440
violates one of the assumptions of the Beer-Lambert law that hold for all other species.
441
Drifts in the peaks are typically associated with complex interactions among ionic species.
442
For example, Sun Qin 40 reported such an effect in carbonate-water solutions, while Ahmed
443
et al. 41 have shown that in aqueous solution the stretch band of the hydroxide anion located
444
at 3400 cm-1 changes in the presence of nitrate, sulfate, and phosphate. Previous works 42–44
445
have shown that the peak of nitrate at 1048 cm-1 exhibits shifts towards higher wavenumbers
446
due to increments of temperature and of nitrate concentration itself. We ruled out effects due
447
to temperature and pH, since both were monitored and did not change in our experiments.
448
We also found that repeating the experiment at an overall lower concentration still showed
449
a shift in the spectrum. A detailed investigation of what causes this peak shift is beyond the
450
scope of this work and will be confronted separately, since this phenomenon may require the
28
ACS Paragon Plus Environment
Page 28 of 40
Page 29 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
451
use of nonlinear BSS to improve the composition estimates.
452
3.3.3
453
One could question the need to utilize BSS techniques when a library providing one-point
454
calibration is available. In fact, using the library, the inverse problem of Eq. (1) could be
455
solved directly with classical least-squares (CLS) or least absolute shrinkage and selection
456
operator (LASSO), thus obtaining the concentration matrix C. However, our procedure
457
is more advantageous than CLS and LASSO because it determines the number of species
458
and the shape of their spectra using only the experimental measurements. It does so at
459
an insignificant additional computational cost. On the contrary, CLS and LASSO could
460
determine the number of relevant components in the system only after estimating the con-
461
centration matrix, i.e. during a post-processing step. Theoretically, all species not present
462
in the mixture yield zero entries in the concentration matrix C, which should thus be sparse.
463
In reality, because of noise, most entries in the concentration matrix estimated by CLS are
464
small, but non-zero and sometimes negative. To select the actual component a criterion to
465
discriminate between entries due to noise and actual low concentrations is required. LASSO
466
partly alleviates the issues of CLS via regularization and yields as sparse a C as possible,
467
thus also performing species determination. Even with LASSO, one must still determine the
468
level of regularization, e.g. by Bayesian inference. In addition to the issues mentioned above,
469
our procedure is superior to CLS and LASSO when one or more species are not included
470
in the library. CLS and LASSO cannot determine the shape of spectra missing from the
471
library: this information has to be inferred by inspecting the residuals in post-processing.
472
BSS techniques are able to estimate all relevant spectra independently of the existence and
473
correctness of the library.
BSS versus Alternative Approaches
474
As a proof of concept for potential of BSS techniques, let us briefly look at the results
475
obtained by re-running our procedure on the experimental Raman data discussed in Section
476
3.3.1, but removing the reference carbonate spectrum from the library. Since the BSS part of
29
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
477
the algorithm does not depend on the library, the spectra reconstructed by the BSS procedure
478
are not affected by the missing carbonate, as shown in Figure 11, where the error between
479
the measured and reconstructed spectra are reported. CLS and LASSO do not provide
480
any conclusive information about the shape of the missing spectrum, since the error due
481
to nonlinearities overlaps with the signal associated with carbonate, but BSS still performs
482
very well and its residual error corresponds to noise. Qualitatively similar results can be
483
obtained by removing nitrite or sulfate from the library: the corresponding Figures SI-9 and
484
SI-10 are provided in the Supplementary Information.
Figure 11: On the left column, we have reported the element-wise error between the measured Raman spectra the reconstructed one using BSS, CLA, or LASSO. The BSS residual is basically background noise, whereas CLS and LASSO capture neither the peak drift of nitrate nor the peak of carbonate.
485
486
Additionally, the BSS part of the algorithm provides a clearer insight in how the spectra
487
of the pure species should look, as one can see in Figure 12, where the vertical dashed lines
488
mark the locations of the characteristic peaks of the actual species in the mixture. Note that 30
ACS Paragon Plus Environment
Page 30 of 40
Page 31 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
489
the color code matches the one we used to report the reference spectra in Figure 1. The
490
dashed spectrum S4 , which does not match any real species, represents the spurious compo-
491
nent associated with the nitrate peak drift. Interestingly, the BSS spectrum S3 associated
492
with nitrite also presents a small bump at the location (about 1051 cm-1 ) of the spurious
493
component peak: this results suggests that nitrate peak drift is linked with the presence
494
of nitrite in solution. We also recall that no BSS peak is associated with phosphate (see
495
Section 3.3.1): nevertheless, the BSS spectrum S1 (associated with sulfate) shows a small
496
bump corresponding to the phosphate peak location (about 940 cm-1 ).
Figure 12: The spectra of the independent relevant species recovered by the BSS algorithms after analyzing the experimental data. The dashed lines indicate one characteristic peak for each substance, using the same color code as in Figure 1. The dashed spectrum labeled as S4 represents the spurious component associated with the shift of nitrate peak. Note that the spectrum S3 associated with nitrite exhibits a small peak at the location of the spurious component, suggesting a correlation between the shift and the presence of nitrite.
497
31
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
498
4
Summary and Conclusions
499
Estimating solution composition, including the identity and concentration of solutes, is of
500
great importance in managing and processing low-level radioactive waste at the Hanford
501
site. Here we demonstrate the potential use of IR and Raman spectroscopy to achieve these
502
goals. However, the unique features of radioactive wastes make standard calibration methods
503
difficult to implement and lead us to development of a three-step procedure based on blind
504
source separation (BSS) techniques. The methodology assumes validity of the Beer-Lambert
505
law and of linear superposition of the spectra of pure species.
506
The results, with simulated and experimental data, demonstrate that the proposed proce-
507
dure is efficient and robust in identifying the species and estimating their relative concentra-
508
tions, even in the presence of noise and/or of moderate deviations from linearity. Moreover,
509
BSS techniques are useful in determining the presence of unexpected (thus not in the library)
510
species and facilitate estimation of the spectra of such species, a task not easily achieved
511
with other methods. The estimated BSS spectra can be used to scan larger databases and
512
identify the best candidate species to expand the library. Therefore, in a data-driven and
513
computationally efficient manner, it is possible to gain a deeper insight about the system to
514
be analyzed using only limited initial prior knowledge of the system.
32
ACS Paragon Plus Environment
Page 32 of 40
Page 33 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
515
Industrial & Engineering Chemistry Research
List of Symbols Symbol Meaning X
Spectral intensity of a mixture
C
Molar concentration
a
Absorbance
T
Temperature
Sw
Number of window points of Savitzky-Golay Filter
Sp
Degree of the polynomial of Savitzky-Golay Filter
λ
Wavenumber
χ
Mole fraction
η
Gaussian White Noise
σ2
Variance
µ
Expected Value, Mean
κ
Coefficient of Variation
γ
Correlation coefficient
εp
Relative p-norm between congruent matrices
Element-wise error between congruent matrices
SN R
Signal-to-Noise ratio
SN Rl
Local Signal-to-Noise ratio
X
Matrix of spectral intensities of a set of mixtures
C
Matrix of concentrations of a set of mixtures
L
Matrix of spectra of pure species
A
Mixing matrix of a BBS
S
Source matrix of a BBS
η
Noise matrix
S
Singular Values 33
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Symbol Meaning nN
Number of measurements of a set of mixture
nK
Number of species in a set of mixtures
n ˆK
Estimated number of species in a set of mixtures
nL
Number of sampling points of the discretized spectra
516
Acknowledgments
517
Financial support from the Consortium for Risk Evaluation with Stakeholder Participa-
518
tion (CRESP) is gratefully acknowledged. The authors are also thankful to Michael Stone,
519
Richard Wyrwas, and the Real-Time, in Line Monitoring Program Group at Savannah River
520
National Laboratory for providing the simulant recipe and useful discussions.
34
ACS Paragon Plus Environment
Page 34 of 40
Page 35 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
521
Industrial & Engineering Chemistry Research
References
522
(1) Holt, M. Civilian nuclear waste disposal. Congr. Res. Serv. 2018,
523
(2) Program-wide strategy and better reporting needed to address growing environmental
524
cleanup liability. 2019.
525
(3) Lewiner, F.; Klein, J. P.; Puel, F.; F´evotte, G. On-line ATR FTIR measurement of
526
supersaturation during solution cystallization processes. Calibration and applications
527
on three solute/solvent systems. Chem. Eng. Sci. 2001, 56, 2069–2084.
528
(4) Togkalidou, T.; Fujiwara, M.; Patel, S.; Braatz, R. D. Solute concentration prediction
529
using chemometrics and ATR-FTIR spectroscopy. J. Cryst. Growth 2001, 231, 534–
530
543.
531
(5) Cornel, J.; Lindenberg, C.; Mazzotti, M. Quantitative application of in situ ATR-FTIR
532
and Raman spectroscopy in crystallization processes. Ind. Eng. Chem. Res. 2008, 47,
533
4870–4882.
534
535
(6) Griffin, D. J.; Grover, M. A.; Kawajiri, Y.; Rousseau, R. W. Robust multicomponent IR-to-concentration model regression. Chem. Eng. Sci. 2014, 116, 77–90.
536
(7) Siesler, H. W.; Ozaki, Y.; Kawata, S. Wiley –VCH ; 2002.
537
(8) Bryan, S.; Levitskaia, T.; Schlahta, S. Raman based process monitor for continuous
538
real-time analysis of high level radioactive waste components. HLW, TRU, LLW/ILW,
539
Mix. Hazard. Wastes Environ. Manag. 2008, 1–14.
540
(9) Lumetta, G. J.; Braley, J. C.; Peterson, J. M.; Bryan, S. A.; Levitskaia, T. G. Separating
541
and stabilizing phosphate from high-level radioactive waste: Process development and
542
spectroscopic monitoring. Environ. Sci. Technol. 2012, 46, 6190–6197.
35
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
543
(10) Lines, A. M.; Adami, S. R.; Sinkov, S. I.; Lumetta, G. J.; Bryan, S. A. Multivari-
544
ate Analysis for Quantification of Plutonium(IV) in Nitric Acid Based on Absorption
545
Spectra. Anal. Chem. 2017, 89, 9354–9359.
546
547
548
549
550
551
552
553
(11) Hyv¨arinen, A.; Oja, E. Independent component analysis: Algorithms and applications. Neural Networks 2000, 13, 411–430. (12) Ruckebusch, C., Ed. Data Handling in Science and Technology, 1st ed.; Elsevier: Oxford, 2016. (13) Naik, G. R.; Kumar, D. K. An overview of independent component analysis and its applications. Informatica 2011, 35, 63–81. (14) Hyvarinen, A. Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Networks 1999, 10, 626–634.
554
(15) St¨ogbauer, H.; Kraskov, A.; Astakhov, S. A.; Grassberger, P. Least-dependent-
555
component analysis based on mutual information. Phys. Rev. E - Stat. Nonlinear, Soft
556
Matter Phys. 2004, 70, 1–17.
557
558
559
560
(16) Kraskov, A.; St¨ogbauer, H.; Grassberger, P. Estimating mutual information. Phys. Rev. E - Stat. Physics, Plasmas, Fluids, Relat. Interdiscip. Top. 2004, 69, 16. (17) Chen, J.; Wang, X. Z. A new approach to near-infrared spectral data analysis using independent component analysis. J. Chem. Inf. Comput. Sci. 2001, 41, 992–1001.
561
(18) Monakhova, Y. B.; Astakhov, S. A.; Kraskov, A.; Mushtakova, S. P. Independent
562
components in spectroscopic analysis of complex mixtures. Chemom. Intell. Lab. Syst.
563
2010, 103, 108–115.
564
(19) Monakhova, Y. B.; Kuballa, T.; Leitz, J.; Lachenmeier, D. W. Determination of diethyl
565
phthalate and polyhexamethylene guanidine in surrogate alcohol from Russia. Int. J.
566
Anal. Chem. 2011, 2011, 1–7. 36
ACS Paragon Plus Environment
Page 36 of 40
Page 37 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
567
(20) Monakhova, Y. B.; Tsikin, A. M.; Mushtakova, S. P. Independent components analysis
568
as an alternative to principal component analysis and discriminant analysis algorithms
569
in the processing of spectrometric data. J. Anal. Chem. 2015, 70, 1055–1061.
570
(21) Monakhova, Y. B.; Mushtakova, S. P. Multicomponent quantitative spectroscopic anal-
571
ysis without reference substances based on ICA modelling. Anal. Bioanal. Chem. 2017,
572
409, 3319–3327.
573
574
575
576
577
578
579
580
(22) Lawton, W. H.; Sylvestre, E. A. Self modeling curve resolution. Technometrics 1971, 13, 617–633. (23) Neymeyr, K.; Sawall, M.; Hess, D. Pure component spectral recovery and constrained matrix factorizations: Concepts and applications. J. Chemom. 2010, 24, 67–74. (24) Tauler, R. Some surprising properties of multivariate curve resolution-alternating least squares (MCR-ALS) algorithms. J. Chemom. 2009, 24, n/a–n/a. (25) De Juan, A.; Jaumot, J.; Tauler, R. Multivariate Curve Resolution (MCR). Solving the mixture analysis problem. Anal. Methods 2014, 6, 4964–4976.
581
(26) O’Haver, T. C.; Fell, A. F.; Smith, G.; Gans, P.; Sneddon, J.; Bezur, L.; Michel, R. G.;
582
Ottaway, J. M.; Miller, J. N.; Ahmad, T. A.; Fell, A. F.; Chadburn, B. P.; Cottrell, C. T.
583
Derivative spectroscopy and its applications in analysis. Anal. Proc. 1982, 19, 22.
584
585
(27) Anderssen, R. S.; Hegland, M. Derivative spectroscopy - an enhanced role for numerical differentiation. J. Integr. Equations Appl. 2010, 22, 355–367.
586
(28) Shao, X.; Cui, X.; Wang, M.; Cai, W. High order derivative to investigate the complexity
587
of the near infrared spectra of aqueous solutions. Spectrochim. Acta - Part A Mol.
588
Biomol. Spectrosc. 2019, 213, 83–89.
589
590
(29) Chen, K. et al. Direct growth of single-crystalline III-V semiconductors on amorphous substrates. Nat. Commun. 2016, 7, 1–6. 37
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
591
592
(30) Chen, W.; Biegler, L. T.; Mu˜ noz, S. G. Kinetic parameter estimation based on spectroscopic data with unknown absorbing species. AIChE J. 2018, 64, 3595–3613.
593
(31) Chen, W.; Biegler, L. T.; Garcia-Munoz, S.; Garc´ıa, S. A unified framework for kinetic
594
parameter estimation based on spectroscopic data w/ or w/o unwanted contributions.
595
Ind. Eng. Chem. Res. 2019, acs.iecr.8b05273.
596
(32) Valderrama, L.; Gon¸calves, R. P.; Mar¸co, P. H.; Rutledge, D. N.; Valderrama, P. In-
597
dependent components analysis as a means to have initial estimates for multivariate
598
curve resolution-alternating least squares. J. Adv. Res. 2016, 7, 795–802.
599
600
601
602
(33) Savitzky, A.; Golay, M. J. Smoothing and differentiation of data by simplified least squares procedures. Anal. Chem. 1964, 36, 1627–1639. (34) Viv´o-Truyols, G.; Schoenmakers, P. J. Automatic selection of optimal Savitzky-Golay smoothing. Anal. Chem. 2006, 78, 4598–4608.
603
(35) Nassif, L.; Dumont, G.; Alysouri, H.; Rousseau, R. W. Pretreatment of Hanford
604
Medium-Curie Wastes by Fractional Crystallization. Environ. Sci. Technol. 2008, 42,
605
4940–4945.
606
607
(36) Russell, R. L.; Schonewill, P. P.; Burns, C. A. Simulant Development for LAWPS Testing; 2017.
608
(37) van den Berg, R. A.; Hoefsloot, H. C. J.; Westerhuis, J. A.; Smilde, A. K.; van der
609
Werf, M. J. Centering, scaling, and transformations: improving the biological informa-
610
tion content of metabolomics data. BMC Genomics 2006, 7, 142.
611
612
613
614
(38) Ziegler, H. Properties of digital smoothing polynomial (Dispo) filters. Appl. Spectrosc. 1981, 35, 88–92. (39) O’Haver, T. C.; Begley, T. Signal-to-noise ratio in higher order derivative spectrometry. Anal. Chem. 1981, 53, 1876–1878. 38
ACS Paragon Plus Environment
Page 38 of 40
Page 39 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
615
616
Industrial & Engineering Chemistry Research
(40) Sun, Q.; Qin, C. Raman OH stretching band of water as an internal standard to determine carbonate concentrations. Chem. Geol. 2011, 283, 274–278.
617
(41) Ahmed, M.; Namboodiri, V.; Singh, A. K.; Mondal, J. A.; Sarkar, S. K. How ions
618
affect the structure of water: a combined Raman spectroscopy and multivariate curve
619
resolution study. J. Phys. Chem. B 2013, 117, 16479–16485.
620
621
(42) Miller, A. G.; Macklin, J. A. Matrix Effects on the Raman Analytical Lines of Oxyanions. Anal. Chem. 1980, 52, 807–812.
622
(43) Frost, R. L.; James, D. W. Ion–ion–solvent interactions in solution. Part 5.—Influence
623
of added halide, change in temperature and solvent deuteration on ion association
624
in aqueous solutions of nitrate salts. J. Chem. Soc. Faraday Trans. 1 Phys. Chem.
625
Condens. Phases 1982, 78, 3249.
626
(44) Yu, J.-Y.; Zhang, Y.; Tan, S.-H.; Liu, Y.; Zhang, Y.-H. Observation on the Ion As-
627
sociation Equilibria in NaNO 3 Droplets Using Micro-Raman Spectroscopy. J. Phys.
628
Chem. B 2012, 116, 12581–12589.
39
ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
629
For Table of Content Use Only
630
Title: Analysis of Multicomponent Ionic Mixtures using Blind Source Separation - a Pro-
631
cessing Case Study
632
633
Authors: Giovanni Maria Maggioni, Stefani Kocevska, Ronald W. Rousseau, and Martha A. Grover
634
Synopsis: We have developed a blind source separation procedure to be applied on low-
635
level nuclear waste processing, to identify the number of species in a aqueous mixture, label
636
them with respect to a reference library, and determined their relative concentrations. We
637
have tested our procedure against simulated and experimental data for a mixture of water
638
plus five sodium salts using both Raman and ATR-FTIR measurements.
639
40
ACS Paragon Plus Environment
Page 40 of 40