Local Rank Deficiency Caused Problems in ... - ACS Publications

Jan 23, 2017 - Local Rank Deficiency Caused Problems in Analyzing Chemical Data. Mahsa Akbari Lakeh,. †. Róbert Rajkó,. ‡ and Hamid Abdollahi*,â...
0 downloads 0 Views 1MB Size
Subscriber access provided by UB + Fachbibliothek Chemie | (FU-Bibliothekssystem)

Article

On local rank deficiency caused problems in analysing chemical data Mahsa Akbari Lakeh, Róbert Rajkó, and Hamid Abdollahi Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.6b03134 • Publication Date (Web): 23 Jan 2017 Downloaded from http://pubs.acs.org on January 23, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

On local rank deficiency caused problems in analysing chemical data Mahsa Akbari Lakeh a, Róbert Rajkó b, Hamid Abdollahi a,* a

Faculty of Chemistry, Institute for Advanced Studies in Basic Sciences, P.O. Box 45195-1159, Zanjan, Iran

b

Institute of Process Engineering, Faculty of Engineering, University of Szeged, Moszkvai krt. 5-7, H-6725 Szeged, Hungary

*

Corresponding author at: Faculty of Chemistry, Institute for Advanced Studies in Basic Sciences, Gavazang Road, Zanjan, Iran. Tel.: +98 24 33153122; fax: +98 24 33153232. E-mail address: [email protected] (H. Abdollahi)

1

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract Multivariate curve resolution (MCR) is a powerful methodology for analyzing chemical data in different application fields such as pharmaceutical analysis, agriculture, food chemistry, environment, and industrial and clinical chemistry. However, MCR results are often complicated by rotational ambiguity, meaning that there is a range of feasible solutions which fulfill the constraints and explain equally well the observed experimental data. Constraints determine the properties of resolved profiles in MCR methods by enforcing different assumptions on data. The applied constraints on chemical data sets should be derived from the physical nature and prior knowledge of the system under study. Therefore, the reliability of the constraints in order to get accurate results is a critical aspect that should be considered by analytical chemists who use MCR methods. Local rank information plays a key role in the curve resolution of multicomponent chemical systems. Applying the local rank constraint can reduce the extent of rotational ambiguity considerably and in some cases unique solutions can be achieved. Local rank exploratory methods like Evolving Factor Analysis (EFA) method provide local rank maps in order to obtain the presence pattern of components on the main assumption that the number of components in each window is equal to its rank. It is shown in this work that the local rank is a mathematical concept that may not be in concordance with chemical information. Thus, applying the local rank constraint for restricting the rotational ambiguity in MCR methods can lead to incorrect solutions! This problem is due to “local rank deficiency” which is introduced in this contribution.

2

ACS Paragon Plus Environment

Page 2 of 26

Page 3 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Keywords: Local rank deficiency; Local rank constraint; Borgen plot; Multivariate curve resolution; Rotational ambiguity.

INTRODUCTION The rapid growth of advanced instruments that produce huge amount of multivariate data, accompanied with the availability of inexpensive, powerful computers has caused chemometrics to involve in solving most analytical chemistry problems. Multivariate Curve Resolution (MCR) 1,2

has emerged as a powerful tool in chemometrics realm for extracting useful chemical

information from multivariate data sets. MCR methods are extensively applied to a wide range of data generated by numerous different process for both qualitative and quantitative purposes.3-6 In many cases, MCR is the only way currently available for obtaining the pure composition and spectra from chemical measurements. However, the main drawback to these methods is rotational ambiguity7 meaning that there is a range of feasible solutions corresponding to the species profiles which explain equally well the observed experimental data while fulfilling the applied constraints. Several approaches are proposed in the literature for calculating the range of feasible solutions of a specific system.1,8-13 One of the oldest chemometric procedures in this area is the SelfModeling Curve Resolution (SMCR) approach that was introduced by Lawton and Sylvestre (LS) in 1971 for decomposing two component systems.1 Since then many efforts were made to generalize the LS method to more complex mixtures, but it was not until 1985 that Borgen and Kowalski proposed an analytical solution for resolution of three-component systems.8 The

3

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

interested reader can find more detailed history of early years in the paper of Hamilton and Gemperline.14 As a matter of fact, rotational ambiguity can affect the accuracy of results significantly, so calculating the range of the feasible solutions and evaluating the error prediction range is highly recommended in all the MCR applications. Ahmadi and Abdollahi15 investigated the effect of rotational ambiguity on the accuracy of quantitative results obtained by MCR methods. In those cases where the range of feasible solutions is wide, implementing more effective restrictions can reduce the extent of rotational ambiguity and improve the quantitative results. The applied constraints on the chemical data should always be correct and reliable otherwise, some true feasible profiles will be eliminated or in some cases, a wrong unique solution will be obtained. Consequently, studying the reliability and effectiveness of constraints can help the analytical chemists who use MCR methods to gain more accurate results. Various constraints have different weights in restricting the rotational ambiguity and systematical errors. Among them, local rank has been recognized as the most important constraint for achieving true solutions since the very beginning of curve resolution studies.7 It has been used in lots of MCR applications for considerably reducing the extent of the rotational ambiguity or in most favorable cases for finding unique solutions. Actually, if conditions of Manne’s resolution theorems16 are satisfied in a specific system, unique solutions will be obtained by applying the local rank constraint.17 In general, rank is an algebraic concept defined as the number of linearly independent vectors which can construct all columns/rows of a data matrix as their linear combinations. The rank of a data matrix is one of the immediate and important information that is revealed by factor analysis methods. In chemical data analysis, the rank of a data matrix is usually attributed to the number

4

ACS Paragon Plus Environment

Page 4 of 26

Page 5 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

of chemical components in the system. However, there exist various experimental situations where the rank of a data matrix is less (rank deficiency) or greater than (rank excessiveness) the number of components. Such data matrices cannot be analyzed directly and different approaches are proposed to deal with them.18,19 Furthermore, the sub-matrices of a data matrix can be subjected to the factor analysis methods in order to determine the local rank information. In literature, it is common to suppose that the local rank is either equal or greater than (because of noise) the number of present chemical components in the considered rows or columns. For instance, it is common to consider just one component in a local rank-one window. Supposing this assumption is correct, tracking the local rank by local rank exploratory methods, such as EFA20 or other algorithms based on EFA,21-23 can provide essential knowledge such as selective zones and zero region zones for components. Ultimately, it is assumed that the presence window of each component (e.g. elution windows in case of chromatographic data) can be obtained based on the local rank information. What happens to the results of MCR methods based on the local rank information if these assumptions are incorrect? It would be interesting to researchers who apply MCR methods in analytical chemistry to know in which cases these assumptions are acceptable and in which cases they should be avoided. In this contribution, the local rank constraint is studied in detail, and the concordance of the local rank information with the presence windows of components is pursued in bilinear chemical data sets. Two simulated data examples accompanied by an experimental data were used to clarify the discussions and investigations.

THEORETICAL BACKGROUND General theory

5

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 26

The main assumption of all MCR methods is the bilinearity of the measured data matrix so that it can be decomposed into the product of two physically meaningful matrices similar to the following: Rm,n= Cm,p STp,n

(1)

Where R (m×n) is the measured bilinear data matrix; the columns of C (m×p) contain the concentration profiles, and the rows of ST (p×n) contain the pure instrumental response profiles of p pure components present in the mixture. Superscript T denotes the transpose of a matrix, i.e., interchanging its rows and columns. Given a measured data matrix R, MCR methods try to obtain both C and S matrices. A frequent starting point for this purpose is to carry out a decomposition of the data matrix by using the Singular Value Decomposition (SVD) or Principal Component Analysis (PCA) according to Eq. (2), Rm,n = Um,p Dp,p VTp,n= Xm,p VTp,n = Um,pYTp,n

(2)

Where the columns of U (m×p) and rows of VT (p×n) contain the left and right eigenvectors of R, respectively; D (p×p) denotes a diagonal matrix of singular values; X (m×p) is a matrix containing the coordinates of the row vectors in row space (V space), and Y (p×n) is a matrix containing the coordinates of column vectors in column space (U space). The SVD analysis provides unique solutions, U and VT, (as long as the singular values are non-degenerate) but these are not the chemically meaningful results sought in the curve resolution as shown in Eq. 1. However, U and VT are related respectively to the matrices C and ST according to the following equation: Rm,n = (Xm,p Tp,p)(T-1p,p VTp,n) = Cm,p STp,n

(3)

T (p×p) matrix is a non-singular transformation matrix that transforms the abstract solutions (X and VT) to the concentration and pure response profiles. If one prescribes no certain scale for C

6

ACS Paragon Plus Environment

Page 7 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

or S, an infinite number of solutions will be possible for the above equation. The true scale can just be determined from the calibration samples of known concentrations. Thus, a specific normalization can be used for either C or S to find the solutions. In most cases, C and S cannot be determined uniquely even after a proper normalization, meaning there are several sets of feasible solutions via a range of different transformation matrices. All the possible solutions fulfill the constraints and fit the data equally well, so there is no reason to prefer one over the other unless more information is provided. This intrinsic indeterminacy of all MCR methods is well known as rotational ambiguity. Different kinds of restrictions on the feasible solutions such as local rank, unimodality of concentration profiles, and equality usually decrease the range of rotational ambiguity and in some cases, a set of unique solutions can be obtained.24-26 Borgen/Rajkó Plot Borgen/Rajkó8,10 plot is a geometrical and comprehensive tool that enlarges the details of a threecomponent system. A typical Borgen plot (shown in Fig. 3) consists of three important sections: the inner polygon, the outer polygon and the feasible region(s). The inner polygon is a convex polygon (convex hull) that encloses all the data points in the desired reduced abstract subspace (U- or V-subspace) after a proper normalization.27 The outer polygon is again a convex polygon that defines the boundary between the positive and negative parts of the reduced abstract subspace. All the retransformed points (i.e. the profiles) inside this polygon have no negative values. The feasible region(s) is between the inner and the outer polygons and contains all the possible solutions for the curve resolution problem. In the case of distinct feasible regions, the retransformed points of each feasible region produce the permissible band for a specific component.

7

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

EXPERIMENTAL SECTION Two simulated data sets are investigated in this contribution. The first one is a typical noise free data which is used for basically evaluating the concordance of the local rank concept with the presence window of chemical species. In the second simulated example, the effect of implementing the local rank constraint on the quantitative results of the MCR methods is studied. Moreover, an experimental data set was used to extend the discussions to real cases. The simulations were carried out in MATLAB environment on a personal computer. The SMCR calculations for obtaining the Borgen plots and feasible solutions were made by the FAC-PACK software (version 1.20 - the generalized Borgen plots module). This software can be freely downloaded at http://www.math.uni-rostock.de/facpack/. In the following, the data sets are introduced in more details. Simulated example data 1 The first simulated data set is obtained from the chromatograms and spectra shown in Fig. 1, corresponding to a mixture of three co-eluting components in a chromatographic system monitoring with a diode array detector. The dimension of the simulated data set is 34 rows (elution times) by 51 columns (wavelengths). As it can be deduced from the elution and spectral profiles, there are three different local rank windows on the rows of the data matrix comprising windows with rank one, two, and three. a)

b)

8

ACS Paragon Plus Environment

Page 8 of 26

Page 9 of 26

Elution profiles

Spectral profiles 1.6

1 0.9

1.4

II

I

0.8 1.2

III

0.7

III 1 Abs.

0.6 Conc.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

I

0.5 0.4

II

0.8 0.6

0.3

0.4

0.2 0.2

0.1 0

0

5

10

15 20 retention times

25

30

0 450

35

500

550

600

wavelength

Figure 1. (a) simulated elution profiles; (b) simulated spectral profiles

Simulated example data 2 A multivariate second-order data is generated similar to the previous example for simulating a mixture of two co-eluting components in a chromatographic system. The pure spectra and elution profiles of the components are represented in Fig. 2. It is assumed that one of the components is known in this experiment. So, four calibration samples are built with the single analyte at nominal concentrations of 1, 1.5, 2, and 2.5 (in arbitrary units). In order to further simultaneous analysis of the mixture and standard samples, the created individual data matrices were arranged in an augmented column-wise matrix (keeping the wavelengths in common). As all the generated data sets are of equal size, the size of the resultant augmented data matrix is 85 rows (5 samples × 17 elution time) by 91 columns (wavelengths). In order to produce a condition close to real systems, a random homoscedastic white noise matrix was created with zero mean and relative standard deviation of 0.007 maximum value of the signal. This noise matrix was then added to the simulated augmented data matrix.

9

ACS Paragon Plus Environment

Analytical Chemistry

x 10

Spectral profiles

Elution profiles

molar abs.

4 3 2 1 0

Augmented data

3000 2500

1.5

2000

1

abs.

-4

5

Conc.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

0.5

1500

0 400

1000 500

0

Page 10 of 26

5 10 retention times

15

350

0 300

320

340 360 wavelength

380

400

300

wavelength

0

20

40

60

80

100

retention time

Figure 2. Simulated elution profiles, absorption spectra, and augmented data matrix.

Experimental data The experimental data set was chosen from a published paper,28 an investigation of the dissolution properties of meloxicam-mannitol binary systems. From the whole data matrix, two dissolution experiments were undertaken: one sample containing a physical mixture of meloxicam and mannitol in the ratio of 3:7 (w/w) (ME1), and one sample of mannitol (Mannit2 (amount as in the mixture with a ratio of 3:7)) into artificial enteric juice (900 ml) with a pH of 7.5 (± 0.1) at 37 °C (± 0.5 °C). The sampling was made after 5, 10, 20, 30, 60 and 90 min, and the spectra of the filtered sample solutions were measured in the UV-Vis range, i.e., from 220 to 436 nm in steps of 4 nm (Helios α Spectronic, Unicam, Cambridge, UK). The 2×6 process spectra were augmented by the spectrum of the solvent, and the collection formed the 13 rows of the data matrix used for the analyses.

RESULTS AND DISCUSSION This section focuses mainly on the concordance of the local rank information with the presence windows of components in the bilinear data matrices. The agreement of the local rank information with actual presence windows of the components in noise-free matrices is of great importance because the local rank constraint is applied based on the correctness of this

10

ACS Paragon Plus Environment

Page 11 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

concordance. For instance, when selectivity is applied as a constraint, it is forced that just one component should participate to reconstruct the rows or columns with local rank-one. The main question arises here: Is there any possibility that more than one component participates in building the local rank-one window? Alternatively, in general, may rank deficiency be present in the sub-windows of a data matrix? The simulated example data 1 Fig. 3 shows the Borgen plot of the first example data in row space. In this plot, each circle is the representative of a spectrum at a specific retention time so the number of circles is equal to the number of rows of data. However, when some spectra have an identical shape, they will locate in the same position in row space. Three distinct feasible regions in Fig. 3 illustrate that rotational ambiguity exists for all three components. So, a range of feasible spectra can be defined for each component by non-negativity constraint. MCR methods usually try to find a set of feasible spectra that can construct data with corresponding non-negative concentration profiles. In the Borgen plot, a feasible set of solutions can be defined by any three points, each located in the feasible region of a component, that form a triangle that encloses all the data points.29,30 This will give a set of non-negative spectra and concentration profiles. Three typical sets of solutions are shown in Fig. 4 (a, b and c).

11

ACS Paragon Plus Environment

Analytical Chemistry

0.5

II

0 X3

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 26

III

-0.5

I -1 -0.8

-0.6 -0.4

-0.2

0

0.2 X

0.4

0.6

0.8

1

1.2

2

Figure 3. Borgen plot of the simulated data set. The hollow circles indicate the rows of the data in row space.

As seen in Scheme 1, different local rank windows are present in the rows of the simulated data: the first eight spectra of the data set (correspond to the first eight elution times) make a window with rank-two, the next 14 spectra constructs a window with rank three, spectra in rows 23 to 28 again make a window with rank two, and finally the last six rows build a window with rank one. a)

I II III

1

b)

2 0

0

5

3 10

15 20 retention times

2 25

1 30

35

Scheme 1. (a) One of the possible elution chromatographic patterns among all for the simulated data set; (b) local rank windows in rows of data matrix

As stated before, the coordinates of spectra of the local rank-one window would be the same in row space because all have the same shape. Thus, the six spectra of the local rank-one window are indicated as a single circle in Fig. 4 (a, b and c); other data points are not shown for the sake of clarity. Regarding the Borgen plot in Fig. 4, the local rank-one window is located inside the feasible region of the third component. Thus, the spectral profiles with rank one can be considered as the 12

ACS Paragon Plus Environment

Page 13 of 26

pure spectrum for the third component. Fig. 4(a), represents a set of feasible solutions that indicate the local rank-one window as a selective window in data. However, one can find several sets of feasible solutions with this property by fixing the position of the third spectrum and changing the other two positions. As can be seen in elution profiles in Fig. 4(a), there is a selective window for the third component (elution times 29 to 34 in arbitrary unit) and it shows that the contribution of other components for building the local rank-one window is zero. However, other sets of feasible solutions may not confirm this conclusion. It is illustrated in Fig. 4(b) that a linear combination of the spectral profiles of the second and third component can construct the rows with rank-one, i.e. the linear combination of points 1 and 2 in Fig. 4(b). The corresponding elution profiles confirm the presence of two components in the rank-one submatrix!

Normalized spectral profiles 0.35

(a)

0.3 I III

0.25

Abs.

0.5

II

0.2 II

0.15 0.1 0.05

0

0 450

500

X3

550

600

wavelength

Normalized elution profiles

III -0.5

6

5

I -1 -0.8

II 25

4 Conc.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

-0.6 -0.4

-0.2

0

0.2 X2

0.4

0.6

0.8

1

1.2

30

III

3

I 2

1

0

5

13

ACS Paragon Plus Environment

10

15 20 retention times

25

30

35

Analytical Chemistry

(b)

Normalized spectral profiles 0.35 III

0.3 I

0.25

Abs.

0.5

II

0.2 0.15

1

II

0.1 0.05

0

0 450

X3

2

500

550

600

wavelength

Normalized elution profiles

III

-0.5

7

6

5

I -1

II

4

-0.6

-0.4

-0.2

0

0.2 X2

0.4

0.6

0.8

1

24

Conc.

-0.8

1.2

3

26

28

30

32

34

I III

2

1

0

5

10

15 20 retention times

25

30

35

(c) Normalized spectral profiles 0.4 0.35

I

III

0.3

0.5

0.25 Abs.

4 II

0.2 0.15 II

0.1

0 0.05

3

X3

0 450

500

550

600

wavelength

5

-0.5

Normalized elution profiles

III

4.5 4

II 3.5

I

3

24

I

-1 -0.8

-0.6

-0.4

-0.2

0

0.2 X2

0.4

0.6

0.8

1

1.2

Conc.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 26

2.5

26

28

30

32

34

retention times

III

2 1.5 1 0.5 0

0

5

10

15 20 retention times

25

30

35

Figure 4. The position of three possible solutions with different presence patterns. One of the possible set of solutions that (a) just one component, (b) two components and (c) three components are present in the local-rank one window. 14

ACS Paragon Plus Environment

Page 15 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

The local rank-one window, in the same way, can be constructed from the linear combination of three different components, e.g. the points 3, 4 and 5 in Fig. 4(c). Consequently, due to the rotational ambiguity, one, two, or three components may be present in this local rank-one submatrix. Possibly previous chemical knowledge confirms the presence of more than one component in this window. In such cases, the number of recognizable contributions in a local window is lower than the number of chemical components, and for this reason that we call it “local rank deficiency”. It is very important to note that, all sets of feasible solutions of a system satisfy the local rank information, but the presence patterns may differ from one set to the other because of local rank deficiency. Thus, a number of various presence windows may exist in the feasible band of a component. In Figs. S1−S3 (see the Supporting Information), the feasible elution profiles are shown for each component of the simulated data. Also, some of the possible elution windows with their bands are displayed for each component.

The simulated example data 2 In order to evaluate the effect of the local rank constraint on quantification problems, the analysis of the simulated augmented data set was performed. The aim of the data analysis here is to quantify the selected analyte in the presence of an unknown species in the simulated binary mixture. This quantification can be done by simultaneous resolution of the mixture and the pure standard data of the analyte using any MCR method which allows for prediction of analyte in the presence of unmodeled interferents. For this purpose, once the resolution is achieved, the ratio between the areas of the analyte peak in unknown and standard sample should be calculated. Afterwards, the quantification can be done by multiplying this ratio by the known concentration of analyte in standard sample. 15

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 26

Local rank analysis was performed on the mixture data and then local rank constraint was employed for data analysis. Investigations based on feasible band calculations revealed that, in this case, the local rank constraint provides sufficient conditions for uniqueness of analyte concentration profile. MCR-ALS algorithm 31 was employed for extraction of this unique profile because local rank information can be applied easily in this method. The initial estimates for MCR-ALS were made by EFA method, and the imposed constraints were non-negativity of both concentration and absorption values and also the local rank information. After the convergence of MCR-ALS algorithm, the estimated concentration profile of analyte was used for its quantification in unknown mixture. The quantitative calculations based on peak area revealed that the estimated concentration deviates considerably from the true value under the local rank constraint, see Table 1. Similar calculations were performed for analyte concentration by just employing the non-negativity constraint. Since non-negativity cannot provide unique solution in this instance, all the possible non-negative concentration profile of analyte were calculated and then were subjected to quantification. A systematic grid search method

32

was used for

calculating the feasible solutions. For each feasible non-negative profile, a distinct calibration procedure was performed as before so different concentration values were obtained for analyte concentration in the unknown mixture. The calculated concentrations were then sorted in a range from the minimum to the maximum value. Quantitative results and the percentage of quantitation errors by imposing different constraints are summarized in table 1. It is worth to note that, the estimated concentration by the MCR-ALS algorithm and also the real value are within this range. So, instead of choosing one solution based on local rank, all the non-negative solutions were used for quantification. The range of feasible non-negative solutions, the estimated concentration profile based on the local rank constraint, and the real concentration profile is shown in Fig. 5.

16

ACS Paragon Plus Environment

Page 17 of 26

In fact, the local rank constraint finds a special concentration profile for analyte among all the possible non-negative profiles. However, physically correct solution is never guaranteed. Consequently, in such systems, it seems reasonable to define a range of concentrations based on the physico–chemically meaningful restrictions rather than a unique solution based on mathematical limitations. The defined concentration range demonstrates the measurement uncertainty in presence of the rotational ambiguity. Table 1 Quantitative results by applying different constraints on the simulated example data 2 Constraints Quantitative results

Non-negativity Non-negativity and local rank

Analyte concentration 0.99 b – 2.18 c

% error a

2.15

19.44%

-45.00% b – 21.11% c

a

% quantitation error = [(C obtained - C true) / C true] × 100 C true is considered 1.8 (a.u.) b lower level concentration or error c Upper level concentration or error

Normalized elution profiles 1

0.8

0.6 Conc.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

0.4

0.2

0 0

20

40 60 retention times

80

100

Figure 5. All the feasible non-negative elution profiles for analyte is shown. The estimated MCR-ALS profile is marked as ‘●’ and the true profile is shown as the solid line.

17

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 26

The experimental data set In this section, the local rank information is assessed in a real chemical data set. The experimental data set (13 rows × 55 columns) consists of the spectrum of the solvent sample (the juice), the spectra of the mixture sample (containing the solvent, mannitol, and meloxicam), and the spectra of the standard sample of mannitol (containing mannitol and solvent).28 Scheme 2 represents the decomposition of the experimental data to the concentration and spectral profiles. All the feasible dissolution rate profiles for three components were calculated by Borgen approach under non-negativity constraint. Regarding the feasible profiles which is shown in Fig. 6, there are several presence windows in the band of each component. Local rank analysis was then performed on the experimental data and showed that the rows 8 to 13 have the local rank two. Interestingly, some sets of feasible solutions, see Fig.6, show the contribution of three components for building these rows with local rank-two. However, in this case, prior chemical information can determine reliably the number of components in standard and mixture submatrices. The rows 8 to 13 belong to the spectra of the standard matrix and surly one component (the Meloxicam) does not participate for constructing them.

Solvent Mixture

Standard

0 0

Solvent Mannitol Meloxicam

₌ 0 0 0 0

Scheme 2. decomposition of the experimental data to the concentration and spectral profiles

So, applying the chemical knowledge in this case declines some solutions. For this purpose, the zero concentration region constraint must be applied for the Meloxicam, and this will result in considerable reduction in feasible solutions of all components.

18

ACS Paragon Plus Environment

Page 19 of 26

Relative dissolution rate

1.6 1.4 1.2 1

Me

0.8

J

0.6

Ma

0.4 0.2 0 0

1

2

3

4

5

6

7

8

9 10 11 12 13 14

Solvent

Mixture Samples

Figure 6. Bands for the dissolution rate profiles of the three components (Ma: Mannitol, Me: Meloxicam, J: Juice) for experimental data.

More interestingly, based on rank analysis, the local rank of the columns 6 to 9 is two, but none of the possible solutions in the spectral bands confirm this information (see Fig. 7). It means that the local rank deficiency exists in these columns, and they are certainly constructed from three components! In this case, applying the local rank constraint may cause wrong answers. 1.2 1 Relative absorbance

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

0.8 0.6 0.4

Ma Me

0.2 0 -0.2 200

J 250

300 350 Wavelength

400

450

Figure 7. The spectral bands of the three components (Ma: Mannitol, Me: Meloxicam, J: Juice) for experimental data; the narrowest band is for juice. The dashed rectangle indicates the relative absorbances values correspond to local rank-two window in columns.

19

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 26

DISCUSSION Rank deficiency may be present in the local windows of a data set rather than the whole data matrix so applying the constraint to limit the feasible regions only based on the local rank information is a risky task unless there is reliable chemical information about the number of components in the local windows. In other words, local rank is a mathematical constraint that should be used with caution. In many cases, local rank cannot be attributed directly to the number of components and in some cases that this concordance exists, applying the local rank information does not reduce the amount of rotational ambiguity. Evidence supporting this claim comes from several basic chemometric studies and will be published in a more specialized journal. In those cases which the whole data matrix is rank deficient, the number of significant contributions is lower than the total number of components present in the system. So, the resolution of the rank deficient data matrix to all the real process contributions is impossible. Moreover, it is obvious that the information obtained by the local rank analysis of such data sets does not match with the real concentration windows. A modified version of the EFA method has been proposed to deal with these problems33. The proposed algorithm is adopted to work with full rank data matrices obtained by matrix augmentation of the rank deficient data with a full rank standard matrix. The main idea of the modified EFA algorithm is starting both forward and backward steps of EFA from the full rank standard matrix in order to extract the concentration windows of the components present exclusively in rank deficient matrix. As stated before, the information about the presence windows of components is reliable when it comes from chemical knowledge. In the modified EFA algorithm, the augmentation step breaks the rank deficiency so the total number of components can be obtained. Furthermore, employing

20

ACS Paragon Plus Environment

Page 21 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

the additional chemical information of the standard matrix, may help to find the true concentration windows. The local rank deficiency problem can be overcome by a similar approach. If appropriate chemical knowledge can be achieved somehow, a specific presence window or even a specific profile can be chosen among all the possible windows and profiles. However, augmentation by itself may not provide unique resolution for unknown components of the rank deficient data34 and, as shown before, rotational ambiguity can produce different presence windows in the feasible band of a certain component. It follows that any algorithms that compute just one presence pattern for components will ignore other possible solutions and patterns. Thus, further research is suggested to evaluate the performance of the modified EFA algorithm.

CONCLUSIONS Constraints have been studied because, since the early days of using MCR methodology, their applying is still an active field of the data evaluation research. Local rank is one of the most important constraints used in chemometrics literature. It has been used in many different MCR applications for reducing the extent of rotational ambiguity. Furthermore, it is shown that under special circumstances unique solutions can be produced by applying this constraint. In such studies, the main assumption is that the local rank information is in concordance with the presence pattern of components. However, it is shown in this work that the local rank information cannot automatically be equated with the number of components because of the socalled “local rank deficiency” problem. Since the local rank deficiency may occur in all systems with different number of components, applying the local rank information is a risky task indeed. Furthermore, in those cases that the local rank information matches with number of components,

21

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

applying the local rank constraint does not reduce the range of possible solutions because actually all the possible solutions already obey it. Employing the local rank constraint in MCR methods can result in serious systematic errors both in qualitative and quantitative analysis. In fact, there is no guarantee that the local rank information give physically correct solutions. Generally, while the information about the presence or absence of components comes from the mathematical concepts like the local rank information, this risk exists but when this information comes from true chemical knowledge, it is valuable and very effective on reducing the rotational ambiguity. For example, when information about the selectivity or zero region of a component comes from previous chemical knowledge or when the correspondence among species constraint is applied in multi-set data analysis, the information about the presence or absence of components is reliable.

Supporting Information Feasible non-negative elution profiles; and Different elution windows inside each feasible band

REFERENCES

(1) Lawton, W. H.; Sylvestre, E. A. Technometrics 1971, 13, 617-633.

(2) De Juan, A.; Tauler, R. Anal. Chim. Acta 2003, 500, 195-210.

(3) Garrido, M.; Rius, F.; Larrechi, M. Anal. Bioanal. Chem. 2008, 390, 2059-2066.

(4) de Juan, A.; Tauler, R. Crit. Rev. Anal. Chem. 2006, 36, 163-176.

(5) Ruckebusch, C.; Blanchet, L. Anal. Chim. Acta 2013, 765, 28-36.

22

ACS Paragon Plus Environment

Page 22 of 26

Page 23 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

(6) Hantao, L. W.; Aleme, H. G.; Pedroso, M. P.; Sabin, G. P.; Poppi, R. J.; Augusto, F. Anal. Chim. Acta 2012, 731, 11-23.

(7) Tauler, R.; Smilde, A.; Kowalski, B. J. Chemom. 1995, 9, 31-58.

(8) Borgen, O. S.; Kowalski, B. R. Anal. Chim. Acta 1985, 174, 1-26.

(9) Wentzell, P. D.; Wang, J.-H.; Loucks, L. F.; Miller, K. M. Can. J. Chem. 1998, 76, 1144-1155.

(10) Rajkó, R.; István, K. J. Chemom. 2005, 19, 448-463.

(11) Gemperline, P. Anal. Chem 1999, 71, 5398-5404.

(12) Sawall, M.; Neymeyr, K. J. Chemom. 2014, 28, 633-644.

(13) Golshan, A.; Abdollahi, H.; Beyramysoltan, S.; Maeder, M.; Neymeyr, K.; Rajkó, R.; Sawall, M.; Tauler, R. Anal. Chim. Acta 2016, 911, 1-13.

(14) Hamilton, J. C.; Gemperline, P. J. J. Chemom. 1990, 4, 1-13.

(15) Ahmadi, G.; Tauler, R.; Abdollahi, H. Chemometr. Intell. Lab. 2015, 142, 143-150.

(16) Manne, R. Chemometr. Intell. Lab. 1995, 27, 89-94.

(17) Akbari, M.; Abdollahi, H. J. Chemom. 2013, 27, 278-286.

(18) De Juan, A.; Casassas, E.; Tauler, R. Soft-Modelling of Analytical Data, Wiley, New York 2000.

(19) Ruckebusch, C.; De Juan, A.; Duponchel, L.; Huvenne, J. Chemometr. Intell. Lab. 2006, 80, 209-214.

(20) Gampp, H.; Maeder, M.; Meyer, C. J.; Zuberbühler, A. D. Talanta 1985, 32, 257-264.

(21) Keller, H.; Massart, D. Anal. Chim. Acta 1991, 246, 379-390.

(22) Whitson, A. C.; Maeder, M. J. Chemom. 2001, 15, 475-484. 23

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(23) Zeng, Z.-D.; Xu, C.-J.; Liang, Y.-Z.; Li, B.-Y. Chemometr. Intell. Lab. 2003, 69, 89-101.

(24) Geladi, P.; Wold, S. Chemometr. Intell. Lab. 1987, 2, 273-281.

(25) De Juan, A.; Vander Heyden, Y.; Tauler, R.; Massart, D. Anal. Chim. Acta 1997, 346, 307-318.

(26) Beyramysoltan, S.; Rajkó, R.; Abdollahi, H. Anal. Chim. Acta 2013, 791, 25-35.

(27) Rajko, R. J. Chemom. 2009, 23, 265-274.

(28) Rajkó, R.; Nassab, P. R.; Szabó-Révész, P. Talanta 2009, 79, 268-274.

(29) Henry, R. C. Chemometr. Intell. Lab. 2005, 77, 59-63.

(30) Rajkó, R. J. Chemom. 2006, 20, 164-169.

(31) Jaumot, J.; Gargallo, R.; de Juan, A.; Tauler, R. Chemometr. Intell. Lab. 2005, 76, 101-110.

(32) Vosough, M.; Mason, C.; Tauler, R.; Jalali-Heravi, M.; Maeder, M. J. Chemom. 2006, 20, 302-310.

(33) de Juan, A.; Navea, S.; Diewok, J.; Tauler, R. Chemometr. Intell. Lab. 2004, 70, 11-21.

(34) Alinaghi, M.; Rajkó, R.; Abdollahi, H. Chemometr. Intell. Lab. 2016, 153, 22-32.

24

ACS Paragon Plus Environment

Page 24 of 26

Page 25 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

For TOC only

25

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

26

ACS Paragon Plus Environment

Page 26 of 26