Anal. Chem. 2010, 82, 8753–8754
Reply to the Comments on “Near-Infrared Hyperspectral Unmixing Based on a Minimum Volume Criterion for Fast and Accurate Chemometric Characterization of Counterfeit Tablets” Marta B. Lopes,*,† Jean-Claude Wolff,‡ Jose´ M. Bioucas-Dias,† and Ma´rio A. T. Figueiredo† Instituto de Telecomunicac¸o˜es, Instituto Superior Te´cnico, Technical University of Lisbon, Avenida Rovisco Pais, 1049-001 Lisboa, Portugal, and GlaxoSmithKline, Medicines Research Centre, Gunnels Wood Road, Stevenage, Hertfordshire SG1 2NY, U.K. Commenting on our recent paper,1 Ro´bert Rajko´ (RR) pointed out what he claims to be a set of flaws in that paper. In this reply, we argue that RR’s claims are essentially based on a set of misunderstandings/misconceptions and on a careless reading of our paper. RR began by claiming that other researchers had proposed minimum simplex volume criteria earlier than Craig;2 he cites, to support his claim, the work of Perczel et al.,3 published in 1991, while the paper by Craig that we cited is from 1994. Although the paper by Craig that we chose to cite is the archival journal version, the approach therein described had been originally proposed by Craig in 1990, in a earlier conference paper (see ref 5 in Craig’s paper2). However, Perczel et al. had in fact independently proposed this idea in 19894 (a fact that RR did not mention), so we acknowledge that we should have also cited that work by Perczel et al.4 RR claimed that “...the technical terms and background history should have been smoothly transformed to the technical terms and background history used in the fertilized scientific area (chemometrics).” We concede that we could have made an effort to use more chemometrics jargon; however, we believe that the use of community-specific terminologies is in part responsible for a large quantity of duplicate work across different disciplines. We have written our paper using neutral terminology, carefully defining all the mathematical objects; the paper was peer-reviewed, accepted, and published in a prestigious chemistry journal, thus we are confident that the absence of chemometrics jargon is not an obstacle to its understanding by the chemometrics community. The core of RR’s negative comments on our paper was focused on what he considered to be a harmful statement made in Lopes et al.:1 “The main disadvantage of MCR-ALS is the so-called * To whom correspondence should be addressed. Phone: +351218418387. Fax: +351218418472. E-mail:
[email protected]. † Technical University of Lisbon. ‡ GlaxoSmithKline. (1) Lopes, M. B.; Wolff, J.-C.; Bioucas-Dias, J. M.; Figueiredo, M. A. T. Anal. Chem. 2010, 82 (4), 1462–1469. (2) Craig, M. IEEE Trans. Geosci. Remote Sens. 1994, 32 (3), 542–552. (3) Perczel, A.; Hollo´si, M.; Tusna´dy, G.; Fasman, D. Protein Eng. 1991, 4 (6), 669–679. (4) Perczel, A.; Hollo´si, M.; Tusnady, G.; Fasman, D. Croatica Chim. Acta 1989, 62, 189–200. 10.1021/ac102443w 2010 American Chemical Society Published on Web 09/29/2010
rotational ambiguity problem, i.e., a set of simplices with different orientations, all enclosing the data points, are minimizers of the least-squares criterion.” In response to that statement, RR argued that the rotational ambiguity is not due to MCR-ALS (or any other related algorithm) but it is a feature of the underlying bilinear decomposition of the data matrix. Of course it is! Let us recall that MCR-ALS is nothing but an iterative technique to solve the following optimization problem min |Y - MA| 2F
(M,A)∈F
(1)
where F is the feasible set of pair of matrices satisfying the adopted constraints (such as non-negativity, closure, unimodality, etc.); for details about the meanings and dimensions of matrices Y, M, and A, see Lopes et al.1 MCR-ALS alternates between solving with respect to (wrt) M, with A fixed, and solving wrt A, with M fixed. It is obvious that, under the non-negativity constraint, or under both the non-negativity and closure constraints, problem 1 has an infinite set of solutions, as clearly illustrated in Figure 2 of Lopes et al.1 and explained in the “Comments” paper by RR. Consequently, this nonuniqueness is of course inherited by any algorithm designed to solve 1, as is the case of MCR-ALS; to which solution the algorithm converges (if it does converge at all) depends critically on the initialization, because 1 is a nonconvex problem. RR stated that we applied the non-negativity constraints and the necessary condition: “each facet of the simplex contains at least p - 1 spectral vectors.” This is simply wrong and can only result from a careless reading of our paper; RR mistook a necessary (though not sufficient) condition for the minimum simplex volume criterion to be successful, with a constraint imposed on the solution of an optimization problem. Nowhere in the methods used in our paper was such a constraint imposed on the solution. Moreover, RR completely missed the meaning of the condition; he wrote “The chemometric meaning of this condition is: there should exist at least p - 1 spectral bands at which only one component has zero absorbance (signal) for all components (endmembers). This strict condition is rather arbitrary and can be unattainable.” This is simply wrong; the real meaning of this Analytical Chemistry, Vol. 82, No. 20, October 15, 2010
8753
condition is: for each of the p pure materials (endmembers), there should exist in the data at least p - 1 pixels where the abundance of this pure material is zero; this is highly likely to happen in large data sets and is a much weaker condition than the existence of pure pixels. In the following paragraph, RR used some data to argue that “the methods used by them are not always optimal.” This was a futile effort, since we never claimed that the minimum simplex volume criterion always yields optimal solutions. First of all, such a statement would beg the definition of what is meant by optimality. The only thing we claim is: under the linear noiseless model (eq 1 in Lopes et al.1), a necessary (not sufficient) condition for the solution of the minimum volume simplex criterion (eq 4 in Lopes et al.1) to be the true underlying simplex is that the data Y contains at least p - 1 points on each facet of the true underlying simplex. This is an objective mathematical statement, which is provable, thus not subject to discussion. Of course, the linear noiseless model is simply a model of reality, not reality itself, thus the quality/accuracy of these solutions and how well they match the underlying reality (which is probably what RR was thinking about when he wrote “optimality”) depends on how well the model describes the underlying reality. RR claimed that “There is a deep confusion in the term of unique solution.” Yes there is, but the confusion is in RR’s comments, as explained next. First of all, RR used the term “optimal” several times in this paragraph, without ever defining what he means by “optimal”; optimality can only be discussed in the context of an optimality criterion, which is usually formalized as an optimization problem. In writing that “Programmers and mathematicians have been and are developing algorithms providing unique solution”, RR showed a strange ignorance about the whole mathematical field of optimization, which should be highly relevant for chemometrics. Uniqueness of solution is not a characteristic of an algorithm but of an optimization problem; there are optimization problems that have unique solutions, others have many solutions, still others have no solution at all. RR then wrote “Most of the algorithms can provide only one solution at the end of the iterations; however, a large set of equally optimal solutions (not revealed by the used algorithms) can exist.” This is again simply wrong; leaving aside the fact that RR was (wrongly) assuming that all algorithms are iterative and that there is such a thing as the “end of the iterations”, it is not true that most algorithms, when dealing with an optimization problem with multiple solutions, yield a single solution. When dealing with optimization problems with multiple solutions, most iterative algorithms will converge to different solutions, depending on how they are initialized. The example of PCA used by RR is indeed a good example of the danger of using the term “optimal” without defining what is the underlying optimality criterion; he wrote: “Mathematically (...) PCA can provide a unique solution, because of the used restrictions of orthogonality, normalization, sign (e.g., only nonnegative components of the first eigenvector), and maximum variance. However, there exist infinite equally optimal solutions based on eq 3”. What are “optimal solutions based on eq 3 ”? Equation 3 is
8754
Analytical Chemistry, Vol. 82, No. 20, October 15, 2010
not an optimality criterion, it is simply an under-determined matrix equation, with an infinite set of solutions. The additional restrictions mentioned by RR simply serve to select one element from this infinite set; whether this one element makes scientific sense or not depends on how well the adopted model and restrictions describe the underlying reality. RR wrote that “It should be mentioned here that both SISAL and MVSA suffer from the nonuniqueness because of the used λ parameter.” Here, RR was referring to a completely different source of nonuniqueness; of course, for different choices of λ, SISAL/MVSA will lead to different solutions, because they are solving different optimization problems. Parameter λ was introduced to allow the minimum simplex volume criterion to be robust wrt noise and/or outliers; if it is known that there are no outliers nor noise, then λ can be set to +∞ and the non-negativity will be fully enforced. RR’s claim that “the result is not always equal to the true one” is again useless; nowhere in the paper (and especially in the case of data with noise and/or outliers) do we claim that our methods are guaranteed to find the underlying truth. In the same paragraph, RR wrote “Moreover it can be easily constructed such a data set for which the condition is fulfilled but not uniquely, i.e., there exist several minimal volume simplices.” This is again a futile effort; of course RR’s example is correct; however, notice that we never claimed that the presence of p - 1 points on each facet of the true underlying simplex was sufficient for minimum simplex volume criteria to recover the true underlying simplex. It is a necessary condition, not a sufficient condition. Quoting from our paper: “A necessary condition for this idea to work is that each facet of the simplex contains at least p - 1 spectral vectors.” Finally, RR discusses the normalization issue, claiming that “the assumption of 1-norm normalization is arbitrary in this case. Because of the scale ambiguity of the decomposition in eq 1, any concentration profile can be multiplied with a constant if we divide the corresponding absorbance profile with the same constant simultaneously.” Of course this is true, and we never argued otherwise in the paper; this is simply not an issue in our paper, as explained next. Notice that the quality/accuracy of the solutions is assessed by comparing the true and estimated M matrices; thus, as long as all M matrices (true and estimated) undergo the same normalization, they can be directly compared. To conclude, we would like to stress that the main conclusions in Lopes et al.1 were not challenged or commented by RR. These conclusions, which resulted from our experimental evaluation of the minimum simplex volume criterion (implemented via the MVSA and SISAL algorithms) and the least-squares bilinear factorization criterion (implemented by MCR-ALS) in the context of the data sets considered in the paper, are that MVSA/SISAL yields much more accurate estimates than MCR-ALS and that it does so with a much lighter computational effort.
Received for review September 14, 2010 AC102443W