Modeling Methylene Blue Aggregation in Acidic Solution to the Limits

Dec 19, 2012 - Methylene blue (MB+), a common cationic thiazine dye, aggregates in ...... Malinowski , E. R. Factor Analysis in Chemistry, 3rd ed.; Wi...
1 downloads 0 Views 413KB Size
Article pubs.acs.org/ac

Modeling Methylene Blue Aggregation in Acidic Solution to the Limits of Factor Analysis Emily K. Golz and Douglas A. Vander Griend* Department of Chemistry and Biochemistry, Calvin College, Grand Rapids, Michigan 49546, United States S Supporting Information *

ABSTRACT: Methylene blue (MB+), a common cationic thiazine dye, aggregates in acidic solutions. Absorbance data for equilibrated solutions of the chloride salt were analyzed over a concentration range of 1.0 × 10−3 to 2.6 × 10−5 M, in both 0.1 M HCl and 0.1 M HNO3. Factor analyses of the raw absorbance data sets (categorically a better choice than effective absorbance) definitively show there are at least three distinct molecular absorbers regardless of acid type. A model with monomer, dimer, and trimer works well, but extensive testing has resulted in several other good models, some with higher order aggregates and some with chloride anions. Good models were frequently indistinguishable from each other by quality of fit or reasonability of molar absorptivity curves. The modeling of simulated data sets demonstrates the cases and degrees to which signal noise in the original data obscure the true model. In particular, the more mathematically similar (less orthogonal) the molar absorptivity curves of the chemical species in a model are, the less signal noise it takes to obscure the true model from other potentially good models. Unfortunately, the molar absorptivity curves in dye aggregation systems like that of methylene blue tend to be sufficiently similar so as to lead to the obscuration of models even at the noise levels (0.0001 ABS) of typical benchtop spectrophotometers.

M

and thus are involved in inducing the spectral deviations.9 Nandini and Vishalakshi furthered this theory by studying the thermodynamic parameters of MB+ associated with large organic anions and concluded that binding occurred not only due to electrostatic attractions but also because of hydrophobic attractions unlike other molecules they had studied.10 Even as the understanding of how and why MB+ aggregates has grown, the debate persists and many more questions continue to be raised. MB+ has been studied via vapor pressure osmometry,4 protonation equilibria,11 isoextraction methods,5,11−13 and most commonly by spectrophotometric methods1,3,4,11,14−16 to explain why spectral deviations occur. Of particular relevance to this study, Malinowski and Zhao chose to study which set of chemical reactions best describes the aggregates that are formed in a solution of MB+ via factor anlaysis.14,16 Up until the use of factor analysis, deconvolution of complex data was impossible when it was comprised of several features that could not be directly delineated. Malinowski helped expand factor analysis from its original use in the social sciences to where it is today in chemistry and other scientific fields.17 His work on MB+ systems still stands as an archetypal application of factor analysis in chemistry.14 Information about several MB+ aggregates was simultaneously obtained without

ethylene blue (MB) has attracted the attention of many scientists not only for its practical uses as a dye but for the unique physiochemical properties it exhibits in solution. Its structure is shown in Figure 1. In particular, the spectra of MB+

Figure 1. Structure of methylene blue chloride.

exhibits a shift from a band at 660 nm to a band at 610 nm and, for that reason, does not appear to follow the Beer−Lambert law.1 Deviation from the Beer−Lambert law is a phenomenon that is common to many organic dyes2 of which aggregation was proposed to be the cause.1,3−7 The dimerization of MB+ has been extensively studied along with several other dyes, as it appears there is an enthalpy−entropy compensation that occurs when the solution-phase structure of the dye changes from a monomer to a dimer.8 There is much debate in the literature as to how and why MB+ and other organic dyes aggregate, of which Padday has drawn together and elucidated, concluding that the aggregation of organic dye molecules occurs because of hydrophobic bonding6 and not because of counterion involvement.7 More recently, however, a study has been published that concluded from the orientation of aggregates that the size of the counterions invokes different orientations © 2012 American Chemical Society

Received: November 9, 2012 Accepted: December 19, 2012 Published: December 19, 2012 1240

dx.doi.org/10.1021/ac303271m | Anal. Chem. 2013, 85, 1240−1246

Analytical Chemistry

Article

1.0029 × 10−4 mol) in 0.1 M HNO3 (density = 0.9984 g/mL) in a 100 mL volumetric flask. A total of 30 subsequent methylene blue solutions were made, with a concentration range of 1.0029 × 10−3 to 2.5571 × 10−5, by diluting the stock solution with 0.1 M HNO3 by mass. A second set of solutions were made by dissolving methylene blue hydrate (0.0094 g, 2.5140 × 10−5 mol) to make a stock solution of 1.0056 × 10−3 in 0.1 M HCl (density = 0.9987 g/mL) in a 25 mL volumetric flask. A total of 30 subsequent solutions were made having a concentration range from 1.0056 × 10−3 to 2.4112 × 10−5. Acid mitigates the absorption of methylene blue on glassware. Solutions were measured within 24 h since the dye tends to degrade in aqueous solution. Instrumentation. The visible absorbance values from 350 to 900 nm for each of the 29 methylene blue solution were measured on a Varian Cary 100 UV−vis spectrophotometer with a scan rate of 600 nm/min, a spectral beam width of 2.0 nm, in double beam mode with a reference cell of 0.1 M acid and interfaced with a an OptiplexGX620 computer. The baseline for the spectra were corrected relative to the signal from the sample cell filled with only 0.1 M acid. Quartz cells with a path length of 0.1 cm were used for the entire solution set. Data Analysis. The complete data sets of measured absorbance values for each acid medium (HNO3 and HCl) were comprised of 29 spectral curves taken over 350−900 nm. These data sets were examined by unrestricted factor analysis to determine the additive mathematical structure of the data without any chemical parameters in order to establish the total number of unique chemical absorbers that can account for the complete data set. The data sets are then also modeled by restricting the data to the parameters of equilibria for the set of chemical reactions provided by the user. Such restrictions essentially require that the complete set of concentration profiles be generated from just two independent parameters, specifically two equilibrium constants. Furthermore, because equilibrium is enforced, mathematical parameters such as nonnegativity and unimodality are automatically enforced. The computer program “Sivvu” was used to perform this analysis, and the details of specific calculations can be found in the Supporting Information.25 Starting with raw absorbance values, Sivvu determines the molar absorptivity curves and the concentration profile that best rebuild the complete data set by performing least-squares analysis of the overexpressed system of linear equations. The degree of fit of the rebuilt data set is expressed by the root-mean-square residual (RMSR) between the measured and calculated absorbance values. A data set can be modeled in two distinct ways: as raw absorbance or as effective absorbance. Effective absorbance, Aeff, is found by dividing the raw absorbance by the total nominal concentration of methylene blue, [MB]T, and the cell path length, L. Simulated Data. Sivvu was also used to simulate complete data sets comparable to the measured data sets. The wavelength range (350−900 nm), the number of raw absorbance curves (51), and the compositions of each chemical solution (1.0 × 10−3 M to 2.5 × 10−5 M) for each simulated data set were identical. Each data set was simulated on the chemical model with three chemical absorbers suggested by Malinowski14 and depicted in Scheme 1 (Model A). Since the equilibrium values of the chemical reactions were not changed, a single set of concentration profiles was in fact used for each simulated data set.

the need to chemically isolate them. Malinowski and Zhao applied factor analysis to the complexity of MB+ in solution with much success. Their model for MB+ in solution, which we heretofore refer to as Model A, is described by the chemical reactions in Scheme 1.14 Scheme 1. Model A as Determined and Described by Malinowski and Zhao14

They also continued their study of MB+ by determining the hydration of MB+ aggregates.18 Hemmateenejad et al. furthered the study of MB+ by collecting data at different temperatures and determining its thermodynamic parameters via factor analysis.16 It is important to note that in both of these studies on MB+, researchers employed a version of factor analysis, for which the application of certain mathematical constraints like nonnegativity and unimodality in the determination of concentration profiles, does not assume directly any particular equilibrium relationships between the chemical species involved. It is only in a second step that the possible equilibrium models are tested for how well they match the concentration profiles that were mathematically extracted. This is how Malinowski and Zhao determined that Model A, with chloride involvement in the trimer, was superior to a simple monomer/dimer/trimer model.14 By contrast, our work imposes chemical equilibrium restrictions directly on the mathematical deconvolution process. Factor analysis has many advantages but also has been shown to have limitations.19−24 Malinowski initially wrote on the theory of error in factor analysis commenting on how essentially if the part of the data that is a result of error is removed, all that is left is the factors that build the data, ultimately leading to data improvement.24 Included in those studies is work done by Swain, Bryndza, and Swain on the misleading results that missing data can create.23 For example, when factor analysis is applied to simulated mass spectra data, the ability to correctly interpret data depended on the peak width, resolution, and similarity in fragmentation pattern as well as the amount of noise in the data. Tway and Love warn against combinations of the above that can increase the subjectiveness of the analysis.21 Interestingly, in an investigation by Conny and Meglen, it was suggested that white noise could improve the quality of factor analysis.22 Clearly factor analysis is a powerful and widely used tool that has great potential for unraveling complicated solution phase dynamics. To use it well though requires considerable care as we will demonstrate with studies on simulated data.



EXPERIMENTAL SECTION Reagents. All chemicals were analytical grade. Methylene blue hydrate (C16H18ClN3S·3H2O) was obtained from Sigma Aldrich and was used without further purification. Type I water was obtained from a NANOpure diamond water system. Concentrated nitric acid and hydrochloric acid were obtained from Wheaton Scientific. Methylene Blue Solutions. A stock solution with a concentration of 1.0029 × 10−3 M and density of 0.9984 g/mL was prepared by dissolving methylene blue hydrate (0.0375 g, 1241

dx.doi.org/10.1021/ac303271m | Anal. Chem. 2013, 85, 1240−1246

Analytical Chemistry

Article

Figure 2. Sets of molar absorptivity curves used for data simulation. Sets I and II are molar absorptivity curves extracted from measured data sets. Sets III and IV are arbitrary curves generated from simple mathematical functions.

Beer−Lambert law predicts. Solutions of MB+ contain only MB+ and chloride, thus the overwhelming conclusion in the literature is that this apparent violation of the Beer−Lambert law occurs because of aggregation. In order to begin to study the specific relationships between the aggregates in solution and their corresponding molar absorptivity curves, the number of distinct chemical species must be ascertained. This can be readily accomplished through unrestricted factor analysis. Unrestricted Factor Analysis. Unrestricted factor analysis in effect calculates the percentage of data that can be reproduced from an arbitrary number of additive factors. Every additional factor helps to account for more of the data, but if the factor corresponds to a real chemical species, then the improvement to the fit will be significant, whereas if the factor is simply accounting for a little more random error, the improvement to the fit will be small and predictable. This trend is shown in Table 1 by unrestricted factor analysis of the simulated data sets with added error (standard deviation = 0.001), showing just three clear factors. After the third factor, the factors’ contribution increases at regular intervals of ∼0.02%. The regular increase is due to the regularity of the normal distribution of the random error. Unrestricted factor analysis of the nitric acid MB+ data set conclusively shows there are at least three chemical species responsible for absorption, accounting for 99.55% of the data, but there could be up to six species contributing to total absorbance data. The results of unrestricted factor analysis of the measured data sets in 0.1 M acid are also found in Table 1. When studying the same problem over a concentration range of 1 × 10−7 to 1.6 × 10−2, Malinowski’s data originally appeared to have 13 factors. Malinowski utilized multiple cuvettes with different pathlengths and thus could justifiably truncate the 13 factors down to 3 factors.14 Our entire data set was measured in one cuvette so any additional factors are either real or the result of some unknown systematic error. Given that they are so small, it is difficult to tell. MB+ in HCl solutions has three definitive species contributing 99.61% of the data but possibly has up to five species. When data sets for either acid were modeled according to chemical equilibria with four or more species, the

Four different data sets were simulated by multiplying the common set of concentration profiles by one of four sets of molar absorptivity curves (one curve for each distinct chemical absorber). The four sets of molar absorptivity curves used to simulate data sets are represented in Figure 2. Sets I and II represent the curves that resulted from modeling measured MB+ data, either as raw absorbance or effective absorbance, respectively. Sets III and IV are entirely artificial sets of molar absorptivity curves in which the curves themselves are quite mathematically distinct by design. Error could also be added to each of the four created data sets. Signal noise with a normal distribution and standard deviation of 0.001 was added to each of the four. In addition, noise with standard deviations of 0.1, 0.01, 0.0001, and 0.000 01 were added to data sets II and IV. Simulated data sets were then analyzed in a manner identical to that of the measured data sets as described in the previous section.



RESULTS AND DISCUSSION Determining the molar absorptivity curves of methylene blue aggregates in solution is inherently difficult because there is no direct way to isolate a particular aggregate species as they exist in equilibrium with other species. Figure 3 shows the measured data points of MB+ at 660 nm as a function of concentration. Clearly MB+ does not follow the linear relationship that the

Figure 3. MB+ absorbance at 660 nm as a function of nominal concentration. 1242

dx.doi.org/10.1021/ac303271m | Anal. Chem. 2013, 85, 1240−1246

Analytical Chemistry

Article

contrast, the most concentrated solutions exhibit a peak at 610 nm and a shoulder at 660 nm. There is also no clear isosbestic point, which suggests that there are more than two distinct chemical species. Several articles in the literature suggest the set of chemical reactions (1) and (2) in Scheme 1 as the model for the equilibrium of MB+ in solution.14,16,18 Malinowski and Zhao established that this particular set of chemical reactions was a better representation of their data, which was modeled based on effective absorptivity, than a simple monomer/dimer/ trimer model. They found log Kd values of 3.84 and 10.49 for the complete dissociation constants of the dimer and trimer with chloride, respectively.14 Modeling our nitric acid data set based on effective absorbance with these reactions led to a log Kd value of 3.85 for the dissociation of the dimer, and 10.98 for the complete dissociation of the trimer with chloride. The first value matches well with Malinowski’s findings, but the second value is not as close. To better understand the discrepancies in our results, we first undertook a study to see whether absorbance data could and should be modeled best as raw absorbance data or effective absorbance data. When the data set of MB+ in 0.1 M nitric acid was modeled according to Model A, the resulting sets of molar absorptivity curves (Figure 5) differ depending on whether raw or effective absorbance data was used for modeling. The two sets of curves look very similar when comparing general shape, relative peak height, and placement but have some key differences as well. Most noticeable is the bump at 700 nm in the dimer curve modeled with raw absorbance, which is not seen in the dimer curve modeled on effective absorbance. Overall, using effective absorbance appears to produce slightly better looking curves in our opinion, but nothing conclusive can be said about which set of curves more accurately represent the true molar absorptivity curves. In order to determine which, if either, method of modeling is really more effective, simulated data was modeled with both raw and effective absorbance and the results were compared back to the parameters used to build the simulated data set. Raw Absorbance vs Effective Absorbance. Modeling effective absorbance was compared to modeling raw absorbance using simulated data sets. The true molar absorptivity curves as well as the true equilibrium constants are now known so direct comparisons between the results can be readily made. Data sets were built from curve set II (Figure 2), and then random error with different standard deviations was added to the raw absorbance. Maximum deviation of the log K values obtained from the absorbance data sets are shown in Figure 6. Data sets with no error added resulted in a deviation in the log K values of 1.8 × 10−5 or less for either type of absorbance data, which is

Table 1. Factor Number along with the Total Percent Contribution of Factors Are Listed for Unrestricted Factor Analysis of Complete Data Setsa

a

Dark shading represents necessary factors, light shading represents possible factors, and lack of shading represents random error.

resulting molar absorptivity curves were irregular with abrupt changes, as if modeling error rather than the absorbance of a chemical species. Therefore we too determine that over this concentration range, there are only three notable absorbing species. The effective absorptivity of MB+ as a function of nominal concentration is shown in Figure 4. The most dilute solutions

Figure 4. Plot of effective absorptivity of MB+ in 0.1 M HNO3.

exhibit a peak at 660 nm and a shoulder at 610 nm. This curve shape is thought to correspond to the monomer species. In

Figure 5. Molar absorptivity curves determined by modeling (A) raw absorbance and (B) effective absorbance with the chemical species MB+ (), MB22+ (- - -), and MB3Cl2+ (- - ) (model A). 1243

dx.doi.org/10.1021/ac303271m | Anal. Chem. 2013, 85, 1240−1246

Analytical Chemistry

Article

note that each individual curve deviated differently than its counterparts. The dimer curve and monomer curve had a much greater deviation that the trimer. A Varian Cary 100 spectrophotometer has a standard deviation of 0.000 16 for a measurement at 590 nm with an absorbance of 1 abs.26 This amount of error falls within the range that modeling raw absorbance is noticeably more accurate for both log K values and molar absorptivity curves. Therefore, raw absorbance should be used in most, if not all, cases for the determination of either log K values or molar absorptivity curves via factor analysis modeling. Multiple Models. Given the clear superiority of modeling raw absorbance over effective absorbance, the modeling of the measured and simulated data sets is heretofore exclusively based on raw absorbance. For the real data sets, both nitric and hydrochloric acid, over 80 different models were tested with Sivvu, the details of which can be found in the Supporting Information. Table 2 lists the top 10 models for each data set

Figure 6. Deviation of log K1 and log K2 from their true values for simulated data sets with different levels of added error. Both raw and effective absorbance data were modeled.

well within the computational rounding error of the calculations. As random error was introduced into the simulated raw absorbance data, the accuracy of the model decreases overall. For all data sets, no matter the quantity of error added, raw absorbance was more accurate. The six simulated data sets used to test the accuracy of equilibrium constants were also used to ascertain the accuracy of the resulting molar absorptivity curves. The difference between the modeled molar absorptivity curves and the true molar absorptivity curves is usefully quantified by treating the molar absorptivity curves as n-dimensional vectors and calculating the angle between the vectors ⇀ ⇀ a ·b θ = arccos ⇀ ⇀ a · b

Table 2. Minimized RMS Residuals for Different Optimized Models of the Absorbance Data Sets of MB+ in HNO3 and HCla models for data set in 0.1 M HNO3 MB+, MB+, MB+, MB+,

An angle of 0° indicates that two vectors, i.e., the two curves, are completely identical; an angle of 90° indicates that two vectors are orthogonal, meaning the curves have no common component. The angles corresponding to each species when there is no added error was less than 0.0002°, indicating the modeled curves are virtually identical to the curves the data set was built with, as expected. However, again as error is added in, all three molar absorptivity curves begin to deviate (Figure 7). Regardless of the amount of error, modeling raw absorbance always resulted in less deviation (smaller angles) than modeling effective absorbance. Clearly, error affects the molar absorptivity curves much more so when modeled with effective absorbance compared to raw absorbance. It is interesting to

MB22+, MB22+, MB22+, MB22+,

MB3Cl22+ MB77+ MB3Cl3 MB88+

RMS residual

models for data set in 0.1 M HCl

RMS residual

0.021 726 0.021 890 0.021 930 0.021 950

MB+, MB33+, MB88+ MB+, MB33+, MB99+ MB+, MB33+, MB77+ MB+, MB2Cl+, MB5Cl32+ MB+, MB22+, MB55+ MB+, MB33+, MB1010+ MB+, MB2Cl+, MB4Cl4 MB+, MB2Cl+, MB4Cl3+ MB+, MB2Cl+, MB4Cl22+ MB+, MB22+, MB44+

0.033 010 0.033 181 0.033 369 0.033 469

MB+, MB2+, MB6+ MB+, MB2Cl+, MB5Cl32+ MB+, MB22+, MB9+ MB+, MB22+, MB3Cl2+

0.022 080 0.022 141 0.022 158 0.022 350

MB+, MB33+, MB1212+

0.022 351

MB+, MB33+, MB1111+,

0.022 381

a

0.033 474 0.033 595 0.033 806 0.033 819 0.033 851 0.033 858

The three absorbers for each model are listed.

along with the corresponding RMS residual values. No one particular model stands out as the clear favorite. RMS residual values differ by arguably insignificant amounts and all molar absorptivity curves are smooth, normal looking curves without jagged lines or abrupt changes. Each model had different log K values and similar, but not identical, molar absorptivity curves, making it difficult to conclude which model is the one true model among so many other good models. It is important to note that the top models for the nitric acid data set are quite distinct from the ones that fit the hydrochloric acid data set. The only model common to both top 10 includes the absorbers MB+, MB2Cl+, and MB5Cl32+. As an overall trend, the second species in the nitric acid data set models was typically a dimer, compared to the hydrochloric acid data set models, which typically was a three part species like a trimer or a dimer with chloride. It seems then that higher aggregates might be favored in chloride solution. Error Effects on Models. Simulated data was used to understand how measured data sets of MB+ can be modeled with so many different sets of chemical reactions and the results not readily differentiated from each other using RMS residual values as the primary quantitative criteria. Models of simulated data were then analyzed to see how added random error obscures the true answer and if added error can mask the true

Figure 7. Comparison of the angle between true molar absorptivity curves and modeled molar absorptivity curves of the monomer, dimer, and trimer as a function of the amount of error that is added to the raw absorbance data of a simulated data set that is subsequently modeled as raw absorbance data (raw) or effective absorbance data (eff). 1244

dx.doi.org/10.1021/ac303271m | Anal. Chem. 2013, 85, 1240−1246

Analytical Chemistry

Article

conclusively determined to be the right model. Conversely though, if one model is decidedly better in RMS residual than all other reasonable models, then this is quite strong evidence of its validity. This is because of the simple observation that the number of good but incorrect models is legion, whereas the right model is usually singular. Uniqueness of Molar Absorptivity Curves. Simulated data sets were also studied to determine if there was correlation between the uniqueness of the molar absorptivity curves used to build a simulated data set and how well the true model could be differentiated from other good models. Each of the four curve sets from Figure 1 was used to build a data set. The corresponding angles between the three molar absorptivity curves in each set are shown in Table 3. Again, the difference between two molar absorptivity curves is quantified by calculating the n-dimensional angle between them, where n is the vector length of the curve. The table also lists the difference in RMS residual of the true model and the best incorrect model. All used the same chemical reactions (Model A) and all had added error with a standard deviation of 0.001. The results indicate that the greater the distinctions between the three molar absorptivity curves, the greater the gap between the RMS residual of the true model and the next best models. Curve sets I and II are realistic molar absorptivity curves taken from measured data sets with relatively small angles between the molar absorptivity curves in the set. These curves with calculated angles around 40° had a very small difference in the RMS residual of the true model and the next best model, differing by only 0.5%, and their resulting molar absorptivity curves were very reasonable. Curve set I exhibited this to an extreme with the model including species MB, MB2Cl+, MB5Cl32+ having a better RMS residual than the model with which the data set was built. The artificial curves in sets III and IV are more mutually orthogonal, and the RMS residual values between the true model and the best alternate differ by 24% and 47%, respectively. The greater the uniqueness of a set of molar absorptivity curves, the better this type of mathematical analysis can lock in on the correct model and the associated log K values and molar absorptivity curves. This suggests that the less unique a set of molar absorptivity curves are among themselves, the more freedom this type of mathematical analysis has and the smaller the amount of error needed to obscure the data. This makes it difficult to differentiate a good model from the true model. The combination of added error with relatively similar molar absorptivity curves creates a very slippery mathematical situation. Since the molar absorptivity curves for the real data are like curve set II, the uniqueness of a set of curves, or lack thereof, explains why, in the case of MB+ solutions, so many

model so that all tested models would give similar RMS residual values. Different amounts of error were added to the raw simulated data set built with curve set II from Figure 1. The data sets differ only by the amount of error introduced into the simulated raw absorbance data. Each set was then modeled according to Model A (Scheme 1), the one from which they were built, as well as with two other models represented by the chemical reactions in Schemes 2 and 3. Scheme 2. Model B

Scheme 3. Model C

Figure 8 depicts how error obfuscates the true model with other models. When no or minimal error was added to the data

Figure 8. Graph of RMS residual for data simulated with curve set II according to Model A, and then modeled according to Models A, B, and C. See text for details.

set, the true model (Model A) was clearly distinguishable from other models by RMS residual values. When error with a standard deviation of 0.0001 was added to the data set, differentiating the true model from the others based on RMS residual became much more difficult and the models from the data set with added error having a standard deviation of 0.001 or more could not be differentiated at all based on RMS residual values that were only 0.7% greater for the incorrect models than for the true model. It is interesting to note that in the cases with 0.001 error, factor analysis on the residuals shows that using Model C leaves a significant factor behind, whereas using Model B or Model A does not (see the Supporting Information). Clearly the degree of random noise must be taken into account before a particular model can be

Table 3. RMS Residual Values for the Modeling of a Simulated Data Set (With Error Added) Based on the True Model and the Best Alternate Modela

curve curve curve curve

set set set set

I II III IV

angles between molar absorptivity curves (deg)

average angle (deg)

RMS residual of true model

RMS residual of best alternate

difference in RMS residual

58.15/47.84/20.36 33.58/42.53/15.53 69.17/89.99/69.65 90/90/90

42.12 30.48 76.27 90.0

0.012 377 0.009 793 4 0.010 25 0.009 985 7

0.012 319 0.009 840 6 0.012 75 0.014 665

−0.5% +0.5% +24% +47%

a

The angles between the molar absorptivity curves used to build each simulated data set correspond to the orthogonality of the molar absorptivity curves to each other. 1245

dx.doi.org/10.1021/ac303271m | Anal. Chem. 2013, 85, 1240−1246

Analytical Chemistry

Article

standard error in raw absorbance be greater than 107 to ensure that the best model is likely the correct one. Invoking these measures will help make factor analysis a more reliable chemometric tool as well as make it more discernible to researchers who use and evaluate it.

models resulted in similar RMS residual values such that no one model could be identified as the true model. A final test of simulated models as a function of error added was also carried out with a simulated data built from curve set IV to distinguish between how curve uniqueness was obscuring the true model as opposed to random error. Upon modeling the data set with all three models (A, B, and C), it is clear that there is much more separation between the RMS residuals of the different models. Comparing Figure 9 to Figure 8, only



ASSOCIATED CONTENT

S Supporting Information *

Plethora of modeling results of both real and simulated data. This material is available free of charge via the Internet at http://pubs.acs.org.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This material is based upon work supported by the National Science Foundation under Grant No. CHE-0911527 and by the American Chemical Society Petroleum Research Fund (Grant No. 49503-UR3).

Figure 9. RMS residuals for data set simulated with curve set IV according to Model A and then modeled with Models A, B, and C.



upon the addition of error with a standard deviation of 0.001 do the RMS residuals start to become indistinguishable. This is 10 times the amount of error that it took to obscure the data set built with curve set II. For the measured data sets of MB+ in nitric and hydrochloric acid, the molar absorptivity curves produced from modeling had an average calculated angle of 30.48°. Furthermore, it is likely that the molar absorptivity curves of aggregates of MB, or any other dye molecules, will always share common spectral features that render them quite comparable mathematically. Thus their lack of uniqueness in combination with the random error from instrument noise creates a situation in which the actual model cannot be definitively identified by this method of analysis. Furthermore, molar absorptivity curves for any set of aggregates will likely result in a similar situation that limits the sensitivity of modeling through factor analysis to the point where findings are inconclusive. To further study MB+ in solution it is clear that the error must be reduced. This can be done with better instrumentation but also by acquiring more data. Beyond this, if additional information on the identity of possible chemical species can be elucidated by alternate means such as mass spectrometry or conductivity, then the list of possible models for the solutionphase behavior of methylene blue can be pared down considerably. Ultimately, care must be taken when studying molar absorptivity curves via factor analysis. Specifically, in order to optimize the validity of the models based on equilibriumrestricted factor analysis (1) raw (rather than effective) absorbance should be used, (2) RMS residual values should be analyzed and reported for not only the best model but also for at least the five next best models, and (3) the degree of uniqueness between all molar absorptivity curves should be analyzed and reported. For the final criterion, we suggest that calculating the angle between curves as vectors taken here is a simple and thorough approach. Furthermore, on the basis of our initial studies, we suggest for a quantitative formulation that the smallest curve angle in degrees squared divided by the

REFERENCES

(1) Rabinowitch, E.; Epstein, L. F. J. Am. Chem. Soc. 1941, 63, 69−78. (2) Holmes, W. C. Ind. Eng. Chem. 1924, 16, 35−40. (3) Bergmann, K.; O’Konski, C. T. J. Phys. Chem. 1963, 67, 2169− 2177. (4) Braswell, E. J. Phys. Chem. 1968, 72, 2477−2483. (5) Mukerjee, P.; Ghosh, A. K. J. Am. Chem. Soc. 1970, 92, 6408− 6412. (6) Némethy, G.; Scheraga, H. A. J. Phys. Chem. 1962, 66, 1773− 1789. (7) Padday, J. F. J. Phys. Chem. 1968, 72, 1259−1264. (8) Murakami, K. Dyes Pigm. 2002, 53, 31−43. (9) Yao, H.; Isohashi, T.; Kimura, K. J. Phys. Chem. B 2007, 111, 7176−7183. (10) Nandini, R.; Vishalakshi, B. E-J. Chem. 2011, 8, S253−S265. (11) Ghosh, A. K. J. Am. Chem. Soc. 1970, 92, 6415−6418. (12) Mukerjee, P.; Ghosh, A. K. J. Am. Chem. Soc. 1970, 92, 6413− 6415. (13) Mukerjee, P.; Ghosh, A. K. J. Am. Chem. Soc. 1970, 92, 6403− 6407. (14) Zhao, Z. M.; Malinowski, E. R. J. Chemom. 1999, 13, 83−94. (15) Patil, K.; Pawar, R.; Talap, P. Phys. Chem. Chem. Phys. 2000, 2, 4313−4317. (16) Hemmateenejad, B.; Absalan, G.; Hasanpour, M. J. Iran. Chem. Soc. 2011, 8, 166−175. (17) Malinowski, E. R. Factor Analysis in Chemistry, 3rd ed.; Wiley: New York, 2002. (18) Zhao, Z. M.; Malinowski, E. R. Appl. Spectrosc. 1999, 53, 1567− 1574. (19) Liu, Z.; Patterson, D. G.; Lee, M. L. Anal. Chem. 1995, 67, 3840−3845. (20) Duewer, D. L.; Kowalski, B. R.; Fasching, J. L. Anal. Chem. 1976, 48, 2002−2010. (21) Woodruff, H. B.; Tway, P. C.; Love, L. J. C. Anal. Chem. 1981, 53, 81−84. (22) Conny, J. M.; Meglen, R. R. Anal. Chem. 1992, 64, 2580−2589. (23) Swain, C. G.; Bryndza, H. E.; Swain, M. S. J. Chem. Inf. Comput. Sci. 1979, 19, 19−23. (24) Malinowski, E. R. Anal. Chem. 1977, 49, 606−612. (25) Vander Griend, D. A.; DeVries, M. J. “Sivvu” software, http:// www.calvin.edu/∼dav4/Sivvu.htm (accessed December 14, 2012). (26) Cary-specifications.pdf. 1246

dx.doi.org/10.1021/ac303271m | Anal. Chem. 2013, 85, 1240−1246