Anal. Chem. 2004, 76, 5726-5733
Quantification of Ternary Mixtures of Heavy Metal Cations from Metallochromic Absorbance Spectra Using Neural Network Inversion Dan Mikami,† Toshifumi Ohki,‡ Ken Yamaji,† Saeko Ishihara,‡ Daniel Citterio,§ Masafumi Hagiwara,† and Koji Suzuki*,‡,§,#
Departments of Information and Computer Science and Applied Chemistry, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223-8522, Japan, Kanagawa Academy of Science and Technology (KAST), KSP West 614, 3-2-1 Sakado, Takatsu-ku, Kawasaki 213-0012, Japan, and Core Research for Evolutional Science and Technology (CREST), JST Agency, 4-1-8 Honcho, Kawaguchi, Saitama 332-0012, Japan
A new method based on artificial neural networks (ANN) for the processing of spectrophotometric data is proposed and illustrated on the example of the simultaneous quantification of ternary mixtures of zinc, cadmium, and mercury cations in aqueous solutions. Three types of commercially available metallochromic indicators were used as a simple model setup to create spectral data analogous to those normally received from an optical sensor array. In conventional ANN training methods for chemical sensors based on spectrophotometric data, a calibration is established by mathematically correlating the measured optical signal as network input with the concentration of the calibration sample as network output. In several situations, however, especially when dealing with mixed sample solutions, the relationship between a measured absorption spectrum and the corresponding ion concentrations is ambiguous, resulting in an “ill-posed problem”. On the other hand, if the training direction is reversed by correlating known sample concentrations with measured optical signals, the relationship becomes reasonable for the ANN to obtain its structure. The proposed model illustrated in this paper is based on a more reasonable direct mapping and estimation by artificial neural network inversion (ANNI). In the training step, sample mixtures of known concentrations are optically measured to construct networks correlating the input data (ion concentrations) and the output data (absorption spectra). In the estimation step, the ion concentrations of unknown samples are estimated using the constructed ANN. The measured spectra of the unknown samples are fed to the output layer, and the appropriate input concentrations are determined by ANNI. When training the ANN system with 143 ternary mixtures of Zn2+, Cd2+, and Hg2+ in a concentration range from 1 to 100 µM, rootmean-square errors of prediction (RMSEP) of 0.45 (Zn2+), 0.96 (Cd2+), and 0.32 µM (Hg2+) were observed for the estimation of concentrations in 30 test samples, using the ANNI procedure. This newly proposed model, which involves the construction of an ANN based on direct mapping and estimation by ANNI, opens up one 5726 Analytical Chemistry, Vol. 76, No. 19, October 1, 2004
way to overcome the limitations of nonselective sensors, allowing the use of more easily accessible semiselective receptors to realize smart chemical sensing systems. In recent years, processing of analytical data by computer methods has become routine. Due to the immense progress in analytical instrumentation, enormous amounts of data can now be acquired within a short time. This situation required the development of special mathematical and statistical methods in order to extract the desired quantitative chemical information from the overall data, giving rise to the field of chemometrics. The technique of data mining, which searches for relationships and correlations within large amounts of raw data, has become highly important. In this context methods such as principal component analysis (PCA), cluster analysis, discriminant analysis, partial leastsquares analysis (PLS), and artificial neural networks (ANN) are widely used among others for complicated information processing.1 Chemometrical methods have been widely applied for the analysis of sample mixtures2-11 and in combination with chemical * Corresponding author. E-mail:
[email protected]. Fax: +81-45-5645095. Tel.: +81-45-566-1568. † Department of Information and Computer Science, Keio University. ‡ Department of Applied Chemistry, Keio University. § Kanagawa Academy of Science and Technology. # Core Research for Evolutional Science and Technology. (1) Duda, R. O.; Hart, P. E.; Stork, D. G. Pattern Classification, 2nd ed.; John Wiley & Sons: New York, 2001. (2) Vitouchova´, M.; Janca´r, L.; Sommer, L. Fresenius J. Anal. Chem. 1992, 343, 274-279. (3) Vitouchova´, M.; Janca´r, L.; Sommer, L. Fresenius J. Anal. Chem. 1992, 343, 274-279. (4) Ni, Y. Anal. Chim. Acta 1993, 284, 199-205. (5) JiJi, R. D.; Cooper, G. A.; Booksh, K. S. Anal. Chim. Acta 1999, 397, 6172. (6) Kompany-Zareh, M.; Massoumi, A. Fresenius J. Anal. Chem. 1999, 363, 219-223. (7) Kompany-Zareh, M.; Massoumi, A.; Pezeshk-Zadeh, Sh. Talanta 1999, 48, 283-292. (8) Esteves da Silva, J. C. G.; Oliveira, C. J. S. Talanta 1999, 49, 889-897. (9) Thomas, E. V. Anal. Chem. 2000, 72, 2821-2827. (10) Moberg, L.; Karlberg, B.; Blomqvist, S.; Larsson, U. Anal. Chim. Acta 2000, 411, 137-143. (11) Espinosa-Mansilla, A.; Valenzuela, M. I. A.; Mun ˜oz de la Pen ˜a, A.; Salinas, F.; Can ˜ada F. C. Anal. Chim. Acta 2001, 427, 129-136. 10.1021/ac040024e CCC: $27.50
© 2004 American Chemical Society Published on Web 08/21/2004
sensors.12-20 Among those, ANNs are of special interest because of their ability for analytical learning inspired by the human brain.21-31 ANNs can be applied in situations where other methods often fail because a profound knowledge of the theoretical response mechanism in terms of a mathematical function is not required. This is especially useful when simultaneously dealing with a large number of chemical equilibria of different stoichiometries as they are often found in mixed samples. Therefore, most applications of ANNs concerning chemical sensing are focused on multianalyte detection. With the development of electronic noses for gaseous sample mixtures and more recently, electronic tongues for liquid samples, a paradigm shift in the field of chemical sensing can be observed, where focus is moved from highly selective recognition elements to semiselective receptors.32-35 With these developments, ANNs have gained in importance again. Concerning chemical sensors, a molecular receptor or ionophore can be regarded as “hardware” and the data processing of the raw signal as “software”. Whereas past studies have mainly aimed at the improvement of the hardware (e.g., selectivity, sensitivity) and the software (data processing) independently, collaborations between chemists and computer engineers have become more common.36,37 Harmonizing the performance of the hardware with the software, instead of independent development of both, may enable the establishment of smart chemical sensing systems in the future. In this paper, we demonstrate (12) Abdollahi, H. Anal. Chim. Acta 2001, 442, 327-336. (13) Saurina, J.; Herna´ndez-Cassou, S. Anal. Chim. Acta 2001, 438, 335-352. (14) Beebe, K. R.; Kowalski, B. R. Anal. Chem. 1988, 60, 2273-2278. (15) Forster, R. J.; Regan, F.; Diamond, D. Anal. Chem. 1991, 63, 876-882. (16) Mu ¨ller-Ackermann, E.; Panne, U.; Niessner, R. Anal. Methods Instrum. 1995, 2, 182-189. (17) Jurs, P. C.; Bakken, G. A.; McClelland, H. E. Chem. Rev. 2000, 100, 26492678. (18) Ertas, N.; Akkaya, E. U.; Ataman, O. Y. Talanta 2000, 51, 693-699. (19) Plegge, V.; Slama, M.; Su ¨ selbeck, B.; Wienke, D.; Spener, F.; Knoll, M.; Zaborosch, C. Anal. Chem. 2000, 72, 2937-2942. (20) Krantz-Ru ¨ lcker, C.; Stenberg, M.; Winquist, F.; Lundstro¨m, I. Anal. Chim. Acta 2001, 426, 217-226. (21) Grate, J. W.; Patrash, S. J.; Kaganove, S. N.; Abraham, M. H.; Wise, B. M.; Gallagher, N. B. Anal. Chem. 2001, 73, 5247-5259. (22) Zupan, J.; Gasteiger, J. Anal. Chim. Acta 1991, 248, 1-30. (23) White, J.; Kauer, J. S.; Dickinson, T. A.; Walt, D. R. Anal. Chem. 1996, 68, 2191-2202. (24) Hongmei, W.; Lishi, W.; Wanli, X.; Baogui, Z.; Chengjun, L.; Jianxing, F. Anal. Chem. 1997, 69, 699-702. (25) Sutter, J. M.; Jurs, P. C. Anal. Chem. 1997, 69, 856-862. (26) Chan, H.; Butler, A.; Falck, D. M.; Freund, M. S. Anal. Chem. 1997, 69, 2373-2378. (27) Johnson, S. R.; Sutter, J. M.; Engelhardt, H. L.; Jurs, P. C.; White, J.; Kauer, J. S.; Dickinson, T. A.; Walt, D. R. Anal. Chem. 1997, 69, 4641-4648. (28) Walt, D. R.; Dickinson, T.; White, J.; Kauer, J.; Johnson, S.; Engelhardt, H.; Sutter, J.; Jurs, P. Biosens. Bioelectron. 1998, 13, 697-699. (29) Bessant, C.; Selwayan, S. Anal. Chem. 1999, 71, 2806-2813. (30) Legin, A.; Rudnitskaya, A.; Vlasov, Y.; Di Natale, C.; Mazzone, E.; D’Amico, A. Electroanalysis 1999, 11, 814-820. (31) Mortensen, J.; Legin, A.; Ipatov, A.; Rudnitskaya, A.; Vlasov, Y.; Hjuler, K. Anal. Chim. Acta 2000, 403, 273-277. (32) Li, Q.; Yao, X.; Chen, X.; Liu, M.; Zhang, R.; Zhang, X.; Hu, Z. Analyst 2000, 125, 2049-2053. (33) Lavigne, J. L.; Savoy, S.; Clevenger, M. B.; Ritchie, J. E.; McDoniel, B.; Yoo, S.-J.; Anslyn, E. V.; McDevitt, J. T.; Shear, J. B.; Neikirk, D. J. Am. Chem. Soc. 1998, 120, 6429-6430. (34) Albert, K. J.; Lewis, N. S.; Schauer, C. L.; Sotzing, G. A.; Stitzel, S. E.; Vaid, T. P.; Walt, D. R. Chem. Rev. 2000, 100, 2595-2626. (35) Raimundo, I. A.; Narayanaswamy, R. Anal. Chim. Acta 2003, 90, 189-197. (36) Lavigne, J. J.; Anslyn, E. V. Angew. Chem., Int. Ed. 2001, 40, 3118-3130. (37) Lavine, B. K.; Workman, J., Jr. Anal. Chem. 2002, 74, 2763-2769.
an application of an artificial neural network system, which was specifically developed for the analysis of spectrophotometrical data and which is new in terms of its application to chemical sensing. The goal of this work is to establish a novel software model for spectra analysis and to validate its efficiency. Since the quantification of heavy metal cations in aqueous systems is an important task in analytical chemistry and the number of suitable highly selective sensors is still limited,38,39 we selected a simple multianalyte sensing array using commercially available semiselective metal ion indicators as a model application. The binding of the metal ion to the indicator is monitored by measuring absorption spectra of aqueous solutions containing one of several dyes and a mixture of cations. The estimation of the ion concentration based on measured optical spectra can sometimes be regarded as an “inverse problem”. In this example, the cause, meaning the concentration of the different ions, is estimated based on the measured result, represented by the spectra. In many cases, it is difficult to achieve the inverse mapping which transforms the result into the cause since one specific result can be due to different possible causes. Conventional methods for multivariate data analysis (e.g., back-propagation neural networks and partial least-squares regression) by themselves do not have the ability to describe such relationships reliably. Therefore, we propose and apply a new method consisting of a training step based on direct mapping and an estimation step based on artificial neural network inversion (ANNI).40,41 During the training step, the ion concentrations are correlated with the absorbance spectra, linking the cause and the result in a proper direction. For the estimation of unknown samples, the measured spectra are fed to the output layer of the constructed network. Then, the input (ion concentrations) is determined using ANNI. Instead of inverse mapping using conventional methods resulting in unreasonable estimations, this combination of direct mapping and network inversion enables the more reliable quantification of ions in mixed solutions. To the best of our knowledge, the technique of network inversion has not been applied to the analysis of spectral data before. In a model experiment we demonstrate the simultaneous quantification of Zn2+, Cd2+, and Hg2+ in mixed aqueous solutions using three metallochromic indicators, methylthymol blue, murexide, and 4,7-dihydroxy-1,10-phenanthroline, as the single “sensor elements” combined into a sensor array. This simple model study shows that nonselective multiple sensing elements or sensors together with the newly proposed software procedure allow the realization of multiple and selective analyte determinations. THEORETICAL CONSIDERATIONS Spectral Analysis. The absorption spectra of metallochromic indicators vary with the concentration of metal ions in a mixed (38) Goodey, A.; Lavigne, J. J.; Savoy, S. M.; Rodriguez, M. D.; Curey, T.; Tsao, A.; Simmons, G.; Wright, J.; Yoo, W. J.; Sohn, Y.; Anslyn, E. V.; Shear, J. B.; Neikirk, D. P.; McDevitt, J. T. J. Am. Chem. Soc. 2001, 123, 2259-2570. (39) Bu ¨ hlmann, P.; Pretsch, E.; Bakker, E. Chem. Rev. 1998, 98, 1593-1687. (40) Linden, A.; Kindermann, J. Proc. Int. Joint Conf. Neural Networks (IJCNN), Washington, DC, 1989; Vol. 2, pp 425-430. (41) Jensen, C. A.; Russell, D. R.; Marks, R. J.; El-Sharkawi, M. A.; Jung, J.-B.; Miyamoto, R. T.; Anderson, G. M.; Eggen, C. J. Proc. IEEE 1999, 87, 15361549.
Analytical Chemistry, Vol. 76, No. 19, October 1, 2004
5727
sample. In other words, the concentration of the ion is the cause leading to the result of changing absorption spectra. Therefore, to estimate the concentration of an unknown sample mixture from its absorption spectrum means to estimate the cause from the result, posing a typical inverse problem. The difficulty with the inverse problem encountered here is the fact that the inverse mapping is not generally available due to the possible nonexistence of an appropriate answer or due to the lack of uniqueness in the reverse relationship as a result of using nonselective and nonlinearly responding sensors in mixed sample solutions. This situation, which leads to unreliable estimations, is termed an “ill-posed problem”. Figure 1 shows a schematic representation of the inverse relationship between the result (absorbance) and the cause (concentration). The first arrow, mapping a point from A into B, denotes an example of direct mapping, which does not suffer from problems as mentioned above. The second arrow, mapping a point from B into A, is an example of inverse mapping, which estimates a correct answer. Such a situation is encountered when a sensor selectively responds to a specific ion and its concentration lies within the dynamic response range. The third arrow represents an example of inverse mapping not leading to a correct answer due to the nonexistence of an appropriate solution. The fourth arrow, which does not lead to a single answer, shows a lack of uniqueness. This case is commonly observed when a sensor does not respond to a specific analyte with high selectivity but to several analytes simultaneously. A unique correct answer by inverse mapping is only obtained in the situation described by arrow 2, which applies when using highly selective sensors. In this situation only, an inverse mapping transforms a measured absorbance value into a concentration value. For the simultaneous determination of several ions in mixed solutions with poorly selective sensors, this situation is rarely encountered. A semiselective sensor may similarly respond to concentration changes of several cations when exposed to mixed solutions, and it becomes impossible to distinguish between the single analyte contribution to the overall signal change. This leads to an ill-posed situation due to a lack of uniqueness as described above (Figure 1, arrow (4)). Conventional calibration methods based on inverse mapping may fail in such cases. On the other hand, it is assumed that direct mapping results in more reasonable results. Analysis Procedure. The newly proposed method requires separate acquisition of the spectral data for every sensor in mixed solutions. Figure 2 shows a schematic outline of the complete process, which can be divided into two steps, the training step and the estimation step. In the training step (Figure 2A), separate neural networks are created for each sensor in the array and trained to learn the correlation between the ion concentration and the resulting spectra (direct mapping) using the back-propagation learning algorithm. However, since the target is the quantification of an unknown ion concentration using a measured spectrum, the resulting networks have to be inversed in the estimation step following the training session. In the estimation step (Figure 2B), the measured spectra of test samples are fed to the output layer of the constructed network, and network inversion is applied to search for the appropriate input (ion concentrations). This procedure is repeated for all three networks, resulting in three 5728
Analytical Chemistry, Vol. 76, No. 19, October 1, 2004
candidate concentration ranges. Then all the predictions are combined and the overlapping data points are averaged to output the final result. In the following, the training step and the estimation step will be discussed in more detail. The ANN applied in this model is a commonly used threelayer feed-forward neural network shown in Figure 3. This hierarchical ANN consists of one input layer (IL) with I input neurons, one hidden layer (HL) with J neurons, and one output layer (OL) with K neurons. The input neurons and the hidden neurons, as well as the hidden and the output neurons, are connected by connection weights of Vji and Wkj respectively. The learning algorithm used in the training step is back-propagation. This method has been proven useful for learning complicated relationships between input and output. However, as mentioned above, BP-ANNs are not successful in some cases of inverse problems. To overcome this problem, the direction of training for the network is inversed compared to conventional applications of BP-ANNs, where the spectral changes are fed into the input layer and the ion concentrations are received from the output layer. The advantage of the opposite training direction is that there is only one correct output (spectrum) for every concentration input. In addition, we used a modified definition of the error to cope with a wide range of concentrations. Normally, the error E is defined as shown in eq 1, whereas we used the definition given in eq 2,
E)
E)
1
K
∑(y
2 k)1 1
K
∑ 2 k)1
k
- yˆk)2
( ) yk - yˆk
(1)
2
yk
(2)
with K being the number of output neurons of the BP-ANN, yk the estimated output, and yˆk the target output of the kth output node. The output of the network is calculated as follows:
yˆk ) f(netk)
(3)
J
netk )
∑o W j
kj
(4)
j)1
oj ) f(netj)
(5)
I
netj )
∑x W i
ji
(6)
i)1
f(x) )
1 1 + e-x
(7)
where “net” is the net input value and o the output value. The transfer functions used in the hidden and the output neurons are both sigmoidal functions defined in eq 7. Based on the backpropagation algorithm, the connection weights between the hidden and the output layer Wkj are optimized by the following calculation
∂E Wkj ) Wkj - η ∂Wkj
(8)
where
( )
∂E 1 yk - yˆk ) yk(1 - yk)oj ∂Wkj yˆk yˆk
(9)
In a similar manner, the weights between the input neurons and the hidden neurons Vji are determined as follows:
∂E Vji ) Vji - η ∂Vji
K
∂E
)
∂Vji
k)1
( )
1 yk - yˆk
∑ yˆ
(10)
yˆk
k
yk(1 - yk)wkjoj(1 - oj)xj
(11)
In the estimation step, the technique of network inversion is applied to test sample data, not previously used for network training. Again, this procedure is performed separately for the spectral data of every single sensor. The inversion is done by computing iteratively an input vector which minimizes the sum of square errors to approximate a given output target. This network inversion is a numerical searching process based on a gradient descent algorithm. The algorithm is well-known as a classical minimization/maximization method for searching an input value resulting in a minimized/maximized error function. In some cases of an ill-posed problem, this minimizing of the sum of square errors is a powerful solution to estimate an approximate answer. In this process, the neural network system constructed during the training step is now used to search for possible, at first unknown, input data (concentrations), leading to the experimentally measured output (absorbance). At first, a random input value xi0 (i ) 1, 2, ..., i, ..., I) is created, which is then iterated as mathematically described by eqs 12-14,
yˆ ) g(x tn) 1
E≡
K
∑
2 k)1
(12)
( ) yk - yˆk
2
(13)
yk
∂E ∂xi
) x ti - η x t+1 i
(14)
with η being the learning coefficient. Provided
∂E ∂xi
∂Ek ∂xi
)
( )
yˆk yk - yˆk yk2
yk
K
)
∂Ek
∑ ∂x k)1
(15)
i J
∑W o (1 - o )V
yk(1 - yk)
kj j
j
ji
(16)
j)1
Following these calculations, the appropriate input values are searched and determined. EXPERIMENTAL SECTION Reagents. Commercially available reagents of the highest grade were used for the preparations of the aqueous test electrolytes and pH buffer solutions. The distilled and deionized
Figure 1. Schematic representation of an inverse relationship between a cause (concentration of the analyte) and the corresponding result (measured absorbance for an indicator): (1) direct mapping with unique correlation; (2) inverse mapping leading to the correct answer; (3) inverse mapping with nonexisting answer; (4) inverse mapping with lack of uniqueness.
water used had a resistivity of greater than 1.5 × 107 Ω cm at 25 °C. The metallochromic indicators methylthymol blue (MTB) and murexide ammonium salt (MAS) were bought from Tokyo Kasei Kogyo Co. (Tokyo, Japan) and used as received. 4,7-Dihydroxy1,10-phenanthroline (DHP) was obtained from Aldrich Chemical Co. (Milwaukee, WI). To increase the solubility in water, the phenanthroline indicator was transformed into the disodium salt by treating 1 equiv of indicator with 2 equiv of 1 M NaOH in methanolic solution. After evaporation of the solvents, the residual dye was dried under high vacuum and used without further purification. Instruments. All absorbance spectra were recorded on a SPECTRAmax PLUS384 microplate reader (Molecular Devices Corporation, Sunnyvale, CA) using Costar UV-plate 96-well flat bottom microplates (Corning Inc., Corning, NY). Preparation of Sample Solutions for ANN Training and Testing. A total of 144 ternary mixtures of Zn2+, Cd2+, and Hg2+ (all acetate salts) in magnesium acetate buffered (pH ) 5.70, I ) 0.0075 M) solutions were prepared as stock samples. Stock solutions for the metal indicators were prepared separately in the same buffer solution. The samples for the actual measurement were prepared by mixing 1:1 of ion stock and indicator stock inside the wells of a microplate. Before spectra acquisition, the samples were allowed to equilibrate for 20 min. The final concentrations for the indicator dyes were 1.80 × 10-4 M for MTB, 1.51 × 10-4 M for MAS, and 3.78 × 10-5 M for DHP. The investigated metal ion concentrations were 1, 2.6, 6.4, 16.0, 40.0, and 100 µM for Zn2+ and Cd2+ and 1, 2.6, 6.4, and 16.0 µM for Hg2+. All possible ternary mixtures were prepared and analyzed. In total, 144 spectra from 300 to 700 nm were measured for each of the metal ion indicators. The evaluation of the ANN was performed based on a cross-validation using the leave-one-out method. In this case, 143 samples out of the total of 144 were used for the training of the ANN. Here, the remaining one sample is treated as the test sample and used to evaluate the reliability of the constructed network. The spectra of the test sample were fed to the output layer of the network, and the input data (ion concentrations) estimated by ANNI. This procedure was repeated for 30 different test solutions to evaluate the versatility of the network. The test samples were selected in the following manner. From the total of Analytical Chemistry, Vol. 76, No. 19, October 1, 2004
5729
Figure 2. Schematic outline of the data processing system: (A) Training step: construction and training of separate BP-ANNs for each of the sensors in the array. The ion concentrations of the training samples are correlated with their spectra. (B) Estimation step: estimation of concentrations for the test sample by network inversion. The spectral data of the test sample are fed to the output layer (OL) of the network to search for the corresponding input.
144 samples, the upper and lower limiting concentrations for each ion were excluded to keep the test samples within the range defined by the training samples. Out of the remaining 32 samples, 30 samples with estimable resulting concentrations were used for testing. Training of ANN and Network Inversion. Out of the complete spectral data, three wavelengths were selected from each set of spectra for the final network construction: 312, 340, and 360 nm for DHP, 448, 548, and 608 nm for MTB, and 480, 536, and 552 nm for MAS. These wavelengths are in the vicinity of the wavelengths of maximum absorbance of each spectrum and reflect the increase and decrease in absorption as well as the shift in the peak. Preliminary tests have shown that the use of three carefully selected wavelengths was sufficient, with no significant improvements observed by using the complete spectra. The 5730
Analytical Chemistry, Vol. 76, No. 19, October 1, 2004
concentration values were normalized by cnorm ) 100 × xc in order to fit all values in the interval 0-1. The three normalized cation concentrations were used as input and the absorbance at the three selected wavelengths as output data for the training of the BP-ANN. The following network architecture was selected: 3 input neurons, 3 output neurons, 10 hidden neurons, and 200 000 training epochs. During the network inversion, 5000 iterations with a learning coefficient η of 0.01 were performed to calculate candidate concentration ranges for each analyte in the mixture. Then the predictions from three sensors were combined, and the overlapping data points were averaged to give the final concentration estimation. To avoid the result to fall in a local minimum, 20 000 different data points were randomly picked as the first input value xi0. All data were processed on Free BSD based personal computers using C language.
Figure 3. Structure of the three-layer back-propagation neural network with Vji and Wki being the connection weights between the input and the hidden neurons and the hidden and the output neurons, respectively. The input layer (IL), the hidden layer (HL), and the output layer (OL) each consists of I, J, and K neurons, respectively.
RESULTS AND DISCUSSION Direction of Training for the BP-ANN. During the training session for the newly proposed method, a BP-ANN is taught to learn the relationship between the concentration of ions and the absorption spectra. This direction is the opposite of the normally
applied training direction for ANNs dealing with spectrophotometrical data. Therefore, we compared the learning ability of a BP-ANN for both directions of training. In Figure 4, the BP-ANN predicted concentration output values are plotted against the real concentration values in ternary mixtures obtained from the spectra of MTB for a network trained in the conventional direction with absorbance values at three wavelengths as the input and the concentrations of the three cations as the output. No agreement between predicted and real values can be observed, clearly demonstrating the difficulties of ANNs to treat nonunique correlations. If the direction of training is inversed and concentration values are used as the input data however, a very high correlation between the predicted and the actually measured absorbance is found for all investigated wavelengths (Figure 5), with the BP-ANN estimated results fitting the theoretical line with slope of unity. Similar results were obtained for the remaining metal indicators. Three independent networks for the three indicators were constructed. Estimation of Unknown Sample Concentrations by Network Inversion. After the successful training of the three BPANNs was completed, the concentrations of the three ions in a mixture of a test sample were estimated by means of network inversion. For every network system, a separate concentration estimation is obtained. Due to the high cross-sensitivity of the applied metal indicators, no reliable estimation of the ion concentration is possible by using the information gained from one sens-
Figure 4. Plots of real concentrations versus BP-ANN estimated concentrations for the methylthymol blue network trained in a conventional direction with ternary mixtures of Zn2+, Cd2+, and Hg2+. The concentrations were estimated using absorbance values at three wavelengths as the input data: (a) Zn2+ concentration; (b) Cd2+ concentration; (c) Hg2+ concentration.
Figure 5. Plots of measured absorbance versus BP-ANN estimated absorbance for the methylthymol blue network trained in the reversed direction with ternary mixtures of Zn2+, Cd2+, and Hg2+. The absorbances were estimated using the concentrations of the three cations as the input data: (a) 448 nm; (b) 548 nm; (c) 608 nm.
Analytical Chemistry, Vol. 76, No. 19, October 1, 2004
5731
Figure 6. Concentration ranges (normalized concentrations) estimated by neural network inversion for one example of a ternary mixture of 40.0 µM Zn2+, 40.0 µM Cd2+, and 6.4 µM Hg2+ using three indicators: (a) DHP; (b) MTB; (c) MAS; (d) combination of the single concentration estimations resulting in a final intersection at 39.93 µM Zn2+, 40.86 µM Cd2+, and 6.45 µM Hg2+. Table 1. Comparison of the Actual Concentration Values and the Results Estimated by Network Inversion for 30 Ternary Mixtures of Zn2+, Cd2+, and Hg2+ a concentration found (µM)b
concentration added (µM)
relative error of prediction (%)
sample
Zn2+
Cd2+
Hg2+
Zn2+
Cd2+
Hg2+
Zn2+
Cd2+
Hg2+
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
2.56 2.56 2.56 2.56 2.56 2.56 2.56 6.40 6.40 6.40 6.40 6.40 6.40 6.40 6.40 16.00 16.00 16.00 16.00 16.00 16.00 16.00 40.00 40.00 40.00 40.00 40.00 40.00 40.00 40.00
2.56 6.40 6.40 16.00 16.00 40.00 40.00 2.56 2.56 6.40 6.40 16.00 16.00 40.00 40.00 2.56 2.56 6.40 6.40 16.00 16.00 40.00 2.56 2.56 6.40 6.40 16.00 16.00 40.00 40.00
2.56 2.56 6.40 2.56 6.40 2.56 6.40 2.56 6.40 2.56 6.40 2.56 6.40 2.56 6.40 2.56 6.40 2.56 6.40 2.56 6.40 6.40 2.56 6.40 2.56 6.40 2.56 6.40 2.56 6.40
2.07 ( 0.15 2.28 ( 0.01 2.39 ( 0.29 2.16 ( 0.02 2.45 ( 0.10 2.65 ( 0.09 2.47 ( 0.15 6.53 ( 0.09 6.31 ( 0.02 6.17 ( 0.12 6.11 ( 0.11 6.26 ( 0.03 6.73 ( 0.28 6.10 ( 0.09 5.70 ( 0.23 16.37 ( 0.08 16.43 ( 0.19 16.09 ( 0.06 16.58 ( 0.11 16.22 ( 0.08 15.87 ( 0.45 16.12 ( 0.17 39.52 ( 0.09 39.38 ( 0.09 38.89 ( 0.19 39.37 ( 0.18 38.97 ( 0.13 39.83 ( 0.15 39.22 ( 0.17 39.93 ( 0.16
3.50 ( 0.64 6.88 ( 0.02 6.34 ( 1.33 16.56 ( 0.17 17.53 ( 0.70 40.70 ( 0.69 42.31 ( 0.98 2.68 ( 0.31 2.60 ( 0.05 6.37 ( 0.43 6.51 ( 0.53 14.58 ( 0.17 15.21 ( 0.96 41.50 ( 0.50 41.97 ( 0.91 2.03 ( 0.31 2.31 ( 0.62 7.23 ( 0.22 5.75 ( 0.44 15.29 ( 0.35 16.26 ( 1.25 41.51 ( 0.91 2.22 ( 0.39 2.22 ( 0.70 5.99 ( 0.86 5.84 ( 0.95 15.92 ( 0.73 15.50 ( 1.03 41.82 ( 1.05 40.86 ( 1.01
2.43 ( 0.08 2.41 ( 0.04 6.13 ( 0.19 2.34 ( 0.03 5.85 ( 0.18 2.50 ( 0.04 6.19 ( 0.11 2.28 ( 0.04 5.38 ( 0.09 2.26 ( 0.04 6.16 ( 0.10 2.73 ( 0.04 6.26 ( 0.17 2.49 ( 0.05 6.50 ( 0.12 2.79 ( 0.06 6.43 ( 0.11 2.97 ( 0.06 6.76 ( 0.08 2.18 ( 0.04 5.94 ( 0.01 6.14 ( 0.11 2.81 ( 0.05 6.32 ( 0.10 2.70 ( 0.07 6.63 ( 0.12 2.79 ( 0.06 6.78 ( 0.10 2.91 ( 0.05 6.45 ( 0.12
-19.30 -10.99 -6.64 -15.59 -4.20 3.68 -3.59 2.05 -1.43 -3.67 -4.52 -2.24 5.15 -4.62 -10.95 2.29 2.69 0.59 3.65 1.36 -0.84 0.73 -1.21 -1.54 -2.77 -1.59 -2.59 -0.42 -1.94 -0.17
36.67 7.46 -0.95 3.48 9.56 1.75 5.77 4.63 1.37 -0.44 1.77 -8.91 -4.95 3.75 4.94 -20.65 -9.81 12.90 -10.21 -4.44 1.61 3.77 -13.11 -13.39 -6.44 -8.68 -0.47 -3.15 4.55 2.14
-5.23 -6.02 -4.25 -8.73 -8.54 -2.53 -3.30 -10.84 -16.01 -11.79 -3.79 6.81 -2.26 -2.54 1.56 8.91 0.51 15.87 5.60 -14.92 -7.13 -4.06 9.68 -1.31 5.44 3.56 9.13 5.92 13.52 0.76
a Result indicates the mean value and the standard deviation of the estimated concentration. The relative error of prediction refers to the actual concentration as 100%. b Mean value and standard deviation.
ing element alone. Figure 6a shows a typical example of an output obtained from the DHP network. With the data obtained from DHP as a metal indicator, only the concentration of Hg2+ ions 5732 Analytical Chemistry, Vol. 76, No. 19, October 1, 2004
could be estimated satisfactorily. The concentrations of Zn2+ and Cd2+ remain undetermined, indicating a partial selectivity of DHP toward Hg2+ compared to the other ions investigated. The inversed
ANN calculates a three-dimensional candidate concentration range as a correct answer. All plotted spots represent the three compound concentrations of a ternary mixture, which according to the ANN estimation lead to the same absorption spectrum. When the information obtained from all involved sensing elements is combined, however (Figure 6a-c), the threedimensional concentration ranges from the single indicators were reduced to a narrow intersection, giving the final concentration estimation for the unknown sample. The obtained data demonstrates the importance of the semiselectivity of the used sensing elements. Only if the applied indicators show significant differences in their selectivity behavior will there be a narrow intersection between the single contributing elements, allowing a reliable estimation of the concentrations in mixture. A typical example for an estimation with the three indicators MAS, DHP, and MTB is given in Figure 6d. In Table 1, the individual estimation results for the 30 ternary mixture test samples are summarized. The root-mean-square errors of prediction (RMSEPj) were calculated for the three analytes separately (j ) Zn2+, Cd2+, Hg2+) according to eq 17,
RMSEPj )
x
1
m
∑(c m
ij
- cˆij)2
(17)
i)1
where m is the number of samples, cij the concentration of the jth component in the ith sample, and cˆij is the estimated concentration for the same sample. The corresponding values were 0.45 µM for Zn2+, 0.96 µM for Cd2+, and 0.32 µM for Hg2+. When one looks at the relative errors of individual results for single ternary mixtures in Table 1, no specific trends can be observed. Many of the mixed samples with three disparate analyte concentrations are correctly quantified (e.g., numbers 7, 14, 22, 25, 27, and 28); however, some others show larger deviations, especially for the Cd2+ ion (e.g., samples 5, 12, 17, 18, and 24). Mixtures with low ion concentrations (2.56 µM) do not generally result in larger error values than mixtures with high ion concentrations (40.00 µM). These facts indicate that the established neural network system can reliably quantify the investigated ternary mixtures within the range limited by the calibration
mixtures without the occurrence of concentration-dependent clusters of erroneous results. CONCLUSIONS The quantification of three-dimensional mixtures of heavy metal ions was successful with an accuracy sufficient for many applications, despite the chemical simplicity of the used sensing elements. With this approach based on artificial neural networks, no mathematical model of the involved chemical equilibria is required. The same accuracy as obtained with highly selective chemical sensors cannot be expected for such a system. However, for the many cases where such sensors are still not available and very cumbersome to develop, it is a very useful alternative to other, often more complicated and cost- and labor-intensive analytical methods. Additionally, further improvement of the overall performance can be expected when applying the ANN system for more sophisticated sensing elements. In the application example presented in this paper, metallochromic indicators in aqueous solution were used as a model for an optical sensing system and complete absorbance spectra were measured. For the data analysis with the newly developed software package, however, it was sufficient to use only a few data points (absorbance at a specific wavelength) selected from the whole spectrum without a significant loss of chemical information. This indicates that the same technique might be applied to other sensor devices, such as ion-selective electrodes, where only one data point (electrochemical potential) is obtained for one sensing element and one sample. Therefore, the newly established software tool is a promising analytical model for the development of diverse, smart chemical sensing systems for multianalyte detection and quantification. ACKNOWLEDGMENT This study is partly supported by the Japanese Ministry of Education’s Academic Frontier Promotional Project “Science and Technology Program on Molecules, Supra-Molecules and SupraStructured Materials”. D.C. gratefully acknowledges a research fellowship granted by the Science and Technology Agency (STA) of Japan. Received for review February 13, 2004. Accepted June 10, 2004. AC040024E
Analytical Chemistry, Vol. 76, No. 19, October 1, 2004
5733