Anal. Chem. 2005, 77, 1368-1375
Correlation Coefficient Mapping in Fluorescence Spectroscopy: Tissue Classification for Cancer Detection Ed Crowell,† Gufeng Wang,† Jason Cox,† Charles P. Platz,‡ and Lei Geng*,†
Department of Chemistry, the Optical Science and Technology Center, and the Center for Biocatalysis and Bioprocessing, University of Iowa, Iowa City, Iowa 52242, and Department of Pathology, College of Medicine, University of Iowa, Iowa City, Iowa 52242
Correlation coefficient mapping has been applied to intrinsic fluorescence spectra of colonic tissue for the purpose of cancer diagnosis. Fluorescence emission spectra were collected of 57 colonic tissue sites in a range of 4 physiological conditions: normal (29), hyperplastic (2), adenomatous (5), and cancerous tissues (21). The sample-sample correlation was used to examine the ability of correlation coefficient mapping to determine tissue disease state. The correlation coefficient map indicates two main categories of samples. These categories were found to relate to disease states of the tissue. Sensitivity, selectivity, predictive value positive, and predictive value negative for differentiation between normal tissue and all other categories were all above 92%. This was found to be similar to, or higher than, tissue classification using existing methods of data reduction. Wavelength-wavelength correlation among the samples highlights areas of importance for tissue classification. The two-dimensional correlation map reveals absorption by NADH and hemoglobin in the samples as negative correlation, an effect not obvious from the one-dimensional fluorescence spectra alone. The integrity of tissue was examined in a time series of spectra of a single tissue sample taken after tissue resection. The wavelengthwavelength correlation coefficient map shows the areas of significance for each fluorophore and their relation to each other. NADH displays negative correlation to collagen and FAD, from the absorption of emission or fluorescence resonance energy transfer. The wavelengthwavelength correlation map for the decay set also clearly shows that there are only three fluorophores of importance in the samples, by the well-defined pattern of the map. The sample-sample correlation coefficient map reveals the changes over time and their impact on tissue classification. Correlation coefficient mapping proves to be an effective method for sample classification and cancer detection. * Corresponding author. E-mail:
[email protected]. Tel: (319)335-3167. Fax: (319)335-1270. † Department of Chemistry, the Optical Science and Technology Center, and the Center for Biocatalysis and Bioprocessing. ‡ Department of Pathology, College of Medicine.
1368 Analytical Chemistry, Vol. 77, No. 5, March 1, 2005
Successful detection and treatment of cancers are a crucial area of health management. In the year 2004 alone, it is estimated1 that in the United States there will be 1.37 million new cancer cases diagnosed and over 500 000 people will die from cancer. In all cancers, early diagnosis is the key to effective treatment and long-term survival. In colorectal cancer, for example, detection of the cancer before it can spread leads to a 90% five-year survival rate. If the cancer has spread to nearby organs and lymph nodes, the five-year survival rate drops to 66%. When the cancer is not detected until it has spread to distant organs, only 9% of the patients will survive the next five years. At this time, ∼38% of colorectal cancers are detected before they spread, leading to an overall five-year survival rate of 62%.1 Cancer diagnosis with optical spectroscopy is currently a very active area of research. A wide variety of methods are currently under study, including Raman,2-4 fluorescence emission spectra,5-10 fluorescence lifetime,11,12 and diffuse reflectance.5,13 Fluorescence alone has been applied to cancers from many organs of the body, including the oral cavity,14,15 esophagus,16 larynx,17,18 lung,9,19-21 (1) Cancer Facts and Figures 2003; American Cancer Society: Atlanta, GA.2003. (2) Stone, N.; Kendall, C.; Smith, J.; Crow, P.; Barr, H. Faraday Discuss. 2004, 126, 141-157. (3) Huang, Z. W.; McWilliams, A.; Lui, H.; McLean, D. I.; Lam, S.; Zeng, H. S. Int. J. Cancer 2003, 107, 1047-1052. (4) Crow, P.; Stone, N.; Kendall, C.; Uff, J.; Ritchie, A.; Wright, M. J. Urol. 2003, 169, 871. (5) Breslin, T. M.; Xu, F. S.; Palmer, G. M.; Zhu, C. F.; Gilchrist, K. W.; Ramanujam, N. Ann. Surg. Oncol. 2004, 11, 65-70. (6) Smith, P. W. J. Cell. Biochem. 2002, (Suppl. 39), 54-59. (7) Panjehpour, M.; Julius, C. E.; Phan, M. N.; Vo-Dinh, T.; Overholt, S. Lasers Surg. Med. 2002, 31, 367-373. (8) Brewer, M.; Utzinger, U.; Silva, E.; Gershenson, D.; Bast, R. C.; Follen, M.; Richards-Kortum, R. Lasers Surg. Med. 2001, 29, 128-135. (9) Zellweger, M.; Goujon, D.; Conde, R.; Forrer, M.; van den Bergh, H.; Wagnieres, G. Appl. Opt. 2001, 40, 3784-3791. (10) Panjehpour, M.; Overholt, B. F.; Schmidhammer, J. L.; Farris, C.; Buckley, P. F.; Vo-Dinh, T. Gastrointest. Endoscopy 1995, 41, 577-581. (11) Tadrous, P. J.; Siegel, J.; French, P. M. W.; Shousha, S.; Lalani, E. N.; Stamp, G. W. H. J. Pathol. 2003, 199, 309-317. (12) Mizeret, J.; Wagnieres, G.; Stepinac, T.; van den Bergh, H. Lasers Med. Sci. 1997, 12, 209-217. (13) Utzinger, U.; Brewer, M.; Silva, E.; Gershenson, D.; Blast, R. C.; Follen, M.; Richards-Kortum, R. Lasers Surg. Med. 2001, 28, 56-66. (14) Manjunath, B. K.; Kurein, J.; Rao, L.; Krishna, C. M.; Chidananda, M. S.; Venkatakrishna, K.; Kartha, V. B. J. Photochem. Photobiol. B 2004, 73, 4958. (15) Ebihara, A.; Krasieva, T. B.; Liaw, L. H. L.; Fago, S.; Messadi, D.; Osann, K.; Wilder-Smith, P. Lasers Surg. Med. 2003, 32, 17-24. 10.1021/ac049074+ CCC: $30.25
© 2005 American Chemical Society Published on Web 02/02/2005
stomach,22 cervix,23 colon,24-31 bladder,32,33 breast,5,34,35 skin,7 and brain.36 Detection of colonic cancer is particularly of interest because it is one of the most frequently occurring cancers.1 Colonic cancer is readily recognizable by polyp formation on the otherwise smooth and relatively uniform appearance of the tissue surface. Colonic tissue consists of two primary layers, the mucosa and submucosa. The mucosa is an epithelial layer that forms a physical barrier but allows for absorption of nutrients into the blood. This means that the mucosa is well supplied with blood vessels. Further, the epithelial layer grows and sloughs off much the same as skin does (and is a similar tissue). Thus, the mucosa is a very active layer and well supplied with reduced nicotinamide adenine dinucleotide (NADH) and flavin adenine dinucleotide (FAD), metabolic components that are two of the primary fluorescent species in this tissue. The submucosa is primarily connective tissue containing collagen type IV. Collagen type IV is the third fluorescent species observed in colonic tissue samples. As cancer develops in colonic tissue, changes in the fluorescence emission spectrum can be observed. With increasing polyp formation, the relative contribution of collagen decreases while the contribution from NADH increases. This allows for disease state differentiation based on these two components. However, there are several complications in the application of this method. Some of the difficulty is due to spectral overlap and hemoglobin (16) Pfefer, T. J.; Paithankar, D. Y.; Poneros, J. M.; Schomacker, K. T.; Nishioka, N. S. Lasers Surg. Med. 2003, 32, 10-16. (17) Arens, C.; Dreyer, T.; Glanz, H.; Malzahn, K. Eur. Arch. Oto-Rhino-Laryngol. 2004, 261, 71-76. (18) Malzahn, K.; Dreyer, T.; Glanz, H.; Arens, C. Laryngoscope 2002, 112, 488493. (19) Zeng, H. S.; Petek, M.; Zorman, M. T.; McWilliams, A.; Palcic, B.; Lam, S. Opt. Lett. 2004, 587-589. (20) Ikeda, N.; Hiyoshi, T.; Kakihana, M.; Honda, H.; Kato, Y.; Okunaka, T.; Furukawa, K.; Tsuchida, T.; Kato, H.; Ebihara, Y. Lung Cancer 2003, 41, 303-309. (21) Goujon, D.; Zellweger, M.; Radu, A.; Grosjean, P.; Weber, B. C.; van den Bergh, H.; Monnier, P.; Wagnieres, G. J. Biomed. Opt. 2003, 8, 17-25. (22) Mayinger, B.; Jordan, M.; Horbach, T.; Horner, P.; Gerlach, C.; Mueller, S.; Hohenberger, W.; Hahn, E. G. Gastrointest. Endoscopy 2004, 59, 191-198. (23) Mitchell, M. F.; Cantor, S. B.; Ramanujam, N.; Tortolero-Luna, G.; RichardsKortum, R. Obstet. Gynecol. 1999, 93, 462-470. (24) Mayinger, B.; Jordan, M.; Horner, P.; Gerlach, C.; Muehldorfer, S.; Bittorf, B. R.; Matzel, K. E.; Hohenberger, W.; Hahn, E. G.; Guenther, K. J. Photochem. Photobiol. B 2003, 70, 13-20. (25) Schomacker, K. T.; Frisoli, J. K.; Compton, C. C.; Flotte, T. J.; Richter, J. M.; Deutsch, T. F.; Nishika, N. S. Gastroenterology 1992, 102, 1155-1160. (26) Schomacker, K. T.; Frisoli, J. K.; Compton, C. C.; Flotte, T. J.; Richter, J. M.; Nishika, N. S.; Deutsch, T. F. Lasers Surg. Med. 1992, 12, 63-78. (27) Wang, C. Y.; Lin, J. K.; Chen, B. F.; Chiang, H. H. K. J. Formosan Med. Assoc. 1999, 98, 837-843. (28) Izuishi, K.; Tajiri, H.; Fujii, T.; Boku, N.; Ohtsu, A.; Ohnishi, T.; Ryu, M.; Kinoshita, T.; Yoshida, S. Endoscopy 1999, 31, 511-516. (29) Wang, T. D.; Crawford, J. M.; Feld, M. S.; Wang, Y.; Itzkan, I.; Van Dam, J. Gastrointest. Endoscopy 1999, 49, 447-455. (30) Richards-Kortum, R.; Rava, R. P.; Petras, R. E.; Fitzmaurice, M.; Sivak, M.; Feld, M. S. Photochem. Photobiol. 1991, 53, 777-786. (31) Zonios, G.; Cothren, R.; Crawford, J. M.; Fitzmaurice, M.; Manoharan, R.; Van Dam, J.; Feld, M. S. Ann. N. Y. Acad. Sci. 1998, 838, 108-115. (32) Hungerhuber, E.; Kriegmair, M.; Knuechel, R.; Hofstetter, A.; Zaak, D. J. Urology 2003, 169, 863. (33) Olivo, M.; Lau, W.; Manivasager, V.; Hoon, T. P.; Christopher, C. Int. J. Oncol. 2003, 22, 523-528. (34) Sevick-Muraca, E. M.; Hawrysz, D. J. Neoplasia 2000, 388-417. (35) Hage, R.; Galhanone, P. R.; Zangaro, R. A.; Rodrigues, K. C.; Pacheco, M. T. T.; Martin, A. A.; Netto, M. M.; Soares, F. A.; da Cunha, I. W. Lasers Med. Sci. 2003, 18, 171-176. (36) Bottiroli, G.; Croce, A. C.; Locatelli, D.; Nano, R.; Giombelli, E.; Messina, A.; Benericetti, E. Cancer Detect. Prev. 1998, 22, 330-339.
absorption within the complex system of the tissue. Further, sample-to-sample and patient-to-patient variation limits the direct use of fluorescence emission spectra as a means of diagnosis. A crucial step of optical detection of diseases is the classification of samples into normal and diseased categories. A few methods have been developed in the literature for tissue classification, including multivariate linear regression37 and differential normalization,38 with different levels of success. For clinical applications, the classification method needs to be simple and robust. In addition, transportability between laboratories is required for a successful method. In this paper, we will utilize correlation coefficient mapping to analyze fluorescence spectra of colonic tissue samples. This work is the first application of correlation coefficient mapping to sample classification, as well as the first application to cancer investigation.39 Two-dimensional correlation analysis is an effective method for enhancing spectral resolution and establishing correlation among spectral bands. It spreads the one-dimensional spectrum into two dimensions by evaluating the correlation between each pair of spectral variables. Two types of correlation are commonly employed to generate 2D correlational maps: the perturbationbased dynamic 2D correlation40-46 and 2D correlation based on statistical parameters.47-56 The former approach collects dynamic data in a specific sequence upon external perturbations, while the latter could be used for static data where spectra are randomly collected, which is ideal for the purpose of cancer diagnosis. One form of 2D correlation based on statistical parameters is covariance mapping. The central idea of covariance mapping is to plot variance-covariance matrix in a continuous 2D map, which has been known in the statistical community for many years. The concept of 2D covariance mapping was introduced by Frasinski et al. in an application of mass spectrometry to study the ionization mechanism.47 Covariance mapping is extensively used in mass spectrometry47-49 and found applications in infrared,50 Raman,51 and very recently NMR spectroscopy.52 In covariance mapping, (37) Kapadia, C. R.; Cutruzzola, F. W.; Obrien, K. M.; Stetz, M. L.; Enriquez, R.; Deckelbaum, L. I. Gastroenterology 1990, 99, 150-157. (38) Vo-Dinh, T.; Panjehpour, M.; Overholt, B. F.; Buckley, P. Appl. Spectrosc. 1997, 51, 58-63. (39) Geng, L. Presented at the 28th FACSS (Federation of Analytical Chemistry and Spectroscopy Societies), Detroit, 2001. (40) Noda, I. J. Am. Chem. Soc. 1989, 111, 8116-8118. (41) Noda, I. Appl. Spectrosc. 1993, 47, 1329-1336. (42) Noda, I. Appl. Spectrosc. 2000, 54, 994-999. (43) Noda, I.; Dowrey, A. E.; Marcott, C.; Story, G. M.; Ozaki, Y. Appl. Spectrosc. 2000, 54, 236-248A. (44) Wang, G.; Geng, L. Anal. Chem. 2000, 72, 4531-4542. (45) He, Y.; Wang, G.; Cox, J.; Geng, L. Anal. Chem. 2001, 73, 2302-2309. (46) Geng, L.; Cox, J. M.; He, Y. Analyst 2001, 126, 1229-1239. (47) Frasinski, L. J.; Codling, K.; Hatherly, P. A. Science 1989, 246, 1029-1031. (48) Card, D. A.; Wisniewski, E. S.; Folmer, D. E.; Castleman, A. W. J. Chem. Phys. 2002, 116, 3554-3567. (49) Foltin, M.; Stueber, G. J.; Bernstein, E. R. J. Chem. Phys. 1998, 109, 43424360. (50) Marcot, C.; Story, G. M.; Dowrey, A. E.; Noda, I. In Computer-enhanced analytical spectroscopy; Wilkins, C. L., Ed.; Plenum Press: New York, 1993; Vol. 4, pp 237-255. (51) Colomban, P.; Treppoz, F. J. Raman Spectrosc. 2001, 32, 93-102. (52) Bruschweilera, R.; Zhang, F. J. Chem. Phys. 2004, 120, 5253-5260. (53) Barton, F. E.; Himmelsbach, D. S.; Duckworth, J. H.; Smith, M. J. Appl. Spectrosc. 1992, 46, 420-429. (54) Sasic, S.; Ozaki, Y. Anal. Chem. 2001, 73, 2294-2301. (55) Windig, W.; Margevich, D. E.; McKenna, W. P. Chemom. Intell. Lab. Syst. 1995, 28, 109-128. (56) Wang, G.; Geng, L. Anal. Chem. In press.
Analytical Chemistry, Vol. 77, No. 5, March 1, 2005
1369
the absolute value of covariance is used to indicate the correlation of dynamic changes of spectral intensity, or similarity of two sets of spectral intensities, between each pair of variables. The covariance contains information on the absolute values of spectral intensities, under which the dynamical changes could be covered. It is informative to normalize the covariance with respective standard deviations into the correlation coefficient. Correlation coefficient mapping was used in infrared and near-infrared spectroscopy by Barton53 and later applied to various forms of spectroscopy.54-56 As has been pointed out by Sasic and Ozaki,54 after normalization, the correlation coefficient falls in the range between -1 and 1, where 1 indicates perfect correlation or similarity, -1 indicates perfect anticorrelation or antisimilarity, and 0 indicates no correlation or no similarity. Any value between is a measure of the extent of correlation or similarity. Both covariance mapping and correlation coefficient mapping are based on statistical parameters and are methods of statistical 2D correlation spectroscopy. The correlation analysis initially was performed between variables, so-called variable-variable correlation, or wavelengthwavelength correlation if the wavelength is the variable. Of particular interest is that the correlation could also be evaluated between samples, namely, sample-sample correlation.57,58 The species dynamics upon external perturbations could be obtained through sample-sample correlation.58 It has been demonstrated that sample-sample correlation could also be used in vibrational spectra classification.59 In this paper, both sample-sample and variable-variable correlations are used in the application of cancer diagnosis. Sample-sample correlation is examined for the ability to determine the tissue disease state. Variable-variable correlation is used to highlight the spectral regions of importance for tissue classification. In a time series of spectra taken after tissue resection, wavelength-wavelength correlation discloses relative changing rates of spectral regions of significance, while sampleto-sample correlation coefficient map reveals the changes over time and their impact on tissue classification. EXPERIMENTAL SECTION Fluorescence Spectroscopy of Tissue. All tissue samples were examined in vitro, after resection from patients undergoing oncological surgery. Samples were provided by the Department of Pathology of the University of Iowa Hospitals and Clinics, and analysis was performed immediately after receipt. Normal sites were areas of the tissue samples that are away from the cancerous region. All samples were rinsed and irrigated with phosphatebuffered saline (pH 7.4, 150 mM) to prevent tissue degradation. Disease state was determined by a pathologist before fluorescence analysis. These studies were approved by the Institutional Review Board (IRB) of The University of Iowa. The spectofluorometer used for all measurements was a SLM48000MHF (Jobin Yvon, Edison, NJ) with a fiber-optic attachment for direction of excitation light and collection of fluorescence. For each sample, the position and angle of the fiber(57) Zimba, C. Presented at the Second International Symposium on Advanced Infrared Spectroscopy (AIRS II), Durham, NC, 17 June 1996; and at the First International Symposium on Two-Dimensional Correlation Spectroscopy (2DCOS), Sanda, Japan, 30 August 1999. (58) Sasic, S.; Muszynski, A.; Ozaki, Y. J. Phys. Chem. A 2000, 104, 6380-6387. (59) Sasic, S.; Katsumoto, Y.; Sato, H.; Ozaki, Y. Anal. Chem. 2003, 75, 40104018.
1370
Analytical Chemistry, Vol. 77, No. 5, March 1, 2005
optic probe were adjusted for maximum emission intensity and reduction of scattered light, making sure that the probe was not in contact with the sample. Excitation was provided by a HeCd laser at 325 nm. The excitation wavelength was selected such that all three major tissue fluorophores are excited. Although a shorter wavelength is more favorable for collagen excitation, 325 nm is chosen to avoid spectral intereference from aromatic amino acids that are excited at shorter wavelengths. Longer wavelength excitation, although more favorable for FAD, cannot be used because of the low-absorption cross sections of collagen. Emission was filtered using two 345-nm long-pass filters before direction to the monochromator. These filters are necessary due to strong scattering of the excitation light by the tissue samples. Detection was by a high-sensitivity photomultiplier tube (Hamamatsu 1477). Correlation Coefficient Mapping. For a series of spectroscopic data collected with m samples at n spectral variables, the covariance between each pair of variables is defined as
c(v1,v2) )
m
1
∑[x (v ) - x m-1 1
i
av(v1)][xi(v2)
- xav(v2)]
i)1
)
m
1
∑
[ xi(v1)xi(v2) - mxav(v1)xav(v2)], m - 1 i)1
(1)
and correlation coefficient is defined as
p(v1,v2) )
c(v1,v2)
)
σ(v1)σ(v2) m
1
∑ m-1
[xi(v1) - xav(v1)] [xi(v2) - xav(v2)] σ(v1)
i)1
(2)
σ(v2)
where xav(v) and σ(v) are the mean and the standard deviation of spectral intensities of m measurements at variable v respectively. The sample-sample covariance is calculated from
c(s1,s2) )
)
n
1 n-1
∑[x (s ) - x j
1
av(s1)][xj(s2)
- xav(s2)]
j)1
n
1 n-1
[
∑x (s )x (s ) - nx j
1
j
2
av(s1)xav(s2)],
(3)
j)1
and sample-sample correlation coefficient is defined as
p(s1,s2) ) c(s1,s2)/σ(s1)σ(s2)
(4)
where xav(s) and σ(s) are the mean and the standard deviation in spectral intensities of n variables of sample s respectively. The plot of p(x,y) with respect to two sample axes or two wavelength axes is the correlation coefficient map. The correlation coefficient maps are calculated with programs written in MatLab (MathWorks, Natick, MA). RESULTS AND DISCUSSION Fluorescence Spectra of Tissue. The autofluorescence emission spectra of colonic tissue samples are due to three
Figure 1. Integrated intensity-normalized fluorescence emission spectra for all tissue samples.
primary tissue components: collagen, NADH, and FAD. The spectra also contain the effects of strong hemoglobin absorption. Figure 1 illustrates the emission spectra for the tissue samples, showing variation in the contributions of each component by tissue type and significant variation between samples of the same type. The variations between sites in the same tissue type reflect variations in the health and tissue conditions of the patients. Tissue Sample Wavelength-Wavelength Correlation. The correlation coefficient map is calculated from a matrix containing the emission spectra of all samples, each column being the emission spectrum of a different sample. Each spectrum is then normalized with the total integrated intensity in the spectrum. The individual points within the normalized spectrum then have the mean subtracted from them and are divided by the standard deviation. Subtraction of the mean is performed as correlation coefficient mapping is concerned with changes between data points. Division by the standard deviation to calculate the correlation coefficients results in more equal weighting of the individual spectra within the correlation coefficient map so that one spectrum is not dominant in the data set.54 The resulting columns, though corresponding to the original spectra, no longer resemble the original spectra (Figure 2) but emphasize the changes in the spectra from point to point. It is this data matrix from which the correlation coefficient maps are calculated. Figure 3 is the wavelength-wavelength correlation coefficient map of the tissue spectra. The data set consists of fluorescence emission spectra of 57 colonic tissue sites in a range of four types: 29 normal, 2 hyperplastic, 5 adenomatous, and 21 cancerous tissue samples. Hyperplastic tissues are benign in which the nonviable cells do not detach as in normal tissue, resulting in an appearance similar to adenomatous tissue that is precancerous. The diagonal line shows autocorrelation of unit correlation coefficients. The off-diagonal areas of high correlation form four blocks aligned on the diagonal, indicating four spectral regions of significance. The four spectral regions show low correlation with each other, indicating that they originate from different tissue components. Though on its own the correlation coefficient map cannot determine what each component is, it is clear that there are only four major contributors that are related to tissue disease state. This is an important result when considered in the context of the complexity of tissue sample. Tissue is an elaborate mix of many chemicals in a wide range of microenvironments. That only
Figure 2. Mean-centered fluorescence spectra of tissue. Each curve is a one-dimensional fluorescence emission spectrum after subtraction of the mean and division by the standard deviation.
Figure 3. Wavelength-wavelength correlation coefficient map of tissue spectra.
four of these many possible contributors are of significance is one of the reasons a fluorescence-based optical biopsy method is possible as it reduces an otherwise complex real system to a simple composition. Previously, it has been assumed that there were only four major contributors to tissue fluorescence and this assumption was used successfully in spectral fitting for cancer diagnosis.26,31 This assumption is validated by correlation coefficient mapping, which directly displays changes between samples without any prior knowledge of tissue or prior model assumption. As such, the results shown in Figure 3 directly indicate a fourcomponent model. In the correlation coefficient map, each spectral region can be assigned to a tissue component based on its absorption and emission spectra. In this map, collagen is dominant in the range of 350-410 nm. The Soret band of hemoglobin absorption is apparent in the wavelength region from 410 to 440, overlapping with collagen and NADH. NADH overlaps both hemoglobin and FAD, with a range of 440-525 nm. FAD emits between 525 and 600 nm but has low fluorescence intensity due to the unfavorable excitation at 325 nm. It should be noted that these regions correspond to, but are not exactly the same as, the individual fluorescence emission or Analytical Chemistry, Vol. 77, No. 5, March 1, 2005
1371
absorption spectra. The correlation coefficient map emphasizes areas of change rather than contribution to the total intensity; thus, the wavelength ranges important for correlation coefficient differ from the emission spectra. The changes that the map considers are, in this case, due to sample variation. Thus, a component of high intensity, but showing little change from sample to sample, would not appear on the correlation coefficient map. Conversely, a relatively minor component in terms of intensity, which shows significant change from sample to sample, would be readily apparent in the correlation coefficient map. By establishing the correlation between spectral regions and contributors, correlation of fluorescent components in tissue samples could be revealed. For example, hemoglobin displays negative correlation with all other components by inspecting the correlation of the 410-490-nm spectral region with all other regions. Similarly, collagen exhibits negative correlation with NADH and almost zero correlation with FAD. NADH shows weak positive correlation with FAD. That hemoglobin shows negative correlation with all other components is expected, since it is a strong absorber, centered at 414 nm but showing absorption across the wavelength range of interest. This indicates that hemoglobin contribution tends to change in a direction opposite to other components. For example, samples with high concentration of hemoglobin will show less collagen and NADH contribution and vice versa. Negative correlation of NADH and collagen is potentially due to absorption of collagen fluorescence by NADH, as seen by the overlap of the emission and excitation spectra for these components. Between samples, this means that samples with high collagen contribution tend to show lower NADH intensity and vice versa. This aspect of the change of contribution of collagen and NADH based on disease state has been noted before.31 In general, normal tissues show higher collagen contribution than NADH while cancerous tissues show increased NADH relative to collagen emission. This is thought to be due to decreased contribution of the collagen-rich submucosa to the total tissue emission with increased contribution from the metabolically active, thickened, cancerous mucosa. Though correlation coefficient mapping cannot determine the nature of the signal, it does clearly show that collagen and NADH are being sampled differently and is consistent with the morphological model. Zero correlation between collagen and FAD is understood with the large wavelength separation of the absorption and fluorescence for these components. Between samples, this additionally indicates that FAD and collagen are sampled independently within the tissue. That is, FAD contribution varies independent of collagen contribution. The positive correlation of NADH and FAD, along with the overlap of areas of importance, can be attributed to the spectral overlap of their emission as well as overlap of NADH emission with FAD absorption. In terms of sample classification, the positive correlation indicates that high NADH contribution tends to appear in samples that also show higher FAD contribution. As both NADH and FAD are metabolic components, it is sensible that they would both be present in higher concentrations within rapidly growing cancerous tissues. These qualitative results, overall, are not surprising but do show the utility of correlation coefficient mapping in visually 1372 Analytical Chemistry, Vol. 77, No. 5, March 1, 2005
Figure 4. Sample-sample correlation coefficient map of tissue spectra.
displaying these correlations. What the maps also provide, which is not easily interpreted from the one-dimensional spectra, are the regions of importance as well as a measure of the extent of the correlations. Tissue Sample-Sample Correlation. The sample-sample correlation coefficient map is shown in Figure 4. From this figure, it is apparent that there are two primary categories, which are well correlated to themselves and show negative correlation coefficient to each other. For the purposes of this map, samples were grouped by type: normal, hyperplastic, adenomatous, and cancerous as determined by histopathology. It can be seen that the normal tissue samples (1-29) correlate well with each other, but with two (sites 6 and 28) samples that do not correlate with the other normal samples. Similarly the cancerous samples (sites 37-57) correlate well with each other, with four (sites 37-40) that do not show high correlation with the other cancerous samples. The hyperplastic samples (samples 30 and 31) show fair correlation with the normal samples but also some correlation with cancerous samples. This can be interpreted as a sign of their abnormal, but noncancerous, state. Adenomatous samples (samples 32-36) do not show consistent correlation with either state, likely due to high sample-to-sample variation. Interpretation of the correlation coefficient maps to determine disease states can be accomplished in two ways. The first is to simply visually inspect the map and examine the strength of correlation between the sample in question and the normal samples. High correlation to the normal samples, indicated by bright areas, and low correlation to the cancerous samples lead to a conclusion that the sample is normal. Conversely, low correlation to other normal samples and high correlation to cancerous samples would indicate a cancerous sample. Though direct and simple, such a method is not truly rigorous and presents difficulties in determining the degree of correlation. Using the correlation values for each sample allows for a more discriminating and quantitative method to be used. This is done by calculating the average correlation coefficient for the sample to all normal samples. In this way, a high average correlation to normal tissue samples is used as an indicator of normal tissue and low average correlation as an indicator of an abnormal tissue. Figure 5A shows
Figure 5. Average correlation of tissue samples to all (A) normal and (B) cancerous samples.
the average correlation to all normal tissues by each tissue sample. Figure 5B is the average correlation to all cancerous tissue by each tissue sample. These figures illustrate the expected trend. Samples 1-29 tend to have positive average correlation to the other normal tissue samples, with the exception of sample 6, and negative correlation to cancerous with the same exception. Abnormal tissue samples (30-57) tend to show negative correlation to normal tissue and corresponding positive correlation to cancerous samples, with the exceptions of samples 32, 37, 38, and 39. Classification between normal and abnormal tissues is made using this plot by categorizing those samples which show 0.22 or greater positive average correlation to the normal tissue samples. Using this cutoff, correlation coefficient analysis gives results that agree quite well with the pathological determination of the tissues with only 4 samples of the 57 misclassified as either normal or abnormal. Samples 6 and 28 fall below the cutoff, resulting in false positives for cancerous tissue. Samples 32 and 38 are false negatives. All other samples are correctly classified, resulting in 26 true positive and 27 true negative results. It should be mentioned that the cutoff threshold of 0.22 is determined for the tissue population in this work. With access to a larger set of spectral data, this number could and should be different for best disease detection. The standard measures of the effectiveness of a diagnostic method are sensitivity, specificity, predictive value positive, predictive value negative, and false positive rate. Sensitivity is the
probability that a disease state is correctly diagnosed and is calculated by dividing the number of correctly diagnosed disease samples (true positives) by the total number of disease samples (true positives and false negatives). Specificity is the probability that a nondisease state is correctly diagnosed and is calculated by the ratio of correctly diagnosed normal samples (true negatives) to all normal samples (false positive and true negative). Predictive value positive is a measure of the probability that a positive result is truly diseased and is found by dividing the number of correctly diagnosed diseased samples by all positive diagnoses (true and false positive). Predictive value negative is the reverse, the probability that a negative diagnosis is truly negative and is the ratio of correctly diagnosed normal samples (true negatives) to all normal diagnoses (true negative and false negative). The false alarm rate is the measure of the probability that a diagnosis of disease is incorrect and is calculated from the ratio of incorrectly diagnosed normal samples (false positives) to all positive diagnoses (false positive and true positive). The above determinations of tissue state using correlation coefficient mapping result in a sensitivity of 93%, specificity of 93%, predictive value positive of 93%, false positive rate of 7%, and predictive value negative of 93% for differentiation between normal tissue and abnormal tissue states. This was found to be similar to, or better than the existing methods of discrimination for the same samples based on one-dimensional fluorescence spectra.3 The ability of correlation coefficient mapping to determine disease state is not surprising since it uses the same information as one-dimensional spectral discrimination and is similarly limited by sample-to-sample variations. Visual representation and use of the correlation values, however, make interpretation of the results very simple. The simplicity of comparison to cancerous or normal sets may prove to be a useful tool for initial diagnosis or a guide for determining if further analysis is required. This is an important consideration due to the disadvantages of classification from biopsy. Currently, tissue samples are removed in biopsy from patient sites that are considered suspicious on visual examination. The sample is then fixed, sliced, dyed, and examined by a trained pathologist. This method is time-consuming, expensive, and requires the removal of tissue from the patient. In comparison, correlation coefficient mapping could potentially reduce the number of biopsies required by providing a quick and simple method for determining if the tissue in question requires further examination. In the case of routine checks, it could eliminate the need for unnecessary biopsies by quickly determining that there are no sites requiring further examination. Collection of the fluorescence emission can be easily accomplished using fiber optics added to an endoscope to direct the laser excitation to the tissue and to collect the resulting fluorescence. With the data used, it was not possible to reliably distinguish hyperplastic, adenomatous, and cancerous tissues from each other. This is due to the small sample size for hyperplastic and adenomatous tissues, only two and five, respectively. Additional samples for these tissue states would likely result in increased ability to differentiate the disease states. Decay of Tissue Autofluorescence after Resection. It has been noted that in vitro measurements of tissue samples show Analytical Chemistry, Vol. 77, No. 5, March 1, 2005
1373
Figure 6. One-dimensional emission spectra of a single tissue sample after resection over time. Spectra taken at 5-min intervals. Figure 8. Sample-sample correlation coefficient map for decay spectra.
Figure 7. Wavelength-wavelength correlation coefficient map for decay spectra.
significant spectral changes over time.60 Spectra of a tissue sample taken in 5-min intervals after tissue resection are shown in Figure 6. The spectra show a distinct trend over time. The area corresponding to collagen fluorescence, centered at 390 nm, increases slightly over time while the area corresponding to NADH, centered at 470 nm, shows significant decrease over time. Correlation coefficient maps, sample-to-sample and wavelengthto-wavelength, provide a wealth of information about the decay process. Figure 7 is the wavelength-wavelength correlation coefficient map for all the decay spectra. The correlation coefficient map shows three clearly defined areas and strong autocorrelation. Using information from the fluorescence emission spectra, these areas can be assigned to the three intrinsic fluorophores in the tissue sample. The first region, 350-440 nm, can be assigned to collagen. The second region, 440-525 nm, is assigned to NADH, with the third region due to FAD. It can be seen from the emission spectrum series that, in time, NADH is decreasing with increases in collagen and FAD contribution to the spectrum. What is not apparent from the one-dimensional emission spectra is the clear (60) Palmer, G. M.; Marshek, C. L.; Vrotsos, K. M.; Ramanujam, N. Lasers Surg. Med. 2002, 30, 191-200.
1374 Analytical Chemistry, Vol. 77, No. 5, March 1, 2005
delineation of the three components and their areas of significance. It is of note that the wavelength ranges of importance for each component in this map differ slightly from those observed in Figure 3. This occurs for two primary reasons. The first is that hemoglobin shows no effect on the correlation coefficient map for the decay of resectioned tissue. From this, it can be concluded that hemoglobin undergoes no changes affecting its fluorescence absorption properties within the time frame of the analysis, information not apparent in the one-dimensional spectra. It was previously thought that loss of hemoglobin, during sample preparation and over time, contributed to the spectral changes. That it is not present in the correlation coefficient maps contradicts the previous hypothesis. The second reason for the change in wavelength regions attributed to each component is based in the fact that correlation coefficient mapping reveals areas of simultaneous change. Figure 3 reflects changes between different samples while Figure 7 relates changes over time in a single sample. The areas where these changes are most prominent, though due to the same components, will vary somewhat due to the change in what makes the individual spectra different from each other in each data set. The correlation coefficient map in Figure 7 clearly shows that collagen has negative correlation to NADH and positive correlation to FAD. NADH displays negative correlation to both collagen and FAD. Interpretation of the correlation coefficient values must be made carefully here as well. That collagen and FAD are positively correlated does not mean that there is direct influence between the two. The correct interpretation would be that NADH shows negative correlation to both collagen and FAD. This indicates that the decrease in NADH, likely due to NADH oxidation over time, results in the observed increase in collagen and FAD fluorescence intensity. As mentioned, NADH absorbs collagen fluorescence; thus, a decrease in NADH concentration would result in greater collagen intensity. In the case of FAD, a decrease in the overlap of NADH would cause an apparent increase in FAD intensity. The sample-to-sample correlation coefficient map for the decay spectra provides information about the decay process (Figure 8). The map shows well-defined groups of samples 1-14 (0-70 min) and samples 17-33 (85-165 min). The samples correlate well
with each other within each group and almost perfectly anticorrelate with each other between groups. Samples 15 and 16 show weak correlation with all other samples, possibly because the close resemblance of their spectra to the average of all the spectra. This correlation coefficient map indicates a uniform change in the samples over time, which is caused by the oxidation of NADH. CONCLUSIONS This work demonstrates for the first time that correlation coefficient mapping is an effective method for sample classification. Further, correlation coefficient mapping can allow clear visualization of inherent correlations not readily apparent in the 1D fluorescence spectra. Especially, it shows that four tissue components are responsible for the fluorescence signals, without any prior assumptions as needed in analysis of 1D spectra. We have applied correlation coefficient mapping to tissue autofluorescence spectra for tissue diagnosis as well as investigation of the process of tissue decay after resection. By providing an easily interpreted visual map of spectral changes and differences, correlation coefficient mapping shows its utility as an analysis method in
fluorescence spectroscopy. When applied to tissue classification for the purposes of cancer diagnosis, correlation coefficient mapping shows itself to be a simple method for sample classification based on the inherent correlations between samples. As such, the method shows promise as a way to screen tissue for further analysis by providing a quick and simple test to determine whether further, more detailed examination is required and reducing the number of tissue samples that would have to be removed from the patient for examination in biopsy procedures. ACKNOWLEDGMENT The support by the National Institute of Health, the National Cancer Institute (CA100741) is gratefully acknowledged. G.W. thanks the Center of Biocatalysis and Bioprocessing of the University of Iowa for a predoctoral fellowship.
Received for review June 24, 2004. Accepted November 2, 2004. AC049074+
Analytical Chemistry, Vol. 77, No. 5, March 1, 2005
1375