Anal. Chem. 2009, 81, 7773–7777
Blood Species Identification for Forensic Purposes Using Raman Spectroscopy Combined with Advanced Statistical Analysis Kelly Virkler and Igor K. Lednev* Department of Chemistry, University at Albany, State University of New York, 1400 Washington Avenue, Albany, New York 12222 Forensic analysis has become one of the most growing areas of analytical chemistry in recent years. The ability to determine the species of origin of a body fluid sample is a very important and crucial part of a forensic investigation. We introduce here a new technique which utilizes a modern analytical method based on the combination of Raman spectroscopy and advanced statistics to analyze the composition of blood traces from different species. Near-infrared Raman spectroscopy (NIR) was used to analyze multiple dry samples of human, canine, and feline blood for the ultimate application to forensic species identification. All of the spectra were combined into a single data matrix, and the number of principle components that described the system was determined using multiple statistical methods such as significant factor analysis (SFA), principle component analysis (PCA), and several cross-validation methods. Of the six principle components that were determined to be present, the first three, which contributed over 90% to the spectral data of the system, were used to form a three-dimensional scores plot that clearly showed significant separation between the three groups of species. Ellipsoids representing a 99% confidence interval surrounding each species group showed no overlap. This technique using Raman spectroscopy is nondestructive and quick and can potentially be performed at the scene of a crime. In recent years, forensic analysis has become one of the most expanding areas of analytical chemistry.1 The identification of body fluids is a very important aspect of forensic investigations. Most cases involving biological evidence begin with the locating and collecting of human body fluids such as blood, semen, and saliva, and the usual procedure involves performing a destructive presumptive test2-4 in order to determine the identity of the unknown fluid that was collected. Sometimes, however, the species of origin of a questioned sample is just as important as * To whom correspondence should be addressed. Phone: (518) 591-8863. Fax: (518) 442-3462. E-mail:
[email protected]. (1) Brettell, T. A.; Butler, J. M.; Almirall, J. R. Anal. Chem. 2009, 81, 4695– 4711. (2) Johnston, E.; Ames, C. E.; Dagnall, K. E.; Foster, J.; Daniel, B. E. J. Forensic Sci. 2008, 53, 687–689. (3) Hochmeister, M. N.; Budowle, B.; Rudin, O.; Gehrig, C.; Borer, U.; Thali, M.; Dirnhofer, R. J. Forensic Sci. 1999, 44, 1057–1060. (4) Myers, J. R.; Adkins, W. K. J. Forensic Sci. 2008, 53, 862–867. 10.1021/ac901350a CCC: $40.75 2009 American Chemical Society Published on Web 08/11/2009
the identity of the fluid itself, whether it be for a criminal case5,6 or wildlife investigation.7,8 This scenario occurs most often with questioned blood samples, so additional testing is needed along with the presumptive screening tests that determine that the sample is in fact blood. Sometimes a test result needs to only show that the blood is not of human origin to be useful, and other times the specific species needs to be determined. There are several tests currently in use that can identify a blood sample to be either human or nonhuman, and there are also tests that can identify the species of origin if it is determined to not be human. These specific tests are not often performed, however, since most cases only require a positive or negative human result.9,10 The basic premise of these tests is the reaction of antigens from an unknown sample with antibodies that are provided by the testing procedure.11 One of the most popular types of human blood identification involves an immunochromatographic assay that can be performed both at a crime scene and in the laboratory. Two commercially available kits which use an antibody-antigen-antibody sandwich are Hexagon OBTI and ABAcard Hematrace. Both kits contain antihuman hemoglobin (Hb) antibodies which recognize and bind to human Hb. These tests will show a positive result for human blood and a negative result for any other animal blood with the exception of some higher primates. There were also some false positive results with other body fluids containing low amounts of Hb. Both test kits are also destructive to the sample, and this can be a large disadvantage if very little sample is available to work with. Other similar kits are also available which contain antibodies that recognize human glycophorin A (GPA) such as RSID-Blood, and this test shows no cross-reactivity with other higher primates or body fluids.9 (5) Duarte, J.; Pacheco, M. T.; Machado, R. Z.; Silveira, L., Jr.; Zangaro, R. A.; Villaverd, A. B. Cell. Mol. Biol. (Noisy-le-grand) 2002, 48, 585–589. (6) Terada, N.; Ohno, N.; Saitoh, S.; Ohno, S. J. Struct. Biol. 2008, 163, 147– 154. (7) Adrian, W. J. Wildlife Forensic Field Manual; Association of Midwest Fish and Game Law Enforcement Officers, 1994. (8) Stroud, R. K.; Adrian, W. J. In Noninfectious Diseases of Wildlife; Fairbrother, A., Locke, L. N., Hoff, G. L., Eds.; Iowa State University Press: Ames, IA, 1996; pp 3-18. (9) Li, R. Forensic Biology; CRC Press: Boca Raton, FL, 2008. (10) Watson, N. In Crime Scene to Court: The Essentials of Forensic Science; Royal Society of Chemistry: Cambridge, U.K., 2004; pp 377-413. (11) Spalding, R. P. In Forensic Science: An Introduction to Scientific and Investigative Techniques; James, S. H., Nordby, J. J., Eds.; CRC Press: Boca Raton, FL, 2003; pp 181-201.
Analytical Chemistry, Vol. 81, No. 18, September 15, 2009
7773
There are also a few tests that can specifically identify the species of a blood sample. Two types of double diffusion assays are the ring assay and Ouchterlony assay.9,11 The ring assay can only test one antigen at a time, and it involves the formation of a precipitin line between antiserum and antigen solutions. The antiserum solution that is used depends on which species is trying to be identified. The Ouchterlony double diffusion technique can compare multiple antigens to one another in addition to identifying their species. The nature of the precipitin lines that form on an agarose gel will reveal if two antigens being compared are identical or not. A similar gel method called crossed-over immunoelectrophoresis uses an electric current to migrate certain antigens toward a questioned antibody in order to produce a precipitin line when there is a match.9,11 Again, these double diffusion tests are all destructive to the sample and can also only be performed in the laboratory. The Ouchterlony method even requires incubation overnight. It would be advantageous to develop a simple, quick, and nondestructive analytical technique which can be used at the scene of a crime to identify the specific species of a blood sample. We have recently reported that Raman spectroscopy has great potential to distinguish between different body fluids12 as well as provide nondestructive, confirmatory identification of body fluids at the scene of a crime.13 Our approach is based on the ability of Raman spectroscopy to accurately identify organic, inorganic, and biological species which is an advantage that is lacking in many other analytical techniques such as ultraviolet absorbance and fluorescence spectroscopies.14 As quoted by Mann and Vickers, Raman spectroscopy “is unusually, if not uniquely, suited to be the process control star of the next century.”15 The technique of Raman spectroscopy has already been used for several different forensic applications.16-20 There have also been nonforensic studies involving Raman spectroscopy of feline plasma,5 oxygen saturation in mice blood,6 and prionic proteins in the blood of cattle, goats, humans, and birds.21 It is possible for both canine and feline blood to be found at a crime scene in addition to human blood,22,23 so it is important for crime scene investigators to be able to determine the origin of the bloodstain. De Wael et al.24 have recently utilized Raman spectroscopy to analyze human, canine, and feline blood by focusing on hemoglobin bands present in the spectra. They found that human, canine, and feline blood all showed very similar (12) Virkler, K.; Lednev, I. K. Forensic Sci. Int. 2008, 181, e1–e5. (13) Virkler, K.; Lednev, I. K. Forensic Sci. Int. 2009, 188, 1–17. (14) Williams, T. L.; Collette, T. W. In Handbook of Raman Spectroscopy: From the Research Laboratory to the Process Line; Lewis, I. R., Edwards, H. G. M., Eds.; Marcel Dekker, Inc.: New York, 2001; pp 683-731. (15) Mann, C. K.; Vickers, T. J. In Handbook of Raman Spectroscopy: From the Research Laboratory to the Process Line; Lewis, I. R., Edwards, H. G. M., Eds.; Marcel Dekker, Inc.: New York, 2001; pp 251-274. (16) Coyle, T.; Anwar, N. Sci. Justice 2008, 49 (1), 32–40. (17) Hodges, C. M.; Akhavan, J. J. Mol. Spectrosc. 1990, 46, 303–307. (18) Mazzella, W. D.; Buzzini, P. Forensic Sci. Int. 2005, 152, 241–247. (19) Rodger, C.; Broughton, D. Analyst 1998, 123, 1823–1826. (20) Thomas, J.; Buzzini, P.; Massonnet, G.; Reedy, B.; Roux, C. Forensic Sci. Int. 2005, 152, 189–197. (21) Hernandez, P. C.; Llop, J. M.; Moscardo, E. M.; Garces, M. M.; Diez, J. J. B. WO/2004/010122, A1, 2004. (22) Agronis, A. In Dateline UC Davis; UC Davis: Davis, CA, 2001. (23) Wilson, P.; Norris, G. In International Young Adult Mental Health Conference; Faculty of Humanities and Social Sciences, Bond University: Gold Coast, Australia, 2003. (24) De Wael, K.; Lepot, L.; Gason, F.; Gilbert, B. Forensic Sci. Int. 2008, 180, 37–42.
7774
Analytical Chemistry, Vol. 81, No. 18, September 15, 2009
Raman spectra that were difficult to visually distinguish. We report here an alternative approach for distinguishing blood of different species based on Raman spectroscopy. Our approach is based on the analysis of a total Raman spectrum and evaluation of the composition of a blood sample, which varies with species. For example, the concentration of hemoglobin in whole blood for human males has been reported to be 16.3 g/mL, while canine and feline blood contain only 14.8 and 11.2 g/mL, respectively.25 We utilized advanced statistical analysis of NIR Raman spectra of blood acquired from multiple donors and found that the scores values of each species was not only visually distinguishable on a three-dimensional plot based on the first three principle components, but ellipsoids enclosing a 99% confidence interval around each species did not overlap indicating that the three species were in fact distinguishable using Raman spectroscopy. This technique using is nondestructive and rapid and can potentially be used with a portable device at the scene of a crime. MATERIALS AND METHODS Samples. Eight samples each of human, canine, and feline blood were obtained from anonymous donors and a local veterinary clinic, respectively. The samples were prepared by placing a small 10 µL drop on a circular glass slide designed for use with an automatic mapping stage. Each sample was allowed to dry completely and was measured using automatic mapping that scanned a sample area of 75 µm × 75 µm and measured Raman spectra from 16 random points within the area with ten accumulations of 10 s each at each point. The spectra were collected within a wavenumber range of 250-1800 cm-1. Raman Microscope. A Renishaw inVia confocal Raman spectrometer equipped with a research-grade Leica microscope, 20× long-range objective (numerical aperture of 0.35), and WiRE 2.0 software were used. For the automatic mapping, the lower plate of a Nanonics AFM MultiView 1000 system was set up under the microscope, and measurements were taken using Quartz II and QuartzSpec software. A 785 nm laser light was utilized for excitation. The laser power on the sample was about 11.5 mW. Data Treatment. All dried blood samples measured using automatic mapping were first treated using GRAMS/AI 7.01 software to remove any cosmic ray interference and subtract any background fluorescence interference. The spectra were then imported into MATLAB 7.4.0 for statistical analysis and normalized to adjust for the varying amount of background interference in each spectrum. The average spectrum was calculated for each of the 24 samples, and this data was combined into a single matrix. Significant factor analysis (SFA)26 was performed to determine the number of principle components that described the entire system of all three species, and then principle component analysis (PCA)26,27 was performed based on the number of components found. Cross-validation methods known as contiguous block, leaveone-out, Venetian blind, and random subset were carried out to verify the number of principle components in the system.27,28 Scores values were determined for each species based on each of the principle components, and the scores for the first three (25) Altman, P. L. Blood and Other Body Fluids; Federation of American Societies for Experimental Biology: Washington, DC, 1961. (26) Malinowski, E. R. Factor Analysis in Chemistry, 3rd ed.; John Wiley & Sons: New York, 2002.
Figure 1. Raman spectra of dry samples of human blood (A), feline blood (B), and canine blood (C). Table 1. Results of Significant Factor Analysis (SFA) for the Combined Set of Blood Samplesa n
EV
1 2 3 4 5 6 7 8 9 10
× × × × × × × × × ×
7.3703 2.2774 8.7440 2.4182 1.5472 6.3971 2.5730 2.3671 1.5615 1.2614
RE 3
10 101 100 100 100 10-1 10-1 10-1 10-1 10-1
3.1645 1.9810 1.2595 9.6944 7.2532 5.9585 5.3502 4.7217 4.2567 3.8401
× × × × × × × × × ×
IND -2
10 10-2 10-2 10-3 10-3 10-3 10-3 10-3 10-3 10-3
5.9821 4.0931 2.8561 2.4236 2.0092 1.8391 1.8513 1.8444 1.8919 1.9592
× × × × × × × × × ×
REV -5
10 10-5 10-5 10-5 10-5 10-5 10-5 10-5 10-5 10-5
1.8887 6.0934 2.4474 7.0950 4.7696 2.0770 8.8236 8.6005 6.0317 5.2004
× × × × × × × × × ×
AUTO -1
10 10-4 10-4 10-5 10-5 10-5 10-6 10-6 10-6 10-6
9.9969 9.9552 9.9394 9.8515 9.7425 9.5146 8.9118 8.8481 7.8690 7.6960
× × × × × × × × × ×
%SL -1
10 10-1 10-1 10-1 10-1 10-1 10-1 10-1 10-1 10-1
0 4.4872 7.7765 1.6506 1.3374 5.2288 1.7182 1.5200 2.0913 2.2714
× × × × × × × × ×
10-2 10-2 100 100 100 101 101 101 101
a n represents the number of principle components; EV, eigenvalue; RE, real error; IND, Malinowski factor indicator function; REV, reduced eigenvalue; AUTO, autocorrelation coefficient, and %SL, significance level for F-test.
components were used to form a three-dimensional plot of each species clustered in space.29 RESULTS AND DISCUSSIONS Main Approach. The spectra obtained from eight samples each of human, canine, and feline blood were treated with advanced statistical methods in order to show that Raman spectroscopy can be used to distinguish between different species of blood. Although the spectra of the blood samples all looked similar both in our study and in a past experiment,24 the result was much different and promising with the help of SFA and PCA analysis. The method of Raman spectroscopy to distinguish between human, canine, and feline samples of blood can potentially be very useful to forensic investigations due to its nondestructive nature13 and portable capabilities.30,31 Sample Analysis. The average spectrum from one sample each of human, canine, and feline blood is shown in Figure 1. Visually the spectra are all very similar. They contain the same (27) Wise, B. M.; Gallagher, N. B.; Bro, R.; Shaver, J. M.; Windig, W.; Koch, R. S. PLS Toolbox; Eigenvector Research, Inc.: Wenatchee, WA, 2005. (28) Xu, M.; Shashilov, V. A.; Ermolenkov, V. V.; Fredriksen, L.; Zagorevski, D.; Lednev, I. K. Protein Sci. 2007, 16, 815–832. (29) Bhartia, R.; Hug, W. F.; Salas, E. C.; Reid, R. D.; Sijapati, K. K.; Tsapin, A.; Abbey, W.; Nealson, K. H.; Lane, A. L.; Conrad, P. G. Appl. Spectrosc. 2008, 62, 1070–1077. (30) Jacoby, M. Chem. Eng. News 2008, 86, 59–60. (31) Eckenrode, A.; Bartick, E. G.; Harvey, S.; Vucelick, M. E.; Wright, B. W.; Huff, R. A. Forensic Sci. Commun. 2001, 3.
major peaks and the only noticeable differences are slight peak intensity fluctuations which are most noticeable at 342, 372, 1371, and 1620 cm-1. The fact that the spectra were not easily distinguishable is consistent with the results shown by De Wael et al.24 There are of course some slight differences in the composition of human, canine, and feline blood, but these differences are due to varying concentrations of components and not from a change in components. For example, the concentration of hemoglobin in whole blood for human males is slightly more than that of canine blood, and feline blood contains the least.25 Human blood on average also contains more glucose, ascorbic acid, niacin, and insulin than the other two species.25 These and other slight hormonal or enzyme variations could cause the fluctuations in Raman peak intensity from one species to another, but there are no major peak difference since all of the blood samples contain basically the same chemical components. The statistical analysis carried out on the averages of all the samples will reveal any differences that are present among the three species that cannot be seen visually. SFA Analysis. Table 1 shows the results of the SFA analysis on the data matrix which contains the spectral data of all 24 samples. Certain criteria indicate that there are in fact six major components. The column of eigenvalues (EV) in Table 1 shows that after the first six components there is not a significant change from one component to the next. The fact that the EVs for n ) 7, Analytical Chemistry, Vol. 81, No. 18, September 15, 2009
7775
Figure 2. A root-mean-square error of cross-validation (RMSECV) plot versus the number of principle components. The break at n ) 6 indicates that there are six components.
8, 9, . . . vary only slightly suggests that the principle components after n ) 6 are error factors.27,28 In addition, the Malinowski factor indicator function (IND) typically is at its lowest value where n equals the number of significant factors composing the data set,28 which in this case is at n ) 6 (Table 1, underlined). Further evidence of the presence of six components is based on the reduced eigenvalues (REV). They should be statistically equal26,28 when corresponding to error factors, and after n ) 6 they exhibit this characteristic. The significance level (%SL) of the F-test indicates how likely it is that a particular EV is actually an error factor and not a principle component.26,28 Table 1 shows that this value increases significantly after n ) 6. Finally, autocorrelation coefficients (AUTOs) are normally in the range of 1 for principle components and less than 0.5 for error factors.26,28 The AUTO data in Table 1 are in the range of 1 for the first six components. The results after that are not 0.5, but they are lower than 1 when compared to the first six values. The remainder of the table contains plenty of data to suggest that there are in fact six principle components. On the basis of this calculated number, scores values for each sample based on each component were calculated and used to determine if there was any statistical overlap among the three species. PCA Analysis. PCA analysis was performed on the data matrix assuming that there were six principle components. The meancentering method27,28 was used to preprocess the data. In an effort to confirm that there were in fact six principle components, crossvalidation methods known as contiguous block, leave-one-out, Venetian blind, and random subset were carried out.27,28 The resulting plot of the root-mean-square error of cross validation (RMSECV) versus the principle component for the contiguous block method is shown in Figure 2. The RMSECV plots for the other cross-validation methods yielded similar results (data not shown). In this type of plot, a local minimum or distinct change in slope will occur at the number of principle components that describes the system.26,27 There is a break in the curve at n ) 6, and this reinforces the result that there are 6 principle components. Scores values based on each of the six components were found for each sample based on the PCA analysis. The scores for the 7776
Analytical Chemistry, Vol. 81, No. 18, September 15, 2009
Figure 3. A three-dimensional principle component plot for human (blue), feline (green), and canine (red) blood samples based on the first three principle spectral components (3A). Each ellipsoid encloses a 99% confidence interval around each species cluster, and the twodimensional view (3B) clearly shows no overlap.
first three components were used to create a three-dimensional plot of the species in space in order to determine if each sample of the same species would cluster and separate from the other species. There was in fact clustering and separation, and the extent of this separation is shown in Figure 3. With the use of the sample set of n ) 8, an ellipsoid was drawn around each species cluster that describes the 99% confidence interval. Figure 3B shows a two-dimensional view based on components 1 and 2 which clearly shows that none of the ellipsoids overlap. Figure 3A shows the same plot from a different angle to demonstrate the third dimension. Although there are in fact six principle components which describe the system, the first three were used for this comparison for two main reasons. First, components 1-3 contribute over 90% of the total spectral data of the system and contain more important information than components 4-6 which do not vary as much between species. Components 4-6 only contribute 3.0%, 2.5%, and 0.9%, respectively, and are not necessary in proving that there is separation even though they were determined to be principle components through SFA analysis. Second, demonstrating that there is no overlap in both two and three-dimensional space is sufficient for assuring that there is no overlap in a six-dimensional space. In addition, two and three dimensions are easy to illustrate with clear results. For the purposes of this study, the first three components are sufficient. The fact that a 99% confidence interval for each species can be clearly separated is a satisfactory result for the purposes of a
forensic investigation. There is a very small chance of a false positive identification using this method, and this is desirable since the ultimate goal is to identify a species of blood. This technique could be applied by introducing an unknown blood sample into the data matrix to see where it would fall on the PCA plot. Additional testing would be necessary if the unknown sample was plotted in between the ellipsoids, but a confident conclusion can be made if it falls within one of the species clusters. It is important to note that this data is preliminary since only eight samples for each species do not represent the entire population adequately; however, the results clearly demonstrate the power of combining advanced spectroscopy and rigorous statistical analysis. CONCLUSIONS A novel analytical approach for characterizing complex biological samples such as blood was developed. The combination of Raman spectroscopy and advanced statistical analysis was used to distinguish human, canine, and feline blood samples, which have many similarities in composition causing their spectra to look almost identical by visual comparison alone. After measurement of multiple spectra from eight samples each of human, canine, and feline blood using automatic Raman spectroscopy, it was found that the three species can be statistically separated within a confidence interval of 99%. PCA analysis and multiple crossvalidation methods indicated that there are six principle components that describe the system containing all 24 samples, and 3 of these 6 components were used to develop a three-dimensional scores plot that clearly shows significant separation among all three species. This plot is based on the three components which contribute the largest percentage to the entire system (91.2%). Other species could also be tested, such as horse, rabbit, cow, deer, and other animals that might play a role in forensic
investigations. Analyzing a larger sample set would be another possible addition to this experiment to better represent the diversity of the entire population(s). This Raman spectroscopic method is nondestructive to the blood samples unlike the other available testing procedures to identify a specific species. This is important when only a small amount of sample is available and further testing is needed on the sample after the initial identification. In addition, portable Raman instruments have already been developed,30,31 and similar instruments could be applied to this method with the proper software implementation so that analysis can take place at the crime scene. This technique shows great potential in the forensic identification of blood species without the need of time-consuming and destructive testing procedures that are limited to a laboratory. ACKNOWLEDGMENT We would like to acknowledge Dr. Victor Shashilov for his advice and valuable discussions. We are also grateful to the former Director of North East Regional Forensic Institute (NERFI), W. Mark Dale (presently at U.S. Army Criminal Investigation Laboratory), the present NERFI Director, John Hicks, and Dr. Barry Duceman, Director of Biological Science in the New York State Police Forensic Investigation Center for continued support. Special thanks to the Just Cats Veterinary Clinic and the Albany County Veterinary Hospital for providing canine and feline blood samples. This work is supported through the Faculty Research Award Program, University at Albany, State University of New York (I.K.L.). Received for review June 22, 2009. Accepted July 27, 2009. AC901350A
Analytical Chemistry, Vol. 81, No. 18, September 15, 2009
7777