Characterizing the Reproducibility of a Protein Profiling Method for the

Cincinnati, Cincinnati, Ohio 45267, and Division of Biomedical Informatics, Cincinnati Children's Hospital. Research Foundation, 3333 Burnet Avenue, ...
1 downloads 0 Views 288KB Size
Characterizing the Reproducibility of a Protein Profiling Method for the Analysis of Mouse Bronchoalveolar Lavage Fluid Anne McLachlan,†,‡ Michael Borchers,§ Prakash Velayutham,‡ Michael Wagner,‡ and Patrick A. Limbach*,† Rieveschl Laboratories for Mass Spectrometry, Department of Chemistry, P.O. Box 210172, University of Cincinnati, Cincinnati, Ohio 45221, Department of Environmental Health, P.O. Box 670056, University of Cincinnati, Cincinnati, Ohio 45267, and Division of Biomedical Informatics, Cincinnati Children’s Hospital Research Foundation, 3333 Burnet Avenue, Cincinnati, Ohio 45229 Received May 19, 2006

The detection of biomarkers in biological fluids has been advanced by the introduction of mass spectrometry screening methods such as matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOFMS), which enables the detection of the presence and the molecular mass of proteins in unfractionated mixtures. The generation of reproducible mass spectra over the course of an experiment is vital in obtaining data in which differences in protein profiles between diseased and healthy states can be assessed correctly. We have developed a protocol to automate the collection of protein profiling data from a large number of samples using MALDI-TOFMS, and we used these samples to characterize the technical reproducibility of the method. This protocol has been used for the analysis of proteins found in bronchoalveolar lavage fluid samples from mice with the ultimate goal of enabling the discovery of differential expression patterns predictive of the development of chronic obstructive pulmonary disease. Samples were purified using magnetic bead-based technology and analyzed on an AnchorChip target plate. Our results demonstrate that the number of peaks detected reproducibly decreases significantly as sample size increases, which motivates the need for technical replicates to be explicitly included in the analysis of MALDI-TOF-based protein profiling studies. Keywords: MALDI-TOF mass spectrometry • automation • technical reproducibility • computational methods • mouse bronchoalveolar lavage fluid • chronic obstructive pulmonary disease

Introduction Proteomic pattern analysis using mass spectrometry has been recognized as one of the most promising new approaches for the identification of biomarkers that indicate healthy and diseased states.1 A major research objective in this field is to search for biomarkers in complex biological fluids.2 The detection of biomarkers has been advanced considerably by the introduction of screening methods such as matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDITOFMS), which enables the detection of the presence and the molecular mass of proteins in unfractionated mixtures. Enrichment of specific subsets of proteins can be achieved by the application of magnetic bead technology3 in which protein enrichment occurs by the relative adsorption of specific protein groups to magnetic beads that are functionalized with resins of varying hydrophobicities. * To whom correspondence should be addressed. Phone: (513) 556-1871. Fax: (513) 556-9239. E-mail: [email protected]. † Rieveschl Laboratories for Mass Spectrometry, Department of Chemistry, University of Cincinnati. ‡ Division of Biomedical Informatics, Cincinnati Children’s Hospital Research Foundation. § Department of Environmental Health, University of Cincinnati. 10.1021/pr060241r CCC: $33.50

 2006 American Chemical Society

The production of reproducible mass spectra over the course of an experiment is vital in obtaining data in which differences in protein profiles between diseased and healthy states can be assessed correctly. Experimental variations arising within a MALDI-TOFMS experiment are well-known and have been discussed by others previously.4-6 Baumann et al.7 and others8-10 discuss the importance of a standardized approach to proteome profiling, particularly in reference to the application of methods that use magnetic bead technology for protein fractionation. They suggest that a standardized analytical approach in combination with the use of magnetic bead fractionation decreases the variability of proteome patterns in human serum as assessed using MALDI-TOFMS. A majority of prior protein profiling studies investigating MALDI (or SELDI) experimental reproducibility have focused on variability in peak intensities among several samples8,11 or have examined the sources of variability for replicate analyses of a single sample.12 De Noo and co-workers introduced a novel bioinformatics pipeline to quantitatively examine the effects of sample preparation and handling on data reproducibility and noted that a typical coefficient of variation between samples was around 20%.8 Hong et al. used principal component Journal of Proteome Research 2006, 5, 3059-3065

3059

Published on Web 10/13/2006

research articles analysis to demonstrate no systematic biases in SELDI data acquisition.12 In this paper, we propose and evaluate a protocol to automate the collection of protein profiling data from a large sample set of mouse bronchoalveolar lavage (BAL) fluid using MALDI-TOFMS. This method was developed using pooled BAL fluid from mice and then applied to a large (48/48) sample/ control cohort to assess reproducibility across technical and biological replicates. The focus here is to enable highthroughput studies using high degrees of automation followed with a careful assessment of method reproducibility. Our findings demonstrate that technical reproducibility must be assessed for large sample sets before the usefulness of any biomarker classification scheme can be determined.

Experimental Section Acrolein Exposure. Mice were exposed to acrolein as previously described.13 The chamber atmosphere was sampled with a series of two glass-fritted impingers, each containing 10 mL of 50% aqueous ethanol solution. A fraction of each sample was mixed with 50 mM hexylresorcinol, 2.1 mM mercury chloride, and 29.7 M trichloroacetic acid. Equal volumes of samples and known standards were heated at 65 °C for 15 min and allowed to cool to 22 °C for 15 min, followed by measurement of the absorbance at 605 nm. The mice were sacrificed by an intraperitoneal injection of pentobarbital sodium, and their posterior abdominal aorta was severed. The trachea was ligated, and the lungs were washed with 1 mL of Hank’s Balanced Salt Solution (HBSS) without calcium and magnesium (GIBCO BRL, Life Technologies, Gaithersburg, MD). Cells were removed by centrifugation at 300g for 10 min. The supernatants were stored at -80 °C. BAL fluid samples that contained any visible blood were discarded. The unexposed mice were sacrificed similarly. Protein Purification. All protein purifications were carried out using MB-HIC (Magnetic Bead based Hydrophobic Interaction Chromatography) purification kits (Bruker Daltonics, Billerica, MA). An aliquot of 5 µL of sample and 10 µL of binding solution was added to 5 µL of C3 bead slurry and mixed thoroughly. This solution was incubated for 1 min and placed in a magnetic bead separator for 20 s to enable the separation of the beads and supernatant. The supernatant (C3S1) was removed and saved for later analysis to determine if the bead capacity was exceeded. The beads were then washed three times by adding 100 µL of wash solution to the beads and moving them back and forth between the poles of the magnetic separator. The washes were discarded. The protein fraction was removed from the beads using 5 µL of 50% aq acetonitrile (HPLC/Spectrograde, Tedia, Fairfield, OH) and incubating the solution for 1 min. The solution was placed in the magnetic separator for 30 s to separate the beads and the supernatant (C3S2). Supernatants C3S1 and C3S2 were analyzed using MALDI-TOFMS. Supernatant C3S2 was not found to contain any mass spectral peaks in common with supernatant C3S1, suggesting that the C3 bead capacity had not been exceeded in this protocol. Two approaches for the sequential purification of samples were performed using C3 beads followed by C18 beads. In the first, the supernatant from the C3 bead purification was applied to C18 beads for further potential enrichment. In the second, the supernatant from the C3 purification was taken to dryness under vacuum and then re-hydrated using the binding solution before purification using C18 beads. 3060

Journal of Proteome Research • Vol. 5, No. 11, 2006

McLachlan et al.

MALDI-TOFMS. A 2 mg/mL stock solution of R-cyano-4hydroxycinnamic acid (HCCA) (purity > 99.0%, Fluka, Switzerland) prepared in ethanol/acetone (2:1 v/v) was used as the matrix for all investigations. Initially, 1 µL of sample was combined with 9 µL of matrix, and 1 µL of this mixture was spotted on a 600 µm-384 AnchorChip target plate (Bruker Daltonics, Billerica, MA). The ratio of analyte to matrix was varied according to the nature of the analyte and the results obtained after preliminary analysis using MALDI-TOFMS. Ontarget washing of the samples spotted on the AnchorChip target plate was carried out, when necessary, to remove any watersoluble contaminants. An aliquot of 1-5 µL of ice-cold 0.1% trifluoroacetic acid (TFA; Fischer Scientific, Pittsburgh, PA) solution was added to the sample spot and allowed to incubate for 3-5 s before removal. In some instances, recrystallization was required to ensure the sample was located on the MALDI sample target anchor. An aliquot of 0.5-1 µL of ethanol/ acetone/0.1% TFA (6:3:1 v/v/v) was applied to the sample spot and allowed to dry. MALDI-TOF mass spectra were acquired using a Bruker Reflex IV (Bruker Daltonics, Billerica, MA) mass spectrometer. The instrument was operated in linear, positive-ion mode with a nitrogen laser (337 nm). The acceleration voltage applied was 20 kV. Mass spectra were acquired over the mass range of 1-70 kDa with the low mass cutoff set to 500 Da. External calibration was performed using the calibrant positions of the MALDI sample target. Peptide calibration standard (m/z range 10003150), protein calibration standard I (m/z range 5700-17000), and protein calibration standard II (m/z range 22300-66500) (Bruker Daltonics, Billerica, MA) were used for external calibration. Automated Spectral Acquisition. The AnchorChip plate geometry was calibrated by spotting a 1 µL aliquot of matrix on the anchors at locations A1, A24, and P24 on the AnchorChip target plate and centering the cross hairs over the anchor in each location. Once the center of the anchor was located, the coordinates were saved to a plate geometry calibration file. Representative data acquisitions from each anchor were collected by creating a series of five spiral rasters that sampled over a 600 µm anchor (i.e., spot). The purpose of defining these spiral rasters, and to obtain five replicates per spot, was to enable the assessment of within-spot technical variability, as well as to smooth over local characteristics in a particular spot. Each anchor was divided into quadrants, and within each of these quadrants, data was acquired in a spiral raster which consisted of five coordinates at each of which 20 mass spectra were acquired. A fifth spiral raster, which also consisted of five coordinates where 20 mass spectra were acquired, was located at the center of the anchor (Figure S1, Supporting Information). The 20 mass spectra acquired at each of the coordinates of the rasters were summed to give 100 mass spectra acquired per spiral raster, and a total of five data files each containing 100 mass spectra acquired over each anchor. Each sample was spotted on five anchors to assess spot-to-spot reproducibility, and yielded a total on 25 replicates of 100 mass spectra per sample. Files were written to define the coordinates of these raster positions. A separate method file was created for each of the five spiral rasters to enable the collection of the data from each raster separately. Instrumental acquisition parameters relating to the laser power, the summing of mass spectra, and movement across the anchors were set using the AutoXecute Method Editor program (Table 1). In-house software was written to interface

research articles

Reproducibility of Profiling Mouse BAL Fluid Table 1. Instrumental Acquisition Parameters Set Using AutoXecute Method Editor tab

General

Laser

Summing Moving

parameter categories

parameter setting

Select spectrum directory Select FLEX Control method Switch on high voltage After AutoXecute run set high voltage to Fuzzy control Initial laser power Maximal laser power Matrix blaster Fuzzy control Sum up Maximal allowed shot number at one position Maximal number of consecutively rejected trials Use () laser attenuation for new spot Maximal allowed shot number is a () threshold

1 min before starting acquisition Off Off 20% 30% Fire initially Off 100 in 20 shot steps 20 20 Initially specified Hard

an AutoXecute Sequence with Excel spreadsheets, which included information relevant to sample annotation including a mapping file that mapped the sample locations on the AnchorChip target plate, experimental details, the number of raster positions, the method files for these raster positions, the output data folder, and the name and location of the text file. Once this information was entered, the spreadsheets were saved using an appropriate name and location. The information was then transferred directly into an AutoXecute Sequence file using software written in-house. A macro was written in the Bruker XTOF program that converted the mass spectral data files to ASCII files and placed the converted files in a folder. A program was written to collect files from each experiment in order to keep all information relating to an experiment with the data from that experiment. This program gathered the top level experimental directory, the mapping file, the text file, the Excel file, the ASCII folder, and any other information files relevant to a particular experiment. Once the CollectFiles program was run, the files were zipped and uploaded to the server for analysis. Figure 1 presents a flow chart summarizing the analysis protocol developed and utilized in this work. Bioinformatic Analysis. A total of 25 replicates of each biological sample was profiled, and each mass spectrum was processed independently as described previously.14 Local linear regression techniques were used to detect the baseline abundance for each mass spectrum. As previously observed,14 the baseline is not constant across the mass spectrum, but rather decreases with increasing m/z. The baseline also varies between mass spectra, which may be due to a varying number of mass spectra added in each acquisition. An iterative procedure was applied to simultaneously measure the local noise level and to determine mass regions that exhibit a signal-to-noise ratio >3. The peak shapes were approximated by triangles, and subpeaks (local maxima) were deemed to be significant if their abundances exceeded 3 times the signal-to-noise level above the triangular interpolation. This procedure reduced each mass spectrum to a set of peaks (the peak list) that was taken to be characteristic of the spectrum and representative of its relevant features. This process served to reduce the number of data from tens of thousands of data points to typically a few hundred data points, and to ensure that only m/z values with a sufficiently large signal-to-noise ratio were considered in the subsequent analytical steps. To enable a comparison of peak lists from different mass spectra, peaks whose m/z values were determined to be within a userspecified mass tolerance window (e.g., 0.2%) were considered to have the same origin.

The 25 replicates per biological sample were consolidated into a single peak list based on the rationale that, for the presence of a particular peak to be considered characteristic of a sample, it should appear in at least the majority of technical replicates. Therefore, the five replicates from each anchor location on the MALDI target were combined by running a majority voting algorithm. If a peak was detected in three or more replicates, it was included in the peak list for that anchor location. Similarly, if a peak was detected in a majority of the five replicate AnchorChip locations, then it was included in the peak list of that sample. Once complete, these iterations resulted in a peak expression matrix which was comprised of samples in its rows and peak masses in its columns.

Results Media Selection. The protocol developed in this work has been tested by successfully analyzing protein profiles of other biological fluids, namely, serum and urine, but the discussion will be limited to mouse BAL fluid. The aim of these studies was to enable high-throughput studies of large datasets to characterize the reproducibility of MALDI-TOFMS over significant numbers of technical replicates. Magnetic bead technology was chosen as the method of protein fractionation because of its ease and speed of extraction. Initial studies were carried out on pooled samples of mouse BAL fluid. These samples were subjected to fractionation with magnetic beads of varying hydrophobicities (C3, C8, and C18) and were analyzed using MALDI-TOFMS (Figure S2, Supporting Information). Although extraction with C3 beads produced the richest mass spectra, sequential extractions were performed on mouse BAL fluid using C3 beads, followed by C18 magnetic beads to ensure the maximum number of proteins was extracted. These sequential extractions provided little additional enrichment and were not pursued further. All further studies were performed using C3 beads only. Sampling. The reproducibility of the data obtained during analysis depends upon how thoroughly each sample spot was analyzed. A number of different raster patterns used for data acquisition were investigated. These rasters were comprised of a variety of shapes including horizontal and vertical lines, and squares and spirals in a range of dimensions. Although the line, square, and larger spiral raster patterns were sampled across the anchor, only five collection points of 100 mass spectra were acquired. When the five spiral rasters were used, data was acquired from 25 positions across the anchor with each of the five rasters comprising five collection points each of 20 mass spectra (Figure S1, Supporting Information). It was Journal of Proteome Research • Vol. 5, No. 11, 2006 3061

research articles

McLachlan et al.

Figure 1. Outline of development of protein profiling protocol.

decided to use the five spiral rasters as this configuration enabled better data acquisition coverage across the anchors than the other raster patterns investigated. Automation. The MALDI-TOF mass spectrometer used in this study was not initially configured for high-throughput analysis. The automation of the sampling regime and data collection, conversion of mass spectral data into other file formats, and documentation of experimental details enabled a high throughput of samples to be obtained in a timely fashion while maintaining the integrity of the data acquired. Scripts that were written to automate the instrumental data acquisition methods using five spiral rasters per sample spot enabled data to be collected in individual files of 100 mass spectra rather than 500 summed spectra. Thus, the reproducibility of mass 3062

Journal of Proteome Research • Vol. 5, No. 11, 2006

spectra acquired can be examined from 25 different positions over the anchor (Figure 2) and allows for the determination of anchor (sample spot) and sample (multiple sample spotting) reproducibility (Figure S3, Supporting Information). BAL-Small Sample Sets. The BAL fluid from eight unexposed mice and eight mice that were exposed to acrolein were analyzed using the sampling and automation protocol developed in this work. The MALDI-TOF mass spectra of this 8 × 8 dataset clearly showed differences between filtered-air exposed (control) mice and the mice exposed to acrolein, as illustrated in Figure 3a,b. The data within each group, that is, control and exposed, were reproducible. Figure 3c shows that the control mice are grouped together and the exposed mice are grouped

Reproducibility of Profiling Mouse BAL Fluid

research articles

Figure 2. Twenty-five replicates of BAL from a single mouse.

together, demonstrating that a difference between the protein profiles of exposed and control candidates exists. Reproducibility Analysis. A thorough review of the literature did not reveal any previously defined standards on how reproducibility should be assessed for protein profiling data of the type acquired in these MALDI-TOFMS analyses. As one aim of this study was to enable the later use of peak intensities to classify samples and distinguish between sample populations, one immediate area of interest when looking at technical and biological variability in this data was to determine the number of peaks whose presence (intensity above noise level) could be detected reproducibly in the replicates. The first characterization focused on the reproducibility within a given spot, which is composed of the five data files obtained from the five spiral raster patterns. The average number of peaks detected reproducibly across acquisitions from the same spot (i.e., in at least 3 out of 5 data files of 100 mass spectra each) was 85 (σ ) 13; n ) 240 separate spots). If the most stringent reproducibility criterion is demanded, that is, each peak must be detected in all five data files obtained over any one spot, then this value drops to 30 (σ ) 8; n ) 240 separate spots). To account for uneven sample distribution on a spot or other sources of variability arising from the protocol, all subsequent examinations of the data used the criterion that for assigning a peak to be representative of a spot it must appear in a majority of acquisitions (3 out of 5) from that spot (anchor). The next characterization focused on spot-to-spot (e.g., interanchor) variability. Using the same measures as above, it was observed that the average number of reproducible peaks (that is, the number appearing in a majority of the spot peak profiles) was 59 (σ ) 12). In this representative study, the technical reproducibility of the protocol decreases by approximately 30% when examining any one particular sample across multiple anchors. Moreover, any one mass spectral data file typically contains in the order of 200 peaks as determined by the bioinformatics approach described. Thus, only around onethird of the acquired mass spectral information can be considered statistically relevant once one examines the reproducibility of the MALDI technique. Another meaningful measure of protocol reproducibility is the sample-to-sample variability. This study was constructed using cohorts of genetically identical mice so as to minimize

Figure 3. Representative MALDI-TOF mass spectra of BAL from unexposed mice and mice exposed to acrolein from an 8 × 8 experiment with mass ranges (a) 2-25 kDa and (b) 2-71 kDa. (c) Dendogram of the protein profiles of 8 unexposed mice and 8 mice exposed to acrolein.

the influence of biological variability on any examination of technical reproducibility. To obtain some measure of sampleto-sample variability, the number of reproducible peaks (identified by the majority voting algorithm described above) was plotted against the relative fraction of samples containing the reproducible peaks. Figure 4 displays the sample-to-sample variability for three sample populations: 48 control mice, 48 mice that were exposed to acrolein, and the 96 samples combined together. As seen, the level of reproducibility was similar among all three sample populations. As might be Journal of Proteome Research • Vol. 5, No. 11, 2006 3063

research articles

Figure 4. Reproducibility across mice exposed to acrolein, unexposed mice, and all samples using a mass tolerance of 0.4% for peak alignment.

expected, a larger number of peaks appear reproducibly within a smaller percentage of replicate samples, with a significantly smaller number of peaks appearing reproducibly within nearly every replicate (10-20 peaks at the 90% level).

Discussion The current research trend in the field of protein science is informed through the search for biomarkers by studying protein expression using protein profiling experiments. Such research demands high throughput of large numbers of samples. If a high sample throughput is to be meaningful, it must be based on analytical methods that are reproducible and rigorous. Mass spectrometry, in particular MALDI-TOFMS, has been proposed as a method of choice for these types of analyses because of its ability to yield data on intact proteins without costly or timeconsuming sample preparation steps. While a number of studies have focused on standardization of sample preparation protocols,7-10 very little has been reported regarding the reproducibility of this approach,8,12,15 especially across larger datasets. Within this study, an automated approach for the high-throughput analysis of large sample sets was developed, and then used to examine technical reproducibility at various levels of analysis. Sampling and profiling bronchoalveolar lavage (BAL) fluid of mice repeatedly exposed to the respiratory toxicant acrolein assessed the feasibility of this approach. Acrolein (2-propenal) is a potent respiratory tract irritant found in tobacco smoke and photochemical smog that has been shown to produce the lesions associated with COPD in experimental animals.16-19 It was anticipated that a significant number of changes in the BAL protein profile of mice exposed to acrolein for 1 week would be detected based on prior microarray data (M. Borchers, private communication). Early results based on a smaller dataset consisting of eight control mice and eight mice that had been exposed to acrolein indicated that differences in peak intensities correlated strongly with exposure to acrolein (Figure 3). The results from a smaller dataset were encouraging enough to expand this study to larger sample sizes, which provided a unique opportunity to determine the technical reproducibility of this protocol. The larger dataset consisted of 96 genetically identical mice, to minimize biological variance, with 48 of the mice exposed to acrolein, and the other 48 mice serving as filtered-air exposed controls. The reproducibility of this data set was assessed by determining the numbers of peaks that appeared in the same location in replicate mass spectra. Raw 3064

Journal of Proteome Research • Vol. 5, No. 11, 2006

McLachlan et al.

data was processed following our previously described approach,14 which accounts for variations in baseline noise levels across the m/z range. A typical result may be seen in Figure 2, which displays 25 replicate MALDI-TOF mass spectra acquired from the BAL fluid of a single mouse. Our analysis shows that for intra-spot variability, around 50% of all peaks were found to occur in the majority of multiply sampled spots. Spot-to-spot variability was found to be even higher, with only 30% of all peaks reproducibly detected in a majority of the multiply sampled spots. These data clearly show that in order to maximize the probability that only legitimate (and thus identifiable) peaks are considered in later analyses, it is essential to explicitly incorporate technical replicates into the study design. These results are most likely directly attributable to the MALDI process and reflect variability in spot formation (e.g., pipetting and/or crystallization) and the desorption/ionization process. Thus, improvements to the primary MALDI steps, such as those reported recently in MALDI imaging,4 should result in significant improvements in sampling reproducibility. While the results described above are not surprising, these data do not address the compounding effects of reproducibility across increasingly larger datasets. To examine such effects, replicate spectra were acquired from a large (n ) 48) set of biologically similar samples. Although the study, as designed, cannot directly separate biological variability from technical variability, the use of genetically identical mice for this study was desired so as to minimize biological variances typically observed when profiling human biological fluids.15 Figure 4 and Figure S4 (Supporting Information) illustrate the number of reproducible peaks as a function of how many sample peak profiles a peak has to appear in to be considered reproducible. At a very rigorous level, for example, p ) 90%, we see that only a small number of peaks (10-20) are appropriate to be used as a basis for classification. Such results clearly suggest that any conclusions drawn from small sample sizes should be carefully interpreted before claiming biological relevance. In particular, classification from small sample sets based upon peaks which are found not to be reproducible when a study is expanded to larger datasets could lead to a favorable bias in the final analysis.6 More sophisticated peak extraction and alignment algorithms than those used in this study, such as those reported by Coombes et al.20 might provide a higher number of peaks to be reproducibly detected. However, our results did not change qualitatively when parameters in the algorithm such as the mass tolerance were varied (cf. Figure S4, Supporting Information). For the data obtained in this study, while the technical reproducibility decreases with increasing biological replicates, even at the most rigorous criteria, statistically significant differences exist between the peaks detected from exposed and unexposed mice. Thus, the preliminary results obtained in the 8 × 8 study may also hold for the 48 × 48 study. Even so, profiling studies still require careful evaluation of sample-to-sample reproducibility across a large dataset to ensure peaks chosen for classification meet appropriate criteria for technical reproducibility. Careful examination of these data also reiterates results found in prior studies.8,11,15 At small samples sizes using the majority voting algorithm to define reproducible peaks, the number of peaks would be equal to or slightly less than the number of peaks found when p ) 51% (i.e., 60). It is the cumulative effect of intra-spot and spot-to-spot variability effective over numerous samples which leads to fewer peaks

research articles

Reproducibility of Profiling Mouse BAL Fluid

being reproducible in these larger sample sizes. As mentioned above, while these results do not discount prior studies performed with small sample sizes, they do point out that larger-scale technical reproducibility issues may affect the ultimate usefulness of any biomarker classification scheme. Given these results, it is essential that both technical and biological reproducibility be carefully assessed using these or other methods, especially until gold standards for different biological fluids are established.

Acknowledgment. Financial support of this work was provided by the National Institutes of Health (P30-ES06096), the Center for Environmental Genetics at UC, Cincinnati, OH, Cincinnati Children’s Hospital Research Foundation, and the University of Cincinnati. Supporting Information Available: Figures showing the schematic of the raster patterns used for data acquisition (Figure S1); MALDI-TOFMS of protein enrichment of mouse BAL using C3 and C18 magnetic beads (Figure S2); mass spectra of pooled mouse BAL fluid for determination of anchor and sample reproducibility (Figure S3); and the reproducibility across mice exposed to acrolein, unexposed mice, and all samples using a peak tolerance of 0.2% for peak alignment (Figure S4). This material is available free of charge via the Internet at http://pubs.acs.org. References (1) Etzioni, R.; Urban, N.; Ramsay, S.; McIntosh, M.; Schwartz, S.; Reid, B.; Radlich, J.; Hartwell, L. Nat. Rev. Cancer 2003, 3, 243252. (2) Aldred, S.; Grant, M. M.; Griffiths, H. R. Clin. Biochem. 2004, 37, 943-952. (3) Zhang, X.; Leung, S. M.; Morris, C. R.; Shigenaga, M. K. J. Biomed. Tech. 2004, 25, 167-175.

(4) Aerni, H.-R.; Cornett, D. S.; Caprioli, R. M. Anal. Chem. 2006, 78, 827-834. (5) Fenselau, C. Anal. Chem. 1997, 69, 661A-665A. (6) Hilario, M.; Kalousis, A.; Pellegrini, C.; Mueller, M. Mass Spectrom. Rev. 2006, 25, 409-449. (7) Baumann, S.; Ceglarek, U.; Fiedler, G. M.; Lembcke, J.; Leichtle, A.; Thiery, J. Clin. Chem. 2005, 51, 973-980. (8) de Noo, M. E.; Tollenaar, R. A. E. M.; Ozalp, A.; Kuppen, P. J. K.; Bladergroen, M. R.; Eilers, P. H. C.; Deelder, A. M. Anal. Chem. 2005, 77, 7232-7241. (9) Rai, A. J.; Gelfand, C. A.; Haywood, B. C.; Warunek, D. J.; Yi, J.; Schuchard, M. D.; Mehigh, R. J.; Cockrill, S. L.; Scott, G. B. I.; Tammen, H.; Schulz-Knappe, P.; Speicher, D. W.; Vitzthum, F.; Haab, B. B.; Siest, G.; Chan, D. W. Proteomics 2005, 5, 32623277. (10) West-Nielsen, M.; Hogdall, E. V.; Marchiori, E.; Hogdall, C. K.; Schou, C.; Heegaard, N. H. H. Anal. Chem. 2005, 77, 5114-5123. (11) Dekker, L. J.; Dalebout, J. C.; Siccama, I.; Jenster, G.; Smitt, P. A. S.; Luider, T. M. Rapid Commun. Mass Spectrom. 2005, 19, 865870. (12) Hong, H.; Dragan, Y.; Epstein, J.; Teitel, C.; Chen, B.; Xie, Q.; Fang, H.; Shi, L.; Perkins, R.; Tong, W. BMC Bioinf. 2005, 6, S5. (13) Borchers, M. T.; Wert, S. E.; Leikauf, G. D. Am. J. Physiol. 1998, 274, L573-L581. (14) Wagner, M.; Naik, D.; Pothen, A. Proteomics 2003, 3, 1692-1698. (15) Semmes, O.; Feng, Z.; Adam, B.; Banez, L.; Bigbee, W.; Campos, D.; Cazares, L.; Chan, D.; Grizzle, W.; Izbicka, E.; Kagan, J.; Malik, G.; McLerran, D.; Moul, J.; Partin, A.; Prasanna, P.; Rosenzweig, J.; Sokoll, L.; Srivastava, S.; Srivastava, S.; Thompson, I.; Welsh, M.; White, N.; Winget, M.; Yasui, Y.; Zhang, Z.; Zhu, L. Clin. Chem. 2005, 51, 102-112. (16) Borchers, M. T.; Wesselkamper, S.; Wert, S. E.; Shapiro, S. D.; Leikauf, G. D. Am. J. Physiol. 1999, 277, L489-L497. (17) Costa, D. L.; Kutzman, R. S.; Lehmann, J. R.; Drew, R. T. Am. Rev. Respir. Dis. 1986, 133, 286-291. (18) Feron, V. J.; Kruysse, A.; Til, H. P.; Immel, H. R. Toxicology 1978, 9, 47-57. (19) Lyon, J. P.; Jenkins, L. J.; Jones, R. A.; Coon, R. A.; Siegel, J. Appl. Pharmacol. 1970, 17, 726-732. (20) Coombes, K. R.; Tsavachidis, S.; Morris, J.; Baggerly, K. A.; Hung, M. C.; Kuerer, H. M. Proteomics 2005, 5, 4107-4117.

PR060241R

Journal of Proteome Research • Vol. 5, No. 11, 2006 3065