Comparison of Software Tools to Improve the Detection of Carcinogen

Comparison of Software Tools to Improve the Detection of. Carcinogen Induced Changes in the Rat Liver Proteome by Analyzing. SELDI-TOF-MS Spectra...
16 downloads 0 Views 579KB Size
Comparison of Software Tools to Improve the Detection of Carcinogen Induced Changes in the Rat Liver Proteome by Analyzing SELDI-TOF-MS Spectra Suse Beyer,† Yvonne Walter,† Juergen Hellmann,† Peter-Juergen Kramer,† Annette Kopp-Schneider,‡ Michaela Kroeger,*,† and Carina Ittrich‡ Institute of Toxicology, Merck KGaA, Darmstadt, and German Cancer Research Center, Heidelberg Received August 24, 2005

A common animal model of chemical hepatocarcinogenesis was used to demonstrate the potential identification of carcinogenicity related protein signatures/biomarkers. Therefore, an animal study in which rats were treated with the known liver carcinogen N-nitrosomorpholine (NNM) or the corresponding vehicle was evaluated. Histopathological investigation as well as SELDI-TOF-MS analysis was performed. SELDI-TOF-MS is an affinity-based mass spectrometry method in which subsets of proteins from biological samples are selectively adsorbed to a chemically modified surface. The proteins are subsequently analyzed with respect to their mass-charge ratios (m/z) by a time of flight (TOF) mass spectrometry (MS) approach. As data preprocessing of SELDI-TOF-MS spectra is essential, baseline correction, normalization, peak detection, and alignment of raw spectra were performed using either the Ciphergen ProteinChip Software 3.1 or functions implemented in the library PROcess of the BioConductor Project. Baseline correction and normalization algorithms of both tools lead to comparable results, whereas results after peak detection and alignment steps differed. Variability between technical and biological replicates was investigated. A linear mixed model with factors experimental group and time point was applied for each protein peak, taking into account the different correlation structure of technical and biological replicates. Alternatively, only median intensity values of technical replicates were used. Results of both models were similar and correlated well with those of the histopathological evaluation of the study. In conclusion, statistical analyses lead to comparable results, whereas parameter settings for preprocessing proved to be crucial. Keywords: SELDI-TOF-MS • toxicoproteomics • early predictive biomarker patterns • liver toxicity and carcinogenicity • linear mixed model

Introduction Currently, toxicological risk assessment of compounds is based primarily on histopathological and biochemical investigation of biological specimen from animal studies.1-4 These studies include classical short-term tests to examine acute as well as chronic toxic effects (e.g., 28 day tests) and two year carcinogenicity studies. To reduce time and cost of animal studies, new techniques, such as genomics, proteomics, and metabonomics are becoming widely accepted.5,6 Toxicoproteomics, in particular, is the examination of changes in the proteome at the molecular level following exposure to a toxicant.7,8 To relate such changes to conventional endpoints of toxicity, a common animal model of chemical hepatocarcinogenesis was used.9 The known liver carcinogen N-Nitrosomorpholine (NNM) at 20 mg/kg body weight was applied to young adult male Wistar rats for 7 weeks to induce hepato* To whom the correspondence should be adressed. Tel.: +49-6151723587. Fax: +49-6151-727673. E-mail: [email protected]. † Merck KGaA. ‡ German Cancer Research Center.

254

Journal of Proteome Research 2006, 5, 254-261

Published on Web 01/06/2006

carcinogenesis, followed by an 18 weeks exposure-free period. Animals were sacrificed on day one and after 25 weeks, respectively. Left liver lobes from control and treated animals were prepared for histopathological evaluation, as well as proteomic investigation with SELDI-TOF-MS (Surface enhanced laser desorption/ionization time-of-flight mass spectrometry).10-12 With SELDI-TOF-MS, complex protein samples are prefractionated by capturing protein species of certain biochemical properties on a special chip surface. By using stringent washing procedures, contaminants such as buffer salts and detergents are removed from the surface. Remaining proteins are cocrystallized with small organic and solid-state electron absorbing molecules (EAMs), which absorb laser energy at certain laser wavelengths. After converting laser energy into thermal energy and transferring it to the proteins bound on the chip surface, consequently, proteins are detached and then freely fly through a time-of-flight tube. The time it takes for the molecules to pass through the tube is a function of their molecular weight and charge. Intensity signals of each discrete time-of-flight are 10.1021/pr050279o CCC: $33.50

 2006 American Chemical Society

Carcinogen Induced Changes in the Rat Liver Proteome Table 1. Comparison of Baseline Correction and normalization Algorithms from PROcess and the Ciphergen ProteinChip Software 3.1 across all 182 Spectra of the Study baseline correction normalization all m/z

maximum absolute deviation 74.64 mean absolute min across 1.416 deviation 182 spectra max across 3.65 182 spectra

m/z > 2000 Da

m/z > 2000 Da

74.64 0.11

55.50 0.72

3.05

2.52

collected by a detector at the end of the tube. Single raw spectra created in this manner from complex protein mixtures consist of approximately 25 000 data points covering a m/z range of 2000-20 000 Da, which was the same for our experimental setup. SELDI-TOF-MS technology has been previously investigated for potential use in serum biomarker development in clinical research.13 However, reliability of data sets derived from this approach has been critically discussed in the literature,14 although discussion about reliability of SELDI-TOF-MS results could be triggered by wrong assumptions.15 As preprocessing of SELDI-TOF-MS spectra is an essential step prior to data analysis,16 we have tried to find optimal parameter settings for baseline correction, normalization, peak detection and alignment of multiple spectra by comparing two software tools, the Ciphergen ProteinChip Software 3.1 and library PROcess of the BioConductor Project2 (www.bioconductor.org) available through R.1 The reliability of subsequent data analysis is important to ensure the significance of potentially interesting protein peaks.17 To allow for a simultaneous evaluation of both time points, we applied a linear mixed model and a linear model for each selected protein peak. Results of statistical analysis initially comprise unidentified protein peaks, for which only molecular weights are known but not the protein identification and, therefore, not the cellular function. However, these protein peaks may be potentially taken as signatures corresponding to different toxicological or carcinogenic endpoints.3

Materials and Methods Animal Study. Male Wistar rats were treated with NNM.9 The animals were dosed daily by gavage with either vehicle (drinking water) or 20 mg/kg body weight NNM for 7 weeks and were sacrificed after 1 day or 25 weeks after the start of the experiment. The sample size was 6 animals per group and time point. Necropsy and Histopathology. For proteomics and histopathological investigations, left liver lobes were removed from the animals. For protein extraction liver tissue was cut into small pieces, frozen in liquid nitrogen, and stored at -80 °C. For histopathology, three slices (2-3 mm) of each liver lobe were fixed in formalin, embedded in paraffin and stained with hematoxilin-eosin (HE). The observed liver alterations were scored and divided into different categories as described in Table 1 of Fella et al.18 Protein Extraction. Frozen liver samples (100 ( 50 mg) were ground with a mortar and pestle under liquid nitrogen. Powdered tissue was transferred into a tube containing 125 µL lysis buffer I (10 mM Tris/ pH 7.5, 1 mM EDTA, 0.2 M Saccharose, Benzonase and inhibitor cocktail) and completely

research articles suspended. Afterward, 875 µL lysis buffer II (7 M urea, 2 M thiourea, 4% CHAPS, 40 mM DTT, 20 mM spermine) was added. Suspensions were incubated for 1 h at room temperature and centrifuged at 500 rpm. To separate contaminants and non- soluble ingredients, samples were ultracentrifuged for 30 min at 10 °C and 35 000 rpm. Concentrations of the supernatants were determined by the Bradford assay. Samples were stored at -20 °C. SELDI Protein Chip Preparation. CM10 (weak cation exchanger) ProteinChip arrays from Ciphergen Biosystems containing 8 spots each with the ability of binding the same subgroup of proteins were equilibrated with 50 µL preactivating buffer (10 mM HCl) for 10 min at room temperature. Protein samples were diluted with binding buffer (0.1 M Sodium acetate, 0.05% Triton X-100, pH 4.5) to a concentration of 300 µg/mL. One-hundred microliters of the protein solutions were applied to single spots on the arrays, incubated for 1 h at room temperature and centrifuged at 270 rpm. Each animal group was represented by 6 biological replicates. Per animal, 8 technical replicates were performed on one of the 8 spots per chip. Each spot was washed 3 times with 300 µL binding buffer for 7 min each followed by a centrifugation at 270 rpm in order to purify the spots from unbound molecules and contaminants. Finally, the protein chips were washed with ultrapure water and dried at room temperature for 30 min. As an EAM sinapinic acid (SPA; 400 µL l 50% ACN, 0.5% TFA) was added in two aliquots of 0.7 µL. Subsequently, TOF-analysis was carried out. SELDI Raw Spectra. Raw spectra for each single replicate were obtained by using a Ciphergen Protein Chip Reader (PBS II) with the following settings: 2x warming shots at laser intensity 240 (not collected), collection of 5× laser intensity 230 every five positions between 20 and 80, high mass 50 000 Da, detector voltage 1800 V and focus mass 12 000 Da. Spectra were calibrated externally using purified peptide and protein standards (insulin, ubiquitin, cytochrome C, superoxide dismutase, myoglobin). Ten spectra were not evaluable due to experimental problems. Overall the study comprised 182 spectra. Software for Data Preprocessing and Statistical Analysis. Data preprocessing was completed using the Ciphergen ProteinChip Software 3.1 as well as the library PROcess of the BioConductor Project (Version 1.5)2 available through R (Version 2.01).1 Statistical data analysis was performed using library limma of the BioConductor Project2 and the SAS procedure PROC MIXED (SAS Institute, Cary, USA). In preparation for further data analysis, several preprocessing steps were applied: baseline correction, normalization, quality control, peak detection, and alignment. Baseline Correction. Baseline Correction for each single spectrum is necessary to reduce noise in the data.19 Using PROcess, a loess curve was fitted to the spectrums local minima20 in order to estimate the corresponding baseline. The algorithm implemented in the Ciphergen ProteinChip Software 3.1 fits a piecewise convex-hull that attempts to find the bottom of the spectra and to correct the peak height as well as the area.21 Baseline corrected spectra result from subtracting the baseline from raw spectra. To enhance peaks, the baseline was smoothed before correction. Normalization. Intensities of m/z values are measurements of relative rather than absolute quantities of proteins due to variation in sample preparation, matrix crystallization and ion detection.16 Therefore, prior to differential expression analysis, intensities have to be normalized. The Ciphergen ProteinChip Journal of Proteome Research • Vol. 5, No. 2, 2006 255

research articles Software 3.1 as well as PROcess perform global normalization approaches. Using PROcess, a set of spectra is normalized to their median AUC (area under the curve), which is the sum of the intensities between two cut off points (2000 and 20 000 Da in our study) and therefore summarizes the information from a series of measurements.20 The Ciphergen ProteinChip Software 3.1 normalizes spectra by calculating the Total Ion Current (TIC), which is equal to the sum of the intensity values in a selected m/z range.18 The intensity values of each spectrum were scaled such that the resulting TIC’s were equal for all spectra.21 Quality Control. Quality Control of spectra was performed using the function quality from PROcess with the standard threshold for the parameters quality, retain, and peak.20 Considering the Ciphergen ProteinChip Software 3.1, spectra were deemed of poor quality when their normalization factors were at least larger than the 75% quantile plus 1.5 times the interquartile range of all normalization factors. A second round of normalization was performed after having removed the deficient spectra from the datasets. Peak Detection in Single Spectra. Peaks can be interpreted as local maxima within spectra.22,23 For peak detection, different signal-to-noise ratios were used in order to find a large intersect between both tools. Using PROcess, a m/z-value was called a peak if its intensity signal was 2 or 15 times larger than local noise (signal-to-noise ratio), and if its signal intensity was the maximum in a given window (span ) 81).20 Using the Ciphergen ProteinChip Software 3.1, peaks were detected using the Biomarker Wizard 21 with signal-to-noise-ratios of 2 and 15. Peak Alignment Across Spectra. To take into account a maximum horizontal shift of 0.3% (vendors’ notes) for peak alignment, in PROcess each peak is treated as an interval censored observation and the Maximum-Likelihood-Estimates of the common peak locations conditional that the observed intervals are estimated by a method introduced by Gentleman and Geyer.24 To be kept as a peak after alignment, the peak under investigation had to be present in at least 10% of the spectra. By using the Ciphergen ProteinChip Software 3.1, the signal-to-noise ratio for adding estimated peaks to complete clusters across multiple spectra was set to 1020 and the peak under investigation had to be present in at least 10% of the spectra. Peaks after alignment are referred to as m/z cluster. Variability of the Data. Standard functions of the Bioconductor Project,2 which are available through the statistical software package R,1 were used to access the variability of the data. Therefore, the coefficients of variation (cv’s), defined as the standard deviation divided by the mean, for biological and technical replicates were calculated always taking into account all m/z cluster. For variability of biological replicates, cv’s were calculated for each group and time point separately after having summarized the intensities of corresponding technical replicates to their median. Technical variability was assessed by calculating the median cv for each animal across corresponding technical replicates. Cv’s then were summed up by their median across animals for each group and time point separately. Data Transformation. Intensities of m/z clusters were transformed using the arsinh function (f(x) ) log2(x + sqrt(x2 + 1)) - log2 (2)). This approach stabilizes the variance and is well-defined even for negative intensity values introduced into the dataset by baseline correction.25 Linear Mixed Model. The study includes replicates of biological specimen. Intensity values of each m/z cluster are 256

Journal of Proteome Research • Vol. 5, No. 2, 2006

Beyer et al.

uncorrelated between different biological samples but technical replicates are correlated. Taking into account this correlation structure, for each time point a linear mixed model was applied to the data (1). y ) Xβ + , ≈ N(0,Σ)

(1)

where y is the vector of the arsinh transformed intensity values, X the design matrix (X ) 1n, x1, x2, x3; x1 ) 1: treat ) NNM, 0: treat ) control, x2 ) 1: timepoint ) 25 weeks, 0: timepoint ) 1day, x3 ) x1 * x2), β ) (β0, β1, β2, β3 ) the parameter vector (β0: intercept, β1: time effect for the control group, β2: treatment effect for the first time point (day one), β3: interaction term), and  the error vector. The matrix Σ is block diagonal with blocks corresponding to the individuals and with each block having a compound-symmetry structure. This structure has two unknown parameters, one modeling a common covariance and the other residual variance, estimated separately for each treatment group. The SAS procedure PROC MIXED was used. The covariance matrix and the parameter vector β were estimated by Maximum-Likelihood-Method. The m/z clusters were identified as significantly different between control and treatment-group separately for each time point using 0.005 as cutoff for p-values. With this significance level, less than 1 false positive m/z cluster is expected taking into account all 33 m/z clusters detected with both software tools. Linear Model. Median m/z cluster intensities from technical replicates were calculated. For each m/z cluster a linear model was applied with factors treatment and time. The error terms were assumed to be independent and normally distributed with mean zero and variance σ2. Treatment effects were tested with Wald-Tests. M/z cluster with p-values e 0.005 were selected. The analysis was carried out using library limma from BioConductor Project.2

Results Necropsy and Histopathology. In livers of untreated animals, no alterations were observed. After 1 day of NNM exposure, histologically no visible alterations were found in the liver that could be unequivocally attributed to NNM treatment. After 25 weeks, histopathological examination of the livers revealed liver cell adenomas as well as carcinomas in 4 rats but no neoplastic lesions in 2 rats. A representative HE stained liver section is shown in Figure 1. Quality Control and Preprocessing. Figure 2 shows a heatmap of the raw spectra before baseline correction and normalization for a global overview of the data quality.26 No gross outlying features or general shifts between spectra could be observed, although the intensities of some spectra varied. We focused on careful preprocessing of SELDI-TOF-MS spectra, since altering settings during preprocessing can have a large impact on data analysis results. For measurement of discrepancies between both tools after baseline correction and normalization, the maximum absolute deviation between all m/z values across spectra from all technical and biological replicates was calculated (Table 1). In addition, the mean absolute deviation across all m/z values of all spectra was calculated. The minimum and maximum across all spectra is summarized in Table 1. After baseline correction, larger deviations were detected between both software tools in the molecular range below 2000 Da than above 2000 Da. This is probably due to high variability and numerous irregularities caused by EAM’s (Table 1). There-

Carcinogen Induced Changes in the Rat Liver Proteome

research articles

Figure 2. Heatmap including all 182 spectra of the study. Control and treated samples are indicated for both time points, respectively.

Figure 1. Liver histopathology of rats treated with NNM (HE stain). Left panel: Liver cell adenoma adjacent to apparently normal liver tissue (200×). Right panel: Hepatocellular carcinoma (400×).

fore, low molecular weight parts of the spectra were excluded from further analysis and normalization was carried out including only the mass range from 2000 to 20 000 Da. Since normalization is performed after baseline correction, results after normalization are likely to be biased by results from baseline correction. To assess the maximal mean absolute deviation of 2.52 after comparison of normalization algorithms (Table 1), the median relative intensity of m/z clusters across spectra was calculated and resulted in 23.9 (range: 1.5-48.4) for PROcess and 25.7 (range: 0.8-45.9) for the Ciphergen ProteinChip Software 3.1. The maximum relative intensity was 100 using the PBSII-reader. Numerical quality control was carried out after the first round of normalization employing the function quality of PROcess. Although some spectra showed only m/z values in the low molecular weight range up to 2000 Da only one spectrum was flagged as bad. Normalization factors derived from the Ciphergen ProteinChip Software 3.1 were used to assess m/z intensity distributions within spectra. Nine spectra were excluded from further analysis because their normalization factors were at least larger than the 75% quantile plus 1.5 times the interquartile range of all normalization factors. A second round of normalization was carried out with the remaining spectra.

The mean number of peaks per spectrum was 87 and 21 detected by the Ciphergen ProteinChip Software 3.1 and 30 and 1 detected by PROcess with signal-to-noise thresholds of 2 and 15, respectively. The number of m/z clusters obtained after aligning spectra and adding estimated peaks was 127 and 40 using the Ciphergen ProteinChip Software 3.1 and 53 and 1 using PROcess with signal-to-noise ratios of 2 and 15, respectively. Thirty-three m/z clusters were detected by both approaches. For each software tool, an error of 0.3% on the m/z axis (vendor’s notes) was allowed for peaks to be assigned to the same cluster. As both software tools use slightly different starting points for peak detection, the criteria for designation of clusters being the same between both tools had to be assessed individually. Intervals for each m/z cluster were defined by the exact position of each m/z cluster plus the modulus of 0.3% as maximal error on the m/z axis. These intervals then had to be overlapping between both software tools for m/z values belonging to the same cluster. To assess the consistency of m/z cluster intensities from both software tools, Pearson’s correlation coefficient was calculated. Therefore, intensities of the 33 m/z clusters detected with both software tools across all spectra before numerical quality control during normalization were considered. The median correlation of m/z cluster intensities was 0.93 with a minimum of 0.05 and a maximum of 0.98. Spectra where correlation of intensities was bad could be shown to be those being removed during quality control. The number of detected peaks and m/z clusters between both approaches differed depending on the settings. Regarding signal-to-noise thresholds of 15 for PROcess and 2 for the Ciphergen ProteinChip Software 3.1, discrepancies in the number of m/z clusters were mainly due to the detection of peaks by functions of PROcess in two particular regions of the spectra (15 100 Da to 16 200 and 4440 Da to 5630 Da). The number of detected peaks and resulting m/z clusters can easily be altered by using more stringent signal-to-noise ratios (e.g., mean number of detected peaks ) 26 with signal-to-noise ratio ) 10 for PROcess) as well as by modifying the percentage of spectra in which a peak has to be present in order to be kept after alignment. Journal of Proteome Research • Vol. 5, No. 2, 2006 257

research articles

Beyer et al.

Figure 3. Hierarchical clustering across the animals of time point 25 weeks. Concerning preprocessing with the Ciphergen ProteinChip Software 3.1, the median intensity values per animal were used. Control and treatment groups are marked. Blue is assigned to low, red to high-intensity values.

Variability of the data was determined by calculating the coefficient of variation using m/z cluster intensities from SELDI-TOF-MS spectra which were preprocessed with the Ciphergen ProteinChip Software 3.1. The median coefficient of technical variation was 0.09 (range ) 0.03-1.04) for the control group on day one, 0.09 (range ) 0.02-0.60) for the treatment group on day one, 0.11 (range ) 0.03-1.70) for the control group at 25 weeks and 0.27 (range ) 0.06-2.04) for the treatment group at 25 weeks. We observed a technical variation of 0.41 in liver tissue samples analyzed with 2DE electrophoresis derived from the same animal study (data not shown). The median biological variability was 0.24 (range ) 0.05-0.80) for the control group on day one, 0.27 (range ) 0.09-0.76) for the treatment group on day one, 0.39 (range ) 0.09-1.17) for the control group at 25 weeks and 1.02 (range ) 0.32-1.93) for the treatment group at 25 weeks. The biological variability increased over time and was highest in the treatment group at 25 weeks. Also, for the technical variability a time effect was observed, although the overall technical variability was lower than the biological variability. Increasing technical variability in each group was not related to the molecular weight of m/z clusters (data not shown). Data Analysis. No differentially expressed m/z cluster (pvalues e 0.005 and fold changes e 0.5 or g 2) were observed on day one with a linear mixed model or a linear model analysis. After 25 weeks, the same significant m/z clusters were found with a linear mixed model and linear model analysis. Since the study design includes biological as well as technical replicates, application of a linear mixed model is the more appropriate statistical method. Therefore, only results from the linear mixed model fit will be reported below. Results were visualized with a hierarchical cluster (method ) complete, distance ) euclidean) across m/z cluster intensities of the animals of 25 weeks using median m/z cluster intensities per animal (Figure 3). Patterns observed in the cluster dendrogram concerning m/z cluster intensities of four animals in the treatment group at 25 258

Journal of Proteome Research • Vol. 5, No. 2, 2006

Figure 4. Volcano-plots displaying the fold change estimates and the p-values after fitting a linear mixed model for all detected m/z values after preprocessing with Ciphergen ProteinChip Software 3.1. Considering 25 weeks, fewer differentially expressed m/z values were detected between control group and treatment group without neoplastic lesions (A) than between control animals and treated animals displaying neoplasia (B).

weeks compared to m/z cluster intensities of the remaining two animals could be correlated to histopathology. A linear mixed model was applied after splitting the treatment group into two subsets, four animals with neoplastic lesions (denoted by t_nl) and the remaining two animals with no carcinogenic findings (denoted by t_no_nl). The control group was denoted by c. The volcanoplots in Figure 4 show, that in those animals, which had undergone a progression toward neoplasia, more m/z clusters were significantly regulated than in those animals which did not show neoplastic lesions. It has to be pointed out, that statistical analysis was carried out with groups of small sampling sizes (four animals against two animals). The analysis seems to be suitable since observed biological effects were large, however. With both software tools, nine differentially expressed m/z clusters were detected (fold changes e 0.5 or g 2, which corresponds to log2 ratios e -1 or g 1, and p-values e 0.005) in at least one of the comparisons c/t_no_nl or c/t_nl (Table 2). Differentially expressed m/z cluster from both software tools were identical in their m/z taking into account a shift on the m/z axis due to different starting points for peak detection. Of the nine m/z clusters shown in Table 2, five m/z cluster were regulated to a greater extent in the neoplasia group compared to the remaining two animals of the treatment group at 25 weeks (Table 2). Out of these, three m/z clusters were down and two were up regulated.

Discussion Currently, carcinogenic effects of chemical substances or pharmaceuticals are evaluated after having carried out long-

research articles

Carcinogen Induced Changes in the Rat Liver Proteome

Table 2. Differentially Expressed m/z Values after Pre-processing of the Spectra Either with PROcess or Ciphergen ProteinChip Software 3.1.a Ciphergen

PROcess

log2 ratio m/z value

M8546_62 M8676_42 M8907_84 M9327_21 M9915_61 M7581_35 M11284_0 M13742_1 M13975_2

log2 ratio

c/t_no_nl

c/t_nl

t_no_nl/t_nl

0.01 1.07b -0.56b 1.93d -0.15 -2.91d -0.89b -2.5d -0.57

-2.55d

-2.56d

-2.28d -4.01d -1.84b -4.12d -1.49c 2.40d 2.26c 1.94d

-3.35d -3.46d -3.77d -3.96d 1.42b 3.29d 4.76d 2.51d

m/z value

M8547.05 M8672.27 M8907.28 M9322.73 M9917.02 M7583.57 M11305.6 M13747.4 M13974.4

c/t_no_nl

c/t_nl

t_no_nl/t_nl

-0.15 0.66b -0.55b 1.81d -0.14 -2.98d -0.80b -2.28d -0.65

-1.73d

-1.84d -2.43d -2.72d -3.01d -3.37d 2.10b 3.22d 4.44d 2.48d

-1.77d -3.27d -1.19 -3.51d -0.92 2.42d 2.16d 1.83d

a Fold changes and p-values after fitting a linear mixed model for the contrasts c/t_no_nl, c/t_nl and t_no_nl/t_nl are reported. b p-value e 0.05, cp-value e 0.005, dp-value e 0.0005.

term animal studies (e.g., two year chronic rodent bioassays) by investigating histopathological endpoints. These animal studies are time and cost consuming, and above all, ethically problematic.27 Therefore, it is important to search for reliable and sensitive alternative approaches, which could detect endpoint related changes at early stages of the carcinogenic process. SELDI-TOF-MS technology could meet this expectation, however, effectiveness has to be proven and validated. For this reason, we have evaluated the discriminatory power of SELDI-TOF-MS by detecting endpoint related changes in animals from a carcinogenicity study with severe histopathological findings. A positive result could lead to the routine use of SELDI-TOF-MS as an early screening platform for toxicological-related questions. Data Preprocessing. A molecular approach in toxicology accepted by drug approval agencies would have to deliver robust outcomes. For this reason, we compared results from two software tools after preprocessing of the raw data. Although results for baseline correction and normalization of both tools were nearly the same (Table 1), ‘smoothing’ the baseline is a critical step.28 If the ‘smoothing’ parameter is set too high, wide rather than local changes in intensities will be recognized and signals are likely to get lost. For normalization, we considered intensity signals between 2000 and 20 000 Da as we used sinapinic acid (SPA) as energy absorbing molecule. SPA causes many signals in the molecular range below 2000 Da but allows the display of signals in a wide m/z range up to 20 000 Da. In general, the matrix solvent composition can significantly affect result.29 Both software tools perform global normalization approaches considering all intensity signals in selected m/z ranges. Each spectrum displays signals due to EAM molecules. Independent of the overall intensity distribution in the spectrum, spectra with low intensity signals have to be removed before final normalization. Otherwise, signals from the EAM’s will be adapted to the overall intensity of the spectrum and, accordingly, the intensity distribution will be skewed. Normalization algorithms from the Ciphergen ProteinChip Software 3.1 and PROcess have the same underlying concept and give similar results, but generally it is questionable if global normalization is suitable since the focus lies on identification of differentially expressed proteins.30 Alternative normalization procedures could improve results, e.g., normalizing only to intensities of external spikes. Peak detection and alignment are the most critical steps in preprocessing of SELDI-TOF-MS spectra. To obtain meaningful sets of m/z clusters for ongoing data analysis, it is necessary

to use parameter settings, which are suitable for the biological samples under investigation. As we were analyzing proteins from complex rat liver tissues without pre-fractionation, ion suppression effects between high and low abundant ions have to be taken into account. These effects are well-known for MSbased approaches.31 Consequently, even if low abundant proteins are supposed to carry important information,32,33 these ions may be superposed in our experiments. Therefore, using stringent signal-to-noise ratios excluding low intensity signals may improve reproducibility of results, although results could be biased when concentrating only on high signal-to-noise ratios. The detection of the true maximum intensity of a peak is essential. Otherwise, fold-change estimates will be biased. We observed that using the function getPeaks from PROcess the ‘smoothing’ parameter has to be set to zero to obtain reasonable results (Figure 5). For the Ciphergen ProteinChip Software 3.1 this parameter cannot be controlled. Assigning signal intensities to spectra that do not already have a peak within a given cluster is an issue of importance. Controlling this parameter is difficult in experiments being composed of technical as well biological replicates. The number of detected peaks and m/z clusters can easily be altered and, consequently, the false positive discovery rate can hardly be controlled. An alternative approach for peak alignment is complete linkage hierarchical clustering, as proposed by Tibshirani et al.23 Note, that intensities below signal level can be detected during peak alignment when estimated peaks are added to complete cluster. Data Analysis. Linear models have the underlying idea that the observed data can be explained by a model in which the effects of different influences are additive.34 Applying a linear model, taking into account median intensities across technical replicates or a linear mixed model, with the covariance matrix being compound symmetric, gives quantitatively and qualitatively the same results for both preprocessing approaches. As with the linear mixed model approach the particular correlation structure between replicates can be taken into account, fitting this model to the data seems to be more appropriate, although more parameters have to be estimated. In all cases, the quality of the model fit has to be approved. In general, the assumption of the normal distribution of the error in small datasets, as is always the case in toxicological studies, could be critical. Alternative approaches for data analysis like nonparametric tests might deliver more robust results.35 Correlation of Results from SELDI-TOF-MS to Histopathology. Histopathological outcomes could be supported by Journal of Proteome Research • Vol. 5, No. 2, 2006 259

research articles

Beyer et al.

Figure 5. Typical spectrum from rat liver lysate in the molecular range of 2000 to 20 000 Da and zoomed in the region of approximately 7600 to 8400 Da. Detection of the maximum intensity at a given m/z is inevitable (right panel). Intensities below signal level can be detected during peak alignment when estimated peaks are added to complete cluster. Red and green asterix indicate the intensities quantified by Ciphergen and PROcess, respectively.

results obtained after analysis of SELDI-TOF-MS spectra. Nine significantly regulated m/z clusters after NNM treatment could be detected with SELDI-TOF-MS using both software tools for preprocessing. Differential expression of these m/z clusters after 25 weeks suggests that they are specific carcinogenicity markers/signatures rather than protein profiles related to acute toxic effects. It has to be pointed out here, that absolute quantitation is not possible by SELDI-TOF-MS investigations in general. However, relative quantitation seems to be possible, if differences in intensities are statistically significant and above stringent thresholds. With respect to robust statistical data analysis, one clear hindrance in toxicological studies is that experiments constitute small sample sizes. However, it has to be pointed out that in the study under investigation here outbred Wistar rats were used. These animals display much smaller biological variability than patient samples in clinical studies for which SELDI-TOF-MS often is utilized and for which standardization and validation approaches are underway.36

Conclusions Concerning preprocessing of SELDI-TOF-MS spectra, baseline correction, and normalization algorithms implemented in two available software tools give fairly the same results. However, the most critical issue concerning preprocessing is peak detection and alignment of peaks across spectra. Our results are comprised of significantly regulated m/z clusters between control and treatment groups visualized in a hierarchical cluster. These m/z clusters can be taken as a set of endpoint markers for the carcinogenetic effects of NNM. Moreover, our findings demonstrate the possibility to discriminate between different stages of NNM-related hepatocarcinogenicity with SELDI-TOF-MS. Although it seems to be possible in general to identify biomarkers,37 the biochemical properties of some proteins may circumvent the identification process. However, we believe that the information, which is carried by protein sets comprised of m/z clusters rather than single, wellknown molecules is large enough for discrimination. Our data show proof of concept for the endpoint of cancer. The power for prediction of early proteomic changes is under further investigation. 260

Journal of Proteome Research • Vol. 5, No. 2, 2006

Acknowledgment. The studies were supported by grants from the Bundesministerium fu¨r Bildung und Forschung of the Federal Republic of Germany PTJ-BIO 0312618 and 0312619. We thank two anonymous reviewers for their valuable comments. References (1) R Development Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, http://www. R-project.org, 2005. (2) Gentleman, R.; Carey, V. J.; Bates, D. J.; Bolstad, B.; Dettling, M.; Dudoit, S. Genome Biol. 2004, 1. (3) Bandara, L. R.; Kennedy S. Drug Discov. Today 2002, 7, 411418. (4) Waring, J. F.; Ciurlionis, R.; Jolly, R. A.; Heindel, M.; Ulrich, R. G. Toxicol. Lett. 2001, 120, 359-368. (5) Kennedy, S. Biomarkers 2002, 7, 269-290. (6) Banks, D. Chance 2003, 16, 6-7. (7) Moyses, C. Int. J. Pharm. Med. 1999, 13, 197-202. (8) Steiner, S.; Anderson, N. L. Toxicol. Lett. 2000, 112, 467-471. (9) Weber, E.; Bannasch, P. Carcinogenesis 1994, 15, 1227-1234. (10) Hutchens, T. W.; Yip, T. T. Rapid Commun. Mass Spectrom. 1993, 7, 576-580. (11) Grizzle, W. E.; Adam, B. L.; Bigbee, W. L.; Conrads, T. P.; Carroll, C.; Feng, Z.; et al. Dis. Markers 2003, 2004, 185-195. (12) Merchant, M.; Weinberger, S. R. Electrophoresis 2000, 21, 11641167. (13) Petricoin, E. F.; Ardekani, A. M.; Hitt, B. A.; Levine, P. J.; Fusaro, V. A.; Steinberg, S. M. et al. Lancet 2002, 359, 572-577. (14) Baggerly, K. A.; Morris, J. S.; Coombes, K. R. Bioinformatics 2004, 20, 777-785. (15) Liotta, L. A.; Lowenthal, M.; Mehta, A.; Conrads, T. P.; Veenstra, T. D.; Fishman, D. A.; Petricoin, III, E. F. J. Nat’l Cancer Inst. 2005, 97, 310-314. (16) Yasui, Y.; Randolph, T.; Feng, Z. In: Handbook of Statistics in Clinical Oncology, 2nd ed.; Marcel Dekker Inc.: New York, 2005. (17) Petricoin, E. F.; Liotta, L. A. Curr. Opin. Biotechnol. 2004, 15, 2430. (18) Fella, K.; Glu ¨ ckmann, M.; Hellmann, J.; Karas, M.; Kramer, P.-J.; Kro¨ger, M. Proteomics 2005, 5, 1914-1927. (19) Baggerly, K. A.; Morris, J. S.; Wang, J.; Gold, D.; Xiao, L.-C.; Coombes, K. R. Proteomics 2003, 3, 1667-1682. (20) Li, Y.; BioConductor Vignette ‘PROcess’, www.bioconductor.org. (21) Ciphergen User Manual 2.0, Ciphergen Biosystems, Inc., Freemont, USA. (22) Yasui, Y.; Pepe, M.; Thompson, M. L.; Adam, B. L.; Wright, G. L.; Qu, Y. Biostatistics 2003, 4, 449-463. (23) Tibshirani, R.; Hastie, T.; Narasimhan, B.; Soltys, S.; Shi, G.; Koong, A.; Le, Q. T. Bioinformatics 2004, 20, 3034-3044. (24) Gentleman, R.; Geyer, C. Biometrika 1994, 82, 618-632.

research articles

Carcinogen Induced Changes in the Rat Liver Proteome (25) Huber, W.; von Heydebreck, A.; Su ¨ ltmann, H.; Poustka, A.; Vingron, M. Bioinformatics 2002, 18 suppl. 1, 96-104. (26) Hawkins, D. M.; Wolfinger, R. D.; Liu, L.; Young, S. S. Chance 2003, 16, 19-23. (27) Cohen, S. M. Human Carcinogenic Risk Evaluation: Toxicol. Sci. 2004, 80, 225-229. (28) Eilers, P. H.; de Menezes, R. X. Bioinformatics 2005, 21, 11461153. (29) Cohen, S. L.; Chait, B. T. Anal. Chem. 1996, 68, 31-37. (30) Bolstad, B. M.; Irizarry, R. A.; Astrand, M.; Speed, T. P. Bioinformatics 2003, 19, 185-193. (31) Annesley, T. M. Clin. Chem. 2003, 49, 1041-1044. (32) Lowenthal, M. S.; Mehta, A. I.; Frogale, K.; Bandle, R. W.; Araujo, R. P.; Hood, B. L. et al. Clin. Chem. 2005, 51, 1933-1945.

(33) Lopez, M. F.; Mikulskis, A.; Kuzdzal, S.; Bennett, D. A.; Kelly, J.; Golenko, E. et al. Clin. Chem. 2005, 51, 1946-1954. (34) Altman, D. G. Practical Statistics for Medical Research; Chapman & Hall: New York, 1991. (35) Brunner, E.; Domhof, S.; Langer, F. Nonparametric Analysis of Longitudinal Data in Factorial Designs; Wiley: New York, 2001. (36) Grizzle, W. E.; Semmes, O. J.; Basler, J.; Izbicka, E.; Feng, Z.; Kagan, J. et al. Urol. Oncol. 2004, 22, 337-343. (37) Zhang, Z.; Bast, R. C.; Yu, Y.; Li, J.; Sokoll, L. J.; Rai, A. J. et al. Cancer Res. 2004, 64, 5882-5890.

PR050279O

Journal of Proteome Research • Vol. 5, No. 2, 2006 261