Impact of Replicate Types on Proteomic Expression Analysis

In expression proteomics, the samples utilized within an experimental design may include technical, biological, or pooled replicates. This manuscript ...
66 downloads 4 Views 105KB Size
Impact of Replicate Types on Proteomic Expression Analysis Natasha A. Karp,† Matthew Spencer,‡ Helen Lindsay,§ Kevin O’Dell,§ and Kathryn S. Lilley*,† Biochemistry Department, University of Cambridge, Cambridge, England, Department of Mathematics and Statistics and Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada, and IBLS Division of Molecular Genetics, University of Glasgow, Glasgow, Scotland Received April 1, 2005

In expression proteomics, the samples utilized within an experimental design may include technical, biological, or pooled replicates. This manuscript discusses various experimental designs and the conclusions that can be drawn from them. Specifically, it addresses the impact of mixing replicate types on the statistical analysis which can be performed. This study focuses on difference gel electrophoresis (DiGE), but the issues are equally applicable to all quantitative methodologies assessing relative changes in protein expression. Keywords: expression analysis • proteomics • differential gel electrophoresis • replication • nested ANOVA

Introduction Proteomics is the study of ‘the entire protein complement expressed by a genome in a cell or tissue type’, and is a major tool in the post-genomics era.1 Expression proteomics is the comparison of distinct proteomes (e.g., control versus treatment or control versus disease) to identify protein species with changes in expression. Many different quantitative techniques are being utilized and developed to allow quantitative comparison of samples from one state to another. These include stable isotope labeling, both in vitro2-4 or in vivo,5 and 2D gel electrophoresis with post-staining6,7 or pre-labeling.8,9 Each technique has strengths and weaknesses and plays a complementary role in proteomics. The design of an experiment is crucial to the type of analyses that can be utilized, the types of questions that can be addressed, and the robustness of the results obtained. Early expression studies compared one sample with another and the analyses were restricted to looking for changes above a threshold determined by the system’s experimental noise.3,5,9 This method of analysis limits the sensitivity of the system, as biologically relevant changes smaller than the threshold cannot be detected. For example, with difference gel electrophoresis (DiGE) in a pairwise comparison the sensitivity is limited to changes above a 2-fold threshold.10 Furthermore, any changes in expression identified are changes in the samples compared within the study and the assumption is made that any given sample is representative of the group of samples from which it is drawn. Recent developments in quantitative proteomic techniques have allowed multiple samples to be compared simultaneously. For example, the recently developed isobaric iTRAQ labeling system allows the multiplexing of up to four samples in a single * To whom correspondence should be addressed. E-mail: KSL23@ cam.ac.uk. † University of Cambridge. ‡ Dalhousie University. § University of Glasgow. 10.1021/pr050084g CCC: $30.25

 2005 American Chemical Society

experiment and the collection of several data points for each protein.11 In the 2D gel electrophoresis field, the development of spectrally resolvable CyDyes with the use of an internal standard sample allows multiplexing of samples and quantitation across a gel series.12 With the collection of multiple data points statistical methods can then be utilized to identify changes in protein expression with increased confidence. This can be achieved by a variety of different experimental designs. The experimental design however affects the conclusion that can be drawn from any quantitative analysis. Careful planning is thus essential to maximize the information gained from an experiment. In particular, there seems to be a certain degree of confusion in the proteomics community about the nature of replication. Peng et al. have recently discussed a similar problem with microarray data.13 Replicates are of two main types. Technical replicates are repeated measures from the same biological sample. Biological replicates are different samples from the same treatment group. The type of replicate used affects the type of statistical analysis that can be carried out and the conclusions that can be drawn. This manuscript discusses issues concerning replicate types and provides some guidelines for the analysis of proteomic data. All of the ideas involved are well-established in other fields, but have not been extensively applied to proteomics. Univariate methods such as the Student’s t-Test are used to detect significant changes in the expression of individual proteins (a variable with a number of observations).12,14 Multivariate methods, e.g., principle components analysis, utilize all of the data (all observations for all variables) simultaneously, to look for patterns in expression changes. The univariate approach will identify protein species that exhibit significant changes in expression, while the multivariate approaches can detect more subtle changes across sets of proteins that work in concert such as those that might be involved in a specific cellular pathway or function. Both approaches have a role to play in data analysis. The univariate method, however, is the simplest to interpret and most commonly used; consequently, Journal of Proteome Research 2005, 4, 1867-1871

1867

Published on Web 08/17/2005

letters this study focuses on the use of univariate methods for data analysis. The issues surrounding replicate types are illustrated within this manuscript by considering an expression study comparing mutant and wild-type Drosophila. Quantitative expression data were obtained using difference gel electrophoresis (DiGE), developed by U ¨ nlu ¨ et al.,15 where samples are labeled with spectrally resolvable CyDyes (GE Healthcare, Sweden) prior to electrophoresis. A mix of biological and technical replicates were obtained from pooled samples and the proteins identified as changing were compared when different statistical tests were used. The statistical tests rely on the assumption that each measurement is an independent sample. Here we discuss that these assumptions are likely to be unfounded as technical replicates from the same biological replicate are not independent and consequently give similar measurements.16 An alternative statistical test based on a nested analysis of variance (ANOVA) is thus proposed and the impact of the incorrect method in over-estimating the significance is demonstrated. The influence of different types of replicates on the robustness of conclusions that can be drawn from such experiments is also discussed.

Experimental Section Experimental Design and Sample Preparation. The study compared mutant and wild-type strains of Drosophila melanogaster, where the mutant arises from a missense mutation in an ion channel gene leading to a temperature sensitive paralysis. Flies were grown in one incubation chamber and harvested simultaneously. To extract the soluble protein, whole adult male flies between one and 2 days old were homogenized in 150 µL of lysis buffer (10 mM Tris pH8.0, 5 mM magnesium acetate, 8 M urea, and 2% 3-[(3-Cholamidopropyl)dimethylammonio]-1-propanesulfonate, incubated on ice and the supernatant collected after centrifugation at 13 000 rpm for 10 min at 4 °C. The protein concentration was determined using the Bio-Rad DC protein assay as described by the manufacturers (Bio-Rad, UK). As an individual fly yielded only a small quantity of protein, biological replicates were formed by pooling 12 flies. Three biological replicates were prepared for each group (mutant versus wild-type). Technical replicates were obtained by three repeated measures on each biological replicates, for a total of 18 data points (9 per group). Difference Gel Electrophoresis. Difference gel electrophoresis (DiGE) was performed as previously described 12. In short, individual protein samples were minimally labeled with Cy3 or Cy5 (GE Healthcare, Sweden). A protein pool consisting of all protein samples included in the study was generated for use as an internal standard, and was minimally labeled with Cy2. Proteins labeled with Cy2 (pool), Cy3 and Cy5 were mixed and separated by isoelectric focusing (IEF) using immobilized pH gradient (IPG) DryStrips, pH 3-10 (GE Healthcare, Sweden) according to manufacturer’s instructions. Proteins were further separated according to molecular weight using 12% SDSpolyacrylamide gels. Following electrophoresis, labeled proteins were visualized by scanning the gels at 100 µm resolution and appropriate wavelengths for Cy2, Cy3 and Cy5 fluorescence using the Typhoon 9400 (GE Healthcare, Sweden). A random design with a dye-mix approach was used to avoid experimental artifacts.17 Quantitative analysis was completed with DeCyder version 4 (GE Healthcare, Sweden). After matching across gels, spot volumes were exported and the log standardized abundance (volume ratio of the normalized sample spot volume 1868

Journal of Proteome Research • Vol. 4, No. 5, 2005

Karp et al.

relative to the standard spot volume) used for statistical analysis with the software package SPSS 11.5 (SPSS, USA). Data Analysis. The log standardized abundance was used as the log transformation improves normality.18 In the oneway ANOVA (equivalent to a Student’s t-Test when comparing two groups) the log standardized abundance was defined as the dependent variable and status (i.e., wild-type versus mutant) as the controlled factor. In the nested ANOVA, log standardized abundance was defined as the dependent variable, and status as controlled factor with a random factor of biological replicate nested within the status. Type three sums of squares were used as recommended by Maxwell and Delaney for this model.19 Pooling of mean squares was not used as debate exists over the value and desirability of pooling and, for the majority of spots, the conservative set of rules for pooling described by Sokal and Rohlf were not met.20 The study focused on 40 spots chosen at random across the master gel, which were matched across all gels in the experiment, and spanned a variety of spot volumes.

Results and Discussion Design Strategies. Replication is central to experimental design and allows more robust data analysis. Replicates can be split into two main types: technical and biological replicates. Technical replicates, also called repeated measures, address the measurement error or noise in the experiment. For 2D-gels, a technical replicate would be obtained by running another gel with the same sample. Alternatively, repeating the extraction from a sample and running a second gel would also be a type of technical replicate. By taking multiple measurements, the uncertainty about the true reading for a given sample is reduced. Examples of technical noise that can affect difference gel electrophoresis (DiGE) include noise within the imaging process that may be caused by factors such as dust, irreproducibility of sample preparation and variation in 2D gel running parameters. Biological replicates are individuals of a particular variety or type and correspond to replication in the classical statistical sense. The random sampling of biological replicates from populations subjected to different treatments allows inferences to be made about the effect of the treatments relative to the biological noise of the system. Biological variability is intrinsic to all organisms and can arise from genetic or environmental factors. Designing an experiment employing biological replicates addresses the issue of biological variability, as any significant change in expression can be considered as significant above biological noise. In many proteomic studies where genetically defined populations do not exist within the study, for example in the study of human proteomics, the use of biological replicates is essential to identify changes in protein expression that are significant above the natural variation observed within a population. An example of such a study is that of Prabakaran and co-workers who were searching for protein expression changes related to schizophrenia.21 Biological replicates also encompass the technical noise in a given system. Without an assessment of biological variation, no inferences about the differences between populations can be determined, unless the unrealistic assumption is made that there is no biological variability. In an experiment where only technical replicates are used, all that can be concluded is that the individuals sampled are different and suggest biologically relevant changes. An alternative approach to using many individual organisms or individually prepared cultures is to use pooled samples, with

Impact of Replicate Types on Expression Analysis

letters

Figure 1. Structure of replicates for the two types of ANOVA used. (A) In a one-way ANOVA, each replicate is treated as equivalent. (B) In a nested ANOVA, a hierarchical structure is utilized and nests the technical replicates beneath the relevant biological replicate. Each biological replicate is represented by a number (1 to 3) and the technical replicate of that biological replicate by a subscript letter (A to C).

equal contribution from each sample forming the pool. This may be necessary when insufficient material is obtained from an individual, or when the number of possible replicates is limited (e.g., cost limitations) and the biological variance is high. When few replicates are available and the variance is high, the experiment will have a low power (ability to detect changes in expression) and hence the experiment will be ineffective in detecting expression changes. Pooling of randomly selected biological replicates reduces the biological variance by forming an average sample. If multiple technical replicates are obtained from a pooled sample, then the only conclusion possible is that the average sample exhibits a significant change above technical noise. If pooling is required, then it is better to use several small pools such that the variance among pooled samples within treatments can be estimated.22 Sample pooling should not be used if the researcher wishes to correlate the expression with a variable measured at the subject level as this information will have been lost in the pooling process.13 For example, if the researcher used pooled samples from diseased patients the researcher will not have the option to find proteins that differentiate subtypes of the disease. Mixing Replicate Types. The Student’s t-Test (equivalent to a one-way ANOVA) is the most frequently used statistical test to identify proteins with significant changes in expression from one condition to another. Underlying the test of significance is the assumption that the samples are independently drawn from normal distributions with the same variance in both groups (homogeneity of variance). In an experiment where multiple replicates types have been obtained, for example both biological and technical replicates, the sampling is not independent as the technical replicates from a biological replicate will have similar results. The leading software packages utilized in DiGE data analysis, for example Progeneis (Non-Linear Dynamics, UK) and DeCyder (GE Healthcare, Sweden), have no capabilities to use more complex univariate tests which consider the mix of replicate types. The ANOVA is a univariate statistical test that has the potential to be used in more complex situations. A.J Underwood (1997) provides a good explanation of the ANOVA test and how it can be used to model more complex experimental designs.16

An appropriate alternative to the standard Student’s t-Test is a nested ANOVA where a mix of technical and biological replicates can be used. In a nested ANOVA the test can take into account the relationship between technical replicates drawn from the same biological replicate when looking for significant changes. Figure 1 provides a graphical representation of the hierarchy considered in the one-way ANOVA versus the nested ANOVA. In a nested analysis, the test assesses whether the variance due to the treatment is greater than the variance among biological replicates within a treatment. To investigate the impact of mixing replicate types with the traditional and adjusted approach to data analysis, an expression study comparing a wildtype with a mutant sample was completed with a mixture of biological and technical replicates as described in the methods section. The p-values for forty randomly chosen spots were obtained from a one-way ANOVA (equivalent to Student’s t-test when comparing two groups), which ignores the difference in replicate type, were compared to a nested ANOVA in which the hierarchical structure was considered. The results demonstrate that the one-way ANOVA overestimated the significance for the majority of protein spots studied (Figure 2) as the one-way ANOVA resulted in lower p-values. Thus the one-way ANOVA over-estimates the significance of the difference between the groups. Clearly, if a p-value drops below the threshold, typically 0.01 in expression proteomics studies, this results in false positives (incorrectly calling a change in expression as significant). The difference in p-values obtained depends on the difference in the technical and biological variance. The more similar the biological variance is to the technical variance the smaller the difference in p-values. This study demonstrates that mixing replicate types without the appropriate analysis will often lead to false positives. Nested ANOVA, in common with other ANOVA designs and the Student’s t-Test, requires that the errors are normally distributed with the same variance in different treatments (homogeneity of variance). Karp and Lilley18 have demonstrated that technical replicates of DiGE leads to data that is normally distributed with no evidence for heterogeneity of variance. Journal of Proteome Research • Vol. 4, No. 5, 2005 1869

letters

Figure 2. Comparison of the one-way ANOVA p-values with the nested ANOVA p-values for 40 protein spots obtained in an expression study with a mixture of biological and technical replicates. The diagonal line represents the results if the two tests gave identical results. The majority of points lie above the equivalence line, which shows a higher p-value was frequently obtained with the nested ANOVA indicating an over estimation of the significance of the difference between the groups giving rise to increase in the number of false positives. The dotted lines show the 0.01 threshold typically used to assess significance. With the one-way ANOVA, 23 spots had a p-value below the threshold (to the left of the vertical dotted line), while the nested ANOVA gave three significant spots (below the horizontal dotted line).

Normal quantile plots (Q-Q plots) of the residuals from the nested ANOVA confirmed that errors were approximately normal. However, in the nested ANOVA some spots had higher variance in one treatment than the other (assessed using the Levene Test for Equality of Variances).23 Some of the problems with unequal variance were related to gel running issues triggering an outlier value. For example, in some cases a single spot was observed in some subgroups, but was resolved into distinct spots in others, and this effect caused the increase in noise. The treatment the study is focusing on could also be a source, as the current approaches utilized in the field assume the treatment has no effect on the variance just on the level of expression. Research has shown the ANOVA is ‘robust’ to violations of its assumptions, tending to give conservative answers hence the power is decreased as the probability of type one errors is low.24-27 An exception to this is when the average signal is correlated with the variance. The publication by Karp et al. demonstrates that the technical variance does not correlate with signal strength.18 Thus, the test will be robust even in situations when homogeneity of variance does not hold. Alternative methods such as the Student’s t-Test for unequal variance could be utilized for spots identified with heterogeneous variance. For repeat measure experiments, the test can be used with data averaged over technical replicates. Justifying this approach when the heterogeneity could be arising from outliers is difficult as outlier removal maybe a more appropriate adjustment to the data analysis. Further investigation with more replicates is needed to tease out the issues involved. To complicate issues further, changes in variance from a treatment might be biological significant even when no significant changes in mean are detected. Regardless of the remaining difficulties, the data demonstrates that a nested model for 1870

Journal of Proteome Research • Vol. 4, No. 5, 2005

Karp et al.

identifying significant changes is a more appropriate approach when a mix of replicate types is used. With the experimental design utilized in this study, the nested ANOVA identifies spots with changes above the technical noise from the DiGE process, the technical noise in the extraction of the sample from each pool and above the biological noise remaining after the pooling approach. As 12 flies were used to form each pool, there will be some component of biological noise but the influence of biological outliers has been reduced. Thus, the changes are seen in the average sample formed from the pooling process. This pooling approach has advantages when only a few replicates are possible (e.g., due to financial cost or low protein yield). Given that recent studies have shown that technical variance within the DiGE system is low,18 it could be argued that an appropriate experimental design should focus on employing only biological replicates. For example, use biological replicates but increase the biological element by preparing the pooled sample from a fewer number of flies (protein yield permitting). Focusing on biological replicates would increase our ability to detect small changes in expression above the biological noise. In studies with other quantitative techniques however, such as singlestain 2D gel electrophoresis systems,18 the technical noise is higher than with DiGE. The value of technical replicates therefore increases and in these cases a nested analysis utilizing a mix of replicates would be appropriate. In a study with high technical noise from sample preparation, such as those that involve pre-fractionation of proteins, confidence in the biological significance of the changes could be increased by the use of technical extraction replicates with a nested ANOVA to detect significant changes.

Conclusion Developments in quantitative proteomics have allowed multiple data points to be collected in a single study. This leads to an increase in experimental complexity and data analysis but more robust interpretation. In many studies, it is now possible to generate different replicate types and the implications of these on the conclusions are discussed with their strengths and weaknesses. Ideally biological replicates should be utilized, which leads to changes in expression being identified above the biological variability in the population. However, in experiments with low sample yield, using biological replicates is not always possible. While in experiments with cost restrictions, and hence a low number of replicates, biological replicates might not be appropriate as the resulting low power could prevent any significant changes being detected. An alternative to a biological replicate is to use a pooled sample and any expression changes then identified are changes in the average sample seen above the remaining biological noise seen after pooling. The pooling of samples reduces the biological variance, increasing the power to detect changes in expression above the average sample formed. The use of only technical replicates leads to expression changes in the sample being identified above the technical noise of the experiment, which is less robust as no account of biological noise is considered. Traditionally, little attention has been given to the types of replicate used in a proteomic expression study and at times researchers have mixed replicate types in the same study without adapting the statistical test utilized. The underlying assumptions of independent sampling no longer hold when replicate types are mixed and hence the standard tests (Student’s t-Test or one-way ANOVA) are no longer appropriate.

letters

Impact of Replicate Types on Expression Analysis

Within this manuscript, it has been demonstrated that treating technical replicates as if they were biological replicates will increase the false positive rate (false calls of significance). Nested designs are thus essential to account for the dependency among technical replicates when a repeated measures design is utilized. Nested designs occur in many other experimental designs, whenever subgroups are randomly chosen. Consider a study of mutant versus wildtype samples using genetically identical mice, where multiple mice are sampled from different growth chambers. For this experiment, the data for each mouse should be nested within each chamber. With an experiment involving genetically identical cultured cells, each plate could be considered as a biological replicate with gel repeats as the technical replicate (splitting of the cell pellet harvested and separate extraction of protein from these pellet portions would also constitute a technical replicate). In a study with extraction replicates and gel replicates, a model with the gel replicates nested within the extraction replicates would be required, as the gel replicates from the same extraction replicate will tend to be similar. With the low technical variance seen with DiGE the value of mixing replicate types can be questioned. A simpler approach, which focuses the resources on the highest source of variance (i.e., biological replicates), might be more appropriate. This will, however, depend on the number of replicates possible for the researcher experiment. If biological noise is high and the number of possible replicates is low, then pooled replicates that maintain some of the biological noise will give more significant results compared to technical replicates alone but maintain the power to detect changes. For DiGE users who wish to utilize the commercial software in the current format, the conclusion of this study is that the user should not mix replicate types. For other quantitative expression techniques, the technical variance has yet to be assessed hence a nested design where technical replicates are utilized might be necessary.

Acknowledgment. This work was supported by a BBSRC Grant (BB/C50694/1), which also funds Dr. N. Karp as a BBSRC research associate. Dr. M. Spencer is supported by the Arts & Humanities Research Board and Miss H. Lindsay is funded by a BBSRC Special Research Studentship (02/B1/S/08145) awarded to Dr K. O’Dell. References (1) Wasinger, V. C.; Cordwell, S. J.; Cerpa-Poljak, A.; Yan, J. X.; Gooley, A. A.; Wilkins, M. R.; Duncan, M. W.; Harris, R.; Williams, K. L.; Humphery-Smith et, a. Progress with gene-product mapping of the Mollicutes: Mycoplasma genitalium. Electrophoresis 1995, 16 (7), 1090-1094. (2) Gygi, S. P.; Rist, B.; Gerber, S. A.; Turecek, F.; Gelb, M. H.; Aebersold, R. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat. Biotechnol. 1999, 17 (10), 994-999. (3) Zhou, H.; Ranish, J. A.; Watts, J. D.; Aebersold, R. Quantitative proteome analysis by solid-phase isotope tagging and mass spectrometry. Nat. Biotechnol. 2002, 20 (5), 512-515. (4) Yao, X.; Freas, A.; Ramirez, J.; Demirev, P. A.; Fenselau, C. Proteolytic 18O labeling for comparative proteomics: model studies with two serotypes of adenovirus. Anal. Chem. 2001, 73 (13), 2836-2842. (5) Everley, P.; Krijgsveld, J.; Zetter, B.; Gygi, S. Quantitative cancer proteomics: stable isotope labeling with amino acids in cell culture (SILAC) as a tool for prostate cancer research. Mol. Cell Proteomics 2004, 3 (7), 729-735. (6) Fievet, J.; Dillmann, C.; Lagniel, G.; Davanture, M.; Negroni, L.; Labarre, J.; de Vienne, D. Assessing factors for reliable quantitative proteomics based on two-dimensional gel electrophoresis. Proteomics 2004, 4 (7), 1939-1949.

(7) Smejkal, G. B.; Robinson, M. H.; Lazarev, A. Comparison of fluorescent stains: relative photostability and differential staining of proteins in two-dimensional gels. Electrophoresis 2004, 25 (15), 2511-2519. (8) Yan, J. X.; Devenish, A. T.; Wait, R.; Stone, T.; Lewis, S.; Fowler, S. Fluorescence two-dimensional difference gel electrophoresis and mass spectrometry based proteomic analysis of Escherichia coli. Proteomics 2002, 2 (12), 1682-1698. (9) Hu, Y.; Wang, G.; Chen, G. Y.; Fu, X.; Yao, S. Q. Proteome analysis of Saccharomyces cerevisiae under metal stress by two-dimensional differential gel electrophoresis. Electrophoresis 2003, 24 (9), 1458-1470. (10) Karp, N.; Kreil, D.; Lilley, K. Determining a significant change in protein expression with DeCyderTM during a pairwise comparison using two-dimensional difference gel electrophoresis. Proteomics 2004, 4 (5), 1421-1432. (11) Ross, P.; Huang, Y.; Marchese, J.; Williamson, B.; Parker, K.; Hattan, S.; Khainovski, N.; Pillai, S.; Dey, S.; Daniels, S.; Purkayastha, S.; Juhasz, P.; Martin, S.; Bartlet-Jones, M.; He, F.; Jacobson, A.; Pappin, D. Multiplexed protein quantitation in saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell. Proteomics 2004, 3 (12), 1154-1169. (12) Alban, A.; David, S. O.; Bjorkesten, L.; Andersson, C.; Sloge, E.; Lewis, S.; Currie, I. A novel experimental design for comparative two-dimensional gel analysis: Two-dimensional difference gel electrophoresis incorporating a pooled internal standard. Proteomics 2003, 3 (1), 36-44. (13) Peng, X.; Wood, C.; Blalock, E.; Chen, K.; Landfield, P.; Stromberg, A. Statistical implications of pooling RNA samples for microarray experiments. BMC Bioinformatics 2003, 4 (1), 26. (14) Bergh, G. V. D.; Clerens, S.; Vandesande, F.; Arckens, L. Reversedphase high-performance liquid chromatography prefractionation prior to two-dimensional difference gel electrophoresis and mass spectrometry identifies new differentially expressed proteins between striate cortex of kitten and adult cat. Electrophoresis 2003, 24 (9), 1471-1481. (15) U ¨ nlu ¨ , M.; Morgan, M. E.; Minden, J. S. Difference gel electrophoresis: a single gel method for detecting changes in protein extracts. Electrophoresis 1997, 18 (11), 2071-2077. (16) Underwood, A. Experiments in Ecology: Their Logical Design and Interpretation Using Analysis of Variance; Cambridge University Press: Cambridge, 1997. (17) Karp, N.; Griffin, J.; Lilley, K. Application of partial least squares discriminant analysis to two-dimensional difference gel studies in expression proteomics. Proteomics 2005, 5 (1), 81-90. (18) Karp, N.; Lilley, K. Maximising sensitivity for detecting changes in protein expression: experimental design using Minimal CyDyes. Proteomics 2005, 5 (12), in print. (19) Maxwell, S. E.; Delaney, H. D. Designing Experiments and Analyzing Data: A Model Comparison Perspective, Second ed.; Lawrence Erlbaum Associates: New Jersey, 2004. (20) Sokal, R.; Rohlf, F. Biometry: The Principles and Practice of Statistics in Biological Research, Third ed.; W. H. Freeman and Company: New York, 1995; p 887. (21) Prabakaran, S.; Swatton, J. E.; Ryan, M. M.; Huffaker, S. J.; Huang, J. J.; Griffin, J. L.; Wayland, M.; Freeman, T.; Dudbridge, F.; Lilley, K. S.; Karp, N. A.; Hester, S.; Tkachev, D.; Mimmack, M. L.; Yolken, R. H.; Webster, M. J.; Torrey, E. F.; Bahn, S. Mitochondrial dysfunction in Schizophrenia: evidence for compromised brain metabolism and oxidative stress. Mol. Psychiatry 2004, 9, 684697. (22) Kendziorski, C.; Irizarry, R.; Chen, K.; Haag, J.; Gould, M. On the utility of pooling biological samples in microarray experiments. Proc. Natl. Acad. Sci. U.S.A. 2005, 102 (12), 4252-4257. (23) Levene, H. In Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling; Stanford University Press: Stanford, 1960. (24) Box, G. E. P. Some theorems on quadratic forms applied in the study of analysis of variance problems: I. Effect of inequality of variances in the one-way classification. Ann. Math. Stat. 1954, 25, 290-302. (25) Box, G. E. P. Some theorems on quadratic forms applied in the study of analysis of variance problems: II. Effect of inequality of variances and of correlation of errors in the two-way classification. Ann. Math. Stat. 1954, 25, 484-498. (26) Hsu, P. L. Contributions to the theory of Student’s t test as applied to the problem of two samples. Stat. Res. Memoirs 1938, 2, (124). (27) Lindman, H. R. Analysis of Variance in Complex Experimental Designs; W. H. Freeman & Co.: San Francisco, 1974.

PR050084G Journal of Proteome Research • Vol. 4, No. 5, 2005 1871