PROTEOMICS PROJECTS
HUPO HBPP: a case for big science With the publication of 15 research papers in Proteomics, another HUPO project has concluded its pilot phase (Proteomics 2006, 6, 4890–5086). Members of the Human Brain Proteome Project (HBPP) have identified 792 nonredundant proteins from mouse brain samples and 1832 from human brain samples. Surprisingly, only ~30% of the proteins were identified by more than one group. Even groups that used the same technique often identified different data sets. According to HBPP members, this result is a vote of confidence for large-scale proteomics consortia. Researchers had the option of studying mouse or human brain samples, or both. Mouse brains representing three developmental stages were available in addition to two human samples. Some groups performed relative quantitation experiments, whereas others cataloged all the proteins present in a given sample. The researchers were free to analyze the specimens with any technique, so several LC/MS/MS and gel-based methods were used. All ~750,000 MS/ MS spectra were reprocessed centrally with a new informatics pipeline to create a high-confidence protein list. Overlap among the data sets that were generated by different laboratories was minimal; two-thirds of the proteins
were identified by only one group. This result may seem like more ammunition for critics of proteomics who contend that the methods are not reproducible, but the chair of HBPP, Helmut Meyer of Ruhr-Universität Bochum (RUB, Germany), takes a different view. “You get reproducible data,” he says. “The lesson we have learned is that you have to reproduce your data individually because it is not feasible that another lab taking even the same samples can reproduce the same results.” The samples are so complex and so many steps are involved in each method that any small deviation from a protocol can generate a different, but somewhat overlapping, data set. Proteomics researchers “have to follow routinely and like a slave” standard operating procedures to ensure reproducibility at least within a particular group, he says. Christian Stephan at RUB, who is also a member of HBPP, says that proteomics methods produce complementary results. Just one research group will not identify all the proteins of a proteome with one method. “So this is why we need big consortia to analyze all of the proteins of one organ or sample,” he says. Instead of accepting a list of protein identifications from participating laboratories, HBPP members at RUB established the Data Collection Center (DCC) and reanalyzed all MS data with
a central reprocessing pipeline. According to Stephan, this new analysis scheme allowed DCC researchers to assess the data with a defined set of parameters “to compare these results among totally different methods and totally different laboratories.” To identify peptides, the researchers analyzed MS data with three search engines that sifted through the International Protein Index databases in addition to decoy databases for the determination of false-positive rates. Protein lists were developed and merged into one master list with a false-positive rate of ~5%. Meyer says that the most challenging aspect of the pilot phase was the implementation of this pipeline. It took ~6–9 months to analyze all the mass spectra and troubleshoot the system as problems arose. With the present configuration, however, all ~750,000 mass spectra could be analyzed in just 2–3 weeks. All HBPP data are publicly available at the Proteomics Identifications Database. In the next phase of HBPP, participants will tackle the study of neurodegenerative disorders, such as Parkinson’s and Alzheimer’s diseases. Already, they are evaluating suitable mouse models for this work. Meyer says the goal is to discover markers for early diagnoses and to better understand the progression of these conditions. —Katie Cottingham
GOVERNMENT AND SOCIETY
PSI MI group examines antibody–antigen interactions At the HUPO Proteomics Standards Initiative (PSI) fall meeting held September 25–27 at the American Chemical Society in Washington, D.C., the Molecular Interactions (MI) group assessed the ability of the PSI MI standard to represent antibody–antigen interactions. Members deemed the current model appropriate for the description of single antibody–antigen interactions but not for multiple binding events that occur on high-throughput platforms, such as microarrays. Henning Hermjakob at the European Bioinformatics Institute (U.K.), who is the chair of PSI MI, says that these interactions had not been considered when the original model was created,
but now that major antibody production and characterization efforts are gaining steam, the time has come. The efforts include those by the Human Protein Atlas group, the EU ProteomeBinders consortium, and the U.S. National Cancer Institute. Although the standard will work for antibody–antigen interactions, it may not be sufficient for the description of protein microarray studies, which typically feature antibodies. But all is not lost—a group known as MAGE (Microarray Gene Expression) will include protein microarrays in its future work on standards. The PSI MI schema will be used, therefore, to report the results of individual interactions discovered on microarrays, whereas “the other efforts that are more microarray-oriented, es-
2886 Journal of Proteome Research • Vol. 5, No. 11, 2006
pecially MAGE, will work on capturing the underlying experimental details,” says Hermjakob. —Katie Cottingham
U.S. FDA to regulate algorithms in medical tests Algorithms used to diagnose, treat, or prevent illnesses will come under the purview of the U.S. Food and Drug Administration (FDA) by next year. These in vitro diagnostic multivariate index assays (IVDMIAs) include mathematical functions that analyze highthroughput protein or gene expression data to make clinical assessments. Comments on the draft guidelines (posted at www.fda.gov/cdrh/oivd/ guidance/1610.html) are due to FDA by December 6, 2006.