CPAS: A proteomics data management system for ... - ACS Publications

management system for the masses ... helped the scientists go the extra step and ... A representation of a typical CPAS workflow for an LC/MS/MS exper...
1 downloads 0 Views 60KB Size
RESEARCH PROFILES

CPAS: A proteomics data management system for the masses

commercial search engines, such as ­Sequest and Mascot, could be used instead. “Although many of these features individually have been freely available before, I think for the first time, here’s a free and usable platform that integrates all of these steps,” says McIntosh. The two modules interact with four core components. The experiment annotation component displays information about the experiment in graphical and tabular formats. The sample management feature, in combination with experiment annotation, links samples with the results they generate. The protein services component stores and updates information about the proteins identified in each experiment. ­Finally, the project management module allows collaboration. Users can restrict

groups. Users can search data in CPAS with different algorithms or apply their own favorite filtering criteria. Mass As recipients of a National Cancer Inspectra from multiple runs can also be stitute (NCI) contract to develop a free, compared. open-source informatics platform for Because the core components are managing data from NCI’s Proteomics not specific to LC/MS/MS-based exInitiative, Martin McIntosh and colperiments, researchers can use CPAS leagues at the Fred Hutchinson Cancer to store and manage information genResearch Center (FHCRC) and the Unierated by other types of proteomics versity of Michigan had bigger plans and nonproteomics projects. McIntosh up their sleeves. “We took our mission says that several laboratories already more broadly than the contract lanstore data on local CPAS installations. guage strictly required,” says McIntosh. “We have a lot of projects that take sam“Early on, I realized that the best outples and send them around the world come would [result] not by developing to have them analyzed,” he says. “We a platform specific to our experimental use CPAS as a tool to capture the replan but rather a more general-purpose sults and also to [collaborate].” He adds proteomics [platform] suitable to supthat these researchers use diverse techport clinical proteomics research.” niques, such as flow cytometry, ELISAs, Only one year into the and DNA microarrays, contract, the researchto generate the data they ers have already prostore on CPAS. The analyduced the first version of sis of such data is not curthe Computational Prorently supported by a CPAS teomics Analysis Sysmodule, but McIntosh and tem (CPAS), which is decolleagues plan to develop scribed in this issue of analysis modules for othJPR (pp 112–121). McIner types of data, including tosh credits his computthose produced by MALer-savvy group of former DI fingerprinting and acMicrosoft software encurate mass- and time-tag gineers and stellar acaapproaches. demic bioinformaticists In addition to expandfor the speedy developing the analytic capabiliment of CPAS. An adties of CPAS, the researchditional grant from the ers plan to maintain the Canary Fund, an orgasystem by working with a nization that provides spin-off company called funding for research LabKey Software. Formed on cancer biomarkers, in October 2005 by some Data flow. A representation of a typical CPAS workflow for an LC/MS/MS helped the scientists go of the researchers who deexperiment. the extra step and make veloped CPAS, the compaCPAS an easy-to-use, inny exists to maintain and stallable product that is accessible to access to only their collaborators or, beupdate the system. “The bioinformatlaboratory groups that do not have incause CPAS is a web-based application, ics experts of my FHCRC laboratory will formatics staff. CPAS, its source code, they can open up their results to the encontinue to develop the scientific asand user documentation are freely tire world. pects of CPAS, but LabKey Software was available at http://cpas.fhcrc.org. McIntosh says the project manageformed to take responsibility for supCPAS manages raw LC/MS/MS proment core component is a key feature of porting the core software as an openteomics data that are in the mzXML CPAS. “There are a lot of great resources source product,” explains McIntosh. format, a model used to store mass that will take data and store and annoUpdates will include making the sysspectral data, and ties several analysis tate the conclusions of proteomics projtem compatible with new standards, steps together. Proteomics data are fed ects, so we decided to focus CPAS on including those developed by HUPO’s into the data pipeline module. The LC/ the ability to have a system suitable for Proteomics Standards Initiative. Any MS/MS data analytic module searches managing public or consortia data earsoftware produced by the company or any proteomics database that a scienlier during the discovery process,” he McIntosh’s group will be owned and tist chooses for peptide identification. says. Another important aspect of CPAS distributed freely by FHCRC to the sciCPAS includes the free, open-source is that it allows scientists to dig into and entific community. search algorithm X!Tandem, but many work with the data generated by other —Katie Cottingham

14 Journal of Proteome Research • Vol. 5, No. 1, 2006