Government and Society: Human Proteome Folding Project seeks

Government and Society: Human Proteome Folding Project seeks volunteers | Finding a common language for ... J. Proteome Res. , 2005, 4 (1), pp 20–20...
1 downloads 0 Views 59KB Size
G O V E R N M E N T

Did you know that you could help solve protein structures simply by turning on your computer? In November 2004, United Devices, IBM, and the Institute for Systems Biology (ISB) announced that volunteers are needed to donate their excess computing power to the Human Proteome Folding Project. Volunteers download Rosetta, a program that predicts protein structures, and their computers do the rest without interfering with other applications. The goal of the Human Proteome Folding Project “is to fold the proteins of unknown function in the human proteome and predict their structures,” says Rich Bonneau at ISB. “Once we predict the structures, we will be able to use the predicted structures to guess at the functions for many of the proteins.” Although human proteins of unknown function are given priority in the queue, proteins from human pathogens and other organisms with completely sequenced genomes will also be run through the prediction software. When a volunteer downloads Rosetta software at either www.grid.org or www. worldcommunitygrid.org, he or she joins a grid of millions of other volunteers.

Finding a common language for metabolomics First came MAGE, then PEDRo, and now ArMet. The large amounts of data generated by the DNA microarray, proteomics, and metabolomics research communities triggered a need for frameworks that can not only handle this information but also facilitate meaningful analyses. Architecture for metabolomics, or ArMet, was developed independently from MAGE (microarray gene expression) and PEDRo (proteomics experiment data repository). “The standard software engineering practice is to establish requirements for closely associated fields separately and subsequently to reconcile the results,” explains Nigel Hardy, a computer science lecturer at the University of Wales, Aberystwyth, and an author of a recent paper describing ArMet’s application in plant metabolomics (Nat. Biotechnol. 2004, 22, 1601–1606). Originally created to support a project involving two laboratories, ArMet was generalized for broader use. It dif-

20

S O C I E T Y

Jikku Venkat at United Devices explains, “Grid computing provides you [with] the ability to tie your computer resources together so that all of the people in your organization or people who want to do research projects now have access to [more] computer capacity.” United Devices’ grid computing technology can accommodate up to 9 million machines. Once a volunteer joins the project, new bits of work are downloaded periodically from the Internet onto his or her computer. Structure prediction can occur offline, and Venkat says that this process will not slow down one’s computer. “We do a lot of things within our technology to make sure it’s as unobtrusive as possible. The grid computing is running at a much lower priority on your machine than the normal tasks you are doing,” he says. “In over four years of operating a global grid of millions of devices, we’ve never had an obtrusiveness or security issue.” The Rosetta program predicts protein folding by first calculating the local structures of small fragments of the protein. Bonneau says that the next step is to put these pieces together to form a global structure. “You take the things that are good locally and you put them together to … bring all those little pieces

of helix, sheet, and loop [together] into something that looks like a protein,” he explains. The Rosetta program generates large sets of likely global structures, which are collated by IBM and United Devices and sent to ISB. ISB scientists and their collaborators at the University of Washington analyze the structures and predict functions for the proteins. Bonneau says that he and his colleagues are “really excited about scaling up de novo structure prediction.” Without grid computing, he says a project of this magnitude would be impossible. —Katie Cottingham

fers from prior databases in that it contains more “metadata”—information about the full experimental context used to collect the metabolomics data. In fact, the first eight of ArMet’s nine components simply place these data sets in context. Given the dynamic nature of the metabolome, this was considered a critical feature of the design process. Although to date ArMet’s utility has only been demonstrated in plants, the framework is expected to have farreaching implications. “ArMet is modular so new instruments, and specifically new organisms and new ways of growing them and sampling them, can all be incorporated straightforwardly into the basic architecture,” says Douglas B. Kell, research chair in bioanalytical sciences at the University of Manchester (U.K.) and an author of the ArMet paper. Kell is currently working to extend ArMet to microbiology. Other recent activity in the field includes the creation of MIAMET (minimum information on a metabolomics experiment), a first attempt at data stan-

dards for plant metabolomics, and SMRS (standard metabonomics reporting structure), a metabonomics/metabolomics framework that, to date, has focused on toxicity trials. The flexibility of these data systems, which can be used as modules, is considered key to their future compatibility. “All of these are pieces on the table,” says Bruce Kristal, an associate professor at Cornell University’s Weill Medical College and secretary of the Metabolomics Society. “When we finally assemble the puzzle, I don’t think you can predict today what pieces will be in it.” Hardy, who is involved in both ArMet and SMRS, expects to see a “useful synergy between the two initiatives.” The Metabolomics Society, which was formed in March 2004, is helping to organize a workshop this spring to facilitate the process. “We have to generate a common language that we can all understand,” says Rima Kaddurah-Daouk, the society’s president. —Vida Foubister

Journal of Proteome Research • Vol. 4, No. 1, 2005

RICH BONNEAU, INSTITUTE FOR SYSTEMS BIOLOGY

Human Proteome Folding Project seeks volunteers

A N D

Bent into shape. An example of a protein folded by Rosetta.