Using the semantic web to retrieve proteomics information

information at the end of a proteomics ms experi- ment, scientists typically have a list of protein names and accession numbers. to learn more about t...
0 downloads 0 Views 425KB Size
currents Using the semantic web to retrieve proteomics information At the end of a proteomics MS experiment, scientists typically have a list of protein names and accession numbers. To learn more about the proteins, investigators could search PubMed. However, PubMed searches conducted with protein identifiers rarely work, say Mark Gerstein and colleagues at Yale University. Therefore, the researchers take advantage of an evolving concept, called the semantic web, for these searches. With the semantic web, resource description frameworks (RDFs), web ontology language, and XML allow bioinformaticians to add meaning and structure to biological content so that documents can be automatically analyzed and integrated. Gerstein and colleagues developed the LinkHub system, in which biological identifiers and the relationships among them are graphed. Known related web documents are linked to the identifier nodes in the RDF graph. Nodes and documents that are close to the node that represents the queried identifier are given high scores and are applied as a training set. Words in the documents are given weighted values, and these terms are used to search unstructured sources, such as PubMed, for relevant documents. The investigators say that this method increases the accuracy of identifier searchers compared with current procedures. (Bioinformatics 2007, 23, 3073–3079)

Glycosylation site prediction Computational approaches could reduce the cost and the amount of time required to determine glycosylation sites on proteins. Machine learning algorithms, such as support vector machines (SVMs), are popular methods for the prediction of glycosylation types, so Cornelia Caragea and co-workers at Iowa State University compared the performance of single SVMs with that of ensembles of SVMs. The ensembles were collections of single SVMs, each of which was trained on a different subset of the data instead of on the entire data set. In all cases, ensembles of SVMs outperformed single SVMs. (BMC Bioinformatics 2007, 8, 438)

E. coli proteome on a chip Heng Zhu, Chuan He, and colleagues at the Johns Hopkins University School of Medicine, National Taiwan University, and the University of Chicago have developed a microarray that includes nearly all of the proteins that can be produced by the E. coli K12 strain. According to the researchers, this new microarray is the first example of a non-yeast wholeproteome chip. To isolate proteins for the chip, the researchers devised a rapid, high-throughput purification protocol. They obtained a library of plasmids that includes the open reading frames of 4256 of the 4288 genes of the E. coli genome. The plasmids overexpress the proteins encoded by these genes when isopropyl-α-d-thiogalactoside (known as IPTG) is added to the growth medium of the bacterial cells. Individual cultures were grown in mi­cro­ well plates, and the same plate was used

Whole-organism imaging MS

2.1

for the entire purification procedure, including cell lysis, affinity capture, and washes. The proteins were isolated within 10 hours. Most of the purified proteins had the expected molecular weights, and half of them were the major band on Coomassie-stained 1DE gels. The purified proteins were printed on glass slides. Because the proteins were fused to an N-terminal polyhistidine tag, they could be visualized on the slides with an antibody against the tag and a fluorescent secondary antibody. Almost all of the proteins had a signal above background levels. As a proof-of-principle demonstration, the investigators incubated the arrays with DNA probes that had specific types of damage. With this method, they identified a few E. coli proteins that bound to the damaged DNA. Of these proteins, two (YbcN and YbaZ) were chosen for further study. Both proteins bound to

1

150

TUHIN SINHA

Toolbox

Many imaging MS (MALDI IMS) studies involve the analysis of organs or x tissue sections. To gain a more complete understanding of processes taking place within an 0 entire organism, how2.6 2.1 0 0 y ever, Tuhin Sinha and z 0 colleagues at Vanderbilt University developed Pretty pictures. The left image represents aligned MALDI a method to apply 3D IMS and MRI data obtained from a tissue slice through a MALDI IMS to wholemouse head. The MRI data are depicted in black and white, animal tissue sections. whereas the levels of a protein detected with MALDI IMS are represented by the colored area. On the right is a reThey also created a tool constructed optical image of the mouse head. to combine MALDI IMS images with other types of data, such as those from magnetic resonance imaging age, which was aligned with optical im(MRI) experiments. ages of the tissue slices. With in vivo MRI, the researchers MRI scans of the mouse brains also imaged the heads of mice with brain were aligned with optical images of the tumors. The mice were then sacrificed, tissue slices so that they could be comperfused with saline, and frozen. MALpared with the MALDI IMS results. In DI IMS images were acquired from seoverlays, regions of high protein conlected tissue sections throughout centration detected in the MALDI IMS whole mice with a lateral resolution of studies corresponded well to MRI con150–300 μm, with 300 laser shots per trast variations. In addition, brain repixel. To generate spatially resolved 3D gions within the tumor area were sigvolume reconstructions of entire mice, nificantly different in terms of the Sinha and colleagues performed severmeasured MALDI and MRI parameal postprocessing steps that matched ters compared with other regions. (Nat. the mass spectra to the targeting imMethods 2008, 5, 57–59)

836 Journal of Proteome Research • Vol. 7, No. 3, 2008

currents damaged DNA in biochemical assays, and they appear to repair DNA via a base-flipping mechanism. (Nat. Methods 2008, 5, 69–74)

A protein implicated in intrauterine growth restriction Fetuses with intrauterine growth restriction (IUGR) develop slowly in utero and, at birth, are much smaller than most newborns. Ironically, as these individuals reach adulthood, they often exhibit symptoms of metabolic syndrome, such as obesity, type 2 diabetes, and cardiovascular problems. To discover proteins that could explain the pathophysiology of the condition at birth and later in life, George Chrousos, Panagiotis Karamessinis, and co-workers at the Academy of Athens, the University of Athens, and the Center for Medical Genomics (Switzerland) performed a proteomics study. They found that the forms of fetuin A present in the umbilical cord blood of

Amyloid fibrils formed by semen proteins boost HIV infection

Most HIV-positive patients were infected upon genital exposure to semen from infected men. The factors in semen that affect the transmission of HIV, however, are not well understood. Therefore, Frank Kirchhoff, Wolf-Georg Forssmann, and colleagues at several universities, companies, and institutes in Germany, Spain, and the U.S. screened a library of peptides and proteins to discover inhibitors and enhancers of HIV infection. They found that fragments of prostatic acidic phosphatase (PAP) form amyloid fibrils that bind the virus particles and augment their attachment to host cells. A library of peptides and proteins (