Location proteomics analysis of Human Protein ... - ACS Publications

Location proteomics analy- sis of Human Protein Atlas images. To discover biomarkers, proteomics researchers typically compare the proteins in healthy...
0 downloads 6 Views 674KB Size
news

RESEARCH PROFILES

Location proteomics analysis of Human Protein Atlas images

way to tease apart the colors. Newberg reviewed the literature and tested two strategies, linear unmixing and negative-matrix factorization (NMF; a blind To discover biomarkers, proteomics approach). Although NMF provided a researchers typically compare the better visual separation of the colors, proteins in healthy and diseased tissues linear unmixing provided better clasfor differences in expression levels or sification results. variations in posttranslational modiNext, Newberg and Murphy chose fications. However, some proteins are features with which to train and evaluexpressed and modified to the same ate mathematical classifiers. This step extent in both tissue types yet located is relatively easy with cultured cells in different subcellular compartthat have virtually identical morments. For example, a protein that phologies, but tissues can contain usually resides in the cytoplasm ER Cyto. Endo. Golgi several cell types. “Rather than can activate cancer-causing Mixed trying to segment individual cells pathways if it moves into the in each tissue, which is the way we nucleus. In research reported in would handle that problem in culJPR (2008, 7, 2300–2308), Robert DNA tured cells, we calculated features of F. Murphy and Justin Newberg of the image as a whole,” explains Murthe Center for Bioimage Informatphy. Therefore, the scientists simply ics at Carnegie Mellon University Protein used features that they knew could developed an automated method distinguish subcellular patterns to determine the subcellular locawithout separating out each cell. tions of proteins within tissues and Unmixed To determine whether these feaapplied the method to images from tures matched any of the eight major the Human Protein Atlas (HPA). subcellular patterns, the researchers When HPA was publicly anLocation, location, location. HPA tissue samples anathen used support-vector machine nounced for the first time at the lyzed by linear unmixing. (Top row) Examples of unproclassifiers. With a voting scheme, HUPO Fourth Annual World cessed HPA images, (second row) isolated DNA staining, they looked at the data in many Congress in Munich (in 2005), (third row) isolated protein staining, and (bottom row) a composite of the DNA and protein-staining images. Ladifferent ways by applying various Murphy’s group already had been bels along the top indicate the subcellular location of the filters to the classifiers. In the end, developing automated methods protein of interest. (ER = endoplasmic reticulum, Cyto. the classification accuracy of the for the location proteomics of cul= cytoplasm, and Endo. = endosome.) method was 83% when applied to tured cells for many years. “I first 45 tissues. When only the highest heard about the project [at that confidence answers (classification meeting], and it stimulated me to likelihood >0.5) were considered, the The researchers faced a few chalthink that [HPA] was something that we accuracy increased to 97%. lenges when they tried to adapt their could do automated analysis for,” MurAlthough the automation method previous method to the HPA data set. phy remembers. worked well in this proof-of-principle Murphy and colleagues originally deThe annotations currently posted on study, Murphy says that several refineveloped their automation method for the HPA website (www.proteinatlas. ments are in the works. For example, the analysis of fluorescently labeled sinorg) were determined manually by paimproved unmixing and segmentation gle cells. In these types of studies, sigthologists who viewed millions of tissue schemes would allow researchers to nals from multiple fluorescent probes images. Their goal was simply to note better analyze the heterogeneity of cells can be viewed separately by switchin which tissues a particular protein within tissue samples. “Ultimately, we ing between different channels on a was expressed. According to Murphy, want to be able to look at images of tismicroscope. However, HPA samples only occasionally did the pathologists sues and see patterns beyond the major are stained with two or three immuinclude comments about subcellular ones,” he says. Looking even farther nohistochemistry dyes, each of which location, such as whether a protein apahead, he plans to classify all of the reflects light in the visible spectrum. peared to be in the nucleus, in the cyto>3000 proteins included in the HPA imThus, each HPA image includes at least plasm, or associated with a membrane. ages and compare protein patterns of two colors that are viewed simultaneAll of the HPA data are freely availhealthy and cancerous tissues. ously. To adapt the automation method, able online, so Murphy and Newberg, a —Katie Cottingham the investigators first had to devise a biomedical engineering Ph.D. student, immediately got to work. They analyzed HPA images for 16 proteins that were known to be located in one of eight major cellular compartments (2 proteins per compartment). However, the leap from figuring out whether a protein is located in a certain compartment in fluorescently labeled cultured cells to doing this type of study for stained tissues that contain several cell types was a huge one.

2188 Journal of Proteome Research • Vol. 7, No. 6, 2008

© 2008 American Chemical Society