Highlights of the Biology and Disease-driven Human Proteome Project

Aug 30, 2016 - A Cloud-Based Metabolite and Chemical Prioritization System for the Biology/Disease-Driven Human Proteome .... The SysteMHC Atlas proje...
1 downloads 0 Views 393KB Size
Subscriber access provided by Northern Illinois University

Perspective

Highlights of the Biology and Diseasedriven Human Proteome Project, 2015-2016 Jennifer E. Van Eyk, Fernando Jose Corrales, Ruedi Aebersold, Ferdinando Cerciello, Eric W. Deutsch, Paola Roncada, Jean-Charles Sanchez, Tadashi Yamamoto, Pengyuan Yang, Hui Zhang, and Gilbert S Omenn J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.6b00444 • Publication Date (Web): 30 Aug 2016 Downloaded from http://pubs.acs.org on August 30, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Highlights of the Biology and Disease-driven Human Proteome Project, 2015-2016 Jennifer E. Van Eyk1,2, Fernando J. Corrales1,3, Ruedi Aebersold4, Ferdinando Cerciello4, Eric W. Deutsch5, Paola Roncada6, Jean-Charles Sanchez7, Tadashi Yamamoto8, Pengyuan Yang9, Hui Zhang10, Gilbert S. Omenn* 1,11 1

2

Equal contribution

Advanced Clinical BioSystems Research Institute, Department of Medicine, Cedars-Sinai Medical Centre, Los Angeles, CA, USA, 90038

3

Department of Hepatology, Proteomics laboratory, CIMA, University of Navarra; Ciberhed; PRB2, ProteoRed-ISCIII. 31008 Pamplona, Spain

4

Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, 8093 Zürich, Switzerland. 5

6

7

Institute for Systems Biology, Seattle, WA 98109 USA

Istituto Sperimentale Italiano L. Spallanzani, 20133 Milano, Italy

Centre Medicale Universitaire, Human Protein Sciences Department, CH-1211 Geneva, Switzerland.

ACS Paragon Plus Environment

1

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

8

Page 2 of 31

Niigata University, Department of Structural Pathology, Institute of Nephrology, Medical and Dental School, Asachimachi-dori Niigata, 951-8510, Japan. 9

10

* 11

Fudan University, Department of Chemistry, Shanghai, P.R. China

Johns Hopkins University, Department of Pathology, Baltimore, Maryland, USA.

Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA, [email protected], (734) 763-7583.

Keywords: Biology and Disease-driven Human Proteome Project, B/D-HPP; selected reaction monitoring-MS, SRM-MS; mass spectrometry, MS

ABSTRACT

The Biology and Disease-driven Human Proteome Project (B/D-HPP) is aimed at supporting and enhancing the broad use of state-of-the-art proteomic methods to characterize and quantify proteins for in depth understanding of the molecular mechanisms of biological processes and human disease. Based on a foundation of the pre-existing HUPO initiatives begun in 2002, the B/D-HPP is designed to provide standardized methods and resources for mass spectrometry (MS) and specific protein affinity reagents, and facilitate accessibility of these resources to the broader life sciences research and clinical communities. Currently there are 22 B/D-HPP initiatives and 3 closely related HPP resource pillars. The B/D-HPP groups are working to define sets of protein targets that are highly relevant to each particular field, to deliver relevant assays for the measurement of these selected targets, and to disseminate and make publicly accessible the information and tools generated. Major developments are the 2016 publications of

ACS Paragon Plus Environment

2

Page 3 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

the Human SRM Atlas and of “popular protein sets” for six organ systems. Here we present the current activities and plans of the BD-HPP initiatives as highlighted in numerous B/D-HPP workshops at the 14th annual HUPO 2015 World Congress of Proteomics in Vancouver, Canada.

INTRODUCTION, ORGANIZATION, AND GOALS Under the aegis of the Human Proteome Organization (HUPO), the Human Proteome Project (HPP) was formed in 2010 and launched over the subsequent years to promote advances in our understanding of the human proteome. The HPP is a focal point for many proteomic research laboratories around the world, and enhances the quality and interconnectedness of proteomics data resources (https://www.hupo.org/human-proteome-project/)1-3. The HPP is composed of three resource pillars, the C-HPP (chromosome–centric) initiative, and the Biology and Diseasedriven B/D-HPP. The B/D-HPP aims to develop targeted and high-throughput proteomics analyses, address research challenges of biological and disease networks, and generate multiplex assays of proteins especially suited for particular cells, tissues and organs in health and across many diseases [http://www.thehpp.org/BD-HPP.php]. The initial concept of the B/D-HPP was to enhance the slow uptake of proteomics compared to other ‘omics fields in the research and clinical communities. Although protein biochemistry (including generation of mutant proteins) is a component of most published biology or diseasebased scientific papers, the vast majority of this global effort remains focused on a small subset of proteins4 most prominently the kinase families. One reason is the shortage of standardized tools, technologies, and informatics pipelines tailored for biomedical and clinical research4. Thus, the goal of the multifaceted B/D-HPP is to produce these resources and create communities able to promote the development and adoption of proteomics in order to address

ACS Paragon Plus Environment

3

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 31

biological and disease questions, and to provide a framework for interaction within biology- and disease-focused research groups, including engagement of early career scientists in our field and outreach to the boarder scientific communities. The B/D-HPP is an alliance of independent groups of researchers who are focused on specific diseases and/or molecular processes (see Figure 1). The B/D-HPP initiatives as of 2013 were brain, cancer, cardiovascular, diabetes, epigenetics and chromatin, glyco-proteomics, infectious disease, kidney and urine, liver, mitochondria, model organisms, plasma, and stem cells. Nearly all remain active (see Table 1). Since then there has been grass roots expansion of B/D-HPP to embrace initiatives in the areas of extreme conditions, eye, food and nutrition, immunopeptidome, musculo-skeletal, pediatric, protein aggregation, and toxicoproteomics. Each B/DHPP group has a chair and co-chairs from different geographic regions of the world; they are responsible for establishing the specific goals and milestones for each group and for participating in workshops and scientific sessions at each annual HUPO Congress, as well as their own meetings. (See Supplementary Table 1 for list of the leaders of each B-D-HPP group and the 3 primary resource pillars). The goal for each B/D-HPP group is to form a network with their participating researchers to cross-fertilize ideas and share data, facilitate adoption of technologies, and address challenges within their specific disease or molecular process domain. This includes the adoption and promotion of the HUPO standards in all aspects of MS and Bioinformatics (including peptide and protein identification). The HPP also encourages links between the B/D and C-HPP groups, now facilitated by the designation by the C-HPP of clusters as shown on the cover of this special issue for cancers, reproductive health, membrane proteins, neurodegenerative (and protein misfolding) disorders and the in vitro transcription/translation technology platform.

ACS Paragon Plus Environment

4

Page 5 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

The B/D-HPP education and outreach program provides an annual Mentoring Day and a series of awards for early investigators – predoctoral, postdoctoral and clinical fellows, and early faculty. Information about the B/D-HPP is available at www.thehpp.org. Finally, a B/D-HPP newsletter is being launched, alongside the long-standing extensive newsletter series of the CHPP. THE HUMAN SRM ATLAS—THE OVERRIDING PROJECT OF THE B/D-HPP Since the launching years, led by the Aebersold lab in Zurich and the Moritz lab in Seattle, and building on early work by Anderson et al5 and by Carr6, the B/D-HPP has pursued the goal of enhancing the capabilities of mass-spectrometry-based proteomics and bringing its applications to all fields of biology and disease through a major program of development of targeted proteomics. This year the release of the comprehensive Human SRM Atlas7 reports data on 166,174 proteotypic peptides that specifically identify 99.7% of the 20,277 predicted and annotated human proteins by the widely-accessible, sensitive, and robust targeted MS method, Selected Reaction Monitoring, SRM. These assays detect, verify, and quantitate such peptides from proteins of interest in specific pathways, as well as their splice isoforms, sequence variants, and post-translational modifications. The team demonstrates the utility of the SRM approach by examining the network response to inhibition of cholesterol synthesis in liver cells and to docetaxel

treatment

of

prostate

cancer

cells.

The

data

are

freely

accessible

at

http://www.srmatlas.org. Recent advances in data-independent acquisition mass spectrometry technologies such as SWATH-MS enable a deeper recording of the peptide contents of samples, including peptides with modifications. Keller et al8 published a novel approach that applies the power of SWATHMS analysis to the automated pursuit of modified peptides. A new SWATHProphet (PTM)

ACS Paragon Plus Environment

5

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 31

functionality added to the open source SWATHProphet software permits identification of precursor ions consistent with a modification along with the mass and localization of the modification in the peptide sequence. The method is sensitive and does not require anticipation of the modifications in advance. Keller8 detected a wide assortment of modified peptides, many unanticipated, in phospho-enriched human tissue culture cell samples, as well as urine containing unpurified synthetic peptides. These methods are likely to prove transformative in cell biology. Two streams of work came together from B/D-HPP and HPP Bioinformatics Resource Pillar collaboration on “priority proteins” and “popular proteins.” The result is publicly-available, comprehensive resources for SRM targeted proteomics for proteins associated with particular diseases, including type 2 diabetes mellitus, ovarian cancers, breast cancers, colon cancers, chromatin proteins, type 1 diabetes, ascending aortic aneurysm, the heart, and the eye (follow www.thehpp.org to https://db.systemsbiology.net/sbeams/cgi/PeptideAtlas/proteinListSelector). Finally, a cross-B/D activity has been the analysis with bibliometric searches of the literature to produce sets of “popular proteins” (highly published) for six organ systems: cardiovascular, cerebral, hepatic, renal, pulmonary, and intestinal9. For each protein in each protein set, SRM assays are specified, so that many users can plan appropriate studies9. B/D-HPP WORKSHOPS AT 2015 HUPO WORLD CONGRESS IN VANCOUVER Here we present highlights of the scientific and educational activities at the HUPO Congress in Vancouver in September 2015 organized by the B/D-HPP and the three HPP resource pillars, namely Knowledgebase, Antibody Profiling, and MS (see online Supplement Table 2 for list of speakers). Workshop summaries are in alphabetical order.

ACS Paragon Plus Environment

6

Page 7 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Affinity-Based Protein Capture Resource Pillar and the Human Protein Atlas (HPA) (led by E. Lundberg, A. Bandrowski, C. Lindskog, M. Skogs, T. L. Alm, H. Rodriguez, D. Shankar, M. Uhlen) The original Human Antibody Initiative has continued to be a key component of the HPP and HUPO itself. This consortium impacts all B/D-HPP groups and the broader scientific community. One goal of the September 2015 meeting and a preceding major workshop at the EuPA annual meeting (Milano in June 2015) was to address the problems of antigen-antibody validation, proper identification of reagents used, and the need for greater reproducibility in reported findings throughout the scientific community. The Affinity Binders Knockdown Initiative aims to create a pipeline of well-validated antibodies and standard operating procedures for antibody validation. The results will be disseminated through the Antibodypedia (www.antibodypedia.org). There was a major release of validated antibodies from the HPA in the January 2015 Science10 along with extensive annotations of protein type (secreted, membrane-spanning, housekeeping, regulatory), druggability, tissue specificity, presence in cancer or other cell lines, and, if relevant, the

proteins

relationship

with

metabolism.

These

classifications

were

based

on

immunohistochemical studies of 44 tissues and RNA sequencing of 32 tissues. These findings about protein expression are widely applicable at organ, tissue, and single-cell levels and contribute to the finding and annotation of missing proteins10. Bioinformatics Workshop (led by E. Deutsch) A major goal of the HPP Bioinformatics group at the HUPO 2015 World Congress was to bring together bioinformaticians, HPP leadership, and investigators, to meet the need to generate more stringent quality assurance guidelines for the HPP and the larger community. This goal is

ACS Paragon Plus Environment

7

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 31

grounded in the recognition that there are variable practices across laboratories, different criteria for protein identification with various search engines and databases, the experience of lax analytical filters in the large-scale reports by Kim et al11 and Wilhelm et al12, and the many challenges to enhance the methodology for very large datasets. Specific guidelines and a checklist for authors and reviewers were generated, discussed in depth at the HPP post-Congress Workshop, and then refined for release in this 2016 HPP Special Issue of the Journal of Proteome Research (JPR) (see www.thehpp.org/guidelines)13. Clearly the current and future HPP data guidelines are an issue of high importance. Because pre-existing guidelines and their enforcement have limitations, the HPP’s new guidelines specifically address how best to make more reliable the claims of confidently detecting missing proteins and novel translation products. The major subtopics included: current HPP data guidelines, data deposition in ProteomeXchange, the 1% protein-level FDR requirement, manual inspection of spectra supporting extraordinary claims, consideration of alternate explanations of the data, use of synthetic reference peptides, and the use of SRM to confirm shotgun results. Each subtopic was introduced separately, followed by active discussion by the workshop participants. The outcome is the new HPP Data Interpretation Guidelines (version 2.1) (https://www.hupo.org/2015/12/news/establishing-new-hpp-guidelines/)13. JPR determined that all of the manuscripts published in the 2016 special issue must comply with these standards. The B/D-HPP recommends that these Guidelines and Checklist become adopted broadly by all proteomics researchers and by other proteomics and life sciences journals. Proteomics Standards Initiative and ProteomeXchange Consortium (led by E. Deutsch, H. Hermjakob, J.A. Vizcaíno)

ACS Paragon Plus Environment

8

Page 9 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

The goal of the Vancouver workshop was to focus on the implementation of the current set of proteomics standards and to review the steps that have been taken over the year. The program started with an overview and recent advances of the Proteomics Standards Initiative (PSI), and highlighted recent publications14-21 on the current set of standards and the progress made at the 2015 PSI workshop in Seattle, USA which included the PSI Extended FASTA File (PEFF) format. The PSI Molecular Interactions Working Group presented an update of the PSI-MI XML format for encoding molecular interactions in a rich XML schema, which is widely used among the molecular interactions databases. PSI-MI XML version 3.0, which enhances the format to describe more than 2 protein interaction partners and conditional interactions, is under development. The PSI Proteomics Informatics Working Group promoted the adoption of three recently completed formats: mzIdentML, for encoding proteins, peptides and peptide-spectrummatches in XML; mzQuantML, a highly flexible XML format for encoding abundance measurements; and mzTab, a simplified tab-delimited text format for encoding the relevant summary information from mzIdentML and mzQuantML files. The current activity also includes development of proBED and proBAM, which are standardized formats for encoding the output of proteomics experiments in the widely-used genomics BED and BAM formats, which can be read by many genomics tools. These new formats are aimed at proteogenomics workflows. Finally, an overview of the ProteomeXchange Consortium of current proteomics data repositories was presented. Current members are PRIDE at the European Bioinformatics Institute (EBI), PeptideAtlas at the Institute for Systems Biology (ISB), and MassIVE at the University of California San Diego (UCSD). There are now >3000 datasets in ProteomeXchange. There are plans to include additional repositories, such as jPOST in Japan. Datasets deposited in PRIDE

ACS Paragon Plus Environment

9

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 31

(MS/MS) and MassIVE are routinely downloaded for standardized reanalysis with complementary pipelines by PeptideAtlas and by GPMDB. The 2016 PSI Spring Meeting was held in Ghent, Belgium (web site http://psidev.info/). Brain Proteome Project (led by H.E. Meyer, G. Schmitz, Y.M. Park, A. Häggmark, A. Urbani, P. Marin, K. Marcus, P. Nilsson, D. Martins-De-Souza) The HUPO Brain Proteome Project (HBPP) is an international interdisciplinary initiative focused on the investigation of the human brain (see www.hbpp.org). The brain is a complex organ comprised of a large number of different tissue layers and cells. The goal of the brain proteome project is to understand the roles of the proteome in the functions of the human brain. To date, the B/D-HPP researchers have used a broad spectrum of methods and have fostered strong cooperation between scientists from different fields (affinity proteomics, bioinformatics, biostatistics, proteomics, analytical biotechnology, clinical science, neurobiology, biochemistry, neuroanatomy, neuropathology). At the Vancouver Congress the initiative focused on projects related to understanding the biogenesis of the most common forms of dementia – Alzheimer’s Disease (AD), Parkinson’s Disease (PD), Frontotemporal Dementia (FTD), and other neurodegenerative diseases. Insights from biomarker research of neurodegenerative diseases and the underlying disease mechanisms, especially in the hippocampus were highlighted. This was the 25th Workshop of the HBPP (see www.hbpp.org). Cancer Proteome Project (led by H. Zhang, C. R. Jimenez, E. Nice, M. S. Baker, J. Qin, A. Umar, K. Abbott, Y. Chen) The goal of the Cancer-HPP is to characterize different cancer proteomes, determine the correlation of transcriptome and proteome, identify high priority proteins for each tumor type, and generate and disseminate assays and resources to support the analysis of complex biological

ACS Paragon Plus Environment

10

Page 11 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

networks or clinical specimens underlying different disease processes. The use of targeted antibody and MS/MS-based methods allows quantitative analysis of high priority target proteins. This group has proposed formation of an international cancer proteomic effort [similar to The Cancer Genome Atlas (TCGA) project] to identify and validate cancer proteins for different cancer types using MS based methods (see CPTAC-HPP section). In Vancouver, data from membrane proteome studies on colorectal cancers provided insights for subtype classification and novel biomarker definition. The primary role of post-translational modifications was underscored with phosphoproteomics studies that identified tissue signatures in non-small cell lung cancers and with glycomic analyses showing a unique N-linked glycan structure prevalent in ovarian cancers. The clinical perspectives of breast cancer proteomics were also discussed. Finally, the Chinese Human Proteome Project (CNHPP) presented extensive work to define the proteomic landscape of liver cancers. Clinical Proteomic Tumor Analysis Consortium (CPTAC) HPP (led by H. Rodriguez, C, Kinsinger, B. Zhang, S Thomas, J. Whiteaker) The goal of the CPTAC-HPP Workshop was to address the many challenges and progress in developing precise quantitative tools for protein analysis in the discovery and targeted arena22. NCI’s public portal (http://assays.cancer.gov) contains nearly 800 fit-for-purpose targeted SRMMS assays developed in coordination with the U.S. Food and Drug Administration (FDA) and the American Association for Clinical Chemistry. Next there is a need to adapt and ensure that these assays can be used easily across many cell and tissue types. Much like antibody based assays, transparency around expectation, validation, and limitations will be required for these MS based tools.

ACS Paragon Plus Environment

11

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 31

Cardiovascular Proteome Project (led by P. Srinivas, S.J. Parker, A.W. Herren, T. Klein, M. Lam, M.E. McComb) The goal of the cardiovascular proteome project is to promote the use of technology broadly across the heart and vascular research domains. The group has determined which proteins are most studied in cardiovascular research as a means of identifying proteins and pathways that could be a focus for developing targeted quantitative assays. The cardiovascular initiative published a paper in 2015 entitled “Prioritizing Proteomics Assay Development for Clinical Translation” about the most studied of the