Data Sharing Advances Rare and Neglected ... - ACS Publications

May 20, 2019 - are called rare, there are at least 30 million people in the US living with a rare ... However, President Reagan signed the US ... deve...
0 downloads 0 Views 2MB Size
Viewpoint Cite This: ACS Pharmacol. Transl. Sci. XXXX, XXX, XXX−XXX

pubs.acs.org/ptsci

Data Sharing Advances Rare and Neglected Disease Clinical Research and Treatments Rachelle J. Bienstock*

Downloaded via 188.68.1.19 on August 29, 2019 at 20:05:07 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

RJB Computational Modeling LLC, Chapel Hill, North Carolina 27514, United States ABSTRACT: Because of the decreased cost and increased ease of whole genome analysis, the diagnosis of rare, orphan diseases has entered a new era. This new technological advance, combined with the worldwide web connections, now permits sharing, searching, and linking genotype, phenotype, and other information to facilitate diagnosis. Databases currently accessible and searchable by researchers, clinicians, and patients will be presented and discussed.



WHAT IS A RARE OR ORPHAN DISEASE? There is no universal recognized definition of a rare, neglected, or orphan disease. In the US, a rare, orphan, or neglected disease is generally regarded as a life-threatening or debilitating disease that affects a “small” percentage of the population. In the US, “small” is defined, according to the Rare Diseases Act of 2002, as affecting less than 200 000 persons. In the EU, it is defined as affecting 1 in 2000 persons and in Japan as affecting less than 50 000 persons, or about 1 in 2500 persons. There are roughly 6000−8000 identified rare diseases, so although they are called rare, there are at least 30 million people in the US living with a rare disease, and they affect roughly 8% of the world’s population. It is estimated that 350 million people worldwide are suffering from a rare disease.1 Roughly 80% of rare diseases are genetic in origin and result from dysfunction of a particular pathway due to defective gene expression or protein production.1 Because of the low occurrence of these diseases, the physician to which a patient presents symptoms may never have seen a particular rare or orphan disease. This creates a challenge for patient diagnosis, and typically it takes an average of 4.8 years from the onset of patient symptoms to diagnosis. Patients will typically need to see many physicians and specialists in order to receive the correct diagnosis. Most patients are children and approximately 30% will die prior to reaching age 5. The vast majority of rare diseases still have no treatments, (it is estimated that 95% of rare diseases lack an FDA treatment), although many are monogenic disorders with known mutations. Because of their low prevalence, sharing of genetic, drug discovery, treatment, and outcome data therefore becomes essential. Prior to 1983, rare diseases did not offer a large customer base and therefore lacked commercial incentives for drug development. However, President Reagan signed the US Orphan Drug Act in 1983, which facilitates orphan drug development by granting market exclusivity to companies developing drugs and therapies for orphan and rare disease, reducing research and development costs. Due to these incentives, large pharmaceutical companies have now refocused on drug development for rare diseases, away from the traditional research and development areas, as rare and orphan disease clinical trials require fewer patients, and patient advocacy groups assist with patient recruitment. Additionally, © XXXX American Chemical Society

the regulatory path for rare disease drug approval is easier than for traditional drugs.



RARE DISEASE DIAGNOSIS Initial evidence of an underlying genetic disorder is given by the patient’s external manifestation (phenotype) and the patient is usually referred to the clinic. Genetic testing is necessary to confirm a molecular diagnosis of a rare disease. Rapid technological advances and declining costs have facilitated the routine use of high-resolution genetic methods such as array-comparative genomic hybridization, singlenucleotide polymorphism (SNP) genotyping and exome sequencing in the clinic.2,3 This has resulted in major advances in the ability to diagnose rare genetic diseases and has led to identification of many new genes responsible for rare disorders. The idea behind the development of networked and shared databases is to provide genotype and phenotype matching algorithms to link cases with commons phenotypes and gene mutations. For rare diseases, it is extremely challenging, yet essential for diagnosis and treatment, to identify cases from other nonrelated individuals. There are multiple projects and groups that are developing genotype and phenotype matching algorithms. Several of these web searchable services are discussed in addition to an effort to link information into one single service the Matchmaker Exchange.



GLOBAL RARE DISEASES REGISTRY AND REPOSITORY The Global Rare Diseases Registry and Repository (RaDaR)4 https://ncats.nih.gov/radar is a database sponsored by the National Center for Advancing Translational Sciences (NCATS) to provide contact and demographic information to connect researchers and help them find patients interested and appropriate for participating in research studies. The registry contains information regarding a patient’s conditions and diagnosis. Registry databases and tissue repositories are essential tools for providing important data to a broad community of researchers. The registry also contains patient outcomes from treatments and scientific understanding of Received: May 20, 2019

A

DOI: 10.1021/acsptsci.9b00034 ACS Pharmacol. Transl. Sci. XXXX, XXX, XXX−XXX

Viewpoint

ACS Pharmacology & Translational Science

Figure 1. Orphanet web interface illustrates the available links and options to orphan drugs, diagnostic laboratories, research projects, and registries and Orphanet reports. Image reproduced with permission. Copyright 2019 Orphanet.

rare disorders and rare disease free data (OrphaData). This includes data on expert centers, patient organizations, diagnostic laboratories, research projects, clinical trials, registries, databases, and infrastructures for research. Orphanet also provides for rare disease nomenclature (ORPHAnumber). Each disease in Orphanet is given a unique identifier, the Orpha number (i.e., ORPHA57146 Rare hepatic disease), and assigned a classification. There are a set of rules for assignment to make sure classifications are consistent. These are publicly available and described in detail https://www.orpha.net/ orphacom/cahiers/docs/GB/Orphanet_linearisation_rules. pdf The data is organized in a hierarchical manner and each data set includes the types of disorders, and relationships between disorders. The graphical interface for Orphanet is illustrated in Figure 1. Orphanet is linked to and affiliated with the, Orphanet Journal of Rare Diseases https://ojrd.biomedcentral.com/, an open access peer-reviewed journal which published articles related to rare diseases and orphan drugs. Orphanet also produces a report series with information on rare diseases, their epidemiology, orphan drug information and rare disease registries. Orphanet also supports the Orphan News, an electronic newsletter. As well as ORDO, the orphaned rare disease ontology, which provides for computational analysis and captures the relationship between diseases and genes. Orphanet serves as a rare disease data reference source, and has enabled researchers to share information regarding orphan drug discovery.7 Drug repositioning and repurposing is an approach exploring the repurposing of already known drugs for orphan disease. OrphaData provides access to the aggregated data from Orphanet− rare diseases and their associated genes, phenotypes associated with rare disorders and rare disease free data, (www.orphadata.org/cgi-bin/index.php). This enables free access to download the data for use in bulk format for projects without being limited to search through a web graphical interface. Specifically, a systematic exploration of 320,856 possible links between known drugs in DrugBank and orphan proteins obtained from Orphanet was performed in a study and revealed as many as 18,145 candidate drugs for repurposing for rare or orphan diseases. DECIPHER (https://decipher.sanger.ac.uk). DECIPHER,8 is another collaborative database developed to share genotype and phenotype information to assist rare disease

disease and assists therapy development. The goal for the registries is to connect patient communities with researchers. Similar registries to RaDaR in the global rare disease space include the International Rare Disease Research Consortium and ERA-Net for Research Programs on Rare Diseases, which function at the funding level to connect institutions with similar interests and group data. Data registries such as these are essential for diagnosis when each physician sees so few patients presenting with the disease.



DATA SHARING IN THE UNDIAGNOSED DISEASES NETWORK (UDN) HTTPS://WWW.GENOME.GOV/ 27550959/UNDIAGNOSED-DISEASES-NETWORKUDN/ The Undiagnosed Diseases Network (UDN)5 is structured to make advances in multiple disciplines related to the fields of genomics and undiagnosed diseases. The network includes 12 clinical coordinating centers (Baylor College of Medicine, Houston; Brigham and Women’s Hospital, Boston; Children’s Hospital University of Pennsylvania; Duke University, Durham; The National Human Genome Research Institute (NHGRI), Bethesda; Stanford Medicine, Stanford CA; University of California, Los Angeles; University of Miami School of Medicine, Miami; University of Utah; University of Washington, Seattle; Washington University, St. Louis; and Vanderbilt University) and a sequencing core center at Baylor College of Medicine. The primary goals of the UDN is to improve the level of diagnosis and care, facilitate research, and create an integrated and collaborative research community. One critical goal of the UDN is to identify the causes of disease in the genomes of the patients studied, and in general to contribute to the science of genome interpretation. The UDN is a member of Matchmaker Exchange’s collaborative effort to address the common challenge of exome and genome sequencing. Orphanet https://www.orpha.net/consor/cgi-bin/ index.php. Orphanet,6 was established by INSERM (French National Institute for Health and Medical Research) in 1997. In 2000, Orphanet became a European Union project with grants from the European Commission, consisting of a consortium of 40 member countries. Orphanet experts collect, annotate, and classify (according to the Orphanet nomenclature) information on rare diseases with their associated genes. Phenotypes are associated with B

DOI: 10.1021/acsptsci.9b00034 ACS Pharmacol. Transl. Sci. XXXX, XXX, XXX−XXX

Viewpoint

ACS Pharmacology & Translational Science

Figure 2. Genomic and phenotype data searchable through DECIPHER. DECIPHER enables various types of searches. These searches and identification of a patient’s specific mutation facilitate diagnosis. (A) DECIPHER provides a Genome Browser for searching by genome positon, gene name, or chromosome band; (B) searching phenotype abnormalities exhibited by patients in the open-access portion of DECIPHER. From here you can link to the patient’s phenotype; (C) CNV Syndromes, a list of curated microdeletion and microduplication syndromes involved with developmental disorders; (D) CNV Syndromes in karyotype view. Images reproduced with permission. Copyright 2009 Genome Research Limited.

diagnosis. Established in 2004, DECIPHER permits the deposition and sharing of rare and orphan disease gene variants securely and offers a web searchable interface. DECIPHER has data from more than 250 registered centers and at this writing openly shares more than 30 000 patient records which have had identifiable information removed. DECIPHER offers users analysis and visualization tools to help with understanding the meaning of deposited sequence variants or copy-number variants (CNVs). DECIPHER is secure and password-protected. Phenotype information (i.e. age, gestation in weeks for prenatal data, sex, affected status of parents, images) can be deposited confidentially through the online interface. There is restricted access to some deposited information, such as images, which are only available to the depositing organization. Explicit patient consent must be obtained before a patient record is openly shared and is available in an unidentifiable format for searching. DECIPHER was one of the first web-based databases which allowed for the searching and matching of genomic information and permitted contact and collaboration among users.

A curated list of well described syndromes due to gene deletions and duplications, DECIPHER CNV syndromes, (https://decipher.sanger.ac.uk/disorders) is also included in the database, and aids with the interpretation of rare copynumber variants in the clinic. DECIPHER has tools for variant analysis and assists with identifying patients exhibiting similar genotype−phenotype characteristics. DECIPHER data which have had patient identification information removed includes CNVs (copy number variants) as well as sequence variants, including SNVs (single-nucleotide variants). Variant information, pathogenicity, inheritance and phenotype information is also included. DECIPHER also includes information regarding mitochondrial genome variants. DECIPHER supports sharing between linked projects (consortium sharing). At this writing more than 56 000 patient records are shared within consortia. The depositing center determines the extent of sharing. When data are deposited, the sharing status of a patient variant can be selected to be “private,” “managed” (if the project is part of a consortium), or “public.” Variants marked private are not searchable publicly (only visible to the depositing center). Variants marked managed are shared with other projects that are part of the C

DOI: 10.1021/acsptsci.9b00034 ACS Pharmacol. Transl. Sci. XXXX, XXX, XXX−XXX

Viewpoint

ACS Pharmacology & Translational Science Table 1. Searchable Databases from Café Variome name

description

website

1000genomes

1000 Genomes Project: largest public catalogue of human variation and genotype data.

dbsnp

dbSNP: contains human single nucleotide variations

diagnostic

Diagnostic Variants: Variants submitted from diagnostic laboratories

dmudb

Diagnostic Mutation Database: The Diagnostic Mutation Database (DMuDB) was established by The National Genetics Reference Laboratories (NGRLs) in Manchester & Wessex, UK Dept. of Health to support NHS diagnostic genetic services. The frequency of Inherited Disorders Database: records frequencies of causative mutations leading to inherited disorders worldwide. Finnish Disease Heritage Database: disease mutations enriched in the Finnish population FORGE (Finding of Rare Disease Genes) Canada Consortium: National consortium using next-generation sequencing technology to find genes linked to rare pediatric-onset disorders in the Canadian population. Human Gene Mutation Database at the Institute of Medical Genetics in Cardiff

FINDbase Findis forge hgmd phencode

uniprot

PhenCode: connects human phenotype and clinical data in various locus-specific mutation databases (LSDBs) with data on genome sequences, evolutionary history, and function in the UCSC Genome Browser. PhenCode is a collaboration among researchers at Penn State, UC Santa Cruz, and locus experts at other institutions. Uniprot: Funded by NIH, EMBL-EBI, PIR and SIB. Provides high-quality and freely accessible resource of protein sequence and functional information

http://www. internationalgenome. org/ https://www.ncbi.nlm. nih.gov/snp/ https://central. cafevariome.org/ discover/source/ diagnostic http://www.ngrl.org. uk/ http://www.findbase. org/ http://www.findis.org/ https://www. genomebc.ca/ http://www.hgmd.cf. ac.uk/ac/index.php http://phencode.bx. psu.edu/ https://www.uniprot. org/

data from patients and uses them to build an open database. Patients register and complete informed consent forms online, followed by a general survey about their health, and then upload their genetic test results. Cafe Variome (http://www.cafevariome.org). Café Variome12 differs from the other Web sites in that it is a customizable web tool developed by the University of Leicester Bioinformatics Group for the user to make their own data and data set available for others to search. It offers a developed structure and predefined data types and a simple way to build queries to the data. Many different types of data and data fields can be accommodated in Café Variome, including genotype and phenotype details, with interfaces that facilitate easy querying of the data. Cafe Variome is designed to be easy to install and configure by groups or institutions that want to share their data and is a single software package. The Café Variome system provides a core MySQL database, search interfaces, ontology components, results presentation interfaces, user and usergroup settings, data management tools, administration functions, interface styling control, Web service capabilities, and inbuilt messaging services. This data discovery tool is currently being used by several projects including EPAD (European Prevention Alzheimers Dementia) Consortium, SolveORD, BBMRI.uk (Biobanking and Biomolecular Resources Research Infrastructure), EMIF, eTOX, efpai, Photosystem, Tissue Directory, Coordination Centre innovative medicines initiative (imi), and PhenoSystems. A genotype−phenotype installation, Café Variome Central, created from publicly available data sets is available for searching (https://central.cafevariome.org/). It includes the currently searchable data and databases listed in Table 1. Collaborative Efforts. Methods are needed to facilitate interaction between multiple disparate and disconnected projects. The Matchmaker Exchange, MME13 (http://ww. matchmakerexchange.org) is an attempt to fill this niche. It is a network of genotype and rare phenotype databases which unifies data in separate databases. The MME serves as a tissue repository database and facilitates interaction between multiple disconnected projects as a collaborative effort to facilitate

same consortium, but are not visible to other users who are not part of the same project or consortium. Variants marked public are openly shared (this requires explicit patient consent). The patient record (including phenotype information) is shared openly when at least one variant is marked as public. The extent of sharing is granular, so that different variants for the same patient can have a different sharing status. DECIPHER offers the ability to graph data and interactive web tools for data visualization. These are illustrated in Figure 2. PhenomeCentral Portal (https://phenomecentral. org). Phenome Central9 is a restricted access network for clinicians, researchers, and scientific consortia to share patient phenotype and genotype data to discover similar patients. A centralized web portal enables clinicians to share undiagnosed patient information. One method in which rare conditions are diagnosed is the comparison of the exomes or genomes of unrelated individuals with a specific disorder. PhenomeCentral permits clinicians at different hospitals to share and compare this type of data for more efficient diagnosis of rare diseases. After entering a patient record into PhenomeCentral, the researcher can decide to have their patient record participate in the MME (Matchmaker Exchange), a linked network of rare disease patient databases. GeneMatcher (http://www.genematcher.org). GeneMatcher,10 another web-based tool, was developed by the Baylor- Hopkins Center for Mendelian Genomics. It highlights individuals with rare phenotypes who have variants in the same candidate disease gene. GeneMatcher does not collect individual identifiable data. It permits researchers to search genes (by gene symbol, Entrez- or Ensembl-Gene ID) and facilitates connections between researches posting information regarding the same gene. Gene matches are done automatically upon gene submission and both submitters receive email notification. GenomeConnect https://www.genomeconnect.org/. Genome Connect11 is an online resource for patients developed by the Clinical Genome Resource (ClinGen) and maintained by the Institutional Review Board at Geisinger Health System. The project collects genotypic and phenotypic D

DOI: 10.1021/acsptsci.9b00034 ACS Pharmacol. Transl. Sci. XXXX, XXX, XXX−XXX

Viewpoint

ACS Pharmacology & Translational Science

Figure 3. MatchMaker Exchange: (A) starting search page for a patient looking for information; (B) databases to which each user group, (patients, researchers, and clinicians) have access; (C) type of information stored in each database; (D) criteria used in each database for similarity scoring for matching patients. Images reproduced with permission. Copyright 2019 GitHub Inc.

institution, or that they are aware of, interested in finding out information about a specific orphan disease. The existence of many small siloed data sets can impede diagnosis and definition of a rare disease, as a patient or a researcher may not be able to effectively search linking information available in different databases. They could be searching one database, which has genomic information regarding a rare disease, but the diagnosis data are in a different database and with no connection between the two databases the link between disease genotype and diagnosis may never be made. How Matchmaker Exchange Works. There is a user agreement for those wishing to use the MME, and a steering committee to govern the program and set the rules and standards. The steering committee is composed of a representative from each of the MME participating services, as well as program organizer representatives from Global Alliance for Genomics and Health (GA4GH) and the International Rare Diseases Research Consortium (IRDiRC). The steering committee maintains the service requirements, user agreement, and oversight of the API to ensure the MME meets the needs of the rare disease community and reflects consensus standards and best practices as set forth by the GA4GH and IRDiRC. Matchmaker Exchange Application Programming Interface (MME API). The Matchmaker API (application programming interface)14 specification is described and information on its use is available through Github, https:// github.com/ga4gh/mme-apis, and the complete API specification handshake is described at https://github.com/ga4gh/ mme-apis/blob/master/search-api.md. A sample search request could be implemented or edited as shown. Current live implementations of the Matchmaker API are listed at https:// github.com/ga4gh/mme-apis/wiki/Endpoints.

identification of cases with similar phenotype and genotype profiles through a standardized application programming interface (API).



THE MATCHMAKER EXCHANGE (MME) The International Consortium of Human Phenotype Task Force is developing standards for interoperability among databases. The Matchmaker Exchange (MME) was initiated to create a standard way of connecting a network of databases of rare disease information and genotypes and phenotypes using a common application programming interface (API); however, each database can still maintain its own data organization schema. The Matchmatcher Exchange facilitates interaction between multiple disconnected projects and enables searches of multiple databases. Participating matchmaker services are required to implement a standardized API, consistent with standards developed by the GA4GH Data Working Group, for exchanging genotypic and phenotypic information. The MatchMaker Exchange project receives funding from the participant databases and from the Global Alliance for Genomics and Health, IRDiRC (the International Rare Diseases Research Consortium), CanSHARE, NIH Centers for Mendelian Genomics, Genome Canader, CIHR, Care4Rare, and RD-Connect. Collaborative Matchmaker exchange member organizations include AGHA Patient Archive, DECIPHER, GeneMatcher, IRUD, matchbox, MyGene2, and PhenomeCentral. MatchMaker Exchange Participants include Café Variome, Centers for Mendelian Genomics, ClinGen GenomeConnect, GENESIS, LOVD, PEER, RD-Connect, and Undiagnosed Diseases Network. This common API to query, access, and link rare disease databases is essential for patients, physicians, and researchers who may be isolated and be the only one in their local area or E

DOI: 10.1021/acsptsci.9b00034 ACS Pharmacol. Transl. Sci. XXXX, XXX, XXX−XXX

Viewpoint

ACS Pharmacology & Translational Science

(2) Shaw-Smith, C., Redon, R., Rickman, L., Rio, M., Willatt, L., Fiegler, H., Firth, H., Sanlaville, D., Winter, R., Colleaux, L., Bobrow, M., and Carter, N. P. (2004) Microarray based comparative genomic hybridization (array-CGH) detects submicroscopic chromosomal deletions and duplications in patients with learning disability/mental retardation and dysmorphic features. J. Med. Genet. 41, 241−248. (3) Gambin, T., Yuan, B., Bi, W., Liu, P., Rosenfeld, J. A., CobanAkdemir, Z., Pursley, A. N., Nagamani, S. C. S., Marom, R., Golla, S., Dengle, L., Petrie, H. G., Matalon, R., Emrick, L., Proud, M. B., Treadwell-Deering, D., Chao, H. T., Koillinen, H., Brown, C., Urraca, N., Mostafavi, R., Bernes, S., Roeder, E. R., Nugent, K. M., Bader, P., Bellus, G., Cummings, M., Northrup, H., Ashfaq, M., Westman, R., Wildin, R., Beck, A. E., Immken, L., Elton, L., Varghese, S., Buchanan, E., Faivre, L., Lefebvre, M., Schaaf, C. P., Walkiewicz, M., Yang, Y., Kang, S. L., Lalani, S. R., Bacino, C. A., Beaudet, A. L., Breman, A. M., Smith, J. L., Cheung, S. W., Lupski, J. R., Patel, A., Shaw, C. A., and Stankiewicz, P. (2017) Identification of novel candidate disease genes from de novo exonic copy number variants. Genome Med. 9, 83−98. (4) Rubinstein, Y. R., and McInnes, P. (2015) NIH/NCATS/ GRDR® Common Data Elements: A leading force for standardized data collection. Contemp. Clin. Trials 42, 78−80. (5) Brownstein, C. A., Holm, I. A., Ramoni, R., and Goldstein, D. B. (2015) Data Sharing in the Undiagnosed Diseases Network. Human Mutation 36, 985−988. (6) Pavan, S., Rommel, K., Mateo Marquina, M. E., Hohn, S., Lanneau, V., and Rath, A. (2017) Clinical Practice Guidelines for Rare Diseases: The Orphanet Database. PLoS One. 12, e0170365. (7) Brylinski, M., Naderi, M., Govindaraj, R. G., and Lemoine, J. (2018) eRepo-ORP: Exploring the Opportunity Space to Combat Orphan Diseases with Existing Drugs. J. Mol. Biol. 430, 2266−2273. (8) Chatzimichali, E. A., Brent, S., Hutton, B., Perrett, D., Wright, C. F., Bevan, A. P., Hurles, M. E., Firth, H. V., and Swaminathan, G. J. (2015) Facilitating Collaboration in Rare Genetic Disorders Through Effective Matchmaking in DECIPHER. Hum. Mutat. 36, 941−949. (9) Buske, O. J., Girdea, M., Dumitriu, S., Gallinger, B., Hartley, T., Trang, H., Misyura, A., Friedman, T., Beaulieu, C., Bone, W. P., Links, A. E., Washington, N. L., Haendel, M. A., Robinson, P. N., Boerkoel, C. F., Adams, D., Gahl, W. A., Boycott, K. M., and Brudno, M. (2015) PhenomeCentral: A Portal for Phenotypic and Genotypic Matchmaking of Patients with Rare Genetic Diseases. Human Mutation 36, 941− 949. (10) Sobreira, N., Schiettecatte, F., Valle, D., and Hamosh, A. (2015) GeneMatcher: A Matching Tool for Connecting Investigators with an Interest in the Same Gene. Hum. Mutat. 36, 928−930. (11) Kirkpatrick, B. E., Riggs, E. R., Azzariti, D. R., Miller, V. R., Ledbetter, D. H., Miller, D. T., Rehm, H., Martin, C. L., and Faucett, A. W. (2015) Genome Connect: Matchmaking Between Patients, Clinical Laboratories, and Researchers to Improve Genomic Knowledge. Human Mutation 36, 974−978. (12) Lancaster, O., Beck, T., Atlan, D., Swertz, M., Thangavelu, D., Veal, C., Dalgleish, R., and Brookes, A. J. (2015) Cafe Variome: General-Purpose Software for Making Genotype−Phenotype Data Discoverable in Restricted or Open Access Contexts. Human Mutation 36, 957−964. (13) Philippakis, A. A., Azzariti, D. R., Beltran, S., Brookes, A. J., Brownstein, C. A., Brudno, M., Brunner, H. G., Buske, O. J., Carey, K., Doll, C., Dumitriu, S., Dyke, S. O. M., den Dunnen, J. T., Firth, H. V., Gibbs, R. A., Girdea, M., Gonzalez, M., Haendel, M. A., Hamosh, A., Holm, I. A., Huang, L., Hurles, M. E., Hutton, B., Krier, J. B., Misyura, A., Mungall, C. J., Paschall, J., Paten, B., Robinson, P. N., Schiettecatte, F., Sobreira, N. L., Swaminathan, G. J., Taschner, P. E., Terry, S. F., Washington, N. L., SZüchner, S., Boycott, K. M., and Rehm, H. L. (2015) The Matchmaker Exchange: A Platform for Rare Disease Gene Discovery. Human Mutation 36, 915−921. (14) Buske, O. J., Schiettecatte, F., Hutton, B., Dumitriu, S., Misyura, A., Huang, L., Hartley, T., Girdea, M., Sobreira, N., Mungall, C., and Brudno, M. (2015) Hum. Mutat. 36, 922−927.

The MME API specifies the format of both the query that is sent to participating databases and the response. The request is simply a description of the individual to be matched and the response is a list of the descriptions of similar individuals. Because the API is built around the description of an individual rather than a complex query language it is easy to understand. A match request contains a single case used as the query and the match response contains a scored list of the most similar cases in the remote system in the same format. The patient type parameter is flexible to facilitate matchmaking between cases. There are few required fields making it easy to implement regardless of the data stored by the individual database matchmaker service and there are many optional fields enabling inclusion of additional information. Within the MatchMaker Exchange there is a standardization of control terminologies. The API defines a set of data types with a corresponding set of properties. Standard identifiers are used for exchanging patient profiles. Figure 3 illustrates the MatchMaker Exchange interface a patient would see initiating a search. A patient can search by going to the MatchMaker Web site and search for genotype and phenotype matches. The type of data that is searchable and available through the Matchmaker Exchange Web site and individual database services includes phenotype, genotype, candidate genes, and whether there is a linked diagnosis. Certain databases are accessible to Clinicians, Patients, and Researchers through MatchMaker: The parameters used for matching in each of the databases within MatchMaker Exchange are indicated.



CONCLUSIONS There are many databases collecting individual genotype and phenotype information characteristic of rare and orphan diseases. There are also new efforts collaborating effectively to share and access rare diseases and syndromes data to assist clinicians with diagnosis and treatment. The ability to share and search specific genotype and phenotype dataeven down to specifics of genes, mutations, and CNV facilitates diagnosis. Through efforts such as Matchmaker, a common application programming interface (API), data can be easily shared and searched between different databases, even those with disparate datatypes.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. ORCID

Rachelle J. Bienstock: 0000-0001-5228-3610 Notes

The author declares no competing financial interest.



ACKNOWLEDGMENTS We thank Dr. Julia Foreman, DECIPHER project manager, Wellcome Sanger Institute, for reviewing sections of the manuscript.



REFERENCES

(1) Julkowska, D., Austin, C. P., Cutillo, C. M., Gancberg, D., Hager, C., Halftermeyer, J., Jonker, A. H., Lau, L. P. L., Norstedt, I., Rath, A., Schuster, R., Simelyte, E., and van Weely, S. (2017) The importance of international collaboration for rare diseases research: a European perspective. Gene Ther. 24, 562−571. F

DOI: 10.1021/acsptsci.9b00034 ACS Pharmacol. Transl. Sci. XXXX, XXX, XXX−XXX