Innovation in Small-Molecule-Druggable Chemical Space: Where are

Oct 25, 2017 - Time is a tremendously important parameter in pharmaceutical development, and numerous studies have measured the time needed for drug d...
0 downloads 7 Views 2MB Size
Subscriber access provided by READING UNIV

Article

Innovation in Small-Molecule-Druggable Chemical Space: Where are the Initial Modulators of New Targets Published? Stephanie Kay Ashenden, Thierry Kogej, Ola Engkvist, and Andreas Bender J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.7b00295 • Publication Date (Web): 25 Oct 2017 Downloaded from http://pubs.acs.org on October 26, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Information and Modeling is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Innovation in Small-Molecule-Druggable Chemical Space: Where are the Initial Modulators of New Targets Published? Stephanie K Ashenden1, Thierry Kogej2, Ola Engkvist2, Andreas Bender1* 1

Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW, UK 2

Discovery Sciences, IMED Biotech Unit, AstraZeneca, Gothenburg, Sweden, 431 50, SE *[email protected]

1

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract It is well established that the number of publications of novel small molecule drugs, and their associated targets, has increased over the years. This work provides an update on publishing trends over the years with a particular focus on the comparison between patents and scientific literature of which are accessible through ChEMBL and GOSTAR databases. More precisely, the patents and scientific literature associated with bioactive molecules and their target annotations have been compared to identify where novelty originated from. To analyse potential target class influences, the data has been further split into eight different target classes.. Moreover, small molecule modulators for protein targets are usually published in both scientific literature and in patents (45%), or only in scientific literature (51%) but rarely in patents only. It has been observed that generally, novel targets and their associated compounds are published in literature primarily, whereas novel compounds (regardless of their associated targets) tend to be published in patents first.

Introduction Drug discovery is a costly and lengthy process, only a small proportion of molecules that are identified as a candidate drug are approved as new drugs each year1. Despite this, an increasing number of novel druggable targets have been identified over the years as well as a plethora of compounds being identified and published. Analyzing this data in a time course manner can allow researchers to understand preferred modes of publishing modulators of protein targets, as well as to identify trends over time. This study aims to achieve this goal by examining

2

ACS Paragon Plus Environment

Page 2 of 50

Page 3 of 50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

compounds and their associated targets over time in the two main avenues of dissemination, namely patents and peer-reviewed scientific literature.

Occasionally, findings will be published in patents exclusively (in particular from private companies); however, publishing in scientific journals usually increases the exposure of the data that might lead to collaboration and further funding opportunities, and it represents additional value both for researchers in companies, as well as being crucial in academia and research institutes. What is communicated depends on where the information is being published; for example, a patent will not necessarily have all the biological activity information such as the activity type but a journal publication may not depict the molecular structures2. For instance, it has been shown that patents actually contain more chemical information than publication, and it has even been suggested that they may contain the information up to decades before they appear in literature3. Thus, during a drug discovery program, accessing all the published scientific knowledge around a biological target available through both scientific literature and patents seems crucial.

Time is a tremendously important parameter in pharmaceutical development and numerous studies have been made to measure the time needed for drug discovery and development. Among those, the difference between the launch of a drug and publication dates (the date the drug was published in either a patent or in scientific literature) for oral drugs has been investigated. In one study, the authors noted that the earliest publication date for oral drugs usually corresponds to a patent4,5. Nevertheless, the analysed dataset size was fairly small (592 drugs), mainly because it 3

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

was restricted to launched drugs for which all necessary information could be identified. Additionally, a previous study analysed a small number of protein modulators and considered the delay of the publication of these annotations in scientific literature, after having been published in a patent. In this study the authors found that on average there is a four year delay between publishing a patent to scientific literature for compound-target interactions which also highlighted the need for scientists to be able to search patents reliably6. The main objective of this study is to try to understand where pharmaceutical innovations in the form of new modulators of protein targets reported as a function of time. For achieving this, we investigated whether the first bioactive compound (a compound that has been shown to have activity on a particular target) for a novel target tends to be primarily published, either in patents or in scientific literature. In the remaining manuscript, we refer to a protein modulator (a compound and the target it has been associated with by a measured activity), as a compound that has a bioactivity (IC50, EC50, Ki and Kd) (