Subscriber access provided by Universitätsbibliothek | Technische Universität München
Article
An expert system to predict the forced degradation of organic molecules Alexis D.C. Parenty, William G. Button, and Martin A. Ott Mol. Pharmaceutics, Just Accepted Manuscript • DOI: 10.1021/mp400083h • Publication Date (Web): 03 Jul 2013 Downloaded from http://pubs.acs.org on July 9, 2013
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
Molecular Pharmaceutics is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
An expert system to predict the forced degradation of organic molecules Alexis D. C. Parenty, William G. Button and Martin A. Ott* Lhasa Limited, 22-23 Blenheim Terrace, Woodhouse Lane, Leeds, LS2 9HD, UK.
Abstract In this paper we describe Zeneth, a new expert computational system for the prediction of forced degradation pathways of organic compounds. Intermolecular reactions such as dimerization, reactions between the query compound and its degradants, as well as interactions with excipients can be predicted. The program employs a knowledge base of patterns and reasoning rules to suggest the most likely transformations under various environmental conditions relevant to the pharmaceutical industry. Building the knowledge base is facilitated by data sharing between the users. KEYWORDS: Zeneth, forced degradation, stress testing, degradation pathways, degradant, degradation products, expert system, knowledge base, reasoning rules, prediction, data sharing.
Introduction *
E-Mail:
[email protected]; Tel: +44 (0) 113 394 6044; Fax: +44 (0) 113 394 6099
ACS Paragon Plus Environment
1
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 2 of 48
For the registration of a new Active Pharmaceutical Ingredient (API), regulatory agencies require that the structures of significant degradation products observed during storage be known and recommend that the degradation chemistry of the API be assessed,1,2 as described in guidelines of the International Conference on Harmonization (ICH).3–6 This can be a difficult task since these degradants often occur in minute quantities, dispersed within the API and excipient. Forced degradation studies on pharmaceutical compounds, also called stress testing studies, use physical and chemical stresses in order to screen product stability and rapidly obtain larger quantities of degradants than would have been obtained under standard storage conditions as in long-term or accelerated stability studies.3 Such increased production of degradants facilitates their isolation and characterization, leading to a better understanding of the degradation chemistry of the API and elucidation of its degradation pathways. Therefore, forced degradation studies are a vital part of pharmaceutical development, especially in the area of formulation, where better understanding of drug-excipient interactions can be gained, and also in the area of manufacture and packaging.2 Traditionally, forced degradation has been the preserve of skillful chemists with an encyclopedic knowledge of both organic and analytical chemistry. Although this situation persists today, the continuous and rapid development of scientific knowledge places an increasing burden on the chemist’s memory. Nevertheless, the outcome of such an exercise is subjective since it depends on the chemist’s experience in the field. In contrast, by combining the fields of chemistry and computer science, in silico approaches have the potential to offer fast and full recall of information in a more objective manner.7–9 Computers are also very good at applying rearrangements and perceiving symmetry in a molecule.10 Molecular perception is crucial in forced degradation studies, particularly in structurally complex molecules. For example, a computer program will not have difficulty in recognizing that the mono-oxidation of diol 1
ACS Paragon Plus Environment
2
Page 3 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
(Scheme 1) yields the same product 2, whichever hydroxyl is oxidized, due to the symmetry present in the skeleton of the molecule. However, it is less easy for the human brain to handle three-dimensional representations and reach the same conclusion as quickly. Two fundamental in silico approaches can be used to predict chemical fates: logic-oriented and information-oriented.11 Logic-oriented systems make use of first principles to transform a starting material into its corresponding product via bond-making and bond-breaking steps. In order to predict a transformation, logic-oriented systems typically evaluate physicochemical properties, such as electronegativity, bond dissociation energy, polarizability, strain and enthalpy of formation. When such properties are estimated with sufficient precision, the system is able to predict known as well as unknown reactions with equal ease. Nevertheless, this method requires a high level of confidence in the prediction of every thermodynamic parameter, which is still very expensive in terms of processing time, even with today’s computer technology. Moreover, the multi-factor nature of logic-oriented systems makes them hard to fine-tune: if the predictions do not match experiments, it is not easy to implement the necessary adjustments to improve the correspondence. In contrast, information-oriented systems such as Zeneth, also called expert systems, are able to emulate the decision making of a human expert by using transformation rules based on experimental precedence or on knowledge previously processed by the human brain. These transformation rules are described by patterns (Markush structures) defining the scope and the structural modification that takes place during the conversion of the query compound into its products. Together, these transformation rules form what is called the knowledge base. When particular patterns in the structure of the starting material are recognized by the program and the reaction condition requirements are met, the corresponding rules are triggered, causing the
ACS Paragon Plus Environment
3
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 4 of 48
transformation of the molecule into its corresponding products. Note that unlike logic-oriented systems, expert systems are limited by the content of their library: they cannot generate reactions that are not present in the knowledge base and continuous revision is necessary to keep the knowledge base up to date. Expert system programming has been flourishing during the last 35 years in almost every field of chemistry where a considerable amount of knowledge is required to tackle a problem effectively.12 For example, expert systems can be found for the prediction of synthetic organic reactions,13–16 reaction kinetic modelling,14,17 synthetic route design via retrosynthetic analysis,18– 22
prediction of biodegradation pathways,23 prediction of mass spectrometry fragmentations,14
database proofreading,24 as well as the prediction of toxicology,25 metabolism26 and forced degradation.27,28 The idea of developing predictive software for forced degradation was initiated by earlier expert systems able to predict forward reactions such as CAMEO,13 EROS, 14 and ROBIA,15,16 which were not actually designed for this purpose, but were also able to predict degradation reactions with some success.29 In 2006 Pfizer established DELPHI (Degradation Expert Leading to PHarmaceutical Insight), the first expert system entirely devoted to the prediction of forced degradation.28 However, DELPHI remains non-commercial software built mostly using confidential forced degradation data obtained internally and is not easily taught new chemistry, which has led to its discontinuation.
Experimental Section Overview of the Zeneth program
ACS Paragon Plus Environment
4
Page 5 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
Forced degradation studies are time consuming, and are mainly performed on confidential compounds. For this reason, the field of degradation chemistry is relatively immature and generates a limited amount of publicly available data. To facilitate the gathering of this dispersed information, in 2008 Steven Baertschi, an expert in forced degradation chemistry,30 and Lhasa Limited, a company with expertise in data sharing and in silico prediction systems,26 started the development of Zeneth. As Lhasa Limited had already developed Meteor,26 a comparable expert system for the prediction of metabolism, the design of Zeneth was based on that of Meteor. The program interface allows the user to specify the compound(s) to be processed, generates prediction results in the form of a tree-shaped diagram of degradation products and the likelihood of their formation (Figure 1). The results display screen also provides options to view the complete pathway to a degradant, including intermediate structures (i.e., transient structures within a step). Several processing parameters are available to influence the prediction results, such as the number of steps, the minimum likelihood, and the reaction conditions. In addition, several result filters (which are applied afterwards) allow the refinement of the degradation tree by highlighting reaction pathways that lead to degradants with experimentally observed exact mass, molecular formula and/or structures. Note that when a query structure is processed, all specified conditions are considered to be present alongside each other as a “set” of conditions – there is no loop over reaction conditions or reagents. Different sets of conditions require another run of the program; AutoZeneth (vide infra) can do this automatically.
Specification
ACS Paragon Plus Environment
5
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 6 of 48
Zeneth is a standalone application that runs on Microsoft Windows 7,31 XP and Vista. The user interface is written in Visual Basic while the structure matching and chemistry perception are handled by C++ modules. Additional technical features are support for a range of third-party chemical drawing packages (Accelrys Draw, Symyx Draw, ISIS/Draw and ChemDraw) and the ability to save prediction results and to create customizable reports in various formats (Rich Text Format, MicroSoft Excel, ISIS sketch file, SDfile32 and tab delimited text). An “AutoZeneth” feature allows users to process multiple structures (from Molfiles32 or SDfiles) or to process the same structure against a number of sets of reaction conditions. The AutoZeneth functionality can also be invoked from the command line. The program incorporates a knowledge base editor that is used to enter chemical knowledge (transformations, references, reasoning rules, examples and excipients). Users are able to add their own knowledge to a separate “custom” knowledge base.
Knowledge base building and information sources Zeneth’s knowledge base is developed in very much the same way as Meteor’s. Available degradation data are examined to establish degradation rules that together form a knowledge base (Figure 2). A large amount of degradation chemistry has been summarized in books.29,33 In addition, the primary literature contains much more information that is often more detailed and specific even though this information is highly dispersed over a number of pharmaceutical journals. Useful information can also be obtained from general chemistry text books and from the web-based CambridgeSoft Pharmaceutical Drug Degradation Database, Pharma D3.34 That database, however, has been growing slowly since its creation in 2004 and to date contains only 1,200
ACS Paragon Plus Environment
6
Page 7 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
degradation transformations from 383 compounds. In comparison, knowledge base development for Meteor, an analogous expert system for the prediction of metabolism,26 benefits from large databases such as Accelrys Metabolite, which currently contains 95,000 metabolic transformations from 13,400 compounds.35 Confidential data can often be used to build non-confidential knowledge. Compared to metabolic or toxicological data, forced degradation data are less dependent on features relating to the full structure of the molecule. Often, degradation chemistry takes place on a functional group which can be expressed as a substructure bearing R groups. Therefore, data sharing appears to be more readily applicable to the field of forced degradation chemistry than to many other areas. The exception would be where a degradation is specific to a confidential scaffold, such as a novel heterocyclic system. In such a case, the data may be donated anonymously, if at all. Of course, users may prefer to build their own “custom” knowledge base to complement the one provided with the program.
Results Zeneth knowledge base Transformation patterns The conversion of data into knowledge can be exemplified by the lactonization of a cephalosporin (Scheme 2, Steps 1-2). Reacting structures are reduced to their “degradophore”, i.e., the smallest part of a molecule directly responsible for the degradation reaction (Scheme 2, Step 2); full structures do not appear in the software. Also in Step 2, the knowledge base scientist
ACS Paragon Plus Environment
7
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 8 of 48
elucidates the mechanism of the degradation and tries to understand what the requirements are for the reaction to take place (Scheme 2, Step 3). In Step 3, the degradation reaction is generalized by the use of R groups and a new transformation rule is created in the knowledge base (in this case transformation 046) consisting of a pattern and its reaction change. Such patterns define both the transformation and its scope and are at the core of the knowledge base of Zeneth. The formulation of the transformation rule takes into account the mechanism of the reaction, geometric considerations (e.g., cis/trans double bonds, distances, angles and steric hindrance) as well as general reactivity principles. In this example, the knowledge base writer has recognized that the cis-alkene fragment present in the substructure (Scheme 2, Step 1) can also be part of an aromatic ring (substitution at a benzylic versus an allylic position) (Scheme 2, Step 3). The likelihood of a transformation and its dependence on reaction conditions are described elsewhere in the knowledge base, through reasoning rules. Reasoning rules
The program is presently able to handle the four main pharmaceutically relevant degradation conditions that have to be tested according to the ICH guidelines,3,4 which are thermolytic, hydrolytic, oxidative and photolytic conditions.2 To predict those types of degradations, the software allows the selection of a number of conditions: desired temperature and pH (numerical values), presence or absence of water, oxygen, metal, radical initiator, peroxide and light. In addition to writing transformation patterns, the knowledge base scientist tries to understand from reported examples and chemistry knowledge which reaction conditions need to be selected in
ACS Paragon Plus Environment
8
Page 9 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
order to predict the likelihood of a particular degradation. Two types of reasoning rules36 (absolute and relative) are written for each transformation rule. Absolute reasoning rules36 determine the likelihood of a transformation and rank them into five levels: [very likely], [likely], [equivocal], [unlikely] and [very unlikely] depending on the reaction type (as defined by the pattern(s)) and the reaction conditions that have been selected by the user. Degradants coming from matching transformation rules present in the knowledge base are only shown in the results if their likelihood is equal or greater than the likelihood threshold constraint selected by the user. If the likelihood of a degradant is lower than that threshold, the degradant is neither displayed nor is it used to generate further degradants. This helps to avoid an uncontrolled combinatorial explosion of unlikely predictions and provides information on which degradation pathways are more likely to take place. For example, an absolute reasoning rule sets the S-oxidation of thioether as [very likely] when either O2 or peroxides are present, whereas the relatively less likely oxidation at benzylic positions is set to [likely] when both O2 and radical initiator are present. As another example, in the acid/base-catalyzed hydrolysis of imides, water is a prerequisite, and a higher likelihood is assigned at the two extremes of the pH scale than at neutral pH (see orange pH profile in Figure 3). Other pH profiles can be used to write other absolute reasoning rules depending on the reaction under consideration, such as the red pH profile in Figure 3 for a reaction that is catalyzed only by acid. As many hydrolysis and isomerization reactions are acid- and/or basecatalyzed, pH profiles are used extensively in the knowledge base. At present, dependence on temperature is implemented only for “thermal” (non-catalyzed) eliminations, fragmentations and rearrangements (e.g., a Cope elimination). For other transformation rules, the predictions will not change with temperature.
ACS Paragon Plus Environment
9
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 10 of 48
All absolute reasoning rules entered in the knowledge base follow the format: if [Grounds] is [Threshold] then [Proposition] is [Force].36 A reasoning phrase must be placed in each of the [Grounds], [Threshold], [Proposition] and [Force] fields. A list of reasoning phrases is available from a drop-down menu to fill the fields or new ones can be added. The [Grounds] field is the evidence to be considered by the reasoning rule; the [Threshold] field is the level of likelihood above which the [Grounds] must be for the [Proposition] to be assigned the [Force]; the [Proposition] field is the question to be answered by the absolute reasoning rule; and the [Force] is the reasoning outcome assigned to the [Proposition]. For example, the absolute reasoning rules in Figure 4 have been entered in the knowledge base. Absolute reasoning rule 14 (Figure 4) says that if the starting material matches a pattern of transformation 011 (oxidation of thioether to sulfoxide), then the resulting product has a likelihood of formation equal to [very likely] when “any oxidant” (oxygen or peroxide) is present. Note that absolute reasoning rules 90 and 91 define the meaning of the [Grounds] for rule 63 by setting the value of the variable “any oxidant”. This variable may be re-used by other rules. Figure 4 also provides an example with a pH dependency, the acid-catalyzed transformation 006 (dehydration of hydroxylamine to imine). Absolute reasoning rule 9 says that if the starting material matches a pattern of transformation 006, then the associated product (here an imine) has a likelihood of formation governed by “pH profile 5” (the red pH profile of Figure 3), which is defined by rules 80-84 as [very unlikely] when the pH is 8 or higher to [very likely] for a pH below 2, reflecting the acid-catalyzed nature of the transformation. The variable “likelihood with pH profile 5” can be re-used for other acid-catalyzed transformations that have comparable reactivity.
ACS Paragon Plus Environment
10
Page 11 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
Relative reasoning rules36 are also present in the knowledge base and describe the relative reactivity of functional groups. These rules are also implemented by the knowledge base scientist and are used by the reasoning engine to assess competing reactions. The likelihood of formation for a degradant does not necessarily depend on the likelihood of its isolated transformation coming from reaction conditions and absolute reasoning rules. This is because other transformations can compete for the same substrate. For example, faster transformations may deplete a shared reagent and prevent slower reactions from taking place. Therefore, to be able to control the behavior of competing reactions, relative reasoning rules describing the relative reactivity of functional groups are implemented for every transformation present in the knowledge base. For example, a rule will state that the hydrolysis of an imine is quicker (more likely) than the hydrolysis of an amide. As exemplified for the hydrolysis of bromazepam in Scheme 3, two different hydrolysis sequences could mechanistically give the same degradants D3 and D4, but the first hydrolysis is more likely to take place on the most electrophilic site of the molecule (here the imine moiety), competing for a hydroxyl anion or water molecule and therefore preventing the less electrophilic amide moiety present in the molecule from reacting during the first degradation step to form midway degradant D2. This prediction is in accordance with experimental results.37 Therefore, to describe the most likely pathway and help to reduce the number of redundant structures, an option in the program can be activated by the user so that the reasoning engine eliminates the slowest competing reactions, such as the one that would lead to D2 in Scheme 3. Note that relative reasoning is applied analogously in the Meteor program for metabolism prediction.26,36
ACS Paragon Plus Environment
11
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 12 of 48
Knowledge base status All the prediction results presented here were obtained using version 2012.1.1 of the knowledge base and version 4 of the Zeneth program.38 This knowledge base consists of 243 different named transformations, together with reasoning rules (419 absolute and 248 relative). In addition, the knowledge base contains additional information on each transformation (Markush structure defining the scope of the reaction, comments, references and examples) that can be viewed by the user when it is applied. A transformation usually contains more than one pattern (666 in total) in order to deal with variants of the transformation and issues such as regioselectivity. Substructures can be quite general, allowing those patterns to generate a large number of different reactions under the same name; or they can be very restrictive for very specific reactions. The chemical transformations belong to one of several general classes of reaction categories: hydrolysis, oxidation, condensation, addition, elimination, substitution, isomerization, rearrangement, and photochemical reactions (Table 1). Note that the knowledge base contains more transformations under the categories of hydrolysis and oxidation, accounting for about half of all, and fewer in the category of isomerizations, rearrangements and photochemical reactions. This does not necessarily reflect how frequently these rules are applied on APIs since it depends on the scope of the pattern used, as well as the structure of the molecules (some structural features being more common than others). The current number of photochemical transformations is probably relatively low and this chemistry will require more attention in the near future. This is also the case, albeit to a lesser extent, for isomerizations and rearrangements.
ACS Paragon Plus Environment
12
Page 13 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
Transformation categories Hydrolyses Oxidations Condensations and additions Eliminations and fragmentations Isomerization and rearrangements Photochemical reactions Total
Number of transformations present in the knowledge base by year 2010 2011 2012 44 55 56 41 56 76 27 42 42 17 20 25 19 23 25 14 19 19 162 215 243
Table 1. Knowledge base status of Zeneth.
Hydrolysis reactions are in general the simplest ones to predict without the help of an expert system, whereas isomerization, rearrangement and photochemical reactions are much more difficult to predict without the software and are the ones that chemists may overlook. Oxidation reactions are not especially difficult to predict but a particular functional group (or heterocycle) can often produce multiple oxidation products (e.g., a tertiary amine may lead to an N-oxide, iminium compounds, hemiaminals and dealkylated products/carbonyl compounds). Zeneth is also capable of handling intermolecular reactions such as dimerization and reactions with excipients. In this respect, common excipients (currently 49, see Table 2) as well as their known contaminants have been entered into the knowledge base and are available from a dropdown menu in the program to be processed with the API. This list is updated periodically with new excipients and new contaminants. Note that any other structures that are not yet present in the knowledge base, such as flavoring agents or combination products, can be added by the user to be processed for intermolecular reactions with the API.
ACS Paragon Plus Environment
13
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Benzaldehyde
D-Sorbitol
Polyethylene glycol
Benzyl alcohol
Ethyl cellulose
Polylactide
Benzylparaben
Ethylparaben
Polyvinyl alcohol
Butylated hydroxytoluene
Glyceryl 1,3-dibehenate
Polyvinylpyrrolidone
Butylparaben
Hydroxyethyl cellulose
Propylparaben
Cellulose
Hydroxyethyl ethyl cellulose
Retinal
Citric acid
Hydroxyethyl methyl cellulose
Retinoic acid
(Cros)carmellose
Hydroxypropyl cellulose
Retinol
(Cros)carmellose sodium
Hypromellose
Sodium starch glycolate
Crospovidone
Hypromellose acetate succinate
StarCap 1500
D-Fructose
Hypromellose phthalate
Starch
D-Galactose
Magnesium stearate
Stearic acid
D-Glucose
Maleic acid
Succinic acid
D-Lactitol
Methyl cellulose
Triacetin
D-Lactose
Methylparaben
Triethyl acetylcitrate
D-Mannitol
Opadry
D-Mannose
Phthalic acid
Page 14 of 48
Table 2. Excipients in the Zeneth knowledge base (version 2012.1.1).
Considerable improvement in Zeneth’s knowledge base has taken place during the last four years of its development. Although many principal degradation transformations are already present, the knowledge base continues to grow steadily through implementation of new chemistry that is increasingly structure-specific. Scopes of existing transformations are updated periodically to allow better refinement. In addition, reasoning rules are often revisited to achieve more accurate predictions, taking into consideration the feedback from users. All new knowledge must be supported by literature evidence or by a strong rationale before it is implemented into the knowledge base. A number of users are interested in observing the maturation of the knowledge base. Recently, five pharmaceutical companies have agreed to collectively compare different knowledge base versions against experimental results obtained from forced degradation studies. This benchmarking study will form the basis of a future publication. Any observed degradants missed
ACS Paragon Plus Environment
14
Page 15 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
by the prediction will also form the grounds for implementing new rules in future knowledge base releases.
Execution of a transformation rule The software recognizes any query structure entered by the user that contains the “degradophore” for a particular degradation reaction (such as the one of transformation 046 from the example in Scheme 2, Step 3). Providing that the reaction conditions are met, the corresponding products are generated (Scheme 4). Note that the second reaction in Scheme 4 would normally not be shown as the minimum likelihood level is usually set higher than [very unlikely]. Logic flowchart A simplified logic flow of Zeneth is represented in Figure 5. The main input for Zeneth is the structure of the starting material. In addition the user may select processing constraints which define the reaction conditions, the number of degradation steps allowed and the likelihood threshold. A chemistry engine begins the processing of the query structure by perceiving structural features such as atoms, bonds, charges and aromaticity. If the structure matches any transformation pattern present in the knowledge base, the program applies the corresponding transformation as explained earlier. The reasoning engine uses the rules present in the knowledge base to estimate the likelihood of the reactions taking the reaction conditions into consideration. If the likelihood meets the
ACS Paragon Plus Environment
15
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 16 of 48
threshold, the program displays the structure of the degradant along with its ID number, reaction name, comments, references, and literature examples. The program will loop until all transformations have been applied, until all degradants of the generation have been processed, or until the number of steps has been reached. Note that this diagram does not explicitly illustrate the treatment of dimerization reactions; dimerizations are handled as an inner loop within the box “Match transformations … generate degradants”.
Result display Zeneth results are displayed on the screen in a form of an inverted tree and in tabular format (see also Figure 1). In the tree, a color code is used to distinguish the likelihood of degradants. Two forms of likelihood are calculated. The step likelihood gives the likelihood of each individual step whereas the pathway likelihood considers the full reaction pathway. Degradation tree with step likelihood The program searches for transformation rules in the knowledge base that can be applied to the query structure Q. If a pattern is recognized and the reaction conditions are met, the program follows the sequence described in the flowchart (Figure 5) to predict the first generation of degradants (i.e., D1-D3 in Figure 6). If degradation rules can be re-applied to the degradants and if the user wishes to process additional generations, the program will repeat the same process for each degradant, generating structures for the next level of degradation products (D4-D8 and D9D14, respectively, in Figure 6). The structures of predicted degradants are given along with their corresponding transformation names, comments, references and examples to inform the user about the scope and scientific
ACS Paragon Plus Environment
16
Page 17 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
basis of the transformation rule. In addition, as explained earlier, a reasoning engine assesses the likelihood of formation for each degradant in the tree, represented by a color code. This is called the step likelihood. It can be concluded from this representation that any degradant with a higher likelihood of formation than its parent (like D11 and D7), might consume it. Therefore, in the example of Figure 6, parent degradants D6 and D3 might only exist transitionally and might not be observed experimentally due to the more likely subsequent transformation step leading to D11 and D7 respectively. Degradation tree with pathway likelihood In practice, the reaction pathway leading to a degradant of second and subsequent generations might have to be taken into consideration to predict its likelihood of formation. This is because the formation of such a degradant depends on the likelihood of all degradants on its pathway. A degradant cannot possibly have a higher likelihood of existence than any other degradant earlier in its path. Therefore, another likelihood representation can be displayed by the software taking into account the degradation pathway and is called pathway likelihood: In this tree representation (Figure 7), the software only allows a degradant to have a likelihood of formation equal or lower than that of its parent degradant (such as D6 to D11, or D7 to D14). Comparison of the two tree representations (Figure 6 and Figure 7) shows that even though the formation of D11 from D6 is very likely to take place (Figure 6), D11 is unlikely to be formed due to the “bottleneck reaction” leading to D6 (Figure 7). Reaction path display
ACS Paragon Plus Environment
17
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 18 of 48
The reaction path leading to any product in the tree can be highlighted from the software, such as the degradation pathway from acetylsalicylic acid leading to degradant D26 (Scheme 5). D26 comes from the photo-Fries rearrangement of acetylsalicylic acid Q leading to D3, followed by a decarboxylation reaction to D9 and a Baeyer-Villiger rearrangement. Additional information for each step, such as comments, references, examples and scope of the transformation, can be displayed if needed. Note that the software also displays the structure of intermediates involved in the reactions to help understanding of the reaction mechanism. Both the photo-Fries rearrangement and the decarboxylation reaction proceed through unstable tautomer intermediates (I1 and I7, Scheme 5).
ACS Paragon Plus Environment
18
Page 19 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
Results table A results table (see Figure 1) is also provided to facilitate the analysis of results. For each transformation it lists information such as likelihood, reaction name, formula and exact mass of the degradant, formula gain, formula loss, and (exact) mass difference.
Prediction of intermolecular reactions In degradation, intermolecular reactions can take place in a number of ways: combination product interactions, interactions between APIs and excipients (and/or their contaminants), and dimerization reactions. Dimerizations As an option, processing constraints in Zeneth can be chosen to allow the query compound to react with itself and with its degradants, as well as to allow degradants to react with themselves (but not with each other as that would lead to a combinatorial explosion). Collectively, these types of reaction can be classified as “dimerization” reactions. For example, when processing acetylsalicylic acid (two steps, likelihood at least equivocal) without allowing dimerization reactions, 11 degradants are obtained (Scheme 6, Run 1). If the same query compound is also allowed to react with its degradants and with itself, 20 degradants are obtained, such as degradant D8, resulting from an ester hydrolysis followed by an esterification of the phenol D2 with the query compound (Scheme 6, Run 2). If degradants are allowed to react with themselves as well, the prediction returns 30 degradants such as degradant D8 from the oxidative coupling of phenol D2 (Scheme 6, Run 3).
ACS Paragon Plus Environment
19
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 20 of 48
Reactions of combination products Combination product interactions can be predicted by processing one structure in the presence of one or several others. As an example, the combination product of acetylsalicylic acid and ascorbic acid, under the previous conditions, allowing two generations with a likelihood threshold set to equivocal, yields 170 degradants, such as D146 in Scheme 7 (first run), coming from the esterification of the query compound with the primary alcohol of ascorbic acid, followed by an elimination reaction opening the lactone. Reaction with excipients and contaminants The knowledge base contains a number of excipient structures which can be selected from a drop-down menu to be processed together with the API to predict possible intermolecular reactions. On its own, acetylsalicylic acid leads to the prediction of 11 degradants when processed over two generations with the previous reaction conditions, but Zeneth returns 14 degradants when processed in the presence of the anti-adherent magnesium stearate (more specifically, the stearic acid present in this excipient), such as degradant D11 (Scheme 7, second run) coming from the esterification of the phenol moiety of degradant D3 by stearic acid. In addition, the user can process APIs with impurities that are commonly present in the excipients. A list of known contaminants has been cross-referenced to the list of excipients in the knowledge base and is updated periodically. For example, when the user processes acetylsalicylic acid with the excipient D-glucose, Zeneth offers the possibility of adding the common contaminant 5-hydroxymethylfurfural (5-HMF) in the prediction. If these two compounds are taken into consideration, Zeneth returns 235 degradants over two generations, such as degradantD33 from Scheme 7 (third run) coming from the esterification of carboxylic acid degradant D2 with contaminant 5-HMF.
ACS Paragon Plus Environment
20
Page 21 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
Note that the high number of degradants can be handled easily by a number of filters available within the software, able to highlight the prediction results.
Filtering results It is possible to switch off Zeneth’s reasoning engine and therefore apply all transformations whose reaction patterns match, regardless of the reaction conditions, for as many generations as desired. This will often lead to a combinatorial explosion of transformations, giving a large number of degradants. Similarly, setting the threshold of likelihood to (very) unlikely could cause many degradants to be generated. This unconstrained mode of processing can still be meaningful because Zeneth can filter structures in a number of ways to help focus on specific results. Forced degradation of leflunomide: mass and duplicate filters In addition to having to understand the chemical behavior of a product, the forced degradation scientist has the difficult task of trying to characterize and if possible isolate traces of degradants observed in stress testing experiments. These degradants are often numerous, and can co-elute or elute very close to each other on a standard HPLC or LC-MS method, making interpretation difficult. Most of the time, the only experimental information available to aid solving the structure of a degradant is a mass spectrum obtained by LC-MS along with its mass fragmentation information.39 Trying to deduce the structure of a degradant using only mass spectrometry data can be a difficult exercise, especially in the absence of any intuitive guesses. Zeneth has the ability to filter possible degradants by exact mass to quickly propose degradation pathways leading to compounds with a particular observed mass. Several degradants might have
ACS Paragon Plus Environment
21
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 22 of 48
the same exact mass, but Zeneth’s assessment of the likelihood of different degradation pathways offers further guidance towards final structure elucidation. Duplicate degradants can indeed be generated via different reactions or different orders of reactions within a multi-step pathway. It is possible to leave out duplicate structures coming from redundant or longer pathways. In Figure 8, D4, D4’ and D4’’ are identical structures, coming from different reaction pathways, but degradant D4’ and D4’’ can be hidden from the reaction tree because they come from a longer reaction pathway or from a less likely intermediate, respectively. As an (hypothetical) example, let us assume that during the forced degradation of leflunomide the exact mass of an unknown degradant was found to be 143.022 Dalton. Zeneth predicts 96 possible degradants from leflunomide over three generations at pH 13 in the presence of water, oxygen and light (Figure 9, details not shown). The mass filter option allows the results to be restricted to a specific exact mass, here 143.02 (with an appropriate tolerance; 0.01 was used), and returns in this case only five degradants. From those degradants, three are identical but coming from different reaction pathways with different likelihood and can be filtered out by choosing not to display the less likely duplicate structures (Figure 9). Human judgement is often required for the final decisions in the elucidation of the degradant structure. The reaction pathways generated by the program can be examined taking into account the number of steps in the reaction pathways and the likelihood of the steps involved. In the example of leflunomide (Scheme 8), it would appear that structure D10 is the more likely degradant able to give an exact mass of 143.02 under the selected conditions because the other isomer D45 would require an additional isomerization step (degradants lower in the tree tend to be less likely than higher ones).
ACS Paragon Plus Environment
22
Page 23 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
Forced degradation of lornoxicam: molecular formula filter It is also possible to filter large numbers of result structures by molecular formula obtained via accurate mass spectrometry. For example, the experimental forced degradation of lornoxicam in hydrolytic, oxidative and photolytic conditions gives an unknown degradant with an exact mass value of m/z = 250.9478, which corresponds to the molecular formula C7H6ClNO3S2.40 Running Zeneth with pH = 1 and water, oxygen and light, allowing three steps and at least likely, gives 40 possible degradants. With the molecular formula filter set to C7H6ClNO3S2 Zeneth returns one structure, in accordance with the experimental finding,40 through a three-step reaction sequence with high likelihood (Scheme 9). Forced degradation of D-fructose: structure filter Full characterization of a degradant should be supported by a plausible suggested route of formation. In other words, analytical evidence should be in agreement with the chemistry that can reasonably take place during stress testing. Therefore, as well as having to fully characterize a degradant, the forced degradation scientist should clarify how the degradant is formed from the starting material. A structure filter can be used to localize an experimentally characterized degradant in the degradation tree, after which the reaction pathway can be examined in order to understand the transformation steps and their mechanisms. As an example, the structure of maltol (Scheme 10-A), a known degradant from ketohexoses, can be found in a tree of 216 degradants generated by the processing of D-fructose in water at pH 11 over four generations with a likelihood threshold set to likely; and its reaction pathway highlighted (Scheme 10-B). D-Fructose (Q in Scheme 10-B) undergoes a hydroxyl elimination via the six-membered ring transition state of tautomer I3 leading enol intermediate I4 which gives D6 after tautomerization. Note that intermediate I4 is formed via a rearrangement reaction
ACS Paragon Plus Environment
23
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 24 of 48
similar to the Amadori rearrangement of N-glycosides involved in the Maillard reaction.41 Internal hemiacetalization of D6 takes place leading to D38, followed by a hydroxyl elimination to form the α,β-unsaturated ketone intermediate D139 which dehydrates through a tautomer intermediate I148 to lead to maltol D386.42
Creation of reports Prediction results from Zeneth can be stored as a Zeneth file to be re-opened later (from within the program). The results can also be presented as reports in a number of standard formats such as TXT, RTF, SDF32 XLS files. The reports contain a range of information such as program and knowledge base version, details of the reasoning, reaction conditions used, filters applied, reaction name, reaction pathway with associated likelihood showing the structures of the starting material, intermediates (if any) and resulting degradants. Additional information such as reaction description with its scope, references, examples and comments can be reported if required.
Discussion Scope The main use case of the Zeneth program is to assist in the structure determination of products formed during forced degradation of APIs and the elucidation of their pathways of formation. The program does not aim to estimate stability, quantities of products or rates of formation, although the likelihoods it produces can be expected to correlate with all of these. Zeneth does aim to predict all possible degradation reactions, and for this goal a primary requirement is a
ACS Paragon Plus Environment
24
Page 25 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
comprehensive knowledge base. While the knowledge base has reached some degree of maturity, it can certainly not be considered comprehensive at this moment. The evaluation of knowledge base development and coverage will be the subject of a forthcoming paper. The area of chemistry in which the program performs best is that of small molecules. In particular, proteins are outside its scope as are polymers. However, it is possible to process small oligopeptides and the knowledge base contains a number of transformations that are relevant to amino residues. Query structures must, in general, be provided in a “neutral” form (i.e., neither protonated nor deprotonated) regardless of the pH. The patterns, together with the pH profiles, are designed to address the pH-dependency of reactions.
Limitations As the knowledge base has been growing steadily, the number of degradation pathways that the system predicts has grown exponentially. This combinatorial explosion is especially severe with compounds that contain many functional groups and also when compounds are being processed with excipients and their contaminants, in particular with excipients such as sugars which contain dense functionality themselves. In general, not all of the suggested products will be observed experimentally. However, the knowledge base is still in a relatively early phase of its growth and until now it has been considered more important to try to minimize the number of unpredicted observed transformations rather than trying to minimize the number of non-observed predictions. Nevertheless, it is clear that the accuracy of predictions will have to improve. Two avenues are available to effect this improvement. Firstly, reasoning rules can be implemented which control the growth of multi-step pathways. Current research is focused on the identification of such rules which, of course, should not prevent the generation of likely pathways. Secondly, the program
ACS Paragon Plus Environment
25
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 26 of 48
should be able to discriminate better between good and bad applications of a particular transformation. Currently, the likelihood of a transformation is fixed and cannot be modulated by ancillary structural features – a different likelihood can only be obtained through another transformation with a different pattern. Modulation of transformation likelihoods by way of calculable properties (e.g., a base-catalyzed epimerisation that depends on the pKa of the stereocenter) would be a better method. This will be investigated in the future. As mentioned earlier, dependence on temperature is currently not fully treated in the system. Only “thermal” (non-catalyzed) reactions such as eliminations, fragmentations and rearrangements have been provided with a “temperature profile” analogous to the pH profiles. In a later stage of development, reasoning rules will be designed that can handle temperature and pH profiles simultaneously. Such rules must be checked carefully against experimental data, where available. Tautomerism is another area which will require more attention in the near future. The ability of the program to deal adequately with tautomeric structures is essential in order to recognize the many ways a structure can sometimes be represented. While the system does contain a small knowledge base of tautomeric conversions, it is currently not equipped to deal with the (often redundant) products from tautomers of the API (and of its degradants), causing serious inefficiency. The problem of how to take tautomers into account without introducing redundancy is currently being investigated. When it is solved, more tautomeric conversions will be implemented. Finally, a prediction system that does not perform extensive (and CPU-intensive) calculations to reach its conclusions can provide only approximate answers and will always need the expertise
ACS Paragon Plus Environment
26
Page 27 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
of the user who will have the final say regarding the likelihood of degradation products and their pathways.
Conclusion A computer system that aims to predict the forced degradation pathways of organic compounds has been described. Even though its development has started only relatively recently, Zeneth is able to predict plausible pathways taking into account reaction conditions and other compounds such as excipients. Two of the main advantages of the system are total recall of information and the absence of bias (offset somewhat by the fact that the knowledge base will never contain all of chemistry). A further major benefit is the steady accumulation of knowledge. Zeneth can help analytical chemists to anticipate, analyze and understand the results of forced degradation studies and to reduce their requirement for expertise in organic chemistry. Also, for the novice bench scientist, Zeneth can serve as a means of training and for benchmarking of personal skills. Last but not least, degradation experts can use the system to enhance their productivity. The knowledge base undergoes continuous expansion and it is made available to users twice a year. This process is undertaken by Lhasa’s scientists and benefits from data supplied by users. A simple editor interface allows users to create their own knowledge base if needed, making the application flexible and extensible. Consequently, it is fair to say that the knowledge base and its associated prediction quality can only improve with time and with the number of users.
Acknowledgements
ACS Paragon Plus Environment
27
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 28 of 48
We thank the members of the steering committee, Steve Baertschi, Patrick Jansen and Kimmer Smith from Eli Lilly & Co; Rhonda Jackson, Dave Breslin and Steven Hostyn from Johnson & Johnson; Mark Kleinman and Dan Reynolds from GlaxoSmithKline; Darren Reid and Roman Shimanovich from Amgen; Karen Alsante, Weitao Pan, Dinos Santafianos and Chris Foti from Pfizer, for their strategic directions, sharing of valuable knowledge and monitoring of the software performance. We are especially grateful to Steve Baertschi who provided the initial impetus for the project. Finally, we are indebted to Carol Marchant from Lhasa Limited for her valuable advice on the lay-out of the manuscript.
References (1)
Ngwa, G. Forced Degradation as an Integral Part of HPLC Stability-Indicating Method Development. Drug Delivery Technol. 2010, 10, 56–59.
(2)
Reynolds, D. W.; Facchine, K. L.; Mullaney, J. F.; Alsante, K. M.; Hatajik, T. D.; Motto, M. G. Available Guidance and Best Practices for Conducting Forced Degradation Studies. Pharm. Technol. 2002, 48–56.
(3)
ICH 2003. Q1A(R2): Stability Testing of New Drug Substances and Products. Retrieved June 3, 2013 from http://www.ich.org/products/guidelines/quality/article/qualityguidelines.html.
(4)
ICH 1996. Q1B: Stability Testing: Photostability Testing of New Drug Substances and Products. Retrieved June 3, 2013 from http://www.ich.org/products/guidelines/quality/article/quality-guidelines.html.
(5)
ICH 2006. Q3A(R2): Impurities in New Drug Substances. Retrieved June 3, 2013 from http://www.ich.org/products/guidelines/quality/article/quality-guidelines.html.
(6)
ICH 2006. Q3B(R2): Impurities in New Drug Products. Retrieved June 3, 2013 from http://www.ich.org/products/guidelines/quality/article/quality-guidelines.html.
(7)
Brown, F. K. Chemoinformatics: What is it and How does it Impact Drug Discovery. In Annual Reports in Medicinal Chemistry, volume 33; Bristol, J. A., Ed.; Academic Press, 1998; pp 375–384.
ACS Paragon Plus Environment
28
Page 29 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
(8)
Brown, F. Editorial opinion: chemoinformatics - a ten year update. Curr. Opin. Drug Discovery Dev. 2005, 8, 298–302.
(9)
Gasteiger, J.; Engel, T. Chemoinformatics: A Textbook; Wiley-VCH: Weinheim, 2003; 649 p.
(10)
Judson, P. Knowledge-based expert systems in chemistry: not counting on computers; Royal Society of Chemistry: Cambridge, 2009; pp 16–34.
(11)
Ugi, I. K.; Bauer, J.; Baumgartner, R.; Fontain, E.; Forstmeyer, D.; Lohberger, S. Computer assistance in the design of syntheses and a new generation of computer programs for the solution of chemical problems by molecular logic. Pure Appl. Chem. 1988, 60, 1573–1586.
(12)
Jackson, P. Introduction to Expert Systems, 3rd ed.; Addison Wesley: Harlow, 1999; 542 p.
(13) Jorgensen, W. L.; Laird, E. R.; Gushurst, A. J.; Fleischer, J. M.; Gothe, S. A.; Helson, H. E.; Paderes, G. D.; Sinclair, S. CAMEO: a program for the logical prediction of the products of organic reactions. Pure Appl. Chem. 1990, 62, 1921–1932. (14)
Hollering, R.; Gasteiger, J.; Steinhauer, L.; Schulz, K. P.; Herwig, A. Simulation of Organic Reactions: From the Degradation of Chemicals to Combinatorial Synthesis. J. Chem. Inf. Comput. Sci. 2000, 40, 482–494.
(15) Socorro, I. M.; Taylor, K.; Goodman, J. M. ROBIA: a reaction prediction program. Org. Lett. 2005, 7, 3541–3544. (16)
Socorro, I. M.; Goodman, J. M. The ROBIA program for predicting organic reactivity. J. Chem. Inf. Model. 2006, 46, 606–614.
(17)
Gasteiger, J.; Bauerschmidt, S.; Burkard, U.; Hemmer, M. C.; Herwig, A.; Von Homeyer, A.; Höllering, R.; Kleinöder, T.; Kostka, T.; Schwab, C.; Selzer, P.; Steinhauer, L. Decision support systems for chemical structure representation, reaction modeling, and spectra simulation. SAR QSAR Environ. Res. 2002, 13, 89–110.
(18)
Law, J.; Zsoldos, Z.; Simon, A.; Reid, D.; Liu, Y.; Khew, S. Y.; Johnson, A. P.; Major, S.; Wade, R. A.; Ando, H. Y. Route Designer: a retrosynthetic analysis tool utilizing automated retrosynthetic rule generation. J. Chem. Inf. Model. 2009, 49, 593–602.
(19)
Huang, Q.; Li, L.-L.; Yang, S.-Y. RASA: A Rapid Retrosynthesis-Based Scoring Method for the Assessment of Synthetic Accessibility of Drug-like Molecules. J. Chem. Inf. Model. 2011, 51, 2768–2777.
(20)
Corey, E. J.; Long, A. K.; Rubenstein, S. D. Computer-Assisted Analysis in Organic Synthesis. Science 1985, 228, 408–418.
ACS Paragon Plus Environment
29
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 30 of 48
(21)
Johnson, A. P.; Marshall, C. Starting material oriented retrosynthetic analysis in the LHASA program. 3. Heuristic estimation of synthetic proximity. J. Chem. Inf. Comput. Sci. 1992, 32, 426–429.
(22)
Ott, M. A.; Noordik, J. H. Long-range strategies in the LHASA program: the quinone Diels-Alder transform. J. Chem. Inf. Comput. Sci. 1997, 37, 98–108.
(23)
Fenner, K.; Gao, J.; Kramer, S.; Ellis, L.; Wackett, L. Data-driven extraction of relative reasoning rules to limit combinatorial explosion in biodegradation pathway prediction. Bioinformatics 2008, 24, 2079–2085.
(24)
Durant, J. L.; Leland, B. A.; Nourse, J. G. VET: a tool for reaction plausibility checking. J. Chem. Inf. Model. 2006, 46, 762–766.
(25)
Matthews, E. J.; Kruhlak, N. L.; Benz, R. D.; Contrera, J. F.; Marchant, C. A.; Yang, C. Combined Use of MC4PC, MDL-QSAR, BioEpisteme, Leadscope PDM, and Derek for Windows Software to Achieve High-Performance, High-Confidence, Mode of ActionBased Predictions of Chemical Carcinogenesis in Rodents. Toxicol. Mech. Methods 2008, 18, 189–206.
(26)
Marchant, C. A.; Briggs, K. A.; Long, A. In silico tools for sharing data and knowledge on toxicity and metabolism: Derek for Windows, Meteor, and Vitic. Toxicol. Mech. Methods 2008, 18, 177–187.
(27)
Lee, P. H.; Rafferty, M. F. In Silico Pharmaceutical Property Prediction. Mol. Pharmaceutics 2007, 4, 487–488.
(28)
Pole, D. L.; Ando, H. Y.; Murphy, S. T. Prediction of drug degradants using DELPHI: an expert system for focusing knowledge. Mol. Pharmaceutics 2007, 4, 539–549.
(29)
Baertschi, S. W.; Alsante, K. M.; Reed, R. A. Pharmaceutical Stress Testing: Predicting Drug Degradation, 2nd ed.; Informa Healthcare: London, 2011; 624 p.
(30)
Baertschi, S. W.; Alsante, K. M.; Santafianos, D. Stress Testing: The Chemistry of Drug Degradation. In Pharmaceutical Stress Testing, 2nd ed.; Baertschi, S. W.; Alsante, K. M.; Reed, R. A., Eds.; Informa Healthcare: London, 2011; pp 49–141.
(31)
The Zeneth program has not yet been fully tested on Windows 8 platforms.
(32)
Dalby, A.; Nourse, J. G.; Hounshell, W. D.; Gushurst, A. K. I.; Grier, D. L.; Leland, B. A.; Laufer, J. Description of several chemical structure file formats used by computer programs developed at Molecular Design Limited. J. Chem. Inf. Comput. Sci. 1992, 32, 244–255.
(33)
Li, M. Organic Chemistry of Drug Degradation; RSC Publishing: Cambridge, 2012; 305 p.
ACS Paragon Plus Environment
30
Page 31 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
(34)
Drug Degradation Database. Retrieved June 3, 2013 from http://d3.cambridgesoft.com.
(35)
Accelrys Metabolite. Retrieved November 15, 2012 from http://accelrys.com/products/databases/bioactivity/metabolite.html.
(36)
Button, W. G.; Judson, P. N.; Long, A.; Vessey, J. D. Using absolute and relative reasoning in the prediction of the potential metabolism of xenobiotics. J. Chem. Inf. Comput. Sci. 2003, 43, 1371–1377.
(37)
Panderi, I.; Archotaki, H.; Gikas, E.; Parissi-Poulou, M. Acidic hydrolysis of bromazepam studied by high performance liquid chromatography. Isolation and identification of its degradation products. J. Pharm. Biomed. Anal. 1998, 17, 327–335.
(38)
The most recent version of Zeneth is version 5 with knowledge base version 2012.2.0 containing 277 transformations, released January 7th 2013.
(39)
Gonsalves, A. R.; Pineiro, M.; Martins, J. M.; Barata, P. A.; Menezes, J. C. Identification of Alprazolam and its degradation products using LC-MS-MS. ARKIVOC 2010, 128–141.
(40)
Modhave, D. T.; Handa, T.; Shah, R. P.; Singh, S. Stress degradation studies on lornoxicam using LC, LC-MS/TOF and LC-MS(n). J. Pharm. Biomed. Anal. 2011, 56, 538–545.
(41)
Nursten, H. The Maillard Reaction: Chemistry, Biochemistry and Implications; The Royal Society of Chemistry: Cambridge, 2005.
(42)
Yaylayan, V. A.; Mandeville, S. Stereochemical Control of Maltol Formation in Maillard Reaction. J. Agric. Food Chem. 1994, 42, 771–775.
.
ACS Paragon Plus Environment
31
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 32 of 48
Journal: Molecular Pharmaceutics Manuscript ID: mp-2013-00083h Title: “An expert system to predict the forced degradation of organic molecules” Author(s): Parenty D. C., Alexis; Button, William G.; Ott, Martin A.
List of Tables
Transformation categories Hydrolyses Oxidations Condensations and additions Eliminations and fragmentations Isomerization and rearrangements Photochemical reactions Total
Number of transformations present in the knowledge base by year 2010 2011 2012 44 55 56 41 56 76 27 42 42 17 20 25 19 23 25 14 19 19 162 215 243
Table 1. Knowledge base status of Zeneth.
ACS Paragon Plus Environment
32
Page 33 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
Benzaldehyde
D-Sorbitol
Polyethylene glycol
Benzyl alcohol
Ethyl cellulose
Polylactide
Benzylparaben
Ethylparaben
Polyvinyl alcohol
Butylated hydroxytoluene
Glyceryl 1,3-dibehenate
Polyvinylpyrrolidone
Butylparaben
Hydroxyethyl cellulose
Propylparaben
Cellulose
Hydroxyethyl ethyl cellulose
Retinal
Citric acid
Hydroxyethyl methyl cellulose
Retinoic acid
(Cros)carmellose
Hydroxypropyl cellulose
Retinol
(Cros)carmellose sodium
Hypromellose
Sodium starch glycolate
Crospovidone
Hypromellose acetate succinate
StarCap 1500
D-Fructose
Hypromellose phthalate
Starch
D-Galactose
Magnesium stearate
Stearic acid
D-Glucose
Maleic acid
Succinic acid
D-Lactitol
Methyl cellulose
Triacetin
D-Lactose
Methylparaben
Triethyl acetylcitrate
D-Mannitol
Opadry
D-Mannose
Phthalic acid
Table 2. Excipients in the Zeneth knowledge base (version 2012.1.1).
ACS Paragon Plus Environment
33
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 34 of 48
List of Figures and Schemes OH O 2 OH HO H
O 1
HO H 2
Scheme 1: Two identical structures of a product, which at first sight might appear different to the human eye.
Figure 1. A prediction result (for bromazepam; conditions: pH1, water, light) as displayed by Zeneth, showing the summary and detail trees and the results table.
ACS Paragon Plus Environment
34
Page 35 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
Figure 2. Schematic overview of the knowledge building process in the Zeneth system.
ACS Paragon Plus Environment
35
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 36 of 48
Scheme 2. Taking into account the mechanism, geometric considerations, reaction conditions and general reactivity principles, the degradation is generalized and a new rule is created in the knowledge base (Step 1 to 3). The “degradophore” is the characteristic substructure required for a reaction type.
ACS Paragon Plus Environment
36
Page 37 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
Figure 3. Examples of pH profiles showing how pH affects the likelihood of pH dependent reactions.
Figure 4. Excerpt from Zeneth’s absolute reasoning rule editor screen showing how conditions influence likelihoods: transformation 006 (dehydration of hydroxylamine to imine) depends on the pH and transformation 011 (oxidation of thioether) depends on the presence of oxidants.
ACS Paragon Plus Environment
37
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 38 of 48
Scheme 3. Relative reasoning rules present in the knowledge base are used by the reasoning engine to assess competing reaction pathways and eliminate less likely degradants (formed more slowly), in this case D2.
Scheme 4. The program analyzes the query structure and generates the corresponding degradant structure when a “degradophore” in the knowledge base is recognized (i.e., a
ACS Paragon Plus Environment
38
Page 39 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
pattern of a transformation has matched and the corresponding reasoning rule, describing the conditions, has been satisfied).
Figure 5. The logic flow of Zeneth showing the processing of a starting material into its degradants.
ACS Paragon Plus Environment
39
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 40 of 48
Figure 6. Example of a three-generation degradation tree showing the step likelihood of transformations by color coding.
Figure 7. Example of a three-generation degradation tree showing the pathway likelihood leading to the degradants.
ACS Paragon Plus Environment
40
Page 41 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
Scheme 5. A reaction pathway prediction for degradant D26 from acetylsalicylic acid processed by Zeneth at pH 1 in the presence of water, oxygen, metal, radical initiator, peroxide and light. Electron-pushing arrows and two explicit hydrogens have been added to show the mechanisms more clearly.
ACS Paragon Plus Environment
41
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 42 of 48
Scheme 6. How different types of dimerization reactions can be controlled. Processing constraints common to all three runs: likelihood at least equivocal, two steps, and conditions pH 7, water, oxygen, peroxide, radical initiator, metal and light.
ACS Paragon Plus Environment
42
Page 43 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
Scheme 7. Examples of degradation predictions for acetylsalicylic acid processed with various other compounds (ascorbic acid, the acids in magnesium stearate, and D-glucose with contaminant 5-HMF, respectively). Processing constraints: likelihood at least equivocal, two steps, and conditions pH 7, water, oxygen, peroxide, radical initiator, metal and light.
ACS Paragon Plus Environment
43
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 44 of 48
Figure 8. Three pathways leading to the same degradant (D4 = D4’ = D4’’). Q is the query structure, Dn is a degradant, L = likely step, and VL = very likely step. The two structures coming from the longer reaction path or from a less likely first-generation degradant can be hidden from the tree.
ACS Paragon Plus Environment
44
Page 45 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
Figure 9. Schematic showing the three usual phases necessary to elucidate the structure of an unknown degradant knowing its exact mass.
ACS Paragon Plus Environment
45
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 46 of 48
Scheme 8. Proposed reaction pathways, leading to the two degradants D10 and D45 with an exact mass of 143.02, filtered out from the 96 possible degradants of leflunomide obtained after processing at pH 13, in the presence of water, oxygen and light.
Scheme 9. Zeneth’s proposed degradation pathway of lornoxicam to a degradant structure with molecular formula C7H6ClNO3S2 (processed with pH = 1, water, oxygen and light).
ACS Paragon Plus Environment
46
Page 47 of 48
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Molecular Pharmaceutics
Scheme 10. A) Zeneth’s degradation prediction (summarized) for D-fructose (constraints: four steps, likelihood at least likely and conditions pH 11 and water). B) Retrieval of the reaction pathway leading to maltol using the structure of maltol as a filter.
ACS Paragon Plus Environment
47
Molecular Pharmaceutics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 48 of 48
For Table of Contents Use Only Journal: Molecular Pharmaceutics Manuscript ID: mp-2013-00083h Title: “An expert system to predict the forced degradation of organic molecules” Author(s): Parenty D. C., Alexis; Button, William G.; Ott, Martin A.
Abstract Graphic (.tif)
Abstract Graphic (.png)
ACS Paragon Plus Environment
48