Logistic Classification Models for pH-Permeability Profile: Predicting

Feb 21, 2019 - Permeability is used to describe and evaluate the absorption of drug substances in the human gastrointestinal tract (GIT). Permeability...
0 downloads 0 Views 4MB Size
Subscriber access provided by MIDWESTERN UNIVERSITY

Pharmaceutical Modeling

Logistic Classification Models for pH-Permeability Profile: Predicting Permeability Classes for the Biopharmaceutical Classification System Mare Oja, Sulev Sild, and Uko Maran J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.8b00833 • Publication Date (Web): 21 Feb 2019 Downloaded from http://pubs.acs.org on February 24, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Logistic Classification Models for pH-Permeability Profile: Predicting Permeability Classes for the Biopharmaceutical Classification System Mare Oja, Sulev Sild, Uko Maran* Institute of Chemistry, University of Tartu, Ravila 14A, Tartu 50411, Estonia *corresponding author [email protected], phone +372 7 375 254, fax +372 7 375 264

1 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 49

Abstract Permeability is used to describe and evaluate the absorption of drug substances in the human gastrointestinal tract (GIT). Permeability is largely dependent on fluctuating pH that causes the ionization of drug substances and also influences regional absorption in the GIT. Therefore, classification models that characterize permeability at wide range of pH-s were derived in the current study. For this, drug substances were described with six data series that were measured with a parallel artificial membrane permeability assay (PAMPA), including a permeability profile at four pH-s (3, 5, 7.4 and 9), and the highest and intrinsic membrane permeability. Logistic regression classification models were developed and compared by using two distinct sets of descriptors: 1) hydrophobicity descriptor, the logarithm of the octanol-water partition (logPow) or distribution (logD) coefficient, and 2) theoretical molecular descriptors. In both cases, models have good classification and descriptive capabilities for training set (accuracy: 0.76 to 0.91). Triple-validation with three sets of drug substances shows good prediction capability for all models: validation set (accuracy: 0.73 to 0.91), external validation set (accuracy: 0.72 to 0.9) and the permeability classes of FDA reference drugs for the biopharmaceutical classification system (BCS) (accuracy: 0.72 to 0.88). The identification of BCS permeability classes was further improved with decision trees that consolidated predictions from models with each descriptor type. These decision trees have higher confidence and accuracy (0.91 for theoretical molecular descriptors and 0.81 for hydrophobicity descriptors) than the individual models in assigning drug substances into BCS permeability classes. A detailed analysis of classification models and related decision trees suggests that they are suitable for predicting classes of permeability for passively transported drug substances, including specifically within the BCS framework. All developed models are available at the QsarDB repository (http://dx.doi.org/10.15152/QDB.206).

2 ACS Paragon Plus Environment

Page 3 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

1.

Introduction

Absorption in the gastrointestinal tract (GIT) is an important parameter for drug substances and their candidates for many reasons. Namely, the oral delivery is preferable, cost effective and convenient way to administer drugs. The pharmaceutical effect for orally administrated drugs depends greatly from absorption and distribution of active substance. Absorption depends on solubility and permeability of the drug substance and other factors related to the GIT. Analyses have been showing that poor pharmacokinetics and pharmacodynamics are one of the most frequent reasons for the drug attrition 1. The regions of the human GIT have different properties, so the absorption varies regionally 2,3. One reason that affects the regional absorption is the fluctuating pH, which varies widely from acidic (pH~2-3) to basic (pH~8-9) 4,5: usually lower pH-s are in the upper and higher pH-s in the bottom side of the GIT. Furthermore, the biopharmaceutical classification system (BCS)

6

uses solubility and

permeability data to group orally administered drugs into four classes (high solubility – high permeability, low solubility – high permeability, high solubility – low permeability, and low solubility – low permeability). The United States Food and Drug Administration (FDA) recommends BCS to select suitable biowaivers for in vivo bioavailability and/or bioequivalence studies 7. The FDA guideline

7

does not suggest a specific pH for the

permeability measurements, although for the solubility measurements the range of pH is defined from 1 to 6.8 for slow dissolution compounds. The guideline foresees in vivo (human and animal models), in situ (animal models) and in vitro (tissue and cell) methods for detecting permeability classes 7. All listed methods are considered appropriate for passively transported drug substances and only in vivo human models for detecting efflux transport 7. In practice all these methods are time consuming and costly.

3 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 49

The parallel artificial membrane permeability assay (PAMPA) 5,8 is a more accessible and faster method for measuring passive transport than cell, animal and human based methods. The membrane permeability measured by PAMPA has been shown to match well with the human intestinal absorption 9. Therefore, the prediction models based on membrane permeability measured with PAMPA serve as a reasonable alternative to FDA reference methods for detecting permeability in the early stage of drug discovery when estimating the suitability of drug substance candidates for oral administration. The advantage of PAMPA is the ability to measure membrane permeability in a wide pH range, i.e. enables the description of the entire range of pH, and thus allows assessing the regional absorption, in the GIT. Surprisingly, the effect of pH to the membrane permeability is rarely considered during the characterization and modelling of drug substance candidates. Exceptions are our recent experimental and multilinear regression (MLR) modelling studies that analyse pH effect on membrane permeability for series of acidic and basic 10, neutral and amphoteric GIT

9,12

11

drug substances, for the highest membrane permeability over pH range in the

and for the quantitative pH-permeability profile 9. According to the recent

comprehensive summary of existing quantitative models for PAMPA

12,

a vast amount of

them are MLR models developed to describe and predict membrane permeability at or close to neutral pH-s. Qualitative methods are another way to develop descriptive and predictive models. However, to the best of our knowledge classification models for PAMPA are not available in the scientific literature. At the same time, classification models that take into account a wide range of pH for membrane permeability can help us to get better understanding and improve regional-based absorption studies, the virtual screening of chemical libraries, and detection of biowaivers in the BCS framework. It should be noted that for other intestinal absorption related parameters, such as the human intestinal absorption

13,14,15,16,17

and cell-based 4

ACS Paragon Plus Environment

Page 5 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

membrane permeability assay (Caco-2)

15,18,

several classification models have been

developed using different classification methods, like linear discriminant analysis, decision tree, support vector machine, artificial neural network, etc. In the current study, the focus is to develop classification models for the systematically measured PAMPA data that mimics the full range of pH in the GIT. At first, classification models with the logistic regression method have been studied using only hydrophobicity descriptor (the logarithm of the octanol-water partition or distribution coefficient) which is referred to as significant structure parameter for modelling of membrane permeability. Then the study is extended to freely available and simply calculable 1D and 2D molecular descriptors to derive easily interpretive and predictive models for the classification of highand low-permeable drug substances. The final models are triple-validated with validation set, external validation set and FDA reference compounds for BCS permeability classes. Finally, decision trees are used to analyse the predictions of six classification models and in assigning drug substances into BCS permeability classes.

5 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2.

Methods and materials

2.1

Data series

Page 6 of 49

Experimental effective membrane permeability (logPe, unit log(cm/s)) values and the corresponding high and low permeability classes (Table S2-S7 in the Supporting Information) were adapted from our prior work 9, where systematic PAMPA measurements for drug substances have been performed and used for the development and validation of multi-linear regression QSAR-s complemented with a cut-off based classification. This cut-off originated from the comparison of membrane permeability measured with PAMPA and human intestinal absorption data and the same cut-off was also used in this study. According to this cut-off, compounds with logPe ≥ -6.2 were labelled as high permeable and compounds with logPe < 6.2 as low permeable. The dataset included 178 drug substances (for simplicity they are referred to as ‘compounds’ in the following text), with the similar distribution of basic, acidic, amphoteric and neutral compounds as in the approved drugs database 9. This dataset was divided into training and validation sets depending on the distribution of high- and lowpermeable compounds (see details below in the chapter “Balancing data series”). After the selection of final models, we also adopted a blind validation set of 60 compounds from our previous work 9 and used it for an external validation. The six data series corresponding to PAMPA measurements at different pH-s were used in this study. The first four data series mimic the full pH range (pH 3, 5, 7.4 and 9) in the GIT and establish a class-based pH-permeability profile. The fifth data series considers the highest membrane permeability over the above given pH-s (logPe_highest) and indicates whether the compound is permeable in the GIT. The sixth data series, the intrinsic membrane permeability (logPo), characterizes the permeability class of the uncharged compound. More

6 ACS Paragon Plus Environment

Page 7 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

detailed analysis and discussion about the experimental data is available in our previous publication 9.

2.2

Dataset of FDA reference drugs

FDA uses 40 reference (model) drugs

7

for controlling the suitability of permeability

measurement methods for detecting high and low permeability for the BCS. This list includes 11 high, 10 moderate, 10 low, 5 zero-permeable compounds, and 4 efflux substrates. In the current work permeability classes for FDA compounds were assigned according to the BCS and thus the FDA compounds were grouped into two classes: FDA moderate-, low- and zeropermeable compounds (i.e. human intestinal absorption (fa) < 85%) were considered as lowpermeable; and FDA high-permeable compounds (i.e. fa ≥ 85%) as high-permeable in the current comparison. This allowed a more comprehensive testing of the developed classification models for their ability to distinguish high- and low-permeable compounds, hence the FDA compounds were used as a third validation set and provided a practical use case for predicting permeability classes for BCS. Efflux substrates (digoxin, paclitaxel, quinidine, vinblastine) were excluded from the analyses, because the PAMPA method is not able to take account of the efflux transport. Additionally, three polymers (polyethylene glycol 400 and 4000, FITC-Dextran) and one polysaccharide (inulin) were excluded. In total, the prediction results of 32 FDA reference drugs (11 high- and 21 low-permeable compounds) were analysed. It should be noted that the group of high-permeable compounds for BCS was strongly overlapping with the compounds used during model development and validation (10 out of 11), while the group of low-permeable compounds included mostly new compounds, i.e. not used for model development and validation (15 compounds out of 21). Nevertheless, this FDA dataset provides valuable comparisons how consistent are predictions from PAMPA classification models when compared with permeability data that is measured in the GIT (in vivo) with completely different experimental settings. 7 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2.3

Page 8 of 49

Balancing data series

The division of compounds into the training and validation sets was adapted from the original publication of experimental data 9. In the original publication compounds in data series were ordered according to the descending order of experimental values and every fourth compound was selected into the validation set and remaining compounds formed the training set. This method ensured a similar distribution of membrane permeability values in training and validation sets for regression analysis, but for classification purposes the training sets for most of data series were unbalanced. [Figure 1 approximate location] The preliminary analysis (Figure 1) showed that the distribution of acidic and basic compounds had a major influence on the balance between high- and low-permeable compounds in data series. Basic compounds are usually low permeable at acidic pH-s and consisted about 50% of the dataset of compounds 9. This shifted the balance of data series for acidic pH-s to the majority of low-permeable compounds. For example, the balance for data series at pH 3 is 44 high vs. 134 low and pH 5 is 58 high vs. 120 low. Acids are low permeable at pH 7.4 and 9, but their influence was not as substantial as of basic compounds, because the dataset of compounds contained only about 20% of acids 9. For this reason, data series at pH 7.4 and 9 were balanced (90 high vs. 88 low). Since the majority of acids and bases are high permeable in uncharged state, the data series for logPe_highest and logPo had the majority of high-permeable compounds (110 high vs. 68 low and 115 high vs. 63 low, respectively). Most ampholytes (around 20% in the dataset of compounds 9) were low permeable in all data series and their influence on the balance was not significantly affected by pH. The lowest influence to the balance was from neutral compounds (around 10% in the dataset of compounds 9), because their permeability class is constant with pH. Consequently, the distribution of high- and low-permeability classes was highly influenced by the pH8 ACS Paragon Plus Environment

Page 9 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

permeability profiles of acidic and basic compounds, which caused the unbalance in training sets. [Table 1 approximate location] Training set balance influences the development and results of classification models 19,20,21,

i.e. a large variance in the number of compounds in any of the classes for training set

can cause problems, such as overly optimistic accuracy estimates due to the bias towards the majority class and poor predictive accuracy of the minority class. Therefore, the original training sets

9

have been balanced before developing classification models. Proportions of

high- and low-permeable compounds in initial data series were used for the selection of appropriate ratios for making balanced training sets. For pH 3 every third and for pH 5 every second low-permeable compound in the training set was selected to the balanced training set and the remaining low-permeable compounds were included to the validation set. For the logPe_highest every third and logPo every second, high-permeable compound from the training set was moved to the validation set and the remaining compounds formed the balanced training set. The selections were made while keeping the same order based on experimental values as it was in the original training sets. The sizes of final training and validation sets within data series are shown in Table 1.

2.4

Chemical structure characterization

For the calculation of descriptors, all structures were standardized to make them uncharged and salt- and hydrate-free. Then dominant tautomeric forms of compounds were detected and the aromatic SMILES (Table S1 and S8 in the Supporting Information) were generated with JChem for Excel 22. The octanol-water partition (Pow) and distribution (D) coefficients have been often associated with permeability, therefore their logarithmic values (logPow and logD) were used as descriptors. For logPow, experimental values from the PhysProp database 23 were retrieved. 9 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 49

For missing logPow values, appropriate method for prediction was identified. For this, different calculators were compared with 142 experimental logPow values of current dataset. Firstly, four different logPow calculators (ALOGP, XLogP, CrippenLogP, MLogP) from PaDEL-Descriptor software 24,25 were considered, but all of them resulted in correlations with R2 less than 0.70 (data not shown). The satisfactory correlation (R2=0.99, data not shown) between experimental data and calculated logPow values for 142 compounds was achieved with XlogP3 (version 3.2.2)

26

and allowed to fill in the missing experimental values. The

logD at pH 3, 5, 7.4 and 9 were calculated with JChem for Excel

22.

The highest logD

(logDhighest) value for compound was picked over selected pH-s. PaDEL-Descriptor software

24,25

was used to calculate 1D and 2D molecular

descriptors with “detect aromaticity” and “standardize nitro groups” settings. Descriptors with near zero variance, highly correlated descriptors (R=0.9999), autocorrelation and atom- and fragment specific descriptors were removed from the descriptor collection. The reduced set of 224 descriptors was used in the further analysis.

2.5

Logistic regression

Classification models for the membrane permeability were developed with the logistic regression by using the glm function in R software (version 3.3.2) 27. The logistic regression was selected because its ability to estimate probability of the predicted response and due to its simple mathematical representation, which makes the developed models easily usable. For the model building, all correlations of one descriptor and combinations of two and three descriptors were analysed. The logistic regression

28

is a statistical method for binary classification problems,

where the dependent variable is the probability of a binary event, calculated as the logit function of the linear combination of independent variables. The probability (P) of outcome (Eq. (1)) being high (P≤0.5) or low (P>0.5) permeable is calculated using the intercept (b) and 10 ACS Paragon Plus Environment

Page 11 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

the coefficients (a1, a2, … an) of the variable (X1, X2, … Xn) in the logistic regression (z). (1)

e (b  a1 X1  a2 X 2 ...an X n ) ez P  1  e (b  a1 X1  a2 X 2 ...an X n ) 1  e z

The final classification models suitable for predictions were archived in the QSAR Data

Bank

format

29

and

uploaded

to

the

QsarDB

repository

30,31

(http://dx.doi.org/10.15152/QDB.206).

2.6

Performance analysis

The confusion matrix

32

was used to assess the performance of classification models. Each

row in the confusion matrix shows how many compounds are in the specific class based on the predicted values and each column shows how many compounds are in the specific class based on the experimental values. In the current study, “positive” (class 0) means highpermeable compounds and “negative” (class 1) means low-permeable compounds. The comparison of experimental and predicted membrane permeability values groups predictions into four classes: (i)

True positive (TP) – correctly predicted high-permeable compounds

(ii)

True negative (TN) – correctly predicted low-permeable compounds

(iii)

False positive (FP) – low-permeable compounds, which are predicted as highpermeable

(iv)

False negative (FN) – high-permeable compounds, which are predicted as lowpermeable The model performance was described with sensitivity, specificity and accuracy.

Sensitivity is defined as TP/(TP+FN) and measures the proportion of correctly predicted highpermeable compounds. Specificity is defined as TN/(TN+FP) and measures the proportion of correctly

predicted

low-permeable

compounds.

Accuracy

is

defined

as

(TP+TN)/(TP+TN+FP+FN) and measures the proportion of correctly predicted responses by the model.

11 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2.7

Page 12 of 49

Visualization of models

Classification models were visualised and analysed with a receiver operating characteristic (ROC) curves, which were created by plotting the true positive rate (sensitivity) against to the false positive rate at various threshold settings and quantified with an area under the curve (AUC). The higher AUC value and ROC curves nearer to the left upper corner indicate the higher prediction accuracy and distinction capability between high- and low-permeable compounds. The diagonal line on the ROC plot indicates the random guess, where the AUC value is 0.5. Additionally of the ROC curves, the distribution of probabilities for high- and lowpermeable compounds was analysed. This gave information how near to the cut-off value of probability (P=0.5) is for the majority of compounds. The prediction for a compound beyond the probability limit (P=0.5) has more confidence than the one closer to the cut-off value. It means that models with higher confidence have more high-permeable compounds around 0 and low-permeable compounds near to 1, i.e. less points around 0.5. In addition, the distribution of descriptors in the model using training set is analysed to ensure that the descriptor can distinguish high- and low-permeable compounds, i.e. high- and low-permeable compounds have characteristic distribution of descriptor values.

2.8

Decision trees for permeability classes in BCS

In total, six PAMPA classification models for each of the descriptor types were derived in this study. Since pH influences permeability in the GIT, final PAMPA models from each descriptor type were combined into a simple decision tree and tested to predict BCS permeability classes with FDA reference drugs. The decision tree's criterion was the number of high permeability predictions over six models (#HighPrediction). To find threshold for the criterion the following rules were considered: (i) a decision could not be made only on the

12 ACS Paragon Plus Environment

Page 13 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

basis of one or two high-permeable predictions and (ii) prediction could not be made solely on the basis of high permeability predictions for logPe_highest and/or logPo models. This meant that the #HighPrediction value must be greater than two (#HighPrediction > 2), so that the compound would be classified as a high permeability compound in the BCS framework. The predicted class of classification models and decision trees were compared to the BCS experimental permeability class, and the match was described with accuracy, sensitivity and specificity (see details in the "Performance analysis" section).

13 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

3.

Results and Discussions

3.1

Experimental class-based pH-permeability profiles

Page 14 of 49

[Table 2 approximate location] Different chemical classes (acids, bases, ampholytes, and neutrals) have characteristic pHpermeability profiles according to the measured PAMPA data. For demonstrating different class-based pH-permeability profiles, one illustrative compound for each chemical class (Table 2) was selected from the FDA reference drug list 7. Acids, for example ketoprofen (Table 2: a, Exp.), are typically low permeable in basic environments and high permeable in acidic environments. This behaviour clearly influences the estimation of the permeability class assigned by FDA for ketoprofen (Table 2: a), where the correct class is high permeable and it is correctly measured only on acidic pH-s. Bases, for example propranolol (Table 2: b, Exp.), are usually high permeable in basic environments and low permeable in acidic environments. Indeed propranolol (Table 2: b, Exp.) is assigned as high permeable by FDA and therefore the membrane permeability at pH 7.4 and 9 must be considered. Ampholytes are typically low-permeable compounds, as for example FDA assigns famotidine (Table 2: c, Exp.) as low-permeable compound, which is also coherent with the membrane permeability classes from the experimental pH-permeability profile. The membrane permeability for neutral compounds does not depend on the pH of the environment, like carbamazepine (Table 2: d, Exp.). FDA assigns carbamazepine (Table 2: d) as high-permeable and the same result was obtained from our experimental pH-permeability profile. The comparison between FDA permeability classes and our experimental class-based pH-permeability profiles indicates that the membrane permeability measured at the correct pH range matches well with FDA permeability classes. Comparing more general measures, the highest (logPe_highest) and

14 ACS Paragon Plus Environment

Page 15 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

intrinsic (logPo) membrane permeability, indicate a correct correspondence for all four reference compounds.

3.2

Membrane permeability vs. hydrophobicity

LogPow and logD are commonly used descriptors in QSAR models for the membrane permeability 12. Despite the fact that they have a correlation with the membrane permeability (R2 < 0.65), this is not enough for the regression models to have descriptive and predictive capability 9. Conceptually, the identification of group or class memberships is easier than the precise estimation of quantitative responses. Therefore, it was investigated how good classification models can be obtained using only one of these hydrophobicity descriptors in the model. Depending on the nature of data series either logD or logPow were used. For developing pH specific membrane permeability models, logD values at the specific pH were used. For the logPe_highest, the logDhighest was used. For the logPo model, logPow was used instead of logD. [Figure 2 approximate location] [Figure 3 approximate location] [Table 3 approximate location] The resulting models with hydrophobicity descriptor (Figure 2, Table 3, Table S2S7) show very good descriptive properties for the training set (accuracy: 0.84 to 0.91) and prediction capabilities for the validation set (accuracy: 0.73 to 0.90). Sensitivities and specificities for the training sets (Figure 2, Table 3) are in the range of 0.80 to 0.91. The differences between sensitivities and specificities are low, with the maximum difference of 0.09 for the model M5 (logPe_highest), which indicates that both high- and low-permeable compounds are equally correctly classified. However, validation sets for given six data series (Figure 2, Table 3) have larger differences between sensitivity and specificity when 15 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 49

compared to the training sets and follow the trend where sensitivity (0.86 to 0.95) is higher than specificity (0.69 to 0.88) and differences are in-between 0.01 and 0.25. This indicates a better prediction capability for the high-permeable compounds compared to the lowpermeable compounds, which can be caused by the unbalanced validation sets in most of data series. A similar result can be seen on ROC curves where AUC values (Figure 3) are high and similar for both training (AUC: 0.91 to 0.95) and validation (AUC: 0.88 to 0.97) sets. This shows that models with only one hydrophobicity descriptor show a good potential for describing and predicting permeability classes for all data series. However, three issues should be considered while using models with hydrophobicity descriptor. The first issue is the precision of logPow or logD values, since they both are experimental parameters that are often calculated using predictive models. For the logPow comparisons between experimental and calculated data are available. These calculations have shown questionable results for compounds with a complex pattern of functional groups in the chemical structure that is a typical case for drug substances (see example in “Chemical structure characterization” chapter and in our previous publication

11).

In contrast to the

logPow, the prediction capability of logD calculators is hard to verify because very little experimental logD data is published at various pH-s. Therefore, logD calculators are not widely tested and validated for various pH-s that hinders the use of logD as a descriptor. The second issue is related to the assumption that only uncharged compounds can partition to the octanol and thus the logD descriptor is calculated from the relationship between logPow and the dissociation constant (pKa), which means that the precision of logD values is highly dependent on the accuracies of logPow and pKa. The third issue is related to the lack of freely available logD calculators that limits the practical use of the developed classification models.

16 ACS Paragon Plus Environment

Page 17 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

3.3

Membrane permeability vs. theoretical molecular descriptors [Table 4 approximate location]

Considering the complex nature of hydrophobicity descriptors and the concerns above, alternatives were sought to replace logD and logPow in the classification models. 1D and 2D theoretical molecular descriptors calculated with the PaDEL-Descriptor software

24,25

were

tested as a source of alternative descriptors. The initial analysis of one-parameter models showed that their descriptive and predictive capabilities are not sufficient for predicting the membrane permeability (data not shown), i.e. much lower accuracy for the training set was observed. The development of two- and three-parameters models improved their performance. The models with theoretical molecular descriptors (Figure 2, Table 4, Table S2-S7) show similarly good prediction capability compared to the models with hydrophobicity descriptor. The training set accuracies (0.76 to 0.88) are very similar to the validation set accuracies (0.75 to 0.91) and differences between accuracies of training and validation sets is less than 0.1 (Figure 2, Table 4), which indicates that all models are well balanced between descriptive and predictive capabilities. The lowest accuracy is observed for the training set of the model at pH 5 and for the validation set, in case of the models at pH 3 and 5, which indicates that predicting permeability classes for acidic pH-s are more complicated compared to the other data series. Differences between sensitivity (0.77 to 0.88) and specificity (0.73 to 0.88) for the training set (Figure 2, Table 4) are less than 0.1, which indicates equally good descriptive capability for high- and low-permeable compounds. The outcome for the validation sets (Figure 2, Table 4) is more varied. The difference between sensitivity (0.80 to 0.93) and specificity (0.72 to 0.91) for models at pH 7.4, pH 9, logPe_highest and logPo is less than 0.1, but for the models at pH 3 and 5 is more than 0.1. This suggests that the prediction of the low-permeable compounds at acidic pH-s is more complicated compared to the prediction of high-permeable compounds. This can be also caused by the fact that validation 17 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 49

sets for acidic pH-s include more low-permeable compounds (100) compared to highpermeable compounds (11) (Table 1). Smaller and balanced validation sets for models at pH 7.4 and pH 9 (Table 1) improve their statistical performance in comparison with training sets, while for other models the validation set statistics are lower or comparable with the training set statistics (Table 4). The analysis of ROC curves and AUC values shows that training (AUC: 0.84 to 0.92) and validation (AUC: 0.78 to 0.96) sets have relatively high and similar AUC values (Figure 3), which indicates that these models have good descriptive and predictive properties.

3.4

Mechanistic interpretation of theoretical molecular descriptors [Table 5 approximate location]

In the final models with theoretical molecular descriptors, two descriptors were included (M7M10). The exception was model at pH 7.4 (M9) that included three descriptors. A more detailed analysis showed that the descriptors selected into the models (Table 5) describe the properties of the molecular structure associated with ionization, polarity and size and complexity of the molecule. All models for the specific pH (Table 4: M7-M10) include one ionization related descriptor (Table 5), which are a number of basic (nBase) or acidic (nAcid) groups or the mean first ionization potential (scaled on the carbon atom) (Mi). The descriptor nBase

24,25

identifies compounds with basic groups described by the following SMARTS: "[$([NH2][CX4])]", "[$([NH](-[CX4])-[CX4])]", "[$(N(-[CX4])(-[CX4])-[CX4])]", "[$(N=C-N)]", and "[$(N-C=N)]". The descriptor nAcid 24,25 identifies compounds with acidic groups (SMARTS: ([O;H1]-[C,S,P]=O), $([NH](S(=O)=O)C(F)(F)F) and $(n1nnnc1)). The ionization of compounds reduces the membrane permeability (Figure S1, S3, S4: b) and both descriptors identify partially or fully ionized compounds at the modelled pH (basic groups at pH 3 and acidic groups at pH 7.4 and 9). The ionization of both acidic and basic groups for the model at 18 ACS Paragon Plus Environment

Page 19 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

pH 5 is described by Mi descriptor 25 that is calculated from atomic ionisation potentials of a molecule. Higher Mi (Table 4: M8, Figure S2: b) values belong more likely to lowpermeable compounds, indicating that high-permeable compounds have a stronger charge localization (more double bounds), which makes it easier to remove electrons. Also, higher Mi values indicate the presence of nitrogen and oxygen atoms (i.e. electronegative groups), which need more energy to remove electrons. Overall these descriptors consider the constitution of functional groups in compounds, i.e. the pattern of chemical classes and ionization and their different permeability due to the pH of the environment. [Figure 4 approximate location] The descriptors associated with the polarity of molecules (Table 5) are related to the hydrogen bond and surface area. Two of these descriptors, a measure of contribution of hydrogen bond donor atoms (ETA_dEpsilon_D) and a number of hydrogen bond donors (nHBDon), are directly associated to the hydrogen bond donors and are included in almost all developed models, except the model at pH 9. ETA_dEpsilon_D

33

accounts for the

electronegativity contribution of hydrogen bond donor atoms in relation to the electronegativity of heavy atoms in a compound. The nHBDon

25,34

is calculated as the

number of hydrogen bond donors in a compound, which are any –OH or –NH groups where the formal charge of the oxygen or nitrogen is non-negative (i.e. formal charge ≥ 0). Both these descriptors indicate (Table 4: M7-9, M10; Figure 4: b, Figure S1-S3, S5) that compounds with more hydrogen bond donors are more likely low than high permeable. The ETA_Psi_1 descriptor appears only once for the model at pH 7.4 and this extended topochemical atom descriptor is a measure of hydrogen bonding propensity of the molecules and/or polar surface area (ETA_Psi_1) 33. Higher ETA_Psi_1 descriptor value (Table 4: M9, Figure S3) describes compounds with less electronegative atoms, which are parts of hydrogen bond forming and polar surface area. At more basic pH (pH 9) the polarity of the 19 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 49

molecule, as described by the topological polar surface area (TopoPSA), becomes more important than hydrogen bond properties. The TopoPSA 35, calculated based on the fragment contribution method, describes the polar surface area of a compound. Compounds with higher polar surface area (Table 4: M10, Figure S4) are more frequently low than high permeable, which can be caused by stronger interactions with solvent and membrane which in turn causes difficulties to move across the membrane. Only one descriptor is associated with the size and complexity of molecules (Table 5), a 0th order information content index (IC0). IC0 is calculated from the Shannon’s entropy as ― ∑𝑖𝑝𝑖log2 𝑝𝑖, where the pi is the probability of randomly selecting an atom of a specific type i in the molecule

36,37.

This descriptor characterises the molecular complexity as the average

amount of information per atom type. IC0 is included only logPe_highest and logPo models, which indicates that this descriptor is significant to describe permeability properties for uncharged compounds. The sign of its regression coefficient and distribution of descriptor (Table 4: M11-M12, Figure 4: c, Figure S5) shows that compounds containing more different atom types (i.e. more complex molecules) are more frequently low than high permeable. In summary, pH specific models (Table 4: M7-M10) include usually one descriptor that can be attributed to the interaction processes related to the pH of the environment (ionization descriptor) and the second descriptor that takes into account interaction patterns occurring during the permeability through the membrane (polarity descriptor). Models for logPe_highest or logPo (Table 4: M11-M12) indicate that when influence from the ionization of compounds is less important, the membrane permeability is influenced by hydrogen bond donors (nHBDon) and the complexity of the molecule (IC0), both properties being also represented in the models for specific pH through several descriptors (ETA_dEpsilon_D, nHBDon, ETA_Psi_1, TopoPSA). The performance of models and the simple interpretability 20 ACS Paragon Plus Environment

Page 21 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

of involved descriptors show a good potential to classify the membrane permeability of compounds with high accuracy by considering only few structure related 1D/2D descriptors.

3.5

External validation

The external validation set of 60 compounds was used for the independent validation of the derived classification models. The models with hydrophobicity descriptors (Table 3, Figure 2) demonstrate that models at pH 3, 5 and 7.4 (accuracy: 0.8 to 0.9) have slightly higher prediction ability for the external validation set compared to the models for pH 9, logPe_highest and logPo (accuracy: 0.72 to 0.77). The same trend is also visible in ROC graphs (Figure 3), where remarkably higher AUC values exist for the models at pH 3 (AUC: 0.96) and 7.4 (AUC: 0.94) and lowest for logPe_highest (AUC: 0.85) and logPo (AUC: 0.82). The prediction capability of the external validation set is very similar for both high (i.e. sensitivity) and low (i.e. specificity) permeable compounds for almost all series (Table 3, Figure 2), except for the models at pH 5 and for logPo where the specificity is slightly lower than sensitivity. The prediction statistics for the external validation set shows that the models with hydrophobicity descriptor can be used to predict the permeability classes for new compounds, but the limitations discussed at the end of the chapter “Membrane permeability vs. hydrophobicity” must be kept in mind. The models with theoretical molecular descriptors (Table 4, Figure 2) have also good prediction ability. The accuracy for the validation and external validation sets are very similar, i.e. difference is less than 0.1 (Figure 2, Table 4), except the models at pH 7.4 and 9, where accuracy are remarkable lower (difference more than 0.14). The same trend is visible in the ROC graphs (Figure 3): where difference of AUC values for validation and external validation sets is less than 0.1 for models at pH 3, 5, logPe_highest and logPo; and difference is more than 0.1 for models at pH 7.4 and 9. Interestingly, in the case of an external validation set, the specificity is slightly higher than their sensitivity for all models with theoretical 21 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 49

molecular descriptors except of the logPo model. The results of external validation show that these models can be used to distinguish high- and low-permeable compounds with high accuracy and without restrictions.

3.6

Comparison of models: hydrophobicity vs theoretical molecular descriptors

Models with hydrophobicity descriptor and models derived from theoretical molecular descriptors have comparable statistical parameters (Table 3 vs Table 4). The accuracy for the training, validation and external validation sets is usually slightly higher for the models relaying on hydrophobicity descriptor than for the models with theoretical molecular descriptors (Figure 2). Deviations from this trend can be observed for the validation set of models at pH 7.4 and 9, and the external validation set of models at pH 9 and for logPe_highest, where the accuracy is slightly higher for the models with the theoretical molecular descriptors. The differences between sensitivity and specificity are slightly in a smaller range for the models with theoretical molecular descriptors (average and range over absolute values: 0.06, 0-0.21) compared to the models with hydrophobicity descriptor (average and range over absolute values: 0.09, 0-0.25). Models with hydrophobicity descriptor perform slightly better according to the ROC curves (Figure 3), where the AUC values are usually slightly higher than models with theoretical molecular descriptors. Thus, we can conclude from the comparison of statistical parameters that although models with hydrophobicity descriptor have usually slightly higher performance parameters (accuracy, sensitivity, specificity, AUC) than the models with theoretical molecular descriptors, both types of models are sufficient to predict permeability classes for new compounds. A detailed view to prediction confidence of models is coming from the analysis of the distribution of predicted probabilities. A more detailed example is given for the logPe_highest models (Figure 4), which are more universal and cover the wide range of pH in the GIT. The distribution of probabilities (Figure 4: a) for models with hydrophobicity descriptor and with 22 ACS Paragon Plus Environment

Page 23 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

theoretical molecular descriptors shows that for the training and validation sets the highest amount of low- and high-permeable compounds is near to 0 (solid line) or near to 1 (dashed line), respectively. The distribution of probabilities shows that predictions for the model with theoretical molecular descriptors are more reliable compared to the model with hydrophobicity descriptor for the external validation set, because it has much higher amount of compounds near to 0 and 1. Less than 10% of high- and low-permeable compounds for both types of models have probability value near to the threshold (0.5 ±0.1), which means that predictions for the majority of compounds are far from the cut-off value. A deeper analysis of the distribution of the descriptors (Figure 4: b,c) shows that logDhighest, nHBDon and IC0 have different range and spread of values for high- and low-permeable compounds, indicating that all these descriptors have good discriminative ability to classify high- and low-permeable compounds. A disadvantage for the model with hydrophobicity descriptor is the definition of the cut-off value based on only one descriptor, because logDhighest values have a small overlap between high- and low-permeable compounds. Consequently, the model with theoretical molecular descriptors, which includes two descriptors, is more robust and has more confident prediction capability for external validation set in comparison with the model including hydrophobicity descriptor. The analysis of probability and descriptor distributions for other data series (Figure S1-S5) show similar results.

3.7

Predicted class-based pH-permeability profiles

The ability of derived models to predict class-based pH-permeability profiles, the highest and intrinsic membrane permeability, was illustrated by using the four representative compounds that we have already examined for the experimental pH-permeability profiles (Table 2). The high coherence between experimental and predicted data is evident for all models. Only one misclassification occurred for the propranolol (Table 2: c: logPow/logD) when model with hydrophobicity descriptor at pH 7.4, that incorrectly predicted it as low permeable instead of 23 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 49

high permeable, while the model with theoretical molecular descriptors predicted it correctly. Acidic, ketoprofen (Table 2: a), and basic, propranolol (Table 2: b), compounds illustrate the possibility to use models with both descriptor types to predict class-based pH-permeability profiles. Consequently, the developed classification models can be used for predicting permeability classes with high accuracy. Depending on the purpose: the pH-specific models allow to predict class-based pH-permeability profiles and the regional dependent absorption; the model for logPe_highest predicts the highest permeability class in the GIT; and the model for logPo predicts permeability class for the uncharged form of the compound.

3.8

Evaluating permeability classes in BCS for FDA reference drugs [Table 6 approximate location]

All models from Table 3 and Table 4 were applied to predict six PAMPA permeability classes for the FDA reference (model) drugs within BCS (Table S8-S10). The obtained predictions were used for the analysis of pH-permeability profiles, which turned out to be useful for the evaluation of BCS permeability classes. This evaluation can be considered as the third validation and a practical use case for the developed models. Prediction accuracies for FDA reference compounds across all data series (Table 6) show that models with theoretical molecular descriptors (accuracy: 0.81 to 0.88) perform better than models with hydrophobicity descriptor (accuracy: 0.72 to 0.81). This trend is also present for sensitivities (0.55-0.82 vs 0.27-0.64), but not for specificities, which are in similar ranges (0.90-0.95 vs 0.86-1.00) with both descriptor types. The best performance characteristics were obtained using models with theoretical molecular descriptors for logPe_highest and logPo. Probably these models have higher performance because the full pH range in the GIT is considered. Most problematic for both types of models are high-permeable compounds, which can be correctly classified by using the pH-permeability profile. Namely BCS permeability classes are 24 ACS Paragon Plus Environment

Page 25 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

determined based on the full pH range in the GIT, while evaluating the permeability class only at a single pH value might underestimate it because the compound can be high permeable at some other pH. In order to take advantage of the predictions made with the pHdependent permeability models and to improve the confidence of BCS permeability classification, we can look for the agreement over all predictions and represent this as a simple decision tree. [Figure 5 approximate location] Both decision trees (Figure 5) use the number of the predictions into the high permeable class over six models (#HighPrediction) as a descriptor, where its value must be at least three to classify the compound as high permeable. The decision tree with hydrophobicity descriptor has the same performance characteristics as model for logPo, but the decision tree with theoretical molecular descriptors has higher accuracy (0.91) and specificity (0.9) than any individual models (Table 6). Comparing two decision trees, the one with theoretical molecular descriptors has significantly higher accuracy (0.91 vs 0.81), sensitivity (0.82 vs 0.64) and specificity (0.95 vs 0.90) in contrast to the hydrophobicity descriptor. The decision tree with hydrophobicity descriptor misclassified six compounds, which is two times higher than the decision trees with theoretical molecular descriptors (Table 6). This shows that hydrophobicity descriptor captures less structural features than theoretical molecular descriptors. Among these six misclassified compounds, three compounds (theophylline, minoxidil, and chlorphenamine) were misclassified by both decision trees. This indicates that the permeability of these compounds is influenced by structural properties that are not captured in either type of models. For example, misclassifications for low permeable compounds (Chlorphenamine and Furosemide) occur only when they are moderately permeable according to the FDA classification. This may be due to the low solubility at pH-s, where permeability is high, which is decreasing absorption in the GIT, but the solubility is not 25 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 49

explicitly considered in the developed classification models for permeability. Conclusively, although some misclassification exists for both high and low permeable compounds, the decision trees predict the BCS permeability classes with higher confidence and accuracy in comparison to the individual models; especially improvement is achieved when using models with theoretical molecular descriptors. Consequently, the decision trees allowed evaluating the consensus of six models and more precisely attribute BCS permeability classes to FDA reference compounds. From this we can conclude that such an approach will help to apply these models and decision trees to predict the passive transport of new drug candidates with high reliability.

26 ACS Paragon Plus Environment

Page 27 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

4.

Conclusion

Classification models for the membrane permeability measured by PAMPA at different pH-s have been developed by using the logistic regression method. These models predict high and low class-based pH-permeability profiles (at pH 3, 5, 7.4, and 9), the highest membrane permeability classes in the pH range in the GIT, and intrinsic membrane permeability classes. The models cover systematically the full pH range in the GIT and provide a realistic estimation of human intestinal absorption. Two types of classification models are developed: models with hydrophobicity descriptor (logPow or logD) and models with theoretical molecular descriptors. Both types of classification models for all six data series show a satisfactory performance for the training, validation and external validation sets. Models with theoretical molecular descriptors describe properties of molecular structure that capture interaction in the lipid phase (polarity and size and complexity of the molecule) and in the environment caused by the pH (ionization). Experimental and predicted pH-permeability profiles have very good coherence in both types of models, which indicates that these models allow to predict pH-permeability profiles and also assess regional dependent absorption. In addition, the developed classification models were tested to predict the permeability classes for FDA reference drugs for BCS. Models with theoretical molecular descriptors have a slightly better prediction capability (accuracy: 0.81-0.88) for reference drugs compared to the models with hydrophobicity descriptor (accuracy: 0.72-0.81). Higher accuracy was also observed when the highest or intrinsic membrane permeability are considered, which demonstrate the importance to consider a wide pH range in the GIT and that improves the quality of predictions for human intestinal absorption. To facilitate decision making in early drug development, the predictions from both types of models can be used as inputs for the 27 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 49

decision trees. The decision trees relaying on the predictions of the high permeable class from six models with the hydrophobicity descriptor (accuracy: 0.84) and also with the theoretical molecular descriptor (accuracy: 0.91) significantly improve the classification of FDA reference drugs to BCS permeability classes when compared to the individual models. This confirms that decision trees derived from the predictions of the developed classification models will be highly usable for predicting passive transport in the GIT. They allow selecting suitable drug candidates and biowaivers during drug discovery process, and are potentially useful as an alternative method to predict permeability classes for the BCS.

28 ACS Paragon Plus Environment

Page 29 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

5.

Acknowledgements

This work was supported by the Ministry of Education and Research, Republic of Estonia [grant number IUT34-14] and the European Union European Regional Development Fund through Foundation Archimedes [grant number TK143, Centre of Excellence in Molecular Cell Engineering]. The authors are grateful to Dr. Alex Avdeef (in-ADME Research) for helpful discussions.

29 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

6.

Page 30 of 49

Supporting Information

Supporting Information Available: Graphical analysis of classification models (Figures S1 to S5), SMILES for drug substances (Table S1) and FDA reference drugs (Table S8), data for each classification model (Tables S2 to S7) and data for FDA reference drugs for both type of models (Tables S9 and S10).

30 ACS Paragon Plus Environment

Page 31 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

7.

Figures

Figure 1 Distribution of high (H) and low (L) permeable compounds in data series and chemical classes within them.

31 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 49

Figure 2 Classification results for single-parameter models with hydrophobicity (logD/logPow, M1-M6) and multi-parameter models with theoretical molecular descriptors (M7-M12) for all data series: T - training, V - validation, Ext - external validation sets.

32 ACS Paragon Plus Environment

Page 33 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 3 Comparison of ROC plots for models with hydrophobicity descriptor (M1-M6) and theoretical molecular descriptors (M7-M12) for corresponding training, validation and external validation sets. The dashed line indicates random classification with AUC value 0.5.

33 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 49

Figure 4 Comparisons for highest membrane permeability models (logPe_highest) with hydrophobicity and theoretical molecular descriptors: distribution of probabilities (a) and descriptors values (b, c).

34 ACS Paragon Plus Environment

Page 35 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 5 Decision trees for models with hydrophobicity descriptor and theoretical molecular descriptors: predicted classes for FDA reference compounds with performance characteristics. The number of high permeability predictions (#HighPrediction) is taken over six models.

35 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

8.

Page 36 of 49

Tables

Table 1 Number of compounds and distribution of high- and low-permeable compounds in training, validation and external validation set within data series. Data series pH 3 pH 5 pH 7.4 pH 9 logPe_highest logPo

Training Total High 67 33 89 44 134 68 134 68 107 56 91 44

Validation External Low Total High Low Total High 34 111 11 100 60 15 45 89 14 75 60 21 66 44 22 22 60 40 66 44 22 22 60 40 51 71 54 17 60 45 47 87 71 16 60 47

Low 45 39 20 20 15 13

Table 2 Examples of pH-specific, highest (high) and intrinsic (logPo) membrane permeability for chemical classes with predicted classes for models with hydrophobicity descriptor (logD/logPow) and theoretical molecular descriptors (Theor. mol. desc.) with FDA permeability classes. a) Ketoprofen Acid Exp logD/logPow Theor. mol. desc.

3 H H H

5 H H H

7.4 L L L

FDA: high 9 high logPo L H H L H H L H H

b) Propranolol Base Exp logD/logPow Theor. mol. desc.

3 L L L

5 L L L

7.4 H L H

FDA: high 9 high logPo H H H H H H H H H

c) Famotidine Ampholyte Exp logD/logPow Theor. mol. desc.

3 L L L

5 L L L

7.4 L L L

FDA: low 9 high logPo L L L L L L L L L

d) Carbamazepine Neutral Exp logD/logPow Theor. mol. desc.

3 H H H

5 H H H

7.4 H H H

FDA: high 9 high logPo H H H H H H H H H

36 ACS Paragon Plus Environment

Page 37 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 3 One-parameter classification models (z) with the intercept and the coefficients of the variable using hydrophobicity descriptor and their performance characteristics (Exp. – experimental, Pred. – predicted, Train – training set, Val – validation set, Ext – external validation set, H – high, L – low, Acc – accuracy, Sens – sensitivity, Spec – specificity). M1: Model at pH 3 (Figure S1, Table S2) zM1=1.9182(±0.5936) – 1.7339(±0.4366)*logDpH3

Pred. H Pred. L

Train 30 3

Exp. H Val 10 1

Ext 14 1

Train 3 31

Exp. L Val 19 81

Ext 5 40

Accuracy: Sensitivity: Specificity:

Train 0.91 0.91 0.91

Val 0.82 0.91 0.81

Ext 0.90 0.93 0.89

Train 0.84 0.82 0.87

Val 0.73 0.93 0.69

Ext 0.80 0.90 0.74

Train 0.86 0.85 0.86

Val 0.80 0.86 0.73

Ext 0.88 0.90 0.85

Train 0.86 0.87 0.85

Val 0.86 0.95 0.77

Ext 0.72 0.70 0.75

M2: Model at pH 5 (Figure S2, Table S3) zM2=1.3569(±0.4149) – 1.6234(±0.3427)*logDpH5

Pred. H Pred. L

Train 36 8

Exp. H Val 13 1

Ext 19 2

Train 6 39

Exp. L Val 23 52

Ext 10 29

Accuracy: Sensitivity: Specificity:

M3: Model at pH 7.4 (Figure S3, Table S4) zM3= 1.3899(±0.3298) – 1.6063(±0.2686)*logDpH7.4

Pred. H Pred. L

Train 58 10

Exp. H Val 19 3

Ext 36 4

Train 9 57

Exp. L Val 6 16

Ext 3 17

Accuracy: Sensitivity: Specificity:

M4: Model at pH 9 (Figure S4, Table S5) zM4=2.0555(±0.4488) – 1.6828(±0.2695)*logDpH9

Pred. H Pred. L

Train 59 9

Exp. H Val 21 1

Ext 28 12

Train 10 56

Exp. L Val 5 17

Ext 5 15

Accuracy: Sensitivity: Specificity:

M5: Model for highest membrane permeability over pH range 3 to 9 (Figure 4, Table S6) zM5=3.0469(±0.7128) – 2.0974(±0.4200)*logDhighest

Pred. H Pred. L

Train 50 6

Exp. H Val 48 6

Ext 32 13

Train 10 41

Exp. L Val 2 15

Ext 3 12

Accuracy: Sensitivity: Specificity:

Train 0.85 0.89 0.80

Val 0.89 0.89 0.88

Ext 0.73 0.71 0.80

M6: Model for intrinsic membrane permeability (logPo) (Figure S5, Table S7) zM6=2.7082(±0.6470) – 1.7840(±0.3696)*logPow

Pred. H Pred. L

Train 38 6

Exp. H Val 67 4

Ext 37 10

Train 7 40

Exp. L Val 5 11

Ext 4 9

Accuracy: Sensitivity: Specificity:

Train 0.86 0.86 0.85

Val 0.90 0.94 0.69

Ext 0.77 0.79 0.69

37 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 49

Table 4 Multi-parametric classification models (z) with the intercept and the coefficients of the variable using theoretical molecular descriptors and their performance characteristics (Exp. – experimental, Pred. – predicted, Train – training set, Val – validation set, Ext – external validation set, H – high, L – low, Acc – accuracy, Sens – sensitivity, Spec – specificity). M7: Model at pH 3 (Figure S1, Table S2) zM7= -2.2465(±0.6331)+3.1714(±0.8013)*nBase+31.7661(±10.1616)*ETA_dEpsilon_D

Pred. H Pred. L

Train 29 4

Exp. H Val 10 1

Ext 12 3

Train 4 30

Exp. L Val 22 78

Ext 7 38

Accuracy: Sensitivity: Specificity:

Train 0.88 0.88 0.88

Val 0.79 0.91 0.78

Ext 0.83 0.80 0.84

Val 0.75 0.93 0.72

Ext 0.73 0.67 0.77

M8: Model at pH 5 (Figure S2, Table S3) zM8 = -93.311(±26.727)+12.034(±3.500)*Mi+1.262(±0.353)*nHBDon

Pred. H Pred. L

Train 35 9

Exp. H Val 13 1

Ext 14 7

Train 12 33

Exp. L Val 21 54

Ext 9 30

Accuracy: Sensitivity: Specificity:

Train 0.76 0.80 0.73

M9: Model at pH 7.4 (Figure S3, Table S4) zM9=8.1235(±3.0492)+1.8048(±0.6662)*nAcid+49.3794(±10.5649)*ETA_dEpsilon_D– 19.4353(±5.4894)*ETA_Psi_1

Pred. H Pred. L

Train 56 12

Exp. H Val 20 2

Ext 27 13

Train 11 55

Exp. L Val 3 19

Ext 4 16

Accuracy: Sensitivity: Specificity:

Train 0.83 0.82 0.83

Val 0.89 0.91 0.86

Ext 0.72 0.68 0.80

M10: Model at pH 9 (Figure S4, Table S5) zM10=-4.4598(±0.7984)+3.1851(±0.8401)*nAcid+0.05472(±0.00981)*TopoPSA

Pred. H Pred. L

Train 56 12

Exp. H Val 20 2

Ext 30 10

Train 13 53

Exp. L Val 2 20

Ext 4 16

Accuracy: Sensitivity: Specificity:

Train 0.81 0.82 0.80

Val 0.91 0.91 0.91

Ext 0.77 0.75 0.80

M11: Model for highest membrane permeability over pH range 3 to 9 (Figure 4, Table S6) zM11=-19.2012(±3.8467)+1.5724(±0.4009)*nHBDon+10.1887(±2.1448)*IC0

Pred. H Pred. L

Train 47 9

Exp. H Val 43 11

Ext 35 10

Train 7 44

Exp. L Val 2 15

Ext 2 13

Accuracy: Sensitivity: Specificity:

Train 0.85 0.84 0.86

Val 0.82 0.80 0.88

Ext 0.80 0.78 0.87

M12: Model for intrinsic membrane permeability (logPo) (Figure S5, Table S7) zM12=-13.6556(±3.0374)+0.8675(±0.3012)*nHBDon+7.4158(±1.7392)*IC0

Pred. H Pred. L

Train 34 10

Exp. H Val 60 11

Ext 37 10

Train 7 40

Exp. L Val 2 14

Ext 3 10

Accuracy: Sensitivity: Specificity:

Train 0.81 0.77 0.85

Val 0.85 0.85 0.88

Ext 0.78 0.79 0.77

38 ACS Paragon Plus Environment

Page 39 of 49 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 5 Descriptors in the multi-parametric classification models (M7-M12) grouped according to the properties of molecular structure.

Properties of molecular structure Ionization Polarity

Size and complexity of molecule

Descriptors nBase Mi nAcid ETA_dEpsilon_D nHBDon ETA_Psi_1 TopoPSA IC0

Models pH 3 pH 5 pH 7.4 and 9 pH 3 and 7.4 pH 5, logPe_highest, logPo pH 7.4 pH 9 logPe_highest, logPo

39 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 49

Table 6 Comparison of BCS permeability classes with predictions (H – high, L – low) from the models that use either hydrophobicity descriptor (Table 3: M1-M6) or theoretical molecular descriptors (Table 4: M7-M12) and final decision based on the decision tree (Figure 5) for FDA reference drugs. Misclassified compounds are marked with red circle. pH 3 pH 5 pH 7.4 pH 9 M1 M7 M2 M8 M3 M9 M4 M10 BCS: High permeability (fa≥85%) a) FDA: High permeability (fa≥85%) Antipyrine* H H H H H H H Caffeine* H H H Ketoprofen* H H H H Naproxen* H H H H Theophylline* H H Metoprolol* H H Propranolol* H H H Carbamazepine* H H H H H H H H Disopyramide* H H H H Minoxidil* Phenytoin H H H H H H H H BCS: Low permeability (fa2

No Low Low: 20 High: 2

Sensitivity: 0.64 Specificity: 0.90 Accuracy: 0.81 ACS Paragon Plus Environment

Ye s

High High: 9 Low: 1

Sensitivity: 0.82 Specificity: 0.95 Accuracy: 0.91

Page 49 of 49 Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12

Classification models PAMPA

Decision tree BCS

pH-permeability profile pH 3 pH 7.4 pH 5 pH 9

#HighPrediction>2

Highest

logPo

No

Yes

Low

High

ACS Paragon Plus Environment