15 A New Approach to Bioactive Synthesis PHILIP S. MAGEE
Downloaded via NORTHWESTERN UNIV on July 11, 2018 at 16:26:28 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.
Chevron Chemical Co., 940 Hensley Street, Richmond, CA 94804
The application of QSAR to bioactive synthesis has always suffered f r o m an unfortunate paradox. In order to develop a useful equation, it is necessary to first complete a substantial fraction of the synthesis. Only then can the derived equation assist in extending or optimizing the bioactive series. No help is available for the earliest or intermediate stages of synthesis which have already been passed. Nor is it certain that a useful equation can be gained f r o m the first 10-20 members of a series. Poor selection of structural changes, variable biodata, differential metabolism of some members and the p r e sence of unknown factors can a l l lead to poor correlations of little practical use. These problems are common to anyone who has attempted QSAR on novel bioactive series. There is another problem that is equally vexing and this relates to drug or pesticide modification. In most cases, the complete structural series and biodata were developed in another laboratory and are unavailable. The only available knowledge may be the structures of the c o m m e r c i a l and patented bioactives. No equation is forthcoming unless the entire study is repeated in your own laboratories, a project unlikely to gain approval. How then can one apply computer assisted methods based on QSAR to aid synthesis at any stage in either type of problem? Inventive use of the transport, enzyme association and reaction model in conjunction with a large operational table of physically measured parameters provides a partial solution.
0-8412-0521-3/79/47-112-319$05.50/0 © 1979 American Chemical Society
Olson and Christoffersen; Computer-Assisted Drug Design ACS Symposium Series; American Chemical Society: Washington, DC, 1979.
320
COMPUTER-ASSISTED DRUG DESIGN
The N e c e s s a r y Models T h e a p p r o a c h depends on f a m i l i a r s t r u c t u r a l and b i o e n e r g e t i c m o d e l s . In the s t r u c t u r a l m o d e l , a b i o a c t i v e s e r i e s i s d e s c r i b e d as a p a r e n t s y s t e m p l u s s u b s t i t u e n t s . We w i l l d e a l o n l y w i t h s u b s t i t u e n t s and t h e i r effects on s t r u c t u r e r a t h e r than w i t h w h o l e m o l e c u l e s . A w h o l e m o l e c u l e a p p r o a c h i s not i m p o s s i b l e but p o s e s m o r e p r o b l e m s i n its d e s i g n . F i g u r e 1 shows a s i m p l i f i e d v e r s i o n of the b i o e n e r g e t i c m o d e l l e a d i n g to the g e n e r a l H a n s c h equation. Transport from p o i n t of a p p l i c a t i o n to the r e g i o n c o n t a i n i n g the a c t i v e s i t e , r e p r e s e n t e d h e r e as an enzyme, i s shown. F o r p a s s i v e t r a n s p o r t , the d e p e n d a n c y i s s o m e f u n c t i o n of l o g Ρ o r the d e r i v e d p a r a m e t e r , π. S u b s t r a t e b i n d i n g , h o w e v e r , m a y depend on π o r M R a c c o r d i n g to the n a t u r e of the a s s o c i a t i o n s i t e . I n h i b i t i o n is then c o n s i d e r e d to be an o r g a n i c r e a c t i o n l i k e l y to c o r r e l a t e w i t h e l e c t r o n i c and s t e r i c p a r a m e t e r s . F i n a l l y , the s e q u e n c e of events that f o l l o w i n h i b i t i o n l e a d s to a m e a s u r e d bio r e s p o n s e that c a n be c a s t into f r e e - e n e r g y f o r m by the e x p r e s s i o n L o g 1 / Ε Ό 5 0 · F o r a r e l a t e d s e r i e s of compounds, we then a c c e p t the l i n e a r c o m b i n a t i o n o f f r e e e n e r g y f a c t o r s b a s e d on s u b s t i t u e n t constants as b e i n g f r e q u e n t l y c a p a b l e o f c o r r e l a t i n g the biodata.
Appln.
/ n
-V
-
phases
/
V
y
©==; (R
/
/ Transport Log Ρ , π
Binding 7Γ, MR
log 1 / E D
5 0
= a7T-
Figure 1.
b7T + cG 2
Inhibition a's,u,E s
+ dU + e
QSAR model
T h e i m p o r t a n t p o i n t of this m o d e l is that we r e s t r i c t o u r s e l v e s to s i m p l e t r a n s p o r t , b i n d i n g and r e a c t i o n p a r a m e t e r s .
Olson and Christoffersen; Computer-Assisted Drug Design ACS Symposium Series; American Chemical Society: Washington, DC, 1979.
15.
MAGEE
321
Bioactive Synthesis
M a s t e r Data T h e k e y to c o m p u t e r a s s i s t e d s y n t h e s i s b a s e d on t h e s e m o d e l s l i e s i n the s i z e a n d f o r m a t o f the data b a s e . In i t s c u r r e n t v e r s i o n , the M a s t e r D a t a p a r a m e t e r t a b l e l i s t s 390 s u b s t i t u e n t s i n 780 l i n e s . A s shown i n F i g u r e 2, e a c h s u b s t i tuent i s e n t e r e d on two l i n e s . T h e o d d l i n e s c o n t a i n the m e t a p a r a m e t e r s ; the even l i n e s c o n t a i n the p a r a a n d a l i p h a t i c v a l u e s . F o l l o w i n g the c a s e n u m b e r a n d a s i m p l i f i e d s u b s t i t u e n t name, the h e a d i n g s a r e : m o l e c u l a r w e i g h t of f r a g m e n t , m o l a r r e f r a c t i o n , p i , p i s q u a r e d , Hammett's s i g m a , B r o w n s s i g m a p l u s , C h a r t o n ' s s i g m a l o c a l i z e d and C h a r t o n s u p s i l o n v a l u e s . 1
!
CASE
NAME
MWF
MR
El
PIQ
SIG
SIGP
SIG1
17 18
CN CN
26.0 26.0
6.33 6.33
-0.57 -0.57
0.32 0.32
0.56 0.66
0.56 0.66
0.60
701 702
SOME SOME
63.1 63.1
13.70 13.70
-1.58 -1.58
2.50 2.50
0.52 0.49
0.39
Figure 2.
Master data
O r i g i n a l l y d e s i g n e d as a data b a s e f o r m u l t i p l e r e g r e s s i o n , the m a i n t a b l e has s e v e r a l s u b - t a b l e r o u t i n e s f o r c o m bining selected lines with kinetic, e q u i l i b r i u m o r biodata. One of the r o u t i n e s c o n v e r t s b i o d a t a i n ppm, a c o m m o n i n d u s t r y f o r m , to the m o l a r e q u i v a l e n t , l o g MW/ED50. T h i s i s gene r a t e d b y i n t r o d u c i n g the p a r e n t MW into a p r o g r a m that u s e s MWF f r o m M a s t e r Data. T h e m o s t i m p o r t a n t f e a t u r e of M a s t e r D a t a i s the s i m p l e data b a s e f o r m a t . T h i s a l l o w s m a n i p u l a t i o n o f the data by c o m p u t e r p r o g r a m s d e s i g n e d to a s s i s t s y n t h e s i s . T o f a c i l i t a t e use of these p r o g r a m s , M W F and e a c h o f the p a r a m e t e r s a r e a s s i g n e d a n u m e r i c a l c o d e (1-8) as c o l u m n i d e n t i f i e r s . In the f o l l o w i n g s e c t i o n s , the M a s t e r D a t a p r o g r a m s a r e described with relevant examples. RANGE Program
- Isolipophilic
Groups
M a n y drug and p e s t i c i d e s e r i e s a r e either t r a n s p o r t o r b i n d i n g dependent and e x h i b i t o p t i m u m b e h a v i o r i n l o g Ρ o r i n s u b s t i t u e n t π v a l u e s . T h i s i s o f t e n a p p a r e n t f r o m the s t r u c t u r e s of c o m m e r c i a l m e m b e r s o f a c l a s s . W i t h i n a f a c t o r o f two o f the o p t i m u m ( l o g 1/C-O. 3), the w i d t h o f m o s t p a r a b o l i c o r b i l i n e a r p l o t s i s 1. 5-3. 0 l o g Ρ u n i t s (JO. T h u s , i f a c l a s s i s l o g Ρ o r Σ π dependant, its b e s t m e m b e r s s h o u l d c l u s t e r about
Olson and Christoffersen; Computer-Assisted Drug Design ACS Symposium Series; American Chemical Society: Washington, DC, 1979.
COMPUTER-ASSISTED DRUG DESIGN
322
the o p t i m u m w i t h the m a j o r i t y f a l l i n g i n the r a n g e , l o g Ρ (avg) +_ 1. 0. F i g u r e s 3 and 4 show f o u r d r u g s e r i e s that f i t this criterion.
Diuretic
Local Anesthetic
Figure S.
Some drug classes
PROBABLE TRANSPORT OR BINDING DEPENDANCE Class
Avg. Σπ
Range (±1.0)
No. in Range
η
Barbiturates
3.23
2.23-4.23
30
34
Antipsychotics
1.43
0.43-2.43
19
22
Diuretics
2.58
1.58-3.58
14
16
Local Anesthetics
2.32
1.32-3.32
15
18
Figure 4.
Probable transport or binding dépendance
C o n c e n t r a t i n g on one o f t h e s e c l a s s e s , we note i n F i g u r e 5 that 27 o f 34 c o m m e r c i a l b a r b i t u r a t e s h a v e e t h y l o r a l l y l as one of the g e m - d i a l k y l groups(2). in designing a prog r a m to r e p l a c e e t h y l o r a l l y l w i t h n o v e l g r o u p s , the m o s t c o m m o n c a u s e o f low a c t i v i t y c o u l d w e l l be s e r i o u s d e p a r t u r e
Olson and Christoffersen; Computer-Assisted Drug Design ACS Symposium Series; American Chemical Society: Washington, DC, 1979.
15.
MAGEE
Bioactive Synthesis
323
f r o m optimum log P. What is really needed then is a design program for novel isolipophilic groups. Such groups may still fail to excell but not because they possess inadequate π values.
Mephobarbital
C-Alkyl
No. of Cases
C H 2
15
5
CH =CHCH 2
2
(CH ) CH3
Talbutal
2
Figure 5.
Allobarbital
η 34
1.02
12
34
1.10
5
34
1.40
Various barbiturates
In a simplistic approach, Master Data can be rapidly searched by a PL/1 program called R A N G E for groups having π values close to ethyl and allyl. R A N G E is activated by selecting the parameter code for π (=2) and specifying the lower and upper limits of the search. The program segregates the desired data, ranks it in ascending order and prints out within seconds. Response on a C R T terminal is immediate. This is useful, but a more inventive approach uses group fragments called R A N G E modifiers. Shown in Figure 6 are a few of the groups used to modify any selected range of π values by combining these groups with those on the program print-out. In this procedure, we are taking advantage of an approximation, the near additivity of π values. Since these groups contribute to the total π value,
Olson and Christoffersen; Computer-Assisted Drug Design ACS Symposium Series; American Chemical Society: Washington, DC, 1979.
COMPUTER-ASSISTED DRUG DESIGN
324
their values must be subtracted f r o m the range being searched. RANGE MODIFIERS
-s-
-CN
-0-
-OCH3
-CH=CH-
-SCN
/C=0
-co> C H
3
-N(CH ) 3
-(CH ) -
-SCH3
-so2-
-S0 CH
2
-CI, Br
Figure 6.
n
2
2
3
Range modifiers
As shown in Figure 7, if three - C H 2 - groups are added to substituents printed out by the R A N G E program, then clearly this contribution must be subtracted to stay in the desired range. MODIFIER CORRECTION MODIFIER
ΤΓ
CORRECTION
-(CH ) -
1.55
-1.55
-CH=CH-
0.82
-0.82
-0.02
+0.02
>=0
-1.06
+1.06
-SO 9-
-2.14
+2.14
2
-0CH
3
3
Figure 7.
Modifier correction
In actual practice, the R A N G E problem of generating groups isolipophilic with ethyl-allyl would be handled as follows. A s both groups are close in π value, a single nominal range of 1. 02-1. 10 can be used. This range is a r b i t r a r i l y extended to accommodate e r r o r s in π and to avoid missing adjacent groups of interest. A n extension of 0. 3 was selected for this problem, a value which includes two other s m a l l groups used in barbiturates, vinyl (π = 0. 82) and isopropyl (π = 1.40). The extended range (0. 72-1.40) is now broadened to cover a l l modifiers in a single run (search range = -0. 83 to 3. 54. A s the print-out is ranked in π, it is simple to mark and label the top
Olson and Christoffersen; Computer-Assisted Drug Design ACS Symposium Series; American Chemical Society: Washington, DC, 1979.
15.
MAGEE
Bioactive Synthesis
325
and bottom of each modifier. Within these individual ranges, one looks for interesting groups to combine with the modifiers. The near additivity of π values is an approximation, but the width of most log Ρ curves is such that few outliers w i l l be generatedQ). It should be remembered that no attempt is being made to duplicate exact numbers but simply to fall within o r near a selected range. Because of this scope and "margin for error", the combination process of modifier + print-out groups can be highly inventive, with cyclizations and isomerizations freely allowed. Figure 8 shows a few of the many groups generated as ethyl/allyl replacements. Note that part of the inventive process includes casting the isolipophilic group into a suitable intermediate for the desired reaction, alkylation of malonic ester in this case. Master Data and the R A N G E program is merely an assistant or co-inventor at best. Human inventive skills are still required to complete the procedure. The potential for generating novel isolipophilic groups is nearly unlimited even though Master Data is presently far f r o m com plete in π values.
Isolipophilic Reactant
-SCH
3
-CH CH OCH
3
-CH CH OCH
3
2
2
2
-N(CH ) 3
2
-CH CH=CH
2
2
C1
CH OCH CaCCH C1 3
2
2
(CH ) NCH=CHCH C1
2
3
2
2
Ο
Ο
II
-CH C(CH )
-c-
2
3
II
3
(CH ) C-CCH Br 3
Example: C H O C H C s C C H C 1 + 0 C ( C O O C H ) 3
2
2
2
Η
Ν. CH OCH CsCCH 3
2
2
5
3
2
2
Ο Σπ = 2.93
Figure 8. Groups isolipophilic with C H —C H 2
5
S
5
Figures 9 and 10 describe a similar pesticide example based on the Stauffer Chemical series of thiolcarbamate herbicides. The implied optimum in transport or binding is
Olson and Christoffersen; Computer-Assisted Drug Design ACS Symposium Series; American Chemical Society: Washington, DC, 1979.
COMPUTER-ASSISTED DRUG DESIGN
326
c l e a r f r o m i n s p e c t i o n of F i g u r e 9, t h e s e f o u r b e i n g the s t r o n g e s t of the s e r i e s . T h e p r o b l e m c h o s e n w a s r e p l a c e m e n t of the R S - a l k y l g r o u p w i t h g r o u p s i s o l i p o p h i l i c w i t h e t h y l / p r o p y l . A s s e e n i n F i g u r e 10, the r a n g e s e a r c h e d i s n e a r l y the s a m e as that f o r e t h y l / a l l y l . T h i s i s c o i n c i d e n t a l and i m p l i e s no l i m i t a t i o n s o n the p r o c e d u r e . A n y g r o u p f r o m m o s t l i p o p h i l i c to m o s t h y d r o p h i l i c c a n be p r o c e s s e d . F i g u r e 10 a l s o c o n t a i n s an i n t e r e s t i n g e x a m p l e of the i n v e n t i v e p r o c e s s . In the t h i r d e x a m p l e , the v a l e r o l a c t o n e g e n e r a t e d f r o m the c a r b o x y l m o d i f i e r a n d the b u t y l g r o u p i s d i f f i c u l t to p r e p a r e but the i s o m e r i c b r o m o b u t y r o l a c t o n e i s a v a i l a b l e f r o m A l d r i c h Chemical.
^z»CH CH CH 2
EPTAM
2
3
CH CH SCI\k 3
2
CHOCHQCH
?
ORDRAM
^CHoCHoCHo CH CH CH 2
0 VERNAM
2
2
•CH0CH0CH0
CH CH CH S(!l\lc = i x
^
GRISEOFULVIN
® "
s
\ — ™
X
INHIBITION BY MICHAEL ADDITION
Figure 11.
Fungicides inhibiting dehydrogenases
F r o m a p a t t e r n r e c o g n i t i o n p o i n t o f view, o v e r 5 0 % o f c o m m e r c i a l f u n g i c i d e s a r e , i n fact, S H i n h i b i t o r s (4). These t h r e e s y s t e m s r e p r e s e n t a b r o a d g e n e r a l c l a s s and h a v e i n c o m m o n a n i n h i b i t i o n step r e l a t e d to t h e M i c h a e l a d d i t i o n . T h e R A N G E p r o g r a m c o u l d b e u s e d to e x a m i n e e a c h fungicide i n d i v i d u a l l y f o r the p u r p o s e of i m p r o v i n g each c l a s s . B u t a m u c h m o r e i n t e r e s t i n g a p p r o a c h i s to t r e a t the M i c h a e l a d d i t i o n i t s e l f . T h i s c a n b e done b y c o n s i d e r i n g the e l e c t r o n i c r a n g e o f t h o s e g r o u p s k n o w n to a c t i v a t e the d o u b l e bond f o r n u c l e o p h i l i c a d d i t i o n . F i v e o f the n i n e g r o u p s s e l e c t e d f o r t h i s s t u d y a r e shown i n F i g u r e 12 a n d t h e s e i n c l u d e the two g r o u p s at e a c h end o f t h e r a n g e (-NO2 a n d - C O N H 2 ) . S i g m a p a r a (σρ) w a s s e l e c t e d to r e p r e s e n t t h e e l e c t r o n i c r a n g e o f t h e s e g r o u p s b e c a u s e o f its h i g h r e s o n a n c e component. A m u c h b e t t e r c h o i c e w o u l d b e one o f C h a r t o n s s i g m a d e l o c a l i z e d p a r a m e t e r s !
Olson and Christoffersen; Computer-Assisted Drug Design ACS Symposium Series; American Chemical Society: Washington, DC, 1979.
15.
MAGEE
Bioactive Synthesis
-N0
329
0.78
2
-S0 CH 2
0.72
3
-CN
0.66 0.45
-COOCH3
-CONH
0.36
2
RANGE
0.36-0.78
SEARCH
0.3 - 0.9
Figure 12.
Michael addition activators (—CH=CH—X)
(σ£> or